LLVM 22.0.0git
MemorySanitizer.cpp
Go to the documentation of this file.
1//===- MemorySanitizer.cpp - detector of uninitialized reads --------------===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9/// \file
10/// This file is a part of MemorySanitizer, a detector of uninitialized
11/// reads.
12///
13/// The algorithm of the tool is similar to Memcheck
14/// (https://static.usenix.org/event/usenix05/tech/general/full_papers/seward/seward_html/usenix2005.html)
15/// We associate a few shadow bits with every byte of the application memory,
16/// poison the shadow of the malloc-ed or alloca-ed memory, load the shadow,
17/// bits on every memory read, propagate the shadow bits through some of the
18/// arithmetic instruction (including MOV), store the shadow bits on every
19/// memory write, report a bug on some other instructions (e.g. JMP) if the
20/// associated shadow is poisoned.
21///
22/// But there are differences too. The first and the major one:
23/// compiler instrumentation instead of binary instrumentation. This
24/// gives us much better register allocation, possible compiler
25/// optimizations and a fast start-up. But this brings the major issue
26/// as well: msan needs to see all program events, including system
27/// calls and reads/writes in system libraries, so we either need to
28/// compile *everything* with msan or use a binary translation
29/// component (e.g. DynamoRIO) to instrument pre-built libraries.
30/// Another difference from Memcheck is that we use 8 shadow bits per
31/// byte of application memory and use a direct shadow mapping. This
32/// greatly simplifies the instrumentation code and avoids races on
33/// shadow updates (Memcheck is single-threaded so races are not a
34/// concern there. Memcheck uses 2 shadow bits per byte with a slow
35/// path storage that uses 8 bits per byte).
36///
37/// The default value of shadow is 0, which means "clean" (not poisoned).
38///
39/// Every module initializer should call __msan_init to ensure that the
40/// shadow memory is ready. On error, __msan_warning is called. Since
41/// parameters and return values may be passed via registers, we have a
42/// specialized thread-local shadow for return values
43/// (__msan_retval_tls) and parameters (__msan_param_tls).
44///
45/// Origin tracking.
46///
47/// MemorySanitizer can track origins (allocation points) of all uninitialized
48/// values. This behavior is controlled with a flag (msan-track-origins) and is
49/// disabled by default.
50///
51/// Origins are 4-byte values created and interpreted by the runtime library.
52/// They are stored in a second shadow mapping, one 4-byte value for 4 bytes
53/// of application memory. Propagation of origins is basically a bunch of
54/// "select" instructions that pick the origin of a dirty argument, if an
55/// instruction has one.
56///
57/// Every 4 aligned, consecutive bytes of application memory have one origin
58/// value associated with them. If these bytes contain uninitialized data
59/// coming from 2 different allocations, the last store wins. Because of this,
60/// MemorySanitizer reports can show unrelated origins, but this is unlikely in
61/// practice.
62///
63/// Origins are meaningless for fully initialized values, so MemorySanitizer
64/// avoids storing origin to memory when a fully initialized value is stored.
65/// This way it avoids needless overwriting origin of the 4-byte region on
66/// a short (i.e. 1 byte) clean store, and it is also good for performance.
67///
68/// Atomic handling.
69///
70/// Ideally, every atomic store of application value should update the
71/// corresponding shadow location in an atomic way. Unfortunately, atomic store
72/// of two disjoint locations can not be done without severe slowdown.
73///
74/// Therefore, we implement an approximation that may err on the safe side.
75/// In this implementation, every atomically accessed location in the program
76/// may only change from (partially) uninitialized to fully initialized, but
77/// not the other way around. We load the shadow _after_ the application load,
78/// and we store the shadow _before_ the app store. Also, we always store clean
79/// shadow (if the application store is atomic). This way, if the store-load
80/// pair constitutes a happens-before arc, shadow store and load are correctly
81/// ordered such that the load will get either the value that was stored, or
82/// some later value (which is always clean).
83///
84/// This does not work very well with Compare-And-Swap (CAS) and
85/// Read-Modify-Write (RMW) operations. To follow the above logic, CAS and RMW
86/// must store the new shadow before the app operation, and load the shadow
87/// after the app operation. Computers don't work this way. Current
88/// implementation ignores the load aspect of CAS/RMW, always returning a clean
89/// value. It implements the store part as a simple atomic store by storing a
90/// clean shadow.
91///
92/// Instrumenting inline assembly.
93///
94/// For inline assembly code LLVM has little idea about which memory locations
95/// become initialized depending on the arguments. It can be possible to figure
96/// out which arguments are meant to point to inputs and outputs, but the
97/// actual semantics can be only visible at runtime. In the Linux kernel it's
98/// also possible that the arguments only indicate the offset for a base taken
99/// from a segment register, so it's dangerous to treat any asm() arguments as
100/// pointers. We take a conservative approach generating calls to
101/// __msan_instrument_asm_store(ptr, size)
102/// , which defer the memory unpoisoning to the runtime library.
103/// The latter can perform more complex address checks to figure out whether
104/// it's safe to touch the shadow memory.
105/// Like with atomic operations, we call __msan_instrument_asm_store() before
106/// the assembly call, so that changes to the shadow memory will be seen by
107/// other threads together with main memory initialization.
108///
109/// KernelMemorySanitizer (KMSAN) implementation.
110///
111/// The major differences between KMSAN and MSan instrumentation are:
112/// - KMSAN always tracks the origins and implies msan-keep-going=true;
113/// - KMSAN allocates shadow and origin memory for each page separately, so
114/// there are no explicit accesses to shadow and origin in the
115/// instrumentation.
116/// Shadow and origin values for a particular X-byte memory location
117/// (X=1,2,4,8) are accessed through pointers obtained via the
118/// __msan_metadata_ptr_for_load_X(ptr)
119/// __msan_metadata_ptr_for_store_X(ptr)
120/// functions. The corresponding functions check that the X-byte accesses
121/// are possible and returns the pointers to shadow and origin memory.
122/// Arbitrary sized accesses are handled with:
123/// __msan_metadata_ptr_for_load_n(ptr, size)
124/// __msan_metadata_ptr_for_store_n(ptr, size);
125/// Note that the sanitizer code has to deal with how shadow/origin pairs
126/// returned by the these functions are represented in different ABIs. In
127/// the X86_64 ABI they are returned in RDX:RAX, in PowerPC64 they are
128/// returned in r3 and r4, and in the SystemZ ABI they are written to memory
129/// pointed to by a hidden parameter.
130/// - TLS variables are stored in a single per-task struct. A call to a
131/// function __msan_get_context_state() returning a pointer to that struct
132/// is inserted into every instrumented function before the entry block;
133/// - __msan_warning() takes a 32-bit origin parameter;
134/// - local variables are poisoned with __msan_poison_alloca() upon function
135/// entry and unpoisoned with __msan_unpoison_alloca() before leaving the
136/// function;
137/// - the pass doesn't declare any global variables or add global constructors
138/// to the translation unit.
139///
140/// Also, KMSAN currently ignores uninitialized memory passed into inline asm
141/// calls, making sure we're on the safe side wrt. possible false positives.
142///
143/// KernelMemorySanitizer only supports X86_64, SystemZ and PowerPC64 at the
144/// moment.
145///
146//
147// FIXME: This sanitizer does not yet handle scalable vectors
148//
149//===----------------------------------------------------------------------===//
150
152#include "llvm/ADT/APInt.h"
153#include "llvm/ADT/ArrayRef.h"
154#include "llvm/ADT/DenseMap.h"
156#include "llvm/ADT/SetVector.h"
157#include "llvm/ADT/SmallPtrSet.h"
158#include "llvm/ADT/SmallVector.h"
160#include "llvm/ADT/StringRef.h"
164#include "llvm/IR/Argument.h"
166#include "llvm/IR/Attributes.h"
167#include "llvm/IR/BasicBlock.h"
168#include "llvm/IR/CallingConv.h"
169#include "llvm/IR/Constant.h"
170#include "llvm/IR/Constants.h"
171#include "llvm/IR/DataLayout.h"
172#include "llvm/IR/DerivedTypes.h"
173#include "llvm/IR/Function.h"
174#include "llvm/IR/GlobalValue.h"
176#include "llvm/IR/IRBuilder.h"
177#include "llvm/IR/InlineAsm.h"
178#include "llvm/IR/InstVisitor.h"
179#include "llvm/IR/InstrTypes.h"
180#include "llvm/IR/Instruction.h"
181#include "llvm/IR/Instructions.h"
183#include "llvm/IR/Intrinsics.h"
184#include "llvm/IR/IntrinsicsAArch64.h"
185#include "llvm/IR/IntrinsicsX86.h"
186#include "llvm/IR/MDBuilder.h"
187#include "llvm/IR/Module.h"
188#include "llvm/IR/Type.h"
189#include "llvm/IR/Value.h"
190#include "llvm/IR/ValueMap.h"
193#include "llvm/Support/Casting.h"
195#include "llvm/Support/Debug.h"
205#include <algorithm>
206#include <cassert>
207#include <cstddef>
208#include <cstdint>
209#include <memory>
210#include <numeric>
211#include <string>
212#include <tuple>
213
214using namespace llvm;
215
216#define DEBUG_TYPE "msan"
217
218DEBUG_COUNTER(DebugInsertCheck, "msan-insert-check",
219 "Controls which checks to insert");
220
221DEBUG_COUNTER(DebugInstrumentInstruction, "msan-instrument-instruction",
222 "Controls which instruction to instrument");
223
224static const unsigned kOriginSize = 4;
227
228// These constants must be kept in sync with the ones in msan.h.
229static const unsigned kParamTLSSize = 800;
230static const unsigned kRetvalTLSSize = 800;
231
232// Accesses sizes are powers of two: 1, 2, 4, 8.
233static const size_t kNumberOfAccessSizes = 4;
234
235/// Track origins of uninitialized values.
236///
237/// Adds a section to MemorySanitizer report that points to the allocation
238/// (stack or heap) the uninitialized bits came from originally.
240 "msan-track-origins",
241 cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden,
242 cl::init(0));
243
244static cl::opt<bool> ClKeepGoing("msan-keep-going",
245 cl::desc("keep going after reporting a UMR"),
246 cl::Hidden, cl::init(false));
247
248static cl::opt<bool>
249 ClPoisonStack("msan-poison-stack",
250 cl::desc("poison uninitialized stack variables"), cl::Hidden,
251 cl::init(true));
252
254 "msan-poison-stack-with-call",
255 cl::desc("poison uninitialized stack variables with a call"), cl::Hidden,
256 cl::init(false));
257
259 "msan-poison-stack-pattern",
260 cl::desc("poison uninitialized stack variables with the given pattern"),
261 cl::Hidden, cl::init(0xff));
262
263static cl::opt<bool>
264 ClPrintStackNames("msan-print-stack-names",
265 cl::desc("Print name of local stack variable"),
266 cl::Hidden, cl::init(true));
267
268static cl::opt<bool>
269 ClPoisonUndef("msan-poison-undef",
270 cl::desc("Poison fully undef temporary values. "
271 "Partially undefined constant vectors "
272 "are unaffected by this flag (see "
273 "-msan-poison-undef-vectors)."),
274 cl::Hidden, cl::init(true));
275
277 "msan-poison-undef-vectors",
278 cl::desc("Precisely poison partially undefined constant vectors. "
279 "If false (legacy behavior), the entire vector is "
280 "considered fully initialized, which may lead to false "
281 "negatives. Fully undefined constant vectors are "
282 "unaffected by this flag (see -msan-poison-undef)."),
283 cl::Hidden, cl::init(false));
284
286 "msan-precise-disjoint-or",
287 cl::desc("Precisely poison disjoint OR. If false (legacy behavior), "
288 "disjointedness is ignored (i.e., 1|1 is initialized)."),
289 cl::Hidden, cl::init(false));
290
291static cl::opt<bool>
292 ClHandleICmp("msan-handle-icmp",
293 cl::desc("propagate shadow through ICmpEQ and ICmpNE"),
294 cl::Hidden, cl::init(true));
295
296static cl::opt<bool>
297 ClHandleICmpExact("msan-handle-icmp-exact",
298 cl::desc("exact handling of relational integer ICmp"),
299 cl::Hidden, cl::init(true));
300
302 "msan-handle-lifetime-intrinsics",
303 cl::desc(
304 "when possible, poison scoped variables at the beginning of the scope "
305 "(slower, but more precise)"),
306 cl::Hidden, cl::init(true));
307
308// When compiling the Linux kernel, we sometimes see false positives related to
309// MSan being unable to understand that inline assembly calls may initialize
310// local variables.
311// This flag makes the compiler conservatively unpoison every memory location
312// passed into an assembly call. Note that this may cause false positives.
313// Because it's impossible to figure out the array sizes, we can only unpoison
314// the first sizeof(type) bytes for each type* pointer.
316 "msan-handle-asm-conservative",
317 cl::desc("conservative handling of inline assembly"), cl::Hidden,
318 cl::init(true));
319
320// This flag controls whether we check the shadow of the address
321// operand of load or store. Such bugs are very rare, since load from
322// a garbage address typically results in SEGV, but still happen
323// (e.g. only lower bits of address are garbage, or the access happens
324// early at program startup where malloc-ed memory is more likely to
325// be zeroed. As of 2012-08-28 this flag adds 20% slowdown.
327 "msan-check-access-address",
328 cl::desc("report accesses through a pointer which has poisoned shadow"),
329 cl::Hidden, cl::init(true));
330
332 "msan-eager-checks",
333 cl::desc("check arguments and return values at function call boundaries"),
334 cl::Hidden, cl::init(false));
335
337 "msan-dump-strict-instructions",
338 cl::desc("print out instructions with default strict semantics i.e.,"
339 "check that all the inputs are fully initialized, and mark "
340 "the output as fully initialized. These semantics are applied "
341 "to instructions that could not be handled explicitly nor "
342 "heuristically."),
343 cl::Hidden, cl::init(false));
344
345// Currently, all the heuristically handled instructions are specifically
346// IntrinsicInst. However, we use the broader "HeuristicInstructions" name
347// to parallel 'msan-dump-strict-instructions', and to keep the door open to
348// handling non-intrinsic instructions heuristically.
350 "msan-dump-heuristic-instructions",
351 cl::desc("Prints 'unknown' instructions that were handled heuristically. "
352 "Use -msan-dump-strict-instructions to print instructions that "
353 "could not be handled explicitly nor heuristically."),
354 cl::Hidden, cl::init(false));
355
357 "msan-instrumentation-with-call-threshold",
358 cl::desc(
359 "If the function being instrumented requires more than "
360 "this number of checks and origin stores, use callbacks instead of "
361 "inline checks (-1 means never use callbacks)."),
362 cl::Hidden, cl::init(3500));
363
364static cl::opt<bool>
365 ClEnableKmsan("msan-kernel",
366 cl::desc("Enable KernelMemorySanitizer instrumentation"),
367 cl::Hidden, cl::init(false));
368
369static cl::opt<bool>
370 ClDisableChecks("msan-disable-checks",
371 cl::desc("Apply no_sanitize to the whole file"), cl::Hidden,
372 cl::init(false));
373
374static cl::opt<bool>
375 ClCheckConstantShadow("msan-check-constant-shadow",
376 cl::desc("Insert checks for constant shadow values"),
377 cl::Hidden, cl::init(true));
378
379// This is off by default because of a bug in gold:
380// https://sourceware.org/bugzilla/show_bug.cgi?id=19002
381static cl::opt<bool>
382 ClWithComdat("msan-with-comdat",
383 cl::desc("Place MSan constructors in comdat sections"),
384 cl::Hidden, cl::init(false));
385
386// These options allow to specify custom memory map parameters
387// See MemoryMapParams for details.
388static cl::opt<uint64_t> ClAndMask("msan-and-mask",
389 cl::desc("Define custom MSan AndMask"),
390 cl::Hidden, cl::init(0));
391
392static cl::opt<uint64_t> ClXorMask("msan-xor-mask",
393 cl::desc("Define custom MSan XorMask"),
394 cl::Hidden, cl::init(0));
395
396static cl::opt<uint64_t> ClShadowBase("msan-shadow-base",
397 cl::desc("Define custom MSan ShadowBase"),
398 cl::Hidden, cl::init(0));
399
400static cl::opt<uint64_t> ClOriginBase("msan-origin-base",
401 cl::desc("Define custom MSan OriginBase"),
402 cl::Hidden, cl::init(0));
403
404static cl::opt<int>
405 ClDisambiguateWarning("msan-disambiguate-warning-threshold",
406 cl::desc("Define threshold for number of checks per "
407 "debug location to force origin update."),
408 cl::Hidden, cl::init(3));
409
410const char kMsanModuleCtorName[] = "msan.module_ctor";
411const char kMsanInitName[] = "__msan_init";
412
413namespace {
414
415// Memory map parameters used in application-to-shadow address calculation.
416// Offset = (Addr & ~AndMask) ^ XorMask
417// Shadow = ShadowBase + Offset
418// Origin = OriginBase + Offset
419struct MemoryMapParams {
420 uint64_t AndMask;
421 uint64_t XorMask;
422 uint64_t ShadowBase;
423 uint64_t OriginBase;
424};
425
426struct PlatformMemoryMapParams {
427 const MemoryMapParams *bits32;
428 const MemoryMapParams *bits64;
429};
430
431} // end anonymous namespace
432
433// i386 Linux
434static const MemoryMapParams Linux_I386_MemoryMapParams = {
435 0x000080000000, // AndMask
436 0, // XorMask (not used)
437 0, // ShadowBase (not used)
438 0x000040000000, // OriginBase
439};
440
441// x86_64 Linux
442static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
443 0, // AndMask (not used)
444 0x500000000000, // XorMask
445 0, // ShadowBase (not used)
446 0x100000000000, // OriginBase
447};
448
449// mips32 Linux
450// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
451// after picking good constants
452
453// mips64 Linux
454static const MemoryMapParams Linux_MIPS64_MemoryMapParams = {
455 0, // AndMask (not used)
456 0x008000000000, // XorMask
457 0, // ShadowBase (not used)
458 0x002000000000, // OriginBase
459};
460
461// ppc32 Linux
462// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
463// after picking good constants
464
465// ppc64 Linux
466static const MemoryMapParams Linux_PowerPC64_MemoryMapParams = {
467 0xE00000000000, // AndMask
468 0x100000000000, // XorMask
469 0x080000000000, // ShadowBase
470 0x1C0000000000, // OriginBase
471};
472
473// s390x Linux
474static const MemoryMapParams Linux_S390X_MemoryMapParams = {
475 0xC00000000000, // AndMask
476 0, // XorMask (not used)
477 0x080000000000, // ShadowBase
478 0x1C0000000000, // OriginBase
479};
480
481// arm32 Linux
482// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
483// after picking good constants
484
485// aarch64 Linux
486static const MemoryMapParams Linux_AArch64_MemoryMapParams = {
487 0, // AndMask (not used)
488 0x0B00000000000, // XorMask
489 0, // ShadowBase (not used)
490 0x0200000000000, // OriginBase
491};
492
493// loongarch64 Linux
494static const MemoryMapParams Linux_LoongArch64_MemoryMapParams = {
495 0, // AndMask (not used)
496 0x500000000000, // XorMask
497 0, // ShadowBase (not used)
498 0x100000000000, // OriginBase
499};
500
501// riscv32 Linux
502// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
503// after picking good constants
504
505// aarch64 FreeBSD
506static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams = {
507 0x1800000000000, // AndMask
508 0x0400000000000, // XorMask
509 0x0200000000000, // ShadowBase
510 0x0700000000000, // OriginBase
511};
512
513// i386 FreeBSD
514static const MemoryMapParams FreeBSD_I386_MemoryMapParams = {
515 0x000180000000, // AndMask
516 0x000040000000, // XorMask
517 0x000020000000, // ShadowBase
518 0x000700000000, // OriginBase
519};
520
521// x86_64 FreeBSD
522static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams = {
523 0xc00000000000, // AndMask
524 0x200000000000, // XorMask
525 0x100000000000, // ShadowBase
526 0x380000000000, // OriginBase
527};
528
529// x86_64 NetBSD
530static const MemoryMapParams NetBSD_X86_64_MemoryMapParams = {
531 0, // AndMask
532 0x500000000000, // XorMask
533 0, // ShadowBase
534 0x100000000000, // OriginBase
535};
536
537static const PlatformMemoryMapParams Linux_X86_MemoryMapParams = {
540};
541
542static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams = {
543 nullptr,
545};
546
547static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams = {
548 nullptr,
550};
551
552static const PlatformMemoryMapParams Linux_S390_MemoryMapParams = {
553 nullptr,
555};
556
557static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams = {
558 nullptr,
560};
561
562static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams = {
563 nullptr,
565};
566
567static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams = {
568 nullptr,
570};
571
572static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams = {
575};
576
577static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams = {
578 nullptr,
580};
581
582namespace {
583
584/// Instrument functions of a module to detect uninitialized reads.
585///
586/// Instantiating MemorySanitizer inserts the msan runtime library API function
587/// declarations into the module if they don't exist already. Instantiating
588/// ensures the __msan_init function is in the list of global constructors for
589/// the module.
590class MemorySanitizer {
591public:
592 MemorySanitizer(Module &M, MemorySanitizerOptions Options)
593 : CompileKernel(Options.Kernel), TrackOrigins(Options.TrackOrigins),
594 Recover(Options.Recover), EagerChecks(Options.EagerChecks) {
595 initializeModule(M);
596 }
597
598 // MSan cannot be moved or copied because of MapParams.
599 MemorySanitizer(MemorySanitizer &&) = delete;
600 MemorySanitizer &operator=(MemorySanitizer &&) = delete;
601 MemorySanitizer(const MemorySanitizer &) = delete;
602 MemorySanitizer &operator=(const MemorySanitizer &) = delete;
603
604 bool sanitizeFunction(Function &F, TargetLibraryInfo &TLI);
605
606private:
607 friend struct MemorySanitizerVisitor;
608 friend struct VarArgHelperBase;
609 friend struct VarArgAMD64Helper;
610 friend struct VarArgAArch64Helper;
611 friend struct VarArgPowerPC64Helper;
612 friend struct VarArgPowerPC32Helper;
613 friend struct VarArgSystemZHelper;
614 friend struct VarArgI386Helper;
615 friend struct VarArgGenericHelper;
616
617 void initializeModule(Module &M);
618 void initializeCallbacks(Module &M, const TargetLibraryInfo &TLI);
619 void createKernelApi(Module &M, const TargetLibraryInfo &TLI);
620 void createUserspaceApi(Module &M, const TargetLibraryInfo &TLI);
621
622 template <typename... ArgsTy>
623 FunctionCallee getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
624 ArgsTy... Args);
625
626 /// True if we're compiling the Linux kernel.
627 bool CompileKernel;
628 /// Track origins (allocation points) of uninitialized values.
629 int TrackOrigins;
630 bool Recover;
631 bool EagerChecks;
632
633 Triple TargetTriple;
634 LLVMContext *C;
635 Type *IntptrTy; ///< Integer type with the size of a ptr in default AS.
636 Type *OriginTy;
637 PointerType *PtrTy; ///< Integer type with the size of a ptr in default AS.
638
639 // XxxTLS variables represent the per-thread state in MSan and per-task state
640 // in KMSAN.
641 // For the userspace these point to thread-local globals. In the kernel land
642 // they point to the members of a per-task struct obtained via a call to
643 // __msan_get_context_state().
644
645 /// Thread-local shadow storage for function parameters.
646 Value *ParamTLS;
647
648 /// Thread-local origin storage for function parameters.
649 Value *ParamOriginTLS;
650
651 /// Thread-local shadow storage for function return value.
652 Value *RetvalTLS;
653
654 /// Thread-local origin storage for function return value.
655 Value *RetvalOriginTLS;
656
657 /// Thread-local shadow storage for in-register va_arg function.
658 Value *VAArgTLS;
659
660 /// Thread-local shadow storage for in-register va_arg function.
661 Value *VAArgOriginTLS;
662
663 /// Thread-local shadow storage for va_arg overflow area.
664 Value *VAArgOverflowSizeTLS;
665
666 /// Are the instrumentation callbacks set up?
667 bool CallbacksInitialized = false;
668
669 /// The run-time callback to print a warning.
670 FunctionCallee WarningFn;
671
672 // These arrays are indexed by log2(AccessSize).
673 FunctionCallee MaybeWarningFn[kNumberOfAccessSizes];
674 FunctionCallee MaybeWarningVarSizeFn;
675 FunctionCallee MaybeStoreOriginFn[kNumberOfAccessSizes];
676
677 /// Run-time helper that generates a new origin value for a stack
678 /// allocation.
679 FunctionCallee MsanSetAllocaOriginWithDescriptionFn;
680 // No description version
681 FunctionCallee MsanSetAllocaOriginNoDescriptionFn;
682
683 /// Run-time helper that poisons stack on function entry.
684 FunctionCallee MsanPoisonStackFn;
685
686 /// Run-time helper that records a store (or any event) of an
687 /// uninitialized value and returns an updated origin id encoding this info.
688 FunctionCallee MsanChainOriginFn;
689
690 /// Run-time helper that paints an origin over a region.
691 FunctionCallee MsanSetOriginFn;
692
693 /// MSan runtime replacements for memmove, memcpy and memset.
694 FunctionCallee MemmoveFn, MemcpyFn, MemsetFn;
695
696 /// KMSAN callback for task-local function argument shadow.
697 StructType *MsanContextStateTy;
698 FunctionCallee MsanGetContextStateFn;
699
700 /// Functions for poisoning/unpoisoning local variables
701 FunctionCallee MsanPoisonAllocaFn, MsanUnpoisonAllocaFn;
702
703 /// Pair of shadow/origin pointers.
704 Type *MsanMetadata;
705
706 /// Each of the MsanMetadataPtrXxx functions returns a MsanMetadata.
707 FunctionCallee MsanMetadataPtrForLoadN, MsanMetadataPtrForStoreN;
708 FunctionCallee MsanMetadataPtrForLoad_1_8[4];
709 FunctionCallee MsanMetadataPtrForStore_1_8[4];
710 FunctionCallee MsanInstrumentAsmStoreFn;
711
712 /// Storage for return values of the MsanMetadataPtrXxx functions.
713 Value *MsanMetadataAlloca;
714
715 /// Helper to choose between different MsanMetadataPtrXxx().
716 FunctionCallee getKmsanShadowOriginAccessFn(bool isStore, int size);
717
718 /// Memory map parameters used in application-to-shadow calculation.
719 const MemoryMapParams *MapParams;
720
721 /// Custom memory map parameters used when -msan-shadow-base or
722 // -msan-origin-base is provided.
723 MemoryMapParams CustomMapParams;
724
725 MDNode *ColdCallWeights;
726
727 /// Branch weights for origin store.
728 MDNode *OriginStoreWeights;
729};
730
731void insertModuleCtor(Module &M) {
734 /*InitArgTypes=*/{},
735 /*InitArgs=*/{},
736 // This callback is invoked when the functions are created the first
737 // time. Hook them into the global ctors list in that case:
738 [&](Function *Ctor, FunctionCallee) {
739 if (!ClWithComdat) {
740 appendToGlobalCtors(M, Ctor, 0);
741 return;
742 }
743 Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName);
744 Ctor->setComdat(MsanCtorComdat);
745 appendToGlobalCtors(M, Ctor, 0, Ctor);
746 });
747}
748
749template <class T> T getOptOrDefault(const cl::opt<T> &Opt, T Default) {
750 return (Opt.getNumOccurrences() > 0) ? Opt : Default;
751}
752
753} // end anonymous namespace
754
756 bool EagerChecks)
757 : Kernel(getOptOrDefault(ClEnableKmsan, K)),
758 TrackOrigins(getOptOrDefault(ClTrackOrigins, Kernel ? 2 : TO)),
759 Recover(getOptOrDefault(ClKeepGoing, Kernel || R)),
760 EagerChecks(getOptOrDefault(ClEagerChecks, EagerChecks)) {}
761
764 // Return early if nosanitize_memory module flag is present for the module.
765 if (checkIfAlreadyInstrumented(M, "nosanitize_memory"))
766 return PreservedAnalyses::all();
767 bool Modified = false;
768 if (!Options.Kernel) {
769 insertModuleCtor(M);
770 Modified = true;
771 }
772
773 auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
774 for (Function &F : M) {
775 if (F.empty())
776 continue;
777 MemorySanitizer Msan(*F.getParent(), Options);
778 Modified |=
779 Msan.sanitizeFunction(F, FAM.getResult<TargetLibraryAnalysis>(F));
780 }
781
782 if (!Modified)
783 return PreservedAnalyses::all();
784
786 // GlobalsAA is considered stateless and does not get invalidated unless
787 // explicitly invalidated; PreservedAnalyses::none() is not enough. Sanitizers
788 // make changes that require GlobalsAA to be invalidated.
789 PA.abandon<GlobalsAA>();
790 return PA;
791}
792
794 raw_ostream &OS, function_ref<StringRef(StringRef)> MapClassName2PassName) {
796 OS, MapClassName2PassName);
797 OS << '<';
798 if (Options.Recover)
799 OS << "recover;";
800 if (Options.Kernel)
801 OS << "kernel;";
802 if (Options.EagerChecks)
803 OS << "eager-checks;";
804 OS << "track-origins=" << Options.TrackOrigins;
805 OS << '>';
806}
807
808/// Create a non-const global initialized with the given string.
809///
810/// Creates a writable global for Str so that we can pass it to the
811/// run-time lib. Runtime uses first 4 bytes of the string to store the
812/// frame ID, so the string needs to be mutable.
814 StringRef Str) {
815 Constant *StrConst = ConstantDataArray::getString(M.getContext(), Str);
816 return new GlobalVariable(M, StrConst->getType(), /*isConstant=*/true,
817 GlobalValue::PrivateLinkage, StrConst, "");
818}
819
820template <typename... ArgsTy>
822MemorySanitizer::getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
823 ArgsTy... Args) {
824 if (TargetTriple.getArch() == Triple::systemz) {
825 // SystemZ ABI: shadow/origin pair is returned via a hidden parameter.
826 return M.getOrInsertFunction(Name, Type::getVoidTy(*C), PtrTy,
827 std::forward<ArgsTy>(Args)...);
828 }
829
830 return M.getOrInsertFunction(Name, MsanMetadata,
831 std::forward<ArgsTy>(Args)...);
832}
833
834/// Create KMSAN API callbacks.
835void MemorySanitizer::createKernelApi(Module &M, const TargetLibraryInfo &TLI) {
836 IRBuilder<> IRB(*C);
837
838 // These will be initialized in insertKmsanPrologue().
839 RetvalTLS = nullptr;
840 RetvalOriginTLS = nullptr;
841 ParamTLS = nullptr;
842 ParamOriginTLS = nullptr;
843 VAArgTLS = nullptr;
844 VAArgOriginTLS = nullptr;
845 VAArgOverflowSizeTLS = nullptr;
846
847 WarningFn = M.getOrInsertFunction("__msan_warning",
848 TLI.getAttrList(C, {0}, /*Signed=*/false),
849 IRB.getVoidTy(), IRB.getInt32Ty());
850
851 // Requests the per-task context state (kmsan_context_state*) from the
852 // runtime library.
853 MsanContextStateTy = StructType::get(
854 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
855 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8),
856 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
857 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8), /* va_arg_origin */
858 IRB.getInt64Ty(), ArrayType::get(OriginTy, kParamTLSSize / 4), OriginTy,
859 OriginTy);
860 MsanGetContextStateFn =
861 M.getOrInsertFunction("__msan_get_context_state", PtrTy);
862
863 MsanMetadata = StructType::get(PtrTy, PtrTy);
864
865 for (int ind = 0, size = 1; ind < 4; ind++, size <<= 1) {
866 std::string name_load =
867 "__msan_metadata_ptr_for_load_" + std::to_string(size);
868 std::string name_store =
869 "__msan_metadata_ptr_for_store_" + std::to_string(size);
870 MsanMetadataPtrForLoad_1_8[ind] =
871 getOrInsertMsanMetadataFunction(M, name_load, PtrTy);
872 MsanMetadataPtrForStore_1_8[ind] =
873 getOrInsertMsanMetadataFunction(M, name_store, PtrTy);
874 }
875
876 MsanMetadataPtrForLoadN = getOrInsertMsanMetadataFunction(
877 M, "__msan_metadata_ptr_for_load_n", PtrTy, IntptrTy);
878 MsanMetadataPtrForStoreN = getOrInsertMsanMetadataFunction(
879 M, "__msan_metadata_ptr_for_store_n", PtrTy, IntptrTy);
880
881 // Functions for poisoning and unpoisoning memory.
882 MsanPoisonAllocaFn = M.getOrInsertFunction(
883 "__msan_poison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
884 MsanUnpoisonAllocaFn = M.getOrInsertFunction(
885 "__msan_unpoison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy);
886}
887
889 return M.getOrInsertGlobal(Name, Ty, [&] {
890 return new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
891 nullptr, Name, nullptr,
893 });
894}
895
896/// Insert declarations for userspace-specific functions and globals.
897void MemorySanitizer::createUserspaceApi(Module &M,
898 const TargetLibraryInfo &TLI) {
899 IRBuilder<> IRB(*C);
900
901 // Create the callback.
902 // FIXME: this function should have "Cold" calling conv,
903 // which is not yet implemented.
904 if (TrackOrigins) {
905 StringRef WarningFnName = Recover ? "__msan_warning_with_origin"
906 : "__msan_warning_with_origin_noreturn";
907 WarningFn = M.getOrInsertFunction(WarningFnName,
908 TLI.getAttrList(C, {0}, /*Signed=*/false),
909 IRB.getVoidTy(), IRB.getInt32Ty());
910 } else {
911 StringRef WarningFnName =
912 Recover ? "__msan_warning" : "__msan_warning_noreturn";
913 WarningFn = M.getOrInsertFunction(WarningFnName, IRB.getVoidTy());
914 }
915
916 // Create the global TLS variables.
917 RetvalTLS =
918 getOrInsertGlobal(M, "__msan_retval_tls",
919 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8));
920
921 RetvalOriginTLS = getOrInsertGlobal(M, "__msan_retval_origin_tls", OriginTy);
922
923 ParamTLS =
924 getOrInsertGlobal(M, "__msan_param_tls",
925 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
926
927 ParamOriginTLS =
928 getOrInsertGlobal(M, "__msan_param_origin_tls",
929 ArrayType::get(OriginTy, kParamTLSSize / 4));
930
931 VAArgTLS =
932 getOrInsertGlobal(M, "__msan_va_arg_tls",
933 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
934
935 VAArgOriginTLS =
936 getOrInsertGlobal(M, "__msan_va_arg_origin_tls",
937 ArrayType::get(OriginTy, kParamTLSSize / 4));
938
939 VAArgOverflowSizeTLS = getOrInsertGlobal(M, "__msan_va_arg_overflow_size_tls",
940 IRB.getIntPtrTy(M.getDataLayout()));
941
942 for (size_t AccessSizeIndex = 0; AccessSizeIndex < kNumberOfAccessSizes;
943 AccessSizeIndex++) {
944 unsigned AccessSize = 1 << AccessSizeIndex;
945 std::string FunctionName = "__msan_maybe_warning_" + itostr(AccessSize);
946 MaybeWarningFn[AccessSizeIndex] = M.getOrInsertFunction(
947 FunctionName, TLI.getAttrList(C, {0, 1}, /*Signed=*/false),
948 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), IRB.getInt32Ty());
949 MaybeWarningVarSizeFn = M.getOrInsertFunction(
950 "__msan_maybe_warning_N", TLI.getAttrList(C, {}, /*Signed=*/false),
951 IRB.getVoidTy(), PtrTy, IRB.getInt64Ty(), IRB.getInt32Ty());
952 FunctionName = "__msan_maybe_store_origin_" + itostr(AccessSize);
953 MaybeStoreOriginFn[AccessSizeIndex] = M.getOrInsertFunction(
954 FunctionName, TLI.getAttrList(C, {0, 2}, /*Signed=*/false),
955 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), PtrTy,
956 IRB.getInt32Ty());
957 }
958
959 MsanSetAllocaOriginWithDescriptionFn =
960 M.getOrInsertFunction("__msan_set_alloca_origin_with_descr",
961 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy, PtrTy);
962 MsanSetAllocaOriginNoDescriptionFn =
963 M.getOrInsertFunction("__msan_set_alloca_origin_no_descr",
964 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
965 MsanPoisonStackFn = M.getOrInsertFunction("__msan_poison_stack",
966 IRB.getVoidTy(), PtrTy, IntptrTy);
967}
968
969/// Insert extern declaration of runtime-provided functions and globals.
970void MemorySanitizer::initializeCallbacks(Module &M,
971 const TargetLibraryInfo &TLI) {
972 // Only do this once.
973 if (CallbacksInitialized)
974 return;
975
976 IRBuilder<> IRB(*C);
977 // Initialize callbacks that are common for kernel and userspace
978 // instrumentation.
979 MsanChainOriginFn = M.getOrInsertFunction(
980 "__msan_chain_origin",
981 TLI.getAttrList(C, {0}, /*Signed=*/false, /*Ret=*/true), IRB.getInt32Ty(),
982 IRB.getInt32Ty());
983 MsanSetOriginFn = M.getOrInsertFunction(
984 "__msan_set_origin", TLI.getAttrList(C, {2}, /*Signed=*/false),
985 IRB.getVoidTy(), PtrTy, IntptrTy, IRB.getInt32Ty());
986 MemmoveFn =
987 M.getOrInsertFunction("__msan_memmove", PtrTy, PtrTy, PtrTy, IntptrTy);
988 MemcpyFn =
989 M.getOrInsertFunction("__msan_memcpy", PtrTy, PtrTy, PtrTy, IntptrTy);
990 MemsetFn = M.getOrInsertFunction("__msan_memset",
991 TLI.getAttrList(C, {1}, /*Signed=*/true),
992 PtrTy, PtrTy, IRB.getInt32Ty(), IntptrTy);
993
994 MsanInstrumentAsmStoreFn = M.getOrInsertFunction(
995 "__msan_instrument_asm_store", IRB.getVoidTy(), PtrTy, IntptrTy);
996
997 if (CompileKernel) {
998 createKernelApi(M, TLI);
999 } else {
1000 createUserspaceApi(M, TLI);
1001 }
1002 CallbacksInitialized = true;
1003}
1004
1005FunctionCallee MemorySanitizer::getKmsanShadowOriginAccessFn(bool isStore,
1006 int size) {
1007 FunctionCallee *Fns =
1008 isStore ? MsanMetadataPtrForStore_1_8 : MsanMetadataPtrForLoad_1_8;
1009 switch (size) {
1010 case 1:
1011 return Fns[0];
1012 case 2:
1013 return Fns[1];
1014 case 4:
1015 return Fns[2];
1016 case 8:
1017 return Fns[3];
1018 default:
1019 return nullptr;
1020 }
1021}
1022
1023/// Module-level initialization.
1024///
1025/// inserts a call to __msan_init to the module's constructor list.
1026void MemorySanitizer::initializeModule(Module &M) {
1027 auto &DL = M.getDataLayout();
1028
1029 TargetTriple = M.getTargetTriple();
1030
1031 bool ShadowPassed = ClShadowBase.getNumOccurrences() > 0;
1032 bool OriginPassed = ClOriginBase.getNumOccurrences() > 0;
1033 // Check the overrides first
1034 if (ShadowPassed || OriginPassed) {
1035 CustomMapParams.AndMask = ClAndMask;
1036 CustomMapParams.XorMask = ClXorMask;
1037 CustomMapParams.ShadowBase = ClShadowBase;
1038 CustomMapParams.OriginBase = ClOriginBase;
1039 MapParams = &CustomMapParams;
1040 } else {
1041 switch (TargetTriple.getOS()) {
1042 case Triple::FreeBSD:
1043 switch (TargetTriple.getArch()) {
1044 case Triple::aarch64:
1045 MapParams = FreeBSD_ARM_MemoryMapParams.bits64;
1046 break;
1047 case Triple::x86_64:
1048 MapParams = FreeBSD_X86_MemoryMapParams.bits64;
1049 break;
1050 case Triple::x86:
1051 MapParams = FreeBSD_X86_MemoryMapParams.bits32;
1052 break;
1053 default:
1054 report_fatal_error("unsupported architecture");
1055 }
1056 break;
1057 case Triple::NetBSD:
1058 switch (TargetTriple.getArch()) {
1059 case Triple::x86_64:
1060 MapParams = NetBSD_X86_MemoryMapParams.bits64;
1061 break;
1062 default:
1063 report_fatal_error("unsupported architecture");
1064 }
1065 break;
1066 case Triple::Linux:
1067 switch (TargetTriple.getArch()) {
1068 case Triple::x86_64:
1069 MapParams = Linux_X86_MemoryMapParams.bits64;
1070 break;
1071 case Triple::x86:
1072 MapParams = Linux_X86_MemoryMapParams.bits32;
1073 break;
1074 case Triple::mips64:
1075 case Triple::mips64el:
1076 MapParams = Linux_MIPS_MemoryMapParams.bits64;
1077 break;
1078 case Triple::ppc64:
1079 case Triple::ppc64le:
1080 MapParams = Linux_PowerPC_MemoryMapParams.bits64;
1081 break;
1082 case Triple::systemz:
1083 MapParams = Linux_S390_MemoryMapParams.bits64;
1084 break;
1085 case Triple::aarch64:
1086 case Triple::aarch64_be:
1087 MapParams = Linux_ARM_MemoryMapParams.bits64;
1088 break;
1090 MapParams = Linux_LoongArch_MemoryMapParams.bits64;
1091 break;
1092 default:
1093 report_fatal_error("unsupported architecture");
1094 }
1095 break;
1096 default:
1097 report_fatal_error("unsupported operating system");
1098 }
1099 }
1100
1101 C = &(M.getContext());
1102 IRBuilder<> IRB(*C);
1103 IntptrTy = IRB.getIntPtrTy(DL);
1104 OriginTy = IRB.getInt32Ty();
1105 PtrTy = IRB.getPtrTy();
1106
1107 ColdCallWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1108 OriginStoreWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1109
1110 if (!CompileKernel) {
1111 if (TrackOrigins)
1112 M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] {
1113 return new GlobalVariable(
1114 M, IRB.getInt32Ty(), true, GlobalValue::WeakODRLinkage,
1115 IRB.getInt32(TrackOrigins), "__msan_track_origins");
1116 });
1117
1118 if (Recover)
1119 M.getOrInsertGlobal("__msan_keep_going", IRB.getInt32Ty(), [&] {
1120 return new GlobalVariable(M, IRB.getInt32Ty(), true,
1121 GlobalValue::WeakODRLinkage,
1122 IRB.getInt32(Recover), "__msan_keep_going");
1123 });
1124 }
1125}
1126
1127namespace {
1128
1129/// A helper class that handles instrumentation of VarArg
1130/// functions on a particular platform.
1131///
1132/// Implementations are expected to insert the instrumentation
1133/// necessary to propagate argument shadow through VarArg function
1134/// calls. Visit* methods are called during an InstVisitor pass over
1135/// the function, and should avoid creating new basic blocks. A new
1136/// instance of this class is created for each instrumented function.
1137struct VarArgHelper {
1138 virtual ~VarArgHelper() = default;
1139
1140 /// Visit a CallBase.
1141 virtual void visitCallBase(CallBase &CB, IRBuilder<> &IRB) = 0;
1142
1143 /// Visit a va_start call.
1144 virtual void visitVAStartInst(VAStartInst &I) = 0;
1145
1146 /// Visit a va_copy call.
1147 virtual void visitVACopyInst(VACopyInst &I) = 0;
1148
1149 /// Finalize function instrumentation.
1150 ///
1151 /// This method is called after visiting all interesting (see above)
1152 /// instructions in a function.
1153 virtual void finalizeInstrumentation() = 0;
1154};
1155
1156struct MemorySanitizerVisitor;
1157
1158} // end anonymous namespace
1159
1160static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
1161 MemorySanitizerVisitor &Visitor);
1162
1163static unsigned TypeSizeToSizeIndex(TypeSize TS) {
1164 if (TS.isScalable())
1165 // Scalable types unconditionally take slowpaths.
1166 return kNumberOfAccessSizes;
1167 unsigned TypeSizeFixed = TS.getFixedValue();
1168 if (TypeSizeFixed <= 8)
1169 return 0;
1170 return Log2_32_Ceil((TypeSizeFixed + 7) / 8);
1171}
1172
1173namespace {
1174
1175/// Helper class to attach debug information of the given instruction onto new
1176/// instructions inserted after.
1177class NextNodeIRBuilder : public IRBuilder<> {
1178public:
1179 explicit NextNodeIRBuilder(Instruction *IP) : IRBuilder<>(IP->getNextNode()) {
1180 SetCurrentDebugLocation(IP->getDebugLoc());
1181 }
1182};
1183
1184/// This class does all the work for a given function. Store and Load
1185/// instructions store and load corresponding shadow and origin
1186/// values. Most instructions propagate shadow from arguments to their
1187/// return values. Certain instructions (most importantly, BranchInst)
1188/// test their argument shadow and print reports (with a runtime call) if it's
1189/// non-zero.
1190struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
1191 Function &F;
1192 MemorySanitizer &MS;
1193 SmallVector<PHINode *, 16> ShadowPHINodes, OriginPHINodes;
1194 ValueMap<Value *, Value *> ShadowMap, OriginMap;
1195 std::unique_ptr<VarArgHelper> VAHelper;
1196 const TargetLibraryInfo *TLI;
1197 Instruction *FnPrologueEnd;
1198 SmallVector<Instruction *, 16> Instructions;
1199
1200 // The following flags disable parts of MSan instrumentation based on
1201 // exclusion list contents and command-line options.
1202 bool InsertChecks;
1203 bool PropagateShadow;
1204 bool PoisonStack;
1205 bool PoisonUndef;
1206 bool PoisonUndefVectors;
1207
1208 struct ShadowOriginAndInsertPoint {
1209 Value *Shadow;
1210 Value *Origin;
1211 Instruction *OrigIns;
1212
1213 ShadowOriginAndInsertPoint(Value *S, Value *O, Instruction *I)
1214 : Shadow(S), Origin(O), OrigIns(I) {}
1215 };
1217 DenseMap<const DILocation *, int> LazyWarningDebugLocationCount;
1218 SmallSetVector<AllocaInst *, 16> AllocaSet;
1221 int64_t SplittableBlocksCount = 0;
1222
1223 MemorySanitizerVisitor(Function &F, MemorySanitizer &MS,
1224 const TargetLibraryInfo &TLI)
1225 : F(F), MS(MS), VAHelper(CreateVarArgHelper(F, MS, *this)), TLI(&TLI) {
1226 bool SanitizeFunction =
1227 F.hasFnAttribute(Attribute::SanitizeMemory) && !ClDisableChecks;
1228 InsertChecks = SanitizeFunction;
1229 PropagateShadow = SanitizeFunction;
1230 PoisonStack = SanitizeFunction && ClPoisonStack;
1231 PoisonUndef = SanitizeFunction && ClPoisonUndef;
1232 PoisonUndefVectors = SanitizeFunction && ClPoisonUndefVectors;
1233
1234 // In the presence of unreachable blocks, we may see Phi nodes with
1235 // incoming nodes from such blocks. Since InstVisitor skips unreachable
1236 // blocks, such nodes will not have any shadow value associated with them.
1237 // It's easier to remove unreachable blocks than deal with missing shadow.
1239
1240 MS.initializeCallbacks(*F.getParent(), TLI);
1241 FnPrologueEnd =
1242 IRBuilder<>(&F.getEntryBlock(), F.getEntryBlock().getFirstNonPHIIt())
1243 .CreateIntrinsic(Intrinsic::donothing, {});
1244
1245 if (MS.CompileKernel) {
1246 IRBuilder<> IRB(FnPrologueEnd);
1247 insertKmsanPrologue(IRB);
1248 }
1249
1250 LLVM_DEBUG(if (!InsertChecks) dbgs()
1251 << "MemorySanitizer is not inserting checks into '"
1252 << F.getName() << "'\n");
1253 }
1254
1255 bool instrumentWithCalls(Value *V) {
1256 // Constants likely will be eliminated by follow-up passes.
1257 if (isa<Constant>(V))
1258 return false;
1259 ++SplittableBlocksCount;
1261 SplittableBlocksCount > ClInstrumentationWithCallThreshold;
1262 }
1263
1264 bool isInPrologue(Instruction &I) {
1265 return I.getParent() == FnPrologueEnd->getParent() &&
1266 (&I == FnPrologueEnd || I.comesBefore(FnPrologueEnd));
1267 }
1268
1269 // Creates a new origin and records the stack trace. In general we can call
1270 // this function for any origin manipulation we like. However it will cost
1271 // runtime resources. So use this wisely only if it can provide additional
1272 // information helpful to a user.
1273 Value *updateOrigin(Value *V, IRBuilder<> &IRB) {
1274 if (MS.TrackOrigins <= 1)
1275 return V;
1276 return IRB.CreateCall(MS.MsanChainOriginFn, V);
1277 }
1278
1279 Value *originToIntptr(IRBuilder<> &IRB, Value *Origin) {
1280 const DataLayout &DL = F.getDataLayout();
1281 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1282 if (IntptrSize == kOriginSize)
1283 return Origin;
1284 assert(IntptrSize == kOriginSize * 2);
1285 Origin = IRB.CreateIntCast(Origin, MS.IntptrTy, /* isSigned */ false);
1286 return IRB.CreateOr(Origin, IRB.CreateShl(Origin, kOriginSize * 8));
1287 }
1288
1289 /// Fill memory range with the given origin value.
1290 void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *OriginPtr,
1291 TypeSize TS, Align Alignment) {
1292 const DataLayout &DL = F.getDataLayout();
1293 const Align IntptrAlignment = DL.getABITypeAlign(MS.IntptrTy);
1294 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1295 assert(IntptrAlignment >= kMinOriginAlignment);
1296 assert(IntptrSize >= kOriginSize);
1297
1298 // Note: The loop based formation works for fixed length vectors too,
1299 // however we prefer to unroll and specialize alignment below.
1300 if (TS.isScalable()) {
1301 Value *Size = IRB.CreateTypeSize(MS.IntptrTy, TS);
1302 Value *RoundUp =
1303 IRB.CreateAdd(Size, ConstantInt::get(MS.IntptrTy, kOriginSize - 1));
1304 Value *End =
1305 IRB.CreateUDiv(RoundUp, ConstantInt::get(MS.IntptrTy, kOriginSize));
1306 auto [InsertPt, Index] =
1308 IRB.SetInsertPoint(InsertPt);
1309
1310 Value *GEP = IRB.CreateGEP(MS.OriginTy, OriginPtr, Index);
1312 return;
1313 }
1314
1315 unsigned Size = TS.getFixedValue();
1316
1317 unsigned Ofs = 0;
1318 Align CurrentAlignment = Alignment;
1319 if (Alignment >= IntptrAlignment && IntptrSize > kOriginSize) {
1320 Value *IntptrOrigin = originToIntptr(IRB, Origin);
1321 Value *IntptrOriginPtr = IRB.CreatePointerCast(OriginPtr, MS.PtrTy);
1322 for (unsigned i = 0; i < Size / IntptrSize; ++i) {
1323 Value *Ptr = i ? IRB.CreateConstGEP1_32(MS.IntptrTy, IntptrOriginPtr, i)
1324 : IntptrOriginPtr;
1325 IRB.CreateAlignedStore(IntptrOrigin, Ptr, CurrentAlignment);
1326 Ofs += IntptrSize / kOriginSize;
1327 CurrentAlignment = IntptrAlignment;
1328 }
1329 }
1330
1331 for (unsigned i = Ofs; i < (Size + kOriginSize - 1) / kOriginSize; ++i) {
1332 Value *GEP =
1333 i ? IRB.CreateConstGEP1_32(MS.OriginTy, OriginPtr, i) : OriginPtr;
1334 IRB.CreateAlignedStore(Origin, GEP, CurrentAlignment);
1335 CurrentAlignment = kMinOriginAlignment;
1336 }
1337 }
1338
1339 void storeOrigin(IRBuilder<> &IRB, Value *Addr, Value *Shadow, Value *Origin,
1340 Value *OriginPtr, Align Alignment) {
1341 const DataLayout &DL = F.getDataLayout();
1342 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1343 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
1344 // ZExt cannot convert between vector and scalar
1345 Value *ConvertedShadow = convertShadowToScalar(Shadow, IRB);
1346 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1347 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1348 // Origin is not needed: value is initialized or const shadow is
1349 // ignored.
1350 return;
1351 }
1352 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1353 // Copy origin as the value is definitely uninitialized.
1354 paintOrigin(IRB, updateOrigin(Origin, IRB), OriginPtr, StoreSize,
1355 OriginAlignment);
1356 return;
1357 }
1358 // Fallback to runtime check, which still can be optimized out later.
1359 }
1360
1361 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1362 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1363 if (instrumentWithCalls(ConvertedShadow) &&
1364 SizeIndex < kNumberOfAccessSizes && !MS.CompileKernel) {
1365 FunctionCallee Fn = MS.MaybeStoreOriginFn[SizeIndex];
1366 Value *ConvertedShadow2 =
1367 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1368 CallBase *CB = IRB.CreateCall(Fn, {ConvertedShadow2, Addr, Origin});
1369 CB->addParamAttr(0, Attribute::ZExt);
1370 CB->addParamAttr(2, Attribute::ZExt);
1371 } else {
1372 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1374 Cmp, &*IRB.GetInsertPoint(), false, MS.OriginStoreWeights);
1375 IRBuilder<> IRBNew(CheckTerm);
1376 paintOrigin(IRBNew, updateOrigin(Origin, IRBNew), OriginPtr, StoreSize,
1377 OriginAlignment);
1378 }
1379 }
1380
1381 void materializeStores() {
1382 for (StoreInst *SI : StoreList) {
1383 IRBuilder<> IRB(SI);
1384 Value *Val = SI->getValueOperand();
1385 Value *Addr = SI->getPointerOperand();
1386 Value *Shadow = SI->isAtomic() ? getCleanShadow(Val) : getShadow(Val);
1387 Value *ShadowPtr, *OriginPtr;
1388 Type *ShadowTy = Shadow->getType();
1389 const Align Alignment = SI->getAlign();
1390 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1391 std::tie(ShadowPtr, OriginPtr) =
1392 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ true);
1393
1394 [[maybe_unused]] StoreInst *NewSI =
1395 IRB.CreateAlignedStore(Shadow, ShadowPtr, Alignment);
1396 LLVM_DEBUG(dbgs() << " STORE: " << *NewSI << "\n");
1397
1398 if (SI->isAtomic())
1399 SI->setOrdering(addReleaseOrdering(SI->getOrdering()));
1400
1401 if (MS.TrackOrigins && !SI->isAtomic())
1402 storeOrigin(IRB, Addr, Shadow, getOrigin(Val), OriginPtr,
1403 OriginAlignment);
1404 }
1405 }
1406
1407 // Returns true if Debug Location corresponds to multiple warnings.
1408 bool shouldDisambiguateWarningLocation(const DebugLoc &DebugLoc) {
1409 if (MS.TrackOrigins < 2)
1410 return false;
1411
1412 if (LazyWarningDebugLocationCount.empty())
1413 for (const auto &I : InstrumentationList)
1414 ++LazyWarningDebugLocationCount[I.OrigIns->getDebugLoc()];
1415
1416 return LazyWarningDebugLocationCount[DebugLoc] >= ClDisambiguateWarning;
1417 }
1418
1419 /// Helper function to insert a warning at IRB's current insert point.
1420 void insertWarningFn(IRBuilder<> &IRB, Value *Origin) {
1421 if (!Origin)
1422 Origin = (Value *)IRB.getInt32(0);
1423 assert(Origin->getType()->isIntegerTy());
1424
1425 if (shouldDisambiguateWarningLocation(IRB.getCurrentDebugLocation())) {
1426 // Try to create additional origin with debug info of the last origin
1427 // instruction. It may provide additional information to the user.
1428 if (Instruction *OI = dyn_cast_or_null<Instruction>(Origin)) {
1429 assert(MS.TrackOrigins);
1430 auto NewDebugLoc = OI->getDebugLoc();
1431 // Origin update with missing or the same debug location provides no
1432 // additional value.
1433 if (NewDebugLoc && NewDebugLoc != IRB.getCurrentDebugLocation()) {
1434 // Insert update just before the check, so we call runtime only just
1435 // before the report.
1436 IRBuilder<> IRBOrigin(&*IRB.GetInsertPoint());
1437 IRBOrigin.SetCurrentDebugLocation(NewDebugLoc);
1438 Origin = updateOrigin(Origin, IRBOrigin);
1439 }
1440 }
1441 }
1442
1443 if (MS.CompileKernel || MS.TrackOrigins)
1444 IRB.CreateCall(MS.WarningFn, Origin)->setCannotMerge();
1445 else
1446 IRB.CreateCall(MS.WarningFn)->setCannotMerge();
1447 // FIXME: Insert UnreachableInst if !MS.Recover?
1448 // This may invalidate some of the following checks and needs to be done
1449 // at the very end.
1450 }
1451
1452 void materializeOneCheck(IRBuilder<> &IRB, Value *ConvertedShadow,
1453 Value *Origin) {
1454 const DataLayout &DL = F.getDataLayout();
1455 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1456 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1457 if (instrumentWithCalls(ConvertedShadow) && !MS.CompileKernel) {
1458 // ZExt cannot convert between vector and scalar
1459 ConvertedShadow = convertShadowToScalar(ConvertedShadow, IRB);
1460 Value *ConvertedShadow2 =
1461 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1462
1463 if (SizeIndex < kNumberOfAccessSizes) {
1464 FunctionCallee Fn = MS.MaybeWarningFn[SizeIndex];
1465 CallBase *CB = IRB.CreateCall(
1466 Fn,
1467 {ConvertedShadow2,
1468 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1469 CB->addParamAttr(0, Attribute::ZExt);
1470 CB->addParamAttr(1, Attribute::ZExt);
1471 } else {
1472 FunctionCallee Fn = MS.MaybeWarningVarSizeFn;
1473 Value *ShadowAlloca = IRB.CreateAlloca(ConvertedShadow2->getType(), 0u);
1474 IRB.CreateStore(ConvertedShadow2, ShadowAlloca);
1475 unsigned ShadowSize = DL.getTypeAllocSize(ConvertedShadow2->getType());
1476 CallBase *CB = IRB.CreateCall(
1477 Fn,
1478 {ShadowAlloca, ConstantInt::get(IRB.getInt64Ty(), ShadowSize),
1479 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1480 CB->addParamAttr(1, Attribute::ZExt);
1481 CB->addParamAttr(2, Attribute::ZExt);
1482 }
1483 } else {
1484 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1486 Cmp, &*IRB.GetInsertPoint(),
1487 /* Unreachable */ !MS.Recover, MS.ColdCallWeights);
1488
1489 IRB.SetInsertPoint(CheckTerm);
1490 insertWarningFn(IRB, Origin);
1491 LLVM_DEBUG(dbgs() << " CHECK: " << *Cmp << "\n");
1492 }
1493 }
1494
1495 void materializeInstructionChecks(
1496 ArrayRef<ShadowOriginAndInsertPoint> InstructionChecks) {
1497 const DataLayout &DL = F.getDataLayout();
1498 // Disable combining in some cases. TrackOrigins checks each shadow to pick
1499 // correct origin.
1500 bool Combine = !MS.TrackOrigins;
1501 Instruction *Instruction = InstructionChecks.front().OrigIns;
1502 Value *Shadow = nullptr;
1503 for (const auto &ShadowData : InstructionChecks) {
1504 assert(ShadowData.OrigIns == Instruction);
1505 IRBuilder<> IRB(Instruction);
1506
1507 Value *ConvertedShadow = ShadowData.Shadow;
1508
1509 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1510 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1511 // Skip, value is initialized or const shadow is ignored.
1512 continue;
1513 }
1514 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1515 // Report as the value is definitely uninitialized.
1516 insertWarningFn(IRB, ShadowData.Origin);
1517 if (!MS.Recover)
1518 return; // Always fail and stop here, not need to check the rest.
1519 // Skip entire instruction,
1520 continue;
1521 }
1522 // Fallback to runtime check, which still can be optimized out later.
1523 }
1524
1525 if (!Combine) {
1526 materializeOneCheck(IRB, ConvertedShadow, ShadowData.Origin);
1527 continue;
1528 }
1529
1530 if (!Shadow) {
1531 Shadow = ConvertedShadow;
1532 continue;
1533 }
1534
1535 Shadow = convertToBool(Shadow, IRB, "_mscmp");
1536 ConvertedShadow = convertToBool(ConvertedShadow, IRB, "_mscmp");
1537 Shadow = IRB.CreateOr(Shadow, ConvertedShadow, "_msor");
1538 }
1539
1540 if (Shadow) {
1541 assert(Combine);
1542 IRBuilder<> IRB(Instruction);
1543 materializeOneCheck(IRB, Shadow, nullptr);
1544 }
1545 }
1546
1547 void materializeChecks() {
1548#ifndef NDEBUG
1549 // For assert below.
1550 SmallPtrSet<Instruction *, 16> Done;
1551#endif
1552
1553 for (auto I = InstrumentationList.begin();
1554 I != InstrumentationList.end();) {
1555 auto OrigIns = I->OrigIns;
1556 // Checks are grouped by the original instruction. We call all
1557 // `insertShadowCheck` for an instruction at once.
1558 assert(Done.insert(OrigIns).second);
1559 auto J = std::find_if(I + 1, InstrumentationList.end(),
1560 [OrigIns](const ShadowOriginAndInsertPoint &R) {
1561 return OrigIns != R.OrigIns;
1562 });
1563 // Process all checks of instruction at once.
1564 materializeInstructionChecks(ArrayRef<ShadowOriginAndInsertPoint>(I, J));
1565 I = J;
1566 }
1567
1568 LLVM_DEBUG(dbgs() << "DONE:\n" << F);
1569 }
1570
1571 // Returns the last instruction in the new prologue
1572 void insertKmsanPrologue(IRBuilder<> &IRB) {
1573 Value *ContextState = IRB.CreateCall(MS.MsanGetContextStateFn, {});
1574 Constant *Zero = IRB.getInt32(0);
1575 MS.ParamTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1576 {Zero, IRB.getInt32(0)}, "param_shadow");
1577 MS.RetvalTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1578 {Zero, IRB.getInt32(1)}, "retval_shadow");
1579 MS.VAArgTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1580 {Zero, IRB.getInt32(2)}, "va_arg_shadow");
1581 MS.VAArgOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1582 {Zero, IRB.getInt32(3)}, "va_arg_origin");
1583 MS.VAArgOverflowSizeTLS =
1584 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1585 {Zero, IRB.getInt32(4)}, "va_arg_overflow_size");
1586 MS.ParamOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1587 {Zero, IRB.getInt32(5)}, "param_origin");
1588 MS.RetvalOriginTLS =
1589 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1590 {Zero, IRB.getInt32(6)}, "retval_origin");
1591 if (MS.TargetTriple.getArch() == Triple::systemz)
1592 MS.MsanMetadataAlloca = IRB.CreateAlloca(MS.MsanMetadata, 0u);
1593 }
1594
1595 /// Add MemorySanitizer instrumentation to a function.
1596 bool runOnFunction() {
1597 // Iterate all BBs in depth-first order and create shadow instructions
1598 // for all instructions (where applicable).
1599 // For PHI nodes we create dummy shadow PHIs which will be finalized later.
1600 for (BasicBlock *BB : depth_first(FnPrologueEnd->getParent()))
1601 visit(*BB);
1602
1603 // `visit` above only collects instructions. Process them after iterating
1604 // CFG to avoid requirement on CFG transformations.
1605 for (Instruction *I : Instructions)
1607
1608 // Finalize PHI nodes.
1609 for (PHINode *PN : ShadowPHINodes) {
1610 PHINode *PNS = cast<PHINode>(getShadow(PN));
1611 PHINode *PNO = MS.TrackOrigins ? cast<PHINode>(getOrigin(PN)) : nullptr;
1612 size_t NumValues = PN->getNumIncomingValues();
1613 for (size_t v = 0; v < NumValues; v++) {
1614 PNS->addIncoming(getShadow(PN, v), PN->getIncomingBlock(v));
1615 if (PNO)
1616 PNO->addIncoming(getOrigin(PN, v), PN->getIncomingBlock(v));
1617 }
1618 }
1619
1620 VAHelper->finalizeInstrumentation();
1621
1622 // Poison llvm.lifetime.start intrinsics, if we haven't fallen back to
1623 // instrumenting only allocas.
1625 for (auto Item : LifetimeStartList) {
1626 instrumentAlloca(*Item.second, Item.first);
1627 AllocaSet.remove(Item.second);
1628 }
1629 }
1630 // Poison the allocas for which we didn't instrument the corresponding
1631 // lifetime intrinsics.
1632 for (AllocaInst *AI : AllocaSet)
1633 instrumentAlloca(*AI);
1634
1635 // Insert shadow value checks.
1636 materializeChecks();
1637
1638 // Delayed instrumentation of StoreInst.
1639 // This may not add new address checks.
1640 materializeStores();
1641
1642 return true;
1643 }
1644
1645 /// Compute the shadow type that corresponds to a given Value.
1646 Type *getShadowTy(Value *V) { return getShadowTy(V->getType()); }
1647
1648 /// Compute the shadow type that corresponds to a given Type.
1649 Type *getShadowTy(Type *OrigTy) {
1650 if (!OrigTy->isSized()) {
1651 return nullptr;
1652 }
1653 // For integer type, shadow is the same as the original type.
1654 // This may return weird-sized types like i1.
1655 if (IntegerType *IT = dyn_cast<IntegerType>(OrigTy))
1656 return IT;
1657 const DataLayout &DL = F.getDataLayout();
1658 if (VectorType *VT = dyn_cast<VectorType>(OrigTy)) {
1659 uint32_t EltSize = DL.getTypeSizeInBits(VT->getElementType());
1660 return VectorType::get(IntegerType::get(*MS.C, EltSize),
1661 VT->getElementCount());
1662 }
1663 if (ArrayType *AT = dyn_cast<ArrayType>(OrigTy)) {
1664 return ArrayType::get(getShadowTy(AT->getElementType()),
1665 AT->getNumElements());
1666 }
1667 if (StructType *ST = dyn_cast<StructType>(OrigTy)) {
1669 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1670 Elements.push_back(getShadowTy(ST->getElementType(i)));
1671 StructType *Res = StructType::get(*MS.C, Elements, ST->isPacked());
1672 LLVM_DEBUG(dbgs() << "getShadowTy: " << *ST << " ===> " << *Res << "\n");
1673 return Res;
1674 }
1675 uint32_t TypeSize = DL.getTypeSizeInBits(OrigTy);
1676 return IntegerType::get(*MS.C, TypeSize);
1677 }
1678
1679 /// Extract combined shadow of struct elements as a bool
1680 Value *collapseStructShadow(StructType *Struct, Value *Shadow,
1681 IRBuilder<> &IRB) {
1682 Value *FalseVal = IRB.getIntN(/* width */ 1, /* value */ 0);
1683 Value *Aggregator = FalseVal;
1684
1685 for (unsigned Idx = 0; Idx < Struct->getNumElements(); Idx++) {
1686 // Combine by ORing together each element's bool shadow
1687 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1688 Value *ShadowBool = convertToBool(ShadowItem, IRB);
1689
1690 if (Aggregator != FalseVal)
1691 Aggregator = IRB.CreateOr(Aggregator, ShadowBool);
1692 else
1693 Aggregator = ShadowBool;
1694 }
1695
1696 return Aggregator;
1697 }
1698
1699 // Extract combined shadow of array elements
1700 Value *collapseArrayShadow(ArrayType *Array, Value *Shadow,
1701 IRBuilder<> &IRB) {
1702 if (!Array->getNumElements())
1703 return IRB.getIntN(/* width */ 1, /* value */ 0);
1704
1705 Value *FirstItem = IRB.CreateExtractValue(Shadow, 0);
1706 Value *Aggregator = convertShadowToScalar(FirstItem, IRB);
1707
1708 for (unsigned Idx = 1; Idx < Array->getNumElements(); Idx++) {
1709 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1710 Value *ShadowInner = convertShadowToScalar(ShadowItem, IRB);
1711 Aggregator = IRB.CreateOr(Aggregator, ShadowInner);
1712 }
1713 return Aggregator;
1714 }
1715
1716 /// Convert a shadow value to it's flattened variant. The resulting
1717 /// shadow may not necessarily have the same bit width as the input
1718 /// value, but it will always be comparable to zero.
1719 Value *convertShadowToScalar(Value *V, IRBuilder<> &IRB) {
1720 if (StructType *Struct = dyn_cast<StructType>(V->getType()))
1721 return collapseStructShadow(Struct, V, IRB);
1722 if (ArrayType *Array = dyn_cast<ArrayType>(V->getType()))
1723 return collapseArrayShadow(Array, V, IRB);
1724 if (isa<VectorType>(V->getType())) {
1725 if (isa<ScalableVectorType>(V->getType()))
1726 return convertShadowToScalar(IRB.CreateOrReduce(V), IRB);
1727 unsigned BitWidth =
1728 V->getType()->getPrimitiveSizeInBits().getFixedValue();
1729 return IRB.CreateBitCast(V, IntegerType::get(*MS.C, BitWidth));
1730 }
1731 return V;
1732 }
1733
1734 // Convert a scalar value to an i1 by comparing with 0
1735 Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &name = "") {
1736 Type *VTy = V->getType();
1737 if (!VTy->isIntegerTy())
1738 return convertToBool(convertShadowToScalar(V, IRB), IRB, name);
1739 if (VTy->getIntegerBitWidth() == 1)
1740 // Just converting a bool to a bool, so do nothing.
1741 return V;
1742 return IRB.CreateICmpNE(V, ConstantInt::get(VTy, 0), name);
1743 }
1744
1745 Type *ptrToIntPtrType(Type *PtrTy) const {
1746 if (VectorType *VectTy = dyn_cast<VectorType>(PtrTy)) {
1747 return VectorType::get(ptrToIntPtrType(VectTy->getElementType()),
1748 VectTy->getElementCount());
1749 }
1750 assert(PtrTy->isIntOrPtrTy());
1751 return MS.IntptrTy;
1752 }
1753
1754 Type *getPtrToShadowPtrType(Type *IntPtrTy, Type *ShadowTy) const {
1755 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1756 return VectorType::get(
1757 getPtrToShadowPtrType(VectTy->getElementType(), ShadowTy),
1758 VectTy->getElementCount());
1759 }
1760 assert(IntPtrTy == MS.IntptrTy);
1761 return MS.PtrTy;
1762 }
1763
1764 Constant *constToIntPtr(Type *IntPtrTy, uint64_t C) const {
1765 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1767 VectTy->getElementCount(),
1768 constToIntPtr(VectTy->getElementType(), C));
1769 }
1770 assert(IntPtrTy == MS.IntptrTy);
1771 return ConstantInt::get(MS.IntptrTy, C);
1772 }
1773
1774 /// Returns the integer shadow offset that corresponds to a given
1775 /// application address, whereby:
1776 ///
1777 /// Offset = (Addr & ~AndMask) ^ XorMask
1778 /// Shadow = ShadowBase + Offset
1779 /// Origin = (OriginBase + Offset) & ~Alignment
1780 ///
1781 /// Note: for efficiency, many shadow mappings only require use the XorMask
1782 /// and OriginBase; the AndMask and ShadowBase are often zero.
1783 Value *getShadowPtrOffset(Value *Addr, IRBuilder<> &IRB) {
1784 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1785 Value *OffsetLong = IRB.CreatePointerCast(Addr, IntptrTy);
1786
1787 if (uint64_t AndMask = MS.MapParams->AndMask)
1788 OffsetLong = IRB.CreateAnd(OffsetLong, constToIntPtr(IntptrTy, ~AndMask));
1789
1790 if (uint64_t XorMask = MS.MapParams->XorMask)
1791 OffsetLong = IRB.CreateXor(OffsetLong, constToIntPtr(IntptrTy, XorMask));
1792 return OffsetLong;
1793 }
1794
1795 /// Compute the shadow and origin addresses corresponding to a given
1796 /// application address.
1797 ///
1798 /// Shadow = ShadowBase + Offset
1799 /// Origin = (OriginBase + Offset) & ~3ULL
1800 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1801 /// a single pointee.
1802 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1803 std::pair<Value *, Value *>
1804 getShadowOriginPtrUserspace(Value *Addr, IRBuilder<> &IRB, Type *ShadowTy,
1805 MaybeAlign Alignment) {
1806 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1807 if (!VectTy) {
1808 assert(Addr->getType()->isPointerTy());
1809 } else {
1810 assert(VectTy->getElementType()->isPointerTy());
1811 }
1812 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1813 Value *ShadowOffset = getShadowPtrOffset(Addr, IRB);
1814 Value *ShadowLong = ShadowOffset;
1815 if (uint64_t ShadowBase = MS.MapParams->ShadowBase) {
1816 ShadowLong =
1817 IRB.CreateAdd(ShadowLong, constToIntPtr(IntptrTy, ShadowBase));
1818 }
1819 Value *ShadowPtr = IRB.CreateIntToPtr(
1820 ShadowLong, getPtrToShadowPtrType(IntptrTy, ShadowTy));
1821
1822 Value *OriginPtr = nullptr;
1823 if (MS.TrackOrigins) {
1824 Value *OriginLong = ShadowOffset;
1825 uint64_t OriginBase = MS.MapParams->OriginBase;
1826 if (OriginBase != 0)
1827 OriginLong =
1828 IRB.CreateAdd(OriginLong, constToIntPtr(IntptrTy, OriginBase));
1829 if (!Alignment || *Alignment < kMinOriginAlignment) {
1830 uint64_t Mask = kMinOriginAlignment.value() - 1;
1831 OriginLong = IRB.CreateAnd(OriginLong, constToIntPtr(IntptrTy, ~Mask));
1832 }
1833 OriginPtr = IRB.CreateIntToPtr(
1834 OriginLong, getPtrToShadowPtrType(IntptrTy, MS.OriginTy));
1835 }
1836 return std::make_pair(ShadowPtr, OriginPtr);
1837 }
1838
1839 template <typename... ArgsTy>
1840 Value *createMetadataCall(IRBuilder<> &IRB, FunctionCallee Callee,
1841 ArgsTy... Args) {
1842 if (MS.TargetTriple.getArch() == Triple::systemz) {
1843 IRB.CreateCall(Callee,
1844 {MS.MsanMetadataAlloca, std::forward<ArgsTy>(Args)...});
1845 return IRB.CreateLoad(MS.MsanMetadata, MS.MsanMetadataAlloca);
1846 }
1847
1848 return IRB.CreateCall(Callee, {std::forward<ArgsTy>(Args)...});
1849 }
1850
1851 std::pair<Value *, Value *> getShadowOriginPtrKernelNoVec(Value *Addr,
1852 IRBuilder<> &IRB,
1853 Type *ShadowTy,
1854 bool isStore) {
1855 Value *ShadowOriginPtrs;
1856 const DataLayout &DL = F.getDataLayout();
1857 TypeSize Size = DL.getTypeStoreSize(ShadowTy);
1858
1859 FunctionCallee Getter = MS.getKmsanShadowOriginAccessFn(isStore, Size);
1860 Value *AddrCast = IRB.CreatePointerCast(Addr, MS.PtrTy);
1861 if (Getter) {
1862 ShadowOriginPtrs = createMetadataCall(IRB, Getter, AddrCast);
1863 } else {
1864 Value *SizeVal = ConstantInt::get(MS.IntptrTy, Size);
1865 ShadowOriginPtrs = createMetadataCall(
1866 IRB,
1867 isStore ? MS.MsanMetadataPtrForStoreN : MS.MsanMetadataPtrForLoadN,
1868 AddrCast, SizeVal);
1869 }
1870 Value *ShadowPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 0);
1871 ShadowPtr = IRB.CreatePointerCast(ShadowPtr, MS.PtrTy);
1872 Value *OriginPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 1);
1873
1874 return std::make_pair(ShadowPtr, OriginPtr);
1875 }
1876
1877 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1878 /// a single pointee.
1879 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1880 std::pair<Value *, Value *> getShadowOriginPtrKernel(Value *Addr,
1881 IRBuilder<> &IRB,
1882 Type *ShadowTy,
1883 bool isStore) {
1884 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1885 if (!VectTy) {
1886 assert(Addr->getType()->isPointerTy());
1887 return getShadowOriginPtrKernelNoVec(Addr, IRB, ShadowTy, isStore);
1888 }
1889
1890 // TODO: Support callbacs with vectors of addresses.
1891 unsigned NumElements = cast<FixedVectorType>(VectTy)->getNumElements();
1892 Value *ShadowPtrs = ConstantInt::getNullValue(
1893 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1894 Value *OriginPtrs = nullptr;
1895 if (MS.TrackOrigins)
1896 OriginPtrs = ConstantInt::getNullValue(
1897 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1898 for (unsigned i = 0; i < NumElements; ++i) {
1899 Value *OneAddr =
1900 IRB.CreateExtractElement(Addr, ConstantInt::get(IRB.getInt32Ty(), i));
1901 auto [ShadowPtr, OriginPtr] =
1902 getShadowOriginPtrKernelNoVec(OneAddr, IRB, ShadowTy, isStore);
1903
1904 ShadowPtrs = IRB.CreateInsertElement(
1905 ShadowPtrs, ShadowPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1906 if (MS.TrackOrigins)
1907 OriginPtrs = IRB.CreateInsertElement(
1908 OriginPtrs, OriginPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1909 }
1910 return {ShadowPtrs, OriginPtrs};
1911 }
1912
1913 std::pair<Value *, Value *> getShadowOriginPtr(Value *Addr, IRBuilder<> &IRB,
1914 Type *ShadowTy,
1915 MaybeAlign Alignment,
1916 bool isStore) {
1917 if (MS.CompileKernel)
1918 return getShadowOriginPtrKernel(Addr, IRB, ShadowTy, isStore);
1919 return getShadowOriginPtrUserspace(Addr, IRB, ShadowTy, Alignment);
1920 }
1921
1922 /// Compute the shadow address for a given function argument.
1923 ///
1924 /// Shadow = ParamTLS+ArgOffset.
1925 Value *getShadowPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1926 Value *Base = IRB.CreatePointerCast(MS.ParamTLS, MS.IntptrTy);
1927 if (ArgOffset)
1928 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
1929 return IRB.CreateIntToPtr(Base, IRB.getPtrTy(0), "_msarg");
1930 }
1931
1932 /// Compute the origin address for a given function argument.
1933 Value *getOriginPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1934 if (!MS.TrackOrigins)
1935 return nullptr;
1936 Value *Base = IRB.CreatePointerCast(MS.ParamOriginTLS, MS.IntptrTy);
1937 if (ArgOffset)
1938 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
1939 return IRB.CreateIntToPtr(Base, IRB.getPtrTy(0), "_msarg_o");
1940 }
1941
1942 /// Compute the shadow address for a retval.
1943 Value *getShadowPtrForRetval(IRBuilder<> &IRB) {
1944 return IRB.CreatePointerCast(MS.RetvalTLS, IRB.getPtrTy(0), "_msret");
1945 }
1946
1947 /// Compute the origin address for a retval.
1948 Value *getOriginPtrForRetval() {
1949 // We keep a single origin for the entire retval. Might be too optimistic.
1950 return MS.RetvalOriginTLS;
1951 }
1952
1953 /// Set SV to be the shadow value for V.
1954 void setShadow(Value *V, Value *SV) {
1955 assert(!ShadowMap.count(V) && "Values may only have one shadow");
1956 ShadowMap[V] = PropagateShadow ? SV : getCleanShadow(V);
1957 }
1958
1959 /// Set Origin to be the origin value for V.
1960 void setOrigin(Value *V, Value *Origin) {
1961 if (!MS.TrackOrigins)
1962 return;
1963 assert(!OriginMap.count(V) && "Values may only have one origin");
1964 LLVM_DEBUG(dbgs() << "ORIGIN: " << *V << " ==> " << *Origin << "\n");
1965 OriginMap[V] = Origin;
1966 }
1967
1968 Constant *getCleanShadow(Type *OrigTy) {
1969 Type *ShadowTy = getShadowTy(OrigTy);
1970 if (!ShadowTy)
1971 return nullptr;
1972 return Constant::getNullValue(ShadowTy);
1973 }
1974
1975 /// Create a clean shadow value for a given value.
1976 ///
1977 /// Clean shadow (all zeroes) means all bits of the value are defined
1978 /// (initialized).
1979 Constant *getCleanShadow(Value *V) { return getCleanShadow(V->getType()); }
1980
1981 /// Create a dirty shadow of a given shadow type.
1982 Constant *getPoisonedShadow(Type *ShadowTy) {
1983 assert(ShadowTy);
1984 if (isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy))
1985 return Constant::getAllOnesValue(ShadowTy);
1986 if (ArrayType *AT = dyn_cast<ArrayType>(ShadowTy)) {
1987 SmallVector<Constant *, 4> Vals(AT->getNumElements(),
1988 getPoisonedShadow(AT->getElementType()));
1989 return ConstantArray::get(AT, Vals);
1990 }
1991 if (StructType *ST = dyn_cast<StructType>(ShadowTy)) {
1993 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1994 Vals.push_back(getPoisonedShadow(ST->getElementType(i)));
1995 return ConstantStruct::get(ST, Vals);
1996 }
1997 llvm_unreachable("Unexpected shadow type");
1998 }
1999
2000 /// Create a dirty shadow for a given value.
2001 Constant *getPoisonedShadow(Value *V) {
2002 Type *ShadowTy = getShadowTy(V);
2003 if (!ShadowTy)
2004 return nullptr;
2005 return getPoisonedShadow(ShadowTy);
2006 }
2007
2008 /// Create a clean (zero) origin.
2009 Value *getCleanOrigin() { return Constant::getNullValue(MS.OriginTy); }
2010
2011 /// Get the shadow value for a given Value.
2012 ///
2013 /// This function either returns the value set earlier with setShadow,
2014 /// or extracts if from ParamTLS (for function arguments).
2015 Value *getShadow(Value *V) {
2016 if (Instruction *I = dyn_cast<Instruction>(V)) {
2017 if (!PropagateShadow || I->getMetadata(LLVMContext::MD_nosanitize))
2018 return getCleanShadow(V);
2019 // For instructions the shadow is already stored in the map.
2020 Value *Shadow = ShadowMap[V];
2021 if (!Shadow) {
2022 LLVM_DEBUG(dbgs() << "No shadow: " << *V << "\n" << *(I->getParent()));
2023 assert(Shadow && "No shadow for a value");
2024 }
2025 return Shadow;
2026 }
2027 // Handle fully undefined values
2028 // (partially undefined constant vectors are handled later)
2029 if ([[maybe_unused]] UndefValue *U = dyn_cast<UndefValue>(V)) {
2030 Value *AllOnes = (PropagateShadow && PoisonUndef) ? getPoisonedShadow(V)
2031 : getCleanShadow(V);
2032 LLVM_DEBUG(dbgs() << "Undef: " << *U << " ==> " << *AllOnes << "\n");
2033 return AllOnes;
2034 }
2035 if (Argument *A = dyn_cast<Argument>(V)) {
2036 // For arguments we compute the shadow on demand and store it in the map.
2037 Value *&ShadowPtr = ShadowMap[V];
2038 if (ShadowPtr)
2039 return ShadowPtr;
2040 Function *F = A->getParent();
2041 IRBuilder<> EntryIRB(FnPrologueEnd);
2042 unsigned ArgOffset = 0;
2043 const DataLayout &DL = F->getDataLayout();
2044 for (auto &FArg : F->args()) {
2045 if (!FArg.getType()->isSized() || FArg.getType()->isScalableTy()) {
2046 LLVM_DEBUG(dbgs() << (FArg.getType()->isScalableTy()
2047 ? "vscale not fully supported\n"
2048 : "Arg is not sized\n"));
2049 if (A == &FArg) {
2050 ShadowPtr = getCleanShadow(V);
2051 setOrigin(A, getCleanOrigin());
2052 break;
2053 }
2054 continue;
2055 }
2056
2057 unsigned Size = FArg.hasByValAttr()
2058 ? DL.getTypeAllocSize(FArg.getParamByValType())
2059 : DL.getTypeAllocSize(FArg.getType());
2060
2061 if (A == &FArg) {
2062 bool Overflow = ArgOffset + Size > kParamTLSSize;
2063 if (FArg.hasByValAttr()) {
2064 // ByVal pointer itself has clean shadow. We copy the actual
2065 // argument shadow to the underlying memory.
2066 // Figure out maximal valid memcpy alignment.
2067 const Align ArgAlign = DL.getValueOrABITypeAlignment(
2068 FArg.getParamAlign(), FArg.getParamByValType());
2069 Value *CpShadowPtr, *CpOriginPtr;
2070 std::tie(CpShadowPtr, CpOriginPtr) =
2071 getShadowOriginPtr(V, EntryIRB, EntryIRB.getInt8Ty(), ArgAlign,
2072 /*isStore*/ true);
2073 if (!PropagateShadow || Overflow) {
2074 // ParamTLS overflow.
2075 EntryIRB.CreateMemSet(
2076 CpShadowPtr, Constant::getNullValue(EntryIRB.getInt8Ty()),
2077 Size, ArgAlign);
2078 } else {
2079 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2080 const Align CopyAlign = std::min(ArgAlign, kShadowTLSAlignment);
2081 [[maybe_unused]] Value *Cpy = EntryIRB.CreateMemCpy(
2082 CpShadowPtr, CopyAlign, Base, CopyAlign, Size);
2083 LLVM_DEBUG(dbgs() << " ByValCpy: " << *Cpy << "\n");
2084
2085 if (MS.TrackOrigins) {
2086 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2087 // FIXME: OriginSize should be:
2088 // alignTo(V % kMinOriginAlignment + Size, kMinOriginAlignment)
2089 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
2090 EntryIRB.CreateMemCpy(
2091 CpOriginPtr,
2092 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginPtr,
2093 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
2094 OriginSize);
2095 }
2096 }
2097 }
2098
2099 if (!PropagateShadow || Overflow || FArg.hasByValAttr() ||
2100 (MS.EagerChecks && FArg.hasAttribute(Attribute::NoUndef))) {
2101 ShadowPtr = getCleanShadow(V);
2102 setOrigin(A, getCleanOrigin());
2103 } else {
2104 // Shadow over TLS
2105 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2106 ShadowPtr = EntryIRB.CreateAlignedLoad(getShadowTy(&FArg), Base,
2108 if (MS.TrackOrigins) {
2109 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2110 setOrigin(A, EntryIRB.CreateLoad(MS.OriginTy, OriginPtr));
2111 }
2112 }
2114 << " ARG: " << FArg << " ==> " << *ShadowPtr << "\n");
2115 break;
2116 }
2117
2118 ArgOffset += alignTo(Size, kShadowTLSAlignment);
2119 }
2120 assert(ShadowPtr && "Could not find shadow for an argument");
2121 return ShadowPtr;
2122 }
2123
2124 // Check for partially-undefined constant vectors
2125 // TODO: scalable vectors (this is hard because we do not have IRBuilder)
2126 if (isa<FixedVectorType>(V->getType()) && isa<Constant>(V) &&
2127 cast<Constant>(V)->containsUndefOrPoisonElement() && PropagateShadow &&
2128 PoisonUndefVectors) {
2129 unsigned NumElems = cast<FixedVectorType>(V->getType())->getNumElements();
2130 SmallVector<Constant *, 32> ShadowVector(NumElems);
2131 for (unsigned i = 0; i != NumElems; ++i) {
2132 Constant *Elem = cast<Constant>(V)->getAggregateElement(i);
2133 ShadowVector[i] = isa<UndefValue>(Elem) ? getPoisonedShadow(Elem)
2134 : getCleanShadow(Elem);
2135 }
2136
2137 Value *ShadowConstant = ConstantVector::get(ShadowVector);
2138 LLVM_DEBUG(dbgs() << "Partial undef constant vector: " << *V << " ==> "
2139 << *ShadowConstant << "\n");
2140
2141 return ShadowConstant;
2142 }
2143
2144 // TODO: partially-undefined constant arrays, structures, and nested types
2145
2146 // For everything else the shadow is zero.
2147 return getCleanShadow(V);
2148 }
2149
2150 /// Get the shadow for i-th argument of the instruction I.
2151 Value *getShadow(Instruction *I, int i) {
2152 return getShadow(I->getOperand(i));
2153 }
2154
2155 /// Get the origin for a value.
2156 Value *getOrigin(Value *V) {
2157 if (!MS.TrackOrigins)
2158 return nullptr;
2159 if (!PropagateShadow || isa<Constant>(V) || isa<InlineAsm>(V))
2160 return getCleanOrigin();
2162 "Unexpected value type in getOrigin()");
2163 if (Instruction *I = dyn_cast<Instruction>(V)) {
2164 if (I->getMetadata(LLVMContext::MD_nosanitize))
2165 return getCleanOrigin();
2166 }
2167 Value *Origin = OriginMap[V];
2168 assert(Origin && "Missing origin");
2169 return Origin;
2170 }
2171
2172 /// Get the origin for i-th argument of the instruction I.
2173 Value *getOrigin(Instruction *I, int i) {
2174 return getOrigin(I->getOperand(i));
2175 }
2176
2177 /// Remember the place where a shadow check should be inserted.
2178 ///
2179 /// This location will be later instrumented with a check that will print a
2180 /// UMR warning in runtime if the shadow value is not 0.
2181 void insertCheckShadow(Value *Shadow, Value *Origin, Instruction *OrigIns) {
2182 assert(Shadow);
2183 if (!InsertChecks)
2184 return;
2185
2186 if (!DebugCounter::shouldExecute(DebugInsertCheck)) {
2187 LLVM_DEBUG(dbgs() << "Skipping check of " << *Shadow << " before "
2188 << *OrigIns << "\n");
2189 return;
2190 }
2191#ifndef NDEBUG
2192 Type *ShadowTy = Shadow->getType();
2193 assert((isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy) ||
2194 isa<StructType>(ShadowTy) || isa<ArrayType>(ShadowTy)) &&
2195 "Can only insert checks for integer, vector, and aggregate shadow "
2196 "types");
2197#endif
2198 InstrumentationList.push_back(
2199 ShadowOriginAndInsertPoint(Shadow, Origin, OrigIns));
2200 }
2201
2202 /// Get shadow for value, and remember the place where a shadow check should
2203 /// be inserted.
2204 ///
2205 /// This location will be later instrumented with a check that will print a
2206 /// UMR warning in runtime if the value is not fully defined.
2207 void insertCheckShadowOf(Value *Val, Instruction *OrigIns) {
2208 assert(Val);
2209 Value *Shadow, *Origin;
2211 Shadow = getShadow(Val);
2212 if (!Shadow)
2213 return;
2214 Origin = getOrigin(Val);
2215 } else {
2216 Shadow = dyn_cast_or_null<Instruction>(getShadow(Val));
2217 if (!Shadow)
2218 return;
2219 Origin = dyn_cast_or_null<Instruction>(getOrigin(Val));
2220 }
2221 insertCheckShadow(Shadow, Origin, OrigIns);
2222 }
2223
2225 switch (a) {
2226 case AtomicOrdering::NotAtomic:
2227 return AtomicOrdering::NotAtomic;
2228 case AtomicOrdering::Unordered:
2229 case AtomicOrdering::Monotonic:
2230 case AtomicOrdering::Release:
2231 return AtomicOrdering::Release;
2232 case AtomicOrdering::Acquire:
2233 case AtomicOrdering::AcquireRelease:
2234 return AtomicOrdering::AcquireRelease;
2235 case AtomicOrdering::SequentiallyConsistent:
2236 return AtomicOrdering::SequentiallyConsistent;
2237 }
2238 llvm_unreachable("Unknown ordering");
2239 }
2240
2241 Value *makeAddReleaseOrderingTable(IRBuilder<> &IRB) {
2242 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2243 uint32_t OrderingTable[NumOrderings] = {};
2244
2245 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2246 OrderingTable[(int)AtomicOrderingCABI::release] =
2247 (int)AtomicOrderingCABI::release;
2248 OrderingTable[(int)AtomicOrderingCABI::consume] =
2249 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2250 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2251 (int)AtomicOrderingCABI::acq_rel;
2252 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2253 (int)AtomicOrderingCABI::seq_cst;
2254
2255 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2256 }
2257
2259 switch (a) {
2260 case AtomicOrdering::NotAtomic:
2261 return AtomicOrdering::NotAtomic;
2262 case AtomicOrdering::Unordered:
2263 case AtomicOrdering::Monotonic:
2264 case AtomicOrdering::Acquire:
2265 return AtomicOrdering::Acquire;
2266 case AtomicOrdering::Release:
2267 case AtomicOrdering::AcquireRelease:
2268 return AtomicOrdering::AcquireRelease;
2269 case AtomicOrdering::SequentiallyConsistent:
2270 return AtomicOrdering::SequentiallyConsistent;
2271 }
2272 llvm_unreachable("Unknown ordering");
2273 }
2274
2275 Value *makeAddAcquireOrderingTable(IRBuilder<> &IRB) {
2276 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2277 uint32_t OrderingTable[NumOrderings] = {};
2278
2279 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2280 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2281 OrderingTable[(int)AtomicOrderingCABI::consume] =
2282 (int)AtomicOrderingCABI::acquire;
2283 OrderingTable[(int)AtomicOrderingCABI::release] =
2284 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2285 (int)AtomicOrderingCABI::acq_rel;
2286 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2287 (int)AtomicOrderingCABI::seq_cst;
2288
2289 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2290 }
2291
2292 // ------------------- Visitors.
2293 using InstVisitor<MemorySanitizerVisitor>::visit;
2294 void visit(Instruction &I) {
2295 if (I.getMetadata(LLVMContext::MD_nosanitize))
2296 return;
2297 // Don't want to visit if we're in the prologue
2298 if (isInPrologue(I))
2299 return;
2300 if (!DebugCounter::shouldExecute(DebugInstrumentInstruction)) {
2301 LLVM_DEBUG(dbgs() << "Skipping instruction: " << I << "\n");
2302 // We still need to set the shadow and origin to clean values.
2303 setShadow(&I, getCleanShadow(&I));
2304 setOrigin(&I, getCleanOrigin());
2305 return;
2306 }
2307
2308 Instructions.push_back(&I);
2309 }
2310
2311 /// Instrument LoadInst
2312 ///
2313 /// Loads the corresponding shadow and (optionally) origin.
2314 /// Optionally, checks that the load address is fully defined.
2315 void visitLoadInst(LoadInst &I) {
2316 assert(I.getType()->isSized() && "Load type must have size");
2317 assert(!I.getMetadata(LLVMContext::MD_nosanitize));
2318 NextNodeIRBuilder IRB(&I);
2319 Type *ShadowTy = getShadowTy(&I);
2320 Value *Addr = I.getPointerOperand();
2321 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
2322 const Align Alignment = I.getAlign();
2323 if (PropagateShadow) {
2324 std::tie(ShadowPtr, OriginPtr) =
2325 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
2326 setShadow(&I,
2327 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
2328 } else {
2329 setShadow(&I, getCleanShadow(&I));
2330 }
2331
2333 insertCheckShadowOf(I.getPointerOperand(), &I);
2334
2335 if (I.isAtomic())
2336 I.setOrdering(addAcquireOrdering(I.getOrdering()));
2337
2338 if (MS.TrackOrigins) {
2339 if (PropagateShadow) {
2340 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
2341 setOrigin(
2342 &I, IRB.CreateAlignedLoad(MS.OriginTy, OriginPtr, OriginAlignment));
2343 } else {
2344 setOrigin(&I, getCleanOrigin());
2345 }
2346 }
2347 }
2348
2349 /// Instrument StoreInst
2350 ///
2351 /// Stores the corresponding shadow and (optionally) origin.
2352 /// Optionally, checks that the store address is fully defined.
2353 void visitStoreInst(StoreInst &I) {
2354 StoreList.push_back(&I);
2356 insertCheckShadowOf(I.getPointerOperand(), &I);
2357 }
2358
2359 void handleCASOrRMW(Instruction &I) {
2361
2362 IRBuilder<> IRB(&I);
2363 Value *Addr = I.getOperand(0);
2364 Value *Val = I.getOperand(1);
2365 Value *ShadowPtr = getShadowOriginPtr(Addr, IRB, getShadowTy(Val), Align(1),
2366 /*isStore*/ true)
2367 .first;
2368
2370 insertCheckShadowOf(Addr, &I);
2371
2372 // Only test the conditional argument of cmpxchg instruction.
2373 // The other argument can potentially be uninitialized, but we can not
2374 // detect this situation reliably without possible false positives.
2376 insertCheckShadowOf(Val, &I);
2377
2378 IRB.CreateStore(getCleanShadow(Val), ShadowPtr);
2379
2380 setShadow(&I, getCleanShadow(&I));
2381 setOrigin(&I, getCleanOrigin());
2382 }
2383
2384 void visitAtomicRMWInst(AtomicRMWInst &I) {
2385 handleCASOrRMW(I);
2386 I.setOrdering(addReleaseOrdering(I.getOrdering()));
2387 }
2388
2389 void visitAtomicCmpXchgInst(AtomicCmpXchgInst &I) {
2390 handleCASOrRMW(I);
2391 I.setSuccessOrdering(addReleaseOrdering(I.getSuccessOrdering()));
2392 }
2393
2394 // Vector manipulation.
2395 void visitExtractElementInst(ExtractElementInst &I) {
2396 insertCheckShadowOf(I.getOperand(1), &I);
2397 IRBuilder<> IRB(&I);
2398 setShadow(&I, IRB.CreateExtractElement(getShadow(&I, 0), I.getOperand(1),
2399 "_msprop"));
2400 setOrigin(&I, getOrigin(&I, 0));
2401 }
2402
2403 void visitInsertElementInst(InsertElementInst &I) {
2404 insertCheckShadowOf(I.getOperand(2), &I);
2405 IRBuilder<> IRB(&I);
2406 auto *Shadow0 = getShadow(&I, 0);
2407 auto *Shadow1 = getShadow(&I, 1);
2408 setShadow(&I, IRB.CreateInsertElement(Shadow0, Shadow1, I.getOperand(2),
2409 "_msprop"));
2410 setOriginForNaryOp(I);
2411 }
2412
2413 void visitShuffleVectorInst(ShuffleVectorInst &I) {
2414 IRBuilder<> IRB(&I);
2415 auto *Shadow0 = getShadow(&I, 0);
2416 auto *Shadow1 = getShadow(&I, 1);
2417 setShadow(&I, IRB.CreateShuffleVector(Shadow0, Shadow1, I.getShuffleMask(),
2418 "_msprop"));
2419 setOriginForNaryOp(I);
2420 }
2421
2422 // Casts.
2423 void visitSExtInst(SExtInst &I) {
2424 IRBuilder<> IRB(&I);
2425 setShadow(&I, IRB.CreateSExt(getShadow(&I, 0), I.getType(), "_msprop"));
2426 setOrigin(&I, getOrigin(&I, 0));
2427 }
2428
2429 void visitZExtInst(ZExtInst &I) {
2430 IRBuilder<> IRB(&I);
2431 setShadow(&I, IRB.CreateZExt(getShadow(&I, 0), I.getType(), "_msprop"));
2432 setOrigin(&I, getOrigin(&I, 0));
2433 }
2434
2435 void visitTruncInst(TruncInst &I) {
2436 IRBuilder<> IRB(&I);
2437 setShadow(&I, IRB.CreateTrunc(getShadow(&I, 0), I.getType(), "_msprop"));
2438 setOrigin(&I, getOrigin(&I, 0));
2439 }
2440
2441 void visitBitCastInst(BitCastInst &I) {
2442 // Special case: if this is the bitcast (there is exactly 1 allowed) between
2443 // a musttail call and a ret, don't instrument. New instructions are not
2444 // allowed after a musttail call.
2445 if (auto *CI = dyn_cast<CallInst>(I.getOperand(0)))
2446 if (CI->isMustTailCall())
2447 return;
2448 IRBuilder<> IRB(&I);
2449 setShadow(&I, IRB.CreateBitCast(getShadow(&I, 0), getShadowTy(&I)));
2450 setOrigin(&I, getOrigin(&I, 0));
2451 }
2452
2453 void visitPtrToIntInst(PtrToIntInst &I) {
2454 IRBuilder<> IRB(&I);
2455 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2456 "_msprop_ptrtoint"));
2457 setOrigin(&I, getOrigin(&I, 0));
2458 }
2459
2460 void visitIntToPtrInst(IntToPtrInst &I) {
2461 IRBuilder<> IRB(&I);
2462 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2463 "_msprop_inttoptr"));
2464 setOrigin(&I, getOrigin(&I, 0));
2465 }
2466
2467 void visitFPToSIInst(CastInst &I) { handleShadowOr(I); }
2468 void visitFPToUIInst(CastInst &I) { handleShadowOr(I); }
2469 void visitSIToFPInst(CastInst &I) { handleShadowOr(I); }
2470 void visitUIToFPInst(CastInst &I) { handleShadowOr(I); }
2471 void visitFPExtInst(CastInst &I) { handleShadowOr(I); }
2472 void visitFPTruncInst(CastInst &I) { handleShadowOr(I); }
2473
2474 /// Propagate shadow for bitwise AND.
2475 ///
2476 /// This code is exact, i.e. if, for example, a bit in the left argument
2477 /// is defined and 0, then neither the value not definedness of the
2478 /// corresponding bit in B don't affect the resulting shadow.
2479 void visitAnd(BinaryOperator &I) {
2480 IRBuilder<> IRB(&I);
2481 // "And" of 0 and a poisoned value results in unpoisoned value.
2482 // 1&1 => 1; 0&1 => 0; p&1 => p;
2483 // 1&0 => 0; 0&0 => 0; p&0 => 0;
2484 // 1&p => p; 0&p => 0; p&p => p;
2485 // S = (S1 & S2) | (V1 & S2) | (S1 & V2)
2486 Value *S1 = getShadow(&I, 0);
2487 Value *S2 = getShadow(&I, 1);
2488 Value *V1 = I.getOperand(0);
2489 Value *V2 = I.getOperand(1);
2490 if (V1->getType() != S1->getType()) {
2491 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2492 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2493 }
2494 Value *S1S2 = IRB.CreateAnd(S1, S2);
2495 Value *V1S2 = IRB.CreateAnd(V1, S2);
2496 Value *S1V2 = IRB.CreateAnd(S1, V2);
2497 setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
2498 setOriginForNaryOp(I);
2499 }
2500
2501 void visitOr(BinaryOperator &I) {
2502 IRBuilder<> IRB(&I);
2503 // "Or" of 1 and a poisoned value results in unpoisoned value:
2504 // 1|1 => 1; 0|1 => 1; p|1 => 1;
2505 // 1|0 => 1; 0|0 => 0; p|0 => p;
2506 // 1|p => 1; 0|p => p; p|p => p;
2507 //
2508 // S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
2509 //
2510 // If the "disjoint OR" property is violated, the result is poison, and
2511 // hence the entire shadow is uninitialized:
2512 // S = S | SignExt(V1 & V2 != 0)
2513 Value *S1 = getShadow(&I, 0);
2514 Value *S2 = getShadow(&I, 1);
2515 Value *V1 = I.getOperand(0);
2516 Value *V2 = I.getOperand(1);
2517 if (V1->getType() != S1->getType()) {
2518 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2519 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2520 }
2521
2522 Value *NotV1 = IRB.CreateNot(V1);
2523 Value *NotV2 = IRB.CreateNot(V2);
2524
2525 Value *S1S2 = IRB.CreateAnd(S1, S2);
2526 Value *S2NotV1 = IRB.CreateAnd(NotV1, S2);
2527 Value *S1NotV2 = IRB.CreateAnd(S1, NotV2);
2528
2529 Value *S = IRB.CreateOr({S1S2, S2NotV1, S1NotV2});
2530
2531 if (ClPreciseDisjointOr && cast<PossiblyDisjointInst>(&I)->isDisjoint()) {
2532 Value *V1V2 = IRB.CreateAnd(V1, V2);
2533 Value *DisjointOrShadow = IRB.CreateSExt(
2534 IRB.CreateICmpNE(V1V2, getCleanShadow(V1V2)), V1V2->getType());
2535 S = IRB.CreateOr(S, DisjointOrShadow, "_ms_disjoint");
2536 }
2537
2538 setShadow(&I, S);
2539 setOriginForNaryOp(I);
2540 }
2541
2542 /// Default propagation of shadow and/or origin.
2543 ///
2544 /// This class implements the general case of shadow propagation, used in all
2545 /// cases where we don't know and/or don't care about what the operation
2546 /// actually does. It converts all input shadow values to a common type
2547 /// (extending or truncating as necessary), and bitwise OR's them.
2548 ///
2549 /// This is much cheaper than inserting checks (i.e. requiring inputs to be
2550 /// fully initialized), and less prone to false positives.
2551 ///
2552 /// This class also implements the general case of origin propagation. For a
2553 /// Nary operation, result origin is set to the origin of an argument that is
2554 /// not entirely initialized. If there is more than one such arguments, the
2555 /// rightmost of them is picked. It does not matter which one is picked if all
2556 /// arguments are initialized.
2557 template <bool CombineShadow> class Combiner {
2558 Value *Shadow = nullptr;
2559 Value *Origin = nullptr;
2560 IRBuilder<> &IRB;
2561 MemorySanitizerVisitor *MSV;
2562
2563 public:
2564 Combiner(MemorySanitizerVisitor *MSV, IRBuilder<> &IRB)
2565 : IRB(IRB), MSV(MSV) {}
2566
2567 /// Add a pair of shadow and origin values to the mix.
2568 Combiner &Add(Value *OpShadow, Value *OpOrigin) {
2569 if (CombineShadow) {
2570 assert(OpShadow);
2571 if (!Shadow)
2572 Shadow = OpShadow;
2573 else {
2574 OpShadow = MSV->CreateShadowCast(IRB, OpShadow, Shadow->getType());
2575 Shadow = IRB.CreateOr(Shadow, OpShadow, "_msprop");
2576 }
2577 }
2578
2579 if (MSV->MS.TrackOrigins) {
2580 assert(OpOrigin);
2581 if (!Origin) {
2582 Origin = OpOrigin;
2583 } else {
2584 Constant *ConstOrigin = dyn_cast<Constant>(OpOrigin);
2585 // No point in adding something that might result in 0 origin value.
2586 if (!ConstOrigin || !ConstOrigin->isNullValue()) {
2587 Value *Cond = MSV->convertToBool(OpShadow, IRB);
2588 Origin = IRB.CreateSelect(Cond, OpOrigin, Origin);
2589 }
2590 }
2591 }
2592 return *this;
2593 }
2594
2595 /// Add an application value to the mix.
2596 Combiner &Add(Value *V) {
2597 Value *OpShadow = MSV->getShadow(V);
2598 Value *OpOrigin = MSV->MS.TrackOrigins ? MSV->getOrigin(V) : nullptr;
2599 return Add(OpShadow, OpOrigin);
2600 }
2601
2602 /// Set the current combined values as the given instruction's shadow
2603 /// and origin.
2604 void Done(Instruction *I) {
2605 if (CombineShadow) {
2606 assert(Shadow);
2607 Shadow = MSV->CreateShadowCast(IRB, Shadow, MSV->getShadowTy(I));
2608 MSV->setShadow(I, Shadow);
2609 }
2610 if (MSV->MS.TrackOrigins) {
2611 assert(Origin);
2612 MSV->setOrigin(I, Origin);
2613 }
2614 }
2615
2616 /// Store the current combined value at the specified origin
2617 /// location.
2618 void DoneAndStoreOrigin(TypeSize TS, Value *OriginPtr) {
2619 if (MSV->MS.TrackOrigins) {
2620 assert(Origin);
2621 MSV->paintOrigin(IRB, Origin, OriginPtr, TS, kMinOriginAlignment);
2622 }
2623 }
2624 };
2625
2626 using ShadowAndOriginCombiner = Combiner<true>;
2627 using OriginCombiner = Combiner<false>;
2628
2629 /// Propagate origin for arbitrary operation.
2630 void setOriginForNaryOp(Instruction &I) {
2631 if (!MS.TrackOrigins)
2632 return;
2633 IRBuilder<> IRB(&I);
2634 OriginCombiner OC(this, IRB);
2635 for (Use &Op : I.operands())
2636 OC.Add(Op.get());
2637 OC.Done(&I);
2638 }
2639
2640 size_t VectorOrPrimitiveTypeSizeInBits(Type *Ty) {
2641 assert(!(Ty->isVectorTy() && Ty->getScalarType()->isPointerTy()) &&
2642 "Vector of pointers is not a valid shadow type");
2643 return Ty->isVectorTy() ? cast<FixedVectorType>(Ty)->getNumElements() *
2645 : Ty->getPrimitiveSizeInBits();
2646 }
2647
2648 /// Cast between two shadow types, extending or truncating as
2649 /// necessary.
2650 Value *CreateShadowCast(IRBuilder<> &IRB, Value *V, Type *dstTy,
2651 bool Signed = false) {
2652 Type *srcTy = V->getType();
2653 if (srcTy == dstTy)
2654 return V;
2655 size_t srcSizeInBits = VectorOrPrimitiveTypeSizeInBits(srcTy);
2656 size_t dstSizeInBits = VectorOrPrimitiveTypeSizeInBits(dstTy);
2657 if (srcSizeInBits > 1 && dstSizeInBits == 1)
2658 return IRB.CreateICmpNE(V, getCleanShadow(V));
2659
2660 if (dstTy->isIntegerTy() && srcTy->isIntegerTy())
2661 return IRB.CreateIntCast(V, dstTy, Signed);
2662 if (dstTy->isVectorTy() && srcTy->isVectorTy() &&
2663 cast<VectorType>(dstTy)->getElementCount() ==
2664 cast<VectorType>(srcTy)->getElementCount())
2665 return IRB.CreateIntCast(V, dstTy, Signed);
2666 Value *V1 = IRB.CreateBitCast(V, Type::getIntNTy(*MS.C, srcSizeInBits));
2667 Value *V2 =
2668 IRB.CreateIntCast(V1, Type::getIntNTy(*MS.C, dstSizeInBits), Signed);
2669 return IRB.CreateBitCast(V2, dstTy);
2670 // TODO: handle struct types.
2671 }
2672
2673 /// Cast an application value to the type of its own shadow.
2674 Value *CreateAppToShadowCast(IRBuilder<> &IRB, Value *V) {
2675 Type *ShadowTy = getShadowTy(V);
2676 if (V->getType() == ShadowTy)
2677 return V;
2678 if (V->getType()->isPtrOrPtrVectorTy())
2679 return IRB.CreatePtrToInt(V, ShadowTy);
2680 else
2681 return IRB.CreateBitCast(V, ShadowTy);
2682 }
2683
2684 /// Propagate shadow for arbitrary operation.
2685 void handleShadowOr(Instruction &I) {
2686 IRBuilder<> IRB(&I);
2687 ShadowAndOriginCombiner SC(this, IRB);
2688 for (Use &Op : I.operands())
2689 SC.Add(Op.get());
2690 SC.Done(&I);
2691 }
2692
2693 // Perform a bitwise OR on the horizontal pairs (or other specified grouping)
2694 // of elements.
2695 //
2696 // For example, suppose we have:
2697 // VectorA: <a1, a2, a3, a4, a5, a6>
2698 // VectorB: <b1, b2, b3, b4, b5, b6>
2699 // ReductionFactor: 3.
2700 // The output would be:
2701 // <a1|a2|a3, a4|a5|a6, b1|b2|b3, b4|b5|b6>
2702 //
2703 // This is convenient for instrumenting horizontal add/sub.
2704 // For bitwise OR on "vertical" pairs, see maybeHandleSimpleNomemIntrinsic().
2705 Value *horizontalReduce(IntrinsicInst &I, unsigned ReductionFactor,
2706 Value *VectorA, Value *VectorB) {
2707 assert(isa<FixedVectorType>(VectorA->getType()));
2708 unsigned TotalNumElems =
2709 cast<FixedVectorType>(VectorA->getType())->getNumElements();
2710
2711 if (VectorB) {
2712 assert(VectorA->getType() == VectorB->getType());
2713 TotalNumElems = TotalNumElems * 2;
2714 }
2715
2716 assert(TotalNumElems % ReductionFactor == 0);
2717
2718 Value *Or = nullptr;
2719
2720 IRBuilder<> IRB(&I);
2721 for (unsigned i = 0; i < ReductionFactor; i++) {
2722 SmallVector<int, 16> Mask;
2723 for (unsigned X = 0; X < TotalNumElems; X += ReductionFactor)
2724 Mask.push_back(X + i);
2725
2726 Value *Masked;
2727 if (VectorB)
2728 Masked = IRB.CreateShuffleVector(VectorA, VectorB, Mask);
2729 else
2730 Masked = IRB.CreateShuffleVector(VectorA, Mask);
2731
2732 if (Or)
2733 Or = IRB.CreateOr(Or, Masked);
2734 else
2735 Or = Masked;
2736 }
2737
2738 return Or;
2739 }
2740
2741 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2742 /// fields.
2743 ///
2744 /// e.g., <2 x i32> @llvm.aarch64.neon.saddlp.v2i32.v4i16(<4 x i16>)
2745 /// <16 x i8> @llvm.aarch64.neon.addp.v16i8(<16 x i8>, <16 x i8>)
2746 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I) {
2747 assert(I.arg_size() == 1 || I.arg_size() == 2);
2748
2749 assert(I.getType()->isVectorTy());
2750 assert(I.getArgOperand(0)->getType()->isVectorTy());
2751
2752 [[maybe_unused]] FixedVectorType *ParamType =
2753 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2754 assert((I.arg_size() != 2) ||
2755 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2756 [[maybe_unused]] FixedVectorType *ReturnType =
2757 cast<FixedVectorType>(I.getType());
2758 assert(ParamType->getNumElements() * I.arg_size() ==
2759 2 * ReturnType->getNumElements());
2760
2761 IRBuilder<> IRB(&I);
2762
2763 // Horizontal OR of shadow
2764 Value *FirstArgShadow = getShadow(&I, 0);
2765 Value *SecondArgShadow = nullptr;
2766 if (I.arg_size() == 2)
2767 SecondArgShadow = getShadow(&I, 1);
2768
2769 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
2770 SecondArgShadow);
2771
2772 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2773
2774 setShadow(&I, OrShadow);
2775 setOriginForNaryOp(I);
2776 }
2777
2778 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2779 /// fields, with the parameters reinterpreted to have elements of a specified
2780 /// width. For example:
2781 /// @llvm.x86.ssse3.phadd.w(<1 x i64> [[VAR1]], <1 x i64> [[VAR2]])
2782 /// conceptually operates on
2783 /// (<4 x i16> [[VAR1]], <4 x i16> [[VAR2]])
2784 /// and can be handled with ReinterpretElemWidth == 16.
2785 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I,
2786 int ReinterpretElemWidth) {
2787 assert(I.arg_size() == 1 || I.arg_size() == 2);
2788
2789 assert(I.getType()->isVectorTy());
2790 assert(I.getArgOperand(0)->getType()->isVectorTy());
2791
2792 FixedVectorType *ParamType =
2793 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2794 assert((I.arg_size() != 2) ||
2795 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2796
2797 [[maybe_unused]] FixedVectorType *ReturnType =
2798 cast<FixedVectorType>(I.getType());
2799 assert(ParamType->getNumElements() * I.arg_size() ==
2800 2 * ReturnType->getNumElements());
2801
2802 IRBuilder<> IRB(&I);
2803
2804 FixedVectorType *ReinterpretShadowTy = nullptr;
2805 assert(isAligned(Align(ReinterpretElemWidth),
2806 ParamType->getPrimitiveSizeInBits()));
2807 ReinterpretShadowTy = FixedVectorType::get(
2808 IRB.getIntNTy(ReinterpretElemWidth),
2809 ParamType->getPrimitiveSizeInBits() / ReinterpretElemWidth);
2810
2811 // Horizontal OR of shadow
2812 Value *FirstArgShadow = getShadow(&I, 0);
2813 FirstArgShadow = IRB.CreateBitCast(FirstArgShadow, ReinterpretShadowTy);
2814
2815 // If we had two parameters each with an odd number of elements, the total
2816 // number of elements is even, but we have never seen this in extant
2817 // instruction sets, so we enforce that each parameter must have an even
2818 // number of elements.
2820 Align(2),
2821 cast<FixedVectorType>(FirstArgShadow->getType())->getNumElements()));
2822
2823 Value *SecondArgShadow = nullptr;
2824 if (I.arg_size() == 2) {
2825 SecondArgShadow = getShadow(&I, 1);
2826 SecondArgShadow = IRB.CreateBitCast(SecondArgShadow, ReinterpretShadowTy);
2827 }
2828
2829 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
2830 SecondArgShadow);
2831
2832 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2833
2834 setShadow(&I, OrShadow);
2835 setOriginForNaryOp(I);
2836 }
2837
2838 void visitFNeg(UnaryOperator &I) { handleShadowOr(I); }
2839
2840 // Handle multiplication by constant.
2841 //
2842 // Handle a special case of multiplication by constant that may have one or
2843 // more zeros in the lower bits. This makes corresponding number of lower bits
2844 // of the result zero as well. We model it by shifting the other operand
2845 // shadow left by the required number of bits. Effectively, we transform
2846 // (X * (A * 2**B)) to ((X << B) * A) and instrument (X << B) as (Sx << B).
2847 // We use multiplication by 2**N instead of shift to cover the case of
2848 // multiplication by 0, which may occur in some elements of a vector operand.
2849 void handleMulByConstant(BinaryOperator &I, Constant *ConstArg,
2850 Value *OtherArg) {
2851 Constant *ShadowMul;
2852 Type *Ty = ConstArg->getType();
2853 if (auto *VTy = dyn_cast<VectorType>(Ty)) {
2854 unsigned NumElements = cast<FixedVectorType>(VTy)->getNumElements();
2855 Type *EltTy = VTy->getElementType();
2857 for (unsigned Idx = 0; Idx < NumElements; ++Idx) {
2858 if (ConstantInt *Elt =
2860 const APInt &V = Elt->getValue();
2861 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2862 Elements.push_back(ConstantInt::get(EltTy, V2));
2863 } else {
2864 Elements.push_back(ConstantInt::get(EltTy, 1));
2865 }
2866 }
2867 ShadowMul = ConstantVector::get(Elements);
2868 } else {
2869 if (ConstantInt *Elt = dyn_cast<ConstantInt>(ConstArg)) {
2870 const APInt &V = Elt->getValue();
2871 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2872 ShadowMul = ConstantInt::get(Ty, V2);
2873 } else {
2874 ShadowMul = ConstantInt::get(Ty, 1);
2875 }
2876 }
2877
2878 IRBuilder<> IRB(&I);
2879 setShadow(&I,
2880 IRB.CreateMul(getShadow(OtherArg), ShadowMul, "msprop_mul_cst"));
2881 setOrigin(&I, getOrigin(OtherArg));
2882 }
2883
2884 void visitMul(BinaryOperator &I) {
2885 Constant *constOp0 = dyn_cast<Constant>(I.getOperand(0));
2886 Constant *constOp1 = dyn_cast<Constant>(I.getOperand(1));
2887 if (constOp0 && !constOp1)
2888 handleMulByConstant(I, constOp0, I.getOperand(1));
2889 else if (constOp1 && !constOp0)
2890 handleMulByConstant(I, constOp1, I.getOperand(0));
2891 else
2892 handleShadowOr(I);
2893 }
2894
2895 void visitFAdd(BinaryOperator &I) { handleShadowOr(I); }
2896 void visitFSub(BinaryOperator &I) { handleShadowOr(I); }
2897 void visitFMul(BinaryOperator &I) { handleShadowOr(I); }
2898 void visitAdd(BinaryOperator &I) { handleShadowOr(I); }
2899 void visitSub(BinaryOperator &I) { handleShadowOr(I); }
2900 void visitXor(BinaryOperator &I) { handleShadowOr(I); }
2901
2902 void handleIntegerDiv(Instruction &I) {
2903 IRBuilder<> IRB(&I);
2904 // Strict on the second argument.
2905 insertCheckShadowOf(I.getOperand(1), &I);
2906 setShadow(&I, getShadow(&I, 0));
2907 setOrigin(&I, getOrigin(&I, 0));
2908 }
2909
2910 void visitUDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2911 void visitSDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2912 void visitURem(BinaryOperator &I) { handleIntegerDiv(I); }
2913 void visitSRem(BinaryOperator &I) { handleIntegerDiv(I); }
2914
2915 // Floating point division is side-effect free. We can not require that the
2916 // divisor is fully initialized and must propagate shadow. See PR37523.
2917 void visitFDiv(BinaryOperator &I) { handleShadowOr(I); }
2918 void visitFRem(BinaryOperator &I) { handleShadowOr(I); }
2919
2920 /// Instrument == and != comparisons.
2921 ///
2922 /// Sometimes the comparison result is known even if some of the bits of the
2923 /// arguments are not.
2924 void handleEqualityComparison(ICmpInst &I) {
2925 IRBuilder<> IRB(&I);
2926 Value *A = I.getOperand(0);
2927 Value *B = I.getOperand(1);
2928 Value *Sa = getShadow(A);
2929 Value *Sb = getShadow(B);
2930
2931 // Get rid of pointers and vectors of pointers.
2932 // For ints (and vectors of ints), types of A and Sa match,
2933 // and this is a no-op.
2934 A = IRB.CreatePointerCast(A, Sa->getType());
2935 B = IRB.CreatePointerCast(B, Sb->getType());
2936
2937 // A == B <==> (C = A^B) == 0
2938 // A != B <==> (C = A^B) != 0
2939 // Sc = Sa | Sb
2940 Value *C = IRB.CreateXor(A, B);
2941 Value *Sc = IRB.CreateOr(Sa, Sb);
2942 // Now dealing with i = (C == 0) comparison (or C != 0, does not matter now)
2943 // Result is defined if one of the following is true
2944 // * there is a defined 1 bit in C
2945 // * C is fully defined
2946 // Si = !(C & ~Sc) && Sc
2948 Value *MinusOne = Constant::getAllOnesValue(Sc->getType());
2949 Value *LHS = IRB.CreateICmpNE(Sc, Zero);
2950 Value *RHS =
2951 IRB.CreateICmpEQ(IRB.CreateAnd(IRB.CreateXor(Sc, MinusOne), C), Zero);
2952 Value *Si = IRB.CreateAnd(LHS, RHS);
2953 Si->setName("_msprop_icmp");
2954 setShadow(&I, Si);
2955 setOriginForNaryOp(I);
2956 }
2957
2958 /// Instrument relational comparisons.
2959 ///
2960 /// This function does exact shadow propagation for all relational
2961 /// comparisons of integers, pointers and vectors of those.
2962 /// FIXME: output seems suboptimal when one of the operands is a constant
2963 void handleRelationalComparisonExact(ICmpInst &I) {
2964 IRBuilder<> IRB(&I);
2965 Value *A = I.getOperand(0);
2966 Value *B = I.getOperand(1);
2967 Value *Sa = getShadow(A);
2968 Value *Sb = getShadow(B);
2969
2970 // Get rid of pointers and vectors of pointers.
2971 // For ints (and vectors of ints), types of A and Sa match,
2972 // and this is a no-op.
2973 A = IRB.CreatePointerCast(A, Sa->getType());
2974 B = IRB.CreatePointerCast(B, Sb->getType());
2975
2976 // Let [a0, a1] be the interval of possible values of A, taking into account
2977 // its undefined bits. Let [b0, b1] be the interval of possible values of B.
2978 // Then (A cmp B) is defined iff (a0 cmp b1) == (a1 cmp b0).
2979 bool IsSigned = I.isSigned();
2980
2981 auto GetMinMaxUnsigned = [&](Value *V, Value *S) {
2982 if (IsSigned) {
2983 // Sign-flip to map from signed range to unsigned range. Relation A vs B
2984 // should be preserved, if checked with `getUnsignedPredicate()`.
2985 // Relationship between Amin, Amax, Bmin, Bmax also will not be
2986 // affected, as they are created by effectively adding/substructing from
2987 // A (or B) a value, derived from shadow, with no overflow, either
2988 // before or after sign flip.
2989 APInt MinVal =
2990 APInt::getSignedMinValue(V->getType()->getScalarSizeInBits());
2991 V = IRB.CreateXor(V, ConstantInt::get(V->getType(), MinVal));
2992 }
2993 // Minimize undefined bits.
2994 Value *Min = IRB.CreateAnd(V, IRB.CreateNot(S));
2995 Value *Max = IRB.CreateOr(V, S);
2996 return std::make_pair(Min, Max);
2997 };
2998
2999 auto [Amin, Amax] = GetMinMaxUnsigned(A, Sa);
3000 auto [Bmin, Bmax] = GetMinMaxUnsigned(B, Sb);
3001 Value *S1 = IRB.CreateICmp(I.getUnsignedPredicate(), Amin, Bmax);
3002 Value *S2 = IRB.CreateICmp(I.getUnsignedPredicate(), Amax, Bmin);
3003
3004 Value *Si = IRB.CreateXor(S1, S2);
3005 setShadow(&I, Si);
3006 setOriginForNaryOp(I);
3007 }
3008
3009 /// Instrument signed relational comparisons.
3010 ///
3011 /// Handle sign bit tests: x<0, x>=0, x<=-1, x>-1 by propagating the highest
3012 /// bit of the shadow. Everything else is delegated to handleShadowOr().
3013 void handleSignedRelationalComparison(ICmpInst &I) {
3014 Constant *constOp;
3015 Value *op = nullptr;
3017 if ((constOp = dyn_cast<Constant>(I.getOperand(1)))) {
3018 op = I.getOperand(0);
3019 pre = I.getPredicate();
3020 } else if ((constOp = dyn_cast<Constant>(I.getOperand(0)))) {
3021 op = I.getOperand(1);
3022 pre = I.getSwappedPredicate();
3023 } else {
3024 handleShadowOr(I);
3025 return;
3026 }
3027
3028 if ((constOp->isNullValue() &&
3029 (pre == CmpInst::ICMP_SLT || pre == CmpInst::ICMP_SGE)) ||
3030 (constOp->isAllOnesValue() &&
3031 (pre == CmpInst::ICMP_SGT || pre == CmpInst::ICMP_SLE))) {
3032 IRBuilder<> IRB(&I);
3033 Value *Shadow = IRB.CreateICmpSLT(getShadow(op), getCleanShadow(op),
3034 "_msprop_icmp_s");
3035 setShadow(&I, Shadow);
3036 setOrigin(&I, getOrigin(op));
3037 } else {
3038 handleShadowOr(I);
3039 }
3040 }
3041
3042 void visitICmpInst(ICmpInst &I) {
3043 if (!ClHandleICmp) {
3044 handleShadowOr(I);
3045 return;
3046 }
3047 if (I.isEquality()) {
3048 handleEqualityComparison(I);
3049 return;
3050 }
3051
3052 assert(I.isRelational());
3053 if (ClHandleICmpExact) {
3054 handleRelationalComparisonExact(I);
3055 return;
3056 }
3057 if (I.isSigned()) {
3058 handleSignedRelationalComparison(I);
3059 return;
3060 }
3061
3062 assert(I.isUnsigned());
3063 if ((isa<Constant>(I.getOperand(0)) || isa<Constant>(I.getOperand(1)))) {
3064 handleRelationalComparisonExact(I);
3065 return;
3066 }
3067
3068 handleShadowOr(I);
3069 }
3070
3071 void visitFCmpInst(FCmpInst &I) { handleShadowOr(I); }
3072
3073 void handleShift(BinaryOperator &I) {
3074 IRBuilder<> IRB(&I);
3075 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3076 // Otherwise perform the same shift on S1.
3077 Value *S1 = getShadow(&I, 0);
3078 Value *S2 = getShadow(&I, 1);
3079 Value *S2Conv =
3080 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3081 Value *V2 = I.getOperand(1);
3082 Value *Shift = IRB.CreateBinOp(I.getOpcode(), S1, V2);
3083 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3084 setOriginForNaryOp(I);
3085 }
3086
3087 void visitShl(BinaryOperator &I) { handleShift(I); }
3088 void visitAShr(BinaryOperator &I) { handleShift(I); }
3089 void visitLShr(BinaryOperator &I) { handleShift(I); }
3090
3091 void handleFunnelShift(IntrinsicInst &I) {
3092 IRBuilder<> IRB(&I);
3093 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3094 // Otherwise perform the same shift on S0 and S1.
3095 Value *S0 = getShadow(&I, 0);
3096 Value *S1 = getShadow(&I, 1);
3097 Value *S2 = getShadow(&I, 2);
3098 Value *S2Conv =
3099 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3100 Value *V2 = I.getOperand(2);
3101 Value *Shift = IRB.CreateIntrinsic(I.getIntrinsicID(), S2Conv->getType(),
3102 {S0, S1, V2});
3103 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3104 setOriginForNaryOp(I);
3105 }
3106
3107 /// Instrument llvm.memmove
3108 ///
3109 /// At this point we don't know if llvm.memmove will be inlined or not.
3110 /// If we don't instrument it and it gets inlined,
3111 /// our interceptor will not kick in and we will lose the memmove.
3112 /// If we instrument the call here, but it does not get inlined,
3113 /// we will memove the shadow twice: which is bad in case
3114 /// of overlapping regions. So, we simply lower the intrinsic to a call.
3115 ///
3116 /// Similar situation exists for memcpy and memset.
3117 void visitMemMoveInst(MemMoveInst &I) {
3118 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3119 IRBuilder<> IRB(&I);
3120 IRB.CreateCall(MS.MemmoveFn,
3121 {I.getArgOperand(0), I.getArgOperand(1),
3122 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3124 }
3125
3126 /// Instrument memcpy
3127 ///
3128 /// Similar to memmove: avoid copying shadow twice. This is somewhat
3129 /// unfortunate as it may slowdown small constant memcpys.
3130 /// FIXME: consider doing manual inline for small constant sizes and proper
3131 /// alignment.
3132 ///
3133 /// Note: This also handles memcpy.inline, which promises no calls to external
3134 /// functions as an optimization. However, with instrumentation enabled this
3135 /// is difficult to promise; additionally, we know that the MSan runtime
3136 /// exists and provides __msan_memcpy(). Therefore, we assume that with
3137 /// instrumentation it's safe to turn memcpy.inline into a call to
3138 /// __msan_memcpy(). Should this be wrong, such as when implementing memcpy()
3139 /// itself, instrumentation should be disabled with the no_sanitize attribute.
3140 void visitMemCpyInst(MemCpyInst &I) {
3141 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3142 IRBuilder<> IRB(&I);
3143 IRB.CreateCall(MS.MemcpyFn,
3144 {I.getArgOperand(0), I.getArgOperand(1),
3145 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3147 }
3148
3149 // Same as memcpy.
3150 void visitMemSetInst(MemSetInst &I) {
3151 IRBuilder<> IRB(&I);
3152 IRB.CreateCall(
3153 MS.MemsetFn,
3154 {I.getArgOperand(0),
3155 IRB.CreateIntCast(I.getArgOperand(1), IRB.getInt32Ty(), false),
3156 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3158 }
3159
3160 void visitVAStartInst(VAStartInst &I) { VAHelper->visitVAStartInst(I); }
3161
3162 void visitVACopyInst(VACopyInst &I) { VAHelper->visitVACopyInst(I); }
3163
3164 /// Handle vector store-like intrinsics.
3165 ///
3166 /// Instrument intrinsics that look like a simple SIMD store: writes memory,
3167 /// has 1 pointer argument and 1 vector argument, returns void.
3168 bool handleVectorStoreIntrinsic(IntrinsicInst &I) {
3169 assert(I.arg_size() == 2);
3170
3171 IRBuilder<> IRB(&I);
3172 Value *Addr = I.getArgOperand(0);
3173 Value *Shadow = getShadow(&I, 1);
3174 Value *ShadowPtr, *OriginPtr;
3175
3176 // We don't know the pointer alignment (could be unaligned SSE store!).
3177 // Have to assume to worst case.
3178 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
3179 Addr, IRB, Shadow->getType(), Align(1), /*isStore*/ true);
3180 IRB.CreateAlignedStore(Shadow, ShadowPtr, Align(1));
3181
3183 insertCheckShadowOf(Addr, &I);
3184
3185 // FIXME: factor out common code from materializeStores
3186 if (MS.TrackOrigins)
3187 IRB.CreateStore(getOrigin(&I, 1), OriginPtr);
3188 return true;
3189 }
3190
3191 /// Handle vector load-like intrinsics.
3192 ///
3193 /// Instrument intrinsics that look like a simple SIMD load: reads memory,
3194 /// has 1 pointer argument, returns a vector.
3195 bool handleVectorLoadIntrinsic(IntrinsicInst &I) {
3196 assert(I.arg_size() == 1);
3197
3198 IRBuilder<> IRB(&I);
3199 Value *Addr = I.getArgOperand(0);
3200
3201 Type *ShadowTy = getShadowTy(&I);
3202 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
3203 if (PropagateShadow) {
3204 // We don't know the pointer alignment (could be unaligned SSE load!).
3205 // Have to assume to worst case.
3206 const Align Alignment = Align(1);
3207 std::tie(ShadowPtr, OriginPtr) =
3208 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
3209 setShadow(&I,
3210 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
3211 } else {
3212 setShadow(&I, getCleanShadow(&I));
3213 }
3214
3216 insertCheckShadowOf(Addr, &I);
3217
3218 if (MS.TrackOrigins) {
3219 if (PropagateShadow)
3220 setOrigin(&I, IRB.CreateLoad(MS.OriginTy, OriginPtr));
3221 else
3222 setOrigin(&I, getCleanOrigin());
3223 }
3224 return true;
3225 }
3226
3227 /// Handle (SIMD arithmetic)-like intrinsics.
3228 ///
3229 /// Instrument intrinsics with any number of arguments of the same type [*],
3230 /// equal to the return type, plus a specified number of trailing flags of
3231 /// any type.
3232 ///
3233 /// [*] The type should be simple (no aggregates or pointers; vectors are
3234 /// fine).
3235 ///
3236 /// Caller guarantees that this intrinsic does not access memory.
3237 ///
3238 /// TODO: "horizontal"/"pairwise" intrinsics are often incorrectly matched by
3239 /// by this handler. See horizontalReduce().
3240 ///
3241 /// TODO: permutation intrinsics are also often incorrectly matched.
3242 [[maybe_unused]] bool
3243 maybeHandleSimpleNomemIntrinsic(IntrinsicInst &I,
3244 unsigned int trailingFlags) {
3245 Type *RetTy = I.getType();
3246 if (!(RetTy->isIntOrIntVectorTy() || RetTy->isFPOrFPVectorTy()))
3247 return false;
3248
3249 unsigned NumArgOperands = I.arg_size();
3250 assert(NumArgOperands >= trailingFlags);
3251 for (unsigned i = 0; i < NumArgOperands - trailingFlags; ++i) {
3252 Type *Ty = I.getArgOperand(i)->getType();
3253 if (Ty != RetTy)
3254 return false;
3255 }
3256
3257 IRBuilder<> IRB(&I);
3258 ShadowAndOriginCombiner SC(this, IRB);
3259 for (unsigned i = 0; i < NumArgOperands; ++i)
3260 SC.Add(I.getArgOperand(i));
3261 SC.Done(&I);
3262
3263 return true;
3264 }
3265
3266 /// Returns whether it was able to heuristically instrument unknown
3267 /// intrinsics.
3268 ///
3269 /// The main purpose of this code is to do something reasonable with all
3270 /// random intrinsics we might encounter, most importantly - SIMD intrinsics.
3271 /// We recognize several classes of intrinsics by their argument types and
3272 /// ModRefBehaviour and apply special instrumentation when we are reasonably
3273 /// sure that we know what the intrinsic does.
3274 ///
3275 /// We special-case intrinsics where this approach fails. See llvm.bswap
3276 /// handling as an example of that.
3277 bool maybeHandleUnknownIntrinsicUnlogged(IntrinsicInst &I) {
3278 unsigned NumArgOperands = I.arg_size();
3279 if (NumArgOperands == 0)
3280 return false;
3281
3282 if (NumArgOperands == 2 && I.getArgOperand(0)->getType()->isPointerTy() &&
3283 I.getArgOperand(1)->getType()->isVectorTy() &&
3284 I.getType()->isVoidTy() && !I.onlyReadsMemory()) {
3285 // This looks like a vector store.
3286 return handleVectorStoreIntrinsic(I);
3287 }
3288
3289 if (NumArgOperands == 1 && I.getArgOperand(0)->getType()->isPointerTy() &&
3290 I.getType()->isVectorTy() && I.onlyReadsMemory()) {
3291 // This looks like a vector load.
3292 return handleVectorLoadIntrinsic(I);
3293 }
3294
3295 if (I.doesNotAccessMemory())
3296 if (maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/0))
3297 return true;
3298
3299 // FIXME: detect and handle SSE maskstore/maskload?
3300 // Some cases are now handled in handleAVXMasked{Load,Store}.
3301 return false;
3302 }
3303
3304 bool maybeHandleUnknownIntrinsic(IntrinsicInst &I) {
3305 if (maybeHandleUnknownIntrinsicUnlogged(I)) {
3307 dumpInst(I);
3308
3309 LLVM_DEBUG(dbgs() << "UNKNOWN INSTRUCTION HANDLED HEURISTICALLY: " << I
3310 << "\n");
3311 return true;
3312 } else
3313 return false;
3314 }
3315
3316 void handleInvariantGroup(IntrinsicInst &I) {
3317 setShadow(&I, getShadow(&I, 0));
3318 setOrigin(&I, getOrigin(&I, 0));
3319 }
3320
3321 void handleLifetimeStart(IntrinsicInst &I) {
3322 if (!PoisonStack)
3323 return;
3324 AllocaInst *AI = dyn_cast<AllocaInst>(I.getArgOperand(0));
3325 if (AI)
3326 LifetimeStartList.push_back(std::make_pair(&I, AI));
3327 }
3328
3329 void handleBswap(IntrinsicInst &I) {
3330 IRBuilder<> IRB(&I);
3331 Value *Op = I.getArgOperand(0);
3332 Type *OpType = Op->getType();
3333 setShadow(&I, IRB.CreateIntrinsic(Intrinsic::bswap, ArrayRef(&OpType, 1),
3334 getShadow(Op)));
3335 setOrigin(&I, getOrigin(Op));
3336 }
3337
3338 // Uninitialized bits are ok if they appear after the leading/trailing 0's
3339 // and a 1. If the input is all zero, it is fully initialized iff
3340 // !is_zero_poison.
3341 //
3342 // e.g., for ctlz, with little-endian, if 0/1 are initialized bits with
3343 // concrete value 0/1, and ? is an uninitialized bit:
3344 // - 0001 0??? is fully initialized
3345 // - 000? ???? is fully uninitialized (*)
3346 // - ???? ???? is fully uninitialized
3347 // - 0000 0000 is fully uninitialized if is_zero_poison,
3348 // fully initialized otherwise
3349 //
3350 // (*) TODO: arguably, since the number of zeros is in the range [3, 8], we
3351 // only need to poison 4 bits.
3352 //
3353 // OutputShadow =
3354 // ((ConcreteZerosCount >= ShadowZerosCount) && !AllZeroShadow)
3355 // || (is_zero_poison && AllZeroSrc)
3356 void handleCountLeadingTrailingZeros(IntrinsicInst &I) {
3357 IRBuilder<> IRB(&I);
3358 Value *Src = I.getArgOperand(0);
3359 Value *SrcShadow = getShadow(Src);
3360
3361 Value *False = IRB.getInt1(false);
3362 Value *ConcreteZerosCount = IRB.CreateIntrinsic(
3363 I.getType(), I.getIntrinsicID(), {Src, /*is_zero_poison=*/False});
3364 Value *ShadowZerosCount = IRB.CreateIntrinsic(
3365 I.getType(), I.getIntrinsicID(), {SrcShadow, /*is_zero_poison=*/False});
3366
3367 Value *CompareConcreteZeros = IRB.CreateICmpUGE(
3368 ConcreteZerosCount, ShadowZerosCount, "_mscz_cmp_zeros");
3369
3370 Value *NotAllZeroShadow =
3371 IRB.CreateIsNotNull(SrcShadow, "_mscz_shadow_not_null");
3372 Value *OutputShadow =
3373 IRB.CreateAnd(CompareConcreteZeros, NotAllZeroShadow, "_mscz_main");
3374
3375 // If zero poison is requested, mix in with the shadow
3376 Constant *IsZeroPoison = cast<Constant>(I.getOperand(1));
3377 if (!IsZeroPoison->isZeroValue()) {
3378 Value *BoolZeroPoison = IRB.CreateIsNull(Src, "_mscz_bzp");
3379 OutputShadow = IRB.CreateOr(OutputShadow, BoolZeroPoison, "_mscz_bs");
3380 }
3381
3382 OutputShadow = IRB.CreateSExt(OutputShadow, getShadowTy(Src), "_mscz_os");
3383
3384 setShadow(&I, OutputShadow);
3385 setOriginForNaryOp(I);
3386 }
3387
3388 /// Handle Arm NEON vector convert intrinsics.
3389 ///
3390 /// e.g., <4 x i32> @llvm.aarch64.neon.fcvtpu.v4i32.v4f32(<4 x float>)
3391 /// i32 @llvm.aarch64.neon.fcvtms.i32.f64(double)
3392 ///
3393 /// For x86 SSE vector convert intrinsics, see
3394 /// handleSSEVectorConvertIntrinsic().
3395 void handleNEONVectorConvertIntrinsic(IntrinsicInst &I) {
3396 assert(I.arg_size() == 1);
3397
3398 IRBuilder<> IRB(&I);
3399 Value *S0 = getShadow(&I, 0);
3400
3401 /// For scalars:
3402 /// Since they are converting from floating-point to integer, the output is
3403 /// - fully uninitialized if *any* bit of the input is uninitialized
3404 /// - fully ininitialized if all bits of the input are ininitialized
3405 /// We apply the same principle on a per-field basis for vectors.
3406 Value *OutShadow = IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)),
3407 getShadowTy(&I));
3408 setShadow(&I, OutShadow);
3409 setOriginForNaryOp(I);
3410 }
3411
3412 /// Some instructions have additional zero-elements in the return type
3413 /// e.g., <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512(<8 x i64>, ...)
3414 ///
3415 /// This function will return a vector type with the same number of elements
3416 /// as the input, but same per-element width as the return value e.g.,
3417 /// <8 x i8>.
3418 FixedVectorType *maybeShrinkVectorShadowType(Value *Src, IntrinsicInst &I) {
3419 assert(isa<FixedVectorType>(getShadowTy(&I)));
3420 FixedVectorType *ShadowType = cast<FixedVectorType>(getShadowTy(&I));
3421
3422 // TODO: generalize beyond 2x?
3423 if (ShadowType->getElementCount() ==
3424 cast<VectorType>(Src->getType())->getElementCount() * 2)
3425 ShadowType = FixedVectorType::getHalfElementsVectorType(ShadowType);
3426
3427 assert(ShadowType->getElementCount() ==
3428 cast<VectorType>(Src->getType())->getElementCount());
3429
3430 return ShadowType;
3431 }
3432
3433 /// Doubles the length of a vector shadow (extending with zeros) if necessary
3434 /// to match the length of the shadow for the instruction.
3435 /// If scalar types of the vectors are different, it will use the type of the
3436 /// input vector.
3437 /// This is more type-safe than CreateShadowCast().
3438 Value *maybeExtendVectorShadowWithZeros(Value *Shadow, IntrinsicInst &I) {
3439 IRBuilder<> IRB(&I);
3441 assert(isa<FixedVectorType>(I.getType()));
3442
3443 Value *FullShadow = getCleanShadow(&I);
3444 unsigned ShadowNumElems =
3445 cast<FixedVectorType>(Shadow->getType())->getNumElements();
3446 unsigned FullShadowNumElems =
3447 cast<FixedVectorType>(FullShadow->getType())->getNumElements();
3448
3449 assert((ShadowNumElems == FullShadowNumElems) ||
3450 (ShadowNumElems * 2 == FullShadowNumElems));
3451
3452 if (ShadowNumElems == FullShadowNumElems) {
3453 FullShadow = Shadow;
3454 } else {
3455 // TODO: generalize beyond 2x?
3456 SmallVector<int, 32> ShadowMask(FullShadowNumElems);
3457 std::iota(ShadowMask.begin(), ShadowMask.end(), 0);
3458
3459 // Append zeros
3460 FullShadow =
3461 IRB.CreateShuffleVector(Shadow, getCleanShadow(Shadow), ShadowMask);
3462 }
3463
3464 return FullShadow;
3465 }
3466
3467 /// Handle x86 SSE vector conversion.
3468 ///
3469 /// e.g., single-precision to half-precision conversion:
3470 /// <8 x i16> @llvm.x86.vcvtps2ph.256(<8 x float> %a0, i32 0)
3471 /// <8 x i16> @llvm.x86.vcvtps2ph.128(<4 x float> %a0, i32 0)
3472 ///
3473 /// floating-point to integer:
3474 /// <4 x i32> @llvm.x86.sse2.cvtps2dq(<4 x float>)
3475 /// <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>)
3476 ///
3477 /// Note: if the output has more elements, they are zero-initialized (and
3478 /// therefore the shadow will also be initialized).
3479 ///
3480 /// This differs from handleSSEVectorConvertIntrinsic() because it
3481 /// propagates uninitialized shadow (instead of checking the shadow).
3482 void handleSSEVectorConvertIntrinsicByProp(IntrinsicInst &I,
3483 bool HasRoundingMode) {
3484 if (HasRoundingMode) {
3485 assert(I.arg_size() == 2);
3486 [[maybe_unused]] Value *RoundingMode = I.getArgOperand(1);
3487 assert(RoundingMode->getType()->isIntegerTy());
3488 } else {
3489 assert(I.arg_size() == 1);
3490 }
3491
3492 Value *Src = I.getArgOperand(0);
3493 assert(Src->getType()->isVectorTy());
3494
3495 // The return type might have more elements than the input.
3496 // Temporarily shrink the return type's number of elements.
3497 VectorType *ShadowType = maybeShrinkVectorShadowType(Src, I);
3498
3499 IRBuilder<> IRB(&I);
3500 Value *S0 = getShadow(&I, 0);
3501
3502 /// For scalars:
3503 /// Since they are converting to and/or from floating-point, the output is:
3504 /// - fully uninitialized if *any* bit of the input is uninitialized
3505 /// - fully ininitialized if all bits of the input are ininitialized
3506 /// We apply the same principle on a per-field basis for vectors.
3507 Value *Shadow =
3508 IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)), ShadowType);
3509
3510 // The return type might have more elements than the input.
3511 // Extend the return type back to its original width if necessary.
3512 Value *FullShadow = maybeExtendVectorShadowWithZeros(Shadow, I);
3513
3514 setShadow(&I, FullShadow);
3515 setOriginForNaryOp(I);
3516 }
3517
3518 // Instrument x86 SSE vector convert intrinsic.
3519 //
3520 // This function instruments intrinsics like cvtsi2ss:
3521 // %Out = int_xxx_cvtyyy(%ConvertOp)
3522 // or
3523 // %Out = int_xxx_cvtyyy(%CopyOp, %ConvertOp)
3524 // Intrinsic converts \p NumUsedElements elements of \p ConvertOp to the same
3525 // number \p Out elements, and (if has 2 arguments) copies the rest of the
3526 // elements from \p CopyOp.
3527 // In most cases conversion involves floating-point value which may trigger a
3528 // hardware exception when not fully initialized. For this reason we require
3529 // \p ConvertOp[0:NumUsedElements] to be fully initialized and trap otherwise.
3530 // We copy the shadow of \p CopyOp[NumUsedElements:] to \p
3531 // Out[NumUsedElements:]. This means that intrinsics without \p CopyOp always
3532 // return a fully initialized value.
3533 //
3534 // For Arm NEON vector convert intrinsics, see
3535 // handleNEONVectorConvertIntrinsic().
3536 void handleSSEVectorConvertIntrinsic(IntrinsicInst &I, int NumUsedElements,
3537 bool HasRoundingMode = false) {
3538 IRBuilder<> IRB(&I);
3539 Value *CopyOp, *ConvertOp;
3540
3541 assert((!HasRoundingMode ||
3542 isa<ConstantInt>(I.getArgOperand(I.arg_size() - 1))) &&
3543 "Invalid rounding mode");
3544
3545 switch (I.arg_size() - HasRoundingMode) {
3546 case 2:
3547 CopyOp = I.getArgOperand(0);
3548 ConvertOp = I.getArgOperand(1);
3549 break;
3550 case 1:
3551 ConvertOp = I.getArgOperand(0);
3552 CopyOp = nullptr;
3553 break;
3554 default:
3555 llvm_unreachable("Cvt intrinsic with unsupported number of arguments.");
3556 }
3557
3558 // The first *NumUsedElements* elements of ConvertOp are converted to the
3559 // same number of output elements. The rest of the output is copied from
3560 // CopyOp, or (if not available) filled with zeroes.
3561 // Combine shadow for elements of ConvertOp that are used in this operation,
3562 // and insert a check.
3563 // FIXME: consider propagating shadow of ConvertOp, at least in the case of
3564 // int->any conversion.
3565 Value *ConvertShadow = getShadow(ConvertOp);
3566 Value *AggShadow = nullptr;
3567 if (ConvertOp->getType()->isVectorTy()) {
3568 AggShadow = IRB.CreateExtractElement(
3569 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), 0));
3570 for (int i = 1; i < NumUsedElements; ++i) {
3571 Value *MoreShadow = IRB.CreateExtractElement(
3572 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), i));
3573 AggShadow = IRB.CreateOr(AggShadow, MoreShadow);
3574 }
3575 } else {
3576 AggShadow = ConvertShadow;
3577 }
3578 assert(AggShadow->getType()->isIntegerTy());
3579 insertCheckShadow(AggShadow, getOrigin(ConvertOp), &I);
3580
3581 // Build result shadow by zero-filling parts of CopyOp shadow that come from
3582 // ConvertOp.
3583 if (CopyOp) {
3584 assert(CopyOp->getType() == I.getType());
3585 assert(CopyOp->getType()->isVectorTy());
3586 Value *ResultShadow = getShadow(CopyOp);
3587 Type *EltTy = cast<VectorType>(ResultShadow->getType())->getElementType();
3588 for (int i = 0; i < NumUsedElements; ++i) {
3589 ResultShadow = IRB.CreateInsertElement(
3590 ResultShadow, ConstantInt::getNullValue(EltTy),
3591 ConstantInt::get(IRB.getInt32Ty(), i));
3592 }
3593 setShadow(&I, ResultShadow);
3594 setOrigin(&I, getOrigin(CopyOp));
3595 } else {
3596 setShadow(&I, getCleanShadow(&I));
3597 setOrigin(&I, getCleanOrigin());
3598 }
3599 }
3600
3601 // Given a scalar or vector, extract lower 64 bits (or less), and return all
3602 // zeroes if it is zero, and all ones otherwise.
3603 Value *Lower64ShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3604 if (S->getType()->isVectorTy())
3605 S = CreateShadowCast(IRB, S, IRB.getInt64Ty(), /* Signed */ true);
3606 assert(S->getType()->getPrimitiveSizeInBits() <= 64);
3607 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3608 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3609 }
3610
3611 // Given a vector, extract its first element, and return all
3612 // zeroes if it is zero, and all ones otherwise.
3613 Value *LowerElementShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3614 Value *S1 = IRB.CreateExtractElement(S, (uint64_t)0);
3615 Value *S2 = IRB.CreateICmpNE(S1, getCleanShadow(S1));
3616 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3617 }
3618
3619 Value *VariableShadowExtend(IRBuilder<> &IRB, Value *S) {
3620 Type *T = S->getType();
3621 assert(T->isVectorTy());
3622 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3623 return IRB.CreateSExt(S2, T);
3624 }
3625
3626 // Instrument vector shift intrinsic.
3627 //
3628 // This function instruments intrinsics like int_x86_avx2_psll_w.
3629 // Intrinsic shifts %In by %ShiftSize bits.
3630 // %ShiftSize may be a vector. In that case the lower 64 bits determine shift
3631 // size, and the rest is ignored. Behavior is defined even if shift size is
3632 // greater than register (or field) width.
3633 void handleVectorShiftIntrinsic(IntrinsicInst &I, bool Variable) {
3634 assert(I.arg_size() == 2);
3635 IRBuilder<> IRB(&I);
3636 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3637 // Otherwise perform the same shift on S1.
3638 Value *S1 = getShadow(&I, 0);
3639 Value *S2 = getShadow(&I, 1);
3640 Value *S2Conv = Variable ? VariableShadowExtend(IRB, S2)
3641 : Lower64ShadowExtend(IRB, S2, getShadowTy(&I));
3642 Value *V1 = I.getOperand(0);
3643 Value *V2 = I.getOperand(1);
3644 Value *Shift = IRB.CreateCall(I.getFunctionType(), I.getCalledOperand(),
3645 {IRB.CreateBitCast(S1, V1->getType()), V2});
3646 Shift = IRB.CreateBitCast(Shift, getShadowTy(&I));
3647 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3648 setOriginForNaryOp(I);
3649 }
3650
3651 // Get an MMX-sized (64-bit) vector type, or optionally, other sized
3652 // vectors.
3653 Type *getMMXVectorTy(unsigned EltSizeInBits,
3654 unsigned X86_MMXSizeInBits = 64) {
3655 assert(EltSizeInBits != 0 && (X86_MMXSizeInBits % EltSizeInBits) == 0 &&
3656 "Illegal MMX vector element size");
3657 return FixedVectorType::get(IntegerType::get(*MS.C, EltSizeInBits),
3658 X86_MMXSizeInBits / EltSizeInBits);
3659 }
3660
3661 // Returns a signed counterpart for an (un)signed-saturate-and-pack
3662 // intrinsic.
3663 Intrinsic::ID getSignedPackIntrinsic(Intrinsic::ID id) {
3664 switch (id) {
3665 case Intrinsic::x86_sse2_packsswb_128:
3666 case Intrinsic::x86_sse2_packuswb_128:
3667 return Intrinsic::x86_sse2_packsswb_128;
3668
3669 case Intrinsic::x86_sse2_packssdw_128:
3670 case Intrinsic::x86_sse41_packusdw:
3671 return Intrinsic::x86_sse2_packssdw_128;
3672
3673 case Intrinsic::x86_avx2_packsswb:
3674 case Intrinsic::x86_avx2_packuswb:
3675 return Intrinsic::x86_avx2_packsswb;
3676
3677 case Intrinsic::x86_avx2_packssdw:
3678 case Intrinsic::x86_avx2_packusdw:
3679 return Intrinsic::x86_avx2_packssdw;
3680
3681 case Intrinsic::x86_mmx_packsswb:
3682 case Intrinsic::x86_mmx_packuswb:
3683 return Intrinsic::x86_mmx_packsswb;
3684
3685 case Intrinsic::x86_mmx_packssdw:
3686 return Intrinsic::x86_mmx_packssdw;
3687 default:
3688 llvm_unreachable("unexpected intrinsic id");
3689 }
3690 }
3691
3692 // Instrument vector pack intrinsic.
3693 //
3694 // This function instruments intrinsics like x86_mmx_packsswb, that
3695 // packs elements of 2 input vectors into half as many bits with saturation.
3696 // Shadow is propagated with the signed variant of the same intrinsic applied
3697 // to sext(Sa != zeroinitializer), sext(Sb != zeroinitializer).
3698 // MMXEltSizeInBits is used only for x86mmx arguments.
3699 void handleVectorPackIntrinsic(IntrinsicInst &I,
3700 unsigned MMXEltSizeInBits = 0) {
3701 assert(I.arg_size() == 2);
3702 IRBuilder<> IRB(&I);
3703 Value *S1 = getShadow(&I, 0);
3704 Value *S2 = getShadow(&I, 1);
3705 assert(S1->getType()->isVectorTy());
3706
3707 // SExt and ICmpNE below must apply to individual elements of input vectors.
3708 // In case of x86mmx arguments, cast them to appropriate vector types and
3709 // back.
3710 Type *T =
3711 MMXEltSizeInBits ? getMMXVectorTy(MMXEltSizeInBits) : S1->getType();
3712 if (MMXEltSizeInBits) {
3713 S1 = IRB.CreateBitCast(S1, T);
3714 S2 = IRB.CreateBitCast(S2, T);
3715 }
3716 Value *S1_ext =
3718 Value *S2_ext =
3720 if (MMXEltSizeInBits) {
3721 S1_ext = IRB.CreateBitCast(S1_ext, getMMXVectorTy(64));
3722 S2_ext = IRB.CreateBitCast(S2_ext, getMMXVectorTy(64));
3723 }
3724
3725 Value *S = IRB.CreateIntrinsic(getSignedPackIntrinsic(I.getIntrinsicID()),
3726 {S1_ext, S2_ext}, /*FMFSource=*/nullptr,
3727 "_msprop_vector_pack");
3728 if (MMXEltSizeInBits)
3729 S = IRB.CreateBitCast(S, getShadowTy(&I));
3730 setShadow(&I, S);
3731 setOriginForNaryOp(I);
3732 }
3733
3734 // Convert `Mask` into `<n x i1>`.
3735 Constant *createDppMask(unsigned Width, unsigned Mask) {
3737 for (auto &M : R) {
3738 M = ConstantInt::getBool(F.getContext(), Mask & 1);
3739 Mask >>= 1;
3740 }
3741 return ConstantVector::get(R);
3742 }
3743
3744 // Calculate output shadow as array of booleans `<n x i1>`, assuming if any
3745 // arg is poisoned, entire dot product is poisoned.
3746 Value *findDppPoisonedOutput(IRBuilder<> &IRB, Value *S, unsigned SrcMask,
3747 unsigned DstMask) {
3748 const unsigned Width =
3749 cast<FixedVectorType>(S->getType())->getNumElements();
3750
3751 S = IRB.CreateSelect(createDppMask(Width, SrcMask), S,
3753 Value *SElem = IRB.CreateOrReduce(S);
3754 Value *IsClean = IRB.CreateIsNull(SElem, "_msdpp");
3755 Value *DstMaskV = createDppMask(Width, DstMask);
3756
3757 return IRB.CreateSelect(
3758 IsClean, Constant::getNullValue(DstMaskV->getType()), DstMaskV);
3759 }
3760
3761 // See `Intel Intrinsics Guide` for `_dp_p*` instructions.
3762 //
3763 // 2 and 4 element versions produce single scalar of dot product, and then
3764 // puts it into elements of output vector, selected by 4 lowest bits of the
3765 // mask. Top 4 bits of the mask control which elements of input to use for dot
3766 // product.
3767 //
3768 // 8 element version mask still has only 4 bit for input, and 4 bit for output
3769 // mask. According to the spec it just operates as 4 element version on first
3770 // 4 elements of inputs and output, and then on last 4 elements of inputs and
3771 // output.
3772 void handleDppIntrinsic(IntrinsicInst &I) {
3773 IRBuilder<> IRB(&I);
3774
3775 Value *S0 = getShadow(&I, 0);
3776 Value *S1 = getShadow(&I, 1);
3777 Value *S = IRB.CreateOr(S0, S1);
3778
3779 const unsigned Width =
3780 cast<FixedVectorType>(S->getType())->getNumElements();
3781 assert(Width == 2 || Width == 4 || Width == 8);
3782
3783 const unsigned Mask = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
3784 const unsigned SrcMask = Mask >> 4;
3785 const unsigned DstMask = Mask & 0xf;
3786
3787 // Calculate shadow as `<n x i1>`.
3788 Value *SI1 = findDppPoisonedOutput(IRB, S, SrcMask, DstMask);
3789 if (Width == 8) {
3790 // First 4 elements of shadow are already calculated. `makeDppShadow`
3791 // operats on 32 bit masks, so we can just shift masks, and repeat.
3792 SI1 = IRB.CreateOr(
3793 SI1, findDppPoisonedOutput(IRB, S, SrcMask << 4, DstMask << 4));
3794 }
3795 // Extend to real size of shadow, poisoning either all or none bits of an
3796 // element.
3797 S = IRB.CreateSExt(SI1, S->getType(), "_msdpp");
3798
3799 setShadow(&I, S);
3800 setOriginForNaryOp(I);
3801 }
3802
3803 Value *convertBlendvToSelectMask(IRBuilder<> &IRB, Value *C) {
3804 C = CreateAppToShadowCast(IRB, C);
3805 FixedVectorType *FVT = cast<FixedVectorType>(C->getType());
3806 unsigned ElSize = FVT->getElementType()->getPrimitiveSizeInBits();
3807 C = IRB.CreateAShr(C, ElSize - 1);
3808 FVT = FixedVectorType::get(IRB.getInt1Ty(), FVT->getNumElements());
3809 return IRB.CreateTrunc(C, FVT);
3810 }
3811
3812 // `blendv(f, t, c)` is effectively `select(c[top_bit], t, f)`.
3813 void handleBlendvIntrinsic(IntrinsicInst &I) {
3814 Value *C = I.getOperand(2);
3815 Value *T = I.getOperand(1);
3816 Value *F = I.getOperand(0);
3817
3818 Value *Sc = getShadow(&I, 2);
3819 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
3820
3821 {
3822 IRBuilder<> IRB(&I);
3823 // Extract top bit from condition and its shadow.
3824 C = convertBlendvToSelectMask(IRB, C);
3825 Sc = convertBlendvToSelectMask(IRB, Sc);
3826
3827 setShadow(C, Sc);
3828 setOrigin(C, Oc);
3829 }
3830
3831 handleSelectLikeInst(I, C, T, F);
3832 }
3833
3834 // Instrument sum-of-absolute-differences intrinsic.
3835 void handleVectorSadIntrinsic(IntrinsicInst &I, bool IsMMX = false) {
3836 const unsigned SignificantBitsPerResultElement = 16;
3837 Type *ResTy = IsMMX ? IntegerType::get(*MS.C, 64) : I.getType();
3838 unsigned ZeroBitsPerResultElement =
3839 ResTy->getScalarSizeInBits() - SignificantBitsPerResultElement;
3840
3841 IRBuilder<> IRB(&I);
3842 auto *Shadow0 = getShadow(&I, 0);
3843 auto *Shadow1 = getShadow(&I, 1);
3844 Value *S = IRB.CreateOr(Shadow0, Shadow1);
3845 S = IRB.CreateBitCast(S, ResTy);
3846 S = IRB.CreateSExt(IRB.CreateICmpNE(S, Constant::getNullValue(ResTy)),
3847 ResTy);
3848 S = IRB.CreateLShr(S, ZeroBitsPerResultElement);
3849 S = IRB.CreateBitCast(S, getShadowTy(&I));
3850 setShadow(&I, S);
3851 setOriginForNaryOp(I);
3852 }
3853
3854 // Instrument multiply-add(-accumulate)? intrinsics.
3855 //
3856 // e.g., Two operands:
3857 // <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a, <8 x i16> %b)
3858 //
3859 // Two operands which require an EltSizeInBits override:
3860 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64> %a, <1 x i64> %b)
3861 //
3862 // Three operands:
3863 // <4 x i32> @llvm.x86.avx512.vpdpbusd.128
3864 // (<4 x i32> %s, <16 x i8> %a, <16 x i8> %b)
3865 // (this is equivalent to multiply-add on %a and %b, followed by
3866 // adding/"accumulating" %s. "Accumulation" stores the result in one
3867 // of the source registers, but this accumulate vs. add distinction
3868 // is lost when dealing with LLVM intrinsics.)
3869 void handleVectorPmaddIntrinsic(IntrinsicInst &I, unsigned ReductionFactor,
3870 unsigned EltSizeInBits = 0) {
3871 IRBuilder<> IRB(&I);
3872
3873 [[maybe_unused]] FixedVectorType *ReturnType =
3874 cast<FixedVectorType>(I.getType());
3875 assert(isa<FixedVectorType>(ReturnType));
3876
3877 // Vectors A and B, and shadows
3878 Value *Va = nullptr;
3879 Value *Vb = nullptr;
3880 Value *Sa = nullptr;
3881 Value *Sb = nullptr;
3882
3883 assert(I.arg_size() == 2 || I.arg_size() == 3);
3884 if (I.arg_size() == 2) {
3885 Va = I.getOperand(0);
3886 Vb = I.getOperand(1);
3887
3888 Sa = getShadow(&I, 0);
3889 Sb = getShadow(&I, 1);
3890 } else if (I.arg_size() == 3) {
3891 // Operand 0 is the accumulator. We will deal with that below.
3892 Va = I.getOperand(1);
3893 Vb = I.getOperand(2);
3894
3895 Sa = getShadow(&I, 1);
3896 Sb = getShadow(&I, 2);
3897 }
3898
3899 FixedVectorType *ParamType = cast<FixedVectorType>(Va->getType());
3900 assert(ParamType == Vb->getType());
3901
3902 assert(ParamType->getPrimitiveSizeInBits() ==
3903 ReturnType->getPrimitiveSizeInBits());
3904
3905 if (I.arg_size() == 3) {
3906 [[maybe_unused]] auto *AccumulatorType =
3907 cast<FixedVectorType>(I.getOperand(0)->getType());
3908 assert(AccumulatorType == ReturnType);
3909 }
3910
3911 FixedVectorType *ImplicitReturnType = ReturnType;
3912 // Step 1: instrument multiplication of corresponding vector elements
3913 if (EltSizeInBits) {
3914 ImplicitReturnType = cast<FixedVectorType>(
3915 getMMXVectorTy(EltSizeInBits * ReductionFactor,
3916 ParamType->getPrimitiveSizeInBits()));
3917 ParamType = cast<FixedVectorType>(
3918 getMMXVectorTy(EltSizeInBits, ParamType->getPrimitiveSizeInBits()));
3919
3920 Va = IRB.CreateBitCast(Va, ParamType);
3921 Vb = IRB.CreateBitCast(Vb, ParamType);
3922
3923 Sa = IRB.CreateBitCast(Sa, getShadowTy(ParamType));
3924 Sb = IRB.CreateBitCast(Sb, getShadowTy(ParamType));
3925 } else {
3926 assert(ParamType->getNumElements() ==
3927 ReturnType->getNumElements() * ReductionFactor);
3928 }
3929
3930 // Multiplying an *initialized* zero by an uninitialized element results in
3931 // an initialized zero element.
3932 //
3933 // This is analogous to bitwise AND, where "AND" of 0 and a poisoned value
3934 // results in an unpoisoned value. We can therefore adapt the visitAnd()
3935 // instrumentation:
3936 // OutShadow = (SaNonZero & SbNonZero)
3937 // | (VaNonZero & SbNonZero)
3938 // | (SaNonZero & VbNonZero)
3939 // where non-zero is checked on a per-element basis (not per bit).
3940 Value *SZero = Constant::getNullValue(Va->getType());
3941 Value *VZero = Constant::getNullValue(Sa->getType());
3942 Value *SaNonZero = IRB.CreateICmpNE(Sa, SZero);
3943 Value *SbNonZero = IRB.CreateICmpNE(Sb, SZero);
3944 Value *VaNonZero = IRB.CreateICmpNE(Va, VZero);
3945 Value *VbNonZero = IRB.CreateICmpNE(Vb, VZero);
3946
3947 Value *SaAndSbNonZero = IRB.CreateAnd(SaNonZero, SbNonZero);
3948 Value *VaAndSbNonZero = IRB.CreateAnd(VaNonZero, SbNonZero);
3949 Value *SaAndVbNonZero = IRB.CreateAnd(SaNonZero, VbNonZero);
3950
3951 // Each element of the vector is represented by a single bit (poisoned or
3952 // not) e.g., <8 x i1>.
3953 Value *And = IRB.CreateOr({SaAndSbNonZero, VaAndSbNonZero, SaAndVbNonZero});
3954
3955 // Extend <8 x i1> to <8 x i16>.
3956 // (The real pmadd intrinsic would have computed intermediate values of
3957 // <8 x i32>, but that is irrelevant for our shadow purposes because we
3958 // consider each element to be either fully initialized or fully
3959 // uninitialized.)
3960 And = IRB.CreateSExt(And, Sa->getType());
3961
3962 // Step 2: instrument horizontal add
3963 // We don't need bit-precise horizontalReduce because we only want to check
3964 // if each pair/quad of elements is fully zero.
3965 // Cast to <4 x i32>.
3966 Value *Horizontal = IRB.CreateBitCast(And, ImplicitReturnType);
3967
3968 // Compute <4 x i1>, then extend back to <4 x i32>.
3969 Value *OutShadow = IRB.CreateSExt(
3970 IRB.CreateICmpNE(Horizontal,
3971 Constant::getNullValue(Horizontal->getType())),
3972 ImplicitReturnType);
3973
3974 // Cast it back to the required fake return type (if MMX: <1 x i64>; for
3975 // AVX, it is already correct).
3976 if (EltSizeInBits)
3977 OutShadow = CreateShadowCast(IRB, OutShadow, getShadowTy(&I));
3978
3979 // Step 3 (if applicable): instrument accumulator
3980 if (I.arg_size() == 3)
3981 OutShadow = IRB.CreateOr(OutShadow, getShadow(&I, 0));
3982
3983 setShadow(&I, OutShadow);
3984 setOriginForNaryOp(I);
3985 }
3986
3987 // Instrument compare-packed intrinsic.
3988 // Basically, an or followed by sext(icmp ne 0) to end up with all-zeros or
3989 // all-ones shadow.
3990 void handleVectorComparePackedIntrinsic(IntrinsicInst &I) {
3991 IRBuilder<> IRB(&I);
3992 Type *ResTy = getShadowTy(&I);
3993 auto *Shadow0 = getShadow(&I, 0);
3994 auto *Shadow1 = getShadow(&I, 1);
3995 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
3996 Value *S = IRB.CreateSExt(
3997 IRB.CreateICmpNE(S0, Constant::getNullValue(ResTy)), ResTy);
3998 setShadow(&I, S);
3999 setOriginForNaryOp(I);
4000 }
4001
4002 // Instrument compare-scalar intrinsic.
4003 // This handles both cmp* intrinsics which return the result in the first
4004 // element of a vector, and comi* which return the result as i32.
4005 void handleVectorCompareScalarIntrinsic(IntrinsicInst &I) {
4006 IRBuilder<> IRB(&I);
4007 auto *Shadow0 = getShadow(&I, 0);
4008 auto *Shadow1 = getShadow(&I, 1);
4009 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
4010 Value *S = LowerElementShadowExtend(IRB, S0, getShadowTy(&I));
4011 setShadow(&I, S);
4012 setOriginForNaryOp(I);
4013 }
4014
4015 // Instrument generic vector reduction intrinsics
4016 // by ORing together all their fields.
4017 //
4018 // If AllowShadowCast is true, the return type does not need to be the same
4019 // type as the fields
4020 // e.g., declare i32 @llvm.aarch64.neon.uaddv.i32.v16i8(<16 x i8>)
4021 void handleVectorReduceIntrinsic(IntrinsicInst &I, bool AllowShadowCast) {
4022 assert(I.arg_size() == 1);
4023
4024 IRBuilder<> IRB(&I);
4025 Value *S = IRB.CreateOrReduce(getShadow(&I, 0));
4026 if (AllowShadowCast)
4027 S = CreateShadowCast(IRB, S, getShadowTy(&I));
4028 else
4029 assert(S->getType() == getShadowTy(&I));
4030 setShadow(&I, S);
4031 setOriginForNaryOp(I);
4032 }
4033
4034 // Similar to handleVectorReduceIntrinsic but with an initial starting value.
4035 // e.g., call float @llvm.vector.reduce.fadd.f32.v2f32(float %a0, <2 x float>
4036 // %a1)
4037 // shadow = shadow[a0] | shadow[a1.0] | shadow[a1.1]
4038 //
4039 // The type of the return value, initial starting value, and elements of the
4040 // vector must be identical.
4041 void handleVectorReduceWithStarterIntrinsic(IntrinsicInst &I) {
4042 assert(I.arg_size() == 2);
4043
4044 IRBuilder<> IRB(&I);
4045 Value *Shadow0 = getShadow(&I, 0);
4046 Value *Shadow1 = IRB.CreateOrReduce(getShadow(&I, 1));
4047 assert(Shadow0->getType() == Shadow1->getType());
4048 Value *S = IRB.CreateOr(Shadow0, Shadow1);
4049 assert(S->getType() == getShadowTy(&I));
4050 setShadow(&I, S);
4051 setOriginForNaryOp(I);
4052 }
4053
4054 // Instrument vector.reduce.or intrinsic.
4055 // Valid (non-poisoned) set bits in the operand pull low the
4056 // corresponding shadow bits.
4057 void handleVectorReduceOrIntrinsic(IntrinsicInst &I) {
4058 assert(I.arg_size() == 1);
4059
4060 IRBuilder<> IRB(&I);
4061 Value *OperandShadow = getShadow(&I, 0);
4062 Value *OperandUnsetBits = IRB.CreateNot(I.getOperand(0));
4063 Value *OperandUnsetOrPoison = IRB.CreateOr(OperandUnsetBits, OperandShadow);
4064 // Bit N is clean if any field's bit N is 1 and unpoison
4065 Value *OutShadowMask = IRB.CreateAndReduce(OperandUnsetOrPoison);
4066 // Otherwise, it is clean if every field's bit N is unpoison
4067 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4068 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4069
4070 setShadow(&I, S);
4071 setOrigin(&I, getOrigin(&I, 0));
4072 }
4073
4074 // Instrument vector.reduce.and intrinsic.
4075 // Valid (non-poisoned) unset bits in the operand pull down the
4076 // corresponding shadow bits.
4077 void handleVectorReduceAndIntrinsic(IntrinsicInst &I) {
4078 assert(I.arg_size() == 1);
4079
4080 IRBuilder<> IRB(&I);
4081 Value *OperandShadow = getShadow(&I, 0);
4082 Value *OperandSetOrPoison = IRB.CreateOr(I.getOperand(0), OperandShadow);
4083 // Bit N is clean if any field's bit N is 0 and unpoison
4084 Value *OutShadowMask = IRB.CreateAndReduce(OperandSetOrPoison);
4085 // Otherwise, it is clean if every field's bit N is unpoison
4086 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4087 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4088
4089 setShadow(&I, S);
4090 setOrigin(&I, getOrigin(&I, 0));
4091 }
4092
4093 void handleStmxcsr(IntrinsicInst &I) {
4094 IRBuilder<> IRB(&I);
4095 Value *Addr = I.getArgOperand(0);
4096 Type *Ty = IRB.getInt32Ty();
4097 Value *ShadowPtr =
4098 getShadowOriginPtr(Addr, IRB, Ty, Align(1), /*isStore*/ true).first;
4099
4100 IRB.CreateStore(getCleanShadow(Ty), ShadowPtr);
4101
4103 insertCheckShadowOf(Addr, &I);
4104 }
4105
4106 void handleLdmxcsr(IntrinsicInst &I) {
4107 if (!InsertChecks)
4108 return;
4109
4110 IRBuilder<> IRB(&I);
4111 Value *Addr = I.getArgOperand(0);
4112 Type *Ty = IRB.getInt32Ty();
4113 const Align Alignment = Align(1);
4114 Value *ShadowPtr, *OriginPtr;
4115 std::tie(ShadowPtr, OriginPtr) =
4116 getShadowOriginPtr(Addr, IRB, Ty, Alignment, /*isStore*/ false);
4117
4119 insertCheckShadowOf(Addr, &I);
4120
4121 Value *Shadow = IRB.CreateAlignedLoad(Ty, ShadowPtr, Alignment, "_ldmxcsr");
4122 Value *Origin = MS.TrackOrigins ? IRB.CreateLoad(MS.OriginTy, OriginPtr)
4123 : getCleanOrigin();
4124 insertCheckShadow(Shadow, Origin, &I);
4125 }
4126
4127 void handleMaskedExpandLoad(IntrinsicInst &I) {
4128 IRBuilder<> IRB(&I);
4129 Value *Ptr = I.getArgOperand(0);
4130 MaybeAlign Align = I.getParamAlign(0);
4131 Value *Mask = I.getArgOperand(1);
4132 Value *PassThru = I.getArgOperand(2);
4133
4135 insertCheckShadowOf(Ptr, &I);
4136 insertCheckShadowOf(Mask, &I);
4137 }
4138
4139 if (!PropagateShadow) {
4140 setShadow(&I, getCleanShadow(&I));
4141 setOrigin(&I, getCleanOrigin());
4142 return;
4143 }
4144
4145 Type *ShadowTy = getShadowTy(&I);
4146 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4147 auto [ShadowPtr, OriginPtr] =
4148 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ false);
4149
4150 Value *Shadow =
4151 IRB.CreateMaskedExpandLoad(ShadowTy, ShadowPtr, Align, Mask,
4152 getShadow(PassThru), "_msmaskedexpload");
4153
4154 setShadow(&I, Shadow);
4155
4156 // TODO: Store origins.
4157 setOrigin(&I, getCleanOrigin());
4158 }
4159
4160 void handleMaskedCompressStore(IntrinsicInst &I) {
4161 IRBuilder<> IRB(&I);
4162 Value *Values = I.getArgOperand(0);
4163 Value *Ptr = I.getArgOperand(1);
4164 MaybeAlign Align = I.getParamAlign(1);
4165 Value *Mask = I.getArgOperand(2);
4166
4168 insertCheckShadowOf(Ptr, &I);
4169 insertCheckShadowOf(Mask, &I);
4170 }
4171
4172 Value *Shadow = getShadow(Values);
4173 Type *ElementShadowTy =
4174 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4175 auto [ShadowPtr, OriginPtrs] =
4176 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ true);
4177
4178 IRB.CreateMaskedCompressStore(Shadow, ShadowPtr, Align, Mask);
4179
4180 // TODO: Store origins.
4181 }
4182
4183 void handleMaskedGather(IntrinsicInst &I) {
4184 IRBuilder<> IRB(&I);
4185 Value *Ptrs = I.getArgOperand(0);
4186 const Align Alignment(
4187 cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
4188 Value *Mask = I.getArgOperand(2);
4189 Value *PassThru = I.getArgOperand(3);
4190
4191 Type *PtrsShadowTy = getShadowTy(Ptrs);
4193 insertCheckShadowOf(Mask, &I);
4194 Value *MaskedPtrShadow = IRB.CreateSelect(
4195 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4196 "_msmaskedptrs");
4197 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4198 }
4199
4200 if (!PropagateShadow) {
4201 setShadow(&I, getCleanShadow(&I));
4202 setOrigin(&I, getCleanOrigin());
4203 return;
4204 }
4205
4206 Type *ShadowTy = getShadowTy(&I);
4207 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4208 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4209 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ false);
4210
4211 Value *Shadow =
4212 IRB.CreateMaskedGather(ShadowTy, ShadowPtrs, Alignment, Mask,
4213 getShadow(PassThru), "_msmaskedgather");
4214
4215 setShadow(&I, Shadow);
4216
4217 // TODO: Store origins.
4218 setOrigin(&I, getCleanOrigin());
4219 }
4220
4221 void handleMaskedScatter(IntrinsicInst &I) {
4222 IRBuilder<> IRB(&I);
4223 Value *Values = I.getArgOperand(0);
4224 Value *Ptrs = I.getArgOperand(1);
4225 const Align Alignment(
4226 cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
4227 Value *Mask = I.getArgOperand(3);
4228
4229 Type *PtrsShadowTy = getShadowTy(Ptrs);
4231 insertCheckShadowOf(Mask, &I);
4232 Value *MaskedPtrShadow = IRB.CreateSelect(
4233 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4234 "_msmaskedptrs");
4235 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4236 }
4237
4238 Value *Shadow = getShadow(Values);
4239 Type *ElementShadowTy =
4240 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4241 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4242 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ true);
4243
4244 IRB.CreateMaskedScatter(Shadow, ShadowPtrs, Alignment, Mask);
4245
4246 // TODO: Store origin.
4247 }
4248
4249 // Intrinsic::masked_store
4250 //
4251 // Note: handleAVXMaskedStore handles AVX/AVX2 variants, though AVX512 masked
4252 // stores are lowered to Intrinsic::masked_store.
4253 void handleMaskedStore(IntrinsicInst &I) {
4254 IRBuilder<> IRB(&I);
4255 Value *V = I.getArgOperand(0);
4256 Value *Ptr = I.getArgOperand(1);
4257 const Align Alignment(
4258 cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
4259 Value *Mask = I.getArgOperand(3);
4260 Value *Shadow = getShadow(V);
4261
4263 insertCheckShadowOf(Ptr, &I);
4264 insertCheckShadowOf(Mask, &I);
4265 }
4266
4267 Value *ShadowPtr;
4268 Value *OriginPtr;
4269 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
4270 Ptr, IRB, Shadow->getType(), Alignment, /*isStore*/ true);
4271
4272 IRB.CreateMaskedStore(Shadow, ShadowPtr, Alignment, Mask);
4273
4274 if (!MS.TrackOrigins)
4275 return;
4276
4277 auto &DL = F.getDataLayout();
4278 paintOrigin(IRB, getOrigin(V), OriginPtr,
4279 DL.getTypeStoreSize(Shadow->getType()),
4280 std::max(Alignment, kMinOriginAlignment));
4281 }
4282
4283 // Intrinsic::masked_load
4284 //
4285 // Note: handleAVXMaskedLoad handles AVX/AVX2 variants, though AVX512 masked
4286 // loads are lowered to Intrinsic::masked_load.
4287 void handleMaskedLoad(IntrinsicInst &I) {
4288 IRBuilder<> IRB(&I);
4289 Value *Ptr = I.getArgOperand(0);
4290 const Align Alignment(
4291 cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
4292 Value *Mask = I.getArgOperand(2);
4293 Value *PassThru = I.getArgOperand(3);
4294
4296 insertCheckShadowOf(Ptr, &I);
4297 insertCheckShadowOf(Mask, &I);
4298 }
4299
4300 if (!PropagateShadow) {
4301 setShadow(&I, getCleanShadow(&I));
4302 setOrigin(&I, getCleanOrigin());
4303 return;
4304 }
4305
4306 Type *ShadowTy = getShadowTy(&I);
4307 Value *ShadowPtr, *OriginPtr;
4308 std::tie(ShadowPtr, OriginPtr) =
4309 getShadowOriginPtr(Ptr, IRB, ShadowTy, Alignment, /*isStore*/ false);
4310 setShadow(&I, IRB.CreateMaskedLoad(ShadowTy, ShadowPtr, Alignment, Mask,
4311 getShadow(PassThru), "_msmaskedld"));
4312
4313 if (!MS.TrackOrigins)
4314 return;
4315
4316 // Choose between PassThru's and the loaded value's origins.
4317 Value *MaskedPassThruShadow = IRB.CreateAnd(
4318 getShadow(PassThru), IRB.CreateSExt(IRB.CreateNeg(Mask), ShadowTy));
4319
4320 Value *NotNull = convertToBool(MaskedPassThruShadow, IRB, "_mscmp");
4321
4322 Value *PtrOrigin = IRB.CreateLoad(MS.OriginTy, OriginPtr);
4323 Value *Origin = IRB.CreateSelect(NotNull, getOrigin(PassThru), PtrOrigin);
4324
4325 setOrigin(&I, Origin);
4326 }
4327
4328 // e.g., void @llvm.x86.avx.maskstore.ps.256(ptr, <8 x i32>, <8 x float>)
4329 // dst mask src
4330 //
4331 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4332 // by handleMaskedStore.
4333 //
4334 // This function handles AVX and AVX2 masked stores; these use the MSBs of a
4335 // vector of integers, unlike the LLVM masked intrinsics, which require a
4336 // vector of booleans. X86InstCombineIntrinsic.cpp::simplifyX86MaskedLoad
4337 // mentions that the x86 backend does not know how to efficiently convert
4338 // from a vector of booleans back into the AVX mask format; therefore, they
4339 // (and we) do not reduce AVX/AVX2 masked intrinsics into LLVM masked
4340 // intrinsics.
4341 void handleAVXMaskedStore(IntrinsicInst &I) {
4342 assert(I.arg_size() == 3);
4343
4344 IRBuilder<> IRB(&I);
4345
4346 Value *Dst = I.getArgOperand(0);
4347 assert(Dst->getType()->isPointerTy() && "Destination is not a pointer!");
4348
4349 Value *Mask = I.getArgOperand(1);
4350 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4351
4352 Value *Src = I.getArgOperand(2);
4353 assert(isa<VectorType>(Src->getType()) && "Source is not a vector!");
4354
4355 const Align Alignment = Align(1);
4356
4357 Value *SrcShadow = getShadow(Src);
4358
4360 insertCheckShadowOf(Dst, &I);
4361 insertCheckShadowOf(Mask, &I);
4362 }
4363
4364 Value *DstShadowPtr;
4365 Value *DstOriginPtr;
4366 std::tie(DstShadowPtr, DstOriginPtr) = getShadowOriginPtr(
4367 Dst, IRB, SrcShadow->getType(), Alignment, /*isStore*/ true);
4368
4369 SmallVector<Value *, 2> ShadowArgs;
4370 ShadowArgs.append(1, DstShadowPtr);
4371 ShadowArgs.append(1, Mask);
4372 // The intrinsic may require floating-point but shadows can be arbitrary
4373 // bit patterns, of which some would be interpreted as "invalid"
4374 // floating-point values (NaN etc.); we assume the intrinsic will happily
4375 // copy them.
4376 ShadowArgs.append(1, IRB.CreateBitCast(SrcShadow, Src->getType()));
4377
4378 CallInst *CI =
4379 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
4380 setShadow(&I, CI);
4381
4382 if (!MS.TrackOrigins)
4383 return;
4384
4385 // Approximation only
4386 auto &DL = F.getDataLayout();
4387 paintOrigin(IRB, getOrigin(Src), DstOriginPtr,
4388 DL.getTypeStoreSize(SrcShadow->getType()),
4389 std::max(Alignment, kMinOriginAlignment));
4390 }
4391
4392 // e.g., <8 x float> @llvm.x86.avx.maskload.ps.256(ptr, <8 x i32>)
4393 // return src mask
4394 //
4395 // Masked-off values are replaced with 0, which conveniently also represents
4396 // initialized memory.
4397 //
4398 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4399 // by handleMaskedStore.
4400 //
4401 // We do not combine this with handleMaskedLoad; see comment in
4402 // handleAVXMaskedStore for the rationale.
4403 //
4404 // This is subtly different than handleIntrinsicByApplyingToShadow(I, 1)
4405 // because we need to apply getShadowOriginPtr, not getShadow, to the first
4406 // parameter.
4407 void handleAVXMaskedLoad(IntrinsicInst &I) {
4408 assert(I.arg_size() == 2);
4409
4410 IRBuilder<> IRB(&I);
4411
4412 Value *Src = I.getArgOperand(0);
4413 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
4414
4415 Value *Mask = I.getArgOperand(1);
4416 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4417
4418 const Align Alignment = Align(1);
4419
4421 insertCheckShadowOf(Mask, &I);
4422 }
4423
4424 Type *SrcShadowTy = getShadowTy(Src);
4425 Value *SrcShadowPtr, *SrcOriginPtr;
4426 std::tie(SrcShadowPtr, SrcOriginPtr) =
4427 getShadowOriginPtr(Src, IRB, SrcShadowTy, Alignment, /*isStore*/ false);
4428
4429 SmallVector<Value *, 2> ShadowArgs;
4430 ShadowArgs.append(1, SrcShadowPtr);
4431 ShadowArgs.append(1, Mask);
4432
4433 CallInst *CI =
4434 IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(), ShadowArgs);
4435 // The AVX masked load intrinsics do not have integer variants. We use the
4436 // floating-point variants, which will happily copy the shadows even if
4437 // they are interpreted as "invalid" floating-point values (NaN etc.).
4438 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4439
4440 if (!MS.TrackOrigins)
4441 return;
4442
4443 // The "pass-through" value is always zero (initialized). To the extent
4444 // that that results in initialized aligned 4-byte chunks, the origin value
4445 // is ignored. It is therefore correct to simply copy the origin from src.
4446 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
4447 setOrigin(&I, PtrSrcOrigin);
4448 }
4449
4450 // Test whether the mask indices are initialized, only checking the bits that
4451 // are actually used.
4452 //
4453 // e.g., if Idx is <32 x i16>, only (log2(32) == 5) bits of each index are
4454 // used/checked.
4455 void maskedCheckAVXIndexShadow(IRBuilder<> &IRB, Value *Idx, Instruction *I) {
4456 assert(isFixedIntVector(Idx));
4457 auto IdxVectorSize =
4458 cast<FixedVectorType>(Idx->getType())->getNumElements();
4459 assert(isPowerOf2_64(IdxVectorSize));
4460
4461 // Compiler isn't smart enough, let's help it
4462 if (isa<Constant>(Idx))
4463 return;
4464
4465 auto *IdxShadow = getShadow(Idx);
4466 Value *Truncated = IRB.CreateTrunc(
4467 IdxShadow,
4468 FixedVectorType::get(Type::getIntNTy(*MS.C, Log2_64(IdxVectorSize)),
4469 IdxVectorSize));
4470 insertCheckShadow(Truncated, getOrigin(Idx), I);
4471 }
4472
4473 // Instrument AVX permutation intrinsic.
4474 // We apply the same permutation (argument index 1) to the shadow.
4475 void handleAVXVpermilvar(IntrinsicInst &I) {
4476 IRBuilder<> IRB(&I);
4477 Value *Shadow = getShadow(&I, 0);
4478 maskedCheckAVXIndexShadow(IRB, I.getArgOperand(1), &I);
4479
4480 // Shadows are integer-ish types but some intrinsics require a
4481 // different (e.g., floating-point) type.
4482 Shadow = IRB.CreateBitCast(Shadow, I.getArgOperand(0)->getType());
4483 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4484 {Shadow, I.getArgOperand(1)});
4485
4486 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4487 setOriginForNaryOp(I);
4488 }
4489
4490 // Instrument AVX permutation intrinsic.
4491 // We apply the same permutation (argument index 1) to the shadows.
4492 void handleAVXVpermi2var(IntrinsicInst &I) {
4493 assert(I.arg_size() == 3);
4494 assert(isa<FixedVectorType>(I.getArgOperand(0)->getType()));
4495 assert(isa<FixedVectorType>(I.getArgOperand(1)->getType()));
4496 assert(isa<FixedVectorType>(I.getArgOperand(2)->getType()));
4497 [[maybe_unused]] auto ArgVectorSize =
4498 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4499 assert(cast<FixedVectorType>(I.getArgOperand(1)->getType())
4500 ->getNumElements() == ArgVectorSize);
4501 assert(cast<FixedVectorType>(I.getArgOperand(2)->getType())
4502 ->getNumElements() == ArgVectorSize);
4503 assert(I.getArgOperand(0)->getType() == I.getArgOperand(2)->getType());
4504 assert(I.getType() == I.getArgOperand(0)->getType());
4505 assert(I.getArgOperand(1)->getType()->isIntOrIntVectorTy());
4506 IRBuilder<> IRB(&I);
4507 Value *AShadow = getShadow(&I, 0);
4508 Value *Idx = I.getArgOperand(1);
4509 Value *BShadow = getShadow(&I, 2);
4510
4511 maskedCheckAVXIndexShadow(IRB, Idx, &I);
4512
4513 // Shadows are integer-ish types but some intrinsics require a
4514 // different (e.g., floating-point) type.
4515 AShadow = IRB.CreateBitCast(AShadow, I.getArgOperand(0)->getType());
4516 BShadow = IRB.CreateBitCast(BShadow, I.getArgOperand(2)->getType());
4517 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4518 {AShadow, Idx, BShadow});
4519 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4520 setOriginForNaryOp(I);
4521 }
4522
4523 [[maybe_unused]] static bool isFixedIntVectorTy(const Type *T) {
4524 return isa<FixedVectorType>(T) && T->isIntOrIntVectorTy();
4525 }
4526
4527 [[maybe_unused]] static bool isFixedFPVectorTy(const Type *T) {
4528 return isa<FixedVectorType>(T) && T->isFPOrFPVectorTy();
4529 }
4530
4531 [[maybe_unused]] static bool isFixedIntVector(const Value *V) {
4532 return isFixedIntVectorTy(V->getType());
4533 }
4534
4535 [[maybe_unused]] static bool isFixedFPVector(const Value *V) {
4536 return isFixedFPVectorTy(V->getType());
4537 }
4538
4539 // e.g., <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
4540 // (<16 x float> a, <16 x i32> writethru, i16 mask,
4541 // i32 rounding)
4542 //
4543 // Inconveniently, some similar intrinsics have a different operand order:
4544 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
4545 // (<16 x float> a, i32 rounding, <16 x i16> writethru,
4546 // i16 mask)
4547 //
4548 // If the return type has more elements than A, the excess elements are
4549 // zeroed (and the corresponding shadow is initialized).
4550 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
4551 // (<4 x float> a, i32 rounding, <8 x i16> writethru,
4552 // i8 mask)
4553 //
4554 // dst[i] = mask[i] ? convert(a[i]) : writethru[i]
4555 // dst_shadow[i] = mask[i] ? all_or_nothing(a_shadow[i]) : writethru_shadow[i]
4556 // where all_or_nothing(x) is fully uninitialized if x has any
4557 // uninitialized bits
4558 void handleAVX512VectorConvertFPToInt(IntrinsicInst &I, bool LastMask) {
4559 IRBuilder<> IRB(&I);
4560
4561 assert(I.arg_size() == 4);
4562 Value *A = I.getOperand(0);
4563 Value *WriteThrough;
4564 Value *Mask;
4566 if (LastMask) {
4567 WriteThrough = I.getOperand(2);
4568 Mask = I.getOperand(3);
4569 RoundingMode = I.getOperand(1);
4570 } else {
4571 WriteThrough = I.getOperand(1);
4572 Mask = I.getOperand(2);
4573 RoundingMode = I.getOperand(3);
4574 }
4575
4576 assert(isFixedFPVector(A));
4577 assert(isFixedIntVector(WriteThrough));
4578
4579 unsigned ANumElements =
4580 cast<FixedVectorType>(A->getType())->getNumElements();
4581 [[maybe_unused]] unsigned WriteThruNumElements =
4582 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4583 assert(ANumElements == WriteThruNumElements ||
4584 ANumElements * 2 == WriteThruNumElements);
4585
4586 assert(Mask->getType()->isIntegerTy());
4587 unsigned MaskNumElements = Mask->getType()->getScalarSizeInBits();
4588 assert(ANumElements == MaskNumElements ||
4589 ANumElements * 2 == MaskNumElements);
4590
4591 assert(WriteThruNumElements == MaskNumElements);
4592
4593 // Some bits of the mask may be unused, though it's unusual to have partly
4594 // uninitialized bits.
4595 insertCheckShadowOf(Mask, &I);
4596
4597 assert(RoundingMode->getType()->isIntegerTy());
4598 // Only some bits of the rounding mode are used, though it's very
4599 // unusual to have uninitialized bits there (more commonly, it's a
4600 // constant).
4601 insertCheckShadowOf(RoundingMode, &I);
4602
4603 assert(I.getType() == WriteThrough->getType());
4604
4605 Value *AShadow = getShadow(A);
4606 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4607
4608 if (ANumElements * 2 == MaskNumElements) {
4609 // Ensure that the irrelevant bits of the mask are zero, hence selecting
4610 // from the zeroed shadow instead of the writethrough's shadow.
4611 Mask =
4612 IRB.CreateTrunc(Mask, IRB.getIntNTy(ANumElements), "_ms_mask_trunc");
4613 Mask =
4614 IRB.CreateZExt(Mask, IRB.getIntNTy(MaskNumElements), "_ms_mask_zext");
4615 }
4616
4617 // Convert i16 mask to <16 x i1>
4618 Mask = IRB.CreateBitCast(
4619 Mask, FixedVectorType::get(IRB.getInt1Ty(), MaskNumElements),
4620 "_ms_mask_bitcast");
4621
4622 /// For floating-point to integer conversion, the output is:
4623 /// - fully uninitialized if *any* bit of the input is uninitialized
4624 /// - fully ininitialized if all bits of the input are ininitialized
4625 /// We apply the same principle on a per-element basis for vectors.
4626 ///
4627 /// We use the scalar width of the return type instead of A's.
4628 AShadow = IRB.CreateSExt(
4629 IRB.CreateICmpNE(AShadow, getCleanShadow(AShadow->getType())),
4630 getShadowTy(&I), "_ms_a_shadow");
4631
4632 Value *WriteThroughShadow = getShadow(WriteThrough);
4633 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow,
4634 "_ms_writethru_select");
4635
4636 setShadow(&I, Shadow);
4637 setOriginForNaryOp(I);
4638 }
4639
4640 // Instrument BMI / BMI2 intrinsics.
4641 // All of these intrinsics are Z = I(X, Y)
4642 // where the types of all operands and the result match, and are either i32 or
4643 // i64. The following instrumentation happens to work for all of them:
4644 // Sz = I(Sx, Y) | (sext (Sy != 0))
4645 void handleBmiIntrinsic(IntrinsicInst &I) {
4646 IRBuilder<> IRB(&I);
4647 Type *ShadowTy = getShadowTy(&I);
4648
4649 // If any bit of the mask operand is poisoned, then the whole thing is.
4650 Value *SMask = getShadow(&I, 1);
4651 SMask = IRB.CreateSExt(IRB.CreateICmpNE(SMask, getCleanShadow(ShadowTy)),
4652 ShadowTy);
4653 // Apply the same intrinsic to the shadow of the first operand.
4654 Value *S = IRB.CreateCall(I.getCalledFunction(),
4655 {getShadow(&I, 0), I.getOperand(1)});
4656 S = IRB.CreateOr(SMask, S);
4657 setShadow(&I, S);
4658 setOriginForNaryOp(I);
4659 }
4660
4661 static SmallVector<int, 8> getPclmulMask(unsigned Width, bool OddElements) {
4662 SmallVector<int, 8> Mask;
4663 for (unsigned X = OddElements ? 1 : 0; X < Width; X += 2) {
4664 Mask.append(2, X);
4665 }
4666 return Mask;
4667 }
4668
4669 // Instrument pclmul intrinsics.
4670 // These intrinsics operate either on odd or on even elements of the input
4671 // vectors, depending on the constant in the 3rd argument, ignoring the rest.
4672 // Replace the unused elements with copies of the used ones, ex:
4673 // (0, 1, 2, 3) -> (0, 0, 2, 2) (even case)
4674 // or
4675 // (0, 1, 2, 3) -> (1, 1, 3, 3) (odd case)
4676 // and then apply the usual shadow combining logic.
4677 void handlePclmulIntrinsic(IntrinsicInst &I) {
4678 IRBuilder<> IRB(&I);
4679 unsigned Width =
4680 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4681 assert(isa<ConstantInt>(I.getArgOperand(2)) &&
4682 "pclmul 3rd operand must be a constant");
4683 unsigned Imm = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
4684 Value *Shuf0 = IRB.CreateShuffleVector(getShadow(&I, 0),
4685 getPclmulMask(Width, Imm & 0x01));
4686 Value *Shuf1 = IRB.CreateShuffleVector(getShadow(&I, 1),
4687 getPclmulMask(Width, Imm & 0x10));
4688 ShadowAndOriginCombiner SOC(this, IRB);
4689 SOC.Add(Shuf0, getOrigin(&I, 0));
4690 SOC.Add(Shuf1, getOrigin(&I, 1));
4691 SOC.Done(&I);
4692 }
4693
4694 // Instrument _mm_*_sd|ss intrinsics
4695 void handleUnarySdSsIntrinsic(IntrinsicInst &I) {
4696 IRBuilder<> IRB(&I);
4697 unsigned Width =
4698 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4699 Value *First = getShadow(&I, 0);
4700 Value *Second = getShadow(&I, 1);
4701 // First element of second operand, remaining elements of first operand
4702 SmallVector<int, 16> Mask;
4703 Mask.push_back(Width);
4704 for (unsigned i = 1; i < Width; i++)
4705 Mask.push_back(i);
4706 Value *Shadow = IRB.CreateShuffleVector(First, Second, Mask);
4707
4708 setShadow(&I, Shadow);
4709 setOriginForNaryOp(I);
4710 }
4711
4712 void handleVtestIntrinsic(IntrinsicInst &I) {
4713 IRBuilder<> IRB(&I);
4714 Value *Shadow0 = getShadow(&I, 0);
4715 Value *Shadow1 = getShadow(&I, 1);
4716 Value *Or = IRB.CreateOr(Shadow0, Shadow1);
4717 Value *NZ = IRB.CreateICmpNE(Or, Constant::getNullValue(Or->getType()));
4718 Value *Scalar = convertShadowToScalar(NZ, IRB);
4719 Value *Shadow = IRB.CreateZExt(Scalar, getShadowTy(&I));
4720
4721 setShadow(&I, Shadow);
4722 setOriginForNaryOp(I);
4723 }
4724
4725 void handleBinarySdSsIntrinsic(IntrinsicInst &I) {
4726 IRBuilder<> IRB(&I);
4727 unsigned Width =
4728 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4729 Value *First = getShadow(&I, 0);
4730 Value *Second = getShadow(&I, 1);
4731 Value *OrShadow = IRB.CreateOr(First, Second);
4732 // First element of both OR'd together, remaining elements of first operand
4733 SmallVector<int, 16> Mask;
4734 Mask.push_back(Width);
4735 for (unsigned i = 1; i < Width; i++)
4736 Mask.push_back(i);
4737 Value *Shadow = IRB.CreateShuffleVector(First, OrShadow, Mask);
4738
4739 setShadow(&I, Shadow);
4740 setOriginForNaryOp(I);
4741 }
4742
4743 // _mm_round_ps / _mm_round_ps.
4744 // Similar to maybeHandleSimpleNomemIntrinsic except
4745 // the second argument is guranteed to be a constant integer.
4746 void handleRoundPdPsIntrinsic(IntrinsicInst &I) {
4747 assert(I.getArgOperand(0)->getType() == I.getType());
4748 assert(I.arg_size() == 2);
4749 assert(isa<ConstantInt>(I.getArgOperand(1)));
4750
4751 IRBuilder<> IRB(&I);
4752 ShadowAndOriginCombiner SC(this, IRB);
4753 SC.Add(I.getArgOperand(0));
4754 SC.Done(&I);
4755 }
4756
4757 // Instrument @llvm.abs intrinsic.
4758 //
4759 // e.g., i32 @llvm.abs.i32 (i32 <Src>, i1 <is_int_min_poison>)
4760 // <4 x i32> @llvm.abs.v4i32(<4 x i32> <Src>, i1 <is_int_min_poison>)
4761 void handleAbsIntrinsic(IntrinsicInst &I) {
4762 assert(I.arg_size() == 2);
4763 Value *Src = I.getArgOperand(0);
4764 Value *IsIntMinPoison = I.getArgOperand(1);
4765
4766 assert(I.getType()->isIntOrIntVectorTy());
4767
4768 assert(Src->getType() == I.getType());
4769
4770 assert(IsIntMinPoison->getType()->isIntegerTy());
4771 assert(IsIntMinPoison->getType()->getIntegerBitWidth() == 1);
4772
4773 IRBuilder<> IRB(&I);
4774 Value *SrcShadow = getShadow(Src);
4775
4776 APInt MinVal =
4777 APInt::getSignedMinValue(Src->getType()->getScalarSizeInBits());
4778 Value *MinValVec = ConstantInt::get(Src->getType(), MinVal);
4779 Value *SrcIsMin = IRB.CreateICmp(CmpInst::ICMP_EQ, Src, MinValVec);
4780
4781 Value *PoisonedShadow = getPoisonedShadow(Src);
4782 Value *PoisonedIfIntMinShadow =
4783 IRB.CreateSelect(SrcIsMin, PoisonedShadow, SrcShadow);
4784 Value *Shadow =
4785 IRB.CreateSelect(IsIntMinPoison, PoisonedIfIntMinShadow, SrcShadow);
4786
4787 setShadow(&I, Shadow);
4788 setOrigin(&I, getOrigin(&I, 0));
4789 }
4790
4791 void handleIsFpClass(IntrinsicInst &I) {
4792 IRBuilder<> IRB(&I);
4793 Value *Shadow = getShadow(&I, 0);
4794 setShadow(&I, IRB.CreateICmpNE(Shadow, getCleanShadow(Shadow)));
4795 setOrigin(&I, getOrigin(&I, 0));
4796 }
4797
4798 void handleArithmeticWithOverflow(IntrinsicInst &I) {
4799 IRBuilder<> IRB(&I);
4800 Value *Shadow0 = getShadow(&I, 0);
4801 Value *Shadow1 = getShadow(&I, 1);
4802 Value *ShadowElt0 = IRB.CreateOr(Shadow0, Shadow1);
4803 Value *ShadowElt1 =
4804 IRB.CreateICmpNE(ShadowElt0, getCleanShadow(ShadowElt0));
4805
4806 Value *Shadow = PoisonValue::get(getShadowTy(&I));
4807 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt0, 0);
4808 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt1, 1);
4809
4810 setShadow(&I, Shadow);
4811 setOriginForNaryOp(I);
4812 }
4813
4814 Value *extractLowerShadow(IRBuilder<> &IRB, Value *V) {
4815 assert(isa<FixedVectorType>(V->getType()));
4816 assert(cast<FixedVectorType>(V->getType())->getNumElements() > 0);
4817 Value *Shadow = getShadow(V);
4818 return IRB.CreateExtractElement(Shadow,
4819 ConstantInt::get(IRB.getInt32Ty(), 0));
4820 }
4821
4822 // Handle llvm.x86.avx512.mask.pmov{,s,us}.*.512
4823 //
4824 // e.g., call <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512
4825 // (<8 x i64>, <16 x i8>, i8)
4826 // A WriteThru Mask
4827 //
4828 // call <16 x i8> @llvm.x86.avx512.mask.pmovs.db.512
4829 // (<16 x i32>, <16 x i8>, i16)
4830 //
4831 // Dst[i] = Mask[i] ? truncate_or_saturate(A[i]) : WriteThru[i]
4832 // Dst_shadow[i] = Mask[i] ? truncate(A_shadow[i]) : WriteThru_shadow[i]
4833 //
4834 // If Dst has more elements than A, the excess elements are zeroed (and the
4835 // corresponding shadow is initialized).
4836 //
4837 // Note: for PMOV (truncation), handleIntrinsicByApplyingToShadow is precise
4838 // and is much faster than this handler.
4839 void handleAVX512VectorDownConvert(IntrinsicInst &I) {
4840 IRBuilder<> IRB(&I);
4841
4842 assert(I.arg_size() == 3);
4843 Value *A = I.getOperand(0);
4844 Value *WriteThrough = I.getOperand(1);
4845 Value *Mask = I.getOperand(2);
4846
4847 assert(isFixedIntVector(A));
4848 assert(isFixedIntVector(WriteThrough));
4849
4850 unsigned ANumElements =
4851 cast<FixedVectorType>(A->getType())->getNumElements();
4852 unsigned OutputNumElements =
4853 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4854 assert(ANumElements == OutputNumElements ||
4855 ANumElements * 2 == OutputNumElements);
4856
4857 assert(Mask->getType()->isIntegerTy());
4858 assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
4859 insertCheckShadowOf(Mask, &I);
4860
4861 assert(I.getType() == WriteThrough->getType());
4862
4863 // Widen the mask, if necessary, to have one bit per element of the output
4864 // vector.
4865 // We want the extra bits to have '1's, so that the CreateSelect will
4866 // select the values from AShadow instead of WriteThroughShadow ("maskless"
4867 // versions of the intrinsics are sometimes implemented using an all-1's
4868 // mask and an undefined value for WriteThroughShadow). We accomplish this
4869 // by using bitwise NOT before and after the ZExt.
4870 if (ANumElements != OutputNumElements) {
4871 Mask = IRB.CreateNot(Mask);
4872 Mask = IRB.CreateZExt(Mask, Type::getIntNTy(*MS.C, OutputNumElements),
4873 "_ms_widen_mask");
4874 Mask = IRB.CreateNot(Mask);
4875 }
4876 Mask = IRB.CreateBitCast(
4877 Mask, FixedVectorType::get(IRB.getInt1Ty(), OutputNumElements));
4878
4879 Value *AShadow = getShadow(A);
4880
4881 // The return type might have more elements than the input.
4882 // Temporarily shrink the return type's number of elements.
4883 VectorType *ShadowType = maybeShrinkVectorShadowType(A, I);
4884
4885 // PMOV truncates; PMOVS/PMOVUS uses signed/unsigned saturation.
4886 // This handler treats them all as truncation, which leads to some rare
4887 // false positives in the cases where the truncated bytes could
4888 // unambiguously saturate the value e.g., if A = ??????10 ????????
4889 // (big-endian), the unsigned saturated byte conversion is 11111111 i.e.,
4890 // fully defined, but the truncated byte is ????????.
4891 //
4892 // TODO: use GetMinMaxUnsigned() to handle saturation precisely.
4893 AShadow = IRB.CreateTrunc(AShadow, ShadowType, "_ms_trunc_shadow");
4894 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4895
4896 Value *WriteThroughShadow = getShadow(WriteThrough);
4897
4898 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow);
4899 setShadow(&I, Shadow);
4900 setOriginForNaryOp(I);
4901 }
4902
4903 // For sh.* compiler intrinsics:
4904 // llvm.x86.avx512fp16.mask.{add/sub/mul/div/max/min}.sh.round
4905 // (<8 x half>, <8 x half>, <8 x half>, i8, i32)
4906 // A B WriteThru Mask RoundingMode
4907 //
4908 // DstShadow[0] = Mask[0] ? (AShadow[0] | BShadow[0]) : WriteThruShadow[0]
4909 // DstShadow[1..7] = AShadow[1..7]
4910 void visitGenericScalarHalfwordInst(IntrinsicInst &I) {
4911 IRBuilder<> IRB(&I);
4912
4913 assert(I.arg_size() == 5);
4914 Value *A = I.getOperand(0);
4915 Value *B = I.getOperand(1);
4916 Value *WriteThrough = I.getOperand(2);
4917 Value *Mask = I.getOperand(3);
4918 Value *RoundingMode = I.getOperand(4);
4919
4920 // Technically, we could probably just check whether the LSB is
4921 // initialized, but intuitively it feels like a partly uninitialized mask
4922 // is unintended, and we should warn the user immediately.
4923 insertCheckShadowOf(Mask, &I);
4924 insertCheckShadowOf(RoundingMode, &I);
4925
4926 assert(isa<FixedVectorType>(A->getType()));
4927 unsigned NumElements =
4928 cast<FixedVectorType>(A->getType())->getNumElements();
4929 assert(NumElements == 8);
4930 assert(A->getType() == B->getType());
4931 assert(B->getType() == WriteThrough->getType());
4932 assert(Mask->getType()->getPrimitiveSizeInBits() == NumElements);
4933 assert(RoundingMode->getType()->isIntegerTy());
4934
4935 Value *ALowerShadow = extractLowerShadow(IRB, A);
4936 Value *BLowerShadow = extractLowerShadow(IRB, B);
4937
4938 Value *ABLowerShadow = IRB.CreateOr(ALowerShadow, BLowerShadow);
4939
4940 Value *WriteThroughLowerShadow = extractLowerShadow(IRB, WriteThrough);
4941
4942 Mask = IRB.CreateBitCast(
4943 Mask, FixedVectorType::get(IRB.getInt1Ty(), NumElements));
4944 Value *MaskLower =
4945 IRB.CreateExtractElement(Mask, ConstantInt::get(IRB.getInt32Ty(), 0));
4946
4947 Value *AShadow = getShadow(A);
4948 Value *DstLowerShadow =
4949 IRB.CreateSelect(MaskLower, ABLowerShadow, WriteThroughLowerShadow);
4950 Value *DstShadow = IRB.CreateInsertElement(
4951 AShadow, DstLowerShadow, ConstantInt::get(IRB.getInt32Ty(), 0),
4952 "_msprop");
4953
4954 setShadow(&I, DstShadow);
4955 setOriginForNaryOp(I);
4956 }
4957
4958 // Approximately handle AVX Galois Field Affine Transformation
4959 //
4960 // e.g.,
4961 // <16 x i8> @llvm.x86.vgf2p8affineqb.128(<16 x i8>, <16 x i8>, i8)
4962 // <32 x i8> @llvm.x86.vgf2p8affineqb.256(<32 x i8>, <32 x i8>, i8)
4963 // <64 x i8> @llvm.x86.vgf2p8affineqb.512(<64 x i8>, <64 x i8>, i8)
4964 // Out A x b
4965 // where A and x are packed matrices, b is a vector,
4966 // Out = A * x + b in GF(2)
4967 //
4968 // Multiplication in GF(2) is equivalent to bitwise AND. However, the matrix
4969 // computation also includes a parity calculation.
4970 //
4971 // For the bitwise AND of bits V1 and V2, the exact shadow is:
4972 // Out_Shadow = (V1_Shadow & V2_Shadow)
4973 // | (V1 & V2_Shadow)
4974 // | (V1_Shadow & V2 )
4975 //
4976 // We approximate the shadow of gf2p8affineqb using:
4977 // Out_Shadow = gf2p8affineqb(x_Shadow, A_shadow, 0)
4978 // | gf2p8affineqb(x, A_shadow, 0)
4979 // | gf2p8affineqb(x_Shadow, A, 0)
4980 // | set1_epi8(b_Shadow)
4981 //
4982 // This approximation has false negatives: if an intermediate dot-product
4983 // contains an even number of 1's, the parity is 0.
4984 // It has no false positives.
4985 void handleAVXGF2P8Affine(IntrinsicInst &I) {
4986 IRBuilder<> IRB(&I);
4987
4988 assert(I.arg_size() == 3);
4989 Value *A = I.getOperand(0);
4990 Value *X = I.getOperand(1);
4991 Value *B = I.getOperand(2);
4992
4993 assert(isFixedIntVector(A));
4994 assert(cast<VectorType>(A->getType())
4995 ->getElementType()
4996 ->getScalarSizeInBits() == 8);
4997
4998 assert(A->getType() == X->getType());
4999
5000 assert(B->getType()->isIntegerTy());
5001 assert(B->getType()->getScalarSizeInBits() == 8);
5002
5003 assert(I.getType() == A->getType());
5004
5005 Value *AShadow = getShadow(A);
5006 Value *XShadow = getShadow(X);
5007 Value *BZeroShadow = getCleanShadow(B);
5008
5009 CallInst *AShadowXShadow = IRB.CreateIntrinsic(
5010 I.getType(), I.getIntrinsicID(), {XShadow, AShadow, BZeroShadow});
5011 CallInst *AShadowX = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5012 {X, AShadow, BZeroShadow});
5013 CallInst *XShadowA = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5014 {XShadow, A, BZeroShadow});
5015
5016 unsigned NumElements = cast<FixedVectorType>(I.getType())->getNumElements();
5017 Value *BShadow = getShadow(B);
5018 Value *BBroadcastShadow = getCleanShadow(AShadow);
5019 // There is no LLVM IR intrinsic for _mm512_set1_epi8.
5020 // This loop generates a lot of LLVM IR, which we expect that CodeGen will
5021 // lower appropriately (e.g., VPBROADCASTB).
5022 // Besides, b is often a constant, in which case it is fully initialized.
5023 for (unsigned i = 0; i < NumElements; i++)
5024 BBroadcastShadow = IRB.CreateInsertElement(BBroadcastShadow, BShadow, i);
5025
5026 setShadow(&I, IRB.CreateOr(
5027 {AShadowXShadow, AShadowX, XShadowA, BBroadcastShadow}));
5028 setOriginForNaryOp(I);
5029 }
5030
5031 // Handle Arm NEON vector load intrinsics (vld*).
5032 //
5033 // The WithLane instructions (ld[234]lane) are similar to:
5034 // call {<4 x i32>, <4 x i32>, <4 x i32>}
5035 // @llvm.aarch64.neon.ld3lane.v4i32.p0
5036 // (<4 x i32> %L1, <4 x i32> %L2, <4 x i32> %L3, i64 %lane, ptr
5037 // %A)
5038 //
5039 // The non-WithLane instructions (ld[234], ld1x[234], ld[234]r) are similar
5040 // to:
5041 // call {<8 x i8>, <8 x i8>} @llvm.aarch64.neon.ld2.v8i8.p0(ptr %A)
5042 void handleNEONVectorLoad(IntrinsicInst &I, bool WithLane) {
5043 unsigned int numArgs = I.arg_size();
5044
5045 // Return type is a struct of vectors of integers or floating-point
5046 assert(I.getType()->isStructTy());
5047 [[maybe_unused]] StructType *RetTy = cast<StructType>(I.getType());
5048 assert(RetTy->getNumElements() > 0);
5050 RetTy->getElementType(0)->isFPOrFPVectorTy());
5051 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5052 assert(RetTy->getElementType(i) == RetTy->getElementType(0));
5053
5054 if (WithLane) {
5055 // 2, 3 or 4 vectors, plus lane number, plus input pointer
5056 assert(4 <= numArgs && numArgs <= 6);
5057
5058 // Return type is a struct of the input vectors
5059 assert(RetTy->getNumElements() + 2 == numArgs);
5060 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5061 assert(I.getArgOperand(i)->getType() == RetTy->getElementType(0));
5062 } else {
5063 assert(numArgs == 1);
5064 }
5065
5066 IRBuilder<> IRB(&I);
5067
5068 SmallVector<Value *, 6> ShadowArgs;
5069 if (WithLane) {
5070 for (unsigned int i = 0; i < numArgs - 2; i++)
5071 ShadowArgs.push_back(getShadow(I.getArgOperand(i)));
5072
5073 // Lane number, passed verbatim
5074 Value *LaneNumber = I.getArgOperand(numArgs - 2);
5075 ShadowArgs.push_back(LaneNumber);
5076
5077 // TODO: blend shadow of lane number into output shadow?
5078 insertCheckShadowOf(LaneNumber, &I);
5079 }
5080
5081 Value *Src = I.getArgOperand(numArgs - 1);
5082 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
5083
5084 Type *SrcShadowTy = getShadowTy(Src);
5085 auto [SrcShadowPtr, SrcOriginPtr] =
5086 getShadowOriginPtr(Src, IRB, SrcShadowTy, Align(1), /*isStore*/ false);
5087 ShadowArgs.push_back(SrcShadowPtr);
5088
5089 // The NEON vector load instructions handled by this function all have
5090 // integer variants. It is easier to use those rather than trying to cast
5091 // a struct of vectors of floats into a struct of vectors of integers.
5092 CallInst *CI =
5093 IRB.CreateIntrinsic(getShadowTy(&I), I.getIntrinsicID(), ShadowArgs);
5094 setShadow(&I, CI);
5095
5096 if (!MS.TrackOrigins)
5097 return;
5098
5099 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
5100 setOrigin(&I, PtrSrcOrigin);
5101 }
5102
5103 /// Handle Arm NEON vector store intrinsics (vst{2,3,4}, vst1x_{2,3,4},
5104 /// and vst{2,3,4}lane).
5105 ///
5106 /// Arm NEON vector store intrinsics have the output address (pointer) as the
5107 /// last argument, with the initial arguments being the inputs (and lane
5108 /// number for vst{2,3,4}lane). They return void.
5109 ///
5110 /// - st4 interleaves the output e.g., st4 (inA, inB, inC, inD, outP) writes
5111 /// abcdabcdabcdabcd... into *outP
5112 /// - st1_x4 is non-interleaved e.g., st1_x4 (inA, inB, inC, inD, outP)
5113 /// writes aaaa...bbbb...cccc...dddd... into *outP
5114 /// - st4lane has arguments of (inA, inB, inC, inD, lane, outP)
5115 /// These instructions can all be instrumented with essentially the same
5116 /// MSan logic, simply by applying the corresponding intrinsic to the shadow.
5117 void handleNEONVectorStoreIntrinsic(IntrinsicInst &I, bool useLane) {
5118 IRBuilder<> IRB(&I);
5119
5120 // Don't use getNumOperands() because it includes the callee
5121 int numArgOperands = I.arg_size();
5122
5123 // The last arg operand is the output (pointer)
5124 assert(numArgOperands >= 1);
5125 Value *Addr = I.getArgOperand(numArgOperands - 1);
5126 assert(Addr->getType()->isPointerTy());
5127 int skipTrailingOperands = 1;
5128
5130 insertCheckShadowOf(Addr, &I);
5131
5132 // Second-last operand is the lane number (for vst{2,3,4}lane)
5133 if (useLane) {
5134 skipTrailingOperands++;
5135 assert(numArgOperands >= static_cast<int>(skipTrailingOperands));
5137 I.getArgOperand(numArgOperands - skipTrailingOperands)->getType()));
5138 }
5139
5140 SmallVector<Value *, 8> ShadowArgs;
5141 // All the initial operands are the inputs
5142 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++) {
5143 assert(isa<FixedVectorType>(I.getArgOperand(i)->getType()));
5144 Value *Shadow = getShadow(&I, i);
5145 ShadowArgs.append(1, Shadow);
5146 }
5147
5148 // MSan's GetShadowTy assumes the LHS is the type we want the shadow for
5149 // e.g., for:
5150 // [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to i128
5151 // we know the type of the output (and its shadow) is <16 x i8>.
5152 //
5153 // Arm NEON VST is unusual because the last argument is the output address:
5154 // define void @st2_16b(<16 x i8> %A, <16 x i8> %B, ptr %P) {
5155 // call void @llvm.aarch64.neon.st2.v16i8.p0
5156 // (<16 x i8> [[A]], <16 x i8> [[B]], ptr [[P]])
5157 // and we have no type information about P's operand. We must manually
5158 // compute the type (<16 x i8> x 2).
5159 FixedVectorType *OutputVectorTy = FixedVectorType::get(
5160 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getElementType(),
5161 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements() *
5162 (numArgOperands - skipTrailingOperands));
5163 Type *OutputShadowTy = getShadowTy(OutputVectorTy);
5164
5165 if (useLane)
5166 ShadowArgs.append(1,
5167 I.getArgOperand(numArgOperands - skipTrailingOperands));
5168
5169 Value *OutputShadowPtr, *OutputOriginPtr;
5170 // AArch64 NEON does not need alignment (unless OS requires it)
5171 std::tie(OutputShadowPtr, OutputOriginPtr) = getShadowOriginPtr(
5172 Addr, IRB, OutputShadowTy, Align(1), /*isStore*/ true);
5173 ShadowArgs.append(1, OutputShadowPtr);
5174
5175 CallInst *CI =
5176 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
5177 setShadow(&I, CI);
5178
5179 if (MS.TrackOrigins) {
5180 // TODO: if we modelled the vst* instruction more precisely, we could
5181 // more accurately track the origins (e.g., if both inputs are
5182 // uninitialized for vst2, we currently blame the second input, even
5183 // though part of the output depends only on the first input).
5184 //
5185 // This is particularly imprecise for vst{2,3,4}lane, since only one
5186 // lane of each input is actually copied to the output.
5187 OriginCombiner OC(this, IRB);
5188 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++)
5189 OC.Add(I.getArgOperand(i));
5190
5191 const DataLayout &DL = F.getDataLayout();
5192 OC.DoneAndStoreOrigin(DL.getTypeStoreSize(OutputVectorTy),
5193 OutputOriginPtr);
5194 }
5195 }
5196
5197 /// Handle intrinsics by applying the intrinsic to the shadows.
5198 ///
5199 /// The trailing arguments are passed verbatim to the intrinsic, though any
5200 /// uninitialized trailing arguments can also taint the shadow e.g., for an
5201 /// intrinsic with one trailing verbatim argument:
5202 /// out = intrinsic(var1, var2, opType)
5203 /// we compute:
5204 /// shadow[out] =
5205 /// intrinsic(shadow[var1], shadow[var2], opType) | shadow[opType]
5206 ///
5207 /// Typically, shadowIntrinsicID will be specified by the caller to be
5208 /// I.getIntrinsicID(), but the caller can choose to replace it with another
5209 /// intrinsic of the same type.
5210 ///
5211 /// CAUTION: this assumes that the intrinsic will handle arbitrary
5212 /// bit-patterns (for example, if the intrinsic accepts floats for
5213 /// var1, we require that it doesn't care if inputs are NaNs).
5214 ///
5215 /// For example, this can be applied to the Arm NEON vector table intrinsics
5216 /// (tbl{1,2,3,4}).
5217 ///
5218 /// The origin is approximated using setOriginForNaryOp.
5219 void handleIntrinsicByApplyingToShadow(IntrinsicInst &I,
5220 Intrinsic::ID shadowIntrinsicID,
5221 unsigned int trailingVerbatimArgs) {
5222 IRBuilder<> IRB(&I);
5223
5224 assert(trailingVerbatimArgs < I.arg_size());
5225
5226 SmallVector<Value *, 8> ShadowArgs;
5227 // Don't use getNumOperands() because it includes the callee
5228 for (unsigned int i = 0; i < I.arg_size() - trailingVerbatimArgs; i++) {
5229 Value *Shadow = getShadow(&I, i);
5230
5231 // Shadows are integer-ish types but some intrinsics require a
5232 // different (e.g., floating-point) type.
5233 ShadowArgs.push_back(
5234 IRB.CreateBitCast(Shadow, I.getArgOperand(i)->getType()));
5235 }
5236
5237 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5238 i++) {
5239 Value *Arg = I.getArgOperand(i);
5240 ShadowArgs.push_back(Arg);
5241 }
5242
5243 CallInst *CI =
5244 IRB.CreateIntrinsic(I.getType(), shadowIntrinsicID, ShadowArgs);
5245 Value *CombinedShadow = CI;
5246
5247 // Combine the computed shadow with the shadow of trailing args
5248 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5249 i++) {
5250 Value *Shadow =
5251 CreateShadowCast(IRB, getShadow(&I, i), CombinedShadow->getType());
5252 CombinedShadow = IRB.CreateOr(Shadow, CombinedShadow, "_msprop");
5253 }
5254
5255 setShadow(&I, IRB.CreateBitCast(CombinedShadow, getShadowTy(&I)));
5256
5257 setOriginForNaryOp(I);
5258 }
5259
5260 // Approximation only
5261 //
5262 // e.g., <16 x i8> @llvm.aarch64.neon.pmull64(i64, i64)
5263 void handleNEONVectorMultiplyIntrinsic(IntrinsicInst &I) {
5264 assert(I.arg_size() == 2);
5265
5266 handleShadowOr(I);
5267 }
5268
5269 bool maybeHandleCrossPlatformIntrinsic(IntrinsicInst &I) {
5270 switch (I.getIntrinsicID()) {
5271 case Intrinsic::uadd_with_overflow:
5272 case Intrinsic::sadd_with_overflow:
5273 case Intrinsic::usub_with_overflow:
5274 case Intrinsic::ssub_with_overflow:
5275 case Intrinsic::umul_with_overflow:
5276 case Intrinsic::smul_with_overflow:
5277 handleArithmeticWithOverflow(I);
5278 break;
5279 case Intrinsic::abs:
5280 handleAbsIntrinsic(I);
5281 break;
5282 case Intrinsic::bitreverse:
5283 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
5284 /*trailingVerbatimArgs*/ 0);
5285 break;
5286 case Intrinsic::is_fpclass:
5287 handleIsFpClass(I);
5288 break;
5289 case Intrinsic::lifetime_start:
5290 handleLifetimeStart(I);
5291 break;
5292 case Intrinsic::launder_invariant_group:
5293 case Intrinsic::strip_invariant_group:
5294 handleInvariantGroup(I);
5295 break;
5296 case Intrinsic::bswap:
5297 handleBswap(I);
5298 break;
5299 case Intrinsic::ctlz:
5300 case Intrinsic::cttz:
5301 handleCountLeadingTrailingZeros(I);
5302 break;
5303 case Intrinsic::masked_compressstore:
5304 handleMaskedCompressStore(I);
5305 break;
5306 case Intrinsic::masked_expandload:
5307 handleMaskedExpandLoad(I);
5308 break;
5309 case Intrinsic::masked_gather:
5310 handleMaskedGather(I);
5311 break;
5312 case Intrinsic::masked_scatter:
5313 handleMaskedScatter(I);
5314 break;
5315 case Intrinsic::masked_store:
5316 handleMaskedStore(I);
5317 break;
5318 case Intrinsic::masked_load:
5319 handleMaskedLoad(I);
5320 break;
5321 case Intrinsic::vector_reduce_and:
5322 handleVectorReduceAndIntrinsic(I);
5323 break;
5324 case Intrinsic::vector_reduce_or:
5325 handleVectorReduceOrIntrinsic(I);
5326 break;
5327
5328 case Intrinsic::vector_reduce_add:
5329 case Intrinsic::vector_reduce_xor:
5330 case Intrinsic::vector_reduce_mul:
5331 // Signed/Unsigned Min/Max
5332 // TODO: handling similarly to AND/OR may be more precise.
5333 case Intrinsic::vector_reduce_smax:
5334 case Intrinsic::vector_reduce_smin:
5335 case Intrinsic::vector_reduce_umax:
5336 case Intrinsic::vector_reduce_umin:
5337 // TODO: this has no false positives, but arguably we should check that all
5338 // the bits are initialized.
5339 case Intrinsic::vector_reduce_fmax:
5340 case Intrinsic::vector_reduce_fmin:
5341 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/false);
5342 break;
5343
5344 case Intrinsic::vector_reduce_fadd:
5345 case Intrinsic::vector_reduce_fmul:
5346 handleVectorReduceWithStarterIntrinsic(I);
5347 break;
5348
5349 case Intrinsic::scmp:
5350 case Intrinsic::ucmp: {
5351 handleShadowOr(I);
5352 break;
5353 }
5354
5355 case Intrinsic::fshl:
5356 case Intrinsic::fshr:
5357 handleFunnelShift(I);
5358 break;
5359
5360 case Intrinsic::is_constant:
5361 // The result of llvm.is.constant() is always defined.
5362 setShadow(&I, getCleanShadow(&I));
5363 setOrigin(&I, getCleanOrigin());
5364 break;
5365
5366 default:
5367 return false;
5368 }
5369
5370 return true;
5371 }
5372
5373 bool maybeHandleX86SIMDIntrinsic(IntrinsicInst &I) {
5374 switch (I.getIntrinsicID()) {
5375 case Intrinsic::x86_sse_stmxcsr:
5376 handleStmxcsr(I);
5377 break;
5378 case Intrinsic::x86_sse_ldmxcsr:
5379 handleLdmxcsr(I);
5380 break;
5381
5382 // Convert Scalar Double Precision Floating-Point Value
5383 // to Unsigned Doubleword Integer
5384 // etc.
5385 case Intrinsic::x86_avx512_vcvtsd2usi64:
5386 case Intrinsic::x86_avx512_vcvtsd2usi32:
5387 case Intrinsic::x86_avx512_vcvtss2usi64:
5388 case Intrinsic::x86_avx512_vcvtss2usi32:
5389 case Intrinsic::x86_avx512_cvttss2usi64:
5390 case Intrinsic::x86_avx512_cvttss2usi:
5391 case Intrinsic::x86_avx512_cvttsd2usi64:
5392 case Intrinsic::x86_avx512_cvttsd2usi:
5393 case Intrinsic::x86_avx512_cvtusi2ss:
5394 case Intrinsic::x86_avx512_cvtusi642sd:
5395 case Intrinsic::x86_avx512_cvtusi642ss:
5396 handleSSEVectorConvertIntrinsic(I, 1, true);
5397 break;
5398 case Intrinsic::x86_sse2_cvtsd2si64:
5399 case Intrinsic::x86_sse2_cvtsd2si:
5400 case Intrinsic::x86_sse2_cvtsd2ss:
5401 case Intrinsic::x86_sse2_cvttsd2si64:
5402 case Intrinsic::x86_sse2_cvttsd2si:
5403 case Intrinsic::x86_sse_cvtss2si64:
5404 case Intrinsic::x86_sse_cvtss2si:
5405 case Intrinsic::x86_sse_cvttss2si64:
5406 case Intrinsic::x86_sse_cvttss2si:
5407 handleSSEVectorConvertIntrinsic(I, 1);
5408 break;
5409 case Intrinsic::x86_sse_cvtps2pi:
5410 case Intrinsic::x86_sse_cvttps2pi:
5411 handleSSEVectorConvertIntrinsic(I, 2);
5412 break;
5413
5414 // TODO:
5415 // <1 x i64> @llvm.x86.sse.cvtpd2pi(<2 x double>)
5416 // <2 x double> @llvm.x86.sse.cvtpi2pd(<1 x i64>)
5417 // <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, <1 x i64>)
5418
5419 case Intrinsic::x86_vcvtps2ph_128:
5420 case Intrinsic::x86_vcvtps2ph_256: {
5421 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/true);
5422 break;
5423 }
5424
5425 // Convert Packed Single Precision Floating-Point Values
5426 // to Packed Signed Doubleword Integer Values
5427 //
5428 // <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
5429 // (<16 x float>, <16 x i32>, i16, i32)
5430 case Intrinsic::x86_avx512_mask_cvtps2dq_512:
5431 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/false);
5432 break;
5433
5434 // Convert Packed Double Precision Floating-Point Values
5435 // to Packed Single Precision Floating-Point Values
5436 case Intrinsic::x86_sse2_cvtpd2ps:
5437 case Intrinsic::x86_sse2_cvtps2dq:
5438 case Intrinsic::x86_sse2_cvtpd2dq:
5439 case Intrinsic::x86_sse2_cvttps2dq:
5440 case Intrinsic::x86_sse2_cvttpd2dq:
5441 case Intrinsic::x86_avx_cvt_pd2_ps_256:
5442 case Intrinsic::x86_avx_cvt_ps2dq_256:
5443 case Intrinsic::x86_avx_cvt_pd2dq_256:
5444 case Intrinsic::x86_avx_cvtt_ps2dq_256:
5445 case Intrinsic::x86_avx_cvtt_pd2dq_256: {
5446 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/false);
5447 break;
5448 }
5449
5450 // Convert Single-Precision FP Value to 16-bit FP Value
5451 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
5452 // (<16 x float>, i32, <16 x i16>, i16)
5453 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
5454 // (<4 x float>, i32, <8 x i16>, i8)
5455 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256
5456 // (<8 x float>, i32, <8 x i16>, i8)
5457 case Intrinsic::x86_avx512_mask_vcvtps2ph_512:
5458 case Intrinsic::x86_avx512_mask_vcvtps2ph_256:
5459 case Intrinsic::x86_avx512_mask_vcvtps2ph_128:
5460 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/true);
5461 break;
5462
5463 // Shift Packed Data (Left Logical, Right Arithmetic, Right Logical)
5464 case Intrinsic::x86_avx512_psll_w_512:
5465 case Intrinsic::x86_avx512_psll_d_512:
5466 case Intrinsic::x86_avx512_psll_q_512:
5467 case Intrinsic::x86_avx512_pslli_w_512:
5468 case Intrinsic::x86_avx512_pslli_d_512:
5469 case Intrinsic::x86_avx512_pslli_q_512:
5470 case Intrinsic::x86_avx512_psrl_w_512:
5471 case Intrinsic::x86_avx512_psrl_d_512:
5472 case Intrinsic::x86_avx512_psrl_q_512:
5473 case Intrinsic::x86_avx512_psra_w_512:
5474 case Intrinsic::x86_avx512_psra_d_512:
5475 case Intrinsic::x86_avx512_psra_q_512:
5476 case Intrinsic::x86_avx512_psrli_w_512:
5477 case Intrinsic::x86_avx512_psrli_d_512:
5478 case Intrinsic::x86_avx512_psrli_q_512:
5479 case Intrinsic::x86_avx512_psrai_w_512:
5480 case Intrinsic::x86_avx512_psrai_d_512:
5481 case Intrinsic::x86_avx512_psrai_q_512:
5482 case Intrinsic::x86_avx512_psra_q_256:
5483 case Intrinsic::x86_avx512_psra_q_128:
5484 case Intrinsic::x86_avx512_psrai_q_256:
5485 case Intrinsic::x86_avx512_psrai_q_128:
5486 case Intrinsic::x86_avx2_psll_w:
5487 case Intrinsic::x86_avx2_psll_d:
5488 case Intrinsic::x86_avx2_psll_q:
5489 case Intrinsic::x86_avx2_pslli_w:
5490 case Intrinsic::x86_avx2_pslli_d:
5491 case Intrinsic::x86_avx2_pslli_q:
5492 case Intrinsic::x86_avx2_psrl_w:
5493 case Intrinsic::x86_avx2_psrl_d:
5494 case Intrinsic::x86_avx2_psrl_q:
5495 case Intrinsic::x86_avx2_psra_w:
5496 case Intrinsic::x86_avx2_psra_d:
5497 case Intrinsic::x86_avx2_psrli_w:
5498 case Intrinsic::x86_avx2_psrli_d:
5499 case Intrinsic::x86_avx2_psrli_q:
5500 case Intrinsic::x86_avx2_psrai_w:
5501 case Intrinsic::x86_avx2_psrai_d:
5502 case Intrinsic::x86_sse2_psll_w:
5503 case Intrinsic::x86_sse2_psll_d:
5504 case Intrinsic::x86_sse2_psll_q:
5505 case Intrinsic::x86_sse2_pslli_w:
5506 case Intrinsic::x86_sse2_pslli_d:
5507 case Intrinsic::x86_sse2_pslli_q:
5508 case Intrinsic::x86_sse2_psrl_w:
5509 case Intrinsic::x86_sse2_psrl_d:
5510 case Intrinsic::x86_sse2_psrl_q:
5511 case Intrinsic::x86_sse2_psra_w:
5512 case Intrinsic::x86_sse2_psra_d:
5513 case Intrinsic::x86_sse2_psrli_w:
5514 case Intrinsic::x86_sse2_psrli_d:
5515 case Intrinsic::x86_sse2_psrli_q:
5516 case Intrinsic::x86_sse2_psrai_w:
5517 case Intrinsic::x86_sse2_psrai_d:
5518 case Intrinsic::x86_mmx_psll_w:
5519 case Intrinsic::x86_mmx_psll_d:
5520 case Intrinsic::x86_mmx_psll_q:
5521 case Intrinsic::x86_mmx_pslli_w:
5522 case Intrinsic::x86_mmx_pslli_d:
5523 case Intrinsic::x86_mmx_pslli_q:
5524 case Intrinsic::x86_mmx_psrl_w:
5525 case Intrinsic::x86_mmx_psrl_d:
5526 case Intrinsic::x86_mmx_psrl_q:
5527 case Intrinsic::x86_mmx_psra_w:
5528 case Intrinsic::x86_mmx_psra_d:
5529 case Intrinsic::x86_mmx_psrli_w:
5530 case Intrinsic::x86_mmx_psrli_d:
5531 case Intrinsic::x86_mmx_psrli_q:
5532 case Intrinsic::x86_mmx_psrai_w:
5533 case Intrinsic::x86_mmx_psrai_d:
5534 handleVectorShiftIntrinsic(I, /* Variable */ false);
5535 break;
5536 case Intrinsic::x86_avx2_psllv_d:
5537 case Intrinsic::x86_avx2_psllv_d_256:
5538 case Intrinsic::x86_avx512_psllv_d_512:
5539 case Intrinsic::x86_avx2_psllv_q:
5540 case Intrinsic::x86_avx2_psllv_q_256:
5541 case Intrinsic::x86_avx512_psllv_q_512:
5542 case Intrinsic::x86_avx2_psrlv_d:
5543 case Intrinsic::x86_avx2_psrlv_d_256:
5544 case Intrinsic::x86_avx512_psrlv_d_512:
5545 case Intrinsic::x86_avx2_psrlv_q:
5546 case Intrinsic::x86_avx2_psrlv_q_256:
5547 case Intrinsic::x86_avx512_psrlv_q_512:
5548 case Intrinsic::x86_avx2_psrav_d:
5549 case Intrinsic::x86_avx2_psrav_d_256:
5550 case Intrinsic::x86_avx512_psrav_d_512:
5551 case Intrinsic::x86_avx512_psrav_q_128:
5552 case Intrinsic::x86_avx512_psrav_q_256:
5553 case Intrinsic::x86_avx512_psrav_q_512:
5554 handleVectorShiftIntrinsic(I, /* Variable */ true);
5555 break;
5556
5557 case Intrinsic::x86_sse2_packsswb_128:
5558 case Intrinsic::x86_sse2_packssdw_128:
5559 case Intrinsic::x86_sse2_packuswb_128:
5560 case Intrinsic::x86_sse41_packusdw:
5561 case Intrinsic::x86_avx2_packsswb:
5562 case Intrinsic::x86_avx2_packssdw:
5563 case Intrinsic::x86_avx2_packuswb:
5564 case Intrinsic::x86_avx2_packusdw:
5565 handleVectorPackIntrinsic(I);
5566 break;
5567
5568 case Intrinsic::x86_sse41_pblendvb:
5569 case Intrinsic::x86_sse41_blendvpd:
5570 case Intrinsic::x86_sse41_blendvps:
5571 case Intrinsic::x86_avx_blendv_pd_256:
5572 case Intrinsic::x86_avx_blendv_ps_256:
5573 case Intrinsic::x86_avx2_pblendvb:
5574 handleBlendvIntrinsic(I);
5575 break;
5576
5577 case Intrinsic::x86_avx_dp_ps_256:
5578 case Intrinsic::x86_sse41_dppd:
5579 case Intrinsic::x86_sse41_dpps:
5580 handleDppIntrinsic(I);
5581 break;
5582
5583 case Intrinsic::x86_mmx_packsswb:
5584 case Intrinsic::x86_mmx_packuswb:
5585 handleVectorPackIntrinsic(I, 16);
5586 break;
5587
5588 case Intrinsic::x86_mmx_packssdw:
5589 handleVectorPackIntrinsic(I, 32);
5590 break;
5591
5592 case Intrinsic::x86_mmx_psad_bw:
5593 handleVectorSadIntrinsic(I, true);
5594 break;
5595 case Intrinsic::x86_sse2_psad_bw:
5596 case Intrinsic::x86_avx2_psad_bw:
5597 handleVectorSadIntrinsic(I);
5598 break;
5599
5600 // Multiply and Add Packed Words
5601 // < 4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>)
5602 // < 8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16>, <16 x i16>)
5603 // <16 x i32> @llvm.x86.avx512.pmaddw.d.512(<32 x i16>, <32 x i16>)
5604 //
5605 // Multiply and Add Packed Signed and Unsigned Bytes
5606 // < 8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>)
5607 // <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8>, <32 x i8>)
5608 // <32 x i16> @llvm.x86.avx512.pmaddubs.w.512(<64 x i8>, <64 x i8>)
5609 //
5610 // These intrinsics are auto-upgraded into non-masked forms:
5611 // < 4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128
5612 // (<8 x i16>, <8 x i16>, <4 x i32>, i8)
5613 // < 8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256
5614 // (<16 x i16>, <16 x i16>, <8 x i32>, i8)
5615 // <16 x i32> @llvm.x86.avx512.mask.pmaddw.d.512
5616 // (<32 x i16>, <32 x i16>, <16 x i32>, i16)
5617 // < 8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128
5618 // (<16 x i8>, <16 x i8>, <8 x i16>, i8)
5619 // <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256
5620 // (<32 x i8>, <32 x i8>, <16 x i16>, i16)
5621 // <32 x i16> @llvm.x86.avx512.mask.pmaddubs.w.512
5622 // (<64 x i8>, <64 x i8>, <32 x i16>, i32)
5623 case Intrinsic::x86_sse2_pmadd_wd:
5624 case Intrinsic::x86_avx2_pmadd_wd:
5625 case Intrinsic::x86_avx512_pmaddw_d_512:
5626 case Intrinsic::x86_ssse3_pmadd_ub_sw_128:
5627 case Intrinsic::x86_avx2_pmadd_ub_sw:
5628 case Intrinsic::x86_avx512_pmaddubs_w_512:
5629 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2);
5630 break;
5631
5632 // <1 x i64> @llvm.x86.ssse3.pmadd.ub.sw(<1 x i64>, <1 x i64>)
5633 case Intrinsic::x86_ssse3_pmadd_ub_sw:
5634 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/8);
5635 break;
5636
5637 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64>, <1 x i64>)
5638 case Intrinsic::x86_mmx_pmadd_wd:
5639 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/16);
5640 break;
5641
5642 // AVX Vector Neural Network Instructions: bytes
5643 //
5644 // Multiply and Add Packed Signed and Unsigned Bytes
5645 // < 4 x i32> @llvm.x86.avx512.vpdpbusd.128
5646 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5647 // < 8 x i32> @llvm.x86.avx512.vpdpbusd.256
5648 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5649 // <16 x i32> @llvm.x86.avx512.vpdpbusd.512
5650 // (<16 x i32>, <64 x i8>, <64 x i8>)
5651 //
5652 // Multiply and Add Unsigned and Signed Bytes With Saturation
5653 // < 4 x i32> @llvm.x86.avx512.vpdpbusds.128
5654 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5655 // < 8 x i32> @llvm.x86.avx512.vpdpbusds.256
5656 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5657 // <16 x i32> @llvm.x86.avx512.vpdpbusds.512
5658 // (<16 x i32>, <64 x i8>, <64 x i8>)
5659 //
5660 // < 4 x i32> @llvm.x86.avx2.vpdpbssd.128
5661 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5662 // < 8 x i32> @llvm.x86.avx2.vpdpbssd.256
5663 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5664 //
5665 // < 4 x i32> @llvm.x86.avx2.vpdpbssds.128
5666 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5667 // < 8 x i32> @llvm.x86.avx2.vpdpbssds.256
5668 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5669 //
5670 // <16 x i32> @llvm.x86.avx10.vpdpbssd.512
5671 // (<16 x i32>, <16 x i32>, <16 x i32>)
5672 // <16 x i32> @llvm.x86.avx10.vpdpbssds.512
5673 // (<16 x i32>, <16 x i32>, <16 x i32>)
5674 //
5675 // These intrinsics are auto-upgraded into non-masked forms:
5676 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusd.128
5677 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5678 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusd.128
5679 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5680 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusd.256
5681 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5682 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusd.256
5683 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5684 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusd.512
5685 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5686 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusd.512
5687 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5688 //
5689 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusds.128
5690 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5691 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusds.128
5692 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5693 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusds.256
5694 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5695 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusds.256
5696 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5697 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusds.512
5698 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5699 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusds.512
5700 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5701 case Intrinsic::x86_avx512_vpdpbusd_128:
5702 case Intrinsic::x86_avx512_vpdpbusd_256:
5703 case Intrinsic::x86_avx512_vpdpbusd_512:
5704 case Intrinsic::x86_avx512_vpdpbusds_128:
5705 case Intrinsic::x86_avx512_vpdpbusds_256:
5706 case Intrinsic::x86_avx512_vpdpbusds_512:
5707 case Intrinsic::x86_avx2_vpdpbssd_128:
5708 case Intrinsic::x86_avx2_vpdpbssd_256:
5709 case Intrinsic::x86_avx2_vpdpbssds_128:
5710 case Intrinsic::x86_avx2_vpdpbssds_256:
5711 case Intrinsic::x86_avx10_vpdpbssd_512:
5712 case Intrinsic::x86_avx10_vpdpbssds_512:
5713 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/4, /*EltSize=*/8);
5714 break;
5715
5716 // AVX Vector Neural Network Instructions: words
5717 //
5718 // Multiply and Add Signed Word Integers
5719 // < 4 x i32> @llvm.x86.avx512.vpdpwssd.128
5720 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5721 // < 8 x i32> @llvm.x86.avx512.vpdpwssd.256
5722 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5723 // <16 x i32> @llvm.x86.avx512.vpdpwssd.512
5724 // (<16 x i32>, <16 x i32>, <16 x i32>)
5725 //
5726 // Multiply and Add Signed Word Integers With Saturation
5727 // < 4 x i32> @llvm.x86.avx512.vpdpwssds.128
5728 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5729 // < 8 x i32> @llvm.x86.avx512.vpdpwssds.256
5730 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5731 // <16 x i32> @llvm.x86.avx512.vpdpwssds.512
5732 // (<16 x i32>, <16 x i32>, <16 x i32>)
5733 //
5734 // These intrinsics are auto-upgraded into non-masked forms:
5735 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssd.128
5736 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5737 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssd.128
5738 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5739 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssd.256
5740 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5741 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssd.256
5742 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5743 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssd.512
5744 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5745 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssd.512
5746 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5747 //
5748 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssds.128
5749 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5750 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssds.128
5751 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5752 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssds.256
5753 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5754 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssds.256
5755 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5756 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssds.512
5757 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5758 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssds.512
5759 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5760 case Intrinsic::x86_avx512_vpdpwssd_128:
5761 case Intrinsic::x86_avx512_vpdpwssd_256:
5762 case Intrinsic::x86_avx512_vpdpwssd_512:
5763 case Intrinsic::x86_avx512_vpdpwssds_128:
5764 case Intrinsic::x86_avx512_vpdpwssds_256:
5765 case Intrinsic::x86_avx512_vpdpwssds_512:
5766 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/16);
5767 break;
5768
5769 // TODO: Dot Product of BF16 Pairs Accumulated Into Packed Single
5770 // Precision
5771 // <4 x float> @llvm.x86.avx512bf16.dpbf16ps.128
5772 // (<4 x float>, <8 x bfloat>, <8 x bfloat>)
5773 // <8 x float> @llvm.x86.avx512bf16.dpbf16ps.256
5774 // (<8 x float>, <16 x bfloat>, <16 x bfloat>)
5775 // <16 x float> @llvm.x86.avx512bf16.dpbf16ps.512
5776 // (<16 x float>, <32 x bfloat>, <32 x bfloat>)
5777 // handleVectorPmaddIntrinsic() currently only handles integer types.
5778
5779 case Intrinsic::x86_sse_cmp_ss:
5780 case Intrinsic::x86_sse2_cmp_sd:
5781 case Intrinsic::x86_sse_comieq_ss:
5782 case Intrinsic::x86_sse_comilt_ss:
5783 case Intrinsic::x86_sse_comile_ss:
5784 case Intrinsic::x86_sse_comigt_ss:
5785 case Intrinsic::x86_sse_comige_ss:
5786 case Intrinsic::x86_sse_comineq_ss:
5787 case Intrinsic::x86_sse_ucomieq_ss:
5788 case Intrinsic::x86_sse_ucomilt_ss:
5789 case Intrinsic::x86_sse_ucomile_ss:
5790 case Intrinsic::x86_sse_ucomigt_ss:
5791 case Intrinsic::x86_sse_ucomige_ss:
5792 case Intrinsic::x86_sse_ucomineq_ss:
5793 case Intrinsic::x86_sse2_comieq_sd:
5794 case Intrinsic::x86_sse2_comilt_sd:
5795 case Intrinsic::x86_sse2_comile_sd:
5796 case Intrinsic::x86_sse2_comigt_sd:
5797 case Intrinsic::x86_sse2_comige_sd:
5798 case Intrinsic::x86_sse2_comineq_sd:
5799 case Intrinsic::x86_sse2_ucomieq_sd:
5800 case Intrinsic::x86_sse2_ucomilt_sd:
5801 case Intrinsic::x86_sse2_ucomile_sd:
5802 case Intrinsic::x86_sse2_ucomigt_sd:
5803 case Intrinsic::x86_sse2_ucomige_sd:
5804 case Intrinsic::x86_sse2_ucomineq_sd:
5805 handleVectorCompareScalarIntrinsic(I);
5806 break;
5807
5808 case Intrinsic::x86_avx_cmp_pd_256:
5809 case Intrinsic::x86_avx_cmp_ps_256:
5810 case Intrinsic::x86_sse2_cmp_pd:
5811 case Intrinsic::x86_sse_cmp_ps:
5812 handleVectorComparePackedIntrinsic(I);
5813 break;
5814
5815 case Intrinsic::x86_bmi_bextr_32:
5816 case Intrinsic::x86_bmi_bextr_64:
5817 case Intrinsic::x86_bmi_bzhi_32:
5818 case Intrinsic::x86_bmi_bzhi_64:
5819 case Intrinsic::x86_bmi_pdep_32:
5820 case Intrinsic::x86_bmi_pdep_64:
5821 case Intrinsic::x86_bmi_pext_32:
5822 case Intrinsic::x86_bmi_pext_64:
5823 handleBmiIntrinsic(I);
5824 break;
5825
5826 case Intrinsic::x86_pclmulqdq:
5827 case Intrinsic::x86_pclmulqdq_256:
5828 case Intrinsic::x86_pclmulqdq_512:
5829 handlePclmulIntrinsic(I);
5830 break;
5831
5832 case Intrinsic::x86_avx_round_pd_256:
5833 case Intrinsic::x86_avx_round_ps_256:
5834 case Intrinsic::x86_sse41_round_pd:
5835 case Intrinsic::x86_sse41_round_ps:
5836 handleRoundPdPsIntrinsic(I);
5837 break;
5838
5839 case Intrinsic::x86_sse41_round_sd:
5840 case Intrinsic::x86_sse41_round_ss:
5841 handleUnarySdSsIntrinsic(I);
5842 break;
5843
5844 case Intrinsic::x86_sse2_max_sd:
5845 case Intrinsic::x86_sse_max_ss:
5846 case Intrinsic::x86_sse2_min_sd:
5847 case Intrinsic::x86_sse_min_ss:
5848 handleBinarySdSsIntrinsic(I);
5849 break;
5850
5851 case Intrinsic::x86_avx_vtestc_pd:
5852 case Intrinsic::x86_avx_vtestc_pd_256:
5853 case Intrinsic::x86_avx_vtestc_ps:
5854 case Intrinsic::x86_avx_vtestc_ps_256:
5855 case Intrinsic::x86_avx_vtestnzc_pd:
5856 case Intrinsic::x86_avx_vtestnzc_pd_256:
5857 case Intrinsic::x86_avx_vtestnzc_ps:
5858 case Intrinsic::x86_avx_vtestnzc_ps_256:
5859 case Intrinsic::x86_avx_vtestz_pd:
5860 case Intrinsic::x86_avx_vtestz_pd_256:
5861 case Intrinsic::x86_avx_vtestz_ps:
5862 case Intrinsic::x86_avx_vtestz_ps_256:
5863 case Intrinsic::x86_avx_ptestc_256:
5864 case Intrinsic::x86_avx_ptestnzc_256:
5865 case Intrinsic::x86_avx_ptestz_256:
5866 case Intrinsic::x86_sse41_ptestc:
5867 case Intrinsic::x86_sse41_ptestnzc:
5868 case Intrinsic::x86_sse41_ptestz:
5869 handleVtestIntrinsic(I);
5870 break;
5871
5872 // Packed Horizontal Add/Subtract
5873 case Intrinsic::x86_ssse3_phadd_w:
5874 case Intrinsic::x86_ssse3_phadd_w_128:
5875 case Intrinsic::x86_avx2_phadd_w:
5876 case Intrinsic::x86_ssse3_phsub_w:
5877 case Intrinsic::x86_ssse3_phsub_w_128:
5878 case Intrinsic::x86_avx2_phsub_w: {
5879 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
5880 break;
5881 }
5882
5883 // Packed Horizontal Add/Subtract
5884 case Intrinsic::x86_ssse3_phadd_d:
5885 case Intrinsic::x86_ssse3_phadd_d_128:
5886 case Intrinsic::x86_avx2_phadd_d:
5887 case Intrinsic::x86_ssse3_phsub_d:
5888 case Intrinsic::x86_ssse3_phsub_d_128:
5889 case Intrinsic::x86_avx2_phsub_d: {
5890 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/32);
5891 break;
5892 }
5893
5894 // Packed Horizontal Add/Subtract and Saturate
5895 case Intrinsic::x86_ssse3_phadd_sw:
5896 case Intrinsic::x86_ssse3_phadd_sw_128:
5897 case Intrinsic::x86_avx2_phadd_sw:
5898 case Intrinsic::x86_ssse3_phsub_sw:
5899 case Intrinsic::x86_ssse3_phsub_sw_128:
5900 case Intrinsic::x86_avx2_phsub_sw: {
5901 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
5902 break;
5903 }
5904
5905 // Packed Single/Double Precision Floating-Point Horizontal Add
5906 case Intrinsic::x86_sse3_hadd_ps:
5907 case Intrinsic::x86_sse3_hadd_pd:
5908 case Intrinsic::x86_avx_hadd_pd_256:
5909 case Intrinsic::x86_avx_hadd_ps_256:
5910 case Intrinsic::x86_sse3_hsub_ps:
5911 case Intrinsic::x86_sse3_hsub_pd:
5912 case Intrinsic::x86_avx_hsub_pd_256:
5913 case Intrinsic::x86_avx_hsub_ps_256: {
5914 handlePairwiseShadowOrIntrinsic(I);
5915 break;
5916 }
5917
5918 case Intrinsic::x86_avx_maskstore_ps:
5919 case Intrinsic::x86_avx_maskstore_pd:
5920 case Intrinsic::x86_avx_maskstore_ps_256:
5921 case Intrinsic::x86_avx_maskstore_pd_256:
5922 case Intrinsic::x86_avx2_maskstore_d:
5923 case Intrinsic::x86_avx2_maskstore_q:
5924 case Intrinsic::x86_avx2_maskstore_d_256:
5925 case Intrinsic::x86_avx2_maskstore_q_256: {
5926 handleAVXMaskedStore(I);
5927 break;
5928 }
5929
5930 case Intrinsic::x86_avx_maskload_ps:
5931 case Intrinsic::x86_avx_maskload_pd:
5932 case Intrinsic::x86_avx_maskload_ps_256:
5933 case Intrinsic::x86_avx_maskload_pd_256:
5934 case Intrinsic::x86_avx2_maskload_d:
5935 case Intrinsic::x86_avx2_maskload_q:
5936 case Intrinsic::x86_avx2_maskload_d_256:
5937 case Intrinsic::x86_avx2_maskload_q_256: {
5938 handleAVXMaskedLoad(I);
5939 break;
5940 }
5941
5942 // Packed
5943 case Intrinsic::x86_avx512fp16_add_ph_512:
5944 case Intrinsic::x86_avx512fp16_sub_ph_512:
5945 case Intrinsic::x86_avx512fp16_mul_ph_512:
5946 case Intrinsic::x86_avx512fp16_div_ph_512:
5947 case Intrinsic::x86_avx512fp16_max_ph_512:
5948 case Intrinsic::x86_avx512fp16_min_ph_512:
5949 case Intrinsic::x86_avx512_min_ps_512:
5950 case Intrinsic::x86_avx512_min_pd_512:
5951 case Intrinsic::x86_avx512_max_ps_512:
5952 case Intrinsic::x86_avx512_max_pd_512: {
5953 // These AVX512 variants contain the rounding mode as a trailing flag.
5954 // Earlier variants do not have a trailing flag and are already handled
5955 // by maybeHandleSimpleNomemIntrinsic(I, 0) via
5956 // maybeHandleUnknownIntrinsic.
5957 [[maybe_unused]] bool Success =
5958 maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/1);
5959 assert(Success);
5960 break;
5961 }
5962
5963 case Intrinsic::x86_avx_vpermilvar_pd:
5964 case Intrinsic::x86_avx_vpermilvar_pd_256:
5965 case Intrinsic::x86_avx512_vpermilvar_pd_512:
5966 case Intrinsic::x86_avx_vpermilvar_ps:
5967 case Intrinsic::x86_avx_vpermilvar_ps_256:
5968 case Intrinsic::x86_avx512_vpermilvar_ps_512: {
5969 handleAVXVpermilvar(I);
5970 break;
5971 }
5972
5973 case Intrinsic::x86_avx512_vpermi2var_d_128:
5974 case Intrinsic::x86_avx512_vpermi2var_d_256:
5975 case Intrinsic::x86_avx512_vpermi2var_d_512:
5976 case Intrinsic::x86_avx512_vpermi2var_hi_128:
5977 case Intrinsic::x86_avx512_vpermi2var_hi_256:
5978 case Intrinsic::x86_avx512_vpermi2var_hi_512:
5979 case Intrinsic::x86_avx512_vpermi2var_pd_128:
5980 case Intrinsic::x86_avx512_vpermi2var_pd_256:
5981 case Intrinsic::x86_avx512_vpermi2var_pd_512:
5982 case Intrinsic::x86_avx512_vpermi2var_ps_128:
5983 case Intrinsic::x86_avx512_vpermi2var_ps_256:
5984 case Intrinsic::x86_avx512_vpermi2var_ps_512:
5985 case Intrinsic::x86_avx512_vpermi2var_q_128:
5986 case Intrinsic::x86_avx512_vpermi2var_q_256:
5987 case Intrinsic::x86_avx512_vpermi2var_q_512:
5988 case Intrinsic::x86_avx512_vpermi2var_qi_128:
5989 case Intrinsic::x86_avx512_vpermi2var_qi_256:
5990 case Intrinsic::x86_avx512_vpermi2var_qi_512:
5991 handleAVXVpermi2var(I);
5992 break;
5993
5994 // Packed Shuffle
5995 // llvm.x86.sse.pshuf.w(<1 x i64>, i8)
5996 // llvm.x86.ssse3.pshuf.b(<1 x i64>, <1 x i64>)
5997 // llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>)
5998 // llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>)
5999 // llvm.x86.avx512.pshuf.b.512(<64 x i8>, <64 x i8>)
6000 //
6001 // The following intrinsics are auto-upgraded:
6002 // llvm.x86.sse2.pshuf.d(<4 x i32>, i8)
6003 // llvm.x86.sse2.gpshufh.w(<8 x i16>, i8)
6004 // llvm.x86.sse2.pshufl.w(<8 x i16>, i8)
6005 case Intrinsic::x86_avx2_pshuf_b:
6006 case Intrinsic::x86_sse_pshuf_w:
6007 case Intrinsic::x86_ssse3_pshuf_b_128:
6008 case Intrinsic::x86_ssse3_pshuf_b:
6009 case Intrinsic::x86_avx512_pshuf_b_512:
6010 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6011 /*trailingVerbatimArgs=*/1);
6012 break;
6013
6014 // AVX512 PMOV: Packed MOV, with truncation
6015 // Precisely handled by applying the same intrinsic to the shadow
6016 case Intrinsic::x86_avx512_mask_pmov_dw_512:
6017 case Intrinsic::x86_avx512_mask_pmov_db_512:
6018 case Intrinsic::x86_avx512_mask_pmov_qb_512:
6019 case Intrinsic::x86_avx512_mask_pmov_qw_512: {
6020 // Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 were removed in
6021 // f608dc1f5775ee880e8ea30e2d06ab5a4a935c22
6022 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6023 /*trailingVerbatimArgs=*/1);
6024 break;
6025 }
6026
6027 // AVX512 PMVOV{S,US}: Packed MOV, with signed/unsigned saturation
6028 // Approximately handled using the corresponding truncation intrinsic
6029 // TODO: improve handleAVX512VectorDownConvert to precisely model saturation
6030 case Intrinsic::x86_avx512_mask_pmovs_dw_512:
6031 case Intrinsic::x86_avx512_mask_pmovus_dw_512: {
6032 handleIntrinsicByApplyingToShadow(I,
6033 Intrinsic::x86_avx512_mask_pmov_dw_512,
6034 /* trailingVerbatimArgs=*/1);
6035 break;
6036 }
6037
6038 case Intrinsic::x86_avx512_mask_pmovs_db_512:
6039 case Intrinsic::x86_avx512_mask_pmovus_db_512: {
6040 handleIntrinsicByApplyingToShadow(I,
6041 Intrinsic::x86_avx512_mask_pmov_db_512,
6042 /* trailingVerbatimArgs=*/1);
6043 break;
6044 }
6045
6046 case Intrinsic::x86_avx512_mask_pmovs_qb_512:
6047 case Intrinsic::x86_avx512_mask_pmovus_qb_512: {
6048 handleIntrinsicByApplyingToShadow(I,
6049 Intrinsic::x86_avx512_mask_pmov_qb_512,
6050 /* trailingVerbatimArgs=*/1);
6051 break;
6052 }
6053
6054 case Intrinsic::x86_avx512_mask_pmovs_qw_512:
6055 case Intrinsic::x86_avx512_mask_pmovus_qw_512: {
6056 handleIntrinsicByApplyingToShadow(I,
6057 Intrinsic::x86_avx512_mask_pmov_qw_512,
6058 /* trailingVerbatimArgs=*/1);
6059 break;
6060 }
6061
6062 case Intrinsic::x86_avx512_mask_pmovs_qd_512:
6063 case Intrinsic::x86_avx512_mask_pmovus_qd_512:
6064 case Intrinsic::x86_avx512_mask_pmovs_wb_512:
6065 case Intrinsic::x86_avx512_mask_pmovus_wb_512: {
6066 // Since Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 do not exist, we
6067 // cannot use handleIntrinsicByApplyingToShadow. Instead, we call the
6068 // slow-path handler.
6069 handleAVX512VectorDownConvert(I);
6070 break;
6071 }
6072
6073 // AVX512 FP16 Arithmetic
6074 case Intrinsic::x86_avx512fp16_mask_add_sh_round:
6075 case Intrinsic::x86_avx512fp16_mask_sub_sh_round:
6076 case Intrinsic::x86_avx512fp16_mask_mul_sh_round:
6077 case Intrinsic::x86_avx512fp16_mask_div_sh_round:
6078 case Intrinsic::x86_avx512fp16_mask_max_sh_round:
6079 case Intrinsic::x86_avx512fp16_mask_min_sh_round: {
6080 visitGenericScalarHalfwordInst(I);
6081 break;
6082 }
6083
6084 // AVX Galois Field New Instructions
6085 case Intrinsic::x86_vgf2p8affineqb_128:
6086 case Intrinsic::x86_vgf2p8affineqb_256:
6087 case Intrinsic::x86_vgf2p8affineqb_512:
6088 handleAVXGF2P8Affine(I);
6089 break;
6090
6091 default:
6092 return false;
6093 }
6094
6095 return true;
6096 }
6097
6098 bool maybeHandleArmSIMDIntrinsic(IntrinsicInst &I) {
6099 switch (I.getIntrinsicID()) {
6100 case Intrinsic::aarch64_neon_rshrn:
6101 case Intrinsic::aarch64_neon_sqrshl:
6102 case Intrinsic::aarch64_neon_sqrshrn:
6103 case Intrinsic::aarch64_neon_sqrshrun:
6104 case Intrinsic::aarch64_neon_sqshl:
6105 case Intrinsic::aarch64_neon_sqshlu:
6106 case Intrinsic::aarch64_neon_sqshrn:
6107 case Intrinsic::aarch64_neon_sqshrun:
6108 case Intrinsic::aarch64_neon_srshl:
6109 case Intrinsic::aarch64_neon_sshl:
6110 case Intrinsic::aarch64_neon_uqrshl:
6111 case Intrinsic::aarch64_neon_uqrshrn:
6112 case Intrinsic::aarch64_neon_uqshl:
6113 case Intrinsic::aarch64_neon_uqshrn:
6114 case Intrinsic::aarch64_neon_urshl:
6115 case Intrinsic::aarch64_neon_ushl:
6116 // Not handled here: aarch64_neon_vsli (vector shift left and insert)
6117 handleVectorShiftIntrinsic(I, /* Variable */ false);
6118 break;
6119
6120 // TODO: handling max/min similarly to AND/OR may be more precise
6121 // Floating-Point Maximum/Minimum Pairwise
6122 case Intrinsic::aarch64_neon_fmaxp:
6123 case Intrinsic::aarch64_neon_fminp:
6124 // Floating-Point Maximum/Minimum Number Pairwise
6125 case Intrinsic::aarch64_neon_fmaxnmp:
6126 case Intrinsic::aarch64_neon_fminnmp:
6127 // Signed/Unsigned Maximum/Minimum Pairwise
6128 case Intrinsic::aarch64_neon_smaxp:
6129 case Intrinsic::aarch64_neon_sminp:
6130 case Intrinsic::aarch64_neon_umaxp:
6131 case Intrinsic::aarch64_neon_uminp:
6132 // Add Pairwise
6133 case Intrinsic::aarch64_neon_addp:
6134 // Floating-point Add Pairwise
6135 case Intrinsic::aarch64_neon_faddp:
6136 // Add Long Pairwise
6137 case Intrinsic::aarch64_neon_saddlp:
6138 case Intrinsic::aarch64_neon_uaddlp: {
6139 handlePairwiseShadowOrIntrinsic(I);
6140 break;
6141 }
6142
6143 // Floating-point Convert to integer, rounding to nearest with ties to Away
6144 case Intrinsic::aarch64_neon_fcvtas:
6145 case Intrinsic::aarch64_neon_fcvtau:
6146 // Floating-point convert to integer, rounding toward minus infinity
6147 case Intrinsic::aarch64_neon_fcvtms:
6148 case Intrinsic::aarch64_neon_fcvtmu:
6149 // Floating-point convert to integer, rounding to nearest with ties to even
6150 case Intrinsic::aarch64_neon_fcvtns:
6151 case Intrinsic::aarch64_neon_fcvtnu:
6152 // Floating-point convert to integer, rounding toward plus infinity
6153 case Intrinsic::aarch64_neon_fcvtps:
6154 case Intrinsic::aarch64_neon_fcvtpu:
6155 // Floating-point Convert to integer, rounding toward Zero
6156 case Intrinsic::aarch64_neon_fcvtzs:
6157 case Intrinsic::aarch64_neon_fcvtzu:
6158 // Floating-point convert to lower precision narrow, rounding to odd
6159 case Intrinsic::aarch64_neon_fcvtxn: {
6160 handleNEONVectorConvertIntrinsic(I);
6161 break;
6162 }
6163
6164 // Add reduction to scalar
6165 case Intrinsic::aarch64_neon_faddv:
6166 case Intrinsic::aarch64_neon_saddv:
6167 case Intrinsic::aarch64_neon_uaddv:
6168 // Signed/Unsigned min/max (Vector)
6169 // TODO: handling similarly to AND/OR may be more precise.
6170 case Intrinsic::aarch64_neon_smaxv:
6171 case Intrinsic::aarch64_neon_sminv:
6172 case Intrinsic::aarch64_neon_umaxv:
6173 case Intrinsic::aarch64_neon_uminv:
6174 // Floating-point min/max (vector)
6175 // The f{min,max}"nm"v variants handle NaN differently than f{min,max}v,
6176 // but our shadow propagation is the same.
6177 case Intrinsic::aarch64_neon_fmaxv:
6178 case Intrinsic::aarch64_neon_fminv:
6179 case Intrinsic::aarch64_neon_fmaxnmv:
6180 case Intrinsic::aarch64_neon_fminnmv:
6181 // Sum long across vector
6182 case Intrinsic::aarch64_neon_saddlv:
6183 case Intrinsic::aarch64_neon_uaddlv:
6184 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/true);
6185 break;
6186
6187 case Intrinsic::aarch64_neon_ld1x2:
6188 case Intrinsic::aarch64_neon_ld1x3:
6189 case Intrinsic::aarch64_neon_ld1x4:
6190 case Intrinsic::aarch64_neon_ld2:
6191 case Intrinsic::aarch64_neon_ld3:
6192 case Intrinsic::aarch64_neon_ld4:
6193 case Intrinsic::aarch64_neon_ld2r:
6194 case Intrinsic::aarch64_neon_ld3r:
6195 case Intrinsic::aarch64_neon_ld4r: {
6196 handleNEONVectorLoad(I, /*WithLane=*/false);
6197 break;
6198 }
6199
6200 case Intrinsic::aarch64_neon_ld2lane:
6201 case Intrinsic::aarch64_neon_ld3lane:
6202 case Intrinsic::aarch64_neon_ld4lane: {
6203 handleNEONVectorLoad(I, /*WithLane=*/true);
6204 break;
6205 }
6206
6207 // Saturating extract narrow
6208 case Intrinsic::aarch64_neon_sqxtn:
6209 case Intrinsic::aarch64_neon_sqxtun:
6210 case Intrinsic::aarch64_neon_uqxtn:
6211 // These only have one argument, but we (ab)use handleShadowOr because it
6212 // does work on single argument intrinsics and will typecast the shadow
6213 // (and update the origin).
6214 handleShadowOr(I);
6215 break;
6216
6217 case Intrinsic::aarch64_neon_st1x2:
6218 case Intrinsic::aarch64_neon_st1x3:
6219 case Intrinsic::aarch64_neon_st1x4:
6220 case Intrinsic::aarch64_neon_st2:
6221 case Intrinsic::aarch64_neon_st3:
6222 case Intrinsic::aarch64_neon_st4: {
6223 handleNEONVectorStoreIntrinsic(I, false);
6224 break;
6225 }
6226
6227 case Intrinsic::aarch64_neon_st2lane:
6228 case Intrinsic::aarch64_neon_st3lane:
6229 case Intrinsic::aarch64_neon_st4lane: {
6230 handleNEONVectorStoreIntrinsic(I, true);
6231 break;
6232 }
6233
6234 // Arm NEON vector table intrinsics have the source/table register(s) as
6235 // arguments, followed by the index register. They return the output.
6236 //
6237 // 'TBL writes a zero if an index is out-of-range, while TBX leaves the
6238 // original value unchanged in the destination register.'
6239 // Conveniently, zero denotes a clean shadow, which means out-of-range
6240 // indices for TBL will initialize the user data with zero and also clean
6241 // the shadow. (For TBX, neither the user data nor the shadow will be
6242 // updated, which is also correct.)
6243 case Intrinsic::aarch64_neon_tbl1:
6244 case Intrinsic::aarch64_neon_tbl2:
6245 case Intrinsic::aarch64_neon_tbl3:
6246 case Intrinsic::aarch64_neon_tbl4:
6247 case Intrinsic::aarch64_neon_tbx1:
6248 case Intrinsic::aarch64_neon_tbx2:
6249 case Intrinsic::aarch64_neon_tbx3:
6250 case Intrinsic::aarch64_neon_tbx4: {
6251 // The last trailing argument (index register) should be handled verbatim
6252 handleIntrinsicByApplyingToShadow(
6253 I, /*shadowIntrinsicID=*/I.getIntrinsicID(),
6254 /*trailingVerbatimArgs*/ 1);
6255 break;
6256 }
6257
6258 case Intrinsic::aarch64_neon_fmulx:
6259 case Intrinsic::aarch64_neon_pmul:
6260 case Intrinsic::aarch64_neon_pmull:
6261 case Intrinsic::aarch64_neon_smull:
6262 case Intrinsic::aarch64_neon_pmull64:
6263 case Intrinsic::aarch64_neon_umull: {
6264 handleNEONVectorMultiplyIntrinsic(I);
6265 break;
6266 }
6267
6268 default:
6269 return false;
6270 }
6271
6272 return true;
6273 }
6274
6275 void visitIntrinsicInst(IntrinsicInst &I) {
6276 if (maybeHandleCrossPlatformIntrinsic(I))
6277 return;
6278
6279 if (maybeHandleX86SIMDIntrinsic(I))
6280 return;
6281
6282 if (maybeHandleArmSIMDIntrinsic(I))
6283 return;
6284
6285 if (maybeHandleUnknownIntrinsic(I))
6286 return;
6287
6288 visitInstruction(I);
6289 }
6290
6291 void visitLibAtomicLoad(CallBase &CB) {
6292 // Since we use getNextNode here, we can't have CB terminate the BB.
6293 assert(isa<CallInst>(CB));
6294
6295 IRBuilder<> IRB(&CB);
6296 Value *Size = CB.getArgOperand(0);
6297 Value *SrcPtr = CB.getArgOperand(1);
6298 Value *DstPtr = CB.getArgOperand(2);
6299 Value *Ordering = CB.getArgOperand(3);
6300 // Convert the call to have at least Acquire ordering to make sure
6301 // the shadow operations aren't reordered before it.
6302 Value *NewOrdering =
6303 IRB.CreateExtractElement(makeAddAcquireOrderingTable(IRB), Ordering);
6304 CB.setArgOperand(3, NewOrdering);
6305
6306 NextNodeIRBuilder NextIRB(&CB);
6307 Value *SrcShadowPtr, *SrcOriginPtr;
6308 std::tie(SrcShadowPtr, SrcOriginPtr) =
6309 getShadowOriginPtr(SrcPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6310 /*isStore*/ false);
6311 Value *DstShadowPtr =
6312 getShadowOriginPtr(DstPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6313 /*isStore*/ true)
6314 .first;
6315
6316 NextIRB.CreateMemCpy(DstShadowPtr, Align(1), SrcShadowPtr, Align(1), Size);
6317 if (MS.TrackOrigins) {
6318 Value *SrcOrigin = NextIRB.CreateAlignedLoad(MS.OriginTy, SrcOriginPtr,
6320 Value *NewOrigin = updateOrigin(SrcOrigin, NextIRB);
6321 NextIRB.CreateCall(MS.MsanSetOriginFn, {DstPtr, Size, NewOrigin});
6322 }
6323 }
6324
6325 void visitLibAtomicStore(CallBase &CB) {
6326 IRBuilder<> IRB(&CB);
6327 Value *Size = CB.getArgOperand(0);
6328 Value *DstPtr = CB.getArgOperand(2);
6329 Value *Ordering = CB.getArgOperand(3);
6330 // Convert the call to have at least Release ordering to make sure
6331 // the shadow operations aren't reordered after it.
6332 Value *NewOrdering =
6333 IRB.CreateExtractElement(makeAddReleaseOrderingTable(IRB), Ordering);
6334 CB.setArgOperand(3, NewOrdering);
6335
6336 Value *DstShadowPtr =
6337 getShadowOriginPtr(DstPtr, IRB, IRB.getInt8Ty(), Align(1),
6338 /*isStore*/ true)
6339 .first;
6340
6341 // Atomic store always paints clean shadow/origin. See file header.
6342 IRB.CreateMemSet(DstShadowPtr, getCleanShadow(IRB.getInt8Ty()), Size,
6343 Align(1));
6344 }
6345
6346 void visitCallBase(CallBase &CB) {
6347 assert(!CB.getMetadata(LLVMContext::MD_nosanitize));
6348 if (CB.isInlineAsm()) {
6349 // For inline asm (either a call to asm function, or callbr instruction),
6350 // do the usual thing: check argument shadow and mark all outputs as
6351 // clean. Note that any side effects of the inline asm that are not
6352 // immediately visible in its constraints are not handled.
6354 visitAsmInstruction(CB);
6355 else
6356 visitInstruction(CB);
6357 return;
6358 }
6359 LibFunc LF;
6360 if (TLI->getLibFunc(CB, LF)) {
6361 // libatomic.a functions need to have special handling because there isn't
6362 // a good way to intercept them or compile the library with
6363 // instrumentation.
6364 switch (LF) {
6365 case LibFunc_atomic_load:
6366 if (!isa<CallInst>(CB)) {
6367 llvm::errs() << "MSAN -- cannot instrument invoke of libatomic load."
6368 "Ignoring!\n";
6369 break;
6370 }
6371 visitLibAtomicLoad(CB);
6372 return;
6373 case LibFunc_atomic_store:
6374 visitLibAtomicStore(CB);
6375 return;
6376 default:
6377 break;
6378 }
6379 }
6380
6381 if (auto *Call = dyn_cast<CallInst>(&CB)) {
6382 assert(!isa<IntrinsicInst>(Call) && "intrinsics are handled elsewhere");
6383
6384 // We are going to insert code that relies on the fact that the callee
6385 // will become a non-readonly function after it is instrumented by us. To
6386 // prevent this code from being optimized out, mark that function
6387 // non-readonly in advance.
6388 // TODO: We can likely do better than dropping memory() completely here.
6389 AttributeMask B;
6390 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
6391
6393 if (Function *Func = Call->getCalledFunction()) {
6394 Func->removeFnAttrs(B);
6395 }
6396
6398 }
6399 IRBuilder<> IRB(&CB);
6400 bool MayCheckCall = MS.EagerChecks;
6401 if (Function *Func = CB.getCalledFunction()) {
6402 // __sanitizer_unaligned_{load,store} functions may be called by users
6403 // and always expects shadows in the TLS. So don't check them.
6404 MayCheckCall &= !Func->getName().starts_with("__sanitizer_unaligned_");
6405 }
6406
6407 unsigned ArgOffset = 0;
6408 LLVM_DEBUG(dbgs() << " CallSite: " << CB << "\n");
6409 for (const auto &[i, A] : llvm::enumerate(CB.args())) {
6410 if (!A->getType()->isSized()) {
6411 LLVM_DEBUG(dbgs() << "Arg " << i << " is not sized: " << CB << "\n");
6412 continue;
6413 }
6414
6415 if (A->getType()->isScalableTy()) {
6416 LLVM_DEBUG(dbgs() << "Arg " << i << " is vscale: " << CB << "\n");
6417 // Handle as noundef, but don't reserve tls slots.
6418 insertCheckShadowOf(A, &CB);
6419 continue;
6420 }
6421
6422 unsigned Size = 0;
6423 const DataLayout &DL = F.getDataLayout();
6424
6425 bool ByVal = CB.paramHasAttr(i, Attribute::ByVal);
6426 bool NoUndef = CB.paramHasAttr(i, Attribute::NoUndef);
6427 bool EagerCheck = MayCheckCall && !ByVal && NoUndef;
6428
6429 if (EagerCheck) {
6430 insertCheckShadowOf(A, &CB);
6431 Size = DL.getTypeAllocSize(A->getType());
6432 } else {
6433 [[maybe_unused]] Value *Store = nullptr;
6434 // Compute the Shadow for arg even if it is ByVal, because
6435 // in that case getShadow() will copy the actual arg shadow to
6436 // __msan_param_tls.
6437 Value *ArgShadow = getShadow(A);
6438 Value *ArgShadowBase = getShadowPtrForArgument(IRB, ArgOffset);
6439 LLVM_DEBUG(dbgs() << " Arg#" << i << ": " << *A
6440 << " Shadow: " << *ArgShadow << "\n");
6441 if (ByVal) {
6442 // ByVal requires some special handling as it's too big for a single
6443 // load
6444 assert(A->getType()->isPointerTy() &&
6445 "ByVal argument is not a pointer!");
6446 Size = DL.getTypeAllocSize(CB.getParamByValType(i));
6447 if (ArgOffset + Size > kParamTLSSize)
6448 break;
6449 const MaybeAlign ParamAlignment(CB.getParamAlign(i));
6450 MaybeAlign Alignment = std::nullopt;
6451 if (ParamAlignment)
6452 Alignment = std::min(*ParamAlignment, kShadowTLSAlignment);
6453 Value *AShadowPtr, *AOriginPtr;
6454 std::tie(AShadowPtr, AOriginPtr) =
6455 getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), Alignment,
6456 /*isStore*/ false);
6457 if (!PropagateShadow) {
6458 Store = IRB.CreateMemSet(ArgShadowBase,
6460 Size, Alignment);
6461 } else {
6462 Store = IRB.CreateMemCpy(ArgShadowBase, Alignment, AShadowPtr,
6463 Alignment, Size);
6464 if (MS.TrackOrigins) {
6465 Value *ArgOriginBase = getOriginPtrForArgument(IRB, ArgOffset);
6466 // FIXME: OriginSize should be:
6467 // alignTo(A % kMinOriginAlignment + Size, kMinOriginAlignment)
6468 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
6469 IRB.CreateMemCpy(
6470 ArgOriginBase,
6471 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
6472 AOriginPtr,
6473 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginSize);
6474 }
6475 }
6476 } else {
6477 // Any other parameters mean we need bit-grained tracking of uninit
6478 // data
6479 Size = DL.getTypeAllocSize(A->getType());
6480 if (ArgOffset + Size > kParamTLSSize)
6481 break;
6482 Store = IRB.CreateAlignedStore(ArgShadow, ArgShadowBase,
6484 Constant *Cst = dyn_cast<Constant>(ArgShadow);
6485 if (MS.TrackOrigins && !(Cst && Cst->isNullValue())) {
6486 IRB.CreateStore(getOrigin(A),
6487 getOriginPtrForArgument(IRB, ArgOffset));
6488 }
6489 }
6490 assert(Store != nullptr);
6491 LLVM_DEBUG(dbgs() << " Param:" << *Store << "\n");
6492 }
6493 assert(Size != 0);
6494 ArgOffset += alignTo(Size, kShadowTLSAlignment);
6495 }
6496 LLVM_DEBUG(dbgs() << " done with call args\n");
6497
6498 FunctionType *FT = CB.getFunctionType();
6499 if (FT->isVarArg()) {
6500 VAHelper->visitCallBase(CB, IRB);
6501 }
6502
6503 // Now, get the shadow for the RetVal.
6504 if (!CB.getType()->isSized())
6505 return;
6506 // Don't emit the epilogue for musttail call returns.
6507 if (isa<CallInst>(CB) && cast<CallInst>(CB).isMustTailCall())
6508 return;
6509
6510 if (MayCheckCall && CB.hasRetAttr(Attribute::NoUndef)) {
6511 setShadow(&CB, getCleanShadow(&CB));
6512 setOrigin(&CB, getCleanOrigin());
6513 return;
6514 }
6515
6516 IRBuilder<> IRBBefore(&CB);
6517 // Until we have full dynamic coverage, make sure the retval shadow is 0.
6518 Value *Base = getShadowPtrForRetval(IRBBefore);
6519 IRBBefore.CreateAlignedStore(getCleanShadow(&CB), Base,
6521 BasicBlock::iterator NextInsn;
6522 if (isa<CallInst>(CB)) {
6523 NextInsn = ++CB.getIterator();
6524 assert(NextInsn != CB.getParent()->end());
6525 } else {
6526 BasicBlock *NormalDest = cast<InvokeInst>(CB).getNormalDest();
6527 if (!NormalDest->getSinglePredecessor()) {
6528 // FIXME: this case is tricky, so we are just conservative here.
6529 // Perhaps we need to split the edge between this BB and NormalDest,
6530 // but a naive attempt to use SplitEdge leads to a crash.
6531 setShadow(&CB, getCleanShadow(&CB));
6532 setOrigin(&CB, getCleanOrigin());
6533 return;
6534 }
6535 // FIXME: NextInsn is likely in a basic block that has not been visited
6536 // yet. Anything inserted there will be instrumented by MSan later!
6537 NextInsn = NormalDest->getFirstInsertionPt();
6538 assert(NextInsn != NormalDest->end() &&
6539 "Could not find insertion point for retval shadow load");
6540 }
6541 IRBuilder<> IRBAfter(&*NextInsn);
6542 Value *RetvalShadow = IRBAfter.CreateAlignedLoad(
6543 getShadowTy(&CB), getShadowPtrForRetval(IRBAfter), kShadowTLSAlignment,
6544 "_msret");
6545 setShadow(&CB, RetvalShadow);
6546 if (MS.TrackOrigins)
6547 setOrigin(&CB, IRBAfter.CreateLoad(MS.OriginTy, getOriginPtrForRetval()));
6548 }
6549
6550 bool isAMustTailRetVal(Value *RetVal) {
6551 if (auto *I = dyn_cast<BitCastInst>(RetVal)) {
6552 RetVal = I->getOperand(0);
6553 }
6554 if (auto *I = dyn_cast<CallInst>(RetVal)) {
6555 return I->isMustTailCall();
6556 }
6557 return false;
6558 }
6559
6560 void visitReturnInst(ReturnInst &I) {
6561 IRBuilder<> IRB(&I);
6562 Value *RetVal = I.getReturnValue();
6563 if (!RetVal)
6564 return;
6565 // Don't emit the epilogue for musttail call returns.
6566 if (isAMustTailRetVal(RetVal))
6567 return;
6568 Value *ShadowPtr = getShadowPtrForRetval(IRB);
6569 bool HasNoUndef = F.hasRetAttribute(Attribute::NoUndef);
6570 bool StoreShadow = !(MS.EagerChecks && HasNoUndef);
6571 // FIXME: Consider using SpecialCaseList to specify a list of functions that
6572 // must always return fully initialized values. For now, we hardcode "main".
6573 bool EagerCheck = (MS.EagerChecks && HasNoUndef) || (F.getName() == "main");
6574
6575 Value *Shadow = getShadow(RetVal);
6576 bool StoreOrigin = true;
6577 if (EagerCheck) {
6578 insertCheckShadowOf(RetVal, &I);
6579 Shadow = getCleanShadow(RetVal);
6580 StoreOrigin = false;
6581 }
6582
6583 // The caller may still expect information passed over TLS if we pass our
6584 // check
6585 if (StoreShadow) {
6586 IRB.CreateAlignedStore(Shadow, ShadowPtr, kShadowTLSAlignment);
6587 if (MS.TrackOrigins && StoreOrigin)
6588 IRB.CreateStore(getOrigin(RetVal), getOriginPtrForRetval());
6589 }
6590 }
6591
6592 void visitPHINode(PHINode &I) {
6593 IRBuilder<> IRB(&I);
6594 if (!PropagateShadow) {
6595 setShadow(&I, getCleanShadow(&I));
6596 setOrigin(&I, getCleanOrigin());
6597 return;
6598 }
6599
6600 ShadowPHINodes.push_back(&I);
6601 setShadow(&I, IRB.CreatePHI(getShadowTy(&I), I.getNumIncomingValues(),
6602 "_msphi_s"));
6603 if (MS.TrackOrigins)
6604 setOrigin(
6605 &I, IRB.CreatePHI(MS.OriginTy, I.getNumIncomingValues(), "_msphi_o"));
6606 }
6607
6608 Value *getLocalVarIdptr(AllocaInst &I) {
6609 ConstantInt *IntConst =
6610 ConstantInt::get(Type::getInt32Ty((*F.getParent()).getContext()), 0);
6611 return new GlobalVariable(*F.getParent(), IntConst->getType(),
6612 /*isConstant=*/false, GlobalValue::PrivateLinkage,
6613 IntConst);
6614 }
6615
6616 Value *getLocalVarDescription(AllocaInst &I) {
6617 return createPrivateConstGlobalForString(*F.getParent(), I.getName());
6618 }
6619
6620 void poisonAllocaUserspace(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
6621 if (PoisonStack && ClPoisonStackWithCall) {
6622 IRB.CreateCall(MS.MsanPoisonStackFn, {&I, Len});
6623 } else {
6624 Value *ShadowBase, *OriginBase;
6625 std::tie(ShadowBase, OriginBase) = getShadowOriginPtr(
6626 &I, IRB, IRB.getInt8Ty(), Align(1), /*isStore*/ true);
6627
6628 Value *PoisonValue = IRB.getInt8(PoisonStack ? ClPoisonStackPattern : 0);
6629 IRB.CreateMemSet(ShadowBase, PoisonValue, Len, I.getAlign());
6630 }
6631
6632 if (PoisonStack && MS.TrackOrigins) {
6633 Value *Idptr = getLocalVarIdptr(I);
6634 if (ClPrintStackNames) {
6635 Value *Descr = getLocalVarDescription(I);
6636 IRB.CreateCall(MS.MsanSetAllocaOriginWithDescriptionFn,
6637 {&I, Len, Idptr, Descr});
6638 } else {
6639 IRB.CreateCall(MS.MsanSetAllocaOriginNoDescriptionFn, {&I, Len, Idptr});
6640 }
6641 }
6642 }
6643
6644 void poisonAllocaKmsan(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
6645 Value *Descr = getLocalVarDescription(I);
6646 if (PoisonStack) {
6647 IRB.CreateCall(MS.MsanPoisonAllocaFn, {&I, Len, Descr});
6648 } else {
6649 IRB.CreateCall(MS.MsanUnpoisonAllocaFn, {&I, Len});
6650 }
6651 }
6652
6653 void instrumentAlloca(AllocaInst &I, Instruction *InsPoint = nullptr) {
6654 if (!InsPoint)
6655 InsPoint = &I;
6656 NextNodeIRBuilder IRB(InsPoint);
6657 const DataLayout &DL = F.getDataLayout();
6658 TypeSize TS = DL.getTypeAllocSize(I.getAllocatedType());
6659 Value *Len = IRB.CreateTypeSize(MS.IntptrTy, TS);
6660 if (I.isArrayAllocation())
6661 Len = IRB.CreateMul(Len,
6662 IRB.CreateZExtOrTrunc(I.getArraySize(), MS.IntptrTy));
6663
6664 if (MS.CompileKernel)
6665 poisonAllocaKmsan(I, IRB, Len);
6666 else
6667 poisonAllocaUserspace(I, IRB, Len);
6668 }
6669
6670 void visitAllocaInst(AllocaInst &I) {
6671 setShadow(&I, getCleanShadow(&I));
6672 setOrigin(&I, getCleanOrigin());
6673 // We'll get to this alloca later unless it's poisoned at the corresponding
6674 // llvm.lifetime.start.
6675 AllocaSet.insert(&I);
6676 }
6677
6678 void visitSelectInst(SelectInst &I) {
6679 // a = select b, c, d
6680 Value *B = I.getCondition();
6681 Value *C = I.getTrueValue();
6682 Value *D = I.getFalseValue();
6683
6684 handleSelectLikeInst(I, B, C, D);
6685 }
6686
6687 void handleSelectLikeInst(Instruction &I, Value *B, Value *C, Value *D) {
6688 IRBuilder<> IRB(&I);
6689
6690 Value *Sb = getShadow(B);
6691 Value *Sc = getShadow(C);
6692 Value *Sd = getShadow(D);
6693
6694 Value *Ob = MS.TrackOrigins ? getOrigin(B) : nullptr;
6695 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
6696 Value *Od = MS.TrackOrigins ? getOrigin(D) : nullptr;
6697
6698 // Result shadow if condition shadow is 0.
6699 Value *Sa0 = IRB.CreateSelect(B, Sc, Sd);
6700 Value *Sa1;
6701 if (I.getType()->isAggregateType()) {
6702 // To avoid "sign extending" i1 to an arbitrary aggregate type, we just do
6703 // an extra "select". This results in much more compact IR.
6704 // Sa = select Sb, poisoned, (select b, Sc, Sd)
6705 Sa1 = getPoisonedShadow(getShadowTy(I.getType()));
6706 } else {
6707 // Sa = select Sb, [ (c^d) | Sc | Sd ], [ b ? Sc : Sd ]
6708 // If Sb (condition is poisoned), look for bits in c and d that are equal
6709 // and both unpoisoned.
6710 // If !Sb (condition is unpoisoned), simply pick one of Sc and Sd.
6711
6712 // Cast arguments to shadow-compatible type.
6713 C = CreateAppToShadowCast(IRB, C);
6714 D = CreateAppToShadowCast(IRB, D);
6715
6716 // Result shadow if condition shadow is 1.
6717 Sa1 = IRB.CreateOr({IRB.CreateXor(C, D), Sc, Sd});
6718 }
6719 Value *Sa = IRB.CreateSelect(Sb, Sa1, Sa0, "_msprop_select");
6720 setShadow(&I, Sa);
6721 if (MS.TrackOrigins) {
6722 // Origins are always i32, so any vector conditions must be flattened.
6723 // FIXME: consider tracking vector origins for app vectors?
6724 if (B->getType()->isVectorTy()) {
6725 B = convertToBool(B, IRB);
6726 Sb = convertToBool(Sb, IRB);
6727 }
6728 // a = select b, c, d
6729 // Oa = Sb ? Ob : (b ? Oc : Od)
6730 setOrigin(&I, IRB.CreateSelect(Sb, Ob, IRB.CreateSelect(B, Oc, Od)));
6731 }
6732 }
6733
6734 void visitLandingPadInst(LandingPadInst &I) {
6735 // Do nothing.
6736 // See https://github.com/google/sanitizers/issues/504
6737 setShadow(&I, getCleanShadow(&I));
6738 setOrigin(&I, getCleanOrigin());
6739 }
6740
6741 void visitCatchSwitchInst(CatchSwitchInst &I) {
6742 setShadow(&I, getCleanShadow(&I));
6743 setOrigin(&I, getCleanOrigin());
6744 }
6745
6746 void visitFuncletPadInst(FuncletPadInst &I) {
6747 setShadow(&I, getCleanShadow(&I));
6748 setOrigin(&I, getCleanOrigin());
6749 }
6750
6751 void visitGetElementPtrInst(GetElementPtrInst &I) { handleShadowOr(I); }
6752
6753 void visitExtractValueInst(ExtractValueInst &I) {
6754 IRBuilder<> IRB(&I);
6755 Value *Agg = I.getAggregateOperand();
6756 LLVM_DEBUG(dbgs() << "ExtractValue: " << I << "\n");
6757 Value *AggShadow = getShadow(Agg);
6758 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
6759 Value *ResShadow = IRB.CreateExtractValue(AggShadow, I.getIndices());
6760 LLVM_DEBUG(dbgs() << " ResShadow: " << *ResShadow << "\n");
6761 setShadow(&I, ResShadow);
6762 setOriginForNaryOp(I);
6763 }
6764
6765 void visitInsertValueInst(InsertValueInst &I) {
6766 IRBuilder<> IRB(&I);
6767 LLVM_DEBUG(dbgs() << "InsertValue: " << I << "\n");
6768 Value *AggShadow = getShadow(I.getAggregateOperand());
6769 Value *InsShadow = getShadow(I.getInsertedValueOperand());
6770 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
6771 LLVM_DEBUG(dbgs() << " InsShadow: " << *InsShadow << "\n");
6772 Value *Res = IRB.CreateInsertValue(AggShadow, InsShadow, I.getIndices());
6773 LLVM_DEBUG(dbgs() << " Res: " << *Res << "\n");
6774 setShadow(&I, Res);
6775 setOriginForNaryOp(I);
6776 }
6777
6778 void dumpInst(Instruction &I) {
6779 if (CallInst *CI = dyn_cast<CallInst>(&I)) {
6780 errs() << "ZZZ call " << CI->getCalledFunction()->getName() << "\n";
6781 } else {
6782 errs() << "ZZZ " << I.getOpcodeName() << "\n";
6783 }
6784 errs() << "QQQ " << I << "\n";
6785 }
6786
6787 void visitResumeInst(ResumeInst &I) {
6788 LLVM_DEBUG(dbgs() << "Resume: " << I << "\n");
6789 // Nothing to do here.
6790 }
6791
6792 void visitCleanupReturnInst(CleanupReturnInst &CRI) {
6793 LLVM_DEBUG(dbgs() << "CleanupReturn: " << CRI << "\n");
6794 // Nothing to do here.
6795 }
6796
6797 void visitCatchReturnInst(CatchReturnInst &CRI) {
6798 LLVM_DEBUG(dbgs() << "CatchReturn: " << CRI << "\n");
6799 // Nothing to do here.
6800 }
6801
6802 void instrumentAsmArgument(Value *Operand, Type *ElemTy, Instruction &I,
6803 IRBuilder<> &IRB, const DataLayout &DL,
6804 bool isOutput) {
6805 // For each assembly argument, we check its value for being initialized.
6806 // If the argument is a pointer, we assume it points to a single element
6807 // of the corresponding type (or to a 8-byte word, if the type is unsized).
6808 // Each such pointer is instrumented with a call to the runtime library.
6809 Type *OpType = Operand->getType();
6810 // Check the operand value itself.
6811 insertCheckShadowOf(Operand, &I);
6812 if (!OpType->isPointerTy() || !isOutput) {
6813 assert(!isOutput);
6814 return;
6815 }
6816 if (!ElemTy->isSized())
6817 return;
6818 auto Size = DL.getTypeStoreSize(ElemTy);
6819 Value *SizeVal = IRB.CreateTypeSize(MS.IntptrTy, Size);
6820 if (MS.CompileKernel) {
6821 IRB.CreateCall(MS.MsanInstrumentAsmStoreFn, {Operand, SizeVal});
6822 } else {
6823 // ElemTy, derived from elementtype(), does not encode the alignment of
6824 // the pointer. Conservatively assume that the shadow memory is unaligned.
6825 // When Size is large, avoid StoreInst as it would expand to many
6826 // instructions.
6827 auto [ShadowPtr, _] =
6828 getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1));
6829 if (Size <= 32)
6830 IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
6831 else
6832 IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()),
6833 SizeVal, Align(1));
6834 }
6835 }
6836
6837 /// Get the number of output arguments returned by pointers.
6838 int getNumOutputArgs(InlineAsm *IA, CallBase *CB) {
6839 int NumRetOutputs = 0;
6840 int NumOutputs = 0;
6841 Type *RetTy = cast<Value>(CB)->getType();
6842 if (!RetTy->isVoidTy()) {
6843 // Register outputs are returned via the CallInst return value.
6844 auto *ST = dyn_cast<StructType>(RetTy);
6845 if (ST)
6846 NumRetOutputs = ST->getNumElements();
6847 else
6848 NumRetOutputs = 1;
6849 }
6850 InlineAsm::ConstraintInfoVector Constraints = IA->ParseConstraints();
6851 for (const InlineAsm::ConstraintInfo &Info : Constraints) {
6852 switch (Info.Type) {
6854 NumOutputs++;
6855 break;
6856 default:
6857 break;
6858 }
6859 }
6860 return NumOutputs - NumRetOutputs;
6861 }
6862
6863 void visitAsmInstruction(Instruction &I) {
6864 // Conservative inline assembly handling: check for poisoned shadow of
6865 // asm() arguments, then unpoison the result and all the memory locations
6866 // pointed to by those arguments.
6867 // An inline asm() statement in C++ contains lists of input and output
6868 // arguments used by the assembly code. These are mapped to operands of the
6869 // CallInst as follows:
6870 // - nR register outputs ("=r) are returned by value in a single structure
6871 // (SSA value of the CallInst);
6872 // - nO other outputs ("=m" and others) are returned by pointer as first
6873 // nO operands of the CallInst;
6874 // - nI inputs ("r", "m" and others) are passed to CallInst as the
6875 // remaining nI operands.
6876 // The total number of asm() arguments in the source is nR+nO+nI, and the
6877 // corresponding CallInst has nO+nI+1 operands (the last operand is the
6878 // function to be called).
6879 const DataLayout &DL = F.getDataLayout();
6880 CallBase *CB = cast<CallBase>(&I);
6881 IRBuilder<> IRB(&I);
6882 InlineAsm *IA = cast<InlineAsm>(CB->getCalledOperand());
6883 int OutputArgs = getNumOutputArgs(IA, CB);
6884 // The last operand of a CallInst is the function itself.
6885 int NumOperands = CB->getNumOperands() - 1;
6886
6887 // Check input arguments. Doing so before unpoisoning output arguments, so
6888 // that we won't overwrite uninit values before checking them.
6889 for (int i = OutputArgs; i < NumOperands; i++) {
6890 Value *Operand = CB->getOperand(i);
6891 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
6892 /*isOutput*/ false);
6893 }
6894 // Unpoison output arguments. This must happen before the actual InlineAsm
6895 // call, so that the shadow for memory published in the asm() statement
6896 // remains valid.
6897 for (int i = 0; i < OutputArgs; i++) {
6898 Value *Operand = CB->getOperand(i);
6899 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
6900 /*isOutput*/ true);
6901 }
6902
6903 setShadow(&I, getCleanShadow(&I));
6904 setOrigin(&I, getCleanOrigin());
6905 }
6906
6907 void visitFreezeInst(FreezeInst &I) {
6908 // Freeze always returns a fully defined value.
6909 setShadow(&I, getCleanShadow(&I));
6910 setOrigin(&I, getCleanOrigin());
6911 }
6912
6913 void visitInstruction(Instruction &I) {
6914 // Everything else: stop propagating and check for poisoned shadow.
6916 dumpInst(I);
6917 LLVM_DEBUG(dbgs() << "DEFAULT: " << I << "\n");
6918 for (size_t i = 0, n = I.getNumOperands(); i < n; i++) {
6919 Value *Operand = I.getOperand(i);
6920 if (Operand->getType()->isSized())
6921 insertCheckShadowOf(Operand, &I);
6922 }
6923 setShadow(&I, getCleanShadow(&I));
6924 setOrigin(&I, getCleanOrigin());
6925 }
6926};
6927
6928struct VarArgHelperBase : public VarArgHelper {
6929 Function &F;
6930 MemorySanitizer &MS;
6931 MemorySanitizerVisitor &MSV;
6932 SmallVector<CallInst *, 16> VAStartInstrumentationList;
6933 const unsigned VAListTagSize;
6934
6935 VarArgHelperBase(Function &F, MemorySanitizer &MS,
6936 MemorySanitizerVisitor &MSV, unsigned VAListTagSize)
6937 : F(F), MS(MS), MSV(MSV), VAListTagSize(VAListTagSize) {}
6938
6939 Value *getShadowAddrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
6940 Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
6941 return IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
6942 }
6943
6944 /// Compute the shadow address for a given va_arg.
6945 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
6946 Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
6947 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
6948 return IRB.CreateIntToPtr(Base, MS.PtrTy, "_msarg_va_s");
6949 }
6950
6951 /// Compute the shadow address for a given va_arg.
6952 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset,
6953 unsigned ArgSize) {
6954 // Make sure we don't overflow __msan_va_arg_tls.
6955 if (ArgOffset + ArgSize > kParamTLSSize)
6956 return nullptr;
6957 return getShadowPtrForVAArgument(IRB, ArgOffset);
6958 }
6959
6960 /// Compute the origin address for a given va_arg.
6961 Value *getOriginPtrForVAArgument(IRBuilder<> &IRB, int ArgOffset) {
6962 Value *Base = IRB.CreatePointerCast(MS.VAArgOriginTLS, MS.IntptrTy);
6963 // getOriginPtrForVAArgument() is always called after
6964 // getShadowPtrForVAArgument(), so __msan_va_arg_origin_tls can never
6965 // overflow.
6966 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
6967 return IRB.CreateIntToPtr(Base, MS.PtrTy, "_msarg_va_o");
6968 }
6969
6970 void CleanUnusedTLS(IRBuilder<> &IRB, Value *ShadowBase,
6971 unsigned BaseOffset) {
6972 // The tails of __msan_va_arg_tls is not large enough to fit full
6973 // value shadow, but it will be copied to backup anyway. Make it
6974 // clean.
6975 if (BaseOffset >= kParamTLSSize)
6976 return;
6977 Value *TailSize =
6978 ConstantInt::getSigned(IRB.getInt32Ty(), kParamTLSSize - BaseOffset);
6979 IRB.CreateMemSet(ShadowBase, ConstantInt::getNullValue(IRB.getInt8Ty()),
6980 TailSize, Align(8));
6981 }
6982
6983 void unpoisonVAListTagForInst(IntrinsicInst &I) {
6984 IRBuilder<> IRB(&I);
6985 Value *VAListTag = I.getArgOperand(0);
6986 const Align Alignment = Align(8);
6987 auto [ShadowPtr, OriginPtr] = MSV.getShadowOriginPtr(
6988 VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
6989 // Unpoison the whole __va_list_tag.
6990 IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
6991 VAListTagSize, Alignment, false);
6992 }
6993
6994 void visitVAStartInst(VAStartInst &I) override {
6995 if (F.getCallingConv() == CallingConv::Win64)
6996 return;
6997 VAStartInstrumentationList.push_back(&I);
6998 unpoisonVAListTagForInst(I);
6999 }
7000
7001 void visitVACopyInst(VACopyInst &I) override {
7002 if (F.getCallingConv() == CallingConv::Win64)
7003 return;
7004 unpoisonVAListTagForInst(I);
7005 }
7006};
7007
7008/// AMD64-specific implementation of VarArgHelper.
7009struct VarArgAMD64Helper : public VarArgHelperBase {
7010 // An unfortunate workaround for asymmetric lowering of va_arg stuff.
7011 // See a comment in visitCallBase for more details.
7012 static const unsigned AMD64GpEndOffset = 48; // AMD64 ABI Draft 0.99.6 p3.5.7
7013 static const unsigned AMD64FpEndOffsetSSE = 176;
7014 // If SSE is disabled, fp_offset in va_list is zero.
7015 static const unsigned AMD64FpEndOffsetNoSSE = AMD64GpEndOffset;
7016
7017 unsigned AMD64FpEndOffset;
7018 AllocaInst *VAArgTLSCopy = nullptr;
7019 AllocaInst *VAArgTLSOriginCopy = nullptr;
7020 Value *VAArgOverflowSize = nullptr;
7021
7022 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7023
7024 VarArgAMD64Helper(Function &F, MemorySanitizer &MS,
7025 MemorySanitizerVisitor &MSV)
7026 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/24) {
7027 AMD64FpEndOffset = AMD64FpEndOffsetSSE;
7028 for (const auto &Attr : F.getAttributes().getFnAttrs()) {
7029 if (Attr.isStringAttribute() &&
7030 (Attr.getKindAsString() == "target-features")) {
7031 if (Attr.getValueAsString().contains("-sse"))
7032 AMD64FpEndOffset = AMD64FpEndOffsetNoSSE;
7033 break;
7034 }
7035 }
7036 }
7037
7038 ArgKind classifyArgument(Value *arg) {
7039 // A very rough approximation of X86_64 argument classification rules.
7040 Type *T = arg->getType();
7041 if (T->isX86_FP80Ty())
7042 return AK_Memory;
7043 if (T->isFPOrFPVectorTy())
7044 return AK_FloatingPoint;
7045 if (T->isIntegerTy() && T->getPrimitiveSizeInBits() <= 64)
7046 return AK_GeneralPurpose;
7047 if (T->isPointerTy())
7048 return AK_GeneralPurpose;
7049 return AK_Memory;
7050 }
7051
7052 // For VarArg functions, store the argument shadow in an ABI-specific format
7053 // that corresponds to va_list layout.
7054 // We do this because Clang lowers va_arg in the frontend, and this pass
7055 // only sees the low level code that deals with va_list internals.
7056 // A much easier alternative (provided that Clang emits va_arg instructions)
7057 // would have been to associate each live instance of va_list with a copy of
7058 // MSanParamTLS, and extract shadow on va_arg() call in the argument list
7059 // order.
7060 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7061 unsigned GpOffset = 0;
7062 unsigned FpOffset = AMD64GpEndOffset;
7063 unsigned OverflowOffset = AMD64FpEndOffset;
7064 const DataLayout &DL = F.getDataLayout();
7065
7066 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7067 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7068 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7069 if (IsByVal) {
7070 // ByVal arguments always go to the overflow area.
7071 // Fixed arguments passed through the overflow area will be stepped
7072 // over by va_start, so don't count them towards the offset.
7073 if (IsFixed)
7074 continue;
7075 assert(A->getType()->isPointerTy());
7076 Type *RealTy = CB.getParamByValType(ArgNo);
7077 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7078 uint64_t AlignedSize = alignTo(ArgSize, 8);
7079 unsigned BaseOffset = OverflowOffset;
7080 Value *ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7081 Value *OriginBase = nullptr;
7082 if (MS.TrackOrigins)
7083 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7084 OverflowOffset += AlignedSize;
7085
7086 if (OverflowOffset > kParamTLSSize) {
7087 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7088 continue; // We have no space to copy shadow there.
7089 }
7090
7091 Value *ShadowPtr, *OriginPtr;
7092 std::tie(ShadowPtr, OriginPtr) =
7093 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), kShadowTLSAlignment,
7094 /*isStore*/ false);
7095 IRB.CreateMemCpy(ShadowBase, kShadowTLSAlignment, ShadowPtr,
7096 kShadowTLSAlignment, ArgSize);
7097 if (MS.TrackOrigins)
7098 IRB.CreateMemCpy(OriginBase, kShadowTLSAlignment, OriginPtr,
7099 kShadowTLSAlignment, ArgSize);
7100 } else {
7101 ArgKind AK = classifyArgument(A);
7102 if (AK == AK_GeneralPurpose && GpOffset >= AMD64GpEndOffset)
7103 AK = AK_Memory;
7104 if (AK == AK_FloatingPoint && FpOffset >= AMD64FpEndOffset)
7105 AK = AK_Memory;
7106 Value *ShadowBase, *OriginBase = nullptr;
7107 switch (AK) {
7108 case AK_GeneralPurpose:
7109 ShadowBase = getShadowPtrForVAArgument(IRB, GpOffset);
7110 if (MS.TrackOrigins)
7111 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset);
7112 GpOffset += 8;
7113 assert(GpOffset <= kParamTLSSize);
7114 break;
7115 case AK_FloatingPoint:
7116 ShadowBase = getShadowPtrForVAArgument(IRB, FpOffset);
7117 if (MS.TrackOrigins)
7118 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
7119 FpOffset += 16;
7120 assert(FpOffset <= kParamTLSSize);
7121 break;
7122 case AK_Memory:
7123 if (IsFixed)
7124 continue;
7125 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7126 uint64_t AlignedSize = alignTo(ArgSize, 8);
7127 unsigned BaseOffset = OverflowOffset;
7128 ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7129 if (MS.TrackOrigins) {
7130 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7131 }
7132 OverflowOffset += AlignedSize;
7133 if (OverflowOffset > kParamTLSSize) {
7134 // We have no space to copy shadow there.
7135 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7136 continue;
7137 }
7138 }
7139 // Take fixed arguments into account for GpOffset and FpOffset,
7140 // but don't actually store shadows for them.
7141 // TODO(glider): don't call get*PtrForVAArgument() for them.
7142 if (IsFixed)
7143 continue;
7144 Value *Shadow = MSV.getShadow(A);
7145 IRB.CreateAlignedStore(Shadow, ShadowBase, kShadowTLSAlignment);
7146 if (MS.TrackOrigins) {
7147 Value *Origin = MSV.getOrigin(A);
7148 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
7149 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
7151 }
7152 }
7153 }
7154 Constant *OverflowSize =
7155 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AMD64FpEndOffset);
7156 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7157 }
7158
7159 void finalizeInstrumentation() override {
7160 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7161 "finalizeInstrumentation called twice");
7162 if (!VAStartInstrumentationList.empty()) {
7163 // If there is a va_start in this function, make a backup copy of
7164 // va_arg_tls somewhere in the function entry block.
7165 IRBuilder<> IRB(MSV.FnPrologueEnd);
7166 VAArgOverflowSize =
7167 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7168 Value *CopySize = IRB.CreateAdd(
7169 ConstantInt::get(MS.IntptrTy, AMD64FpEndOffset), VAArgOverflowSize);
7170 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7171 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7172 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7173 CopySize, kShadowTLSAlignment, false);
7174
7175 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7176 Intrinsic::umin, CopySize,
7177 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7178 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7179 kShadowTLSAlignment, SrcSize);
7180 if (MS.TrackOrigins) {
7181 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7182 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
7183 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
7184 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
7185 }
7186 }
7187
7188 // Instrument va_start.
7189 // Copy va_list shadow from the backup copy of the TLS contents.
7190 for (CallInst *OrigInst : VAStartInstrumentationList) {
7191 NextNodeIRBuilder IRB(OrigInst);
7192 Value *VAListTag = OrigInst->getArgOperand(0);
7193
7194 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
7195 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7196 ConstantInt::get(MS.IntptrTy, 16)),
7197 MS.PtrTy);
7198 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7199 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7200 const Align Alignment = Align(16);
7201 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7202 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7203 Alignment, /*isStore*/ true);
7204 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
7205 AMD64FpEndOffset);
7206 if (MS.TrackOrigins)
7207 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
7208 Alignment, AMD64FpEndOffset);
7209 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
7210 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7211 ConstantInt::get(MS.IntptrTy, 8)),
7212 MS.PtrTy);
7213 Value *OverflowArgAreaPtr =
7214 IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
7215 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
7216 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
7217 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
7218 Alignment, /*isStore*/ true);
7219 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
7220 AMD64FpEndOffset);
7221 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
7222 VAArgOverflowSize);
7223 if (MS.TrackOrigins) {
7224 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
7225 AMD64FpEndOffset);
7226 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
7227 VAArgOverflowSize);
7228 }
7229 }
7230 }
7231};
7232
7233/// AArch64-specific implementation of VarArgHelper.
7234struct VarArgAArch64Helper : public VarArgHelperBase {
7235 static const unsigned kAArch64GrArgSize = 64;
7236 static const unsigned kAArch64VrArgSize = 128;
7237
7238 static const unsigned AArch64GrBegOffset = 0;
7239 static const unsigned AArch64GrEndOffset = kAArch64GrArgSize;
7240 // Make VR space aligned to 16 bytes.
7241 static const unsigned AArch64VrBegOffset = AArch64GrEndOffset;
7242 static const unsigned AArch64VrEndOffset =
7243 AArch64VrBegOffset + kAArch64VrArgSize;
7244 static const unsigned AArch64VAEndOffset = AArch64VrEndOffset;
7245
7246 AllocaInst *VAArgTLSCopy = nullptr;
7247 Value *VAArgOverflowSize = nullptr;
7248
7249 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7250
7251 VarArgAArch64Helper(Function &F, MemorySanitizer &MS,
7252 MemorySanitizerVisitor &MSV)
7253 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/32) {}
7254
7255 // A very rough approximation of aarch64 argument classification rules.
7256 std::pair<ArgKind, uint64_t> classifyArgument(Type *T) {
7257 if (T->isIntOrPtrTy() && T->getPrimitiveSizeInBits() <= 64)
7258 return {AK_GeneralPurpose, 1};
7259 if (T->isFloatingPointTy() && T->getPrimitiveSizeInBits() <= 128)
7260 return {AK_FloatingPoint, 1};
7261
7262 if (T->isArrayTy()) {
7263 auto R = classifyArgument(T->getArrayElementType());
7264 R.second *= T->getScalarType()->getArrayNumElements();
7265 return R;
7266 }
7267
7268 if (const FixedVectorType *FV = dyn_cast<FixedVectorType>(T)) {
7269 auto R = classifyArgument(FV->getScalarType());
7270 R.second *= FV->getNumElements();
7271 return R;
7272 }
7273
7274 LLVM_DEBUG(errs() << "Unknown vararg type: " << *T << "\n");
7275 return {AK_Memory, 0};
7276 }
7277
7278 // The instrumentation stores the argument shadow in a non ABI-specific
7279 // format because it does not know which argument is named (since Clang,
7280 // like x86_64 case, lowers the va_args in the frontend and this pass only
7281 // sees the low level code that deals with va_list internals).
7282 // The first seven GR registers are saved in the first 56 bytes of the
7283 // va_arg tls arra, followed by the first 8 FP/SIMD registers, and then
7284 // the remaining arguments.
7285 // Using constant offset within the va_arg TLS array allows fast copy
7286 // in the finalize instrumentation.
7287 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7288 unsigned GrOffset = AArch64GrBegOffset;
7289 unsigned VrOffset = AArch64VrBegOffset;
7290 unsigned OverflowOffset = AArch64VAEndOffset;
7291
7292 const DataLayout &DL = F.getDataLayout();
7293 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7294 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7295 auto [AK, RegNum] = classifyArgument(A->getType());
7296 if (AK == AK_GeneralPurpose &&
7297 (GrOffset + RegNum * 8) > AArch64GrEndOffset)
7298 AK = AK_Memory;
7299 if (AK == AK_FloatingPoint &&
7300 (VrOffset + RegNum * 16) > AArch64VrEndOffset)
7301 AK = AK_Memory;
7302 Value *Base;
7303 switch (AK) {
7304 case AK_GeneralPurpose:
7305 Base = getShadowPtrForVAArgument(IRB, GrOffset);
7306 GrOffset += 8 * RegNum;
7307 break;
7308 case AK_FloatingPoint:
7309 Base = getShadowPtrForVAArgument(IRB, VrOffset);
7310 VrOffset += 16 * RegNum;
7311 break;
7312 case AK_Memory:
7313 // Don't count fixed arguments in the overflow area - va_start will
7314 // skip right over them.
7315 if (IsFixed)
7316 continue;
7317 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7318 uint64_t AlignedSize = alignTo(ArgSize, 8);
7319 unsigned BaseOffset = OverflowOffset;
7320 Base = getShadowPtrForVAArgument(IRB, BaseOffset);
7321 OverflowOffset += AlignedSize;
7322 if (OverflowOffset > kParamTLSSize) {
7323 // We have no space to copy shadow there.
7324 CleanUnusedTLS(IRB, Base, BaseOffset);
7325 continue;
7326 }
7327 break;
7328 }
7329 // Count Gp/Vr fixed arguments to their respective offsets, but don't
7330 // bother to actually store a shadow.
7331 if (IsFixed)
7332 continue;
7333 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
7334 }
7335 Constant *OverflowSize =
7336 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AArch64VAEndOffset);
7337 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7338 }
7339
7340 // Retrieve a va_list field of 'void*' size.
7341 Value *getVAField64(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7342 Value *SaveAreaPtrPtr = IRB.CreateIntToPtr(
7343 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7344 ConstantInt::get(MS.IntptrTy, offset)),
7345 MS.PtrTy);
7346 return IRB.CreateLoad(Type::getInt64Ty(*MS.C), SaveAreaPtrPtr);
7347 }
7348
7349 // Retrieve a va_list field of 'int' size.
7350 Value *getVAField32(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7351 Value *SaveAreaPtr = IRB.CreateIntToPtr(
7352 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7353 ConstantInt::get(MS.IntptrTy, offset)),
7354 MS.PtrTy);
7355 Value *SaveArea32 = IRB.CreateLoad(IRB.getInt32Ty(), SaveAreaPtr);
7356 return IRB.CreateSExt(SaveArea32, MS.IntptrTy);
7357 }
7358
7359 void finalizeInstrumentation() override {
7360 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7361 "finalizeInstrumentation called twice");
7362 if (!VAStartInstrumentationList.empty()) {
7363 // If there is a va_start in this function, make a backup copy of
7364 // va_arg_tls somewhere in the function entry block.
7365 IRBuilder<> IRB(MSV.FnPrologueEnd);
7366 VAArgOverflowSize =
7367 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7368 Value *CopySize = IRB.CreateAdd(
7369 ConstantInt::get(MS.IntptrTy, AArch64VAEndOffset), VAArgOverflowSize);
7370 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7371 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7372 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7373 CopySize, kShadowTLSAlignment, false);
7374
7375 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7376 Intrinsic::umin, CopySize,
7377 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7378 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7379 kShadowTLSAlignment, SrcSize);
7380 }
7381
7382 Value *GrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64GrArgSize);
7383 Value *VrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64VrArgSize);
7384
7385 // Instrument va_start, copy va_list shadow from the backup copy of
7386 // the TLS contents.
7387 for (CallInst *OrigInst : VAStartInstrumentationList) {
7388 NextNodeIRBuilder IRB(OrigInst);
7389
7390 Value *VAListTag = OrigInst->getArgOperand(0);
7391
7392 // The variadic ABI for AArch64 creates two areas to save the incoming
7393 // argument registers (one for 64-bit general register xn-x7 and another
7394 // for 128-bit FP/SIMD vn-v7).
7395 // We need then to propagate the shadow arguments on both regions
7396 // 'va::__gr_top + va::__gr_offs' and 'va::__vr_top + va::__vr_offs'.
7397 // The remaining arguments are saved on shadow for 'va::stack'.
7398 // One caveat is it requires only to propagate the non-named arguments,
7399 // however on the call site instrumentation 'all' the arguments are
7400 // saved. So to copy the shadow values from the va_arg TLS array
7401 // we need to adjust the offset for both GR and VR fields based on
7402 // the __{gr,vr}_offs value (since they are stores based on incoming
7403 // named arguments).
7404 Type *RegSaveAreaPtrTy = IRB.getPtrTy();
7405
7406 // Read the stack pointer from the va_list.
7407 Value *StackSaveAreaPtr =
7408 IRB.CreateIntToPtr(getVAField64(IRB, VAListTag, 0), RegSaveAreaPtrTy);
7409
7410 // Read both the __gr_top and __gr_off and add them up.
7411 Value *GrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 8);
7412 Value *GrOffSaveArea = getVAField32(IRB, VAListTag, 24);
7413
7414 Value *GrRegSaveAreaPtr = IRB.CreateIntToPtr(
7415 IRB.CreateAdd(GrTopSaveAreaPtr, GrOffSaveArea), RegSaveAreaPtrTy);
7416
7417 // Read both the __vr_top and __vr_off and add them up.
7418 Value *VrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 16);
7419 Value *VrOffSaveArea = getVAField32(IRB, VAListTag, 28);
7420
7421 Value *VrRegSaveAreaPtr = IRB.CreateIntToPtr(
7422 IRB.CreateAdd(VrTopSaveAreaPtr, VrOffSaveArea), RegSaveAreaPtrTy);
7423
7424 // It does not know how many named arguments is being used and, on the
7425 // callsite all the arguments were saved. Since __gr_off is defined as
7426 // '0 - ((8 - named_gr) * 8)', the idea is to just propagate the variadic
7427 // argument by ignoring the bytes of shadow from named arguments.
7428 Value *GrRegSaveAreaShadowPtrOff =
7429 IRB.CreateAdd(GrArgSize, GrOffSaveArea);
7430
7431 Value *GrRegSaveAreaShadowPtr =
7432 MSV.getShadowOriginPtr(GrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7433 Align(8), /*isStore*/ true)
7434 .first;
7435
7436 Value *GrSrcPtr =
7437 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy, GrRegSaveAreaShadowPtrOff);
7438 Value *GrCopySize = IRB.CreateSub(GrArgSize, GrRegSaveAreaShadowPtrOff);
7439
7440 IRB.CreateMemCpy(GrRegSaveAreaShadowPtr, Align(8), GrSrcPtr, Align(8),
7441 GrCopySize);
7442
7443 // Again, but for FP/SIMD values.
7444 Value *VrRegSaveAreaShadowPtrOff =
7445 IRB.CreateAdd(VrArgSize, VrOffSaveArea);
7446
7447 Value *VrRegSaveAreaShadowPtr =
7448 MSV.getShadowOriginPtr(VrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7449 Align(8), /*isStore*/ true)
7450 .first;
7451
7452 Value *VrSrcPtr = IRB.CreateInBoundsPtrAdd(
7453 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy,
7454 IRB.getInt32(AArch64VrBegOffset)),
7455 VrRegSaveAreaShadowPtrOff);
7456 Value *VrCopySize = IRB.CreateSub(VrArgSize, VrRegSaveAreaShadowPtrOff);
7457
7458 IRB.CreateMemCpy(VrRegSaveAreaShadowPtr, Align(8), VrSrcPtr, Align(8),
7459 VrCopySize);
7460
7461 // And finally for remaining arguments.
7462 Value *StackSaveAreaShadowPtr =
7463 MSV.getShadowOriginPtr(StackSaveAreaPtr, IRB, IRB.getInt8Ty(),
7464 Align(16), /*isStore*/ true)
7465 .first;
7466
7467 Value *StackSrcPtr = IRB.CreateInBoundsPtrAdd(
7468 VAArgTLSCopy, IRB.getInt32(AArch64VAEndOffset));
7469
7470 IRB.CreateMemCpy(StackSaveAreaShadowPtr, Align(16), StackSrcPtr,
7471 Align(16), VAArgOverflowSize);
7472 }
7473 }
7474};
7475
7476/// PowerPC64-specific implementation of VarArgHelper.
7477struct VarArgPowerPC64Helper : public VarArgHelperBase {
7478 AllocaInst *VAArgTLSCopy = nullptr;
7479 Value *VAArgSize = nullptr;
7480
7481 VarArgPowerPC64Helper(Function &F, MemorySanitizer &MS,
7482 MemorySanitizerVisitor &MSV)
7483 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/8) {}
7484
7485 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7486 // For PowerPC, we need to deal with alignment of stack arguments -
7487 // they are mostly aligned to 8 bytes, but vectors and i128 arrays
7488 // are aligned to 16 bytes, byvals can be aligned to 8 or 16 bytes,
7489 // For that reason, we compute current offset from stack pointer (which is
7490 // always properly aligned), and offset for the first vararg, then subtract
7491 // them.
7492 unsigned VAArgBase;
7493 Triple TargetTriple(F.getParent()->getTargetTriple());
7494 // Parameter save area starts at 48 bytes from frame pointer for ABIv1,
7495 // and 32 bytes for ABIv2. This is usually determined by target
7496 // endianness, but in theory could be overridden by function attribute.
7497 if (TargetTriple.isPPC64ELFv2ABI())
7498 VAArgBase = 32;
7499 else
7500 VAArgBase = 48;
7501 unsigned VAArgOffset = VAArgBase;
7502 const DataLayout &DL = F.getDataLayout();
7503 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7504 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7505 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7506 if (IsByVal) {
7507 assert(A->getType()->isPointerTy());
7508 Type *RealTy = CB.getParamByValType(ArgNo);
7509 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7510 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(8));
7511 if (ArgAlign < 8)
7512 ArgAlign = Align(8);
7513 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7514 if (!IsFixed) {
7515 Value *Base =
7516 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7517 if (Base) {
7518 Value *AShadowPtr, *AOriginPtr;
7519 std::tie(AShadowPtr, AOriginPtr) =
7520 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
7521 kShadowTLSAlignment, /*isStore*/ false);
7522
7523 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
7524 kShadowTLSAlignment, ArgSize);
7525 }
7526 }
7527 VAArgOffset += alignTo(ArgSize, Align(8));
7528 } else {
7529 Value *Base;
7530 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7531 Align ArgAlign = Align(8);
7532 if (A->getType()->isArrayTy()) {
7533 // Arrays are aligned to element size, except for long double
7534 // arrays, which are aligned to 8 bytes.
7535 Type *ElementTy = A->getType()->getArrayElementType();
7536 if (!ElementTy->isPPC_FP128Ty())
7537 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
7538 } else if (A->getType()->isVectorTy()) {
7539 // Vectors are naturally aligned.
7540 ArgAlign = Align(ArgSize);
7541 }
7542 if (ArgAlign < 8)
7543 ArgAlign = Align(8);
7544 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7545 if (DL.isBigEndian()) {
7546 // Adjusting the shadow for argument with size < 8 to match the
7547 // placement of bits in big endian system
7548 if (ArgSize < 8)
7549 VAArgOffset += (8 - ArgSize);
7550 }
7551 if (!IsFixed) {
7552 Base =
7553 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7554 if (Base)
7555 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
7556 }
7557 VAArgOffset += ArgSize;
7558 VAArgOffset = alignTo(VAArgOffset, Align(8));
7559 }
7560 if (IsFixed)
7561 VAArgBase = VAArgOffset;
7562 }
7563
7564 Constant *TotalVAArgSize =
7565 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
7566 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
7567 // a new class member i.e. it is the total size of all VarArgs.
7568 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
7569 }
7570
7571 void finalizeInstrumentation() override {
7572 assert(!VAArgSize && !VAArgTLSCopy &&
7573 "finalizeInstrumentation called twice");
7574 IRBuilder<> IRB(MSV.FnPrologueEnd);
7575 VAArgSize = IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7576 Value *CopySize = VAArgSize;
7577
7578 if (!VAStartInstrumentationList.empty()) {
7579 // If there is a va_start in this function, make a backup copy of
7580 // va_arg_tls somewhere in the function entry block.
7581
7582 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7583 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7584 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7585 CopySize, kShadowTLSAlignment, false);
7586
7587 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7588 Intrinsic::umin, CopySize,
7589 ConstantInt::get(IRB.getInt64Ty(), kParamTLSSize));
7590 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7591 kShadowTLSAlignment, SrcSize);
7592 }
7593
7594 // Instrument va_start.
7595 // Copy va_list shadow from the backup copy of the TLS contents.
7596 for (CallInst *OrigInst : VAStartInstrumentationList) {
7597 NextNodeIRBuilder IRB(OrigInst);
7598 Value *VAListTag = OrigInst->getArgOperand(0);
7599 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7600
7601 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
7602
7603 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7604 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7605 const DataLayout &DL = F.getDataLayout();
7606 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7607 const Align Alignment = Align(IntptrSize);
7608 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7609 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7610 Alignment, /*isStore*/ true);
7611 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
7612 CopySize);
7613 }
7614 }
7615};
7616
7617/// PowerPC32-specific implementation of VarArgHelper.
7618struct VarArgPowerPC32Helper : public VarArgHelperBase {
7619 AllocaInst *VAArgTLSCopy = nullptr;
7620 Value *VAArgSize = nullptr;
7621
7622 VarArgPowerPC32Helper(Function &F, MemorySanitizer &MS,
7623 MemorySanitizerVisitor &MSV)
7624 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/12) {}
7625
7626 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7627 unsigned VAArgBase;
7628 // Parameter save area is 8 bytes from frame pointer in PPC32
7629 VAArgBase = 8;
7630 unsigned VAArgOffset = VAArgBase;
7631 const DataLayout &DL = F.getDataLayout();
7632 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7633 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7634 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7635 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7636 if (IsByVal) {
7637 assert(A->getType()->isPointerTy());
7638 Type *RealTy = CB.getParamByValType(ArgNo);
7639 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7640 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
7641 if (ArgAlign < IntptrSize)
7642 ArgAlign = Align(IntptrSize);
7643 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7644 if (!IsFixed) {
7645 Value *Base =
7646 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7647 if (Base) {
7648 Value *AShadowPtr, *AOriginPtr;
7649 std::tie(AShadowPtr, AOriginPtr) =
7650 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
7651 kShadowTLSAlignment, /*isStore*/ false);
7652
7653 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
7654 kShadowTLSAlignment, ArgSize);
7655 }
7656 }
7657 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
7658 } else {
7659 Value *Base;
7660 Type *ArgTy = A->getType();
7661
7662 // On PPC 32 floating point variable arguments are stored in separate
7663 // area: fp_save_area = reg_save_area + 4*8. We do not copy shaodow for
7664 // them as they will be found when checking call arguments.
7665 if (!ArgTy->isFloatingPointTy()) {
7666 uint64_t ArgSize = DL.getTypeAllocSize(ArgTy);
7667 Align ArgAlign = Align(IntptrSize);
7668 if (ArgTy->isArrayTy()) {
7669 // Arrays are aligned to element size, except for long double
7670 // arrays, which are aligned to 8 bytes.
7671 Type *ElementTy = ArgTy->getArrayElementType();
7672 if (!ElementTy->isPPC_FP128Ty())
7673 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
7674 } else if (ArgTy->isVectorTy()) {
7675 // Vectors are naturally aligned.
7676 ArgAlign = Align(ArgSize);
7677 }
7678 if (ArgAlign < IntptrSize)
7679 ArgAlign = Align(IntptrSize);
7680 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7681 if (DL.isBigEndian()) {
7682 // Adjusting the shadow for argument with size < IntptrSize to match
7683 // the placement of bits in big endian system
7684 if (ArgSize < IntptrSize)
7685 VAArgOffset += (IntptrSize - ArgSize);
7686 }
7687 if (!IsFixed) {
7688 Base = getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase,
7689 ArgSize);
7690 if (Base)
7691 IRB.CreateAlignedStore(MSV.getShadow(A), Base,
7693 }
7694 VAArgOffset += ArgSize;
7695 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
7696 }
7697 }
7698 }
7699
7700 Constant *TotalVAArgSize =
7701 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
7702 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
7703 // a new class member i.e. it is the total size of all VarArgs.
7704 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
7705 }
7706
7707 void finalizeInstrumentation() override {
7708 assert(!VAArgSize && !VAArgTLSCopy &&
7709 "finalizeInstrumentation called twice");
7710 IRBuilder<> IRB(MSV.FnPrologueEnd);
7711 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
7712 Value *CopySize = VAArgSize;
7713
7714 if (!VAStartInstrumentationList.empty()) {
7715 // If there is a va_start in this function, make a backup copy of
7716 // va_arg_tls somewhere in the function entry block.
7717
7718 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7719 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7720 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7721 CopySize, kShadowTLSAlignment, false);
7722
7723 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7724 Intrinsic::umin, CopySize,
7725 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7726 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7727 kShadowTLSAlignment, SrcSize);
7728 }
7729
7730 // Instrument va_start.
7731 // Copy va_list shadow from the backup copy of the TLS contents.
7732 for (CallInst *OrigInst : VAStartInstrumentationList) {
7733 NextNodeIRBuilder IRB(OrigInst);
7734 Value *VAListTag = OrigInst->getArgOperand(0);
7735 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7736 Value *RegSaveAreaSize = CopySize;
7737
7738 // In PPC32 va_list_tag is a struct
7739 RegSaveAreaPtrPtr =
7740 IRB.CreateAdd(RegSaveAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 8));
7741
7742 // On PPC 32 reg_save_area can only hold 32 bytes of data
7743 RegSaveAreaSize = IRB.CreateBinaryIntrinsic(
7744 Intrinsic::umin, CopySize, ConstantInt::get(MS.IntptrTy, 32));
7745
7746 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
7747 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7748
7749 const DataLayout &DL = F.getDataLayout();
7750 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7751 const Align Alignment = Align(IntptrSize);
7752
7753 { // Copy reg save area
7754 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7755 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7756 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7757 Alignment, /*isStore*/ true);
7758 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy,
7759 Alignment, RegSaveAreaSize);
7760
7761 RegSaveAreaShadowPtr =
7762 IRB.CreatePtrToInt(RegSaveAreaShadowPtr, MS.IntptrTy);
7763 Value *FPSaveArea = IRB.CreateAdd(RegSaveAreaShadowPtr,
7764 ConstantInt::get(MS.IntptrTy, 32));
7765 FPSaveArea = IRB.CreateIntToPtr(FPSaveArea, MS.PtrTy);
7766 // We fill fp shadow with zeroes as uninitialized fp args should have
7767 // been found during call base check
7768 IRB.CreateMemSet(FPSaveArea, ConstantInt::getNullValue(IRB.getInt8Ty()),
7769 ConstantInt::get(MS.IntptrTy, 32), Alignment);
7770 }
7771
7772 { // Copy overflow area
7773 // RegSaveAreaSize is min(CopySize, 32) -> no overflow can occur
7774 Value *OverflowAreaSize = IRB.CreateSub(CopySize, RegSaveAreaSize);
7775
7776 Value *OverflowAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7777 OverflowAreaPtrPtr =
7778 IRB.CreateAdd(OverflowAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 4));
7779 OverflowAreaPtrPtr = IRB.CreateIntToPtr(OverflowAreaPtrPtr, MS.PtrTy);
7780
7781 Value *OverflowAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowAreaPtrPtr);
7782
7783 Value *OverflowAreaShadowPtr, *OverflowAreaOriginPtr;
7784 std::tie(OverflowAreaShadowPtr, OverflowAreaOriginPtr) =
7785 MSV.getShadowOriginPtr(OverflowAreaPtr, IRB, IRB.getInt8Ty(),
7786 Alignment, /*isStore*/ true);
7787
7788 Value *OverflowVAArgTLSCopyPtr =
7789 IRB.CreatePtrToInt(VAArgTLSCopy, MS.IntptrTy);
7790 OverflowVAArgTLSCopyPtr =
7791 IRB.CreateAdd(OverflowVAArgTLSCopyPtr, RegSaveAreaSize);
7792
7793 OverflowVAArgTLSCopyPtr =
7794 IRB.CreateIntToPtr(OverflowVAArgTLSCopyPtr, MS.PtrTy);
7795 IRB.CreateMemCpy(OverflowAreaShadowPtr, Alignment,
7796 OverflowVAArgTLSCopyPtr, Alignment, OverflowAreaSize);
7797 }
7798 }
7799 }
7800};
7801
7802/// SystemZ-specific implementation of VarArgHelper.
7803struct VarArgSystemZHelper : public VarArgHelperBase {
7804 static const unsigned SystemZGpOffset = 16;
7805 static const unsigned SystemZGpEndOffset = 56;
7806 static const unsigned SystemZFpOffset = 128;
7807 static const unsigned SystemZFpEndOffset = 160;
7808 static const unsigned SystemZMaxVrArgs = 8;
7809 static const unsigned SystemZRegSaveAreaSize = 160;
7810 static const unsigned SystemZOverflowOffset = 160;
7811 static const unsigned SystemZVAListTagSize = 32;
7812 static const unsigned SystemZOverflowArgAreaPtrOffset = 16;
7813 static const unsigned SystemZRegSaveAreaPtrOffset = 24;
7814
7815 bool IsSoftFloatABI;
7816 AllocaInst *VAArgTLSCopy = nullptr;
7817 AllocaInst *VAArgTLSOriginCopy = nullptr;
7818 Value *VAArgOverflowSize = nullptr;
7819
7820 enum class ArgKind {
7821 GeneralPurpose,
7822 FloatingPoint,
7823 Vector,
7824 Memory,
7825 Indirect,
7826 };
7827
7828 enum class ShadowExtension { None, Zero, Sign };
7829
7830 VarArgSystemZHelper(Function &F, MemorySanitizer &MS,
7831 MemorySanitizerVisitor &MSV)
7832 : VarArgHelperBase(F, MS, MSV, SystemZVAListTagSize),
7833 IsSoftFloatABI(F.getFnAttribute("use-soft-float").getValueAsBool()) {}
7834
7835 ArgKind classifyArgument(Type *T) {
7836 // T is a SystemZABIInfo::classifyArgumentType() output, and there are
7837 // only a few possibilities of what it can be. In particular, enums, single
7838 // element structs and large types have already been taken care of.
7839
7840 // Some i128 and fp128 arguments are converted to pointers only in the
7841 // back end.
7842 if (T->isIntegerTy(128) || T->isFP128Ty())
7843 return ArgKind::Indirect;
7844 if (T->isFloatingPointTy())
7845 return IsSoftFloatABI ? ArgKind::GeneralPurpose : ArgKind::FloatingPoint;
7846 if (T->isIntegerTy() || T->isPointerTy())
7847 return ArgKind::GeneralPurpose;
7848 if (T->isVectorTy())
7849 return ArgKind::Vector;
7850 return ArgKind::Memory;
7851 }
7852
7853 ShadowExtension getShadowExtension(const CallBase &CB, unsigned ArgNo) {
7854 // ABI says: "One of the simple integer types no more than 64 bits wide.
7855 // ... If such an argument is shorter than 64 bits, replace it by a full
7856 // 64-bit integer representing the same number, using sign or zero
7857 // extension". Shadow for an integer argument has the same type as the
7858 // argument itself, so it can be sign or zero extended as well.
7859 bool ZExt = CB.paramHasAttr(ArgNo, Attribute::ZExt);
7860 bool SExt = CB.paramHasAttr(ArgNo, Attribute::SExt);
7861 if (ZExt) {
7862 assert(!SExt);
7863 return ShadowExtension::Zero;
7864 }
7865 if (SExt) {
7866 assert(!ZExt);
7867 return ShadowExtension::Sign;
7868 }
7869 return ShadowExtension::None;
7870 }
7871
7872 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7873 unsigned GpOffset = SystemZGpOffset;
7874 unsigned FpOffset = SystemZFpOffset;
7875 unsigned VrIndex = 0;
7876 unsigned OverflowOffset = SystemZOverflowOffset;
7877 const DataLayout &DL = F.getDataLayout();
7878 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7879 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7880 // SystemZABIInfo does not produce ByVal parameters.
7881 assert(!CB.paramHasAttr(ArgNo, Attribute::ByVal));
7882 Type *T = A->getType();
7883 ArgKind AK = classifyArgument(T);
7884 if (AK == ArgKind::Indirect) {
7885 T = MS.PtrTy;
7886 AK = ArgKind::GeneralPurpose;
7887 }
7888 if (AK == ArgKind::GeneralPurpose && GpOffset >= SystemZGpEndOffset)
7889 AK = ArgKind::Memory;
7890 if (AK == ArgKind::FloatingPoint && FpOffset >= SystemZFpEndOffset)
7891 AK = ArgKind::Memory;
7892 if (AK == ArgKind::Vector && (VrIndex >= SystemZMaxVrArgs || !IsFixed))
7893 AK = ArgKind::Memory;
7894 Value *ShadowBase = nullptr;
7895 Value *OriginBase = nullptr;
7896 ShadowExtension SE = ShadowExtension::None;
7897 switch (AK) {
7898 case ArgKind::GeneralPurpose: {
7899 // Always keep track of GpOffset, but store shadow only for varargs.
7900 uint64_t ArgSize = 8;
7901 if (GpOffset + ArgSize <= kParamTLSSize) {
7902 if (!IsFixed) {
7903 SE = getShadowExtension(CB, ArgNo);
7904 uint64_t GapSize = 0;
7905 if (SE == ShadowExtension::None) {
7906 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
7907 assert(ArgAllocSize <= ArgSize);
7908 GapSize = ArgSize - ArgAllocSize;
7909 }
7910 ShadowBase = getShadowAddrForVAArgument(IRB, GpOffset + GapSize);
7911 if (MS.TrackOrigins)
7912 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset + GapSize);
7913 }
7914 GpOffset += ArgSize;
7915 } else {
7916 GpOffset = kParamTLSSize;
7917 }
7918 break;
7919 }
7920 case ArgKind::FloatingPoint: {
7921 // Always keep track of FpOffset, but store shadow only for varargs.
7922 uint64_t ArgSize = 8;
7923 if (FpOffset + ArgSize <= kParamTLSSize) {
7924 if (!IsFixed) {
7925 // PoP says: "A short floating-point datum requires only the
7926 // left-most 32 bit positions of a floating-point register".
7927 // Therefore, in contrast to AK_GeneralPurpose and AK_Memory,
7928 // don't extend shadow and don't mind the gap.
7929 ShadowBase = getShadowAddrForVAArgument(IRB, FpOffset);
7930 if (MS.TrackOrigins)
7931 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
7932 }
7933 FpOffset += ArgSize;
7934 } else {
7935 FpOffset = kParamTLSSize;
7936 }
7937 break;
7938 }
7939 case ArgKind::Vector: {
7940 // Keep track of VrIndex. No need to store shadow, since vector varargs
7941 // go through AK_Memory.
7942 assert(IsFixed);
7943 VrIndex++;
7944 break;
7945 }
7946 case ArgKind::Memory: {
7947 // Keep track of OverflowOffset and store shadow only for varargs.
7948 // Ignore fixed args, since we need to copy only the vararg portion of
7949 // the overflow area shadow.
7950 if (!IsFixed) {
7951 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
7952 uint64_t ArgSize = alignTo(ArgAllocSize, 8);
7953 if (OverflowOffset + ArgSize <= kParamTLSSize) {
7954 SE = getShadowExtension(CB, ArgNo);
7955 uint64_t GapSize =
7956 SE == ShadowExtension::None ? ArgSize - ArgAllocSize : 0;
7957 ShadowBase =
7958 getShadowAddrForVAArgument(IRB, OverflowOffset + GapSize);
7959 if (MS.TrackOrigins)
7960 OriginBase =
7961 getOriginPtrForVAArgument(IRB, OverflowOffset + GapSize);
7962 OverflowOffset += ArgSize;
7963 } else {
7964 OverflowOffset = kParamTLSSize;
7965 }
7966 }
7967 break;
7968 }
7969 case ArgKind::Indirect:
7970 llvm_unreachable("Indirect must be converted to GeneralPurpose");
7971 }
7972 if (ShadowBase == nullptr)
7973 continue;
7974 Value *Shadow = MSV.getShadow(A);
7975 if (SE != ShadowExtension::None)
7976 Shadow = MSV.CreateShadowCast(IRB, Shadow, IRB.getInt64Ty(),
7977 /*Signed*/ SE == ShadowExtension::Sign);
7978 ShadowBase = IRB.CreateIntToPtr(ShadowBase, MS.PtrTy, "_msarg_va_s");
7979 IRB.CreateStore(Shadow, ShadowBase);
7980 if (MS.TrackOrigins) {
7981 Value *Origin = MSV.getOrigin(A);
7982 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
7983 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
7985 }
7986 }
7987 Constant *OverflowSize = ConstantInt::get(
7988 IRB.getInt64Ty(), OverflowOffset - SystemZOverflowOffset);
7989 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7990 }
7991
7992 void copyRegSaveArea(IRBuilder<> &IRB, Value *VAListTag) {
7993 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
7994 IRB.CreateAdd(
7995 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7996 ConstantInt::get(MS.IntptrTy, SystemZRegSaveAreaPtrOffset)),
7997 MS.PtrTy);
7998 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7999 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8000 const Align Alignment = Align(8);
8001 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8002 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(), Alignment,
8003 /*isStore*/ true);
8004 // TODO(iii): copy only fragments filled by visitCallBase()
8005 // TODO(iii): support packed-stack && !use-soft-float
8006 // For use-soft-float functions, it is enough to copy just the GPRs.
8007 unsigned RegSaveAreaSize =
8008 IsSoftFloatABI ? SystemZGpEndOffset : SystemZRegSaveAreaSize;
8009 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8010 RegSaveAreaSize);
8011 if (MS.TrackOrigins)
8012 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
8013 Alignment, RegSaveAreaSize);
8014 }
8015
8016 // FIXME: This implementation limits OverflowOffset to kParamTLSSize, so we
8017 // don't know real overflow size and can't clear shadow beyond kParamTLSSize.
8018 void copyOverflowArea(IRBuilder<> &IRB, Value *VAListTag) {
8019 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
8020 IRB.CreateAdd(
8021 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8022 ConstantInt::get(MS.IntptrTy, SystemZOverflowArgAreaPtrOffset)),
8023 MS.PtrTy);
8024 Value *OverflowArgAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
8025 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
8026 const Align Alignment = Align(8);
8027 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
8028 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
8029 Alignment, /*isStore*/ true);
8030 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
8031 SystemZOverflowOffset);
8032 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
8033 VAArgOverflowSize);
8034 if (MS.TrackOrigins) {
8035 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
8036 SystemZOverflowOffset);
8037 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
8038 VAArgOverflowSize);
8039 }
8040 }
8041
8042 void finalizeInstrumentation() override {
8043 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
8044 "finalizeInstrumentation called twice");
8045 if (!VAStartInstrumentationList.empty()) {
8046 // If there is a va_start in this function, make a backup copy of
8047 // va_arg_tls somewhere in the function entry block.
8048 IRBuilder<> IRB(MSV.FnPrologueEnd);
8049 VAArgOverflowSize =
8050 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
8051 Value *CopySize =
8052 IRB.CreateAdd(ConstantInt::get(MS.IntptrTy, SystemZOverflowOffset),
8053 VAArgOverflowSize);
8054 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8055 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8056 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8057 CopySize, kShadowTLSAlignment, false);
8058
8059 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8060 Intrinsic::umin, CopySize,
8061 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8062 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8063 kShadowTLSAlignment, SrcSize);
8064 if (MS.TrackOrigins) {
8065 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8066 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
8067 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
8068 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
8069 }
8070 }
8071
8072 // Instrument va_start.
8073 // Copy va_list shadow from the backup copy of the TLS contents.
8074 for (CallInst *OrigInst : VAStartInstrumentationList) {
8075 NextNodeIRBuilder IRB(OrigInst);
8076 Value *VAListTag = OrigInst->getArgOperand(0);
8077 copyRegSaveArea(IRB, VAListTag);
8078 copyOverflowArea(IRB, VAListTag);
8079 }
8080 }
8081};
8082
8083/// i386-specific implementation of VarArgHelper.
8084struct VarArgI386Helper : public VarArgHelperBase {
8085 AllocaInst *VAArgTLSCopy = nullptr;
8086 Value *VAArgSize = nullptr;
8087
8088 VarArgI386Helper(Function &F, MemorySanitizer &MS,
8089 MemorySanitizerVisitor &MSV)
8090 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/4) {}
8091
8092 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8093 const DataLayout &DL = F.getDataLayout();
8094 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8095 unsigned VAArgOffset = 0;
8096 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8097 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8098 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
8099 if (IsByVal) {
8100 assert(A->getType()->isPointerTy());
8101 Type *RealTy = CB.getParamByValType(ArgNo);
8102 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
8103 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
8104 if (ArgAlign < IntptrSize)
8105 ArgAlign = Align(IntptrSize);
8106 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8107 if (!IsFixed) {
8108 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8109 if (Base) {
8110 Value *AShadowPtr, *AOriginPtr;
8111 std::tie(AShadowPtr, AOriginPtr) =
8112 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
8113 kShadowTLSAlignment, /*isStore*/ false);
8114
8115 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
8116 kShadowTLSAlignment, ArgSize);
8117 }
8118 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
8119 }
8120 } else {
8121 Value *Base;
8122 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8123 Align ArgAlign = Align(IntptrSize);
8124 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8125 if (DL.isBigEndian()) {
8126 // Adjusting the shadow for argument with size < IntptrSize to match
8127 // the placement of bits in big endian system
8128 if (ArgSize < IntptrSize)
8129 VAArgOffset += (IntptrSize - ArgSize);
8130 }
8131 if (!IsFixed) {
8132 Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8133 if (Base)
8134 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8135 VAArgOffset += ArgSize;
8136 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
8137 }
8138 }
8139 }
8140
8141 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8142 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8143 // a new class member i.e. it is the total size of all VarArgs.
8144 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8145 }
8146
8147 void finalizeInstrumentation() override {
8148 assert(!VAArgSize && !VAArgTLSCopy &&
8149 "finalizeInstrumentation called twice");
8150 IRBuilder<> IRB(MSV.FnPrologueEnd);
8151 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8152 Value *CopySize = VAArgSize;
8153
8154 if (!VAStartInstrumentationList.empty()) {
8155 // If there is a va_start in this function, make a backup copy of
8156 // va_arg_tls somewhere in the function entry block.
8157 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8158 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8159 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8160 CopySize, kShadowTLSAlignment, false);
8161
8162 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8163 Intrinsic::umin, CopySize,
8164 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8165 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8166 kShadowTLSAlignment, SrcSize);
8167 }
8168
8169 // Instrument va_start.
8170 // Copy va_list shadow from the backup copy of the TLS contents.
8171 for (CallInst *OrigInst : VAStartInstrumentationList) {
8172 NextNodeIRBuilder IRB(OrigInst);
8173 Value *VAListTag = OrigInst->getArgOperand(0);
8174 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8175 Value *RegSaveAreaPtrPtr =
8176 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8177 PointerType::get(*MS.C, 0));
8178 Value *RegSaveAreaPtr =
8179 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8180 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8181 const DataLayout &DL = F.getDataLayout();
8182 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8183 const Align Alignment = Align(IntptrSize);
8184 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8185 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8186 Alignment, /*isStore*/ true);
8187 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8188 CopySize);
8189 }
8190 }
8191};
8192
8193/// Implementation of VarArgHelper that is used for ARM32, MIPS, RISCV,
8194/// LoongArch64.
8195struct VarArgGenericHelper : public VarArgHelperBase {
8196 AllocaInst *VAArgTLSCopy = nullptr;
8197 Value *VAArgSize = nullptr;
8198
8199 VarArgGenericHelper(Function &F, MemorySanitizer &MS,
8200 MemorySanitizerVisitor &MSV, const unsigned VAListTagSize)
8201 : VarArgHelperBase(F, MS, MSV, VAListTagSize) {}
8202
8203 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8204 unsigned VAArgOffset = 0;
8205 const DataLayout &DL = F.getDataLayout();
8206 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8207 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8208 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8209 if (IsFixed)
8210 continue;
8211 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8212 if (DL.isBigEndian()) {
8213 // Adjusting the shadow for argument with size < IntptrSize to match the
8214 // placement of bits in big endian system
8215 if (ArgSize < IntptrSize)
8216 VAArgOffset += (IntptrSize - ArgSize);
8217 }
8218 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8219 VAArgOffset += ArgSize;
8220 VAArgOffset = alignTo(VAArgOffset, IntptrSize);
8221 if (!Base)
8222 continue;
8223 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8224 }
8225
8226 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8227 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8228 // a new class member i.e. it is the total size of all VarArgs.
8229 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8230 }
8231
8232 void finalizeInstrumentation() override {
8233 assert(!VAArgSize && !VAArgTLSCopy &&
8234 "finalizeInstrumentation called twice");
8235 IRBuilder<> IRB(MSV.FnPrologueEnd);
8236 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8237 Value *CopySize = VAArgSize;
8238
8239 if (!VAStartInstrumentationList.empty()) {
8240 // If there is a va_start in this function, make a backup copy of
8241 // va_arg_tls somewhere in the function entry block.
8242 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8243 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8244 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8245 CopySize, kShadowTLSAlignment, false);
8246
8247 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8248 Intrinsic::umin, CopySize,
8249 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8250 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8251 kShadowTLSAlignment, SrcSize);
8252 }
8253
8254 // Instrument va_start.
8255 // Copy va_list shadow from the backup copy of the TLS contents.
8256 for (CallInst *OrigInst : VAStartInstrumentationList) {
8257 NextNodeIRBuilder IRB(OrigInst);
8258 Value *VAListTag = OrigInst->getArgOperand(0);
8259 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8260 Value *RegSaveAreaPtrPtr =
8261 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8262 PointerType::get(*MS.C, 0));
8263 Value *RegSaveAreaPtr =
8264 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8265 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8266 const DataLayout &DL = F.getDataLayout();
8267 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8268 const Align Alignment = Align(IntptrSize);
8269 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8270 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8271 Alignment, /*isStore*/ true);
8272 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8273 CopySize);
8274 }
8275 }
8276};
8277
8278// ARM32, Loongarch64, MIPS and RISCV share the same calling conventions
8279// regarding VAArgs.
8280using VarArgARM32Helper = VarArgGenericHelper;
8281using VarArgRISCVHelper = VarArgGenericHelper;
8282using VarArgMIPSHelper = VarArgGenericHelper;
8283using VarArgLoongArch64Helper = VarArgGenericHelper;
8284
8285/// A no-op implementation of VarArgHelper.
8286struct VarArgNoOpHelper : public VarArgHelper {
8287 VarArgNoOpHelper(Function &F, MemorySanitizer &MS,
8288 MemorySanitizerVisitor &MSV) {}
8289
8290 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {}
8291
8292 void visitVAStartInst(VAStartInst &I) override {}
8293
8294 void visitVACopyInst(VACopyInst &I) override {}
8295
8296 void finalizeInstrumentation() override {}
8297};
8298
8299} // end anonymous namespace
8300
8301static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
8302 MemorySanitizerVisitor &Visitor) {
8303 // VarArg handling is only implemented on AMD64. False positives are possible
8304 // on other platforms.
8305 Triple TargetTriple(Func.getParent()->getTargetTriple());
8306
8307 if (TargetTriple.getArch() == Triple::x86)
8308 return new VarArgI386Helper(Func, Msan, Visitor);
8309
8310 if (TargetTriple.getArch() == Triple::x86_64)
8311 return new VarArgAMD64Helper(Func, Msan, Visitor);
8312
8313 if (TargetTriple.isARM())
8314 return new VarArgARM32Helper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8315
8316 if (TargetTriple.isAArch64())
8317 return new VarArgAArch64Helper(Func, Msan, Visitor);
8318
8319 if (TargetTriple.isSystemZ())
8320 return new VarArgSystemZHelper(Func, Msan, Visitor);
8321
8322 // On PowerPC32 VAListTag is a struct
8323 // {char, char, i16 padding, char *, char *}
8324 if (TargetTriple.isPPC32())
8325 return new VarArgPowerPC32Helper(Func, Msan, Visitor);
8326
8327 if (TargetTriple.isPPC64())
8328 return new VarArgPowerPC64Helper(Func, Msan, Visitor);
8329
8330 if (TargetTriple.isRISCV32())
8331 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8332
8333 if (TargetTriple.isRISCV64())
8334 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8335
8336 if (TargetTriple.isMIPS32())
8337 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8338
8339 if (TargetTriple.isMIPS64())
8340 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8341
8342 if (TargetTriple.isLoongArch64())
8343 return new VarArgLoongArch64Helper(Func, Msan, Visitor,
8344 /*VAListTagSize=*/8);
8345
8346 return new VarArgNoOpHelper(Func, Msan, Visitor);
8347}
8348
8349bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) {
8350 if (!CompileKernel && F.getName() == kMsanModuleCtorName)
8351 return false;
8352
8353 if (F.hasFnAttribute(Attribute::DisableSanitizerInstrumentation))
8354 return false;
8355
8356 MemorySanitizerVisitor Visitor(F, *this, TLI);
8357
8358 // Clear out memory attributes.
8360 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
8361 F.removeFnAttrs(B);
8362
8363 return Visitor.runOnFunction();
8364}
#define Success
assert(UImm &&(UImm !=~static_cast< T >(0)) &&"Invalid immediate!")
constexpr LLT S1
This file implements a class to represent arbitrary precision integral constant values and operations...
static bool isStore(int Opcode)
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
static cl::opt< ITMode > IT(cl::desc("IT block support"), cl::Hidden, cl::init(DefaultIT), cl::values(clEnumValN(DefaultIT, "arm-default-it", "Generate any type of IT block"), clEnumValN(RestrictedIT, "arm-restrict-it", "Disallow complex IT blocks")))
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClWithComdat("asan-with-comdat", cl::desc("Place ASan constructors in comdat sections"), cl::Hidden, cl::init(true))
VarLocInsertPt getNextNode(const DbgRecord *DVR)
Atomic ordering constants.
This file contains the simple types necessary to represent the attributes associated with functions a...
static GCRegistry::Add< ErlangGC > A("erlang", "erlang-compatible garbage collector")
static GCRegistry::Add< StatepointGC > D("statepoint-example", "an example strategy for statepoint")
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
Analysis containing CSE Info
Definition CSEInfo.cpp:27
This file contains the declarations for the subclasses of Constant, which represent the different fla...
const MemoryMapParams Linux_LoongArch64_MemoryMapParams
const MemoryMapParams Linux_X86_64_MemoryMapParams
static cl::opt< int > ClTrackOrigins("dfsan-track-origins", cl::desc("Track origins of labels"), cl::Hidden, cl::init(0))
static AtomicOrdering addReleaseOrdering(AtomicOrdering AO)
static AtomicOrdering addAcquireOrdering(AtomicOrdering AO)
const MemoryMapParams Linux_AArch64_MemoryMapParams
static bool isAMustTailRetVal(Value *RetVal)
This file provides an implementation of debug counters.
#define DEBUG_COUNTER(VARNAME, COUNTERNAME, DESC)
This file defines the DenseMap class.
This file builds on the ADT/GraphTraits.h file to build generic depth first graph iterator.
@ Default
static bool runOnFunction(Function &F, bool PostInlining)
This is the interface for a simple mod/ref and alias analysis over globals.
static size_t TypeSizeToSizeIndex(uint32_t TypeSize)
#define op(i)
Hexagon Common GEP
#define _
Hexagon Vector Combine
Module.h This file contains the declarations for the Module class.
static LVOptions Options
Definition LVOptions.cpp:25
#define F(x, y, z)
Definition MD5.cpp:55
#define I(x, y, z)
Definition MD5.cpp:58
Machine Check Debug Module
static const PlatformMemoryMapParams Linux_S390_MemoryMapParams
static const Align kMinOriginAlignment
static cl::opt< uint64_t > ClShadowBase("msan-shadow-base", cl::desc("Define custom MSan ShadowBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClPoisonUndef("msan-poison-undef", cl::desc("Poison fully undef temporary values. " "Partially undefined constant vectors " "are unaffected by this flag (see " "-msan-poison-undef-vectors)."), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_X86_MemoryMapParams
static cl::opt< uint64_t > ClOriginBase("msan-origin-base", cl::desc("Define custom MSan OriginBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClCheckConstantShadow("msan-check-constant-shadow", cl::desc("Insert checks for constant shadow values"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams
static const MemoryMapParams NetBSD_X86_64_MemoryMapParams
static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams
static const unsigned kOriginSize
static cl::opt< bool > ClWithComdat("msan-with-comdat", cl::desc("Place MSan constructors in comdat sections"), cl::Hidden, cl::init(false))
static cl::opt< int > ClTrackOrigins("msan-track-origins", cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden, cl::init(0))
Track origins of uninitialized values.
static cl::opt< int > ClInstrumentationWithCallThreshold("msan-instrumentation-with-call-threshold", cl::desc("If the function being instrumented requires more than " "this number of checks and origin stores, use callbacks instead of " "inline checks (-1 means never use callbacks)."), cl::Hidden, cl::init(3500))
static cl::opt< int > ClPoisonStackPattern("msan-poison-stack-pattern", cl::desc("poison uninitialized stack variables with the given pattern"), cl::Hidden, cl::init(0xff))
static const Align kShadowTLSAlignment
static cl::opt< bool > ClHandleICmpExact("msan-handle-icmp-exact", cl::desc("exact handling of relational integer ICmp"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams
static cl::opt< bool > ClDumpStrictInstructions("msan-dump-strict-instructions", cl::desc("print out instructions with default strict semantics i.e.," "check that all the inputs are fully initialized, and mark " "the output as fully initialized. These semantics are applied " "to instructions that could not be handled explicitly nor " "heuristically."), cl::Hidden, cl::init(false))
static Constant * getOrInsertGlobal(Module &M, StringRef Name, Type *Ty)
static cl::opt< bool > ClPreciseDisjointOr("msan-precise-disjoint-or", cl::desc("Precisely poison disjoint OR. If false (legacy behavior), " "disjointedness is ignored (i.e., 1|1 is initialized)."), cl::Hidden, cl::init(false))
static const MemoryMapParams Linux_S390X_MemoryMapParams
static cl::opt< bool > ClPoisonStack("msan-poison-stack", cl::desc("poison uninitialized stack variables"), cl::Hidden, cl::init(true))
static const MemoryMapParams Linux_I386_MemoryMapParams
const char kMsanInitName[]
static cl::opt< bool > ClPoisonUndefVectors("msan-poison-undef-vectors", cl::desc("Precisely poison partially undefined constant vectors. " "If false (legacy behavior), the entire vector is " "considered fully initialized, which may lead to false " "negatives. Fully undefined constant vectors are " "unaffected by this flag (see -msan-poison-undef)."), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPrintStackNames("msan-print-stack-names", cl::desc("Print name of local stack variable"), cl::Hidden, cl::init(true))
static cl::opt< uint64_t > ClAndMask("msan-and-mask", cl::desc("Define custom MSan AndMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleLifetimeIntrinsics("msan-handle-lifetime-intrinsics", cl::desc("when possible, poison scoped variables at the beginning of the scope " "(slower, but more precise)"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClKeepGoing("msan-keep-going", cl::desc("keep going after reporting a UMR"), cl::Hidden, cl::init(false))
static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams
static GlobalVariable * createPrivateConstGlobalForString(Module &M, StringRef Str)
Create a non-const global initialized with the given string.
static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClEagerChecks("msan-eager-checks", cl::desc("check arguments and return values at function call boundaries"), cl::Hidden, cl::init(false))
static cl::opt< int > ClDisambiguateWarning("msan-disambiguate-warning-threshold", cl::desc("Define threshold for number of checks per " "debug location to force origin update."), cl::Hidden, cl::init(3))
static VarArgHelper * CreateVarArgHelper(Function &Func, MemorySanitizer &Msan, MemorySanitizerVisitor &Visitor)
static const MemoryMapParams Linux_MIPS64_MemoryMapParams
static const MemoryMapParams Linux_PowerPC64_MemoryMapParams
static cl::opt< uint64_t > ClXorMask("msan-xor-mask", cl::desc("Define custom MSan XorMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleAsmConservative("msan-handle-asm-conservative", cl::desc("conservative handling of inline assembly"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams
static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams
static const unsigned kParamTLSSize
static cl::opt< bool > ClHandleICmp("msan-handle-icmp", cl::desc("propagate shadow through ICmpEQ and ICmpNE"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClEnableKmsan("msan-kernel", cl::desc("Enable KernelMemorySanitizer instrumentation"), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPoisonStackWithCall("msan-poison-stack-with-call", cl::desc("poison uninitialized stack variables with a call"), cl::Hidden, cl::init(false))
static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams
static cl::opt< bool > ClDumpHeuristicInstructions("msan-dump-heuristic-instructions", cl::desc("Prints 'unknown' instructions that were handled heuristically. " "Use -msan-dump-strict-instructions to print instructions that " "could not be handled explicitly nor heuristically."), cl::Hidden, cl::init(false))
static const unsigned kRetvalTLSSize
static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams
const char kMsanModuleCtorName[]
static const MemoryMapParams FreeBSD_I386_MemoryMapParams
static cl::opt< bool > ClCheckAccessAddress("msan-check-access-address", cl::desc("report accesses through a pointer which has poisoned shadow"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClDisableChecks("msan-disable-checks", cl::desc("Apply no_sanitize to the whole file"), cl::Hidden, cl::init(false))
#define T
FunctionAnalysisManager FAM
if(PassOpts->AAPipeline)
const SmallVectorImpl< MachineOperand > & Cond
static const char * name
void visit(MachineFunction &MF, MachineBasicBlock &Start, std::function< void(MachineBasicBlock *)> op)
This file implements a set that has insertion order iteration characteristics.
This file defines the SmallPtrSet class.
This file defines the SmallVector class.
This file contains some functions that are useful when dealing with strings.
#define LLVM_DEBUG(...)
Definition Debug.h:119
static TableGen::Emitter::OptClass< SkeletonEmitter > X("gen-skeleton-class", "Generate example skeleton class")
static SymbolRef::Type getType(const Symbol *Sym)
Definition TapiFile.cpp:39
Value * RHS
Value * LHS
static APInt getSignedMinValue(unsigned numBits)
Gets minimum signed value of APInt for a specific bit width.
Definition APInt.h:219
void setAlignment(Align Align)
PassT::Result & getResult(IRUnitT &IR, ExtraArgTs... ExtraArgs)
Get the result of an analysis pass for a given IR unit.
const T & front() const
front - Get the first element.
Definition ArrayRef.h:150
static LLVM_ABI ArrayType * get(Type *ElementType, uint64_t NumElements)
This static method is the primary way to construct an ArrayType.
This class stores enough information to efficiently remove some attributes from an existing AttrBuild...
AttributeMask & addAttribute(Attribute::AttrKind Val)
Add an attribute to the mask.
iterator end()
Definition BasicBlock.h:472
LLVM_ABI const_iterator getFirstInsertionPt() const
Returns an iterator to the first instruction in this block that is suitable for inserting a non-PHI i...
LLVM_ABI const BasicBlock * getSinglePredecessor() const
Return the predecessor of this block if it has a single predecessor block.
InstListType::iterator iterator
Instruction iterators...
Definition BasicBlock.h:170
bool isInlineAsm() const
Check if this call is an inline asm statement.
Function * getCalledFunction() const
Returns the function called, or null if this is an indirect function invocation or the function signa...
bool hasRetAttr(Attribute::AttrKind Kind) const
Determine whether the return value has the given attribute.
LLVM_ABI bool paramHasAttr(unsigned ArgNo, Attribute::AttrKind Kind) const
Determine whether the argument or parameter has the given attribute.
void removeFnAttrs(const AttributeMask &AttrsToRemove)
Removes the attributes from the function.
void setCannotMerge()
MaybeAlign getParamAlign(unsigned ArgNo) const
Extract the alignment for a call or parameter (0=unknown).
Type * getParamByValType(unsigned ArgNo) const
Extract the byval type for a call or parameter.
Value * getCalledOperand() const
Type * getParamElementType(unsigned ArgNo) const
Extract the elementtype type for a parameter.
Value * getArgOperand(unsigned i) const
void setArgOperand(unsigned i, Value *v)
FunctionType * getFunctionType() const
iterator_range< User::op_iterator > args()
Iteration adapter for range-for loops.
void addParamAttr(unsigned ArgNo, Attribute::AttrKind Kind)
Adds the attribute to the indicated argument.
Predicate
This enumeration lists the possible predicates for CmpInst subclasses.
Definition InstrTypes.h:678
@ ICMP_SLT
signed less than
Definition InstrTypes.h:707
@ ICMP_SLE
signed less or equal
Definition InstrTypes.h:708
@ ICMP_SGT
signed greater than
Definition InstrTypes.h:705
@ ICMP_SGE
signed greater or equal
Definition InstrTypes.h:706
static LLVM_ABI Constant * get(ArrayType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getString(LLVMContext &Context, StringRef Initializer, bool AddNull=true)
This method constructs a CDS and initializes it with a text string.
static LLVM_ABI Constant * get(LLVMContext &Context, ArrayRef< uint8_t > Elts)
get() constructors - Return a constant with vector type with an element count and element type matchi...
static ConstantInt * getSigned(IntegerType *Ty, int64_t V)
Return a ConstantInt with the specified value for the specified type.
Definition Constants.h:131
static LLVM_ABI ConstantInt * getBool(LLVMContext &Context, bool V)
static LLVM_ABI Constant * get(StructType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getSplat(ElementCount EC, Constant *Elt)
Return a ConstantVector with the specified constant in each element.
static LLVM_ABI Constant * get(ArrayRef< Constant * > V)
This is an important base class in LLVM.
Definition Constant.h:43
static LLVM_ABI Constant * getAllOnesValue(Type *Ty)
LLVM_ABI bool isAllOnesValue() const
Return true if this is the value that would be returned by getAllOnesValue.
static LLVM_ABI Constant * getNullValue(Type *Ty)
Constructor to create a '0' constant of arbitrary type.
LLVM_ABI Constant * getAggregateElement(unsigned Elt) const
For aggregates (struct/array/vector) return the constant that corresponds to the specified element if...
LLVM_ABI bool isZeroValue() const
Return true if the value is negative zero or null value.
Definition Constants.cpp:76
LLVM_ABI bool isNullValue() const
Return true if this is the value that would be returned by getNullValue.
Definition Constants.cpp:90
static bool shouldExecute(unsigned CounterName)
bool empty() const
Definition DenseMap.h:107
unsigned getNumElements() const
static LLVM_ABI FixedVectorType * get(Type *ElementType, unsigned NumElts)
Definition Type.cpp:803
static FixedVectorType * getHalfElementsVectorType(FixedVectorType *VTy)
A handy container for a FunctionType+Callee-pointer pair, which can be passed around as a single enti...
unsigned getNumParams() const
Return the number of fixed parameters this function type requires.
LLVM_ABI void setComdat(Comdat *C)
Definition Globals.cpp:214
@ PrivateLinkage
Like Internal, but omit from symbol table.
Definition GlobalValue.h:61
@ ExternalLinkage
Externally visible function.
Definition GlobalValue.h:53
Analysis pass providing a never-invalidated alias analysis result.
ConstantInt * getInt1(bool V)
Get a constant value representing either true or false.
Definition IRBuilder.h:497
Value * CreateInsertElement(Type *VecTy, Value *NewElt, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2571
Value * CreateConstGEP1_32(Type *Ty, Value *Ptr, unsigned Idx0, const Twine &Name="")
Definition IRBuilder.h:1936
AllocaInst * CreateAlloca(Type *Ty, unsigned AddrSpace, Value *ArraySize=nullptr, const Twine &Name="")
Definition IRBuilder.h:1830
IntegerType * getInt1Ty()
Fetch the type representing a single bit.
Definition IRBuilder.h:547
LLVM_ABI CallInst * CreateMaskedCompressStore(Value *Val, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr)
Create a call to Masked Compress Store intrinsic.
Value * CreateInsertValue(Value *Agg, Value *Val, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2625
Value * CreateExtractElement(Value *Vec, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2559
IntegerType * getIntNTy(unsigned N)
Fetch the type representing an N-bit integer.
Definition IRBuilder.h:575
LoadInst * CreateAlignedLoad(Type *Ty, Value *Ptr, MaybeAlign Align, const char *Name)
Definition IRBuilder.h:1864
Value * CreateZExtOrTrunc(Value *V, Type *DestTy, const Twine &Name="")
Create a ZExt or Trunc from the integer value V to DestTy.
Definition IRBuilder.h:2100
CallInst * CreateMemCpy(Value *Dst, MaybeAlign DstAlign, Value *Src, MaybeAlign SrcAlign, uint64_t Size, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memcpy between the specified pointers.
Definition IRBuilder.h:687
LLVM_ABI CallInst * CreateAndReduce(Value *Src)
Create a vector int AND reduction intrinsic of the source vector.
Value * CreatePointerCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2251
Value * CreateExtractValue(Value *Agg, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2618
LLVM_ABI CallInst * CreateMaskedLoad(Type *Ty, Value *Ptr, Align Alignment, Value *Mask, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Load intrinsic.
LLVM_ABI Value * CreateSelect(Value *C, Value *True, Value *False, const Twine &Name="", Instruction *MDFrom=nullptr)
BasicBlock::iterator GetInsertPoint() const
Definition IRBuilder.h:202
Value * CreateSExt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2094
Value * CreateIntToPtr(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2199
Value * CreateLShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1513
IntegerType * getInt32Ty()
Fetch the type representing a 32-bit integer.
Definition IRBuilder.h:562
ConstantInt * getInt8(uint8_t C)
Get a constant 8-bit value.
Definition IRBuilder.h:512
IntegerType * getInt64Ty()
Fetch the type representing a 64-bit integer.
Definition IRBuilder.h:567
Value * CreateUDiv(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1454
Value * CreateICmpNE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2333
Value * CreateGEP(Type *Ty, Value *Ptr, ArrayRef< Value * > IdxList, const Twine &Name="", GEPNoWrapFlags NW=GEPNoWrapFlags::none())
Definition IRBuilder.h:1923
Value * CreateNeg(Value *V, const Twine &Name="", bool HasNSW=false)
Definition IRBuilder.h:1781
LLVM_ABI CallInst * CreateOrReduce(Value *Src)
Create a vector int OR reduction intrinsic of the source vector.
LLVM_ABI Value * CreateBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with 2 operands which is mangled on the first type.
LLVM_ABI CallInst * CreateIntrinsic(Intrinsic::ID ID, ArrayRef< Type * > Types, ArrayRef< Value * > Args, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with Args, mangled using Types.
ConstantInt * getInt32(uint32_t C)
Get a constant 32-bit value.
Definition IRBuilder.h:522
PHINode * CreatePHI(Type *Ty, unsigned NumReservedValues, const Twine &Name="")
Definition IRBuilder.h:2494
Value * CreateNot(Value *V, const Twine &Name="")
Definition IRBuilder.h:1805
Value * CreateICmpEQ(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2329
LLVM_ABI DebugLoc getCurrentDebugLocation() const
Get location information used by debugging information.
Definition IRBuilder.cpp:63
Value * CreateSub(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1420
Value * CreateBitCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2204
ConstantInt * getIntN(unsigned N, uint64_t C)
Get a constant N-bit value, zero extended or truncated from a 64-bit value.
Definition IRBuilder.h:533
LoadInst * CreateLoad(Type *Ty, Value *Ptr, const char *Name)
Provided to resolve 'CreateLoad(Ty, Ptr, "...")' correctly, instead of converting the string to 'bool...
Definition IRBuilder.h:1847
Value * CreateShl(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1492
CallInst * CreateMemSet(Value *Ptr, Value *Val, uint64_t Size, MaybeAlign Align, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memset to the specified pointer and the specified value.
Definition IRBuilder.h:630
Value * CreateZExt(Value *V, Type *DestTy, const Twine &Name="", bool IsNonNeg=false)
Definition IRBuilder.h:2082
Value * CreateShuffleVector(Value *V1, Value *V2, Value *Mask, const Twine &Name="")
Definition IRBuilder.h:2593
LLVMContext & getContext() const
Definition IRBuilder.h:203
Value * CreateAnd(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1551
StoreInst * CreateStore(Value *Val, Value *Ptr, bool isVolatile=false)
Definition IRBuilder.h:1860
LLVM_ABI CallInst * CreateMaskedStore(Value *Val, Value *Ptr, Align Alignment, Value *Mask)
Create a call to Masked Store intrinsic.
Value * CreateAdd(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1403
Value * CreatePtrToInt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2194
Value * CreateIsNotNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg != 0.
Definition IRBuilder.h:2651
CallInst * CreateCall(FunctionType *FTy, Value *Callee, ArrayRef< Value * > Args={}, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:2508
Value * CreateTrunc(Value *V, Type *DestTy, const Twine &Name="", bool IsNUW=false, bool IsNSW=false)
Definition IRBuilder.h:2068
PointerType * getPtrTy(unsigned AddrSpace=0)
Fetch the type representing a pointer.
Definition IRBuilder.h:605
Value * CreateBinOp(Instruction::BinaryOps Opc, Value *LHS, Value *RHS, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:1708
Value * CreateICmpSLT(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2361
LLVM_ABI Value * CreateTypeSize(Type *Ty, TypeSize Size)
Create an expression which evaluates to the number of units in Size at runtime.
Value * CreateICmpUGE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2341
Value * CreateIntCast(Value *V, Type *DestTy, bool isSigned, const Twine &Name="")
Definition IRBuilder.h:2277
Value * CreateIsNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg == 0.
Definition IRBuilder.h:2646
void SetInsertPoint(BasicBlock *TheBB)
This specifies that created instructions should be appended to the end of the specified block.
Definition IRBuilder.h:207
Type * getVoidTy()
Fetch the type representing void.
Definition IRBuilder.h:600
StoreInst * CreateAlignedStore(Value *Val, Value *Ptr, MaybeAlign Align, bool isVolatile=false)
Definition IRBuilder.h:1883
LLVM_ABI CallInst * CreateMaskedExpandLoad(Type *Ty, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Expand Load intrinsic.
Value * CreateInBoundsPtrAdd(Value *Ptr, Value *Offset, const Twine &Name="")
Definition IRBuilder.h:2041
Value * CreateAShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1532
Value * CreateXor(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1599
Value * CreateICmp(CmpInst::Predicate P, Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2439
Value * CreateOr(Value *LHS, Value *RHS, const Twine &Name="", bool IsDisjoint=false)
Definition IRBuilder.h:1573
IntegerType * getInt8Ty()
Fetch the type representing an 8-bit integer.
Definition IRBuilder.h:552
Value * CreateMul(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1437
LLVM_ABI CallInst * CreateMaskedScatter(Value *Val, Value *Ptrs, Align Alignment, Value *Mask=nullptr)
Create a call to Masked Scatter intrinsic.
LLVM_ABI CallInst * CreateMaskedGather(Type *Ty, Value *Ptrs, Align Alignment, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Gather intrinsic.
This provides a uniform API for creating instructions and inserting them into a basic block: either a...
Definition IRBuilder.h:2780
std::vector< ConstraintInfo > ConstraintInfoVector
Definition InlineAsm.h:123
void visit(Iterator Start, Iterator End)
Definition InstVisitor.h:87
const DebugLoc & getDebugLoc() const
Return the debug location for this node as a DebugLoc.
LLVM_ABI InstListType::iterator eraseFromParent()
This method unlinks 'this' from the containing basic block and deletes it.
MDNode * getMetadata(unsigned KindID) const
Get the metadata of given kind attached to this Instruction.
LLVM_ABI bool comesBefore(const Instruction *Other) const
Given an instruction Other in the same basic block as this instruction, return true if this instructi...
static LLVM_ABI IntegerType * get(LLVMContext &C, unsigned NumBits)
This static method is the primary way of constructing an IntegerType.
Definition Type.cpp:319
LLVM_ABI MDNode * createUnlikelyBranchWeights()
Return metadata containing two branch weights, with significant bias towards false destination.
Definition MDBuilder.cpp:48
A Module instance is used to store all the information related to an LLVM module.
Definition Module.h:67
void addIncoming(Value *V, BasicBlock *BB)
Add an incoming value to the end of the PHI list.
static LLVM_ABI PoisonValue * get(Type *T)
Static factory methods - Return an 'poison' object of the specified type.
A set of analyses that are preserved following a run of a transformation pass.
Definition Analysis.h:112
static PreservedAnalyses none()
Convenience factory function for the empty preserved set.
Definition Analysis.h:115
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
Definition Analysis.h:118
PreservedAnalyses & abandon()
Mark an analysis as abandoned.
Definition Analysis.h:171
bool remove(const value_type &X)
Remove an item from the set vector.
Definition SetVector.h:198
bool insert(const value_type &X)
Insert a new element into the SetVector.
Definition SetVector.h:168
void append(ItTy in_start, ItTy in_end)
Add the specified range to the end of the SmallVector.
void push_back(const T &Elt)
StringRef - Represent a constant reference to a string, i.e.
Definition StringRef.h:55
static LLVM_ABI StructType * get(LLVMContext &Context, ArrayRef< Type * > Elements, bool isPacked=false)
This static method is the primary way to create a literal StructType.
Definition Type.cpp:414
unsigned getNumElements() const
Random access to the elements.
Type * getElementType(unsigned N) const
Analysis pass providing the TargetLibraryInfo.
Provides information about what library functions are available for the current target.
AttributeList getAttrList(LLVMContext *C, ArrayRef< unsigned > ArgNos, bool Signed, bool Ret=false, AttributeList AL=AttributeList()) const
bool getLibFunc(StringRef funcName, LibFunc &F) const
Searches for a particular function name.
Triple - Helper class for working with autoconf configuration names.
Definition Triple.h:47
bool isMIPS64() const
Tests whether the target is MIPS 64-bit (little and big endian).
Definition Triple.h:1030
@ loongarch64
Definition Triple.h:65
bool isRISCV32() const
Tests whether the target is 32-bit RISC-V.
Definition Triple.h:1073
bool isPPC32() const
Tests whether the target is 32-bit PowerPC (little and big endian).
Definition Triple.h:1046
ArchType getArch() const
Get the parsed architecture type of this triple.
Definition Triple.h:411
bool isRISCV64() const
Tests whether the target is 64-bit RISC-V.
Definition Triple.h:1078
bool isLoongArch64() const
Tests whether the target is 64-bit LoongArch.
Definition Triple.h:1019
bool isMIPS32() const
Tests whether the target is MIPS 32-bit (little and big endian).
Definition Triple.h:1025
bool isARM() const
Tests whether the target is ARM (little and big endian).
Definition Triple.h:914
bool isPPC64() const
Tests whether the target is 64-bit PowerPC (little and big endian).
Definition Triple.h:1051
bool isAArch64() const
Tests whether the target is AArch64 (little and big endian).
Definition Triple.h:998
bool isSystemZ() const
Tests whether the target is SystemZ.
Definition Triple.h:1097
The instances of the Type class are immutable: once they are created, they are never changed.
Definition Type.h:45
LLVM_ABI unsigned getIntegerBitWidth() const
bool isVectorTy() const
True if this is an instance of VectorType.
Definition Type.h:273
bool isArrayTy() const
True if this is an instance of ArrayType.
Definition Type.h:264
bool isIntOrIntVectorTy() const
Return true if this is an integer type or a vector of integer types.
Definition Type.h:246
bool isPointerTy() const
True if this is an instance of PointerType.
Definition Type.h:267
Type * getArrayElementType() const
Definition Type.h:408
bool isPPC_FP128Ty() const
Return true if this is powerpc long double.
Definition Type.h:165
static LLVM_ABI Type * getVoidTy(LLVMContext &C)
Definition Type.cpp:281
Type * getScalarType() const
If this is a vector type, return the element type, otherwise return 'this'.
Definition Type.h:352
LLVM_ABI TypeSize getPrimitiveSizeInBits() const LLVM_READONLY
Return the basic size of this type if it is a primitive type.
Definition Type.cpp:198
bool isSized(SmallPtrSetImpl< Type * > *Visited=nullptr) const
Return true if it makes sense to take the size of this type.
Definition Type.h:311
LLVM_ABI unsigned getScalarSizeInBits() const LLVM_READONLY
If this is a vector type, return the getPrimitiveSizeInBits value for the element type.
Definition Type.cpp:231
bool isFloatingPointTy() const
Return true if this is one of the floating-point types.
Definition Type.h:184
bool isIntOrPtrTy() const
Return true if this is an integer type or a pointer type.
Definition Type.h:255
bool isIntegerTy() const
True if this is an instance of IntegerType.
Definition Type.h:240
bool isFPOrFPVectorTy() const
Return true if this is a FP type or a vector of FP.
Definition Type.h:225
bool isVoidTy() const
Return true if this is 'void'.
Definition Type.h:139
Value * getOperand(unsigned i) const
Definition User.h:232
unsigned getNumOperands() const
Definition User.h:254
size_type count(const KeyT &Val) const
Return 1 if the specified key is in the map, 0 otherwise.
Definition ValueMap.h:156
Type * getType() const
All values are typed, get the type of this value.
Definition Value.h:256
LLVM_ABI void setName(const Twine &Name)
Change the name of the value.
Definition Value.cpp:390
LLVM_ABI StringRef getName() const
Return a constant reference to the value's name.
Definition Value.cpp:322
ElementCount getElementCount() const
Return an ElementCount instance to represent the (possibly scalable) number of elements in the vector...
Type * getElementType() const
int getNumOccurrences() const
constexpr ScalarTy getFixedValue() const
Definition TypeSize.h:200
constexpr bool isScalable() const
Returns whether the quantity is scaled by a runtime quantity (vscale).
Definition TypeSize.h:169
An efficient, type-erasing, non-owning reference to a callable.
const ParentTy * getParent() const
Definition ilist_node.h:34
self_iterator getIterator()
Definition ilist_node.h:134
This class implements an extremely fast bulk output stream that can only output to a stream.
Definition raw_ostream.h:53
CallInst * Call
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
constexpr char Align[]
Key for Kernel::Arg::Metadata::mAlign.
constexpr std::underlying_type_t< E > Mask()
Get a bitmask with 1s in all places up to the high-order bit of E's largest value.
@ C
The default llvm calling convention, compatible with C.
Definition CallingConv.h:34
@ BasicBlock
Various leaf nodes.
Definition ISDOpcodes.h:81
initializer< Ty > init(const Ty &Val)
Function * Kernel
Summary of a kernel (=entry point for target offloading).
Definition OpenMPOpt.h:21
NodeAddr< FuncNode * > Func
Definition RDFGraph.h:393
friend class Instruction
Iterator for Instructions in a `BasicBlock.
Definition BasicBlock.h:73
This is an optimization pass for GlobalISel generic memory operations.
unsigned Log2_32_Ceil(uint32_t Value)
Return the ceil log base 2 of the specified value, 32 if the value is zero.
Definition MathExtras.h:349
FunctionAddr VTableAddr Value
Definition InstrProf.h:137
auto size(R &&Range, std::enable_if_t< std::is_base_of< std::random_access_iterator_tag, typename std::iterator_traits< decltype(Range.begin())>::iterator_category >::value, void > *=nullptr)
Get the size of a range.
Definition STLExtras.h:1685
auto enumerate(FirstRange &&First, RestRanges &&...Rest)
Given two or more input ranges, returns a new range whose values are tuples (A, B,...
Definition STLExtras.h:2474
decltype(auto) dyn_cast(const From &Val)
dyn_cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:649
@ Done
Definition Threading.h:60
bool isAligned(Align Lhs, uint64_t SizeInBytes)
Checks that SizeInBytes is a multiple of the alignment.
Definition Alignment.h:145
LLVM_ABI std::pair< Instruction *, Value * > SplitBlockAndInsertSimpleForLoop(Value *End, BasicBlock::iterator SplitBefore)
Insert a for (int i = 0; i < End; i++) loop structure (with the exception that End is assumed > 0,...
InnerAnalysisManagerProxy< FunctionAnalysisManager, Module > FunctionAnalysisManagerModuleProxy
Provide the FunctionAnalysisManager to Module proxy.
constexpr bool isPowerOf2_64(uint64_t Value)
Return true if the argument is a power of two > 0 (64 bit edition.)
Definition MathExtras.h:293
unsigned Log2_64(uint64_t Value)
Return the floor log base 2 of the specified value, -1 if the value is zero.
Definition MathExtras.h:342
auto dyn_cast_or_null(const Y &Val)
Definition Casting.h:759
LLVM_ABI std::pair< Function *, FunctionCallee > getOrCreateSanitizerCtorAndInitFunctions(Module &M, StringRef CtorName, StringRef InitName, ArrayRef< Type * > InitArgTypes, ArrayRef< Value * > InitArgs, function_ref< void(Function *, FunctionCallee)> FunctionsCreatedCallback, StringRef VersionCheckName=StringRef(), bool Weak=false)
Creates sanitizer constructor function lazily.
LLVM_ABI raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition Debug.cpp:207
LLVM_ABI void report_fatal_error(Error Err, bool gen_crash_diag=true)
Definition Error.cpp:167
class LLVM_GSL_OWNER SmallVector
Forward declaration of SmallVector so that calculateSmallVectorDefaultInlinedElements can reference s...
bool isa(const From &Val)
isa<X> - Return true if the parameter to the template is an instance of one of the template type argu...
Definition Casting.h:548
LLVM_ABI bool isKnownNonZero(const Value *V, const SimplifyQuery &Q, unsigned Depth=0)
Return true if the given value is known to be non-zero when defined.
LLVM_ABI raw_fd_ostream & errs()
This returns a reference to a raw_ostream for standard error.
AtomicOrdering
Atomic ordering for LLVM's memory model.
@ First
Helpers to iterate all locations in the MemoryEffectsBase class.
Definition ModRef.h:71
IRBuilder(LLVMContext &, FolderTy, InserterTy, MDNode *, ArrayRef< OperandBundleDef >) -> IRBuilder< FolderTy, InserterTy >
@ Or
Bitwise or logical OR of integers.
@ And
Bitwise or logical AND of integers.
@ Add
Sum of integers.
uint64_t alignTo(uint64_t Size, Align A)
Returns a multiple of A needed to store Size bytes.
Definition Alignment.h:155
DWARFExpression::Operation Op
RoundingMode
Rounding mode.
ArrayRef(const T &OneElt) -> ArrayRef< T >
constexpr unsigned BitWidth
LLVM_ABI void appendToGlobalCtors(Module &M, Function *F, int Priority, Constant *Data=nullptr)
Append F to the list of global ctors of module M with the given Priority.
decltype(auto) cast(const From &Val)
cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:565
iterator_range< df_iterator< T > > depth_first(const T &G)
LLVM_ABI Instruction * SplitBlockAndInsertIfThen(Value *Cond, BasicBlock::iterator SplitBefore, bool Unreachable, MDNode *BranchWeights=nullptr, DomTreeUpdater *DTU=nullptr, LoopInfo *LI=nullptr, BasicBlock *ThenBlock=nullptr)
Split the containing block at the specified instruction - everything before SplitBefore stays in the ...
LLVM_ABI void maybeMarkSanitizerLibraryCallNoBuiltin(CallInst *CI, const TargetLibraryInfo *TLI)
Given a CallInst, check if it calls a string function known to CodeGen, and mark it with NoBuiltin if...
Definition Local.cpp:3829
LLVM_ABI bool removeUnreachableBlocks(Function &F, DomTreeUpdater *DTU=nullptr, MemorySSAUpdater *MSSAU=nullptr)
Remove all blocks that can not be reached from the function's entry.
Definition Local.cpp:2883
LLVM_ABI bool checkIfAlreadyInstrumented(Module &M, StringRef Flag)
Check if module has flag attached, if not add the flag.
std::string itostr(int64_t X)
AnalysisManager< Module > ModuleAnalysisManager
Convenience typedef for the Module analysis manager.
Definition MIRParser.h:39
This struct is a compact representation of a valid (non-zero power of two) alignment.
Definition Alignment.h:39
uint64_t value() const
This is a hole in the type system and should not be abused.
Definition Alignment.h:85
LLVM_ABI void printPipeline(raw_ostream &OS, function_ref< StringRef(StringRef)> MapClassName2PassName)
LLVM_ABI PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM)
A CRTP mix-in to automatically provide informational APIs needed for passes.
Definition PassManager.h:70