LLVM 22.0.0git
MemorySanitizer.cpp
Go to the documentation of this file.
1//===- MemorySanitizer.cpp - detector of uninitialized reads --------------===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9/// \file
10/// This file is a part of MemorySanitizer, a detector of uninitialized
11/// reads.
12///
13/// The algorithm of the tool is similar to Memcheck
14/// (https://static.usenix.org/event/usenix05/tech/general/full_papers/seward/seward_html/usenix2005.html)
15/// We associate a few shadow bits with every byte of the application memory,
16/// poison the shadow of the malloc-ed or alloca-ed memory, load the shadow,
17/// bits on every memory read, propagate the shadow bits through some of the
18/// arithmetic instruction (including MOV), store the shadow bits on every
19/// memory write, report a bug on some other instructions (e.g. JMP) if the
20/// associated shadow is poisoned.
21///
22/// But there are differences too. The first and the major one:
23/// compiler instrumentation instead of binary instrumentation. This
24/// gives us much better register allocation, possible compiler
25/// optimizations and a fast start-up. But this brings the major issue
26/// as well: msan needs to see all program events, including system
27/// calls and reads/writes in system libraries, so we either need to
28/// compile *everything* with msan or use a binary translation
29/// component (e.g. DynamoRIO) to instrument pre-built libraries.
30/// Another difference from Memcheck is that we use 8 shadow bits per
31/// byte of application memory and use a direct shadow mapping. This
32/// greatly simplifies the instrumentation code and avoids races on
33/// shadow updates (Memcheck is single-threaded so races are not a
34/// concern there. Memcheck uses 2 shadow bits per byte with a slow
35/// path storage that uses 8 bits per byte).
36///
37/// The default value of shadow is 0, which means "clean" (not poisoned).
38///
39/// Every module initializer should call __msan_init to ensure that the
40/// shadow memory is ready. On error, __msan_warning is called. Since
41/// parameters and return values may be passed via registers, we have a
42/// specialized thread-local shadow for return values
43/// (__msan_retval_tls) and parameters (__msan_param_tls).
44///
45/// Origin tracking.
46///
47/// MemorySanitizer can track origins (allocation points) of all uninitialized
48/// values. This behavior is controlled with a flag (msan-track-origins) and is
49/// disabled by default.
50///
51/// Origins are 4-byte values created and interpreted by the runtime library.
52/// They are stored in a second shadow mapping, one 4-byte value for 4 bytes
53/// of application memory. Propagation of origins is basically a bunch of
54/// "select" instructions that pick the origin of a dirty argument, if an
55/// instruction has one.
56///
57/// Every 4 aligned, consecutive bytes of application memory have one origin
58/// value associated with them. If these bytes contain uninitialized data
59/// coming from 2 different allocations, the last store wins. Because of this,
60/// MemorySanitizer reports can show unrelated origins, but this is unlikely in
61/// practice.
62///
63/// Origins are meaningless for fully initialized values, so MemorySanitizer
64/// avoids storing origin to memory when a fully initialized value is stored.
65/// This way it avoids needless overwriting origin of the 4-byte region on
66/// a short (i.e. 1 byte) clean store, and it is also good for performance.
67///
68/// Atomic handling.
69///
70/// Ideally, every atomic store of application value should update the
71/// corresponding shadow location in an atomic way. Unfortunately, atomic store
72/// of two disjoint locations can not be done without severe slowdown.
73///
74/// Therefore, we implement an approximation that may err on the safe side.
75/// In this implementation, every atomically accessed location in the program
76/// may only change from (partially) uninitialized to fully initialized, but
77/// not the other way around. We load the shadow _after_ the application load,
78/// and we store the shadow _before_ the app store. Also, we always store clean
79/// shadow (if the application store is atomic). This way, if the store-load
80/// pair constitutes a happens-before arc, shadow store and load are correctly
81/// ordered such that the load will get either the value that was stored, or
82/// some later value (which is always clean).
83///
84/// This does not work very well with Compare-And-Swap (CAS) and
85/// Read-Modify-Write (RMW) operations. To follow the above logic, CAS and RMW
86/// must store the new shadow before the app operation, and load the shadow
87/// after the app operation. Computers don't work this way. Current
88/// implementation ignores the load aspect of CAS/RMW, always returning a clean
89/// value. It implements the store part as a simple atomic store by storing a
90/// clean shadow.
91///
92/// Instrumenting inline assembly.
93///
94/// For inline assembly code LLVM has little idea about which memory locations
95/// become initialized depending on the arguments. It can be possible to figure
96/// out which arguments are meant to point to inputs and outputs, but the
97/// actual semantics can be only visible at runtime. In the Linux kernel it's
98/// also possible that the arguments only indicate the offset for a base taken
99/// from a segment register, so it's dangerous to treat any asm() arguments as
100/// pointers. We take a conservative approach generating calls to
101/// __msan_instrument_asm_store(ptr, size)
102/// , which defer the memory unpoisoning to the runtime library.
103/// The latter can perform more complex address checks to figure out whether
104/// it's safe to touch the shadow memory.
105/// Like with atomic operations, we call __msan_instrument_asm_store() before
106/// the assembly call, so that changes to the shadow memory will be seen by
107/// other threads together with main memory initialization.
108///
109/// KernelMemorySanitizer (KMSAN) implementation.
110///
111/// The major differences between KMSAN and MSan instrumentation are:
112/// - KMSAN always tracks the origins and implies msan-keep-going=true;
113/// - KMSAN allocates shadow and origin memory for each page separately, so
114/// there are no explicit accesses to shadow and origin in the
115/// instrumentation.
116/// Shadow and origin values for a particular X-byte memory location
117/// (X=1,2,4,8) are accessed through pointers obtained via the
118/// __msan_metadata_ptr_for_load_X(ptr)
119/// __msan_metadata_ptr_for_store_X(ptr)
120/// functions. The corresponding functions check that the X-byte accesses
121/// are possible and returns the pointers to shadow and origin memory.
122/// Arbitrary sized accesses are handled with:
123/// __msan_metadata_ptr_for_load_n(ptr, size)
124/// __msan_metadata_ptr_for_store_n(ptr, size);
125/// Note that the sanitizer code has to deal with how shadow/origin pairs
126/// returned by the these functions are represented in different ABIs. In
127/// the X86_64 ABI they are returned in RDX:RAX, in PowerPC64 they are
128/// returned in r3 and r4, and in the SystemZ ABI they are written to memory
129/// pointed to by a hidden parameter.
130/// - TLS variables are stored in a single per-task struct. A call to a
131/// function __msan_get_context_state() returning a pointer to that struct
132/// is inserted into every instrumented function before the entry block;
133/// - __msan_warning() takes a 32-bit origin parameter;
134/// - local variables are poisoned with __msan_poison_alloca() upon function
135/// entry and unpoisoned with __msan_unpoison_alloca() before leaving the
136/// function;
137/// - the pass doesn't declare any global variables or add global constructors
138/// to the translation unit.
139///
140/// Also, KMSAN currently ignores uninitialized memory passed into inline asm
141/// calls, making sure we're on the safe side wrt. possible false positives.
142///
143/// KernelMemorySanitizer only supports X86_64, SystemZ and PowerPC64 at the
144/// moment.
145///
146//
147// FIXME: This sanitizer does not yet handle scalable vectors
148//
149//===----------------------------------------------------------------------===//
150
152#include "llvm/ADT/APInt.h"
153#include "llvm/ADT/ArrayRef.h"
154#include "llvm/ADT/DenseMap.h"
156#include "llvm/ADT/SetVector.h"
157#include "llvm/ADT/SmallPtrSet.h"
158#include "llvm/ADT/SmallVector.h"
160#include "llvm/ADT/StringRef.h"
164#include "llvm/IR/Argument.h"
166#include "llvm/IR/Attributes.h"
167#include "llvm/IR/BasicBlock.h"
168#include "llvm/IR/CallingConv.h"
169#include "llvm/IR/Constant.h"
170#include "llvm/IR/Constants.h"
171#include "llvm/IR/DataLayout.h"
172#include "llvm/IR/DerivedTypes.h"
173#include "llvm/IR/Function.h"
174#include "llvm/IR/GlobalValue.h"
176#include "llvm/IR/IRBuilder.h"
177#include "llvm/IR/InlineAsm.h"
178#include "llvm/IR/InstVisitor.h"
179#include "llvm/IR/InstrTypes.h"
180#include "llvm/IR/Instruction.h"
181#include "llvm/IR/Instructions.h"
183#include "llvm/IR/Intrinsics.h"
184#include "llvm/IR/IntrinsicsAArch64.h"
185#include "llvm/IR/IntrinsicsX86.h"
186#include "llvm/IR/MDBuilder.h"
187#include "llvm/IR/Module.h"
188#include "llvm/IR/Type.h"
189#include "llvm/IR/Value.h"
190#include "llvm/IR/ValueMap.h"
193#include "llvm/Support/Casting.h"
195#include "llvm/Support/Debug.h"
205#include <algorithm>
206#include <cassert>
207#include <cstddef>
208#include <cstdint>
209#include <memory>
210#include <numeric>
211#include <string>
212#include <tuple>
213
214using namespace llvm;
215
216#define DEBUG_TYPE "msan"
217
218DEBUG_COUNTER(DebugInsertCheck, "msan-insert-check",
219 "Controls which checks to insert");
220
221DEBUG_COUNTER(DebugInstrumentInstruction, "msan-instrument-instruction",
222 "Controls which instruction to instrument");
223
224static const unsigned kOriginSize = 4;
227
228// These constants must be kept in sync with the ones in msan.h.
229static const unsigned kParamTLSSize = 800;
230static const unsigned kRetvalTLSSize = 800;
231
232// Accesses sizes are powers of two: 1, 2, 4, 8.
233static const size_t kNumberOfAccessSizes = 4;
234
235/// Track origins of uninitialized values.
236///
237/// Adds a section to MemorySanitizer report that points to the allocation
238/// (stack or heap) the uninitialized bits came from originally.
240 "msan-track-origins",
241 cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden,
242 cl::init(0));
243
244static cl::opt<bool> ClKeepGoing("msan-keep-going",
245 cl::desc("keep going after reporting a UMR"),
246 cl::Hidden, cl::init(false));
247
248static cl::opt<bool>
249 ClPoisonStack("msan-poison-stack",
250 cl::desc("poison uninitialized stack variables"), cl::Hidden,
251 cl::init(true));
252
254 "msan-poison-stack-with-call",
255 cl::desc("poison uninitialized stack variables with a call"), cl::Hidden,
256 cl::init(false));
257
259 "msan-poison-stack-pattern",
260 cl::desc("poison uninitialized stack variables with the given pattern"),
261 cl::Hidden, cl::init(0xff));
262
263static cl::opt<bool>
264 ClPrintStackNames("msan-print-stack-names",
265 cl::desc("Print name of local stack variable"),
266 cl::Hidden, cl::init(true));
267
268static cl::opt<bool>
269 ClPoisonUndef("msan-poison-undef",
270 cl::desc("Poison fully undef temporary values. "
271 "Partially undefined constant vectors "
272 "are unaffected by this flag (see "
273 "-msan-poison-undef-vectors)."),
274 cl::Hidden, cl::init(true));
275
277 "msan-poison-undef-vectors",
278 cl::desc("Precisely poison partially undefined constant vectors. "
279 "If false (legacy behavior), the entire vector is "
280 "considered fully initialized, which may lead to false "
281 "negatives. Fully undefined constant vectors are "
282 "unaffected by this flag (see -msan-poison-undef)."),
283 cl::Hidden, cl::init(false));
284
286 "msan-precise-disjoint-or",
287 cl::desc("Precisely poison disjoint OR. If false (legacy behavior), "
288 "disjointedness is ignored (i.e., 1|1 is initialized)."),
289 cl::Hidden, cl::init(false));
290
291static cl::opt<bool>
292 ClHandleICmp("msan-handle-icmp",
293 cl::desc("propagate shadow through ICmpEQ and ICmpNE"),
294 cl::Hidden, cl::init(true));
295
296static cl::opt<bool>
297 ClHandleICmpExact("msan-handle-icmp-exact",
298 cl::desc("exact handling of relational integer ICmp"),
299 cl::Hidden, cl::init(true));
300
302 "msan-handle-lifetime-intrinsics",
303 cl::desc(
304 "when possible, poison scoped variables at the beginning of the scope "
305 "(slower, but more precise)"),
306 cl::Hidden, cl::init(true));
307
308// When compiling the Linux kernel, we sometimes see false positives related to
309// MSan being unable to understand that inline assembly calls may initialize
310// local variables.
311// This flag makes the compiler conservatively unpoison every memory location
312// passed into an assembly call. Note that this may cause false positives.
313// Because it's impossible to figure out the array sizes, we can only unpoison
314// the first sizeof(type) bytes for each type* pointer.
316 "msan-handle-asm-conservative",
317 cl::desc("conservative handling of inline assembly"), cl::Hidden,
318 cl::init(true));
319
320// This flag controls whether we check the shadow of the address
321// operand of load or store. Such bugs are very rare, since load from
322// a garbage address typically results in SEGV, but still happen
323// (e.g. only lower bits of address are garbage, or the access happens
324// early at program startup where malloc-ed memory is more likely to
325// be zeroed. As of 2012-08-28 this flag adds 20% slowdown.
327 "msan-check-access-address",
328 cl::desc("report accesses through a pointer which has poisoned shadow"),
329 cl::Hidden, cl::init(true));
330
332 "msan-eager-checks",
333 cl::desc("check arguments and return values at function call boundaries"),
334 cl::Hidden, cl::init(false));
335
337 "msan-dump-strict-instructions",
338 cl::desc("print out instructions with default strict semantics i.e.,"
339 "check that all the inputs are fully initialized, and mark "
340 "the output as fully initialized. These semantics are applied "
341 "to instructions that could not be handled explicitly nor "
342 "heuristically."),
343 cl::Hidden, cl::init(false));
344
345// Currently, all the heuristically handled instructions are specifically
346// IntrinsicInst. However, we use the broader "HeuristicInstructions" name
347// to parallel 'msan-dump-strict-instructions', and to keep the door open to
348// handling non-intrinsic instructions heuristically.
350 "msan-dump-heuristic-instructions",
351 cl::desc("Prints 'unknown' instructions that were handled heuristically. "
352 "Use -msan-dump-strict-instructions to print instructions that "
353 "could not be handled explicitly nor heuristically."),
354 cl::Hidden, cl::init(false));
355
357 "msan-instrumentation-with-call-threshold",
358 cl::desc(
359 "If the function being instrumented requires more than "
360 "this number of checks and origin stores, use callbacks instead of "
361 "inline checks (-1 means never use callbacks)."),
362 cl::Hidden, cl::init(3500));
363
364static cl::opt<bool>
365 ClEnableKmsan("msan-kernel",
366 cl::desc("Enable KernelMemorySanitizer instrumentation"),
367 cl::Hidden, cl::init(false));
368
369static cl::opt<bool>
370 ClDisableChecks("msan-disable-checks",
371 cl::desc("Apply no_sanitize to the whole file"), cl::Hidden,
372 cl::init(false));
373
374static cl::opt<bool>
375 ClCheckConstantShadow("msan-check-constant-shadow",
376 cl::desc("Insert checks for constant shadow values"),
377 cl::Hidden, cl::init(true));
378
379// This is off by default because of a bug in gold:
380// https://sourceware.org/bugzilla/show_bug.cgi?id=19002
381static cl::opt<bool>
382 ClWithComdat("msan-with-comdat",
383 cl::desc("Place MSan constructors in comdat sections"),
384 cl::Hidden, cl::init(false));
385
386// These options allow to specify custom memory map parameters
387// See MemoryMapParams for details.
388static cl::opt<uint64_t> ClAndMask("msan-and-mask",
389 cl::desc("Define custom MSan AndMask"),
390 cl::Hidden, cl::init(0));
391
392static cl::opt<uint64_t> ClXorMask("msan-xor-mask",
393 cl::desc("Define custom MSan XorMask"),
394 cl::Hidden, cl::init(0));
395
396static cl::opt<uint64_t> ClShadowBase("msan-shadow-base",
397 cl::desc("Define custom MSan ShadowBase"),
398 cl::Hidden, cl::init(0));
399
400static cl::opt<uint64_t> ClOriginBase("msan-origin-base",
401 cl::desc("Define custom MSan OriginBase"),
402 cl::Hidden, cl::init(0));
403
404static cl::opt<int>
405 ClDisambiguateWarning("msan-disambiguate-warning-threshold",
406 cl::desc("Define threshold for number of checks per "
407 "debug location to force origin update."),
408 cl::Hidden, cl::init(3));
409
410const char kMsanModuleCtorName[] = "msan.module_ctor";
411const char kMsanInitName[] = "__msan_init";
412
413namespace {
414
415// Memory map parameters used in application-to-shadow address calculation.
416// Offset = (Addr & ~AndMask) ^ XorMask
417// Shadow = ShadowBase + Offset
418// Origin = OriginBase + Offset
419struct MemoryMapParams {
420 uint64_t AndMask;
421 uint64_t XorMask;
422 uint64_t ShadowBase;
423 uint64_t OriginBase;
424};
425
426struct PlatformMemoryMapParams {
427 const MemoryMapParams *bits32;
428 const MemoryMapParams *bits64;
429};
430
431} // end anonymous namespace
432
433// i386 Linux
434static const MemoryMapParams Linux_I386_MemoryMapParams = {
435 0x000080000000, // AndMask
436 0, // XorMask (not used)
437 0, // ShadowBase (not used)
438 0x000040000000, // OriginBase
439};
440
441// x86_64 Linux
442static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
443 0, // AndMask (not used)
444 0x500000000000, // XorMask
445 0, // ShadowBase (not used)
446 0x100000000000, // OriginBase
447};
448
449// mips32 Linux
450// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
451// after picking good constants
452
453// mips64 Linux
454static const MemoryMapParams Linux_MIPS64_MemoryMapParams = {
455 0, // AndMask (not used)
456 0x008000000000, // XorMask
457 0, // ShadowBase (not used)
458 0x002000000000, // OriginBase
459};
460
461// ppc32 Linux
462// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
463// after picking good constants
464
465// ppc64 Linux
466static const MemoryMapParams Linux_PowerPC64_MemoryMapParams = {
467 0xE00000000000, // AndMask
468 0x100000000000, // XorMask
469 0x080000000000, // ShadowBase
470 0x1C0000000000, // OriginBase
471};
472
473// s390x Linux
474static const MemoryMapParams Linux_S390X_MemoryMapParams = {
475 0xC00000000000, // AndMask
476 0, // XorMask (not used)
477 0x080000000000, // ShadowBase
478 0x1C0000000000, // OriginBase
479};
480
481// arm32 Linux
482// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
483// after picking good constants
484
485// aarch64 Linux
486static const MemoryMapParams Linux_AArch64_MemoryMapParams = {
487 0, // AndMask (not used)
488 0x0B00000000000, // XorMask
489 0, // ShadowBase (not used)
490 0x0200000000000, // OriginBase
491};
492
493// loongarch64 Linux
494static const MemoryMapParams Linux_LoongArch64_MemoryMapParams = {
495 0, // AndMask (not used)
496 0x500000000000, // XorMask
497 0, // ShadowBase (not used)
498 0x100000000000, // OriginBase
499};
500
501// riscv32 Linux
502// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
503// after picking good constants
504
505// aarch64 FreeBSD
506static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams = {
507 0x1800000000000, // AndMask
508 0x0400000000000, // XorMask
509 0x0200000000000, // ShadowBase
510 0x0700000000000, // OriginBase
511};
512
513// i386 FreeBSD
514static const MemoryMapParams FreeBSD_I386_MemoryMapParams = {
515 0x000180000000, // AndMask
516 0x000040000000, // XorMask
517 0x000020000000, // ShadowBase
518 0x000700000000, // OriginBase
519};
520
521// x86_64 FreeBSD
522static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams = {
523 0xc00000000000, // AndMask
524 0x200000000000, // XorMask
525 0x100000000000, // ShadowBase
526 0x380000000000, // OriginBase
527};
528
529// x86_64 NetBSD
530static const MemoryMapParams NetBSD_X86_64_MemoryMapParams = {
531 0, // AndMask
532 0x500000000000, // XorMask
533 0, // ShadowBase
534 0x100000000000, // OriginBase
535};
536
537static const PlatformMemoryMapParams Linux_X86_MemoryMapParams = {
540};
541
542static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams = {
543 nullptr,
545};
546
547static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams = {
548 nullptr,
550};
551
552static const PlatformMemoryMapParams Linux_S390_MemoryMapParams = {
553 nullptr,
555};
556
557static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams = {
558 nullptr,
560};
561
562static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams = {
563 nullptr,
565};
566
567static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams = {
568 nullptr,
570};
571
572static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams = {
575};
576
577static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams = {
578 nullptr,
580};
581
582namespace {
583
584/// Instrument functions of a module to detect uninitialized reads.
585///
586/// Instantiating MemorySanitizer inserts the msan runtime library API function
587/// declarations into the module if they don't exist already. Instantiating
588/// ensures the __msan_init function is in the list of global constructors for
589/// the module.
590class MemorySanitizer {
591public:
592 MemorySanitizer(Module &M, MemorySanitizerOptions Options)
593 : CompileKernel(Options.Kernel), TrackOrigins(Options.TrackOrigins),
594 Recover(Options.Recover), EagerChecks(Options.EagerChecks) {
595 initializeModule(M);
596 }
597
598 // MSan cannot be moved or copied because of MapParams.
599 MemorySanitizer(MemorySanitizer &&) = delete;
600 MemorySanitizer &operator=(MemorySanitizer &&) = delete;
601 MemorySanitizer(const MemorySanitizer &) = delete;
602 MemorySanitizer &operator=(const MemorySanitizer &) = delete;
603
604 bool sanitizeFunction(Function &F, TargetLibraryInfo &TLI);
605
606private:
607 friend struct MemorySanitizerVisitor;
608 friend struct VarArgHelperBase;
609 friend struct VarArgAMD64Helper;
610 friend struct VarArgAArch64Helper;
611 friend struct VarArgPowerPC64Helper;
612 friend struct VarArgPowerPC32Helper;
613 friend struct VarArgSystemZHelper;
614 friend struct VarArgI386Helper;
615 friend struct VarArgGenericHelper;
616
617 void initializeModule(Module &M);
618 void initializeCallbacks(Module &M, const TargetLibraryInfo &TLI);
619 void createKernelApi(Module &M, const TargetLibraryInfo &TLI);
620 void createUserspaceApi(Module &M, const TargetLibraryInfo &TLI);
621
622 template <typename... ArgsTy>
623 FunctionCallee getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
624 ArgsTy... Args);
625
626 /// True if we're compiling the Linux kernel.
627 bool CompileKernel;
628 /// Track origins (allocation points) of uninitialized values.
629 int TrackOrigins;
630 bool Recover;
631 bool EagerChecks;
632
633 Triple TargetTriple;
634 LLVMContext *C;
635 Type *IntptrTy; ///< Integer type with the size of a ptr in default AS.
636 Type *OriginTy;
637 PointerType *PtrTy; ///< Integer type with the size of a ptr in default AS.
638
639 // XxxTLS variables represent the per-thread state in MSan and per-task state
640 // in KMSAN.
641 // For the userspace these point to thread-local globals. In the kernel land
642 // they point to the members of a per-task struct obtained via a call to
643 // __msan_get_context_state().
644
645 /// Thread-local shadow storage for function parameters.
646 Value *ParamTLS;
647
648 /// Thread-local origin storage for function parameters.
649 Value *ParamOriginTLS;
650
651 /// Thread-local shadow storage for function return value.
652 Value *RetvalTLS;
653
654 /// Thread-local origin storage for function return value.
655 Value *RetvalOriginTLS;
656
657 /// Thread-local shadow storage for in-register va_arg function.
658 Value *VAArgTLS;
659
660 /// Thread-local shadow storage for in-register va_arg function.
661 Value *VAArgOriginTLS;
662
663 /// Thread-local shadow storage for va_arg overflow area.
664 Value *VAArgOverflowSizeTLS;
665
666 /// Are the instrumentation callbacks set up?
667 bool CallbacksInitialized = false;
668
669 /// The run-time callback to print a warning.
670 FunctionCallee WarningFn;
671
672 // These arrays are indexed by log2(AccessSize).
673 FunctionCallee MaybeWarningFn[kNumberOfAccessSizes];
674 FunctionCallee MaybeWarningVarSizeFn;
675 FunctionCallee MaybeStoreOriginFn[kNumberOfAccessSizes];
676
677 /// Run-time helper that generates a new origin value for a stack
678 /// allocation.
679 FunctionCallee MsanSetAllocaOriginWithDescriptionFn;
680 // No description version
681 FunctionCallee MsanSetAllocaOriginNoDescriptionFn;
682
683 /// Run-time helper that poisons stack on function entry.
684 FunctionCallee MsanPoisonStackFn;
685
686 /// Run-time helper that records a store (or any event) of an
687 /// uninitialized value and returns an updated origin id encoding this info.
688 FunctionCallee MsanChainOriginFn;
689
690 /// Run-time helper that paints an origin over a region.
691 FunctionCallee MsanSetOriginFn;
692
693 /// MSan runtime replacements for memmove, memcpy and memset.
694 FunctionCallee MemmoveFn, MemcpyFn, MemsetFn;
695
696 /// KMSAN callback for task-local function argument shadow.
697 StructType *MsanContextStateTy;
698 FunctionCallee MsanGetContextStateFn;
699
700 /// Functions for poisoning/unpoisoning local variables
701 FunctionCallee MsanPoisonAllocaFn, MsanUnpoisonAllocaFn;
702
703 /// Pair of shadow/origin pointers.
704 Type *MsanMetadata;
705
706 /// Each of the MsanMetadataPtrXxx functions returns a MsanMetadata.
707 FunctionCallee MsanMetadataPtrForLoadN, MsanMetadataPtrForStoreN;
708 FunctionCallee MsanMetadataPtrForLoad_1_8[4];
709 FunctionCallee MsanMetadataPtrForStore_1_8[4];
710 FunctionCallee MsanInstrumentAsmStoreFn;
711
712 /// Storage for return values of the MsanMetadataPtrXxx functions.
713 Value *MsanMetadataAlloca;
714
715 /// Helper to choose between different MsanMetadataPtrXxx().
716 FunctionCallee getKmsanShadowOriginAccessFn(bool isStore, int size);
717
718 /// Memory map parameters used in application-to-shadow calculation.
719 const MemoryMapParams *MapParams;
720
721 /// Custom memory map parameters used when -msan-shadow-base or
722 // -msan-origin-base is provided.
723 MemoryMapParams CustomMapParams;
724
725 MDNode *ColdCallWeights;
726
727 /// Branch weights for origin store.
728 MDNode *OriginStoreWeights;
729};
730
731void insertModuleCtor(Module &M) {
734 /*InitArgTypes=*/{},
735 /*InitArgs=*/{},
736 // This callback is invoked when the functions are created the first
737 // time. Hook them into the global ctors list in that case:
738 [&](Function *Ctor, FunctionCallee) {
739 if (!ClWithComdat) {
740 appendToGlobalCtors(M, Ctor, 0);
741 return;
742 }
743 Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName);
744 Ctor->setComdat(MsanCtorComdat);
745 appendToGlobalCtors(M, Ctor, 0, Ctor);
746 });
747}
748
749template <class T> T getOptOrDefault(const cl::opt<T> &Opt, T Default) {
750 return (Opt.getNumOccurrences() > 0) ? Opt : Default;
751}
752
753} // end anonymous namespace
754
756 bool EagerChecks)
757 : Kernel(getOptOrDefault(ClEnableKmsan, K)),
758 TrackOrigins(getOptOrDefault(ClTrackOrigins, Kernel ? 2 : TO)),
759 Recover(getOptOrDefault(ClKeepGoing, Kernel || R)),
760 EagerChecks(getOptOrDefault(ClEagerChecks, EagerChecks)) {}
761
764 // Return early if nosanitize_memory module flag is present for the module.
765 if (checkIfAlreadyInstrumented(M, "nosanitize_memory"))
766 return PreservedAnalyses::all();
767 bool Modified = false;
768 if (!Options.Kernel) {
769 insertModuleCtor(M);
770 Modified = true;
771 }
772
773 auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
774 for (Function &F : M) {
775 if (F.empty())
776 continue;
777 MemorySanitizer Msan(*F.getParent(), Options);
778 Modified |=
779 Msan.sanitizeFunction(F, FAM.getResult<TargetLibraryAnalysis>(F));
780 }
781
782 if (!Modified)
783 return PreservedAnalyses::all();
784
786 // GlobalsAA is considered stateless and does not get invalidated unless
787 // explicitly invalidated; PreservedAnalyses::none() is not enough. Sanitizers
788 // make changes that require GlobalsAA to be invalidated.
789 PA.abandon<GlobalsAA>();
790 return PA;
791}
792
794 raw_ostream &OS, function_ref<StringRef(StringRef)> MapClassName2PassName) {
796 OS, MapClassName2PassName);
797 OS << '<';
798 if (Options.Recover)
799 OS << "recover;";
800 if (Options.Kernel)
801 OS << "kernel;";
802 if (Options.EagerChecks)
803 OS << "eager-checks;";
804 OS << "track-origins=" << Options.TrackOrigins;
805 OS << '>';
806}
807
808/// Create a non-const global initialized with the given string.
809///
810/// Creates a writable global for Str so that we can pass it to the
811/// run-time lib. Runtime uses first 4 bytes of the string to store the
812/// frame ID, so the string needs to be mutable.
814 StringRef Str) {
815 Constant *StrConst = ConstantDataArray::getString(M.getContext(), Str);
816 return new GlobalVariable(M, StrConst->getType(), /*isConstant=*/true,
817 GlobalValue::PrivateLinkage, StrConst, "");
818}
819
820template <typename... ArgsTy>
822MemorySanitizer::getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
823 ArgsTy... Args) {
824 if (TargetTriple.getArch() == Triple::systemz) {
825 // SystemZ ABI: shadow/origin pair is returned via a hidden parameter.
826 return M.getOrInsertFunction(Name, Type::getVoidTy(*C), PtrTy,
827 std::forward<ArgsTy>(Args)...);
828 }
829
830 return M.getOrInsertFunction(Name, MsanMetadata,
831 std::forward<ArgsTy>(Args)...);
832}
833
834/// Create KMSAN API callbacks.
835void MemorySanitizer::createKernelApi(Module &M, const TargetLibraryInfo &TLI) {
836 IRBuilder<> IRB(*C);
837
838 // These will be initialized in insertKmsanPrologue().
839 RetvalTLS = nullptr;
840 RetvalOriginTLS = nullptr;
841 ParamTLS = nullptr;
842 ParamOriginTLS = nullptr;
843 VAArgTLS = nullptr;
844 VAArgOriginTLS = nullptr;
845 VAArgOverflowSizeTLS = nullptr;
846
847 WarningFn = M.getOrInsertFunction("__msan_warning",
848 TLI.getAttrList(C, {0}, /*Signed=*/false),
849 IRB.getVoidTy(), IRB.getInt32Ty());
850
851 // Requests the per-task context state (kmsan_context_state*) from the
852 // runtime library.
853 MsanContextStateTy = StructType::get(
854 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
855 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8),
856 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
857 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8), /* va_arg_origin */
858 IRB.getInt64Ty(), ArrayType::get(OriginTy, kParamTLSSize / 4), OriginTy,
859 OriginTy);
860 MsanGetContextStateFn =
861 M.getOrInsertFunction("__msan_get_context_state", PtrTy);
862
863 MsanMetadata = StructType::get(PtrTy, PtrTy);
864
865 for (int ind = 0, size = 1; ind < 4; ind++, size <<= 1) {
866 std::string name_load =
867 "__msan_metadata_ptr_for_load_" + std::to_string(size);
868 std::string name_store =
869 "__msan_metadata_ptr_for_store_" + std::to_string(size);
870 MsanMetadataPtrForLoad_1_8[ind] =
871 getOrInsertMsanMetadataFunction(M, name_load, PtrTy);
872 MsanMetadataPtrForStore_1_8[ind] =
873 getOrInsertMsanMetadataFunction(M, name_store, PtrTy);
874 }
875
876 MsanMetadataPtrForLoadN = getOrInsertMsanMetadataFunction(
877 M, "__msan_metadata_ptr_for_load_n", PtrTy, IntptrTy);
878 MsanMetadataPtrForStoreN = getOrInsertMsanMetadataFunction(
879 M, "__msan_metadata_ptr_for_store_n", PtrTy, IntptrTy);
880
881 // Functions for poisoning and unpoisoning memory.
882 MsanPoisonAllocaFn = M.getOrInsertFunction(
883 "__msan_poison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
884 MsanUnpoisonAllocaFn = M.getOrInsertFunction(
885 "__msan_unpoison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy);
886}
887
889 return M.getOrInsertGlobal(Name, Ty, [&] {
890 return new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
891 nullptr, Name, nullptr,
893 });
894}
895
896/// Insert declarations for userspace-specific functions and globals.
897void MemorySanitizer::createUserspaceApi(Module &M,
898 const TargetLibraryInfo &TLI) {
899 IRBuilder<> IRB(*C);
900
901 // Create the callback.
902 // FIXME: this function should have "Cold" calling conv,
903 // which is not yet implemented.
904 if (TrackOrigins) {
905 StringRef WarningFnName = Recover ? "__msan_warning_with_origin"
906 : "__msan_warning_with_origin_noreturn";
907 WarningFn = M.getOrInsertFunction(WarningFnName,
908 TLI.getAttrList(C, {0}, /*Signed=*/false),
909 IRB.getVoidTy(), IRB.getInt32Ty());
910 } else {
911 StringRef WarningFnName =
912 Recover ? "__msan_warning" : "__msan_warning_noreturn";
913 WarningFn = M.getOrInsertFunction(WarningFnName, IRB.getVoidTy());
914 }
915
916 // Create the global TLS variables.
917 RetvalTLS =
918 getOrInsertGlobal(M, "__msan_retval_tls",
919 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8));
920
921 RetvalOriginTLS = getOrInsertGlobal(M, "__msan_retval_origin_tls", OriginTy);
922
923 ParamTLS =
924 getOrInsertGlobal(M, "__msan_param_tls",
925 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
926
927 ParamOriginTLS =
928 getOrInsertGlobal(M, "__msan_param_origin_tls",
929 ArrayType::get(OriginTy, kParamTLSSize / 4));
930
931 VAArgTLS =
932 getOrInsertGlobal(M, "__msan_va_arg_tls",
933 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
934
935 VAArgOriginTLS =
936 getOrInsertGlobal(M, "__msan_va_arg_origin_tls",
937 ArrayType::get(OriginTy, kParamTLSSize / 4));
938
939 VAArgOverflowSizeTLS = getOrInsertGlobal(M, "__msan_va_arg_overflow_size_tls",
940 IRB.getIntPtrTy(M.getDataLayout()));
941
942 for (size_t AccessSizeIndex = 0; AccessSizeIndex < kNumberOfAccessSizes;
943 AccessSizeIndex++) {
944 unsigned AccessSize = 1 << AccessSizeIndex;
945 std::string FunctionName = "__msan_maybe_warning_" + itostr(AccessSize);
946 MaybeWarningFn[AccessSizeIndex] = M.getOrInsertFunction(
947 FunctionName, TLI.getAttrList(C, {0, 1}, /*Signed=*/false),
948 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), IRB.getInt32Ty());
949 MaybeWarningVarSizeFn = M.getOrInsertFunction(
950 "__msan_maybe_warning_N", TLI.getAttrList(C, {}, /*Signed=*/false),
951 IRB.getVoidTy(), PtrTy, IRB.getInt64Ty(), IRB.getInt32Ty());
952 FunctionName = "__msan_maybe_store_origin_" + itostr(AccessSize);
953 MaybeStoreOriginFn[AccessSizeIndex] = M.getOrInsertFunction(
954 FunctionName, TLI.getAttrList(C, {0, 2}, /*Signed=*/false),
955 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), PtrTy,
956 IRB.getInt32Ty());
957 }
958
959 MsanSetAllocaOriginWithDescriptionFn =
960 M.getOrInsertFunction("__msan_set_alloca_origin_with_descr",
961 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy, PtrTy);
962 MsanSetAllocaOriginNoDescriptionFn =
963 M.getOrInsertFunction("__msan_set_alloca_origin_no_descr",
964 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
965 MsanPoisonStackFn = M.getOrInsertFunction("__msan_poison_stack",
966 IRB.getVoidTy(), PtrTy, IntptrTy);
967}
968
969/// Insert extern declaration of runtime-provided functions and globals.
970void MemorySanitizer::initializeCallbacks(Module &M,
971 const TargetLibraryInfo &TLI) {
972 // Only do this once.
973 if (CallbacksInitialized)
974 return;
975
976 IRBuilder<> IRB(*C);
977 // Initialize callbacks that are common for kernel and userspace
978 // instrumentation.
979 MsanChainOriginFn = M.getOrInsertFunction(
980 "__msan_chain_origin",
981 TLI.getAttrList(C, {0}, /*Signed=*/false, /*Ret=*/true), IRB.getInt32Ty(),
982 IRB.getInt32Ty());
983 MsanSetOriginFn = M.getOrInsertFunction(
984 "__msan_set_origin", TLI.getAttrList(C, {2}, /*Signed=*/false),
985 IRB.getVoidTy(), PtrTy, IntptrTy, IRB.getInt32Ty());
986 MemmoveFn =
987 M.getOrInsertFunction("__msan_memmove", PtrTy, PtrTy, PtrTy, IntptrTy);
988 MemcpyFn =
989 M.getOrInsertFunction("__msan_memcpy", PtrTy, PtrTy, PtrTy, IntptrTy);
990 MemsetFn = M.getOrInsertFunction("__msan_memset",
991 TLI.getAttrList(C, {1}, /*Signed=*/true),
992 PtrTy, PtrTy, IRB.getInt32Ty(), IntptrTy);
993
994 MsanInstrumentAsmStoreFn = M.getOrInsertFunction(
995 "__msan_instrument_asm_store", IRB.getVoidTy(), PtrTy, IntptrTy);
996
997 if (CompileKernel) {
998 createKernelApi(M, TLI);
999 } else {
1000 createUserspaceApi(M, TLI);
1001 }
1002 CallbacksInitialized = true;
1003}
1004
1005FunctionCallee MemorySanitizer::getKmsanShadowOriginAccessFn(bool isStore,
1006 int size) {
1007 FunctionCallee *Fns =
1008 isStore ? MsanMetadataPtrForStore_1_8 : MsanMetadataPtrForLoad_1_8;
1009 switch (size) {
1010 case 1:
1011 return Fns[0];
1012 case 2:
1013 return Fns[1];
1014 case 4:
1015 return Fns[2];
1016 case 8:
1017 return Fns[3];
1018 default:
1019 return nullptr;
1020 }
1021}
1022
1023/// Module-level initialization.
1024///
1025/// inserts a call to __msan_init to the module's constructor list.
1026void MemorySanitizer::initializeModule(Module &M) {
1027 auto &DL = M.getDataLayout();
1028
1029 TargetTriple = M.getTargetTriple();
1030
1031 bool ShadowPassed = ClShadowBase.getNumOccurrences() > 0;
1032 bool OriginPassed = ClOriginBase.getNumOccurrences() > 0;
1033 // Check the overrides first
1034 if (ShadowPassed || OriginPassed) {
1035 CustomMapParams.AndMask = ClAndMask;
1036 CustomMapParams.XorMask = ClXorMask;
1037 CustomMapParams.ShadowBase = ClShadowBase;
1038 CustomMapParams.OriginBase = ClOriginBase;
1039 MapParams = &CustomMapParams;
1040 } else {
1041 switch (TargetTriple.getOS()) {
1042 case Triple::FreeBSD:
1043 switch (TargetTriple.getArch()) {
1044 case Triple::aarch64:
1045 MapParams = FreeBSD_ARM_MemoryMapParams.bits64;
1046 break;
1047 case Triple::x86_64:
1048 MapParams = FreeBSD_X86_MemoryMapParams.bits64;
1049 break;
1050 case Triple::x86:
1051 MapParams = FreeBSD_X86_MemoryMapParams.bits32;
1052 break;
1053 default:
1054 report_fatal_error("unsupported architecture");
1055 }
1056 break;
1057 case Triple::NetBSD:
1058 switch (TargetTriple.getArch()) {
1059 case Triple::x86_64:
1060 MapParams = NetBSD_X86_MemoryMapParams.bits64;
1061 break;
1062 default:
1063 report_fatal_error("unsupported architecture");
1064 }
1065 break;
1066 case Triple::Linux:
1067 switch (TargetTriple.getArch()) {
1068 case Triple::x86_64:
1069 MapParams = Linux_X86_MemoryMapParams.bits64;
1070 break;
1071 case Triple::x86:
1072 MapParams = Linux_X86_MemoryMapParams.bits32;
1073 break;
1074 case Triple::mips64:
1075 case Triple::mips64el:
1076 MapParams = Linux_MIPS_MemoryMapParams.bits64;
1077 break;
1078 case Triple::ppc64:
1079 case Triple::ppc64le:
1080 MapParams = Linux_PowerPC_MemoryMapParams.bits64;
1081 break;
1082 case Triple::systemz:
1083 MapParams = Linux_S390_MemoryMapParams.bits64;
1084 break;
1085 case Triple::aarch64:
1086 case Triple::aarch64_be:
1087 MapParams = Linux_ARM_MemoryMapParams.bits64;
1088 break;
1090 MapParams = Linux_LoongArch_MemoryMapParams.bits64;
1091 break;
1092 default:
1093 report_fatal_error("unsupported architecture");
1094 }
1095 break;
1096 default:
1097 report_fatal_error("unsupported operating system");
1098 }
1099 }
1100
1101 C = &(M.getContext());
1102 IRBuilder<> IRB(*C);
1103 IntptrTy = IRB.getIntPtrTy(DL);
1104 OriginTy = IRB.getInt32Ty();
1105 PtrTy = IRB.getPtrTy();
1106
1107 ColdCallWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1108 OriginStoreWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1109
1110 if (!CompileKernel) {
1111 if (TrackOrigins)
1112 M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] {
1113 return new GlobalVariable(
1114 M, IRB.getInt32Ty(), true, GlobalValue::WeakODRLinkage,
1115 IRB.getInt32(TrackOrigins), "__msan_track_origins");
1116 });
1117
1118 if (Recover)
1119 M.getOrInsertGlobal("__msan_keep_going", IRB.getInt32Ty(), [&] {
1120 return new GlobalVariable(M, IRB.getInt32Ty(), true,
1121 GlobalValue::WeakODRLinkage,
1122 IRB.getInt32(Recover), "__msan_keep_going");
1123 });
1124 }
1125}
1126
1127namespace {
1128
1129/// A helper class that handles instrumentation of VarArg
1130/// functions on a particular platform.
1131///
1132/// Implementations are expected to insert the instrumentation
1133/// necessary to propagate argument shadow through VarArg function
1134/// calls. Visit* methods are called during an InstVisitor pass over
1135/// the function, and should avoid creating new basic blocks. A new
1136/// instance of this class is created for each instrumented function.
1137struct VarArgHelper {
1138 virtual ~VarArgHelper() = default;
1139
1140 /// Visit a CallBase.
1141 virtual void visitCallBase(CallBase &CB, IRBuilder<> &IRB) = 0;
1142
1143 /// Visit a va_start call.
1144 virtual void visitVAStartInst(VAStartInst &I) = 0;
1145
1146 /// Visit a va_copy call.
1147 virtual void visitVACopyInst(VACopyInst &I) = 0;
1148
1149 /// Finalize function instrumentation.
1150 ///
1151 /// This method is called after visiting all interesting (see above)
1152 /// instructions in a function.
1153 virtual void finalizeInstrumentation() = 0;
1154};
1155
1156struct MemorySanitizerVisitor;
1157
1158} // end anonymous namespace
1159
1160static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
1161 MemorySanitizerVisitor &Visitor);
1162
1163static unsigned TypeSizeToSizeIndex(TypeSize TS) {
1164 if (TS.isScalable())
1165 // Scalable types unconditionally take slowpaths.
1166 return kNumberOfAccessSizes;
1167 unsigned TypeSizeFixed = TS.getFixedValue();
1168 if (TypeSizeFixed <= 8)
1169 return 0;
1170 return Log2_32_Ceil((TypeSizeFixed + 7) / 8);
1171}
1172
1173namespace {
1174
1175/// Helper class to attach debug information of the given instruction onto new
1176/// instructions inserted after.
1177class NextNodeIRBuilder : public IRBuilder<> {
1178public:
1179 explicit NextNodeIRBuilder(Instruction *IP) : IRBuilder<>(IP->getNextNode()) {
1180 SetCurrentDebugLocation(IP->getDebugLoc());
1181 }
1182};
1183
1184/// This class does all the work for a given function. Store and Load
1185/// instructions store and load corresponding shadow and origin
1186/// values. Most instructions propagate shadow from arguments to their
1187/// return values. Certain instructions (most importantly, BranchInst)
1188/// test their argument shadow and print reports (with a runtime call) if it's
1189/// non-zero.
1190struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
1191 Function &F;
1192 MemorySanitizer &MS;
1193 SmallVector<PHINode *, 16> ShadowPHINodes, OriginPHINodes;
1194 ValueMap<Value *, Value *> ShadowMap, OriginMap;
1195 std::unique_ptr<VarArgHelper> VAHelper;
1196 const TargetLibraryInfo *TLI;
1197 Instruction *FnPrologueEnd;
1198 SmallVector<Instruction *, 16> Instructions;
1199
1200 // The following flags disable parts of MSan instrumentation based on
1201 // exclusion list contents and command-line options.
1202 bool InsertChecks;
1203 bool PropagateShadow;
1204 bool PoisonStack;
1205 bool PoisonUndef;
1206 bool PoisonUndefVectors;
1207
1208 struct ShadowOriginAndInsertPoint {
1209 Value *Shadow;
1210 Value *Origin;
1211 Instruction *OrigIns;
1212
1213 ShadowOriginAndInsertPoint(Value *S, Value *O, Instruction *I)
1214 : Shadow(S), Origin(O), OrigIns(I) {}
1215 };
1217 DenseMap<const DILocation *, int> LazyWarningDebugLocationCount;
1218 SmallSetVector<AllocaInst *, 16> AllocaSet;
1221 int64_t SplittableBlocksCount = 0;
1222
1223 MemorySanitizerVisitor(Function &F, MemorySanitizer &MS,
1224 const TargetLibraryInfo &TLI)
1225 : F(F), MS(MS), VAHelper(CreateVarArgHelper(F, MS, *this)), TLI(&TLI) {
1226 bool SanitizeFunction =
1227 F.hasFnAttribute(Attribute::SanitizeMemory) && !ClDisableChecks;
1228 InsertChecks = SanitizeFunction;
1229 PropagateShadow = SanitizeFunction;
1230 PoisonStack = SanitizeFunction && ClPoisonStack;
1231 PoisonUndef = SanitizeFunction && ClPoisonUndef;
1232 PoisonUndefVectors = SanitizeFunction && ClPoisonUndefVectors;
1233
1234 // In the presence of unreachable blocks, we may see Phi nodes with
1235 // incoming nodes from such blocks. Since InstVisitor skips unreachable
1236 // blocks, such nodes will not have any shadow value associated with them.
1237 // It's easier to remove unreachable blocks than deal with missing shadow.
1239
1240 MS.initializeCallbacks(*F.getParent(), TLI);
1241 FnPrologueEnd =
1242 IRBuilder<>(&F.getEntryBlock(), F.getEntryBlock().getFirstNonPHIIt())
1243 .CreateIntrinsic(Intrinsic::donothing, {});
1244
1245 if (MS.CompileKernel) {
1246 IRBuilder<> IRB(FnPrologueEnd);
1247 insertKmsanPrologue(IRB);
1248 }
1249
1250 LLVM_DEBUG(if (!InsertChecks) dbgs()
1251 << "MemorySanitizer is not inserting checks into '"
1252 << F.getName() << "'\n");
1253 }
1254
1255 bool instrumentWithCalls(Value *V) {
1256 // Constants likely will be eliminated by follow-up passes.
1257 if (isa<Constant>(V))
1258 return false;
1259 ++SplittableBlocksCount;
1261 SplittableBlocksCount > ClInstrumentationWithCallThreshold;
1262 }
1263
1264 bool isInPrologue(Instruction &I) {
1265 return I.getParent() == FnPrologueEnd->getParent() &&
1266 (&I == FnPrologueEnd || I.comesBefore(FnPrologueEnd));
1267 }
1268
1269 // Creates a new origin and records the stack trace. In general we can call
1270 // this function for any origin manipulation we like. However it will cost
1271 // runtime resources. So use this wisely only if it can provide additional
1272 // information helpful to a user.
1273 Value *updateOrigin(Value *V, IRBuilder<> &IRB) {
1274 if (MS.TrackOrigins <= 1)
1275 return V;
1276 return IRB.CreateCall(MS.MsanChainOriginFn, V);
1277 }
1278
1279 Value *originToIntptr(IRBuilder<> &IRB, Value *Origin) {
1280 const DataLayout &DL = F.getDataLayout();
1281 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1282 if (IntptrSize == kOriginSize)
1283 return Origin;
1284 assert(IntptrSize == kOriginSize * 2);
1285 Origin = IRB.CreateIntCast(Origin, MS.IntptrTy, /* isSigned */ false);
1286 return IRB.CreateOr(Origin, IRB.CreateShl(Origin, kOriginSize * 8));
1287 }
1288
1289 /// Fill memory range with the given origin value.
1290 void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *OriginPtr,
1291 TypeSize TS, Align Alignment) {
1292 const DataLayout &DL = F.getDataLayout();
1293 const Align IntptrAlignment = DL.getABITypeAlign(MS.IntptrTy);
1294 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1295 assert(IntptrAlignment >= kMinOriginAlignment);
1296 assert(IntptrSize >= kOriginSize);
1297
1298 // Note: The loop based formation works for fixed length vectors too,
1299 // however we prefer to unroll and specialize alignment below.
1300 if (TS.isScalable()) {
1301 Value *Size = IRB.CreateTypeSize(MS.IntptrTy, TS);
1302 Value *RoundUp =
1303 IRB.CreateAdd(Size, ConstantInt::get(MS.IntptrTy, kOriginSize - 1));
1304 Value *End =
1305 IRB.CreateUDiv(RoundUp, ConstantInt::get(MS.IntptrTy, kOriginSize));
1306 auto [InsertPt, Index] =
1308 IRB.SetInsertPoint(InsertPt);
1309
1310 Value *GEP = IRB.CreateGEP(MS.OriginTy, OriginPtr, Index);
1312 return;
1313 }
1314
1315 unsigned Size = TS.getFixedValue();
1316
1317 unsigned Ofs = 0;
1318 Align CurrentAlignment = Alignment;
1319 if (Alignment >= IntptrAlignment && IntptrSize > kOriginSize) {
1320 Value *IntptrOrigin = originToIntptr(IRB, Origin);
1321 Value *IntptrOriginPtr = IRB.CreatePointerCast(OriginPtr, MS.PtrTy);
1322 for (unsigned i = 0; i < Size / IntptrSize; ++i) {
1323 Value *Ptr = i ? IRB.CreateConstGEP1_32(MS.IntptrTy, IntptrOriginPtr, i)
1324 : IntptrOriginPtr;
1325 IRB.CreateAlignedStore(IntptrOrigin, Ptr, CurrentAlignment);
1326 Ofs += IntptrSize / kOriginSize;
1327 CurrentAlignment = IntptrAlignment;
1328 }
1329 }
1330
1331 for (unsigned i = Ofs; i < (Size + kOriginSize - 1) / kOriginSize; ++i) {
1332 Value *GEP =
1333 i ? IRB.CreateConstGEP1_32(MS.OriginTy, OriginPtr, i) : OriginPtr;
1334 IRB.CreateAlignedStore(Origin, GEP, CurrentAlignment);
1335 CurrentAlignment = kMinOriginAlignment;
1336 }
1337 }
1338
1339 void storeOrigin(IRBuilder<> &IRB, Value *Addr, Value *Shadow, Value *Origin,
1340 Value *OriginPtr, Align Alignment) {
1341 const DataLayout &DL = F.getDataLayout();
1342 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1343 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
1344 // ZExt cannot convert between vector and scalar
1345 Value *ConvertedShadow = convertShadowToScalar(Shadow, IRB);
1346 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1347 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1348 // Origin is not needed: value is initialized or const shadow is
1349 // ignored.
1350 return;
1351 }
1352 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1353 // Copy origin as the value is definitely uninitialized.
1354 paintOrigin(IRB, updateOrigin(Origin, IRB), OriginPtr, StoreSize,
1355 OriginAlignment);
1356 return;
1357 }
1358 // Fallback to runtime check, which still can be optimized out later.
1359 }
1360
1361 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1362 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1363 if (instrumentWithCalls(ConvertedShadow) &&
1364 SizeIndex < kNumberOfAccessSizes && !MS.CompileKernel) {
1365 FunctionCallee Fn = MS.MaybeStoreOriginFn[SizeIndex];
1366 Value *ConvertedShadow2 =
1367 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1368 CallBase *CB = IRB.CreateCall(Fn, {ConvertedShadow2, Addr, Origin});
1369 CB->addParamAttr(0, Attribute::ZExt);
1370 CB->addParamAttr(2, Attribute::ZExt);
1371 } else {
1372 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1374 Cmp, &*IRB.GetInsertPoint(), false, MS.OriginStoreWeights);
1375 IRBuilder<> IRBNew(CheckTerm);
1376 paintOrigin(IRBNew, updateOrigin(Origin, IRBNew), OriginPtr, StoreSize,
1377 OriginAlignment);
1378 }
1379 }
1380
1381 void materializeStores() {
1382 for (StoreInst *SI : StoreList) {
1383 IRBuilder<> IRB(SI);
1384 Value *Val = SI->getValueOperand();
1385 Value *Addr = SI->getPointerOperand();
1386 Value *Shadow = SI->isAtomic() ? getCleanShadow(Val) : getShadow(Val);
1387 Value *ShadowPtr, *OriginPtr;
1388 Type *ShadowTy = Shadow->getType();
1389 const Align Alignment = SI->getAlign();
1390 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1391 std::tie(ShadowPtr, OriginPtr) =
1392 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ true);
1393
1394 [[maybe_unused]] StoreInst *NewSI =
1395 IRB.CreateAlignedStore(Shadow, ShadowPtr, Alignment);
1396 LLVM_DEBUG(dbgs() << " STORE: " << *NewSI << "\n");
1397
1398 if (SI->isAtomic())
1399 SI->setOrdering(addReleaseOrdering(SI->getOrdering()));
1400
1401 if (MS.TrackOrigins && !SI->isAtomic())
1402 storeOrigin(IRB, Addr, Shadow, getOrigin(Val), OriginPtr,
1403 OriginAlignment);
1404 }
1405 }
1406
1407 // Returns true if Debug Location corresponds to multiple warnings.
1408 bool shouldDisambiguateWarningLocation(const DebugLoc &DebugLoc) {
1409 if (MS.TrackOrigins < 2)
1410 return false;
1411
1412 if (LazyWarningDebugLocationCount.empty())
1413 for (const auto &I : InstrumentationList)
1414 ++LazyWarningDebugLocationCount[I.OrigIns->getDebugLoc()];
1415
1416 return LazyWarningDebugLocationCount[DebugLoc] >= ClDisambiguateWarning;
1417 }
1418
1419 /// Helper function to insert a warning at IRB's current insert point.
1420 void insertWarningFn(IRBuilder<> &IRB, Value *Origin) {
1421 if (!Origin)
1422 Origin = (Value *)IRB.getInt32(0);
1423 assert(Origin->getType()->isIntegerTy());
1424
1425 if (shouldDisambiguateWarningLocation(IRB.getCurrentDebugLocation())) {
1426 // Try to create additional origin with debug info of the last origin
1427 // instruction. It may provide additional information to the user.
1428 if (Instruction *OI = dyn_cast_or_null<Instruction>(Origin)) {
1429 assert(MS.TrackOrigins);
1430 auto NewDebugLoc = OI->getDebugLoc();
1431 // Origin update with missing or the same debug location provides no
1432 // additional value.
1433 if (NewDebugLoc && NewDebugLoc != IRB.getCurrentDebugLocation()) {
1434 // Insert update just before the check, so we call runtime only just
1435 // before the report.
1436 IRBuilder<> IRBOrigin(&*IRB.GetInsertPoint());
1437 IRBOrigin.SetCurrentDebugLocation(NewDebugLoc);
1438 Origin = updateOrigin(Origin, IRBOrigin);
1439 }
1440 }
1441 }
1442
1443 if (MS.CompileKernel || MS.TrackOrigins)
1444 IRB.CreateCall(MS.WarningFn, Origin)->setCannotMerge();
1445 else
1446 IRB.CreateCall(MS.WarningFn)->setCannotMerge();
1447 // FIXME: Insert UnreachableInst if !MS.Recover?
1448 // This may invalidate some of the following checks and needs to be done
1449 // at the very end.
1450 }
1451
1452 void materializeOneCheck(IRBuilder<> &IRB, Value *ConvertedShadow,
1453 Value *Origin) {
1454 const DataLayout &DL = F.getDataLayout();
1455 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1456 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1457 if (instrumentWithCalls(ConvertedShadow) && !MS.CompileKernel) {
1458 // ZExt cannot convert between vector and scalar
1459 ConvertedShadow = convertShadowToScalar(ConvertedShadow, IRB);
1460 Value *ConvertedShadow2 =
1461 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1462
1463 if (SizeIndex < kNumberOfAccessSizes) {
1464 FunctionCallee Fn = MS.MaybeWarningFn[SizeIndex];
1465 CallBase *CB = IRB.CreateCall(
1466 Fn,
1467 {ConvertedShadow2,
1468 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1469 CB->addParamAttr(0, Attribute::ZExt);
1470 CB->addParamAttr(1, Attribute::ZExt);
1471 } else {
1472 FunctionCallee Fn = MS.MaybeWarningVarSizeFn;
1473 Value *ShadowAlloca = IRB.CreateAlloca(ConvertedShadow2->getType(), 0u);
1474 IRB.CreateStore(ConvertedShadow2, ShadowAlloca);
1475 unsigned ShadowSize = DL.getTypeAllocSize(ConvertedShadow2->getType());
1476 CallBase *CB = IRB.CreateCall(
1477 Fn,
1478 {ShadowAlloca, ConstantInt::get(IRB.getInt64Ty(), ShadowSize),
1479 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1480 CB->addParamAttr(1, Attribute::ZExt);
1481 CB->addParamAttr(2, Attribute::ZExt);
1482 }
1483 } else {
1484 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1486 Cmp, &*IRB.GetInsertPoint(),
1487 /* Unreachable */ !MS.Recover, MS.ColdCallWeights);
1488
1489 IRB.SetInsertPoint(CheckTerm);
1490 insertWarningFn(IRB, Origin);
1491 LLVM_DEBUG(dbgs() << " CHECK: " << *Cmp << "\n");
1492 }
1493 }
1494
1495 void materializeInstructionChecks(
1496 ArrayRef<ShadowOriginAndInsertPoint> InstructionChecks) {
1497 const DataLayout &DL = F.getDataLayout();
1498 // Disable combining in some cases. TrackOrigins checks each shadow to pick
1499 // correct origin.
1500 bool Combine = !MS.TrackOrigins;
1501 Instruction *Instruction = InstructionChecks.front().OrigIns;
1502 Value *Shadow = nullptr;
1503 for (const auto &ShadowData : InstructionChecks) {
1504 assert(ShadowData.OrigIns == Instruction);
1505 IRBuilder<> IRB(Instruction);
1506
1507 Value *ConvertedShadow = ShadowData.Shadow;
1508
1509 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1510 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1511 // Skip, value is initialized or const shadow is ignored.
1512 continue;
1513 }
1514 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1515 // Report as the value is definitely uninitialized.
1516 insertWarningFn(IRB, ShadowData.Origin);
1517 if (!MS.Recover)
1518 return; // Always fail and stop here, not need to check the rest.
1519 // Skip entire instruction,
1520 continue;
1521 }
1522 // Fallback to runtime check, which still can be optimized out later.
1523 }
1524
1525 if (!Combine) {
1526 materializeOneCheck(IRB, ConvertedShadow, ShadowData.Origin);
1527 continue;
1528 }
1529
1530 if (!Shadow) {
1531 Shadow = ConvertedShadow;
1532 continue;
1533 }
1534
1535 Shadow = convertToBool(Shadow, IRB, "_mscmp");
1536 ConvertedShadow = convertToBool(ConvertedShadow, IRB, "_mscmp");
1537 Shadow = IRB.CreateOr(Shadow, ConvertedShadow, "_msor");
1538 }
1539
1540 if (Shadow) {
1541 assert(Combine);
1542 IRBuilder<> IRB(Instruction);
1543 materializeOneCheck(IRB, Shadow, nullptr);
1544 }
1545 }
1546
1547 void materializeChecks() {
1548#ifndef NDEBUG
1549 // For assert below.
1550 SmallPtrSet<Instruction *, 16> Done;
1551#endif
1552
1553 for (auto I = InstrumentationList.begin();
1554 I != InstrumentationList.end();) {
1555 auto OrigIns = I->OrigIns;
1556 // Checks are grouped by the original instruction. We call all
1557 // `insertShadowCheck` for an instruction at once.
1558 assert(Done.insert(OrigIns).second);
1559 auto J = std::find_if(I + 1, InstrumentationList.end(),
1560 [OrigIns](const ShadowOriginAndInsertPoint &R) {
1561 return OrigIns != R.OrigIns;
1562 });
1563 // Process all checks of instruction at once.
1564 materializeInstructionChecks(ArrayRef<ShadowOriginAndInsertPoint>(I, J));
1565 I = J;
1566 }
1567
1568 LLVM_DEBUG(dbgs() << "DONE:\n" << F);
1569 }
1570
1571 // Returns the last instruction in the new prologue
1572 void insertKmsanPrologue(IRBuilder<> &IRB) {
1573 Value *ContextState = IRB.CreateCall(MS.MsanGetContextStateFn, {});
1574 Constant *Zero = IRB.getInt32(0);
1575 MS.ParamTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1576 {Zero, IRB.getInt32(0)}, "param_shadow");
1577 MS.RetvalTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1578 {Zero, IRB.getInt32(1)}, "retval_shadow");
1579 MS.VAArgTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1580 {Zero, IRB.getInt32(2)}, "va_arg_shadow");
1581 MS.VAArgOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1582 {Zero, IRB.getInt32(3)}, "va_arg_origin");
1583 MS.VAArgOverflowSizeTLS =
1584 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1585 {Zero, IRB.getInt32(4)}, "va_arg_overflow_size");
1586 MS.ParamOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1587 {Zero, IRB.getInt32(5)}, "param_origin");
1588 MS.RetvalOriginTLS =
1589 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1590 {Zero, IRB.getInt32(6)}, "retval_origin");
1591 if (MS.TargetTriple.getArch() == Triple::systemz)
1592 MS.MsanMetadataAlloca = IRB.CreateAlloca(MS.MsanMetadata, 0u);
1593 }
1594
1595 /// Add MemorySanitizer instrumentation to a function.
1596 bool runOnFunction() {
1597 // Iterate all BBs in depth-first order and create shadow instructions
1598 // for all instructions (where applicable).
1599 // For PHI nodes we create dummy shadow PHIs which will be finalized later.
1600 for (BasicBlock *BB : depth_first(FnPrologueEnd->getParent()))
1601 visit(*BB);
1602
1603 // `visit` above only collects instructions. Process them after iterating
1604 // CFG to avoid requirement on CFG transformations.
1605 for (Instruction *I : Instructions)
1607
1608 // Finalize PHI nodes.
1609 for (PHINode *PN : ShadowPHINodes) {
1610 PHINode *PNS = cast<PHINode>(getShadow(PN));
1611 PHINode *PNO = MS.TrackOrigins ? cast<PHINode>(getOrigin(PN)) : nullptr;
1612 size_t NumValues = PN->getNumIncomingValues();
1613 for (size_t v = 0; v < NumValues; v++) {
1614 PNS->addIncoming(getShadow(PN, v), PN->getIncomingBlock(v));
1615 if (PNO)
1616 PNO->addIncoming(getOrigin(PN, v), PN->getIncomingBlock(v));
1617 }
1618 }
1619
1620 VAHelper->finalizeInstrumentation();
1621
1622 // Poison llvm.lifetime.start intrinsics, if we haven't fallen back to
1623 // instrumenting only allocas.
1625 for (auto Item : LifetimeStartList) {
1626 instrumentAlloca(*Item.second, Item.first);
1627 AllocaSet.remove(Item.second);
1628 }
1629 }
1630 // Poison the allocas for which we didn't instrument the corresponding
1631 // lifetime intrinsics.
1632 for (AllocaInst *AI : AllocaSet)
1633 instrumentAlloca(*AI);
1634
1635 // Insert shadow value checks.
1636 materializeChecks();
1637
1638 // Delayed instrumentation of StoreInst.
1639 // This may not add new address checks.
1640 materializeStores();
1641
1642 return true;
1643 }
1644
1645 /// Compute the shadow type that corresponds to a given Value.
1646 Type *getShadowTy(Value *V) { return getShadowTy(V->getType()); }
1647
1648 /// Compute the shadow type that corresponds to a given Type.
1649 Type *getShadowTy(Type *OrigTy) {
1650 if (!OrigTy->isSized()) {
1651 return nullptr;
1652 }
1653 // For integer type, shadow is the same as the original type.
1654 // This may return weird-sized types like i1.
1655 if (IntegerType *IT = dyn_cast<IntegerType>(OrigTy))
1656 return IT;
1657 const DataLayout &DL = F.getDataLayout();
1658 if (VectorType *VT = dyn_cast<VectorType>(OrigTy)) {
1659 uint32_t EltSize = DL.getTypeSizeInBits(VT->getElementType());
1660 return VectorType::get(IntegerType::get(*MS.C, EltSize),
1661 VT->getElementCount());
1662 }
1663 if (ArrayType *AT = dyn_cast<ArrayType>(OrigTy)) {
1664 return ArrayType::get(getShadowTy(AT->getElementType()),
1665 AT->getNumElements());
1666 }
1667 if (StructType *ST = dyn_cast<StructType>(OrigTy)) {
1669 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1670 Elements.push_back(getShadowTy(ST->getElementType(i)));
1671 StructType *Res = StructType::get(*MS.C, Elements, ST->isPacked());
1672 LLVM_DEBUG(dbgs() << "getShadowTy: " << *ST << " ===> " << *Res << "\n");
1673 return Res;
1674 }
1675 uint32_t TypeSize = DL.getTypeSizeInBits(OrigTy);
1676 return IntegerType::get(*MS.C, TypeSize);
1677 }
1678
1679 /// Extract combined shadow of struct elements as a bool
1680 Value *collapseStructShadow(StructType *Struct, Value *Shadow,
1681 IRBuilder<> &IRB) {
1682 Value *FalseVal = IRB.getIntN(/* width */ 1, /* value */ 0);
1683 Value *Aggregator = FalseVal;
1684
1685 for (unsigned Idx = 0; Idx < Struct->getNumElements(); Idx++) {
1686 // Combine by ORing together each element's bool shadow
1687 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1688 Value *ShadowBool = convertToBool(ShadowItem, IRB);
1689
1690 if (Aggregator != FalseVal)
1691 Aggregator = IRB.CreateOr(Aggregator, ShadowBool);
1692 else
1693 Aggregator = ShadowBool;
1694 }
1695
1696 return Aggregator;
1697 }
1698
1699 // Extract combined shadow of array elements
1700 Value *collapseArrayShadow(ArrayType *Array, Value *Shadow,
1701 IRBuilder<> &IRB) {
1702 if (!Array->getNumElements())
1703 return IRB.getIntN(/* width */ 1, /* value */ 0);
1704
1705 Value *FirstItem = IRB.CreateExtractValue(Shadow, 0);
1706 Value *Aggregator = convertShadowToScalar(FirstItem, IRB);
1707
1708 for (unsigned Idx = 1; Idx < Array->getNumElements(); Idx++) {
1709 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1710 Value *ShadowInner = convertShadowToScalar(ShadowItem, IRB);
1711 Aggregator = IRB.CreateOr(Aggregator, ShadowInner);
1712 }
1713 return Aggregator;
1714 }
1715
1716 /// Convert a shadow value to it's flattened variant. The resulting
1717 /// shadow may not necessarily have the same bit width as the input
1718 /// value, but it will always be comparable to zero.
1719 Value *convertShadowToScalar(Value *V, IRBuilder<> &IRB) {
1720 if (StructType *Struct = dyn_cast<StructType>(V->getType()))
1721 return collapseStructShadow(Struct, V, IRB);
1722 if (ArrayType *Array = dyn_cast<ArrayType>(V->getType()))
1723 return collapseArrayShadow(Array, V, IRB);
1724 if (isa<VectorType>(V->getType())) {
1725 if (isa<ScalableVectorType>(V->getType()))
1726 return convertShadowToScalar(IRB.CreateOrReduce(V), IRB);
1727 unsigned BitWidth =
1728 V->getType()->getPrimitiveSizeInBits().getFixedValue();
1729 return IRB.CreateBitCast(V, IntegerType::get(*MS.C, BitWidth));
1730 }
1731 return V;
1732 }
1733
1734 // Convert a scalar value to an i1 by comparing with 0
1735 Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &name = "") {
1736 Type *VTy = V->getType();
1737 if (!VTy->isIntegerTy())
1738 return convertToBool(convertShadowToScalar(V, IRB), IRB, name);
1739 if (VTy->getIntegerBitWidth() == 1)
1740 // Just converting a bool to a bool, so do nothing.
1741 return V;
1742 return IRB.CreateICmpNE(V, ConstantInt::get(VTy, 0), name);
1743 }
1744
1745 Type *ptrToIntPtrType(Type *PtrTy) const {
1746 if (VectorType *VectTy = dyn_cast<VectorType>(PtrTy)) {
1747 return VectorType::get(ptrToIntPtrType(VectTy->getElementType()),
1748 VectTy->getElementCount());
1749 }
1750 assert(PtrTy->isIntOrPtrTy());
1751 return MS.IntptrTy;
1752 }
1753
1754 Type *getPtrToShadowPtrType(Type *IntPtrTy, Type *ShadowTy) const {
1755 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1756 return VectorType::get(
1757 getPtrToShadowPtrType(VectTy->getElementType(), ShadowTy),
1758 VectTy->getElementCount());
1759 }
1760 assert(IntPtrTy == MS.IntptrTy);
1761 return MS.PtrTy;
1762 }
1763
1764 Constant *constToIntPtr(Type *IntPtrTy, uint64_t C) const {
1765 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1767 VectTy->getElementCount(),
1768 constToIntPtr(VectTy->getElementType(), C));
1769 }
1770 assert(IntPtrTy == MS.IntptrTy);
1771 return ConstantInt::get(MS.IntptrTy, C);
1772 }
1773
1774 /// Returns the integer shadow offset that corresponds to a given
1775 /// application address, whereby:
1776 ///
1777 /// Offset = (Addr & ~AndMask) ^ XorMask
1778 /// Shadow = ShadowBase + Offset
1779 /// Origin = (OriginBase + Offset) & ~Alignment
1780 ///
1781 /// Note: for efficiency, many shadow mappings only require use the XorMask
1782 /// and OriginBase; the AndMask and ShadowBase are often zero.
1783 Value *getShadowPtrOffset(Value *Addr, IRBuilder<> &IRB) {
1784 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1785 Value *OffsetLong = IRB.CreatePointerCast(Addr, IntptrTy);
1786
1787 if (uint64_t AndMask = MS.MapParams->AndMask)
1788 OffsetLong = IRB.CreateAnd(OffsetLong, constToIntPtr(IntptrTy, ~AndMask));
1789
1790 if (uint64_t XorMask = MS.MapParams->XorMask)
1791 OffsetLong = IRB.CreateXor(OffsetLong, constToIntPtr(IntptrTy, XorMask));
1792 return OffsetLong;
1793 }
1794
1795 /// Compute the shadow and origin addresses corresponding to a given
1796 /// application address.
1797 ///
1798 /// Shadow = ShadowBase + Offset
1799 /// Origin = (OriginBase + Offset) & ~3ULL
1800 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1801 /// a single pointee.
1802 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1803 std::pair<Value *, Value *>
1804 getShadowOriginPtrUserspace(Value *Addr, IRBuilder<> &IRB, Type *ShadowTy,
1805 MaybeAlign Alignment) {
1806 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1807 if (!VectTy) {
1808 assert(Addr->getType()->isPointerTy());
1809 } else {
1810 assert(VectTy->getElementType()->isPointerTy());
1811 }
1812 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1813 Value *ShadowOffset = getShadowPtrOffset(Addr, IRB);
1814 Value *ShadowLong = ShadowOffset;
1815 if (uint64_t ShadowBase = MS.MapParams->ShadowBase) {
1816 ShadowLong =
1817 IRB.CreateAdd(ShadowLong, constToIntPtr(IntptrTy, ShadowBase));
1818 }
1819 Value *ShadowPtr = IRB.CreateIntToPtr(
1820 ShadowLong, getPtrToShadowPtrType(IntptrTy, ShadowTy));
1821
1822 Value *OriginPtr = nullptr;
1823 if (MS.TrackOrigins) {
1824 Value *OriginLong = ShadowOffset;
1825 uint64_t OriginBase = MS.MapParams->OriginBase;
1826 if (OriginBase != 0)
1827 OriginLong =
1828 IRB.CreateAdd(OriginLong, constToIntPtr(IntptrTy, OriginBase));
1829 if (!Alignment || *Alignment < kMinOriginAlignment) {
1830 uint64_t Mask = kMinOriginAlignment.value() - 1;
1831 OriginLong = IRB.CreateAnd(OriginLong, constToIntPtr(IntptrTy, ~Mask));
1832 }
1833 OriginPtr = IRB.CreateIntToPtr(
1834 OriginLong, getPtrToShadowPtrType(IntptrTy, MS.OriginTy));
1835 }
1836 return std::make_pair(ShadowPtr, OriginPtr);
1837 }
1838
1839 template <typename... ArgsTy>
1840 Value *createMetadataCall(IRBuilder<> &IRB, FunctionCallee Callee,
1841 ArgsTy... Args) {
1842 if (MS.TargetTriple.getArch() == Triple::systemz) {
1843 IRB.CreateCall(Callee,
1844 {MS.MsanMetadataAlloca, std::forward<ArgsTy>(Args)...});
1845 return IRB.CreateLoad(MS.MsanMetadata, MS.MsanMetadataAlloca);
1846 }
1847
1848 return IRB.CreateCall(Callee, {std::forward<ArgsTy>(Args)...});
1849 }
1850
1851 std::pair<Value *, Value *> getShadowOriginPtrKernelNoVec(Value *Addr,
1852 IRBuilder<> &IRB,
1853 Type *ShadowTy,
1854 bool isStore) {
1855 Value *ShadowOriginPtrs;
1856 const DataLayout &DL = F.getDataLayout();
1857 TypeSize Size = DL.getTypeStoreSize(ShadowTy);
1858
1859 FunctionCallee Getter = MS.getKmsanShadowOriginAccessFn(isStore, Size);
1860 Value *AddrCast = IRB.CreatePointerCast(Addr, MS.PtrTy);
1861 if (Getter) {
1862 ShadowOriginPtrs = createMetadataCall(IRB, Getter, AddrCast);
1863 } else {
1864 Value *SizeVal = ConstantInt::get(MS.IntptrTy, Size);
1865 ShadowOriginPtrs = createMetadataCall(
1866 IRB,
1867 isStore ? MS.MsanMetadataPtrForStoreN : MS.MsanMetadataPtrForLoadN,
1868 AddrCast, SizeVal);
1869 }
1870 Value *ShadowPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 0);
1871 ShadowPtr = IRB.CreatePointerCast(ShadowPtr, MS.PtrTy);
1872 Value *OriginPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 1);
1873
1874 return std::make_pair(ShadowPtr, OriginPtr);
1875 }
1876
1877 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1878 /// a single pointee.
1879 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1880 std::pair<Value *, Value *> getShadowOriginPtrKernel(Value *Addr,
1881 IRBuilder<> &IRB,
1882 Type *ShadowTy,
1883 bool isStore) {
1884 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1885 if (!VectTy) {
1886 assert(Addr->getType()->isPointerTy());
1887 return getShadowOriginPtrKernelNoVec(Addr, IRB, ShadowTy, isStore);
1888 }
1889
1890 // TODO: Support callbacs with vectors of addresses.
1891 unsigned NumElements = cast<FixedVectorType>(VectTy)->getNumElements();
1892 Value *ShadowPtrs = ConstantInt::getNullValue(
1893 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1894 Value *OriginPtrs = nullptr;
1895 if (MS.TrackOrigins)
1896 OriginPtrs = ConstantInt::getNullValue(
1897 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1898 for (unsigned i = 0; i < NumElements; ++i) {
1899 Value *OneAddr =
1900 IRB.CreateExtractElement(Addr, ConstantInt::get(IRB.getInt32Ty(), i));
1901 auto [ShadowPtr, OriginPtr] =
1902 getShadowOriginPtrKernelNoVec(OneAddr, IRB, ShadowTy, isStore);
1903
1904 ShadowPtrs = IRB.CreateInsertElement(
1905 ShadowPtrs, ShadowPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1906 if (MS.TrackOrigins)
1907 OriginPtrs = IRB.CreateInsertElement(
1908 OriginPtrs, OriginPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1909 }
1910 return {ShadowPtrs, OriginPtrs};
1911 }
1912
1913 std::pair<Value *, Value *> getShadowOriginPtr(Value *Addr, IRBuilder<> &IRB,
1914 Type *ShadowTy,
1915 MaybeAlign Alignment,
1916 bool isStore) {
1917 if (MS.CompileKernel)
1918 return getShadowOriginPtrKernel(Addr, IRB, ShadowTy, isStore);
1919 return getShadowOriginPtrUserspace(Addr, IRB, ShadowTy, Alignment);
1920 }
1921
1922 /// Compute the shadow address for a given function argument.
1923 ///
1924 /// Shadow = ParamTLS+ArgOffset.
1925 Value *getShadowPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1926 Value *Base = IRB.CreatePointerCast(MS.ParamTLS, MS.IntptrTy);
1927 if (ArgOffset)
1928 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
1929 return IRB.CreateIntToPtr(Base, IRB.getPtrTy(0), "_msarg");
1930 }
1931
1932 /// Compute the origin address for a given function argument.
1933 Value *getOriginPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1934 if (!MS.TrackOrigins)
1935 return nullptr;
1936 Value *Base = IRB.CreatePointerCast(MS.ParamOriginTLS, MS.IntptrTy);
1937 if (ArgOffset)
1938 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
1939 return IRB.CreateIntToPtr(Base, IRB.getPtrTy(0), "_msarg_o");
1940 }
1941
1942 /// Compute the shadow address for a retval.
1943 Value *getShadowPtrForRetval(IRBuilder<> &IRB) {
1944 return IRB.CreatePointerCast(MS.RetvalTLS, IRB.getPtrTy(0), "_msret");
1945 }
1946
1947 /// Compute the origin address for a retval.
1948 Value *getOriginPtrForRetval() {
1949 // We keep a single origin for the entire retval. Might be too optimistic.
1950 return MS.RetvalOriginTLS;
1951 }
1952
1953 /// Set SV to be the shadow value for V.
1954 void setShadow(Value *V, Value *SV) {
1955 assert(!ShadowMap.count(V) && "Values may only have one shadow");
1956 ShadowMap[V] = PropagateShadow ? SV : getCleanShadow(V);
1957 }
1958
1959 /// Set Origin to be the origin value for V.
1960 void setOrigin(Value *V, Value *Origin) {
1961 if (!MS.TrackOrigins)
1962 return;
1963 assert(!OriginMap.count(V) && "Values may only have one origin");
1964 LLVM_DEBUG(dbgs() << "ORIGIN: " << *V << " ==> " << *Origin << "\n");
1965 OriginMap[V] = Origin;
1966 }
1967
1968 Constant *getCleanShadow(Type *OrigTy) {
1969 Type *ShadowTy = getShadowTy(OrigTy);
1970 if (!ShadowTy)
1971 return nullptr;
1972 return Constant::getNullValue(ShadowTy);
1973 }
1974
1975 /// Create a clean shadow value for a given value.
1976 ///
1977 /// Clean shadow (all zeroes) means all bits of the value are defined
1978 /// (initialized).
1979 Constant *getCleanShadow(Value *V) { return getCleanShadow(V->getType()); }
1980
1981 /// Create a dirty shadow of a given shadow type.
1982 Constant *getPoisonedShadow(Type *ShadowTy) {
1983 assert(ShadowTy);
1984 if (isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy))
1985 return Constant::getAllOnesValue(ShadowTy);
1986 if (ArrayType *AT = dyn_cast<ArrayType>(ShadowTy)) {
1987 SmallVector<Constant *, 4> Vals(AT->getNumElements(),
1988 getPoisonedShadow(AT->getElementType()));
1989 return ConstantArray::get(AT, Vals);
1990 }
1991 if (StructType *ST = dyn_cast<StructType>(ShadowTy)) {
1993 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1994 Vals.push_back(getPoisonedShadow(ST->getElementType(i)));
1995 return ConstantStruct::get(ST, Vals);
1996 }
1997 llvm_unreachable("Unexpected shadow type");
1998 }
1999
2000 /// Create a dirty shadow for a given value.
2001 Constant *getPoisonedShadow(Value *V) {
2002 Type *ShadowTy = getShadowTy(V);
2003 if (!ShadowTy)
2004 return nullptr;
2005 return getPoisonedShadow(ShadowTy);
2006 }
2007
2008 /// Create a clean (zero) origin.
2009 Value *getCleanOrigin() { return Constant::getNullValue(MS.OriginTy); }
2010
2011 /// Get the shadow value for a given Value.
2012 ///
2013 /// This function either returns the value set earlier with setShadow,
2014 /// or extracts if from ParamTLS (for function arguments).
2015 Value *getShadow(Value *V) {
2016 if (Instruction *I = dyn_cast<Instruction>(V)) {
2017 if (!PropagateShadow || I->getMetadata(LLVMContext::MD_nosanitize))
2018 return getCleanShadow(V);
2019 // For instructions the shadow is already stored in the map.
2020 Value *Shadow = ShadowMap[V];
2021 if (!Shadow) {
2022 LLVM_DEBUG(dbgs() << "No shadow: " << *V << "\n" << *(I->getParent()));
2023 assert(Shadow && "No shadow for a value");
2024 }
2025 return Shadow;
2026 }
2027 // Handle fully undefined values
2028 // (partially undefined constant vectors are handled later)
2029 if ([[maybe_unused]] UndefValue *U = dyn_cast<UndefValue>(V)) {
2030 Value *AllOnes = (PropagateShadow && PoisonUndef) ? getPoisonedShadow(V)
2031 : getCleanShadow(V);
2032 LLVM_DEBUG(dbgs() << "Undef: " << *U << " ==> " << *AllOnes << "\n");
2033 return AllOnes;
2034 }
2035 if (Argument *A = dyn_cast<Argument>(V)) {
2036 // For arguments we compute the shadow on demand and store it in the map.
2037 Value *&ShadowPtr = ShadowMap[V];
2038 if (ShadowPtr)
2039 return ShadowPtr;
2040 Function *F = A->getParent();
2041 IRBuilder<> EntryIRB(FnPrologueEnd);
2042 unsigned ArgOffset = 0;
2043 const DataLayout &DL = F->getDataLayout();
2044 for (auto &FArg : F->args()) {
2045 if (!FArg.getType()->isSized() || FArg.getType()->isScalableTy()) {
2046 LLVM_DEBUG(dbgs() << (FArg.getType()->isScalableTy()
2047 ? "vscale not fully supported\n"
2048 : "Arg is not sized\n"));
2049 if (A == &FArg) {
2050 ShadowPtr = getCleanShadow(V);
2051 setOrigin(A, getCleanOrigin());
2052 break;
2053 }
2054 continue;
2055 }
2056
2057 unsigned Size = FArg.hasByValAttr()
2058 ? DL.getTypeAllocSize(FArg.getParamByValType())
2059 : DL.getTypeAllocSize(FArg.getType());
2060
2061 if (A == &FArg) {
2062 bool Overflow = ArgOffset + Size > kParamTLSSize;
2063 if (FArg.hasByValAttr()) {
2064 // ByVal pointer itself has clean shadow. We copy the actual
2065 // argument shadow to the underlying memory.
2066 // Figure out maximal valid memcpy alignment.
2067 const Align ArgAlign = DL.getValueOrABITypeAlignment(
2068 FArg.getParamAlign(), FArg.getParamByValType());
2069 Value *CpShadowPtr, *CpOriginPtr;
2070 std::tie(CpShadowPtr, CpOriginPtr) =
2071 getShadowOriginPtr(V, EntryIRB, EntryIRB.getInt8Ty(), ArgAlign,
2072 /*isStore*/ true);
2073 if (!PropagateShadow || Overflow) {
2074 // ParamTLS overflow.
2075 EntryIRB.CreateMemSet(
2076 CpShadowPtr, Constant::getNullValue(EntryIRB.getInt8Ty()),
2077 Size, ArgAlign);
2078 } else {
2079 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2080 const Align CopyAlign = std::min(ArgAlign, kShadowTLSAlignment);
2081 [[maybe_unused]] Value *Cpy = EntryIRB.CreateMemCpy(
2082 CpShadowPtr, CopyAlign, Base, CopyAlign, Size);
2083 LLVM_DEBUG(dbgs() << " ByValCpy: " << *Cpy << "\n");
2084
2085 if (MS.TrackOrigins) {
2086 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2087 // FIXME: OriginSize should be:
2088 // alignTo(V % kMinOriginAlignment + Size, kMinOriginAlignment)
2089 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
2090 EntryIRB.CreateMemCpy(
2091 CpOriginPtr,
2092 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginPtr,
2093 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
2094 OriginSize);
2095 }
2096 }
2097 }
2098
2099 if (!PropagateShadow || Overflow || FArg.hasByValAttr() ||
2100 (MS.EagerChecks && FArg.hasAttribute(Attribute::NoUndef))) {
2101 ShadowPtr = getCleanShadow(V);
2102 setOrigin(A, getCleanOrigin());
2103 } else {
2104 // Shadow over TLS
2105 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2106 ShadowPtr = EntryIRB.CreateAlignedLoad(getShadowTy(&FArg), Base,
2108 if (MS.TrackOrigins) {
2109 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2110 setOrigin(A, EntryIRB.CreateLoad(MS.OriginTy, OriginPtr));
2111 }
2112 }
2114 << " ARG: " << FArg << " ==> " << *ShadowPtr << "\n");
2115 break;
2116 }
2117
2118 ArgOffset += alignTo(Size, kShadowTLSAlignment);
2119 }
2120 assert(ShadowPtr && "Could not find shadow for an argument");
2121 return ShadowPtr;
2122 }
2123
2124 // Check for partially-undefined constant vectors
2125 // TODO: scalable vectors (this is hard because we do not have IRBuilder)
2126 if (isa<FixedVectorType>(V->getType()) && isa<Constant>(V) &&
2127 cast<Constant>(V)->containsUndefOrPoisonElement() && PropagateShadow &&
2128 PoisonUndefVectors) {
2129 unsigned NumElems = cast<FixedVectorType>(V->getType())->getNumElements();
2130 SmallVector<Constant *, 32> ShadowVector(NumElems);
2131 for (unsigned i = 0; i != NumElems; ++i) {
2132 Constant *Elem = cast<Constant>(V)->getAggregateElement(i);
2133 ShadowVector[i] = isa<UndefValue>(Elem) ? getPoisonedShadow(Elem)
2134 : getCleanShadow(Elem);
2135 }
2136
2137 Value *ShadowConstant = ConstantVector::get(ShadowVector);
2138 LLVM_DEBUG(dbgs() << "Partial undef constant vector: " << *V << " ==> "
2139 << *ShadowConstant << "\n");
2140
2141 return ShadowConstant;
2142 }
2143
2144 // TODO: partially-undefined constant arrays, structures, and nested types
2145
2146 // For everything else the shadow is zero.
2147 return getCleanShadow(V);
2148 }
2149
2150 /// Get the shadow for i-th argument of the instruction I.
2151 Value *getShadow(Instruction *I, int i) {
2152 return getShadow(I->getOperand(i));
2153 }
2154
2155 /// Get the origin for a value.
2156 Value *getOrigin(Value *V) {
2157 if (!MS.TrackOrigins)
2158 return nullptr;
2159 if (!PropagateShadow || isa<Constant>(V) || isa<InlineAsm>(V))
2160 return getCleanOrigin();
2162 "Unexpected value type in getOrigin()");
2163 if (Instruction *I = dyn_cast<Instruction>(V)) {
2164 if (I->getMetadata(LLVMContext::MD_nosanitize))
2165 return getCleanOrigin();
2166 }
2167 Value *Origin = OriginMap[V];
2168 assert(Origin && "Missing origin");
2169 return Origin;
2170 }
2171
2172 /// Get the origin for i-th argument of the instruction I.
2173 Value *getOrigin(Instruction *I, int i) {
2174 return getOrigin(I->getOperand(i));
2175 }
2176
2177 /// Remember the place where a shadow check should be inserted.
2178 ///
2179 /// This location will be later instrumented with a check that will print a
2180 /// UMR warning in runtime if the shadow value is not 0.
2181 void insertCheckShadow(Value *Shadow, Value *Origin, Instruction *OrigIns) {
2182 assert(Shadow);
2183 if (!InsertChecks)
2184 return;
2185
2186 if (!DebugCounter::shouldExecute(DebugInsertCheck)) {
2187 LLVM_DEBUG(dbgs() << "Skipping check of " << *Shadow << " before "
2188 << *OrigIns << "\n");
2189 return;
2190 }
2191#ifndef NDEBUG
2192 Type *ShadowTy = Shadow->getType();
2193 assert((isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy) ||
2194 isa<StructType>(ShadowTy) || isa<ArrayType>(ShadowTy)) &&
2195 "Can only insert checks for integer, vector, and aggregate shadow "
2196 "types");
2197#endif
2198 InstrumentationList.push_back(
2199 ShadowOriginAndInsertPoint(Shadow, Origin, OrigIns));
2200 }
2201
2202 /// Get shadow for value, and remember the place where a shadow check should
2203 /// be inserted.
2204 ///
2205 /// This location will be later instrumented with a check that will print a
2206 /// UMR warning in runtime if the value is not fully defined.
2207 void insertCheckShadowOf(Value *Val, Instruction *OrigIns) {
2208 assert(Val);
2209 Value *Shadow, *Origin;
2211 Shadow = getShadow(Val);
2212 if (!Shadow)
2213 return;
2214 Origin = getOrigin(Val);
2215 } else {
2216 Shadow = dyn_cast_or_null<Instruction>(getShadow(Val));
2217 if (!Shadow)
2218 return;
2219 Origin = dyn_cast_or_null<Instruction>(getOrigin(Val));
2220 }
2221 insertCheckShadow(Shadow, Origin, OrigIns);
2222 }
2223
2225 switch (a) {
2226 case AtomicOrdering::NotAtomic:
2227 return AtomicOrdering::NotAtomic;
2228 case AtomicOrdering::Unordered:
2229 case AtomicOrdering::Monotonic:
2230 case AtomicOrdering::Release:
2231 return AtomicOrdering::Release;
2232 case AtomicOrdering::Acquire:
2233 case AtomicOrdering::AcquireRelease:
2234 return AtomicOrdering::AcquireRelease;
2235 case AtomicOrdering::SequentiallyConsistent:
2236 return AtomicOrdering::SequentiallyConsistent;
2237 }
2238 llvm_unreachable("Unknown ordering");
2239 }
2240
2241 Value *makeAddReleaseOrderingTable(IRBuilder<> &IRB) {
2242 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2243 uint32_t OrderingTable[NumOrderings] = {};
2244
2245 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2246 OrderingTable[(int)AtomicOrderingCABI::release] =
2247 (int)AtomicOrderingCABI::release;
2248 OrderingTable[(int)AtomicOrderingCABI::consume] =
2249 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2250 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2251 (int)AtomicOrderingCABI::acq_rel;
2252 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2253 (int)AtomicOrderingCABI::seq_cst;
2254
2255 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2256 }
2257
2259 switch (a) {
2260 case AtomicOrdering::NotAtomic:
2261 return AtomicOrdering::NotAtomic;
2262 case AtomicOrdering::Unordered:
2263 case AtomicOrdering::Monotonic:
2264 case AtomicOrdering::Acquire:
2265 return AtomicOrdering::Acquire;
2266 case AtomicOrdering::Release:
2267 case AtomicOrdering::AcquireRelease:
2268 return AtomicOrdering::AcquireRelease;
2269 case AtomicOrdering::SequentiallyConsistent:
2270 return AtomicOrdering::SequentiallyConsistent;
2271 }
2272 llvm_unreachable("Unknown ordering");
2273 }
2274
2275 Value *makeAddAcquireOrderingTable(IRBuilder<> &IRB) {
2276 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2277 uint32_t OrderingTable[NumOrderings] = {};
2278
2279 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2280 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2281 OrderingTable[(int)AtomicOrderingCABI::consume] =
2282 (int)AtomicOrderingCABI::acquire;
2283 OrderingTable[(int)AtomicOrderingCABI::release] =
2284 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2285 (int)AtomicOrderingCABI::acq_rel;
2286 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2287 (int)AtomicOrderingCABI::seq_cst;
2288
2289 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2290 }
2291
2292 // ------------------- Visitors.
2293 using InstVisitor<MemorySanitizerVisitor>::visit;
2294 void visit(Instruction &I) {
2295 if (I.getMetadata(LLVMContext::MD_nosanitize))
2296 return;
2297 // Don't want to visit if we're in the prologue
2298 if (isInPrologue(I))
2299 return;
2300 if (!DebugCounter::shouldExecute(DebugInstrumentInstruction)) {
2301 LLVM_DEBUG(dbgs() << "Skipping instruction: " << I << "\n");
2302 // We still need to set the shadow and origin to clean values.
2303 setShadow(&I, getCleanShadow(&I));
2304 setOrigin(&I, getCleanOrigin());
2305 return;
2306 }
2307
2308 Instructions.push_back(&I);
2309 }
2310
2311 /// Instrument LoadInst
2312 ///
2313 /// Loads the corresponding shadow and (optionally) origin.
2314 /// Optionally, checks that the load address is fully defined.
2315 void visitLoadInst(LoadInst &I) {
2316 assert(I.getType()->isSized() && "Load type must have size");
2317 assert(!I.getMetadata(LLVMContext::MD_nosanitize));
2318 NextNodeIRBuilder IRB(&I);
2319 Type *ShadowTy = getShadowTy(&I);
2320 Value *Addr = I.getPointerOperand();
2321 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
2322 const Align Alignment = I.getAlign();
2323 if (PropagateShadow) {
2324 std::tie(ShadowPtr, OriginPtr) =
2325 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
2326 setShadow(&I,
2327 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
2328 } else {
2329 setShadow(&I, getCleanShadow(&I));
2330 }
2331
2333 insertCheckShadowOf(I.getPointerOperand(), &I);
2334
2335 if (I.isAtomic())
2336 I.setOrdering(addAcquireOrdering(I.getOrdering()));
2337
2338 if (MS.TrackOrigins) {
2339 if (PropagateShadow) {
2340 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
2341 setOrigin(
2342 &I, IRB.CreateAlignedLoad(MS.OriginTy, OriginPtr, OriginAlignment));
2343 } else {
2344 setOrigin(&I, getCleanOrigin());
2345 }
2346 }
2347 }
2348
2349 /// Instrument StoreInst
2350 ///
2351 /// Stores the corresponding shadow and (optionally) origin.
2352 /// Optionally, checks that the store address is fully defined.
2353 void visitStoreInst(StoreInst &I) {
2354 StoreList.push_back(&I);
2356 insertCheckShadowOf(I.getPointerOperand(), &I);
2357 }
2358
2359 void handleCASOrRMW(Instruction &I) {
2361
2362 IRBuilder<> IRB(&I);
2363 Value *Addr = I.getOperand(0);
2364 Value *Val = I.getOperand(1);
2365 Value *ShadowPtr = getShadowOriginPtr(Addr, IRB, getShadowTy(Val), Align(1),
2366 /*isStore*/ true)
2367 .first;
2368
2370 insertCheckShadowOf(Addr, &I);
2371
2372 // Only test the conditional argument of cmpxchg instruction.
2373 // The other argument can potentially be uninitialized, but we can not
2374 // detect this situation reliably without possible false positives.
2376 insertCheckShadowOf(Val, &I);
2377
2378 IRB.CreateStore(getCleanShadow(Val), ShadowPtr);
2379
2380 setShadow(&I, getCleanShadow(&I));
2381 setOrigin(&I, getCleanOrigin());
2382 }
2383
2384 void visitAtomicRMWInst(AtomicRMWInst &I) {
2385 handleCASOrRMW(I);
2386 I.setOrdering(addReleaseOrdering(I.getOrdering()));
2387 }
2388
2389 void visitAtomicCmpXchgInst(AtomicCmpXchgInst &I) {
2390 handleCASOrRMW(I);
2391 I.setSuccessOrdering(addReleaseOrdering(I.getSuccessOrdering()));
2392 }
2393
2394 // Vector manipulation.
2395 void visitExtractElementInst(ExtractElementInst &I) {
2396 insertCheckShadowOf(I.getOperand(1), &I);
2397 IRBuilder<> IRB(&I);
2398 setShadow(&I, IRB.CreateExtractElement(getShadow(&I, 0), I.getOperand(1),
2399 "_msprop"));
2400 setOrigin(&I, getOrigin(&I, 0));
2401 }
2402
2403 void visitInsertElementInst(InsertElementInst &I) {
2404 insertCheckShadowOf(I.getOperand(2), &I);
2405 IRBuilder<> IRB(&I);
2406 auto *Shadow0 = getShadow(&I, 0);
2407 auto *Shadow1 = getShadow(&I, 1);
2408 setShadow(&I, IRB.CreateInsertElement(Shadow0, Shadow1, I.getOperand(2),
2409 "_msprop"));
2410 setOriginForNaryOp(I);
2411 }
2412
2413 void visitShuffleVectorInst(ShuffleVectorInst &I) {
2414 IRBuilder<> IRB(&I);
2415 auto *Shadow0 = getShadow(&I, 0);
2416 auto *Shadow1 = getShadow(&I, 1);
2417 setShadow(&I, IRB.CreateShuffleVector(Shadow0, Shadow1, I.getShuffleMask(),
2418 "_msprop"));
2419 setOriginForNaryOp(I);
2420 }
2421
2422 // Casts.
2423 void visitSExtInst(SExtInst &I) {
2424 IRBuilder<> IRB(&I);
2425 setShadow(&I, IRB.CreateSExt(getShadow(&I, 0), I.getType(), "_msprop"));
2426 setOrigin(&I, getOrigin(&I, 0));
2427 }
2428
2429 void visitZExtInst(ZExtInst &I) {
2430 IRBuilder<> IRB(&I);
2431 setShadow(&I, IRB.CreateZExt(getShadow(&I, 0), I.getType(), "_msprop"));
2432 setOrigin(&I, getOrigin(&I, 0));
2433 }
2434
2435 void visitTruncInst(TruncInst &I) {
2436 IRBuilder<> IRB(&I);
2437 setShadow(&I, IRB.CreateTrunc(getShadow(&I, 0), I.getType(), "_msprop"));
2438 setOrigin(&I, getOrigin(&I, 0));
2439 }
2440
2441 void visitBitCastInst(BitCastInst &I) {
2442 // Special case: if this is the bitcast (there is exactly 1 allowed) between
2443 // a musttail call and a ret, don't instrument. New instructions are not
2444 // allowed after a musttail call.
2445 if (auto *CI = dyn_cast<CallInst>(I.getOperand(0)))
2446 if (CI->isMustTailCall())
2447 return;
2448 IRBuilder<> IRB(&I);
2449 setShadow(&I, IRB.CreateBitCast(getShadow(&I, 0), getShadowTy(&I)));
2450 setOrigin(&I, getOrigin(&I, 0));
2451 }
2452
2453 void visitPtrToIntInst(PtrToIntInst &I) {
2454 IRBuilder<> IRB(&I);
2455 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2456 "_msprop_ptrtoint"));
2457 setOrigin(&I, getOrigin(&I, 0));
2458 }
2459
2460 void visitIntToPtrInst(IntToPtrInst &I) {
2461 IRBuilder<> IRB(&I);
2462 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2463 "_msprop_inttoptr"));
2464 setOrigin(&I, getOrigin(&I, 0));
2465 }
2466
2467 void visitFPToSIInst(CastInst &I) { handleShadowOr(I); }
2468 void visitFPToUIInst(CastInst &I) { handleShadowOr(I); }
2469 void visitSIToFPInst(CastInst &I) { handleShadowOr(I); }
2470 void visitUIToFPInst(CastInst &I) { handleShadowOr(I); }
2471 void visitFPExtInst(CastInst &I) { handleShadowOr(I); }
2472 void visitFPTruncInst(CastInst &I) { handleShadowOr(I); }
2473
2474 /// Propagate shadow for bitwise AND.
2475 ///
2476 /// This code is exact, i.e. if, for example, a bit in the left argument
2477 /// is defined and 0, then neither the value not definedness of the
2478 /// corresponding bit in B don't affect the resulting shadow.
2479 void visitAnd(BinaryOperator &I) {
2480 IRBuilder<> IRB(&I);
2481 // "And" of 0 and a poisoned value results in unpoisoned value.
2482 // 1&1 => 1; 0&1 => 0; p&1 => p;
2483 // 1&0 => 0; 0&0 => 0; p&0 => 0;
2484 // 1&p => p; 0&p => 0; p&p => p;
2485 // S = (S1 & S2) | (V1 & S2) | (S1 & V2)
2486 Value *S1 = getShadow(&I, 0);
2487 Value *S2 = getShadow(&I, 1);
2488 Value *V1 = I.getOperand(0);
2489 Value *V2 = I.getOperand(1);
2490 if (V1->getType() != S1->getType()) {
2491 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2492 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2493 }
2494 Value *S1S2 = IRB.CreateAnd(S1, S2);
2495 Value *V1S2 = IRB.CreateAnd(V1, S2);
2496 Value *S1V2 = IRB.CreateAnd(S1, V2);
2497 setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
2498 setOriginForNaryOp(I);
2499 }
2500
2501 void visitOr(BinaryOperator &I) {
2502 IRBuilder<> IRB(&I);
2503 // "Or" of 1 and a poisoned value results in unpoisoned value:
2504 // 1|1 => 1; 0|1 => 1; p|1 => 1;
2505 // 1|0 => 1; 0|0 => 0; p|0 => p;
2506 // 1|p => 1; 0|p => p; p|p => p;
2507 //
2508 // S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
2509 //
2510 // If the "disjoint OR" property is violated, the result is poison, and
2511 // hence the entire shadow is uninitialized:
2512 // S = S | SignExt(V1 & V2 != 0)
2513 Value *S1 = getShadow(&I, 0);
2514 Value *S2 = getShadow(&I, 1);
2515 Value *V1 = I.getOperand(0);
2516 Value *V2 = I.getOperand(1);
2517 if (V1->getType() != S1->getType()) {
2518 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2519 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2520 }
2521
2522 Value *NotV1 = IRB.CreateNot(V1);
2523 Value *NotV2 = IRB.CreateNot(V2);
2524
2525 Value *S1S2 = IRB.CreateAnd(S1, S2);
2526 Value *S2NotV1 = IRB.CreateAnd(NotV1, S2);
2527 Value *S1NotV2 = IRB.CreateAnd(S1, NotV2);
2528
2529 Value *S = IRB.CreateOr({S1S2, S2NotV1, S1NotV2});
2530
2531 if (ClPreciseDisjointOr && cast<PossiblyDisjointInst>(&I)->isDisjoint()) {
2532 Value *V1V2 = IRB.CreateAnd(V1, V2);
2533 Value *DisjointOrShadow = IRB.CreateSExt(
2534 IRB.CreateICmpNE(V1V2, getCleanShadow(V1V2)), V1V2->getType());
2535 S = IRB.CreateOr(S, DisjointOrShadow, "_ms_disjoint");
2536 }
2537
2538 setShadow(&I, S);
2539 setOriginForNaryOp(I);
2540 }
2541
2542 /// Default propagation of shadow and/or origin.
2543 ///
2544 /// This class implements the general case of shadow propagation, used in all
2545 /// cases where we don't know and/or don't care about what the operation
2546 /// actually does. It converts all input shadow values to a common type
2547 /// (extending or truncating as necessary), and bitwise OR's them.
2548 ///
2549 /// This is much cheaper than inserting checks (i.e. requiring inputs to be
2550 /// fully initialized), and less prone to false positives.
2551 ///
2552 /// This class also implements the general case of origin propagation. For a
2553 /// Nary operation, result origin is set to the origin of an argument that is
2554 /// not entirely initialized. If there is more than one such arguments, the
2555 /// rightmost of them is picked. It does not matter which one is picked if all
2556 /// arguments are initialized.
2557 template <bool CombineShadow> class Combiner {
2558 Value *Shadow = nullptr;
2559 Value *Origin = nullptr;
2560 IRBuilder<> &IRB;
2561 MemorySanitizerVisitor *MSV;
2562
2563 public:
2564 Combiner(MemorySanitizerVisitor *MSV, IRBuilder<> &IRB)
2565 : IRB(IRB), MSV(MSV) {}
2566
2567 /// Add a pair of shadow and origin values to the mix.
2568 Combiner &Add(Value *OpShadow, Value *OpOrigin) {
2569 if (CombineShadow) {
2570 assert(OpShadow);
2571 if (!Shadow)
2572 Shadow = OpShadow;
2573 else {
2574 OpShadow = MSV->CreateShadowCast(IRB, OpShadow, Shadow->getType());
2575 Shadow = IRB.CreateOr(Shadow, OpShadow, "_msprop");
2576 }
2577 }
2578
2579 if (MSV->MS.TrackOrigins) {
2580 assert(OpOrigin);
2581 if (!Origin) {
2582 Origin = OpOrigin;
2583 } else {
2584 Constant *ConstOrigin = dyn_cast<Constant>(OpOrigin);
2585 // No point in adding something that might result in 0 origin value.
2586 if (!ConstOrigin || !ConstOrigin->isNullValue()) {
2587 Value *Cond = MSV->convertToBool(OpShadow, IRB);
2588 Origin = IRB.CreateSelect(Cond, OpOrigin, Origin);
2589 }
2590 }
2591 }
2592 return *this;
2593 }
2594
2595 /// Add an application value to the mix.
2596 Combiner &Add(Value *V) {
2597 Value *OpShadow = MSV->getShadow(V);
2598 Value *OpOrigin = MSV->MS.TrackOrigins ? MSV->getOrigin(V) : nullptr;
2599 return Add(OpShadow, OpOrigin);
2600 }
2601
2602 /// Set the current combined values as the given instruction's shadow
2603 /// and origin.
2604 void Done(Instruction *I) {
2605 if (CombineShadow) {
2606 assert(Shadow);
2607 Shadow = MSV->CreateShadowCast(IRB, Shadow, MSV->getShadowTy(I));
2608 MSV->setShadow(I, Shadow);
2609 }
2610 if (MSV->MS.TrackOrigins) {
2611 assert(Origin);
2612 MSV->setOrigin(I, Origin);
2613 }
2614 }
2615
2616 /// Store the current combined value at the specified origin
2617 /// location.
2618 void DoneAndStoreOrigin(TypeSize TS, Value *OriginPtr) {
2619 if (MSV->MS.TrackOrigins) {
2620 assert(Origin);
2621 MSV->paintOrigin(IRB, Origin, OriginPtr, TS, kMinOriginAlignment);
2622 }
2623 }
2624 };
2625
2626 using ShadowAndOriginCombiner = Combiner<true>;
2627 using OriginCombiner = Combiner<false>;
2628
2629 /// Propagate origin for arbitrary operation.
2630 void setOriginForNaryOp(Instruction &I) {
2631 if (!MS.TrackOrigins)
2632 return;
2633 IRBuilder<> IRB(&I);
2634 OriginCombiner OC(this, IRB);
2635 for (Use &Op : I.operands())
2636 OC.Add(Op.get());
2637 OC.Done(&I);
2638 }
2639
2640 size_t VectorOrPrimitiveTypeSizeInBits(Type *Ty) {
2641 assert(!(Ty->isVectorTy() && Ty->getScalarType()->isPointerTy()) &&
2642 "Vector of pointers is not a valid shadow type");
2643 return Ty->isVectorTy() ? cast<FixedVectorType>(Ty)->getNumElements() *
2645 : Ty->getPrimitiveSizeInBits();
2646 }
2647
2648 /// Cast between two shadow types, extending or truncating as
2649 /// necessary.
2650 Value *CreateShadowCast(IRBuilder<> &IRB, Value *V, Type *dstTy,
2651 bool Signed = false) {
2652 Type *srcTy = V->getType();
2653 if (srcTy == dstTy)
2654 return V;
2655 size_t srcSizeInBits = VectorOrPrimitiveTypeSizeInBits(srcTy);
2656 size_t dstSizeInBits = VectorOrPrimitiveTypeSizeInBits(dstTy);
2657 if (srcSizeInBits > 1 && dstSizeInBits == 1)
2658 return IRB.CreateICmpNE(V, getCleanShadow(V));
2659
2660 if (dstTy->isIntegerTy() && srcTy->isIntegerTy())
2661 return IRB.CreateIntCast(V, dstTy, Signed);
2662 if (dstTy->isVectorTy() && srcTy->isVectorTy() &&
2663 cast<VectorType>(dstTy)->getElementCount() ==
2664 cast<VectorType>(srcTy)->getElementCount())
2665 return IRB.CreateIntCast(V, dstTy, Signed);
2666 Value *V1 = IRB.CreateBitCast(V, Type::getIntNTy(*MS.C, srcSizeInBits));
2667 Value *V2 =
2668 IRB.CreateIntCast(V1, Type::getIntNTy(*MS.C, dstSizeInBits), Signed);
2669 return IRB.CreateBitCast(V2, dstTy);
2670 // TODO: handle struct types.
2671 }
2672
2673 /// Cast an application value to the type of its own shadow.
2674 Value *CreateAppToShadowCast(IRBuilder<> &IRB, Value *V) {
2675 Type *ShadowTy = getShadowTy(V);
2676 if (V->getType() == ShadowTy)
2677 return V;
2678 if (V->getType()->isPtrOrPtrVectorTy())
2679 return IRB.CreatePtrToInt(V, ShadowTy);
2680 else
2681 return IRB.CreateBitCast(V, ShadowTy);
2682 }
2683
2684 /// Propagate shadow for arbitrary operation.
2685 void handleShadowOr(Instruction &I) {
2686 IRBuilder<> IRB(&I);
2687 ShadowAndOriginCombiner SC(this, IRB);
2688 for (Use &Op : I.operands())
2689 SC.Add(Op.get());
2690 SC.Done(&I);
2691 }
2692
2693 // Perform a bitwise OR on the horizontal pairs (or other specified grouping)
2694 // of elements.
2695 //
2696 // For example, suppose we have:
2697 // VectorA: <a1, a2, a3, a4, a5, a6>
2698 // VectorB: <b1, b2, b3, b4, b5, b6>
2699 // ReductionFactor: 3.
2700 // The output would be:
2701 // <a1|a2|a3, a4|a5|a6, b1|b2|b3, b4|b5|b6>
2702 //
2703 // This is convenient for instrumenting horizontal add/sub.
2704 // For bitwise OR on "vertical" pairs, see maybeHandleSimpleNomemIntrinsic().
2705 Value *horizontalReduce(IntrinsicInst &I, unsigned ReductionFactor,
2706 Value *VectorA, Value *VectorB) {
2707 assert(isa<FixedVectorType>(VectorA->getType()));
2708 unsigned TotalNumElems =
2709 cast<FixedVectorType>(VectorA->getType())->getNumElements();
2710
2711 if (VectorB) {
2712 assert(VectorA->getType() == VectorB->getType());
2713 TotalNumElems = TotalNumElems * 2;
2714 }
2715
2716 assert(TotalNumElems % ReductionFactor == 0);
2717
2718 Value *Or = nullptr;
2719
2720 IRBuilder<> IRB(&I);
2721 for (unsigned i = 0; i < ReductionFactor; i++) {
2722 SmallVector<int, 16> Mask;
2723 for (unsigned X = 0; X < TotalNumElems; X += ReductionFactor)
2724 Mask.push_back(X + i);
2725
2726 Value *Masked;
2727 if (VectorB)
2728 Masked = IRB.CreateShuffleVector(VectorA, VectorB, Mask);
2729 else
2730 Masked = IRB.CreateShuffleVector(VectorA, Mask);
2731
2732 if (Or)
2733 Or = IRB.CreateOr(Or, Masked);
2734 else
2735 Or = Masked;
2736 }
2737
2738 return Or;
2739 }
2740
2741 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2742 /// fields.
2743 ///
2744 /// e.g., <2 x i32> @llvm.aarch64.neon.saddlp.v2i32.v4i16(<4 x i16>)
2745 /// <16 x i8> @llvm.aarch64.neon.addp.v16i8(<16 x i8>, <16 x i8>)
2746 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I) {
2747 assert(I.arg_size() == 1 || I.arg_size() == 2);
2748
2749 assert(I.getType()->isVectorTy());
2750 assert(I.getArgOperand(0)->getType()->isVectorTy());
2751
2752 [[maybe_unused]] FixedVectorType *ParamType =
2753 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2754 assert((I.arg_size() != 2) ||
2755 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2756 [[maybe_unused]] FixedVectorType *ReturnType =
2757 cast<FixedVectorType>(I.getType());
2758 assert(ParamType->getNumElements() * I.arg_size() ==
2759 2 * ReturnType->getNumElements());
2760
2761 IRBuilder<> IRB(&I);
2762
2763 // Horizontal OR of shadow
2764 Value *FirstArgShadow = getShadow(&I, 0);
2765 Value *SecondArgShadow = nullptr;
2766 if (I.arg_size() == 2)
2767 SecondArgShadow = getShadow(&I, 1);
2768
2769 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
2770 SecondArgShadow);
2771
2772 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2773
2774 setShadow(&I, OrShadow);
2775 setOriginForNaryOp(I);
2776 }
2777
2778 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2779 /// fields, with the parameters reinterpreted to have elements of a specified
2780 /// width. For example:
2781 /// @llvm.x86.ssse3.phadd.w(<1 x i64> [[VAR1]], <1 x i64> [[VAR2]])
2782 /// conceptually operates on
2783 /// (<4 x i16> [[VAR1]], <4 x i16> [[VAR2]])
2784 /// and can be handled with ReinterpretElemWidth == 16.
2785 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I,
2786 int ReinterpretElemWidth) {
2787 assert(I.arg_size() == 1 || I.arg_size() == 2);
2788
2789 assert(I.getType()->isVectorTy());
2790 assert(I.getArgOperand(0)->getType()->isVectorTy());
2791
2792 FixedVectorType *ParamType =
2793 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2794 assert((I.arg_size() != 2) ||
2795 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2796
2797 [[maybe_unused]] FixedVectorType *ReturnType =
2798 cast<FixedVectorType>(I.getType());
2799 assert(ParamType->getNumElements() * I.arg_size() ==
2800 2 * ReturnType->getNumElements());
2801
2802 IRBuilder<> IRB(&I);
2803
2804 FixedVectorType *ReinterpretShadowTy = nullptr;
2805 assert(isAligned(Align(ReinterpretElemWidth),
2806 ParamType->getPrimitiveSizeInBits()));
2807 ReinterpretShadowTy = FixedVectorType::get(
2808 IRB.getIntNTy(ReinterpretElemWidth),
2809 ParamType->getPrimitiveSizeInBits() / ReinterpretElemWidth);
2810
2811 // Horizontal OR of shadow
2812 Value *FirstArgShadow = getShadow(&I, 0);
2813 FirstArgShadow = IRB.CreateBitCast(FirstArgShadow, ReinterpretShadowTy);
2814
2815 // If we had two parameters each with an odd number of elements, the total
2816 // number of elements is even, but we have never seen this in extant
2817 // instruction sets, so we enforce that each parameter must have an even
2818 // number of elements.
2820 Align(2),
2821 cast<FixedVectorType>(FirstArgShadow->getType())->getNumElements()));
2822
2823 Value *SecondArgShadow = nullptr;
2824 if (I.arg_size() == 2) {
2825 SecondArgShadow = getShadow(&I, 1);
2826 SecondArgShadow = IRB.CreateBitCast(SecondArgShadow, ReinterpretShadowTy);
2827 }
2828
2829 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
2830 SecondArgShadow);
2831
2832 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2833
2834 setShadow(&I, OrShadow);
2835 setOriginForNaryOp(I);
2836 }
2837
2838 void visitFNeg(UnaryOperator &I) { handleShadowOr(I); }
2839
2840 // Handle multiplication by constant.
2841 //
2842 // Handle a special case of multiplication by constant that may have one or
2843 // more zeros in the lower bits. This makes corresponding number of lower bits
2844 // of the result zero as well. We model it by shifting the other operand
2845 // shadow left by the required number of bits. Effectively, we transform
2846 // (X * (A * 2**B)) to ((X << B) * A) and instrument (X << B) as (Sx << B).
2847 // We use multiplication by 2**N instead of shift to cover the case of
2848 // multiplication by 0, which may occur in some elements of a vector operand.
2849 void handleMulByConstant(BinaryOperator &I, Constant *ConstArg,
2850 Value *OtherArg) {
2851 Constant *ShadowMul;
2852 Type *Ty = ConstArg->getType();
2853 if (auto *VTy = dyn_cast<VectorType>(Ty)) {
2854 unsigned NumElements = cast<FixedVectorType>(VTy)->getNumElements();
2855 Type *EltTy = VTy->getElementType();
2857 for (unsigned Idx = 0; Idx < NumElements; ++Idx) {
2858 if (ConstantInt *Elt =
2860 const APInt &V = Elt->getValue();
2861 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2862 Elements.push_back(ConstantInt::get(EltTy, V2));
2863 } else {
2864 Elements.push_back(ConstantInt::get(EltTy, 1));
2865 }
2866 }
2867 ShadowMul = ConstantVector::get(Elements);
2868 } else {
2869 if (ConstantInt *Elt = dyn_cast<ConstantInt>(ConstArg)) {
2870 const APInt &V = Elt->getValue();
2871 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2872 ShadowMul = ConstantInt::get(Ty, V2);
2873 } else {
2874 ShadowMul = ConstantInt::get(Ty, 1);
2875 }
2876 }
2877
2878 IRBuilder<> IRB(&I);
2879 setShadow(&I,
2880 IRB.CreateMul(getShadow(OtherArg), ShadowMul, "msprop_mul_cst"));
2881 setOrigin(&I, getOrigin(OtherArg));
2882 }
2883
2884 void visitMul(BinaryOperator &I) {
2885 Constant *constOp0 = dyn_cast<Constant>(I.getOperand(0));
2886 Constant *constOp1 = dyn_cast<Constant>(I.getOperand(1));
2887 if (constOp0 && !constOp1)
2888 handleMulByConstant(I, constOp0, I.getOperand(1));
2889 else if (constOp1 && !constOp0)
2890 handleMulByConstant(I, constOp1, I.getOperand(0));
2891 else
2892 handleShadowOr(I);
2893 }
2894
2895 void visitFAdd(BinaryOperator &I) { handleShadowOr(I); }
2896 void visitFSub(BinaryOperator &I) { handleShadowOr(I); }
2897 void visitFMul(BinaryOperator &I) { handleShadowOr(I); }
2898 void visitAdd(BinaryOperator &I) { handleShadowOr(I); }
2899 void visitSub(BinaryOperator &I) { handleShadowOr(I); }
2900 void visitXor(BinaryOperator &I) { handleShadowOr(I); }
2901
2902 void handleIntegerDiv(Instruction &I) {
2903 IRBuilder<> IRB(&I);
2904 // Strict on the second argument.
2905 insertCheckShadowOf(I.getOperand(1), &I);
2906 setShadow(&I, getShadow(&I, 0));
2907 setOrigin(&I, getOrigin(&I, 0));
2908 }
2909
2910 void visitUDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2911 void visitSDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2912 void visitURem(BinaryOperator &I) { handleIntegerDiv(I); }
2913 void visitSRem(BinaryOperator &I) { handleIntegerDiv(I); }
2914
2915 // Floating point division is side-effect free. We can not require that the
2916 // divisor is fully initialized and must propagate shadow. See PR37523.
2917 void visitFDiv(BinaryOperator &I) { handleShadowOr(I); }
2918 void visitFRem(BinaryOperator &I) { handleShadowOr(I); }
2919
2920 /// Instrument == and != comparisons.
2921 ///
2922 /// Sometimes the comparison result is known even if some of the bits of the
2923 /// arguments are not.
2924 void handleEqualityComparison(ICmpInst &I) {
2925 IRBuilder<> IRB(&I);
2926 Value *A = I.getOperand(0);
2927 Value *B = I.getOperand(1);
2928 Value *Sa = getShadow(A);
2929 Value *Sb = getShadow(B);
2930
2931 // Get rid of pointers and vectors of pointers.
2932 // For ints (and vectors of ints), types of A and Sa match,
2933 // and this is a no-op.
2934 A = IRB.CreatePointerCast(A, Sa->getType());
2935 B = IRB.CreatePointerCast(B, Sb->getType());
2936
2937 // A == B <==> (C = A^B) == 0
2938 // A != B <==> (C = A^B) != 0
2939 // Sc = Sa | Sb
2940 Value *C = IRB.CreateXor(A, B);
2941 Value *Sc = IRB.CreateOr(Sa, Sb);
2942 // Now dealing with i = (C == 0) comparison (or C != 0, does not matter now)
2943 // Result is defined if one of the following is true
2944 // * there is a defined 1 bit in C
2945 // * C is fully defined
2946 // Si = !(C & ~Sc) && Sc
2948 Value *MinusOne = Constant::getAllOnesValue(Sc->getType());
2949 Value *LHS = IRB.CreateICmpNE(Sc, Zero);
2950 Value *RHS =
2951 IRB.CreateICmpEQ(IRB.CreateAnd(IRB.CreateXor(Sc, MinusOne), C), Zero);
2952 Value *Si = IRB.CreateAnd(LHS, RHS);
2953 Si->setName("_msprop_icmp");
2954 setShadow(&I, Si);
2955 setOriginForNaryOp(I);
2956 }
2957
2958 /// Instrument relational comparisons.
2959 ///
2960 /// This function does exact shadow propagation for all relational
2961 /// comparisons of integers, pointers and vectors of those.
2962 /// FIXME: output seems suboptimal when one of the operands is a constant
2963 void handleRelationalComparisonExact(ICmpInst &I) {
2964 IRBuilder<> IRB(&I);
2965 Value *A = I.getOperand(0);
2966 Value *B = I.getOperand(1);
2967 Value *Sa = getShadow(A);
2968 Value *Sb = getShadow(B);
2969
2970 // Get rid of pointers and vectors of pointers.
2971 // For ints (and vectors of ints), types of A and Sa match,
2972 // and this is a no-op.
2973 A = IRB.CreatePointerCast(A, Sa->getType());
2974 B = IRB.CreatePointerCast(B, Sb->getType());
2975
2976 // Let [a0, a1] be the interval of possible values of A, taking into account
2977 // its undefined bits. Let [b0, b1] be the interval of possible values of B.
2978 // Then (A cmp B) is defined iff (a0 cmp b1) == (a1 cmp b0).
2979 bool IsSigned = I.isSigned();
2980
2981 auto GetMinMaxUnsigned = [&](Value *V, Value *S) {
2982 if (IsSigned) {
2983 // Sign-flip to map from signed range to unsigned range. Relation A vs B
2984 // should be preserved, if checked with `getUnsignedPredicate()`.
2985 // Relationship between Amin, Amax, Bmin, Bmax also will not be
2986 // affected, as they are created by effectively adding/substructing from
2987 // A (or B) a value, derived from shadow, with no overflow, either
2988 // before or after sign flip.
2989 APInt MinVal =
2990 APInt::getSignedMinValue(V->getType()->getScalarSizeInBits());
2991 V = IRB.CreateXor(V, ConstantInt::get(V->getType(), MinVal));
2992 }
2993 // Minimize undefined bits.
2994 Value *Min = IRB.CreateAnd(V, IRB.CreateNot(S));
2995 Value *Max = IRB.CreateOr(V, S);
2996 return std::make_pair(Min, Max);
2997 };
2998
2999 auto [Amin, Amax] = GetMinMaxUnsigned(A, Sa);
3000 auto [Bmin, Bmax] = GetMinMaxUnsigned(B, Sb);
3001 Value *S1 = IRB.CreateICmp(I.getUnsignedPredicate(), Amin, Bmax);
3002 Value *S2 = IRB.CreateICmp(I.getUnsignedPredicate(), Amax, Bmin);
3003
3004 Value *Si = IRB.CreateXor(S1, S2);
3005 setShadow(&I, Si);
3006 setOriginForNaryOp(I);
3007 }
3008
3009 /// Instrument signed relational comparisons.
3010 ///
3011 /// Handle sign bit tests: x<0, x>=0, x<=-1, x>-1 by propagating the highest
3012 /// bit of the shadow. Everything else is delegated to handleShadowOr().
3013 void handleSignedRelationalComparison(ICmpInst &I) {
3014 Constant *constOp;
3015 Value *op = nullptr;
3017 if ((constOp = dyn_cast<Constant>(I.getOperand(1)))) {
3018 op = I.getOperand(0);
3019 pre = I.getPredicate();
3020 } else if ((constOp = dyn_cast<Constant>(I.getOperand(0)))) {
3021 op = I.getOperand(1);
3022 pre = I.getSwappedPredicate();
3023 } else {
3024 handleShadowOr(I);
3025 return;
3026 }
3027
3028 if ((constOp->isNullValue() &&
3029 (pre == CmpInst::ICMP_SLT || pre == CmpInst::ICMP_SGE)) ||
3030 (constOp->isAllOnesValue() &&
3031 (pre == CmpInst::ICMP_SGT || pre == CmpInst::ICMP_SLE))) {
3032 IRBuilder<> IRB(&I);
3033 Value *Shadow = IRB.CreateICmpSLT(getShadow(op), getCleanShadow(op),
3034 "_msprop_icmp_s");
3035 setShadow(&I, Shadow);
3036 setOrigin(&I, getOrigin(op));
3037 } else {
3038 handleShadowOr(I);
3039 }
3040 }
3041
3042 void visitICmpInst(ICmpInst &I) {
3043 if (!ClHandleICmp) {
3044 handleShadowOr(I);
3045 return;
3046 }
3047 if (I.isEquality()) {
3048 handleEqualityComparison(I);
3049 return;
3050 }
3051
3052 assert(I.isRelational());
3053 if (ClHandleICmpExact) {
3054 handleRelationalComparisonExact(I);
3055 return;
3056 }
3057 if (I.isSigned()) {
3058 handleSignedRelationalComparison(I);
3059 return;
3060 }
3061
3062 assert(I.isUnsigned());
3063 if ((isa<Constant>(I.getOperand(0)) || isa<Constant>(I.getOperand(1)))) {
3064 handleRelationalComparisonExact(I);
3065 return;
3066 }
3067
3068 handleShadowOr(I);
3069 }
3070
3071 void visitFCmpInst(FCmpInst &I) { handleShadowOr(I); }
3072
3073 void handleShift(BinaryOperator &I) {
3074 IRBuilder<> IRB(&I);
3075 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3076 // Otherwise perform the same shift on S1.
3077 Value *S1 = getShadow(&I, 0);
3078 Value *S2 = getShadow(&I, 1);
3079 Value *S2Conv =
3080 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3081 Value *V2 = I.getOperand(1);
3082 Value *Shift = IRB.CreateBinOp(I.getOpcode(), S1, V2);
3083 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3084 setOriginForNaryOp(I);
3085 }
3086
3087 void visitShl(BinaryOperator &I) { handleShift(I); }
3088 void visitAShr(BinaryOperator &I) { handleShift(I); }
3089 void visitLShr(BinaryOperator &I) { handleShift(I); }
3090
3091 void handleFunnelShift(IntrinsicInst &I) {
3092 IRBuilder<> IRB(&I);
3093 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3094 // Otherwise perform the same shift on S0 and S1.
3095 Value *S0 = getShadow(&I, 0);
3096 Value *S1 = getShadow(&I, 1);
3097 Value *S2 = getShadow(&I, 2);
3098 Value *S2Conv =
3099 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3100 Value *V2 = I.getOperand(2);
3101 Value *Shift = IRB.CreateIntrinsic(I.getIntrinsicID(), S2Conv->getType(),
3102 {S0, S1, V2});
3103 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3104 setOriginForNaryOp(I);
3105 }
3106
3107 /// Instrument llvm.memmove
3108 ///
3109 /// At this point we don't know if llvm.memmove will be inlined or not.
3110 /// If we don't instrument it and it gets inlined,
3111 /// our interceptor will not kick in and we will lose the memmove.
3112 /// If we instrument the call here, but it does not get inlined,
3113 /// we will memove the shadow twice: which is bad in case
3114 /// of overlapping regions. So, we simply lower the intrinsic to a call.
3115 ///
3116 /// Similar situation exists for memcpy and memset.
3117 void visitMemMoveInst(MemMoveInst &I) {
3118 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3119 IRBuilder<> IRB(&I);
3120 IRB.CreateCall(MS.MemmoveFn,
3121 {I.getArgOperand(0), I.getArgOperand(1),
3122 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3124 }
3125
3126 /// Instrument memcpy
3127 ///
3128 /// Similar to memmove: avoid copying shadow twice. This is somewhat
3129 /// unfortunate as it may slowdown small constant memcpys.
3130 /// FIXME: consider doing manual inline for small constant sizes and proper
3131 /// alignment.
3132 ///
3133 /// Note: This also handles memcpy.inline, which promises no calls to external
3134 /// functions as an optimization. However, with instrumentation enabled this
3135 /// is difficult to promise; additionally, we know that the MSan runtime
3136 /// exists and provides __msan_memcpy(). Therefore, we assume that with
3137 /// instrumentation it's safe to turn memcpy.inline into a call to
3138 /// __msan_memcpy(). Should this be wrong, such as when implementing memcpy()
3139 /// itself, instrumentation should be disabled with the no_sanitize attribute.
3140 void visitMemCpyInst(MemCpyInst &I) {
3141 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3142 IRBuilder<> IRB(&I);
3143 IRB.CreateCall(MS.MemcpyFn,
3144 {I.getArgOperand(0), I.getArgOperand(1),
3145 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3147 }
3148
3149 // Same as memcpy.
3150 void visitMemSetInst(MemSetInst &I) {
3151 IRBuilder<> IRB(&I);
3152 IRB.CreateCall(
3153 MS.MemsetFn,
3154 {I.getArgOperand(0),
3155 IRB.CreateIntCast(I.getArgOperand(1), IRB.getInt32Ty(), false),
3156 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3158 }
3159
3160 void visitVAStartInst(VAStartInst &I) { VAHelper->visitVAStartInst(I); }
3161
3162 void visitVACopyInst(VACopyInst &I) { VAHelper->visitVACopyInst(I); }
3163
3164 /// Handle vector store-like intrinsics.
3165 ///
3166 /// Instrument intrinsics that look like a simple SIMD store: writes memory,
3167 /// has 1 pointer argument and 1 vector argument, returns void.
3168 bool handleVectorStoreIntrinsic(IntrinsicInst &I) {
3169 assert(I.arg_size() == 2);
3170
3171 IRBuilder<> IRB(&I);
3172 Value *Addr = I.getArgOperand(0);
3173 Value *Shadow = getShadow(&I, 1);
3174 Value *ShadowPtr, *OriginPtr;
3175
3176 // We don't know the pointer alignment (could be unaligned SSE store!).
3177 // Have to assume to worst case.
3178 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
3179 Addr, IRB, Shadow->getType(), Align(1), /*isStore*/ true);
3180 IRB.CreateAlignedStore(Shadow, ShadowPtr, Align(1));
3181
3183 insertCheckShadowOf(Addr, &I);
3184
3185 // FIXME: factor out common code from materializeStores
3186 if (MS.TrackOrigins)
3187 IRB.CreateStore(getOrigin(&I, 1), OriginPtr);
3188 return true;
3189 }
3190
3191 /// Handle vector load-like intrinsics.
3192 ///
3193 /// Instrument intrinsics that look like a simple SIMD load: reads memory,
3194 /// has 1 pointer argument, returns a vector.
3195 bool handleVectorLoadIntrinsic(IntrinsicInst &I) {
3196 assert(I.arg_size() == 1);
3197
3198 IRBuilder<> IRB(&I);
3199 Value *Addr = I.getArgOperand(0);
3200
3201 Type *ShadowTy = getShadowTy(&I);
3202 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
3203 if (PropagateShadow) {
3204 // We don't know the pointer alignment (could be unaligned SSE load!).
3205 // Have to assume to worst case.
3206 const Align Alignment = Align(1);
3207 std::tie(ShadowPtr, OriginPtr) =
3208 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
3209 setShadow(&I,
3210 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
3211 } else {
3212 setShadow(&I, getCleanShadow(&I));
3213 }
3214
3216 insertCheckShadowOf(Addr, &I);
3217
3218 if (MS.TrackOrigins) {
3219 if (PropagateShadow)
3220 setOrigin(&I, IRB.CreateLoad(MS.OriginTy, OriginPtr));
3221 else
3222 setOrigin(&I, getCleanOrigin());
3223 }
3224 return true;
3225 }
3226
3227 /// Handle (SIMD arithmetic)-like intrinsics.
3228 ///
3229 /// Instrument intrinsics with any number of arguments of the same type [*],
3230 /// equal to the return type, plus a specified number of trailing flags of
3231 /// any type.
3232 ///
3233 /// [*] The type should be simple (no aggregates or pointers; vectors are
3234 /// fine).
3235 ///
3236 /// Caller guarantees that this intrinsic does not access memory.
3237 ///
3238 /// TODO: "horizontal"/"pairwise" intrinsics are often incorrectly matched by
3239 /// by this handler. See horizontalReduce().
3240 ///
3241 /// TODO: permutation intrinsics are also often incorrectly matched.
3242 [[maybe_unused]] bool
3243 maybeHandleSimpleNomemIntrinsic(IntrinsicInst &I,
3244 unsigned int trailingFlags) {
3245 Type *RetTy = I.getType();
3246 if (!(RetTy->isIntOrIntVectorTy() || RetTy->isFPOrFPVectorTy()))
3247 return false;
3248
3249 unsigned NumArgOperands = I.arg_size();
3250 assert(NumArgOperands >= trailingFlags);
3251 for (unsigned i = 0; i < NumArgOperands - trailingFlags; ++i) {
3252 Type *Ty = I.getArgOperand(i)->getType();
3253 if (Ty != RetTy)
3254 return false;
3255 }
3256
3257 IRBuilder<> IRB(&I);
3258 ShadowAndOriginCombiner SC(this, IRB);
3259 for (unsigned i = 0; i < NumArgOperands; ++i)
3260 SC.Add(I.getArgOperand(i));
3261 SC.Done(&I);
3262
3263 return true;
3264 }
3265
3266 /// Returns whether it was able to heuristically instrument unknown
3267 /// intrinsics.
3268 ///
3269 /// The main purpose of this code is to do something reasonable with all
3270 /// random intrinsics we might encounter, most importantly - SIMD intrinsics.
3271 /// We recognize several classes of intrinsics by their argument types and
3272 /// ModRefBehaviour and apply special instrumentation when we are reasonably
3273 /// sure that we know what the intrinsic does.
3274 ///
3275 /// We special-case intrinsics where this approach fails. See llvm.bswap
3276 /// handling as an example of that.
3277 bool maybeHandleUnknownIntrinsicUnlogged(IntrinsicInst &I) {
3278 unsigned NumArgOperands = I.arg_size();
3279 if (NumArgOperands == 0)
3280 return false;
3281
3282 if (NumArgOperands == 2 && I.getArgOperand(0)->getType()->isPointerTy() &&
3283 I.getArgOperand(1)->getType()->isVectorTy() &&
3284 I.getType()->isVoidTy() && !I.onlyReadsMemory()) {
3285 // This looks like a vector store.
3286 return handleVectorStoreIntrinsic(I);
3287 }
3288
3289 if (NumArgOperands == 1 && I.getArgOperand(0)->getType()->isPointerTy() &&
3290 I.getType()->isVectorTy() && I.onlyReadsMemory()) {
3291 // This looks like a vector load.
3292 return handleVectorLoadIntrinsic(I);
3293 }
3294
3295 if (I.doesNotAccessMemory())
3296 if (maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/0))
3297 return true;
3298
3299 // FIXME: detect and handle SSE maskstore/maskload?
3300 // Some cases are now handled in handleAVXMasked{Load,Store}.
3301 return false;
3302 }
3303
3304 bool maybeHandleUnknownIntrinsic(IntrinsicInst &I) {
3305 if (maybeHandleUnknownIntrinsicUnlogged(I)) {
3307 dumpInst(I);
3308
3309 LLVM_DEBUG(dbgs() << "UNKNOWN INSTRUCTION HANDLED HEURISTICALLY: " << I
3310 << "\n");
3311 return true;
3312 } else
3313 return false;
3314 }
3315
3316 void handleInvariantGroup(IntrinsicInst &I) {
3317 setShadow(&I, getShadow(&I, 0));
3318 setOrigin(&I, getOrigin(&I, 0));
3319 }
3320
3321 void handleLifetimeStart(IntrinsicInst &I) {
3322 if (!PoisonStack)
3323 return;
3324 AllocaInst *AI = dyn_cast<AllocaInst>(I.getArgOperand(0));
3325 if (AI)
3326 LifetimeStartList.push_back(std::make_pair(&I, AI));
3327 }
3328
3329 void handleBswap(IntrinsicInst &I) {
3330 IRBuilder<> IRB(&I);
3331 Value *Op = I.getArgOperand(0);
3332 Type *OpType = Op->getType();
3333 setShadow(&I, IRB.CreateIntrinsic(Intrinsic::bswap, ArrayRef(&OpType, 1),
3334 getShadow(Op)));
3335 setOrigin(&I, getOrigin(Op));
3336 }
3337
3338 // Uninitialized bits are ok if they appear after the leading/trailing 0's
3339 // and a 1. If the input is all zero, it is fully initialized iff
3340 // !is_zero_poison.
3341 //
3342 // e.g., for ctlz, with little-endian, if 0/1 are initialized bits with
3343 // concrete value 0/1, and ? is an uninitialized bit:
3344 // - 0001 0??? is fully initialized
3345 // - 000? ???? is fully uninitialized (*)
3346 // - ???? ???? is fully uninitialized
3347 // - 0000 0000 is fully uninitialized if is_zero_poison,
3348 // fully initialized otherwise
3349 //
3350 // (*) TODO: arguably, since the number of zeros is in the range [3, 8], we
3351 // only need to poison 4 bits.
3352 //
3353 // OutputShadow =
3354 // ((ConcreteZerosCount >= ShadowZerosCount) && !AllZeroShadow)
3355 // || (is_zero_poison && AllZeroSrc)
3356 void handleCountLeadingTrailingZeros(IntrinsicInst &I) {
3357 IRBuilder<> IRB(&I);
3358 Value *Src = I.getArgOperand(0);
3359 Value *SrcShadow = getShadow(Src);
3360
3361 Value *False = IRB.getInt1(false);
3362 Value *ConcreteZerosCount = IRB.CreateIntrinsic(
3363 I.getType(), I.getIntrinsicID(), {Src, /*is_zero_poison=*/False});
3364 Value *ShadowZerosCount = IRB.CreateIntrinsic(
3365 I.getType(), I.getIntrinsicID(), {SrcShadow, /*is_zero_poison=*/False});
3366
3367 Value *CompareConcreteZeros = IRB.CreateICmpUGE(
3368 ConcreteZerosCount, ShadowZerosCount, "_mscz_cmp_zeros");
3369
3370 Value *NotAllZeroShadow =
3371 IRB.CreateIsNotNull(SrcShadow, "_mscz_shadow_not_null");
3372 Value *OutputShadow =
3373 IRB.CreateAnd(CompareConcreteZeros, NotAllZeroShadow, "_mscz_main");
3374
3375 // If zero poison is requested, mix in with the shadow
3376 Constant *IsZeroPoison = cast<Constant>(I.getOperand(1));
3377 if (!IsZeroPoison->isZeroValue()) {
3378 Value *BoolZeroPoison = IRB.CreateIsNull(Src, "_mscz_bzp");
3379 OutputShadow = IRB.CreateOr(OutputShadow, BoolZeroPoison, "_mscz_bs");
3380 }
3381
3382 OutputShadow = IRB.CreateSExt(OutputShadow, getShadowTy(Src), "_mscz_os");
3383
3384 setShadow(&I, OutputShadow);
3385 setOriginForNaryOp(I);
3386 }
3387
3388 /// Handle Arm NEON vector convert intrinsics.
3389 ///
3390 /// e.g., <4 x i32> @llvm.aarch64.neon.fcvtpu.v4i32.v4f32(<4 x float>)
3391 /// i32 @llvm.aarch64.neon.fcvtms.i32.f64(double)
3392 ///
3393 /// For x86 SSE vector convert intrinsics, see
3394 /// handleSSEVectorConvertIntrinsic().
3395 void handleNEONVectorConvertIntrinsic(IntrinsicInst &I) {
3396 assert(I.arg_size() == 1);
3397
3398 IRBuilder<> IRB(&I);
3399 Value *S0 = getShadow(&I, 0);
3400
3401 /// For scalars:
3402 /// Since they are converting from floating-point to integer, the output is
3403 /// - fully uninitialized if *any* bit of the input is uninitialized
3404 /// - fully ininitialized if all bits of the input are ininitialized
3405 /// We apply the same principle on a per-field basis for vectors.
3406 Value *OutShadow = IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)),
3407 getShadowTy(&I));
3408 setShadow(&I, OutShadow);
3409 setOriginForNaryOp(I);
3410 }
3411
3412 /// Some instructions have additional zero-elements in the return type
3413 /// e.g., <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512(<8 x i64>, ...)
3414 ///
3415 /// This function will return a vector type with the same number of elements
3416 /// as the input, but same per-element width as the return value e.g.,
3417 /// <8 x i8>.
3418 FixedVectorType *maybeShrinkVectorShadowType(Value *Src, IntrinsicInst &I) {
3419 assert(isa<FixedVectorType>(getShadowTy(&I)));
3420 FixedVectorType *ShadowType = cast<FixedVectorType>(getShadowTy(&I));
3421
3422 // TODO: generalize beyond 2x?
3423 if (ShadowType->getElementCount() ==
3424 cast<VectorType>(Src->getType())->getElementCount() * 2)
3425 ShadowType = FixedVectorType::getHalfElementsVectorType(ShadowType);
3426
3427 assert(ShadowType->getElementCount() ==
3428 cast<VectorType>(Src->getType())->getElementCount());
3429
3430 return ShadowType;
3431 }
3432
3433 /// Doubles the length of a vector shadow (extending with zeros) if necessary
3434 /// to match the length of the shadow for the instruction.
3435 /// If scalar types of the vectors are different, it will use the type of the
3436 /// input vector.
3437 /// This is more type-safe than CreateShadowCast().
3438 Value *maybeExtendVectorShadowWithZeros(Value *Shadow, IntrinsicInst &I) {
3439 IRBuilder<> IRB(&I);
3441 assert(isa<FixedVectorType>(I.getType()));
3442
3443 Value *FullShadow = getCleanShadow(&I);
3444 unsigned ShadowNumElems =
3445 cast<FixedVectorType>(Shadow->getType())->getNumElements();
3446 unsigned FullShadowNumElems =
3447 cast<FixedVectorType>(FullShadow->getType())->getNumElements();
3448
3449 assert((ShadowNumElems == FullShadowNumElems) ||
3450 (ShadowNumElems * 2 == FullShadowNumElems));
3451
3452 if (ShadowNumElems == FullShadowNumElems) {
3453 FullShadow = Shadow;
3454 } else {
3455 // TODO: generalize beyond 2x?
3456 SmallVector<int, 32> ShadowMask(FullShadowNumElems);
3457 std::iota(ShadowMask.begin(), ShadowMask.end(), 0);
3458
3459 // Append zeros
3460 FullShadow =
3461 IRB.CreateShuffleVector(Shadow, getCleanShadow(Shadow), ShadowMask);
3462 }
3463
3464 return FullShadow;
3465 }
3466
3467 /// Handle x86 SSE vector conversion.
3468 ///
3469 /// e.g., single-precision to half-precision conversion:
3470 /// <8 x i16> @llvm.x86.vcvtps2ph.256(<8 x float> %a0, i32 0)
3471 /// <8 x i16> @llvm.x86.vcvtps2ph.128(<4 x float> %a0, i32 0)
3472 ///
3473 /// floating-point to integer:
3474 /// <4 x i32> @llvm.x86.sse2.cvtps2dq(<4 x float>)
3475 /// <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>)
3476 ///
3477 /// Note: if the output has more elements, they are zero-initialized (and
3478 /// therefore the shadow will also be initialized).
3479 ///
3480 /// This differs from handleSSEVectorConvertIntrinsic() because it
3481 /// propagates uninitialized shadow (instead of checking the shadow).
3482 void handleSSEVectorConvertIntrinsicByProp(IntrinsicInst &I,
3483 bool HasRoundingMode) {
3484 if (HasRoundingMode) {
3485 assert(I.arg_size() == 2);
3486 [[maybe_unused]] Value *RoundingMode = I.getArgOperand(1);
3487 assert(RoundingMode->getType()->isIntegerTy());
3488 } else {
3489 assert(I.arg_size() == 1);
3490 }
3491
3492 Value *Src = I.getArgOperand(0);
3493 assert(Src->getType()->isVectorTy());
3494
3495 // The return type might have more elements than the input.
3496 // Temporarily shrink the return type's number of elements.
3497 VectorType *ShadowType = maybeShrinkVectorShadowType(Src, I);
3498
3499 IRBuilder<> IRB(&I);
3500 Value *S0 = getShadow(&I, 0);
3501
3502 /// For scalars:
3503 /// Since they are converting to and/or from floating-point, the output is:
3504 /// - fully uninitialized if *any* bit of the input is uninitialized
3505 /// - fully ininitialized if all bits of the input are ininitialized
3506 /// We apply the same principle on a per-field basis for vectors.
3507 Value *Shadow =
3508 IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)), ShadowType);
3509
3510 // The return type might have more elements than the input.
3511 // Extend the return type back to its original width if necessary.
3512 Value *FullShadow = maybeExtendVectorShadowWithZeros(Shadow, I);
3513
3514 setShadow(&I, FullShadow);
3515 setOriginForNaryOp(I);
3516 }
3517
3518 // Instrument x86 SSE vector convert intrinsic.
3519 //
3520 // This function instruments intrinsics like cvtsi2ss:
3521 // %Out = int_xxx_cvtyyy(%ConvertOp)
3522 // or
3523 // %Out = int_xxx_cvtyyy(%CopyOp, %ConvertOp)
3524 // Intrinsic converts \p NumUsedElements elements of \p ConvertOp to the same
3525 // number \p Out elements, and (if has 2 arguments) copies the rest of the
3526 // elements from \p CopyOp.
3527 // In most cases conversion involves floating-point value which may trigger a
3528 // hardware exception when not fully initialized. For this reason we require
3529 // \p ConvertOp[0:NumUsedElements] to be fully initialized and trap otherwise.
3530 // We copy the shadow of \p CopyOp[NumUsedElements:] to \p
3531 // Out[NumUsedElements:]. This means that intrinsics without \p CopyOp always
3532 // return a fully initialized value.
3533 //
3534 // For Arm NEON vector convert intrinsics, see
3535 // handleNEONVectorConvertIntrinsic().
3536 void handleSSEVectorConvertIntrinsic(IntrinsicInst &I, int NumUsedElements,
3537 bool HasRoundingMode = false) {
3538 IRBuilder<> IRB(&I);
3539 Value *CopyOp, *ConvertOp;
3540
3541 assert((!HasRoundingMode ||
3542 isa<ConstantInt>(I.getArgOperand(I.arg_size() - 1))) &&
3543 "Invalid rounding mode");
3544
3545 switch (I.arg_size() - HasRoundingMode) {
3546 case 2:
3547 CopyOp = I.getArgOperand(0);
3548 ConvertOp = I.getArgOperand(1);
3549 break;
3550 case 1:
3551 ConvertOp = I.getArgOperand(0);
3552 CopyOp = nullptr;
3553 break;
3554 default:
3555 llvm_unreachable("Cvt intrinsic with unsupported number of arguments.");
3556 }
3557
3558 // The first *NumUsedElements* elements of ConvertOp are converted to the
3559 // same number of output elements. The rest of the output is copied from
3560 // CopyOp, or (if not available) filled with zeroes.
3561 // Combine shadow for elements of ConvertOp that are used in this operation,
3562 // and insert a check.
3563 // FIXME: consider propagating shadow of ConvertOp, at least in the case of
3564 // int->any conversion.
3565 Value *ConvertShadow = getShadow(ConvertOp);
3566 Value *AggShadow = nullptr;
3567 if (ConvertOp->getType()->isVectorTy()) {
3568 AggShadow = IRB.CreateExtractElement(
3569 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), 0));
3570 for (int i = 1; i < NumUsedElements; ++i) {
3571 Value *MoreShadow = IRB.CreateExtractElement(
3572 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), i));
3573 AggShadow = IRB.CreateOr(AggShadow, MoreShadow);
3574 }
3575 } else {
3576 AggShadow = ConvertShadow;
3577 }
3578 assert(AggShadow->getType()->isIntegerTy());
3579 insertCheckShadow(AggShadow, getOrigin(ConvertOp), &I);
3580
3581 // Build result shadow by zero-filling parts of CopyOp shadow that come from
3582 // ConvertOp.
3583 if (CopyOp) {
3584 assert(CopyOp->getType() == I.getType());
3585 assert(CopyOp->getType()->isVectorTy());
3586 Value *ResultShadow = getShadow(CopyOp);
3587 Type *EltTy = cast<VectorType>(ResultShadow->getType())->getElementType();
3588 for (int i = 0; i < NumUsedElements; ++i) {
3589 ResultShadow = IRB.CreateInsertElement(
3590 ResultShadow, ConstantInt::getNullValue(EltTy),
3591 ConstantInt::get(IRB.getInt32Ty(), i));
3592 }
3593 setShadow(&I, ResultShadow);
3594 setOrigin(&I, getOrigin(CopyOp));
3595 } else {
3596 setShadow(&I, getCleanShadow(&I));
3597 setOrigin(&I, getCleanOrigin());
3598 }
3599 }
3600
3601 // Given a scalar or vector, extract lower 64 bits (or less), and return all
3602 // zeroes if it is zero, and all ones otherwise.
3603 Value *Lower64ShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3604 if (S->getType()->isVectorTy())
3605 S = CreateShadowCast(IRB, S, IRB.getInt64Ty(), /* Signed */ true);
3606 assert(S->getType()->getPrimitiveSizeInBits() <= 64);
3607 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3608 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3609 }
3610
3611 // Given a vector, extract its first element, and return all
3612 // zeroes if it is zero, and all ones otherwise.
3613 Value *LowerElementShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3614 Value *S1 = IRB.CreateExtractElement(S, (uint64_t)0);
3615 Value *S2 = IRB.CreateICmpNE(S1, getCleanShadow(S1));
3616 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3617 }
3618
3619 Value *VariableShadowExtend(IRBuilder<> &IRB, Value *S) {
3620 Type *T = S->getType();
3621 assert(T->isVectorTy());
3622 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3623 return IRB.CreateSExt(S2, T);
3624 }
3625
3626 // Instrument vector shift intrinsic.
3627 //
3628 // This function instruments intrinsics like int_x86_avx2_psll_w.
3629 // Intrinsic shifts %In by %ShiftSize bits.
3630 // %ShiftSize may be a vector. In that case the lower 64 bits determine shift
3631 // size, and the rest is ignored. Behavior is defined even if shift size is
3632 // greater than register (or field) width.
3633 void handleVectorShiftIntrinsic(IntrinsicInst &I, bool Variable) {
3634 assert(I.arg_size() == 2);
3635 IRBuilder<> IRB(&I);
3636 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3637 // Otherwise perform the same shift on S1.
3638 Value *S1 = getShadow(&I, 0);
3639 Value *S2 = getShadow(&I, 1);
3640 Value *S2Conv = Variable ? VariableShadowExtend(IRB, S2)
3641 : Lower64ShadowExtend(IRB, S2, getShadowTy(&I));
3642 Value *V1 = I.getOperand(0);
3643 Value *V2 = I.getOperand(1);
3644 Value *Shift = IRB.CreateCall(I.getFunctionType(), I.getCalledOperand(),
3645 {IRB.CreateBitCast(S1, V1->getType()), V2});
3646 Shift = IRB.CreateBitCast(Shift, getShadowTy(&I));
3647 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3648 setOriginForNaryOp(I);
3649 }
3650
3651 // Get an MMX-sized (64-bit) vector type, or optionally, other sized
3652 // vectors.
3653 Type *getMMXVectorTy(unsigned EltSizeInBits,
3654 unsigned X86_MMXSizeInBits = 64) {
3655 assert(EltSizeInBits != 0 && (X86_MMXSizeInBits % EltSizeInBits) == 0 &&
3656 "Illegal MMX vector element size");
3657 return FixedVectorType::get(IntegerType::get(*MS.C, EltSizeInBits),
3658 X86_MMXSizeInBits / EltSizeInBits);
3659 }
3660
3661 // Returns a signed counterpart for an (un)signed-saturate-and-pack
3662 // intrinsic.
3663 Intrinsic::ID getSignedPackIntrinsic(Intrinsic::ID id) {
3664 switch (id) {
3665 case Intrinsic::x86_sse2_packsswb_128:
3666 case Intrinsic::x86_sse2_packuswb_128:
3667 return Intrinsic::x86_sse2_packsswb_128;
3668
3669 case Intrinsic::x86_sse2_packssdw_128:
3670 case Intrinsic::x86_sse41_packusdw:
3671 return Intrinsic::x86_sse2_packssdw_128;
3672
3673 case Intrinsic::x86_avx2_packsswb:
3674 case Intrinsic::x86_avx2_packuswb:
3675 return Intrinsic::x86_avx2_packsswb;
3676
3677 case Intrinsic::x86_avx2_packssdw:
3678 case Intrinsic::x86_avx2_packusdw:
3679 return Intrinsic::x86_avx2_packssdw;
3680
3681 case Intrinsic::x86_mmx_packsswb:
3682 case Intrinsic::x86_mmx_packuswb:
3683 return Intrinsic::x86_mmx_packsswb;
3684
3685 case Intrinsic::x86_mmx_packssdw:
3686 return Intrinsic::x86_mmx_packssdw;
3687
3688 case Intrinsic::x86_avx512_packssdw_512:
3689 case Intrinsic::x86_avx512_packusdw_512:
3690 return Intrinsic::x86_avx512_packssdw_512;
3691
3692 case Intrinsic::x86_avx512_packsswb_512:
3693 case Intrinsic::x86_avx512_packuswb_512:
3694 return Intrinsic::x86_avx512_packsswb_512;
3695
3696 default:
3697 llvm_unreachable("unexpected intrinsic id");
3698 }
3699 }
3700
3701 // Instrument vector pack intrinsic.
3702 //
3703 // This function instruments intrinsics like x86_mmx_packsswb, that
3704 // packs elements of 2 input vectors into half as many bits with saturation.
3705 // Shadow is propagated with the signed variant of the same intrinsic applied
3706 // to sext(Sa != zeroinitializer), sext(Sb != zeroinitializer).
3707 // MMXEltSizeInBits is used only for x86mmx arguments.
3708 //
3709 // TODO: consider using GetMinMaxUnsigned() to handle saturation precisely
3710 void handleVectorPackIntrinsic(IntrinsicInst &I,
3711 unsigned MMXEltSizeInBits = 0) {
3712 assert(I.arg_size() == 2);
3713 IRBuilder<> IRB(&I);
3714 Value *S1 = getShadow(&I, 0);
3715 Value *S2 = getShadow(&I, 1);
3716 assert(S1->getType()->isVectorTy());
3717
3718 // SExt and ICmpNE below must apply to individual elements of input vectors.
3719 // In case of x86mmx arguments, cast them to appropriate vector types and
3720 // back.
3721 Type *T =
3722 MMXEltSizeInBits ? getMMXVectorTy(MMXEltSizeInBits) : S1->getType();
3723 if (MMXEltSizeInBits) {
3724 S1 = IRB.CreateBitCast(S1, T);
3725 S2 = IRB.CreateBitCast(S2, T);
3726 }
3727 Value *S1_ext =
3729 Value *S2_ext =
3731 if (MMXEltSizeInBits) {
3732 S1_ext = IRB.CreateBitCast(S1_ext, getMMXVectorTy(64));
3733 S2_ext = IRB.CreateBitCast(S2_ext, getMMXVectorTy(64));
3734 }
3735
3736 Value *S = IRB.CreateIntrinsic(getSignedPackIntrinsic(I.getIntrinsicID()),
3737 {S1_ext, S2_ext}, /*FMFSource=*/nullptr,
3738 "_msprop_vector_pack");
3739 if (MMXEltSizeInBits)
3740 S = IRB.CreateBitCast(S, getShadowTy(&I));
3741 setShadow(&I, S);
3742 setOriginForNaryOp(I);
3743 }
3744
3745 // Convert `Mask` into `<n x i1>`.
3746 Constant *createDppMask(unsigned Width, unsigned Mask) {
3748 for (auto &M : R) {
3749 M = ConstantInt::getBool(F.getContext(), Mask & 1);
3750 Mask >>= 1;
3751 }
3752 return ConstantVector::get(R);
3753 }
3754
3755 // Calculate output shadow as array of booleans `<n x i1>`, assuming if any
3756 // arg is poisoned, entire dot product is poisoned.
3757 Value *findDppPoisonedOutput(IRBuilder<> &IRB, Value *S, unsigned SrcMask,
3758 unsigned DstMask) {
3759 const unsigned Width =
3760 cast<FixedVectorType>(S->getType())->getNumElements();
3761
3762 S = IRB.CreateSelect(createDppMask(Width, SrcMask), S,
3764 Value *SElem = IRB.CreateOrReduce(S);
3765 Value *IsClean = IRB.CreateIsNull(SElem, "_msdpp");
3766 Value *DstMaskV = createDppMask(Width, DstMask);
3767
3768 return IRB.CreateSelect(
3769 IsClean, Constant::getNullValue(DstMaskV->getType()), DstMaskV);
3770 }
3771
3772 // See `Intel Intrinsics Guide` for `_dp_p*` instructions.
3773 //
3774 // 2 and 4 element versions produce single scalar of dot product, and then
3775 // puts it into elements of output vector, selected by 4 lowest bits of the
3776 // mask. Top 4 bits of the mask control which elements of input to use for dot
3777 // product.
3778 //
3779 // 8 element version mask still has only 4 bit for input, and 4 bit for output
3780 // mask. According to the spec it just operates as 4 element version on first
3781 // 4 elements of inputs and output, and then on last 4 elements of inputs and
3782 // output.
3783 void handleDppIntrinsic(IntrinsicInst &I) {
3784 IRBuilder<> IRB(&I);
3785
3786 Value *S0 = getShadow(&I, 0);
3787 Value *S1 = getShadow(&I, 1);
3788 Value *S = IRB.CreateOr(S0, S1);
3789
3790 const unsigned Width =
3791 cast<FixedVectorType>(S->getType())->getNumElements();
3792 assert(Width == 2 || Width == 4 || Width == 8);
3793
3794 const unsigned Mask = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
3795 const unsigned SrcMask = Mask >> 4;
3796 const unsigned DstMask = Mask & 0xf;
3797
3798 // Calculate shadow as `<n x i1>`.
3799 Value *SI1 = findDppPoisonedOutput(IRB, S, SrcMask, DstMask);
3800 if (Width == 8) {
3801 // First 4 elements of shadow are already calculated. `makeDppShadow`
3802 // operats on 32 bit masks, so we can just shift masks, and repeat.
3803 SI1 = IRB.CreateOr(
3804 SI1, findDppPoisonedOutput(IRB, S, SrcMask << 4, DstMask << 4));
3805 }
3806 // Extend to real size of shadow, poisoning either all or none bits of an
3807 // element.
3808 S = IRB.CreateSExt(SI1, S->getType(), "_msdpp");
3809
3810 setShadow(&I, S);
3811 setOriginForNaryOp(I);
3812 }
3813
3814 Value *convertBlendvToSelectMask(IRBuilder<> &IRB, Value *C) {
3815 C = CreateAppToShadowCast(IRB, C);
3816 FixedVectorType *FVT = cast<FixedVectorType>(C->getType());
3817 unsigned ElSize = FVT->getElementType()->getPrimitiveSizeInBits();
3818 C = IRB.CreateAShr(C, ElSize - 1);
3819 FVT = FixedVectorType::get(IRB.getInt1Ty(), FVT->getNumElements());
3820 return IRB.CreateTrunc(C, FVT);
3821 }
3822
3823 // `blendv(f, t, c)` is effectively `select(c[top_bit], t, f)`.
3824 void handleBlendvIntrinsic(IntrinsicInst &I) {
3825 Value *C = I.getOperand(2);
3826 Value *T = I.getOperand(1);
3827 Value *F = I.getOperand(0);
3828
3829 Value *Sc = getShadow(&I, 2);
3830 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
3831
3832 {
3833 IRBuilder<> IRB(&I);
3834 // Extract top bit from condition and its shadow.
3835 C = convertBlendvToSelectMask(IRB, C);
3836 Sc = convertBlendvToSelectMask(IRB, Sc);
3837
3838 setShadow(C, Sc);
3839 setOrigin(C, Oc);
3840 }
3841
3842 handleSelectLikeInst(I, C, T, F);
3843 }
3844
3845 // Instrument sum-of-absolute-differences intrinsic.
3846 void handleVectorSadIntrinsic(IntrinsicInst &I, bool IsMMX = false) {
3847 const unsigned SignificantBitsPerResultElement = 16;
3848 Type *ResTy = IsMMX ? IntegerType::get(*MS.C, 64) : I.getType();
3849 unsigned ZeroBitsPerResultElement =
3850 ResTy->getScalarSizeInBits() - SignificantBitsPerResultElement;
3851
3852 IRBuilder<> IRB(&I);
3853 auto *Shadow0 = getShadow(&I, 0);
3854 auto *Shadow1 = getShadow(&I, 1);
3855 Value *S = IRB.CreateOr(Shadow0, Shadow1);
3856 S = IRB.CreateBitCast(S, ResTy);
3857 S = IRB.CreateSExt(IRB.CreateICmpNE(S, Constant::getNullValue(ResTy)),
3858 ResTy);
3859 S = IRB.CreateLShr(S, ZeroBitsPerResultElement);
3860 S = IRB.CreateBitCast(S, getShadowTy(&I));
3861 setShadow(&I, S);
3862 setOriginForNaryOp(I);
3863 }
3864
3865 // Instrument multiply-add(-accumulate)? intrinsics.
3866 //
3867 // e.g., Two operands:
3868 // <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a, <8 x i16> %b)
3869 //
3870 // Two operands which require an EltSizeInBits override:
3871 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64> %a, <1 x i64> %b)
3872 //
3873 // Three operands:
3874 // <4 x i32> @llvm.x86.avx512.vpdpbusd.128
3875 // (<4 x i32> %s, <16 x i8> %a, <16 x i8> %b)
3876 // (this is equivalent to multiply-add on %a and %b, followed by
3877 // adding/"accumulating" %s. "Accumulation" stores the result in one
3878 // of the source registers, but this accumulate vs. add distinction
3879 // is lost when dealing with LLVM intrinsics.)
3880 void handleVectorPmaddIntrinsic(IntrinsicInst &I, unsigned ReductionFactor,
3881 unsigned EltSizeInBits = 0) {
3882 IRBuilder<> IRB(&I);
3883
3884 [[maybe_unused]] FixedVectorType *ReturnType =
3885 cast<FixedVectorType>(I.getType());
3886 assert(isa<FixedVectorType>(ReturnType));
3887
3888 // Vectors A and B, and shadows
3889 Value *Va = nullptr;
3890 Value *Vb = nullptr;
3891 Value *Sa = nullptr;
3892 Value *Sb = nullptr;
3893
3894 assert(I.arg_size() == 2 || I.arg_size() == 3);
3895 if (I.arg_size() == 2) {
3896 Va = I.getOperand(0);
3897 Vb = I.getOperand(1);
3898
3899 Sa = getShadow(&I, 0);
3900 Sb = getShadow(&I, 1);
3901 } else if (I.arg_size() == 3) {
3902 // Operand 0 is the accumulator. We will deal with that below.
3903 Va = I.getOperand(1);
3904 Vb = I.getOperand(2);
3905
3906 Sa = getShadow(&I, 1);
3907 Sb = getShadow(&I, 2);
3908 }
3909
3910 FixedVectorType *ParamType = cast<FixedVectorType>(Va->getType());
3911 assert(ParamType == Vb->getType());
3912
3913 assert(ParamType->getPrimitiveSizeInBits() ==
3914 ReturnType->getPrimitiveSizeInBits());
3915
3916 if (I.arg_size() == 3) {
3917 [[maybe_unused]] auto *AccumulatorType =
3918 cast<FixedVectorType>(I.getOperand(0)->getType());
3919 assert(AccumulatorType == ReturnType);
3920 }
3921
3922 FixedVectorType *ImplicitReturnType = ReturnType;
3923 // Step 1: instrument multiplication of corresponding vector elements
3924 if (EltSizeInBits) {
3925 ImplicitReturnType = cast<FixedVectorType>(
3926 getMMXVectorTy(EltSizeInBits * ReductionFactor,
3927 ParamType->getPrimitiveSizeInBits()));
3928 ParamType = cast<FixedVectorType>(
3929 getMMXVectorTy(EltSizeInBits, ParamType->getPrimitiveSizeInBits()));
3930
3931 Va = IRB.CreateBitCast(Va, ParamType);
3932 Vb = IRB.CreateBitCast(Vb, ParamType);
3933
3934 Sa = IRB.CreateBitCast(Sa, getShadowTy(ParamType));
3935 Sb = IRB.CreateBitCast(Sb, getShadowTy(ParamType));
3936 } else {
3937 assert(ParamType->getNumElements() ==
3938 ReturnType->getNumElements() * ReductionFactor);
3939 }
3940
3941 // Multiplying an *initialized* zero by an uninitialized element results in
3942 // an initialized zero element.
3943 //
3944 // This is analogous to bitwise AND, where "AND" of 0 and a poisoned value
3945 // results in an unpoisoned value. We can therefore adapt the visitAnd()
3946 // instrumentation:
3947 // OutShadow = (SaNonZero & SbNonZero)
3948 // | (VaNonZero & SbNonZero)
3949 // | (SaNonZero & VbNonZero)
3950 // where non-zero is checked on a per-element basis (not per bit).
3951 Value *SZero = Constant::getNullValue(Va->getType());
3952 Value *VZero = Constant::getNullValue(Sa->getType());
3953 Value *SaNonZero = IRB.CreateICmpNE(Sa, SZero);
3954 Value *SbNonZero = IRB.CreateICmpNE(Sb, SZero);
3955 Value *VaNonZero = IRB.CreateICmpNE(Va, VZero);
3956 Value *VbNonZero = IRB.CreateICmpNE(Vb, VZero);
3957
3958 Value *SaAndSbNonZero = IRB.CreateAnd(SaNonZero, SbNonZero);
3959 Value *VaAndSbNonZero = IRB.CreateAnd(VaNonZero, SbNonZero);
3960 Value *SaAndVbNonZero = IRB.CreateAnd(SaNonZero, VbNonZero);
3961
3962 // Each element of the vector is represented by a single bit (poisoned or
3963 // not) e.g., <8 x i1>.
3964 Value *And = IRB.CreateOr({SaAndSbNonZero, VaAndSbNonZero, SaAndVbNonZero});
3965
3966 // Extend <8 x i1> to <8 x i16>.
3967 // (The real pmadd intrinsic would have computed intermediate values of
3968 // <8 x i32>, but that is irrelevant for our shadow purposes because we
3969 // consider each element to be either fully initialized or fully
3970 // uninitialized.)
3971 And = IRB.CreateSExt(And, Sa->getType());
3972
3973 // Step 2: instrument horizontal add
3974 // We don't need bit-precise horizontalReduce because we only want to check
3975 // if each pair/quad of elements is fully zero.
3976 // Cast to <4 x i32>.
3977 Value *Horizontal = IRB.CreateBitCast(And, ImplicitReturnType);
3978
3979 // Compute <4 x i1>, then extend back to <4 x i32>.
3980 Value *OutShadow = IRB.CreateSExt(
3981 IRB.CreateICmpNE(Horizontal,
3982 Constant::getNullValue(Horizontal->getType())),
3983 ImplicitReturnType);
3984
3985 // Cast it back to the required fake return type (if MMX: <1 x i64>; for
3986 // AVX, it is already correct).
3987 if (EltSizeInBits)
3988 OutShadow = CreateShadowCast(IRB, OutShadow, getShadowTy(&I));
3989
3990 // Step 3 (if applicable): instrument accumulator
3991 if (I.arg_size() == 3)
3992 OutShadow = IRB.CreateOr(OutShadow, getShadow(&I, 0));
3993
3994 setShadow(&I, OutShadow);
3995 setOriginForNaryOp(I);
3996 }
3997
3998 // Instrument compare-packed intrinsic.
3999 // Basically, an or followed by sext(icmp ne 0) to end up with all-zeros or
4000 // all-ones shadow.
4001 void handleVectorComparePackedIntrinsic(IntrinsicInst &I) {
4002 IRBuilder<> IRB(&I);
4003 Type *ResTy = getShadowTy(&I);
4004 auto *Shadow0 = getShadow(&I, 0);
4005 auto *Shadow1 = getShadow(&I, 1);
4006 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
4007 Value *S = IRB.CreateSExt(
4008 IRB.CreateICmpNE(S0, Constant::getNullValue(ResTy)), ResTy);
4009 setShadow(&I, S);
4010 setOriginForNaryOp(I);
4011 }
4012
4013 // Instrument compare-scalar intrinsic.
4014 // This handles both cmp* intrinsics which return the result in the first
4015 // element of a vector, and comi* which return the result as i32.
4016 void handleVectorCompareScalarIntrinsic(IntrinsicInst &I) {
4017 IRBuilder<> IRB(&I);
4018 auto *Shadow0 = getShadow(&I, 0);
4019 auto *Shadow1 = getShadow(&I, 1);
4020 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
4021 Value *S = LowerElementShadowExtend(IRB, S0, getShadowTy(&I));
4022 setShadow(&I, S);
4023 setOriginForNaryOp(I);
4024 }
4025
4026 // Instrument generic vector reduction intrinsics
4027 // by ORing together all their fields.
4028 //
4029 // If AllowShadowCast is true, the return type does not need to be the same
4030 // type as the fields
4031 // e.g., declare i32 @llvm.aarch64.neon.uaddv.i32.v16i8(<16 x i8>)
4032 void handleVectorReduceIntrinsic(IntrinsicInst &I, bool AllowShadowCast) {
4033 assert(I.arg_size() == 1);
4034
4035 IRBuilder<> IRB(&I);
4036 Value *S = IRB.CreateOrReduce(getShadow(&I, 0));
4037 if (AllowShadowCast)
4038 S = CreateShadowCast(IRB, S, getShadowTy(&I));
4039 else
4040 assert(S->getType() == getShadowTy(&I));
4041 setShadow(&I, S);
4042 setOriginForNaryOp(I);
4043 }
4044
4045 // Similar to handleVectorReduceIntrinsic but with an initial starting value.
4046 // e.g., call float @llvm.vector.reduce.fadd.f32.v2f32(float %a0, <2 x float>
4047 // %a1)
4048 // shadow = shadow[a0] | shadow[a1.0] | shadow[a1.1]
4049 //
4050 // The type of the return value, initial starting value, and elements of the
4051 // vector must be identical.
4052 void handleVectorReduceWithStarterIntrinsic(IntrinsicInst &I) {
4053 assert(I.arg_size() == 2);
4054
4055 IRBuilder<> IRB(&I);
4056 Value *Shadow0 = getShadow(&I, 0);
4057 Value *Shadow1 = IRB.CreateOrReduce(getShadow(&I, 1));
4058 assert(Shadow0->getType() == Shadow1->getType());
4059 Value *S = IRB.CreateOr(Shadow0, Shadow1);
4060 assert(S->getType() == getShadowTy(&I));
4061 setShadow(&I, S);
4062 setOriginForNaryOp(I);
4063 }
4064
4065 // Instrument vector.reduce.or intrinsic.
4066 // Valid (non-poisoned) set bits in the operand pull low the
4067 // corresponding shadow bits.
4068 void handleVectorReduceOrIntrinsic(IntrinsicInst &I) {
4069 assert(I.arg_size() == 1);
4070
4071 IRBuilder<> IRB(&I);
4072 Value *OperandShadow = getShadow(&I, 0);
4073 Value *OperandUnsetBits = IRB.CreateNot(I.getOperand(0));
4074 Value *OperandUnsetOrPoison = IRB.CreateOr(OperandUnsetBits, OperandShadow);
4075 // Bit N is clean if any field's bit N is 1 and unpoison
4076 Value *OutShadowMask = IRB.CreateAndReduce(OperandUnsetOrPoison);
4077 // Otherwise, it is clean if every field's bit N is unpoison
4078 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4079 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4080
4081 setShadow(&I, S);
4082 setOrigin(&I, getOrigin(&I, 0));
4083 }
4084
4085 // Instrument vector.reduce.and intrinsic.
4086 // Valid (non-poisoned) unset bits in the operand pull down the
4087 // corresponding shadow bits.
4088 void handleVectorReduceAndIntrinsic(IntrinsicInst &I) {
4089 assert(I.arg_size() == 1);
4090
4091 IRBuilder<> IRB(&I);
4092 Value *OperandShadow = getShadow(&I, 0);
4093 Value *OperandSetOrPoison = IRB.CreateOr(I.getOperand(0), OperandShadow);
4094 // Bit N is clean if any field's bit N is 0 and unpoison
4095 Value *OutShadowMask = IRB.CreateAndReduce(OperandSetOrPoison);
4096 // Otherwise, it is clean if every field's bit N is unpoison
4097 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4098 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4099
4100 setShadow(&I, S);
4101 setOrigin(&I, getOrigin(&I, 0));
4102 }
4103
4104 void handleStmxcsr(IntrinsicInst &I) {
4105 IRBuilder<> IRB(&I);
4106 Value *Addr = I.getArgOperand(0);
4107 Type *Ty = IRB.getInt32Ty();
4108 Value *ShadowPtr =
4109 getShadowOriginPtr(Addr, IRB, Ty, Align(1), /*isStore*/ true).first;
4110
4111 IRB.CreateStore(getCleanShadow(Ty), ShadowPtr);
4112
4114 insertCheckShadowOf(Addr, &I);
4115 }
4116
4117 void handleLdmxcsr(IntrinsicInst &I) {
4118 if (!InsertChecks)
4119 return;
4120
4121 IRBuilder<> IRB(&I);
4122 Value *Addr = I.getArgOperand(0);
4123 Type *Ty = IRB.getInt32Ty();
4124 const Align Alignment = Align(1);
4125 Value *ShadowPtr, *OriginPtr;
4126 std::tie(ShadowPtr, OriginPtr) =
4127 getShadowOriginPtr(Addr, IRB, Ty, Alignment, /*isStore*/ false);
4128
4130 insertCheckShadowOf(Addr, &I);
4131
4132 Value *Shadow = IRB.CreateAlignedLoad(Ty, ShadowPtr, Alignment, "_ldmxcsr");
4133 Value *Origin = MS.TrackOrigins ? IRB.CreateLoad(MS.OriginTy, OriginPtr)
4134 : getCleanOrigin();
4135 insertCheckShadow(Shadow, Origin, &I);
4136 }
4137
4138 void handleMaskedExpandLoad(IntrinsicInst &I) {
4139 IRBuilder<> IRB(&I);
4140 Value *Ptr = I.getArgOperand(0);
4141 MaybeAlign Align = I.getParamAlign(0);
4142 Value *Mask = I.getArgOperand(1);
4143 Value *PassThru = I.getArgOperand(2);
4144
4146 insertCheckShadowOf(Ptr, &I);
4147 insertCheckShadowOf(Mask, &I);
4148 }
4149
4150 if (!PropagateShadow) {
4151 setShadow(&I, getCleanShadow(&I));
4152 setOrigin(&I, getCleanOrigin());
4153 return;
4154 }
4155
4156 Type *ShadowTy = getShadowTy(&I);
4157 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4158 auto [ShadowPtr, OriginPtr] =
4159 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ false);
4160
4161 Value *Shadow =
4162 IRB.CreateMaskedExpandLoad(ShadowTy, ShadowPtr, Align, Mask,
4163 getShadow(PassThru), "_msmaskedexpload");
4164
4165 setShadow(&I, Shadow);
4166
4167 // TODO: Store origins.
4168 setOrigin(&I, getCleanOrigin());
4169 }
4170
4171 void handleMaskedCompressStore(IntrinsicInst &I) {
4172 IRBuilder<> IRB(&I);
4173 Value *Values = I.getArgOperand(0);
4174 Value *Ptr = I.getArgOperand(1);
4175 MaybeAlign Align = I.getParamAlign(1);
4176 Value *Mask = I.getArgOperand(2);
4177
4179 insertCheckShadowOf(Ptr, &I);
4180 insertCheckShadowOf(Mask, &I);
4181 }
4182
4183 Value *Shadow = getShadow(Values);
4184 Type *ElementShadowTy =
4185 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4186 auto [ShadowPtr, OriginPtrs] =
4187 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ true);
4188
4189 IRB.CreateMaskedCompressStore(Shadow, ShadowPtr, Align, Mask);
4190
4191 // TODO: Store origins.
4192 }
4193
4194 void handleMaskedGather(IntrinsicInst &I) {
4195 IRBuilder<> IRB(&I);
4196 Value *Ptrs = I.getArgOperand(0);
4197 const Align Alignment(
4198 cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
4199 Value *Mask = I.getArgOperand(2);
4200 Value *PassThru = I.getArgOperand(3);
4201
4202 Type *PtrsShadowTy = getShadowTy(Ptrs);
4204 insertCheckShadowOf(Mask, &I);
4205 Value *MaskedPtrShadow = IRB.CreateSelect(
4206 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4207 "_msmaskedptrs");
4208 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4209 }
4210
4211 if (!PropagateShadow) {
4212 setShadow(&I, getCleanShadow(&I));
4213 setOrigin(&I, getCleanOrigin());
4214 return;
4215 }
4216
4217 Type *ShadowTy = getShadowTy(&I);
4218 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4219 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4220 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ false);
4221
4222 Value *Shadow =
4223 IRB.CreateMaskedGather(ShadowTy, ShadowPtrs, Alignment, Mask,
4224 getShadow(PassThru), "_msmaskedgather");
4225
4226 setShadow(&I, Shadow);
4227
4228 // TODO: Store origins.
4229 setOrigin(&I, getCleanOrigin());
4230 }
4231
4232 void handleMaskedScatter(IntrinsicInst &I) {
4233 IRBuilder<> IRB(&I);
4234 Value *Values = I.getArgOperand(0);
4235 Value *Ptrs = I.getArgOperand(1);
4236 const Align Alignment(
4237 cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
4238 Value *Mask = I.getArgOperand(3);
4239
4240 Type *PtrsShadowTy = getShadowTy(Ptrs);
4242 insertCheckShadowOf(Mask, &I);
4243 Value *MaskedPtrShadow = IRB.CreateSelect(
4244 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4245 "_msmaskedptrs");
4246 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4247 }
4248
4249 Value *Shadow = getShadow(Values);
4250 Type *ElementShadowTy =
4251 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4252 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4253 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ true);
4254
4255 IRB.CreateMaskedScatter(Shadow, ShadowPtrs, Alignment, Mask);
4256
4257 // TODO: Store origin.
4258 }
4259
4260 // Intrinsic::masked_store
4261 //
4262 // Note: handleAVXMaskedStore handles AVX/AVX2 variants, though AVX512 masked
4263 // stores are lowered to Intrinsic::masked_store.
4264 void handleMaskedStore(IntrinsicInst &I) {
4265 IRBuilder<> IRB(&I);
4266 Value *V = I.getArgOperand(0);
4267 Value *Ptr = I.getArgOperand(1);
4268 const Align Alignment(
4269 cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
4270 Value *Mask = I.getArgOperand(3);
4271 Value *Shadow = getShadow(V);
4272
4274 insertCheckShadowOf(Ptr, &I);
4275 insertCheckShadowOf(Mask, &I);
4276 }
4277
4278 Value *ShadowPtr;
4279 Value *OriginPtr;
4280 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
4281 Ptr, IRB, Shadow->getType(), Alignment, /*isStore*/ true);
4282
4283 IRB.CreateMaskedStore(Shadow, ShadowPtr, Alignment, Mask);
4284
4285 if (!MS.TrackOrigins)
4286 return;
4287
4288 auto &DL = F.getDataLayout();
4289 paintOrigin(IRB, getOrigin(V), OriginPtr,
4290 DL.getTypeStoreSize(Shadow->getType()),
4291 std::max(Alignment, kMinOriginAlignment));
4292 }
4293
4294 // Intrinsic::masked_load
4295 //
4296 // Note: handleAVXMaskedLoad handles AVX/AVX2 variants, though AVX512 masked
4297 // loads are lowered to Intrinsic::masked_load.
4298 void handleMaskedLoad(IntrinsicInst &I) {
4299 IRBuilder<> IRB(&I);
4300 Value *Ptr = I.getArgOperand(0);
4301 const Align Alignment(
4302 cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
4303 Value *Mask = I.getArgOperand(2);
4304 Value *PassThru = I.getArgOperand(3);
4305
4307 insertCheckShadowOf(Ptr, &I);
4308 insertCheckShadowOf(Mask, &I);
4309 }
4310
4311 if (!PropagateShadow) {
4312 setShadow(&I, getCleanShadow(&I));
4313 setOrigin(&I, getCleanOrigin());
4314 return;
4315 }
4316
4317 Type *ShadowTy = getShadowTy(&I);
4318 Value *ShadowPtr, *OriginPtr;
4319 std::tie(ShadowPtr, OriginPtr) =
4320 getShadowOriginPtr(Ptr, IRB, ShadowTy, Alignment, /*isStore*/ false);
4321 setShadow(&I, IRB.CreateMaskedLoad(ShadowTy, ShadowPtr, Alignment, Mask,
4322 getShadow(PassThru), "_msmaskedld"));
4323
4324 if (!MS.TrackOrigins)
4325 return;
4326
4327 // Choose between PassThru's and the loaded value's origins.
4328 Value *MaskedPassThruShadow = IRB.CreateAnd(
4329 getShadow(PassThru), IRB.CreateSExt(IRB.CreateNeg(Mask), ShadowTy));
4330
4331 Value *NotNull = convertToBool(MaskedPassThruShadow, IRB, "_mscmp");
4332
4333 Value *PtrOrigin = IRB.CreateLoad(MS.OriginTy, OriginPtr);
4334 Value *Origin = IRB.CreateSelect(NotNull, getOrigin(PassThru), PtrOrigin);
4335
4336 setOrigin(&I, Origin);
4337 }
4338
4339 // e.g., void @llvm.x86.avx.maskstore.ps.256(ptr, <8 x i32>, <8 x float>)
4340 // dst mask src
4341 //
4342 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4343 // by handleMaskedStore.
4344 //
4345 // This function handles AVX and AVX2 masked stores; these use the MSBs of a
4346 // vector of integers, unlike the LLVM masked intrinsics, which require a
4347 // vector of booleans. X86InstCombineIntrinsic.cpp::simplifyX86MaskedLoad
4348 // mentions that the x86 backend does not know how to efficiently convert
4349 // from a vector of booleans back into the AVX mask format; therefore, they
4350 // (and we) do not reduce AVX/AVX2 masked intrinsics into LLVM masked
4351 // intrinsics.
4352 void handleAVXMaskedStore(IntrinsicInst &I) {
4353 assert(I.arg_size() == 3);
4354
4355 IRBuilder<> IRB(&I);
4356
4357 Value *Dst = I.getArgOperand(0);
4358 assert(Dst->getType()->isPointerTy() && "Destination is not a pointer!");
4359
4360 Value *Mask = I.getArgOperand(1);
4361 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4362
4363 Value *Src = I.getArgOperand(2);
4364 assert(isa<VectorType>(Src->getType()) && "Source is not a vector!");
4365
4366 const Align Alignment = Align(1);
4367
4368 Value *SrcShadow = getShadow(Src);
4369
4371 insertCheckShadowOf(Dst, &I);
4372 insertCheckShadowOf(Mask, &I);
4373 }
4374
4375 Value *DstShadowPtr;
4376 Value *DstOriginPtr;
4377 std::tie(DstShadowPtr, DstOriginPtr) = getShadowOriginPtr(
4378 Dst, IRB, SrcShadow->getType(), Alignment, /*isStore*/ true);
4379
4380 SmallVector<Value *, 2> ShadowArgs;
4381 ShadowArgs.append(1, DstShadowPtr);
4382 ShadowArgs.append(1, Mask);
4383 // The intrinsic may require floating-point but shadows can be arbitrary
4384 // bit patterns, of which some would be interpreted as "invalid"
4385 // floating-point values (NaN etc.); we assume the intrinsic will happily
4386 // copy them.
4387 ShadowArgs.append(1, IRB.CreateBitCast(SrcShadow, Src->getType()));
4388
4389 CallInst *CI =
4390 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
4391 setShadow(&I, CI);
4392
4393 if (!MS.TrackOrigins)
4394 return;
4395
4396 // Approximation only
4397 auto &DL = F.getDataLayout();
4398 paintOrigin(IRB, getOrigin(Src), DstOriginPtr,
4399 DL.getTypeStoreSize(SrcShadow->getType()),
4400 std::max(Alignment, kMinOriginAlignment));
4401 }
4402
4403 // e.g., <8 x float> @llvm.x86.avx.maskload.ps.256(ptr, <8 x i32>)
4404 // return src mask
4405 //
4406 // Masked-off values are replaced with 0, which conveniently also represents
4407 // initialized memory.
4408 //
4409 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4410 // by handleMaskedStore.
4411 //
4412 // We do not combine this with handleMaskedLoad; see comment in
4413 // handleAVXMaskedStore for the rationale.
4414 //
4415 // This is subtly different than handleIntrinsicByApplyingToShadow(I, 1)
4416 // because we need to apply getShadowOriginPtr, not getShadow, to the first
4417 // parameter.
4418 void handleAVXMaskedLoad(IntrinsicInst &I) {
4419 assert(I.arg_size() == 2);
4420
4421 IRBuilder<> IRB(&I);
4422
4423 Value *Src = I.getArgOperand(0);
4424 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
4425
4426 Value *Mask = I.getArgOperand(1);
4427 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4428
4429 const Align Alignment = Align(1);
4430
4432 insertCheckShadowOf(Mask, &I);
4433 }
4434
4435 Type *SrcShadowTy = getShadowTy(Src);
4436 Value *SrcShadowPtr, *SrcOriginPtr;
4437 std::tie(SrcShadowPtr, SrcOriginPtr) =
4438 getShadowOriginPtr(Src, IRB, SrcShadowTy, Alignment, /*isStore*/ false);
4439
4440 SmallVector<Value *, 2> ShadowArgs;
4441 ShadowArgs.append(1, SrcShadowPtr);
4442 ShadowArgs.append(1, Mask);
4443
4444 CallInst *CI =
4445 IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(), ShadowArgs);
4446 // The AVX masked load intrinsics do not have integer variants. We use the
4447 // floating-point variants, which will happily copy the shadows even if
4448 // they are interpreted as "invalid" floating-point values (NaN etc.).
4449 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4450
4451 if (!MS.TrackOrigins)
4452 return;
4453
4454 // The "pass-through" value is always zero (initialized). To the extent
4455 // that that results in initialized aligned 4-byte chunks, the origin value
4456 // is ignored. It is therefore correct to simply copy the origin from src.
4457 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
4458 setOrigin(&I, PtrSrcOrigin);
4459 }
4460
4461 // Test whether the mask indices are initialized, only checking the bits that
4462 // are actually used.
4463 //
4464 // e.g., if Idx is <32 x i16>, only (log2(32) == 5) bits of each index are
4465 // used/checked.
4466 void maskedCheckAVXIndexShadow(IRBuilder<> &IRB, Value *Idx, Instruction *I) {
4467 assert(isFixedIntVector(Idx));
4468 auto IdxVectorSize =
4469 cast<FixedVectorType>(Idx->getType())->getNumElements();
4470 assert(isPowerOf2_64(IdxVectorSize));
4471
4472 // Compiler isn't smart enough, let's help it
4473 if (isa<Constant>(Idx))
4474 return;
4475
4476 auto *IdxShadow = getShadow(Idx);
4477 Value *Truncated = IRB.CreateTrunc(
4478 IdxShadow,
4479 FixedVectorType::get(Type::getIntNTy(*MS.C, Log2_64(IdxVectorSize)),
4480 IdxVectorSize));
4481 insertCheckShadow(Truncated, getOrigin(Idx), I);
4482 }
4483
4484 // Instrument AVX permutation intrinsic.
4485 // We apply the same permutation (argument index 1) to the shadow.
4486 void handleAVXVpermilvar(IntrinsicInst &I) {
4487 IRBuilder<> IRB(&I);
4488 Value *Shadow = getShadow(&I, 0);
4489 maskedCheckAVXIndexShadow(IRB, I.getArgOperand(1), &I);
4490
4491 // Shadows are integer-ish types but some intrinsics require a
4492 // different (e.g., floating-point) type.
4493 Shadow = IRB.CreateBitCast(Shadow, I.getArgOperand(0)->getType());
4494 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4495 {Shadow, I.getArgOperand(1)});
4496
4497 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4498 setOriginForNaryOp(I);
4499 }
4500
4501 // Instrument AVX permutation intrinsic.
4502 // We apply the same permutation (argument index 1) to the shadows.
4503 void handleAVXVpermi2var(IntrinsicInst &I) {
4504 assert(I.arg_size() == 3);
4505 assert(isa<FixedVectorType>(I.getArgOperand(0)->getType()));
4506 assert(isa<FixedVectorType>(I.getArgOperand(1)->getType()));
4507 assert(isa<FixedVectorType>(I.getArgOperand(2)->getType()));
4508 [[maybe_unused]] auto ArgVectorSize =
4509 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4510 assert(cast<FixedVectorType>(I.getArgOperand(1)->getType())
4511 ->getNumElements() == ArgVectorSize);
4512 assert(cast<FixedVectorType>(I.getArgOperand(2)->getType())
4513 ->getNumElements() == ArgVectorSize);
4514 assert(I.getArgOperand(0)->getType() == I.getArgOperand(2)->getType());
4515 assert(I.getType() == I.getArgOperand(0)->getType());
4516 assert(I.getArgOperand(1)->getType()->isIntOrIntVectorTy());
4517 IRBuilder<> IRB(&I);
4518 Value *AShadow = getShadow(&I, 0);
4519 Value *Idx = I.getArgOperand(1);
4520 Value *BShadow = getShadow(&I, 2);
4521
4522 maskedCheckAVXIndexShadow(IRB, Idx, &I);
4523
4524 // Shadows are integer-ish types but some intrinsics require a
4525 // different (e.g., floating-point) type.
4526 AShadow = IRB.CreateBitCast(AShadow, I.getArgOperand(0)->getType());
4527 BShadow = IRB.CreateBitCast(BShadow, I.getArgOperand(2)->getType());
4528 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4529 {AShadow, Idx, BShadow});
4530 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4531 setOriginForNaryOp(I);
4532 }
4533
4534 [[maybe_unused]] static bool isFixedIntVectorTy(const Type *T) {
4535 return isa<FixedVectorType>(T) && T->isIntOrIntVectorTy();
4536 }
4537
4538 [[maybe_unused]] static bool isFixedFPVectorTy(const Type *T) {
4539 return isa<FixedVectorType>(T) && T->isFPOrFPVectorTy();
4540 }
4541
4542 [[maybe_unused]] static bool isFixedIntVector(const Value *V) {
4543 return isFixedIntVectorTy(V->getType());
4544 }
4545
4546 [[maybe_unused]] static bool isFixedFPVector(const Value *V) {
4547 return isFixedFPVectorTy(V->getType());
4548 }
4549
4550 // e.g., <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
4551 // (<16 x float> a, <16 x i32> writethru, i16 mask,
4552 // i32 rounding)
4553 //
4554 // Inconveniently, some similar intrinsics have a different operand order:
4555 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
4556 // (<16 x float> a, i32 rounding, <16 x i16> writethru,
4557 // i16 mask)
4558 //
4559 // If the return type has more elements than A, the excess elements are
4560 // zeroed (and the corresponding shadow is initialized).
4561 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
4562 // (<4 x float> a, i32 rounding, <8 x i16> writethru,
4563 // i8 mask)
4564 //
4565 // dst[i] = mask[i] ? convert(a[i]) : writethru[i]
4566 // dst_shadow[i] = mask[i] ? all_or_nothing(a_shadow[i]) : writethru_shadow[i]
4567 // where all_or_nothing(x) is fully uninitialized if x has any
4568 // uninitialized bits
4569 void handleAVX512VectorConvertFPToInt(IntrinsicInst &I, bool LastMask) {
4570 IRBuilder<> IRB(&I);
4571
4572 assert(I.arg_size() == 4);
4573 Value *A = I.getOperand(0);
4574 Value *WriteThrough;
4575 Value *Mask;
4577 if (LastMask) {
4578 WriteThrough = I.getOperand(2);
4579 Mask = I.getOperand(3);
4580 RoundingMode = I.getOperand(1);
4581 } else {
4582 WriteThrough = I.getOperand(1);
4583 Mask = I.getOperand(2);
4584 RoundingMode = I.getOperand(3);
4585 }
4586
4587 assert(isFixedFPVector(A));
4588 assert(isFixedIntVector(WriteThrough));
4589
4590 unsigned ANumElements =
4591 cast<FixedVectorType>(A->getType())->getNumElements();
4592 [[maybe_unused]] unsigned WriteThruNumElements =
4593 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4594 assert(ANumElements == WriteThruNumElements ||
4595 ANumElements * 2 == WriteThruNumElements);
4596
4597 assert(Mask->getType()->isIntegerTy());
4598 unsigned MaskNumElements = Mask->getType()->getScalarSizeInBits();
4599 assert(ANumElements == MaskNumElements ||
4600 ANumElements * 2 == MaskNumElements);
4601
4602 assert(WriteThruNumElements == MaskNumElements);
4603
4604 // Some bits of the mask may be unused, though it's unusual to have partly
4605 // uninitialized bits.
4606 insertCheckShadowOf(Mask, &I);
4607
4608 assert(RoundingMode->getType()->isIntegerTy());
4609 // Only some bits of the rounding mode are used, though it's very
4610 // unusual to have uninitialized bits there (more commonly, it's a
4611 // constant).
4612 insertCheckShadowOf(RoundingMode, &I);
4613
4614 assert(I.getType() == WriteThrough->getType());
4615
4616 Value *AShadow = getShadow(A);
4617 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4618
4619 if (ANumElements * 2 == MaskNumElements) {
4620 // Ensure that the irrelevant bits of the mask are zero, hence selecting
4621 // from the zeroed shadow instead of the writethrough's shadow.
4622 Mask =
4623 IRB.CreateTrunc(Mask, IRB.getIntNTy(ANumElements), "_ms_mask_trunc");
4624 Mask =
4625 IRB.CreateZExt(Mask, IRB.getIntNTy(MaskNumElements), "_ms_mask_zext");
4626 }
4627
4628 // Convert i16 mask to <16 x i1>
4629 Mask = IRB.CreateBitCast(
4630 Mask, FixedVectorType::get(IRB.getInt1Ty(), MaskNumElements),
4631 "_ms_mask_bitcast");
4632
4633 /// For floating-point to integer conversion, the output is:
4634 /// - fully uninitialized if *any* bit of the input is uninitialized
4635 /// - fully ininitialized if all bits of the input are ininitialized
4636 /// We apply the same principle on a per-element basis for vectors.
4637 ///
4638 /// We use the scalar width of the return type instead of A's.
4639 AShadow = IRB.CreateSExt(
4640 IRB.CreateICmpNE(AShadow, getCleanShadow(AShadow->getType())),
4641 getShadowTy(&I), "_ms_a_shadow");
4642
4643 Value *WriteThroughShadow = getShadow(WriteThrough);
4644 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow,
4645 "_ms_writethru_select");
4646
4647 setShadow(&I, Shadow);
4648 setOriginForNaryOp(I);
4649 }
4650
4651 // Instrument BMI / BMI2 intrinsics.
4652 // All of these intrinsics are Z = I(X, Y)
4653 // where the types of all operands and the result match, and are either i32 or
4654 // i64. The following instrumentation happens to work for all of them:
4655 // Sz = I(Sx, Y) | (sext (Sy != 0))
4656 void handleBmiIntrinsic(IntrinsicInst &I) {
4657 IRBuilder<> IRB(&I);
4658 Type *ShadowTy = getShadowTy(&I);
4659
4660 // If any bit of the mask operand is poisoned, then the whole thing is.
4661 Value *SMask = getShadow(&I, 1);
4662 SMask = IRB.CreateSExt(IRB.CreateICmpNE(SMask, getCleanShadow(ShadowTy)),
4663 ShadowTy);
4664 // Apply the same intrinsic to the shadow of the first operand.
4665 Value *S = IRB.CreateCall(I.getCalledFunction(),
4666 {getShadow(&I, 0), I.getOperand(1)});
4667 S = IRB.CreateOr(SMask, S);
4668 setShadow(&I, S);
4669 setOriginForNaryOp(I);
4670 }
4671
4672 static SmallVector<int, 8> getPclmulMask(unsigned Width, bool OddElements) {
4673 SmallVector<int, 8> Mask;
4674 for (unsigned X = OddElements ? 1 : 0; X < Width; X += 2) {
4675 Mask.append(2, X);
4676 }
4677 return Mask;
4678 }
4679
4680 // Instrument pclmul intrinsics.
4681 // These intrinsics operate either on odd or on even elements of the input
4682 // vectors, depending on the constant in the 3rd argument, ignoring the rest.
4683 // Replace the unused elements with copies of the used ones, ex:
4684 // (0, 1, 2, 3) -> (0, 0, 2, 2) (even case)
4685 // or
4686 // (0, 1, 2, 3) -> (1, 1, 3, 3) (odd case)
4687 // and then apply the usual shadow combining logic.
4688 void handlePclmulIntrinsic(IntrinsicInst &I) {
4689 IRBuilder<> IRB(&I);
4690 unsigned Width =
4691 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4692 assert(isa<ConstantInt>(I.getArgOperand(2)) &&
4693 "pclmul 3rd operand must be a constant");
4694 unsigned Imm = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
4695 Value *Shuf0 = IRB.CreateShuffleVector(getShadow(&I, 0),
4696 getPclmulMask(Width, Imm & 0x01));
4697 Value *Shuf1 = IRB.CreateShuffleVector(getShadow(&I, 1),
4698 getPclmulMask(Width, Imm & 0x10));
4699 ShadowAndOriginCombiner SOC(this, IRB);
4700 SOC.Add(Shuf0, getOrigin(&I, 0));
4701 SOC.Add(Shuf1, getOrigin(&I, 1));
4702 SOC.Done(&I);
4703 }
4704
4705 // Instrument _mm_*_sd|ss intrinsics
4706 void handleUnarySdSsIntrinsic(IntrinsicInst &I) {
4707 IRBuilder<> IRB(&I);
4708 unsigned Width =
4709 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4710 Value *First = getShadow(&I, 0);
4711 Value *Second = getShadow(&I, 1);
4712 // First element of second operand, remaining elements of first operand
4713 SmallVector<int, 16> Mask;
4714 Mask.push_back(Width);
4715 for (unsigned i = 1; i < Width; i++)
4716 Mask.push_back(i);
4717 Value *Shadow = IRB.CreateShuffleVector(First, Second, Mask);
4718
4719 setShadow(&I, Shadow);
4720 setOriginForNaryOp(I);
4721 }
4722
4723 void handleVtestIntrinsic(IntrinsicInst &I) {
4724 IRBuilder<> IRB(&I);
4725 Value *Shadow0 = getShadow(&I, 0);
4726 Value *Shadow1 = getShadow(&I, 1);
4727 Value *Or = IRB.CreateOr(Shadow0, Shadow1);
4728 Value *NZ = IRB.CreateICmpNE(Or, Constant::getNullValue(Or->getType()));
4729 Value *Scalar = convertShadowToScalar(NZ, IRB);
4730 Value *Shadow = IRB.CreateZExt(Scalar, getShadowTy(&I));
4731
4732 setShadow(&I, Shadow);
4733 setOriginForNaryOp(I);
4734 }
4735
4736 void handleBinarySdSsIntrinsic(IntrinsicInst &I) {
4737 IRBuilder<> IRB(&I);
4738 unsigned Width =
4739 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4740 Value *First = getShadow(&I, 0);
4741 Value *Second = getShadow(&I, 1);
4742 Value *OrShadow = IRB.CreateOr(First, Second);
4743 // First element of both OR'd together, remaining elements of first operand
4744 SmallVector<int, 16> Mask;
4745 Mask.push_back(Width);
4746 for (unsigned i = 1; i < Width; i++)
4747 Mask.push_back(i);
4748 Value *Shadow = IRB.CreateShuffleVector(First, OrShadow, Mask);
4749
4750 setShadow(&I, Shadow);
4751 setOriginForNaryOp(I);
4752 }
4753
4754 // _mm_round_ps / _mm_round_ps.
4755 // Similar to maybeHandleSimpleNomemIntrinsic except
4756 // the second argument is guranteed to be a constant integer.
4757 void handleRoundPdPsIntrinsic(IntrinsicInst &I) {
4758 assert(I.getArgOperand(0)->getType() == I.getType());
4759 assert(I.arg_size() == 2);
4760 assert(isa<ConstantInt>(I.getArgOperand(1)));
4761
4762 IRBuilder<> IRB(&I);
4763 ShadowAndOriginCombiner SC(this, IRB);
4764 SC.Add(I.getArgOperand(0));
4765 SC.Done(&I);
4766 }
4767
4768 // Instrument @llvm.abs intrinsic.
4769 //
4770 // e.g., i32 @llvm.abs.i32 (i32 <Src>, i1 <is_int_min_poison>)
4771 // <4 x i32> @llvm.abs.v4i32(<4 x i32> <Src>, i1 <is_int_min_poison>)
4772 void handleAbsIntrinsic(IntrinsicInst &I) {
4773 assert(I.arg_size() == 2);
4774 Value *Src = I.getArgOperand(0);
4775 Value *IsIntMinPoison = I.getArgOperand(1);
4776
4777 assert(I.getType()->isIntOrIntVectorTy());
4778
4779 assert(Src->getType() == I.getType());
4780
4781 assert(IsIntMinPoison->getType()->isIntegerTy());
4782 assert(IsIntMinPoison->getType()->getIntegerBitWidth() == 1);
4783
4784 IRBuilder<> IRB(&I);
4785 Value *SrcShadow = getShadow(Src);
4786
4787 APInt MinVal =
4788 APInt::getSignedMinValue(Src->getType()->getScalarSizeInBits());
4789 Value *MinValVec = ConstantInt::get(Src->getType(), MinVal);
4790 Value *SrcIsMin = IRB.CreateICmp(CmpInst::ICMP_EQ, Src, MinValVec);
4791
4792 Value *PoisonedShadow = getPoisonedShadow(Src);
4793 Value *PoisonedIfIntMinShadow =
4794 IRB.CreateSelect(SrcIsMin, PoisonedShadow, SrcShadow);
4795 Value *Shadow =
4796 IRB.CreateSelect(IsIntMinPoison, PoisonedIfIntMinShadow, SrcShadow);
4797
4798 setShadow(&I, Shadow);
4799 setOrigin(&I, getOrigin(&I, 0));
4800 }
4801
4802 void handleIsFpClass(IntrinsicInst &I) {
4803 IRBuilder<> IRB(&I);
4804 Value *Shadow = getShadow(&I, 0);
4805 setShadow(&I, IRB.CreateICmpNE(Shadow, getCleanShadow(Shadow)));
4806 setOrigin(&I, getOrigin(&I, 0));
4807 }
4808
4809 void handleArithmeticWithOverflow(IntrinsicInst &I) {
4810 IRBuilder<> IRB(&I);
4811 Value *Shadow0 = getShadow(&I, 0);
4812 Value *Shadow1 = getShadow(&I, 1);
4813 Value *ShadowElt0 = IRB.CreateOr(Shadow0, Shadow1);
4814 Value *ShadowElt1 =
4815 IRB.CreateICmpNE(ShadowElt0, getCleanShadow(ShadowElt0));
4816
4817 Value *Shadow = PoisonValue::get(getShadowTy(&I));
4818 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt0, 0);
4819 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt1, 1);
4820
4821 setShadow(&I, Shadow);
4822 setOriginForNaryOp(I);
4823 }
4824
4825 Value *extractLowerShadow(IRBuilder<> &IRB, Value *V) {
4826 assert(isa<FixedVectorType>(V->getType()));
4827 assert(cast<FixedVectorType>(V->getType())->getNumElements() > 0);
4828 Value *Shadow = getShadow(V);
4829 return IRB.CreateExtractElement(Shadow,
4830 ConstantInt::get(IRB.getInt32Ty(), 0));
4831 }
4832
4833 // Handle llvm.x86.avx512.mask.pmov{,s,us}.*.512
4834 //
4835 // e.g., call <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512
4836 // (<8 x i64>, <16 x i8>, i8)
4837 // A WriteThru Mask
4838 //
4839 // call <16 x i8> @llvm.x86.avx512.mask.pmovs.db.512
4840 // (<16 x i32>, <16 x i8>, i16)
4841 //
4842 // Dst[i] = Mask[i] ? truncate_or_saturate(A[i]) : WriteThru[i]
4843 // Dst_shadow[i] = Mask[i] ? truncate(A_shadow[i]) : WriteThru_shadow[i]
4844 //
4845 // If Dst has more elements than A, the excess elements are zeroed (and the
4846 // corresponding shadow is initialized).
4847 //
4848 // Note: for PMOV (truncation), handleIntrinsicByApplyingToShadow is precise
4849 // and is much faster than this handler.
4850 void handleAVX512VectorDownConvert(IntrinsicInst &I) {
4851 IRBuilder<> IRB(&I);
4852
4853 assert(I.arg_size() == 3);
4854 Value *A = I.getOperand(0);
4855 Value *WriteThrough = I.getOperand(1);
4856 Value *Mask = I.getOperand(2);
4857
4858 assert(isFixedIntVector(A));
4859 assert(isFixedIntVector(WriteThrough));
4860
4861 unsigned ANumElements =
4862 cast<FixedVectorType>(A->getType())->getNumElements();
4863 unsigned OutputNumElements =
4864 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4865 assert(ANumElements == OutputNumElements ||
4866 ANumElements * 2 == OutputNumElements);
4867
4868 assert(Mask->getType()->isIntegerTy());
4869 assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
4870 insertCheckShadowOf(Mask, &I);
4871
4872 assert(I.getType() == WriteThrough->getType());
4873
4874 // Widen the mask, if necessary, to have one bit per element of the output
4875 // vector.
4876 // We want the extra bits to have '1's, so that the CreateSelect will
4877 // select the values from AShadow instead of WriteThroughShadow ("maskless"
4878 // versions of the intrinsics are sometimes implemented using an all-1's
4879 // mask and an undefined value for WriteThroughShadow). We accomplish this
4880 // by using bitwise NOT before and after the ZExt.
4881 if (ANumElements != OutputNumElements) {
4882 Mask = IRB.CreateNot(Mask);
4883 Mask = IRB.CreateZExt(Mask, Type::getIntNTy(*MS.C, OutputNumElements),
4884 "_ms_widen_mask");
4885 Mask = IRB.CreateNot(Mask);
4886 }
4887 Mask = IRB.CreateBitCast(
4888 Mask, FixedVectorType::get(IRB.getInt1Ty(), OutputNumElements));
4889
4890 Value *AShadow = getShadow(A);
4891
4892 // The return type might have more elements than the input.
4893 // Temporarily shrink the return type's number of elements.
4894 VectorType *ShadowType = maybeShrinkVectorShadowType(A, I);
4895
4896 // PMOV truncates; PMOVS/PMOVUS uses signed/unsigned saturation.
4897 // This handler treats them all as truncation, which leads to some rare
4898 // false positives in the cases where the truncated bytes could
4899 // unambiguously saturate the value e.g., if A = ??????10 ????????
4900 // (big-endian), the unsigned saturated byte conversion is 11111111 i.e.,
4901 // fully defined, but the truncated byte is ????????.
4902 //
4903 // TODO: use GetMinMaxUnsigned() to handle saturation precisely.
4904 AShadow = IRB.CreateTrunc(AShadow, ShadowType, "_ms_trunc_shadow");
4905 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4906
4907 Value *WriteThroughShadow = getShadow(WriteThrough);
4908
4909 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow);
4910 setShadow(&I, Shadow);
4911 setOriginForNaryOp(I);
4912 }
4913
4914 // For sh.* compiler intrinsics:
4915 // llvm.x86.avx512fp16.mask.{add/sub/mul/div/max/min}.sh.round
4916 // (<8 x half>, <8 x half>, <8 x half>, i8, i32)
4917 // A B WriteThru Mask RoundingMode
4918 //
4919 // DstShadow[0] = Mask[0] ? (AShadow[0] | BShadow[0]) : WriteThruShadow[0]
4920 // DstShadow[1..7] = AShadow[1..7]
4921 void visitGenericScalarHalfwordInst(IntrinsicInst &I) {
4922 IRBuilder<> IRB(&I);
4923
4924 assert(I.arg_size() == 5);
4925 Value *A = I.getOperand(0);
4926 Value *B = I.getOperand(1);
4927 Value *WriteThrough = I.getOperand(2);
4928 Value *Mask = I.getOperand(3);
4929 Value *RoundingMode = I.getOperand(4);
4930
4931 // Technically, we could probably just check whether the LSB is
4932 // initialized, but intuitively it feels like a partly uninitialized mask
4933 // is unintended, and we should warn the user immediately.
4934 insertCheckShadowOf(Mask, &I);
4935 insertCheckShadowOf(RoundingMode, &I);
4936
4937 assert(isa<FixedVectorType>(A->getType()));
4938 unsigned NumElements =
4939 cast<FixedVectorType>(A->getType())->getNumElements();
4940 assert(NumElements == 8);
4941 assert(A->getType() == B->getType());
4942 assert(B->getType() == WriteThrough->getType());
4943 assert(Mask->getType()->getPrimitiveSizeInBits() == NumElements);
4944 assert(RoundingMode->getType()->isIntegerTy());
4945
4946 Value *ALowerShadow = extractLowerShadow(IRB, A);
4947 Value *BLowerShadow = extractLowerShadow(IRB, B);
4948
4949 Value *ABLowerShadow = IRB.CreateOr(ALowerShadow, BLowerShadow);
4950
4951 Value *WriteThroughLowerShadow = extractLowerShadow(IRB, WriteThrough);
4952
4953 Mask = IRB.CreateBitCast(
4954 Mask, FixedVectorType::get(IRB.getInt1Ty(), NumElements));
4955 Value *MaskLower =
4956 IRB.CreateExtractElement(Mask, ConstantInt::get(IRB.getInt32Ty(), 0));
4957
4958 Value *AShadow = getShadow(A);
4959 Value *DstLowerShadow =
4960 IRB.CreateSelect(MaskLower, ABLowerShadow, WriteThroughLowerShadow);
4961 Value *DstShadow = IRB.CreateInsertElement(
4962 AShadow, DstLowerShadow, ConstantInt::get(IRB.getInt32Ty(), 0),
4963 "_msprop");
4964
4965 setShadow(&I, DstShadow);
4966 setOriginForNaryOp(I);
4967 }
4968
4969 // Approximately handle AVX Galois Field Affine Transformation
4970 //
4971 // e.g.,
4972 // <16 x i8> @llvm.x86.vgf2p8affineqb.128(<16 x i8>, <16 x i8>, i8)
4973 // <32 x i8> @llvm.x86.vgf2p8affineqb.256(<32 x i8>, <32 x i8>, i8)
4974 // <64 x i8> @llvm.x86.vgf2p8affineqb.512(<64 x i8>, <64 x i8>, i8)
4975 // Out A x b
4976 // where A and x are packed matrices, b is a vector,
4977 // Out = A * x + b in GF(2)
4978 //
4979 // Multiplication in GF(2) is equivalent to bitwise AND. However, the matrix
4980 // computation also includes a parity calculation.
4981 //
4982 // For the bitwise AND of bits V1 and V2, the exact shadow is:
4983 // Out_Shadow = (V1_Shadow & V2_Shadow)
4984 // | (V1 & V2_Shadow)
4985 // | (V1_Shadow & V2 )
4986 //
4987 // We approximate the shadow of gf2p8affineqb using:
4988 // Out_Shadow = gf2p8affineqb(x_Shadow, A_shadow, 0)
4989 // | gf2p8affineqb(x, A_shadow, 0)
4990 // | gf2p8affineqb(x_Shadow, A, 0)
4991 // | set1_epi8(b_Shadow)
4992 //
4993 // This approximation has false negatives: if an intermediate dot-product
4994 // contains an even number of 1's, the parity is 0.
4995 // It has no false positives.
4996 void handleAVXGF2P8Affine(IntrinsicInst &I) {
4997 IRBuilder<> IRB(&I);
4998
4999 assert(I.arg_size() == 3);
5000 Value *A = I.getOperand(0);
5001 Value *X = I.getOperand(1);
5002 Value *B = I.getOperand(2);
5003
5004 assert(isFixedIntVector(A));
5005 assert(cast<VectorType>(A->getType())
5006 ->getElementType()
5007 ->getScalarSizeInBits() == 8);
5008
5009 assert(A->getType() == X->getType());
5010
5011 assert(B->getType()->isIntegerTy());
5012 assert(B->getType()->getScalarSizeInBits() == 8);
5013
5014 assert(I.getType() == A->getType());
5015
5016 Value *AShadow = getShadow(A);
5017 Value *XShadow = getShadow(X);
5018 Value *BZeroShadow = getCleanShadow(B);
5019
5020 CallInst *AShadowXShadow = IRB.CreateIntrinsic(
5021 I.getType(), I.getIntrinsicID(), {XShadow, AShadow, BZeroShadow});
5022 CallInst *AShadowX = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5023 {X, AShadow, BZeroShadow});
5024 CallInst *XShadowA = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5025 {XShadow, A, BZeroShadow});
5026
5027 unsigned NumElements = cast<FixedVectorType>(I.getType())->getNumElements();
5028 Value *BShadow = getShadow(B);
5029 Value *BBroadcastShadow = getCleanShadow(AShadow);
5030 // There is no LLVM IR intrinsic for _mm512_set1_epi8.
5031 // This loop generates a lot of LLVM IR, which we expect that CodeGen will
5032 // lower appropriately (e.g., VPBROADCASTB).
5033 // Besides, b is often a constant, in which case it is fully initialized.
5034 for (unsigned i = 0; i < NumElements; i++)
5035 BBroadcastShadow = IRB.CreateInsertElement(BBroadcastShadow, BShadow, i);
5036
5037 setShadow(&I, IRB.CreateOr(
5038 {AShadowXShadow, AShadowX, XShadowA, BBroadcastShadow}));
5039 setOriginForNaryOp(I);
5040 }
5041
5042 // Handle Arm NEON vector load intrinsics (vld*).
5043 //
5044 // The WithLane instructions (ld[234]lane) are similar to:
5045 // call {<4 x i32>, <4 x i32>, <4 x i32>}
5046 // @llvm.aarch64.neon.ld3lane.v4i32.p0
5047 // (<4 x i32> %L1, <4 x i32> %L2, <4 x i32> %L3, i64 %lane, ptr
5048 // %A)
5049 //
5050 // The non-WithLane instructions (ld[234], ld1x[234], ld[234]r) are similar
5051 // to:
5052 // call {<8 x i8>, <8 x i8>} @llvm.aarch64.neon.ld2.v8i8.p0(ptr %A)
5053 void handleNEONVectorLoad(IntrinsicInst &I, bool WithLane) {
5054 unsigned int numArgs = I.arg_size();
5055
5056 // Return type is a struct of vectors of integers or floating-point
5057 assert(I.getType()->isStructTy());
5058 [[maybe_unused]] StructType *RetTy = cast<StructType>(I.getType());
5059 assert(RetTy->getNumElements() > 0);
5061 RetTy->getElementType(0)->isFPOrFPVectorTy());
5062 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5063 assert(RetTy->getElementType(i) == RetTy->getElementType(0));
5064
5065 if (WithLane) {
5066 // 2, 3 or 4 vectors, plus lane number, plus input pointer
5067 assert(4 <= numArgs && numArgs <= 6);
5068
5069 // Return type is a struct of the input vectors
5070 assert(RetTy->getNumElements() + 2 == numArgs);
5071 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5072 assert(I.getArgOperand(i)->getType() == RetTy->getElementType(0));
5073 } else {
5074 assert(numArgs == 1);
5075 }
5076
5077 IRBuilder<> IRB(&I);
5078
5079 SmallVector<Value *, 6> ShadowArgs;
5080 if (WithLane) {
5081 for (unsigned int i = 0; i < numArgs - 2; i++)
5082 ShadowArgs.push_back(getShadow(I.getArgOperand(i)));
5083
5084 // Lane number, passed verbatim
5085 Value *LaneNumber = I.getArgOperand(numArgs - 2);
5086 ShadowArgs.push_back(LaneNumber);
5087
5088 // TODO: blend shadow of lane number into output shadow?
5089 insertCheckShadowOf(LaneNumber, &I);
5090 }
5091
5092 Value *Src = I.getArgOperand(numArgs - 1);
5093 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
5094
5095 Type *SrcShadowTy = getShadowTy(Src);
5096 auto [SrcShadowPtr, SrcOriginPtr] =
5097 getShadowOriginPtr(Src, IRB, SrcShadowTy, Align(1), /*isStore*/ false);
5098 ShadowArgs.push_back(SrcShadowPtr);
5099
5100 // The NEON vector load instructions handled by this function all have
5101 // integer variants. It is easier to use those rather than trying to cast
5102 // a struct of vectors of floats into a struct of vectors of integers.
5103 CallInst *CI =
5104 IRB.CreateIntrinsic(getShadowTy(&I), I.getIntrinsicID(), ShadowArgs);
5105 setShadow(&I, CI);
5106
5107 if (!MS.TrackOrigins)
5108 return;
5109
5110 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
5111 setOrigin(&I, PtrSrcOrigin);
5112 }
5113
5114 /// Handle Arm NEON vector store intrinsics (vst{2,3,4}, vst1x_{2,3,4},
5115 /// and vst{2,3,4}lane).
5116 ///
5117 /// Arm NEON vector store intrinsics have the output address (pointer) as the
5118 /// last argument, with the initial arguments being the inputs (and lane
5119 /// number for vst{2,3,4}lane). They return void.
5120 ///
5121 /// - st4 interleaves the output e.g., st4 (inA, inB, inC, inD, outP) writes
5122 /// abcdabcdabcdabcd... into *outP
5123 /// - st1_x4 is non-interleaved e.g., st1_x4 (inA, inB, inC, inD, outP)
5124 /// writes aaaa...bbbb...cccc...dddd... into *outP
5125 /// - st4lane has arguments of (inA, inB, inC, inD, lane, outP)
5126 /// These instructions can all be instrumented with essentially the same
5127 /// MSan logic, simply by applying the corresponding intrinsic to the shadow.
5128 void handleNEONVectorStoreIntrinsic(IntrinsicInst &I, bool useLane) {
5129 IRBuilder<> IRB(&I);
5130
5131 // Don't use getNumOperands() because it includes the callee
5132 int numArgOperands = I.arg_size();
5133
5134 // The last arg operand is the output (pointer)
5135 assert(numArgOperands >= 1);
5136 Value *Addr = I.getArgOperand(numArgOperands - 1);
5137 assert(Addr->getType()->isPointerTy());
5138 int skipTrailingOperands = 1;
5139
5141 insertCheckShadowOf(Addr, &I);
5142
5143 // Second-last operand is the lane number (for vst{2,3,4}lane)
5144 if (useLane) {
5145 skipTrailingOperands++;
5146 assert(numArgOperands >= static_cast<int>(skipTrailingOperands));
5148 I.getArgOperand(numArgOperands - skipTrailingOperands)->getType()));
5149 }
5150
5151 SmallVector<Value *, 8> ShadowArgs;
5152 // All the initial operands are the inputs
5153 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++) {
5154 assert(isa<FixedVectorType>(I.getArgOperand(i)->getType()));
5155 Value *Shadow = getShadow(&I, i);
5156 ShadowArgs.append(1, Shadow);
5157 }
5158
5159 // MSan's GetShadowTy assumes the LHS is the type we want the shadow for
5160 // e.g., for:
5161 // [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to i128
5162 // we know the type of the output (and its shadow) is <16 x i8>.
5163 //
5164 // Arm NEON VST is unusual because the last argument is the output address:
5165 // define void @st2_16b(<16 x i8> %A, <16 x i8> %B, ptr %P) {
5166 // call void @llvm.aarch64.neon.st2.v16i8.p0
5167 // (<16 x i8> [[A]], <16 x i8> [[B]], ptr [[P]])
5168 // and we have no type information about P's operand. We must manually
5169 // compute the type (<16 x i8> x 2).
5170 FixedVectorType *OutputVectorTy = FixedVectorType::get(
5171 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getElementType(),
5172 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements() *
5173 (numArgOperands - skipTrailingOperands));
5174 Type *OutputShadowTy = getShadowTy(OutputVectorTy);
5175
5176 if (useLane)
5177 ShadowArgs.append(1,
5178 I.getArgOperand(numArgOperands - skipTrailingOperands));
5179
5180 Value *OutputShadowPtr, *OutputOriginPtr;
5181 // AArch64 NEON does not need alignment (unless OS requires it)
5182 std::tie(OutputShadowPtr, OutputOriginPtr) = getShadowOriginPtr(
5183 Addr, IRB, OutputShadowTy, Align(1), /*isStore*/ true);
5184 ShadowArgs.append(1, OutputShadowPtr);
5185
5186 CallInst *CI =
5187 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
5188 setShadow(&I, CI);
5189
5190 if (MS.TrackOrigins) {
5191 // TODO: if we modelled the vst* instruction more precisely, we could
5192 // more accurately track the origins (e.g., if both inputs are
5193 // uninitialized for vst2, we currently blame the second input, even
5194 // though part of the output depends only on the first input).
5195 //
5196 // This is particularly imprecise for vst{2,3,4}lane, since only one
5197 // lane of each input is actually copied to the output.
5198 OriginCombiner OC(this, IRB);
5199 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++)
5200 OC.Add(I.getArgOperand(i));
5201
5202 const DataLayout &DL = F.getDataLayout();
5203 OC.DoneAndStoreOrigin(DL.getTypeStoreSize(OutputVectorTy),
5204 OutputOriginPtr);
5205 }
5206 }
5207
5208 /// Handle intrinsics by applying the intrinsic to the shadows.
5209 ///
5210 /// The trailing arguments are passed verbatim to the intrinsic, though any
5211 /// uninitialized trailing arguments can also taint the shadow e.g., for an
5212 /// intrinsic with one trailing verbatim argument:
5213 /// out = intrinsic(var1, var2, opType)
5214 /// we compute:
5215 /// shadow[out] =
5216 /// intrinsic(shadow[var1], shadow[var2], opType) | shadow[opType]
5217 ///
5218 /// Typically, shadowIntrinsicID will be specified by the caller to be
5219 /// I.getIntrinsicID(), but the caller can choose to replace it with another
5220 /// intrinsic of the same type.
5221 ///
5222 /// CAUTION: this assumes that the intrinsic will handle arbitrary
5223 /// bit-patterns (for example, if the intrinsic accepts floats for
5224 /// var1, we require that it doesn't care if inputs are NaNs).
5225 ///
5226 /// For example, this can be applied to the Arm NEON vector table intrinsics
5227 /// (tbl{1,2,3,4}).
5228 ///
5229 /// The origin is approximated using setOriginForNaryOp.
5230 void handleIntrinsicByApplyingToShadow(IntrinsicInst &I,
5231 Intrinsic::ID shadowIntrinsicID,
5232 unsigned int trailingVerbatimArgs) {
5233 IRBuilder<> IRB(&I);
5234
5235 assert(trailingVerbatimArgs < I.arg_size());
5236
5237 SmallVector<Value *, 8> ShadowArgs;
5238 // Don't use getNumOperands() because it includes the callee
5239 for (unsigned int i = 0; i < I.arg_size() - trailingVerbatimArgs; i++) {
5240 Value *Shadow = getShadow(&I, i);
5241
5242 // Shadows are integer-ish types but some intrinsics require a
5243 // different (e.g., floating-point) type.
5244 ShadowArgs.push_back(
5245 IRB.CreateBitCast(Shadow, I.getArgOperand(i)->getType()));
5246 }
5247
5248 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5249 i++) {
5250 Value *Arg = I.getArgOperand(i);
5251 ShadowArgs.push_back(Arg);
5252 }
5253
5254 CallInst *CI =
5255 IRB.CreateIntrinsic(I.getType(), shadowIntrinsicID, ShadowArgs);
5256 Value *CombinedShadow = CI;
5257
5258 // Combine the computed shadow with the shadow of trailing args
5259 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5260 i++) {
5261 Value *Shadow =
5262 CreateShadowCast(IRB, getShadow(&I, i), CombinedShadow->getType());
5263 CombinedShadow = IRB.CreateOr(Shadow, CombinedShadow, "_msprop");
5264 }
5265
5266 setShadow(&I, IRB.CreateBitCast(CombinedShadow, getShadowTy(&I)));
5267
5268 setOriginForNaryOp(I);
5269 }
5270
5271 // Approximation only
5272 //
5273 // e.g., <16 x i8> @llvm.aarch64.neon.pmull64(i64, i64)
5274 void handleNEONVectorMultiplyIntrinsic(IntrinsicInst &I) {
5275 assert(I.arg_size() == 2);
5276
5277 handleShadowOr(I);
5278 }
5279
5280 bool maybeHandleCrossPlatformIntrinsic(IntrinsicInst &I) {
5281 switch (I.getIntrinsicID()) {
5282 case Intrinsic::uadd_with_overflow:
5283 case Intrinsic::sadd_with_overflow:
5284 case Intrinsic::usub_with_overflow:
5285 case Intrinsic::ssub_with_overflow:
5286 case Intrinsic::umul_with_overflow:
5287 case Intrinsic::smul_with_overflow:
5288 handleArithmeticWithOverflow(I);
5289 break;
5290 case Intrinsic::abs:
5291 handleAbsIntrinsic(I);
5292 break;
5293 case Intrinsic::bitreverse:
5294 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
5295 /*trailingVerbatimArgs*/ 0);
5296 break;
5297 case Intrinsic::is_fpclass:
5298 handleIsFpClass(I);
5299 break;
5300 case Intrinsic::lifetime_start:
5301 handleLifetimeStart(I);
5302 break;
5303 case Intrinsic::launder_invariant_group:
5304 case Intrinsic::strip_invariant_group:
5305 handleInvariantGroup(I);
5306 break;
5307 case Intrinsic::bswap:
5308 handleBswap(I);
5309 break;
5310 case Intrinsic::ctlz:
5311 case Intrinsic::cttz:
5312 handleCountLeadingTrailingZeros(I);
5313 break;
5314 case Intrinsic::masked_compressstore:
5315 handleMaskedCompressStore(I);
5316 break;
5317 case Intrinsic::masked_expandload:
5318 handleMaskedExpandLoad(I);
5319 break;
5320 case Intrinsic::masked_gather:
5321 handleMaskedGather(I);
5322 break;
5323 case Intrinsic::masked_scatter:
5324 handleMaskedScatter(I);
5325 break;
5326 case Intrinsic::masked_store:
5327 handleMaskedStore(I);
5328 break;
5329 case Intrinsic::masked_load:
5330 handleMaskedLoad(I);
5331 break;
5332 case Intrinsic::vector_reduce_and:
5333 handleVectorReduceAndIntrinsic(I);
5334 break;
5335 case Intrinsic::vector_reduce_or:
5336 handleVectorReduceOrIntrinsic(I);
5337 break;
5338
5339 case Intrinsic::vector_reduce_add:
5340 case Intrinsic::vector_reduce_xor:
5341 case Intrinsic::vector_reduce_mul:
5342 // Signed/Unsigned Min/Max
5343 // TODO: handling similarly to AND/OR may be more precise.
5344 case Intrinsic::vector_reduce_smax:
5345 case Intrinsic::vector_reduce_smin:
5346 case Intrinsic::vector_reduce_umax:
5347 case Intrinsic::vector_reduce_umin:
5348 // TODO: this has no false positives, but arguably we should check that all
5349 // the bits are initialized.
5350 case Intrinsic::vector_reduce_fmax:
5351 case Intrinsic::vector_reduce_fmin:
5352 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/false);
5353 break;
5354
5355 case Intrinsic::vector_reduce_fadd:
5356 case Intrinsic::vector_reduce_fmul:
5357 handleVectorReduceWithStarterIntrinsic(I);
5358 break;
5359
5360 case Intrinsic::scmp:
5361 case Intrinsic::ucmp: {
5362 handleShadowOr(I);
5363 break;
5364 }
5365
5366 case Intrinsic::fshl:
5367 case Intrinsic::fshr:
5368 handleFunnelShift(I);
5369 break;
5370
5371 case Intrinsic::is_constant:
5372 // The result of llvm.is.constant() is always defined.
5373 setShadow(&I, getCleanShadow(&I));
5374 setOrigin(&I, getCleanOrigin());
5375 break;
5376
5377 default:
5378 return false;
5379 }
5380
5381 return true;
5382 }
5383
5384 bool maybeHandleX86SIMDIntrinsic(IntrinsicInst &I) {
5385 switch (I.getIntrinsicID()) {
5386 case Intrinsic::x86_sse_stmxcsr:
5387 handleStmxcsr(I);
5388 break;
5389 case Intrinsic::x86_sse_ldmxcsr:
5390 handleLdmxcsr(I);
5391 break;
5392
5393 // Convert Scalar Double Precision Floating-Point Value
5394 // to Unsigned Doubleword Integer
5395 // etc.
5396 case Intrinsic::x86_avx512_vcvtsd2usi64:
5397 case Intrinsic::x86_avx512_vcvtsd2usi32:
5398 case Intrinsic::x86_avx512_vcvtss2usi64:
5399 case Intrinsic::x86_avx512_vcvtss2usi32:
5400 case Intrinsic::x86_avx512_cvttss2usi64:
5401 case Intrinsic::x86_avx512_cvttss2usi:
5402 case Intrinsic::x86_avx512_cvttsd2usi64:
5403 case Intrinsic::x86_avx512_cvttsd2usi:
5404 case Intrinsic::x86_avx512_cvtusi2ss:
5405 case Intrinsic::x86_avx512_cvtusi642sd:
5406 case Intrinsic::x86_avx512_cvtusi642ss:
5407 handleSSEVectorConvertIntrinsic(I, 1, true);
5408 break;
5409 case Intrinsic::x86_sse2_cvtsd2si64:
5410 case Intrinsic::x86_sse2_cvtsd2si:
5411 case Intrinsic::x86_sse2_cvtsd2ss:
5412 case Intrinsic::x86_sse2_cvttsd2si64:
5413 case Intrinsic::x86_sse2_cvttsd2si:
5414 case Intrinsic::x86_sse_cvtss2si64:
5415 case Intrinsic::x86_sse_cvtss2si:
5416 case Intrinsic::x86_sse_cvttss2si64:
5417 case Intrinsic::x86_sse_cvttss2si:
5418 handleSSEVectorConvertIntrinsic(I, 1);
5419 break;
5420 case Intrinsic::x86_sse_cvtps2pi:
5421 case Intrinsic::x86_sse_cvttps2pi:
5422 handleSSEVectorConvertIntrinsic(I, 2);
5423 break;
5424
5425 // TODO:
5426 // <1 x i64> @llvm.x86.sse.cvtpd2pi(<2 x double>)
5427 // <2 x double> @llvm.x86.sse.cvtpi2pd(<1 x i64>)
5428 // <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, <1 x i64>)
5429
5430 case Intrinsic::x86_vcvtps2ph_128:
5431 case Intrinsic::x86_vcvtps2ph_256: {
5432 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/true);
5433 break;
5434 }
5435
5436 // Convert Packed Single Precision Floating-Point Values
5437 // to Packed Signed Doubleword Integer Values
5438 //
5439 // <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
5440 // (<16 x float>, <16 x i32>, i16, i32)
5441 case Intrinsic::x86_avx512_mask_cvtps2dq_512:
5442 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/false);
5443 break;
5444
5445 // Convert Packed Double Precision Floating-Point Values
5446 // to Packed Single Precision Floating-Point Values
5447 case Intrinsic::x86_sse2_cvtpd2ps:
5448 case Intrinsic::x86_sse2_cvtps2dq:
5449 case Intrinsic::x86_sse2_cvtpd2dq:
5450 case Intrinsic::x86_sse2_cvttps2dq:
5451 case Intrinsic::x86_sse2_cvttpd2dq:
5452 case Intrinsic::x86_avx_cvt_pd2_ps_256:
5453 case Intrinsic::x86_avx_cvt_ps2dq_256:
5454 case Intrinsic::x86_avx_cvt_pd2dq_256:
5455 case Intrinsic::x86_avx_cvtt_ps2dq_256:
5456 case Intrinsic::x86_avx_cvtt_pd2dq_256: {
5457 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/false);
5458 break;
5459 }
5460
5461 // Convert Single-Precision FP Value to 16-bit FP Value
5462 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
5463 // (<16 x float>, i32, <16 x i16>, i16)
5464 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
5465 // (<4 x float>, i32, <8 x i16>, i8)
5466 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256
5467 // (<8 x float>, i32, <8 x i16>, i8)
5468 case Intrinsic::x86_avx512_mask_vcvtps2ph_512:
5469 case Intrinsic::x86_avx512_mask_vcvtps2ph_256:
5470 case Intrinsic::x86_avx512_mask_vcvtps2ph_128:
5471 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/true);
5472 break;
5473
5474 // Shift Packed Data (Left Logical, Right Arithmetic, Right Logical)
5475 case Intrinsic::x86_avx512_psll_w_512:
5476 case Intrinsic::x86_avx512_psll_d_512:
5477 case Intrinsic::x86_avx512_psll_q_512:
5478 case Intrinsic::x86_avx512_pslli_w_512:
5479 case Intrinsic::x86_avx512_pslli_d_512:
5480 case Intrinsic::x86_avx512_pslli_q_512:
5481 case Intrinsic::x86_avx512_psrl_w_512:
5482 case Intrinsic::x86_avx512_psrl_d_512:
5483 case Intrinsic::x86_avx512_psrl_q_512:
5484 case Intrinsic::x86_avx512_psra_w_512:
5485 case Intrinsic::x86_avx512_psra_d_512:
5486 case Intrinsic::x86_avx512_psra_q_512:
5487 case Intrinsic::x86_avx512_psrli_w_512:
5488 case Intrinsic::x86_avx512_psrli_d_512:
5489 case Intrinsic::x86_avx512_psrli_q_512:
5490 case Intrinsic::x86_avx512_psrai_w_512:
5491 case Intrinsic::x86_avx512_psrai_d_512:
5492 case Intrinsic::x86_avx512_psrai_q_512:
5493 case Intrinsic::x86_avx512_psra_q_256:
5494 case Intrinsic::x86_avx512_psra_q_128:
5495 case Intrinsic::x86_avx512_psrai_q_256:
5496 case Intrinsic::x86_avx512_psrai_q_128:
5497 case Intrinsic::x86_avx2_psll_w:
5498 case Intrinsic::x86_avx2_psll_d:
5499 case Intrinsic::x86_avx2_psll_q:
5500 case Intrinsic::x86_avx2_pslli_w:
5501 case Intrinsic::x86_avx2_pslli_d:
5502 case Intrinsic::x86_avx2_pslli_q:
5503 case Intrinsic::x86_avx2_psrl_w:
5504 case Intrinsic::x86_avx2_psrl_d:
5505 case Intrinsic::x86_avx2_psrl_q:
5506 case Intrinsic::x86_avx2_psra_w:
5507 case Intrinsic::x86_avx2_psra_d:
5508 case Intrinsic::x86_avx2_psrli_w:
5509 case Intrinsic::x86_avx2_psrli_d:
5510 case Intrinsic::x86_avx2_psrli_q:
5511 case Intrinsic::x86_avx2_psrai_w:
5512 case Intrinsic::x86_avx2_psrai_d:
5513 case Intrinsic::x86_sse2_psll_w:
5514 case Intrinsic::x86_sse2_psll_d:
5515 case Intrinsic::x86_sse2_psll_q:
5516 case Intrinsic::x86_sse2_pslli_w:
5517 case Intrinsic::x86_sse2_pslli_d:
5518 case Intrinsic::x86_sse2_pslli_q:
5519 case Intrinsic::x86_sse2_psrl_w:
5520 case Intrinsic::x86_sse2_psrl_d:
5521 case Intrinsic::x86_sse2_psrl_q:
5522 case Intrinsic::x86_sse2_psra_w:
5523 case Intrinsic::x86_sse2_psra_d:
5524 case Intrinsic::x86_sse2_psrli_w:
5525 case Intrinsic::x86_sse2_psrli_d:
5526 case Intrinsic::x86_sse2_psrli_q:
5527 case Intrinsic::x86_sse2_psrai_w:
5528 case Intrinsic::x86_sse2_psrai_d:
5529 case Intrinsic::x86_mmx_psll_w:
5530 case Intrinsic::x86_mmx_psll_d:
5531 case Intrinsic::x86_mmx_psll_q:
5532 case Intrinsic::x86_mmx_pslli_w:
5533 case Intrinsic::x86_mmx_pslli_d:
5534 case Intrinsic::x86_mmx_pslli_q:
5535 case Intrinsic::x86_mmx_psrl_w:
5536 case Intrinsic::x86_mmx_psrl_d:
5537 case Intrinsic::x86_mmx_psrl_q:
5538 case Intrinsic::x86_mmx_psra_w:
5539 case Intrinsic::x86_mmx_psra_d:
5540 case Intrinsic::x86_mmx_psrli_w:
5541 case Intrinsic::x86_mmx_psrli_d:
5542 case Intrinsic::x86_mmx_psrli_q:
5543 case Intrinsic::x86_mmx_psrai_w:
5544 case Intrinsic::x86_mmx_psrai_d:
5545 handleVectorShiftIntrinsic(I, /* Variable */ false);
5546 break;
5547 case Intrinsic::x86_avx2_psllv_d:
5548 case Intrinsic::x86_avx2_psllv_d_256:
5549 case Intrinsic::x86_avx512_psllv_d_512:
5550 case Intrinsic::x86_avx2_psllv_q:
5551 case Intrinsic::x86_avx2_psllv_q_256:
5552 case Intrinsic::x86_avx512_psllv_q_512:
5553 case Intrinsic::x86_avx2_psrlv_d:
5554 case Intrinsic::x86_avx2_psrlv_d_256:
5555 case Intrinsic::x86_avx512_psrlv_d_512:
5556 case Intrinsic::x86_avx2_psrlv_q:
5557 case Intrinsic::x86_avx2_psrlv_q_256:
5558 case Intrinsic::x86_avx512_psrlv_q_512:
5559 case Intrinsic::x86_avx2_psrav_d:
5560 case Intrinsic::x86_avx2_psrav_d_256:
5561 case Intrinsic::x86_avx512_psrav_d_512:
5562 case Intrinsic::x86_avx512_psrav_q_128:
5563 case Intrinsic::x86_avx512_psrav_q_256:
5564 case Intrinsic::x86_avx512_psrav_q_512:
5565 handleVectorShiftIntrinsic(I, /* Variable */ true);
5566 break;
5567
5568 // Pack with Signed/Unsigned Saturation
5569 case Intrinsic::x86_sse2_packsswb_128:
5570 case Intrinsic::x86_sse2_packssdw_128:
5571 case Intrinsic::x86_sse2_packuswb_128:
5572 case Intrinsic::x86_sse41_packusdw:
5573 case Intrinsic::x86_avx2_packsswb:
5574 case Intrinsic::x86_avx2_packssdw:
5575 case Intrinsic::x86_avx2_packuswb:
5576 case Intrinsic::x86_avx2_packusdw:
5577 // e.g., <64 x i8> @llvm.x86.avx512.packsswb.512
5578 // (<32 x i16> %a, <32 x i16> %b)
5579 // <32 x i16> @llvm.x86.avx512.packssdw.512
5580 // (<16 x i32> %a, <16 x i32> %b)
5581 // Note: AVX512 masked variants are auto-upgraded by LLVM.
5582 case Intrinsic::x86_avx512_packsswb_512:
5583 case Intrinsic::x86_avx512_packssdw_512:
5584 case Intrinsic::x86_avx512_packuswb_512:
5585 case Intrinsic::x86_avx512_packusdw_512:
5586 handleVectorPackIntrinsic(I);
5587 break;
5588
5589 case Intrinsic::x86_sse41_pblendvb:
5590 case Intrinsic::x86_sse41_blendvpd:
5591 case Intrinsic::x86_sse41_blendvps:
5592 case Intrinsic::x86_avx_blendv_pd_256:
5593 case Intrinsic::x86_avx_blendv_ps_256:
5594 case Intrinsic::x86_avx2_pblendvb:
5595 handleBlendvIntrinsic(I);
5596 break;
5597
5598 case Intrinsic::x86_avx_dp_ps_256:
5599 case Intrinsic::x86_sse41_dppd:
5600 case Intrinsic::x86_sse41_dpps:
5601 handleDppIntrinsic(I);
5602 break;
5603
5604 case Intrinsic::x86_mmx_packsswb:
5605 case Intrinsic::x86_mmx_packuswb:
5606 handleVectorPackIntrinsic(I, 16);
5607 break;
5608
5609 case Intrinsic::x86_mmx_packssdw:
5610 handleVectorPackIntrinsic(I, 32);
5611 break;
5612
5613 case Intrinsic::x86_mmx_psad_bw:
5614 handleVectorSadIntrinsic(I, true);
5615 break;
5616 case Intrinsic::x86_sse2_psad_bw:
5617 case Intrinsic::x86_avx2_psad_bw:
5618 handleVectorSadIntrinsic(I);
5619 break;
5620
5621 // Multiply and Add Packed Words
5622 // < 4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>)
5623 // < 8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16>, <16 x i16>)
5624 // <16 x i32> @llvm.x86.avx512.pmaddw.d.512(<32 x i16>, <32 x i16>)
5625 //
5626 // Multiply and Add Packed Signed and Unsigned Bytes
5627 // < 8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>)
5628 // <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8>, <32 x i8>)
5629 // <32 x i16> @llvm.x86.avx512.pmaddubs.w.512(<64 x i8>, <64 x i8>)
5630 //
5631 // These intrinsics are auto-upgraded into non-masked forms:
5632 // < 4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128
5633 // (<8 x i16>, <8 x i16>, <4 x i32>, i8)
5634 // < 8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256
5635 // (<16 x i16>, <16 x i16>, <8 x i32>, i8)
5636 // <16 x i32> @llvm.x86.avx512.mask.pmaddw.d.512
5637 // (<32 x i16>, <32 x i16>, <16 x i32>, i16)
5638 // < 8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128
5639 // (<16 x i8>, <16 x i8>, <8 x i16>, i8)
5640 // <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256
5641 // (<32 x i8>, <32 x i8>, <16 x i16>, i16)
5642 // <32 x i16> @llvm.x86.avx512.mask.pmaddubs.w.512
5643 // (<64 x i8>, <64 x i8>, <32 x i16>, i32)
5644 case Intrinsic::x86_sse2_pmadd_wd:
5645 case Intrinsic::x86_avx2_pmadd_wd:
5646 case Intrinsic::x86_avx512_pmaddw_d_512:
5647 case Intrinsic::x86_ssse3_pmadd_ub_sw_128:
5648 case Intrinsic::x86_avx2_pmadd_ub_sw:
5649 case Intrinsic::x86_avx512_pmaddubs_w_512:
5650 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2);
5651 break;
5652
5653 // <1 x i64> @llvm.x86.ssse3.pmadd.ub.sw(<1 x i64>, <1 x i64>)
5654 case Intrinsic::x86_ssse3_pmadd_ub_sw:
5655 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/8);
5656 break;
5657
5658 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64>, <1 x i64>)
5659 case Intrinsic::x86_mmx_pmadd_wd:
5660 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/16);
5661 break;
5662
5663 // AVX Vector Neural Network Instructions: bytes
5664 //
5665 // Multiply and Add Packed Signed and Unsigned Bytes
5666 // < 4 x i32> @llvm.x86.avx512.vpdpbusd.128
5667 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5668 // < 8 x i32> @llvm.x86.avx512.vpdpbusd.256
5669 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5670 // <16 x i32> @llvm.x86.avx512.vpdpbusd.512
5671 // (<16 x i32>, <64 x i8>, <64 x i8>)
5672 //
5673 // Multiply and Add Unsigned and Signed Bytes With Saturation
5674 // < 4 x i32> @llvm.x86.avx512.vpdpbusds.128
5675 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5676 // < 8 x i32> @llvm.x86.avx512.vpdpbusds.256
5677 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5678 // <16 x i32> @llvm.x86.avx512.vpdpbusds.512
5679 // (<16 x i32>, <64 x i8>, <64 x i8>)
5680 //
5681 // < 4 x i32> @llvm.x86.avx2.vpdpbssd.128
5682 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5683 // < 8 x i32> @llvm.x86.avx2.vpdpbssd.256
5684 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5685 //
5686 // < 4 x i32> @llvm.x86.avx2.vpdpbssds.128
5687 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5688 // < 8 x i32> @llvm.x86.avx2.vpdpbssds.256
5689 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5690 //
5691 // <16 x i32> @llvm.x86.avx10.vpdpbssd.512
5692 // (<16 x i32>, <16 x i32>, <16 x i32>)
5693 // <16 x i32> @llvm.x86.avx10.vpdpbssds.512
5694 // (<16 x i32>, <16 x i32>, <16 x i32>)
5695 //
5696 // These intrinsics are auto-upgraded into non-masked forms:
5697 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusd.128
5698 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5699 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusd.128
5700 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5701 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusd.256
5702 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5703 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusd.256
5704 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5705 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusd.512
5706 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5707 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusd.512
5708 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5709 //
5710 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusds.128
5711 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5712 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusds.128
5713 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5714 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusds.256
5715 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5716 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusds.256
5717 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5718 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusds.512
5719 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5720 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusds.512
5721 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5722 case Intrinsic::x86_avx512_vpdpbusd_128:
5723 case Intrinsic::x86_avx512_vpdpbusd_256:
5724 case Intrinsic::x86_avx512_vpdpbusd_512:
5725 case Intrinsic::x86_avx512_vpdpbusds_128:
5726 case Intrinsic::x86_avx512_vpdpbusds_256:
5727 case Intrinsic::x86_avx512_vpdpbusds_512:
5728 case Intrinsic::x86_avx2_vpdpbssd_128:
5729 case Intrinsic::x86_avx2_vpdpbssd_256:
5730 case Intrinsic::x86_avx2_vpdpbssds_128:
5731 case Intrinsic::x86_avx2_vpdpbssds_256:
5732 case Intrinsic::x86_avx10_vpdpbssd_512:
5733 case Intrinsic::x86_avx10_vpdpbssds_512:
5734 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/4, /*EltSize=*/8);
5735 break;
5736
5737 // AVX Vector Neural Network Instructions: words
5738 //
5739 // Multiply and Add Signed Word Integers
5740 // < 4 x i32> @llvm.x86.avx512.vpdpwssd.128
5741 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5742 // < 8 x i32> @llvm.x86.avx512.vpdpwssd.256
5743 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5744 // <16 x i32> @llvm.x86.avx512.vpdpwssd.512
5745 // (<16 x i32>, <16 x i32>, <16 x i32>)
5746 //
5747 // Multiply and Add Signed Word Integers With Saturation
5748 // < 4 x i32> @llvm.x86.avx512.vpdpwssds.128
5749 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5750 // < 8 x i32> @llvm.x86.avx512.vpdpwssds.256
5751 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5752 // <16 x i32> @llvm.x86.avx512.vpdpwssds.512
5753 // (<16 x i32>, <16 x i32>, <16 x i32>)
5754 //
5755 // These intrinsics are auto-upgraded into non-masked forms:
5756 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssd.128
5757 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5758 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssd.128
5759 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5760 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssd.256
5761 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5762 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssd.256
5763 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5764 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssd.512
5765 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5766 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssd.512
5767 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5768 //
5769 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssds.128
5770 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5771 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssds.128
5772 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5773 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssds.256
5774 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5775 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssds.256
5776 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5777 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssds.512
5778 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5779 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssds.512
5780 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5781 case Intrinsic::x86_avx512_vpdpwssd_128:
5782 case Intrinsic::x86_avx512_vpdpwssd_256:
5783 case Intrinsic::x86_avx512_vpdpwssd_512:
5784 case Intrinsic::x86_avx512_vpdpwssds_128:
5785 case Intrinsic::x86_avx512_vpdpwssds_256:
5786 case Intrinsic::x86_avx512_vpdpwssds_512:
5787 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/16);
5788 break;
5789
5790 // TODO: Dot Product of BF16 Pairs Accumulated Into Packed Single
5791 // Precision
5792 // <4 x float> @llvm.x86.avx512bf16.dpbf16ps.128
5793 // (<4 x float>, <8 x bfloat>, <8 x bfloat>)
5794 // <8 x float> @llvm.x86.avx512bf16.dpbf16ps.256
5795 // (<8 x float>, <16 x bfloat>, <16 x bfloat>)
5796 // <16 x float> @llvm.x86.avx512bf16.dpbf16ps.512
5797 // (<16 x float>, <32 x bfloat>, <32 x bfloat>)
5798 // handleVectorPmaddIntrinsic() currently only handles integer types.
5799
5800 case Intrinsic::x86_sse_cmp_ss:
5801 case Intrinsic::x86_sse2_cmp_sd:
5802 case Intrinsic::x86_sse_comieq_ss:
5803 case Intrinsic::x86_sse_comilt_ss:
5804 case Intrinsic::x86_sse_comile_ss:
5805 case Intrinsic::x86_sse_comigt_ss:
5806 case Intrinsic::x86_sse_comige_ss:
5807 case Intrinsic::x86_sse_comineq_ss:
5808 case Intrinsic::x86_sse_ucomieq_ss:
5809 case Intrinsic::x86_sse_ucomilt_ss:
5810 case Intrinsic::x86_sse_ucomile_ss:
5811 case Intrinsic::x86_sse_ucomigt_ss:
5812 case Intrinsic::x86_sse_ucomige_ss:
5813 case Intrinsic::x86_sse_ucomineq_ss:
5814 case Intrinsic::x86_sse2_comieq_sd:
5815 case Intrinsic::x86_sse2_comilt_sd:
5816 case Intrinsic::x86_sse2_comile_sd:
5817 case Intrinsic::x86_sse2_comigt_sd:
5818 case Intrinsic::x86_sse2_comige_sd:
5819 case Intrinsic::x86_sse2_comineq_sd:
5820 case Intrinsic::x86_sse2_ucomieq_sd:
5821 case Intrinsic::x86_sse2_ucomilt_sd:
5822 case Intrinsic::x86_sse2_ucomile_sd:
5823 case Intrinsic::x86_sse2_ucomigt_sd:
5824 case Intrinsic::x86_sse2_ucomige_sd:
5825 case Intrinsic::x86_sse2_ucomineq_sd:
5826 handleVectorCompareScalarIntrinsic(I);
5827 break;
5828
5829 case Intrinsic::x86_avx_cmp_pd_256:
5830 case Intrinsic::x86_avx_cmp_ps_256:
5831 case Intrinsic::x86_sse2_cmp_pd:
5832 case Intrinsic::x86_sse_cmp_ps:
5833 handleVectorComparePackedIntrinsic(I);
5834 break;
5835
5836 case Intrinsic::x86_bmi_bextr_32:
5837 case Intrinsic::x86_bmi_bextr_64:
5838 case Intrinsic::x86_bmi_bzhi_32:
5839 case Intrinsic::x86_bmi_bzhi_64:
5840 case Intrinsic::x86_bmi_pdep_32:
5841 case Intrinsic::x86_bmi_pdep_64:
5842 case Intrinsic::x86_bmi_pext_32:
5843 case Intrinsic::x86_bmi_pext_64:
5844 handleBmiIntrinsic(I);
5845 break;
5846
5847 case Intrinsic::x86_pclmulqdq:
5848 case Intrinsic::x86_pclmulqdq_256:
5849 case Intrinsic::x86_pclmulqdq_512:
5850 handlePclmulIntrinsic(I);
5851 break;
5852
5853 case Intrinsic::x86_avx_round_pd_256:
5854 case Intrinsic::x86_avx_round_ps_256:
5855 case Intrinsic::x86_sse41_round_pd:
5856 case Intrinsic::x86_sse41_round_ps:
5857 handleRoundPdPsIntrinsic(I);
5858 break;
5859
5860 case Intrinsic::x86_sse41_round_sd:
5861 case Intrinsic::x86_sse41_round_ss:
5862 handleUnarySdSsIntrinsic(I);
5863 break;
5864
5865 case Intrinsic::x86_sse2_max_sd:
5866 case Intrinsic::x86_sse_max_ss:
5867 case Intrinsic::x86_sse2_min_sd:
5868 case Intrinsic::x86_sse_min_ss:
5869 handleBinarySdSsIntrinsic(I);
5870 break;
5871
5872 case Intrinsic::x86_avx_vtestc_pd:
5873 case Intrinsic::x86_avx_vtestc_pd_256:
5874 case Intrinsic::x86_avx_vtestc_ps:
5875 case Intrinsic::x86_avx_vtestc_ps_256:
5876 case Intrinsic::x86_avx_vtestnzc_pd:
5877 case Intrinsic::x86_avx_vtestnzc_pd_256:
5878 case Intrinsic::x86_avx_vtestnzc_ps:
5879 case Intrinsic::x86_avx_vtestnzc_ps_256:
5880 case Intrinsic::x86_avx_vtestz_pd:
5881 case Intrinsic::x86_avx_vtestz_pd_256:
5882 case Intrinsic::x86_avx_vtestz_ps:
5883 case Intrinsic::x86_avx_vtestz_ps_256:
5884 case Intrinsic::x86_avx_ptestc_256:
5885 case Intrinsic::x86_avx_ptestnzc_256:
5886 case Intrinsic::x86_avx_ptestz_256:
5887 case Intrinsic::x86_sse41_ptestc:
5888 case Intrinsic::x86_sse41_ptestnzc:
5889 case Intrinsic::x86_sse41_ptestz:
5890 handleVtestIntrinsic(I);
5891 break;
5892
5893 // Packed Horizontal Add/Subtract
5894 case Intrinsic::x86_ssse3_phadd_w:
5895 case Intrinsic::x86_ssse3_phadd_w_128:
5896 case Intrinsic::x86_avx2_phadd_w:
5897 case Intrinsic::x86_ssse3_phsub_w:
5898 case Intrinsic::x86_ssse3_phsub_w_128:
5899 case Intrinsic::x86_avx2_phsub_w: {
5900 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
5901 break;
5902 }
5903
5904 // Packed Horizontal Add/Subtract
5905 case Intrinsic::x86_ssse3_phadd_d:
5906 case Intrinsic::x86_ssse3_phadd_d_128:
5907 case Intrinsic::x86_avx2_phadd_d:
5908 case Intrinsic::x86_ssse3_phsub_d:
5909 case Intrinsic::x86_ssse3_phsub_d_128:
5910 case Intrinsic::x86_avx2_phsub_d: {
5911 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/32);
5912 break;
5913 }
5914
5915 // Packed Horizontal Add/Subtract and Saturate
5916 case Intrinsic::x86_ssse3_phadd_sw:
5917 case Intrinsic::x86_ssse3_phadd_sw_128:
5918 case Intrinsic::x86_avx2_phadd_sw:
5919 case Intrinsic::x86_ssse3_phsub_sw:
5920 case Intrinsic::x86_ssse3_phsub_sw_128:
5921 case Intrinsic::x86_avx2_phsub_sw: {
5922 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
5923 break;
5924 }
5925
5926 // Packed Single/Double Precision Floating-Point Horizontal Add
5927 case Intrinsic::x86_sse3_hadd_ps:
5928 case Intrinsic::x86_sse3_hadd_pd:
5929 case Intrinsic::x86_avx_hadd_pd_256:
5930 case Intrinsic::x86_avx_hadd_ps_256:
5931 case Intrinsic::x86_sse3_hsub_ps:
5932 case Intrinsic::x86_sse3_hsub_pd:
5933 case Intrinsic::x86_avx_hsub_pd_256:
5934 case Intrinsic::x86_avx_hsub_ps_256: {
5935 handlePairwiseShadowOrIntrinsic(I);
5936 break;
5937 }
5938
5939 case Intrinsic::x86_avx_maskstore_ps:
5940 case Intrinsic::x86_avx_maskstore_pd:
5941 case Intrinsic::x86_avx_maskstore_ps_256:
5942 case Intrinsic::x86_avx_maskstore_pd_256:
5943 case Intrinsic::x86_avx2_maskstore_d:
5944 case Intrinsic::x86_avx2_maskstore_q:
5945 case Intrinsic::x86_avx2_maskstore_d_256:
5946 case Intrinsic::x86_avx2_maskstore_q_256: {
5947 handleAVXMaskedStore(I);
5948 break;
5949 }
5950
5951 case Intrinsic::x86_avx_maskload_ps:
5952 case Intrinsic::x86_avx_maskload_pd:
5953 case Intrinsic::x86_avx_maskload_ps_256:
5954 case Intrinsic::x86_avx_maskload_pd_256:
5955 case Intrinsic::x86_avx2_maskload_d:
5956 case Intrinsic::x86_avx2_maskload_q:
5957 case Intrinsic::x86_avx2_maskload_d_256:
5958 case Intrinsic::x86_avx2_maskload_q_256: {
5959 handleAVXMaskedLoad(I);
5960 break;
5961 }
5962
5963 // Packed
5964 case Intrinsic::x86_avx512fp16_add_ph_512:
5965 case Intrinsic::x86_avx512fp16_sub_ph_512:
5966 case Intrinsic::x86_avx512fp16_mul_ph_512:
5967 case Intrinsic::x86_avx512fp16_div_ph_512:
5968 case Intrinsic::x86_avx512fp16_max_ph_512:
5969 case Intrinsic::x86_avx512fp16_min_ph_512:
5970 case Intrinsic::x86_avx512_min_ps_512:
5971 case Intrinsic::x86_avx512_min_pd_512:
5972 case Intrinsic::x86_avx512_max_ps_512:
5973 case Intrinsic::x86_avx512_max_pd_512: {
5974 // These AVX512 variants contain the rounding mode as a trailing flag.
5975 // Earlier variants do not have a trailing flag and are already handled
5976 // by maybeHandleSimpleNomemIntrinsic(I, 0) via
5977 // maybeHandleUnknownIntrinsic.
5978 [[maybe_unused]] bool Success =
5979 maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/1);
5980 assert(Success);
5981 break;
5982 }
5983
5984 case Intrinsic::x86_avx_vpermilvar_pd:
5985 case Intrinsic::x86_avx_vpermilvar_pd_256:
5986 case Intrinsic::x86_avx512_vpermilvar_pd_512:
5987 case Intrinsic::x86_avx_vpermilvar_ps:
5988 case Intrinsic::x86_avx_vpermilvar_ps_256:
5989 case Intrinsic::x86_avx512_vpermilvar_ps_512: {
5990 handleAVXVpermilvar(I);
5991 break;
5992 }
5993
5994 case Intrinsic::x86_avx512_vpermi2var_d_128:
5995 case Intrinsic::x86_avx512_vpermi2var_d_256:
5996 case Intrinsic::x86_avx512_vpermi2var_d_512:
5997 case Intrinsic::x86_avx512_vpermi2var_hi_128:
5998 case Intrinsic::x86_avx512_vpermi2var_hi_256:
5999 case Intrinsic::x86_avx512_vpermi2var_hi_512:
6000 case Intrinsic::x86_avx512_vpermi2var_pd_128:
6001 case Intrinsic::x86_avx512_vpermi2var_pd_256:
6002 case Intrinsic::x86_avx512_vpermi2var_pd_512:
6003 case Intrinsic::x86_avx512_vpermi2var_ps_128:
6004 case Intrinsic::x86_avx512_vpermi2var_ps_256:
6005 case Intrinsic::x86_avx512_vpermi2var_ps_512:
6006 case Intrinsic::x86_avx512_vpermi2var_q_128:
6007 case Intrinsic::x86_avx512_vpermi2var_q_256:
6008 case Intrinsic::x86_avx512_vpermi2var_q_512:
6009 case Intrinsic::x86_avx512_vpermi2var_qi_128:
6010 case Intrinsic::x86_avx512_vpermi2var_qi_256:
6011 case Intrinsic::x86_avx512_vpermi2var_qi_512:
6012 handleAVXVpermi2var(I);
6013 break;
6014
6015 // Packed Shuffle
6016 // llvm.x86.sse.pshuf.w(<1 x i64>, i8)
6017 // llvm.x86.ssse3.pshuf.b(<1 x i64>, <1 x i64>)
6018 // llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>)
6019 // llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>)
6020 // llvm.x86.avx512.pshuf.b.512(<64 x i8>, <64 x i8>)
6021 //
6022 // The following intrinsics are auto-upgraded:
6023 // llvm.x86.sse2.pshuf.d(<4 x i32>, i8)
6024 // llvm.x86.sse2.gpshufh.w(<8 x i16>, i8)
6025 // llvm.x86.sse2.pshufl.w(<8 x i16>, i8)
6026 case Intrinsic::x86_avx2_pshuf_b:
6027 case Intrinsic::x86_sse_pshuf_w:
6028 case Intrinsic::x86_ssse3_pshuf_b_128:
6029 case Intrinsic::x86_ssse3_pshuf_b:
6030 case Intrinsic::x86_avx512_pshuf_b_512:
6031 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6032 /*trailingVerbatimArgs=*/1);
6033 break;
6034
6035 // AVX512 PMOV: Packed MOV, with truncation
6036 // Precisely handled by applying the same intrinsic to the shadow
6037 case Intrinsic::x86_avx512_mask_pmov_dw_512:
6038 case Intrinsic::x86_avx512_mask_pmov_db_512:
6039 case Intrinsic::x86_avx512_mask_pmov_qb_512:
6040 case Intrinsic::x86_avx512_mask_pmov_qw_512: {
6041 // Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 were removed in
6042 // f608dc1f5775ee880e8ea30e2d06ab5a4a935c22
6043 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6044 /*trailingVerbatimArgs=*/1);
6045 break;
6046 }
6047
6048 // AVX512 PMVOV{S,US}: Packed MOV, with signed/unsigned saturation
6049 // Approximately handled using the corresponding truncation intrinsic
6050 // TODO: improve handleAVX512VectorDownConvert to precisely model saturation
6051 case Intrinsic::x86_avx512_mask_pmovs_dw_512:
6052 case Intrinsic::x86_avx512_mask_pmovus_dw_512: {
6053 handleIntrinsicByApplyingToShadow(I,
6054 Intrinsic::x86_avx512_mask_pmov_dw_512,
6055 /* trailingVerbatimArgs=*/1);
6056 break;
6057 }
6058
6059 case Intrinsic::x86_avx512_mask_pmovs_db_512:
6060 case Intrinsic::x86_avx512_mask_pmovus_db_512: {
6061 handleIntrinsicByApplyingToShadow(I,
6062 Intrinsic::x86_avx512_mask_pmov_db_512,
6063 /* trailingVerbatimArgs=*/1);
6064 break;
6065 }
6066
6067 case Intrinsic::x86_avx512_mask_pmovs_qb_512:
6068 case Intrinsic::x86_avx512_mask_pmovus_qb_512: {
6069 handleIntrinsicByApplyingToShadow(I,
6070 Intrinsic::x86_avx512_mask_pmov_qb_512,
6071 /* trailingVerbatimArgs=*/1);
6072 break;
6073 }
6074
6075 case Intrinsic::x86_avx512_mask_pmovs_qw_512:
6076 case Intrinsic::x86_avx512_mask_pmovus_qw_512: {
6077 handleIntrinsicByApplyingToShadow(I,
6078 Intrinsic::x86_avx512_mask_pmov_qw_512,
6079 /* trailingVerbatimArgs=*/1);
6080 break;
6081 }
6082
6083 case Intrinsic::x86_avx512_mask_pmovs_qd_512:
6084 case Intrinsic::x86_avx512_mask_pmovus_qd_512:
6085 case Intrinsic::x86_avx512_mask_pmovs_wb_512:
6086 case Intrinsic::x86_avx512_mask_pmovus_wb_512: {
6087 // Since Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 do not exist, we
6088 // cannot use handleIntrinsicByApplyingToShadow. Instead, we call the
6089 // slow-path handler.
6090 handleAVX512VectorDownConvert(I);
6091 break;
6092 }
6093
6094 // AVX512 FP16 Arithmetic
6095 case Intrinsic::x86_avx512fp16_mask_add_sh_round:
6096 case Intrinsic::x86_avx512fp16_mask_sub_sh_round:
6097 case Intrinsic::x86_avx512fp16_mask_mul_sh_round:
6098 case Intrinsic::x86_avx512fp16_mask_div_sh_round:
6099 case Intrinsic::x86_avx512fp16_mask_max_sh_round:
6100 case Intrinsic::x86_avx512fp16_mask_min_sh_round: {
6101 visitGenericScalarHalfwordInst(I);
6102 break;
6103 }
6104
6105 // AVX Galois Field New Instructions
6106 case Intrinsic::x86_vgf2p8affineqb_128:
6107 case Intrinsic::x86_vgf2p8affineqb_256:
6108 case Intrinsic::x86_vgf2p8affineqb_512:
6109 handleAVXGF2P8Affine(I);
6110 break;
6111
6112 default:
6113 return false;
6114 }
6115
6116 return true;
6117 }
6118
6119 bool maybeHandleArmSIMDIntrinsic(IntrinsicInst &I) {
6120 switch (I.getIntrinsicID()) {
6121 case Intrinsic::aarch64_neon_rshrn:
6122 case Intrinsic::aarch64_neon_sqrshl:
6123 case Intrinsic::aarch64_neon_sqrshrn:
6124 case Intrinsic::aarch64_neon_sqrshrun:
6125 case Intrinsic::aarch64_neon_sqshl:
6126 case Intrinsic::aarch64_neon_sqshlu:
6127 case Intrinsic::aarch64_neon_sqshrn:
6128 case Intrinsic::aarch64_neon_sqshrun:
6129 case Intrinsic::aarch64_neon_srshl:
6130 case Intrinsic::aarch64_neon_sshl:
6131 case Intrinsic::aarch64_neon_uqrshl:
6132 case Intrinsic::aarch64_neon_uqrshrn:
6133 case Intrinsic::aarch64_neon_uqshl:
6134 case Intrinsic::aarch64_neon_uqshrn:
6135 case Intrinsic::aarch64_neon_urshl:
6136 case Intrinsic::aarch64_neon_ushl:
6137 // Not handled here: aarch64_neon_vsli (vector shift left and insert)
6138 handleVectorShiftIntrinsic(I, /* Variable */ false);
6139 break;
6140
6141 // TODO: handling max/min similarly to AND/OR may be more precise
6142 // Floating-Point Maximum/Minimum Pairwise
6143 case Intrinsic::aarch64_neon_fmaxp:
6144 case Intrinsic::aarch64_neon_fminp:
6145 // Floating-Point Maximum/Minimum Number Pairwise
6146 case Intrinsic::aarch64_neon_fmaxnmp:
6147 case Intrinsic::aarch64_neon_fminnmp:
6148 // Signed/Unsigned Maximum/Minimum Pairwise
6149 case Intrinsic::aarch64_neon_smaxp:
6150 case Intrinsic::aarch64_neon_sminp:
6151 case Intrinsic::aarch64_neon_umaxp:
6152 case Intrinsic::aarch64_neon_uminp:
6153 // Add Pairwise
6154 case Intrinsic::aarch64_neon_addp:
6155 // Floating-point Add Pairwise
6156 case Intrinsic::aarch64_neon_faddp:
6157 // Add Long Pairwise
6158 case Intrinsic::aarch64_neon_saddlp:
6159 case Intrinsic::aarch64_neon_uaddlp: {
6160 handlePairwiseShadowOrIntrinsic(I);
6161 break;
6162 }
6163
6164 // Floating-point Convert to integer, rounding to nearest with ties to Away
6165 case Intrinsic::aarch64_neon_fcvtas:
6166 case Intrinsic::aarch64_neon_fcvtau:
6167 // Floating-point convert to integer, rounding toward minus infinity
6168 case Intrinsic::aarch64_neon_fcvtms:
6169 case Intrinsic::aarch64_neon_fcvtmu:
6170 // Floating-point convert to integer, rounding to nearest with ties to even
6171 case Intrinsic::aarch64_neon_fcvtns:
6172 case Intrinsic::aarch64_neon_fcvtnu:
6173 // Floating-point convert to integer, rounding toward plus infinity
6174 case Intrinsic::aarch64_neon_fcvtps:
6175 case Intrinsic::aarch64_neon_fcvtpu:
6176 // Floating-point Convert to integer, rounding toward Zero
6177 case Intrinsic::aarch64_neon_fcvtzs:
6178 case Intrinsic::aarch64_neon_fcvtzu:
6179 // Floating-point convert to lower precision narrow, rounding to odd
6180 case Intrinsic::aarch64_neon_fcvtxn: {
6181 handleNEONVectorConvertIntrinsic(I);
6182 break;
6183 }
6184
6185 // Add reduction to scalar
6186 case Intrinsic::aarch64_neon_faddv:
6187 case Intrinsic::aarch64_neon_saddv:
6188 case Intrinsic::aarch64_neon_uaddv:
6189 // Signed/Unsigned min/max (Vector)
6190 // TODO: handling similarly to AND/OR may be more precise.
6191 case Intrinsic::aarch64_neon_smaxv:
6192 case Intrinsic::aarch64_neon_sminv:
6193 case Intrinsic::aarch64_neon_umaxv:
6194 case Intrinsic::aarch64_neon_uminv:
6195 // Floating-point min/max (vector)
6196 // The f{min,max}"nm"v variants handle NaN differently than f{min,max}v,
6197 // but our shadow propagation is the same.
6198 case Intrinsic::aarch64_neon_fmaxv:
6199 case Intrinsic::aarch64_neon_fminv:
6200 case Intrinsic::aarch64_neon_fmaxnmv:
6201 case Intrinsic::aarch64_neon_fminnmv:
6202 // Sum long across vector
6203 case Intrinsic::aarch64_neon_saddlv:
6204 case Intrinsic::aarch64_neon_uaddlv:
6205 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/true);
6206 break;
6207
6208 case Intrinsic::aarch64_neon_ld1x2:
6209 case Intrinsic::aarch64_neon_ld1x3:
6210 case Intrinsic::aarch64_neon_ld1x4:
6211 case Intrinsic::aarch64_neon_ld2:
6212 case Intrinsic::aarch64_neon_ld3:
6213 case Intrinsic::aarch64_neon_ld4:
6214 case Intrinsic::aarch64_neon_ld2r:
6215 case Intrinsic::aarch64_neon_ld3r:
6216 case Intrinsic::aarch64_neon_ld4r: {
6217 handleNEONVectorLoad(I, /*WithLane=*/false);
6218 break;
6219 }
6220
6221 case Intrinsic::aarch64_neon_ld2lane:
6222 case Intrinsic::aarch64_neon_ld3lane:
6223 case Intrinsic::aarch64_neon_ld4lane: {
6224 handleNEONVectorLoad(I, /*WithLane=*/true);
6225 break;
6226 }
6227
6228 // Saturating extract narrow
6229 case Intrinsic::aarch64_neon_sqxtn:
6230 case Intrinsic::aarch64_neon_sqxtun:
6231 case Intrinsic::aarch64_neon_uqxtn:
6232 // These only have one argument, but we (ab)use handleShadowOr because it
6233 // does work on single argument intrinsics and will typecast the shadow
6234 // (and update the origin).
6235 handleShadowOr(I);
6236 break;
6237
6238 case Intrinsic::aarch64_neon_st1x2:
6239 case Intrinsic::aarch64_neon_st1x3:
6240 case Intrinsic::aarch64_neon_st1x4:
6241 case Intrinsic::aarch64_neon_st2:
6242 case Intrinsic::aarch64_neon_st3:
6243 case Intrinsic::aarch64_neon_st4: {
6244 handleNEONVectorStoreIntrinsic(I, false);
6245 break;
6246 }
6247
6248 case Intrinsic::aarch64_neon_st2lane:
6249 case Intrinsic::aarch64_neon_st3lane:
6250 case Intrinsic::aarch64_neon_st4lane: {
6251 handleNEONVectorStoreIntrinsic(I, true);
6252 break;
6253 }
6254
6255 // Arm NEON vector table intrinsics have the source/table register(s) as
6256 // arguments, followed by the index register. They return the output.
6257 //
6258 // 'TBL writes a zero if an index is out-of-range, while TBX leaves the
6259 // original value unchanged in the destination register.'
6260 // Conveniently, zero denotes a clean shadow, which means out-of-range
6261 // indices for TBL will initialize the user data with zero and also clean
6262 // the shadow. (For TBX, neither the user data nor the shadow will be
6263 // updated, which is also correct.)
6264 case Intrinsic::aarch64_neon_tbl1:
6265 case Intrinsic::aarch64_neon_tbl2:
6266 case Intrinsic::aarch64_neon_tbl3:
6267 case Intrinsic::aarch64_neon_tbl4:
6268 case Intrinsic::aarch64_neon_tbx1:
6269 case Intrinsic::aarch64_neon_tbx2:
6270 case Intrinsic::aarch64_neon_tbx3:
6271 case Intrinsic::aarch64_neon_tbx4: {
6272 // The last trailing argument (index register) should be handled verbatim
6273 handleIntrinsicByApplyingToShadow(
6274 I, /*shadowIntrinsicID=*/I.getIntrinsicID(),
6275 /*trailingVerbatimArgs*/ 1);
6276 break;
6277 }
6278
6279 case Intrinsic::aarch64_neon_fmulx:
6280 case Intrinsic::aarch64_neon_pmul:
6281 case Intrinsic::aarch64_neon_pmull:
6282 case Intrinsic::aarch64_neon_smull:
6283 case Intrinsic::aarch64_neon_pmull64:
6284 case Intrinsic::aarch64_neon_umull: {
6285 handleNEONVectorMultiplyIntrinsic(I);
6286 break;
6287 }
6288
6289 default:
6290 return false;
6291 }
6292
6293 return true;
6294 }
6295
6296 void visitIntrinsicInst(IntrinsicInst &I) {
6297 if (maybeHandleCrossPlatformIntrinsic(I))
6298 return;
6299
6300 if (maybeHandleX86SIMDIntrinsic(I))
6301 return;
6302
6303 if (maybeHandleArmSIMDIntrinsic(I))
6304 return;
6305
6306 if (maybeHandleUnknownIntrinsic(I))
6307 return;
6308
6309 visitInstruction(I);
6310 }
6311
6312 void visitLibAtomicLoad(CallBase &CB) {
6313 // Since we use getNextNode here, we can't have CB terminate the BB.
6314 assert(isa<CallInst>(CB));
6315
6316 IRBuilder<> IRB(&CB);
6317 Value *Size = CB.getArgOperand(0);
6318 Value *SrcPtr = CB.getArgOperand(1);
6319 Value *DstPtr = CB.getArgOperand(2);
6320 Value *Ordering = CB.getArgOperand(3);
6321 // Convert the call to have at least Acquire ordering to make sure
6322 // the shadow operations aren't reordered before it.
6323 Value *NewOrdering =
6324 IRB.CreateExtractElement(makeAddAcquireOrderingTable(IRB), Ordering);
6325 CB.setArgOperand(3, NewOrdering);
6326
6327 NextNodeIRBuilder NextIRB(&CB);
6328 Value *SrcShadowPtr, *SrcOriginPtr;
6329 std::tie(SrcShadowPtr, SrcOriginPtr) =
6330 getShadowOriginPtr(SrcPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6331 /*isStore*/ false);
6332 Value *DstShadowPtr =
6333 getShadowOriginPtr(DstPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6334 /*isStore*/ true)
6335 .first;
6336
6337 NextIRB.CreateMemCpy(DstShadowPtr, Align(1), SrcShadowPtr, Align(1), Size);
6338 if (MS.TrackOrigins) {
6339 Value *SrcOrigin = NextIRB.CreateAlignedLoad(MS.OriginTy, SrcOriginPtr,
6341 Value *NewOrigin = updateOrigin(SrcOrigin, NextIRB);
6342 NextIRB.CreateCall(MS.MsanSetOriginFn, {DstPtr, Size, NewOrigin});
6343 }
6344 }
6345
6346 void visitLibAtomicStore(CallBase &CB) {
6347 IRBuilder<> IRB(&CB);
6348 Value *Size = CB.getArgOperand(0);
6349 Value *DstPtr = CB.getArgOperand(2);
6350 Value *Ordering = CB.getArgOperand(3);
6351 // Convert the call to have at least Release ordering to make sure
6352 // the shadow operations aren't reordered after it.
6353 Value *NewOrdering =
6354 IRB.CreateExtractElement(makeAddReleaseOrderingTable(IRB), Ordering);
6355 CB.setArgOperand(3, NewOrdering);
6356
6357 Value *DstShadowPtr =
6358 getShadowOriginPtr(DstPtr, IRB, IRB.getInt8Ty(), Align(1),
6359 /*isStore*/ true)
6360 .first;
6361
6362 // Atomic store always paints clean shadow/origin. See file header.
6363 IRB.CreateMemSet(DstShadowPtr, getCleanShadow(IRB.getInt8Ty()), Size,
6364 Align(1));
6365 }
6366
6367 void visitCallBase(CallBase &CB) {
6368 assert(!CB.getMetadata(LLVMContext::MD_nosanitize));
6369 if (CB.isInlineAsm()) {
6370 // For inline asm (either a call to asm function, or callbr instruction),
6371 // do the usual thing: check argument shadow and mark all outputs as
6372 // clean. Note that any side effects of the inline asm that are not
6373 // immediately visible in its constraints are not handled.
6375 visitAsmInstruction(CB);
6376 else
6377 visitInstruction(CB);
6378 return;
6379 }
6380 LibFunc LF;
6381 if (TLI->getLibFunc(CB, LF)) {
6382 // libatomic.a functions need to have special handling because there isn't
6383 // a good way to intercept them or compile the library with
6384 // instrumentation.
6385 switch (LF) {
6386 case LibFunc_atomic_load:
6387 if (!isa<CallInst>(CB)) {
6388 llvm::errs() << "MSAN -- cannot instrument invoke of libatomic load."
6389 "Ignoring!\n";
6390 break;
6391 }
6392 visitLibAtomicLoad(CB);
6393 return;
6394 case LibFunc_atomic_store:
6395 visitLibAtomicStore(CB);
6396 return;
6397 default:
6398 break;
6399 }
6400 }
6401
6402 if (auto *Call = dyn_cast<CallInst>(&CB)) {
6403 assert(!isa<IntrinsicInst>(Call) && "intrinsics are handled elsewhere");
6404
6405 // We are going to insert code that relies on the fact that the callee
6406 // will become a non-readonly function after it is instrumented by us. To
6407 // prevent this code from being optimized out, mark that function
6408 // non-readonly in advance.
6409 // TODO: We can likely do better than dropping memory() completely here.
6410 AttributeMask B;
6411 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
6412
6414 if (Function *Func = Call->getCalledFunction()) {
6415 Func->removeFnAttrs(B);
6416 }
6417
6419 }
6420 IRBuilder<> IRB(&CB);
6421 bool MayCheckCall = MS.EagerChecks;
6422 if (Function *Func = CB.getCalledFunction()) {
6423 // __sanitizer_unaligned_{load,store} functions may be called by users
6424 // and always expects shadows in the TLS. So don't check them.
6425 MayCheckCall &= !Func->getName().starts_with("__sanitizer_unaligned_");
6426 }
6427
6428 unsigned ArgOffset = 0;
6429 LLVM_DEBUG(dbgs() << " CallSite: " << CB << "\n");
6430 for (const auto &[i, A] : llvm::enumerate(CB.args())) {
6431 if (!A->getType()->isSized()) {
6432 LLVM_DEBUG(dbgs() << "Arg " << i << " is not sized: " << CB << "\n");
6433 continue;
6434 }
6435
6436 if (A->getType()->isScalableTy()) {
6437 LLVM_DEBUG(dbgs() << "Arg " << i << " is vscale: " << CB << "\n");
6438 // Handle as noundef, but don't reserve tls slots.
6439 insertCheckShadowOf(A, &CB);
6440 continue;
6441 }
6442
6443 unsigned Size = 0;
6444 const DataLayout &DL = F.getDataLayout();
6445
6446 bool ByVal = CB.paramHasAttr(i, Attribute::ByVal);
6447 bool NoUndef = CB.paramHasAttr(i, Attribute::NoUndef);
6448 bool EagerCheck = MayCheckCall && !ByVal && NoUndef;
6449
6450 if (EagerCheck) {
6451 insertCheckShadowOf(A, &CB);
6452 Size = DL.getTypeAllocSize(A->getType());
6453 } else {
6454 [[maybe_unused]] Value *Store = nullptr;
6455 // Compute the Shadow for arg even if it is ByVal, because
6456 // in that case getShadow() will copy the actual arg shadow to
6457 // __msan_param_tls.
6458 Value *ArgShadow = getShadow(A);
6459 Value *ArgShadowBase = getShadowPtrForArgument(IRB, ArgOffset);
6460 LLVM_DEBUG(dbgs() << " Arg#" << i << ": " << *A
6461 << " Shadow: " << *ArgShadow << "\n");
6462 if (ByVal) {
6463 // ByVal requires some special handling as it's too big for a single
6464 // load
6465 assert(A->getType()->isPointerTy() &&
6466 "ByVal argument is not a pointer!");
6467 Size = DL.getTypeAllocSize(CB.getParamByValType(i));
6468 if (ArgOffset + Size > kParamTLSSize)
6469 break;
6470 const MaybeAlign ParamAlignment(CB.getParamAlign(i));
6471 MaybeAlign Alignment = std::nullopt;
6472 if (ParamAlignment)
6473 Alignment = std::min(*ParamAlignment, kShadowTLSAlignment);
6474 Value *AShadowPtr, *AOriginPtr;
6475 std::tie(AShadowPtr, AOriginPtr) =
6476 getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), Alignment,
6477 /*isStore*/ false);
6478 if (!PropagateShadow) {
6479 Store = IRB.CreateMemSet(ArgShadowBase,
6481 Size, Alignment);
6482 } else {
6483 Store = IRB.CreateMemCpy(ArgShadowBase, Alignment, AShadowPtr,
6484 Alignment, Size);
6485 if (MS.TrackOrigins) {
6486 Value *ArgOriginBase = getOriginPtrForArgument(IRB, ArgOffset);
6487 // FIXME: OriginSize should be:
6488 // alignTo(A % kMinOriginAlignment + Size, kMinOriginAlignment)
6489 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
6490 IRB.CreateMemCpy(
6491 ArgOriginBase,
6492 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
6493 AOriginPtr,
6494 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginSize);
6495 }
6496 }
6497 } else {
6498 // Any other parameters mean we need bit-grained tracking of uninit
6499 // data
6500 Size = DL.getTypeAllocSize(A->getType());
6501 if (ArgOffset + Size > kParamTLSSize)
6502 break;
6503 Store = IRB.CreateAlignedStore(ArgShadow, ArgShadowBase,
6505 Constant *Cst = dyn_cast<Constant>(ArgShadow);
6506 if (MS.TrackOrigins && !(Cst && Cst->isNullValue())) {
6507 IRB.CreateStore(getOrigin(A),
6508 getOriginPtrForArgument(IRB, ArgOffset));
6509 }
6510 }
6511 assert(Store != nullptr);
6512 LLVM_DEBUG(dbgs() << " Param:" << *Store << "\n");
6513 }
6514 assert(Size != 0);
6515 ArgOffset += alignTo(Size, kShadowTLSAlignment);
6516 }
6517 LLVM_DEBUG(dbgs() << " done with call args\n");
6518
6519 FunctionType *FT = CB.getFunctionType();
6520 if (FT->isVarArg()) {
6521 VAHelper->visitCallBase(CB, IRB);
6522 }
6523
6524 // Now, get the shadow for the RetVal.
6525 if (!CB.getType()->isSized())
6526 return;
6527 // Don't emit the epilogue for musttail call returns.
6528 if (isa<CallInst>(CB) && cast<CallInst>(CB).isMustTailCall())
6529 return;
6530
6531 if (MayCheckCall && CB.hasRetAttr(Attribute::NoUndef)) {
6532 setShadow(&CB, getCleanShadow(&CB));
6533 setOrigin(&CB, getCleanOrigin());
6534 return;
6535 }
6536
6537 IRBuilder<> IRBBefore(&CB);
6538 // Until we have full dynamic coverage, make sure the retval shadow is 0.
6539 Value *Base = getShadowPtrForRetval(IRBBefore);
6540 IRBBefore.CreateAlignedStore(getCleanShadow(&CB), Base,
6542 BasicBlock::iterator NextInsn;
6543 if (isa<CallInst>(CB)) {
6544 NextInsn = ++CB.getIterator();
6545 assert(NextInsn != CB.getParent()->end());
6546 } else {
6547 BasicBlock *NormalDest = cast<InvokeInst>(CB).getNormalDest();
6548 if (!NormalDest->getSinglePredecessor()) {
6549 // FIXME: this case is tricky, so we are just conservative here.
6550 // Perhaps we need to split the edge between this BB and NormalDest,
6551 // but a naive attempt to use SplitEdge leads to a crash.
6552 setShadow(&CB, getCleanShadow(&CB));
6553 setOrigin(&CB, getCleanOrigin());
6554 return;
6555 }
6556 // FIXME: NextInsn is likely in a basic block that has not been visited
6557 // yet. Anything inserted there will be instrumented by MSan later!
6558 NextInsn = NormalDest->getFirstInsertionPt();
6559 assert(NextInsn != NormalDest->end() &&
6560 "Could not find insertion point for retval shadow load");
6561 }
6562 IRBuilder<> IRBAfter(&*NextInsn);
6563 Value *RetvalShadow = IRBAfter.CreateAlignedLoad(
6564 getShadowTy(&CB), getShadowPtrForRetval(IRBAfter), kShadowTLSAlignment,
6565 "_msret");
6566 setShadow(&CB, RetvalShadow);
6567 if (MS.TrackOrigins)
6568 setOrigin(&CB, IRBAfter.CreateLoad(MS.OriginTy, getOriginPtrForRetval()));
6569 }
6570
6571 bool isAMustTailRetVal(Value *RetVal) {
6572 if (auto *I = dyn_cast<BitCastInst>(RetVal)) {
6573 RetVal = I->getOperand(0);
6574 }
6575 if (auto *I = dyn_cast<CallInst>(RetVal)) {
6576 return I->isMustTailCall();
6577 }
6578 return false;
6579 }
6580
6581 void visitReturnInst(ReturnInst &I) {
6582 IRBuilder<> IRB(&I);
6583 Value *RetVal = I.getReturnValue();
6584 if (!RetVal)
6585 return;
6586 // Don't emit the epilogue for musttail call returns.
6587 if (isAMustTailRetVal(RetVal))
6588 return;
6589 Value *ShadowPtr = getShadowPtrForRetval(IRB);
6590 bool HasNoUndef = F.hasRetAttribute(Attribute::NoUndef);
6591 bool StoreShadow = !(MS.EagerChecks && HasNoUndef);
6592 // FIXME: Consider using SpecialCaseList to specify a list of functions that
6593 // must always return fully initialized values. For now, we hardcode "main".
6594 bool EagerCheck = (MS.EagerChecks && HasNoUndef) || (F.getName() == "main");
6595
6596 Value *Shadow = getShadow(RetVal);
6597 bool StoreOrigin = true;
6598 if (EagerCheck) {
6599 insertCheckShadowOf(RetVal, &I);
6600 Shadow = getCleanShadow(RetVal);
6601 StoreOrigin = false;
6602 }
6603
6604 // The caller may still expect information passed over TLS if we pass our
6605 // check
6606 if (StoreShadow) {
6607 IRB.CreateAlignedStore(Shadow, ShadowPtr, kShadowTLSAlignment);
6608 if (MS.TrackOrigins && StoreOrigin)
6609 IRB.CreateStore(getOrigin(RetVal), getOriginPtrForRetval());
6610 }
6611 }
6612
6613 void visitPHINode(PHINode &I) {
6614 IRBuilder<> IRB(&I);
6615 if (!PropagateShadow) {
6616 setShadow(&I, getCleanShadow(&I));
6617 setOrigin(&I, getCleanOrigin());
6618 return;
6619 }
6620
6621 ShadowPHINodes.push_back(&I);
6622 setShadow(&I, IRB.CreatePHI(getShadowTy(&I), I.getNumIncomingValues(),
6623 "_msphi_s"));
6624 if (MS.TrackOrigins)
6625 setOrigin(
6626 &I, IRB.CreatePHI(MS.OriginTy, I.getNumIncomingValues(), "_msphi_o"));
6627 }
6628
6629 Value *getLocalVarIdptr(AllocaInst &I) {
6630 ConstantInt *IntConst =
6631 ConstantInt::get(Type::getInt32Ty((*F.getParent()).getContext()), 0);
6632 return new GlobalVariable(*F.getParent(), IntConst->getType(),
6633 /*isConstant=*/false, GlobalValue::PrivateLinkage,
6634 IntConst);
6635 }
6636
6637 Value *getLocalVarDescription(AllocaInst &I) {
6638 return createPrivateConstGlobalForString(*F.getParent(), I.getName());
6639 }
6640
6641 void poisonAllocaUserspace(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
6642 if (PoisonStack && ClPoisonStackWithCall) {
6643 IRB.CreateCall(MS.MsanPoisonStackFn, {&I, Len});
6644 } else {
6645 Value *ShadowBase, *OriginBase;
6646 std::tie(ShadowBase, OriginBase) = getShadowOriginPtr(
6647 &I, IRB, IRB.getInt8Ty(), Align(1), /*isStore*/ true);
6648
6649 Value *PoisonValue = IRB.getInt8(PoisonStack ? ClPoisonStackPattern : 0);
6650 IRB.CreateMemSet(ShadowBase, PoisonValue, Len, I.getAlign());
6651 }
6652
6653 if (PoisonStack && MS.TrackOrigins) {
6654 Value *Idptr = getLocalVarIdptr(I);
6655 if (ClPrintStackNames) {
6656 Value *Descr = getLocalVarDescription(I);
6657 IRB.CreateCall(MS.MsanSetAllocaOriginWithDescriptionFn,
6658 {&I, Len, Idptr, Descr});
6659 } else {
6660 IRB.CreateCall(MS.MsanSetAllocaOriginNoDescriptionFn, {&I, Len, Idptr});
6661 }
6662 }
6663 }
6664
6665 void poisonAllocaKmsan(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
6666 Value *Descr = getLocalVarDescription(I);
6667 if (PoisonStack) {
6668 IRB.CreateCall(MS.MsanPoisonAllocaFn, {&I, Len, Descr});
6669 } else {
6670 IRB.CreateCall(MS.MsanUnpoisonAllocaFn, {&I, Len});
6671 }
6672 }
6673
6674 void instrumentAlloca(AllocaInst &I, Instruction *InsPoint = nullptr) {
6675 if (!InsPoint)
6676 InsPoint = &I;
6677 NextNodeIRBuilder IRB(InsPoint);
6678 const DataLayout &DL = F.getDataLayout();
6679 TypeSize TS = DL.getTypeAllocSize(I.getAllocatedType());
6680 Value *Len = IRB.CreateTypeSize(MS.IntptrTy, TS);
6681 if (I.isArrayAllocation())
6682 Len = IRB.CreateMul(Len,
6683 IRB.CreateZExtOrTrunc(I.getArraySize(), MS.IntptrTy));
6684
6685 if (MS.CompileKernel)
6686 poisonAllocaKmsan(I, IRB, Len);
6687 else
6688 poisonAllocaUserspace(I, IRB, Len);
6689 }
6690
6691 void visitAllocaInst(AllocaInst &I) {
6692 setShadow(&I, getCleanShadow(&I));
6693 setOrigin(&I, getCleanOrigin());
6694 // We'll get to this alloca later unless it's poisoned at the corresponding
6695 // llvm.lifetime.start.
6696 AllocaSet.insert(&I);
6697 }
6698
6699 void visitSelectInst(SelectInst &I) {
6700 // a = select b, c, d
6701 Value *B = I.getCondition();
6702 Value *C = I.getTrueValue();
6703 Value *D = I.getFalseValue();
6704
6705 handleSelectLikeInst(I, B, C, D);
6706 }
6707
6708 void handleSelectLikeInst(Instruction &I, Value *B, Value *C, Value *D) {
6709 IRBuilder<> IRB(&I);
6710
6711 Value *Sb = getShadow(B);
6712 Value *Sc = getShadow(C);
6713 Value *Sd = getShadow(D);
6714
6715 Value *Ob = MS.TrackOrigins ? getOrigin(B) : nullptr;
6716 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
6717 Value *Od = MS.TrackOrigins ? getOrigin(D) : nullptr;
6718
6719 // Result shadow if condition shadow is 0.
6720 Value *Sa0 = IRB.CreateSelect(B, Sc, Sd);
6721 Value *Sa1;
6722 if (I.getType()->isAggregateType()) {
6723 // To avoid "sign extending" i1 to an arbitrary aggregate type, we just do
6724 // an extra "select". This results in much more compact IR.
6725 // Sa = select Sb, poisoned, (select b, Sc, Sd)
6726 Sa1 = getPoisonedShadow(getShadowTy(I.getType()));
6727 } else {
6728 // Sa = select Sb, [ (c^d) | Sc | Sd ], [ b ? Sc : Sd ]
6729 // If Sb (condition is poisoned), look for bits in c and d that are equal
6730 // and both unpoisoned.
6731 // If !Sb (condition is unpoisoned), simply pick one of Sc and Sd.
6732
6733 // Cast arguments to shadow-compatible type.
6734 C = CreateAppToShadowCast(IRB, C);
6735 D = CreateAppToShadowCast(IRB, D);
6736
6737 // Result shadow if condition shadow is 1.
6738 Sa1 = IRB.CreateOr({IRB.CreateXor(C, D), Sc, Sd});
6739 }
6740 Value *Sa = IRB.CreateSelect(Sb, Sa1, Sa0, "_msprop_select");
6741 setShadow(&I, Sa);
6742 if (MS.TrackOrigins) {
6743 // Origins are always i32, so any vector conditions must be flattened.
6744 // FIXME: consider tracking vector origins for app vectors?
6745 if (B->getType()->isVectorTy()) {
6746 B = convertToBool(B, IRB);
6747 Sb = convertToBool(Sb, IRB);
6748 }
6749 // a = select b, c, d
6750 // Oa = Sb ? Ob : (b ? Oc : Od)
6751 setOrigin(&I, IRB.CreateSelect(Sb, Ob, IRB.CreateSelect(B, Oc, Od)));
6752 }
6753 }
6754
6755 void visitLandingPadInst(LandingPadInst &I) {
6756 // Do nothing.
6757 // See https://github.com/google/sanitizers/issues/504
6758 setShadow(&I, getCleanShadow(&I));
6759 setOrigin(&I, getCleanOrigin());
6760 }
6761
6762 void visitCatchSwitchInst(CatchSwitchInst &I) {
6763 setShadow(&I, getCleanShadow(&I));
6764 setOrigin(&I, getCleanOrigin());
6765 }
6766
6767 void visitFuncletPadInst(FuncletPadInst &I) {
6768 setShadow(&I, getCleanShadow(&I));
6769 setOrigin(&I, getCleanOrigin());
6770 }
6771
6772 void visitGetElementPtrInst(GetElementPtrInst &I) { handleShadowOr(I); }
6773
6774 void visitExtractValueInst(ExtractValueInst &I) {
6775 IRBuilder<> IRB(&I);
6776 Value *Agg = I.getAggregateOperand();
6777 LLVM_DEBUG(dbgs() << "ExtractValue: " << I << "\n");
6778 Value *AggShadow = getShadow(Agg);
6779 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
6780 Value *ResShadow = IRB.CreateExtractValue(AggShadow, I.getIndices());
6781 LLVM_DEBUG(dbgs() << " ResShadow: " << *ResShadow << "\n");
6782 setShadow(&I, ResShadow);
6783 setOriginForNaryOp(I);
6784 }
6785
6786 void visitInsertValueInst(InsertValueInst &I) {
6787 IRBuilder<> IRB(&I);
6788 LLVM_DEBUG(dbgs() << "InsertValue: " << I << "\n");
6789 Value *AggShadow = getShadow(I.getAggregateOperand());
6790 Value *InsShadow = getShadow(I.getInsertedValueOperand());
6791 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
6792 LLVM_DEBUG(dbgs() << " InsShadow: " << *InsShadow << "\n");
6793 Value *Res = IRB.CreateInsertValue(AggShadow, InsShadow, I.getIndices());
6794 LLVM_DEBUG(dbgs() << " Res: " << *Res << "\n");
6795 setShadow(&I, Res);
6796 setOriginForNaryOp(I);
6797 }
6798
6799 void dumpInst(Instruction &I) {
6800 if (CallInst *CI = dyn_cast<CallInst>(&I)) {
6801 errs() << "ZZZ call " << CI->getCalledFunction()->getName() << "\n";
6802 } else {
6803 errs() << "ZZZ " << I.getOpcodeName() << "\n";
6804 }
6805 errs() << "QQQ " << I << "\n";
6806 }
6807
6808 void visitResumeInst(ResumeInst &I) {
6809 LLVM_DEBUG(dbgs() << "Resume: " << I << "\n");
6810 // Nothing to do here.
6811 }
6812
6813 void visitCleanupReturnInst(CleanupReturnInst &CRI) {
6814 LLVM_DEBUG(dbgs() << "CleanupReturn: " << CRI << "\n");
6815 // Nothing to do here.
6816 }
6817
6818 void visitCatchReturnInst(CatchReturnInst &CRI) {
6819 LLVM_DEBUG(dbgs() << "CatchReturn: " << CRI << "\n");
6820 // Nothing to do here.
6821 }
6822
6823 void instrumentAsmArgument(Value *Operand, Type *ElemTy, Instruction &I,
6824 IRBuilder<> &IRB, const DataLayout &DL,
6825 bool isOutput) {
6826 // For each assembly argument, we check its value for being initialized.
6827 // If the argument is a pointer, we assume it points to a single element
6828 // of the corresponding type (or to a 8-byte word, if the type is unsized).
6829 // Each such pointer is instrumented with a call to the runtime library.
6830 Type *OpType = Operand->getType();
6831 // Check the operand value itself.
6832 insertCheckShadowOf(Operand, &I);
6833 if (!OpType->isPointerTy() || !isOutput) {
6834 assert(!isOutput);
6835 return;
6836 }
6837 if (!ElemTy->isSized())
6838 return;
6839 auto Size = DL.getTypeStoreSize(ElemTy);
6840 Value *SizeVal = IRB.CreateTypeSize(MS.IntptrTy, Size);
6841 if (MS.CompileKernel) {
6842 IRB.CreateCall(MS.MsanInstrumentAsmStoreFn, {Operand, SizeVal});
6843 } else {
6844 // ElemTy, derived from elementtype(), does not encode the alignment of
6845 // the pointer. Conservatively assume that the shadow memory is unaligned.
6846 // When Size is large, avoid StoreInst as it would expand to many
6847 // instructions.
6848 auto [ShadowPtr, _] =
6849 getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1));
6850 if (Size <= 32)
6851 IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
6852 else
6853 IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()),
6854 SizeVal, Align(1));
6855 }
6856 }
6857
6858 /// Get the number of output arguments returned by pointers.
6859 int getNumOutputArgs(InlineAsm *IA, CallBase *CB) {
6860 int NumRetOutputs = 0;
6861 int NumOutputs = 0;
6862 Type *RetTy = cast<Value>(CB)->getType();
6863 if (!RetTy->isVoidTy()) {
6864 // Register outputs are returned via the CallInst return value.
6865 auto *ST = dyn_cast<StructType>(RetTy);
6866 if (ST)
6867 NumRetOutputs = ST->getNumElements();
6868 else
6869 NumRetOutputs = 1;
6870 }
6871 InlineAsm::ConstraintInfoVector Constraints = IA->ParseConstraints();
6872 for (const InlineAsm::ConstraintInfo &Info : Constraints) {
6873 switch (Info.Type) {
6875 NumOutputs++;
6876 break;
6877 default:
6878 break;
6879 }
6880 }
6881 return NumOutputs - NumRetOutputs;
6882 }
6883
6884 void visitAsmInstruction(Instruction &I) {
6885 // Conservative inline assembly handling: check for poisoned shadow of
6886 // asm() arguments, then unpoison the result and all the memory locations
6887 // pointed to by those arguments.
6888 // An inline asm() statement in C++ contains lists of input and output
6889 // arguments used by the assembly code. These are mapped to operands of the
6890 // CallInst as follows:
6891 // - nR register outputs ("=r) are returned by value in a single structure
6892 // (SSA value of the CallInst);
6893 // - nO other outputs ("=m" and others) are returned by pointer as first
6894 // nO operands of the CallInst;
6895 // - nI inputs ("r", "m" and others) are passed to CallInst as the
6896 // remaining nI operands.
6897 // The total number of asm() arguments in the source is nR+nO+nI, and the
6898 // corresponding CallInst has nO+nI+1 operands (the last operand is the
6899 // function to be called).
6900 const DataLayout &DL = F.getDataLayout();
6901 CallBase *CB = cast<CallBase>(&I);
6902 IRBuilder<> IRB(&I);
6903 InlineAsm *IA = cast<InlineAsm>(CB->getCalledOperand());
6904 int OutputArgs = getNumOutputArgs(IA, CB);
6905 // The last operand of a CallInst is the function itself.
6906 int NumOperands = CB->getNumOperands() - 1;
6907
6908 // Check input arguments. Doing so before unpoisoning output arguments, so
6909 // that we won't overwrite uninit values before checking them.
6910 for (int i = OutputArgs; i < NumOperands; i++) {
6911 Value *Operand = CB->getOperand(i);
6912 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
6913 /*isOutput*/ false);
6914 }
6915 // Unpoison output arguments. This must happen before the actual InlineAsm
6916 // call, so that the shadow for memory published in the asm() statement
6917 // remains valid.
6918 for (int i = 0; i < OutputArgs; i++) {
6919 Value *Operand = CB->getOperand(i);
6920 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
6921 /*isOutput*/ true);
6922 }
6923
6924 setShadow(&I, getCleanShadow(&I));
6925 setOrigin(&I, getCleanOrigin());
6926 }
6927
6928 void visitFreezeInst(FreezeInst &I) {
6929 // Freeze always returns a fully defined value.
6930 setShadow(&I, getCleanShadow(&I));
6931 setOrigin(&I, getCleanOrigin());
6932 }
6933
6934 void visitInstruction(Instruction &I) {
6935 // Everything else: stop propagating and check for poisoned shadow.
6937 dumpInst(I);
6938 LLVM_DEBUG(dbgs() << "DEFAULT: " << I << "\n");
6939 for (size_t i = 0, n = I.getNumOperands(); i < n; i++) {
6940 Value *Operand = I.getOperand(i);
6941 if (Operand->getType()->isSized())
6942 insertCheckShadowOf(Operand, &I);
6943 }
6944 setShadow(&I, getCleanShadow(&I));
6945 setOrigin(&I, getCleanOrigin());
6946 }
6947};
6948
6949struct VarArgHelperBase : public VarArgHelper {
6950 Function &F;
6951 MemorySanitizer &MS;
6952 MemorySanitizerVisitor &MSV;
6953 SmallVector<CallInst *, 16> VAStartInstrumentationList;
6954 const unsigned VAListTagSize;
6955
6956 VarArgHelperBase(Function &F, MemorySanitizer &MS,
6957 MemorySanitizerVisitor &MSV, unsigned VAListTagSize)
6958 : F(F), MS(MS), MSV(MSV), VAListTagSize(VAListTagSize) {}
6959
6960 Value *getShadowAddrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
6961 Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
6962 return IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
6963 }
6964
6965 /// Compute the shadow address for a given va_arg.
6966 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
6967 Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
6968 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
6969 return IRB.CreateIntToPtr(Base, MS.PtrTy, "_msarg_va_s");
6970 }
6971
6972 /// Compute the shadow address for a given va_arg.
6973 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset,
6974 unsigned ArgSize) {
6975 // Make sure we don't overflow __msan_va_arg_tls.
6976 if (ArgOffset + ArgSize > kParamTLSSize)
6977 return nullptr;
6978 return getShadowPtrForVAArgument(IRB, ArgOffset);
6979 }
6980
6981 /// Compute the origin address for a given va_arg.
6982 Value *getOriginPtrForVAArgument(IRBuilder<> &IRB, int ArgOffset) {
6983 Value *Base = IRB.CreatePointerCast(MS.VAArgOriginTLS, MS.IntptrTy);
6984 // getOriginPtrForVAArgument() is always called after
6985 // getShadowPtrForVAArgument(), so __msan_va_arg_origin_tls can never
6986 // overflow.
6987 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
6988 return IRB.CreateIntToPtr(Base, MS.PtrTy, "_msarg_va_o");
6989 }
6990
6991 void CleanUnusedTLS(IRBuilder<> &IRB, Value *ShadowBase,
6992 unsigned BaseOffset) {
6993 // The tails of __msan_va_arg_tls is not large enough to fit full
6994 // value shadow, but it will be copied to backup anyway. Make it
6995 // clean.
6996 if (BaseOffset >= kParamTLSSize)
6997 return;
6998 Value *TailSize =
6999 ConstantInt::getSigned(IRB.getInt32Ty(), kParamTLSSize - BaseOffset);
7000 IRB.CreateMemSet(ShadowBase, ConstantInt::getNullValue(IRB.getInt8Ty()),
7001 TailSize, Align(8));
7002 }
7003
7004 void unpoisonVAListTagForInst(IntrinsicInst &I) {
7005 IRBuilder<> IRB(&I);
7006 Value *VAListTag = I.getArgOperand(0);
7007 const Align Alignment = Align(8);
7008 auto [ShadowPtr, OriginPtr] = MSV.getShadowOriginPtr(
7009 VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
7010 // Unpoison the whole __va_list_tag.
7011 IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
7012 VAListTagSize, Alignment, false);
7013 }
7014
7015 void visitVAStartInst(VAStartInst &I) override {
7016 if (F.getCallingConv() == CallingConv::Win64)
7017 return;
7018 VAStartInstrumentationList.push_back(&I);
7019 unpoisonVAListTagForInst(I);
7020 }
7021
7022 void visitVACopyInst(VACopyInst &I) override {
7023 if (F.getCallingConv() == CallingConv::Win64)
7024 return;
7025 unpoisonVAListTagForInst(I);
7026 }
7027};
7028
7029/// AMD64-specific implementation of VarArgHelper.
7030struct VarArgAMD64Helper : public VarArgHelperBase {
7031 // An unfortunate workaround for asymmetric lowering of va_arg stuff.
7032 // See a comment in visitCallBase for more details.
7033 static const unsigned AMD64GpEndOffset = 48; // AMD64 ABI Draft 0.99.6 p3.5.7
7034 static const unsigned AMD64FpEndOffsetSSE = 176;
7035 // If SSE is disabled, fp_offset in va_list is zero.
7036 static const unsigned AMD64FpEndOffsetNoSSE = AMD64GpEndOffset;
7037
7038 unsigned AMD64FpEndOffset;
7039 AllocaInst *VAArgTLSCopy = nullptr;
7040 AllocaInst *VAArgTLSOriginCopy = nullptr;
7041 Value *VAArgOverflowSize = nullptr;
7042
7043 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7044
7045 VarArgAMD64Helper(Function &F, MemorySanitizer &MS,
7046 MemorySanitizerVisitor &MSV)
7047 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/24) {
7048 AMD64FpEndOffset = AMD64FpEndOffsetSSE;
7049 for (const auto &Attr : F.getAttributes().getFnAttrs()) {
7050 if (Attr.isStringAttribute() &&
7051 (Attr.getKindAsString() == "target-features")) {
7052 if (Attr.getValueAsString().contains("-sse"))
7053 AMD64FpEndOffset = AMD64FpEndOffsetNoSSE;
7054 break;
7055 }
7056 }
7057 }
7058
7059 ArgKind classifyArgument(Value *arg) {
7060 // A very rough approximation of X86_64 argument classification rules.
7061 Type *T = arg->getType();
7062 if (T->isX86_FP80Ty())
7063 return AK_Memory;
7064 if (T->isFPOrFPVectorTy())
7065 return AK_FloatingPoint;
7066 if (T->isIntegerTy() && T->getPrimitiveSizeInBits() <= 64)
7067 return AK_GeneralPurpose;
7068 if (T->isPointerTy())
7069 return AK_GeneralPurpose;
7070 return AK_Memory;
7071 }
7072
7073 // For VarArg functions, store the argument shadow in an ABI-specific format
7074 // that corresponds to va_list layout.
7075 // We do this because Clang lowers va_arg in the frontend, and this pass
7076 // only sees the low level code that deals with va_list internals.
7077 // A much easier alternative (provided that Clang emits va_arg instructions)
7078 // would have been to associate each live instance of va_list with a copy of
7079 // MSanParamTLS, and extract shadow on va_arg() call in the argument list
7080 // order.
7081 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7082 unsigned GpOffset = 0;
7083 unsigned FpOffset = AMD64GpEndOffset;
7084 unsigned OverflowOffset = AMD64FpEndOffset;
7085 const DataLayout &DL = F.getDataLayout();
7086
7087 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7088 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7089 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7090 if (IsByVal) {
7091 // ByVal arguments always go to the overflow area.
7092 // Fixed arguments passed through the overflow area will be stepped
7093 // over by va_start, so don't count them towards the offset.
7094 if (IsFixed)
7095 continue;
7096 assert(A->getType()->isPointerTy());
7097 Type *RealTy = CB.getParamByValType(ArgNo);
7098 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7099 uint64_t AlignedSize = alignTo(ArgSize, 8);
7100 unsigned BaseOffset = OverflowOffset;
7101 Value *ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7102 Value *OriginBase = nullptr;
7103 if (MS.TrackOrigins)
7104 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7105 OverflowOffset += AlignedSize;
7106
7107 if (OverflowOffset > kParamTLSSize) {
7108 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7109 continue; // We have no space to copy shadow there.
7110 }
7111
7112 Value *ShadowPtr, *OriginPtr;
7113 std::tie(ShadowPtr, OriginPtr) =
7114 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), kShadowTLSAlignment,
7115 /*isStore*/ false);
7116 IRB.CreateMemCpy(ShadowBase, kShadowTLSAlignment, ShadowPtr,
7117 kShadowTLSAlignment, ArgSize);
7118 if (MS.TrackOrigins)
7119 IRB.CreateMemCpy(OriginBase, kShadowTLSAlignment, OriginPtr,
7120 kShadowTLSAlignment, ArgSize);
7121 } else {
7122 ArgKind AK = classifyArgument(A);
7123 if (AK == AK_GeneralPurpose && GpOffset >= AMD64GpEndOffset)
7124 AK = AK_Memory;
7125 if (AK == AK_FloatingPoint && FpOffset >= AMD64FpEndOffset)
7126 AK = AK_Memory;
7127 Value *ShadowBase, *OriginBase = nullptr;
7128 switch (AK) {
7129 case AK_GeneralPurpose:
7130 ShadowBase = getShadowPtrForVAArgument(IRB, GpOffset);
7131 if (MS.TrackOrigins)
7132 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset);
7133 GpOffset += 8;
7134 assert(GpOffset <= kParamTLSSize);
7135 break;
7136 case AK_FloatingPoint:
7137 ShadowBase = getShadowPtrForVAArgument(IRB, FpOffset);
7138 if (MS.TrackOrigins)
7139 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
7140 FpOffset += 16;
7141 assert(FpOffset <= kParamTLSSize);
7142 break;
7143 case AK_Memory:
7144 if (IsFixed)
7145 continue;
7146 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7147 uint64_t AlignedSize = alignTo(ArgSize, 8);
7148 unsigned BaseOffset = OverflowOffset;
7149 ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7150 if (MS.TrackOrigins) {
7151 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7152 }
7153 OverflowOffset += AlignedSize;
7154 if (OverflowOffset > kParamTLSSize) {
7155 // We have no space to copy shadow there.
7156 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7157 continue;
7158 }
7159 }
7160 // Take fixed arguments into account for GpOffset and FpOffset,
7161 // but don't actually store shadows for them.
7162 // TODO(glider): don't call get*PtrForVAArgument() for them.
7163 if (IsFixed)
7164 continue;
7165 Value *Shadow = MSV.getShadow(A);
7166 IRB.CreateAlignedStore(Shadow, ShadowBase, kShadowTLSAlignment);
7167 if (MS.TrackOrigins) {
7168 Value *Origin = MSV.getOrigin(A);
7169 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
7170 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
7172 }
7173 }
7174 }
7175 Constant *OverflowSize =
7176 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AMD64FpEndOffset);
7177 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7178 }
7179
7180 void finalizeInstrumentation() override {
7181 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7182 "finalizeInstrumentation called twice");
7183 if (!VAStartInstrumentationList.empty()) {
7184 // If there is a va_start in this function, make a backup copy of
7185 // va_arg_tls somewhere in the function entry block.
7186 IRBuilder<> IRB(MSV.FnPrologueEnd);
7187 VAArgOverflowSize =
7188 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7189 Value *CopySize = IRB.CreateAdd(
7190 ConstantInt::get(MS.IntptrTy, AMD64FpEndOffset), VAArgOverflowSize);
7191 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7192 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7193 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7194 CopySize, kShadowTLSAlignment, false);
7195
7196 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7197 Intrinsic::umin, CopySize,
7198 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7199 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7200 kShadowTLSAlignment, SrcSize);
7201 if (MS.TrackOrigins) {
7202 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7203 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
7204 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
7205 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
7206 }
7207 }
7208
7209 // Instrument va_start.
7210 // Copy va_list shadow from the backup copy of the TLS contents.
7211 for (CallInst *OrigInst : VAStartInstrumentationList) {
7212 NextNodeIRBuilder IRB(OrigInst);
7213 Value *VAListTag = OrigInst->getArgOperand(0);
7214
7215 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
7216 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7217 ConstantInt::get(MS.IntptrTy, 16)),
7218 MS.PtrTy);
7219 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7220 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7221 const Align Alignment = Align(16);
7222 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7223 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7224 Alignment, /*isStore*/ true);
7225 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
7226 AMD64FpEndOffset);
7227 if (MS.TrackOrigins)
7228 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
7229 Alignment, AMD64FpEndOffset);
7230 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
7231 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7232 ConstantInt::get(MS.IntptrTy, 8)),
7233 MS.PtrTy);
7234 Value *OverflowArgAreaPtr =
7235 IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
7236 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
7237 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
7238 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
7239 Alignment, /*isStore*/ true);
7240 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
7241 AMD64FpEndOffset);
7242 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
7243 VAArgOverflowSize);
7244 if (MS.TrackOrigins) {
7245 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
7246 AMD64FpEndOffset);
7247 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
7248 VAArgOverflowSize);
7249 }
7250 }
7251 }
7252};
7253
7254/// AArch64-specific implementation of VarArgHelper.
7255struct VarArgAArch64Helper : public VarArgHelperBase {
7256 static const unsigned kAArch64GrArgSize = 64;
7257 static const unsigned kAArch64VrArgSize = 128;
7258
7259 static const unsigned AArch64GrBegOffset = 0;
7260 static const unsigned AArch64GrEndOffset = kAArch64GrArgSize;
7261 // Make VR space aligned to 16 bytes.
7262 static const unsigned AArch64VrBegOffset = AArch64GrEndOffset;
7263 static const unsigned AArch64VrEndOffset =
7264 AArch64VrBegOffset + kAArch64VrArgSize;
7265 static const unsigned AArch64VAEndOffset = AArch64VrEndOffset;
7266
7267 AllocaInst *VAArgTLSCopy = nullptr;
7268 Value *VAArgOverflowSize = nullptr;
7269
7270 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7271
7272 VarArgAArch64Helper(Function &F, MemorySanitizer &MS,
7273 MemorySanitizerVisitor &MSV)
7274 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/32) {}
7275
7276 // A very rough approximation of aarch64 argument classification rules.
7277 std::pair<ArgKind, uint64_t> classifyArgument(Type *T) {
7278 if (T->isIntOrPtrTy() && T->getPrimitiveSizeInBits() <= 64)
7279 return {AK_GeneralPurpose, 1};
7280 if (T->isFloatingPointTy() && T->getPrimitiveSizeInBits() <= 128)
7281 return {AK_FloatingPoint, 1};
7282
7283 if (T->isArrayTy()) {
7284 auto R = classifyArgument(T->getArrayElementType());
7285 R.second *= T->getScalarType()->getArrayNumElements();
7286 return R;
7287 }
7288
7289 if (const FixedVectorType *FV = dyn_cast<FixedVectorType>(T)) {
7290 auto R = classifyArgument(FV->getScalarType());
7291 R.second *= FV->getNumElements();
7292 return R;
7293 }
7294
7295 LLVM_DEBUG(errs() << "Unknown vararg type: " << *T << "\n");
7296 return {AK_Memory, 0};
7297 }
7298
7299 // The instrumentation stores the argument shadow in a non ABI-specific
7300 // format because it does not know which argument is named (since Clang,
7301 // like x86_64 case, lowers the va_args in the frontend and this pass only
7302 // sees the low level code that deals with va_list internals).
7303 // The first seven GR registers are saved in the first 56 bytes of the
7304 // va_arg tls arra, followed by the first 8 FP/SIMD registers, and then
7305 // the remaining arguments.
7306 // Using constant offset within the va_arg TLS array allows fast copy
7307 // in the finalize instrumentation.
7308 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7309 unsigned GrOffset = AArch64GrBegOffset;
7310 unsigned VrOffset = AArch64VrBegOffset;
7311 unsigned OverflowOffset = AArch64VAEndOffset;
7312
7313 const DataLayout &DL = F.getDataLayout();
7314 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7315 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7316 auto [AK, RegNum] = classifyArgument(A->getType());
7317 if (AK == AK_GeneralPurpose &&
7318 (GrOffset + RegNum * 8) > AArch64GrEndOffset)
7319 AK = AK_Memory;
7320 if (AK == AK_FloatingPoint &&
7321 (VrOffset + RegNum * 16) > AArch64VrEndOffset)
7322 AK = AK_Memory;
7323 Value *Base;
7324 switch (AK) {
7325 case AK_GeneralPurpose:
7326 Base = getShadowPtrForVAArgument(IRB, GrOffset);
7327 GrOffset += 8 * RegNum;
7328 break;
7329 case AK_FloatingPoint:
7330 Base = getShadowPtrForVAArgument(IRB, VrOffset);
7331 VrOffset += 16 * RegNum;
7332 break;
7333 case AK_Memory:
7334 // Don't count fixed arguments in the overflow area - va_start will
7335 // skip right over them.
7336 if (IsFixed)
7337 continue;
7338 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7339 uint64_t AlignedSize = alignTo(ArgSize, 8);
7340 unsigned BaseOffset = OverflowOffset;
7341 Base = getShadowPtrForVAArgument(IRB, BaseOffset);
7342 OverflowOffset += AlignedSize;
7343 if (OverflowOffset > kParamTLSSize) {
7344 // We have no space to copy shadow there.
7345 CleanUnusedTLS(IRB, Base, BaseOffset);
7346 continue;
7347 }
7348 break;
7349 }
7350 // Count Gp/Vr fixed arguments to their respective offsets, but don't
7351 // bother to actually store a shadow.
7352 if (IsFixed)
7353 continue;
7354 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
7355 }
7356 Constant *OverflowSize =
7357 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AArch64VAEndOffset);
7358 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7359 }
7360
7361 // Retrieve a va_list field of 'void*' size.
7362 Value *getVAField64(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7363 Value *SaveAreaPtrPtr = IRB.CreateIntToPtr(
7364 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7365 ConstantInt::get(MS.IntptrTy, offset)),
7366 MS.PtrTy);
7367 return IRB.CreateLoad(Type::getInt64Ty(*MS.C), SaveAreaPtrPtr);
7368 }
7369
7370 // Retrieve a va_list field of 'int' size.
7371 Value *getVAField32(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7372 Value *SaveAreaPtr = IRB.CreateIntToPtr(
7373 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7374 ConstantInt::get(MS.IntptrTy, offset)),
7375 MS.PtrTy);
7376 Value *SaveArea32 = IRB.CreateLoad(IRB.getInt32Ty(), SaveAreaPtr);
7377 return IRB.CreateSExt(SaveArea32, MS.IntptrTy);
7378 }
7379
7380 void finalizeInstrumentation() override {
7381 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7382 "finalizeInstrumentation called twice");
7383 if (!VAStartInstrumentationList.empty()) {
7384 // If there is a va_start in this function, make a backup copy of
7385 // va_arg_tls somewhere in the function entry block.
7386 IRBuilder<> IRB(MSV.FnPrologueEnd);
7387 VAArgOverflowSize =
7388 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7389 Value *CopySize = IRB.CreateAdd(
7390 ConstantInt::get(MS.IntptrTy, AArch64VAEndOffset), VAArgOverflowSize);
7391 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7392 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7393 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7394 CopySize, kShadowTLSAlignment, false);
7395
7396 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7397 Intrinsic::umin, CopySize,
7398 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7399 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7400 kShadowTLSAlignment, SrcSize);
7401 }
7402
7403 Value *GrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64GrArgSize);
7404 Value *VrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64VrArgSize);
7405
7406 // Instrument va_start, copy va_list shadow from the backup copy of
7407 // the TLS contents.
7408 for (CallInst *OrigInst : VAStartInstrumentationList) {
7409 NextNodeIRBuilder IRB(OrigInst);
7410
7411 Value *VAListTag = OrigInst->getArgOperand(0);
7412
7413 // The variadic ABI for AArch64 creates two areas to save the incoming
7414 // argument registers (one for 64-bit general register xn-x7 and another
7415 // for 128-bit FP/SIMD vn-v7).
7416 // We need then to propagate the shadow arguments on both regions
7417 // 'va::__gr_top + va::__gr_offs' and 'va::__vr_top + va::__vr_offs'.
7418 // The remaining arguments are saved on shadow for 'va::stack'.
7419 // One caveat is it requires only to propagate the non-named arguments,
7420 // however on the call site instrumentation 'all' the arguments are
7421 // saved. So to copy the shadow values from the va_arg TLS array
7422 // we need to adjust the offset for both GR and VR fields based on
7423 // the __{gr,vr}_offs value (since they are stores based on incoming
7424 // named arguments).
7425 Type *RegSaveAreaPtrTy = IRB.getPtrTy();
7426
7427 // Read the stack pointer from the va_list.
7428 Value *StackSaveAreaPtr =
7429 IRB.CreateIntToPtr(getVAField64(IRB, VAListTag, 0), RegSaveAreaPtrTy);
7430
7431 // Read both the __gr_top and __gr_off and add them up.
7432 Value *GrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 8);
7433 Value *GrOffSaveArea = getVAField32(IRB, VAListTag, 24);
7434
7435 Value *GrRegSaveAreaPtr = IRB.CreateIntToPtr(
7436 IRB.CreateAdd(GrTopSaveAreaPtr, GrOffSaveArea), RegSaveAreaPtrTy);
7437
7438 // Read both the __vr_top and __vr_off and add them up.
7439 Value *VrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 16);
7440 Value *VrOffSaveArea = getVAField32(IRB, VAListTag, 28);
7441
7442 Value *VrRegSaveAreaPtr = IRB.CreateIntToPtr(
7443 IRB.CreateAdd(VrTopSaveAreaPtr, VrOffSaveArea), RegSaveAreaPtrTy);
7444
7445 // It does not know how many named arguments is being used and, on the
7446 // callsite all the arguments were saved. Since __gr_off is defined as
7447 // '0 - ((8 - named_gr) * 8)', the idea is to just propagate the variadic
7448 // argument by ignoring the bytes of shadow from named arguments.
7449 Value *GrRegSaveAreaShadowPtrOff =
7450 IRB.CreateAdd(GrArgSize, GrOffSaveArea);
7451
7452 Value *GrRegSaveAreaShadowPtr =
7453 MSV.getShadowOriginPtr(GrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7454 Align(8), /*isStore*/ true)
7455 .first;
7456
7457 Value *GrSrcPtr =
7458 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy, GrRegSaveAreaShadowPtrOff);
7459 Value *GrCopySize = IRB.CreateSub(GrArgSize, GrRegSaveAreaShadowPtrOff);
7460
7461 IRB.CreateMemCpy(GrRegSaveAreaShadowPtr, Align(8), GrSrcPtr, Align(8),
7462 GrCopySize);
7463
7464 // Again, but for FP/SIMD values.
7465 Value *VrRegSaveAreaShadowPtrOff =
7466 IRB.CreateAdd(VrArgSize, VrOffSaveArea);
7467
7468 Value *VrRegSaveAreaShadowPtr =
7469 MSV.getShadowOriginPtr(VrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7470 Align(8), /*isStore*/ true)
7471 .first;
7472
7473 Value *VrSrcPtr = IRB.CreateInBoundsPtrAdd(
7474 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy,
7475 IRB.getInt32(AArch64VrBegOffset)),
7476 VrRegSaveAreaShadowPtrOff);
7477 Value *VrCopySize = IRB.CreateSub(VrArgSize, VrRegSaveAreaShadowPtrOff);
7478
7479 IRB.CreateMemCpy(VrRegSaveAreaShadowPtr, Align(8), VrSrcPtr, Align(8),
7480 VrCopySize);
7481
7482 // And finally for remaining arguments.
7483 Value *StackSaveAreaShadowPtr =
7484 MSV.getShadowOriginPtr(StackSaveAreaPtr, IRB, IRB.getInt8Ty(),
7485 Align(16), /*isStore*/ true)
7486 .first;
7487
7488 Value *StackSrcPtr = IRB.CreateInBoundsPtrAdd(
7489 VAArgTLSCopy, IRB.getInt32(AArch64VAEndOffset));
7490
7491 IRB.CreateMemCpy(StackSaveAreaShadowPtr, Align(16), StackSrcPtr,
7492 Align(16), VAArgOverflowSize);
7493 }
7494 }
7495};
7496
7497/// PowerPC64-specific implementation of VarArgHelper.
7498struct VarArgPowerPC64Helper : public VarArgHelperBase {
7499 AllocaInst *VAArgTLSCopy = nullptr;
7500 Value *VAArgSize = nullptr;
7501
7502 VarArgPowerPC64Helper(Function &F, MemorySanitizer &MS,
7503 MemorySanitizerVisitor &MSV)
7504 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/8) {}
7505
7506 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7507 // For PowerPC, we need to deal with alignment of stack arguments -
7508 // they are mostly aligned to 8 bytes, but vectors and i128 arrays
7509 // are aligned to 16 bytes, byvals can be aligned to 8 or 16 bytes,
7510 // For that reason, we compute current offset from stack pointer (which is
7511 // always properly aligned), and offset for the first vararg, then subtract
7512 // them.
7513 unsigned VAArgBase;
7514 Triple TargetTriple(F.getParent()->getTargetTriple());
7515 // Parameter save area starts at 48 bytes from frame pointer for ABIv1,
7516 // and 32 bytes for ABIv2. This is usually determined by target
7517 // endianness, but in theory could be overridden by function attribute.
7518 if (TargetTriple.isPPC64ELFv2ABI())
7519 VAArgBase = 32;
7520 else
7521 VAArgBase = 48;
7522 unsigned VAArgOffset = VAArgBase;
7523 const DataLayout &DL = F.getDataLayout();
7524 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7525 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7526 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7527 if (IsByVal) {
7528 assert(A->getType()->isPointerTy());
7529 Type *RealTy = CB.getParamByValType(ArgNo);
7530 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7531 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(8));
7532 if (ArgAlign < 8)
7533 ArgAlign = Align(8);
7534 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7535 if (!IsFixed) {
7536 Value *Base =
7537 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7538 if (Base) {
7539 Value *AShadowPtr, *AOriginPtr;
7540 std::tie(AShadowPtr, AOriginPtr) =
7541 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
7542 kShadowTLSAlignment, /*isStore*/ false);
7543
7544 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
7545 kShadowTLSAlignment, ArgSize);
7546 }
7547 }
7548 VAArgOffset += alignTo(ArgSize, Align(8));
7549 } else {
7550 Value *Base;
7551 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7552 Align ArgAlign = Align(8);
7553 if (A->getType()->isArrayTy()) {
7554 // Arrays are aligned to element size, except for long double
7555 // arrays, which are aligned to 8 bytes.
7556 Type *ElementTy = A->getType()->getArrayElementType();
7557 if (!ElementTy->isPPC_FP128Ty())
7558 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
7559 } else if (A->getType()->isVectorTy()) {
7560 // Vectors are naturally aligned.
7561 ArgAlign = Align(ArgSize);
7562 }
7563 if (ArgAlign < 8)
7564 ArgAlign = Align(8);
7565 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7566 if (DL.isBigEndian()) {
7567 // Adjusting the shadow for argument with size < 8 to match the
7568 // placement of bits in big endian system
7569 if (ArgSize < 8)
7570 VAArgOffset += (8 - ArgSize);
7571 }
7572 if (!IsFixed) {
7573 Base =
7574 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7575 if (Base)
7576 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
7577 }
7578 VAArgOffset += ArgSize;
7579 VAArgOffset = alignTo(VAArgOffset, Align(8));
7580 }
7581 if (IsFixed)
7582 VAArgBase = VAArgOffset;
7583 }
7584
7585 Constant *TotalVAArgSize =
7586 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
7587 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
7588 // a new class member i.e. it is the total size of all VarArgs.
7589 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
7590 }
7591
7592 void finalizeInstrumentation() override {
7593 assert(!VAArgSize && !VAArgTLSCopy &&
7594 "finalizeInstrumentation called twice");
7595 IRBuilder<> IRB(MSV.FnPrologueEnd);
7596 VAArgSize = IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7597 Value *CopySize = VAArgSize;
7598
7599 if (!VAStartInstrumentationList.empty()) {
7600 // If there is a va_start in this function, make a backup copy of
7601 // va_arg_tls somewhere in the function entry block.
7602
7603 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7604 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7605 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7606 CopySize, kShadowTLSAlignment, false);
7607
7608 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7609 Intrinsic::umin, CopySize,
7610 ConstantInt::get(IRB.getInt64Ty(), kParamTLSSize));
7611 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7612 kShadowTLSAlignment, SrcSize);
7613 }
7614
7615 // Instrument va_start.
7616 // Copy va_list shadow from the backup copy of the TLS contents.
7617 for (CallInst *OrigInst : VAStartInstrumentationList) {
7618 NextNodeIRBuilder IRB(OrigInst);
7619 Value *VAListTag = OrigInst->getArgOperand(0);
7620 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7621
7622 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
7623
7624 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7625 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7626 const DataLayout &DL = F.getDataLayout();
7627 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7628 const Align Alignment = Align(IntptrSize);
7629 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7630 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7631 Alignment, /*isStore*/ true);
7632 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
7633 CopySize);
7634 }
7635 }
7636};
7637
7638/// PowerPC32-specific implementation of VarArgHelper.
7639struct VarArgPowerPC32Helper : public VarArgHelperBase {
7640 AllocaInst *VAArgTLSCopy = nullptr;
7641 Value *VAArgSize = nullptr;
7642
7643 VarArgPowerPC32Helper(Function &F, MemorySanitizer &MS,
7644 MemorySanitizerVisitor &MSV)
7645 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/12) {}
7646
7647 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7648 unsigned VAArgBase;
7649 // Parameter save area is 8 bytes from frame pointer in PPC32
7650 VAArgBase = 8;
7651 unsigned VAArgOffset = VAArgBase;
7652 const DataLayout &DL = F.getDataLayout();
7653 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7654 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7655 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7656 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7657 if (IsByVal) {
7658 assert(A->getType()->isPointerTy());
7659 Type *RealTy = CB.getParamByValType(ArgNo);
7660 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7661 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
7662 if (ArgAlign < IntptrSize)
7663 ArgAlign = Align(IntptrSize);
7664 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7665 if (!IsFixed) {
7666 Value *Base =
7667 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7668 if (Base) {
7669 Value *AShadowPtr, *AOriginPtr;
7670 std::tie(AShadowPtr, AOriginPtr) =
7671 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
7672 kShadowTLSAlignment, /*isStore*/ false);
7673
7674 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
7675 kShadowTLSAlignment, ArgSize);
7676 }
7677 }
7678 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
7679 } else {
7680 Value *Base;
7681 Type *ArgTy = A->getType();
7682
7683 // On PPC 32 floating point variable arguments are stored in separate
7684 // area: fp_save_area = reg_save_area + 4*8. We do not copy shaodow for
7685 // them as they will be found when checking call arguments.
7686 if (!ArgTy->isFloatingPointTy()) {
7687 uint64_t ArgSize = DL.getTypeAllocSize(ArgTy);
7688 Align ArgAlign = Align(IntptrSize);
7689 if (ArgTy->isArrayTy()) {
7690 // Arrays are aligned to element size, except for long double
7691 // arrays, which are aligned to 8 bytes.
7692 Type *ElementTy = ArgTy->getArrayElementType();
7693 if (!ElementTy->isPPC_FP128Ty())
7694 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
7695 } else if (ArgTy->isVectorTy()) {
7696 // Vectors are naturally aligned.
7697 ArgAlign = Align(ArgSize);
7698 }
7699 if (ArgAlign < IntptrSize)
7700 ArgAlign = Align(IntptrSize);
7701 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7702 if (DL.isBigEndian()) {
7703 // Adjusting the shadow for argument with size < IntptrSize to match
7704 // the placement of bits in big endian system
7705 if (ArgSize < IntptrSize)
7706 VAArgOffset += (IntptrSize - ArgSize);
7707 }
7708 if (!IsFixed) {
7709 Base = getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase,
7710 ArgSize);
7711 if (Base)
7712 IRB.CreateAlignedStore(MSV.getShadow(A), Base,
7714 }
7715 VAArgOffset += ArgSize;
7716 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
7717 }
7718 }
7719 }
7720
7721 Constant *TotalVAArgSize =
7722 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
7723 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
7724 // a new class member i.e. it is the total size of all VarArgs.
7725 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
7726 }
7727
7728 void finalizeInstrumentation() override {
7729 assert(!VAArgSize && !VAArgTLSCopy &&
7730 "finalizeInstrumentation called twice");
7731 IRBuilder<> IRB(MSV.FnPrologueEnd);
7732 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
7733 Value *CopySize = VAArgSize;
7734
7735 if (!VAStartInstrumentationList.empty()) {
7736 // If there is a va_start in this function, make a backup copy of
7737 // va_arg_tls somewhere in the function entry block.
7738
7739 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7740 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7741 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7742 CopySize, kShadowTLSAlignment, false);
7743
7744 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7745 Intrinsic::umin, CopySize,
7746 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7747 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7748 kShadowTLSAlignment, SrcSize);
7749 }
7750
7751 // Instrument va_start.
7752 // Copy va_list shadow from the backup copy of the TLS contents.
7753 for (CallInst *OrigInst : VAStartInstrumentationList) {
7754 NextNodeIRBuilder IRB(OrigInst);
7755 Value *VAListTag = OrigInst->getArgOperand(0);
7756 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7757 Value *RegSaveAreaSize = CopySize;
7758
7759 // In PPC32 va_list_tag is a struct
7760 RegSaveAreaPtrPtr =
7761 IRB.CreateAdd(RegSaveAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 8));
7762
7763 // On PPC 32 reg_save_area can only hold 32 bytes of data
7764 RegSaveAreaSize = IRB.CreateBinaryIntrinsic(
7765 Intrinsic::umin, CopySize, ConstantInt::get(MS.IntptrTy, 32));
7766
7767 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
7768 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7769
7770 const DataLayout &DL = F.getDataLayout();
7771 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7772 const Align Alignment = Align(IntptrSize);
7773
7774 { // Copy reg save area
7775 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7776 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7777 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7778 Alignment, /*isStore*/ true);
7779 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy,
7780 Alignment, RegSaveAreaSize);
7781
7782 RegSaveAreaShadowPtr =
7783 IRB.CreatePtrToInt(RegSaveAreaShadowPtr, MS.IntptrTy);
7784 Value *FPSaveArea = IRB.CreateAdd(RegSaveAreaShadowPtr,
7785 ConstantInt::get(MS.IntptrTy, 32));
7786 FPSaveArea = IRB.CreateIntToPtr(FPSaveArea, MS.PtrTy);
7787 // We fill fp shadow with zeroes as uninitialized fp args should have
7788 // been found during call base check
7789 IRB.CreateMemSet(FPSaveArea, ConstantInt::getNullValue(IRB.getInt8Ty()),
7790 ConstantInt::get(MS.IntptrTy, 32), Alignment);
7791 }
7792
7793 { // Copy overflow area
7794 // RegSaveAreaSize is min(CopySize, 32) -> no overflow can occur
7795 Value *OverflowAreaSize = IRB.CreateSub(CopySize, RegSaveAreaSize);
7796
7797 Value *OverflowAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7798 OverflowAreaPtrPtr =
7799 IRB.CreateAdd(OverflowAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 4));
7800 OverflowAreaPtrPtr = IRB.CreateIntToPtr(OverflowAreaPtrPtr, MS.PtrTy);
7801
7802 Value *OverflowAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowAreaPtrPtr);
7803
7804 Value *OverflowAreaShadowPtr, *OverflowAreaOriginPtr;
7805 std::tie(OverflowAreaShadowPtr, OverflowAreaOriginPtr) =
7806 MSV.getShadowOriginPtr(OverflowAreaPtr, IRB, IRB.getInt8Ty(),
7807 Alignment, /*isStore*/ true);
7808
7809 Value *OverflowVAArgTLSCopyPtr =
7810 IRB.CreatePtrToInt(VAArgTLSCopy, MS.IntptrTy);
7811 OverflowVAArgTLSCopyPtr =
7812 IRB.CreateAdd(OverflowVAArgTLSCopyPtr, RegSaveAreaSize);
7813
7814 OverflowVAArgTLSCopyPtr =
7815 IRB.CreateIntToPtr(OverflowVAArgTLSCopyPtr, MS.PtrTy);
7816 IRB.CreateMemCpy(OverflowAreaShadowPtr, Alignment,
7817 OverflowVAArgTLSCopyPtr, Alignment, OverflowAreaSize);
7818 }
7819 }
7820 }
7821};
7822
7823/// SystemZ-specific implementation of VarArgHelper.
7824struct VarArgSystemZHelper : public VarArgHelperBase {
7825 static const unsigned SystemZGpOffset = 16;
7826 static const unsigned SystemZGpEndOffset = 56;
7827 static const unsigned SystemZFpOffset = 128;
7828 static const unsigned SystemZFpEndOffset = 160;
7829 static const unsigned SystemZMaxVrArgs = 8;
7830 static const unsigned SystemZRegSaveAreaSize = 160;
7831 static const unsigned SystemZOverflowOffset = 160;
7832 static const unsigned SystemZVAListTagSize = 32;
7833 static const unsigned SystemZOverflowArgAreaPtrOffset = 16;
7834 static const unsigned SystemZRegSaveAreaPtrOffset = 24;
7835
7836 bool IsSoftFloatABI;
7837 AllocaInst *VAArgTLSCopy = nullptr;
7838 AllocaInst *VAArgTLSOriginCopy = nullptr;
7839 Value *VAArgOverflowSize = nullptr;
7840
7841 enum class ArgKind {
7842 GeneralPurpose,
7843 FloatingPoint,
7844 Vector,
7845 Memory,
7846 Indirect,
7847 };
7848
7849 enum class ShadowExtension { None, Zero, Sign };
7850
7851 VarArgSystemZHelper(Function &F, MemorySanitizer &MS,
7852 MemorySanitizerVisitor &MSV)
7853 : VarArgHelperBase(F, MS, MSV, SystemZVAListTagSize),
7854 IsSoftFloatABI(F.getFnAttribute("use-soft-float").getValueAsBool()) {}
7855
7856 ArgKind classifyArgument(Type *T) {
7857 // T is a SystemZABIInfo::classifyArgumentType() output, and there are
7858 // only a few possibilities of what it can be. In particular, enums, single
7859 // element structs and large types have already been taken care of.
7860
7861 // Some i128 and fp128 arguments are converted to pointers only in the
7862 // back end.
7863 if (T->isIntegerTy(128) || T->isFP128Ty())
7864 return ArgKind::Indirect;
7865 if (T->isFloatingPointTy())
7866 return IsSoftFloatABI ? ArgKind::GeneralPurpose : ArgKind::FloatingPoint;
7867 if (T->isIntegerTy() || T->isPointerTy())
7868 return ArgKind::GeneralPurpose;
7869 if (T->isVectorTy())
7870 return ArgKind::Vector;
7871 return ArgKind::Memory;
7872 }
7873
7874 ShadowExtension getShadowExtension(const CallBase &CB, unsigned ArgNo) {
7875 // ABI says: "One of the simple integer types no more than 64 bits wide.
7876 // ... If such an argument is shorter than 64 bits, replace it by a full
7877 // 64-bit integer representing the same number, using sign or zero
7878 // extension". Shadow for an integer argument has the same type as the
7879 // argument itself, so it can be sign or zero extended as well.
7880 bool ZExt = CB.paramHasAttr(ArgNo, Attribute::ZExt);
7881 bool SExt = CB.paramHasAttr(ArgNo, Attribute::SExt);
7882 if (ZExt) {
7883 assert(!SExt);
7884 return ShadowExtension::Zero;
7885 }
7886 if (SExt) {
7887 assert(!ZExt);
7888 return ShadowExtension::Sign;
7889 }
7890 return ShadowExtension::None;
7891 }
7892
7893 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7894 unsigned GpOffset = SystemZGpOffset;
7895 unsigned FpOffset = SystemZFpOffset;
7896 unsigned VrIndex = 0;
7897 unsigned OverflowOffset = SystemZOverflowOffset;
7898 const DataLayout &DL = F.getDataLayout();
7899 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7900 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7901 // SystemZABIInfo does not produce ByVal parameters.
7902 assert(!CB.paramHasAttr(ArgNo, Attribute::ByVal));
7903 Type *T = A->getType();
7904 ArgKind AK = classifyArgument(T);
7905 if (AK == ArgKind::Indirect) {
7906 T = MS.PtrTy;
7907 AK = ArgKind::GeneralPurpose;
7908 }
7909 if (AK == ArgKind::GeneralPurpose && GpOffset >= SystemZGpEndOffset)
7910 AK = ArgKind::Memory;
7911 if (AK == ArgKind::FloatingPoint && FpOffset >= SystemZFpEndOffset)
7912 AK = ArgKind::Memory;
7913 if (AK == ArgKind::Vector && (VrIndex >= SystemZMaxVrArgs || !IsFixed))
7914 AK = ArgKind::Memory;
7915 Value *ShadowBase = nullptr;
7916 Value *OriginBase = nullptr;
7917 ShadowExtension SE = ShadowExtension::None;
7918 switch (AK) {
7919 case ArgKind::GeneralPurpose: {
7920 // Always keep track of GpOffset, but store shadow only for varargs.
7921 uint64_t ArgSize = 8;
7922 if (GpOffset + ArgSize <= kParamTLSSize) {
7923 if (!IsFixed) {
7924 SE = getShadowExtension(CB, ArgNo);
7925 uint64_t GapSize = 0;
7926 if (SE == ShadowExtension::None) {
7927 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
7928 assert(ArgAllocSize <= ArgSize);
7929 GapSize = ArgSize - ArgAllocSize;
7930 }
7931 ShadowBase = getShadowAddrForVAArgument(IRB, GpOffset + GapSize);
7932 if (MS.TrackOrigins)
7933 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset + GapSize);
7934 }
7935 GpOffset += ArgSize;
7936 } else {
7937 GpOffset = kParamTLSSize;
7938 }
7939 break;
7940 }
7941 case ArgKind::FloatingPoint: {
7942 // Always keep track of FpOffset, but store shadow only for varargs.
7943 uint64_t ArgSize = 8;
7944 if (FpOffset + ArgSize <= kParamTLSSize) {
7945 if (!IsFixed) {
7946 // PoP says: "A short floating-point datum requires only the
7947 // left-most 32 bit positions of a floating-point register".
7948 // Therefore, in contrast to AK_GeneralPurpose and AK_Memory,
7949 // don't extend shadow and don't mind the gap.
7950 ShadowBase = getShadowAddrForVAArgument(IRB, FpOffset);
7951 if (MS.TrackOrigins)
7952 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
7953 }
7954 FpOffset += ArgSize;
7955 } else {
7956 FpOffset = kParamTLSSize;
7957 }
7958 break;
7959 }
7960 case ArgKind::Vector: {
7961 // Keep track of VrIndex. No need to store shadow, since vector varargs
7962 // go through AK_Memory.
7963 assert(IsFixed);
7964 VrIndex++;
7965 break;
7966 }
7967 case ArgKind::Memory: {
7968 // Keep track of OverflowOffset and store shadow only for varargs.
7969 // Ignore fixed args, since we need to copy only the vararg portion of
7970 // the overflow area shadow.
7971 if (!IsFixed) {
7972 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
7973 uint64_t ArgSize = alignTo(ArgAllocSize, 8);
7974 if (OverflowOffset + ArgSize <= kParamTLSSize) {
7975 SE = getShadowExtension(CB, ArgNo);
7976 uint64_t GapSize =
7977 SE == ShadowExtension::None ? ArgSize - ArgAllocSize : 0;
7978 ShadowBase =
7979 getShadowAddrForVAArgument(IRB, OverflowOffset + GapSize);
7980 if (MS.TrackOrigins)
7981 OriginBase =
7982 getOriginPtrForVAArgument(IRB, OverflowOffset + GapSize);
7983 OverflowOffset += ArgSize;
7984 } else {
7985 OverflowOffset = kParamTLSSize;
7986 }
7987 }
7988 break;
7989 }
7990 case ArgKind::Indirect:
7991 llvm_unreachable("Indirect must be converted to GeneralPurpose");
7992 }
7993 if (ShadowBase == nullptr)
7994 continue;
7995 Value *Shadow = MSV.getShadow(A);
7996 if (SE != ShadowExtension::None)
7997 Shadow = MSV.CreateShadowCast(IRB, Shadow, IRB.getInt64Ty(),
7998 /*Signed*/ SE == ShadowExtension::Sign);
7999 ShadowBase = IRB.CreateIntToPtr(ShadowBase, MS.PtrTy, "_msarg_va_s");
8000 IRB.CreateStore(Shadow, ShadowBase);
8001 if (MS.TrackOrigins) {
8002 Value *Origin = MSV.getOrigin(A);
8003 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
8004 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
8006 }
8007 }
8008 Constant *OverflowSize = ConstantInt::get(
8009 IRB.getInt64Ty(), OverflowOffset - SystemZOverflowOffset);
8010 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
8011 }
8012
8013 void copyRegSaveArea(IRBuilder<> &IRB, Value *VAListTag) {
8014 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
8015 IRB.CreateAdd(
8016 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8017 ConstantInt::get(MS.IntptrTy, SystemZRegSaveAreaPtrOffset)),
8018 MS.PtrTy);
8019 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
8020 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8021 const Align Alignment = Align(8);
8022 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8023 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(), Alignment,
8024 /*isStore*/ true);
8025 // TODO(iii): copy only fragments filled by visitCallBase()
8026 // TODO(iii): support packed-stack && !use-soft-float
8027 // For use-soft-float functions, it is enough to copy just the GPRs.
8028 unsigned RegSaveAreaSize =
8029 IsSoftFloatABI ? SystemZGpEndOffset : SystemZRegSaveAreaSize;
8030 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8031 RegSaveAreaSize);
8032 if (MS.TrackOrigins)
8033 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
8034 Alignment, RegSaveAreaSize);
8035 }
8036
8037 // FIXME: This implementation limits OverflowOffset to kParamTLSSize, so we
8038 // don't know real overflow size and can't clear shadow beyond kParamTLSSize.
8039 void copyOverflowArea(IRBuilder<> &IRB, Value *VAListTag) {
8040 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
8041 IRB.CreateAdd(
8042 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8043 ConstantInt::get(MS.IntptrTy, SystemZOverflowArgAreaPtrOffset)),
8044 MS.PtrTy);
8045 Value *OverflowArgAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
8046 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
8047 const Align Alignment = Align(8);
8048 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
8049 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
8050 Alignment, /*isStore*/ true);
8051 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
8052 SystemZOverflowOffset);
8053 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
8054 VAArgOverflowSize);
8055 if (MS.TrackOrigins) {
8056 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
8057 SystemZOverflowOffset);
8058 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
8059 VAArgOverflowSize);
8060 }
8061 }
8062
8063 void finalizeInstrumentation() override {
8064 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
8065 "finalizeInstrumentation called twice");
8066 if (!VAStartInstrumentationList.empty()) {
8067 // If there is a va_start in this function, make a backup copy of
8068 // va_arg_tls somewhere in the function entry block.
8069 IRBuilder<> IRB(MSV.FnPrologueEnd);
8070 VAArgOverflowSize =
8071 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
8072 Value *CopySize =
8073 IRB.CreateAdd(ConstantInt::get(MS.IntptrTy, SystemZOverflowOffset),
8074 VAArgOverflowSize);
8075 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8076 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8077 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8078 CopySize, kShadowTLSAlignment, false);
8079
8080 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8081 Intrinsic::umin, CopySize,
8082 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8083 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8084 kShadowTLSAlignment, SrcSize);
8085 if (MS.TrackOrigins) {
8086 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8087 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
8088 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
8089 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
8090 }
8091 }
8092
8093 // Instrument va_start.
8094 // Copy va_list shadow from the backup copy of the TLS contents.
8095 for (CallInst *OrigInst : VAStartInstrumentationList) {
8096 NextNodeIRBuilder IRB(OrigInst);
8097 Value *VAListTag = OrigInst->getArgOperand(0);
8098 copyRegSaveArea(IRB, VAListTag);
8099 copyOverflowArea(IRB, VAListTag);
8100 }
8101 }
8102};
8103
8104/// i386-specific implementation of VarArgHelper.
8105struct VarArgI386Helper : public VarArgHelperBase {
8106 AllocaInst *VAArgTLSCopy = nullptr;
8107 Value *VAArgSize = nullptr;
8108
8109 VarArgI386Helper(Function &F, MemorySanitizer &MS,
8110 MemorySanitizerVisitor &MSV)
8111 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/4) {}
8112
8113 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8114 const DataLayout &DL = F.getDataLayout();
8115 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8116 unsigned VAArgOffset = 0;
8117 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8118 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8119 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
8120 if (IsByVal) {
8121 assert(A->getType()->isPointerTy());
8122 Type *RealTy = CB.getParamByValType(ArgNo);
8123 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
8124 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
8125 if (ArgAlign < IntptrSize)
8126 ArgAlign = Align(IntptrSize);
8127 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8128 if (!IsFixed) {
8129 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8130 if (Base) {
8131 Value *AShadowPtr, *AOriginPtr;
8132 std::tie(AShadowPtr, AOriginPtr) =
8133 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
8134 kShadowTLSAlignment, /*isStore*/ false);
8135
8136 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
8137 kShadowTLSAlignment, ArgSize);
8138 }
8139 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
8140 }
8141 } else {
8142 Value *Base;
8143 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8144 Align ArgAlign = Align(IntptrSize);
8145 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8146 if (DL.isBigEndian()) {
8147 // Adjusting the shadow for argument with size < IntptrSize to match
8148 // the placement of bits in big endian system
8149 if (ArgSize < IntptrSize)
8150 VAArgOffset += (IntptrSize - ArgSize);
8151 }
8152 if (!IsFixed) {
8153 Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8154 if (Base)
8155 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8156 VAArgOffset += ArgSize;
8157 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
8158 }
8159 }
8160 }
8161
8162 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8163 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8164 // a new class member i.e. it is the total size of all VarArgs.
8165 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8166 }
8167
8168 void finalizeInstrumentation() override {
8169 assert(!VAArgSize && !VAArgTLSCopy &&
8170 "finalizeInstrumentation called twice");
8171 IRBuilder<> IRB(MSV.FnPrologueEnd);
8172 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8173 Value *CopySize = VAArgSize;
8174
8175 if (!VAStartInstrumentationList.empty()) {
8176 // If there is a va_start in this function, make a backup copy of
8177 // va_arg_tls somewhere in the function entry block.
8178 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8179 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8180 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8181 CopySize, kShadowTLSAlignment, false);
8182
8183 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8184 Intrinsic::umin, CopySize,
8185 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8186 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8187 kShadowTLSAlignment, SrcSize);
8188 }
8189
8190 // Instrument va_start.
8191 // Copy va_list shadow from the backup copy of the TLS contents.
8192 for (CallInst *OrigInst : VAStartInstrumentationList) {
8193 NextNodeIRBuilder IRB(OrigInst);
8194 Value *VAListTag = OrigInst->getArgOperand(0);
8195 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8196 Value *RegSaveAreaPtrPtr =
8197 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8198 PointerType::get(*MS.C, 0));
8199 Value *RegSaveAreaPtr =
8200 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8201 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8202 const DataLayout &DL = F.getDataLayout();
8203 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8204 const Align Alignment = Align(IntptrSize);
8205 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8206 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8207 Alignment, /*isStore*/ true);
8208 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8209 CopySize);
8210 }
8211 }
8212};
8213
8214/// Implementation of VarArgHelper that is used for ARM32, MIPS, RISCV,
8215/// LoongArch64.
8216struct VarArgGenericHelper : public VarArgHelperBase {
8217 AllocaInst *VAArgTLSCopy = nullptr;
8218 Value *VAArgSize = nullptr;
8219
8220 VarArgGenericHelper(Function &F, MemorySanitizer &MS,
8221 MemorySanitizerVisitor &MSV, const unsigned VAListTagSize)
8222 : VarArgHelperBase(F, MS, MSV, VAListTagSize) {}
8223
8224 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8225 unsigned VAArgOffset = 0;
8226 const DataLayout &DL = F.getDataLayout();
8227 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8228 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8229 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8230 if (IsFixed)
8231 continue;
8232 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8233 if (DL.isBigEndian()) {
8234 // Adjusting the shadow for argument with size < IntptrSize to match the
8235 // placement of bits in big endian system
8236 if (ArgSize < IntptrSize)
8237 VAArgOffset += (IntptrSize - ArgSize);
8238 }
8239 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8240 VAArgOffset += ArgSize;
8241 VAArgOffset = alignTo(VAArgOffset, IntptrSize);
8242 if (!Base)
8243 continue;
8244 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8245 }
8246
8247 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8248 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8249 // a new class member i.e. it is the total size of all VarArgs.
8250 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8251 }
8252
8253 void finalizeInstrumentation() override {
8254 assert(!VAArgSize && !VAArgTLSCopy &&
8255 "finalizeInstrumentation called twice");
8256 IRBuilder<> IRB(MSV.FnPrologueEnd);
8257 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8258 Value *CopySize = VAArgSize;
8259
8260 if (!VAStartInstrumentationList.empty()) {
8261 // If there is a va_start in this function, make a backup copy of
8262 // va_arg_tls somewhere in the function entry block.
8263 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8264 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8265 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8266 CopySize, kShadowTLSAlignment, false);
8267
8268 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8269 Intrinsic::umin, CopySize,
8270 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8271 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8272 kShadowTLSAlignment, SrcSize);
8273 }
8274
8275 // Instrument va_start.
8276 // Copy va_list shadow from the backup copy of the TLS contents.
8277 for (CallInst *OrigInst : VAStartInstrumentationList) {
8278 NextNodeIRBuilder IRB(OrigInst);
8279 Value *VAListTag = OrigInst->getArgOperand(0);
8280 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8281 Value *RegSaveAreaPtrPtr =
8282 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8283 PointerType::get(*MS.C, 0));
8284 Value *RegSaveAreaPtr =
8285 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8286 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8287 const DataLayout &DL = F.getDataLayout();
8288 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8289 const Align Alignment = Align(IntptrSize);
8290 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8291 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8292 Alignment, /*isStore*/ true);
8293 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8294 CopySize);
8295 }
8296 }
8297};
8298
8299// ARM32, Loongarch64, MIPS and RISCV share the same calling conventions
8300// regarding VAArgs.
8301using VarArgARM32Helper = VarArgGenericHelper;
8302using VarArgRISCVHelper = VarArgGenericHelper;
8303using VarArgMIPSHelper = VarArgGenericHelper;
8304using VarArgLoongArch64Helper = VarArgGenericHelper;
8305
8306/// A no-op implementation of VarArgHelper.
8307struct VarArgNoOpHelper : public VarArgHelper {
8308 VarArgNoOpHelper(Function &F, MemorySanitizer &MS,
8309 MemorySanitizerVisitor &MSV) {}
8310
8311 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {}
8312
8313 void visitVAStartInst(VAStartInst &I) override {}
8314
8315 void visitVACopyInst(VACopyInst &I) override {}
8316
8317 void finalizeInstrumentation() override {}
8318};
8319
8320} // end anonymous namespace
8321
8322static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
8323 MemorySanitizerVisitor &Visitor) {
8324 // VarArg handling is only implemented on AMD64. False positives are possible
8325 // on other platforms.
8326 Triple TargetTriple(Func.getParent()->getTargetTriple());
8327
8328 if (TargetTriple.getArch() == Triple::x86)
8329 return new VarArgI386Helper(Func, Msan, Visitor);
8330
8331 if (TargetTriple.getArch() == Triple::x86_64)
8332 return new VarArgAMD64Helper(Func, Msan, Visitor);
8333
8334 if (TargetTriple.isARM())
8335 return new VarArgARM32Helper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8336
8337 if (TargetTriple.isAArch64())
8338 return new VarArgAArch64Helper(Func, Msan, Visitor);
8339
8340 if (TargetTriple.isSystemZ())
8341 return new VarArgSystemZHelper(Func, Msan, Visitor);
8342
8343 // On PowerPC32 VAListTag is a struct
8344 // {char, char, i16 padding, char *, char *}
8345 if (TargetTriple.isPPC32())
8346 return new VarArgPowerPC32Helper(Func, Msan, Visitor);
8347
8348 if (TargetTriple.isPPC64())
8349 return new VarArgPowerPC64Helper(Func, Msan, Visitor);
8350
8351 if (TargetTriple.isRISCV32())
8352 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8353
8354 if (TargetTriple.isRISCV64())
8355 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8356
8357 if (TargetTriple.isMIPS32())
8358 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8359
8360 if (TargetTriple.isMIPS64())
8361 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8362
8363 if (TargetTriple.isLoongArch64())
8364 return new VarArgLoongArch64Helper(Func, Msan, Visitor,
8365 /*VAListTagSize=*/8);
8366
8367 return new VarArgNoOpHelper(Func, Msan, Visitor);
8368}
8369
8370bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) {
8371 if (!CompileKernel && F.getName() == kMsanModuleCtorName)
8372 return false;
8373
8374 if (F.hasFnAttribute(Attribute::DisableSanitizerInstrumentation))
8375 return false;
8376
8377 MemorySanitizerVisitor Visitor(F, *this, TLI);
8378
8379 // Clear out memory attributes.
8381 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
8382 F.removeFnAttrs(B);
8383
8384 return Visitor.runOnFunction();
8385}
#define Success
assert(UImm &&(UImm !=~static_cast< T >(0)) &&"Invalid immediate!")
constexpr LLT S1
This file implements a class to represent arbitrary precision integral constant values and operations...
static bool isStore(int Opcode)
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
static cl::opt< ITMode > IT(cl::desc("IT block support"), cl::Hidden, cl::init(DefaultIT), cl::values(clEnumValN(DefaultIT, "arm-default-it", "Generate any type of IT block"), clEnumValN(RestrictedIT, "arm-restrict-it", "Disallow complex IT blocks")))
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClWithComdat("asan-with-comdat", cl::desc("Place ASan constructors in comdat sections"), cl::Hidden, cl::init(true))
VarLocInsertPt getNextNode(const DbgRecord *DVR)
Atomic ordering constants.
This file contains the simple types necessary to represent the attributes associated with functions a...
static GCRegistry::Add< ErlangGC > A("erlang", "erlang-compatible garbage collector")
static GCRegistry::Add< StatepointGC > D("statepoint-example", "an example strategy for statepoint")
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
Analysis containing CSE Info
Definition CSEInfo.cpp:27
This file contains the declarations for the subclasses of Constant, which represent the different fla...
const MemoryMapParams Linux_LoongArch64_MemoryMapParams
const MemoryMapParams Linux_X86_64_MemoryMapParams
static cl::opt< int > ClTrackOrigins("dfsan-track-origins", cl::desc("Track origins of labels"), cl::Hidden, cl::init(0))
static AtomicOrdering addReleaseOrdering(AtomicOrdering AO)
static AtomicOrdering addAcquireOrdering(AtomicOrdering AO)
const MemoryMapParams Linux_AArch64_MemoryMapParams
static bool isAMustTailRetVal(Value *RetVal)
This file provides an implementation of debug counters.
#define DEBUG_COUNTER(VARNAME, COUNTERNAME, DESC)
This file defines the DenseMap class.
This file builds on the ADT/GraphTraits.h file to build generic depth first graph iterator.
@ Default
static bool runOnFunction(Function &F, bool PostInlining)
This is the interface for a simple mod/ref and alias analysis over globals.
static size_t TypeSizeToSizeIndex(uint32_t TypeSize)
#define op(i)
Hexagon Common GEP
#define _
Hexagon Vector Combine
Module.h This file contains the declarations for the Module class.
static LVOptions Options
Definition LVOptions.cpp:25
#define F(x, y, z)
Definition MD5.cpp:55
#define I(x, y, z)
Definition MD5.cpp:58
Machine Check Debug Module
static const PlatformMemoryMapParams Linux_S390_MemoryMapParams
static const Align kMinOriginAlignment
static cl::opt< uint64_t > ClShadowBase("msan-shadow-base", cl::desc("Define custom MSan ShadowBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClPoisonUndef("msan-poison-undef", cl::desc("Poison fully undef temporary values. " "Partially undefined constant vectors " "are unaffected by this flag (see " "-msan-poison-undef-vectors)."), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_X86_MemoryMapParams
static cl::opt< uint64_t > ClOriginBase("msan-origin-base", cl::desc("Define custom MSan OriginBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClCheckConstantShadow("msan-check-constant-shadow", cl::desc("Insert checks for constant shadow values"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams
static const MemoryMapParams NetBSD_X86_64_MemoryMapParams
static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams
static const unsigned kOriginSize
static cl::opt< bool > ClWithComdat("msan-with-comdat", cl::desc("Place MSan constructors in comdat sections"), cl::Hidden, cl::init(false))
static cl::opt< int > ClTrackOrigins("msan-track-origins", cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden, cl::init(0))
Track origins of uninitialized values.
static cl::opt< int > ClInstrumentationWithCallThreshold("msan-instrumentation-with-call-threshold", cl::desc("If the function being instrumented requires more than " "this number of checks and origin stores, use callbacks instead of " "inline checks (-1 means never use callbacks)."), cl::Hidden, cl::init(3500))
static cl::opt< int > ClPoisonStackPattern("msan-poison-stack-pattern", cl::desc("poison uninitialized stack variables with the given pattern"), cl::Hidden, cl::init(0xff))
static const Align kShadowTLSAlignment
static cl::opt< bool > ClHandleICmpExact("msan-handle-icmp-exact", cl::desc("exact handling of relational integer ICmp"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams
static cl::opt< bool > ClDumpStrictInstructions("msan-dump-strict-instructions", cl::desc("print out instructions with default strict semantics i.e.," "check that all the inputs are fully initialized, and mark " "the output as fully initialized. These semantics are applied " "to instructions that could not be handled explicitly nor " "heuristically."), cl::Hidden, cl::init(false))
static Constant * getOrInsertGlobal(Module &M, StringRef Name, Type *Ty)
static cl::opt< bool > ClPreciseDisjointOr("msan-precise-disjoint-or", cl::desc("Precisely poison disjoint OR. If false (legacy behavior), " "disjointedness is ignored (i.e., 1|1 is initialized)."), cl::Hidden, cl::init(false))
static const MemoryMapParams Linux_S390X_MemoryMapParams
static cl::opt< bool > ClPoisonStack("msan-poison-stack", cl::desc("poison uninitialized stack variables"), cl::Hidden, cl::init(true))
static const MemoryMapParams Linux_I386_MemoryMapParams
const char kMsanInitName[]
static cl::opt< bool > ClPoisonUndefVectors("msan-poison-undef-vectors", cl::desc("Precisely poison partially undefined constant vectors. " "If false (legacy behavior), the entire vector is " "considered fully initialized, which may lead to false " "negatives. Fully undefined constant vectors are " "unaffected by this flag (see -msan-poison-undef)."), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPrintStackNames("msan-print-stack-names", cl::desc("Print name of local stack variable"), cl::Hidden, cl::init(true))
static cl::opt< uint64_t > ClAndMask("msan-and-mask", cl::desc("Define custom MSan AndMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleLifetimeIntrinsics("msan-handle-lifetime-intrinsics", cl::desc("when possible, poison scoped variables at the beginning of the scope " "(slower, but more precise)"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClKeepGoing("msan-keep-going", cl::desc("keep going after reporting a UMR"), cl::Hidden, cl::init(false))
static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams
static GlobalVariable * createPrivateConstGlobalForString(Module &M, StringRef Str)
Create a non-const global initialized with the given string.
static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClEagerChecks("msan-eager-checks", cl::desc("check arguments and return values at function call boundaries"), cl::Hidden, cl::init(false))
static cl::opt< int > ClDisambiguateWarning("msan-disambiguate-warning-threshold", cl::desc("Define threshold for number of checks per " "debug location to force origin update."), cl::Hidden, cl::init(3))
static VarArgHelper * CreateVarArgHelper(Function &Func, MemorySanitizer &Msan, MemorySanitizerVisitor &Visitor)
static const MemoryMapParams Linux_MIPS64_MemoryMapParams
static const MemoryMapParams Linux_PowerPC64_MemoryMapParams
static cl::opt< uint64_t > ClXorMask("msan-xor-mask", cl::desc("Define custom MSan XorMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleAsmConservative("msan-handle-asm-conservative", cl::desc("conservative handling of inline assembly"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams
static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams
static const unsigned kParamTLSSize
static cl::opt< bool > ClHandleICmp("msan-handle-icmp", cl::desc("propagate shadow through ICmpEQ and ICmpNE"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClEnableKmsan("msan-kernel", cl::desc("Enable KernelMemorySanitizer instrumentation"), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPoisonStackWithCall("msan-poison-stack-with-call", cl::desc("poison uninitialized stack variables with a call"), cl::Hidden, cl::init(false))
static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams
static cl::opt< bool > ClDumpHeuristicInstructions("msan-dump-heuristic-instructions", cl::desc("Prints 'unknown' instructions that were handled heuristically. " "Use -msan-dump-strict-instructions to print instructions that " "could not be handled explicitly nor heuristically."), cl::Hidden, cl::init(false))
static const unsigned kRetvalTLSSize
static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams
const char kMsanModuleCtorName[]
static const MemoryMapParams FreeBSD_I386_MemoryMapParams
static cl::opt< bool > ClCheckAccessAddress("msan-check-access-address", cl::desc("report accesses through a pointer which has poisoned shadow"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClDisableChecks("msan-disable-checks", cl::desc("Apply no_sanitize to the whole file"), cl::Hidden, cl::init(false))
#define T
FunctionAnalysisManager FAM
if(PassOpts->AAPipeline)
const SmallVectorImpl< MachineOperand > & Cond
static const char * name
void visit(MachineFunction &MF, MachineBasicBlock &Start, std::function< void(MachineBasicBlock *)> op)
This file implements a set that has insertion order iteration characteristics.
This file defines the SmallPtrSet class.
This file defines the SmallVector class.
This file contains some functions that are useful when dealing with strings.
#define LLVM_DEBUG(...)
Definition Debug.h:114
static TableGen::Emitter::OptClass< SkeletonEmitter > X("gen-skeleton-class", "Generate example skeleton class")
static SymbolRef::Type getType(const Symbol *Sym)
Definition TapiFile.cpp:39
Value * RHS
Value * LHS
static APInt getSignedMinValue(unsigned numBits)
Gets minimum signed value of APInt for a specific bit width.
Definition APInt.h:219
void setAlignment(Align Align)
PassT::Result & getResult(IRUnitT &IR, ExtraArgTs... ExtraArgs)
Get the result of an analysis pass for a given IR unit.
const T & front() const
front - Get the first element.
Definition ArrayRef.h:150
static LLVM_ABI ArrayType * get(Type *ElementType, uint64_t NumElements)
This static method is the primary way to construct an ArrayType.
This class stores enough information to efficiently remove some attributes from an existing AttrBuild...
AttributeMask & addAttribute(Attribute::AttrKind Val)
Add an attribute to the mask.
iterator end()
Definition BasicBlock.h:472
LLVM_ABI const_iterator getFirstInsertionPt() const
Returns an iterator to the first instruction in this block that is suitable for inserting a non-PHI i...
LLVM_ABI const BasicBlock * getSinglePredecessor() const
Return the predecessor of this block if it has a single predecessor block.
InstListType::iterator iterator
Instruction iterators...
Definition BasicBlock.h:170
bool isInlineAsm() const
Check if this call is an inline asm statement.
Function * getCalledFunction() const
Returns the function called, or null if this is an indirect function invocation or the function signa...
bool hasRetAttr(Attribute::AttrKind Kind) const
Determine whether the return value has the given attribute.
LLVM_ABI bool paramHasAttr(unsigned ArgNo, Attribute::AttrKind Kind) const
Determine whether the argument or parameter has the given attribute.
void removeFnAttrs(const AttributeMask &AttrsToRemove)
Removes the attributes from the function.
void setCannotMerge()
MaybeAlign getParamAlign(unsigned ArgNo) const
Extract the alignment for a call or parameter (0=unknown).
Type * getParamByValType(unsigned ArgNo) const
Extract the byval type for a call or parameter.
Value * getCalledOperand() const
Type * getParamElementType(unsigned ArgNo) const
Extract the elementtype type for a parameter.
Value * getArgOperand(unsigned i) const
void setArgOperand(unsigned i, Value *v)
FunctionType * getFunctionType() const
iterator_range< User::op_iterator > args()
Iteration adapter for range-for loops.
void addParamAttr(unsigned ArgNo, Attribute::AttrKind Kind)
Adds the attribute to the indicated argument.
Predicate
This enumeration lists the possible predicates for CmpInst subclasses.
Definition InstrTypes.h:678
@ ICMP_SLT
signed less than
Definition InstrTypes.h:707
@ ICMP_SLE
signed less or equal
Definition InstrTypes.h:708
@ ICMP_SGT
signed greater than
Definition InstrTypes.h:705
@ ICMP_SGE
signed greater or equal
Definition InstrTypes.h:706
static LLVM_ABI Constant * get(ArrayType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getString(LLVMContext &Context, StringRef Initializer, bool AddNull=true)
This method constructs a CDS and initializes it with a text string.
static LLVM_ABI Constant * get(LLVMContext &Context, ArrayRef< uint8_t > Elts)
get() constructors - Return a constant with vector type with an element count and element type matchi...
static ConstantInt * getSigned(IntegerType *Ty, int64_t V)
Return a ConstantInt with the specified value for the specified type.
Definition Constants.h:131
static LLVM_ABI ConstantInt * getBool(LLVMContext &Context, bool V)
static LLVM_ABI Constant * get(StructType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getSplat(ElementCount EC, Constant *Elt)
Return a ConstantVector with the specified constant in each element.
static LLVM_ABI Constant * get(ArrayRef< Constant * > V)
This is an important base class in LLVM.
Definition Constant.h:43
static LLVM_ABI Constant * getAllOnesValue(Type *Ty)
LLVM_ABI bool isAllOnesValue() const
Return true if this is the value that would be returned by getAllOnesValue.
static LLVM_ABI Constant * getNullValue(Type *Ty)
Constructor to create a '0' constant of arbitrary type.
LLVM_ABI Constant * getAggregateElement(unsigned Elt) const
For aggregates (struct/array/vector) return the constant that corresponds to the specified element if...
LLVM_ABI bool isZeroValue() const
Return true if the value is negative zero or null value.
Definition Constants.cpp:76
LLVM_ABI bool isNullValue() const
Return true if this is the value that would be returned by getNullValue.
Definition Constants.cpp:90
static bool shouldExecute(unsigned CounterName)
bool empty() const
Definition DenseMap.h:107
unsigned getNumElements() const
static LLVM_ABI FixedVectorType * get(Type *ElementType, unsigned NumElts)
Definition Type.cpp:803
static FixedVectorType * getHalfElementsVectorType(FixedVectorType *VTy)
A handy container for a FunctionType+Callee-pointer pair, which can be passed around as a single enti...
unsigned getNumParams() const
Return the number of fixed parameters this function type requires.
LLVM_ABI void setComdat(Comdat *C)
Definition Globals.cpp:214
@ PrivateLinkage
Like Internal, but omit from symbol table.
Definition GlobalValue.h:61
@ ExternalLinkage
Externally visible function.
Definition GlobalValue.h:53
Analysis pass providing a never-invalidated alias analysis result.
ConstantInt * getInt1(bool V)
Get a constant value representing either true or false.
Definition IRBuilder.h:497
Value * CreateInsertElement(Type *VecTy, Value *NewElt, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2571
Value * CreateConstGEP1_32(Type *Ty, Value *Ptr, unsigned Idx0, const Twine &Name="")
Definition IRBuilder.h:1936
AllocaInst * CreateAlloca(Type *Ty, unsigned AddrSpace, Value *ArraySize=nullptr, const Twine &Name="")
Definition IRBuilder.h:1830
IntegerType * getInt1Ty()
Fetch the type representing a single bit.
Definition IRBuilder.h:547
LLVM_ABI CallInst * CreateMaskedCompressStore(Value *Val, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr)
Create a call to Masked Compress Store intrinsic.
Value * CreateInsertValue(Value *Agg, Value *Val, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2625
Value * CreateExtractElement(Value *Vec, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2559
IntegerType * getIntNTy(unsigned N)
Fetch the type representing an N-bit integer.
Definition IRBuilder.h:575
LoadInst * CreateAlignedLoad(Type *Ty, Value *Ptr, MaybeAlign Align, const char *Name)
Definition IRBuilder.h:1864
Value * CreateZExtOrTrunc(Value *V, Type *DestTy, const Twine &Name="")
Create a ZExt or Trunc from the integer value V to DestTy.
Definition IRBuilder.h:2100
CallInst * CreateMemCpy(Value *Dst, MaybeAlign DstAlign, Value *Src, MaybeAlign SrcAlign, uint64_t Size, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memcpy between the specified pointers.
Definition IRBuilder.h:687
LLVM_ABI CallInst * CreateAndReduce(Value *Src)
Create a vector int AND reduction intrinsic of the source vector.
Value * CreatePointerCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2251
Value * CreateExtractValue(Value *Agg, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2618
LLVM_ABI CallInst * CreateMaskedLoad(Type *Ty, Value *Ptr, Align Alignment, Value *Mask, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Load intrinsic.
LLVM_ABI Value * CreateSelect(Value *C, Value *True, Value *False, const Twine &Name="", Instruction *MDFrom=nullptr)
BasicBlock::iterator GetInsertPoint() const
Definition IRBuilder.h:202
Value * CreateSExt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2094
Value * CreateIntToPtr(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2199
Value * CreateLShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1513
IntegerType * getInt32Ty()
Fetch the type representing a 32-bit integer.
Definition IRBuilder.h:562
ConstantInt * getInt8(uint8_t C)
Get a constant 8-bit value.
Definition IRBuilder.h:512
IntegerType * getInt64Ty()
Fetch the type representing a 64-bit integer.
Definition IRBuilder.h:567
Value * CreateUDiv(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1454
Value * CreateICmpNE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2333
Value * CreateGEP(Type *Ty, Value *Ptr, ArrayRef< Value * > IdxList, const Twine &Name="", GEPNoWrapFlags NW=GEPNoWrapFlags::none())
Definition IRBuilder.h:1923
Value * CreateNeg(Value *V, const Twine &Name="", bool HasNSW=false)
Definition IRBuilder.h:1781
LLVM_ABI CallInst * CreateOrReduce(Value *Src)
Create a vector int OR reduction intrinsic of the source vector.
LLVM_ABI Value * CreateBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with 2 operands which is mangled on the first type.
LLVM_ABI CallInst * CreateIntrinsic(Intrinsic::ID ID, ArrayRef< Type * > Types, ArrayRef< Value * > Args, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with Args, mangled using Types.
ConstantInt * getInt32(uint32_t C)
Get a constant 32-bit value.
Definition IRBuilder.h:522
PHINode * CreatePHI(Type *Ty, unsigned NumReservedValues, const Twine &Name="")
Definition IRBuilder.h:2494
Value * CreateNot(Value *V, const Twine &Name="")
Definition IRBuilder.h:1805
Value * CreateICmpEQ(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2329
LLVM_ABI DebugLoc getCurrentDebugLocation() const
Get location information used by debugging information.
Definition IRBuilder.cpp:63
Value * CreateSub(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1420
Value * CreateBitCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2204
ConstantInt * getIntN(unsigned N, uint64_t C)
Get a constant N-bit value, zero extended or truncated from a 64-bit value.
Definition IRBuilder.h:533
LoadInst * CreateLoad(Type *Ty, Value *Ptr, const char *Name)
Provided to resolve 'CreateLoad(Ty, Ptr, "...")' correctly, instead of converting the string to 'bool...
Definition IRBuilder.h:1847
Value * CreateShl(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1492
CallInst * CreateMemSet(Value *Ptr, Value *Val, uint64_t Size, MaybeAlign Align, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memset to the specified pointer and the specified value.
Definition IRBuilder.h:630
Value * CreateZExt(Value *V, Type *DestTy, const Twine &Name="", bool IsNonNeg=false)
Definition IRBuilder.h:2082
Value * CreateShuffleVector(Value *V1, Value *V2, Value *Mask, const Twine &Name="")
Definition IRBuilder.h:2593
LLVMContext & getContext() const
Definition IRBuilder.h:203
Value * CreateAnd(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1551
StoreInst * CreateStore(Value *Val, Value *Ptr, bool isVolatile=false)
Definition IRBuilder.h:1860
LLVM_ABI CallInst * CreateMaskedStore(Value *Val, Value *Ptr, Align Alignment, Value *Mask)
Create a call to Masked Store intrinsic.
Value * CreateAdd(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1403
Value * CreatePtrToInt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2194
Value * CreateIsNotNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg != 0.
Definition IRBuilder.h:2651
CallInst * CreateCall(FunctionType *FTy, Value *Callee, ArrayRef< Value * > Args={}, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:2508
Value * CreateTrunc(Value *V, Type *DestTy, const Twine &Name="", bool IsNUW=false, bool IsNSW=false)
Definition IRBuilder.h:2068
PointerType * getPtrTy(unsigned AddrSpace=0)
Fetch the type representing a pointer.
Definition IRBuilder.h:605
Value * CreateBinOp(Instruction::BinaryOps Opc, Value *LHS, Value *RHS, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:1708
Value * CreateICmpSLT(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2361
LLVM_ABI Value * CreateTypeSize(Type *Ty, TypeSize Size)
Create an expression which evaluates to the number of units in Size at runtime.
Value * CreateICmpUGE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2341
Value * CreateIntCast(Value *V, Type *DestTy, bool isSigned, const Twine &Name="")
Definition IRBuilder.h:2277
Value * CreateIsNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg == 0.
Definition IRBuilder.h:2646
void SetInsertPoint(BasicBlock *TheBB)
This specifies that created instructions should be appended to the end of the specified block.
Definition IRBuilder.h:207
Type * getVoidTy()
Fetch the type representing void.
Definition IRBuilder.h:600
StoreInst * CreateAlignedStore(Value *Val, Value *Ptr, MaybeAlign Align, bool isVolatile=false)
Definition IRBuilder.h:1883
LLVM_ABI CallInst * CreateMaskedExpandLoad(Type *Ty, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Expand Load intrinsic.
Value * CreateInBoundsPtrAdd(Value *Ptr, Value *Offset, const Twine &Name="")
Definition IRBuilder.h:2041
Value * CreateAShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1532
Value * CreateXor(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1599
Value * CreateICmp(CmpInst::Predicate P, Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2439
Value * CreateOr(Value *LHS, Value *RHS, const Twine &Name="", bool IsDisjoint=false)
Definition IRBuilder.h:1573
IntegerType * getInt8Ty()
Fetch the type representing an 8-bit integer.
Definition IRBuilder.h:552
Value * CreateMul(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1437
LLVM_ABI CallInst * CreateMaskedScatter(Value *Val, Value *Ptrs, Align Alignment, Value *Mask=nullptr)
Create a call to Masked Scatter intrinsic.
LLVM_ABI CallInst * CreateMaskedGather(Type *Ty, Value *Ptrs, Align Alignment, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Gather intrinsic.
This provides a uniform API for creating instructions and inserting them into a basic block: either a...
Definition IRBuilder.h:2780
std::vector< ConstraintInfo > ConstraintInfoVector
Definition InlineAsm.h:123
void visit(Iterator Start, Iterator End)
Definition InstVisitor.h:87
const DebugLoc & getDebugLoc() const
Return the debug location for this node as a DebugLoc.
LLVM_ABI InstListType::iterator eraseFromParent()
This method unlinks 'this' from the containing basic block and deletes it.
MDNode * getMetadata(unsigned KindID) const
Get the metadata of given kind attached to this Instruction.
LLVM_ABI bool comesBefore(const Instruction *Other) const
Given an instruction Other in the same basic block as this instruction, return true if this instructi...
static LLVM_ABI IntegerType * get(LLVMContext &C, unsigned NumBits)
This static method is the primary way of constructing an IntegerType.
Definition Type.cpp:319
LLVM_ABI MDNode * createUnlikelyBranchWeights()
Return metadata containing two branch weights, with significant bias towards false destination.
Definition MDBuilder.cpp:48
A Module instance is used to store all the information related to an LLVM module.
Definition Module.h:67
void addIncoming(Value *V, BasicBlock *BB)
Add an incoming value to the end of the PHI list.
static LLVM_ABI PoisonValue * get(Type *T)
Static factory methods - Return an 'poison' object of the specified type.
A set of analyses that are preserved following a run of a transformation pass.
Definition Analysis.h:112
static PreservedAnalyses none()
Convenience factory function for the empty preserved set.
Definition Analysis.h:115
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
Definition Analysis.h:118
PreservedAnalyses & abandon()
Mark an analysis as abandoned.
Definition Analysis.h:171
bool remove(const value_type &X)
Remove an item from the set vector.
Definition SetVector.h:198
bool insert(const value_type &X)
Insert a new element into the SetVector.
Definition SetVector.h:168
void append(ItTy in_start, ItTy in_end)
Add the specified range to the end of the SmallVector.
void push_back(const T &Elt)
StringRef - Represent a constant reference to a string, i.e.
Definition StringRef.h:55
static LLVM_ABI StructType * get(LLVMContext &Context, ArrayRef< Type * > Elements, bool isPacked=false)
This static method is the primary way to create a literal StructType.
Definition Type.cpp:414
unsigned getNumElements() const
Random access to the elements.
Type * getElementType(unsigned N) const
Analysis pass providing the TargetLibraryInfo.
Provides information about what library functions are available for the current target.
AttributeList getAttrList(LLVMContext *C, ArrayRef< unsigned > ArgNos, bool Signed, bool Ret=false, AttributeList AL=AttributeList()) const
bool getLibFunc(StringRef funcName, LibFunc &F) const
Searches for a particular function name.
Triple - Helper class for working with autoconf configuration names.
Definition Triple.h:47
bool isMIPS64() const
Tests whether the target is MIPS 64-bit (little and big endian).
Definition Triple.h:1030
@ loongarch64
Definition Triple.h:65
bool isRISCV32() const
Tests whether the target is 32-bit RISC-V.
Definition Triple.h:1073
bool isPPC32() const
Tests whether the target is 32-bit PowerPC (little and big endian).
Definition Triple.h:1046
ArchType getArch() const
Get the parsed architecture type of this triple.
Definition Triple.h:411
bool isRISCV64() const
Tests whether the target is 64-bit RISC-V.
Definition Triple.h:1078
bool isLoongArch64() const
Tests whether the target is 64-bit LoongArch.
Definition Triple.h:1019
bool isMIPS32() const
Tests whether the target is MIPS 32-bit (little and big endian).
Definition Triple.h:1025
bool isARM() const
Tests whether the target is ARM (little and big endian).
Definition Triple.h:914
bool isPPC64() const
Tests whether the target is 64-bit PowerPC (little and big endian).
Definition Triple.h:1051
bool isAArch64() const
Tests whether the target is AArch64 (little and big endian).
Definition Triple.h:998
bool isSystemZ() const
Tests whether the target is SystemZ.
Definition Triple.h:1097
The instances of the Type class are immutable: once they are created, they are never changed.
Definition Type.h:45
LLVM_ABI unsigned getIntegerBitWidth() const
bool isVectorTy() const
True if this is an instance of VectorType.
Definition Type.h:273
bool isArrayTy() const
True if this is an instance of ArrayType.
Definition Type.h:264
bool isIntOrIntVectorTy() const
Return true if this is an integer type or a vector of integer types.
Definition Type.h:246
bool isPointerTy() const
True if this is an instance of PointerType.
Definition Type.h:267
Type * getArrayElementType() const
Definition Type.h:408
bool isPPC_FP128Ty() const
Return true if this is powerpc long double.
Definition Type.h:165
static LLVM_ABI Type * getVoidTy(LLVMContext &C)
Definition Type.cpp:281
Type * getScalarType() const
If this is a vector type, return the element type, otherwise return 'this'.
Definition Type.h:352
LLVM_ABI TypeSize getPrimitiveSizeInBits() const LLVM_READONLY
Return the basic size of this type if it is a primitive type.
Definition Type.cpp:198
bool isSized(SmallPtrSetImpl< Type * > *Visited=nullptr) const
Return true if it makes sense to take the size of this type.
Definition Type.h:311
LLVM_ABI unsigned getScalarSizeInBits() const LLVM_READONLY
If this is a vector type, return the getPrimitiveSizeInBits value for the element type.
Definition Type.cpp:231
bool isFloatingPointTy() const
Return true if this is one of the floating-point types.
Definition Type.h:184
bool isIntOrPtrTy() const
Return true if this is an integer type or a pointer type.
Definition Type.h:255
bool isIntegerTy() const
True if this is an instance of IntegerType.
Definition Type.h:240
bool isFPOrFPVectorTy() const
Return true if this is a FP type or a vector of FP.
Definition Type.h:225
bool isVoidTy() const
Return true if this is 'void'.
Definition Type.h:139
Value * getOperand(unsigned i) const
Definition User.h:232
unsigned getNumOperands() const
Definition User.h:254
size_type count(const KeyT &Val) const
Return 1 if the specified key is in the map, 0 otherwise.
Definition ValueMap.h:156
Type * getType() const
All values are typed, get the type of this value.
Definition Value.h:256
LLVM_ABI void setName(const Twine &Name)
Change the name of the value.
Definition Value.cpp:390
LLVM_ABI StringRef getName() const
Return a constant reference to the value's name.
Definition Value.cpp:322
ElementCount getElementCount() const
Return an ElementCount instance to represent the (possibly scalable) number of elements in the vector...
Type * getElementType() const
int getNumOccurrences() const
constexpr ScalarTy getFixedValue() const
Definition TypeSize.h:200
constexpr bool isScalable() const
Returns whether the quantity is scaled by a runtime quantity (vscale).
Definition TypeSize.h:169
An efficient, type-erasing, non-owning reference to a callable.
const ParentTy * getParent() const
Definition ilist_node.h:34
self_iterator getIterator()
Definition ilist_node.h:134
This class implements an extremely fast bulk output stream that can only output to a stream.
Definition raw_ostream.h:53
CallInst * Call
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
constexpr char Align[]
Key for Kernel::Arg::Metadata::mAlign.
constexpr std::underlying_type_t< E > Mask()
Get a bitmask with 1s in all places up to the high-order bit of E's largest value.
@ C
The default llvm calling convention, compatible with C.
Definition CallingConv.h:34
@ BasicBlock
Various leaf nodes.
Definition ISDOpcodes.h:81
initializer< Ty > init(const Ty &Val)
Function * Kernel
Summary of a kernel (=entry point for target offloading).
Definition OpenMPOpt.h:21
NodeAddr< FuncNode * > Func
Definition RDFGraph.h:393
friend class Instruction
Iterator for Instructions in a `BasicBlock.
Definition BasicBlock.h:73
This is an optimization pass for GlobalISel generic memory operations.
unsigned Log2_32_Ceil(uint32_t Value)
Return the ceil log base 2 of the specified value, 32 if the value is zero.
Definition MathExtras.h:355
FunctionAddr VTableAddr Value
Definition InstrProf.h:137
auto size(R &&Range, std::enable_if_t< std::is_base_of< std::random_access_iterator_tag, typename std::iterator_traits< decltype(Range.begin())>::iterator_category >::value, void > *=nullptr)
Get the size of a range.
Definition STLExtras.h:1665
auto enumerate(FirstRange &&First, RestRanges &&...Rest)
Given two or more input ranges, returns a new range whose values are tuples (A, B,...
Definition STLExtras.h:2454
decltype(auto) dyn_cast(const From &Val)
dyn_cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:649
@ Done
Definition Threading.h:60
bool isAligned(Align Lhs, uint64_t SizeInBytes)
Checks that SizeInBytes is a multiple of the alignment.
Definition Alignment.h:145
LLVM_ABI std::pair< Instruction *, Value * > SplitBlockAndInsertSimpleForLoop(Value *End, BasicBlock::iterator SplitBefore)
Insert a for (int i = 0; i < End; i++) loop structure (with the exception that End is assumed > 0,...
InnerAnalysisManagerProxy< FunctionAnalysisManager, Module > FunctionAnalysisManagerModuleProxy
Provide the FunctionAnalysisManager to Module proxy.
constexpr bool isPowerOf2_64(uint64_t Value)
Return true if the argument is a power of two > 0 (64 bit edition.)
Definition MathExtras.h:293
unsigned Log2_64(uint64_t Value)
Return the floor log base 2 of the specified value, -1 if the value is zero.
Definition MathExtras.h:348
auto dyn_cast_or_null(const Y &Val)
Definition Casting.h:759
LLVM_ABI std::pair< Function *, FunctionCallee > getOrCreateSanitizerCtorAndInitFunctions(Module &M, StringRef CtorName, StringRef InitName, ArrayRef< Type * > InitArgTypes, ArrayRef< Value * > InitArgs, function_ref< void(Function *, FunctionCallee)> FunctionsCreatedCallback, StringRef VersionCheckName=StringRef(), bool Weak=false)
Creates sanitizer constructor function lazily.
LLVM_ABI raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition Debug.cpp:207
LLVM_ABI void report_fatal_error(Error Err, bool gen_crash_diag=true)
Definition Error.cpp:167
class LLVM_GSL_OWNER SmallVector
Forward declaration of SmallVector so that calculateSmallVectorDefaultInlinedElements can reference s...
bool isa(const From &Val)
isa<X> - Return true if the parameter to the template is an instance of one of the template type argu...
Definition Casting.h:548
LLVM_ABI bool isKnownNonZero(const Value *V, const SimplifyQuery &Q, unsigned Depth=0)
Return true if the given value is known to be non-zero when defined.
LLVM_ABI raw_fd_ostream & errs()
This returns a reference to a raw_ostream for standard error.
AtomicOrdering
Atomic ordering for LLVM's memory model.
@ First
Helpers to iterate all locations in the MemoryEffectsBase class.
Definition ModRef.h:71
IRBuilder(LLVMContext &, FolderTy, InserterTy, MDNode *, ArrayRef< OperandBundleDef >) -> IRBuilder< FolderTy, InserterTy >
@ Or
Bitwise or logical OR of integers.
@ And
Bitwise or logical AND of integers.
@ Add
Sum of integers.
uint64_t alignTo(uint64_t Size, Align A)
Returns a multiple of A needed to store Size bytes.
Definition Alignment.h:155
DWARFExpression::Operation Op
RoundingMode
Rounding mode.
ArrayRef(const T &OneElt) -> ArrayRef< T >
constexpr unsigned BitWidth
LLVM_ABI void appendToGlobalCtors(Module &M, Function *F, int Priority, Constant *Data=nullptr)
Append F to the list of global ctors of module M with the given Priority.
decltype(auto) cast(const From &Val)
cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:565
iterator_range< df_iterator< T > > depth_first(const T &G)
LLVM_ABI Instruction * SplitBlockAndInsertIfThen(Value *Cond, BasicBlock::iterator SplitBefore, bool Unreachable, MDNode *BranchWeights=nullptr, DomTreeUpdater *DTU=nullptr, LoopInfo *LI=nullptr, BasicBlock *ThenBlock=nullptr)
Split the containing block at the specified instruction - everything before SplitBefore stays in the ...
LLVM_ABI void maybeMarkSanitizerLibraryCallNoBuiltin(CallInst *CI, const TargetLibraryInfo *TLI)
Given a CallInst, check if it calls a string function known to CodeGen, and mark it with NoBuiltin if...
Definition Local.cpp:3832
LLVM_ABI bool removeUnreachableBlocks(Function &F, DomTreeUpdater *DTU=nullptr, MemorySSAUpdater *MSSAU=nullptr)
Remove all blocks that can not be reached from the function's entry.
Definition Local.cpp:2883
LLVM_ABI bool checkIfAlreadyInstrumented(Module &M, StringRef Flag)
Check if module has flag attached, if not add the flag.
std::string itostr(int64_t X)
AnalysisManager< Module > ModuleAnalysisManager
Convenience typedef for the Module analysis manager.
Definition MIRParser.h:39
This struct is a compact representation of a valid (non-zero power of two) alignment.
Definition Alignment.h:39
uint64_t value() const
This is a hole in the type system and should not be abused.
Definition Alignment.h:85
LLVM_ABI void printPipeline(raw_ostream &OS, function_ref< StringRef(StringRef)> MapClassName2PassName)
LLVM_ABI PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM)
A CRTP mix-in to automatically provide informational APIs needed for passes.
Definition PassManager.h:70