LLVM 22.0.0git
MemorySanitizer.cpp
Go to the documentation of this file.
1//===- MemorySanitizer.cpp - detector of uninitialized reads --------------===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9/// \file
10/// This file is a part of MemorySanitizer, a detector of uninitialized
11/// reads.
12///
13/// The algorithm of the tool is similar to Memcheck
14/// (https://static.usenix.org/event/usenix05/tech/general/full_papers/seward/seward_html/usenix2005.html)
15/// We associate a few shadow bits with every byte of the application memory,
16/// poison the shadow of the malloc-ed or alloca-ed memory, load the shadow,
17/// bits on every memory read, propagate the shadow bits through some of the
18/// arithmetic instruction (including MOV), store the shadow bits on every
19/// memory write, report a bug on some other instructions (e.g. JMP) if the
20/// associated shadow is poisoned.
21///
22/// But there are differences too. The first and the major one:
23/// compiler instrumentation instead of binary instrumentation. This
24/// gives us much better register allocation, possible compiler
25/// optimizations and a fast start-up. But this brings the major issue
26/// as well: msan needs to see all program events, including system
27/// calls and reads/writes in system libraries, so we either need to
28/// compile *everything* with msan or use a binary translation
29/// component (e.g. DynamoRIO) to instrument pre-built libraries.
30/// Another difference from Memcheck is that we use 8 shadow bits per
31/// byte of application memory and use a direct shadow mapping. This
32/// greatly simplifies the instrumentation code and avoids races on
33/// shadow updates (Memcheck is single-threaded so races are not a
34/// concern there. Memcheck uses 2 shadow bits per byte with a slow
35/// path storage that uses 8 bits per byte).
36///
37/// The default value of shadow is 0, which means "clean" (not poisoned).
38///
39/// Every module initializer should call __msan_init to ensure that the
40/// shadow memory is ready. On error, __msan_warning is called. Since
41/// parameters and return values may be passed via registers, we have a
42/// specialized thread-local shadow for return values
43/// (__msan_retval_tls) and parameters (__msan_param_tls).
44///
45/// Origin tracking.
46///
47/// MemorySanitizer can track origins (allocation points) of all uninitialized
48/// values. This behavior is controlled with a flag (msan-track-origins) and is
49/// disabled by default.
50///
51/// Origins are 4-byte values created and interpreted by the runtime library.
52/// They are stored in a second shadow mapping, one 4-byte value for 4 bytes
53/// of application memory. Propagation of origins is basically a bunch of
54/// "select" instructions that pick the origin of a dirty argument, if an
55/// instruction has one.
56///
57/// Every 4 aligned, consecutive bytes of application memory have one origin
58/// value associated with them. If these bytes contain uninitialized data
59/// coming from 2 different allocations, the last store wins. Because of this,
60/// MemorySanitizer reports can show unrelated origins, but this is unlikely in
61/// practice.
62///
63/// Origins are meaningless for fully initialized values, so MemorySanitizer
64/// avoids storing origin to memory when a fully initialized value is stored.
65/// This way it avoids needless overwriting origin of the 4-byte region on
66/// a short (i.e. 1 byte) clean store, and it is also good for performance.
67///
68/// Atomic handling.
69///
70/// Ideally, every atomic store of application value should update the
71/// corresponding shadow location in an atomic way. Unfortunately, atomic store
72/// of two disjoint locations can not be done without severe slowdown.
73///
74/// Therefore, we implement an approximation that may err on the safe side.
75/// In this implementation, every atomically accessed location in the program
76/// may only change from (partially) uninitialized to fully initialized, but
77/// not the other way around. We load the shadow _after_ the application load,
78/// and we store the shadow _before_ the app store. Also, we always store clean
79/// shadow (if the application store is atomic). This way, if the store-load
80/// pair constitutes a happens-before arc, shadow store and load are correctly
81/// ordered such that the load will get either the value that was stored, or
82/// some later value (which is always clean).
83///
84/// This does not work very well with Compare-And-Swap (CAS) and
85/// Read-Modify-Write (RMW) operations. To follow the above logic, CAS and RMW
86/// must store the new shadow before the app operation, and load the shadow
87/// after the app operation. Computers don't work this way. Current
88/// implementation ignores the load aspect of CAS/RMW, always returning a clean
89/// value. It implements the store part as a simple atomic store by storing a
90/// clean shadow.
91///
92/// Instrumenting inline assembly.
93///
94/// For inline assembly code LLVM has little idea about which memory locations
95/// become initialized depending on the arguments. It can be possible to figure
96/// out which arguments are meant to point to inputs and outputs, but the
97/// actual semantics can be only visible at runtime. In the Linux kernel it's
98/// also possible that the arguments only indicate the offset for a base taken
99/// from a segment register, so it's dangerous to treat any asm() arguments as
100/// pointers. We take a conservative approach generating calls to
101/// __msan_instrument_asm_store(ptr, size)
102/// , which defer the memory unpoisoning to the runtime library.
103/// The latter can perform more complex address checks to figure out whether
104/// it's safe to touch the shadow memory.
105/// Like with atomic operations, we call __msan_instrument_asm_store() before
106/// the assembly call, so that changes to the shadow memory will be seen by
107/// other threads together with main memory initialization.
108///
109/// KernelMemorySanitizer (KMSAN) implementation.
110///
111/// The major differences between KMSAN and MSan instrumentation are:
112/// - KMSAN always tracks the origins and implies msan-keep-going=true;
113/// - KMSAN allocates shadow and origin memory for each page separately, so
114/// there are no explicit accesses to shadow and origin in the
115/// instrumentation.
116/// Shadow and origin values for a particular X-byte memory location
117/// (X=1,2,4,8) are accessed through pointers obtained via the
118/// __msan_metadata_ptr_for_load_X(ptr)
119/// __msan_metadata_ptr_for_store_X(ptr)
120/// functions. The corresponding functions check that the X-byte accesses
121/// are possible and returns the pointers to shadow and origin memory.
122/// Arbitrary sized accesses are handled with:
123/// __msan_metadata_ptr_for_load_n(ptr, size)
124/// __msan_metadata_ptr_for_store_n(ptr, size);
125/// Note that the sanitizer code has to deal with how shadow/origin pairs
126/// returned by the these functions are represented in different ABIs. In
127/// the X86_64 ABI they are returned in RDX:RAX, in PowerPC64 they are
128/// returned in r3 and r4, and in the SystemZ ABI they are written to memory
129/// pointed to by a hidden parameter.
130/// - TLS variables are stored in a single per-task struct. A call to a
131/// function __msan_get_context_state() returning a pointer to that struct
132/// is inserted into every instrumented function before the entry block;
133/// - __msan_warning() takes a 32-bit origin parameter;
134/// - local variables are poisoned with __msan_poison_alloca() upon function
135/// entry and unpoisoned with __msan_unpoison_alloca() before leaving the
136/// function;
137/// - the pass doesn't declare any global variables or add global constructors
138/// to the translation unit.
139///
140/// Also, KMSAN currently ignores uninitialized memory passed into inline asm
141/// calls, making sure we're on the safe side wrt. possible false positives.
142///
143/// KernelMemorySanitizer only supports X86_64, SystemZ and PowerPC64 at the
144/// moment.
145///
146//
147// FIXME: This sanitizer does not yet handle scalable vectors
148//
149//===----------------------------------------------------------------------===//
150
152#include "llvm/ADT/APInt.h"
153#include "llvm/ADT/ArrayRef.h"
154#include "llvm/ADT/DenseMap.h"
156#include "llvm/ADT/SetVector.h"
157#include "llvm/ADT/SmallPtrSet.h"
158#include "llvm/ADT/SmallVector.h"
160#include "llvm/ADT/StringRef.h"
164#include "llvm/IR/Argument.h"
166#include "llvm/IR/Attributes.h"
167#include "llvm/IR/BasicBlock.h"
168#include "llvm/IR/CallingConv.h"
169#include "llvm/IR/Constant.h"
170#include "llvm/IR/Constants.h"
171#include "llvm/IR/DataLayout.h"
172#include "llvm/IR/DerivedTypes.h"
173#include "llvm/IR/Function.h"
174#include "llvm/IR/GlobalValue.h"
176#include "llvm/IR/IRBuilder.h"
177#include "llvm/IR/InlineAsm.h"
178#include "llvm/IR/InstVisitor.h"
179#include "llvm/IR/InstrTypes.h"
180#include "llvm/IR/Instruction.h"
181#include "llvm/IR/Instructions.h"
183#include "llvm/IR/Intrinsics.h"
184#include "llvm/IR/IntrinsicsAArch64.h"
185#include "llvm/IR/IntrinsicsX86.h"
186#include "llvm/IR/MDBuilder.h"
187#include "llvm/IR/Module.h"
188#include "llvm/IR/Type.h"
189#include "llvm/IR/Value.h"
190#include "llvm/IR/ValueMap.h"
193#include "llvm/Support/Casting.h"
195#include "llvm/Support/Debug.h"
205#include <algorithm>
206#include <cassert>
207#include <cstddef>
208#include <cstdint>
209#include <memory>
210#include <numeric>
211#include <string>
212#include <tuple>
213
214using namespace llvm;
215
216#define DEBUG_TYPE "msan"
217
218DEBUG_COUNTER(DebugInsertCheck, "msan-insert-check",
219 "Controls which checks to insert");
220
221DEBUG_COUNTER(DebugInstrumentInstruction, "msan-instrument-instruction",
222 "Controls which instruction to instrument");
223
224static const unsigned kOriginSize = 4;
227
228// These constants must be kept in sync with the ones in msan.h.
229static const unsigned kParamTLSSize = 800;
230static const unsigned kRetvalTLSSize = 800;
231
232// Accesses sizes are powers of two: 1, 2, 4, 8.
233static const size_t kNumberOfAccessSizes = 4;
234
235/// Track origins of uninitialized values.
236///
237/// Adds a section to MemorySanitizer report that points to the allocation
238/// (stack or heap) the uninitialized bits came from originally.
240 "msan-track-origins",
241 cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden,
242 cl::init(0));
243
244static cl::opt<bool> ClKeepGoing("msan-keep-going",
245 cl::desc("keep going after reporting a UMR"),
246 cl::Hidden, cl::init(false));
247
248static cl::opt<bool>
249 ClPoisonStack("msan-poison-stack",
250 cl::desc("poison uninitialized stack variables"), cl::Hidden,
251 cl::init(true));
252
254 "msan-poison-stack-with-call",
255 cl::desc("poison uninitialized stack variables with a call"), cl::Hidden,
256 cl::init(false));
257
259 "msan-poison-stack-pattern",
260 cl::desc("poison uninitialized stack variables with the given pattern"),
261 cl::Hidden, cl::init(0xff));
262
263static cl::opt<bool>
264 ClPrintStackNames("msan-print-stack-names",
265 cl::desc("Print name of local stack variable"),
266 cl::Hidden, cl::init(true));
267
268static cl::opt<bool>
269 ClPoisonUndef("msan-poison-undef",
270 cl::desc("Poison fully undef temporary values. "
271 "Partially undefined constant vectors "
272 "are unaffected by this flag (see "
273 "-msan-poison-undef-vectors)."),
274 cl::Hidden, cl::init(true));
275
277 "msan-poison-undef-vectors",
278 cl::desc("Precisely poison partially undefined constant vectors. "
279 "If false (legacy behavior), the entire vector is "
280 "considered fully initialized, which may lead to false "
281 "negatives. Fully undefined constant vectors are "
282 "unaffected by this flag (see -msan-poison-undef)."),
283 cl::Hidden, cl::init(false));
284
286 "msan-precise-disjoint-or",
287 cl::desc("Precisely poison disjoint OR. If false (legacy behavior), "
288 "disjointedness is ignored (i.e., 1|1 is initialized)."),
289 cl::Hidden, cl::init(false));
290
291static cl::opt<bool>
292 ClHandleICmp("msan-handle-icmp",
293 cl::desc("propagate shadow through ICmpEQ and ICmpNE"),
294 cl::Hidden, cl::init(true));
295
296static cl::opt<bool>
297 ClHandleICmpExact("msan-handle-icmp-exact",
298 cl::desc("exact handling of relational integer ICmp"),
299 cl::Hidden, cl::init(true));
300
302 "msan-handle-lifetime-intrinsics",
303 cl::desc(
304 "when possible, poison scoped variables at the beginning of the scope "
305 "(slower, but more precise)"),
306 cl::Hidden, cl::init(true));
307
308// When compiling the Linux kernel, we sometimes see false positives related to
309// MSan being unable to understand that inline assembly calls may initialize
310// local variables.
311// This flag makes the compiler conservatively unpoison every memory location
312// passed into an assembly call. Note that this may cause false positives.
313// Because it's impossible to figure out the array sizes, we can only unpoison
314// the first sizeof(type) bytes for each type* pointer.
316 "msan-handle-asm-conservative",
317 cl::desc("conservative handling of inline assembly"), cl::Hidden,
318 cl::init(true));
319
320// This flag controls whether we check the shadow of the address
321// operand of load or store. Such bugs are very rare, since load from
322// a garbage address typically results in SEGV, but still happen
323// (e.g. only lower bits of address are garbage, or the access happens
324// early at program startup where malloc-ed memory is more likely to
325// be zeroed. As of 2012-08-28 this flag adds 20% slowdown.
327 "msan-check-access-address",
328 cl::desc("report accesses through a pointer which has poisoned shadow"),
329 cl::Hidden, cl::init(true));
330
332 "msan-eager-checks",
333 cl::desc("check arguments and return values at function call boundaries"),
334 cl::Hidden, cl::init(false));
335
337 "msan-dump-strict-instructions",
338 cl::desc("print out instructions with default strict semantics i.e.,"
339 "check that all the inputs are fully initialized, and mark "
340 "the output as fully initialized. These semantics are applied "
341 "to instructions that could not be handled explicitly nor "
342 "heuristically."),
343 cl::Hidden, cl::init(false));
344
345// Currently, all the heuristically handled instructions are specifically
346// IntrinsicInst. However, we use the broader "HeuristicInstructions" name
347// to parallel 'msan-dump-strict-instructions', and to keep the door open to
348// handling non-intrinsic instructions heuristically.
350 "msan-dump-heuristic-instructions",
351 cl::desc("Prints 'unknown' instructions that were handled heuristically. "
352 "Use -msan-dump-strict-instructions to print instructions that "
353 "could not be handled explicitly nor heuristically."),
354 cl::Hidden, cl::init(false));
355
357 "msan-instrumentation-with-call-threshold",
358 cl::desc(
359 "If the function being instrumented requires more than "
360 "this number of checks and origin stores, use callbacks instead of "
361 "inline checks (-1 means never use callbacks)."),
362 cl::Hidden, cl::init(3500));
363
364static cl::opt<bool>
365 ClEnableKmsan("msan-kernel",
366 cl::desc("Enable KernelMemorySanitizer instrumentation"),
367 cl::Hidden, cl::init(false));
368
369static cl::opt<bool>
370 ClDisableChecks("msan-disable-checks",
371 cl::desc("Apply no_sanitize to the whole file"), cl::Hidden,
372 cl::init(false));
373
374static cl::opt<bool>
375 ClCheckConstantShadow("msan-check-constant-shadow",
376 cl::desc("Insert checks for constant shadow values"),
377 cl::Hidden, cl::init(true));
378
379// This is off by default because of a bug in gold:
380// https://sourceware.org/bugzilla/show_bug.cgi?id=19002
381static cl::opt<bool>
382 ClWithComdat("msan-with-comdat",
383 cl::desc("Place MSan constructors in comdat sections"),
384 cl::Hidden, cl::init(false));
385
386// These options allow to specify custom memory map parameters
387// See MemoryMapParams for details.
388static cl::opt<uint64_t> ClAndMask("msan-and-mask",
389 cl::desc("Define custom MSan AndMask"),
390 cl::Hidden, cl::init(0));
391
392static cl::opt<uint64_t> ClXorMask("msan-xor-mask",
393 cl::desc("Define custom MSan XorMask"),
394 cl::Hidden, cl::init(0));
395
396static cl::opt<uint64_t> ClShadowBase("msan-shadow-base",
397 cl::desc("Define custom MSan ShadowBase"),
398 cl::Hidden, cl::init(0));
399
400static cl::opt<uint64_t> ClOriginBase("msan-origin-base",
401 cl::desc("Define custom MSan OriginBase"),
402 cl::Hidden, cl::init(0));
403
404static cl::opt<int>
405 ClDisambiguateWarning("msan-disambiguate-warning-threshold",
406 cl::desc("Define threshold for number of checks per "
407 "debug location to force origin update."),
408 cl::Hidden, cl::init(3));
409
410const char kMsanModuleCtorName[] = "msan.module_ctor";
411const char kMsanInitName[] = "__msan_init";
412
413namespace {
414
415// Memory map parameters used in application-to-shadow address calculation.
416// Offset = (Addr & ~AndMask) ^ XorMask
417// Shadow = ShadowBase + Offset
418// Origin = OriginBase + Offset
419struct MemoryMapParams {
420 uint64_t AndMask;
421 uint64_t XorMask;
422 uint64_t ShadowBase;
423 uint64_t OriginBase;
424};
425
426struct PlatformMemoryMapParams {
427 const MemoryMapParams *bits32;
428 const MemoryMapParams *bits64;
429};
430
431} // end anonymous namespace
432
433// i386 Linux
434static const MemoryMapParams Linux_I386_MemoryMapParams = {
435 0x000080000000, // AndMask
436 0, // XorMask (not used)
437 0, // ShadowBase (not used)
438 0x000040000000, // OriginBase
439};
440
441// x86_64 Linux
442static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
443 0, // AndMask (not used)
444 0x500000000000, // XorMask
445 0, // ShadowBase (not used)
446 0x100000000000, // OriginBase
447};
448
449// mips32 Linux
450// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
451// after picking good constants
452
453// mips64 Linux
454static const MemoryMapParams Linux_MIPS64_MemoryMapParams = {
455 0, // AndMask (not used)
456 0x008000000000, // XorMask
457 0, // ShadowBase (not used)
458 0x002000000000, // OriginBase
459};
460
461// ppc32 Linux
462// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
463// after picking good constants
464
465// ppc64 Linux
466static const MemoryMapParams Linux_PowerPC64_MemoryMapParams = {
467 0xE00000000000, // AndMask
468 0x100000000000, // XorMask
469 0x080000000000, // ShadowBase
470 0x1C0000000000, // OriginBase
471};
472
473// s390x Linux
474static const MemoryMapParams Linux_S390X_MemoryMapParams = {
475 0xC00000000000, // AndMask
476 0, // XorMask (not used)
477 0x080000000000, // ShadowBase
478 0x1C0000000000, // OriginBase
479};
480
481// arm32 Linux
482// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
483// after picking good constants
484
485// aarch64 Linux
486static const MemoryMapParams Linux_AArch64_MemoryMapParams = {
487 0, // AndMask (not used)
488 0x0B00000000000, // XorMask
489 0, // ShadowBase (not used)
490 0x0200000000000, // OriginBase
491};
492
493// loongarch64 Linux
494static const MemoryMapParams Linux_LoongArch64_MemoryMapParams = {
495 0, // AndMask (not used)
496 0x500000000000, // XorMask
497 0, // ShadowBase (not used)
498 0x100000000000, // OriginBase
499};
500
501// riscv32 Linux
502// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
503// after picking good constants
504
505// aarch64 FreeBSD
506static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams = {
507 0x1800000000000, // AndMask
508 0x0400000000000, // XorMask
509 0x0200000000000, // ShadowBase
510 0x0700000000000, // OriginBase
511};
512
513// i386 FreeBSD
514static const MemoryMapParams FreeBSD_I386_MemoryMapParams = {
515 0x000180000000, // AndMask
516 0x000040000000, // XorMask
517 0x000020000000, // ShadowBase
518 0x000700000000, // OriginBase
519};
520
521// x86_64 FreeBSD
522static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams = {
523 0xc00000000000, // AndMask
524 0x200000000000, // XorMask
525 0x100000000000, // ShadowBase
526 0x380000000000, // OriginBase
527};
528
529// x86_64 NetBSD
530static const MemoryMapParams NetBSD_X86_64_MemoryMapParams = {
531 0, // AndMask
532 0x500000000000, // XorMask
533 0, // ShadowBase
534 0x100000000000, // OriginBase
535};
536
537static const PlatformMemoryMapParams Linux_X86_MemoryMapParams = {
540};
541
542static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams = {
543 nullptr,
545};
546
547static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams = {
548 nullptr,
550};
551
552static const PlatformMemoryMapParams Linux_S390_MemoryMapParams = {
553 nullptr,
555};
556
557static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams = {
558 nullptr,
560};
561
562static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams = {
563 nullptr,
565};
566
567static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams = {
568 nullptr,
570};
571
572static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams = {
575};
576
577static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams = {
578 nullptr,
580};
581
582namespace {
583
584/// Instrument functions of a module to detect uninitialized reads.
585///
586/// Instantiating MemorySanitizer inserts the msan runtime library API function
587/// declarations into the module if they don't exist already. Instantiating
588/// ensures the __msan_init function is in the list of global constructors for
589/// the module.
590class MemorySanitizer {
591public:
592 MemorySanitizer(Module &M, MemorySanitizerOptions Options)
593 : CompileKernel(Options.Kernel), TrackOrigins(Options.TrackOrigins),
594 Recover(Options.Recover), EagerChecks(Options.EagerChecks) {
595 initializeModule(M);
596 }
597
598 // MSan cannot be moved or copied because of MapParams.
599 MemorySanitizer(MemorySanitizer &&) = delete;
600 MemorySanitizer &operator=(MemorySanitizer &&) = delete;
601 MemorySanitizer(const MemorySanitizer &) = delete;
602 MemorySanitizer &operator=(const MemorySanitizer &) = delete;
603
604 bool sanitizeFunction(Function &F, TargetLibraryInfo &TLI);
605
606private:
607 friend struct MemorySanitizerVisitor;
608 friend struct VarArgHelperBase;
609 friend struct VarArgAMD64Helper;
610 friend struct VarArgAArch64Helper;
611 friend struct VarArgPowerPC64Helper;
612 friend struct VarArgPowerPC32Helper;
613 friend struct VarArgSystemZHelper;
614 friend struct VarArgI386Helper;
615 friend struct VarArgGenericHelper;
616
617 void initializeModule(Module &M);
618 void initializeCallbacks(Module &M, const TargetLibraryInfo &TLI);
619 void createKernelApi(Module &M, const TargetLibraryInfo &TLI);
620 void createUserspaceApi(Module &M, const TargetLibraryInfo &TLI);
621
622 template <typename... ArgsTy>
623 FunctionCallee getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
624 ArgsTy... Args);
625
626 /// True if we're compiling the Linux kernel.
627 bool CompileKernel;
628 /// Track origins (allocation points) of uninitialized values.
629 int TrackOrigins;
630 bool Recover;
631 bool EagerChecks;
632
633 Triple TargetTriple;
634 LLVMContext *C;
635 Type *IntptrTy; ///< Integer type with the size of a ptr in default AS.
636 Type *OriginTy;
637 PointerType *PtrTy; ///< Integer type with the size of a ptr in default AS.
638
639 // XxxTLS variables represent the per-thread state in MSan and per-task state
640 // in KMSAN.
641 // For the userspace these point to thread-local globals. In the kernel land
642 // they point to the members of a per-task struct obtained via a call to
643 // __msan_get_context_state().
644
645 /// Thread-local shadow storage for function parameters.
646 Value *ParamTLS;
647
648 /// Thread-local origin storage for function parameters.
649 Value *ParamOriginTLS;
650
651 /// Thread-local shadow storage for function return value.
652 Value *RetvalTLS;
653
654 /// Thread-local origin storage for function return value.
655 Value *RetvalOriginTLS;
656
657 /// Thread-local shadow storage for in-register va_arg function.
658 Value *VAArgTLS;
659
660 /// Thread-local shadow storage for in-register va_arg function.
661 Value *VAArgOriginTLS;
662
663 /// Thread-local shadow storage for va_arg overflow area.
664 Value *VAArgOverflowSizeTLS;
665
666 /// Are the instrumentation callbacks set up?
667 bool CallbacksInitialized = false;
668
669 /// The run-time callback to print a warning.
670 FunctionCallee WarningFn;
671
672 // These arrays are indexed by log2(AccessSize).
673 FunctionCallee MaybeWarningFn[kNumberOfAccessSizes];
674 FunctionCallee MaybeWarningVarSizeFn;
675 FunctionCallee MaybeStoreOriginFn[kNumberOfAccessSizes];
676
677 /// Run-time helper that generates a new origin value for a stack
678 /// allocation.
679 FunctionCallee MsanSetAllocaOriginWithDescriptionFn;
680 // No description version
681 FunctionCallee MsanSetAllocaOriginNoDescriptionFn;
682
683 /// Run-time helper that poisons stack on function entry.
684 FunctionCallee MsanPoisonStackFn;
685
686 /// Run-time helper that records a store (or any event) of an
687 /// uninitialized value and returns an updated origin id encoding this info.
688 FunctionCallee MsanChainOriginFn;
689
690 /// Run-time helper that paints an origin over a region.
691 FunctionCallee MsanSetOriginFn;
692
693 /// MSan runtime replacements for memmove, memcpy and memset.
694 FunctionCallee MemmoveFn, MemcpyFn, MemsetFn;
695
696 /// KMSAN callback for task-local function argument shadow.
697 StructType *MsanContextStateTy;
698 FunctionCallee MsanGetContextStateFn;
699
700 /// Functions for poisoning/unpoisoning local variables
701 FunctionCallee MsanPoisonAllocaFn, MsanUnpoisonAllocaFn;
702
703 /// Pair of shadow/origin pointers.
704 Type *MsanMetadata;
705
706 /// Each of the MsanMetadataPtrXxx functions returns a MsanMetadata.
707 FunctionCallee MsanMetadataPtrForLoadN, MsanMetadataPtrForStoreN;
708 FunctionCallee MsanMetadataPtrForLoad_1_8[4];
709 FunctionCallee MsanMetadataPtrForStore_1_8[4];
710 FunctionCallee MsanInstrumentAsmStoreFn;
711
712 /// Storage for return values of the MsanMetadataPtrXxx functions.
713 Value *MsanMetadataAlloca;
714
715 /// Helper to choose between different MsanMetadataPtrXxx().
716 FunctionCallee getKmsanShadowOriginAccessFn(bool isStore, int size);
717
718 /// Memory map parameters used in application-to-shadow calculation.
719 const MemoryMapParams *MapParams;
720
721 /// Custom memory map parameters used when -msan-shadow-base or
722 // -msan-origin-base is provided.
723 MemoryMapParams CustomMapParams;
724
725 MDNode *ColdCallWeights;
726
727 /// Branch weights for origin store.
728 MDNode *OriginStoreWeights;
729};
730
731void insertModuleCtor(Module &M) {
734 /*InitArgTypes=*/{},
735 /*InitArgs=*/{},
736 // This callback is invoked when the functions are created the first
737 // time. Hook them into the global ctors list in that case:
738 [&](Function *Ctor, FunctionCallee) {
739 if (!ClWithComdat) {
740 appendToGlobalCtors(M, Ctor, 0);
741 return;
742 }
743 Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName);
744 Ctor->setComdat(MsanCtorComdat);
745 appendToGlobalCtors(M, Ctor, 0, Ctor);
746 });
747}
748
749template <class T> T getOptOrDefault(const cl::opt<T> &Opt, T Default) {
750 return (Opt.getNumOccurrences() > 0) ? Opt : Default;
751}
752
753} // end anonymous namespace
754
756 bool EagerChecks)
757 : Kernel(getOptOrDefault(ClEnableKmsan, K)),
758 TrackOrigins(getOptOrDefault(ClTrackOrigins, Kernel ? 2 : TO)),
759 Recover(getOptOrDefault(ClKeepGoing, Kernel || R)),
760 EagerChecks(getOptOrDefault(ClEagerChecks, EagerChecks)) {}
761
764 // Return early if nosanitize_memory module flag is present for the module.
765 if (checkIfAlreadyInstrumented(M, "nosanitize_memory"))
766 return PreservedAnalyses::all();
767 bool Modified = false;
768 if (!Options.Kernel) {
769 insertModuleCtor(M);
770 Modified = true;
771 }
772
773 auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
774 for (Function &F : M) {
775 if (F.empty())
776 continue;
777 MemorySanitizer Msan(*F.getParent(), Options);
778 Modified |=
779 Msan.sanitizeFunction(F, FAM.getResult<TargetLibraryAnalysis>(F));
780 }
781
782 if (!Modified)
783 return PreservedAnalyses::all();
784
786 // GlobalsAA is considered stateless and does not get invalidated unless
787 // explicitly invalidated; PreservedAnalyses::none() is not enough. Sanitizers
788 // make changes that require GlobalsAA to be invalidated.
789 PA.abandon<GlobalsAA>();
790 return PA;
791}
792
794 raw_ostream &OS, function_ref<StringRef(StringRef)> MapClassName2PassName) {
796 OS, MapClassName2PassName);
797 OS << '<';
798 if (Options.Recover)
799 OS << "recover;";
800 if (Options.Kernel)
801 OS << "kernel;";
802 if (Options.EagerChecks)
803 OS << "eager-checks;";
804 OS << "track-origins=" << Options.TrackOrigins;
805 OS << '>';
806}
807
808/// Create a non-const global initialized with the given string.
809///
810/// Creates a writable global for Str so that we can pass it to the
811/// run-time lib. Runtime uses first 4 bytes of the string to store the
812/// frame ID, so the string needs to be mutable.
814 StringRef Str) {
815 Constant *StrConst = ConstantDataArray::getString(M.getContext(), Str);
816 return new GlobalVariable(M, StrConst->getType(), /*isConstant=*/true,
817 GlobalValue::PrivateLinkage, StrConst, "");
818}
819
820template <typename... ArgsTy>
822MemorySanitizer::getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
823 ArgsTy... Args) {
824 if (TargetTriple.getArch() == Triple::systemz) {
825 // SystemZ ABI: shadow/origin pair is returned via a hidden parameter.
826 return M.getOrInsertFunction(Name, Type::getVoidTy(*C), PtrTy,
827 std::forward<ArgsTy>(Args)...);
828 }
829
830 return M.getOrInsertFunction(Name, MsanMetadata,
831 std::forward<ArgsTy>(Args)...);
832}
833
834/// Create KMSAN API callbacks.
835void MemorySanitizer::createKernelApi(Module &M, const TargetLibraryInfo &TLI) {
836 IRBuilder<> IRB(*C);
837
838 // These will be initialized in insertKmsanPrologue().
839 RetvalTLS = nullptr;
840 RetvalOriginTLS = nullptr;
841 ParamTLS = nullptr;
842 ParamOriginTLS = nullptr;
843 VAArgTLS = nullptr;
844 VAArgOriginTLS = nullptr;
845 VAArgOverflowSizeTLS = nullptr;
846
847 WarningFn = M.getOrInsertFunction("__msan_warning",
848 TLI.getAttrList(C, {0}, /*Signed=*/false),
849 IRB.getVoidTy(), IRB.getInt32Ty());
850
851 // Requests the per-task context state (kmsan_context_state*) from the
852 // runtime library.
853 MsanContextStateTy = StructType::get(
854 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
855 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8),
856 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
857 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8), /* va_arg_origin */
858 IRB.getInt64Ty(), ArrayType::get(OriginTy, kParamTLSSize / 4), OriginTy,
859 OriginTy);
860 MsanGetContextStateFn =
861 M.getOrInsertFunction("__msan_get_context_state", PtrTy);
862
863 MsanMetadata = StructType::get(PtrTy, PtrTy);
864
865 for (int ind = 0, size = 1; ind < 4; ind++, size <<= 1) {
866 std::string name_load =
867 "__msan_metadata_ptr_for_load_" + std::to_string(size);
868 std::string name_store =
869 "__msan_metadata_ptr_for_store_" + std::to_string(size);
870 MsanMetadataPtrForLoad_1_8[ind] =
871 getOrInsertMsanMetadataFunction(M, name_load, PtrTy);
872 MsanMetadataPtrForStore_1_8[ind] =
873 getOrInsertMsanMetadataFunction(M, name_store, PtrTy);
874 }
875
876 MsanMetadataPtrForLoadN = getOrInsertMsanMetadataFunction(
877 M, "__msan_metadata_ptr_for_load_n", PtrTy, IntptrTy);
878 MsanMetadataPtrForStoreN = getOrInsertMsanMetadataFunction(
879 M, "__msan_metadata_ptr_for_store_n", PtrTy, IntptrTy);
880
881 // Functions for poisoning and unpoisoning memory.
882 MsanPoisonAllocaFn = M.getOrInsertFunction(
883 "__msan_poison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
884 MsanUnpoisonAllocaFn = M.getOrInsertFunction(
885 "__msan_unpoison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy);
886}
887
889 return M.getOrInsertGlobal(Name, Ty, [&] {
890 return new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
891 nullptr, Name, nullptr,
893 });
894}
895
896/// Insert declarations for userspace-specific functions and globals.
897void MemorySanitizer::createUserspaceApi(Module &M,
898 const TargetLibraryInfo &TLI) {
899 IRBuilder<> IRB(*C);
900
901 // Create the callback.
902 // FIXME: this function should have "Cold" calling conv,
903 // which is not yet implemented.
904 if (TrackOrigins) {
905 StringRef WarningFnName = Recover ? "__msan_warning_with_origin"
906 : "__msan_warning_with_origin_noreturn";
907 WarningFn = M.getOrInsertFunction(WarningFnName,
908 TLI.getAttrList(C, {0}, /*Signed=*/false),
909 IRB.getVoidTy(), IRB.getInt32Ty());
910 } else {
911 StringRef WarningFnName =
912 Recover ? "__msan_warning" : "__msan_warning_noreturn";
913 WarningFn = M.getOrInsertFunction(WarningFnName, IRB.getVoidTy());
914 }
915
916 // Create the global TLS variables.
917 RetvalTLS =
918 getOrInsertGlobal(M, "__msan_retval_tls",
919 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8));
920
921 RetvalOriginTLS = getOrInsertGlobal(M, "__msan_retval_origin_tls", OriginTy);
922
923 ParamTLS =
924 getOrInsertGlobal(M, "__msan_param_tls",
925 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
926
927 ParamOriginTLS =
928 getOrInsertGlobal(M, "__msan_param_origin_tls",
929 ArrayType::get(OriginTy, kParamTLSSize / 4));
930
931 VAArgTLS =
932 getOrInsertGlobal(M, "__msan_va_arg_tls",
933 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
934
935 VAArgOriginTLS =
936 getOrInsertGlobal(M, "__msan_va_arg_origin_tls",
937 ArrayType::get(OriginTy, kParamTLSSize / 4));
938
939 VAArgOverflowSizeTLS = getOrInsertGlobal(M, "__msan_va_arg_overflow_size_tls",
940 IRB.getIntPtrTy(M.getDataLayout()));
941
942 for (size_t AccessSizeIndex = 0; AccessSizeIndex < kNumberOfAccessSizes;
943 AccessSizeIndex++) {
944 unsigned AccessSize = 1 << AccessSizeIndex;
945 std::string FunctionName = "__msan_maybe_warning_" + itostr(AccessSize);
946 MaybeWarningFn[AccessSizeIndex] = M.getOrInsertFunction(
947 FunctionName, TLI.getAttrList(C, {0, 1}, /*Signed=*/false),
948 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), IRB.getInt32Ty());
949 MaybeWarningVarSizeFn = M.getOrInsertFunction(
950 "__msan_maybe_warning_N", TLI.getAttrList(C, {}, /*Signed=*/false),
951 IRB.getVoidTy(), PtrTy, IRB.getInt64Ty(), IRB.getInt32Ty());
952 FunctionName = "__msan_maybe_store_origin_" + itostr(AccessSize);
953 MaybeStoreOriginFn[AccessSizeIndex] = M.getOrInsertFunction(
954 FunctionName, TLI.getAttrList(C, {0, 2}, /*Signed=*/false),
955 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), PtrTy,
956 IRB.getInt32Ty());
957 }
958
959 MsanSetAllocaOriginWithDescriptionFn =
960 M.getOrInsertFunction("__msan_set_alloca_origin_with_descr",
961 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy, PtrTy);
962 MsanSetAllocaOriginNoDescriptionFn =
963 M.getOrInsertFunction("__msan_set_alloca_origin_no_descr",
964 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
965 MsanPoisonStackFn = M.getOrInsertFunction("__msan_poison_stack",
966 IRB.getVoidTy(), PtrTy, IntptrTy);
967}
968
969/// Insert extern declaration of runtime-provided functions and globals.
970void MemorySanitizer::initializeCallbacks(Module &M,
971 const TargetLibraryInfo &TLI) {
972 // Only do this once.
973 if (CallbacksInitialized)
974 return;
975
976 IRBuilder<> IRB(*C);
977 // Initialize callbacks that are common for kernel and userspace
978 // instrumentation.
979 MsanChainOriginFn = M.getOrInsertFunction(
980 "__msan_chain_origin",
981 TLI.getAttrList(C, {0}, /*Signed=*/false, /*Ret=*/true), IRB.getInt32Ty(),
982 IRB.getInt32Ty());
983 MsanSetOriginFn = M.getOrInsertFunction(
984 "__msan_set_origin", TLI.getAttrList(C, {2}, /*Signed=*/false),
985 IRB.getVoidTy(), PtrTy, IntptrTy, IRB.getInt32Ty());
986 MemmoveFn =
987 M.getOrInsertFunction("__msan_memmove", PtrTy, PtrTy, PtrTy, IntptrTy);
988 MemcpyFn =
989 M.getOrInsertFunction("__msan_memcpy", PtrTy, PtrTy, PtrTy, IntptrTy);
990 MemsetFn = M.getOrInsertFunction("__msan_memset",
991 TLI.getAttrList(C, {1}, /*Signed=*/true),
992 PtrTy, PtrTy, IRB.getInt32Ty(), IntptrTy);
993
994 MsanInstrumentAsmStoreFn = M.getOrInsertFunction(
995 "__msan_instrument_asm_store", IRB.getVoidTy(), PtrTy, IntptrTy);
996
997 if (CompileKernel) {
998 createKernelApi(M, TLI);
999 } else {
1000 createUserspaceApi(M, TLI);
1001 }
1002 CallbacksInitialized = true;
1003}
1004
1005FunctionCallee MemorySanitizer::getKmsanShadowOriginAccessFn(bool isStore,
1006 int size) {
1007 FunctionCallee *Fns =
1008 isStore ? MsanMetadataPtrForStore_1_8 : MsanMetadataPtrForLoad_1_8;
1009 switch (size) {
1010 case 1:
1011 return Fns[0];
1012 case 2:
1013 return Fns[1];
1014 case 4:
1015 return Fns[2];
1016 case 8:
1017 return Fns[3];
1018 default:
1019 return nullptr;
1020 }
1021}
1022
1023/// Module-level initialization.
1024///
1025/// inserts a call to __msan_init to the module's constructor list.
1026void MemorySanitizer::initializeModule(Module &M) {
1027 auto &DL = M.getDataLayout();
1028
1029 TargetTriple = M.getTargetTriple();
1030
1031 bool ShadowPassed = ClShadowBase.getNumOccurrences() > 0;
1032 bool OriginPassed = ClOriginBase.getNumOccurrences() > 0;
1033 // Check the overrides first
1034 if (ShadowPassed || OriginPassed) {
1035 CustomMapParams.AndMask = ClAndMask;
1036 CustomMapParams.XorMask = ClXorMask;
1037 CustomMapParams.ShadowBase = ClShadowBase;
1038 CustomMapParams.OriginBase = ClOriginBase;
1039 MapParams = &CustomMapParams;
1040 } else {
1041 switch (TargetTriple.getOS()) {
1042 case Triple::FreeBSD:
1043 switch (TargetTriple.getArch()) {
1044 case Triple::aarch64:
1045 MapParams = FreeBSD_ARM_MemoryMapParams.bits64;
1046 break;
1047 case Triple::x86_64:
1048 MapParams = FreeBSD_X86_MemoryMapParams.bits64;
1049 break;
1050 case Triple::x86:
1051 MapParams = FreeBSD_X86_MemoryMapParams.bits32;
1052 break;
1053 default:
1054 report_fatal_error("unsupported architecture");
1055 }
1056 break;
1057 case Triple::NetBSD:
1058 switch (TargetTriple.getArch()) {
1059 case Triple::x86_64:
1060 MapParams = NetBSD_X86_MemoryMapParams.bits64;
1061 break;
1062 default:
1063 report_fatal_error("unsupported architecture");
1064 }
1065 break;
1066 case Triple::Linux:
1067 switch (TargetTriple.getArch()) {
1068 case Triple::x86_64:
1069 MapParams = Linux_X86_MemoryMapParams.bits64;
1070 break;
1071 case Triple::x86:
1072 MapParams = Linux_X86_MemoryMapParams.bits32;
1073 break;
1074 case Triple::mips64:
1075 case Triple::mips64el:
1076 MapParams = Linux_MIPS_MemoryMapParams.bits64;
1077 break;
1078 case Triple::ppc64:
1079 case Triple::ppc64le:
1080 MapParams = Linux_PowerPC_MemoryMapParams.bits64;
1081 break;
1082 case Triple::systemz:
1083 MapParams = Linux_S390_MemoryMapParams.bits64;
1084 break;
1085 case Triple::aarch64:
1086 case Triple::aarch64_be:
1087 MapParams = Linux_ARM_MemoryMapParams.bits64;
1088 break;
1090 MapParams = Linux_LoongArch_MemoryMapParams.bits64;
1091 break;
1092 default:
1093 report_fatal_error("unsupported architecture");
1094 }
1095 break;
1096 default:
1097 report_fatal_error("unsupported operating system");
1098 }
1099 }
1100
1101 C = &(M.getContext());
1102 IRBuilder<> IRB(*C);
1103 IntptrTy = IRB.getIntPtrTy(DL);
1104 OriginTy = IRB.getInt32Ty();
1105 PtrTy = IRB.getPtrTy();
1106
1107 ColdCallWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1108 OriginStoreWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1109
1110 if (!CompileKernel) {
1111 if (TrackOrigins)
1112 M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] {
1113 return new GlobalVariable(
1114 M, IRB.getInt32Ty(), true, GlobalValue::WeakODRLinkage,
1115 IRB.getInt32(TrackOrigins), "__msan_track_origins");
1116 });
1117
1118 if (Recover)
1119 M.getOrInsertGlobal("__msan_keep_going", IRB.getInt32Ty(), [&] {
1120 return new GlobalVariable(M, IRB.getInt32Ty(), true,
1121 GlobalValue::WeakODRLinkage,
1122 IRB.getInt32(Recover), "__msan_keep_going");
1123 });
1124 }
1125}
1126
1127namespace {
1128
1129/// A helper class that handles instrumentation of VarArg
1130/// functions on a particular platform.
1131///
1132/// Implementations are expected to insert the instrumentation
1133/// necessary to propagate argument shadow through VarArg function
1134/// calls. Visit* methods are called during an InstVisitor pass over
1135/// the function, and should avoid creating new basic blocks. A new
1136/// instance of this class is created for each instrumented function.
1137struct VarArgHelper {
1138 virtual ~VarArgHelper() = default;
1139
1140 /// Visit a CallBase.
1141 virtual void visitCallBase(CallBase &CB, IRBuilder<> &IRB) = 0;
1142
1143 /// Visit a va_start call.
1144 virtual void visitVAStartInst(VAStartInst &I) = 0;
1145
1146 /// Visit a va_copy call.
1147 virtual void visitVACopyInst(VACopyInst &I) = 0;
1148
1149 /// Finalize function instrumentation.
1150 ///
1151 /// This method is called after visiting all interesting (see above)
1152 /// instructions in a function.
1153 virtual void finalizeInstrumentation() = 0;
1154};
1155
1156struct MemorySanitizerVisitor;
1157
1158} // end anonymous namespace
1159
1160static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
1161 MemorySanitizerVisitor &Visitor);
1162
1163static unsigned TypeSizeToSizeIndex(TypeSize TS) {
1164 if (TS.isScalable())
1165 // Scalable types unconditionally take slowpaths.
1166 return kNumberOfAccessSizes;
1167 unsigned TypeSizeFixed = TS.getFixedValue();
1168 if (TypeSizeFixed <= 8)
1169 return 0;
1170 return Log2_32_Ceil((TypeSizeFixed + 7) / 8);
1171}
1172
1173namespace {
1174
1175/// Helper class to attach debug information of the given instruction onto new
1176/// instructions inserted after.
1177class NextNodeIRBuilder : public IRBuilder<> {
1178public:
1179 explicit NextNodeIRBuilder(Instruction *IP) : IRBuilder<>(IP->getNextNode()) {
1180 SetCurrentDebugLocation(IP->getDebugLoc());
1181 }
1182};
1183
1184/// This class does all the work for a given function. Store and Load
1185/// instructions store and load corresponding shadow and origin
1186/// values. Most instructions propagate shadow from arguments to their
1187/// return values. Certain instructions (most importantly, BranchInst)
1188/// test their argument shadow and print reports (with a runtime call) if it's
1189/// non-zero.
1190struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
1191 Function &F;
1192 MemorySanitizer &MS;
1193 SmallVector<PHINode *, 16> ShadowPHINodes, OriginPHINodes;
1194 ValueMap<Value *, Value *> ShadowMap, OriginMap;
1195 std::unique_ptr<VarArgHelper> VAHelper;
1196 const TargetLibraryInfo *TLI;
1197 Instruction *FnPrologueEnd;
1198 SmallVector<Instruction *, 16> Instructions;
1199
1200 // The following flags disable parts of MSan instrumentation based on
1201 // exclusion list contents and command-line options.
1202 bool InsertChecks;
1203 bool PropagateShadow;
1204 bool PoisonStack;
1205 bool PoisonUndef;
1206 bool PoisonUndefVectors;
1207
1208 struct ShadowOriginAndInsertPoint {
1209 Value *Shadow;
1210 Value *Origin;
1211 Instruction *OrigIns;
1212
1213 ShadowOriginAndInsertPoint(Value *S, Value *O, Instruction *I)
1214 : Shadow(S), Origin(O), OrigIns(I) {}
1215 };
1217 DenseMap<const DILocation *, int> LazyWarningDebugLocationCount;
1218 SmallSetVector<AllocaInst *, 16> AllocaSet;
1221 int64_t SplittableBlocksCount = 0;
1222
1223 MemorySanitizerVisitor(Function &F, MemorySanitizer &MS,
1224 const TargetLibraryInfo &TLI)
1225 : F(F), MS(MS), VAHelper(CreateVarArgHelper(F, MS, *this)), TLI(&TLI) {
1226 bool SanitizeFunction =
1227 F.hasFnAttribute(Attribute::SanitizeMemory) && !ClDisableChecks;
1228 InsertChecks = SanitizeFunction;
1229 PropagateShadow = SanitizeFunction;
1230 PoisonStack = SanitizeFunction && ClPoisonStack;
1231 PoisonUndef = SanitizeFunction && ClPoisonUndef;
1232 PoisonUndefVectors = SanitizeFunction && ClPoisonUndefVectors;
1233
1234 // In the presence of unreachable blocks, we may see Phi nodes with
1235 // incoming nodes from such blocks. Since InstVisitor skips unreachable
1236 // blocks, such nodes will not have any shadow value associated with them.
1237 // It's easier to remove unreachable blocks than deal with missing shadow.
1239
1240 MS.initializeCallbacks(*F.getParent(), TLI);
1241 FnPrologueEnd =
1242 IRBuilder<>(&F.getEntryBlock(), F.getEntryBlock().getFirstNonPHIIt())
1243 .CreateIntrinsic(Intrinsic::donothing, {});
1244
1245 if (MS.CompileKernel) {
1246 IRBuilder<> IRB(FnPrologueEnd);
1247 insertKmsanPrologue(IRB);
1248 }
1249
1250 LLVM_DEBUG(if (!InsertChecks) dbgs()
1251 << "MemorySanitizer is not inserting checks into '"
1252 << F.getName() << "'\n");
1253 }
1254
1255 bool instrumentWithCalls(Value *V) {
1256 // Constants likely will be eliminated by follow-up passes.
1257 if (isa<Constant>(V))
1258 return false;
1259 ++SplittableBlocksCount;
1261 SplittableBlocksCount > ClInstrumentationWithCallThreshold;
1262 }
1263
1264 bool isInPrologue(Instruction &I) {
1265 return I.getParent() == FnPrologueEnd->getParent() &&
1266 (&I == FnPrologueEnd || I.comesBefore(FnPrologueEnd));
1267 }
1268
1269 // Creates a new origin and records the stack trace. In general we can call
1270 // this function for any origin manipulation we like. However it will cost
1271 // runtime resources. So use this wisely only if it can provide additional
1272 // information helpful to a user.
1273 Value *updateOrigin(Value *V, IRBuilder<> &IRB) {
1274 if (MS.TrackOrigins <= 1)
1275 return V;
1276 return IRB.CreateCall(MS.MsanChainOriginFn, V);
1277 }
1278
1279 Value *originToIntptr(IRBuilder<> &IRB, Value *Origin) {
1280 const DataLayout &DL = F.getDataLayout();
1281 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1282 if (IntptrSize == kOriginSize)
1283 return Origin;
1284 assert(IntptrSize == kOriginSize * 2);
1285 Origin = IRB.CreateIntCast(Origin, MS.IntptrTy, /* isSigned */ false);
1286 return IRB.CreateOr(Origin, IRB.CreateShl(Origin, kOriginSize * 8));
1287 }
1288
1289 /// Fill memory range with the given origin value.
1290 void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *OriginPtr,
1291 TypeSize TS, Align Alignment) {
1292 const DataLayout &DL = F.getDataLayout();
1293 const Align IntptrAlignment = DL.getABITypeAlign(MS.IntptrTy);
1294 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1295 assert(IntptrAlignment >= kMinOriginAlignment);
1296 assert(IntptrSize >= kOriginSize);
1297
1298 // Note: The loop based formation works for fixed length vectors too,
1299 // however we prefer to unroll and specialize alignment below.
1300 if (TS.isScalable()) {
1301 Value *Size = IRB.CreateTypeSize(MS.IntptrTy, TS);
1302 Value *RoundUp =
1303 IRB.CreateAdd(Size, ConstantInt::get(MS.IntptrTy, kOriginSize - 1));
1304 Value *End =
1305 IRB.CreateUDiv(RoundUp, ConstantInt::get(MS.IntptrTy, kOriginSize));
1306 auto [InsertPt, Index] =
1308 IRB.SetInsertPoint(InsertPt);
1309
1310 Value *GEP = IRB.CreateGEP(MS.OriginTy, OriginPtr, Index);
1312 return;
1313 }
1314
1315 unsigned Size = TS.getFixedValue();
1316
1317 unsigned Ofs = 0;
1318 Align CurrentAlignment = Alignment;
1319 if (Alignment >= IntptrAlignment && IntptrSize > kOriginSize) {
1320 Value *IntptrOrigin = originToIntptr(IRB, Origin);
1321 Value *IntptrOriginPtr = IRB.CreatePointerCast(OriginPtr, MS.PtrTy);
1322 for (unsigned i = 0; i < Size / IntptrSize; ++i) {
1323 Value *Ptr = i ? IRB.CreateConstGEP1_32(MS.IntptrTy, IntptrOriginPtr, i)
1324 : IntptrOriginPtr;
1325 IRB.CreateAlignedStore(IntptrOrigin, Ptr, CurrentAlignment);
1326 Ofs += IntptrSize / kOriginSize;
1327 CurrentAlignment = IntptrAlignment;
1328 }
1329 }
1330
1331 for (unsigned i = Ofs; i < (Size + kOriginSize - 1) / kOriginSize; ++i) {
1332 Value *GEP =
1333 i ? IRB.CreateConstGEP1_32(MS.OriginTy, OriginPtr, i) : OriginPtr;
1334 IRB.CreateAlignedStore(Origin, GEP, CurrentAlignment);
1335 CurrentAlignment = kMinOriginAlignment;
1336 }
1337 }
1338
1339 void storeOrigin(IRBuilder<> &IRB, Value *Addr, Value *Shadow, Value *Origin,
1340 Value *OriginPtr, Align Alignment) {
1341 const DataLayout &DL = F.getDataLayout();
1342 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1343 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
1344 // ZExt cannot convert between vector and scalar
1345 Value *ConvertedShadow = convertShadowToScalar(Shadow, IRB);
1346 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1347 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1348 // Origin is not needed: value is initialized or const shadow is
1349 // ignored.
1350 return;
1351 }
1352 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1353 // Copy origin as the value is definitely uninitialized.
1354 paintOrigin(IRB, updateOrigin(Origin, IRB), OriginPtr, StoreSize,
1355 OriginAlignment);
1356 return;
1357 }
1358 // Fallback to runtime check, which still can be optimized out later.
1359 }
1360
1361 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1362 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1363 if (instrumentWithCalls(ConvertedShadow) &&
1364 SizeIndex < kNumberOfAccessSizes && !MS.CompileKernel) {
1365 FunctionCallee Fn = MS.MaybeStoreOriginFn[SizeIndex];
1366 Value *ConvertedShadow2 =
1367 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1368 CallBase *CB = IRB.CreateCall(Fn, {ConvertedShadow2, Addr, Origin});
1369 CB->addParamAttr(0, Attribute::ZExt);
1370 CB->addParamAttr(2, Attribute::ZExt);
1371 } else {
1372 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1374 Cmp, &*IRB.GetInsertPoint(), false, MS.OriginStoreWeights);
1375 IRBuilder<> IRBNew(CheckTerm);
1376 paintOrigin(IRBNew, updateOrigin(Origin, IRBNew), OriginPtr, StoreSize,
1377 OriginAlignment);
1378 }
1379 }
1380
1381 void materializeStores() {
1382 for (StoreInst *SI : StoreList) {
1383 IRBuilder<> IRB(SI);
1384 Value *Val = SI->getValueOperand();
1385 Value *Addr = SI->getPointerOperand();
1386 Value *Shadow = SI->isAtomic() ? getCleanShadow(Val) : getShadow(Val);
1387 Value *ShadowPtr, *OriginPtr;
1388 Type *ShadowTy = Shadow->getType();
1389 const Align Alignment = SI->getAlign();
1390 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1391 std::tie(ShadowPtr, OriginPtr) =
1392 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ true);
1393
1394 [[maybe_unused]] StoreInst *NewSI =
1395 IRB.CreateAlignedStore(Shadow, ShadowPtr, Alignment);
1396 LLVM_DEBUG(dbgs() << " STORE: " << *NewSI << "\n");
1397
1398 if (SI->isAtomic())
1399 SI->setOrdering(addReleaseOrdering(SI->getOrdering()));
1400
1401 if (MS.TrackOrigins && !SI->isAtomic())
1402 storeOrigin(IRB, Addr, Shadow, getOrigin(Val), OriginPtr,
1403 OriginAlignment);
1404 }
1405 }
1406
1407 // Returns true if Debug Location corresponds to multiple warnings.
1408 bool shouldDisambiguateWarningLocation(const DebugLoc &DebugLoc) {
1409 if (MS.TrackOrigins < 2)
1410 return false;
1411
1412 if (LazyWarningDebugLocationCount.empty())
1413 for (const auto &I : InstrumentationList)
1414 ++LazyWarningDebugLocationCount[I.OrigIns->getDebugLoc()];
1415
1416 return LazyWarningDebugLocationCount[DebugLoc] >= ClDisambiguateWarning;
1417 }
1418
1419 /// Helper function to insert a warning at IRB's current insert point.
1420 void insertWarningFn(IRBuilder<> &IRB, Value *Origin) {
1421 if (!Origin)
1422 Origin = (Value *)IRB.getInt32(0);
1423 assert(Origin->getType()->isIntegerTy());
1424
1425 if (shouldDisambiguateWarningLocation(IRB.getCurrentDebugLocation())) {
1426 // Try to create additional origin with debug info of the last origin
1427 // instruction. It may provide additional information to the user.
1428 if (Instruction *OI = dyn_cast_or_null<Instruction>(Origin)) {
1429 assert(MS.TrackOrigins);
1430 auto NewDebugLoc = OI->getDebugLoc();
1431 // Origin update with missing or the same debug location provides no
1432 // additional value.
1433 if (NewDebugLoc && NewDebugLoc != IRB.getCurrentDebugLocation()) {
1434 // Insert update just before the check, so we call runtime only just
1435 // before the report.
1436 IRBuilder<> IRBOrigin(&*IRB.GetInsertPoint());
1437 IRBOrigin.SetCurrentDebugLocation(NewDebugLoc);
1438 Origin = updateOrigin(Origin, IRBOrigin);
1439 }
1440 }
1441 }
1442
1443 if (MS.CompileKernel || MS.TrackOrigins)
1444 IRB.CreateCall(MS.WarningFn, Origin)->setCannotMerge();
1445 else
1446 IRB.CreateCall(MS.WarningFn)->setCannotMerge();
1447 // FIXME: Insert UnreachableInst if !MS.Recover?
1448 // This may invalidate some of the following checks and needs to be done
1449 // at the very end.
1450 }
1451
1452 void materializeOneCheck(IRBuilder<> &IRB, Value *ConvertedShadow,
1453 Value *Origin) {
1454 const DataLayout &DL = F.getDataLayout();
1455 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1456 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1457 if (instrumentWithCalls(ConvertedShadow) && !MS.CompileKernel) {
1458 // ZExt cannot convert between vector and scalar
1459 ConvertedShadow = convertShadowToScalar(ConvertedShadow, IRB);
1460 Value *ConvertedShadow2 =
1461 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1462
1463 if (SizeIndex < kNumberOfAccessSizes) {
1464 FunctionCallee Fn = MS.MaybeWarningFn[SizeIndex];
1465 CallBase *CB = IRB.CreateCall(
1466 Fn,
1467 {ConvertedShadow2,
1468 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1469 CB->addParamAttr(0, Attribute::ZExt);
1470 CB->addParamAttr(1, Attribute::ZExt);
1471 } else {
1472 FunctionCallee Fn = MS.MaybeWarningVarSizeFn;
1473 Value *ShadowAlloca = IRB.CreateAlloca(ConvertedShadow2->getType(), 0u);
1474 IRB.CreateStore(ConvertedShadow2, ShadowAlloca);
1475 unsigned ShadowSize = DL.getTypeAllocSize(ConvertedShadow2->getType());
1476 CallBase *CB = IRB.CreateCall(
1477 Fn,
1478 {ShadowAlloca, ConstantInt::get(IRB.getInt64Ty(), ShadowSize),
1479 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1480 CB->addParamAttr(1, Attribute::ZExt);
1481 CB->addParamAttr(2, Attribute::ZExt);
1482 }
1483 } else {
1484 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1486 Cmp, &*IRB.GetInsertPoint(),
1487 /* Unreachable */ !MS.Recover, MS.ColdCallWeights);
1488
1489 IRB.SetInsertPoint(CheckTerm);
1490 insertWarningFn(IRB, Origin);
1491 LLVM_DEBUG(dbgs() << " CHECK: " << *Cmp << "\n");
1492 }
1493 }
1494
1495 void materializeInstructionChecks(
1496 ArrayRef<ShadowOriginAndInsertPoint> InstructionChecks) {
1497 const DataLayout &DL = F.getDataLayout();
1498 // Disable combining in some cases. TrackOrigins checks each shadow to pick
1499 // correct origin.
1500 bool Combine = !MS.TrackOrigins;
1501 Instruction *Instruction = InstructionChecks.front().OrigIns;
1502 Value *Shadow = nullptr;
1503 for (const auto &ShadowData : InstructionChecks) {
1504 assert(ShadowData.OrigIns == Instruction);
1505 IRBuilder<> IRB(Instruction);
1506
1507 Value *ConvertedShadow = ShadowData.Shadow;
1508
1509 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1510 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1511 // Skip, value is initialized or const shadow is ignored.
1512 continue;
1513 }
1514 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1515 // Report as the value is definitely uninitialized.
1516 insertWarningFn(IRB, ShadowData.Origin);
1517 if (!MS.Recover)
1518 return; // Always fail and stop here, not need to check the rest.
1519 // Skip entire instruction,
1520 continue;
1521 }
1522 // Fallback to runtime check, which still can be optimized out later.
1523 }
1524
1525 if (!Combine) {
1526 materializeOneCheck(IRB, ConvertedShadow, ShadowData.Origin);
1527 continue;
1528 }
1529
1530 if (!Shadow) {
1531 Shadow = ConvertedShadow;
1532 continue;
1533 }
1534
1535 Shadow = convertToBool(Shadow, IRB, "_mscmp");
1536 ConvertedShadow = convertToBool(ConvertedShadow, IRB, "_mscmp");
1537 Shadow = IRB.CreateOr(Shadow, ConvertedShadow, "_msor");
1538 }
1539
1540 if (Shadow) {
1541 assert(Combine);
1542 IRBuilder<> IRB(Instruction);
1543 materializeOneCheck(IRB, Shadow, nullptr);
1544 }
1545 }
1546
1547 void materializeChecks() {
1548#ifndef NDEBUG
1549 // For assert below.
1550 SmallPtrSet<Instruction *, 16> Done;
1551#endif
1552
1553 for (auto I = InstrumentationList.begin();
1554 I != InstrumentationList.end();) {
1555 auto OrigIns = I->OrigIns;
1556 // Checks are grouped by the original instruction. We call all
1557 // `insertShadowCheck` for an instruction at once.
1558 assert(Done.insert(OrigIns).second);
1559 auto J = std::find_if(I + 1, InstrumentationList.end(),
1560 [OrigIns](const ShadowOriginAndInsertPoint &R) {
1561 return OrigIns != R.OrigIns;
1562 });
1563 // Process all checks of instruction at once.
1564 materializeInstructionChecks(ArrayRef<ShadowOriginAndInsertPoint>(I, J));
1565 I = J;
1566 }
1567
1568 LLVM_DEBUG(dbgs() << "DONE:\n" << F);
1569 }
1570
1571 // Returns the last instruction in the new prologue
1572 void insertKmsanPrologue(IRBuilder<> &IRB) {
1573 Value *ContextState = IRB.CreateCall(MS.MsanGetContextStateFn, {});
1574 Constant *Zero = IRB.getInt32(0);
1575 MS.ParamTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1576 {Zero, IRB.getInt32(0)}, "param_shadow");
1577 MS.RetvalTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1578 {Zero, IRB.getInt32(1)}, "retval_shadow");
1579 MS.VAArgTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1580 {Zero, IRB.getInt32(2)}, "va_arg_shadow");
1581 MS.VAArgOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1582 {Zero, IRB.getInt32(3)}, "va_arg_origin");
1583 MS.VAArgOverflowSizeTLS =
1584 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1585 {Zero, IRB.getInt32(4)}, "va_arg_overflow_size");
1586 MS.ParamOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1587 {Zero, IRB.getInt32(5)}, "param_origin");
1588 MS.RetvalOriginTLS =
1589 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1590 {Zero, IRB.getInt32(6)}, "retval_origin");
1591 if (MS.TargetTriple.getArch() == Triple::systemz)
1592 MS.MsanMetadataAlloca = IRB.CreateAlloca(MS.MsanMetadata, 0u);
1593 }
1594
1595 /// Add MemorySanitizer instrumentation to a function.
1596 bool runOnFunction() {
1597 // Iterate all BBs in depth-first order and create shadow instructions
1598 // for all instructions (where applicable).
1599 // For PHI nodes we create dummy shadow PHIs which will be finalized later.
1600 for (BasicBlock *BB : depth_first(FnPrologueEnd->getParent()))
1601 visit(*BB);
1602
1603 // `visit` above only collects instructions. Process them after iterating
1604 // CFG to avoid requirement on CFG transformations.
1605 for (Instruction *I : Instructions)
1607
1608 // Finalize PHI nodes.
1609 for (PHINode *PN : ShadowPHINodes) {
1610 PHINode *PNS = cast<PHINode>(getShadow(PN));
1611 PHINode *PNO = MS.TrackOrigins ? cast<PHINode>(getOrigin(PN)) : nullptr;
1612 size_t NumValues = PN->getNumIncomingValues();
1613 for (size_t v = 0; v < NumValues; v++) {
1614 PNS->addIncoming(getShadow(PN, v), PN->getIncomingBlock(v));
1615 if (PNO)
1616 PNO->addIncoming(getOrigin(PN, v), PN->getIncomingBlock(v));
1617 }
1618 }
1619
1620 VAHelper->finalizeInstrumentation();
1621
1622 // Poison llvm.lifetime.start intrinsics, if we haven't fallen back to
1623 // instrumenting only allocas.
1625 for (auto Item : LifetimeStartList) {
1626 instrumentAlloca(*Item.second, Item.first);
1627 AllocaSet.remove(Item.second);
1628 }
1629 }
1630 // Poison the allocas for which we didn't instrument the corresponding
1631 // lifetime intrinsics.
1632 for (AllocaInst *AI : AllocaSet)
1633 instrumentAlloca(*AI);
1634
1635 // Insert shadow value checks.
1636 materializeChecks();
1637
1638 // Delayed instrumentation of StoreInst.
1639 // This may not add new address checks.
1640 materializeStores();
1641
1642 return true;
1643 }
1644
1645 /// Compute the shadow type that corresponds to a given Value.
1646 Type *getShadowTy(Value *V) { return getShadowTy(V->getType()); }
1647
1648 /// Compute the shadow type that corresponds to a given Type.
1649 Type *getShadowTy(Type *OrigTy) {
1650 if (!OrigTy->isSized()) {
1651 return nullptr;
1652 }
1653 // For integer type, shadow is the same as the original type.
1654 // This may return weird-sized types like i1.
1655 if (IntegerType *IT = dyn_cast<IntegerType>(OrigTy))
1656 return IT;
1657 const DataLayout &DL = F.getDataLayout();
1658 if (VectorType *VT = dyn_cast<VectorType>(OrigTy)) {
1659 uint32_t EltSize = DL.getTypeSizeInBits(VT->getElementType());
1660 return VectorType::get(IntegerType::get(*MS.C, EltSize),
1661 VT->getElementCount());
1662 }
1663 if (ArrayType *AT = dyn_cast<ArrayType>(OrigTy)) {
1664 return ArrayType::get(getShadowTy(AT->getElementType()),
1665 AT->getNumElements());
1666 }
1667 if (StructType *ST = dyn_cast<StructType>(OrigTy)) {
1669 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1670 Elements.push_back(getShadowTy(ST->getElementType(i)));
1671 StructType *Res = StructType::get(*MS.C, Elements, ST->isPacked());
1672 LLVM_DEBUG(dbgs() << "getShadowTy: " << *ST << " ===> " << *Res << "\n");
1673 return Res;
1674 }
1675 uint32_t TypeSize = DL.getTypeSizeInBits(OrigTy);
1676 return IntegerType::get(*MS.C, TypeSize);
1677 }
1678
1679 /// Extract combined shadow of struct elements as a bool
1680 Value *collapseStructShadow(StructType *Struct, Value *Shadow,
1681 IRBuilder<> &IRB) {
1682 Value *FalseVal = IRB.getIntN(/* width */ 1, /* value */ 0);
1683 Value *Aggregator = FalseVal;
1684
1685 for (unsigned Idx = 0; Idx < Struct->getNumElements(); Idx++) {
1686 // Combine by ORing together each element's bool shadow
1687 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1688 Value *ShadowBool = convertToBool(ShadowItem, IRB);
1689
1690 if (Aggregator != FalseVal)
1691 Aggregator = IRB.CreateOr(Aggregator, ShadowBool);
1692 else
1693 Aggregator = ShadowBool;
1694 }
1695
1696 return Aggregator;
1697 }
1698
1699 // Extract combined shadow of array elements
1700 Value *collapseArrayShadow(ArrayType *Array, Value *Shadow,
1701 IRBuilder<> &IRB) {
1702 if (!Array->getNumElements())
1703 return IRB.getIntN(/* width */ 1, /* value */ 0);
1704
1705 Value *FirstItem = IRB.CreateExtractValue(Shadow, 0);
1706 Value *Aggregator = convertShadowToScalar(FirstItem, IRB);
1707
1708 for (unsigned Idx = 1; Idx < Array->getNumElements(); Idx++) {
1709 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1710 Value *ShadowInner = convertShadowToScalar(ShadowItem, IRB);
1711 Aggregator = IRB.CreateOr(Aggregator, ShadowInner);
1712 }
1713 return Aggregator;
1714 }
1715
1716 /// Convert a shadow value to it's flattened variant. The resulting
1717 /// shadow may not necessarily have the same bit width as the input
1718 /// value, but it will always be comparable to zero.
1719 Value *convertShadowToScalar(Value *V, IRBuilder<> &IRB) {
1720 if (StructType *Struct = dyn_cast<StructType>(V->getType()))
1721 return collapseStructShadow(Struct, V, IRB);
1722 if (ArrayType *Array = dyn_cast<ArrayType>(V->getType()))
1723 return collapseArrayShadow(Array, V, IRB);
1724 if (isa<VectorType>(V->getType())) {
1725 if (isa<ScalableVectorType>(V->getType()))
1726 return convertShadowToScalar(IRB.CreateOrReduce(V), IRB);
1727 unsigned BitWidth =
1728 V->getType()->getPrimitiveSizeInBits().getFixedValue();
1729 return IRB.CreateBitCast(V, IntegerType::get(*MS.C, BitWidth));
1730 }
1731 return V;
1732 }
1733
1734 // Convert a scalar value to an i1 by comparing with 0
1735 Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &name = "") {
1736 Type *VTy = V->getType();
1737 if (!VTy->isIntegerTy())
1738 return convertToBool(convertShadowToScalar(V, IRB), IRB, name);
1739 if (VTy->getIntegerBitWidth() == 1)
1740 // Just converting a bool to a bool, so do nothing.
1741 return V;
1742 return IRB.CreateICmpNE(V, ConstantInt::get(VTy, 0), name);
1743 }
1744
1745 Type *ptrToIntPtrType(Type *PtrTy) const {
1746 if (VectorType *VectTy = dyn_cast<VectorType>(PtrTy)) {
1747 return VectorType::get(ptrToIntPtrType(VectTy->getElementType()),
1748 VectTy->getElementCount());
1749 }
1750 assert(PtrTy->isIntOrPtrTy());
1751 return MS.IntptrTy;
1752 }
1753
1754 Type *getPtrToShadowPtrType(Type *IntPtrTy, Type *ShadowTy) const {
1755 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1756 return VectorType::get(
1757 getPtrToShadowPtrType(VectTy->getElementType(), ShadowTy),
1758 VectTy->getElementCount());
1759 }
1760 assert(IntPtrTy == MS.IntptrTy);
1761 return MS.PtrTy;
1762 }
1763
1764 Constant *constToIntPtr(Type *IntPtrTy, uint64_t C) const {
1765 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1767 VectTy->getElementCount(),
1768 constToIntPtr(VectTy->getElementType(), C));
1769 }
1770 assert(IntPtrTy == MS.IntptrTy);
1771 return ConstantInt::get(MS.IntptrTy, C);
1772 }
1773
1774 /// Returns the integer shadow offset that corresponds to a given
1775 /// application address, whereby:
1776 ///
1777 /// Offset = (Addr & ~AndMask) ^ XorMask
1778 /// Shadow = ShadowBase + Offset
1779 /// Origin = (OriginBase + Offset) & ~Alignment
1780 ///
1781 /// Note: for efficiency, many shadow mappings only require use the XorMask
1782 /// and OriginBase; the AndMask and ShadowBase are often zero.
1783 Value *getShadowPtrOffset(Value *Addr, IRBuilder<> &IRB) {
1784 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1785 Value *OffsetLong = IRB.CreatePointerCast(Addr, IntptrTy);
1786
1787 if (uint64_t AndMask = MS.MapParams->AndMask)
1788 OffsetLong = IRB.CreateAnd(OffsetLong, constToIntPtr(IntptrTy, ~AndMask));
1789
1790 if (uint64_t XorMask = MS.MapParams->XorMask)
1791 OffsetLong = IRB.CreateXor(OffsetLong, constToIntPtr(IntptrTy, XorMask));
1792 return OffsetLong;
1793 }
1794
1795 /// Compute the shadow and origin addresses corresponding to a given
1796 /// application address.
1797 ///
1798 /// Shadow = ShadowBase + Offset
1799 /// Origin = (OriginBase + Offset) & ~3ULL
1800 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1801 /// a single pointee.
1802 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1803 std::pair<Value *, Value *>
1804 getShadowOriginPtrUserspace(Value *Addr, IRBuilder<> &IRB, Type *ShadowTy,
1805 MaybeAlign Alignment) {
1806 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1807 if (!VectTy) {
1808 assert(Addr->getType()->isPointerTy());
1809 } else {
1810 assert(VectTy->getElementType()->isPointerTy());
1811 }
1812 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1813 Value *ShadowOffset = getShadowPtrOffset(Addr, IRB);
1814 Value *ShadowLong = ShadowOffset;
1815 if (uint64_t ShadowBase = MS.MapParams->ShadowBase) {
1816 ShadowLong =
1817 IRB.CreateAdd(ShadowLong, constToIntPtr(IntptrTy, ShadowBase));
1818 }
1819 Value *ShadowPtr = IRB.CreateIntToPtr(
1820 ShadowLong, getPtrToShadowPtrType(IntptrTy, ShadowTy));
1821
1822 Value *OriginPtr = nullptr;
1823 if (MS.TrackOrigins) {
1824 Value *OriginLong = ShadowOffset;
1825 uint64_t OriginBase = MS.MapParams->OriginBase;
1826 if (OriginBase != 0)
1827 OriginLong =
1828 IRB.CreateAdd(OriginLong, constToIntPtr(IntptrTy, OriginBase));
1829 if (!Alignment || *Alignment < kMinOriginAlignment) {
1830 uint64_t Mask = kMinOriginAlignment.value() - 1;
1831 OriginLong = IRB.CreateAnd(OriginLong, constToIntPtr(IntptrTy, ~Mask));
1832 }
1833 OriginPtr = IRB.CreateIntToPtr(
1834 OriginLong, getPtrToShadowPtrType(IntptrTy, MS.OriginTy));
1835 }
1836 return std::make_pair(ShadowPtr, OriginPtr);
1837 }
1838
1839 template <typename... ArgsTy>
1840 Value *createMetadataCall(IRBuilder<> &IRB, FunctionCallee Callee,
1841 ArgsTy... Args) {
1842 if (MS.TargetTriple.getArch() == Triple::systemz) {
1843 IRB.CreateCall(Callee,
1844 {MS.MsanMetadataAlloca, std::forward<ArgsTy>(Args)...});
1845 return IRB.CreateLoad(MS.MsanMetadata, MS.MsanMetadataAlloca);
1846 }
1847
1848 return IRB.CreateCall(Callee, {std::forward<ArgsTy>(Args)...});
1849 }
1850
1851 std::pair<Value *, Value *> getShadowOriginPtrKernelNoVec(Value *Addr,
1852 IRBuilder<> &IRB,
1853 Type *ShadowTy,
1854 bool isStore) {
1855 Value *ShadowOriginPtrs;
1856 const DataLayout &DL = F.getDataLayout();
1857 TypeSize Size = DL.getTypeStoreSize(ShadowTy);
1858
1859 FunctionCallee Getter = MS.getKmsanShadowOriginAccessFn(isStore, Size);
1860 Value *AddrCast = IRB.CreatePointerCast(Addr, MS.PtrTy);
1861 if (Getter) {
1862 ShadowOriginPtrs = createMetadataCall(IRB, Getter, AddrCast);
1863 } else {
1864 Value *SizeVal = ConstantInt::get(MS.IntptrTy, Size);
1865 ShadowOriginPtrs = createMetadataCall(
1866 IRB,
1867 isStore ? MS.MsanMetadataPtrForStoreN : MS.MsanMetadataPtrForLoadN,
1868 AddrCast, SizeVal);
1869 }
1870 Value *ShadowPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 0);
1871 ShadowPtr = IRB.CreatePointerCast(ShadowPtr, MS.PtrTy);
1872 Value *OriginPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 1);
1873
1874 return std::make_pair(ShadowPtr, OriginPtr);
1875 }
1876
1877 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1878 /// a single pointee.
1879 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1880 std::pair<Value *, Value *> getShadowOriginPtrKernel(Value *Addr,
1881 IRBuilder<> &IRB,
1882 Type *ShadowTy,
1883 bool isStore) {
1884 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1885 if (!VectTy) {
1886 assert(Addr->getType()->isPointerTy());
1887 return getShadowOriginPtrKernelNoVec(Addr, IRB, ShadowTy, isStore);
1888 }
1889
1890 // TODO: Support callbacs with vectors of addresses.
1891 unsigned NumElements = cast<FixedVectorType>(VectTy)->getNumElements();
1892 Value *ShadowPtrs = ConstantInt::getNullValue(
1893 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1894 Value *OriginPtrs = nullptr;
1895 if (MS.TrackOrigins)
1896 OriginPtrs = ConstantInt::getNullValue(
1897 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1898 for (unsigned i = 0; i < NumElements; ++i) {
1899 Value *OneAddr =
1900 IRB.CreateExtractElement(Addr, ConstantInt::get(IRB.getInt32Ty(), i));
1901 auto [ShadowPtr, OriginPtr] =
1902 getShadowOriginPtrKernelNoVec(OneAddr, IRB, ShadowTy, isStore);
1903
1904 ShadowPtrs = IRB.CreateInsertElement(
1905 ShadowPtrs, ShadowPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1906 if (MS.TrackOrigins)
1907 OriginPtrs = IRB.CreateInsertElement(
1908 OriginPtrs, OriginPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1909 }
1910 return {ShadowPtrs, OriginPtrs};
1911 }
1912
1913 std::pair<Value *, Value *> getShadowOriginPtr(Value *Addr, IRBuilder<> &IRB,
1914 Type *ShadowTy,
1915 MaybeAlign Alignment,
1916 bool isStore) {
1917 if (MS.CompileKernel)
1918 return getShadowOriginPtrKernel(Addr, IRB, ShadowTy, isStore);
1919 return getShadowOriginPtrUserspace(Addr, IRB, ShadowTy, Alignment);
1920 }
1921
1922 /// Compute the shadow address for a given function argument.
1923 ///
1924 /// Shadow = ParamTLS+ArgOffset.
1925 Value *getShadowPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1926 Value *Base = IRB.CreatePointerCast(MS.ParamTLS, MS.IntptrTy);
1927 if (ArgOffset)
1928 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
1929 return IRB.CreateIntToPtr(Base, IRB.getPtrTy(0), "_msarg");
1930 }
1931
1932 /// Compute the origin address for a given function argument.
1933 Value *getOriginPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1934 if (!MS.TrackOrigins)
1935 return nullptr;
1936 Value *Base = IRB.CreatePointerCast(MS.ParamOriginTLS, MS.IntptrTy);
1937 if (ArgOffset)
1938 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
1939 return IRB.CreateIntToPtr(Base, IRB.getPtrTy(0), "_msarg_o");
1940 }
1941
1942 /// Compute the shadow address for a retval.
1943 Value *getShadowPtrForRetval(IRBuilder<> &IRB) {
1944 return IRB.CreatePointerCast(MS.RetvalTLS, IRB.getPtrTy(0), "_msret");
1945 }
1946
1947 /// Compute the origin address for a retval.
1948 Value *getOriginPtrForRetval() {
1949 // We keep a single origin for the entire retval. Might be too optimistic.
1950 return MS.RetvalOriginTLS;
1951 }
1952
1953 /// Set SV to be the shadow value for V.
1954 void setShadow(Value *V, Value *SV) {
1955 assert(!ShadowMap.count(V) && "Values may only have one shadow");
1956 ShadowMap[V] = PropagateShadow ? SV : getCleanShadow(V);
1957 }
1958
1959 /// Set Origin to be the origin value for V.
1960 void setOrigin(Value *V, Value *Origin) {
1961 if (!MS.TrackOrigins)
1962 return;
1963 assert(!OriginMap.count(V) && "Values may only have one origin");
1964 LLVM_DEBUG(dbgs() << "ORIGIN: " << *V << " ==> " << *Origin << "\n");
1965 OriginMap[V] = Origin;
1966 }
1967
1968 Constant *getCleanShadow(Type *OrigTy) {
1969 Type *ShadowTy = getShadowTy(OrigTy);
1970 if (!ShadowTy)
1971 return nullptr;
1972 return Constant::getNullValue(ShadowTy);
1973 }
1974
1975 /// Create a clean shadow value for a given value.
1976 ///
1977 /// Clean shadow (all zeroes) means all bits of the value are defined
1978 /// (initialized).
1979 Constant *getCleanShadow(Value *V) { return getCleanShadow(V->getType()); }
1980
1981 /// Create a dirty shadow of a given shadow type.
1982 Constant *getPoisonedShadow(Type *ShadowTy) {
1983 assert(ShadowTy);
1984 if (isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy))
1985 return Constant::getAllOnesValue(ShadowTy);
1986 if (ArrayType *AT = dyn_cast<ArrayType>(ShadowTy)) {
1987 SmallVector<Constant *, 4> Vals(AT->getNumElements(),
1988 getPoisonedShadow(AT->getElementType()));
1989 return ConstantArray::get(AT, Vals);
1990 }
1991 if (StructType *ST = dyn_cast<StructType>(ShadowTy)) {
1993 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1994 Vals.push_back(getPoisonedShadow(ST->getElementType(i)));
1995 return ConstantStruct::get(ST, Vals);
1996 }
1997 llvm_unreachable("Unexpected shadow type");
1998 }
1999
2000 /// Create a dirty shadow for a given value.
2001 Constant *getPoisonedShadow(Value *V) {
2002 Type *ShadowTy = getShadowTy(V);
2003 if (!ShadowTy)
2004 return nullptr;
2005 return getPoisonedShadow(ShadowTy);
2006 }
2007
2008 /// Create a clean (zero) origin.
2009 Value *getCleanOrigin() { return Constant::getNullValue(MS.OriginTy); }
2010
2011 /// Get the shadow value for a given Value.
2012 ///
2013 /// This function either returns the value set earlier with setShadow,
2014 /// or extracts if from ParamTLS (for function arguments).
2015 Value *getShadow(Value *V) {
2016 if (Instruction *I = dyn_cast<Instruction>(V)) {
2017 if (!PropagateShadow || I->getMetadata(LLVMContext::MD_nosanitize))
2018 return getCleanShadow(V);
2019 // For instructions the shadow is already stored in the map.
2020 Value *Shadow = ShadowMap[V];
2021 if (!Shadow) {
2022 LLVM_DEBUG(dbgs() << "No shadow: " << *V << "\n" << *(I->getParent()));
2023 assert(Shadow && "No shadow for a value");
2024 }
2025 return Shadow;
2026 }
2027 // Handle fully undefined values
2028 // (partially undefined constant vectors are handled later)
2029 if ([[maybe_unused]] UndefValue *U = dyn_cast<UndefValue>(V)) {
2030 Value *AllOnes = (PropagateShadow && PoisonUndef) ? getPoisonedShadow(V)
2031 : getCleanShadow(V);
2032 LLVM_DEBUG(dbgs() << "Undef: " << *U << " ==> " << *AllOnes << "\n");
2033 return AllOnes;
2034 }
2035 if (Argument *A = dyn_cast<Argument>(V)) {
2036 // For arguments we compute the shadow on demand and store it in the map.
2037 Value *&ShadowPtr = ShadowMap[V];
2038 if (ShadowPtr)
2039 return ShadowPtr;
2040 Function *F = A->getParent();
2041 IRBuilder<> EntryIRB(FnPrologueEnd);
2042 unsigned ArgOffset = 0;
2043 const DataLayout &DL = F->getDataLayout();
2044 for (auto &FArg : F->args()) {
2045 if (!FArg.getType()->isSized() || FArg.getType()->isScalableTy()) {
2046 LLVM_DEBUG(dbgs() << (FArg.getType()->isScalableTy()
2047 ? "vscale not fully supported\n"
2048 : "Arg is not sized\n"));
2049 if (A == &FArg) {
2050 ShadowPtr = getCleanShadow(V);
2051 setOrigin(A, getCleanOrigin());
2052 break;
2053 }
2054 continue;
2055 }
2056
2057 unsigned Size = FArg.hasByValAttr()
2058 ? DL.getTypeAllocSize(FArg.getParamByValType())
2059 : DL.getTypeAllocSize(FArg.getType());
2060
2061 if (A == &FArg) {
2062 bool Overflow = ArgOffset + Size > kParamTLSSize;
2063 if (FArg.hasByValAttr()) {
2064 // ByVal pointer itself has clean shadow. We copy the actual
2065 // argument shadow to the underlying memory.
2066 // Figure out maximal valid memcpy alignment.
2067 const Align ArgAlign = DL.getValueOrABITypeAlignment(
2068 FArg.getParamAlign(), FArg.getParamByValType());
2069 Value *CpShadowPtr, *CpOriginPtr;
2070 std::tie(CpShadowPtr, CpOriginPtr) =
2071 getShadowOriginPtr(V, EntryIRB, EntryIRB.getInt8Ty(), ArgAlign,
2072 /*isStore*/ true);
2073 if (!PropagateShadow || Overflow) {
2074 // ParamTLS overflow.
2075 EntryIRB.CreateMemSet(
2076 CpShadowPtr, Constant::getNullValue(EntryIRB.getInt8Ty()),
2077 Size, ArgAlign);
2078 } else {
2079 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2080 const Align CopyAlign = std::min(ArgAlign, kShadowTLSAlignment);
2081 [[maybe_unused]] Value *Cpy = EntryIRB.CreateMemCpy(
2082 CpShadowPtr, CopyAlign, Base, CopyAlign, Size);
2083 LLVM_DEBUG(dbgs() << " ByValCpy: " << *Cpy << "\n");
2084
2085 if (MS.TrackOrigins) {
2086 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2087 // FIXME: OriginSize should be:
2088 // alignTo(V % kMinOriginAlignment + Size, kMinOriginAlignment)
2089 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
2090 EntryIRB.CreateMemCpy(
2091 CpOriginPtr,
2092 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginPtr,
2093 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
2094 OriginSize);
2095 }
2096 }
2097 }
2098
2099 if (!PropagateShadow || Overflow || FArg.hasByValAttr() ||
2100 (MS.EagerChecks && FArg.hasAttribute(Attribute::NoUndef))) {
2101 ShadowPtr = getCleanShadow(V);
2102 setOrigin(A, getCleanOrigin());
2103 } else {
2104 // Shadow over TLS
2105 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2106 ShadowPtr = EntryIRB.CreateAlignedLoad(getShadowTy(&FArg), Base,
2108 if (MS.TrackOrigins) {
2109 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2110 setOrigin(A, EntryIRB.CreateLoad(MS.OriginTy, OriginPtr));
2111 }
2112 }
2114 << " ARG: " << FArg << " ==> " << *ShadowPtr << "\n");
2115 break;
2116 }
2117
2118 ArgOffset += alignTo(Size, kShadowTLSAlignment);
2119 }
2120 assert(ShadowPtr && "Could not find shadow for an argument");
2121 return ShadowPtr;
2122 }
2123
2124 // Check for partially-undefined constant vectors
2125 // TODO: scalable vectors (this is hard because we do not have IRBuilder)
2126 if (isa<FixedVectorType>(V->getType()) && isa<Constant>(V) &&
2127 cast<Constant>(V)->containsUndefOrPoisonElement() && PropagateShadow &&
2128 PoisonUndefVectors) {
2129 unsigned NumElems = cast<FixedVectorType>(V->getType())->getNumElements();
2130 SmallVector<Constant *, 32> ShadowVector(NumElems);
2131 for (unsigned i = 0; i != NumElems; ++i) {
2132 Constant *Elem = cast<Constant>(V)->getAggregateElement(i);
2133 ShadowVector[i] = isa<UndefValue>(Elem) ? getPoisonedShadow(Elem)
2134 : getCleanShadow(Elem);
2135 }
2136
2137 Value *ShadowConstant = ConstantVector::get(ShadowVector);
2138 LLVM_DEBUG(dbgs() << "Partial undef constant vector: " << *V << " ==> "
2139 << *ShadowConstant << "\n");
2140
2141 return ShadowConstant;
2142 }
2143
2144 // TODO: partially-undefined constant arrays, structures, and nested types
2145
2146 // For everything else the shadow is zero.
2147 return getCleanShadow(V);
2148 }
2149
2150 /// Get the shadow for i-th argument of the instruction I.
2151 Value *getShadow(Instruction *I, int i) {
2152 return getShadow(I->getOperand(i));
2153 }
2154
2155 /// Get the origin for a value.
2156 Value *getOrigin(Value *V) {
2157 if (!MS.TrackOrigins)
2158 return nullptr;
2159 if (!PropagateShadow || isa<Constant>(V) || isa<InlineAsm>(V))
2160 return getCleanOrigin();
2162 "Unexpected value type in getOrigin()");
2163 if (Instruction *I = dyn_cast<Instruction>(V)) {
2164 if (I->getMetadata(LLVMContext::MD_nosanitize))
2165 return getCleanOrigin();
2166 }
2167 Value *Origin = OriginMap[V];
2168 assert(Origin && "Missing origin");
2169 return Origin;
2170 }
2171
2172 /// Get the origin for i-th argument of the instruction I.
2173 Value *getOrigin(Instruction *I, int i) {
2174 return getOrigin(I->getOperand(i));
2175 }
2176
2177 /// Remember the place where a shadow check should be inserted.
2178 ///
2179 /// This location will be later instrumented with a check that will print a
2180 /// UMR warning in runtime if the shadow value is not 0.
2181 void insertCheckShadow(Value *Shadow, Value *Origin, Instruction *OrigIns) {
2182 assert(Shadow);
2183 if (!InsertChecks)
2184 return;
2185
2186 if (!DebugCounter::shouldExecute(DebugInsertCheck)) {
2187 LLVM_DEBUG(dbgs() << "Skipping check of " << *Shadow << " before "
2188 << *OrigIns << "\n");
2189 return;
2190 }
2191#ifndef NDEBUG
2192 Type *ShadowTy = Shadow->getType();
2193 assert((isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy) ||
2194 isa<StructType>(ShadowTy) || isa<ArrayType>(ShadowTy)) &&
2195 "Can only insert checks for integer, vector, and aggregate shadow "
2196 "types");
2197#endif
2198 InstrumentationList.push_back(
2199 ShadowOriginAndInsertPoint(Shadow, Origin, OrigIns));
2200 }
2201
2202 /// Get shadow for value, and remember the place where a shadow check should
2203 /// be inserted.
2204 ///
2205 /// This location will be later instrumented with a check that will print a
2206 /// UMR warning in runtime if the value is not fully defined.
2207 void insertCheckShadowOf(Value *Val, Instruction *OrigIns) {
2208 assert(Val);
2209 Value *Shadow, *Origin;
2211 Shadow = getShadow(Val);
2212 if (!Shadow)
2213 return;
2214 Origin = getOrigin(Val);
2215 } else {
2216 Shadow = dyn_cast_or_null<Instruction>(getShadow(Val));
2217 if (!Shadow)
2218 return;
2219 Origin = dyn_cast_or_null<Instruction>(getOrigin(Val));
2220 }
2221 insertCheckShadow(Shadow, Origin, OrigIns);
2222 }
2223
2225 switch (a) {
2226 case AtomicOrdering::NotAtomic:
2227 return AtomicOrdering::NotAtomic;
2228 case AtomicOrdering::Unordered:
2229 case AtomicOrdering::Monotonic:
2230 case AtomicOrdering::Release:
2231 return AtomicOrdering::Release;
2232 case AtomicOrdering::Acquire:
2233 case AtomicOrdering::AcquireRelease:
2234 return AtomicOrdering::AcquireRelease;
2235 case AtomicOrdering::SequentiallyConsistent:
2236 return AtomicOrdering::SequentiallyConsistent;
2237 }
2238 llvm_unreachable("Unknown ordering");
2239 }
2240
2241 Value *makeAddReleaseOrderingTable(IRBuilder<> &IRB) {
2242 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2243 uint32_t OrderingTable[NumOrderings] = {};
2244
2245 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2246 OrderingTable[(int)AtomicOrderingCABI::release] =
2247 (int)AtomicOrderingCABI::release;
2248 OrderingTable[(int)AtomicOrderingCABI::consume] =
2249 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2250 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2251 (int)AtomicOrderingCABI::acq_rel;
2252 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2253 (int)AtomicOrderingCABI::seq_cst;
2254
2255 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2256 }
2257
2259 switch (a) {
2260 case AtomicOrdering::NotAtomic:
2261 return AtomicOrdering::NotAtomic;
2262 case AtomicOrdering::Unordered:
2263 case AtomicOrdering::Monotonic:
2264 case AtomicOrdering::Acquire:
2265 return AtomicOrdering::Acquire;
2266 case AtomicOrdering::Release:
2267 case AtomicOrdering::AcquireRelease:
2268 return AtomicOrdering::AcquireRelease;
2269 case AtomicOrdering::SequentiallyConsistent:
2270 return AtomicOrdering::SequentiallyConsistent;
2271 }
2272 llvm_unreachable("Unknown ordering");
2273 }
2274
2275 Value *makeAddAcquireOrderingTable(IRBuilder<> &IRB) {
2276 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2277 uint32_t OrderingTable[NumOrderings] = {};
2278
2279 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2280 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2281 OrderingTable[(int)AtomicOrderingCABI::consume] =
2282 (int)AtomicOrderingCABI::acquire;
2283 OrderingTable[(int)AtomicOrderingCABI::release] =
2284 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2285 (int)AtomicOrderingCABI::acq_rel;
2286 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2287 (int)AtomicOrderingCABI::seq_cst;
2288
2289 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2290 }
2291
2292 // ------------------- Visitors.
2293 using InstVisitor<MemorySanitizerVisitor>::visit;
2294 void visit(Instruction &I) {
2295 if (I.getMetadata(LLVMContext::MD_nosanitize))
2296 return;
2297 // Don't want to visit if we're in the prologue
2298 if (isInPrologue(I))
2299 return;
2300 if (!DebugCounter::shouldExecute(DebugInstrumentInstruction)) {
2301 LLVM_DEBUG(dbgs() << "Skipping instruction: " << I << "\n");
2302 // We still need to set the shadow and origin to clean values.
2303 setShadow(&I, getCleanShadow(&I));
2304 setOrigin(&I, getCleanOrigin());
2305 return;
2306 }
2307
2308 Instructions.push_back(&I);
2309 }
2310
2311 /// Instrument LoadInst
2312 ///
2313 /// Loads the corresponding shadow and (optionally) origin.
2314 /// Optionally, checks that the load address is fully defined.
2315 void visitLoadInst(LoadInst &I) {
2316 assert(I.getType()->isSized() && "Load type must have size");
2317 assert(!I.getMetadata(LLVMContext::MD_nosanitize));
2318 NextNodeIRBuilder IRB(&I);
2319 Type *ShadowTy = getShadowTy(&I);
2320 Value *Addr = I.getPointerOperand();
2321 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
2322 const Align Alignment = I.getAlign();
2323 if (PropagateShadow) {
2324 std::tie(ShadowPtr, OriginPtr) =
2325 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
2326 setShadow(&I,
2327 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
2328 } else {
2329 setShadow(&I, getCleanShadow(&I));
2330 }
2331
2333 insertCheckShadowOf(I.getPointerOperand(), &I);
2334
2335 if (I.isAtomic())
2336 I.setOrdering(addAcquireOrdering(I.getOrdering()));
2337
2338 if (MS.TrackOrigins) {
2339 if (PropagateShadow) {
2340 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
2341 setOrigin(
2342 &I, IRB.CreateAlignedLoad(MS.OriginTy, OriginPtr, OriginAlignment));
2343 } else {
2344 setOrigin(&I, getCleanOrigin());
2345 }
2346 }
2347 }
2348
2349 /// Instrument StoreInst
2350 ///
2351 /// Stores the corresponding shadow and (optionally) origin.
2352 /// Optionally, checks that the store address is fully defined.
2353 void visitStoreInst(StoreInst &I) {
2354 StoreList.push_back(&I);
2356 insertCheckShadowOf(I.getPointerOperand(), &I);
2357 }
2358
2359 void handleCASOrRMW(Instruction &I) {
2361
2362 IRBuilder<> IRB(&I);
2363 Value *Addr = I.getOperand(0);
2364 Value *Val = I.getOperand(1);
2365 Value *ShadowPtr = getShadowOriginPtr(Addr, IRB, getShadowTy(Val), Align(1),
2366 /*isStore*/ true)
2367 .first;
2368
2370 insertCheckShadowOf(Addr, &I);
2371
2372 // Only test the conditional argument of cmpxchg instruction.
2373 // The other argument can potentially be uninitialized, but we can not
2374 // detect this situation reliably without possible false positives.
2376 insertCheckShadowOf(Val, &I);
2377
2378 IRB.CreateStore(getCleanShadow(Val), ShadowPtr);
2379
2380 setShadow(&I, getCleanShadow(&I));
2381 setOrigin(&I, getCleanOrigin());
2382 }
2383
2384 void visitAtomicRMWInst(AtomicRMWInst &I) {
2385 handleCASOrRMW(I);
2386 I.setOrdering(addReleaseOrdering(I.getOrdering()));
2387 }
2388
2389 void visitAtomicCmpXchgInst(AtomicCmpXchgInst &I) {
2390 handleCASOrRMW(I);
2391 I.setSuccessOrdering(addReleaseOrdering(I.getSuccessOrdering()));
2392 }
2393
2394 // Vector manipulation.
2395 void visitExtractElementInst(ExtractElementInst &I) {
2396 insertCheckShadowOf(I.getOperand(1), &I);
2397 IRBuilder<> IRB(&I);
2398 setShadow(&I, IRB.CreateExtractElement(getShadow(&I, 0), I.getOperand(1),
2399 "_msprop"));
2400 setOrigin(&I, getOrigin(&I, 0));
2401 }
2402
2403 void visitInsertElementInst(InsertElementInst &I) {
2404 insertCheckShadowOf(I.getOperand(2), &I);
2405 IRBuilder<> IRB(&I);
2406 auto *Shadow0 = getShadow(&I, 0);
2407 auto *Shadow1 = getShadow(&I, 1);
2408 setShadow(&I, IRB.CreateInsertElement(Shadow0, Shadow1, I.getOperand(2),
2409 "_msprop"));
2410 setOriginForNaryOp(I);
2411 }
2412
2413 void visitShuffleVectorInst(ShuffleVectorInst &I) {
2414 IRBuilder<> IRB(&I);
2415 auto *Shadow0 = getShadow(&I, 0);
2416 auto *Shadow1 = getShadow(&I, 1);
2417 setShadow(&I, IRB.CreateShuffleVector(Shadow0, Shadow1, I.getShuffleMask(),
2418 "_msprop"));
2419 setOriginForNaryOp(I);
2420 }
2421
2422 // Casts.
2423 void visitSExtInst(SExtInst &I) {
2424 IRBuilder<> IRB(&I);
2425 setShadow(&I, IRB.CreateSExt(getShadow(&I, 0), I.getType(), "_msprop"));
2426 setOrigin(&I, getOrigin(&I, 0));
2427 }
2428
2429 void visitZExtInst(ZExtInst &I) {
2430 IRBuilder<> IRB(&I);
2431 setShadow(&I, IRB.CreateZExt(getShadow(&I, 0), I.getType(), "_msprop"));
2432 setOrigin(&I, getOrigin(&I, 0));
2433 }
2434
2435 void visitTruncInst(TruncInst &I) {
2436 IRBuilder<> IRB(&I);
2437 setShadow(&I, IRB.CreateTrunc(getShadow(&I, 0), I.getType(), "_msprop"));
2438 setOrigin(&I, getOrigin(&I, 0));
2439 }
2440
2441 void visitBitCastInst(BitCastInst &I) {
2442 // Special case: if this is the bitcast (there is exactly 1 allowed) between
2443 // a musttail call and a ret, don't instrument. New instructions are not
2444 // allowed after a musttail call.
2445 if (auto *CI = dyn_cast<CallInst>(I.getOperand(0)))
2446 if (CI->isMustTailCall())
2447 return;
2448 IRBuilder<> IRB(&I);
2449 setShadow(&I, IRB.CreateBitCast(getShadow(&I, 0), getShadowTy(&I)));
2450 setOrigin(&I, getOrigin(&I, 0));
2451 }
2452
2453 void visitPtrToIntInst(PtrToIntInst &I) {
2454 IRBuilder<> IRB(&I);
2455 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2456 "_msprop_ptrtoint"));
2457 setOrigin(&I, getOrigin(&I, 0));
2458 }
2459
2460 void visitIntToPtrInst(IntToPtrInst &I) {
2461 IRBuilder<> IRB(&I);
2462 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2463 "_msprop_inttoptr"));
2464 setOrigin(&I, getOrigin(&I, 0));
2465 }
2466
2467 void visitFPToSIInst(CastInst &I) { handleShadowOr(I); }
2468 void visitFPToUIInst(CastInst &I) { handleShadowOr(I); }
2469 void visitSIToFPInst(CastInst &I) { handleShadowOr(I); }
2470 void visitUIToFPInst(CastInst &I) { handleShadowOr(I); }
2471 void visitFPExtInst(CastInst &I) { handleShadowOr(I); }
2472 void visitFPTruncInst(CastInst &I) { handleShadowOr(I); }
2473
2474 /// Propagate shadow for bitwise AND.
2475 ///
2476 /// This code is exact, i.e. if, for example, a bit in the left argument
2477 /// is defined and 0, then neither the value not definedness of the
2478 /// corresponding bit in B don't affect the resulting shadow.
2479 void visitAnd(BinaryOperator &I) {
2480 IRBuilder<> IRB(&I);
2481 // "And" of 0 and a poisoned value results in unpoisoned value.
2482 // 1&1 => 1; 0&1 => 0; p&1 => p;
2483 // 1&0 => 0; 0&0 => 0; p&0 => 0;
2484 // 1&p => p; 0&p => 0; p&p => p;
2485 // S = (S1 & S2) | (V1 & S2) | (S1 & V2)
2486 Value *S1 = getShadow(&I, 0);
2487 Value *S2 = getShadow(&I, 1);
2488 Value *V1 = I.getOperand(0);
2489 Value *V2 = I.getOperand(1);
2490 if (V1->getType() != S1->getType()) {
2491 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2492 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2493 }
2494 Value *S1S2 = IRB.CreateAnd(S1, S2);
2495 Value *V1S2 = IRB.CreateAnd(V1, S2);
2496 Value *S1V2 = IRB.CreateAnd(S1, V2);
2497 setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
2498 setOriginForNaryOp(I);
2499 }
2500
2501 void visitOr(BinaryOperator &I) {
2502 IRBuilder<> IRB(&I);
2503 // "Or" of 1 and a poisoned value results in unpoisoned value:
2504 // 1|1 => 1; 0|1 => 1; p|1 => 1;
2505 // 1|0 => 1; 0|0 => 0; p|0 => p;
2506 // 1|p => 1; 0|p => p; p|p => p;
2507 //
2508 // S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
2509 //
2510 // If the "disjoint OR" property is violated, the result is poison, and
2511 // hence the entire shadow is uninitialized:
2512 // S = S | SignExt(V1 & V2 != 0)
2513 Value *S1 = getShadow(&I, 0);
2514 Value *S2 = getShadow(&I, 1);
2515 Value *V1 = I.getOperand(0);
2516 Value *V2 = I.getOperand(1);
2517 if (V1->getType() != S1->getType()) {
2518 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2519 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2520 }
2521
2522 Value *NotV1 = IRB.CreateNot(V1);
2523 Value *NotV2 = IRB.CreateNot(V2);
2524
2525 Value *S1S2 = IRB.CreateAnd(S1, S2);
2526 Value *S2NotV1 = IRB.CreateAnd(NotV1, S2);
2527 Value *S1NotV2 = IRB.CreateAnd(S1, NotV2);
2528
2529 Value *S = IRB.CreateOr({S1S2, S2NotV1, S1NotV2});
2530
2531 if (ClPreciseDisjointOr && cast<PossiblyDisjointInst>(&I)->isDisjoint()) {
2532 Value *V1V2 = IRB.CreateAnd(V1, V2);
2533 Value *DisjointOrShadow = IRB.CreateSExt(
2534 IRB.CreateICmpNE(V1V2, getCleanShadow(V1V2)), V1V2->getType());
2535 S = IRB.CreateOr(S, DisjointOrShadow, "_ms_disjoint");
2536 }
2537
2538 setShadow(&I, S);
2539 setOriginForNaryOp(I);
2540 }
2541
2542 /// Default propagation of shadow and/or origin.
2543 ///
2544 /// This class implements the general case of shadow propagation, used in all
2545 /// cases where we don't know and/or don't care about what the operation
2546 /// actually does. It converts all input shadow values to a common type
2547 /// (extending or truncating as necessary), and bitwise OR's them.
2548 ///
2549 /// This is much cheaper than inserting checks (i.e. requiring inputs to be
2550 /// fully initialized), and less prone to false positives.
2551 ///
2552 /// This class also implements the general case of origin propagation. For a
2553 /// Nary operation, result origin is set to the origin of an argument that is
2554 /// not entirely initialized. If there is more than one such arguments, the
2555 /// rightmost of them is picked. It does not matter which one is picked if all
2556 /// arguments are initialized.
2557 template <bool CombineShadow> class Combiner {
2558 Value *Shadow = nullptr;
2559 Value *Origin = nullptr;
2560 IRBuilder<> &IRB;
2561 MemorySanitizerVisitor *MSV;
2562
2563 public:
2564 Combiner(MemorySanitizerVisitor *MSV, IRBuilder<> &IRB)
2565 : IRB(IRB), MSV(MSV) {}
2566
2567 /// Add a pair of shadow and origin values to the mix.
2568 Combiner &Add(Value *OpShadow, Value *OpOrigin) {
2569 if (CombineShadow) {
2570 assert(OpShadow);
2571 if (!Shadow)
2572 Shadow = OpShadow;
2573 else {
2574 OpShadow = MSV->CreateShadowCast(IRB, OpShadow, Shadow->getType());
2575 Shadow = IRB.CreateOr(Shadow, OpShadow, "_msprop");
2576 }
2577 }
2578
2579 if (MSV->MS.TrackOrigins) {
2580 assert(OpOrigin);
2581 if (!Origin) {
2582 Origin = OpOrigin;
2583 } else {
2584 Constant *ConstOrigin = dyn_cast<Constant>(OpOrigin);
2585 // No point in adding something that might result in 0 origin value.
2586 if (!ConstOrigin || !ConstOrigin->isNullValue()) {
2587 Value *Cond = MSV->convertToBool(OpShadow, IRB);
2588 Origin = IRB.CreateSelect(Cond, OpOrigin, Origin);
2589 }
2590 }
2591 }
2592 return *this;
2593 }
2594
2595 /// Add an application value to the mix.
2596 Combiner &Add(Value *V) {
2597 Value *OpShadow = MSV->getShadow(V);
2598 Value *OpOrigin = MSV->MS.TrackOrigins ? MSV->getOrigin(V) : nullptr;
2599 return Add(OpShadow, OpOrigin);
2600 }
2601
2602 /// Set the current combined values as the given instruction's shadow
2603 /// and origin.
2604 void Done(Instruction *I) {
2605 if (CombineShadow) {
2606 assert(Shadow);
2607 Shadow = MSV->CreateShadowCast(IRB, Shadow, MSV->getShadowTy(I));
2608 MSV->setShadow(I, Shadow);
2609 }
2610 if (MSV->MS.TrackOrigins) {
2611 assert(Origin);
2612 MSV->setOrigin(I, Origin);
2613 }
2614 }
2615
2616 /// Store the current combined value at the specified origin
2617 /// location.
2618 void DoneAndStoreOrigin(TypeSize TS, Value *OriginPtr) {
2619 if (MSV->MS.TrackOrigins) {
2620 assert(Origin);
2621 MSV->paintOrigin(IRB, Origin, OriginPtr, TS, kMinOriginAlignment);
2622 }
2623 }
2624 };
2625
2626 using ShadowAndOriginCombiner = Combiner<true>;
2627 using OriginCombiner = Combiner<false>;
2628
2629 /// Propagate origin for arbitrary operation.
2630 void setOriginForNaryOp(Instruction &I) {
2631 if (!MS.TrackOrigins)
2632 return;
2633 IRBuilder<> IRB(&I);
2634 OriginCombiner OC(this, IRB);
2635 for (Use &Op : I.operands())
2636 OC.Add(Op.get());
2637 OC.Done(&I);
2638 }
2639
2640 size_t VectorOrPrimitiveTypeSizeInBits(Type *Ty) {
2641 assert(!(Ty->isVectorTy() && Ty->getScalarType()->isPointerTy()) &&
2642 "Vector of pointers is not a valid shadow type");
2643 return Ty->isVectorTy() ? cast<FixedVectorType>(Ty)->getNumElements() *
2645 : Ty->getPrimitiveSizeInBits();
2646 }
2647
2648 /// Cast between two shadow types, extending or truncating as
2649 /// necessary.
2650 Value *CreateShadowCast(IRBuilder<> &IRB, Value *V, Type *dstTy,
2651 bool Signed = false) {
2652 Type *srcTy = V->getType();
2653 if (srcTy == dstTy)
2654 return V;
2655 size_t srcSizeInBits = VectorOrPrimitiveTypeSizeInBits(srcTy);
2656 size_t dstSizeInBits = VectorOrPrimitiveTypeSizeInBits(dstTy);
2657 if (srcSizeInBits > 1 && dstSizeInBits == 1)
2658 return IRB.CreateICmpNE(V, getCleanShadow(V));
2659
2660 if (dstTy->isIntegerTy() && srcTy->isIntegerTy())
2661 return IRB.CreateIntCast(V, dstTy, Signed);
2662 if (dstTy->isVectorTy() && srcTy->isVectorTy() &&
2663 cast<VectorType>(dstTy)->getElementCount() ==
2664 cast<VectorType>(srcTy)->getElementCount())
2665 return IRB.CreateIntCast(V, dstTy, Signed);
2666 Value *V1 = IRB.CreateBitCast(V, Type::getIntNTy(*MS.C, srcSizeInBits));
2667 Value *V2 =
2668 IRB.CreateIntCast(V1, Type::getIntNTy(*MS.C, dstSizeInBits), Signed);
2669 return IRB.CreateBitCast(V2, dstTy);
2670 // TODO: handle struct types.
2671 }
2672
2673 /// Cast an application value to the type of its own shadow.
2674 Value *CreateAppToShadowCast(IRBuilder<> &IRB, Value *V) {
2675 Type *ShadowTy = getShadowTy(V);
2676 if (V->getType() == ShadowTy)
2677 return V;
2678 if (V->getType()->isPtrOrPtrVectorTy())
2679 return IRB.CreatePtrToInt(V, ShadowTy);
2680 else
2681 return IRB.CreateBitCast(V, ShadowTy);
2682 }
2683
2684 /// Propagate shadow for arbitrary operation.
2685 void handleShadowOr(Instruction &I) {
2686 IRBuilder<> IRB(&I);
2687 ShadowAndOriginCombiner SC(this, IRB);
2688 for (Use &Op : I.operands())
2689 SC.Add(Op.get());
2690 SC.Done(&I);
2691 }
2692
2693 // Perform a bitwise OR on the horizontal pairs (or other specified grouping)
2694 // of elements.
2695 //
2696 // For example, suppose we have:
2697 // VectorA: <a1, a2, a3, a4, a5, a6>
2698 // VectorB: <b1, b2, b3, b4, b5, b6>
2699 // ReductionFactor: 3.
2700 // The output would be:
2701 // <a1|a2|a3, a4|a5|a6, b1|b2|b3, b4|b5|b6>
2702 //
2703 // This is convenient for instrumenting horizontal add/sub.
2704 // For bitwise OR on "vertical" pairs, see maybeHandleSimpleNomemIntrinsic().
2705 Value *horizontalReduce(IntrinsicInst &I, unsigned ReductionFactor,
2706 Value *VectorA, Value *VectorB) {
2707 assert(isa<FixedVectorType>(VectorA->getType()));
2708 unsigned TotalNumElems =
2709 cast<FixedVectorType>(VectorA->getType())->getNumElements();
2710
2711 if (VectorB) {
2712 assert(VectorA->getType() == VectorB->getType());
2713 TotalNumElems = TotalNumElems * 2;
2714 }
2715
2716 assert(TotalNumElems % ReductionFactor == 0);
2717
2718 Value *Or = nullptr;
2719
2720 IRBuilder<> IRB(&I);
2721 for (unsigned i = 0; i < ReductionFactor; i++) {
2722 SmallVector<int, 16> Mask;
2723 for (unsigned X = 0; X < TotalNumElems; X += ReductionFactor)
2724 Mask.push_back(X + i);
2725
2726 Value *Masked;
2727 if (VectorB)
2728 Masked = IRB.CreateShuffleVector(VectorA, VectorB, Mask);
2729 else
2730 Masked = IRB.CreateShuffleVector(VectorA, Mask);
2731
2732 if (Or)
2733 Or = IRB.CreateOr(Or, Masked);
2734 else
2735 Or = Masked;
2736 }
2737
2738 return Or;
2739 }
2740
2741 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2742 /// fields.
2743 ///
2744 /// e.g., <2 x i32> @llvm.aarch64.neon.saddlp.v2i32.v4i16(<4 x i16>)
2745 /// <16 x i8> @llvm.aarch64.neon.addp.v16i8(<16 x i8>, <16 x i8>)
2746 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I) {
2747 assert(I.arg_size() == 1 || I.arg_size() == 2);
2748
2749 assert(I.getType()->isVectorTy());
2750 assert(I.getArgOperand(0)->getType()->isVectorTy());
2751
2752 [[maybe_unused]] FixedVectorType *ParamType =
2753 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2754 assert((I.arg_size() != 2) ||
2755 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2756 [[maybe_unused]] FixedVectorType *ReturnType =
2757 cast<FixedVectorType>(I.getType());
2758 assert(ParamType->getNumElements() * I.arg_size() ==
2759 2 * ReturnType->getNumElements());
2760
2761 IRBuilder<> IRB(&I);
2762
2763 // Horizontal OR of shadow
2764 Value *FirstArgShadow = getShadow(&I, 0);
2765 Value *SecondArgShadow = nullptr;
2766 if (I.arg_size() == 2)
2767 SecondArgShadow = getShadow(&I, 1);
2768
2769 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
2770 SecondArgShadow);
2771
2772 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2773
2774 setShadow(&I, OrShadow);
2775 setOriginForNaryOp(I);
2776 }
2777
2778 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2779 /// fields, with the parameters reinterpreted to have elements of a specified
2780 /// width. For example:
2781 /// @llvm.x86.ssse3.phadd.w(<1 x i64> [[VAR1]], <1 x i64> [[VAR2]])
2782 /// conceptually operates on
2783 /// (<4 x i16> [[VAR1]], <4 x i16> [[VAR2]])
2784 /// and can be handled with ReinterpretElemWidth == 16.
2785 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I,
2786 int ReinterpretElemWidth) {
2787 assert(I.arg_size() == 1 || I.arg_size() == 2);
2788
2789 assert(I.getType()->isVectorTy());
2790 assert(I.getArgOperand(0)->getType()->isVectorTy());
2791
2792 FixedVectorType *ParamType =
2793 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2794 assert((I.arg_size() != 2) ||
2795 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2796
2797 [[maybe_unused]] FixedVectorType *ReturnType =
2798 cast<FixedVectorType>(I.getType());
2799 assert(ParamType->getNumElements() * I.arg_size() ==
2800 2 * ReturnType->getNumElements());
2801
2802 IRBuilder<> IRB(&I);
2803
2804 FixedVectorType *ReinterpretShadowTy = nullptr;
2805 assert(isAligned(Align(ReinterpretElemWidth),
2806 ParamType->getPrimitiveSizeInBits()));
2807 ReinterpretShadowTy = FixedVectorType::get(
2808 IRB.getIntNTy(ReinterpretElemWidth),
2809 ParamType->getPrimitiveSizeInBits() / ReinterpretElemWidth);
2810
2811 // Horizontal OR of shadow
2812 Value *FirstArgShadow = getShadow(&I, 0);
2813 FirstArgShadow = IRB.CreateBitCast(FirstArgShadow, ReinterpretShadowTy);
2814
2815 // If we had two parameters each with an odd number of elements, the total
2816 // number of elements is even, but we have never seen this in extant
2817 // instruction sets, so we enforce that each parameter must have an even
2818 // number of elements.
2820 Align(2),
2821 cast<FixedVectorType>(FirstArgShadow->getType())->getNumElements()));
2822
2823 Value *SecondArgShadow = nullptr;
2824 if (I.arg_size() == 2) {
2825 SecondArgShadow = getShadow(&I, 1);
2826 SecondArgShadow = IRB.CreateBitCast(SecondArgShadow, ReinterpretShadowTy);
2827 }
2828
2829 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
2830 SecondArgShadow);
2831
2832 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2833
2834 setShadow(&I, OrShadow);
2835 setOriginForNaryOp(I);
2836 }
2837
2838 void visitFNeg(UnaryOperator &I) { handleShadowOr(I); }
2839
2840 // Handle multiplication by constant.
2841 //
2842 // Handle a special case of multiplication by constant that may have one or
2843 // more zeros in the lower bits. This makes corresponding number of lower bits
2844 // of the result zero as well. We model it by shifting the other operand
2845 // shadow left by the required number of bits. Effectively, we transform
2846 // (X * (A * 2**B)) to ((X << B) * A) and instrument (X << B) as (Sx << B).
2847 // We use multiplication by 2**N instead of shift to cover the case of
2848 // multiplication by 0, which may occur in some elements of a vector operand.
2849 void handleMulByConstant(BinaryOperator &I, Constant *ConstArg,
2850 Value *OtherArg) {
2851 Constant *ShadowMul;
2852 Type *Ty = ConstArg->getType();
2853 if (auto *VTy = dyn_cast<VectorType>(Ty)) {
2854 unsigned NumElements = cast<FixedVectorType>(VTy)->getNumElements();
2855 Type *EltTy = VTy->getElementType();
2857 for (unsigned Idx = 0; Idx < NumElements; ++Idx) {
2858 if (ConstantInt *Elt =
2860 const APInt &V = Elt->getValue();
2861 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2862 Elements.push_back(ConstantInt::get(EltTy, V2));
2863 } else {
2864 Elements.push_back(ConstantInt::get(EltTy, 1));
2865 }
2866 }
2867 ShadowMul = ConstantVector::get(Elements);
2868 } else {
2869 if (ConstantInt *Elt = dyn_cast<ConstantInt>(ConstArg)) {
2870 const APInt &V = Elt->getValue();
2871 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2872 ShadowMul = ConstantInt::get(Ty, V2);
2873 } else {
2874 ShadowMul = ConstantInt::get(Ty, 1);
2875 }
2876 }
2877
2878 IRBuilder<> IRB(&I);
2879 setShadow(&I,
2880 IRB.CreateMul(getShadow(OtherArg), ShadowMul, "msprop_mul_cst"));
2881 setOrigin(&I, getOrigin(OtherArg));
2882 }
2883
2884 void visitMul(BinaryOperator &I) {
2885 Constant *constOp0 = dyn_cast<Constant>(I.getOperand(0));
2886 Constant *constOp1 = dyn_cast<Constant>(I.getOperand(1));
2887 if (constOp0 && !constOp1)
2888 handleMulByConstant(I, constOp0, I.getOperand(1));
2889 else if (constOp1 && !constOp0)
2890 handleMulByConstant(I, constOp1, I.getOperand(0));
2891 else
2892 handleShadowOr(I);
2893 }
2894
2895 void visitFAdd(BinaryOperator &I) { handleShadowOr(I); }
2896 void visitFSub(BinaryOperator &I) { handleShadowOr(I); }
2897 void visitFMul(BinaryOperator &I) { handleShadowOr(I); }
2898 void visitAdd(BinaryOperator &I) { handleShadowOr(I); }
2899 void visitSub(BinaryOperator &I) { handleShadowOr(I); }
2900 void visitXor(BinaryOperator &I) { handleShadowOr(I); }
2901
2902 void handleIntegerDiv(Instruction &I) {
2903 IRBuilder<> IRB(&I);
2904 // Strict on the second argument.
2905 insertCheckShadowOf(I.getOperand(1), &I);
2906 setShadow(&I, getShadow(&I, 0));
2907 setOrigin(&I, getOrigin(&I, 0));
2908 }
2909
2910 void visitUDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2911 void visitSDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2912 void visitURem(BinaryOperator &I) { handleIntegerDiv(I); }
2913 void visitSRem(BinaryOperator &I) { handleIntegerDiv(I); }
2914
2915 // Floating point division is side-effect free. We can not require that the
2916 // divisor is fully initialized and must propagate shadow. See PR37523.
2917 void visitFDiv(BinaryOperator &I) { handleShadowOr(I); }
2918 void visitFRem(BinaryOperator &I) { handleShadowOr(I); }
2919
2920 /// Instrument == and != comparisons.
2921 ///
2922 /// Sometimes the comparison result is known even if some of the bits of the
2923 /// arguments are not.
2924 void handleEqualityComparison(ICmpInst &I) {
2925 IRBuilder<> IRB(&I);
2926 Value *A = I.getOperand(0);
2927 Value *B = I.getOperand(1);
2928 Value *Sa = getShadow(A);
2929 Value *Sb = getShadow(B);
2930
2931 // Get rid of pointers and vectors of pointers.
2932 // For ints (and vectors of ints), types of A and Sa match,
2933 // and this is a no-op.
2934 A = IRB.CreatePointerCast(A, Sa->getType());
2935 B = IRB.CreatePointerCast(B, Sb->getType());
2936
2937 // A == B <==> (C = A^B) == 0
2938 // A != B <==> (C = A^B) != 0
2939 // Sc = Sa | Sb
2940 Value *C = IRB.CreateXor(A, B);
2941 Value *Sc = IRB.CreateOr(Sa, Sb);
2942 // Now dealing with i = (C == 0) comparison (or C != 0, does not matter now)
2943 // Result is defined if one of the following is true
2944 // * there is a defined 1 bit in C
2945 // * C is fully defined
2946 // Si = !(C & ~Sc) && Sc
2948 Value *MinusOne = Constant::getAllOnesValue(Sc->getType());
2949 Value *LHS = IRB.CreateICmpNE(Sc, Zero);
2950 Value *RHS =
2951 IRB.CreateICmpEQ(IRB.CreateAnd(IRB.CreateXor(Sc, MinusOne), C), Zero);
2952 Value *Si = IRB.CreateAnd(LHS, RHS);
2953 Si->setName("_msprop_icmp");
2954 setShadow(&I, Si);
2955 setOriginForNaryOp(I);
2956 }
2957
2958 /// Instrument relational comparisons.
2959 ///
2960 /// This function does exact shadow propagation for all relational
2961 /// comparisons of integers, pointers and vectors of those.
2962 /// FIXME: output seems suboptimal when one of the operands is a constant
2963 void handleRelationalComparisonExact(ICmpInst &I) {
2964 IRBuilder<> IRB(&I);
2965 Value *A = I.getOperand(0);
2966 Value *B = I.getOperand(1);
2967 Value *Sa = getShadow(A);
2968 Value *Sb = getShadow(B);
2969
2970 // Get rid of pointers and vectors of pointers.
2971 // For ints (and vectors of ints), types of A and Sa match,
2972 // and this is a no-op.
2973 A = IRB.CreatePointerCast(A, Sa->getType());
2974 B = IRB.CreatePointerCast(B, Sb->getType());
2975
2976 // Let [a0, a1] be the interval of possible values of A, taking into account
2977 // its undefined bits. Let [b0, b1] be the interval of possible values of B.
2978 // Then (A cmp B) is defined iff (a0 cmp b1) == (a1 cmp b0).
2979 bool IsSigned = I.isSigned();
2980
2981 auto GetMinMaxUnsigned = [&](Value *V, Value *S) {
2982 if (IsSigned) {
2983 // Sign-flip to map from signed range to unsigned range. Relation A vs B
2984 // should be preserved, if checked with `getUnsignedPredicate()`.
2985 // Relationship between Amin, Amax, Bmin, Bmax also will not be
2986 // affected, as they are created by effectively adding/substructing from
2987 // A (or B) a value, derived from shadow, with no overflow, either
2988 // before or after sign flip.
2989 APInt MinVal =
2990 APInt::getSignedMinValue(V->getType()->getScalarSizeInBits());
2991 V = IRB.CreateXor(V, ConstantInt::get(V->getType(), MinVal));
2992 }
2993 // Minimize undefined bits.
2994 Value *Min = IRB.CreateAnd(V, IRB.CreateNot(S));
2995 Value *Max = IRB.CreateOr(V, S);
2996 return std::make_pair(Min, Max);
2997 };
2998
2999 auto [Amin, Amax] = GetMinMaxUnsigned(A, Sa);
3000 auto [Bmin, Bmax] = GetMinMaxUnsigned(B, Sb);
3001 Value *S1 = IRB.CreateICmp(I.getUnsignedPredicate(), Amin, Bmax);
3002 Value *S2 = IRB.CreateICmp(I.getUnsignedPredicate(), Amax, Bmin);
3003
3004 Value *Si = IRB.CreateXor(S1, S2);
3005 setShadow(&I, Si);
3006 setOriginForNaryOp(I);
3007 }
3008
3009 /// Instrument signed relational comparisons.
3010 ///
3011 /// Handle sign bit tests: x<0, x>=0, x<=-1, x>-1 by propagating the highest
3012 /// bit of the shadow. Everything else is delegated to handleShadowOr().
3013 void handleSignedRelationalComparison(ICmpInst &I) {
3014 Constant *constOp;
3015 Value *op = nullptr;
3017 if ((constOp = dyn_cast<Constant>(I.getOperand(1)))) {
3018 op = I.getOperand(0);
3019 pre = I.getPredicate();
3020 } else if ((constOp = dyn_cast<Constant>(I.getOperand(0)))) {
3021 op = I.getOperand(1);
3022 pre = I.getSwappedPredicate();
3023 } else {
3024 handleShadowOr(I);
3025 return;
3026 }
3027
3028 if ((constOp->isNullValue() &&
3029 (pre == CmpInst::ICMP_SLT || pre == CmpInst::ICMP_SGE)) ||
3030 (constOp->isAllOnesValue() &&
3031 (pre == CmpInst::ICMP_SGT || pre == CmpInst::ICMP_SLE))) {
3032 IRBuilder<> IRB(&I);
3033 Value *Shadow = IRB.CreateICmpSLT(getShadow(op), getCleanShadow(op),
3034 "_msprop_icmp_s");
3035 setShadow(&I, Shadow);
3036 setOrigin(&I, getOrigin(op));
3037 } else {
3038 handleShadowOr(I);
3039 }
3040 }
3041
3042 void visitICmpInst(ICmpInst &I) {
3043 if (!ClHandleICmp) {
3044 handleShadowOr(I);
3045 return;
3046 }
3047 if (I.isEquality()) {
3048 handleEqualityComparison(I);
3049 return;
3050 }
3051
3052 assert(I.isRelational());
3053 if (ClHandleICmpExact) {
3054 handleRelationalComparisonExact(I);
3055 return;
3056 }
3057 if (I.isSigned()) {
3058 handleSignedRelationalComparison(I);
3059 return;
3060 }
3061
3062 assert(I.isUnsigned());
3063 if ((isa<Constant>(I.getOperand(0)) || isa<Constant>(I.getOperand(1)))) {
3064 handleRelationalComparisonExact(I);
3065 return;
3066 }
3067
3068 handleShadowOr(I);
3069 }
3070
3071 void visitFCmpInst(FCmpInst &I) { handleShadowOr(I); }
3072
3073 void handleShift(BinaryOperator &I) {
3074 IRBuilder<> IRB(&I);
3075 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3076 // Otherwise perform the same shift on S1.
3077 Value *S1 = getShadow(&I, 0);
3078 Value *S2 = getShadow(&I, 1);
3079 Value *S2Conv =
3080 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3081 Value *V2 = I.getOperand(1);
3082 Value *Shift = IRB.CreateBinOp(I.getOpcode(), S1, V2);
3083 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3084 setOriginForNaryOp(I);
3085 }
3086
3087 void visitShl(BinaryOperator &I) { handleShift(I); }
3088 void visitAShr(BinaryOperator &I) { handleShift(I); }
3089 void visitLShr(BinaryOperator &I) { handleShift(I); }
3090
3091 void handleFunnelShift(IntrinsicInst &I) {
3092 IRBuilder<> IRB(&I);
3093 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3094 // Otherwise perform the same shift on S0 and S1.
3095 Value *S0 = getShadow(&I, 0);
3096 Value *S1 = getShadow(&I, 1);
3097 Value *S2 = getShadow(&I, 2);
3098 Value *S2Conv =
3099 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3100 Value *V2 = I.getOperand(2);
3101 Value *Shift = IRB.CreateIntrinsic(I.getIntrinsicID(), S2Conv->getType(),
3102 {S0, S1, V2});
3103 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3104 setOriginForNaryOp(I);
3105 }
3106
3107 /// Instrument llvm.memmove
3108 ///
3109 /// At this point we don't know if llvm.memmove will be inlined or not.
3110 /// If we don't instrument it and it gets inlined,
3111 /// our interceptor will not kick in and we will lose the memmove.
3112 /// If we instrument the call here, but it does not get inlined,
3113 /// we will memove the shadow twice: which is bad in case
3114 /// of overlapping regions. So, we simply lower the intrinsic to a call.
3115 ///
3116 /// Similar situation exists for memcpy and memset.
3117 void visitMemMoveInst(MemMoveInst &I) {
3118 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3119 IRBuilder<> IRB(&I);
3120 IRB.CreateCall(MS.MemmoveFn,
3121 {I.getArgOperand(0), I.getArgOperand(1),
3122 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3124 }
3125
3126 /// Instrument memcpy
3127 ///
3128 /// Similar to memmove: avoid copying shadow twice. This is somewhat
3129 /// unfortunate as it may slowdown small constant memcpys.
3130 /// FIXME: consider doing manual inline for small constant sizes and proper
3131 /// alignment.
3132 ///
3133 /// Note: This also handles memcpy.inline, which promises no calls to external
3134 /// functions as an optimization. However, with instrumentation enabled this
3135 /// is difficult to promise; additionally, we know that the MSan runtime
3136 /// exists and provides __msan_memcpy(). Therefore, we assume that with
3137 /// instrumentation it's safe to turn memcpy.inline into a call to
3138 /// __msan_memcpy(). Should this be wrong, such as when implementing memcpy()
3139 /// itself, instrumentation should be disabled with the no_sanitize attribute.
3140 void visitMemCpyInst(MemCpyInst &I) {
3141 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3142 IRBuilder<> IRB(&I);
3143 IRB.CreateCall(MS.MemcpyFn,
3144 {I.getArgOperand(0), I.getArgOperand(1),
3145 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3147 }
3148
3149 // Same as memcpy.
3150 void visitMemSetInst(MemSetInst &I) {
3151 IRBuilder<> IRB(&I);
3152 IRB.CreateCall(
3153 MS.MemsetFn,
3154 {I.getArgOperand(0),
3155 IRB.CreateIntCast(I.getArgOperand(1), IRB.getInt32Ty(), false),
3156 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3158 }
3159
3160 void visitVAStartInst(VAStartInst &I) { VAHelper->visitVAStartInst(I); }
3161
3162 void visitVACopyInst(VACopyInst &I) { VAHelper->visitVACopyInst(I); }
3163
3164 /// Handle vector store-like intrinsics.
3165 ///
3166 /// Instrument intrinsics that look like a simple SIMD store: writes memory,
3167 /// has 1 pointer argument and 1 vector argument, returns void.
3168 bool handleVectorStoreIntrinsic(IntrinsicInst &I) {
3169 assert(I.arg_size() == 2);
3170
3171 IRBuilder<> IRB(&I);
3172 Value *Addr = I.getArgOperand(0);
3173 Value *Shadow = getShadow(&I, 1);
3174 Value *ShadowPtr, *OriginPtr;
3175
3176 // We don't know the pointer alignment (could be unaligned SSE store!).
3177 // Have to assume to worst case.
3178 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
3179 Addr, IRB, Shadow->getType(), Align(1), /*isStore*/ true);
3180 IRB.CreateAlignedStore(Shadow, ShadowPtr, Align(1));
3181
3183 insertCheckShadowOf(Addr, &I);
3184
3185 // FIXME: factor out common code from materializeStores
3186 if (MS.TrackOrigins)
3187 IRB.CreateStore(getOrigin(&I, 1), OriginPtr);
3188 return true;
3189 }
3190
3191 /// Handle vector load-like intrinsics.
3192 ///
3193 /// Instrument intrinsics that look like a simple SIMD load: reads memory,
3194 /// has 1 pointer argument, returns a vector.
3195 bool handleVectorLoadIntrinsic(IntrinsicInst &I) {
3196 assert(I.arg_size() == 1);
3197
3198 IRBuilder<> IRB(&I);
3199 Value *Addr = I.getArgOperand(0);
3200
3201 Type *ShadowTy = getShadowTy(&I);
3202 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
3203 if (PropagateShadow) {
3204 // We don't know the pointer alignment (could be unaligned SSE load!).
3205 // Have to assume to worst case.
3206 const Align Alignment = Align(1);
3207 std::tie(ShadowPtr, OriginPtr) =
3208 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
3209 setShadow(&I,
3210 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
3211 } else {
3212 setShadow(&I, getCleanShadow(&I));
3213 }
3214
3216 insertCheckShadowOf(Addr, &I);
3217
3218 if (MS.TrackOrigins) {
3219 if (PropagateShadow)
3220 setOrigin(&I, IRB.CreateLoad(MS.OriginTy, OriginPtr));
3221 else
3222 setOrigin(&I, getCleanOrigin());
3223 }
3224 return true;
3225 }
3226
3227 /// Handle (SIMD arithmetic)-like intrinsics.
3228 ///
3229 /// Instrument intrinsics with any number of arguments of the same type [*],
3230 /// equal to the return type, plus a specified number of trailing flags of
3231 /// any type.
3232 ///
3233 /// [*] The type should be simple (no aggregates or pointers; vectors are
3234 /// fine).
3235 ///
3236 /// Caller guarantees that this intrinsic does not access memory.
3237 ///
3238 /// TODO: "horizontal"/"pairwise" intrinsics are often incorrectly matched by
3239 /// by this handler. See horizontalReduce().
3240 ///
3241 /// TODO: permutation intrinsics are also often incorrectly matched.
3242 [[maybe_unused]] bool
3243 maybeHandleSimpleNomemIntrinsic(IntrinsicInst &I,
3244 unsigned int trailingFlags) {
3245 Type *RetTy = I.getType();
3246 if (!(RetTy->isIntOrIntVectorTy() || RetTy->isFPOrFPVectorTy()))
3247 return false;
3248
3249 unsigned NumArgOperands = I.arg_size();
3250 assert(NumArgOperands >= trailingFlags);
3251 for (unsigned i = 0; i < NumArgOperands - trailingFlags; ++i) {
3252 Type *Ty = I.getArgOperand(i)->getType();
3253 if (Ty != RetTy)
3254 return false;
3255 }
3256
3257 IRBuilder<> IRB(&I);
3258 ShadowAndOriginCombiner SC(this, IRB);
3259 for (unsigned i = 0; i < NumArgOperands; ++i)
3260 SC.Add(I.getArgOperand(i));
3261 SC.Done(&I);
3262
3263 return true;
3264 }
3265
3266 /// Returns whether it was able to heuristically instrument unknown
3267 /// intrinsics.
3268 ///
3269 /// The main purpose of this code is to do something reasonable with all
3270 /// random intrinsics we might encounter, most importantly - SIMD intrinsics.
3271 /// We recognize several classes of intrinsics by their argument types and
3272 /// ModRefBehaviour and apply special instrumentation when we are reasonably
3273 /// sure that we know what the intrinsic does.
3274 ///
3275 /// We special-case intrinsics where this approach fails. See llvm.bswap
3276 /// handling as an example of that.
3277 bool maybeHandleUnknownIntrinsicUnlogged(IntrinsicInst &I) {
3278 unsigned NumArgOperands = I.arg_size();
3279 if (NumArgOperands == 0)
3280 return false;
3281
3282 if (NumArgOperands == 2 && I.getArgOperand(0)->getType()->isPointerTy() &&
3283 I.getArgOperand(1)->getType()->isVectorTy() &&
3284 I.getType()->isVoidTy() && !I.onlyReadsMemory()) {
3285 // This looks like a vector store.
3286 return handleVectorStoreIntrinsic(I);
3287 }
3288
3289 if (NumArgOperands == 1 && I.getArgOperand(0)->getType()->isPointerTy() &&
3290 I.getType()->isVectorTy() && I.onlyReadsMemory()) {
3291 // This looks like a vector load.
3292 return handleVectorLoadIntrinsic(I);
3293 }
3294
3295 if (I.doesNotAccessMemory())
3296 if (maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/0))
3297 return true;
3298
3299 // FIXME: detect and handle SSE maskstore/maskload?
3300 // Some cases are now handled in handleAVXMasked{Load,Store}.
3301 return false;
3302 }
3303
3304 bool maybeHandleUnknownIntrinsic(IntrinsicInst &I) {
3305 if (maybeHandleUnknownIntrinsicUnlogged(I)) {
3307 dumpInst(I);
3308
3309 LLVM_DEBUG(dbgs() << "UNKNOWN INSTRUCTION HANDLED HEURISTICALLY: " << I
3310 << "\n");
3311 return true;
3312 } else
3313 return false;
3314 }
3315
3316 void handleInvariantGroup(IntrinsicInst &I) {
3317 setShadow(&I, getShadow(&I, 0));
3318 setOrigin(&I, getOrigin(&I, 0));
3319 }
3320
3321 void handleLifetimeStart(IntrinsicInst &I) {
3322 if (!PoisonStack)
3323 return;
3324 AllocaInst *AI = dyn_cast<AllocaInst>(I.getArgOperand(0));
3325 if (AI)
3326 LifetimeStartList.push_back(std::make_pair(&I, AI));
3327 }
3328
3329 void handleBswap(IntrinsicInst &I) {
3330 IRBuilder<> IRB(&I);
3331 Value *Op = I.getArgOperand(0);
3332 Type *OpType = Op->getType();
3333 setShadow(&I, IRB.CreateIntrinsic(Intrinsic::bswap, ArrayRef(&OpType, 1),
3334 getShadow(Op)));
3335 setOrigin(&I, getOrigin(Op));
3336 }
3337
3338 // Uninitialized bits are ok if they appear after the leading/trailing 0's
3339 // and a 1. If the input is all zero, it is fully initialized iff
3340 // !is_zero_poison.
3341 //
3342 // e.g., for ctlz, with little-endian, if 0/1 are initialized bits with
3343 // concrete value 0/1, and ? is an uninitialized bit:
3344 // - 0001 0??? is fully initialized
3345 // - 000? ???? is fully uninitialized (*)
3346 // - ???? ???? is fully uninitialized
3347 // - 0000 0000 is fully uninitialized if is_zero_poison,
3348 // fully initialized otherwise
3349 //
3350 // (*) TODO: arguably, since the number of zeros is in the range [3, 8], we
3351 // only need to poison 4 bits.
3352 //
3353 // OutputShadow =
3354 // ((ConcreteZerosCount >= ShadowZerosCount) && !AllZeroShadow)
3355 // || (is_zero_poison && AllZeroSrc)
3356 void handleCountLeadingTrailingZeros(IntrinsicInst &I) {
3357 IRBuilder<> IRB(&I);
3358 Value *Src = I.getArgOperand(0);
3359 Value *SrcShadow = getShadow(Src);
3360
3361 Value *False = IRB.getInt1(false);
3362 Value *ConcreteZerosCount = IRB.CreateIntrinsic(
3363 I.getType(), I.getIntrinsicID(), {Src, /*is_zero_poison=*/False});
3364 Value *ShadowZerosCount = IRB.CreateIntrinsic(
3365 I.getType(), I.getIntrinsicID(), {SrcShadow, /*is_zero_poison=*/False});
3366
3367 Value *CompareConcreteZeros = IRB.CreateICmpUGE(
3368 ConcreteZerosCount, ShadowZerosCount, "_mscz_cmp_zeros");
3369
3370 Value *NotAllZeroShadow =
3371 IRB.CreateIsNotNull(SrcShadow, "_mscz_shadow_not_null");
3372 Value *OutputShadow =
3373 IRB.CreateAnd(CompareConcreteZeros, NotAllZeroShadow, "_mscz_main");
3374
3375 // If zero poison is requested, mix in with the shadow
3376 Constant *IsZeroPoison = cast<Constant>(I.getOperand(1));
3377 if (!IsZeroPoison->isZeroValue()) {
3378 Value *BoolZeroPoison = IRB.CreateIsNull(Src, "_mscz_bzp");
3379 OutputShadow = IRB.CreateOr(OutputShadow, BoolZeroPoison, "_mscz_bs");
3380 }
3381
3382 OutputShadow = IRB.CreateSExt(OutputShadow, getShadowTy(Src), "_mscz_os");
3383
3384 setShadow(&I, OutputShadow);
3385 setOriginForNaryOp(I);
3386 }
3387
3388 /// Handle Arm NEON vector convert intrinsics.
3389 ///
3390 /// e.g., <4 x i32> @llvm.aarch64.neon.fcvtpu.v4i32.v4f32(<4 x float>)
3391 /// i32 @llvm.aarch64.neon.fcvtms.i32.f64(double)
3392 ///
3393 /// For x86 SSE vector convert intrinsics, see
3394 /// handleSSEVectorConvertIntrinsic().
3395 void handleNEONVectorConvertIntrinsic(IntrinsicInst &I) {
3396 assert(I.arg_size() == 1);
3397
3398 IRBuilder<> IRB(&I);
3399 Value *S0 = getShadow(&I, 0);
3400
3401 /// For scalars:
3402 /// Since they are converting from floating-point to integer, the output is
3403 /// - fully uninitialized if *any* bit of the input is uninitialized
3404 /// - fully ininitialized if all bits of the input are ininitialized
3405 /// We apply the same principle on a per-field basis for vectors.
3406 Value *OutShadow = IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)),
3407 getShadowTy(&I));
3408 setShadow(&I, OutShadow);
3409 setOriginForNaryOp(I);
3410 }
3411
3412 /// Some instructions have additional zero-elements in the return type
3413 /// e.g., <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512(<8 x i64>, ...)
3414 ///
3415 /// This function will return a vector type with the same number of elements
3416 /// as the input, but same per-element width as the return value e.g.,
3417 /// <8 x i8>.
3418 FixedVectorType *maybeShrinkVectorShadowType(Value *Src, IntrinsicInst &I) {
3419 assert(isa<FixedVectorType>(getShadowTy(&I)));
3420 FixedVectorType *ShadowType = cast<FixedVectorType>(getShadowTy(&I));
3421
3422 // TODO: generalize beyond 2x?
3423 if (ShadowType->getElementCount() ==
3424 cast<VectorType>(Src->getType())->getElementCount() * 2)
3425 ShadowType = FixedVectorType::getHalfElementsVectorType(ShadowType);
3426
3427 assert(ShadowType->getElementCount() ==
3428 cast<VectorType>(Src->getType())->getElementCount());
3429
3430 return ShadowType;
3431 }
3432
3433 /// Doubles the length of a vector shadow (extending with zeros) if necessary
3434 /// to match the length of the shadow for the instruction.
3435 /// If scalar types of the vectors are different, it will use the type of the
3436 /// input vector.
3437 /// This is more type-safe than CreateShadowCast().
3438 Value *maybeExtendVectorShadowWithZeros(Value *Shadow, IntrinsicInst &I) {
3439 IRBuilder<> IRB(&I);
3441 assert(isa<FixedVectorType>(I.getType()));
3442
3443 Value *FullShadow = getCleanShadow(&I);
3444 unsigned ShadowNumElems =
3445 cast<FixedVectorType>(Shadow->getType())->getNumElements();
3446 unsigned FullShadowNumElems =
3447 cast<FixedVectorType>(FullShadow->getType())->getNumElements();
3448
3449 assert((ShadowNumElems == FullShadowNumElems) ||
3450 (ShadowNumElems * 2 == FullShadowNumElems));
3451
3452 if (ShadowNumElems == FullShadowNumElems) {
3453 FullShadow = Shadow;
3454 } else {
3455 // TODO: generalize beyond 2x?
3456 SmallVector<int, 32> ShadowMask(FullShadowNumElems);
3457 std::iota(ShadowMask.begin(), ShadowMask.end(), 0);
3458
3459 // Append zeros
3460 FullShadow =
3461 IRB.CreateShuffleVector(Shadow, getCleanShadow(Shadow), ShadowMask);
3462 }
3463
3464 return FullShadow;
3465 }
3466
3467 /// Handle x86 SSE vector conversion.
3468 ///
3469 /// e.g., single-precision to half-precision conversion:
3470 /// <8 x i16> @llvm.x86.vcvtps2ph.256(<8 x float> %a0, i32 0)
3471 /// <8 x i16> @llvm.x86.vcvtps2ph.128(<4 x float> %a0, i32 0)
3472 ///
3473 /// floating-point to integer:
3474 /// <4 x i32> @llvm.x86.sse2.cvtps2dq(<4 x float>)
3475 /// <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>)
3476 ///
3477 /// Note: if the output has more elements, they are zero-initialized (and
3478 /// therefore the shadow will also be initialized).
3479 ///
3480 /// This differs from handleSSEVectorConvertIntrinsic() because it
3481 /// propagates uninitialized shadow (instead of checking the shadow).
3482 void handleSSEVectorConvertIntrinsicByProp(IntrinsicInst &I,
3483 bool HasRoundingMode) {
3484 if (HasRoundingMode) {
3485 assert(I.arg_size() == 2);
3486 [[maybe_unused]] Value *RoundingMode = I.getArgOperand(1);
3487 assert(RoundingMode->getType()->isIntegerTy());
3488 } else {
3489 assert(I.arg_size() == 1);
3490 }
3491
3492 Value *Src = I.getArgOperand(0);
3493 assert(Src->getType()->isVectorTy());
3494
3495 // The return type might have more elements than the input.
3496 // Temporarily shrink the return type's number of elements.
3497 VectorType *ShadowType = maybeShrinkVectorShadowType(Src, I);
3498
3499 IRBuilder<> IRB(&I);
3500 Value *S0 = getShadow(&I, 0);
3501
3502 /// For scalars:
3503 /// Since they are converting to and/or from floating-point, the output is:
3504 /// - fully uninitialized if *any* bit of the input is uninitialized
3505 /// - fully ininitialized if all bits of the input are ininitialized
3506 /// We apply the same principle on a per-field basis for vectors.
3507 Value *Shadow =
3508 IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)), ShadowType);
3509
3510 // The return type might have more elements than the input.
3511 // Extend the return type back to its original width if necessary.
3512 Value *FullShadow = maybeExtendVectorShadowWithZeros(Shadow, I);
3513
3514 setShadow(&I, FullShadow);
3515 setOriginForNaryOp(I);
3516 }
3517
3518 // Instrument x86 SSE vector convert intrinsic.
3519 //
3520 // This function instruments intrinsics like cvtsi2ss:
3521 // %Out = int_xxx_cvtyyy(%ConvertOp)
3522 // or
3523 // %Out = int_xxx_cvtyyy(%CopyOp, %ConvertOp)
3524 // Intrinsic converts \p NumUsedElements elements of \p ConvertOp to the same
3525 // number \p Out elements, and (if has 2 arguments) copies the rest of the
3526 // elements from \p CopyOp.
3527 // In most cases conversion involves floating-point value which may trigger a
3528 // hardware exception when not fully initialized. For this reason we require
3529 // \p ConvertOp[0:NumUsedElements] to be fully initialized and trap otherwise.
3530 // We copy the shadow of \p CopyOp[NumUsedElements:] to \p
3531 // Out[NumUsedElements:]. This means that intrinsics without \p CopyOp always
3532 // return a fully initialized value.
3533 //
3534 // For Arm NEON vector convert intrinsics, see
3535 // handleNEONVectorConvertIntrinsic().
3536 void handleSSEVectorConvertIntrinsic(IntrinsicInst &I, int NumUsedElements,
3537 bool HasRoundingMode = false) {
3538 IRBuilder<> IRB(&I);
3539 Value *CopyOp, *ConvertOp;
3540
3541 assert((!HasRoundingMode ||
3542 isa<ConstantInt>(I.getArgOperand(I.arg_size() - 1))) &&
3543 "Invalid rounding mode");
3544
3545 switch (I.arg_size() - HasRoundingMode) {
3546 case 2:
3547 CopyOp = I.getArgOperand(0);
3548 ConvertOp = I.getArgOperand(1);
3549 break;
3550 case 1:
3551 ConvertOp = I.getArgOperand(0);
3552 CopyOp = nullptr;
3553 break;
3554 default:
3555 llvm_unreachable("Cvt intrinsic with unsupported number of arguments.");
3556 }
3557
3558 // The first *NumUsedElements* elements of ConvertOp are converted to the
3559 // same number of output elements. The rest of the output is copied from
3560 // CopyOp, or (if not available) filled with zeroes.
3561 // Combine shadow for elements of ConvertOp that are used in this operation,
3562 // and insert a check.
3563 // FIXME: consider propagating shadow of ConvertOp, at least in the case of
3564 // int->any conversion.
3565 Value *ConvertShadow = getShadow(ConvertOp);
3566 Value *AggShadow = nullptr;
3567 if (ConvertOp->getType()->isVectorTy()) {
3568 AggShadow = IRB.CreateExtractElement(
3569 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), 0));
3570 for (int i = 1; i < NumUsedElements; ++i) {
3571 Value *MoreShadow = IRB.CreateExtractElement(
3572 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), i));
3573 AggShadow = IRB.CreateOr(AggShadow, MoreShadow);
3574 }
3575 } else {
3576 AggShadow = ConvertShadow;
3577 }
3578 assert(AggShadow->getType()->isIntegerTy());
3579 insertCheckShadow(AggShadow, getOrigin(ConvertOp), &I);
3580
3581 // Build result shadow by zero-filling parts of CopyOp shadow that come from
3582 // ConvertOp.
3583 if (CopyOp) {
3584 assert(CopyOp->getType() == I.getType());
3585 assert(CopyOp->getType()->isVectorTy());
3586 Value *ResultShadow = getShadow(CopyOp);
3587 Type *EltTy = cast<VectorType>(ResultShadow->getType())->getElementType();
3588 for (int i = 0; i < NumUsedElements; ++i) {
3589 ResultShadow = IRB.CreateInsertElement(
3590 ResultShadow, ConstantInt::getNullValue(EltTy),
3591 ConstantInt::get(IRB.getInt32Ty(), i));
3592 }
3593 setShadow(&I, ResultShadow);
3594 setOrigin(&I, getOrigin(CopyOp));
3595 } else {
3596 setShadow(&I, getCleanShadow(&I));
3597 setOrigin(&I, getCleanOrigin());
3598 }
3599 }
3600
3601 // Given a scalar or vector, extract lower 64 bits (or less), and return all
3602 // zeroes if it is zero, and all ones otherwise.
3603 Value *Lower64ShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3604 if (S->getType()->isVectorTy())
3605 S = CreateShadowCast(IRB, S, IRB.getInt64Ty(), /* Signed */ true);
3606 assert(S->getType()->getPrimitiveSizeInBits() <= 64);
3607 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3608 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3609 }
3610
3611 // Given a vector, extract its first element, and return all
3612 // zeroes if it is zero, and all ones otherwise.
3613 Value *LowerElementShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3614 Value *S1 = IRB.CreateExtractElement(S, (uint64_t)0);
3615 Value *S2 = IRB.CreateICmpNE(S1, getCleanShadow(S1));
3616 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3617 }
3618
3619 Value *VariableShadowExtend(IRBuilder<> &IRB, Value *S) {
3620 Type *T = S->getType();
3621 assert(T->isVectorTy());
3622 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3623 return IRB.CreateSExt(S2, T);
3624 }
3625
3626 // Instrument vector shift intrinsic.
3627 //
3628 // This function instruments intrinsics like int_x86_avx2_psll_w.
3629 // Intrinsic shifts %In by %ShiftSize bits.
3630 // %ShiftSize may be a vector. In that case the lower 64 bits determine shift
3631 // size, and the rest is ignored. Behavior is defined even if shift size is
3632 // greater than register (or field) width.
3633 void handleVectorShiftIntrinsic(IntrinsicInst &I, bool Variable) {
3634 assert(I.arg_size() == 2);
3635 IRBuilder<> IRB(&I);
3636 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3637 // Otherwise perform the same shift on S1.
3638 Value *S1 = getShadow(&I, 0);
3639 Value *S2 = getShadow(&I, 1);
3640 Value *S2Conv = Variable ? VariableShadowExtend(IRB, S2)
3641 : Lower64ShadowExtend(IRB, S2, getShadowTy(&I));
3642 Value *V1 = I.getOperand(0);
3643 Value *V2 = I.getOperand(1);
3644 Value *Shift = IRB.CreateCall(I.getFunctionType(), I.getCalledOperand(),
3645 {IRB.CreateBitCast(S1, V1->getType()), V2});
3646 Shift = IRB.CreateBitCast(Shift, getShadowTy(&I));
3647 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3648 setOriginForNaryOp(I);
3649 }
3650
3651 // Get an MMX-sized (64-bit) vector type, or optionally, other sized
3652 // vectors.
3653 Type *getMMXVectorTy(unsigned EltSizeInBits,
3654 unsigned X86_MMXSizeInBits = 64) {
3655 assert(EltSizeInBits != 0 && (X86_MMXSizeInBits % EltSizeInBits) == 0 &&
3656 "Illegal MMX vector element size");
3657 return FixedVectorType::get(IntegerType::get(*MS.C, EltSizeInBits),
3658 X86_MMXSizeInBits / EltSizeInBits);
3659 }
3660
3661 // Returns a signed counterpart for an (un)signed-saturate-and-pack
3662 // intrinsic.
3663 Intrinsic::ID getSignedPackIntrinsic(Intrinsic::ID id) {
3664 switch (id) {
3665 case Intrinsic::x86_sse2_packsswb_128:
3666 case Intrinsic::x86_sse2_packuswb_128:
3667 return Intrinsic::x86_sse2_packsswb_128;
3668
3669 case Intrinsic::x86_sse2_packssdw_128:
3670 case Intrinsic::x86_sse41_packusdw:
3671 return Intrinsic::x86_sse2_packssdw_128;
3672
3673 case Intrinsic::x86_avx2_packsswb:
3674 case Intrinsic::x86_avx2_packuswb:
3675 return Intrinsic::x86_avx2_packsswb;
3676
3677 case Intrinsic::x86_avx2_packssdw:
3678 case Intrinsic::x86_avx2_packusdw:
3679 return Intrinsic::x86_avx2_packssdw;
3680
3681 case Intrinsic::x86_mmx_packsswb:
3682 case Intrinsic::x86_mmx_packuswb:
3683 return Intrinsic::x86_mmx_packsswb;
3684
3685 case Intrinsic::x86_mmx_packssdw:
3686 return Intrinsic::x86_mmx_packssdw;
3687
3688 case Intrinsic::x86_avx512_packssdw_512:
3689 case Intrinsic::x86_avx512_packusdw_512:
3690 return Intrinsic::x86_avx512_packssdw_512;
3691
3692 case Intrinsic::x86_avx512_packsswb_512:
3693 case Intrinsic::x86_avx512_packuswb_512:
3694 return Intrinsic::x86_avx512_packsswb_512;
3695
3696 default:
3697 llvm_unreachable("unexpected intrinsic id");
3698 }
3699 }
3700
3701 // Instrument vector pack intrinsic.
3702 //
3703 // This function instruments intrinsics like x86_mmx_packsswb, that
3704 // packs elements of 2 input vectors into half as many bits with saturation.
3705 // Shadow is propagated with the signed variant of the same intrinsic applied
3706 // to sext(Sa != zeroinitializer), sext(Sb != zeroinitializer).
3707 // MMXEltSizeInBits is used only for x86mmx arguments.
3708 //
3709 // TODO: consider using GetMinMaxUnsigned() to handle saturation precisely
3710 void handleVectorPackIntrinsic(IntrinsicInst &I,
3711 unsigned MMXEltSizeInBits = 0) {
3712 assert(I.arg_size() == 2);
3713 IRBuilder<> IRB(&I);
3714 Value *S1 = getShadow(&I, 0);
3715 Value *S2 = getShadow(&I, 1);
3716 assert(S1->getType()->isVectorTy());
3717
3718 // SExt and ICmpNE below must apply to individual elements of input vectors.
3719 // In case of x86mmx arguments, cast them to appropriate vector types and
3720 // back.
3721 Type *T =
3722 MMXEltSizeInBits ? getMMXVectorTy(MMXEltSizeInBits) : S1->getType();
3723 if (MMXEltSizeInBits) {
3724 S1 = IRB.CreateBitCast(S1, T);
3725 S2 = IRB.CreateBitCast(S2, T);
3726 }
3727 Value *S1_ext =
3729 Value *S2_ext =
3731 if (MMXEltSizeInBits) {
3732 S1_ext = IRB.CreateBitCast(S1_ext, getMMXVectorTy(64));
3733 S2_ext = IRB.CreateBitCast(S2_ext, getMMXVectorTy(64));
3734 }
3735
3736 Value *S = IRB.CreateIntrinsic(getSignedPackIntrinsic(I.getIntrinsicID()),
3737 {S1_ext, S2_ext}, /*FMFSource=*/nullptr,
3738 "_msprop_vector_pack");
3739 if (MMXEltSizeInBits)
3740 S = IRB.CreateBitCast(S, getShadowTy(&I));
3741 setShadow(&I, S);
3742 setOriginForNaryOp(I);
3743 }
3744
3745 // Convert `Mask` into `<n x i1>`.
3746 Constant *createDppMask(unsigned Width, unsigned Mask) {
3748 for (auto &M : R) {
3749 M = ConstantInt::getBool(F.getContext(), Mask & 1);
3750 Mask >>= 1;
3751 }
3752 return ConstantVector::get(R);
3753 }
3754
3755 // Calculate output shadow as array of booleans `<n x i1>`, assuming if any
3756 // arg is poisoned, entire dot product is poisoned.
3757 Value *findDppPoisonedOutput(IRBuilder<> &IRB, Value *S, unsigned SrcMask,
3758 unsigned DstMask) {
3759 const unsigned Width =
3760 cast<FixedVectorType>(S->getType())->getNumElements();
3761
3762 S = IRB.CreateSelect(createDppMask(Width, SrcMask), S,
3764 Value *SElem = IRB.CreateOrReduce(S);
3765 Value *IsClean = IRB.CreateIsNull(SElem, "_msdpp");
3766 Value *DstMaskV = createDppMask(Width, DstMask);
3767
3768 return IRB.CreateSelect(
3769 IsClean, Constant::getNullValue(DstMaskV->getType()), DstMaskV);
3770 }
3771
3772 // See `Intel Intrinsics Guide` for `_dp_p*` instructions.
3773 //
3774 // 2 and 4 element versions produce single scalar of dot product, and then
3775 // puts it into elements of output vector, selected by 4 lowest bits of the
3776 // mask. Top 4 bits of the mask control which elements of input to use for dot
3777 // product.
3778 //
3779 // 8 element version mask still has only 4 bit for input, and 4 bit for output
3780 // mask. According to the spec it just operates as 4 element version on first
3781 // 4 elements of inputs and output, and then on last 4 elements of inputs and
3782 // output.
3783 void handleDppIntrinsic(IntrinsicInst &I) {
3784 IRBuilder<> IRB(&I);
3785
3786 Value *S0 = getShadow(&I, 0);
3787 Value *S1 = getShadow(&I, 1);
3788 Value *S = IRB.CreateOr(S0, S1);
3789
3790 const unsigned Width =
3791 cast<FixedVectorType>(S->getType())->getNumElements();
3792 assert(Width == 2 || Width == 4 || Width == 8);
3793
3794 const unsigned Mask = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
3795 const unsigned SrcMask = Mask >> 4;
3796 const unsigned DstMask = Mask & 0xf;
3797
3798 // Calculate shadow as `<n x i1>`.
3799 Value *SI1 = findDppPoisonedOutput(IRB, S, SrcMask, DstMask);
3800 if (Width == 8) {
3801 // First 4 elements of shadow are already calculated. `makeDppShadow`
3802 // operats on 32 bit masks, so we can just shift masks, and repeat.
3803 SI1 = IRB.CreateOr(
3804 SI1, findDppPoisonedOutput(IRB, S, SrcMask << 4, DstMask << 4));
3805 }
3806 // Extend to real size of shadow, poisoning either all or none bits of an
3807 // element.
3808 S = IRB.CreateSExt(SI1, S->getType(), "_msdpp");
3809
3810 setShadow(&I, S);
3811 setOriginForNaryOp(I);
3812 }
3813
3814 Value *convertBlendvToSelectMask(IRBuilder<> &IRB, Value *C) {
3815 C = CreateAppToShadowCast(IRB, C);
3816 FixedVectorType *FVT = cast<FixedVectorType>(C->getType());
3817 unsigned ElSize = FVT->getElementType()->getPrimitiveSizeInBits();
3818 C = IRB.CreateAShr(C, ElSize - 1);
3819 FVT = FixedVectorType::get(IRB.getInt1Ty(), FVT->getNumElements());
3820 return IRB.CreateTrunc(C, FVT);
3821 }
3822
3823 // `blendv(f, t, c)` is effectively `select(c[top_bit], t, f)`.
3824 void handleBlendvIntrinsic(IntrinsicInst &I) {
3825 Value *C = I.getOperand(2);
3826 Value *T = I.getOperand(1);
3827 Value *F = I.getOperand(0);
3828
3829 Value *Sc = getShadow(&I, 2);
3830 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
3831
3832 {
3833 IRBuilder<> IRB(&I);
3834 // Extract top bit from condition and its shadow.
3835 C = convertBlendvToSelectMask(IRB, C);
3836 Sc = convertBlendvToSelectMask(IRB, Sc);
3837
3838 setShadow(C, Sc);
3839 setOrigin(C, Oc);
3840 }
3841
3842 handleSelectLikeInst(I, C, T, F);
3843 }
3844
3845 // Instrument sum-of-absolute-differences intrinsic.
3846 void handleVectorSadIntrinsic(IntrinsicInst &I, bool IsMMX = false) {
3847 const unsigned SignificantBitsPerResultElement = 16;
3848 Type *ResTy = IsMMX ? IntegerType::get(*MS.C, 64) : I.getType();
3849 unsigned ZeroBitsPerResultElement =
3850 ResTy->getScalarSizeInBits() - SignificantBitsPerResultElement;
3851
3852 IRBuilder<> IRB(&I);
3853 auto *Shadow0 = getShadow(&I, 0);
3854 auto *Shadow1 = getShadow(&I, 1);
3855 Value *S = IRB.CreateOr(Shadow0, Shadow1);
3856 S = IRB.CreateBitCast(S, ResTy);
3857 S = IRB.CreateSExt(IRB.CreateICmpNE(S, Constant::getNullValue(ResTy)),
3858 ResTy);
3859 S = IRB.CreateLShr(S, ZeroBitsPerResultElement);
3860 S = IRB.CreateBitCast(S, getShadowTy(&I));
3861 setShadow(&I, S);
3862 setOriginForNaryOp(I);
3863 }
3864
3865 // Instrument multiply-add(-accumulate)? intrinsics.
3866 //
3867 // e.g., Two operands:
3868 // <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a, <8 x i16> %b)
3869 //
3870 // Two operands which require an EltSizeInBits override:
3871 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64> %a, <1 x i64> %b)
3872 //
3873 // Three operands:
3874 // <4 x i32> @llvm.x86.avx512.vpdpbusd.128
3875 // (<4 x i32> %s, <16 x i8> %a, <16 x i8> %b)
3876 // (this is equivalent to multiply-add on %a and %b, followed by
3877 // adding/"accumulating" %s. "Accumulation" stores the result in one
3878 // of the source registers, but this accumulate vs. add distinction
3879 // is lost when dealing with LLVM intrinsics.)
3880 void handleVectorPmaddIntrinsic(IntrinsicInst &I, unsigned ReductionFactor,
3881 unsigned EltSizeInBits = 0) {
3882 IRBuilder<> IRB(&I);
3883
3884 [[maybe_unused]] FixedVectorType *ReturnType =
3885 cast<FixedVectorType>(I.getType());
3886 assert(isa<FixedVectorType>(ReturnType));
3887
3888 // Vectors A and B, and shadows
3889 Value *Va = nullptr;
3890 Value *Vb = nullptr;
3891 Value *Sa = nullptr;
3892 Value *Sb = nullptr;
3893
3894 assert(I.arg_size() == 2 || I.arg_size() == 3);
3895 if (I.arg_size() == 2) {
3896 Va = I.getOperand(0);
3897 Vb = I.getOperand(1);
3898
3899 Sa = getShadow(&I, 0);
3900 Sb = getShadow(&I, 1);
3901 } else if (I.arg_size() == 3) {
3902 // Operand 0 is the accumulator. We will deal with that below.
3903 Va = I.getOperand(1);
3904 Vb = I.getOperand(2);
3905
3906 Sa = getShadow(&I, 1);
3907 Sb = getShadow(&I, 2);
3908 }
3909
3910 FixedVectorType *ParamType = cast<FixedVectorType>(Va->getType());
3911 assert(ParamType == Vb->getType());
3912
3913 assert(ParamType->getPrimitiveSizeInBits() ==
3914 ReturnType->getPrimitiveSizeInBits());
3915
3916 if (I.arg_size() == 3) {
3917 [[maybe_unused]] auto *AccumulatorType =
3918 cast<FixedVectorType>(I.getOperand(0)->getType());
3919 assert(AccumulatorType == ReturnType);
3920 }
3921
3922 FixedVectorType *ImplicitReturnType = ReturnType;
3923 // Step 1: instrument multiplication of corresponding vector elements
3924 if (EltSizeInBits) {
3925 ImplicitReturnType = cast<FixedVectorType>(
3926 getMMXVectorTy(EltSizeInBits * ReductionFactor,
3927 ParamType->getPrimitiveSizeInBits()));
3928 ParamType = cast<FixedVectorType>(
3929 getMMXVectorTy(EltSizeInBits, ParamType->getPrimitiveSizeInBits()));
3930
3931 Va = IRB.CreateBitCast(Va, ParamType);
3932 Vb = IRB.CreateBitCast(Vb, ParamType);
3933
3934 Sa = IRB.CreateBitCast(Sa, getShadowTy(ParamType));
3935 Sb = IRB.CreateBitCast(Sb, getShadowTy(ParamType));
3936 } else {
3937 assert(ParamType->getNumElements() ==
3938 ReturnType->getNumElements() * ReductionFactor);
3939 }
3940
3941 // Multiplying an *initialized* zero by an uninitialized element results in
3942 // an initialized zero element.
3943 //
3944 // This is analogous to bitwise AND, where "AND" of 0 and a poisoned value
3945 // results in an unpoisoned value. We can therefore adapt the visitAnd()
3946 // instrumentation:
3947 // OutShadow = (SaNonZero & SbNonZero)
3948 // | (VaNonZero & SbNonZero)
3949 // | (SaNonZero & VbNonZero)
3950 // where non-zero is checked on a per-element basis (not per bit).
3951 Value *SZero = Constant::getNullValue(Va->getType());
3952 Value *VZero = Constant::getNullValue(Sa->getType());
3953 Value *SaNonZero = IRB.CreateICmpNE(Sa, SZero);
3954 Value *SbNonZero = IRB.CreateICmpNE(Sb, SZero);
3955 Value *VaNonZero = IRB.CreateICmpNE(Va, VZero);
3956 Value *VbNonZero = IRB.CreateICmpNE(Vb, VZero);
3957
3958 Value *SaAndSbNonZero = IRB.CreateAnd(SaNonZero, SbNonZero);
3959 Value *VaAndSbNonZero = IRB.CreateAnd(VaNonZero, SbNonZero);
3960 Value *SaAndVbNonZero = IRB.CreateAnd(SaNonZero, VbNonZero);
3961
3962 // Each element of the vector is represented by a single bit (poisoned or
3963 // not) e.g., <8 x i1>.
3964 Value *And = IRB.CreateOr({SaAndSbNonZero, VaAndSbNonZero, SaAndVbNonZero});
3965
3966 // Extend <8 x i1> to <8 x i16>.
3967 // (The real pmadd intrinsic would have computed intermediate values of
3968 // <8 x i32>, but that is irrelevant for our shadow purposes because we
3969 // consider each element to be either fully initialized or fully
3970 // uninitialized.)
3971 And = IRB.CreateSExt(And, Sa->getType());
3972
3973 // Step 2: instrument horizontal add
3974 // We don't need bit-precise horizontalReduce because we only want to check
3975 // if each pair/quad of elements is fully zero.
3976 // Cast to <4 x i32>.
3977 Value *Horizontal = IRB.CreateBitCast(And, ImplicitReturnType);
3978
3979 // Compute <4 x i1>, then extend back to <4 x i32>.
3980 Value *OutShadow = IRB.CreateSExt(
3981 IRB.CreateICmpNE(Horizontal,
3982 Constant::getNullValue(Horizontal->getType())),
3983 ImplicitReturnType);
3984
3985 // Cast it back to the required fake return type (if MMX: <1 x i64>; for
3986 // AVX, it is already correct).
3987 if (EltSizeInBits)
3988 OutShadow = CreateShadowCast(IRB, OutShadow, getShadowTy(&I));
3989
3990 // Step 3 (if applicable): instrument accumulator
3991 if (I.arg_size() == 3)
3992 OutShadow = IRB.CreateOr(OutShadow, getShadow(&I, 0));
3993
3994 setShadow(&I, OutShadow);
3995 setOriginForNaryOp(I);
3996 }
3997
3998 // Instrument compare-packed intrinsic.
3999 // Basically, an or followed by sext(icmp ne 0) to end up with all-zeros or
4000 // all-ones shadow.
4001 void handleVectorComparePackedIntrinsic(IntrinsicInst &I) {
4002 IRBuilder<> IRB(&I);
4003 Type *ResTy = getShadowTy(&I);
4004 auto *Shadow0 = getShadow(&I, 0);
4005 auto *Shadow1 = getShadow(&I, 1);
4006 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
4007 Value *S = IRB.CreateSExt(
4008 IRB.CreateICmpNE(S0, Constant::getNullValue(ResTy)), ResTy);
4009 setShadow(&I, S);
4010 setOriginForNaryOp(I);
4011 }
4012
4013 // Instrument compare-scalar intrinsic.
4014 // This handles both cmp* intrinsics which return the result in the first
4015 // element of a vector, and comi* which return the result as i32.
4016 void handleVectorCompareScalarIntrinsic(IntrinsicInst &I) {
4017 IRBuilder<> IRB(&I);
4018 auto *Shadow0 = getShadow(&I, 0);
4019 auto *Shadow1 = getShadow(&I, 1);
4020 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
4021 Value *S = LowerElementShadowExtend(IRB, S0, getShadowTy(&I));
4022 setShadow(&I, S);
4023 setOriginForNaryOp(I);
4024 }
4025
4026 // Instrument generic vector reduction intrinsics
4027 // by ORing together all their fields.
4028 //
4029 // If AllowShadowCast is true, the return type does not need to be the same
4030 // type as the fields
4031 // e.g., declare i32 @llvm.aarch64.neon.uaddv.i32.v16i8(<16 x i8>)
4032 void handleVectorReduceIntrinsic(IntrinsicInst &I, bool AllowShadowCast) {
4033 assert(I.arg_size() == 1);
4034
4035 IRBuilder<> IRB(&I);
4036 Value *S = IRB.CreateOrReduce(getShadow(&I, 0));
4037 if (AllowShadowCast)
4038 S = CreateShadowCast(IRB, S, getShadowTy(&I));
4039 else
4040 assert(S->getType() == getShadowTy(&I));
4041 setShadow(&I, S);
4042 setOriginForNaryOp(I);
4043 }
4044
4045 // Similar to handleVectorReduceIntrinsic but with an initial starting value.
4046 // e.g., call float @llvm.vector.reduce.fadd.f32.v2f32(float %a0, <2 x float>
4047 // %a1)
4048 // shadow = shadow[a0] | shadow[a1.0] | shadow[a1.1]
4049 //
4050 // The type of the return value, initial starting value, and elements of the
4051 // vector must be identical.
4052 void handleVectorReduceWithStarterIntrinsic(IntrinsicInst &I) {
4053 assert(I.arg_size() == 2);
4054
4055 IRBuilder<> IRB(&I);
4056 Value *Shadow0 = getShadow(&I, 0);
4057 Value *Shadow1 = IRB.CreateOrReduce(getShadow(&I, 1));
4058 assert(Shadow0->getType() == Shadow1->getType());
4059 Value *S = IRB.CreateOr(Shadow0, Shadow1);
4060 assert(S->getType() == getShadowTy(&I));
4061 setShadow(&I, S);
4062 setOriginForNaryOp(I);
4063 }
4064
4065 // Instrument vector.reduce.or intrinsic.
4066 // Valid (non-poisoned) set bits in the operand pull low the
4067 // corresponding shadow bits.
4068 void handleVectorReduceOrIntrinsic(IntrinsicInst &I) {
4069 assert(I.arg_size() == 1);
4070
4071 IRBuilder<> IRB(&I);
4072 Value *OperandShadow = getShadow(&I, 0);
4073 Value *OperandUnsetBits = IRB.CreateNot(I.getOperand(0));
4074 Value *OperandUnsetOrPoison = IRB.CreateOr(OperandUnsetBits, OperandShadow);
4075 // Bit N is clean if any field's bit N is 1 and unpoison
4076 Value *OutShadowMask = IRB.CreateAndReduce(OperandUnsetOrPoison);
4077 // Otherwise, it is clean if every field's bit N is unpoison
4078 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4079 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4080
4081 setShadow(&I, S);
4082 setOrigin(&I, getOrigin(&I, 0));
4083 }
4084
4085 // Instrument vector.reduce.and intrinsic.
4086 // Valid (non-poisoned) unset bits in the operand pull down the
4087 // corresponding shadow bits.
4088 void handleVectorReduceAndIntrinsic(IntrinsicInst &I) {
4089 assert(I.arg_size() == 1);
4090
4091 IRBuilder<> IRB(&I);
4092 Value *OperandShadow = getShadow(&I, 0);
4093 Value *OperandSetOrPoison = IRB.CreateOr(I.getOperand(0), OperandShadow);
4094 // Bit N is clean if any field's bit N is 0 and unpoison
4095 Value *OutShadowMask = IRB.CreateAndReduce(OperandSetOrPoison);
4096 // Otherwise, it is clean if every field's bit N is unpoison
4097 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4098 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4099
4100 setShadow(&I, S);
4101 setOrigin(&I, getOrigin(&I, 0));
4102 }
4103
4104 void handleStmxcsr(IntrinsicInst &I) {
4105 IRBuilder<> IRB(&I);
4106 Value *Addr = I.getArgOperand(0);
4107 Type *Ty = IRB.getInt32Ty();
4108 Value *ShadowPtr =
4109 getShadowOriginPtr(Addr, IRB, Ty, Align(1), /*isStore*/ true).first;
4110
4111 IRB.CreateStore(getCleanShadow(Ty), ShadowPtr);
4112
4114 insertCheckShadowOf(Addr, &I);
4115 }
4116
4117 void handleLdmxcsr(IntrinsicInst &I) {
4118 if (!InsertChecks)
4119 return;
4120
4121 IRBuilder<> IRB(&I);
4122 Value *Addr = I.getArgOperand(0);
4123 Type *Ty = IRB.getInt32Ty();
4124 const Align Alignment = Align(1);
4125 Value *ShadowPtr, *OriginPtr;
4126 std::tie(ShadowPtr, OriginPtr) =
4127 getShadowOriginPtr(Addr, IRB, Ty, Alignment, /*isStore*/ false);
4128
4130 insertCheckShadowOf(Addr, &I);
4131
4132 Value *Shadow = IRB.CreateAlignedLoad(Ty, ShadowPtr, Alignment, "_ldmxcsr");
4133 Value *Origin = MS.TrackOrigins ? IRB.CreateLoad(MS.OriginTy, OriginPtr)
4134 : getCleanOrigin();
4135 insertCheckShadow(Shadow, Origin, &I);
4136 }
4137
4138 void handleMaskedExpandLoad(IntrinsicInst &I) {
4139 IRBuilder<> IRB(&I);
4140 Value *Ptr = I.getArgOperand(0);
4141 MaybeAlign Align = I.getParamAlign(0);
4142 Value *Mask = I.getArgOperand(1);
4143 Value *PassThru = I.getArgOperand(2);
4144
4146 insertCheckShadowOf(Ptr, &I);
4147 insertCheckShadowOf(Mask, &I);
4148 }
4149
4150 if (!PropagateShadow) {
4151 setShadow(&I, getCleanShadow(&I));
4152 setOrigin(&I, getCleanOrigin());
4153 return;
4154 }
4155
4156 Type *ShadowTy = getShadowTy(&I);
4157 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4158 auto [ShadowPtr, OriginPtr] =
4159 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ false);
4160
4161 Value *Shadow =
4162 IRB.CreateMaskedExpandLoad(ShadowTy, ShadowPtr, Align, Mask,
4163 getShadow(PassThru), "_msmaskedexpload");
4164
4165 setShadow(&I, Shadow);
4166
4167 // TODO: Store origins.
4168 setOrigin(&I, getCleanOrigin());
4169 }
4170
4171 void handleMaskedCompressStore(IntrinsicInst &I) {
4172 IRBuilder<> IRB(&I);
4173 Value *Values = I.getArgOperand(0);
4174 Value *Ptr = I.getArgOperand(1);
4175 MaybeAlign Align = I.getParamAlign(1);
4176 Value *Mask = I.getArgOperand(2);
4177
4179 insertCheckShadowOf(Ptr, &I);
4180 insertCheckShadowOf(Mask, &I);
4181 }
4182
4183 Value *Shadow = getShadow(Values);
4184 Type *ElementShadowTy =
4185 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4186 auto [ShadowPtr, OriginPtrs] =
4187 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ true);
4188
4189 IRB.CreateMaskedCompressStore(Shadow, ShadowPtr, Align, Mask);
4190
4191 // TODO: Store origins.
4192 }
4193
4194 void handleMaskedGather(IntrinsicInst &I) {
4195 IRBuilder<> IRB(&I);
4196 Value *Ptrs = I.getArgOperand(0);
4197 const Align Alignment(
4198 cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
4199 Value *Mask = I.getArgOperand(2);
4200 Value *PassThru = I.getArgOperand(3);
4201
4202 Type *PtrsShadowTy = getShadowTy(Ptrs);
4204 insertCheckShadowOf(Mask, &I);
4205 Value *MaskedPtrShadow = IRB.CreateSelect(
4206 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4207 "_msmaskedptrs");
4208 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4209 }
4210
4211 if (!PropagateShadow) {
4212 setShadow(&I, getCleanShadow(&I));
4213 setOrigin(&I, getCleanOrigin());
4214 return;
4215 }
4216
4217 Type *ShadowTy = getShadowTy(&I);
4218 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4219 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4220 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ false);
4221
4222 Value *Shadow =
4223 IRB.CreateMaskedGather(ShadowTy, ShadowPtrs, Alignment, Mask,
4224 getShadow(PassThru), "_msmaskedgather");
4225
4226 setShadow(&I, Shadow);
4227
4228 // TODO: Store origins.
4229 setOrigin(&I, getCleanOrigin());
4230 }
4231
4232 void handleMaskedScatter(IntrinsicInst &I) {
4233 IRBuilder<> IRB(&I);
4234 Value *Values = I.getArgOperand(0);
4235 Value *Ptrs = I.getArgOperand(1);
4236 const Align Alignment(
4237 cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
4238 Value *Mask = I.getArgOperand(3);
4239
4240 Type *PtrsShadowTy = getShadowTy(Ptrs);
4242 insertCheckShadowOf(Mask, &I);
4243 Value *MaskedPtrShadow = IRB.CreateSelect(
4244 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4245 "_msmaskedptrs");
4246 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4247 }
4248
4249 Value *Shadow = getShadow(Values);
4250 Type *ElementShadowTy =
4251 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4252 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4253 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ true);
4254
4255 IRB.CreateMaskedScatter(Shadow, ShadowPtrs, Alignment, Mask);
4256
4257 // TODO: Store origin.
4258 }
4259
4260 // Intrinsic::masked_store
4261 //
4262 // Note: handleAVXMaskedStore handles AVX/AVX2 variants, though AVX512 masked
4263 // stores are lowered to Intrinsic::masked_store.
4264 void handleMaskedStore(IntrinsicInst &I) {
4265 IRBuilder<> IRB(&I);
4266 Value *V = I.getArgOperand(0);
4267 Value *Ptr = I.getArgOperand(1);
4268 const Align Alignment(
4269 cast<ConstantInt>(I.getArgOperand(2))->getZExtValue());
4270 Value *Mask = I.getArgOperand(3);
4271 Value *Shadow = getShadow(V);
4272
4274 insertCheckShadowOf(Ptr, &I);
4275 insertCheckShadowOf(Mask, &I);
4276 }
4277
4278 Value *ShadowPtr;
4279 Value *OriginPtr;
4280 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
4281 Ptr, IRB, Shadow->getType(), Alignment, /*isStore*/ true);
4282
4283 IRB.CreateMaskedStore(Shadow, ShadowPtr, Alignment, Mask);
4284
4285 if (!MS.TrackOrigins)
4286 return;
4287
4288 auto &DL = F.getDataLayout();
4289 paintOrigin(IRB, getOrigin(V), OriginPtr,
4290 DL.getTypeStoreSize(Shadow->getType()),
4291 std::max(Alignment, kMinOriginAlignment));
4292 }
4293
4294 // Intrinsic::masked_load
4295 //
4296 // Note: handleAVXMaskedLoad handles AVX/AVX2 variants, though AVX512 masked
4297 // loads are lowered to Intrinsic::masked_load.
4298 void handleMaskedLoad(IntrinsicInst &I) {
4299 IRBuilder<> IRB(&I);
4300 Value *Ptr = I.getArgOperand(0);
4301 const Align Alignment(
4302 cast<ConstantInt>(I.getArgOperand(1))->getZExtValue());
4303 Value *Mask = I.getArgOperand(2);
4304 Value *PassThru = I.getArgOperand(3);
4305
4307 insertCheckShadowOf(Ptr, &I);
4308 insertCheckShadowOf(Mask, &I);
4309 }
4310
4311 if (!PropagateShadow) {
4312 setShadow(&I, getCleanShadow(&I));
4313 setOrigin(&I, getCleanOrigin());
4314 return;
4315 }
4316
4317 Type *ShadowTy = getShadowTy(&I);
4318 Value *ShadowPtr, *OriginPtr;
4319 std::tie(ShadowPtr, OriginPtr) =
4320 getShadowOriginPtr(Ptr, IRB, ShadowTy, Alignment, /*isStore*/ false);
4321 setShadow(&I, IRB.CreateMaskedLoad(ShadowTy, ShadowPtr, Alignment, Mask,
4322 getShadow(PassThru), "_msmaskedld"));
4323
4324 if (!MS.TrackOrigins)
4325 return;
4326
4327 // Choose between PassThru's and the loaded value's origins.
4328 Value *MaskedPassThruShadow = IRB.CreateAnd(
4329 getShadow(PassThru), IRB.CreateSExt(IRB.CreateNeg(Mask), ShadowTy));
4330
4331 Value *NotNull = convertToBool(MaskedPassThruShadow, IRB, "_mscmp");
4332
4333 Value *PtrOrigin = IRB.CreateLoad(MS.OriginTy, OriginPtr);
4334 Value *Origin = IRB.CreateSelect(NotNull, getOrigin(PassThru), PtrOrigin);
4335
4336 setOrigin(&I, Origin);
4337 }
4338
4339 // e.g., void @llvm.x86.avx.maskstore.ps.256(ptr, <8 x i32>, <8 x float>)
4340 // dst mask src
4341 //
4342 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4343 // by handleMaskedStore.
4344 //
4345 // This function handles AVX and AVX2 masked stores; these use the MSBs of a
4346 // vector of integers, unlike the LLVM masked intrinsics, which require a
4347 // vector of booleans. X86InstCombineIntrinsic.cpp::simplifyX86MaskedLoad
4348 // mentions that the x86 backend does not know how to efficiently convert
4349 // from a vector of booleans back into the AVX mask format; therefore, they
4350 // (and we) do not reduce AVX/AVX2 masked intrinsics into LLVM masked
4351 // intrinsics.
4352 void handleAVXMaskedStore(IntrinsicInst &I) {
4353 assert(I.arg_size() == 3);
4354
4355 IRBuilder<> IRB(&I);
4356
4357 Value *Dst = I.getArgOperand(0);
4358 assert(Dst->getType()->isPointerTy() && "Destination is not a pointer!");
4359
4360 Value *Mask = I.getArgOperand(1);
4361 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4362
4363 Value *Src = I.getArgOperand(2);
4364 assert(isa<VectorType>(Src->getType()) && "Source is not a vector!");
4365
4366 const Align Alignment = Align(1);
4367
4368 Value *SrcShadow = getShadow(Src);
4369
4371 insertCheckShadowOf(Dst, &I);
4372 insertCheckShadowOf(Mask, &I);
4373 }
4374
4375 Value *DstShadowPtr;
4376 Value *DstOriginPtr;
4377 std::tie(DstShadowPtr, DstOriginPtr) = getShadowOriginPtr(
4378 Dst, IRB, SrcShadow->getType(), Alignment, /*isStore*/ true);
4379
4380 SmallVector<Value *, 2> ShadowArgs;
4381 ShadowArgs.append(1, DstShadowPtr);
4382 ShadowArgs.append(1, Mask);
4383 // The intrinsic may require floating-point but shadows can be arbitrary
4384 // bit patterns, of which some would be interpreted as "invalid"
4385 // floating-point values (NaN etc.); we assume the intrinsic will happily
4386 // copy them.
4387 ShadowArgs.append(1, IRB.CreateBitCast(SrcShadow, Src->getType()));
4388
4389 CallInst *CI =
4390 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
4391 setShadow(&I, CI);
4392
4393 if (!MS.TrackOrigins)
4394 return;
4395
4396 // Approximation only
4397 auto &DL = F.getDataLayout();
4398 paintOrigin(IRB, getOrigin(Src), DstOriginPtr,
4399 DL.getTypeStoreSize(SrcShadow->getType()),
4400 std::max(Alignment, kMinOriginAlignment));
4401 }
4402
4403 // e.g., <8 x float> @llvm.x86.avx.maskload.ps.256(ptr, <8 x i32>)
4404 // return src mask
4405 //
4406 // Masked-off values are replaced with 0, which conveniently also represents
4407 // initialized memory.
4408 //
4409 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4410 // by handleMaskedStore.
4411 //
4412 // We do not combine this with handleMaskedLoad; see comment in
4413 // handleAVXMaskedStore for the rationale.
4414 //
4415 // This is subtly different than handleIntrinsicByApplyingToShadow(I, 1)
4416 // because we need to apply getShadowOriginPtr, not getShadow, to the first
4417 // parameter.
4418 void handleAVXMaskedLoad(IntrinsicInst &I) {
4419 assert(I.arg_size() == 2);
4420
4421 IRBuilder<> IRB(&I);
4422
4423 Value *Src = I.getArgOperand(0);
4424 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
4425
4426 Value *Mask = I.getArgOperand(1);
4427 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4428
4429 const Align Alignment = Align(1);
4430
4432 insertCheckShadowOf(Mask, &I);
4433 }
4434
4435 Type *SrcShadowTy = getShadowTy(Src);
4436 Value *SrcShadowPtr, *SrcOriginPtr;
4437 std::tie(SrcShadowPtr, SrcOriginPtr) =
4438 getShadowOriginPtr(Src, IRB, SrcShadowTy, Alignment, /*isStore*/ false);
4439
4440 SmallVector<Value *, 2> ShadowArgs;
4441 ShadowArgs.append(1, SrcShadowPtr);
4442 ShadowArgs.append(1, Mask);
4443
4444 CallInst *CI =
4445 IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(), ShadowArgs);
4446 // The AVX masked load intrinsics do not have integer variants. We use the
4447 // floating-point variants, which will happily copy the shadows even if
4448 // they are interpreted as "invalid" floating-point values (NaN etc.).
4449 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4450
4451 if (!MS.TrackOrigins)
4452 return;
4453
4454 // The "pass-through" value is always zero (initialized). To the extent
4455 // that that results in initialized aligned 4-byte chunks, the origin value
4456 // is ignored. It is therefore correct to simply copy the origin from src.
4457 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
4458 setOrigin(&I, PtrSrcOrigin);
4459 }
4460
4461 // Test whether the mask indices are initialized, only checking the bits that
4462 // are actually used.
4463 //
4464 // e.g., if Idx is <32 x i16>, only (log2(32) == 5) bits of each index are
4465 // used/checked.
4466 void maskedCheckAVXIndexShadow(IRBuilder<> &IRB, Value *Idx, Instruction *I) {
4467 assert(isFixedIntVector(Idx));
4468 auto IdxVectorSize =
4469 cast<FixedVectorType>(Idx->getType())->getNumElements();
4470 assert(isPowerOf2_64(IdxVectorSize));
4471
4472 // Compiler isn't smart enough, let's help it
4473 if (isa<Constant>(Idx))
4474 return;
4475
4476 auto *IdxShadow = getShadow(Idx);
4477 Value *Truncated = IRB.CreateTrunc(
4478 IdxShadow,
4479 FixedVectorType::get(Type::getIntNTy(*MS.C, Log2_64(IdxVectorSize)),
4480 IdxVectorSize));
4481 insertCheckShadow(Truncated, getOrigin(Idx), I);
4482 }
4483
4484 // Instrument AVX permutation intrinsic.
4485 // We apply the same permutation (argument index 1) to the shadow.
4486 void handleAVXVpermilvar(IntrinsicInst &I) {
4487 IRBuilder<> IRB(&I);
4488 Value *Shadow = getShadow(&I, 0);
4489 maskedCheckAVXIndexShadow(IRB, I.getArgOperand(1), &I);
4490
4491 // Shadows are integer-ish types but some intrinsics require a
4492 // different (e.g., floating-point) type.
4493 Shadow = IRB.CreateBitCast(Shadow, I.getArgOperand(0)->getType());
4494 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4495 {Shadow, I.getArgOperand(1)});
4496
4497 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4498 setOriginForNaryOp(I);
4499 }
4500
4501 // Instrument AVX permutation intrinsic.
4502 // We apply the same permutation (argument index 1) to the shadows.
4503 void handleAVXVpermi2var(IntrinsicInst &I) {
4504 assert(I.arg_size() == 3);
4505 assert(isa<FixedVectorType>(I.getArgOperand(0)->getType()));
4506 assert(isa<FixedVectorType>(I.getArgOperand(1)->getType()));
4507 assert(isa<FixedVectorType>(I.getArgOperand(2)->getType()));
4508 [[maybe_unused]] auto ArgVectorSize =
4509 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4510 assert(cast<FixedVectorType>(I.getArgOperand(1)->getType())
4511 ->getNumElements() == ArgVectorSize);
4512 assert(cast<FixedVectorType>(I.getArgOperand(2)->getType())
4513 ->getNumElements() == ArgVectorSize);
4514 assert(I.getArgOperand(0)->getType() == I.getArgOperand(2)->getType());
4515 assert(I.getType() == I.getArgOperand(0)->getType());
4516 assert(I.getArgOperand(1)->getType()->isIntOrIntVectorTy());
4517 IRBuilder<> IRB(&I);
4518 Value *AShadow = getShadow(&I, 0);
4519 Value *Idx = I.getArgOperand(1);
4520 Value *BShadow = getShadow(&I, 2);
4521
4522 maskedCheckAVXIndexShadow(IRB, Idx, &I);
4523
4524 // Shadows are integer-ish types but some intrinsics require a
4525 // different (e.g., floating-point) type.
4526 AShadow = IRB.CreateBitCast(AShadow, I.getArgOperand(0)->getType());
4527 BShadow = IRB.CreateBitCast(BShadow, I.getArgOperand(2)->getType());
4528 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4529 {AShadow, Idx, BShadow});
4530 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4531 setOriginForNaryOp(I);
4532 }
4533
4534 [[maybe_unused]] static bool isFixedIntVectorTy(const Type *T) {
4535 return isa<FixedVectorType>(T) && T->isIntOrIntVectorTy();
4536 }
4537
4538 [[maybe_unused]] static bool isFixedFPVectorTy(const Type *T) {
4539 return isa<FixedVectorType>(T) && T->isFPOrFPVectorTy();
4540 }
4541
4542 [[maybe_unused]] static bool isFixedIntVector(const Value *V) {
4543 return isFixedIntVectorTy(V->getType());
4544 }
4545
4546 [[maybe_unused]] static bool isFixedFPVector(const Value *V) {
4547 return isFixedFPVectorTy(V->getType());
4548 }
4549
4550 // e.g., <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
4551 // (<16 x float> a, <16 x i32> writethru, i16 mask,
4552 // i32 rounding)
4553 //
4554 // Inconveniently, some similar intrinsics have a different operand order:
4555 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
4556 // (<16 x float> a, i32 rounding, <16 x i16> writethru,
4557 // i16 mask)
4558 //
4559 // If the return type has more elements than A, the excess elements are
4560 // zeroed (and the corresponding shadow is initialized).
4561 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
4562 // (<4 x float> a, i32 rounding, <8 x i16> writethru,
4563 // i8 mask)
4564 //
4565 // dst[i] = mask[i] ? convert(a[i]) : writethru[i]
4566 // dst_shadow[i] = mask[i] ? all_or_nothing(a_shadow[i]) : writethru_shadow[i]
4567 // where all_or_nothing(x) is fully uninitialized if x has any
4568 // uninitialized bits
4569 void handleAVX512VectorConvertFPToInt(IntrinsicInst &I, bool LastMask) {
4570 IRBuilder<> IRB(&I);
4571
4572 assert(I.arg_size() == 4);
4573 Value *A = I.getOperand(0);
4574 Value *WriteThrough;
4575 Value *Mask;
4577 if (LastMask) {
4578 WriteThrough = I.getOperand(2);
4579 Mask = I.getOperand(3);
4580 RoundingMode = I.getOperand(1);
4581 } else {
4582 WriteThrough = I.getOperand(1);
4583 Mask = I.getOperand(2);
4584 RoundingMode = I.getOperand(3);
4585 }
4586
4587 assert(isFixedFPVector(A));
4588 assert(isFixedIntVector(WriteThrough));
4589
4590 unsigned ANumElements =
4591 cast<FixedVectorType>(A->getType())->getNumElements();
4592 [[maybe_unused]] unsigned WriteThruNumElements =
4593 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4594 assert(ANumElements == WriteThruNumElements ||
4595 ANumElements * 2 == WriteThruNumElements);
4596
4597 assert(Mask->getType()->isIntegerTy());
4598 unsigned MaskNumElements = Mask->getType()->getScalarSizeInBits();
4599 assert(ANumElements == MaskNumElements ||
4600 ANumElements * 2 == MaskNumElements);
4601
4602 assert(WriteThruNumElements == MaskNumElements);
4603
4604 // Some bits of the mask may be unused, though it's unusual to have partly
4605 // uninitialized bits.
4606 insertCheckShadowOf(Mask, &I);
4607
4608 assert(RoundingMode->getType()->isIntegerTy());
4609 // Only some bits of the rounding mode are used, though it's very
4610 // unusual to have uninitialized bits there (more commonly, it's a
4611 // constant).
4612 insertCheckShadowOf(RoundingMode, &I);
4613
4614 assert(I.getType() == WriteThrough->getType());
4615
4616 Value *AShadow = getShadow(A);
4617 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4618
4619 if (ANumElements * 2 == MaskNumElements) {
4620 // Ensure that the irrelevant bits of the mask are zero, hence selecting
4621 // from the zeroed shadow instead of the writethrough's shadow.
4622 Mask =
4623 IRB.CreateTrunc(Mask, IRB.getIntNTy(ANumElements), "_ms_mask_trunc");
4624 Mask =
4625 IRB.CreateZExt(Mask, IRB.getIntNTy(MaskNumElements), "_ms_mask_zext");
4626 }
4627
4628 // Convert i16 mask to <16 x i1>
4629 Mask = IRB.CreateBitCast(
4630 Mask, FixedVectorType::get(IRB.getInt1Ty(), MaskNumElements),
4631 "_ms_mask_bitcast");
4632
4633 /// For floating-point to integer conversion, the output is:
4634 /// - fully uninitialized if *any* bit of the input is uninitialized
4635 /// - fully ininitialized if all bits of the input are ininitialized
4636 /// We apply the same principle on a per-element basis for vectors.
4637 ///
4638 /// We use the scalar width of the return type instead of A's.
4639 AShadow = IRB.CreateSExt(
4640 IRB.CreateICmpNE(AShadow, getCleanShadow(AShadow->getType())),
4641 getShadowTy(&I), "_ms_a_shadow");
4642
4643 Value *WriteThroughShadow = getShadow(WriteThrough);
4644 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow,
4645 "_ms_writethru_select");
4646
4647 setShadow(&I, Shadow);
4648 setOriginForNaryOp(I);
4649 }
4650
4651 // Instrument BMI / BMI2 intrinsics.
4652 // All of these intrinsics are Z = I(X, Y)
4653 // where the types of all operands and the result match, and are either i32 or
4654 // i64. The following instrumentation happens to work for all of them:
4655 // Sz = I(Sx, Y) | (sext (Sy != 0))
4656 void handleBmiIntrinsic(IntrinsicInst &I) {
4657 IRBuilder<> IRB(&I);
4658 Type *ShadowTy = getShadowTy(&I);
4659
4660 // If any bit of the mask operand is poisoned, then the whole thing is.
4661 Value *SMask = getShadow(&I, 1);
4662 SMask = IRB.CreateSExt(IRB.CreateICmpNE(SMask, getCleanShadow(ShadowTy)),
4663 ShadowTy);
4664 // Apply the same intrinsic to the shadow of the first operand.
4665 Value *S = IRB.CreateCall(I.getCalledFunction(),
4666 {getShadow(&I, 0), I.getOperand(1)});
4667 S = IRB.CreateOr(SMask, S);
4668 setShadow(&I, S);
4669 setOriginForNaryOp(I);
4670 }
4671
4672 static SmallVector<int, 8> getPclmulMask(unsigned Width, bool OddElements) {
4673 SmallVector<int, 8> Mask;
4674 for (unsigned X = OddElements ? 1 : 0; X < Width; X += 2) {
4675 Mask.append(2, X);
4676 }
4677 return Mask;
4678 }
4679
4680 // Instrument pclmul intrinsics.
4681 // These intrinsics operate either on odd or on even elements of the input
4682 // vectors, depending on the constant in the 3rd argument, ignoring the rest.
4683 // Replace the unused elements with copies of the used ones, ex:
4684 // (0, 1, 2, 3) -> (0, 0, 2, 2) (even case)
4685 // or
4686 // (0, 1, 2, 3) -> (1, 1, 3, 3) (odd case)
4687 // and then apply the usual shadow combining logic.
4688 void handlePclmulIntrinsic(IntrinsicInst &I) {
4689 IRBuilder<> IRB(&I);
4690 unsigned Width =
4691 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4692 assert(isa<ConstantInt>(I.getArgOperand(2)) &&
4693 "pclmul 3rd operand must be a constant");
4694 unsigned Imm = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
4695 Value *Shuf0 = IRB.CreateShuffleVector(getShadow(&I, 0),
4696 getPclmulMask(Width, Imm & 0x01));
4697 Value *Shuf1 = IRB.CreateShuffleVector(getShadow(&I, 1),
4698 getPclmulMask(Width, Imm & 0x10));
4699 ShadowAndOriginCombiner SOC(this, IRB);
4700 SOC.Add(Shuf0, getOrigin(&I, 0));
4701 SOC.Add(Shuf1, getOrigin(&I, 1));
4702 SOC.Done(&I);
4703 }
4704
4705 // Instrument _mm_*_sd|ss intrinsics
4706 void handleUnarySdSsIntrinsic(IntrinsicInst &I) {
4707 IRBuilder<> IRB(&I);
4708 unsigned Width =
4709 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4710 Value *First = getShadow(&I, 0);
4711 Value *Second = getShadow(&I, 1);
4712 // First element of second operand, remaining elements of first operand
4713 SmallVector<int, 16> Mask;
4714 Mask.push_back(Width);
4715 for (unsigned i = 1; i < Width; i++)
4716 Mask.push_back(i);
4717 Value *Shadow = IRB.CreateShuffleVector(First, Second, Mask);
4718
4719 setShadow(&I, Shadow);
4720 setOriginForNaryOp(I);
4721 }
4722
4723 void handleVtestIntrinsic(IntrinsicInst &I) {
4724 IRBuilder<> IRB(&I);
4725 Value *Shadow0 = getShadow(&I, 0);
4726 Value *Shadow1 = getShadow(&I, 1);
4727 Value *Or = IRB.CreateOr(Shadow0, Shadow1);
4728 Value *NZ = IRB.CreateICmpNE(Or, Constant::getNullValue(Or->getType()));
4729 Value *Scalar = convertShadowToScalar(NZ, IRB);
4730 Value *Shadow = IRB.CreateZExt(Scalar, getShadowTy(&I));
4731
4732 setShadow(&I, Shadow);
4733 setOriginForNaryOp(I);
4734 }
4735
4736 void handleBinarySdSsIntrinsic(IntrinsicInst &I) {
4737 IRBuilder<> IRB(&I);
4738 unsigned Width =
4739 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4740 Value *First = getShadow(&I, 0);
4741 Value *Second = getShadow(&I, 1);
4742 Value *OrShadow = IRB.CreateOr(First, Second);
4743 // First element of both OR'd together, remaining elements of first operand
4744 SmallVector<int, 16> Mask;
4745 Mask.push_back(Width);
4746 for (unsigned i = 1; i < Width; i++)
4747 Mask.push_back(i);
4748 Value *Shadow = IRB.CreateShuffleVector(First, OrShadow, Mask);
4749
4750 setShadow(&I, Shadow);
4751 setOriginForNaryOp(I);
4752 }
4753
4754 // _mm_round_ps / _mm_round_ps.
4755 // Similar to maybeHandleSimpleNomemIntrinsic except
4756 // the second argument is guranteed to be a constant integer.
4757 void handleRoundPdPsIntrinsic(IntrinsicInst &I) {
4758 assert(I.getArgOperand(0)->getType() == I.getType());
4759 assert(I.arg_size() == 2);
4760 assert(isa<ConstantInt>(I.getArgOperand(1)));
4761
4762 IRBuilder<> IRB(&I);
4763 ShadowAndOriginCombiner SC(this, IRB);
4764 SC.Add(I.getArgOperand(0));
4765 SC.Done(&I);
4766 }
4767
4768 // Instrument @llvm.abs intrinsic.
4769 //
4770 // e.g., i32 @llvm.abs.i32 (i32 <Src>, i1 <is_int_min_poison>)
4771 // <4 x i32> @llvm.abs.v4i32(<4 x i32> <Src>, i1 <is_int_min_poison>)
4772 void handleAbsIntrinsic(IntrinsicInst &I) {
4773 assert(I.arg_size() == 2);
4774 Value *Src = I.getArgOperand(0);
4775 Value *IsIntMinPoison = I.getArgOperand(1);
4776
4777 assert(I.getType()->isIntOrIntVectorTy());
4778
4779 assert(Src->getType() == I.getType());
4780
4781 assert(IsIntMinPoison->getType()->isIntegerTy());
4782 assert(IsIntMinPoison->getType()->getIntegerBitWidth() == 1);
4783
4784 IRBuilder<> IRB(&I);
4785 Value *SrcShadow = getShadow(Src);
4786
4787 APInt MinVal =
4788 APInt::getSignedMinValue(Src->getType()->getScalarSizeInBits());
4789 Value *MinValVec = ConstantInt::get(Src->getType(), MinVal);
4790 Value *SrcIsMin = IRB.CreateICmp(CmpInst::ICMP_EQ, Src, MinValVec);
4791
4792 Value *PoisonedShadow = getPoisonedShadow(Src);
4793 Value *PoisonedIfIntMinShadow =
4794 IRB.CreateSelect(SrcIsMin, PoisonedShadow, SrcShadow);
4795 Value *Shadow =
4796 IRB.CreateSelect(IsIntMinPoison, PoisonedIfIntMinShadow, SrcShadow);
4797
4798 setShadow(&I, Shadow);
4799 setOrigin(&I, getOrigin(&I, 0));
4800 }
4801
4802 void handleIsFpClass(IntrinsicInst &I) {
4803 IRBuilder<> IRB(&I);
4804 Value *Shadow = getShadow(&I, 0);
4805 setShadow(&I, IRB.CreateICmpNE(Shadow, getCleanShadow(Shadow)));
4806 setOrigin(&I, getOrigin(&I, 0));
4807 }
4808
4809 void handleArithmeticWithOverflow(IntrinsicInst &I) {
4810 IRBuilder<> IRB(&I);
4811 Value *Shadow0 = getShadow(&I, 0);
4812 Value *Shadow1 = getShadow(&I, 1);
4813 Value *ShadowElt0 = IRB.CreateOr(Shadow0, Shadow1);
4814 Value *ShadowElt1 =
4815 IRB.CreateICmpNE(ShadowElt0, getCleanShadow(ShadowElt0));
4816
4817 Value *Shadow = PoisonValue::get(getShadowTy(&I));
4818 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt0, 0);
4819 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt1, 1);
4820
4821 setShadow(&I, Shadow);
4822 setOriginForNaryOp(I);
4823 }
4824
4825 Value *extractLowerShadow(IRBuilder<> &IRB, Value *V) {
4826 assert(isa<FixedVectorType>(V->getType()));
4827 assert(cast<FixedVectorType>(V->getType())->getNumElements() > 0);
4828 Value *Shadow = getShadow(V);
4829 return IRB.CreateExtractElement(Shadow,
4830 ConstantInt::get(IRB.getInt32Ty(), 0));
4831 }
4832
4833 // Handle llvm.x86.avx512.mask.pmov{,s,us}.*.512
4834 //
4835 // e.g., call <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512
4836 // (<8 x i64>, <16 x i8>, i8)
4837 // A WriteThru Mask
4838 //
4839 // call <16 x i8> @llvm.x86.avx512.mask.pmovs.db.512
4840 // (<16 x i32>, <16 x i8>, i16)
4841 //
4842 // Dst[i] = Mask[i] ? truncate_or_saturate(A[i]) : WriteThru[i]
4843 // Dst_shadow[i] = Mask[i] ? truncate(A_shadow[i]) : WriteThru_shadow[i]
4844 //
4845 // If Dst has more elements than A, the excess elements are zeroed (and the
4846 // corresponding shadow is initialized).
4847 //
4848 // Note: for PMOV (truncation), handleIntrinsicByApplyingToShadow is precise
4849 // and is much faster than this handler.
4850 void handleAVX512VectorDownConvert(IntrinsicInst &I) {
4851 IRBuilder<> IRB(&I);
4852
4853 assert(I.arg_size() == 3);
4854 Value *A = I.getOperand(0);
4855 Value *WriteThrough = I.getOperand(1);
4856 Value *Mask = I.getOperand(2);
4857
4858 assert(isFixedIntVector(A));
4859 assert(isFixedIntVector(WriteThrough));
4860
4861 unsigned ANumElements =
4862 cast<FixedVectorType>(A->getType())->getNumElements();
4863 unsigned OutputNumElements =
4864 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4865 assert(ANumElements == OutputNumElements ||
4866 ANumElements * 2 == OutputNumElements);
4867
4868 assert(Mask->getType()->isIntegerTy());
4869 assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
4870 insertCheckShadowOf(Mask, &I);
4871
4872 assert(I.getType() == WriteThrough->getType());
4873
4874 // Widen the mask, if necessary, to have one bit per element of the output
4875 // vector.
4876 // We want the extra bits to have '1's, so that the CreateSelect will
4877 // select the values from AShadow instead of WriteThroughShadow ("maskless"
4878 // versions of the intrinsics are sometimes implemented using an all-1's
4879 // mask and an undefined value for WriteThroughShadow). We accomplish this
4880 // by using bitwise NOT before and after the ZExt.
4881 if (ANumElements != OutputNumElements) {
4882 Mask = IRB.CreateNot(Mask);
4883 Mask = IRB.CreateZExt(Mask, Type::getIntNTy(*MS.C, OutputNumElements),
4884 "_ms_widen_mask");
4885 Mask = IRB.CreateNot(Mask);
4886 }
4887 Mask = IRB.CreateBitCast(
4888 Mask, FixedVectorType::get(IRB.getInt1Ty(), OutputNumElements));
4889
4890 Value *AShadow = getShadow(A);
4891
4892 // The return type might have more elements than the input.
4893 // Temporarily shrink the return type's number of elements.
4894 VectorType *ShadowType = maybeShrinkVectorShadowType(A, I);
4895
4896 // PMOV truncates; PMOVS/PMOVUS uses signed/unsigned saturation.
4897 // This handler treats them all as truncation, which leads to some rare
4898 // false positives in the cases where the truncated bytes could
4899 // unambiguously saturate the value e.g., if A = ??????10 ????????
4900 // (big-endian), the unsigned saturated byte conversion is 11111111 i.e.,
4901 // fully defined, but the truncated byte is ????????.
4902 //
4903 // TODO: use GetMinMaxUnsigned() to handle saturation precisely.
4904 AShadow = IRB.CreateTrunc(AShadow, ShadowType, "_ms_trunc_shadow");
4905 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4906
4907 Value *WriteThroughShadow = getShadow(WriteThrough);
4908
4909 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow);
4910 setShadow(&I, Shadow);
4911 setOriginForNaryOp(I);
4912 }
4913
4914 // Handle llvm.x86.avx512.* instructions that take a vector of floating-point
4915 // values and perform an operation whose shadow propagation should be handled
4916 // as all-or-nothing [*], with masking provided by a vector and a mask
4917 // supplied as an integer.
4918 //
4919 // [*] if all bits of a vector element are initialized, the output is fully
4920 // initialized; otherwise, the output is fully uninitialized
4921 //
4922 // e.g., <16 x float> @llvm.x86.avx512.rsqrt14.ps.512
4923 // (<16 x float>, <16 x float>, i16)
4924 // A WriteThru Mask
4925 //
4926 // <2 x double> @llvm.x86.avx512.rcp14.pd.128
4927 // (<2 x double>, <2 x double>, i8)
4928 //
4929 // Dst[i] = Mask[i] ? some_op(A[i]) : WriteThru[i]
4930 // Dst_shadow[i] = Mask[i] ? all_or_nothing(A_shadow[i]) : WriteThru_shadow[i]
4931 void handleAVX512VectorGenericMaskedFP(IntrinsicInst &I) {
4932 IRBuilder<> IRB(&I);
4933
4934 assert(I.arg_size() == 3);
4935 Value *A = I.getOperand(0);
4936 Value *WriteThrough = I.getOperand(1);
4937 Value *Mask = I.getOperand(2);
4938
4939 assert(isFixedFPVector(A));
4940 assert(isFixedFPVector(WriteThrough));
4941
4942 [[maybe_unused]] unsigned ANumElements =
4943 cast<FixedVectorType>(A->getType())->getNumElements();
4944 unsigned OutputNumElements =
4945 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4946 assert(ANumElements == OutputNumElements);
4947
4948 assert(Mask->getType()->isIntegerTy());
4949 // Some bits of the mask might be unused, but check them all anyway
4950 // (typically the mask is an integer constant).
4951 insertCheckShadowOf(Mask, &I);
4952
4953 // The mask has 1 bit per element of A, but a minimum of 8 bits.
4954 if (Mask->getType()->getScalarSizeInBits() == 8 && ANumElements < 8)
4955 Mask = IRB.CreateTrunc(Mask, Type::getIntNTy(*MS.C, ANumElements));
4956 assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
4957
4958 assert(I.getType() == WriteThrough->getType());
4959
4960 Mask = IRB.CreateBitCast(
4961 Mask, FixedVectorType::get(IRB.getInt1Ty(), OutputNumElements));
4962
4963 Value *AShadow = getShadow(A);
4964
4965 // All-or-nothing shadow
4966 AShadow = IRB.CreateSExt(IRB.CreateICmpNE(AShadow, getCleanShadow(AShadow)),
4967 AShadow->getType());
4968
4969 Value *WriteThroughShadow = getShadow(WriteThrough);
4970
4971 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow);
4972 setShadow(&I, Shadow);
4973
4974 setOriginForNaryOp(I);
4975 }
4976
4977 // For sh.* compiler intrinsics:
4978 // llvm.x86.avx512fp16.mask.{add/sub/mul/div/max/min}.sh.round
4979 // (<8 x half>, <8 x half>, <8 x half>, i8, i32)
4980 // A B WriteThru Mask RoundingMode
4981 //
4982 // DstShadow[0] = Mask[0] ? (AShadow[0] | BShadow[0]) : WriteThruShadow[0]
4983 // DstShadow[1..7] = AShadow[1..7]
4984 void visitGenericScalarHalfwordInst(IntrinsicInst &I) {
4985 IRBuilder<> IRB(&I);
4986
4987 assert(I.arg_size() == 5);
4988 Value *A = I.getOperand(0);
4989 Value *B = I.getOperand(1);
4990 Value *WriteThrough = I.getOperand(2);
4991 Value *Mask = I.getOperand(3);
4992 Value *RoundingMode = I.getOperand(4);
4993
4994 // Technically, we could probably just check whether the LSB is
4995 // initialized, but intuitively it feels like a partly uninitialized mask
4996 // is unintended, and we should warn the user immediately.
4997 insertCheckShadowOf(Mask, &I);
4998 insertCheckShadowOf(RoundingMode, &I);
4999
5000 assert(isa<FixedVectorType>(A->getType()));
5001 unsigned NumElements =
5002 cast<FixedVectorType>(A->getType())->getNumElements();
5003 assert(NumElements == 8);
5004 assert(A->getType() == B->getType());
5005 assert(B->getType() == WriteThrough->getType());
5006 assert(Mask->getType()->getPrimitiveSizeInBits() == NumElements);
5007 assert(RoundingMode->getType()->isIntegerTy());
5008
5009 Value *ALowerShadow = extractLowerShadow(IRB, A);
5010 Value *BLowerShadow = extractLowerShadow(IRB, B);
5011
5012 Value *ABLowerShadow = IRB.CreateOr(ALowerShadow, BLowerShadow);
5013
5014 Value *WriteThroughLowerShadow = extractLowerShadow(IRB, WriteThrough);
5015
5016 Mask = IRB.CreateBitCast(
5017 Mask, FixedVectorType::get(IRB.getInt1Ty(), NumElements));
5018 Value *MaskLower =
5019 IRB.CreateExtractElement(Mask, ConstantInt::get(IRB.getInt32Ty(), 0));
5020
5021 Value *AShadow = getShadow(A);
5022 Value *DstLowerShadow =
5023 IRB.CreateSelect(MaskLower, ABLowerShadow, WriteThroughLowerShadow);
5024 Value *DstShadow = IRB.CreateInsertElement(
5025 AShadow, DstLowerShadow, ConstantInt::get(IRB.getInt32Ty(), 0),
5026 "_msprop");
5027
5028 setShadow(&I, DstShadow);
5029 setOriginForNaryOp(I);
5030 }
5031
5032 // Approximately handle AVX Galois Field Affine Transformation
5033 //
5034 // e.g.,
5035 // <16 x i8> @llvm.x86.vgf2p8affineqb.128(<16 x i8>, <16 x i8>, i8)
5036 // <32 x i8> @llvm.x86.vgf2p8affineqb.256(<32 x i8>, <32 x i8>, i8)
5037 // <64 x i8> @llvm.x86.vgf2p8affineqb.512(<64 x i8>, <64 x i8>, i8)
5038 // Out A x b
5039 // where A and x are packed matrices, b is a vector,
5040 // Out = A * x + b in GF(2)
5041 //
5042 // Multiplication in GF(2) is equivalent to bitwise AND. However, the matrix
5043 // computation also includes a parity calculation.
5044 //
5045 // For the bitwise AND of bits V1 and V2, the exact shadow is:
5046 // Out_Shadow = (V1_Shadow & V2_Shadow)
5047 // | (V1 & V2_Shadow)
5048 // | (V1_Shadow & V2 )
5049 //
5050 // We approximate the shadow of gf2p8affineqb using:
5051 // Out_Shadow = gf2p8affineqb(x_Shadow, A_shadow, 0)
5052 // | gf2p8affineqb(x, A_shadow, 0)
5053 // | gf2p8affineqb(x_Shadow, A, 0)
5054 // | set1_epi8(b_Shadow)
5055 //
5056 // This approximation has false negatives: if an intermediate dot-product
5057 // contains an even number of 1's, the parity is 0.
5058 // It has no false positives.
5059 void handleAVXGF2P8Affine(IntrinsicInst &I) {
5060 IRBuilder<> IRB(&I);
5061
5062 assert(I.arg_size() == 3);
5063 Value *A = I.getOperand(0);
5064 Value *X = I.getOperand(1);
5065 Value *B = I.getOperand(2);
5066
5067 assert(isFixedIntVector(A));
5068 assert(cast<VectorType>(A->getType())
5069 ->getElementType()
5070 ->getScalarSizeInBits() == 8);
5071
5072 assert(A->getType() == X->getType());
5073
5074 assert(B->getType()->isIntegerTy());
5075 assert(B->getType()->getScalarSizeInBits() == 8);
5076
5077 assert(I.getType() == A->getType());
5078
5079 Value *AShadow = getShadow(A);
5080 Value *XShadow = getShadow(X);
5081 Value *BZeroShadow = getCleanShadow(B);
5082
5083 CallInst *AShadowXShadow = IRB.CreateIntrinsic(
5084 I.getType(), I.getIntrinsicID(), {XShadow, AShadow, BZeroShadow});
5085 CallInst *AShadowX = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5086 {X, AShadow, BZeroShadow});
5087 CallInst *XShadowA = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5088 {XShadow, A, BZeroShadow});
5089
5090 unsigned NumElements = cast<FixedVectorType>(I.getType())->getNumElements();
5091 Value *BShadow = getShadow(B);
5092 Value *BBroadcastShadow = getCleanShadow(AShadow);
5093 // There is no LLVM IR intrinsic for _mm512_set1_epi8.
5094 // This loop generates a lot of LLVM IR, which we expect that CodeGen will
5095 // lower appropriately (e.g., VPBROADCASTB).
5096 // Besides, b is often a constant, in which case it is fully initialized.
5097 for (unsigned i = 0; i < NumElements; i++)
5098 BBroadcastShadow = IRB.CreateInsertElement(BBroadcastShadow, BShadow, i);
5099
5100 setShadow(&I, IRB.CreateOr(
5101 {AShadowXShadow, AShadowX, XShadowA, BBroadcastShadow}));
5102 setOriginForNaryOp(I);
5103 }
5104
5105 // Handle Arm NEON vector load intrinsics (vld*).
5106 //
5107 // The WithLane instructions (ld[234]lane) are similar to:
5108 // call {<4 x i32>, <4 x i32>, <4 x i32>}
5109 // @llvm.aarch64.neon.ld3lane.v4i32.p0
5110 // (<4 x i32> %L1, <4 x i32> %L2, <4 x i32> %L3, i64 %lane, ptr
5111 // %A)
5112 //
5113 // The non-WithLane instructions (ld[234], ld1x[234], ld[234]r) are similar
5114 // to:
5115 // call {<8 x i8>, <8 x i8>} @llvm.aarch64.neon.ld2.v8i8.p0(ptr %A)
5116 void handleNEONVectorLoad(IntrinsicInst &I, bool WithLane) {
5117 unsigned int numArgs = I.arg_size();
5118
5119 // Return type is a struct of vectors of integers or floating-point
5120 assert(I.getType()->isStructTy());
5121 [[maybe_unused]] StructType *RetTy = cast<StructType>(I.getType());
5122 assert(RetTy->getNumElements() > 0);
5124 RetTy->getElementType(0)->isFPOrFPVectorTy());
5125 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5126 assert(RetTy->getElementType(i) == RetTy->getElementType(0));
5127
5128 if (WithLane) {
5129 // 2, 3 or 4 vectors, plus lane number, plus input pointer
5130 assert(4 <= numArgs && numArgs <= 6);
5131
5132 // Return type is a struct of the input vectors
5133 assert(RetTy->getNumElements() + 2 == numArgs);
5134 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5135 assert(I.getArgOperand(i)->getType() == RetTy->getElementType(0));
5136 } else {
5137 assert(numArgs == 1);
5138 }
5139
5140 IRBuilder<> IRB(&I);
5141
5142 SmallVector<Value *, 6> ShadowArgs;
5143 if (WithLane) {
5144 for (unsigned int i = 0; i < numArgs - 2; i++)
5145 ShadowArgs.push_back(getShadow(I.getArgOperand(i)));
5146
5147 // Lane number, passed verbatim
5148 Value *LaneNumber = I.getArgOperand(numArgs - 2);
5149 ShadowArgs.push_back(LaneNumber);
5150
5151 // TODO: blend shadow of lane number into output shadow?
5152 insertCheckShadowOf(LaneNumber, &I);
5153 }
5154
5155 Value *Src = I.getArgOperand(numArgs - 1);
5156 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
5157
5158 Type *SrcShadowTy = getShadowTy(Src);
5159 auto [SrcShadowPtr, SrcOriginPtr] =
5160 getShadowOriginPtr(Src, IRB, SrcShadowTy, Align(1), /*isStore*/ false);
5161 ShadowArgs.push_back(SrcShadowPtr);
5162
5163 // The NEON vector load instructions handled by this function all have
5164 // integer variants. It is easier to use those rather than trying to cast
5165 // a struct of vectors of floats into a struct of vectors of integers.
5166 CallInst *CI =
5167 IRB.CreateIntrinsic(getShadowTy(&I), I.getIntrinsicID(), ShadowArgs);
5168 setShadow(&I, CI);
5169
5170 if (!MS.TrackOrigins)
5171 return;
5172
5173 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
5174 setOrigin(&I, PtrSrcOrigin);
5175 }
5176
5177 /// Handle Arm NEON vector store intrinsics (vst{2,3,4}, vst1x_{2,3,4},
5178 /// and vst{2,3,4}lane).
5179 ///
5180 /// Arm NEON vector store intrinsics have the output address (pointer) as the
5181 /// last argument, with the initial arguments being the inputs (and lane
5182 /// number for vst{2,3,4}lane). They return void.
5183 ///
5184 /// - st4 interleaves the output e.g., st4 (inA, inB, inC, inD, outP) writes
5185 /// abcdabcdabcdabcd... into *outP
5186 /// - st1_x4 is non-interleaved e.g., st1_x4 (inA, inB, inC, inD, outP)
5187 /// writes aaaa...bbbb...cccc...dddd... into *outP
5188 /// - st4lane has arguments of (inA, inB, inC, inD, lane, outP)
5189 /// These instructions can all be instrumented with essentially the same
5190 /// MSan logic, simply by applying the corresponding intrinsic to the shadow.
5191 void handleNEONVectorStoreIntrinsic(IntrinsicInst &I, bool useLane) {
5192 IRBuilder<> IRB(&I);
5193
5194 // Don't use getNumOperands() because it includes the callee
5195 int numArgOperands = I.arg_size();
5196
5197 // The last arg operand is the output (pointer)
5198 assert(numArgOperands >= 1);
5199 Value *Addr = I.getArgOperand(numArgOperands - 1);
5200 assert(Addr->getType()->isPointerTy());
5201 int skipTrailingOperands = 1;
5202
5204 insertCheckShadowOf(Addr, &I);
5205
5206 // Second-last operand is the lane number (for vst{2,3,4}lane)
5207 if (useLane) {
5208 skipTrailingOperands++;
5209 assert(numArgOperands >= static_cast<int>(skipTrailingOperands));
5211 I.getArgOperand(numArgOperands - skipTrailingOperands)->getType()));
5212 }
5213
5214 SmallVector<Value *, 8> ShadowArgs;
5215 // All the initial operands are the inputs
5216 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++) {
5217 assert(isa<FixedVectorType>(I.getArgOperand(i)->getType()));
5218 Value *Shadow = getShadow(&I, i);
5219 ShadowArgs.append(1, Shadow);
5220 }
5221
5222 // MSan's GetShadowTy assumes the LHS is the type we want the shadow for
5223 // e.g., for:
5224 // [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to i128
5225 // we know the type of the output (and its shadow) is <16 x i8>.
5226 //
5227 // Arm NEON VST is unusual because the last argument is the output address:
5228 // define void @st2_16b(<16 x i8> %A, <16 x i8> %B, ptr %P) {
5229 // call void @llvm.aarch64.neon.st2.v16i8.p0
5230 // (<16 x i8> [[A]], <16 x i8> [[B]], ptr [[P]])
5231 // and we have no type information about P's operand. We must manually
5232 // compute the type (<16 x i8> x 2).
5233 FixedVectorType *OutputVectorTy = FixedVectorType::get(
5234 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getElementType(),
5235 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements() *
5236 (numArgOperands - skipTrailingOperands));
5237 Type *OutputShadowTy = getShadowTy(OutputVectorTy);
5238
5239 if (useLane)
5240 ShadowArgs.append(1,
5241 I.getArgOperand(numArgOperands - skipTrailingOperands));
5242
5243 Value *OutputShadowPtr, *OutputOriginPtr;
5244 // AArch64 NEON does not need alignment (unless OS requires it)
5245 std::tie(OutputShadowPtr, OutputOriginPtr) = getShadowOriginPtr(
5246 Addr, IRB, OutputShadowTy, Align(1), /*isStore*/ true);
5247 ShadowArgs.append(1, OutputShadowPtr);
5248
5249 CallInst *CI =
5250 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
5251 setShadow(&I, CI);
5252
5253 if (MS.TrackOrigins) {
5254 // TODO: if we modelled the vst* instruction more precisely, we could
5255 // more accurately track the origins (e.g., if both inputs are
5256 // uninitialized for vst2, we currently blame the second input, even
5257 // though part of the output depends only on the first input).
5258 //
5259 // This is particularly imprecise for vst{2,3,4}lane, since only one
5260 // lane of each input is actually copied to the output.
5261 OriginCombiner OC(this, IRB);
5262 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++)
5263 OC.Add(I.getArgOperand(i));
5264
5265 const DataLayout &DL = F.getDataLayout();
5266 OC.DoneAndStoreOrigin(DL.getTypeStoreSize(OutputVectorTy),
5267 OutputOriginPtr);
5268 }
5269 }
5270
5271 /// Handle intrinsics by applying the intrinsic to the shadows.
5272 ///
5273 /// The trailing arguments are passed verbatim to the intrinsic, though any
5274 /// uninitialized trailing arguments can also taint the shadow e.g., for an
5275 /// intrinsic with one trailing verbatim argument:
5276 /// out = intrinsic(var1, var2, opType)
5277 /// we compute:
5278 /// shadow[out] =
5279 /// intrinsic(shadow[var1], shadow[var2], opType) | shadow[opType]
5280 ///
5281 /// Typically, shadowIntrinsicID will be specified by the caller to be
5282 /// I.getIntrinsicID(), but the caller can choose to replace it with another
5283 /// intrinsic of the same type.
5284 ///
5285 /// CAUTION: this assumes that the intrinsic will handle arbitrary
5286 /// bit-patterns (for example, if the intrinsic accepts floats for
5287 /// var1, we require that it doesn't care if inputs are NaNs).
5288 ///
5289 /// For example, this can be applied to the Arm NEON vector table intrinsics
5290 /// (tbl{1,2,3,4}).
5291 ///
5292 /// The origin is approximated using setOriginForNaryOp.
5293 void handleIntrinsicByApplyingToShadow(IntrinsicInst &I,
5294 Intrinsic::ID shadowIntrinsicID,
5295 unsigned int trailingVerbatimArgs) {
5296 IRBuilder<> IRB(&I);
5297
5298 assert(trailingVerbatimArgs < I.arg_size());
5299
5300 SmallVector<Value *, 8> ShadowArgs;
5301 // Don't use getNumOperands() because it includes the callee
5302 for (unsigned int i = 0; i < I.arg_size() - trailingVerbatimArgs; i++) {
5303 Value *Shadow = getShadow(&I, i);
5304
5305 // Shadows are integer-ish types but some intrinsics require a
5306 // different (e.g., floating-point) type.
5307 ShadowArgs.push_back(
5308 IRB.CreateBitCast(Shadow, I.getArgOperand(i)->getType()));
5309 }
5310
5311 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5312 i++) {
5313 Value *Arg = I.getArgOperand(i);
5314 ShadowArgs.push_back(Arg);
5315 }
5316
5317 CallInst *CI =
5318 IRB.CreateIntrinsic(I.getType(), shadowIntrinsicID, ShadowArgs);
5319 Value *CombinedShadow = CI;
5320
5321 // Combine the computed shadow with the shadow of trailing args
5322 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5323 i++) {
5324 Value *Shadow =
5325 CreateShadowCast(IRB, getShadow(&I, i), CombinedShadow->getType());
5326 CombinedShadow = IRB.CreateOr(Shadow, CombinedShadow, "_msprop");
5327 }
5328
5329 setShadow(&I, IRB.CreateBitCast(CombinedShadow, getShadowTy(&I)));
5330
5331 setOriginForNaryOp(I);
5332 }
5333
5334 // Approximation only
5335 //
5336 // e.g., <16 x i8> @llvm.aarch64.neon.pmull64(i64, i64)
5337 void handleNEONVectorMultiplyIntrinsic(IntrinsicInst &I) {
5338 assert(I.arg_size() == 2);
5339
5340 handleShadowOr(I);
5341 }
5342
5343 bool maybeHandleCrossPlatformIntrinsic(IntrinsicInst &I) {
5344 switch (I.getIntrinsicID()) {
5345 case Intrinsic::uadd_with_overflow:
5346 case Intrinsic::sadd_with_overflow:
5347 case Intrinsic::usub_with_overflow:
5348 case Intrinsic::ssub_with_overflow:
5349 case Intrinsic::umul_with_overflow:
5350 case Intrinsic::smul_with_overflow:
5351 handleArithmeticWithOverflow(I);
5352 break;
5353 case Intrinsic::abs:
5354 handleAbsIntrinsic(I);
5355 break;
5356 case Intrinsic::bitreverse:
5357 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
5358 /*trailingVerbatimArgs*/ 0);
5359 break;
5360 case Intrinsic::is_fpclass:
5361 handleIsFpClass(I);
5362 break;
5363 case Intrinsic::lifetime_start:
5364 handleLifetimeStart(I);
5365 break;
5366 case Intrinsic::launder_invariant_group:
5367 case Intrinsic::strip_invariant_group:
5368 handleInvariantGroup(I);
5369 break;
5370 case Intrinsic::bswap:
5371 handleBswap(I);
5372 break;
5373 case Intrinsic::ctlz:
5374 case Intrinsic::cttz:
5375 handleCountLeadingTrailingZeros(I);
5376 break;
5377 case Intrinsic::masked_compressstore:
5378 handleMaskedCompressStore(I);
5379 break;
5380 case Intrinsic::masked_expandload:
5381 handleMaskedExpandLoad(I);
5382 break;
5383 case Intrinsic::masked_gather:
5384 handleMaskedGather(I);
5385 break;
5386 case Intrinsic::masked_scatter:
5387 handleMaskedScatter(I);
5388 break;
5389 case Intrinsic::masked_store:
5390 handleMaskedStore(I);
5391 break;
5392 case Intrinsic::masked_load:
5393 handleMaskedLoad(I);
5394 break;
5395 case Intrinsic::vector_reduce_and:
5396 handleVectorReduceAndIntrinsic(I);
5397 break;
5398 case Intrinsic::vector_reduce_or:
5399 handleVectorReduceOrIntrinsic(I);
5400 break;
5401
5402 case Intrinsic::vector_reduce_add:
5403 case Intrinsic::vector_reduce_xor:
5404 case Intrinsic::vector_reduce_mul:
5405 // Signed/Unsigned Min/Max
5406 // TODO: handling similarly to AND/OR may be more precise.
5407 case Intrinsic::vector_reduce_smax:
5408 case Intrinsic::vector_reduce_smin:
5409 case Intrinsic::vector_reduce_umax:
5410 case Intrinsic::vector_reduce_umin:
5411 // TODO: this has no false positives, but arguably we should check that all
5412 // the bits are initialized.
5413 case Intrinsic::vector_reduce_fmax:
5414 case Intrinsic::vector_reduce_fmin:
5415 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/false);
5416 break;
5417
5418 case Intrinsic::vector_reduce_fadd:
5419 case Intrinsic::vector_reduce_fmul:
5420 handleVectorReduceWithStarterIntrinsic(I);
5421 break;
5422
5423 case Intrinsic::scmp:
5424 case Intrinsic::ucmp: {
5425 handleShadowOr(I);
5426 break;
5427 }
5428
5429 case Intrinsic::fshl:
5430 case Intrinsic::fshr:
5431 handleFunnelShift(I);
5432 break;
5433
5434 case Intrinsic::is_constant:
5435 // The result of llvm.is.constant() is always defined.
5436 setShadow(&I, getCleanShadow(&I));
5437 setOrigin(&I, getCleanOrigin());
5438 break;
5439
5440 default:
5441 return false;
5442 }
5443
5444 return true;
5445 }
5446
5447 bool maybeHandleX86SIMDIntrinsic(IntrinsicInst &I) {
5448 switch (I.getIntrinsicID()) {
5449 case Intrinsic::x86_sse_stmxcsr:
5450 handleStmxcsr(I);
5451 break;
5452 case Intrinsic::x86_sse_ldmxcsr:
5453 handleLdmxcsr(I);
5454 break;
5455
5456 // Convert Scalar Double Precision Floating-Point Value
5457 // to Unsigned Doubleword Integer
5458 // etc.
5459 case Intrinsic::x86_avx512_vcvtsd2usi64:
5460 case Intrinsic::x86_avx512_vcvtsd2usi32:
5461 case Intrinsic::x86_avx512_vcvtss2usi64:
5462 case Intrinsic::x86_avx512_vcvtss2usi32:
5463 case Intrinsic::x86_avx512_cvttss2usi64:
5464 case Intrinsic::x86_avx512_cvttss2usi:
5465 case Intrinsic::x86_avx512_cvttsd2usi64:
5466 case Intrinsic::x86_avx512_cvttsd2usi:
5467 case Intrinsic::x86_avx512_cvtusi2ss:
5468 case Intrinsic::x86_avx512_cvtusi642sd:
5469 case Intrinsic::x86_avx512_cvtusi642ss:
5470 handleSSEVectorConvertIntrinsic(I, 1, true);
5471 break;
5472 case Intrinsic::x86_sse2_cvtsd2si64:
5473 case Intrinsic::x86_sse2_cvtsd2si:
5474 case Intrinsic::x86_sse2_cvtsd2ss:
5475 case Intrinsic::x86_sse2_cvttsd2si64:
5476 case Intrinsic::x86_sse2_cvttsd2si:
5477 case Intrinsic::x86_sse_cvtss2si64:
5478 case Intrinsic::x86_sse_cvtss2si:
5479 case Intrinsic::x86_sse_cvttss2si64:
5480 case Intrinsic::x86_sse_cvttss2si:
5481 handleSSEVectorConvertIntrinsic(I, 1);
5482 break;
5483 case Intrinsic::x86_sse_cvtps2pi:
5484 case Intrinsic::x86_sse_cvttps2pi:
5485 handleSSEVectorConvertIntrinsic(I, 2);
5486 break;
5487
5488 // TODO:
5489 // <1 x i64> @llvm.x86.sse.cvtpd2pi(<2 x double>)
5490 // <2 x double> @llvm.x86.sse.cvtpi2pd(<1 x i64>)
5491 // <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, <1 x i64>)
5492
5493 case Intrinsic::x86_vcvtps2ph_128:
5494 case Intrinsic::x86_vcvtps2ph_256: {
5495 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/true);
5496 break;
5497 }
5498
5499 // Convert Packed Single Precision Floating-Point Values
5500 // to Packed Signed Doubleword Integer Values
5501 //
5502 // <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
5503 // (<16 x float>, <16 x i32>, i16, i32)
5504 case Intrinsic::x86_avx512_mask_cvtps2dq_512:
5505 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/false);
5506 break;
5507
5508 // Convert Packed Double Precision Floating-Point Values
5509 // to Packed Single Precision Floating-Point Values
5510 case Intrinsic::x86_sse2_cvtpd2ps:
5511 case Intrinsic::x86_sse2_cvtps2dq:
5512 case Intrinsic::x86_sse2_cvtpd2dq:
5513 case Intrinsic::x86_sse2_cvttps2dq:
5514 case Intrinsic::x86_sse2_cvttpd2dq:
5515 case Intrinsic::x86_avx_cvt_pd2_ps_256:
5516 case Intrinsic::x86_avx_cvt_ps2dq_256:
5517 case Intrinsic::x86_avx_cvt_pd2dq_256:
5518 case Intrinsic::x86_avx_cvtt_ps2dq_256:
5519 case Intrinsic::x86_avx_cvtt_pd2dq_256: {
5520 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/false);
5521 break;
5522 }
5523
5524 // Convert Single-Precision FP Value to 16-bit FP Value
5525 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
5526 // (<16 x float>, i32, <16 x i16>, i16)
5527 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
5528 // (<4 x float>, i32, <8 x i16>, i8)
5529 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256
5530 // (<8 x float>, i32, <8 x i16>, i8)
5531 case Intrinsic::x86_avx512_mask_vcvtps2ph_512:
5532 case Intrinsic::x86_avx512_mask_vcvtps2ph_256:
5533 case Intrinsic::x86_avx512_mask_vcvtps2ph_128:
5534 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/true);
5535 break;
5536
5537 // Shift Packed Data (Left Logical, Right Arithmetic, Right Logical)
5538 case Intrinsic::x86_avx512_psll_w_512:
5539 case Intrinsic::x86_avx512_psll_d_512:
5540 case Intrinsic::x86_avx512_psll_q_512:
5541 case Intrinsic::x86_avx512_pslli_w_512:
5542 case Intrinsic::x86_avx512_pslli_d_512:
5543 case Intrinsic::x86_avx512_pslli_q_512:
5544 case Intrinsic::x86_avx512_psrl_w_512:
5545 case Intrinsic::x86_avx512_psrl_d_512:
5546 case Intrinsic::x86_avx512_psrl_q_512:
5547 case Intrinsic::x86_avx512_psra_w_512:
5548 case Intrinsic::x86_avx512_psra_d_512:
5549 case Intrinsic::x86_avx512_psra_q_512:
5550 case Intrinsic::x86_avx512_psrli_w_512:
5551 case Intrinsic::x86_avx512_psrli_d_512:
5552 case Intrinsic::x86_avx512_psrli_q_512:
5553 case Intrinsic::x86_avx512_psrai_w_512:
5554 case Intrinsic::x86_avx512_psrai_d_512:
5555 case Intrinsic::x86_avx512_psrai_q_512:
5556 case Intrinsic::x86_avx512_psra_q_256:
5557 case Intrinsic::x86_avx512_psra_q_128:
5558 case Intrinsic::x86_avx512_psrai_q_256:
5559 case Intrinsic::x86_avx512_psrai_q_128:
5560 case Intrinsic::x86_avx2_psll_w:
5561 case Intrinsic::x86_avx2_psll_d:
5562 case Intrinsic::x86_avx2_psll_q:
5563 case Intrinsic::x86_avx2_pslli_w:
5564 case Intrinsic::x86_avx2_pslli_d:
5565 case Intrinsic::x86_avx2_pslli_q:
5566 case Intrinsic::x86_avx2_psrl_w:
5567 case Intrinsic::x86_avx2_psrl_d:
5568 case Intrinsic::x86_avx2_psrl_q:
5569 case Intrinsic::x86_avx2_psra_w:
5570 case Intrinsic::x86_avx2_psra_d:
5571 case Intrinsic::x86_avx2_psrli_w:
5572 case Intrinsic::x86_avx2_psrli_d:
5573 case Intrinsic::x86_avx2_psrli_q:
5574 case Intrinsic::x86_avx2_psrai_w:
5575 case Intrinsic::x86_avx2_psrai_d:
5576 case Intrinsic::x86_sse2_psll_w:
5577 case Intrinsic::x86_sse2_psll_d:
5578 case Intrinsic::x86_sse2_psll_q:
5579 case Intrinsic::x86_sse2_pslli_w:
5580 case Intrinsic::x86_sse2_pslli_d:
5581 case Intrinsic::x86_sse2_pslli_q:
5582 case Intrinsic::x86_sse2_psrl_w:
5583 case Intrinsic::x86_sse2_psrl_d:
5584 case Intrinsic::x86_sse2_psrl_q:
5585 case Intrinsic::x86_sse2_psra_w:
5586 case Intrinsic::x86_sse2_psra_d:
5587 case Intrinsic::x86_sse2_psrli_w:
5588 case Intrinsic::x86_sse2_psrli_d:
5589 case Intrinsic::x86_sse2_psrli_q:
5590 case Intrinsic::x86_sse2_psrai_w:
5591 case Intrinsic::x86_sse2_psrai_d:
5592 case Intrinsic::x86_mmx_psll_w:
5593 case Intrinsic::x86_mmx_psll_d:
5594 case Intrinsic::x86_mmx_psll_q:
5595 case Intrinsic::x86_mmx_pslli_w:
5596 case Intrinsic::x86_mmx_pslli_d:
5597 case Intrinsic::x86_mmx_pslli_q:
5598 case Intrinsic::x86_mmx_psrl_w:
5599 case Intrinsic::x86_mmx_psrl_d:
5600 case Intrinsic::x86_mmx_psrl_q:
5601 case Intrinsic::x86_mmx_psra_w:
5602 case Intrinsic::x86_mmx_psra_d:
5603 case Intrinsic::x86_mmx_psrli_w:
5604 case Intrinsic::x86_mmx_psrli_d:
5605 case Intrinsic::x86_mmx_psrli_q:
5606 case Intrinsic::x86_mmx_psrai_w:
5607 case Intrinsic::x86_mmx_psrai_d:
5608 handleVectorShiftIntrinsic(I, /* Variable */ false);
5609 break;
5610 case Intrinsic::x86_avx2_psllv_d:
5611 case Intrinsic::x86_avx2_psllv_d_256:
5612 case Intrinsic::x86_avx512_psllv_d_512:
5613 case Intrinsic::x86_avx2_psllv_q:
5614 case Intrinsic::x86_avx2_psllv_q_256:
5615 case Intrinsic::x86_avx512_psllv_q_512:
5616 case Intrinsic::x86_avx2_psrlv_d:
5617 case Intrinsic::x86_avx2_psrlv_d_256:
5618 case Intrinsic::x86_avx512_psrlv_d_512:
5619 case Intrinsic::x86_avx2_psrlv_q:
5620 case Intrinsic::x86_avx2_psrlv_q_256:
5621 case Intrinsic::x86_avx512_psrlv_q_512:
5622 case Intrinsic::x86_avx2_psrav_d:
5623 case Intrinsic::x86_avx2_psrav_d_256:
5624 case Intrinsic::x86_avx512_psrav_d_512:
5625 case Intrinsic::x86_avx512_psrav_q_128:
5626 case Intrinsic::x86_avx512_psrav_q_256:
5627 case Intrinsic::x86_avx512_psrav_q_512:
5628 handleVectorShiftIntrinsic(I, /* Variable */ true);
5629 break;
5630
5631 // Pack with Signed/Unsigned Saturation
5632 case Intrinsic::x86_sse2_packsswb_128:
5633 case Intrinsic::x86_sse2_packssdw_128:
5634 case Intrinsic::x86_sse2_packuswb_128:
5635 case Intrinsic::x86_sse41_packusdw:
5636 case Intrinsic::x86_avx2_packsswb:
5637 case Intrinsic::x86_avx2_packssdw:
5638 case Intrinsic::x86_avx2_packuswb:
5639 case Intrinsic::x86_avx2_packusdw:
5640 // e.g., <64 x i8> @llvm.x86.avx512.packsswb.512
5641 // (<32 x i16> %a, <32 x i16> %b)
5642 // <32 x i16> @llvm.x86.avx512.packssdw.512
5643 // (<16 x i32> %a, <16 x i32> %b)
5644 // Note: AVX512 masked variants are auto-upgraded by LLVM.
5645 case Intrinsic::x86_avx512_packsswb_512:
5646 case Intrinsic::x86_avx512_packssdw_512:
5647 case Intrinsic::x86_avx512_packuswb_512:
5648 case Intrinsic::x86_avx512_packusdw_512:
5649 handleVectorPackIntrinsic(I);
5650 break;
5651
5652 case Intrinsic::x86_sse41_pblendvb:
5653 case Intrinsic::x86_sse41_blendvpd:
5654 case Intrinsic::x86_sse41_blendvps:
5655 case Intrinsic::x86_avx_blendv_pd_256:
5656 case Intrinsic::x86_avx_blendv_ps_256:
5657 case Intrinsic::x86_avx2_pblendvb:
5658 handleBlendvIntrinsic(I);
5659 break;
5660
5661 case Intrinsic::x86_avx_dp_ps_256:
5662 case Intrinsic::x86_sse41_dppd:
5663 case Intrinsic::x86_sse41_dpps:
5664 handleDppIntrinsic(I);
5665 break;
5666
5667 case Intrinsic::x86_mmx_packsswb:
5668 case Intrinsic::x86_mmx_packuswb:
5669 handleVectorPackIntrinsic(I, 16);
5670 break;
5671
5672 case Intrinsic::x86_mmx_packssdw:
5673 handleVectorPackIntrinsic(I, 32);
5674 break;
5675
5676 case Intrinsic::x86_mmx_psad_bw:
5677 handleVectorSadIntrinsic(I, true);
5678 break;
5679 case Intrinsic::x86_sse2_psad_bw:
5680 case Intrinsic::x86_avx2_psad_bw:
5681 handleVectorSadIntrinsic(I);
5682 break;
5683
5684 // Multiply and Add Packed Words
5685 // < 4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>)
5686 // < 8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16>, <16 x i16>)
5687 // <16 x i32> @llvm.x86.avx512.pmaddw.d.512(<32 x i16>, <32 x i16>)
5688 //
5689 // Multiply and Add Packed Signed and Unsigned Bytes
5690 // < 8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>)
5691 // <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8>, <32 x i8>)
5692 // <32 x i16> @llvm.x86.avx512.pmaddubs.w.512(<64 x i8>, <64 x i8>)
5693 //
5694 // These intrinsics are auto-upgraded into non-masked forms:
5695 // < 4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128
5696 // (<8 x i16>, <8 x i16>, <4 x i32>, i8)
5697 // < 8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256
5698 // (<16 x i16>, <16 x i16>, <8 x i32>, i8)
5699 // <16 x i32> @llvm.x86.avx512.mask.pmaddw.d.512
5700 // (<32 x i16>, <32 x i16>, <16 x i32>, i16)
5701 // < 8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128
5702 // (<16 x i8>, <16 x i8>, <8 x i16>, i8)
5703 // <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256
5704 // (<32 x i8>, <32 x i8>, <16 x i16>, i16)
5705 // <32 x i16> @llvm.x86.avx512.mask.pmaddubs.w.512
5706 // (<64 x i8>, <64 x i8>, <32 x i16>, i32)
5707 case Intrinsic::x86_sse2_pmadd_wd:
5708 case Intrinsic::x86_avx2_pmadd_wd:
5709 case Intrinsic::x86_avx512_pmaddw_d_512:
5710 case Intrinsic::x86_ssse3_pmadd_ub_sw_128:
5711 case Intrinsic::x86_avx2_pmadd_ub_sw:
5712 case Intrinsic::x86_avx512_pmaddubs_w_512:
5713 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2);
5714 break;
5715
5716 // <1 x i64> @llvm.x86.ssse3.pmadd.ub.sw(<1 x i64>, <1 x i64>)
5717 case Intrinsic::x86_ssse3_pmadd_ub_sw:
5718 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/8);
5719 break;
5720
5721 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64>, <1 x i64>)
5722 case Intrinsic::x86_mmx_pmadd_wd:
5723 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/16);
5724 break;
5725
5726 // AVX Vector Neural Network Instructions: bytes
5727 //
5728 // Multiply and Add Packed Signed and Unsigned Bytes
5729 // < 4 x i32> @llvm.x86.avx512.vpdpbusd.128
5730 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5731 // < 8 x i32> @llvm.x86.avx512.vpdpbusd.256
5732 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5733 // <16 x i32> @llvm.x86.avx512.vpdpbusd.512
5734 // (<16 x i32>, <64 x i8>, <64 x i8>)
5735 //
5736 // Multiply and Add Unsigned and Signed Bytes With Saturation
5737 // < 4 x i32> @llvm.x86.avx512.vpdpbusds.128
5738 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5739 // < 8 x i32> @llvm.x86.avx512.vpdpbusds.256
5740 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5741 // <16 x i32> @llvm.x86.avx512.vpdpbusds.512
5742 // (<16 x i32>, <64 x i8>, <64 x i8>)
5743 //
5744 // < 4 x i32> @llvm.x86.avx2.vpdpbssd.128
5745 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5746 // < 8 x i32> @llvm.x86.avx2.vpdpbssd.256
5747 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5748 //
5749 // < 4 x i32> @llvm.x86.avx2.vpdpbssds.128
5750 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5751 // < 8 x i32> @llvm.x86.avx2.vpdpbssds.256
5752 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5753 //
5754 // <16 x i32> @llvm.x86.avx10.vpdpbssd.512
5755 // (<16 x i32>, <16 x i32>, <16 x i32>)
5756 // <16 x i32> @llvm.x86.avx10.vpdpbssds.512
5757 // (<16 x i32>, <16 x i32>, <16 x i32>)
5758 //
5759 // These intrinsics are auto-upgraded into non-masked forms:
5760 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusd.128
5761 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5762 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusd.128
5763 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5764 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusd.256
5765 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5766 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusd.256
5767 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5768 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusd.512
5769 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5770 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusd.512
5771 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5772 //
5773 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusds.128
5774 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5775 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusds.128
5776 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5777 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusds.256
5778 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5779 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusds.256
5780 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5781 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusds.512
5782 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5783 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusds.512
5784 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5785 case Intrinsic::x86_avx512_vpdpbusd_128:
5786 case Intrinsic::x86_avx512_vpdpbusd_256:
5787 case Intrinsic::x86_avx512_vpdpbusd_512:
5788 case Intrinsic::x86_avx512_vpdpbusds_128:
5789 case Intrinsic::x86_avx512_vpdpbusds_256:
5790 case Intrinsic::x86_avx512_vpdpbusds_512:
5791 case Intrinsic::x86_avx2_vpdpbssd_128:
5792 case Intrinsic::x86_avx2_vpdpbssd_256:
5793 case Intrinsic::x86_avx2_vpdpbssds_128:
5794 case Intrinsic::x86_avx2_vpdpbssds_256:
5795 case Intrinsic::x86_avx10_vpdpbssd_512:
5796 case Intrinsic::x86_avx10_vpdpbssds_512:
5797 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/4, /*EltSize=*/8);
5798 break;
5799
5800 // AVX Vector Neural Network Instructions: words
5801 //
5802 // Multiply and Add Signed Word Integers
5803 // < 4 x i32> @llvm.x86.avx512.vpdpwssd.128
5804 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5805 // < 8 x i32> @llvm.x86.avx512.vpdpwssd.256
5806 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5807 // <16 x i32> @llvm.x86.avx512.vpdpwssd.512
5808 // (<16 x i32>, <16 x i32>, <16 x i32>)
5809 //
5810 // Multiply and Add Signed Word Integers With Saturation
5811 // < 4 x i32> @llvm.x86.avx512.vpdpwssds.128
5812 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5813 // < 8 x i32> @llvm.x86.avx512.vpdpwssds.256
5814 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5815 // <16 x i32> @llvm.x86.avx512.vpdpwssds.512
5816 // (<16 x i32>, <16 x i32>, <16 x i32>)
5817 //
5818 // These intrinsics are auto-upgraded into non-masked forms:
5819 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssd.128
5820 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5821 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssd.128
5822 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5823 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssd.256
5824 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5825 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssd.256
5826 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5827 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssd.512
5828 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5829 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssd.512
5830 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5831 //
5832 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssds.128
5833 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5834 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssds.128
5835 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5836 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssds.256
5837 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5838 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssds.256
5839 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5840 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssds.512
5841 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5842 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssds.512
5843 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5844 case Intrinsic::x86_avx512_vpdpwssd_128:
5845 case Intrinsic::x86_avx512_vpdpwssd_256:
5846 case Intrinsic::x86_avx512_vpdpwssd_512:
5847 case Intrinsic::x86_avx512_vpdpwssds_128:
5848 case Intrinsic::x86_avx512_vpdpwssds_256:
5849 case Intrinsic::x86_avx512_vpdpwssds_512:
5850 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2, /*EltSize=*/16);
5851 break;
5852
5853 // TODO: Dot Product of BF16 Pairs Accumulated Into Packed Single
5854 // Precision
5855 // <4 x float> @llvm.x86.avx512bf16.dpbf16ps.128
5856 // (<4 x float>, <8 x bfloat>, <8 x bfloat>)
5857 // <8 x float> @llvm.x86.avx512bf16.dpbf16ps.256
5858 // (<8 x float>, <16 x bfloat>, <16 x bfloat>)
5859 // <16 x float> @llvm.x86.avx512bf16.dpbf16ps.512
5860 // (<16 x float>, <32 x bfloat>, <32 x bfloat>)
5861 // handleVectorPmaddIntrinsic() currently only handles integer types.
5862
5863 case Intrinsic::x86_sse_cmp_ss:
5864 case Intrinsic::x86_sse2_cmp_sd:
5865 case Intrinsic::x86_sse_comieq_ss:
5866 case Intrinsic::x86_sse_comilt_ss:
5867 case Intrinsic::x86_sse_comile_ss:
5868 case Intrinsic::x86_sse_comigt_ss:
5869 case Intrinsic::x86_sse_comige_ss:
5870 case Intrinsic::x86_sse_comineq_ss:
5871 case Intrinsic::x86_sse_ucomieq_ss:
5872 case Intrinsic::x86_sse_ucomilt_ss:
5873 case Intrinsic::x86_sse_ucomile_ss:
5874 case Intrinsic::x86_sse_ucomigt_ss:
5875 case Intrinsic::x86_sse_ucomige_ss:
5876 case Intrinsic::x86_sse_ucomineq_ss:
5877 case Intrinsic::x86_sse2_comieq_sd:
5878 case Intrinsic::x86_sse2_comilt_sd:
5879 case Intrinsic::x86_sse2_comile_sd:
5880 case Intrinsic::x86_sse2_comigt_sd:
5881 case Intrinsic::x86_sse2_comige_sd:
5882 case Intrinsic::x86_sse2_comineq_sd:
5883 case Intrinsic::x86_sse2_ucomieq_sd:
5884 case Intrinsic::x86_sse2_ucomilt_sd:
5885 case Intrinsic::x86_sse2_ucomile_sd:
5886 case Intrinsic::x86_sse2_ucomigt_sd:
5887 case Intrinsic::x86_sse2_ucomige_sd:
5888 case Intrinsic::x86_sse2_ucomineq_sd:
5889 handleVectorCompareScalarIntrinsic(I);
5890 break;
5891
5892 case Intrinsic::x86_avx_cmp_pd_256:
5893 case Intrinsic::x86_avx_cmp_ps_256:
5894 case Intrinsic::x86_sse2_cmp_pd:
5895 case Intrinsic::x86_sse_cmp_ps:
5896 handleVectorComparePackedIntrinsic(I);
5897 break;
5898
5899 case Intrinsic::x86_bmi_bextr_32:
5900 case Intrinsic::x86_bmi_bextr_64:
5901 case Intrinsic::x86_bmi_bzhi_32:
5902 case Intrinsic::x86_bmi_bzhi_64:
5903 case Intrinsic::x86_bmi_pdep_32:
5904 case Intrinsic::x86_bmi_pdep_64:
5905 case Intrinsic::x86_bmi_pext_32:
5906 case Intrinsic::x86_bmi_pext_64:
5907 handleBmiIntrinsic(I);
5908 break;
5909
5910 case Intrinsic::x86_pclmulqdq:
5911 case Intrinsic::x86_pclmulqdq_256:
5912 case Intrinsic::x86_pclmulqdq_512:
5913 handlePclmulIntrinsic(I);
5914 break;
5915
5916 case Intrinsic::x86_avx_round_pd_256:
5917 case Intrinsic::x86_avx_round_ps_256:
5918 case Intrinsic::x86_sse41_round_pd:
5919 case Intrinsic::x86_sse41_round_ps:
5920 handleRoundPdPsIntrinsic(I);
5921 break;
5922
5923 case Intrinsic::x86_sse41_round_sd:
5924 case Intrinsic::x86_sse41_round_ss:
5925 handleUnarySdSsIntrinsic(I);
5926 break;
5927
5928 case Intrinsic::x86_sse2_max_sd:
5929 case Intrinsic::x86_sse_max_ss:
5930 case Intrinsic::x86_sse2_min_sd:
5931 case Intrinsic::x86_sse_min_ss:
5932 handleBinarySdSsIntrinsic(I);
5933 break;
5934
5935 case Intrinsic::x86_avx_vtestc_pd:
5936 case Intrinsic::x86_avx_vtestc_pd_256:
5937 case Intrinsic::x86_avx_vtestc_ps:
5938 case Intrinsic::x86_avx_vtestc_ps_256:
5939 case Intrinsic::x86_avx_vtestnzc_pd:
5940 case Intrinsic::x86_avx_vtestnzc_pd_256:
5941 case Intrinsic::x86_avx_vtestnzc_ps:
5942 case Intrinsic::x86_avx_vtestnzc_ps_256:
5943 case Intrinsic::x86_avx_vtestz_pd:
5944 case Intrinsic::x86_avx_vtestz_pd_256:
5945 case Intrinsic::x86_avx_vtestz_ps:
5946 case Intrinsic::x86_avx_vtestz_ps_256:
5947 case Intrinsic::x86_avx_ptestc_256:
5948 case Intrinsic::x86_avx_ptestnzc_256:
5949 case Intrinsic::x86_avx_ptestz_256:
5950 case Intrinsic::x86_sse41_ptestc:
5951 case Intrinsic::x86_sse41_ptestnzc:
5952 case Intrinsic::x86_sse41_ptestz:
5953 handleVtestIntrinsic(I);
5954 break;
5955
5956 // Packed Horizontal Add/Subtract
5957 case Intrinsic::x86_ssse3_phadd_w:
5958 case Intrinsic::x86_ssse3_phadd_w_128:
5959 case Intrinsic::x86_avx2_phadd_w:
5960 case Intrinsic::x86_ssse3_phsub_w:
5961 case Intrinsic::x86_ssse3_phsub_w_128:
5962 case Intrinsic::x86_avx2_phsub_w: {
5963 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
5964 break;
5965 }
5966
5967 // Packed Horizontal Add/Subtract
5968 case Intrinsic::x86_ssse3_phadd_d:
5969 case Intrinsic::x86_ssse3_phadd_d_128:
5970 case Intrinsic::x86_avx2_phadd_d:
5971 case Intrinsic::x86_ssse3_phsub_d:
5972 case Intrinsic::x86_ssse3_phsub_d_128:
5973 case Intrinsic::x86_avx2_phsub_d: {
5974 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/32);
5975 break;
5976 }
5977
5978 // Packed Horizontal Add/Subtract and Saturate
5979 case Intrinsic::x86_ssse3_phadd_sw:
5980 case Intrinsic::x86_ssse3_phadd_sw_128:
5981 case Intrinsic::x86_avx2_phadd_sw:
5982 case Intrinsic::x86_ssse3_phsub_sw:
5983 case Intrinsic::x86_ssse3_phsub_sw_128:
5984 case Intrinsic::x86_avx2_phsub_sw: {
5985 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
5986 break;
5987 }
5988
5989 // Packed Single/Double Precision Floating-Point Horizontal Add
5990 case Intrinsic::x86_sse3_hadd_ps:
5991 case Intrinsic::x86_sse3_hadd_pd:
5992 case Intrinsic::x86_avx_hadd_pd_256:
5993 case Intrinsic::x86_avx_hadd_ps_256:
5994 case Intrinsic::x86_sse3_hsub_ps:
5995 case Intrinsic::x86_sse3_hsub_pd:
5996 case Intrinsic::x86_avx_hsub_pd_256:
5997 case Intrinsic::x86_avx_hsub_ps_256: {
5998 handlePairwiseShadowOrIntrinsic(I);
5999 break;
6000 }
6001
6002 case Intrinsic::x86_avx_maskstore_ps:
6003 case Intrinsic::x86_avx_maskstore_pd:
6004 case Intrinsic::x86_avx_maskstore_ps_256:
6005 case Intrinsic::x86_avx_maskstore_pd_256:
6006 case Intrinsic::x86_avx2_maskstore_d:
6007 case Intrinsic::x86_avx2_maskstore_q:
6008 case Intrinsic::x86_avx2_maskstore_d_256:
6009 case Intrinsic::x86_avx2_maskstore_q_256: {
6010 handleAVXMaskedStore(I);
6011 break;
6012 }
6013
6014 case Intrinsic::x86_avx_maskload_ps:
6015 case Intrinsic::x86_avx_maskload_pd:
6016 case Intrinsic::x86_avx_maskload_ps_256:
6017 case Intrinsic::x86_avx_maskload_pd_256:
6018 case Intrinsic::x86_avx2_maskload_d:
6019 case Intrinsic::x86_avx2_maskload_q:
6020 case Intrinsic::x86_avx2_maskload_d_256:
6021 case Intrinsic::x86_avx2_maskload_q_256: {
6022 handleAVXMaskedLoad(I);
6023 break;
6024 }
6025
6026 // Packed
6027 case Intrinsic::x86_avx512fp16_add_ph_512:
6028 case Intrinsic::x86_avx512fp16_sub_ph_512:
6029 case Intrinsic::x86_avx512fp16_mul_ph_512:
6030 case Intrinsic::x86_avx512fp16_div_ph_512:
6031 case Intrinsic::x86_avx512fp16_max_ph_512:
6032 case Intrinsic::x86_avx512fp16_min_ph_512:
6033 case Intrinsic::x86_avx512_min_ps_512:
6034 case Intrinsic::x86_avx512_min_pd_512:
6035 case Intrinsic::x86_avx512_max_ps_512:
6036 case Intrinsic::x86_avx512_max_pd_512: {
6037 // These AVX512 variants contain the rounding mode as a trailing flag.
6038 // Earlier variants do not have a trailing flag and are already handled
6039 // by maybeHandleSimpleNomemIntrinsic(I, 0) via
6040 // maybeHandleUnknownIntrinsic.
6041 [[maybe_unused]] bool Success =
6042 maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/1);
6043 assert(Success);
6044 break;
6045 }
6046
6047 case Intrinsic::x86_avx_vpermilvar_pd:
6048 case Intrinsic::x86_avx_vpermilvar_pd_256:
6049 case Intrinsic::x86_avx512_vpermilvar_pd_512:
6050 case Intrinsic::x86_avx_vpermilvar_ps:
6051 case Intrinsic::x86_avx_vpermilvar_ps_256:
6052 case Intrinsic::x86_avx512_vpermilvar_ps_512: {
6053 handleAVXVpermilvar(I);
6054 break;
6055 }
6056
6057 case Intrinsic::x86_avx512_vpermi2var_d_128:
6058 case Intrinsic::x86_avx512_vpermi2var_d_256:
6059 case Intrinsic::x86_avx512_vpermi2var_d_512:
6060 case Intrinsic::x86_avx512_vpermi2var_hi_128:
6061 case Intrinsic::x86_avx512_vpermi2var_hi_256:
6062 case Intrinsic::x86_avx512_vpermi2var_hi_512:
6063 case Intrinsic::x86_avx512_vpermi2var_pd_128:
6064 case Intrinsic::x86_avx512_vpermi2var_pd_256:
6065 case Intrinsic::x86_avx512_vpermi2var_pd_512:
6066 case Intrinsic::x86_avx512_vpermi2var_ps_128:
6067 case Intrinsic::x86_avx512_vpermi2var_ps_256:
6068 case Intrinsic::x86_avx512_vpermi2var_ps_512:
6069 case Intrinsic::x86_avx512_vpermi2var_q_128:
6070 case Intrinsic::x86_avx512_vpermi2var_q_256:
6071 case Intrinsic::x86_avx512_vpermi2var_q_512:
6072 case Intrinsic::x86_avx512_vpermi2var_qi_128:
6073 case Intrinsic::x86_avx512_vpermi2var_qi_256:
6074 case Intrinsic::x86_avx512_vpermi2var_qi_512:
6075 handleAVXVpermi2var(I);
6076 break;
6077
6078 // Packed Shuffle
6079 // llvm.x86.sse.pshuf.w(<1 x i64>, i8)
6080 // llvm.x86.ssse3.pshuf.b(<1 x i64>, <1 x i64>)
6081 // llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>)
6082 // llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>)
6083 // llvm.x86.avx512.pshuf.b.512(<64 x i8>, <64 x i8>)
6084 //
6085 // The following intrinsics are auto-upgraded:
6086 // llvm.x86.sse2.pshuf.d(<4 x i32>, i8)
6087 // llvm.x86.sse2.gpshufh.w(<8 x i16>, i8)
6088 // llvm.x86.sse2.pshufl.w(<8 x i16>, i8)
6089 case Intrinsic::x86_avx2_pshuf_b:
6090 case Intrinsic::x86_sse_pshuf_w:
6091 case Intrinsic::x86_ssse3_pshuf_b_128:
6092 case Intrinsic::x86_ssse3_pshuf_b:
6093 case Intrinsic::x86_avx512_pshuf_b_512:
6094 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6095 /*trailingVerbatimArgs=*/1);
6096 break;
6097
6098 // AVX512 PMOV: Packed MOV, with truncation
6099 // Precisely handled by applying the same intrinsic to the shadow
6100 case Intrinsic::x86_avx512_mask_pmov_dw_512:
6101 case Intrinsic::x86_avx512_mask_pmov_db_512:
6102 case Intrinsic::x86_avx512_mask_pmov_qb_512:
6103 case Intrinsic::x86_avx512_mask_pmov_qw_512: {
6104 // Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 were removed in
6105 // f608dc1f5775ee880e8ea30e2d06ab5a4a935c22
6106 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6107 /*trailingVerbatimArgs=*/1);
6108 break;
6109 }
6110
6111 // AVX512 PMVOV{S,US}: Packed MOV, with signed/unsigned saturation
6112 // Approximately handled using the corresponding truncation intrinsic
6113 // TODO: improve handleAVX512VectorDownConvert to precisely model saturation
6114 case Intrinsic::x86_avx512_mask_pmovs_dw_512:
6115 case Intrinsic::x86_avx512_mask_pmovus_dw_512: {
6116 handleIntrinsicByApplyingToShadow(I,
6117 Intrinsic::x86_avx512_mask_pmov_dw_512,
6118 /* trailingVerbatimArgs=*/1);
6119 break;
6120 }
6121
6122 case Intrinsic::x86_avx512_mask_pmovs_db_512:
6123 case Intrinsic::x86_avx512_mask_pmovus_db_512: {
6124 handleIntrinsicByApplyingToShadow(I,
6125 Intrinsic::x86_avx512_mask_pmov_db_512,
6126 /* trailingVerbatimArgs=*/1);
6127 break;
6128 }
6129
6130 case Intrinsic::x86_avx512_mask_pmovs_qb_512:
6131 case Intrinsic::x86_avx512_mask_pmovus_qb_512: {
6132 handleIntrinsicByApplyingToShadow(I,
6133 Intrinsic::x86_avx512_mask_pmov_qb_512,
6134 /* trailingVerbatimArgs=*/1);
6135 break;
6136 }
6137
6138 case Intrinsic::x86_avx512_mask_pmovs_qw_512:
6139 case Intrinsic::x86_avx512_mask_pmovus_qw_512: {
6140 handleIntrinsicByApplyingToShadow(I,
6141 Intrinsic::x86_avx512_mask_pmov_qw_512,
6142 /* trailingVerbatimArgs=*/1);
6143 break;
6144 }
6145
6146 case Intrinsic::x86_avx512_mask_pmovs_qd_512:
6147 case Intrinsic::x86_avx512_mask_pmovus_qd_512:
6148 case Intrinsic::x86_avx512_mask_pmovs_wb_512:
6149 case Intrinsic::x86_avx512_mask_pmovus_wb_512: {
6150 // Since Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 do not exist, we
6151 // cannot use handleIntrinsicByApplyingToShadow. Instead, we call the
6152 // slow-path handler.
6153 handleAVX512VectorDownConvert(I);
6154 break;
6155 }
6156
6157 // AVX512/AVX10 Reciprocal
6158 // <16 x float> @llvm.x86.avx512.rsqrt14.ps.512
6159 // (<16 x float>, <16 x float>, i16)
6160 // <8 x float> @llvm.x86.avx512.rsqrt14.ps.256
6161 // (<8 x float>, <8 x float>, i8)
6162 // <4 x float> @llvm.x86.avx512.rsqrt14.ps.128
6163 // (<4 x float>, <4 x float>, i8)
6164 //
6165 // <8 x double> @llvm.x86.avx512.rsqrt14.pd.512
6166 // (<8 x double>, <8 x double>, i8)
6167 // <4 x double> @llvm.x86.avx512.rsqrt14.pd.256
6168 // (<4 x double>, <4 x double>, i8)
6169 // <2 x double> @llvm.x86.avx512.rsqrt14.pd.128
6170 // (<2 x double>, <2 x double>, i8)
6171 //
6172 // <32 x bfloat> @llvm.x86.avx10.mask.rsqrt.bf16.512
6173 // (<32 x bfloat>, <32 x bfloat>, i32)
6174 // <16 x bfloat> @llvm.x86.avx10.mask.rsqrt.bf16.256
6175 // (<16 x bfloat>, <16 x bfloat>, i16)
6176 // <8 x bfloat> @llvm.x86.avx10.mask.rsqrt.bf16.128
6177 // (<8 x bfloat>, <8 x bfloat>, i8)
6178 //
6179 // <32 x half> @llvm.x86.avx512fp16.mask.rsqrt.ph.512
6180 // (<32 x half>, <32 x half>, i32)
6181 // <16 x half> @llvm.x86.avx512fp16.mask.rsqrt.ph.256
6182 // (<16 x half>, <16 x half>, i16)
6183 // <8 x half> @llvm.x86.avx512fp16.mask.rsqrt.ph.128
6184 // (<8 x half>, <8 x half>, i8)
6185 //
6186 // TODO: 3-operand variants are not handled:
6187 // <2 x double> @llvm.x86.avx512.rsqrt14.sd
6188 // (<2 x double>, <2 x double>, <2 x double>, i8)
6189 // <4 x float> @llvm.x86.avx512.rsqrt14.ss
6190 // (<4 x float>, <4 x float>, <4 x float>, i8)
6191 // <8 x half> @llvm.x86.avx512fp16.mask.rsqrt.sh
6192 // (<8 x half>, <8 x half>, <8 x half>, i8)
6193 case Intrinsic::x86_avx512_rsqrt14_ps_512:
6194 case Intrinsic::x86_avx512_rsqrt14_ps_256:
6195 case Intrinsic::x86_avx512_rsqrt14_ps_128:
6196 case Intrinsic::x86_avx512_rsqrt14_pd_512:
6197 case Intrinsic::x86_avx512_rsqrt14_pd_256:
6198 case Intrinsic::x86_avx512_rsqrt14_pd_128:
6199 case Intrinsic::x86_avx10_mask_rsqrt_bf16_512:
6200 case Intrinsic::x86_avx10_mask_rsqrt_bf16_256:
6201 case Intrinsic::x86_avx10_mask_rsqrt_bf16_128:
6202 case Intrinsic::x86_avx512fp16_mask_rsqrt_ph_512:
6203 case Intrinsic::x86_avx512fp16_mask_rsqrt_ph_256:
6204 case Intrinsic::x86_avx512fp16_mask_rsqrt_ph_128:
6205 handleAVX512VectorGenericMaskedFP(I);
6206 break;
6207
6208 // AVX512/AVX10 Reciprocal Square Root
6209 // <16 x float> @llvm.x86.avx512.rcp14.ps.512
6210 // (<16 x float>, <16 x float>, i16)
6211 // <8 x float> @llvm.x86.avx512.rcp14.ps.256
6212 // (<8 x float>, <8 x float>, i8)
6213 // <4 x float> @llvm.x86.avx512.rcp14.ps.128
6214 // (<4 x float>, <4 x float>, i8)
6215 //
6216 // <8 x double> @llvm.x86.avx512.rcp14.pd.512
6217 // (<8 x double>, <8 x double>, i8)
6218 // <4 x double> @llvm.x86.avx512.rcp14.pd.256
6219 // (<4 x double>, <4 x double>, i8)
6220 // <2 x double> @llvm.x86.avx512.rcp14.pd.128
6221 // (<2 x double>, <2 x double>, i8)
6222 //
6223 // <32 x bfloat> @llvm.x86.avx10.mask.rcp.bf16.512
6224 // (<32 x bfloat>, <32 x bfloat>, i32)
6225 // <16 x bfloat> @llvm.x86.avx10.mask.rcp.bf16.256
6226 // (<16 x bfloat>, <16 x bfloat>, i16)
6227 // <8 x bfloat> @llvm.x86.avx10.mask.rcp.bf16.128
6228 // (<8 x bfloat>, <8 x bfloat>, i8)
6229 //
6230 // <32 x half> @llvm.x86.avx512fp16.mask.rcp.ph.512
6231 // (<32 x half>, <32 x half>, i32)
6232 // <16 x half> @llvm.x86.avx512fp16.mask.rcp.ph.256
6233 // (<16 x half>, <16 x half>, i16)
6234 // <8 x half> @llvm.x86.avx512fp16.mask.rcp.ph.128
6235 // (<8 x half>, <8 x half>, i8)
6236 //
6237 // TODO: 3-operand variants are not handled:
6238 // <2 x double> @llvm.x86.avx512.rcp14.sd
6239 // (<2 x double>, <2 x double>, <2 x double>, i8)
6240 // <4 x float> @llvm.x86.avx512.rcp14.ss
6241 // (<4 x float>, <4 x float>, <4 x float>, i8)
6242 // <8 x half> @llvm.x86.avx512fp16.mask.rcp.sh
6243 // (<8 x half>, <8 x half>, <8 x half>, i8)
6244 case Intrinsic::x86_avx512_rcp14_ps_512:
6245 case Intrinsic::x86_avx512_rcp14_ps_256:
6246 case Intrinsic::x86_avx512_rcp14_ps_128:
6247 case Intrinsic::x86_avx512_rcp14_pd_512:
6248 case Intrinsic::x86_avx512_rcp14_pd_256:
6249 case Intrinsic::x86_avx512_rcp14_pd_128:
6250 case Intrinsic::x86_avx10_mask_rcp_bf16_512:
6251 case Intrinsic::x86_avx10_mask_rcp_bf16_256:
6252 case Intrinsic::x86_avx10_mask_rcp_bf16_128:
6253 case Intrinsic::x86_avx512fp16_mask_rcp_ph_512:
6254 case Intrinsic::x86_avx512fp16_mask_rcp_ph_256:
6255 case Intrinsic::x86_avx512fp16_mask_rcp_ph_128:
6256 handleAVX512VectorGenericMaskedFP(I);
6257 break;
6258
6259 // AVX512 FP16 Arithmetic
6260 case Intrinsic::x86_avx512fp16_mask_add_sh_round:
6261 case Intrinsic::x86_avx512fp16_mask_sub_sh_round:
6262 case Intrinsic::x86_avx512fp16_mask_mul_sh_round:
6263 case Intrinsic::x86_avx512fp16_mask_div_sh_round:
6264 case Intrinsic::x86_avx512fp16_mask_max_sh_round:
6265 case Intrinsic::x86_avx512fp16_mask_min_sh_round: {
6266 visitGenericScalarHalfwordInst(I);
6267 break;
6268 }
6269
6270 // AVX Galois Field New Instructions
6271 case Intrinsic::x86_vgf2p8affineqb_128:
6272 case Intrinsic::x86_vgf2p8affineqb_256:
6273 case Intrinsic::x86_vgf2p8affineqb_512:
6274 handleAVXGF2P8Affine(I);
6275 break;
6276
6277 default:
6278 return false;
6279 }
6280
6281 return true;
6282 }
6283
6284 bool maybeHandleArmSIMDIntrinsic(IntrinsicInst &I) {
6285 switch (I.getIntrinsicID()) {
6286 case Intrinsic::aarch64_neon_rshrn:
6287 case Intrinsic::aarch64_neon_sqrshl:
6288 case Intrinsic::aarch64_neon_sqrshrn:
6289 case Intrinsic::aarch64_neon_sqrshrun:
6290 case Intrinsic::aarch64_neon_sqshl:
6291 case Intrinsic::aarch64_neon_sqshlu:
6292 case Intrinsic::aarch64_neon_sqshrn:
6293 case Intrinsic::aarch64_neon_sqshrun:
6294 case Intrinsic::aarch64_neon_srshl:
6295 case Intrinsic::aarch64_neon_sshl:
6296 case Intrinsic::aarch64_neon_uqrshl:
6297 case Intrinsic::aarch64_neon_uqrshrn:
6298 case Intrinsic::aarch64_neon_uqshl:
6299 case Intrinsic::aarch64_neon_uqshrn:
6300 case Intrinsic::aarch64_neon_urshl:
6301 case Intrinsic::aarch64_neon_ushl:
6302 // Not handled here: aarch64_neon_vsli (vector shift left and insert)
6303 handleVectorShiftIntrinsic(I, /* Variable */ false);
6304 break;
6305
6306 // TODO: handling max/min similarly to AND/OR may be more precise
6307 // Floating-Point Maximum/Minimum Pairwise
6308 case Intrinsic::aarch64_neon_fmaxp:
6309 case Intrinsic::aarch64_neon_fminp:
6310 // Floating-Point Maximum/Minimum Number Pairwise
6311 case Intrinsic::aarch64_neon_fmaxnmp:
6312 case Intrinsic::aarch64_neon_fminnmp:
6313 // Signed/Unsigned Maximum/Minimum Pairwise
6314 case Intrinsic::aarch64_neon_smaxp:
6315 case Intrinsic::aarch64_neon_sminp:
6316 case Intrinsic::aarch64_neon_umaxp:
6317 case Intrinsic::aarch64_neon_uminp:
6318 // Add Pairwise
6319 case Intrinsic::aarch64_neon_addp:
6320 // Floating-point Add Pairwise
6321 case Intrinsic::aarch64_neon_faddp:
6322 // Add Long Pairwise
6323 case Intrinsic::aarch64_neon_saddlp:
6324 case Intrinsic::aarch64_neon_uaddlp: {
6325 handlePairwiseShadowOrIntrinsic(I);
6326 break;
6327 }
6328
6329 // Floating-point Convert to integer, rounding to nearest with ties to Away
6330 case Intrinsic::aarch64_neon_fcvtas:
6331 case Intrinsic::aarch64_neon_fcvtau:
6332 // Floating-point convert to integer, rounding toward minus infinity
6333 case Intrinsic::aarch64_neon_fcvtms:
6334 case Intrinsic::aarch64_neon_fcvtmu:
6335 // Floating-point convert to integer, rounding to nearest with ties to even
6336 case Intrinsic::aarch64_neon_fcvtns:
6337 case Intrinsic::aarch64_neon_fcvtnu:
6338 // Floating-point convert to integer, rounding toward plus infinity
6339 case Intrinsic::aarch64_neon_fcvtps:
6340 case Intrinsic::aarch64_neon_fcvtpu:
6341 // Floating-point Convert to integer, rounding toward Zero
6342 case Intrinsic::aarch64_neon_fcvtzs:
6343 case Intrinsic::aarch64_neon_fcvtzu:
6344 // Floating-point convert to lower precision narrow, rounding to odd
6345 case Intrinsic::aarch64_neon_fcvtxn: {
6346 handleNEONVectorConvertIntrinsic(I);
6347 break;
6348 }
6349
6350 // Add reduction to scalar
6351 case Intrinsic::aarch64_neon_faddv:
6352 case Intrinsic::aarch64_neon_saddv:
6353 case Intrinsic::aarch64_neon_uaddv:
6354 // Signed/Unsigned min/max (Vector)
6355 // TODO: handling similarly to AND/OR may be more precise.
6356 case Intrinsic::aarch64_neon_smaxv:
6357 case Intrinsic::aarch64_neon_sminv:
6358 case Intrinsic::aarch64_neon_umaxv:
6359 case Intrinsic::aarch64_neon_uminv:
6360 // Floating-point min/max (vector)
6361 // The f{min,max}"nm"v variants handle NaN differently than f{min,max}v,
6362 // but our shadow propagation is the same.
6363 case Intrinsic::aarch64_neon_fmaxv:
6364 case Intrinsic::aarch64_neon_fminv:
6365 case Intrinsic::aarch64_neon_fmaxnmv:
6366 case Intrinsic::aarch64_neon_fminnmv:
6367 // Sum long across vector
6368 case Intrinsic::aarch64_neon_saddlv:
6369 case Intrinsic::aarch64_neon_uaddlv:
6370 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/true);
6371 break;
6372
6373 case Intrinsic::aarch64_neon_ld1x2:
6374 case Intrinsic::aarch64_neon_ld1x3:
6375 case Intrinsic::aarch64_neon_ld1x4:
6376 case Intrinsic::aarch64_neon_ld2:
6377 case Intrinsic::aarch64_neon_ld3:
6378 case Intrinsic::aarch64_neon_ld4:
6379 case Intrinsic::aarch64_neon_ld2r:
6380 case Intrinsic::aarch64_neon_ld3r:
6381 case Intrinsic::aarch64_neon_ld4r: {
6382 handleNEONVectorLoad(I, /*WithLane=*/false);
6383 break;
6384 }
6385
6386 case Intrinsic::aarch64_neon_ld2lane:
6387 case Intrinsic::aarch64_neon_ld3lane:
6388 case Intrinsic::aarch64_neon_ld4lane: {
6389 handleNEONVectorLoad(I, /*WithLane=*/true);
6390 break;
6391 }
6392
6393 // Saturating extract narrow
6394 case Intrinsic::aarch64_neon_sqxtn:
6395 case Intrinsic::aarch64_neon_sqxtun:
6396 case Intrinsic::aarch64_neon_uqxtn:
6397 // These only have one argument, but we (ab)use handleShadowOr because it
6398 // does work on single argument intrinsics and will typecast the shadow
6399 // (and update the origin).
6400 handleShadowOr(I);
6401 break;
6402
6403 case Intrinsic::aarch64_neon_st1x2:
6404 case Intrinsic::aarch64_neon_st1x3:
6405 case Intrinsic::aarch64_neon_st1x4:
6406 case Intrinsic::aarch64_neon_st2:
6407 case Intrinsic::aarch64_neon_st3:
6408 case Intrinsic::aarch64_neon_st4: {
6409 handleNEONVectorStoreIntrinsic(I, false);
6410 break;
6411 }
6412
6413 case Intrinsic::aarch64_neon_st2lane:
6414 case Intrinsic::aarch64_neon_st3lane:
6415 case Intrinsic::aarch64_neon_st4lane: {
6416 handleNEONVectorStoreIntrinsic(I, true);
6417 break;
6418 }
6419
6420 // Arm NEON vector table intrinsics have the source/table register(s) as
6421 // arguments, followed by the index register. They return the output.
6422 //
6423 // 'TBL writes a zero if an index is out-of-range, while TBX leaves the
6424 // original value unchanged in the destination register.'
6425 // Conveniently, zero denotes a clean shadow, which means out-of-range
6426 // indices for TBL will initialize the user data with zero and also clean
6427 // the shadow. (For TBX, neither the user data nor the shadow will be
6428 // updated, which is also correct.)
6429 case Intrinsic::aarch64_neon_tbl1:
6430 case Intrinsic::aarch64_neon_tbl2:
6431 case Intrinsic::aarch64_neon_tbl3:
6432 case Intrinsic::aarch64_neon_tbl4:
6433 case Intrinsic::aarch64_neon_tbx1:
6434 case Intrinsic::aarch64_neon_tbx2:
6435 case Intrinsic::aarch64_neon_tbx3:
6436 case Intrinsic::aarch64_neon_tbx4: {
6437 // The last trailing argument (index register) should be handled verbatim
6438 handleIntrinsicByApplyingToShadow(
6439 I, /*shadowIntrinsicID=*/I.getIntrinsicID(),
6440 /*trailingVerbatimArgs*/ 1);
6441 break;
6442 }
6443
6444 case Intrinsic::aarch64_neon_fmulx:
6445 case Intrinsic::aarch64_neon_pmul:
6446 case Intrinsic::aarch64_neon_pmull:
6447 case Intrinsic::aarch64_neon_smull:
6448 case Intrinsic::aarch64_neon_pmull64:
6449 case Intrinsic::aarch64_neon_umull: {
6450 handleNEONVectorMultiplyIntrinsic(I);
6451 break;
6452 }
6453
6454 default:
6455 return false;
6456 }
6457
6458 return true;
6459 }
6460
6461 void visitIntrinsicInst(IntrinsicInst &I) {
6462 if (maybeHandleCrossPlatformIntrinsic(I))
6463 return;
6464
6465 if (maybeHandleX86SIMDIntrinsic(I))
6466 return;
6467
6468 if (maybeHandleArmSIMDIntrinsic(I))
6469 return;
6470
6471 if (maybeHandleUnknownIntrinsic(I))
6472 return;
6473
6474 visitInstruction(I);
6475 }
6476
6477 void visitLibAtomicLoad(CallBase &CB) {
6478 // Since we use getNextNode here, we can't have CB terminate the BB.
6479 assert(isa<CallInst>(CB));
6480
6481 IRBuilder<> IRB(&CB);
6482 Value *Size = CB.getArgOperand(0);
6483 Value *SrcPtr = CB.getArgOperand(1);
6484 Value *DstPtr = CB.getArgOperand(2);
6485 Value *Ordering = CB.getArgOperand(3);
6486 // Convert the call to have at least Acquire ordering to make sure
6487 // the shadow operations aren't reordered before it.
6488 Value *NewOrdering =
6489 IRB.CreateExtractElement(makeAddAcquireOrderingTable(IRB), Ordering);
6490 CB.setArgOperand(3, NewOrdering);
6491
6492 NextNodeIRBuilder NextIRB(&CB);
6493 Value *SrcShadowPtr, *SrcOriginPtr;
6494 std::tie(SrcShadowPtr, SrcOriginPtr) =
6495 getShadowOriginPtr(SrcPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6496 /*isStore*/ false);
6497 Value *DstShadowPtr =
6498 getShadowOriginPtr(DstPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6499 /*isStore*/ true)
6500 .first;
6501
6502 NextIRB.CreateMemCpy(DstShadowPtr, Align(1), SrcShadowPtr, Align(1), Size);
6503 if (MS.TrackOrigins) {
6504 Value *SrcOrigin = NextIRB.CreateAlignedLoad(MS.OriginTy, SrcOriginPtr,
6506 Value *NewOrigin = updateOrigin(SrcOrigin, NextIRB);
6507 NextIRB.CreateCall(MS.MsanSetOriginFn, {DstPtr, Size, NewOrigin});
6508 }
6509 }
6510
6511 void visitLibAtomicStore(CallBase &CB) {
6512 IRBuilder<> IRB(&CB);
6513 Value *Size = CB.getArgOperand(0);
6514 Value *DstPtr = CB.getArgOperand(2);
6515 Value *Ordering = CB.getArgOperand(3);
6516 // Convert the call to have at least Release ordering to make sure
6517 // the shadow operations aren't reordered after it.
6518 Value *NewOrdering =
6519 IRB.CreateExtractElement(makeAddReleaseOrderingTable(IRB), Ordering);
6520 CB.setArgOperand(3, NewOrdering);
6521
6522 Value *DstShadowPtr =
6523 getShadowOriginPtr(DstPtr, IRB, IRB.getInt8Ty(), Align(1),
6524 /*isStore*/ true)
6525 .first;
6526
6527 // Atomic store always paints clean shadow/origin. See file header.
6528 IRB.CreateMemSet(DstShadowPtr, getCleanShadow(IRB.getInt8Ty()), Size,
6529 Align(1));
6530 }
6531
6532 void visitCallBase(CallBase &CB) {
6533 assert(!CB.getMetadata(LLVMContext::MD_nosanitize));
6534 if (CB.isInlineAsm()) {
6535 // For inline asm (either a call to asm function, or callbr instruction),
6536 // do the usual thing: check argument shadow and mark all outputs as
6537 // clean. Note that any side effects of the inline asm that are not
6538 // immediately visible in its constraints are not handled.
6540 visitAsmInstruction(CB);
6541 else
6542 visitInstruction(CB);
6543 return;
6544 }
6545 LibFunc LF;
6546 if (TLI->getLibFunc(CB, LF)) {
6547 // libatomic.a functions need to have special handling because there isn't
6548 // a good way to intercept them or compile the library with
6549 // instrumentation.
6550 switch (LF) {
6551 case LibFunc_atomic_load:
6552 if (!isa<CallInst>(CB)) {
6553 llvm::errs() << "MSAN -- cannot instrument invoke of libatomic load."
6554 "Ignoring!\n";
6555 break;
6556 }
6557 visitLibAtomicLoad(CB);
6558 return;
6559 case LibFunc_atomic_store:
6560 visitLibAtomicStore(CB);
6561 return;
6562 default:
6563 break;
6564 }
6565 }
6566
6567 if (auto *Call = dyn_cast<CallInst>(&CB)) {
6568 assert(!isa<IntrinsicInst>(Call) && "intrinsics are handled elsewhere");
6569
6570 // We are going to insert code that relies on the fact that the callee
6571 // will become a non-readonly function after it is instrumented by us. To
6572 // prevent this code from being optimized out, mark that function
6573 // non-readonly in advance.
6574 // TODO: We can likely do better than dropping memory() completely here.
6575 AttributeMask B;
6576 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
6577
6579 if (Function *Func = Call->getCalledFunction()) {
6580 Func->removeFnAttrs(B);
6581 }
6582
6584 }
6585 IRBuilder<> IRB(&CB);
6586 bool MayCheckCall = MS.EagerChecks;
6587 if (Function *Func = CB.getCalledFunction()) {
6588 // __sanitizer_unaligned_{load,store} functions may be called by users
6589 // and always expects shadows in the TLS. So don't check them.
6590 MayCheckCall &= !Func->getName().starts_with("__sanitizer_unaligned_");
6591 }
6592
6593 unsigned ArgOffset = 0;
6594 LLVM_DEBUG(dbgs() << " CallSite: " << CB << "\n");
6595 for (const auto &[i, A] : llvm::enumerate(CB.args())) {
6596 if (!A->getType()->isSized()) {
6597 LLVM_DEBUG(dbgs() << "Arg " << i << " is not sized: " << CB << "\n");
6598 continue;
6599 }
6600
6601 if (A->getType()->isScalableTy()) {
6602 LLVM_DEBUG(dbgs() << "Arg " << i << " is vscale: " << CB << "\n");
6603 // Handle as noundef, but don't reserve tls slots.
6604 insertCheckShadowOf(A, &CB);
6605 continue;
6606 }
6607
6608 unsigned Size = 0;
6609 const DataLayout &DL = F.getDataLayout();
6610
6611 bool ByVal = CB.paramHasAttr(i, Attribute::ByVal);
6612 bool NoUndef = CB.paramHasAttr(i, Attribute::NoUndef);
6613 bool EagerCheck = MayCheckCall && !ByVal && NoUndef;
6614
6615 if (EagerCheck) {
6616 insertCheckShadowOf(A, &CB);
6617 Size = DL.getTypeAllocSize(A->getType());
6618 } else {
6619 [[maybe_unused]] Value *Store = nullptr;
6620 // Compute the Shadow for arg even if it is ByVal, because
6621 // in that case getShadow() will copy the actual arg shadow to
6622 // __msan_param_tls.
6623 Value *ArgShadow = getShadow(A);
6624 Value *ArgShadowBase = getShadowPtrForArgument(IRB, ArgOffset);
6625 LLVM_DEBUG(dbgs() << " Arg#" << i << ": " << *A
6626 << " Shadow: " << *ArgShadow << "\n");
6627 if (ByVal) {
6628 // ByVal requires some special handling as it's too big for a single
6629 // load
6630 assert(A->getType()->isPointerTy() &&
6631 "ByVal argument is not a pointer!");
6632 Size = DL.getTypeAllocSize(CB.getParamByValType(i));
6633 if (ArgOffset + Size > kParamTLSSize)
6634 break;
6635 const MaybeAlign ParamAlignment(CB.getParamAlign(i));
6636 MaybeAlign Alignment = std::nullopt;
6637 if (ParamAlignment)
6638 Alignment = std::min(*ParamAlignment, kShadowTLSAlignment);
6639 Value *AShadowPtr, *AOriginPtr;
6640 std::tie(AShadowPtr, AOriginPtr) =
6641 getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), Alignment,
6642 /*isStore*/ false);
6643 if (!PropagateShadow) {
6644 Store = IRB.CreateMemSet(ArgShadowBase,
6646 Size, Alignment);
6647 } else {
6648 Store = IRB.CreateMemCpy(ArgShadowBase, Alignment, AShadowPtr,
6649 Alignment, Size);
6650 if (MS.TrackOrigins) {
6651 Value *ArgOriginBase = getOriginPtrForArgument(IRB, ArgOffset);
6652 // FIXME: OriginSize should be:
6653 // alignTo(A % kMinOriginAlignment + Size, kMinOriginAlignment)
6654 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
6655 IRB.CreateMemCpy(
6656 ArgOriginBase,
6657 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
6658 AOriginPtr,
6659 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginSize);
6660 }
6661 }
6662 } else {
6663 // Any other parameters mean we need bit-grained tracking of uninit
6664 // data
6665 Size = DL.getTypeAllocSize(A->getType());
6666 if (ArgOffset + Size > kParamTLSSize)
6667 break;
6668 Store = IRB.CreateAlignedStore(ArgShadow, ArgShadowBase,
6670 Constant *Cst = dyn_cast<Constant>(ArgShadow);
6671 if (MS.TrackOrigins && !(Cst && Cst->isNullValue())) {
6672 IRB.CreateStore(getOrigin(A),
6673 getOriginPtrForArgument(IRB, ArgOffset));
6674 }
6675 }
6676 assert(Store != nullptr);
6677 LLVM_DEBUG(dbgs() << " Param:" << *Store << "\n");
6678 }
6679 assert(Size != 0);
6680 ArgOffset += alignTo(Size, kShadowTLSAlignment);
6681 }
6682 LLVM_DEBUG(dbgs() << " done with call args\n");
6683
6684 FunctionType *FT = CB.getFunctionType();
6685 if (FT->isVarArg()) {
6686 VAHelper->visitCallBase(CB, IRB);
6687 }
6688
6689 // Now, get the shadow for the RetVal.
6690 if (!CB.getType()->isSized())
6691 return;
6692 // Don't emit the epilogue for musttail call returns.
6693 if (isa<CallInst>(CB) && cast<CallInst>(CB).isMustTailCall())
6694 return;
6695
6696 if (MayCheckCall && CB.hasRetAttr(Attribute::NoUndef)) {
6697 setShadow(&CB, getCleanShadow(&CB));
6698 setOrigin(&CB, getCleanOrigin());
6699 return;
6700 }
6701
6702 IRBuilder<> IRBBefore(&CB);
6703 // Until we have full dynamic coverage, make sure the retval shadow is 0.
6704 Value *Base = getShadowPtrForRetval(IRBBefore);
6705 IRBBefore.CreateAlignedStore(getCleanShadow(&CB), Base,
6707 BasicBlock::iterator NextInsn;
6708 if (isa<CallInst>(CB)) {
6709 NextInsn = ++CB.getIterator();
6710 assert(NextInsn != CB.getParent()->end());
6711 } else {
6712 BasicBlock *NormalDest = cast<InvokeInst>(CB).getNormalDest();
6713 if (!NormalDest->getSinglePredecessor()) {
6714 // FIXME: this case is tricky, so we are just conservative here.
6715 // Perhaps we need to split the edge between this BB and NormalDest,
6716 // but a naive attempt to use SplitEdge leads to a crash.
6717 setShadow(&CB, getCleanShadow(&CB));
6718 setOrigin(&CB, getCleanOrigin());
6719 return;
6720 }
6721 // FIXME: NextInsn is likely in a basic block that has not been visited
6722 // yet. Anything inserted there will be instrumented by MSan later!
6723 NextInsn = NormalDest->getFirstInsertionPt();
6724 assert(NextInsn != NormalDest->end() &&
6725 "Could not find insertion point for retval shadow load");
6726 }
6727 IRBuilder<> IRBAfter(&*NextInsn);
6728 Value *RetvalShadow = IRBAfter.CreateAlignedLoad(
6729 getShadowTy(&CB), getShadowPtrForRetval(IRBAfter), kShadowTLSAlignment,
6730 "_msret");
6731 setShadow(&CB, RetvalShadow);
6732 if (MS.TrackOrigins)
6733 setOrigin(&CB, IRBAfter.CreateLoad(MS.OriginTy, getOriginPtrForRetval()));
6734 }
6735
6736 bool isAMustTailRetVal(Value *RetVal) {
6737 if (auto *I = dyn_cast<BitCastInst>(RetVal)) {
6738 RetVal = I->getOperand(0);
6739 }
6740 if (auto *I = dyn_cast<CallInst>(RetVal)) {
6741 return I->isMustTailCall();
6742 }
6743 return false;
6744 }
6745
6746 void visitReturnInst(ReturnInst &I) {
6747 IRBuilder<> IRB(&I);
6748 Value *RetVal = I.getReturnValue();
6749 if (!RetVal)
6750 return;
6751 // Don't emit the epilogue for musttail call returns.
6752 if (isAMustTailRetVal(RetVal))
6753 return;
6754 Value *ShadowPtr = getShadowPtrForRetval(IRB);
6755 bool HasNoUndef = F.hasRetAttribute(Attribute::NoUndef);
6756 bool StoreShadow = !(MS.EagerChecks && HasNoUndef);
6757 // FIXME: Consider using SpecialCaseList to specify a list of functions that
6758 // must always return fully initialized values. For now, we hardcode "main".
6759 bool EagerCheck = (MS.EagerChecks && HasNoUndef) || (F.getName() == "main");
6760
6761 Value *Shadow = getShadow(RetVal);
6762 bool StoreOrigin = true;
6763 if (EagerCheck) {
6764 insertCheckShadowOf(RetVal, &I);
6765 Shadow = getCleanShadow(RetVal);
6766 StoreOrigin = false;
6767 }
6768
6769 // The caller may still expect information passed over TLS if we pass our
6770 // check
6771 if (StoreShadow) {
6772 IRB.CreateAlignedStore(Shadow, ShadowPtr, kShadowTLSAlignment);
6773 if (MS.TrackOrigins && StoreOrigin)
6774 IRB.CreateStore(getOrigin(RetVal), getOriginPtrForRetval());
6775 }
6776 }
6777
6778 void visitPHINode(PHINode &I) {
6779 IRBuilder<> IRB(&I);
6780 if (!PropagateShadow) {
6781 setShadow(&I, getCleanShadow(&I));
6782 setOrigin(&I, getCleanOrigin());
6783 return;
6784 }
6785
6786 ShadowPHINodes.push_back(&I);
6787 setShadow(&I, IRB.CreatePHI(getShadowTy(&I), I.getNumIncomingValues(),
6788 "_msphi_s"));
6789 if (MS.TrackOrigins)
6790 setOrigin(
6791 &I, IRB.CreatePHI(MS.OriginTy, I.getNumIncomingValues(), "_msphi_o"));
6792 }
6793
6794 Value *getLocalVarIdptr(AllocaInst &I) {
6795 ConstantInt *IntConst =
6796 ConstantInt::get(Type::getInt32Ty((*F.getParent()).getContext()), 0);
6797 return new GlobalVariable(*F.getParent(), IntConst->getType(),
6798 /*isConstant=*/false, GlobalValue::PrivateLinkage,
6799 IntConst);
6800 }
6801
6802 Value *getLocalVarDescription(AllocaInst &I) {
6803 return createPrivateConstGlobalForString(*F.getParent(), I.getName());
6804 }
6805
6806 void poisonAllocaUserspace(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
6807 if (PoisonStack && ClPoisonStackWithCall) {
6808 IRB.CreateCall(MS.MsanPoisonStackFn, {&I, Len});
6809 } else {
6810 Value *ShadowBase, *OriginBase;
6811 std::tie(ShadowBase, OriginBase) = getShadowOriginPtr(
6812 &I, IRB, IRB.getInt8Ty(), Align(1), /*isStore*/ true);
6813
6814 Value *PoisonValue = IRB.getInt8(PoisonStack ? ClPoisonStackPattern : 0);
6815 IRB.CreateMemSet(ShadowBase, PoisonValue, Len, I.getAlign());
6816 }
6817
6818 if (PoisonStack && MS.TrackOrigins) {
6819 Value *Idptr = getLocalVarIdptr(I);
6820 if (ClPrintStackNames) {
6821 Value *Descr = getLocalVarDescription(I);
6822 IRB.CreateCall(MS.MsanSetAllocaOriginWithDescriptionFn,
6823 {&I, Len, Idptr, Descr});
6824 } else {
6825 IRB.CreateCall(MS.MsanSetAllocaOriginNoDescriptionFn, {&I, Len, Idptr});
6826 }
6827 }
6828 }
6829
6830 void poisonAllocaKmsan(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
6831 Value *Descr = getLocalVarDescription(I);
6832 if (PoisonStack) {
6833 IRB.CreateCall(MS.MsanPoisonAllocaFn, {&I, Len, Descr});
6834 } else {
6835 IRB.CreateCall(MS.MsanUnpoisonAllocaFn, {&I, Len});
6836 }
6837 }
6838
6839 void instrumentAlloca(AllocaInst &I, Instruction *InsPoint = nullptr) {
6840 if (!InsPoint)
6841 InsPoint = &I;
6842 NextNodeIRBuilder IRB(InsPoint);
6843 const DataLayout &DL = F.getDataLayout();
6844 TypeSize TS = DL.getTypeAllocSize(I.getAllocatedType());
6845 Value *Len = IRB.CreateTypeSize(MS.IntptrTy, TS);
6846 if (I.isArrayAllocation())
6847 Len = IRB.CreateMul(Len,
6848 IRB.CreateZExtOrTrunc(I.getArraySize(), MS.IntptrTy));
6849
6850 if (MS.CompileKernel)
6851 poisonAllocaKmsan(I, IRB, Len);
6852 else
6853 poisonAllocaUserspace(I, IRB, Len);
6854 }
6855
6856 void visitAllocaInst(AllocaInst &I) {
6857 setShadow(&I, getCleanShadow(&I));
6858 setOrigin(&I, getCleanOrigin());
6859 // We'll get to this alloca later unless it's poisoned at the corresponding
6860 // llvm.lifetime.start.
6861 AllocaSet.insert(&I);
6862 }
6863
6864 void visitSelectInst(SelectInst &I) {
6865 // a = select b, c, d
6866 Value *B = I.getCondition();
6867 Value *C = I.getTrueValue();
6868 Value *D = I.getFalseValue();
6869
6870 handleSelectLikeInst(I, B, C, D);
6871 }
6872
6873 void handleSelectLikeInst(Instruction &I, Value *B, Value *C, Value *D) {
6874 IRBuilder<> IRB(&I);
6875
6876 Value *Sb = getShadow(B);
6877 Value *Sc = getShadow(C);
6878 Value *Sd = getShadow(D);
6879
6880 Value *Ob = MS.TrackOrigins ? getOrigin(B) : nullptr;
6881 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
6882 Value *Od = MS.TrackOrigins ? getOrigin(D) : nullptr;
6883
6884 // Result shadow if condition shadow is 0.
6885 Value *Sa0 = IRB.CreateSelect(B, Sc, Sd);
6886 Value *Sa1;
6887 if (I.getType()->isAggregateType()) {
6888 // To avoid "sign extending" i1 to an arbitrary aggregate type, we just do
6889 // an extra "select". This results in much more compact IR.
6890 // Sa = select Sb, poisoned, (select b, Sc, Sd)
6891 Sa1 = getPoisonedShadow(getShadowTy(I.getType()));
6892 } else {
6893 // Sa = select Sb, [ (c^d) | Sc | Sd ], [ b ? Sc : Sd ]
6894 // If Sb (condition is poisoned), look for bits in c and d that are equal
6895 // and both unpoisoned.
6896 // If !Sb (condition is unpoisoned), simply pick one of Sc and Sd.
6897
6898 // Cast arguments to shadow-compatible type.
6899 C = CreateAppToShadowCast(IRB, C);
6900 D = CreateAppToShadowCast(IRB, D);
6901
6902 // Result shadow if condition shadow is 1.
6903 Sa1 = IRB.CreateOr({IRB.CreateXor(C, D), Sc, Sd});
6904 }
6905 Value *Sa = IRB.CreateSelect(Sb, Sa1, Sa0, "_msprop_select");
6906 setShadow(&I, Sa);
6907 if (MS.TrackOrigins) {
6908 // Origins are always i32, so any vector conditions must be flattened.
6909 // FIXME: consider tracking vector origins for app vectors?
6910 if (B->getType()->isVectorTy()) {
6911 B = convertToBool(B, IRB);
6912 Sb = convertToBool(Sb, IRB);
6913 }
6914 // a = select b, c, d
6915 // Oa = Sb ? Ob : (b ? Oc : Od)
6916 setOrigin(&I, IRB.CreateSelect(Sb, Ob, IRB.CreateSelect(B, Oc, Od)));
6917 }
6918 }
6919
6920 void visitLandingPadInst(LandingPadInst &I) {
6921 // Do nothing.
6922 // See https://github.com/google/sanitizers/issues/504
6923 setShadow(&I, getCleanShadow(&I));
6924 setOrigin(&I, getCleanOrigin());
6925 }
6926
6927 void visitCatchSwitchInst(CatchSwitchInst &I) {
6928 setShadow(&I, getCleanShadow(&I));
6929 setOrigin(&I, getCleanOrigin());
6930 }
6931
6932 void visitFuncletPadInst(FuncletPadInst &I) {
6933 setShadow(&I, getCleanShadow(&I));
6934 setOrigin(&I, getCleanOrigin());
6935 }
6936
6937 void visitGetElementPtrInst(GetElementPtrInst &I) { handleShadowOr(I); }
6938
6939 void visitExtractValueInst(ExtractValueInst &I) {
6940 IRBuilder<> IRB(&I);
6941 Value *Agg = I.getAggregateOperand();
6942 LLVM_DEBUG(dbgs() << "ExtractValue: " << I << "\n");
6943 Value *AggShadow = getShadow(Agg);
6944 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
6945 Value *ResShadow = IRB.CreateExtractValue(AggShadow, I.getIndices());
6946 LLVM_DEBUG(dbgs() << " ResShadow: " << *ResShadow << "\n");
6947 setShadow(&I, ResShadow);
6948 setOriginForNaryOp(I);
6949 }
6950
6951 void visitInsertValueInst(InsertValueInst &I) {
6952 IRBuilder<> IRB(&I);
6953 LLVM_DEBUG(dbgs() << "InsertValue: " << I << "\n");
6954 Value *AggShadow = getShadow(I.getAggregateOperand());
6955 Value *InsShadow = getShadow(I.getInsertedValueOperand());
6956 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
6957 LLVM_DEBUG(dbgs() << " InsShadow: " << *InsShadow << "\n");
6958 Value *Res = IRB.CreateInsertValue(AggShadow, InsShadow, I.getIndices());
6959 LLVM_DEBUG(dbgs() << " Res: " << *Res << "\n");
6960 setShadow(&I, Res);
6961 setOriginForNaryOp(I);
6962 }
6963
6964 void dumpInst(Instruction &I) {
6965 if (CallInst *CI = dyn_cast<CallInst>(&I)) {
6966 errs() << "ZZZ call " << CI->getCalledFunction()->getName() << "\n";
6967 } else {
6968 errs() << "ZZZ " << I.getOpcodeName() << "\n";
6969 }
6970 errs() << "QQQ " << I << "\n";
6971 }
6972
6973 void visitResumeInst(ResumeInst &I) {
6974 LLVM_DEBUG(dbgs() << "Resume: " << I << "\n");
6975 // Nothing to do here.
6976 }
6977
6978 void visitCleanupReturnInst(CleanupReturnInst &CRI) {
6979 LLVM_DEBUG(dbgs() << "CleanupReturn: " << CRI << "\n");
6980 // Nothing to do here.
6981 }
6982
6983 void visitCatchReturnInst(CatchReturnInst &CRI) {
6984 LLVM_DEBUG(dbgs() << "CatchReturn: " << CRI << "\n");
6985 // Nothing to do here.
6986 }
6987
6988 void instrumentAsmArgument(Value *Operand, Type *ElemTy, Instruction &I,
6989 IRBuilder<> &IRB, const DataLayout &DL,
6990 bool isOutput) {
6991 // For each assembly argument, we check its value for being initialized.
6992 // If the argument is a pointer, we assume it points to a single element
6993 // of the corresponding type (or to a 8-byte word, if the type is unsized).
6994 // Each such pointer is instrumented with a call to the runtime library.
6995 Type *OpType = Operand->getType();
6996 // Check the operand value itself.
6997 insertCheckShadowOf(Operand, &I);
6998 if (!OpType->isPointerTy() || !isOutput) {
6999 assert(!isOutput);
7000 return;
7001 }
7002 if (!ElemTy->isSized())
7003 return;
7004 auto Size = DL.getTypeStoreSize(ElemTy);
7005 Value *SizeVal = IRB.CreateTypeSize(MS.IntptrTy, Size);
7006 if (MS.CompileKernel) {
7007 IRB.CreateCall(MS.MsanInstrumentAsmStoreFn, {Operand, SizeVal});
7008 } else {
7009 // ElemTy, derived from elementtype(), does not encode the alignment of
7010 // the pointer. Conservatively assume that the shadow memory is unaligned.
7011 // When Size is large, avoid StoreInst as it would expand to many
7012 // instructions.
7013 auto [ShadowPtr, _] =
7014 getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1));
7015 if (Size <= 32)
7016 IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
7017 else
7018 IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()),
7019 SizeVal, Align(1));
7020 }
7021 }
7022
7023 /// Get the number of output arguments returned by pointers.
7024 int getNumOutputArgs(InlineAsm *IA, CallBase *CB) {
7025 int NumRetOutputs = 0;
7026 int NumOutputs = 0;
7027 Type *RetTy = cast<Value>(CB)->getType();
7028 if (!RetTy->isVoidTy()) {
7029 // Register outputs are returned via the CallInst return value.
7030 auto *ST = dyn_cast<StructType>(RetTy);
7031 if (ST)
7032 NumRetOutputs = ST->getNumElements();
7033 else
7034 NumRetOutputs = 1;
7035 }
7036 InlineAsm::ConstraintInfoVector Constraints = IA->ParseConstraints();
7037 for (const InlineAsm::ConstraintInfo &Info : Constraints) {
7038 switch (Info.Type) {
7040 NumOutputs++;
7041 break;
7042 default:
7043 break;
7044 }
7045 }
7046 return NumOutputs - NumRetOutputs;
7047 }
7048
7049 void visitAsmInstruction(Instruction &I) {
7050 // Conservative inline assembly handling: check for poisoned shadow of
7051 // asm() arguments, then unpoison the result and all the memory locations
7052 // pointed to by those arguments.
7053 // An inline asm() statement in C++ contains lists of input and output
7054 // arguments used by the assembly code. These are mapped to operands of the
7055 // CallInst as follows:
7056 // - nR register outputs ("=r) are returned by value in a single structure
7057 // (SSA value of the CallInst);
7058 // - nO other outputs ("=m" and others) are returned by pointer as first
7059 // nO operands of the CallInst;
7060 // - nI inputs ("r", "m" and others) are passed to CallInst as the
7061 // remaining nI operands.
7062 // The total number of asm() arguments in the source is nR+nO+nI, and the
7063 // corresponding CallInst has nO+nI+1 operands (the last operand is the
7064 // function to be called).
7065 const DataLayout &DL = F.getDataLayout();
7066 CallBase *CB = cast<CallBase>(&I);
7067 IRBuilder<> IRB(&I);
7068 InlineAsm *IA = cast<InlineAsm>(CB->getCalledOperand());
7069 int OutputArgs = getNumOutputArgs(IA, CB);
7070 // The last operand of a CallInst is the function itself.
7071 int NumOperands = CB->getNumOperands() - 1;
7072
7073 // Check input arguments. Doing so before unpoisoning output arguments, so
7074 // that we won't overwrite uninit values before checking them.
7075 for (int i = OutputArgs; i < NumOperands; i++) {
7076 Value *Operand = CB->getOperand(i);
7077 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
7078 /*isOutput*/ false);
7079 }
7080 // Unpoison output arguments. This must happen before the actual InlineAsm
7081 // call, so that the shadow for memory published in the asm() statement
7082 // remains valid.
7083 for (int i = 0; i < OutputArgs; i++) {
7084 Value *Operand = CB->getOperand(i);
7085 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
7086 /*isOutput*/ true);
7087 }
7088
7089 setShadow(&I, getCleanShadow(&I));
7090 setOrigin(&I, getCleanOrigin());
7091 }
7092
7093 void visitFreezeInst(FreezeInst &I) {
7094 // Freeze always returns a fully defined value.
7095 setShadow(&I, getCleanShadow(&I));
7096 setOrigin(&I, getCleanOrigin());
7097 }
7098
7099 void visitInstruction(Instruction &I) {
7100 // Everything else: stop propagating and check for poisoned shadow.
7102 dumpInst(I);
7103 LLVM_DEBUG(dbgs() << "DEFAULT: " << I << "\n");
7104 for (size_t i = 0, n = I.getNumOperands(); i < n; i++) {
7105 Value *Operand = I.getOperand(i);
7106 if (Operand->getType()->isSized())
7107 insertCheckShadowOf(Operand, &I);
7108 }
7109 setShadow(&I, getCleanShadow(&I));
7110 setOrigin(&I, getCleanOrigin());
7111 }
7112};
7113
7114struct VarArgHelperBase : public VarArgHelper {
7115 Function &F;
7116 MemorySanitizer &MS;
7117 MemorySanitizerVisitor &MSV;
7118 SmallVector<CallInst *, 16> VAStartInstrumentationList;
7119 const unsigned VAListTagSize;
7120
7121 VarArgHelperBase(Function &F, MemorySanitizer &MS,
7122 MemorySanitizerVisitor &MSV, unsigned VAListTagSize)
7123 : F(F), MS(MS), MSV(MSV), VAListTagSize(VAListTagSize) {}
7124
7125 Value *getShadowAddrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
7126 Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
7127 return IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
7128 }
7129
7130 /// Compute the shadow address for a given va_arg.
7131 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
7132 Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
7133 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
7134 return IRB.CreateIntToPtr(Base, MS.PtrTy, "_msarg_va_s");
7135 }
7136
7137 /// Compute the shadow address for a given va_arg.
7138 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset,
7139 unsigned ArgSize) {
7140 // Make sure we don't overflow __msan_va_arg_tls.
7141 if (ArgOffset + ArgSize > kParamTLSSize)
7142 return nullptr;
7143 return getShadowPtrForVAArgument(IRB, ArgOffset);
7144 }
7145
7146 /// Compute the origin address for a given va_arg.
7147 Value *getOriginPtrForVAArgument(IRBuilder<> &IRB, int ArgOffset) {
7148 Value *Base = IRB.CreatePointerCast(MS.VAArgOriginTLS, MS.IntptrTy);
7149 // getOriginPtrForVAArgument() is always called after
7150 // getShadowPtrForVAArgument(), so __msan_va_arg_origin_tls can never
7151 // overflow.
7152 Base = IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
7153 return IRB.CreateIntToPtr(Base, MS.PtrTy, "_msarg_va_o");
7154 }
7155
7156 void CleanUnusedTLS(IRBuilder<> &IRB, Value *ShadowBase,
7157 unsigned BaseOffset) {
7158 // The tails of __msan_va_arg_tls is not large enough to fit full
7159 // value shadow, but it will be copied to backup anyway. Make it
7160 // clean.
7161 if (BaseOffset >= kParamTLSSize)
7162 return;
7163 Value *TailSize =
7164 ConstantInt::getSigned(IRB.getInt32Ty(), kParamTLSSize - BaseOffset);
7165 IRB.CreateMemSet(ShadowBase, ConstantInt::getNullValue(IRB.getInt8Ty()),
7166 TailSize, Align(8));
7167 }
7168
7169 void unpoisonVAListTagForInst(IntrinsicInst &I) {
7170 IRBuilder<> IRB(&I);
7171 Value *VAListTag = I.getArgOperand(0);
7172 const Align Alignment = Align(8);
7173 auto [ShadowPtr, OriginPtr] = MSV.getShadowOriginPtr(
7174 VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
7175 // Unpoison the whole __va_list_tag.
7176 IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
7177 VAListTagSize, Alignment, false);
7178 }
7179
7180 void visitVAStartInst(VAStartInst &I) override {
7181 if (F.getCallingConv() == CallingConv::Win64)
7182 return;
7183 VAStartInstrumentationList.push_back(&I);
7184 unpoisonVAListTagForInst(I);
7185 }
7186
7187 void visitVACopyInst(VACopyInst &I) override {
7188 if (F.getCallingConv() == CallingConv::Win64)
7189 return;
7190 unpoisonVAListTagForInst(I);
7191 }
7192};
7193
7194/// AMD64-specific implementation of VarArgHelper.
7195struct VarArgAMD64Helper : public VarArgHelperBase {
7196 // An unfortunate workaround for asymmetric lowering of va_arg stuff.
7197 // See a comment in visitCallBase for more details.
7198 static const unsigned AMD64GpEndOffset = 48; // AMD64 ABI Draft 0.99.6 p3.5.7
7199 static const unsigned AMD64FpEndOffsetSSE = 176;
7200 // If SSE is disabled, fp_offset in va_list is zero.
7201 static const unsigned AMD64FpEndOffsetNoSSE = AMD64GpEndOffset;
7202
7203 unsigned AMD64FpEndOffset;
7204 AllocaInst *VAArgTLSCopy = nullptr;
7205 AllocaInst *VAArgTLSOriginCopy = nullptr;
7206 Value *VAArgOverflowSize = nullptr;
7207
7208 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7209
7210 VarArgAMD64Helper(Function &F, MemorySanitizer &MS,
7211 MemorySanitizerVisitor &MSV)
7212 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/24) {
7213 AMD64FpEndOffset = AMD64FpEndOffsetSSE;
7214 for (const auto &Attr : F.getAttributes().getFnAttrs()) {
7215 if (Attr.isStringAttribute() &&
7216 (Attr.getKindAsString() == "target-features")) {
7217 if (Attr.getValueAsString().contains("-sse"))
7218 AMD64FpEndOffset = AMD64FpEndOffsetNoSSE;
7219 break;
7220 }
7221 }
7222 }
7223
7224 ArgKind classifyArgument(Value *arg) {
7225 // A very rough approximation of X86_64 argument classification rules.
7226 Type *T = arg->getType();
7227 if (T->isX86_FP80Ty())
7228 return AK_Memory;
7229 if (T->isFPOrFPVectorTy())
7230 return AK_FloatingPoint;
7231 if (T->isIntegerTy() && T->getPrimitiveSizeInBits() <= 64)
7232 return AK_GeneralPurpose;
7233 if (T->isPointerTy())
7234 return AK_GeneralPurpose;
7235 return AK_Memory;
7236 }
7237
7238 // For VarArg functions, store the argument shadow in an ABI-specific format
7239 // that corresponds to va_list layout.
7240 // We do this because Clang lowers va_arg in the frontend, and this pass
7241 // only sees the low level code that deals with va_list internals.
7242 // A much easier alternative (provided that Clang emits va_arg instructions)
7243 // would have been to associate each live instance of va_list with a copy of
7244 // MSanParamTLS, and extract shadow on va_arg() call in the argument list
7245 // order.
7246 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7247 unsigned GpOffset = 0;
7248 unsigned FpOffset = AMD64GpEndOffset;
7249 unsigned OverflowOffset = AMD64FpEndOffset;
7250 const DataLayout &DL = F.getDataLayout();
7251
7252 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7253 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7254 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7255 if (IsByVal) {
7256 // ByVal arguments always go to the overflow area.
7257 // Fixed arguments passed through the overflow area will be stepped
7258 // over by va_start, so don't count them towards the offset.
7259 if (IsFixed)
7260 continue;
7261 assert(A->getType()->isPointerTy());
7262 Type *RealTy = CB.getParamByValType(ArgNo);
7263 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7264 uint64_t AlignedSize = alignTo(ArgSize, 8);
7265 unsigned BaseOffset = OverflowOffset;
7266 Value *ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7267 Value *OriginBase = nullptr;
7268 if (MS.TrackOrigins)
7269 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7270 OverflowOffset += AlignedSize;
7271
7272 if (OverflowOffset > kParamTLSSize) {
7273 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7274 continue; // We have no space to copy shadow there.
7275 }
7276
7277 Value *ShadowPtr, *OriginPtr;
7278 std::tie(ShadowPtr, OriginPtr) =
7279 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), kShadowTLSAlignment,
7280 /*isStore*/ false);
7281 IRB.CreateMemCpy(ShadowBase, kShadowTLSAlignment, ShadowPtr,
7282 kShadowTLSAlignment, ArgSize);
7283 if (MS.TrackOrigins)
7284 IRB.CreateMemCpy(OriginBase, kShadowTLSAlignment, OriginPtr,
7285 kShadowTLSAlignment, ArgSize);
7286 } else {
7287 ArgKind AK = classifyArgument(A);
7288 if (AK == AK_GeneralPurpose && GpOffset >= AMD64GpEndOffset)
7289 AK = AK_Memory;
7290 if (AK == AK_FloatingPoint && FpOffset >= AMD64FpEndOffset)
7291 AK = AK_Memory;
7292 Value *ShadowBase, *OriginBase = nullptr;
7293 switch (AK) {
7294 case AK_GeneralPurpose:
7295 ShadowBase = getShadowPtrForVAArgument(IRB, GpOffset);
7296 if (MS.TrackOrigins)
7297 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset);
7298 GpOffset += 8;
7299 assert(GpOffset <= kParamTLSSize);
7300 break;
7301 case AK_FloatingPoint:
7302 ShadowBase = getShadowPtrForVAArgument(IRB, FpOffset);
7303 if (MS.TrackOrigins)
7304 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
7305 FpOffset += 16;
7306 assert(FpOffset <= kParamTLSSize);
7307 break;
7308 case AK_Memory:
7309 if (IsFixed)
7310 continue;
7311 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7312 uint64_t AlignedSize = alignTo(ArgSize, 8);
7313 unsigned BaseOffset = OverflowOffset;
7314 ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7315 if (MS.TrackOrigins) {
7316 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7317 }
7318 OverflowOffset += AlignedSize;
7319 if (OverflowOffset > kParamTLSSize) {
7320 // We have no space to copy shadow there.
7321 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7322 continue;
7323 }
7324 }
7325 // Take fixed arguments into account for GpOffset and FpOffset,
7326 // but don't actually store shadows for them.
7327 // TODO(glider): don't call get*PtrForVAArgument() for them.
7328 if (IsFixed)
7329 continue;
7330 Value *Shadow = MSV.getShadow(A);
7331 IRB.CreateAlignedStore(Shadow, ShadowBase, kShadowTLSAlignment);
7332 if (MS.TrackOrigins) {
7333 Value *Origin = MSV.getOrigin(A);
7334 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
7335 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
7337 }
7338 }
7339 }
7340 Constant *OverflowSize =
7341 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AMD64FpEndOffset);
7342 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7343 }
7344
7345 void finalizeInstrumentation() override {
7346 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7347 "finalizeInstrumentation called twice");
7348 if (!VAStartInstrumentationList.empty()) {
7349 // If there is a va_start in this function, make a backup copy of
7350 // va_arg_tls somewhere in the function entry block.
7351 IRBuilder<> IRB(MSV.FnPrologueEnd);
7352 VAArgOverflowSize =
7353 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7354 Value *CopySize = IRB.CreateAdd(
7355 ConstantInt::get(MS.IntptrTy, AMD64FpEndOffset), VAArgOverflowSize);
7356 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7357 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7358 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7359 CopySize, kShadowTLSAlignment, false);
7360
7361 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7362 Intrinsic::umin, CopySize,
7363 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7364 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7365 kShadowTLSAlignment, SrcSize);
7366 if (MS.TrackOrigins) {
7367 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7368 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
7369 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
7370 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
7371 }
7372 }
7373
7374 // Instrument va_start.
7375 // Copy va_list shadow from the backup copy of the TLS contents.
7376 for (CallInst *OrigInst : VAStartInstrumentationList) {
7377 NextNodeIRBuilder IRB(OrigInst);
7378 Value *VAListTag = OrigInst->getArgOperand(0);
7379
7380 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
7381 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7382 ConstantInt::get(MS.IntptrTy, 16)),
7383 MS.PtrTy);
7384 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7385 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7386 const Align Alignment = Align(16);
7387 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7388 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7389 Alignment, /*isStore*/ true);
7390 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
7391 AMD64FpEndOffset);
7392 if (MS.TrackOrigins)
7393 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
7394 Alignment, AMD64FpEndOffset);
7395 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
7396 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7397 ConstantInt::get(MS.IntptrTy, 8)),
7398 MS.PtrTy);
7399 Value *OverflowArgAreaPtr =
7400 IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
7401 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
7402 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
7403 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
7404 Alignment, /*isStore*/ true);
7405 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
7406 AMD64FpEndOffset);
7407 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
7408 VAArgOverflowSize);
7409 if (MS.TrackOrigins) {
7410 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
7411 AMD64FpEndOffset);
7412 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
7413 VAArgOverflowSize);
7414 }
7415 }
7416 }
7417};
7418
7419/// AArch64-specific implementation of VarArgHelper.
7420struct VarArgAArch64Helper : public VarArgHelperBase {
7421 static const unsigned kAArch64GrArgSize = 64;
7422 static const unsigned kAArch64VrArgSize = 128;
7423
7424 static const unsigned AArch64GrBegOffset = 0;
7425 static const unsigned AArch64GrEndOffset = kAArch64GrArgSize;
7426 // Make VR space aligned to 16 bytes.
7427 static const unsigned AArch64VrBegOffset = AArch64GrEndOffset;
7428 static const unsigned AArch64VrEndOffset =
7429 AArch64VrBegOffset + kAArch64VrArgSize;
7430 static const unsigned AArch64VAEndOffset = AArch64VrEndOffset;
7431
7432 AllocaInst *VAArgTLSCopy = nullptr;
7433 Value *VAArgOverflowSize = nullptr;
7434
7435 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7436
7437 VarArgAArch64Helper(Function &F, MemorySanitizer &MS,
7438 MemorySanitizerVisitor &MSV)
7439 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/32) {}
7440
7441 // A very rough approximation of aarch64 argument classification rules.
7442 std::pair<ArgKind, uint64_t> classifyArgument(Type *T) {
7443 if (T->isIntOrPtrTy() && T->getPrimitiveSizeInBits() <= 64)
7444 return {AK_GeneralPurpose, 1};
7445 if (T->isFloatingPointTy() && T->getPrimitiveSizeInBits() <= 128)
7446 return {AK_FloatingPoint, 1};
7447
7448 if (T->isArrayTy()) {
7449 auto R = classifyArgument(T->getArrayElementType());
7450 R.second *= T->getScalarType()->getArrayNumElements();
7451 return R;
7452 }
7453
7454 if (const FixedVectorType *FV = dyn_cast<FixedVectorType>(T)) {
7455 auto R = classifyArgument(FV->getScalarType());
7456 R.second *= FV->getNumElements();
7457 return R;
7458 }
7459
7460 LLVM_DEBUG(errs() << "Unknown vararg type: " << *T << "\n");
7461 return {AK_Memory, 0};
7462 }
7463
7464 // The instrumentation stores the argument shadow in a non ABI-specific
7465 // format because it does not know which argument is named (since Clang,
7466 // like x86_64 case, lowers the va_args in the frontend and this pass only
7467 // sees the low level code that deals with va_list internals).
7468 // The first seven GR registers are saved in the first 56 bytes of the
7469 // va_arg tls arra, followed by the first 8 FP/SIMD registers, and then
7470 // the remaining arguments.
7471 // Using constant offset within the va_arg TLS array allows fast copy
7472 // in the finalize instrumentation.
7473 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7474 unsigned GrOffset = AArch64GrBegOffset;
7475 unsigned VrOffset = AArch64VrBegOffset;
7476 unsigned OverflowOffset = AArch64VAEndOffset;
7477
7478 const DataLayout &DL = F.getDataLayout();
7479 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7480 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7481 auto [AK, RegNum] = classifyArgument(A->getType());
7482 if (AK == AK_GeneralPurpose &&
7483 (GrOffset + RegNum * 8) > AArch64GrEndOffset)
7484 AK = AK_Memory;
7485 if (AK == AK_FloatingPoint &&
7486 (VrOffset + RegNum * 16) > AArch64VrEndOffset)
7487 AK = AK_Memory;
7488 Value *Base;
7489 switch (AK) {
7490 case AK_GeneralPurpose:
7491 Base = getShadowPtrForVAArgument(IRB, GrOffset);
7492 GrOffset += 8 * RegNum;
7493 break;
7494 case AK_FloatingPoint:
7495 Base = getShadowPtrForVAArgument(IRB, VrOffset);
7496 VrOffset += 16 * RegNum;
7497 break;
7498 case AK_Memory:
7499 // Don't count fixed arguments in the overflow area - va_start will
7500 // skip right over them.
7501 if (IsFixed)
7502 continue;
7503 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7504 uint64_t AlignedSize = alignTo(ArgSize, 8);
7505 unsigned BaseOffset = OverflowOffset;
7506 Base = getShadowPtrForVAArgument(IRB, BaseOffset);
7507 OverflowOffset += AlignedSize;
7508 if (OverflowOffset > kParamTLSSize) {
7509 // We have no space to copy shadow there.
7510 CleanUnusedTLS(IRB, Base, BaseOffset);
7511 continue;
7512 }
7513 break;
7514 }
7515 // Count Gp/Vr fixed arguments to their respective offsets, but don't
7516 // bother to actually store a shadow.
7517 if (IsFixed)
7518 continue;
7519 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
7520 }
7521 Constant *OverflowSize =
7522 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AArch64VAEndOffset);
7523 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7524 }
7525
7526 // Retrieve a va_list field of 'void*' size.
7527 Value *getVAField64(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7528 Value *SaveAreaPtrPtr = IRB.CreateIntToPtr(
7529 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7530 ConstantInt::get(MS.IntptrTy, offset)),
7531 MS.PtrTy);
7532 return IRB.CreateLoad(Type::getInt64Ty(*MS.C), SaveAreaPtrPtr);
7533 }
7534
7535 // Retrieve a va_list field of 'int' size.
7536 Value *getVAField32(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7537 Value *SaveAreaPtr = IRB.CreateIntToPtr(
7538 IRB.CreateAdd(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
7539 ConstantInt::get(MS.IntptrTy, offset)),
7540 MS.PtrTy);
7541 Value *SaveArea32 = IRB.CreateLoad(IRB.getInt32Ty(), SaveAreaPtr);
7542 return IRB.CreateSExt(SaveArea32, MS.IntptrTy);
7543 }
7544
7545 void finalizeInstrumentation() override {
7546 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7547 "finalizeInstrumentation called twice");
7548 if (!VAStartInstrumentationList.empty()) {
7549 // If there is a va_start in this function, make a backup copy of
7550 // va_arg_tls somewhere in the function entry block.
7551 IRBuilder<> IRB(MSV.FnPrologueEnd);
7552 VAArgOverflowSize =
7553 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7554 Value *CopySize = IRB.CreateAdd(
7555 ConstantInt::get(MS.IntptrTy, AArch64VAEndOffset), VAArgOverflowSize);
7556 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7557 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7558 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7559 CopySize, kShadowTLSAlignment, false);
7560
7561 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7562 Intrinsic::umin, CopySize,
7563 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7564 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7565 kShadowTLSAlignment, SrcSize);
7566 }
7567
7568 Value *GrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64GrArgSize);
7569 Value *VrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64VrArgSize);
7570
7571 // Instrument va_start, copy va_list shadow from the backup copy of
7572 // the TLS contents.
7573 for (CallInst *OrigInst : VAStartInstrumentationList) {
7574 NextNodeIRBuilder IRB(OrigInst);
7575
7576 Value *VAListTag = OrigInst->getArgOperand(0);
7577
7578 // The variadic ABI for AArch64 creates two areas to save the incoming
7579 // argument registers (one for 64-bit general register xn-x7 and another
7580 // for 128-bit FP/SIMD vn-v7).
7581 // We need then to propagate the shadow arguments on both regions
7582 // 'va::__gr_top + va::__gr_offs' and 'va::__vr_top + va::__vr_offs'.
7583 // The remaining arguments are saved on shadow for 'va::stack'.
7584 // One caveat is it requires only to propagate the non-named arguments,
7585 // however on the call site instrumentation 'all' the arguments are
7586 // saved. So to copy the shadow values from the va_arg TLS array
7587 // we need to adjust the offset for both GR and VR fields based on
7588 // the __{gr,vr}_offs value (since they are stores based on incoming
7589 // named arguments).
7590 Type *RegSaveAreaPtrTy = IRB.getPtrTy();
7591
7592 // Read the stack pointer from the va_list.
7593 Value *StackSaveAreaPtr =
7594 IRB.CreateIntToPtr(getVAField64(IRB, VAListTag, 0), RegSaveAreaPtrTy);
7595
7596 // Read both the __gr_top and __gr_off and add them up.
7597 Value *GrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 8);
7598 Value *GrOffSaveArea = getVAField32(IRB, VAListTag, 24);
7599
7600 Value *GrRegSaveAreaPtr = IRB.CreateIntToPtr(
7601 IRB.CreateAdd(GrTopSaveAreaPtr, GrOffSaveArea), RegSaveAreaPtrTy);
7602
7603 // Read both the __vr_top and __vr_off and add them up.
7604 Value *VrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 16);
7605 Value *VrOffSaveArea = getVAField32(IRB, VAListTag, 28);
7606
7607 Value *VrRegSaveAreaPtr = IRB.CreateIntToPtr(
7608 IRB.CreateAdd(VrTopSaveAreaPtr, VrOffSaveArea), RegSaveAreaPtrTy);
7609
7610 // It does not know how many named arguments is being used and, on the
7611 // callsite all the arguments were saved. Since __gr_off is defined as
7612 // '0 - ((8 - named_gr) * 8)', the idea is to just propagate the variadic
7613 // argument by ignoring the bytes of shadow from named arguments.
7614 Value *GrRegSaveAreaShadowPtrOff =
7615 IRB.CreateAdd(GrArgSize, GrOffSaveArea);
7616
7617 Value *GrRegSaveAreaShadowPtr =
7618 MSV.getShadowOriginPtr(GrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7619 Align(8), /*isStore*/ true)
7620 .first;
7621
7622 Value *GrSrcPtr =
7623 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy, GrRegSaveAreaShadowPtrOff);
7624 Value *GrCopySize = IRB.CreateSub(GrArgSize, GrRegSaveAreaShadowPtrOff);
7625
7626 IRB.CreateMemCpy(GrRegSaveAreaShadowPtr, Align(8), GrSrcPtr, Align(8),
7627 GrCopySize);
7628
7629 // Again, but for FP/SIMD values.
7630 Value *VrRegSaveAreaShadowPtrOff =
7631 IRB.CreateAdd(VrArgSize, VrOffSaveArea);
7632
7633 Value *VrRegSaveAreaShadowPtr =
7634 MSV.getShadowOriginPtr(VrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7635 Align(8), /*isStore*/ true)
7636 .first;
7637
7638 Value *VrSrcPtr = IRB.CreateInBoundsPtrAdd(
7639 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy,
7640 IRB.getInt32(AArch64VrBegOffset)),
7641 VrRegSaveAreaShadowPtrOff);
7642 Value *VrCopySize = IRB.CreateSub(VrArgSize, VrRegSaveAreaShadowPtrOff);
7643
7644 IRB.CreateMemCpy(VrRegSaveAreaShadowPtr, Align(8), VrSrcPtr, Align(8),
7645 VrCopySize);
7646
7647 // And finally for remaining arguments.
7648 Value *StackSaveAreaShadowPtr =
7649 MSV.getShadowOriginPtr(StackSaveAreaPtr, IRB, IRB.getInt8Ty(),
7650 Align(16), /*isStore*/ true)
7651 .first;
7652
7653 Value *StackSrcPtr = IRB.CreateInBoundsPtrAdd(
7654 VAArgTLSCopy, IRB.getInt32(AArch64VAEndOffset));
7655
7656 IRB.CreateMemCpy(StackSaveAreaShadowPtr, Align(16), StackSrcPtr,
7657 Align(16), VAArgOverflowSize);
7658 }
7659 }
7660};
7661
7662/// PowerPC64-specific implementation of VarArgHelper.
7663struct VarArgPowerPC64Helper : public VarArgHelperBase {
7664 AllocaInst *VAArgTLSCopy = nullptr;
7665 Value *VAArgSize = nullptr;
7666
7667 VarArgPowerPC64Helper(Function &F, MemorySanitizer &MS,
7668 MemorySanitizerVisitor &MSV)
7669 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/8) {}
7670
7671 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7672 // For PowerPC, we need to deal with alignment of stack arguments -
7673 // they are mostly aligned to 8 bytes, but vectors and i128 arrays
7674 // are aligned to 16 bytes, byvals can be aligned to 8 or 16 bytes,
7675 // For that reason, we compute current offset from stack pointer (which is
7676 // always properly aligned), and offset for the first vararg, then subtract
7677 // them.
7678 unsigned VAArgBase;
7679 Triple TargetTriple(F.getParent()->getTargetTriple());
7680 // Parameter save area starts at 48 bytes from frame pointer for ABIv1,
7681 // and 32 bytes for ABIv2. This is usually determined by target
7682 // endianness, but in theory could be overridden by function attribute.
7683 if (TargetTriple.isPPC64ELFv2ABI())
7684 VAArgBase = 32;
7685 else
7686 VAArgBase = 48;
7687 unsigned VAArgOffset = VAArgBase;
7688 const DataLayout &DL = F.getDataLayout();
7689 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7690 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7691 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7692 if (IsByVal) {
7693 assert(A->getType()->isPointerTy());
7694 Type *RealTy = CB.getParamByValType(ArgNo);
7695 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7696 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(8));
7697 if (ArgAlign < 8)
7698 ArgAlign = Align(8);
7699 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7700 if (!IsFixed) {
7701 Value *Base =
7702 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7703 if (Base) {
7704 Value *AShadowPtr, *AOriginPtr;
7705 std::tie(AShadowPtr, AOriginPtr) =
7706 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
7707 kShadowTLSAlignment, /*isStore*/ false);
7708
7709 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
7710 kShadowTLSAlignment, ArgSize);
7711 }
7712 }
7713 VAArgOffset += alignTo(ArgSize, Align(8));
7714 } else {
7715 Value *Base;
7716 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7717 Align ArgAlign = Align(8);
7718 if (A->getType()->isArrayTy()) {
7719 // Arrays are aligned to element size, except for long double
7720 // arrays, which are aligned to 8 bytes.
7721 Type *ElementTy = A->getType()->getArrayElementType();
7722 if (!ElementTy->isPPC_FP128Ty())
7723 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
7724 } else if (A->getType()->isVectorTy()) {
7725 // Vectors are naturally aligned.
7726 ArgAlign = Align(ArgSize);
7727 }
7728 if (ArgAlign < 8)
7729 ArgAlign = Align(8);
7730 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7731 if (DL.isBigEndian()) {
7732 // Adjusting the shadow for argument with size < 8 to match the
7733 // placement of bits in big endian system
7734 if (ArgSize < 8)
7735 VAArgOffset += (8 - ArgSize);
7736 }
7737 if (!IsFixed) {
7738 Base =
7739 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7740 if (Base)
7741 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
7742 }
7743 VAArgOffset += ArgSize;
7744 VAArgOffset = alignTo(VAArgOffset, Align(8));
7745 }
7746 if (IsFixed)
7747 VAArgBase = VAArgOffset;
7748 }
7749
7750 Constant *TotalVAArgSize =
7751 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
7752 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
7753 // a new class member i.e. it is the total size of all VarArgs.
7754 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
7755 }
7756
7757 void finalizeInstrumentation() override {
7758 assert(!VAArgSize && !VAArgTLSCopy &&
7759 "finalizeInstrumentation called twice");
7760 IRBuilder<> IRB(MSV.FnPrologueEnd);
7761 VAArgSize = IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7762 Value *CopySize = VAArgSize;
7763
7764 if (!VAStartInstrumentationList.empty()) {
7765 // If there is a va_start in this function, make a backup copy of
7766 // va_arg_tls somewhere in the function entry block.
7767
7768 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7769 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7770 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7771 CopySize, kShadowTLSAlignment, false);
7772
7773 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7774 Intrinsic::umin, CopySize,
7775 ConstantInt::get(IRB.getInt64Ty(), kParamTLSSize));
7776 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7777 kShadowTLSAlignment, SrcSize);
7778 }
7779
7780 // Instrument va_start.
7781 // Copy va_list shadow from the backup copy of the TLS contents.
7782 for (CallInst *OrigInst : VAStartInstrumentationList) {
7783 NextNodeIRBuilder IRB(OrigInst);
7784 Value *VAListTag = OrigInst->getArgOperand(0);
7785 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7786
7787 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
7788
7789 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7790 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7791 const DataLayout &DL = F.getDataLayout();
7792 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7793 const Align Alignment = Align(IntptrSize);
7794 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7795 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7796 Alignment, /*isStore*/ true);
7797 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
7798 CopySize);
7799 }
7800 }
7801};
7802
7803/// PowerPC32-specific implementation of VarArgHelper.
7804struct VarArgPowerPC32Helper : public VarArgHelperBase {
7805 AllocaInst *VAArgTLSCopy = nullptr;
7806 Value *VAArgSize = nullptr;
7807
7808 VarArgPowerPC32Helper(Function &F, MemorySanitizer &MS,
7809 MemorySanitizerVisitor &MSV)
7810 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/12) {}
7811
7812 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7813 unsigned VAArgBase;
7814 // Parameter save area is 8 bytes from frame pointer in PPC32
7815 VAArgBase = 8;
7816 unsigned VAArgOffset = VAArgBase;
7817 const DataLayout &DL = F.getDataLayout();
7818 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7819 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7820 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7821 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7822 if (IsByVal) {
7823 assert(A->getType()->isPointerTy());
7824 Type *RealTy = CB.getParamByValType(ArgNo);
7825 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7826 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
7827 if (ArgAlign < IntptrSize)
7828 ArgAlign = Align(IntptrSize);
7829 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7830 if (!IsFixed) {
7831 Value *Base =
7832 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7833 if (Base) {
7834 Value *AShadowPtr, *AOriginPtr;
7835 std::tie(AShadowPtr, AOriginPtr) =
7836 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
7837 kShadowTLSAlignment, /*isStore*/ false);
7838
7839 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
7840 kShadowTLSAlignment, ArgSize);
7841 }
7842 }
7843 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
7844 } else {
7845 Value *Base;
7846 Type *ArgTy = A->getType();
7847
7848 // On PPC 32 floating point variable arguments are stored in separate
7849 // area: fp_save_area = reg_save_area + 4*8. We do not copy shaodow for
7850 // them as they will be found when checking call arguments.
7851 if (!ArgTy->isFloatingPointTy()) {
7852 uint64_t ArgSize = DL.getTypeAllocSize(ArgTy);
7853 Align ArgAlign = Align(IntptrSize);
7854 if (ArgTy->isArrayTy()) {
7855 // Arrays are aligned to element size, except for long double
7856 // arrays, which are aligned to 8 bytes.
7857 Type *ElementTy = ArgTy->getArrayElementType();
7858 if (!ElementTy->isPPC_FP128Ty())
7859 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
7860 } else if (ArgTy->isVectorTy()) {
7861 // Vectors are naturally aligned.
7862 ArgAlign = Align(ArgSize);
7863 }
7864 if (ArgAlign < IntptrSize)
7865 ArgAlign = Align(IntptrSize);
7866 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7867 if (DL.isBigEndian()) {
7868 // Adjusting the shadow for argument with size < IntptrSize to match
7869 // the placement of bits in big endian system
7870 if (ArgSize < IntptrSize)
7871 VAArgOffset += (IntptrSize - ArgSize);
7872 }
7873 if (!IsFixed) {
7874 Base = getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase,
7875 ArgSize);
7876 if (Base)
7877 IRB.CreateAlignedStore(MSV.getShadow(A), Base,
7879 }
7880 VAArgOffset += ArgSize;
7881 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
7882 }
7883 }
7884 }
7885
7886 Constant *TotalVAArgSize =
7887 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
7888 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
7889 // a new class member i.e. it is the total size of all VarArgs.
7890 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
7891 }
7892
7893 void finalizeInstrumentation() override {
7894 assert(!VAArgSize && !VAArgTLSCopy &&
7895 "finalizeInstrumentation called twice");
7896 IRBuilder<> IRB(MSV.FnPrologueEnd);
7897 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
7898 Value *CopySize = VAArgSize;
7899
7900 if (!VAStartInstrumentationList.empty()) {
7901 // If there is a va_start in this function, make a backup copy of
7902 // va_arg_tls somewhere in the function entry block.
7903
7904 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7905 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7906 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7907 CopySize, kShadowTLSAlignment, false);
7908
7909 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7910 Intrinsic::umin, CopySize,
7911 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7912 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7913 kShadowTLSAlignment, SrcSize);
7914 }
7915
7916 // Instrument va_start.
7917 // Copy va_list shadow from the backup copy of the TLS contents.
7918 for (CallInst *OrigInst : VAStartInstrumentationList) {
7919 NextNodeIRBuilder IRB(OrigInst);
7920 Value *VAListTag = OrigInst->getArgOperand(0);
7921 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7922 Value *RegSaveAreaSize = CopySize;
7923
7924 // In PPC32 va_list_tag is a struct
7925 RegSaveAreaPtrPtr =
7926 IRB.CreateAdd(RegSaveAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 8));
7927
7928 // On PPC 32 reg_save_area can only hold 32 bytes of data
7929 RegSaveAreaSize = IRB.CreateBinaryIntrinsic(
7930 Intrinsic::umin, CopySize, ConstantInt::get(MS.IntptrTy, 32));
7931
7932 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
7933 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7934
7935 const DataLayout &DL = F.getDataLayout();
7936 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7937 const Align Alignment = Align(IntptrSize);
7938
7939 { // Copy reg save area
7940 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7941 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7942 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7943 Alignment, /*isStore*/ true);
7944 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy,
7945 Alignment, RegSaveAreaSize);
7946
7947 RegSaveAreaShadowPtr =
7948 IRB.CreatePtrToInt(RegSaveAreaShadowPtr, MS.IntptrTy);
7949 Value *FPSaveArea = IRB.CreateAdd(RegSaveAreaShadowPtr,
7950 ConstantInt::get(MS.IntptrTy, 32));
7951 FPSaveArea = IRB.CreateIntToPtr(FPSaveArea, MS.PtrTy);
7952 // We fill fp shadow with zeroes as uninitialized fp args should have
7953 // been found during call base check
7954 IRB.CreateMemSet(FPSaveArea, ConstantInt::getNullValue(IRB.getInt8Ty()),
7955 ConstantInt::get(MS.IntptrTy, 32), Alignment);
7956 }
7957
7958 { // Copy overflow area
7959 // RegSaveAreaSize is min(CopySize, 32) -> no overflow can occur
7960 Value *OverflowAreaSize = IRB.CreateSub(CopySize, RegSaveAreaSize);
7961
7962 Value *OverflowAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7963 OverflowAreaPtrPtr =
7964 IRB.CreateAdd(OverflowAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 4));
7965 OverflowAreaPtrPtr = IRB.CreateIntToPtr(OverflowAreaPtrPtr, MS.PtrTy);
7966
7967 Value *OverflowAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowAreaPtrPtr);
7968
7969 Value *OverflowAreaShadowPtr, *OverflowAreaOriginPtr;
7970 std::tie(OverflowAreaShadowPtr, OverflowAreaOriginPtr) =
7971 MSV.getShadowOriginPtr(OverflowAreaPtr, IRB, IRB.getInt8Ty(),
7972 Alignment, /*isStore*/ true);
7973
7974 Value *OverflowVAArgTLSCopyPtr =
7975 IRB.CreatePtrToInt(VAArgTLSCopy, MS.IntptrTy);
7976 OverflowVAArgTLSCopyPtr =
7977 IRB.CreateAdd(OverflowVAArgTLSCopyPtr, RegSaveAreaSize);
7978
7979 OverflowVAArgTLSCopyPtr =
7980 IRB.CreateIntToPtr(OverflowVAArgTLSCopyPtr, MS.PtrTy);
7981 IRB.CreateMemCpy(OverflowAreaShadowPtr, Alignment,
7982 OverflowVAArgTLSCopyPtr, Alignment, OverflowAreaSize);
7983 }
7984 }
7985 }
7986};
7987
7988/// SystemZ-specific implementation of VarArgHelper.
7989struct VarArgSystemZHelper : public VarArgHelperBase {
7990 static const unsigned SystemZGpOffset = 16;
7991 static const unsigned SystemZGpEndOffset = 56;
7992 static const unsigned SystemZFpOffset = 128;
7993 static const unsigned SystemZFpEndOffset = 160;
7994 static const unsigned SystemZMaxVrArgs = 8;
7995 static const unsigned SystemZRegSaveAreaSize = 160;
7996 static const unsigned SystemZOverflowOffset = 160;
7997 static const unsigned SystemZVAListTagSize = 32;
7998 static const unsigned SystemZOverflowArgAreaPtrOffset = 16;
7999 static const unsigned SystemZRegSaveAreaPtrOffset = 24;
8000
8001 bool IsSoftFloatABI;
8002 AllocaInst *VAArgTLSCopy = nullptr;
8003 AllocaInst *VAArgTLSOriginCopy = nullptr;
8004 Value *VAArgOverflowSize = nullptr;
8005
8006 enum class ArgKind {
8007 GeneralPurpose,
8008 FloatingPoint,
8009 Vector,
8010 Memory,
8011 Indirect,
8012 };
8013
8014 enum class ShadowExtension { None, Zero, Sign };
8015
8016 VarArgSystemZHelper(Function &F, MemorySanitizer &MS,
8017 MemorySanitizerVisitor &MSV)
8018 : VarArgHelperBase(F, MS, MSV, SystemZVAListTagSize),
8019 IsSoftFloatABI(F.getFnAttribute("use-soft-float").getValueAsBool()) {}
8020
8021 ArgKind classifyArgument(Type *T) {
8022 // T is a SystemZABIInfo::classifyArgumentType() output, and there are
8023 // only a few possibilities of what it can be. In particular, enums, single
8024 // element structs and large types have already been taken care of.
8025
8026 // Some i128 and fp128 arguments are converted to pointers only in the
8027 // back end.
8028 if (T->isIntegerTy(128) || T->isFP128Ty())
8029 return ArgKind::Indirect;
8030 if (T->isFloatingPointTy())
8031 return IsSoftFloatABI ? ArgKind::GeneralPurpose : ArgKind::FloatingPoint;
8032 if (T->isIntegerTy() || T->isPointerTy())
8033 return ArgKind::GeneralPurpose;
8034 if (T->isVectorTy())
8035 return ArgKind::Vector;
8036 return ArgKind::Memory;
8037 }
8038
8039 ShadowExtension getShadowExtension(const CallBase &CB, unsigned ArgNo) {
8040 // ABI says: "One of the simple integer types no more than 64 bits wide.
8041 // ... If such an argument is shorter than 64 bits, replace it by a full
8042 // 64-bit integer representing the same number, using sign or zero
8043 // extension". Shadow for an integer argument has the same type as the
8044 // argument itself, so it can be sign or zero extended as well.
8045 bool ZExt = CB.paramHasAttr(ArgNo, Attribute::ZExt);
8046 bool SExt = CB.paramHasAttr(ArgNo, Attribute::SExt);
8047 if (ZExt) {
8048 assert(!SExt);
8049 return ShadowExtension::Zero;
8050 }
8051 if (SExt) {
8052 assert(!ZExt);
8053 return ShadowExtension::Sign;
8054 }
8055 return ShadowExtension::None;
8056 }
8057
8058 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8059 unsigned GpOffset = SystemZGpOffset;
8060 unsigned FpOffset = SystemZFpOffset;
8061 unsigned VrIndex = 0;
8062 unsigned OverflowOffset = SystemZOverflowOffset;
8063 const DataLayout &DL = F.getDataLayout();
8064 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8065 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8066 // SystemZABIInfo does not produce ByVal parameters.
8067 assert(!CB.paramHasAttr(ArgNo, Attribute::ByVal));
8068 Type *T = A->getType();
8069 ArgKind AK = classifyArgument(T);
8070 if (AK == ArgKind::Indirect) {
8071 T = MS.PtrTy;
8072 AK = ArgKind::GeneralPurpose;
8073 }
8074 if (AK == ArgKind::GeneralPurpose && GpOffset >= SystemZGpEndOffset)
8075 AK = ArgKind::Memory;
8076 if (AK == ArgKind::FloatingPoint && FpOffset >= SystemZFpEndOffset)
8077 AK = ArgKind::Memory;
8078 if (AK == ArgKind::Vector && (VrIndex >= SystemZMaxVrArgs || !IsFixed))
8079 AK = ArgKind::Memory;
8080 Value *ShadowBase = nullptr;
8081 Value *OriginBase = nullptr;
8082 ShadowExtension SE = ShadowExtension::None;
8083 switch (AK) {
8084 case ArgKind::GeneralPurpose: {
8085 // Always keep track of GpOffset, but store shadow only for varargs.
8086 uint64_t ArgSize = 8;
8087 if (GpOffset + ArgSize <= kParamTLSSize) {
8088 if (!IsFixed) {
8089 SE = getShadowExtension(CB, ArgNo);
8090 uint64_t GapSize = 0;
8091 if (SE == ShadowExtension::None) {
8092 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
8093 assert(ArgAllocSize <= ArgSize);
8094 GapSize = ArgSize - ArgAllocSize;
8095 }
8096 ShadowBase = getShadowAddrForVAArgument(IRB, GpOffset + GapSize);
8097 if (MS.TrackOrigins)
8098 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset + GapSize);
8099 }
8100 GpOffset += ArgSize;
8101 } else {
8102 GpOffset = kParamTLSSize;
8103 }
8104 break;
8105 }
8106 case ArgKind::FloatingPoint: {
8107 // Always keep track of FpOffset, but store shadow only for varargs.
8108 uint64_t ArgSize = 8;
8109 if (FpOffset + ArgSize <= kParamTLSSize) {
8110 if (!IsFixed) {
8111 // PoP says: "A short floating-point datum requires only the
8112 // left-most 32 bit positions of a floating-point register".
8113 // Therefore, in contrast to AK_GeneralPurpose and AK_Memory,
8114 // don't extend shadow and don't mind the gap.
8115 ShadowBase = getShadowAddrForVAArgument(IRB, FpOffset);
8116 if (MS.TrackOrigins)
8117 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
8118 }
8119 FpOffset += ArgSize;
8120 } else {
8121 FpOffset = kParamTLSSize;
8122 }
8123 break;
8124 }
8125 case ArgKind::Vector: {
8126 // Keep track of VrIndex. No need to store shadow, since vector varargs
8127 // go through AK_Memory.
8128 assert(IsFixed);
8129 VrIndex++;
8130 break;
8131 }
8132 case ArgKind::Memory: {
8133 // Keep track of OverflowOffset and store shadow only for varargs.
8134 // Ignore fixed args, since we need to copy only the vararg portion of
8135 // the overflow area shadow.
8136 if (!IsFixed) {
8137 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
8138 uint64_t ArgSize = alignTo(ArgAllocSize, 8);
8139 if (OverflowOffset + ArgSize <= kParamTLSSize) {
8140 SE = getShadowExtension(CB, ArgNo);
8141 uint64_t GapSize =
8142 SE == ShadowExtension::None ? ArgSize - ArgAllocSize : 0;
8143 ShadowBase =
8144 getShadowAddrForVAArgument(IRB, OverflowOffset + GapSize);
8145 if (MS.TrackOrigins)
8146 OriginBase =
8147 getOriginPtrForVAArgument(IRB, OverflowOffset + GapSize);
8148 OverflowOffset += ArgSize;
8149 } else {
8150 OverflowOffset = kParamTLSSize;
8151 }
8152 }
8153 break;
8154 }
8155 case ArgKind::Indirect:
8156 llvm_unreachable("Indirect must be converted to GeneralPurpose");
8157 }
8158 if (ShadowBase == nullptr)
8159 continue;
8160 Value *Shadow = MSV.getShadow(A);
8161 if (SE != ShadowExtension::None)
8162 Shadow = MSV.CreateShadowCast(IRB, Shadow, IRB.getInt64Ty(),
8163 /*Signed*/ SE == ShadowExtension::Sign);
8164 ShadowBase = IRB.CreateIntToPtr(ShadowBase, MS.PtrTy, "_msarg_va_s");
8165 IRB.CreateStore(Shadow, ShadowBase);
8166 if (MS.TrackOrigins) {
8167 Value *Origin = MSV.getOrigin(A);
8168 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
8169 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
8171 }
8172 }
8173 Constant *OverflowSize = ConstantInt::get(
8174 IRB.getInt64Ty(), OverflowOffset - SystemZOverflowOffset);
8175 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
8176 }
8177
8178 void copyRegSaveArea(IRBuilder<> &IRB, Value *VAListTag) {
8179 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
8180 IRB.CreateAdd(
8181 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8182 ConstantInt::get(MS.IntptrTy, SystemZRegSaveAreaPtrOffset)),
8183 MS.PtrTy);
8184 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
8185 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8186 const Align Alignment = Align(8);
8187 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8188 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(), Alignment,
8189 /*isStore*/ true);
8190 // TODO(iii): copy only fragments filled by visitCallBase()
8191 // TODO(iii): support packed-stack && !use-soft-float
8192 // For use-soft-float functions, it is enough to copy just the GPRs.
8193 unsigned RegSaveAreaSize =
8194 IsSoftFloatABI ? SystemZGpEndOffset : SystemZRegSaveAreaSize;
8195 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8196 RegSaveAreaSize);
8197 if (MS.TrackOrigins)
8198 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
8199 Alignment, RegSaveAreaSize);
8200 }
8201
8202 // FIXME: This implementation limits OverflowOffset to kParamTLSSize, so we
8203 // don't know real overflow size and can't clear shadow beyond kParamTLSSize.
8204 void copyOverflowArea(IRBuilder<> &IRB, Value *VAListTag) {
8205 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
8206 IRB.CreateAdd(
8207 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8208 ConstantInt::get(MS.IntptrTy, SystemZOverflowArgAreaPtrOffset)),
8209 MS.PtrTy);
8210 Value *OverflowArgAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
8211 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
8212 const Align Alignment = Align(8);
8213 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
8214 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
8215 Alignment, /*isStore*/ true);
8216 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
8217 SystemZOverflowOffset);
8218 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
8219 VAArgOverflowSize);
8220 if (MS.TrackOrigins) {
8221 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
8222 SystemZOverflowOffset);
8223 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
8224 VAArgOverflowSize);
8225 }
8226 }
8227
8228 void finalizeInstrumentation() override {
8229 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
8230 "finalizeInstrumentation called twice");
8231 if (!VAStartInstrumentationList.empty()) {
8232 // If there is a va_start in this function, make a backup copy of
8233 // va_arg_tls somewhere in the function entry block.
8234 IRBuilder<> IRB(MSV.FnPrologueEnd);
8235 VAArgOverflowSize =
8236 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
8237 Value *CopySize =
8238 IRB.CreateAdd(ConstantInt::get(MS.IntptrTy, SystemZOverflowOffset),
8239 VAArgOverflowSize);
8240 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8241 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8242 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8243 CopySize, kShadowTLSAlignment, false);
8244
8245 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8246 Intrinsic::umin, CopySize,
8247 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8248 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8249 kShadowTLSAlignment, SrcSize);
8250 if (MS.TrackOrigins) {
8251 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8252 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
8253 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
8254 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
8255 }
8256 }
8257
8258 // Instrument va_start.
8259 // Copy va_list shadow from the backup copy of the TLS contents.
8260 for (CallInst *OrigInst : VAStartInstrumentationList) {
8261 NextNodeIRBuilder IRB(OrigInst);
8262 Value *VAListTag = OrigInst->getArgOperand(0);
8263 copyRegSaveArea(IRB, VAListTag);
8264 copyOverflowArea(IRB, VAListTag);
8265 }
8266 }
8267};
8268
8269/// i386-specific implementation of VarArgHelper.
8270struct VarArgI386Helper : public VarArgHelperBase {
8271 AllocaInst *VAArgTLSCopy = nullptr;
8272 Value *VAArgSize = nullptr;
8273
8274 VarArgI386Helper(Function &F, MemorySanitizer &MS,
8275 MemorySanitizerVisitor &MSV)
8276 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/4) {}
8277
8278 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8279 const DataLayout &DL = F.getDataLayout();
8280 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8281 unsigned VAArgOffset = 0;
8282 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8283 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8284 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
8285 if (IsByVal) {
8286 assert(A->getType()->isPointerTy());
8287 Type *RealTy = CB.getParamByValType(ArgNo);
8288 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
8289 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
8290 if (ArgAlign < IntptrSize)
8291 ArgAlign = Align(IntptrSize);
8292 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8293 if (!IsFixed) {
8294 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8295 if (Base) {
8296 Value *AShadowPtr, *AOriginPtr;
8297 std::tie(AShadowPtr, AOriginPtr) =
8298 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
8299 kShadowTLSAlignment, /*isStore*/ false);
8300
8301 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
8302 kShadowTLSAlignment, ArgSize);
8303 }
8304 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
8305 }
8306 } else {
8307 Value *Base;
8308 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8309 Align ArgAlign = Align(IntptrSize);
8310 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8311 if (DL.isBigEndian()) {
8312 // Adjusting the shadow for argument with size < IntptrSize to match
8313 // the placement of bits in big endian system
8314 if (ArgSize < IntptrSize)
8315 VAArgOffset += (IntptrSize - ArgSize);
8316 }
8317 if (!IsFixed) {
8318 Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8319 if (Base)
8320 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8321 VAArgOffset += ArgSize;
8322 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
8323 }
8324 }
8325 }
8326
8327 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8328 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8329 // a new class member i.e. it is the total size of all VarArgs.
8330 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8331 }
8332
8333 void finalizeInstrumentation() override {
8334 assert(!VAArgSize && !VAArgTLSCopy &&
8335 "finalizeInstrumentation called twice");
8336 IRBuilder<> IRB(MSV.FnPrologueEnd);
8337 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8338 Value *CopySize = VAArgSize;
8339
8340 if (!VAStartInstrumentationList.empty()) {
8341 // If there is a va_start in this function, make a backup copy of
8342 // va_arg_tls somewhere in the function entry block.
8343 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8344 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8345 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8346 CopySize, kShadowTLSAlignment, false);
8347
8348 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8349 Intrinsic::umin, CopySize,
8350 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8351 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8352 kShadowTLSAlignment, SrcSize);
8353 }
8354
8355 // Instrument va_start.
8356 // Copy va_list shadow from the backup copy of the TLS contents.
8357 for (CallInst *OrigInst : VAStartInstrumentationList) {
8358 NextNodeIRBuilder IRB(OrigInst);
8359 Value *VAListTag = OrigInst->getArgOperand(0);
8360 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8361 Value *RegSaveAreaPtrPtr =
8362 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8363 PointerType::get(*MS.C, 0));
8364 Value *RegSaveAreaPtr =
8365 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8366 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8367 const DataLayout &DL = F.getDataLayout();
8368 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8369 const Align Alignment = Align(IntptrSize);
8370 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8371 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8372 Alignment, /*isStore*/ true);
8373 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8374 CopySize);
8375 }
8376 }
8377};
8378
8379/// Implementation of VarArgHelper that is used for ARM32, MIPS, RISCV,
8380/// LoongArch64.
8381struct VarArgGenericHelper : public VarArgHelperBase {
8382 AllocaInst *VAArgTLSCopy = nullptr;
8383 Value *VAArgSize = nullptr;
8384
8385 VarArgGenericHelper(Function &F, MemorySanitizer &MS,
8386 MemorySanitizerVisitor &MSV, const unsigned VAListTagSize)
8387 : VarArgHelperBase(F, MS, MSV, VAListTagSize) {}
8388
8389 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8390 unsigned VAArgOffset = 0;
8391 const DataLayout &DL = F.getDataLayout();
8392 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8393 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8394 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8395 if (IsFixed)
8396 continue;
8397 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8398 if (DL.isBigEndian()) {
8399 // Adjusting the shadow for argument with size < IntptrSize to match the
8400 // placement of bits in big endian system
8401 if (ArgSize < IntptrSize)
8402 VAArgOffset += (IntptrSize - ArgSize);
8403 }
8404 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8405 VAArgOffset += ArgSize;
8406 VAArgOffset = alignTo(VAArgOffset, IntptrSize);
8407 if (!Base)
8408 continue;
8409 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8410 }
8411
8412 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8413 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8414 // a new class member i.e. it is the total size of all VarArgs.
8415 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8416 }
8417
8418 void finalizeInstrumentation() override {
8419 assert(!VAArgSize && !VAArgTLSCopy &&
8420 "finalizeInstrumentation called twice");
8421 IRBuilder<> IRB(MSV.FnPrologueEnd);
8422 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8423 Value *CopySize = VAArgSize;
8424
8425 if (!VAStartInstrumentationList.empty()) {
8426 // If there is a va_start in this function, make a backup copy of
8427 // va_arg_tls somewhere in the function entry block.
8428 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8429 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8430 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8431 CopySize, kShadowTLSAlignment, false);
8432
8433 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8434 Intrinsic::umin, CopySize,
8435 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8436 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8437 kShadowTLSAlignment, SrcSize);
8438 }
8439
8440 // Instrument va_start.
8441 // Copy va_list shadow from the backup copy of the TLS contents.
8442 for (CallInst *OrigInst : VAStartInstrumentationList) {
8443 NextNodeIRBuilder IRB(OrigInst);
8444 Value *VAListTag = OrigInst->getArgOperand(0);
8445 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8446 Value *RegSaveAreaPtrPtr =
8447 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8448 PointerType::get(*MS.C, 0));
8449 Value *RegSaveAreaPtr =
8450 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8451 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8452 const DataLayout &DL = F.getDataLayout();
8453 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8454 const Align Alignment = Align(IntptrSize);
8455 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8456 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8457 Alignment, /*isStore*/ true);
8458 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8459 CopySize);
8460 }
8461 }
8462};
8463
8464// ARM32, Loongarch64, MIPS and RISCV share the same calling conventions
8465// regarding VAArgs.
8466using VarArgARM32Helper = VarArgGenericHelper;
8467using VarArgRISCVHelper = VarArgGenericHelper;
8468using VarArgMIPSHelper = VarArgGenericHelper;
8469using VarArgLoongArch64Helper = VarArgGenericHelper;
8470
8471/// A no-op implementation of VarArgHelper.
8472struct VarArgNoOpHelper : public VarArgHelper {
8473 VarArgNoOpHelper(Function &F, MemorySanitizer &MS,
8474 MemorySanitizerVisitor &MSV) {}
8475
8476 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {}
8477
8478 void visitVAStartInst(VAStartInst &I) override {}
8479
8480 void visitVACopyInst(VACopyInst &I) override {}
8481
8482 void finalizeInstrumentation() override {}
8483};
8484
8485} // end anonymous namespace
8486
8487static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
8488 MemorySanitizerVisitor &Visitor) {
8489 // VarArg handling is only implemented on AMD64. False positives are possible
8490 // on other platforms.
8491 Triple TargetTriple(Func.getParent()->getTargetTriple());
8492
8493 if (TargetTriple.getArch() == Triple::x86)
8494 return new VarArgI386Helper(Func, Msan, Visitor);
8495
8496 if (TargetTriple.getArch() == Triple::x86_64)
8497 return new VarArgAMD64Helper(Func, Msan, Visitor);
8498
8499 if (TargetTriple.isARM())
8500 return new VarArgARM32Helper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8501
8502 if (TargetTriple.isAArch64())
8503 return new VarArgAArch64Helper(Func, Msan, Visitor);
8504
8505 if (TargetTriple.isSystemZ())
8506 return new VarArgSystemZHelper(Func, Msan, Visitor);
8507
8508 // On PowerPC32 VAListTag is a struct
8509 // {char, char, i16 padding, char *, char *}
8510 if (TargetTriple.isPPC32())
8511 return new VarArgPowerPC32Helper(Func, Msan, Visitor);
8512
8513 if (TargetTriple.isPPC64())
8514 return new VarArgPowerPC64Helper(Func, Msan, Visitor);
8515
8516 if (TargetTriple.isRISCV32())
8517 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8518
8519 if (TargetTriple.isRISCV64())
8520 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8521
8522 if (TargetTriple.isMIPS32())
8523 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8524
8525 if (TargetTriple.isMIPS64())
8526 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8527
8528 if (TargetTriple.isLoongArch64())
8529 return new VarArgLoongArch64Helper(Func, Msan, Visitor,
8530 /*VAListTagSize=*/8);
8531
8532 return new VarArgNoOpHelper(Func, Msan, Visitor);
8533}
8534
8535bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) {
8536 if (!CompileKernel && F.getName() == kMsanModuleCtorName)
8537 return false;
8538
8539 if (F.hasFnAttribute(Attribute::DisableSanitizerInstrumentation))
8540 return false;
8541
8542 MemorySanitizerVisitor Visitor(F, *this, TLI);
8543
8544 // Clear out memory attributes.
8546 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
8547 F.removeFnAttrs(B);
8548
8549 return Visitor.runOnFunction();
8550}
#define Success
assert(UImm &&(UImm !=~static_cast< T >(0)) &&"Invalid immediate!")
constexpr LLT S1
This file implements a class to represent arbitrary precision integral constant values and operations...
static bool isStore(int Opcode)
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
static cl::opt< ITMode > IT(cl::desc("IT block support"), cl::Hidden, cl::init(DefaultIT), cl::values(clEnumValN(DefaultIT, "arm-default-it", "Generate any type of IT block"), clEnumValN(RestrictedIT, "arm-restrict-it", "Disallow complex IT blocks")))
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClWithComdat("asan-with-comdat", cl::desc("Place ASan constructors in comdat sections"), cl::Hidden, cl::init(true))
VarLocInsertPt getNextNode(const DbgRecord *DVR)
Atomic ordering constants.
This file contains the simple types necessary to represent the attributes associated with functions a...
static GCRegistry::Add< ErlangGC > A("erlang", "erlang-compatible garbage collector")
static GCRegistry::Add< StatepointGC > D("statepoint-example", "an example strategy for statepoint")
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
Analysis containing CSE Info
Definition CSEInfo.cpp:27
This file contains the declarations for the subclasses of Constant, which represent the different fla...
const MemoryMapParams Linux_LoongArch64_MemoryMapParams
const MemoryMapParams Linux_X86_64_MemoryMapParams
static cl::opt< int > ClTrackOrigins("dfsan-track-origins", cl::desc("Track origins of labels"), cl::Hidden, cl::init(0))
static AtomicOrdering addReleaseOrdering(AtomicOrdering AO)
static AtomicOrdering addAcquireOrdering(AtomicOrdering AO)
const MemoryMapParams Linux_AArch64_MemoryMapParams
static bool isAMustTailRetVal(Value *RetVal)
This file provides an implementation of debug counters.
#define DEBUG_COUNTER(VARNAME, COUNTERNAME, DESC)
This file defines the DenseMap class.
This file builds on the ADT/GraphTraits.h file to build generic depth first graph iterator.
@ Default
static bool runOnFunction(Function &F, bool PostInlining)
This is the interface for a simple mod/ref and alias analysis over globals.
static size_t TypeSizeToSizeIndex(uint32_t TypeSize)
#define op(i)
Hexagon Common GEP
#define _
Hexagon Vector Combine
Module.h This file contains the declarations for the Module class.
static LVOptions Options
Definition LVOptions.cpp:25
#define F(x, y, z)
Definition MD5.cpp:55
#define I(x, y, z)
Definition MD5.cpp:58
Machine Check Debug Module
static const PlatformMemoryMapParams Linux_S390_MemoryMapParams
static const Align kMinOriginAlignment
static cl::opt< uint64_t > ClShadowBase("msan-shadow-base", cl::desc("Define custom MSan ShadowBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClPoisonUndef("msan-poison-undef", cl::desc("Poison fully undef temporary values. " "Partially undefined constant vectors " "are unaffected by this flag (see " "-msan-poison-undef-vectors)."), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_X86_MemoryMapParams
static cl::opt< uint64_t > ClOriginBase("msan-origin-base", cl::desc("Define custom MSan OriginBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClCheckConstantShadow("msan-check-constant-shadow", cl::desc("Insert checks for constant shadow values"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams
static const MemoryMapParams NetBSD_X86_64_MemoryMapParams
static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams
static const unsigned kOriginSize
static cl::opt< bool > ClWithComdat("msan-with-comdat", cl::desc("Place MSan constructors in comdat sections"), cl::Hidden, cl::init(false))
static cl::opt< int > ClTrackOrigins("msan-track-origins", cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden, cl::init(0))
Track origins of uninitialized values.
static cl::opt< int > ClInstrumentationWithCallThreshold("msan-instrumentation-with-call-threshold", cl::desc("If the function being instrumented requires more than " "this number of checks and origin stores, use callbacks instead of " "inline checks (-1 means never use callbacks)."), cl::Hidden, cl::init(3500))
static cl::opt< int > ClPoisonStackPattern("msan-poison-stack-pattern", cl::desc("poison uninitialized stack variables with the given pattern"), cl::Hidden, cl::init(0xff))
static const Align kShadowTLSAlignment
static cl::opt< bool > ClHandleICmpExact("msan-handle-icmp-exact", cl::desc("exact handling of relational integer ICmp"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams
static cl::opt< bool > ClDumpStrictInstructions("msan-dump-strict-instructions", cl::desc("print out instructions with default strict semantics i.e.," "check that all the inputs are fully initialized, and mark " "the output as fully initialized. These semantics are applied " "to instructions that could not be handled explicitly nor " "heuristically."), cl::Hidden, cl::init(false))
static Constant * getOrInsertGlobal(Module &M, StringRef Name, Type *Ty)
static cl::opt< bool > ClPreciseDisjointOr("msan-precise-disjoint-or", cl::desc("Precisely poison disjoint OR. If false (legacy behavior), " "disjointedness is ignored (i.e., 1|1 is initialized)."), cl::Hidden, cl::init(false))
static const MemoryMapParams Linux_S390X_MemoryMapParams
static cl::opt< bool > ClPoisonStack("msan-poison-stack", cl::desc("poison uninitialized stack variables"), cl::Hidden, cl::init(true))
static const MemoryMapParams Linux_I386_MemoryMapParams
const char kMsanInitName[]
static cl::opt< bool > ClPoisonUndefVectors("msan-poison-undef-vectors", cl::desc("Precisely poison partially undefined constant vectors. " "If false (legacy behavior), the entire vector is " "considered fully initialized, which may lead to false " "negatives. Fully undefined constant vectors are " "unaffected by this flag (see -msan-poison-undef)."), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPrintStackNames("msan-print-stack-names", cl::desc("Print name of local stack variable"), cl::Hidden, cl::init(true))
static cl::opt< uint64_t > ClAndMask("msan-and-mask", cl::desc("Define custom MSan AndMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleLifetimeIntrinsics("msan-handle-lifetime-intrinsics", cl::desc("when possible, poison scoped variables at the beginning of the scope " "(slower, but more precise)"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClKeepGoing("msan-keep-going", cl::desc("keep going after reporting a UMR"), cl::Hidden, cl::init(false))
static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams
static GlobalVariable * createPrivateConstGlobalForString(Module &M, StringRef Str)
Create a non-const global initialized with the given string.
static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClEagerChecks("msan-eager-checks", cl::desc("check arguments and return values at function call boundaries"), cl::Hidden, cl::init(false))
static cl::opt< int > ClDisambiguateWarning("msan-disambiguate-warning-threshold", cl::desc("Define threshold for number of checks per " "debug location to force origin update."), cl::Hidden, cl::init(3))
static VarArgHelper * CreateVarArgHelper(Function &Func, MemorySanitizer &Msan, MemorySanitizerVisitor &Visitor)
static const MemoryMapParams Linux_MIPS64_MemoryMapParams
static const MemoryMapParams Linux_PowerPC64_MemoryMapParams
static cl::opt< uint64_t > ClXorMask("msan-xor-mask", cl::desc("Define custom MSan XorMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleAsmConservative("msan-handle-asm-conservative", cl::desc("conservative handling of inline assembly"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams
static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams
static const unsigned kParamTLSSize
static cl::opt< bool > ClHandleICmp("msan-handle-icmp", cl::desc("propagate shadow through ICmpEQ and ICmpNE"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClEnableKmsan("msan-kernel", cl::desc("Enable KernelMemorySanitizer instrumentation"), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPoisonStackWithCall("msan-poison-stack-with-call", cl::desc("poison uninitialized stack variables with a call"), cl::Hidden, cl::init(false))
static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams
static cl::opt< bool > ClDumpHeuristicInstructions("msan-dump-heuristic-instructions", cl::desc("Prints 'unknown' instructions that were handled heuristically. " "Use -msan-dump-strict-instructions to print instructions that " "could not be handled explicitly nor heuristically."), cl::Hidden, cl::init(false))
static const unsigned kRetvalTLSSize
static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams
const char kMsanModuleCtorName[]
static const MemoryMapParams FreeBSD_I386_MemoryMapParams
static cl::opt< bool > ClCheckAccessAddress("msan-check-access-address", cl::desc("report accesses through a pointer which has poisoned shadow"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClDisableChecks("msan-disable-checks", cl::desc("Apply no_sanitize to the whole file"), cl::Hidden, cl::init(false))
#define T
FunctionAnalysisManager FAM
if(PassOpts->AAPipeline)
const SmallVectorImpl< MachineOperand > & Cond
static const char * name
void visit(MachineFunction &MF, MachineBasicBlock &Start, std::function< void(MachineBasicBlock *)> op)
This file implements a set that has insertion order iteration characteristics.
This file defines the SmallPtrSet class.
This file defines the SmallVector class.
This file contains some functions that are useful when dealing with strings.
#define LLVM_DEBUG(...)
Definition Debug.h:114
static TableGen::Emitter::OptClass< SkeletonEmitter > X("gen-skeleton-class", "Generate example skeleton class")
static SymbolRef::Type getType(const Symbol *Sym)
Definition TapiFile.cpp:39
Value * RHS
Value * LHS
static APInt getSignedMinValue(unsigned numBits)
Gets minimum signed value of APInt for a specific bit width.
Definition APInt.h:219
void setAlignment(Align Align)
PassT::Result & getResult(IRUnitT &IR, ExtraArgTs... ExtraArgs)
Get the result of an analysis pass for a given IR unit.
const T & front() const
front - Get the first element.
Definition ArrayRef.h:150
static LLVM_ABI ArrayType * get(Type *ElementType, uint64_t NumElements)
This static method is the primary way to construct an ArrayType.
This class stores enough information to efficiently remove some attributes from an existing AttrBuild...
AttributeMask & addAttribute(Attribute::AttrKind Val)
Add an attribute to the mask.
iterator end()
Definition BasicBlock.h:472
LLVM_ABI const_iterator getFirstInsertionPt() const
Returns an iterator to the first instruction in this block that is suitable for inserting a non-PHI i...
LLVM_ABI const BasicBlock * getSinglePredecessor() const
Return the predecessor of this block if it has a single predecessor block.
InstListType::iterator iterator
Instruction iterators...
Definition BasicBlock.h:170
bool isInlineAsm() const
Check if this call is an inline asm statement.
Function * getCalledFunction() const
Returns the function called, or null if this is an indirect function invocation or the function signa...
bool hasRetAttr(Attribute::AttrKind Kind) const
Determine whether the return value has the given attribute.
LLVM_ABI bool paramHasAttr(unsigned ArgNo, Attribute::AttrKind Kind) const
Determine whether the argument or parameter has the given attribute.
void removeFnAttrs(const AttributeMask &AttrsToRemove)
Removes the attributes from the function.
void setCannotMerge()
MaybeAlign getParamAlign(unsigned ArgNo) const
Extract the alignment for a call or parameter (0=unknown).
Type * getParamByValType(unsigned ArgNo) const
Extract the byval type for a call or parameter.
Value * getCalledOperand() const
Type * getParamElementType(unsigned ArgNo) const
Extract the elementtype type for a parameter.
Value * getArgOperand(unsigned i) const
void setArgOperand(unsigned i, Value *v)
FunctionType * getFunctionType() const
iterator_range< User::op_iterator > args()
Iteration adapter for range-for loops.
void addParamAttr(unsigned ArgNo, Attribute::AttrKind Kind)
Adds the attribute to the indicated argument.
Predicate
This enumeration lists the possible predicates for CmpInst subclasses.
Definition InstrTypes.h:678
@ ICMP_SLT
signed less than
Definition InstrTypes.h:707
@ ICMP_SLE
signed less or equal
Definition InstrTypes.h:708
@ ICMP_SGT
signed greater than
Definition InstrTypes.h:705
@ ICMP_SGE
signed greater or equal
Definition InstrTypes.h:706
static LLVM_ABI Constant * get(ArrayType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getString(LLVMContext &Context, StringRef Initializer, bool AddNull=true)
This method constructs a CDS and initializes it with a text string.
static LLVM_ABI Constant * get(LLVMContext &Context, ArrayRef< uint8_t > Elts)
get() constructors - Return a constant with vector type with an element count and element type matchi...
static ConstantInt * getSigned(IntegerType *Ty, int64_t V)
Return a ConstantInt with the specified value for the specified type.
Definition Constants.h:131
static LLVM_ABI ConstantInt * getBool(LLVMContext &Context, bool V)
static LLVM_ABI Constant * get(StructType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getSplat(ElementCount EC, Constant *Elt)
Return a ConstantVector with the specified constant in each element.
static LLVM_ABI Constant * get(ArrayRef< Constant * > V)
This is an important base class in LLVM.
Definition Constant.h:43
static LLVM_ABI Constant * getAllOnesValue(Type *Ty)
LLVM_ABI bool isAllOnesValue() const
Return true if this is the value that would be returned by getAllOnesValue.
static LLVM_ABI Constant * getNullValue(Type *Ty)
Constructor to create a '0' constant of arbitrary type.
LLVM_ABI Constant * getAggregateElement(unsigned Elt) const
For aggregates (struct/array/vector) return the constant that corresponds to the specified element if...
LLVM_ABI bool isZeroValue() const
Return true if the value is negative zero or null value.
Definition Constants.cpp:76
LLVM_ABI bool isNullValue() const
Return true if this is the value that would be returned by getNullValue.
Definition Constants.cpp:90
static bool shouldExecute(unsigned CounterName)
bool empty() const
Definition DenseMap.h:107
unsigned getNumElements() const
static LLVM_ABI FixedVectorType * get(Type *ElementType, unsigned NumElts)
Definition Type.cpp:803
static FixedVectorType * getHalfElementsVectorType(FixedVectorType *VTy)
A handy container for a FunctionType+Callee-pointer pair, which can be passed around as a single enti...
unsigned getNumParams() const
Return the number of fixed parameters this function type requires.
LLVM_ABI void setComdat(Comdat *C)
Definition Globals.cpp:214
@ PrivateLinkage
Like Internal, but omit from symbol table.
Definition GlobalValue.h:61
@ ExternalLinkage
Externally visible function.
Definition GlobalValue.h:53
Analysis pass providing a never-invalidated alias analysis result.
ConstantInt * getInt1(bool V)
Get a constant value representing either true or false.
Definition IRBuilder.h:497
Value * CreateInsertElement(Type *VecTy, Value *NewElt, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2571
Value * CreateConstGEP1_32(Type *Ty, Value *Ptr, unsigned Idx0, const Twine &Name="")
Definition IRBuilder.h:1936
AllocaInst * CreateAlloca(Type *Ty, unsigned AddrSpace, Value *ArraySize=nullptr, const Twine &Name="")
Definition IRBuilder.h:1830
IntegerType * getInt1Ty()
Fetch the type representing a single bit.
Definition IRBuilder.h:547
LLVM_ABI CallInst * CreateMaskedCompressStore(Value *Val, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr)
Create a call to Masked Compress Store intrinsic.
Value * CreateInsertValue(Value *Agg, Value *Val, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2625
Value * CreateExtractElement(Value *Vec, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2559
IntegerType * getIntNTy(unsigned N)
Fetch the type representing an N-bit integer.
Definition IRBuilder.h:575
LoadInst * CreateAlignedLoad(Type *Ty, Value *Ptr, MaybeAlign Align, const char *Name)
Definition IRBuilder.h:1864
Value * CreateZExtOrTrunc(Value *V, Type *DestTy, const Twine &Name="")
Create a ZExt or Trunc from the integer value V to DestTy.
Definition IRBuilder.h:2100
CallInst * CreateMemCpy(Value *Dst, MaybeAlign DstAlign, Value *Src, MaybeAlign SrcAlign, uint64_t Size, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memcpy between the specified pointers.
Definition IRBuilder.h:687
LLVM_ABI CallInst * CreateAndReduce(Value *Src)
Create a vector int AND reduction intrinsic of the source vector.
Value * CreatePointerCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2251
Value * CreateExtractValue(Value *Agg, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2618
LLVM_ABI CallInst * CreateMaskedLoad(Type *Ty, Value *Ptr, Align Alignment, Value *Mask, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Load intrinsic.
LLVM_ABI Value * CreateSelect(Value *C, Value *True, Value *False, const Twine &Name="", Instruction *MDFrom=nullptr)
BasicBlock::iterator GetInsertPoint() const
Definition IRBuilder.h:202
Value * CreateSExt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2094
Value * CreateIntToPtr(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2199
Value * CreateLShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1513
IntegerType * getInt32Ty()
Fetch the type representing a 32-bit integer.
Definition IRBuilder.h:562
ConstantInt * getInt8(uint8_t C)
Get a constant 8-bit value.
Definition IRBuilder.h:512
IntegerType * getInt64Ty()
Fetch the type representing a 64-bit integer.
Definition IRBuilder.h:567
Value * CreateUDiv(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1454
Value * CreateICmpNE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2333
Value * CreateGEP(Type *Ty, Value *Ptr, ArrayRef< Value * > IdxList, const Twine &Name="", GEPNoWrapFlags NW=GEPNoWrapFlags::none())
Definition IRBuilder.h:1923
Value * CreateNeg(Value *V, const Twine &Name="", bool HasNSW=false)
Definition IRBuilder.h:1781
LLVM_ABI CallInst * CreateOrReduce(Value *Src)
Create a vector int OR reduction intrinsic of the source vector.
LLVM_ABI Value * CreateBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with 2 operands which is mangled on the first type.
LLVM_ABI CallInst * CreateIntrinsic(Intrinsic::ID ID, ArrayRef< Type * > Types, ArrayRef< Value * > Args, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with Args, mangled using Types.
ConstantInt * getInt32(uint32_t C)
Get a constant 32-bit value.
Definition IRBuilder.h:522
PHINode * CreatePHI(Type *Ty, unsigned NumReservedValues, const Twine &Name="")
Definition IRBuilder.h:2494
Value * CreateNot(Value *V, const Twine &Name="")
Definition IRBuilder.h:1805
Value * CreateICmpEQ(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2329
LLVM_ABI DebugLoc getCurrentDebugLocation() const
Get location information used by debugging information.
Definition IRBuilder.cpp:63
Value * CreateSub(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1420
Value * CreateBitCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2204
ConstantInt * getIntN(unsigned N, uint64_t C)
Get a constant N-bit value, zero extended or truncated from a 64-bit value.
Definition IRBuilder.h:533
LoadInst * CreateLoad(Type *Ty, Value *Ptr, const char *Name)
Provided to resolve 'CreateLoad(Ty, Ptr, "...")' correctly, instead of converting the string to 'bool...
Definition IRBuilder.h:1847
Value * CreateShl(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1492
CallInst * CreateMemSet(Value *Ptr, Value *Val, uint64_t Size, MaybeAlign Align, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memset to the specified pointer and the specified value.
Definition IRBuilder.h:630
Value * CreateZExt(Value *V, Type *DestTy, const Twine &Name="", bool IsNonNeg=false)
Definition IRBuilder.h:2082
Value * CreateShuffleVector(Value *V1, Value *V2, Value *Mask, const Twine &Name="")
Definition IRBuilder.h:2593
LLVMContext & getContext() const
Definition IRBuilder.h:203
Value * CreateAnd(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1551
StoreInst * CreateStore(Value *Val, Value *Ptr, bool isVolatile=false)
Definition IRBuilder.h:1860
LLVM_ABI CallInst * CreateMaskedStore(Value *Val, Value *Ptr, Align Alignment, Value *Mask)
Create a call to Masked Store intrinsic.
Value * CreateAdd(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1403
Value * CreatePtrToInt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2194
Value * CreateIsNotNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg != 0.
Definition IRBuilder.h:2651
CallInst * CreateCall(FunctionType *FTy, Value *Callee, ArrayRef< Value * > Args={}, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:2508
Value * CreateTrunc(Value *V, Type *DestTy, const Twine &Name="", bool IsNUW=false, bool IsNSW=false)
Definition IRBuilder.h:2068
PointerType * getPtrTy(unsigned AddrSpace=0)
Fetch the type representing a pointer.
Definition IRBuilder.h:605
Value * CreateBinOp(Instruction::BinaryOps Opc, Value *LHS, Value *RHS, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:1708
Value * CreateICmpSLT(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2361
LLVM_ABI Value * CreateTypeSize(Type *Ty, TypeSize Size)
Create an expression which evaluates to the number of units in Size at runtime.
Value * CreateICmpUGE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2341
Value * CreateIntCast(Value *V, Type *DestTy, bool isSigned, const Twine &Name="")
Definition IRBuilder.h:2277
Value * CreateIsNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg == 0.
Definition IRBuilder.h:2646
void SetInsertPoint(BasicBlock *TheBB)
This specifies that created instructions should be appended to the end of the specified block.
Definition IRBuilder.h:207
Type * getVoidTy()
Fetch the type representing void.
Definition IRBuilder.h:600
StoreInst * CreateAlignedStore(Value *Val, Value *Ptr, MaybeAlign Align, bool isVolatile=false)
Definition IRBuilder.h:1883
LLVM_ABI CallInst * CreateMaskedExpandLoad(Type *Ty, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Expand Load intrinsic.
Value * CreateInBoundsPtrAdd(Value *Ptr, Value *Offset, const Twine &Name="")
Definition IRBuilder.h:2041
Value * CreateAShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1532
Value * CreateXor(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1599
Value * CreateICmp(CmpInst::Predicate P, Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2439
Value * CreateOr(Value *LHS, Value *RHS, const Twine &Name="", bool IsDisjoint=false)
Definition IRBuilder.h:1573
IntegerType * getInt8Ty()
Fetch the type representing an 8-bit integer.
Definition IRBuilder.h:552
Value * CreateMul(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1437
LLVM_ABI CallInst * CreateMaskedScatter(Value *Val, Value *Ptrs, Align Alignment, Value *Mask=nullptr)
Create a call to Masked Scatter intrinsic.
LLVM_ABI CallInst * CreateMaskedGather(Type *Ty, Value *Ptrs, Align Alignment, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Gather intrinsic.
This provides a uniform API for creating instructions and inserting them into a basic block: either a...
Definition IRBuilder.h:2780
std::vector< ConstraintInfo > ConstraintInfoVector
Definition InlineAsm.h:123
void visit(Iterator Start, Iterator End)
Definition InstVisitor.h:87
const DebugLoc & getDebugLoc() const
Return the debug location for this node as a DebugLoc.
LLVM_ABI InstListType::iterator eraseFromParent()
This method unlinks 'this' from the containing basic block and deletes it.
MDNode * getMetadata(unsigned KindID) const
Get the metadata of given kind attached to this Instruction.
LLVM_ABI bool comesBefore(const Instruction *Other) const
Given an instruction Other in the same basic block as this instruction, return true if this instructi...
static LLVM_ABI IntegerType * get(LLVMContext &C, unsigned NumBits)
This static method is the primary way of constructing an IntegerType.
Definition Type.cpp:319
LLVM_ABI MDNode * createUnlikelyBranchWeights()
Return metadata containing two branch weights, with significant bias towards false destination.
Definition MDBuilder.cpp:48
A Module instance is used to store all the information related to an LLVM module.
Definition Module.h:67
void addIncoming(Value *V, BasicBlock *BB)
Add an incoming value to the end of the PHI list.
static LLVM_ABI PoisonValue * get(Type *T)
Static factory methods - Return an 'poison' object of the specified type.
A set of analyses that are preserved following a run of a transformation pass.
Definition Analysis.h:112
static PreservedAnalyses none()
Convenience factory function for the empty preserved set.
Definition Analysis.h:115
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
Definition Analysis.h:118
PreservedAnalyses & abandon()
Mark an analysis as abandoned.
Definition Analysis.h:171
bool remove(const value_type &X)
Remove an item from the set vector.
Definition SetVector.h:198
bool insert(const value_type &X)
Insert a new element into the SetVector.
Definition SetVector.h:168
void append(ItTy in_start, ItTy in_end)
Add the specified range to the end of the SmallVector.
void push_back(const T &Elt)
StringRef - Represent a constant reference to a string, i.e.
Definition StringRef.h:55
static LLVM_ABI StructType * get(LLVMContext &Context, ArrayRef< Type * > Elements, bool isPacked=false)
This static method is the primary way to create a literal StructType.
Definition Type.cpp:414
unsigned getNumElements() const
Random access to the elements.
Type * getElementType(unsigned N) const
Analysis pass providing the TargetLibraryInfo.
Provides information about what library functions are available for the current target.
AttributeList getAttrList(LLVMContext *C, ArrayRef< unsigned > ArgNos, bool Signed, bool Ret=false, AttributeList AL=AttributeList()) const
bool getLibFunc(StringRef funcName, LibFunc &F) const
Searches for a particular function name.
Triple - Helper class for working with autoconf configuration names.
Definition Triple.h:47
bool isMIPS64() const
Tests whether the target is MIPS 64-bit (little and big endian).
Definition Triple.h:1030
@ loongarch64
Definition Triple.h:65
bool isRISCV32() const
Tests whether the target is 32-bit RISC-V.
Definition Triple.h:1073
bool isPPC32() const
Tests whether the target is 32-bit PowerPC (little and big endian).
Definition Triple.h:1046
ArchType getArch() const
Get the parsed architecture type of this triple.
Definition Triple.h:411
bool isRISCV64() const
Tests whether the target is 64-bit RISC-V.
Definition Triple.h:1078
bool isLoongArch64() const
Tests whether the target is 64-bit LoongArch.
Definition Triple.h:1019
bool isMIPS32() const
Tests whether the target is MIPS 32-bit (little and big endian).
Definition Triple.h:1025
bool isARM() const
Tests whether the target is ARM (little and big endian).
Definition Triple.h:914
bool isPPC64() const
Tests whether the target is 64-bit PowerPC (little and big endian).
Definition Triple.h:1051
bool isAArch64() const
Tests whether the target is AArch64 (little and big endian).
Definition Triple.h:998
bool isSystemZ() const
Tests whether the target is SystemZ.
Definition Triple.h:1097
The instances of the Type class are immutable: once they are created, they are never changed.
Definition Type.h:45
LLVM_ABI unsigned getIntegerBitWidth() const
bool isVectorTy() const
True if this is an instance of VectorType.
Definition Type.h:273
bool isArrayTy() const
True if this is an instance of ArrayType.
Definition Type.h:264
bool isIntOrIntVectorTy() const
Return true if this is an integer type or a vector of integer types.
Definition Type.h:246
bool isPointerTy() const
True if this is an instance of PointerType.
Definition Type.h:267
Type * getArrayElementType() const
Definition Type.h:408
bool isPPC_FP128Ty() const
Return true if this is powerpc long double.
Definition Type.h:165
static LLVM_ABI Type * getVoidTy(LLVMContext &C)
Definition Type.cpp:281
Type * getScalarType() const
If this is a vector type, return the element type, otherwise return 'this'.
Definition Type.h:352
LLVM_ABI TypeSize getPrimitiveSizeInBits() const LLVM_READONLY
Return the basic size of this type if it is a primitive type.
Definition Type.cpp:198
bool isSized(SmallPtrSetImpl< Type * > *Visited=nullptr) const
Return true if it makes sense to take the size of this type.
Definition Type.h:311
LLVM_ABI unsigned getScalarSizeInBits() const LLVM_READONLY
If this is a vector type, return the getPrimitiveSizeInBits value for the element type.
Definition Type.cpp:231
bool isFloatingPointTy() const
Return true if this is one of the floating-point types.
Definition Type.h:184
bool isIntOrPtrTy() const
Return true if this is an integer type or a pointer type.
Definition Type.h:255
bool isIntegerTy() const
True if this is an instance of IntegerType.
Definition Type.h:240
bool isFPOrFPVectorTy() const
Return true if this is a FP type or a vector of FP.
Definition Type.h:225
bool isVoidTy() const
Return true if this is 'void'.
Definition Type.h:139
Value * getOperand(unsigned i) const
Definition User.h:232
unsigned getNumOperands() const
Definition User.h:254
size_type count(const KeyT &Val) const
Return 1 if the specified key is in the map, 0 otherwise.
Definition ValueMap.h:156
Type * getType() const
All values are typed, get the type of this value.
Definition Value.h:256
LLVM_ABI void setName(const Twine &Name)
Change the name of the value.
Definition Value.cpp:390
LLVM_ABI StringRef getName() const
Return a constant reference to the value's name.
Definition Value.cpp:322
ElementCount getElementCount() const
Return an ElementCount instance to represent the (possibly scalable) number of elements in the vector...
Type * getElementType() const
int getNumOccurrences() const
constexpr ScalarTy getFixedValue() const
Definition TypeSize.h:200
constexpr bool isScalable() const
Returns whether the quantity is scaled by a runtime quantity (vscale).
Definition TypeSize.h:169
An efficient, type-erasing, non-owning reference to a callable.
const ParentTy * getParent() const
Definition ilist_node.h:34
self_iterator getIterator()
Definition ilist_node.h:130
This class implements an extremely fast bulk output stream that can only output to a stream.
Definition raw_ostream.h:53
CallInst * Call
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
constexpr char Align[]
Key for Kernel::Arg::Metadata::mAlign.
constexpr std::underlying_type_t< E > Mask()
Get a bitmask with 1s in all places up to the high-order bit of E's largest value.
@ C
The default llvm calling convention, compatible with C.
Definition CallingConv.h:34
@ BasicBlock
Various leaf nodes.
Definition ISDOpcodes.h:81
initializer< Ty > init(const Ty &Val)
Function * Kernel
Summary of a kernel (=entry point for target offloading).
Definition OpenMPOpt.h:21
NodeAddr< FuncNode * > Func
Definition RDFGraph.h:393
friend class Instruction
Iterator for Instructions in a `BasicBlock.
Definition BasicBlock.h:73
This is an optimization pass for GlobalISel generic memory operations.
unsigned Log2_32_Ceil(uint32_t Value)
Return the ceil log base 2 of the specified value, 32 if the value is zero.
Definition MathExtras.h:355
FunctionAddr VTableAddr Value
Definition InstrProf.h:137
auto size(R &&Range, std::enable_if_t< std::is_base_of< std::random_access_iterator_tag, typename std::iterator_traits< decltype(Range.begin())>::iterator_category >::value, void > *=nullptr)
Get the size of a range.
Definition STLExtras.h:1657
auto enumerate(FirstRange &&First, RestRanges &&...Rest)
Given two or more input ranges, returns a new range whose values are tuples (A, B,...
Definition STLExtras.h:2452
decltype(auto) dyn_cast(const From &Val)
dyn_cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:649
@ Done
Definition Threading.h:60
bool isAligned(Align Lhs, uint64_t SizeInBytes)
Checks that SizeInBytes is a multiple of the alignment.
Definition Alignment.h:145
LLVM_ABI std::pair< Instruction *, Value * > SplitBlockAndInsertSimpleForLoop(Value *End, BasicBlock::iterator SplitBefore)
Insert a for (int i = 0; i < End; i++) loop structure (with the exception that End is assumed > 0,...
InnerAnalysisManagerProxy< FunctionAnalysisManager, Module > FunctionAnalysisManagerModuleProxy
Provide the FunctionAnalysisManager to Module proxy.
constexpr bool isPowerOf2_64(uint64_t Value)
Return true if the argument is a power of two > 0 (64 bit edition.)
Definition MathExtras.h:293
unsigned Log2_64(uint64_t Value)
Return the floor log base 2 of the specified value, -1 if the value is zero.
Definition MathExtras.h:348
auto dyn_cast_or_null(const Y &Val)
Definition Casting.h:759
LLVM_ABI std::pair< Function *, FunctionCallee > getOrCreateSanitizerCtorAndInitFunctions(Module &M, StringRef CtorName, StringRef InitName, ArrayRef< Type * > InitArgTypes, ArrayRef< Value * > InitArgs, function_ref< void(Function *, FunctionCallee)> FunctionsCreatedCallback, StringRef VersionCheckName=StringRef(), bool Weak=false)
Creates sanitizer constructor function lazily.
LLVM_ABI raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition Debug.cpp:207
LLVM_ABI void report_fatal_error(Error Err, bool gen_crash_diag=true)
Definition Error.cpp:167
class LLVM_GSL_OWNER SmallVector
Forward declaration of SmallVector so that calculateSmallVectorDefaultInlinedElements can reference s...
bool isa(const From &Val)
isa<X> - Return true if the parameter to the template is an instance of one of the template type argu...
Definition Casting.h:548
LLVM_ABI bool isKnownNonZero(const Value *V, const SimplifyQuery &Q, unsigned Depth=0)
Return true if the given value is known to be non-zero when defined.
LLVM_ABI raw_fd_ostream & errs()
This returns a reference to a raw_ostream for standard error.
AtomicOrdering
Atomic ordering for LLVM's memory model.
@ First
Helpers to iterate all locations in the MemoryEffectsBase class.
Definition ModRef.h:71
IRBuilder(LLVMContext &, FolderTy, InserterTy, MDNode *, ArrayRef< OperandBundleDef >) -> IRBuilder< FolderTy, InserterTy >
@ Or
Bitwise or logical OR of integers.
@ And
Bitwise or logical AND of integers.
@ Add
Sum of integers.
uint64_t alignTo(uint64_t Size, Align A)
Returns a multiple of A needed to store Size bytes.
Definition Alignment.h:155
DWARFExpression::Operation Op
RoundingMode
Rounding mode.
ArrayRef(const T &OneElt) -> ArrayRef< T >
constexpr unsigned BitWidth
LLVM_ABI void appendToGlobalCtors(Module &M, Function *F, int Priority, Constant *Data=nullptr)
Append F to the list of global ctors of module M with the given Priority.
decltype(auto) cast(const From &Val)
cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:565
iterator_range< df_iterator< T > > depth_first(const T &G)
LLVM_ABI Instruction * SplitBlockAndInsertIfThen(Value *Cond, BasicBlock::iterator SplitBefore, bool Unreachable, MDNode *BranchWeights=nullptr, DomTreeUpdater *DTU=nullptr, LoopInfo *LI=nullptr, BasicBlock *ThenBlock=nullptr)
Split the containing block at the specified instruction - everything before SplitBefore stays in the ...
LLVM_ABI void maybeMarkSanitizerLibraryCallNoBuiltin(CallInst *CI, const TargetLibraryInfo *TLI)
Given a CallInst, check if it calls a string function known to CodeGen, and mark it with NoBuiltin if...
Definition Local.cpp:3832
LLVM_ABI bool removeUnreachableBlocks(Function &F, DomTreeUpdater *DTU=nullptr, MemorySSAUpdater *MSSAU=nullptr)
Remove all blocks that can not be reached from the function's entry.
Definition Local.cpp:2883
LLVM_ABI bool checkIfAlreadyInstrumented(Module &M, StringRef Flag)
Check if module has flag attached, if not add the flag.
std::string itostr(int64_t X)
AnalysisManager< Module > ModuleAnalysisManager
Convenience typedef for the Module analysis manager.
Definition MIRParser.h:39
This struct is a compact representation of a valid (non-zero power of two) alignment.
Definition Alignment.h:39
uint64_t value() const
This is a hole in the type system and should not be abused.
Definition Alignment.h:85
LLVM_ABI void printPipeline(raw_ostream &OS, function_ref< StringRef(StringRef)> MapClassName2PassName)
LLVM_ABI PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM)
A CRTP mix-in to automatically provide informational APIs needed for passes.
Definition PassManager.h:70