LLVM 22.0.0git
|
#include "Target/AMDGPU/AMDGPUTargetTransformInfo.h"
Public Types | |
enum class | KnownIEEEMode { Unknown , On , Off } |
Definition at line 63 of file AMDGPUTargetTransformInfo.h.
|
strong |
Enumerator | |
---|---|
Unknown | |
On | |
Off |
Definition at line 285 of file AMDGPUTargetTransformInfo.h.
|
explicit |
Definition at line 305 of file AMDGPUTargetTransformInfo.cpp.
References F, and llvm::DenormalMode::getPreserveSign().
|
inlineoverridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 189 of file AMDGPUTargetTransformInfo.h.
References llvm::AMDGPU::addrspacesMayAlias().
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1422 of file AMDGPUTargetTransformInfo.cpp.
References adjustInliningThresholdUsingCallee(), ArgAllocaCost, llvm::BasicTTIImplBase< GCNTTIImpl >::DL, and getCallArgsTotalAllocaSize().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1305 of file AMDGPUTargetTransformInfo.cpp.
References llvm::TargetLoweringBase::getTargetMachine(), InlineMaxBB, and llvm::SIModeRegisterDefaults::isInlineCompatible().
|
inlineoverridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 205 of file AMDGPUTargetTransformInfo.h.
References llvm::AMDGPUAS::LOCAL_ADDRESS, llvm::AMDGPUAS::PRIVATE_ADDRESS, and llvm::AMDGPUAS::REGION_ADDRESS.
bool GCNTTIImpl::canSimplifyLegacyMulToMul | ( | const Instruction & | I, |
const Value * | Op0, | ||
const Value * | Op1, | ||
InstCombiner & | IC | ||
) | const |
Definition at line 391 of file AMDGPUInstCombineIntrinsic.cpp.
References llvm::InstCombiner::getSimplifyQuery(), llvm::SimplifyQuery::getWithInstruction(), I, llvm::isKnownNeverInfOrNaN(), llvm::PatternMatch::m_FiniteNonZero(), and llvm::PatternMatch::match().
Referenced by instCombineIntrinsic().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1118 of file AMDGPUTargetTransformInfo.cpp.
References llvm::SmallVectorTemplateBase< T, bool >::push_back().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1514 of file AMDGPUTargetTransformInfo.cpp.
References F, llvm::AMDGPUSubtarget::getFlatWorkGroupSizes(), llvm::AMDGPUSubtarget::getMaxNumWorkGroups(), llvm::AMDGPUSubtarget::getWavesPerEU(), and llvm::SmallVectorTemplateBase< T, bool >::push_back().
GCNTTIImpl::KnownIEEEMode GCNTTIImpl::fpenvIEEEMode | ( | const Instruction & | I | ) | const |
Return KnownIEEEMode::On if we know if the use context can assume "amdgpu-ieee"="true" and KnownIEEEMode::Off if we can assume "amdgpu-ieee"="false".
Definition at line 1531 of file AMDGPUTargetTransformInfo.cpp.
References F, llvm::Attribute::getValueAsBool(), llvm::GCNSubtarget::hasIEEEMode(), I, llvm::AMDGPU::isShader(), llvm::Attribute::isValid(), Off, On, and Unknown.
Referenced by getIntrinsicInstrCost(), and instCombineIntrinsic().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 530 of file AMDGPUTargetTransformInfo.cpp.
References llvm::ISD::ADD, llvm::ISD::AND, CostKind, llvm::FAdd, llvm::ISD::FADD, llvm::FPOpFusion::Fast, llvm::ISD::FDIV, llvm::ISD::FMUL, llvm::ISD::FNEG, llvm::ISD::FREM, llvm::ISD::FSUB, llvm::BasicTTIImplBase< GCNTTIImpl >::getArithmeticInstrCost(), llvm::TargetLoweringBase::getTargetMachine(), llvm::AMDGPUSubtarget::has16BitInsts(), llvm::Instruction::hasAllowContract(), llvm::Instruction::hasApproxFunc(), llvm::AMDGPUSubtarget::hasMadMacF32Insts(), llvm::Value::hasOneUse(), llvm::GCNSubtarget::hasPackedFP32Ops(), llvm::GCNSubtarget::hasUsableDivScaleConditionOutput(), llvm::TargetLoweringBase::InstructionOpcodeToISD(), llvm::AMDGPUTargetLowering::isFNegFree(), llvm::PatternMatch::m_FPOne(), llvm::PatternMatch::match(), llvm::ISD::MUL, llvm::TargetMachine::Options, Options, llvm::ISD::OR, llvm::ISD::SHL, llvm::ISD::SRA, llvm::ISD::SRL, llvm::ISD::SUB, llvm::TargetTransformInfo::TCC_Free, llvm::Value::user_begin(), and llvm::ISD::XOR.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 850 of file AMDGPUTargetTransformInfo.cpp.
References CostKind, llvm::BasicTTIImplBase< GCNTTIImpl >::DL, llvm::BasicTTIImplBase< GCNTTIImpl >::getArithmeticReductionCost(), llvm::EVT::getScalarSizeInBits(), llvm::TargetLoweringBase::getValueType(), llvm::AMDGPUSubtarget::hasVOP3PInsts(), and llvm::TargetTransformInfo::requiresOrderedReduction().
|
inlineoverridevirtual |
Data cache line size for LoopDataPrefetch pass. Has no use before GFX12.
Reimplemented from llvm::BasicTTIImplBase< GCNTTIImpl >.
Definition at line 273 of file AMDGPUTargetTransformInfo.h.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1433 of file AMDGPUTargetTransformInfo.cpp.
References ArgAllocaCost, ArgAllocaCutoff, llvm::BasicTTIImplBase< GCNTTIImpl >::DL, llvm::AllocaInst::getAllocatedType(), getCallArgsTotalAllocaSize(), llvm::CallBase::getCalledFunction(), getInliningThresholdMultiplier(), llvm::DataLayout::getTypeAllocSize(), and llvm::none_of().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 819 of file AMDGPUTargetTransformInfo.cpp.
References assert(), CostKind, llvm::BasicTTIImplBase< GCNTTIImpl >::getCFInstrCost(), I, llvm::TargetTransformInfo::TCK_CodeSize, and llvm::TargetTransformInfo::TCK_SizeAndLatency.
|
inlineoverridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 193 of file AMDGPUTargetTransformInfo.h.
References llvm::AMDGPUAS::FLAT_ADDRESS.
|
inlineoverridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 256 of file AMDGPUTargetTransformInfo.h.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1417 of file AMDGPUTargetTransformInfo.cpp.
References llvm::TargetTransformInfoImplBase::getInliningLastCallToStaticBonus(), and getInliningThresholdMultiplier().
|
inlineoverridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 251 of file AMDGPUTargetTransformInfo.h.
Referenced by getCallerAllocaCost(), and getInliningLastCallToStaticBonus().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 710 of file AMDGPUTargetTransformInfo.cpp.
References llvm::any_of(), CostKind, fpenvIEEEMode(), llvm::IntrinsicCostAttributes::getID(), llvm::IntrinsicCostAttributes::getInst(), llvm::BasicTTIImplBase< GCNTTIImpl >::getIntrinsicInstrCost(), llvm::IntrinsicCostAttributes::getReturnType(), llvm::AMDGPUSubtarget::hasFastFMAF32(), llvm::GCNSubtarget::hasPackedFP32Ops(), llvm::AMDGPUSubtarget::hasVOP3PInsts(), II, intrinsicHasPackedVectorBenefit(), Off, and RetTy.
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 380 of file AMDGPUTargetTransformInfo.cpp.
References llvm::AMDGPUAS::BUFFER_FAT_POINTER, llvm::AMDGPUAS::BUFFER_RESOURCE, llvm::AMDGPUAS::BUFFER_STRIDED_POINTER, llvm::AMDGPUAS::CONSTANT_ADDRESS, llvm::AMDGPUAS::CONSTANT_ADDRESS_32BIT, llvm::GCNSubtarget::getMaxPrivateElementSize(), llvm::AMDGPUAS::GLOBAL_ADDRESS, and llvm::AMDGPUAS::PRIVATE_ADDRESS.
Referenced by getMemoryOpCost().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 359 of file AMDGPUTargetTransformInfo.cpp.
References llvm::Type::getScalarSizeInBits().
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 348 of file AMDGPUTargetTransformInfo.cpp.
References llvm::AMDGPUSubtarget::has16BitInsts(), and llvm::GCNSubtarget::hasPackedFP32Ops().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 495 of file AMDGPUTargetTransformInfo.cpp.
References llvm::ElementCount::isScalar().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 422 of file AMDGPUTargetTransformInfo.cpp.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 426 of file AMDGPUTargetTransformInfo.cpp.
References Context, llvm::FixedVectorType::get(), llvm::Type::getInt32Ty(), llvm::Type::getIntNTy(), llvm::Length, and MemcpyLoopUnroll.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 453 of file AMDGPUTargetTransformInfo.cpp.
References Context, llvm::FixedVectorType::get(), llvm::Type::getInt16Ty(), llvm::Type::getInt32Ty(), llvm::Type::getInt64Ty(), llvm::Type::getInt8Ty(), llvm::TargetTransformInfoImplBase::getMemcpyLoopResidualLoweringType(), and llvm::SmallVectorTemplateBase< T, bool >::push_back().
|
overridevirtual |
Account for loads of i8 vector types to have reduced cost.
For example the cost of load 4 i8s values is one is the cost of loading a single i32 value.
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1547 of file AMDGPUTargetTransformInfo.cpp.
References CostKind, llvm::divideCeil(), llvm::BasicTTIImplBase< GCNTTIImpl >::DL, getLoadStoreVecRegBitWidth(), llvm::BasicTTIImplBase< GCNTTIImpl >::getMemoryOpCost(), llvm::DataLayout::getTypeSizeInBits(), and I.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 868 of file AMDGPUTargetTransformInfo.cpp.
References CostKind, llvm::BasicTTIImplBase< GCNTTIImpl >::DL, llvm::BasicTTIImplBase< GCNTTIImpl >::getMinMaxReductionCost(), llvm::EVT::getScalarSizeInBits(), llvm::TargetLoweringBase::getValueType(), and llvm::AMDGPUSubtarget::hasVOP3PInsts().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 344 of file AMDGPUTargetTransformInfo.cpp.
When counting parts on AMD GPUs, account for i8s being grouped together under a single i32 value.
Otherwise fall back to base implementation.
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1564 of file AMDGPUTargetTransformInfo.cpp.
References llvm::divideCeil(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), and llvm::BasicTTIImplBase< GCNTTIImpl >::getNumberOfParts().
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 320 of file AMDGPUTargetTransformInfo.cpp.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1480 of file AMDGPUTargetTransformInfo.cpp.
References llvm::AMDGPUTTIImpl::getPeelingPreferences().
|
inlineoverridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 116 of file AMDGPUTargetTransformInfo.h.
References assert(), llvm::isPowerOf2_32(), and llvm::TargetTransformInfo::PSK_FastHardware.
|
overridevirtual |
How much before a load we should place the prefetch instruction.
This is currently measured in number of IR instructions.
Reimplemented from llvm::BasicTTIImplBase< GCNTTIImpl >.
Definition at line 1506 of file AMDGPUTargetTransformInfo.cpp.
References llvm::GCNSubtarget::hasPrefetch().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 332 of file AMDGPUTargetTransformInfo.cpp.
References llvm::TypeSize::getFixed(), llvm::TypeSize::getScalable(), llvm::GCNSubtarget::hasPackedFP32Ops(), llvm_unreachable, llvm::TargetTransformInfo::RGK_FixedWidthVector, llvm::TargetTransformInfo::RGK_ScalableVector, and llvm::TargetTransformInfo::RGK_Scalar.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1222 of file AMDGPUTargetTransformInfo.cpp.
References llvm::alignTo(), CostKind, llvm::count_if(), llvm::BasicTTIImplBase< GCNTTIImpl >::DL, llvm::VectorType::getElementType(), llvm::GCNSubtarget::getGeneration(), llvm::BasicTTIImplBase< GCNTTIImpl >::getShuffleCost(), llvm::DataLayout::getTypeSizeInBits(), llvm::AMDGPUSubtarget::hasVOP3PInsts(), llvm::BasicTTIImplBase< GCNTTIImpl >::improveShuffleKindFromMask(), llvm::TargetTransformInfo::SK_Broadcast, llvm::TargetTransformInfo::SK_ExtractSubvector, llvm::TargetTransformInfo::SK_InsertSubvector, llvm::TargetTransformInfo::SK_PermuteSingleSrc, llvm::TargetTransformInfo::SK_PermuteTwoSrc, llvm::TargetTransformInfo::SK_Reverse, llvm::TargetTransformInfo::SK_Select, llvm::TargetTransformInfo::SK_Splice, and llvm::AMDGPUSubtarget::VOLCANIC_ISLANDS.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 370 of file AMDGPUTargetTransformInfo.cpp.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 504 of file AMDGPUTargetTransformInfo.cpp.
References llvm::CallBase::getArgOperand(), llvm::IntrinsicInst::getIntrinsicID(), Info, and llvm::SequentiallyConsistent.
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1474 of file AMDGPUTargetTransformInfo.cpp.
References llvm::AMDGPUTTIImpl::getUnrollingPreferences().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 882 of file AMDGPUTargetTransformInfo.cpp.
References CostKind, llvm::BasicTTIImplBase< GCNTTIImpl >::DL, llvm::DataLayout::getTypeSizeInBits(), llvm::BasicTTIImplBase< GCNTTIImpl >::getVectorInstrCost(), and llvm::AMDGPUSubtarget::has16BitInsts().
|
inline |
Definition at line 236 of file AMDGPUTargetTransformInfo.h.
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 316 of file AMDGPUTargetTransformInfo.cpp.
References F, and llvm::AMDGPUSubtarget::isSingleLaneExecution().
Instruction * GCNTTIImpl::hoistLaneIntrinsicThroughOperand | ( | InstCombiner & | IC, |
IntrinsicInst & | II | ||
) | const |
Definition at line 557 of file AMDGPUInstCombineIntrinsic.cpp.
References assert(), llvm::InstCombiner::Builder, llvm::Instruction::clone(), llvm::DominatorTree::dominates(), llvm::InstCombiner::getDominatorTree(), llvm::User::getOperand(), llvm::User::getOperandUse(), llvm::Intrinsic::getOrInsertDeclaration(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::Value::hasOneUser(), II, isTriviallyUniform(), llvm::BasicTTIImplBase< GCNTTIImpl >::isTypeLegal(), OpIdx, rewriteCall(), and llvm::User::setOperand().
Referenced by instCombineIntrinsic().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 639 of file AMDGPUInstCombineIntrinsic.cpp.
References llvm::CallBase::addFnAttr(), llvm::FastMathFlags::allowContract(), assert(), llvm::APFloat::bitcastToAPInt(), llvm::InstCombiner::Builder, llvm::CallingConv::C, canContractSqrtToRsq(), canSimplifyLegacyMulToMul(), llvm::ConstantFoldCompareInstOperands(), llvm::APFloat::convert(), llvm::IRBuilderBase::CreateAShr(), llvm::IRBuilderBase::CreateExtractVector(), llvm::IRBuilderBase::CreateFAddFMF(), llvm::IRBuilderBase::CreateFMulFMF(), llvm::IRBuilderBase::CreateICmpNE(), llvm::IRBuilderBase::CreateInsertElement(), llvm::IRBuilderBase::CreateIntrinsic(), llvm::IRBuilderBase::CreateLShr(), llvm::IRBuilderBase::CreateMaximumNum(), llvm::IRBuilderBase::CreateMaxNum(), llvm::IRBuilderBase::CreateMinimumNum(), llvm::IRBuilderBase::CreateMinNum(), llvm::IRBuilderBase::CreateSExt(), llvm::IRBuilderBase::CreateShl(), llvm::IRBuilderBase::CreateZExt(), defaultComponentBroadcast(), llvm::APFloat::divide(), llvm::BasicTTIImplBase< GCNTTIImpl >::DL, llvm::InstCombiner::eraseInstFromFunction(), llvm::Exponent, llvm::FAdd, llvm::fcAllFlags, llvm::CmpInst::FIRST_FCMP_PREDICATE, llvm::CmpInst::FIRST_ICMP_PREDICATE, fmed3AMDGCN(), llvm::FMul, llvm::AMDGPU::MFMAScaleFormats::FP4_E2M1, llvm::AMDGPU::MFMAScaleFormats::FP6_E2M3, llvm::AMDGPU::MFMAScaleFormats::FP6_E3M2, llvm::AMDGPU::MFMAScaleFormats::FP8_E4M3, llvm::AMDGPU::MFMAScaleFormats::FP8_E5M2, fpenvIEEEMode(), llvm::frexp(), llvm::MDNode::get(), llvm::MetadataAsValue::get(), llvm::MDString::get(), llvm::FixedVectorType::get(), llvm::UndefValue::get(), llvm::PoisonValue::get(), llvm::ConstantInt::getFalse(), llvm::FPMathOperator::getFastMathFlags(), llvm::Type::getFltSemantics(), llvm::Type::getHalfTy(), llvm::AMDGPU::getImageDimIntrinsicInfo(), llvm::ConstantFP::getInfinity(), llvm::IRBuilderBase::getInt64(), llvm::Type::getIntegerBitWidth(), llvm::IRBuilderBase::getIntNTy(), llvm::CmpInst::getInversePredicate(), llvm::ConstantFP::getNaN(), llvm::Constant::getNullValue(), llvm::Intrinsic::getOrInsertDeclaration(), llvm::APFloat::getQNaN(), llvm::APFloat::getSemantics(), llvm::InstCombiner::getSimplifyQuery(), llvm::CmpInst::getSwappedPredicate(), llvm::Value::getType(), getType(), llvm::ConstantInt::getValue(), llvm::ConstantFP::getValueAPF(), llvm::AMDGPUSubtarget::getWavefrontSize(), llvm::APFloat::getZero(), llvm::ConstantFP::getZero(), llvm::APInt::getZExtValue(), llvm::ConstantInt::getZExtValue(), llvm::GCNSubtarget::hasDefaultComponentBroadcast(), llvm::GCNSubtarget::hasDefaultComponentZero(), llvm::GCNSubtarget::hasMed3_16(), hoistLaneIntrinsicThroughOperand(), I, llvm::CmpInst::ICMP_EQ, llvm::CmpInst::ICMP_NE, Idx, llvm::APFloatBase::IEEEhalf(), llvm::APFloatBase::IEK_Inf, llvm::APFloatBase::IEK_NaN, II, llvm::Type::isDoubleTy(), llvm::Type::isFloatTy(), llvm::CmpInst::isFPPredicate(), llvm::Type::isHalfTy(), llvm::APFloat::isInfinity(), llvm::Type::isIntegerTy(), llvm::APFloat::isNaN(), llvm::APFloat::isNegInfinity(), llvm::Constant::isNullValue(), llvm::APFloat::isPosInfinity(), llvm::APFloat::isSignaling(), llvm::CmpInst::isSigned(), isTriviallyUniform(), llvm::SimplifyQuery::isUndefValue(), llvm::GCNSubtarget::isWave32(), llvm::GCNSubtarget::isWaveSizeKnown(), llvm::CmpInst::LAST_FCMP_PREDICATE, llvm::CmpInst::LAST_ICMP_PREDICATE, llvm_unreachable, llvm::PatternMatch::m_AllOnes(), llvm::PatternMatch::m_AnyZeroFP(), llvm::PatternMatch::m_APFloat(), llvm::PatternMatch::m_Cmp(), llvm::PatternMatch::m_ConstantFP(), llvm::PatternMatch::m_FPExt(), llvm::PatternMatch::m_One(), llvm::PatternMatch::m_SExt(), llvm::PatternMatch::m_Value(), llvm::PatternMatch::m_Zero(), llvm::PatternMatch::m_ZExt(), llvm::PatternMatch::m_ZExtOrSExt(), llvm::Make_64(), llvm::APFloat::makeQuiet(), llvm::PatternMatch::match(), matchFPExtFromF16(), llvm::NearestTiesToEven, Off, llvm::Offset, On, llvm::InstCombiner::replaceInstUsesWith(), llvm::InstCombiner::replaceOperand(), llvm::APFloatBase::rmNearestTiesToEven, llvm::APFloatBase::rmTowardZero, llvm::scalbn(), llvm::Signed, simplifyAMDGCNImageIntrinsic(), simplifyAMDGCNMemoryIntrinsicDemanded(), simplifyDemandedLaneMaskArg(), std::swap(), llvm::Value::takeName(), trimTrailingZerosInVector(), llvm::APInt::trunc(), Unknown, llvm::AMDGPU::wmmaScaleF8F6F4FormatToNumRegs(), X, and Y.
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1044 of file AMDGPUTargetTransformInfo.cpp.
References llvm::CallingConv::C, llvm::computeKnownBits(), llvm::BasicTTIImplBase< GCNTTIImpl >::DL, F, llvm::ExtractValueInst::getIndices(), llvm::User::getOperand(), llvm::AMDGPUSubtarget::getWavefrontSizeLog2(), llvm::AMDGPUSubtarget::hasWavefrontsEvenlySplittingXDim(), I, llvm::CallBase::isInlineAsm(), isInlineAsmSourceOfDivergence(), llvm::AMDGPU::isIntrinsicAlwaysUniform(), llvm::PatternMatch::m_AShr(), llvm::PatternMatch::m_c_And(), llvm::PatternMatch::m_ConstantInt(), llvm::PatternMatch::m_LShr(), llvm::PatternMatch::m_Value(), llvm::PatternMatch::match(), and llvm::ArrayRef< T >::size().
bool GCNTTIImpl::isInlineAsmSourceOfDivergence | ( | const CallInst * | CI, |
ArrayRef< unsigned > | Indices = {} |
||
) | const |
Analyze if the results of inline asm are divergent.
If Indices
is empty, this is analyzing the collective result of all output registers. Otherwise, this is only querying a specific result index if this returns multiple registers in a struct.
Definition at line 914 of file AMDGPUTargetTransformInfo.cpp.
References llvm::TargetLowering::ComputeConstraintToUse(), llvm::BasicTTIImplBase< GCNTTIImpl >::DL, llvm::ArrayRef< T >::empty(), llvm::Instruction::getDataLayout(), llvm::SITargetLowering::getRegForInlineAsmConstraint(), llvm::GCNSubtarget::getRegisterInfo(), llvm::InlineAsm::isOutput, llvm::TargetLowering::ParseConstraints(), llvm::ArrayRef< T >::size(), and TRI.
Referenced by isAlwaysUniform(), and isSourceOfDivergence().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 410 of file AMDGPUTargetTransformInfo.cpp.
References isLegalToVectorizeMemChain().
bool GCNTTIImpl::isLegalToVectorizeMemChain | ( | unsigned | ChainSizeInBytes, |
Align | Alignment, | ||
unsigned | AddrSpace | ||
) | const |
Definition at line 397 of file AMDGPUTargetTransformInfo.cpp.
References llvm::GCNSubtarget::getMaxPrivateElementSize(), llvm::GCNSubtarget::hasUnalignedScratchAccessEnabled(), and llvm::AMDGPUAS::PRIVATE_ADDRESS.
Referenced by isLegalToVectorizeLoadChain(), and isLegalToVectorizeStoreChain().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 416 of file AMDGPUTargetTransformInfo.cpp.
References isLegalToVectorizeMemChain().
|
overridevirtual |
Whether it is profitable to sink the operands of an Instruction I to the basic block of I.
This helps using several modifiers (like abs and neg) more often.
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1289 of file AMDGPUTargetTransformInfo.cpp.
References llvm::any_of(), llvm::SmallVectorBase< Size_T >::empty(), I, llvm::PatternMatch::m_FAbs(), llvm::PatternMatch::m_FNeg(), llvm::PatternMatch::m_Value(), llvm::PatternMatch::match(), and llvm::SmallVectorTemplateBase< T, bool >::push_back().
bool GCNTTIImpl::isReadRegisterSourceOfDivergence | ( | const IntrinsicInst * | ReadReg | ) | const |
Definition at line 950 of file AMDGPUTargetTransformInfo.cpp.
References llvm::CallBase::getArgOperand(), llvm::Value::getType(), llvm::MVT::getVT(), and RegName.
Referenced by isSourceOfDivergence().
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 973 of file AMDGPUTargetTransformInfo.cpp.
References A, F, llvm::AMDGPUAS::FLAT_ADDRESS, llvm::AMDGPUSubtarget::getReqdWorkGroupSize(), llvm::GCNSubtarget::hasGloballyAddressableScratch(), llvm::AMDGPUSubtarget::hasWavefrontsEvenlySplittingXDim(), llvm::AMDGPU::isArgPassedInSGPR(), isInlineAsmSourceOfDivergence(), llvm::AMDGPU::isIntrinsicSourceOfDivergence(), isReadRegisterSourceOfDivergence(), and llvm::AMDGPUAS::PRIVATE_ADDRESS.
|
inlineoverridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 180 of file AMDGPUTargetTransformInfo.h.
References llvm::AMDGPU::addrspacesMayAlias().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1134 of file AMDGPUTargetTransformInfo.cpp.
References B, llvm::computeKnownBits(), llvm::KnownBits::countMinLeadingOnes(), llvm::BasicTTIImplBase< GCNTTIImpl >::DL, llvm::Type::getContext(), llvm::ConstantInt::getFalse(), llvm::Intrinsic::getOrInsertDeclaration(), llvm::Type::getPointerAddressSpace(), llvm::DataLayout::getPointerSizeInBits(), llvm::TargetLoweringBase::getTargetMachine(), llvm::ConstantInt::getTrue(), llvm::Value::getType(), II, llvm::AMDGPU::isExtendedGlobalAddrSpace(), llvm::AMDGPUAS::LOCAL_ADDRESS, and llvm::AMDGPUAS::PRIVATE_ADDRESS.
AS
. Reimplemented from llvm::BasicTTIImplBase< GCNTTIImpl >.
Definition at line 1510 of file AMDGPUTargetTransformInfo.cpp.
References llvm::AMDGPU::isFlatGlobalAddrSpace().
Value * GCNTTIImpl::simplifyAMDGCNLaneIntrinsicDemanded | ( | InstCombiner & | IC, |
IntrinsicInst & | II, | ||
const APInt & | DemandedElts, | ||
APInt & | UndefElts | ||
) | const |
Definition at line 1912 of file AMDGPUInstCombineIntrinsic.cpp.
References llvm::InstCombiner::Builder, llvm::APInt::countr_zero(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateExtractElement(), llvm::IRBuilderBase::CreateInsertElement(), llvm::IRBuilderBase::CreateShuffleVector(), llvm::FixedVectorType::get(), llvm::PoisonValue::get(), llvm::APInt::getActiveBits(), llvm::IRBuilderBase::GetInsertBlock(), llvm::BasicBlock::getModule(), llvm::Intrinsic::getOrInsertDeclaration(), I, II, and llvm::BasicTTIImplBase< GCNTTIImpl >::isTypeLegal().
Referenced by simplifyDemandedVectorEltsIntrinsic().
bool GCNTTIImpl::simplifyDemandedLaneMaskArg | ( | InstCombiner & | IC, |
IntrinsicInst & | II, | ||
unsigned | LaneArgIdx | ||
) | const |
Simplify a lane index operand (e.g.
llvm.amdgcn.readlane src1).
The instruction only reads the low 5 bits for wave32, and 6 bits for wave64.
Definition at line 518 of file AMDGPUInstCombineIntrinsic.cpp.
References llvm::KnownBits::getConstant(), llvm::Value::getType(), llvm::AMDGPUSubtarget::getWavefrontSizeLog2(), II, llvm::KnownBits::isConstant(), and llvm::InstCombiner::SimplifyDemandedBits().
Referenced by instCombineIntrinsic().
|
overridevirtual |
Reimplemented from llvm::TargetTransformInfoImplBase.
Definition at line 1978 of file AMDGPUInstCombineIntrinsic.cpp.
References II, simplifyAMDGCNLaneIntrinsicDemanded(), and simplifyAMDGCNMemoryIntrinsicDemanded().