Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions include/dxc/DXIL/DxilInstructions.h
Original file line number Diff line number Diff line change
Expand Up @@ -645,6 +645,42 @@ struct LlvmInst_VAArg {
bool isAllowed() const { return false; }
};

/// This instruction extracts from vector
struct LlvmInst_ExtractElement {
llvm::Instruction *Instr;
// Construction and identification
LlvmInst_ExtractElement(llvm::Instruction *pInstr) : Instr(pInstr) {}
operator bool() const {
return Instr->getOpcode() == llvm::Instruction::ExtractElement;
}
// Validation support
bool isAllowed() const { return true; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little suprised to see ExtractElement`InsertElement\ShuffleVector` weren't already defined since I expected at least the first two as required for scalarization which DXIL already supports.

If these are new for vectorization, should isAllowed() be checking IsSM69Plus() instead of always returning true?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scalarization took place before DXIL was generated. The output has no native vectors in non-lib shaders and so had no need for these functions. Intermediate steps might have these, but were disallowed for validation of final output. For similar reasons as their inclusion in intermediate steps, library shaders would allow these ops as they preserved native vectors in function interfaces. The check for library shader allowed ops accounts for this in validation.

case Instruction::InsertElement:

The inclusion here adds them to the list of instructions that are approved for general validation by IsLLVMInstructionAllowed, a generated function that merely
returns whether the opcode is in the list of approved LLVM instructions for a non-lib. It is generated into DxilValidationImpl.inc and called here:

if (!IsLLVMInstructionAllowed(I)) {

Similar to this function, this member is meant to return just true or false if it is allowed at all. I should add a 6.9 check to validation though.

};

/// This instruction inserts into vector
struct LlvmInst_InsertElement {
llvm::Instruction *Instr;
// Construction and identification
LlvmInst_InsertElement(llvm::Instruction *pInstr) : Instr(pInstr) {}
operator bool() const {
return Instr->getOpcode() == llvm::Instruction::InsertElement;
}
// Validation support
bool isAllowed() const { return true; }
};

/// This instruction Shuffle two vectors
struct LlvmInst_ShuffleVector {
llvm::Instruction *Instr;
// Construction and identification
LlvmInst_ShuffleVector(llvm::Instruction *pInstr) : Instr(pInstr) {}
operator bool() const {
return Instr->getOpcode() == llvm::Instruction::ShuffleVector;
}
// Validation support
bool isAllowed() const { return true; }
};

/// This instruction extracts from aggregate
struct LlvmInst_ExtractValue {
llvm::Instruction *Instr;
Expand Down
2 changes: 2 additions & 0 deletions lib/DxilValidation/DxilValidation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2158,6 +2158,8 @@ static bool ValidateType(Type *Ty, ValidationContext &ValCtx,
return true;

if (Ty->isVectorTy()) {
if (ValCtx.DxilMod.GetShaderModel()->IsSM69Plus())
return true;
ValCtx.EmitTypeError(Ty, ValidationRule::TypesNoVector);
return false;
}
Expand Down
6 changes: 6 additions & 0 deletions lib/HLSL/DxilLinker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1255,6 +1255,12 @@ void DxilLinkJob::RunPreparePass(Module &M) {
// For static global handle.
PM.add(createLowerStaticGlobalIntoAlloca());

// Change dynamic indexing vector to array where vectors aren't
// supported, but might be there from the initial compile.
if (!pSM->IsSM69Plus())
PM.add(
createDynamicIndexingVectorToArrayPass(false /* ReplaceAllVector */));

// Remove MultiDimArray from function call arg.
PM.add(createMultiDimArrayToOneDimArrayPass());

Expand Down
44 changes: 28 additions & 16 deletions lib/HLSL/HLMatrixBitcastLowerPass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -113,13 +113,13 @@ class MatrixBitcastLowerPass : public FunctionPass {

// Lower matrix first.
for (BitCastInst *BCI : matCastSet) {
lowerMatrix(BCI, BCI->getOperand(0));
lowerMatrix(DM, BCI, BCI->getOperand(0));
}
return bUpdated;
}

private:
void lowerMatrix(Instruction *M, Value *A);
void lowerMatrix(DxilModule &DM, Instruction *M, Value *A);
bool hasCallUser(Instruction *M);
};

Expand Down Expand Up @@ -180,7 +180,8 @@ Value *CreateEltGEP(Value *A, unsigned i, Value *zeroIdx,
}
} // namespace

void MatrixBitcastLowerPass::lowerMatrix(Instruction *M, Value *A) {
void MatrixBitcastLowerPass::lowerMatrix(DxilModule &DM, Instruction *M,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing any use of DM, other than the recursive one required by this function signature.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a relic from when I was getting the version from it.

Value *A) {
for (auto it = M->user_begin(); it != M->user_end();) {
User *U = *(it++);
if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(U)) {
Expand All @@ -193,31 +194,42 @@ void MatrixBitcastLowerPass::lowerMatrix(Instruction *M, Value *A) {
SmallVector<Value *, 2> idxList(GEP->idx_begin(), GEP->idx_end());
DXASSERT(idxList.size() == 2,
"else not one dim matrix array index to matrix");

HLMatrixType MatTy = HLMatrixType::cast(EltTy);
Value *matSize = Builder.getInt32(MatTy.getNumElements());
idxList.back() = Builder.CreateMul(idxList.back(), matSize);
if (!DM.GetShaderModel()->IsSM69Plus()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we base the low-level decisions on a descriptively named boolean field in the MatrixBitcastLowerPass class, initialized based on the shader model, instead of passing the DxilModule down and checking the shader model each time?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even something about scalarizing operations, rather than anything to do with the native DXIL vector support?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the behavior for vec1s should proceed exactly as previously, I've added indications of native vector support which will use that and the vector size to determine what to do in all of these passes.

HLMatrixType MatTy = HLMatrixType::cast(EltTy);
Value *matSize = Builder.getInt32(MatTy.getNumElements());
idxList.back() = Builder.CreateMul(idxList.back(), matSize);
}
Value *NewGEP = Builder.CreateGEP(A, idxList);
lowerMatrix(GEP, NewGEP);
lowerMatrix(DM, GEP, NewGEP);
DXASSERT(GEP->user_empty(), "else lower matrix fail");
GEP->eraseFromParent();
} else {
DXASSERT(0, "invalid GEP for matrix");
}
} else if (BitCastInst *BCI = dyn_cast<BitCastInst>(U)) {
lowerMatrix(BCI, A);
lowerMatrix(DM, BCI, A);
DXASSERT(BCI->user_empty(), "else lower matrix fail");
BCI->eraseFromParent();
} else if (LoadInst *LI = dyn_cast<LoadInst>(U)) {
if (VectorType *Ty = dyn_cast<VectorType>(LI->getType())) {
IRBuilder<> Builder(LI);
Value *zeroIdx = Builder.getInt32(0);
unsigned vecSize = Ty->getNumElements();
Value *NewVec = UndefValue::get(LI->getType());
for (unsigned i = 0; i < vecSize; i++) {
Value *GEP = CreateEltGEP(A, i, zeroIdx, Builder);
Value *Elt = Builder.CreateLoad(GEP);
NewVec = Builder.CreateInsertElement(NewVec, Elt, i);
Value *NewVec = nullptr;
if (DM.GetShaderModel()->IsSM69Plus()) {
// Just create a replacement load using the vector pointer.
Instruction *NewLI = LI->clone();
unsigned VecIdx = NewLI->getNumOperands() - 1;
NewLI->setOperand(VecIdx, A);
Builder.Insert(NewLI);
NewVec = NewLI;
} else {
Value *zeroIdx = Builder.getInt32(0);
unsigned vecSize = Ty->getNumElements();
NewVec = UndefValue::get(LI->getType());
for (unsigned i = 0; i < vecSize; i++) {
Value *GEP = CreateEltGEP(A, i, zeroIdx, Builder);
Value *Elt = Builder.CreateLoad(GEP);
NewVec = Builder.CreateInsertElement(NewVec, Elt, i);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that below this point, there's:

    } else if (StoreInst *ST = dyn_cast<StoreInst>(U)) {

where it still scalarizes the store for the vector.

Did you mean to leave that scalarization in?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I omitted store support in error. Clearly we need some testing for it as it never caused any issues.

}
LI->replaceAllUsesWith(NewVec);
LI->eraseFromParent();
Expand Down
6 changes: 6 additions & 0 deletions lib/Transforms/Scalar/DxilEliminateVector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
// //
///////////////////////////////////////////////////////////////////////////////

#include "dxc/DXIL/DxilModule.h"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumedly this isn't needed.

Suggested change
#include "dxc/DXIL/DxilModule.h"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Left in by mistake.

#include "llvm/IR/Dominators.h"
#include "llvm/IR/Instructions.h"
#include "llvm/Pass.h"
Expand Down Expand Up @@ -151,6 +153,10 @@ bool DxilEliminateVector::TryRewriteDebugInfoForVector(InsertElementInst *IE) {

bool DxilEliminateVector::runOnFunction(Function &F) {

if (F.getParent()->HasDxilModule())
if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())
return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is where we presumably will still want to do something different for vec1, right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's one of the places. However, this has no effect on the operators, but the intrinsics, which will come in another changes. So this particular file's changes are being reverted for now.


auto *DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
DxilValueCache *DVC = &getAnalysis<DxilValueCache>();

Expand Down
18 changes: 15 additions & 3 deletions lib/Transforms/Scalar/LowerTypePasses.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
//===----------------------------------------------------------------------===//

#include "dxc/DXIL/DxilConstants.h"
#include "dxc/DXIL/DxilModule.h"
#include "dxc/DXIL/DxilOperations.h"
#include "dxc/DXIL/DxilUtil.h"
#include "dxc/HLSL/HLModule.h"
Expand Down Expand Up @@ -180,10 +181,12 @@ bool LowerTypePass::runOnModule(Module &M) {
namespace {
class DynamicIndexingVectorToArray : public LowerTypePass {
bool ReplaceAllVectors;
bool SupportsVectors;

public:
explicit DynamicIndexingVectorToArray(bool ReplaceAll = false)
: LowerTypePass(ID), ReplaceAllVectors(ReplaceAll) {}
: LowerTypePass(ID), ReplaceAllVectors(ReplaceAll),
SupportsVectors(false) {}
static char ID; // Pass identification, replacement for typeid
void applyOptions(PassOptions O) override;
void dumpConfig(raw_ostream &OS) override;
Expand All @@ -194,6 +197,7 @@ class DynamicIndexingVectorToArray : public LowerTypePass {
Type *lowerType(Type *Ty) override;
Constant *lowerInitVal(Constant *InitVal, Type *NewTy) override;
StringRef getGlobalPrefix() override { return ".v"; }
void initialize(Module &M) override;

private:
bool HasVectorDynamicIndexing(Value *V);
Expand All @@ -207,6 +211,11 @@ class DynamicIndexingVectorToArray : public LowerTypePass {
void ReplaceAddrSpaceCast(ConstantExpr *CE, Value *A, IRBuilder<> &Builder);
};

void DynamicIndexingVectorToArray::initialize(Module &M) {
if (M.HasHLModule())
SupportsVectors = M.GetHLModule().GetShaderModel()->IsSM69Plus();
}

void DynamicIndexingVectorToArray::applyOptions(PassOptions O) {
GetPassOptionBool(O, "ReplaceAllVectors", &ReplaceAllVectors,
ReplaceAllVectors);
Expand Down Expand Up @@ -286,7 +295,7 @@ void DynamicIndexingVectorToArray::ReplaceStaticIndexingOnVector(Value *V) {
StoreInst *stInst = cast<StoreInst>(GEPUser);
Value *val = stInst->getValueOperand();
Value *ldVal = Builder.CreateLoad(V);
ldVal = Builder.CreateInsertElement(ldVal, val, constIdx);
ldVal = Builder.CreateInsertElement(ldVal, val, constIdx); // UGH
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to elaborate on the "UGH" comment? Is it on the general operation being performed, or something specific about this line?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just how I identified places that I needed to do something to deal with scalarization. Obviously not meant to be left in. I have now done what I intended to address this.

Builder.CreateStore(ldVal, V);
stInst->eraseFromParent();
}
Expand All @@ -306,8 +315,11 @@ void DynamicIndexingVectorToArray::ReplaceStaticIndexingOnVector(Value *V) {
}

bool DynamicIndexingVectorToArray::needToLower(Value *V) {
// Only needed where vectors aren't supported.
if (SupportsVectors)
return false;
Type *Ty = V->getType()->getPointerElementType();
if (dyn_cast<VectorType>(Ty)) {
if (isa<VectorType>(Ty)) {
if (isa<GlobalVariable>(V) || ReplaceAllVectors) {
return true;
}
Expand Down
18 changes: 11 additions & 7 deletions lib/Transforms/Scalar/ScalarReplAggregatesHLSL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1869,7 +1869,8 @@ bool SROAGlobalAndAllocas(HLModule &HLM, bool bHasDbgInfo) {
// if
// all its users can be transformed, then split up the aggregate into its
// separate elements.
if (ShouldAttemptScalarRepl(AI) && isSafeAllocaToScalarRepl(AI)) {
if (!HLM.GetShaderModel()->IsSM69Plus() && ShouldAttemptScalarRepl(AI) &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pass is so complicated as it is, can we capture a bool for DXIL vector support earlier and use that instead of the individual SM 6.9 checks here as well?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I'll do that throughout.

isSafeAllocaToScalarRepl(AI)) {
std::vector<Value *> Elts;
IRBuilder<> Builder(dxilutil::FindAllocaInsertionPt(AI));
bool hasPrecise = HLModule::HasPreciseAttributeWithMetadata(AI);
Expand Down Expand Up @@ -1945,8 +1946,9 @@ bool SROAGlobalAndAllocas(HLModule &HLM, bool bHasDbgInfo) {
continue;
}

// Flat Global vector if no dynamic vector indexing.
bool bFlatVector = !hasDynamicVectorIndexing(GV);
// Flat Global vector if no dynamic vector indexing and pre-6.9.
bool bFlatVector =
!hasDynamicVectorIndexing(GV) && !HLM.GetShaderModel()->IsSM69Plus();

if (bFlatVector) {
GVDbgOffset &dbgOffset = GVDbgOffsetMap[GV];
Expand Down Expand Up @@ -1980,10 +1982,12 @@ bool SROAGlobalAndAllocas(HLModule &HLM, bool bHasDbgInfo) {
} else {
// SROA_Parameter_HLSL has no access to a domtree, if one is needed,
// it'll be generated
SROAed = SROA_Helper::DoScalarReplacement(
GV, Elts, Builder, bFlatVector,
// TODO: set precise.
/*hasPrecise*/ false, typeSys, DL, DeadInsts, /*DT*/ nullptr);
if (!HLM.GetShaderModel()->IsSM69Plus()) {
SROAed = SROA_Helper::DoScalarReplacement(
GV, Elts, Builder, bFlatVector,
// TODO: set precise.
/*hasPrecise*/ false, typeSys, DL, DeadInsts, /*DT*/ nullptr);
}
}

if (SROAed) {
Expand Down
6 changes: 6 additions & 0 deletions lib/Transforms/Scalar/Scalarizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@
//
//===----------------------------------------------------------------------===//

#include "dxc/DXIL/DxilModule.h"

#include "llvm/ADT/STLExtras.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/InstVisitor.h"
Expand Down Expand Up @@ -290,6 +292,10 @@ bool Scalarizer::doInitialization(Module &M) {
}

bool Scalarizer::runOnFunction(Function &F) {
if (F.getParent()->HasDxilModule())
if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (F.getParent()->HasDxilModule())
if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())
auto *Parent = F.getParent();
if (Parent->HasDxilModule())
if (Parent->GetDxilModule().GetShaderModel()->IsSM69Plus())

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could even simplify this further to just:

SupportsVectors = F.getParent()->HasDxilModule() && F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus();

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel like these comments are pulling in the same direction. The first suggests adding a variable to prevent redundant parent retrieval. The second retains that redundancy, but places it on a single line that clang-format will just break up into multiple lines again. I will merge the conditionals though.

Though the compiler will likely remove any redundancy, I'm better disposed toward the temp variable unless there is some coding guideline that insists on a ?: operator in this case.

My reading of https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable doesn't suggest that auto is best here as it's not clear what type the parent will be without knowledge of what function parents always(?) are. So I'm going to spell out Module

return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually want to turn off scalarization entirely? Maybe this is one place where we need to preserve it for vec1 only?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper support for vec1 was missing from this change. I've added it now.


for (Function::iterator BBI = F.begin(), BBE = F.end(); BBI != BBE; ++BBI) {
BasicBlock *BB = BBI;
for (BasicBlock::iterator II = BB->begin(), IE = BB->end(); II != IE;) {
Expand Down
15 changes: 1 addition & 14 deletions tools/clang/lib/CodeGen/CGExprScalar.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3713,20 +3713,7 @@ VisitAbstractConditionalOperator(const AbstractConditionalOperator *E) {
llvm::Value *CondV = CGF.EmitScalarExpr(condExpr);
llvm::Value *LHS = Visit(lhsExpr);
llvm::Value *RHS = Visit(rhsExpr);
if (llvm::VectorType *VT = dyn_cast<llvm::VectorType>(CondV->getType())) {
llvm::VectorType *ResultVT = cast<llvm::VectorType>(LHS->getType());
llvm::Value *result = llvm::UndefValue::get(ResultVT);
for (unsigned i = 0; i < VT->getNumElements(); i++) {
llvm::Value *EltCond = Builder.CreateExtractElement(CondV, i);
llvm::Value *EltL = Builder.CreateExtractElement(LHS, i);
llvm::Value *EltR = Builder.CreateExtractElement(RHS, i);
llvm::Value *EltSelect = Builder.CreateSelect(EltCond, EltL, EltR);
result = Builder.CreateInsertElement(result, EltSelect, i);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess nothing depended on this early scalarization of select? I wonder why it was added in the first place. I could imagine subtle regressions by removing this code if some optimization before scalarization depended on it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing I could discern. If you have suggestions on how to further verify this, I'm happy to try.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it was part of HLSL 2018 and earlier, and might be related to how the non-short-circuting behavior was implemented.

I see you have some HLSL 2018 tests below. I'll look at them more closely to see that they preserve non-short-circuiting.

Code coverage does indicate that this isn't dead code: https://microsoft.github.io/DirectXShaderCompiler/coverage/home/runner/work/DirectXShaderCompiler/DirectXShaderCompiler/tools/clang/lib/CodeGen/CGExprScalar.cpp.html#L3709

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it's not dead, but I've found a lot of pre-emptive scalarization that just duplicates the work that later scalarization passes would do anyway. Short-circuiting involvement is a possibility though.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aspect of this code that prevented short-circuiting was returning before the logic below could potentially insert conditional jumps. The scalarization was incidental. This is why the scalar cases would skip it and just generate the select operation, which would still prevent short-circuiting without any scalarization as scalars don't short circuit any more than vectors do in HLSL 2018. That select generation is the behavior I maintained for vectors and scalars. In pre-6.9 cases, the vector select will be scalarized functionally equivalently as before.

With this hard-coded scalarization in place, it can't be skipped with the existing changes to scalarization passes. We'd need a special case conditional for 6.9 here.

}
return result;
} else {
return Builder.CreateSelect(CondV, LHS, RHS);
}
return Builder.CreateSelect(CondV, LHS, RHS);
}
if (hlsl::IsHLSLMatType(E->getType())) {
llvm::Value *Cond = CGF.EmitScalarExpr(condExpr);
Expand Down
12 changes: 8 additions & 4 deletions tools/clang/lib/Sema/SemaHLSL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5185,8 +5185,8 @@ class HLSLExternalSource : public ExternalSemaSource {
}
return false;
} else if (Template->getTemplatedDecl()->hasAttr<HLSLTessPatchAttr>()) {
DXASSERT(TemplateArgList.size() == 1,
"Tessellation patch has more than one template arg");
DXASSERT(TemplateArgList.size() > 0,
"Tessellation patch should have at least one template args");
const TemplateArgumentLoc &argLoc = TemplateArgList[0];
const TemplateArgument &arg = argLoc.getArgument();
DXASSERT(arg.getKind() == TemplateArgument::ArgKind::Type, "");
Expand Down Expand Up @@ -6487,6 +6487,9 @@ bool HLSLExternalSource::MatchArguments(
}
}

std::string profile = m_sema->getLangOpts().HLSLProfile;
const ShaderModel *SM = hlsl::ShaderModel::GetByName(profile.c_str());
Comment on lines +6706 to +6707
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we couldn't have a LangOpt for HLSLLongVectorSupport, initialized based on the shader model instead?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. We had a slew of these as part of earlier shader models that eventually got removed and merged into a single SM. I could see the readability benefit of being able to query it individually and also in implementing the hidden flag to force scalarization. It's just that precedent of removing similar flags to minimize test area that gives me pause.


// Populate argTypes.
for (size_t i = 0; i <= Args.size(); i++) {
const HLSL_INTRINSIC_ARGUMENT *pArgument = &pIntrinsic->pArgs[i];
Expand Down Expand Up @@ -6657,8 +6660,9 @@ bool HLSLExternalSource::MatchArguments(
}

// Verify that the final results are in bounds.
CAB(uCols > 0 && uCols <= MaxVectorSize && uRows > 0 &&
uRows <= MaxVectorSize,
CAB((uCols > 0 && uRows > 0 &&
((uCols <= MaxVectorSize && uRows <= MaxVectorSize) ||
(SM->IsSM69Plus() && uRows == 1))),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this change relevant to allowing native LLVM vectors? Shouldn't this be part of #7143 instead?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be part of this change, but it isn't part of the decls change either. This is for argument matching of an HLSL intrinsic, which we aren't doing quite yet either.

i);

// Const
Expand Down
Loading