Allow native vectors for LLVM operations by pow2clk · Pull Request #7155 · microsoft/DirectXShaderCompiler

pow2clk · 2025-02-20T22:08:59Z

Disables various forms of scalarization and vector elimination to permit
vectors to pass through to final DXIL when used in native LLVM
operations and loading/storing.

Introduces a few vector manipulation llvm instructions to DXIL allowing
for them to appear in output DXIL.

Skips passes for 6.9 that scalarize, convert to arrays, or otherwise eliminate vectors.
This eliminates the element-by-element loading of the vectors
In many cases, this required plumbing the shader model information to
passes that didn't have it before.

Many changes were needed for the MatrixBitcastLower pass related to
linking to avoid converting matrix vectors, but also to perform the
conversion if a shader was compiled for 6.9+, but then linked to a
earlier target.
This now adapts to the linker target to either preserve vectors for 6.9 or arrays for previous versions.
This requires running the DynamicIndexing VectorToArray pass during linking since 6_x and 6_9+ will fail to run this in the initial compile, but will still need to lower vectors to arrays.

Ternary conditional/select operators were element extracted in codegen.
Removing this allows 6.9 to preserve the vectors, but also maintains
behavior for previous shader models because the operations get
scalarized later anyway.

Keep groupshared variables as vectors for 6.9. They are no longer represented as indivual groupshared scalars.

Adds extensive tests for these operations using different types and
sizes and testing them appropriately. Booleans produce significantly
different code, so they get their own test.

Fixes #7123

Disables various forms of scalarization and vector elimination to permit vectors to pass through to final DXIL when used in native LLVM operations and loading/storing. Introduces a few vector manipulation llvm instructions to DXIL allowing for them to appear in output DXIL. Skips passes for 6.9 that scalarize, convert to arrays, or otherwise eliminate vectors. This eliminates the element-by-element loading of the vectors In many cases, this required plumbing the shader model information to passes that didn't have it before. Many changes were needed for the MatrixBitcastLower pass related to linking to avoid converting matrix vectors, but also to perform the conversion if a shader was compiled for 6.9+, but then linked to a earlier target. This now adapts to the linker target to either preserve vectors for 6.9 or arrays for previous versions. This requires running the DynamicIndexing VectorToArray pass during linking since 6_x and 6_9+ will fail to run this in the initial compile, but will still need to lower vectors to arrays. Ternary conditional/select operators were element extracted in codegen. Removing this allows 6.9 to preserve the vectors, but also maintains behavior for previous shader models because the operations get scalarized later anyway. Keep groupshared variables as vectors for 6.9. They are no longer represented as indivual groupshared scalars. Adds extensive tests for these operations using different types and sizes and testing them appropriately. Booleans produce significantly different code, so they get their own test. Fixes microsoft#7123

pow2clk · 2025-02-20T22:10:14Z

The first two commits are dependencies that belong to a different PR. Only the last is relevant to this: 68a284b

This got lost somewhere

llvm-beanz

This PR has changes to multiple passes that do not have IR tests. We should have tests that verify correctness of the changes in the passes separately from the Clang-level changes.

llvm-beanz · 2025-02-25T02:26:50Z

 def err_hlsl_out_indices_array_incorrect_access: Error<
   "a vector in out indices array must be accessed as a whole">;
+def err_hlsl_unsupported_long_vector: Error<
+   "Vectors of over 4 elements in %0 are not supported">;


Suggested change

"Vectors of over 4 elements in %0 are not supported">;

"vectors of over 4 elements in %0 are not supported">;

This change is not part of this PR, but a prerequisite as I tried to explain in this comment: #7155 (comment) I have fixed it in the relevant PR: #7143

llvm-beanz · 2025-02-25T02:29:10Z

  return false;
 }

+bool hlsl::HasLongVecs(const QualType &qt) {


nit: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

As above, this is not part of this PR, but part of the prerequisite #7143. I renamed it inline with your suggestion on that change, which I'm afraid doesn't match the coding standards you cited. e010223#r1960826249

llvm-beanz · 2025-02-25T02:29:36Z

+  if (qt.isNull()) {
+    return false;
+  }


nit: https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements

Suggested change

if (qt.isNull()) {

return false;

}

if (qt.isNull())

return false;

Not part of this PR. I've responded to it here: 50d1af5

tex3d · 2025-02-25T21:05:27Z

-        HLMatrixType MatTy = HLMatrixType::cast(EltTy);
-        Value *matSize = Builder.getInt32(MatTy.getNumElements());
-        idxList.back() = Builder.CreateMul(idxList.back(), matSize);
+        if (!DM.GetShaderModel()->IsSM69Plus()) {


Can we base the low-level decisions on a descriptively named boolean field in the MatrixBitcastLowerPass class, initialized based on the shader model, instead of passing the DxilModule down and checking the shader model each time?

Maybe even something about scalarizing operations, rather than anything to do with the native DXIL vector support?

Since the behavior for vec1s should proceed exactly as previously, I've added indications of native vector support which will use that and the vector size to determine what to do in all of these passes.

tex3d · 2025-02-25T21:10:14Z

+            Value *GEP = CreateEltGEP(A, i, zeroIdx, Builder);
+            Value *Elt = Builder.CreateLoad(GEP);
+            NewVec = Builder.CreateInsertElement(NewVec, Elt, i);
+          }


I noticed that below this point, there's:

} else if (StoreInst *ST = dyn_cast<StoreInst>(U)) {

where it still scalarizes the store for the vector.

Did you mean to leave that scalarization in?

Yeah, I omitted store support in error. Clearly we need some testing for it as it never caused any issues.

tex3d · 2025-02-25T21:15:51Z


+  if (F.getParent()->HasDxilModule())
+    if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())
+      return false;


So, this is where we presumably will still want to do something different for vec1, right?

It's one of the places. However, this has no effect on the operators, but the intrinsics, which will come in another changes. So this particular file's changes are being reverted for now.

tex3d · 2025-02-25T21:21:00Z

            Value *val = stInst->getValueOperand();
            Value *ldVal = Builder.CreateLoad(V);
-            ldVal = Builder.CreateInsertElement(ldVal, val, constIdx);
+            ldVal = Builder.CreateInsertElement(ldVal, val, constIdx); // UGH


Did you want to elaborate on the "UGH" comment? Is it on the general operation being performed, or something specific about this line?

It was just how I identified places that I needed to do something to deal with scalarization. Obviously not meant to be left in. I have now done what I intended to address this.

tex3d · 2025-02-25T21:22:28Z

      // all its users can be transformed, then split up the aggregate into its
      // separate elements.
-      if (ShouldAttemptScalarRepl(AI) && isSafeAllocaToScalarRepl(AI)) {
+      if (!HLM.GetShaderModel()->IsSM69Plus() && ShouldAttemptScalarRepl(AI) &&


This pass is so complicated as it is, can we capture a bool for DXIL vector support earlier and use that instead of the individual SM 6.9 checks here as well?

Yeah. I'll do that throughout.

tex3d · 2025-02-25T21:28:50Z

 bool Scalarizer::runOnFunction(Function &F) {
+  if (F.getParent()->HasDxilModule())
+    if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())
+      return false;


Do we actually want to turn off scalarization entirely? Maybe this is one place where we need to preserve it for vec1 only?

Proper support for vec1 was missing from this change. I've added it now.

tex3d · 2025-02-25T21:42:09Z

-          llvm::Value *EltL = Builder.CreateExtractElement(LHS, i);
-          llvm::Value *EltR = Builder.CreateExtractElement(RHS, i);
-          llvm::Value *EltSelect = Builder.CreateSelect(EltCond, EltL, EltR);
-          result = Builder.CreateInsertElement(result, EltSelect, i);


I guess nothing depended on this early scalarization of select? I wonder why it was added in the first place. I could imagine subtle regressions by removing this code if some optimization before scalarization depended on it.

Nothing I could discern. If you have suggestions on how to further verify this, I'm happy to try.

This looks like it was part of HLSL 2018 and earlier, and might be related to how the non-short-circuting behavior was implemented.

I see you have some HLSL 2018 tests below. I'll look at them more closely to see that they preserve non-short-circuiting.

Code coverage does indicate that this isn't dead code: https://microsoft.github.io/DirectXShaderCompiler/coverage/home/runner/work/DirectXShaderCompiler/DirectXShaderCompiler/tools/clang/lib/CodeGen/CGExprScalar.cpp.html#L3709

I'm sure it's not dead, but I've found a lot of pre-emptive scalarization that just duplicates the work that later scalarization passes would do anyway. Short-circuiting involvement is a possibility though.

The aspect of this code that prevented short-circuiting was returning before the logic below could potentially insert conditional jumps. The scalarization was incidental. This is why the scalar cases would skip it and just generate the select operation, which would still prevent short-circuiting without any scalarization as scalars don't short circuit any more than vectors do in HLSL 2018. That select generation is the behavior I maintained for vectors and scalars. In pre-6.9 cases, the vector select will be scalarized functionally equivalently as before.

With this hard-coded scalarization in place, it can't be skipped with the existing changes to scalarization passes. We'd need a special case conditional for 6.9 here.

tex3d · 2025-02-25T21:45:10Z

+  std::string profile = m_sema->getLangOpts().HLSLProfile;
+  const ShaderModel *SM = hlsl::ShaderModel::GetByName(profile.c_str());


I wonder if we couldn't have a LangOpt for HLSLLongVectorSupport, initialized based on the shader model instead?

We could. We had a slew of these as part of earlier shader models that eventually got removed and merged into a single SM. I could see the readability benefit of being able to query it individually and also in implementing the hidden flag to force scalarization. It's just that precedent of removing similar flags to minimize test area that gives me pause.

tex3d · 2025-02-25T22:04:31Z

-              uRows <= MaxVectorSize,
+      CAB((uCols > 0 && uRows > 0 &&
+           ((uCols <= MaxVectorSize && uRows <= MaxVectorSize) ||
+            (SM->IsSM69Plus() && uRows == 1))),


How is this change relevant to allowing native LLVM vectors? Shouldn't this be part of #7143 instead?

It shouldn't be part of this change, but it isn't part of the decls change either. This is for argument matching of an HLSL intrinsic, which we aren't doing quite yet either.

tex3d · 2025-02-25T22:16:38Z

+
+  Link(L"main", L"ps_6_9", pLinker, {libName, libName2},
+       {"alloca [2 x <12 x float>]",
+        "getelementptr [12 x float], [12 x float]*"},


I'm not sure how to tell if this is right. Shouldn't this be dealing with a vector rather than an array here? Can I see the IR?

I wrote this in December. I don't remember myself just now. I'll look into it and get back to you.

Well I was ready to write something here about how this was compiled as 6.3 because that's what the file says, but it's actually 6_x. The 12 element array comes from HLMatrixLower, which I hadn't touched.

Okay, it's just scratch space used to move the elements of the matrix around. They use vectors too. I don't know why the test cares about it so much, but it doesn't need changing. I'll take a look at leveraging long vectors for matrices later, but this works perfectly fine for now.

pow2clk · 2025-03-10T11:56:28Z

I have fixed the merge conflicts and responded to some of the feedback locally, but the way the dependencies are set up, if I push those changes before #7143 is merged, it's going to make distinguishing the prerequisite changes and the changes that are the subject of this PR even harder.

joaosaffran

Code looks good to me. Not an expert though

pow2clk

Updated and added to pending comments I forgot to submit some time ago. A new commit should be coming soon, but there are a lot of IR tests to write.

pow2clk · 2025-03-14T18:34:42Z

-        HLMatrixType MatTy = HLMatrixType::cast(EltTy);
-        Value *matSize = Builder.getInt32(MatTy.getNumElements());
-        idxList.back() = Builder.CreateMul(idxList.back(), matSize);
+        if (!DM.GetShaderModel()->IsSM69Plus()) {


Since the behavior for vec1s should proceed exactly as previously, I've added indications of native vector support which will use that and the vector size to determine what to do in all of these passes.

pow2clk · 2025-03-14T18:35:02Z

+            Value *GEP = CreateEltGEP(A, i, zeroIdx, Builder);
+            Value *Elt = Builder.CreateLoad(GEP);
+            NewVec = Builder.CreateInsertElement(NewVec, Elt, i);
+          }


Yeah, I omitted store support in error. Clearly we need some testing for it as it never caused any issues.

pow2clk · 2025-03-14T18:36:00Z


+  if (F.getParent()->HasDxilModule())
+    if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())
+      return false;


It's one of the places. However, this has no effect on the operators, but the intrinsics, which will come in another changes. So this particular file's changes are being reverted for now.

pow2clk · 2025-03-14T18:36:40Z

            Value *val = stInst->getValueOperand();
            Value *ldVal = Builder.CreateLoad(V);
-            ldVal = Builder.CreateInsertElement(ldVal, val, constIdx);
+            ldVal = Builder.CreateInsertElement(ldVal, val, constIdx); // UGH


It was just how I identified places that I needed to do something to deal with scalarization. Obviously not meant to be left in. I have now done what I intended to address this.

pow2clk · 2025-03-17T17:43:56Z

      // all its users can be transformed, then split up the aggregate into its
      // separate elements.
-      if (ShouldAttemptScalarRepl(AI) && isSafeAllocaToScalarRepl(AI)) {
+      if (!HLM.GetShaderModel()->IsSM69Plus() && ShouldAttemptScalarRepl(AI) &&


Yeah. I'll do that throughout.

pow2clk · 2025-03-17T17:44:26Z

 bool Scalarizer::runOnFunction(Function &F) {
+  if (F.getParent()->HasDxilModule())
+    if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())
+      return false;


Proper support for vec1 was missing from this change. I've added it now.

pow2clk · 2025-03-17T17:45:16Z

-          llvm::Value *EltL = Builder.CreateExtractElement(LHS, i);
-          llvm::Value *EltR = Builder.CreateExtractElement(RHS, i);
-          llvm::Value *EltSelect = Builder.CreateSelect(EltCond, EltL, EltR);
-          result = Builder.CreateInsertElement(result, EltSelect, i);


Nothing I could discern. If you have suggestions on how to further verify this, I'm happy to try.

pow2clk · 2025-03-17T17:47:25Z

+  std::string profile = m_sema->getLangOpts().HLSLProfile;
+  const ShaderModel *SM = hlsl::ShaderModel::GetByName(profile.c_str());


We could. We had a slew of these as part of earlier shader models that eventually got removed and merged into a single SM. I could see the readability benefit of being able to query it individually and also in implementing the hidden flag to force scalarization. It's just that precedent of removing similar flags to minimize test area that gives me pause.

pow2clk · 2025-03-17T17:47:56Z

-              uRows <= MaxVectorSize,
+      CAB((uCols > 0 && uRows > 0 &&
+           ((uCols <= MaxVectorSize && uRows <= MaxVectorSize) ||
+            (SM->IsSM69Plus() && uRows == 1))),


It shouldn't be part of this change, but it isn't part of the decls change either. This is for argument matching of an HLSL intrinsic, which we aren't doing quite yet either.

pow2clk · 2025-03-17T17:49:07Z

+
+  Link(L"main", L"ps_6_9", pLinker, {libName, libName2},
+       {"alloca [2 x <12 x float>]",
+        "getelementptr [12 x float], [12 x float]*"},


I wrote this in December. I don't remember myself just now. I'll look into it and get back to you.

farzonl · 2025-03-21T18:55:22Z


-      // Flat Global vector if no dynamic vector indexing.
+      // Flatten Global vector if no dynamic vector indexing.
      bool bFlatVector = !hasDynamicVectorIndexing(GV);


So I understand not flattening vectors. does this also mean we won't flatten multi-dimensional arrays of vectors if we have dynamic vector indexing?

Vectors and their dynamic indexing have no bearing on multi-dimensional array flattening. hasDynamicVectorIndexing strips off all array GEPs and only pays attention to how the vector itself is indexed. The DynamicVectorIndexingVectorToArray pass is run before MultiDimArrayToOneDimArray is. By the time we're worried about flattening arrays, any vector flattening has already been done. In non-lib pre-6.9 cases, that means the vector element looks just like another layer of the multidimensional array and gets flattened with the rest of it. In cases where the vector survives, it becomes the element of the flattened array just like the scalar would. That's why no changes to MultiDimArrayToOneDimArray were required.

farzonl · 2025-03-21T19:13:17Z

+  if (F.getParent()->HasDxilModule())
+    if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())


Suggested change

if (F.getParent()->HasDxilModule())

if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())

auto *Parent = F.getParent();

if (Parent->HasDxilModule())

if (Parent->GetDxilModule().GetShaderModel()->IsSM69Plus())

Could even simplify this further to just:

SupportsVectors = F.getParent()->HasDxilModule() && F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus();

I don't feel like these comments are pulling in the same direction. The first suggests adding a variable to prevent redundant parent retrieval. The second retains that redundancy, but places it on a single line that clang-format will just break up into multiple lines again. I will merge the conditionals though.

Though the compiler will likely remove any redundancy, I'm better disposed toward the temp variable unless there is some coding guideline that insists on a ?: operator in this case.

My reading of https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable doesn't suggest that auto is best here as it's not clear what type the parent will be without knowledge of what function parents always(?) are. So I'm going to spell out Module

farzonl · 2025-03-21T19:15:37Z

+!3 = !{i32 1, i32 6}
+!4 = !{i32 1, i32 6}
+!5 = !{!"lib", i32 6, i32 6}
+!6 = !{i32 1, float ([3 x <3 x float>]*)* @"\01?bar@@YAMY02V?$vector@M$02@@@Z", !7, float ([3 x <3 x float>]*)* @"\01?foo@@YAMY02V?$vector@M$02@@@Z", !7}


what happens if we have a [3 x [3 x <3 x float>]]?

Nothing different. Multidimensional array flattening only interacts with vector flattening when the vectors look like arrays. Anyway, this test is only affected by this change because it was missing the needed metadata to reconstruct the HLModule when invoked with dxopt. It didn't need it before. Because of the shader model dependency, it does now.

farzonl · 2025-03-21T19:24:22Z

            "OTHER", 53, "VAArg", "VAArgInst", "vaarg instruction", "", []
        )
+
+        self.add_llvm_instr(


I understand why we are doing this since this change maps back to the change in include/dxc/DXIL/DxilInstructions.h.

I'm curious what it means though. What does it mean to add_llvm_instr in this case? I don't think it means anything about validity because i've seen these instructions emitted by DXC before?

Does this means there is a way to print out all the llvm instructions DXC supports via querying the DirectXShaderCompiler.utils.hct.hctdb_instrhelp package?

I'm curious for my own reasons because I want to build a Set difference of DXC to upstream llvm and know which instructions need to be legalized.

It does have to do with the validity in non-lib shaders. As I at least partially explained above in DxilInstructions.h, it prevents these from being immediately rejected for non-lib shaders in validation:

DirectXShaderCompiler/lib/DxilValidation/DxilValidation.cpp

Line 2694 in b646ad3

if (!IsLLVMInstructionAllowed(I)) {

.

You could construct that query or you could just look at the LLVM operations indicated by a call to add_llvm_instr in hctdb.py, or you could look at the final form of the IsLLVMInstructionAllowed function the DxilValidationImpl.inc file from a build of DXC and see the opcode value range checks. The only exceptions previously between libs and other shaders were these three, so there are no other gotchas where an operation might seem to be disallowed by these metrics, but really is for a specific case.

I don't know why the order of the extractelements varies, but it does even for the same build depending on whether a debugger is attached. It would seem to be an unordered container object, but there is no such. It just generates them as it goes and never touches them again which would suggest the opposite order in a few cases. They are constant expressions, that's probably why they move around, but I can't find where to make them consistent.

farzonl · 2025-03-21T23:08:20Z

+// CHECK-SAME: ([11 x <[[NUM:[0-9][0-9]*]] x [[TYPE:[a-z0-9]*]]>]*
+export void bittwiddlers(inout vector<TYPE, NUM> things[11]) {
+  // CHECK: [[adr1:%[0-9]*]] = getelementptr inbounds [11 x <[[NUM]] x [[TYPE]]>], [11 x <[[NUM]] x [[TYPE]]>]* %things, i32 0, i32 1
+  // CHECK: [[vec1:%[0-9]*]] = load <[[NUM]] x [[TYPE]]>, <[[NUM]] x [[TYPE]]>* [[adr1]]


would it make sense to make most of these CHECK-NEXT?

Most of them could be, but not all. Some of them have a few commands I left out because they weren't part of the logic flow of interest. I think because I meticulously track the creation and usage of all the values, the matches must either be the instructions I intend or I suppose theoretically some that are entirely equivalent in function. It isn't necessary to the correct outcome that they be on immediately subsequent lines and the dependency tracking means there's no risk of false matches that CHECK-NEXT might prevent otherwise.

That's why it's not needed. I don't want to do it because rechecking all these for which ones can and can't and which ones may on one platform and not an another is not an exercise that I want to undergo without reason to do so.

llvm-beanz · 2025-03-21T14:48:08Z

+#include "dxc/DXIL/DxilModule.h"
+


Presumedly this isn't needed.

Suggested change

#include "dxc/DXIL/DxilModule.h"

Yeah. Left in by mistake.

llvm-beanz · 2025-03-21T14:50:48Z

-                                  const DataLayout &DL,
+                                  bool SupportsVectors, bool hasPrecise,
+                                  DxilTypeSystem &typeSys, const DataLayout &DL,
                                  SmallVector<Value *, 32> &DeadInsts,


I'm not requesting that you fix this, but the use of a SmallVector as an argument does stand out to me. Generally the convention is to use a SmallVectorImpl instead as function arguments so that the function can be used with SmallVectors of different static allocation size.

llvm-beanz · 2025-03-21T14:56:05Z

-          llvm::Value *EltL = Builder.CreateExtractElement(LHS, i);
-          llvm::Value *EltR = Builder.CreateExtractElement(RHS, i);
-          llvm::Value *EltSelect = Builder.CreateSelect(EltCond, EltL, EltR);
-          result = Builder.CreateInsertElement(result, EltSelect, i);


This looks like it was part of HLSL 2018 and earlier, and might be related to how the non-short-circuting behavior was implemented.

I see you have some HLSL 2018 tests below. I'll look at them more closely to see that they preserve non-short-circuiting.

Code coverage does indicate that this isn't dead code: https://microsoft.github.io/DirectXShaderCompiler/coverage/home/runner/work/DirectXShaderCompiler/DirectXShaderCompiler/tools/clang/lib/CodeGen/CGExprScalar.cpp.html#L3709

llvm-beanz · 2025-03-21T21:48:35Z

+  if (F.getParent()->HasDxilModule())
+    if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())


Could even simplify this further to just:

SupportsVectors = F.getParent()->HasDxilModule() && F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus();

llvm-beanz · 2025-03-22T00:36:49Z

      }

-      // Flat Global vector if no dynamic vector indexing.
+      // Flatten Global vector if no dynamic vector indexing.


nit: If we're fixing a comment, might as well make it a complete sentence.

Suggested change

// Flatten Global vector if no dynamic vector indexing.

// Flatten global vector if no dynamic vector indexing occurs.

I worded it more in keeping with the name of the utility function.

llvm-beanz · 2025-03-22T00:49:21Z

+  // CHECK: [[bvec5:%[0-9]*]] = icmp ne <[[NUM]] x i32> [[vec5]], zeroinitializer
+  // MORE STUFF
+
+  res[3] = truth[3] ? truth[4] : truth[5];


Can we have a HLSL 2018 test case something like:

export float4 main(bool4 B, float4 X, float4 Y) { return B ? ++X : Y; }

What I want to verify is that the side-effect of the LHS always occurs and no branch is generated under HLSL 2018.

It's a good suggestion. The native vectors might interfere with that behavior.

Initialize mattype with check for matrixness remove leftover include reword comment refactor SM69 conditional to avoid double parent retrieval Add test that confirms no short-circuiting with native vector logic ops and HLSL 2018. Revise vec1 scalarizer test that was mistakenly generated with HLSL 2021 which included short-circuiting. Add validation check for vector operations in pre-6.9

bogner

Generally looks good to me, though it'd be good to get explicit approval from one of the folks who've done a few rounds of review on it.

Some of the LLVM IR tests have a few bits of unreferenced or unnecessary metadata that it'd be nice to clean up before committing.

bogner · 2025-03-24T17:56:12Z

+  // Since HL module can't be created when linking, check for that first.
+  // Otherwise, either retrieve or generate the HL module.


Should this say "Since dxmodule can't be created when linking..."?

No, we actually can create a dxil module when linking. Maybe I'll replace it with getorcreatedxilmodule just in case. We can't create an hlmodule while linking

I see why it's confusing now though. the "it" was supposed to be a dxil module, but that is not at all clear from the wording.

tex3d · 2025-03-25T17:02:07Z

 } // namespace

-void MatrixBitcastLowerPass::lowerMatrix(Instruction *M, Value *A) {
+void MatrixBitcastLowerPass::lowerMatrix(DxilModule &DM, Instruction *M,


I'm not seeing any use of DM, other than the recursive one required by this function signature.

Yeah, it's a relic from when I was getting the version from it.

tex3d · 2025-03-25T17:02:54Z

  StringRef getPassName() const override { return "Matrix Bitcast lower"; }
  bool runOnFunction(Function &F) override {
+    DxilModule &DM = F.getParent()->GetOrCreateDxilModule();
+    SupportsVectors = DM.GetShaderModel()->IsSM69Plus();


If you want to avoid full DxilModule dependency, I think it would be reasonable to have something that can look up the DXIL version in the module without loading full metadata, whether or not it's HLM or DM.

I like this idea. I know I mentioned it somewhere before. I'd prefer it not block this change though.

tex3d · 2025-03-25T17:23:01Z

+    SupportsVectors = M.GetDxilModule().GetShaderModel()->IsSM69Plus();
+  } else {
+    HLModule &HLM = M.GetOrCreateHLModule();
+    SupportsVectors = HLM.GetShaderModel()->IsSM69Plus();


I mentioned earlier, it might be better to get the DXIL version from metadata without relying on full HLModule/DxilModule metadata loading. Maybe we can add a dxilutil::ReadDxilVersion function?

But the fixes to the HLModule loading are appreciated!

pow2clk · 2025-03-25T17:27:31Z

Generally looks good to me, though it'd be good to get explicit approval from one of the folks who've done a few rounds of review on it.

Some of the LLVM IR tests have a few bits of unreferenced or unnecessary metadata that it'd be nice to clean up before committing.

Just as an informational note, I discovered that the !paseresume line that I think was identified as unneeded at some point actually turns out to be in some places. Without it, we don't get the shader model, so we fall back to default pre-6.9 behavior. That's okay for the vec1 tests because they should be scalarized, but they weren't actually confirming that the scalarization still took place in 6.9, which was their intent.

Remove some unneeded parts of IR tests. Add some back. I discovered that the pauseresume line is at least sometimes needed to construct the module with the shader model information. I fear some of the tests were passing before becuase they expected no changes, but were operating in non-6.9 mode. So they passed, but for the wrong reason. Remove unused DM param to matrixbitcastlowerpass lower matrix call. reword confusing dynindexingvectortoarray comment

tex3d · 2025-03-25T17:56:46Z


 bool Scalarizer::runOnFunction(Function &F) {
+  Module *M = F.getParent();
+  if (M->HasDxilModule() && M->GetDxilModule().GetShaderModel()->IsSM69Plus())


I think the independent lookup of DXIL version would help here, since it won't do something different when there's no created DxilModule already present.

I could also replace it with getorcreate. I was a bit nervous about that becuase of how it failed for hlmodules, but I've since learned that it's a lot more robust and I've made the hlmodule variant similarly so.

Adds a dxilutil function to retrieve shader model from metadata. To enable this, the metadata helper implementation is split into a few utility functions which will continue to throw the exceptions as they did before, but allows passes to call them without fear of crashes as a result of that. Passes will assume the default shader model where it can't retrieve it for one reason or another. Changes scalarizer to try to retrieve the shader module when no module is found. Changes dyanmicvector to array to use the utility function and never try to recreate the hlmodule. Instead it will try the dxilmodule and if that isn't present, it will use the metadata for shader model only.

With the minimal metadata version requirement, these can be more succinct

pow2clk requested a review from a team as a code owner February 20, 2025 22:09

damyanp requested review from bob80905, llvm-beanz and tex3d February 20, 2025 22:30

damyanp assigned llvm-beanz and tex3d Feb 24, 2025

fix assert for tesselation patch template args

556f6e6

This got lost somewhere

llvm-beanz reviewed Feb 25, 2025

View reviewed changes

damyanp assigned pow2clk and unassigned llvm-beanz and tex3d Feb 25, 2025

tex3d self-assigned this Feb 25, 2025

tex3d reviewed Feb 25, 2025

View reviewed changes

damyanp unassigned tex3d Mar 7, 2025

pow2clk removed their assignment Mar 14, 2025

joaosaffran approved these changes Mar 14, 2025

View reviewed changes

pow2clk commented Mar 17, 2025

View reviewed changes

Merge 'origin/main' into longvec_operators

654bf8a

farzonl reviewed Mar 21, 2025

View reviewed changes

Greg Roth added 2 commits March 21, 2025 15:46

another constant case I caught just in case

a3a39b8

farzonl reviewed Mar 21, 2025

View reviewed changes

llvm-beanz reviewed Mar 22, 2025

View reviewed changes

Greg Roth added 4 commits March 24, 2025 04:35

clang-format

1596d99

fixup mistakenly submitted validation code

411b921

clang-format

8dc2a3e

bogner approved these changes Mar 24, 2025

View reviewed changes

llvm-beanz approved these changes Mar 25, 2025

View reviewed changes

tex3d reviewed Mar 25, 2025

View reviewed changes

Greg Roth added 4 commits March 25, 2025 17:07

small clang-format change

ba831c0

Retrieve dxil version instead of shader model from metadata

7292edb

trim down IR passes

96ac2e2

With the minimal metadata version requirement, these can be more succinct

pow2clk merged commit 1eb83c7 into microsoft:main Mar 26, 2025
12 checks passed

github-project-automation Bot moved this from New to Done in HLSL Roadmap Mar 26, 2025

pow2clk deleted the longvec_operators branch March 26, 2025 11:19

pow2clk mentioned this pull request Mar 26, 2025

[SM6.9] Scalarization passes involves a lot of type checks for vec1s #7261

Closed

damyanp moved this from Needs Review to Closed in HLSL Support Apr 22, 2025

damyanp removed this from HLSL Support Jun 25, 2025

	"Vectors of over 4 elements in %0 are not supported">;
	"vectors of over 4 elements in %0 are not supported">;

		std::string profile = m_sema->getLangOpts().HLSLProfile;
		const ShaderModel *SM = hlsl::ShaderModel::GetByName(profile.c_str());

		if (F.getParent()->HasDxilModule())
		if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())

-  if (F.getParent()->HasDxilModule())
-    if (F.getParent()->GetDxilModule().GetShaderModel()->IsSM69Plus())
+  auto *Parent = F.getParent();
+  if (Parent->HasDxilModule())
+    if (Parent->GetDxilModule().GetShaderModel()->IsSM69Plus())

	// Flatten Global vector if no dynamic vector indexing.
	// Flatten global vector if no dynamic vector indexing occurs.

		// Since HL module can't be created when linking, check for that first.
		// Otherwise, either retrieve or generate the HL module.

Conversation

pow2clk commented Feb 20, 2025

Uh oh!

pow2clk commented Feb 20, 2025

Uh oh!

llvm-beanz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pow2clk commented Mar 10, 2025

Uh oh!

joaosaffran left a comment

Choose a reason for hiding this comment

Uh oh!

pow2clk left a comment

Choose a reason for hiding this comment

Uh oh!