[SM6.9] Enable trivial native vector Dxil Operations plus a few#7321
[SM6.9] Enable trivial native vector Dxil Operations plus a few#7321pow2clk wants to merge 6 commits intomicrosoft:mainfrom
Conversation
This enables the generation of native vector DXIL Operations that are "trivial", meaning they take only a single DXOp Call instruction to implement as well as a few others that either only took such a call and some llvm operations or were of particular interest for other reasons. This involves allowing the overloads by adding the vector indication in hctdb, altering the lowering to maintain the vectors instead of scalarizing them, and a few sundry changes to fix issues along the way. The "trivial" dxil operations that return a different value from the overload type had to be moved out of the way and given their own lowering function so that the main function could generate vectors conditional on the version and vector type. These will be added in a later change. While the long vector supporting intrinsics that weren't given this treatment will continue to generate scalarized operations, some of them needed some work as well. The dot product for float vectors longer than 4 had to take the integer fallback path, which required some small modificaitons and a rename. Additionally, a heuristic for pow that malfunctioned with too many elements had to have a limit placed on it. Since the or()/and()/select() intrinsics translate directly to LLVM ops, they can have their lowering scalarization removed and what future scalarization might be needed by the current version can be done by later passes as with other LLVM operators. An issue with a special value used to represent unassined dimensions had to be addressed since new dimensions can exceed that value. It's now MAX_INT. Contributes to microsoft#7120, but I'd prefer to leave it open until all intrinsics are covered
Any altered function is brought inline with LLVM coding standards for varaible capitalization.
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
| @@ -96,16 +96,16 @@ const OP::OpCodeProperty OP::m_OpCodeProps[(unsigned)OP::OpCode::NumOpCodes] = { | |||
| "unary", | |||
There was a problem hiding this comment.
REVIEWER GUIDANCE: I recommend initially just looking at this commit: 2bdf33e The rest are NFC formatting and style changes or autogenerated code changes.
|
REVIEWER GUIDANCE: I recommend initially just looking at this commit: 2bdf33e The rest are NFC formatting and style changes or autogenerated code changes. |
|
Note that an IR test is incoming. |
Was using int dot for the float operands as it was originally an int-only lowering function.
llvm-beanz
left a comment
There was a problem hiding this comment.
This is a mess to review. HLOperationLower.cpp has a massive volume of unrelated changes to code style. Pointing reviewers at the first commit that occurs before the reformatting is extremely unhelpful.
As I go through the first commit, GitHub won't let me comment on out-of-date lines. This means as a reviewer I need to review both he single commit and the whole change in parallel to see if what I'm commenting on is fixed and to put a comment in the appropriate place.
If you want to do a mass reformat, please put it in a separate PR.
| else | ||
| return Builder.CreateCall(Func, Args); // Cannot add name to void. |
There was a problem hiding this comment.
see: https://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return
| else | |
| return Builder.CreateCall(Func, Args); // Cannot add name to void. | |
| return Builder.CreateCall(Func, Args); // Cannot add name to void. |
| // CHECK: call float @dx.op.unary.f32(i32 17, float %{{.*}}) ; Atan(value) | ||
| // CHECK: call float @dx.op.unary.f32(i32 17, float %{{.*}}) ; Atan(value) | ||
| // CHECK: call float @dx.op.unary.f32(i32 17, float %{{.*}}) ; Atan(value) | ||
| // CHECK: call float @dx.op.unary.f32(i32 17, float %{{.*}}) ; Atan(value) |
There was a problem hiding this comment.
Why don't these use the supported vector overloads for expansions?
I think it would have made sense for each of these operations with corresponding vector DXIL ops:
- atan2 (Atan)
- fmod (FAbs, Frc)
- ldexp (Exp)
- pow (Log, Exp)
- modf (Round_z)
| // of `SupportsVectors`, which is deteremined by version and opcode support. | ||
| Value *TrivialDxilOperation(OP::OpCode Opcode, ArrayRef<Value *> Args, Type *Ty, | ||
| Type *RetTy, OP *OP, IRBuilder<> &Builder, | ||
| bool SupportsVectors = false) { |
There was a problem hiding this comment.
Instead of the caller supplying SupportsVectors, couldn't this be looked up by the operation and the module? Something like:
if (Ty->isVectorTy() && Ty->getVectorNumElements() > 1 &&
OP->GetModule()->GetHLModule()->GetShaderModel()->IsSM69Plus() &&
OP::IsOverloadLegal(Opcode, Ty)) {
// ...
}
REVIEWER GUIDANCE: I recommend initially just looking at this commit: 2bdf33e The rest are NFC formatting and style changes or autogenerated code changes. << Remove before submitting.
This enables the generation of native vector DXIL Operations
that are "trivial", meaning they take only a single DXOp Call
instruction to implement as well as a few others that either only took
such a call and some llvm operations or were of particular interest for
other reasons.
This involves allowing the overloads by adding the vector indication in
hctdb, altering the lowering to maintain the vectors instead of
scalarizing them, and a few sundry changes to fix issues along the way.
The "trivial" dxil operations that return a different value from the
overload type had to be moved out of the way and given their own
lowering function so that the main function could generate vectors
conditional on the version and vector type. These will be added in a
later change.
While the long vector supporting intrinsics that weren't given this
treatment will continue to generate scalarized operations, some of them
needed some work as well. The dot product for float vectors longer than
4 had to take the integer fallback path, which required some small
modifications and a rename.
Additionally, a heuristic for pow that malfunctioned with too many
elements had to have a limit placed on it.
Since the or()/and()/select() intrinsics translate directly to LLVM ops,
they can have their lowering scalarization removed and what future
scalarization might be needed by the current version can be done by
later passes as with other LLVM operators.
An issue with a special value used to represent unassigned dimensions had
to be addressed since new dimensions can exceed that value. It's now
MAX_INT.
Contributes to #7120, but I'd prefer to leave it open until all
intrinsics are covered