[CoopVec] Add pixel shader and multi-layer support to Mul and OuterProduct tests#7437
Conversation
3c95c62 to
d12ed52
Compare
…mory and improved input vector/matrix test patterns
041328e to
e70a45f
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
For test code I think its better to fail out at this point. default: Refers to: tools/clang/unittests/HLSLExec/CoopVec.h:176 in f77d76f. [](commit_id = f77d76f, deletion_comment = False) |
Probably better to use VERIFY_FAIL to stop test execution at this point? I'm assuming its not valuable to continue running further. Log::Error will continue execution of the test case, but mark the test as failed. Or was that the intention? Refers to: tools/clang/unittests/HLSLExec/CoopVec.h:139 in f77d76f. [](commit_id = f77d76f, deletion_comment = False) |
Should this abort the test by using a VERIFY_* macro? Same pattern in some of these other helpers as well. Refers to: tools/clang/unittests/HLSLExec/CoopVec.h:82 in f77d76f. [](commit_id = f77d76f, deletion_comment = False) |
| float Elt = 0.0f; | ||
|
|
||
| if (IsIntegralDataType(MatrixInterpretation)) | ||
| Elt = static_cast<float>(Rnd() & 0x7) - 3.0f; |
There was a problem hiding this comment.
They're for getting a specific range of random numbers. I added comments.
| T *Vec = getVector<T>(I); | ||
| for (size_t J = 0; J < VectorSize; ++J) | ||
| if constexpr (std::is_same_v<T, DirectX::PackedVector::HALF>) { | ||
| float Elt = (static_cast<float>(Rnd() & 0x3) - 1.0f) / 2.0f; |
|
|
||
| if (MatrixLayout == D3D12_LINEAR_ALGEBRA_MATRIX_LAYOUT_ROW_MAJOR) { | ||
| ConvertInfo.DestInfo.DestStride = | ||
| (static_cast<UINT>(getVectorSize()) * DestEltSize + 15) & ~15; |
There was a problem hiding this comment.
This is for alignment. Added comments.
This change adds preliminary pixel shader support by wrapping the existing test code in a function, which is called by both compute and pixel shaders. The same test patterns are used for both, mapping threads to input/bias vectors and output buffer offsets. For pixel shaders, an atomic counter is used to implement a poor man's mapping of pixel shader threads to a range of thread IDs.
Multi-layer support is also added, currently limited to two layers. A square matrix is always used for the first layer in a multi-layer config for ease-of-implementation.
The input patterns are now slightly more interesting by generating random input with a generator seeded to a constant value. The range of values is limited to try to lower error that accumulates between the CoopVec GPU implementation and the CPU reference implementation.