Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions tools/clang/unittests/HLSLExec/LinAlgTests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -73,11 +73,13 @@ struct MatrixParams {
bool Enable16Bit;
bool EmulateTest;

size_t strideBytes() const {
size_t rowStride() const {
uint32_t ES = elementSize(CompType);
if (Layout == LinalgMatrixLayout::RowMajor)
return N * ES;
return M * ES;
if (Layout == LinalgMatrixLayout::ColumnMajor)
return M * ES;
return 0;
}

size_t totalElements() const { return M * N; }
Expand All @@ -94,7 +96,7 @@ static std::string buildCompilerArgs(const MatrixParams &Params,
SS << " -DN_DIM=" << Params.N;
SS << " -DUSE=" << static_cast<int>(Params.Use);
SS << " -DSCOPE=" << static_cast<int>(Params.Scope);
SS << " -DSTRIDE=" << Params.strideBytes();
SS << " -DSTRIDE=" << Params.rowStride();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stride is a problem for group shared load and store, from spec, the stride of group shared is the count of elements, so it should be N or M for group shared.

it needs to fix:
__builtin_LinAlg_MatrixLoadFromMemory(
Mat, GsData, OFFSET, STRIDE, LAYOUT);
__builtin_LinAlg_MatrixStoreToMemory(
Mat, GsData, OFFSET, STRIDE, LAYOUT);

also, group shared offset is set to 0 from test, it's okay here, but I guess the offset for group shared also the count of elements?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on a fix for the stride issue!

IIRC OFFSET is still a proper offset into the array. If your group shared array is larger than a single matrix then it may contain other data in parts of the array before/atter the matrix data. Either way we should clarify that in the spec. I'll make a note

SS << " -DLAYOUT=" << static_cast<int>(Params.Layout);
SS << " -DELEM_SIZE=" << static_cast<int>(elementSize(Params.CompType));
SS << " -DNUMTHREADS=" << Params.NumThreads;
Expand Down Expand Up @@ -620,7 +622,7 @@ void DxilConf_SM610_LinAlg::AccumulateDescriptor_Thread_16x16_F16() {
Params.N = 16;
Params.Use = MatrixUse::Accumulator;
Params.Scope = MatrixScope::Thread;
Params.Layout = LinalgMatrixLayout::RowMajor;
Params.Layout = LinalgMatrixLayout::OuterProductOptimal;
Params.NumThreads = 1;
Params.Enable16Bit = true;
runAccumulateDescriptor(D3DDevice, DxcSupport, Params, 19, VerboseLogging);
Expand Down Expand Up @@ -1220,7 +1222,7 @@ void DxilConf_SM610_LinAlg::MatVecMul_Thread_16x16_F16() {
Params.M = 16;
Params.N = 16;
Params.Scope = MatrixScope::Thread;
Params.Layout = LinalgMatrixLayout::RowMajor;
Params.Layout = LinalgMatrixLayout::OuterProductOptimal;
Params.NumThreads = 1;
Params.Enable16Bit = true;
runMatVecMul(D3DDevice, DxcSupport, Params, VerboseLogging,
Expand Down Expand Up @@ -1315,7 +1317,7 @@ void DxilConf_SM610_LinAlg::MatVecMulAdd_Thread_16x16_F16() {
Params.M = 16;
Params.N = 16;
Params.Scope = MatrixScope::Thread;
Params.Layout = LinalgMatrixLayout::RowMajor;
Params.Layout = LinalgMatrixLayout::OuterProductOptimal;
Comment thread
V-FEXrt marked this conversation as resolved.
Outdated
Params.NumThreads = 1;
Params.Enable16Bit = true;
runMatVecMulAdd(D3DDevice, DxcSupport, Params, VerboseLogging,
Expand Down Expand Up @@ -1399,7 +1401,7 @@ void DxilConf_SM610_LinAlg::OuterProduct_Thread_16x16_F16() {
Params.M = 16;
Params.N = 16;
Params.Scope = MatrixScope::Thread;
Params.Layout = LinalgMatrixLayout::RowMajor;
Params.Layout = LinalgMatrixLayout::OuterProductOptimal;
Params.NumThreads = 1;
Params.Enable16Bit = true;
runOuterProduct(D3DDevice, DxcSupport, Params, VerboseLogging);
Expand Down
Loading