Skip to content

Commit d8cfc9e

Browse files
alsepkowCopilot
andcommitted
Use half-precision ULP for min16float dot product tolerance
The dot product tolerance computation was using float32 ULPs for HLSLMin16Float_t, but the GPU may compute at float16 precision. With NUM=256 elements the accumulated error exceeds the float32-based epsilon. Use HLSLHalf_t::GetULP to compute half-precision ULPs for min16float, matching the approach already used for HLSLHalf_t. Co-authored-by: Copilot <[email protected]>
1 parent 351dda7 commit d8cfc9e

1 file changed

Lines changed: 6 additions & 1 deletion

File tree

tools/clang/unittests/HLSLExec/LongVectors.cpp

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1359,7 +1359,12 @@ static double computeAbsoluteEpsilon(double A, double ULPTolerance) {
13591359

13601360
if constexpr (std::is_same_v<T, HLSLHalf_t>)
13611361
ULP = HLSLHalf_t::GetULP(A);
1362-
else
1362+
else if constexpr (std::is_same_v<T, HLSLMin16Float_t>) {
1363+
// Min precision floats may be computed at float16 on the GPU, so use
1364+
// half-precision ULP for tolerance. Reuse HLSLHalf_t::GetULP which
1365+
// computes ULP by incrementing the float16 bit representation.
1366+
ULP = HLSLHalf_t::GetULP(HLSLHalf_t(static_cast<float>(A)));
1367+
} else
13631368
ULP =
13641369
std::nextafter(static_cast<T>(A), std::numeric_limits<T>::infinity()) -
13651370
static_cast<T>(A);

0 commit comments

Comments
 (0)