Skip to content

GVN and SROA miscompile min precision vector element access #8268

@alsepkow

Description

@alsepkow

Description

Multiple optimization passes mishandle min precision vector types due to DXC's padded data layout (i16:32, f16:32), where getTypeSizeInBits returns padded sizes for vectors (HLSL change) but primitive sizes for scalars. This causes three related bugs affecting min16float, min16int, and min16uint vector element access ([] operator).

Bug 1: GVN ICE (Internal Compiler Error)

CanCoerceMustAliasedValueToLoad computes an integer type using the padded size (e.g., 96 bits for <3 x half> instead of 48), then CoerceAvailableValueToLoadType attempts a bitcast from the 48-bit LLVM type to i96 — triggering an LLVM assert.

Bug 2: GVN Incorrect Store-to-Load Forwarding (Silent Miscompile)

GVN's processLoad forwards a store <3 x i16> zeroinitializer directly to a later load <3 x i16>, ignoring intermediate partial store i16 writes to individual vector elements. This happens because MemoryDependenceAnalysis uses padded type sizes to determine aliasing.

Bug 3: SROA Element Misindexing (Silent Miscompile)

Root cause of the test failures. SROA's getNaturalGEPRecursively uses getTypeSizeInBits (primitive size: 2 bytes for i16) for vector element offset calculations, while GEP offset computation uses getTypeAllocSize (padded size: 4 bytes with i16:32). This mismatch causes byte offset 4 (element 1) to be mapped to vector index 4/2 = 2 instead of 4/4 = 1, leading SROA to misplace or eliminate stores to vector elements.

Result: Only element [0] is correct; elements [1] and [2] are zeroed.

Repro

RWByteAddressBuffer g_In : register(u0);
RWByteAddressBuffer g_Out : register(u1);

[numthreads(1,1,1)]
void main() {
  vector<int, 3> raw = g_In.Load< vector<int, 3> >(0);
  vector<min16int, 3> v = (vector<min16int, 3>)raw;
  vector<min16int, 3> out_v = (min16int)0;
  out_v[0] = v[0];
  out_v[2] = v[2];
  out_v[1] = v[1];
  g_Out.Store< vector<int, 3> >(0, (vector<int, 3>)out_v);
}

Compile with: dxc -T cs_6_9 repro.hlsl

  • -O0 / -Od: correct results
  • -O1 (default): Bug 1 (ICE) or Bug 3 (wrong results)

Also reproduces with min16float and min16uint.

Root Cause

DXC's data layout pads min precision types: i16:32 and f16:32. The HLSL change in DataLayout::getTypeSizeInBits (line 540-543) makes vector sizes use getTypeAllocSizeInBits per element, so getTypeSizeInBits(<3 x i16>) = 96 (3 x 32). But scalar getTypeSizeInBits(i16) = 16 returns the primitive width.

This inconsistency propagates through:

  • GVN: Uses padded vector sizes for bitcast width calculations and alias reasoning
  • SROA: Uses primitive scalar sizes for vector element offsets but padded alloc sizes for GEP offsets — causing index mismatches

Fix

Three guards in lib/Transforms/Scalar/GVN.cpp and lib/Transforms/Scalar/SROA.cpp:

  1. GVN CanCoerceMustAliasedValueToLoad: Reject coercion when type sizes include padding
  2. GVN processLoad: Skip store-to-load forwarding for padded types
  3. SROA: Use getTypeAllocSizeInBits for vector element sizes in getNaturalGEPRecursively, isVectorPromotionViable, and AllocaSliceRewriter, matching GEP offset calculations

Fix branch: https://github.com/alsepkow/DirectXShaderCompiler/tree/user/alsepkow/fix-min-precision-opt-bugs
Squashed commit: alsepkow@b34136b9a

Environment

  • DXC version: 1.9.0 (main branch, SM 6.9)
  • Affects: all min precision types (min16float, min16int, min16uint) with vector element access
  • Does NOT affect native 16-bit types (half with -enable-16bit-types)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions