Skip to content

[Expirement] intrinsify span.slice#128298

Closed
EgorBo wants to merge 1 commit into
dotnet:mainfrom
EgorBo:experiment-intrinsify-span-slice
Closed

[Expirement] intrinsify span.slice#128298
EgorBo wants to merge 1 commit into
dotnet:mainfrom
EgorBo:experiment-intrinsify-span-slice

Conversation

@EgorBo
Copy link
Copy Markdown
Member

@EgorBo EgorBo commented May 17, 2026

void Foo(Span<int> src)
{
    int i = 0;
    for (; i <= src.Length - Vector128<int>.Count; i += Vector128<int>.Count)
    {
        Vector128.Create(42).CopyTo(src.Slice(i));
    }
}
 ; Assembly listing for method Program:Foo(System.Span`1[int]) (FullOpts)
-; 0 inlinees with PGO data; 2 single block inlinees; 2 inlinees without PGO data
+; 0 inlinees with PGO data; 1 single block inlinees; 1 inlinees without PGO data
 ;
 ; Lcl frame size = 40
 
 G_M28304_IG01:
        sub      rsp, 40
 G_M28304_IG02:
        mov      rax, bword ptr [rcx]
        mov      ecx, dword ptr [rcx+0x08]
        xor      edx, edx
        lea      r8d, [rcx-0x04]
        test     r8d, r8d
        jl       SHORT G_M28304_IG04
        align    [0 bytes for IG03]
 G_M28304_IG03:
-       cmp      edx, ecx
-       ja       SHORT G_M28304_IG05
-       mov      r10d, edx
-       lea      r10, bword ptr [rax+4*r10]
-       mov      r9d, ecx
-       sub      r9d, edx
-       cmp      r9d, 4
-       jl       SHORT G_M28304_IG06
+       mov      r10d, ecx
+       sub      r10d, edx
+       lea      r9, bword ptr [rax+4*rdx]
+       cmp      r10d, 4
+       jl       SHORT G_M28304_IG05
        vbroadcastss xmm0, dword ptr [reloc @RWD00]
-       vmovups  xmmword ptr [r10], xmm0
+       vmovups  xmmword ptr [r9], xmm0
        add      edx, 4
        cmp      edx, r8d
        jle      SHORT G_M28304_IG03
 G_M28304_IG04:
        add      rsp, 40
        ret
 G_M28304_IG05:
-       call     [System.ThrowHelper:ThrowArgumentOutOfRangeException()]
-       int3
-G_M28304_IG06:
        call     [System.ThrowHelper:ThrowArgumentException_DestinationTooShort()]
        int3
 RWD00   dd      0000002Ah               ; 5.88545e-44
 
-; Total bytes of code 85, prolog size 4, PerfScore 40.50, instruction count 27
+; Total bytes of code 71, prolog size 4, PerfScore 34.50, instruction count 22

Copilot AI review requested due to automatic review settings May 17, 2026 12:32
@EgorBo
Copy link
Copy Markdown
Member Author

EgorBo commented May 17, 2026

@MihuBot

@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 17, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Experimental change that intrinsifies the single-argument Span<T>.Slice(int) / ReadOnlySpan<T>.Slice(int) in the JIT. Instead of inlining the managed body, the importer now emits an explicit GT_BOUNDS_CHECK(start, length + 1) plus direct field writes for the resulting span, which produces tighter codegen and — combined with new VN/range-check normalization — lets range-check elimination see through Slice inside hand-vectorized loops (the motivating loop in the description loses one bounds check and one branch).

Changes:

  • Add NI_System_Span_Slice / NI_System_ReadOnlySpan_Slice intrinsics, recognized only for the 1-arg overload, and expand them in impIntrinsic into a bounds check + byref offset + new length, writing into a fresh span temp.
  • Adjust VN (fgValueNumberTree) and RangeCheck::MergeEdgeAssertionsWorker to treat a length + 1 bound as equivalent to length for assertion/range-check purposes (the form produced by the new Slice expansion).
  • Tag Span.Slice(int) / ReadOnlySpan.Slice(int) with [Intrinsic], exclude them from CALLEE_INTRINSIC inline scoring in fgFindJumpTargets, and remove the Debug.Assert(length >= 0) from the internal span constructors.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/libraries/System.Private.CoreLib/src/System/Span.cs Marks Slice(int) [Intrinsic]; removes Debug.Assert(length >= 0) from internal ctor (unexplained).
src/libraries/System.Private.CoreLib/src/System/ReadOnlySpan.cs Same as above for ReadOnlySpan<T>.
src/coreclr/jit/namedintrinsiclist.h New NI_System_Span_Slice / NI_System_ReadOnlySpan_Slice enum entries.
src/coreclr/jit/importercalls.cpp Recognizes only the 1-arg Slice overload and expands it into bounds-check + byref-offset + length-write into a span temp.
src/coreclr/jit/fgbasic.cpp Excludes the new Slice intrinsics from CALLEE_INTRINSIC inline observation so caller heuristics are unchanged.
src/coreclr/jit/valuenum.cpp Registers the inner X as a checked bound when a bound is of the form X + 1, enabling RCE on the Slice-emitted shape.
src/coreclr/jit/rangecheck.cpp Normalizes assertion limits whose bound VN differs from preferredBoundVN by +1 so TightenLimit prefers the tighter asserted limit.

@@ -130,8 +130,6 @@ public Span(ref T reference)
[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal Span(ref T reference, int length)
{
break;
}

case NI_System_Span_Slice:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants