Skip to content

JIT: Add support for async variant delegate GDV#128273

Draft
jakobbotsch wants to merge 5 commits into
dotnet:mainfrom
jakobbotsch:async-delegate-gdv
Draft

JIT: Add support for async variant delegate GDV#128273
jakobbotsch wants to merge 5 commits into
dotnet:mainfrom
jakobbotsch:async-delegate-gdv

Conversation

@jakobbotsch
Copy link
Copy Markdown
Member

@jakobbotsch jakobbotsch commented May 15, 2026

Switch to the async variant if user implemented when doing delegate GDV. This avoids calling through the task-returning thunk, and further allows inlining the async variant (under the current restrictions).

class Program
{
    public static async Task Main()
    {
        Func<int, Task<int>> del = async x => x * x;

        for (int i = 0; i < 10; i++)
        {
            for (int j = 0; j < 100; j++)
            {
                await Sum(100, del);
            }

            await Task.Delay(100);
        }

        Stopwatch timer = Stopwatch.StartNew();
        await Sum(100_000_000, del);
        Console.WriteLine("Took {0} ms", timer.ElapsedMilliseconds);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private static async Task<int> Sum(int n, Func<int, Task<int>> func)
    {
        int sum = 0;
        for (int i = 0; i < n; i++)
            sum += await func(i);
        return sum;
    }
}

Before: Took 1334 ms
After: Took 122 ms

There are some unnecessary context saves/restores in the inner loop, but at least we do GDV, hoisting and inlining now.

Fix #128260

Switch to the async variant if user implemented when doing delegate GDV.
The async variant can then further be inlined.
Copilot AI review requested due to automatic review settings May 15, 2026 23:17
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 15, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends CoreCLR JIT indirect-call transformations so that guarded devirtualization (GDV) of delegate Invoke within async methods can opportunistically switch a guessed target to its user-implemented async variant (when available), enabling additional hoisting/inlining and avoiding the thunk path. It also introduces a small CallArgs helper to standardize insertion of the async-continuation well-known argument.

Changes:

  • Enhance GuardedDevirtualizationTransformer to detect AsyncHelpers.Await consumption of a GDV return value, split the await region, and (for applicable guesses) switch to the async “other variant” method handle.
  • Add CallArgs::InsertAsyncContinuation and use it from the importer path for ldvirtftn async calls (and from the new GDV async-variant path).
  • Refactor/rename some transformer state to m_ members and adjust block weight handling when an await remainder block is introduced.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
src/coreclr/jit/indirectcalltransformer.cpp Adds async-variant selection for delegate GDV in async methods, plus await splitting and related refactoring.
src/coreclr/jit/importercalls.cpp Switches async-continuation insertion for ldvirtftn async calls to a shared CallArgs helper.
src/coreclr/jit/gentree.h Declares the new CallArgs::InsertAsyncContinuation helper.
src/coreclr/jit/gentree.cpp Implements CallArgs::InsertAsyncContinuation insertion policy for the async continuation arg.
Comments suppressed due to low confidence (1)

src/coreclr/jit/indirectcalltransformer.cpp:873

  • CreateOrGetLocalForRetExpr writes the unused-return result to *isUnused, but the subsequent conditional checks m_returnValueUnused instead of *isUnused. This breaks callers that pass a different output flag (e.g. m_awaitReturnValueUnused in SplitAwait) and can cause the ret-expr to be linked/handled incorrectly (temp allocated vs NOP substitution) depending on stale member state.
            unsigned resultLcl = BAD_VAR_NUM;
            if (noReturnValue)
            {
                JITDUMP("Linking GT_RET_EXPR [%06u] for VOID return to NOP\n",
                        compiler->dspTreeID(inlineInfo->retExpr));
                inlineInfo->retExpr->gtSubstExpr = compiler->gtNewNothingNode();
            }
            else if (m_returnValueUnused)
            {
                JITDUMP("Linking GT_RET_EXPR [%06u] for UNUSED return to NOP\n",
                        compiler->dspTreeID(inlineInfo->retExpr));
                inlineInfo->retExpr->gtSubstExpr = compiler->gtNewNothingNode();
            }

Comment on lines +309 to 320
Compiler* compiler = nullptr;
BasicBlock* currBlock = nullptr;
BasicBlock* m_remainderBlock = nullptr;
BasicBlock* m_checkBlock = nullptr;
BasicBlock* m_thenBlock = nullptr;
BasicBlock* m_elseBlock = nullptr;
Statement* stmt = nullptr;
GenTreeCall* m_origCall = nullptr;
unsigned m_likelihood = HIGH_PROBABILITY;

const int HIGH_PROBABILITY = 80;
};
Comment on lines +384 to +391
m_checkBlock = CreateAndInsertBasicBlock(BBJ_ALWAYS, currBlock, currBlock);
GenTree* fatPointerMask = new (compiler, GT_CNS_INT) GenTreeIntCon(TYP_I_IMPL, FAT_POINTER_MASK);
GenTree* m_fptrAddressCopy = compiler->gtCloneExpr(m_fptrAddress);
GenTree* fatPointerAnd = compiler->gtNewOperNode(GT_AND, TYP_I_IMPL, m_fptrAddressCopy, fatPointerMask);
GenTree* zero = new (compiler, GT_CNS_INT) GenTreeIntCon(TYP_I_IMPL, 0);
GenTree* fatPointerCmp = compiler->gtNewOperNode(GT_NE, TYP_INT, fatPointerAnd, zero);
GenTree* jmpTree = compiler->gtNewOperNode(GT_JTRUE, TYP_VOID, fatPointerCmp);
Statement* jmpStmt = compiler->fgNewStmtFromTree(jmpTree, stmt->GetDebugInfo());
Comment on lines +430 to +432
GenTree* m_fptrAddressCopy = compiler->gtCloneExpr(m_fptrAddress);
GenTree* fatPointerMask = new (compiler, GT_CNS_INT) GenTreeIntCon(TYP_I_IMPL, FAT_POINTER_MASK);
return compiler->gtNewOperNode(GT_SUB, m_pointerType, m_fptrAddressCopy, fatPointerMask);
Comment on lines 845 to 852
if (!noReturnValue)
{
Statement* const nextStmt = stmt->GetNextStmt();
if (nextStmt != nullptr)
{
Compiler::FindLinkData fld = compiler->gtFindLink(nextStmt, retExprNode);
GenTree* const parent = fld.parent;

//
// So to figure out the proper divisor, we start with 1.0 and subtract off each
// preceeding test's likelihood of success.
// preceeding test's m_likelihood of success.
Comment on lines +1797 to +1804
CallArg* CallArgs::InsertAsyncContinuation(Compiler* comp, GenTree* node)
{
NewCallArg newArg = NewCallArg::Primitive(node).WellKnown(WellKnownArg::AsyncContinuation);
assert(FindWellKnownArg(WellKnownArg::InstParam) == nullptr);

if (Target::g_tgtArgOrder == Target::ARG_ORDER_R2L)
{
CallArg* retBufferArg = GetRetBufferArg();
Copilot AI review requested due to automatic review settings May 16, 2026 12:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comment on lines +382 to +389
m_checkBlock = CreateAndInsertBasicBlock(BBJ_ALWAYS, currBlock, currBlock);
GenTree* fatPointerMask = new (compiler, GT_CNS_INT) GenTreeIntCon(TYP_I_IMPL, FAT_POINTER_MASK);
GenTree* m_fptrAddressCopy = compiler->gtCloneExpr(m_fptrAddress);
GenTree* fatPointerAnd = compiler->gtNewOperNode(GT_AND, TYP_I_IMPL, m_fptrAddressCopy, fatPointerMask);
GenTree* zero = new (compiler, GT_CNS_INT) GenTreeIntCon(TYP_I_IMPL, 0);
GenTree* fatPointerCmp = compiler->gtNewOperNode(GT_NE, TYP_INT, fatPointerAnd, zero);
GenTree* jmpTree = compiler->gtNewOperNode(GT_JTRUE, TYP_VOID, fatPointerCmp);
Statement* jmpStmt = compiler->fgNewStmtFromTree(jmpTree, stmt->GetDebugInfo());
Comment on lines +1117 to +1121
// No configuration is currently matched for the awaiter.
callInfo->ContinuationContextHandling = ContinuationContextHandling::ContinueOnCapturedContext;
call->SetIsAsync(callInfo);
call->gtArgs.InsertAsyncContinuation(compiler, compiler->gtNewNull());

//
// So to figure out the proper divisor, we start with 1.0 and subtract off each
// preceeding test's likelihood of success.
// preceeding test's m_likelihood of success.

// When the current candidate has sufficiently high likelihood, scan
// When the current candidate has sufficiently high m_likelihood, scan
// the remainer block looking for another GDV candidate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI runtime-async

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JIT: We should support switching delegate GDVs to async variants

3 participants