Skip to content

⚡ Bolt: [performance improvement] optimize path allocations in incremental invalidation#218

Open
bashandbone wants to merge 1 commit intomainfrom
bolt-optimize-pathbuf-allocs-968163309968730354
Open

⚡ Bolt: [performance improvement] optimize path allocations in incremental invalidation#218
bashandbone wants to merge 1 commit intomainfrom
bolt-optimize-pathbuf-allocs-968163309968730354

Conversation

@bashandbone
Copy link
Copy Markdown
Contributor

@bashandbone bashandbone commented May 9, 2026

💡 What: Removed &v.to_path_buf() calls in crates/flow/src/incremental/invalidation.rs map queries (get_mut and get), using borrowed v instead. Cleaned up explicit lifetime lint warnings in check_var.rs along the way.
🎯 Why: &v.to_path_buf() allocates a new string-backed Path on the heap. Doing this inside Tarjan's strongly-connected-components inner loop (graph edges) resulted in O(E) memory churn per graph connected component calculation. Borrowing allows us to look up references instantly in RapidMap without allocations.
📊 Impact: Faster invalidation propagation algorithms with lower peak memory footprint on deeply-connected component graphs.
🔬 Measurement: Verified via memory correctness traces (test_get.rs checks) and cargo test -p thread-flow --test invalidation_tests. All SCC, topological, and tree validation checks passed successfully.


PR created automatically by Jules for task 968163309968730354 started by @bashandbone

Summary by Sourcery

Optimize incremental invalidation graph traversal to reduce path allocation overhead and clean up lifetime usage in variable checking utilities.

Enhancements:

  • Reuse borrowed Path references in Tarjan SCC indices/lowlink maps to avoid repeated PathBuf allocations during invalidation traversal.
  • Simplify function signatures in variable constraint/transform checks by removing unnecessary explicit lifetimes and using reference types directly.

…ath` instead of allocating `PathBuf` for map lookups

Co-authored-by: bashandbone <[email protected]>
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copilot AI review requested due to automatic review settings May 9, 2026 17:48
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented May 9, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Optimizes incremental invalidation SCC traversal by using borrowed path references instead of heap-allocating PathBufs on every map access, and cleans up unnecessary explicit lifetimes in rule-engine variable checking helpers.

Sequence diagram for optimized SCC traversal map lookups in incremental invalidation

sequenceDiagram
    participant InvalidationDetector
    participant TarjanState
    participant LowlinksMap as RapidMap_lowlinks
    participant IndicesMap as RapidMap_indices
    participant OnStackSet as RapidSet_on_stack
    participant V as Path_v
    participant Dep as Path_dep

    InvalidationDetector->>TarjanState: tarjan_visit(V)
    loop for_each_dependency_of_v
        InvalidationDetector->>TarjanState: next dependency -> Dep
        alt dep_not_visited
            InvalidationDetector->>TarjanState: recurse tarjan_visit(Dep)
            TarjanState->>LowlinksMap: get(Dep)
            LowlinksMap-->>TarjanState: w_lowlink
            Note over TarjanState,LowlinksMap: Before: get_mut(&v.to_path_buf())
            TarjanState->>LowlinksMap: get_mut(V)
            LowlinksMap-->>TarjanState: v_lowlink_ref
            TarjanState->>TarjanState: *v_lowlink_ref = min(*v_lowlink_ref, w_lowlink)
        else dep_on_stack
            TarjanState->>OnStackSet: contains(Dep)
            OnStackSet-->>TarjanState: true
            TarjanState->>IndicesMap: get(Dep)
            IndicesMap-->>TarjanState: w_index
            Note over TarjanState,LowlinksMap: Before: get_mut(&v.to_path_buf())
            TarjanState->>LowlinksMap: get_mut(V)
            LowlinksMap-->>TarjanState: v_lowlink_ref
            TarjanState->>TarjanState: *v_lowlink_ref = min(*v_lowlink_ref, w_index)
        end
    end

    Note over TarjanState,IndicesMap: Before: get(&v.to_path_buf())
    TarjanState->>IndicesMap: get(V)
    IndicesMap-->>TarjanState: v_index

    Note over TarjanState,LowlinksMap: Before: get(&v.to_path_buf())
    TarjanState->>LowlinksMap: get(V)
    LowlinksMap-->>TarjanState: v_lowlink

    TarjanState->>TarjanState: if v_lowlink == v_index then form_SCC()
    TarjanState-->>InvalidationDetector: SCC for V
Loading

Class diagram for updated invalidation and rule-engine helper APIs

classDiagram
    class InvalidationDetector {
        +tarjan_visit(v: &Path, state: &mut TarjanState)
    }

    class TarjanState {
        +indices: RapidMap~Path, usize~
        +lowlinks: RapidMap~Path, usize~
        +on_stack: RapidSet~Path~
        +stack: Vec~Path~
    }

    InvalidationDetector --> TarjanState: uses

    class CheckVarModule {
        +get_vars_from_rules(rule: &Rule, utils: &RuleRegistration) RapidSet~String~
        +check_var_in_constraints(vars: RapidSet~String~, constraints: &RapidMap~MetaVariableID, Rule~) RResult~RapidSet~String~~
        +check_var_in_transform(vars: RapidSet~String~, transform: &Option~Transform~) RResult~RapidSet~String~~
    }

    class Rule {
        +defined_vars() Iterator~String~
        +meta_vars() Iterator~String~
    }

    class RuleRegistration {
        +constraints: RapidMap~MetaVariableID, Rule~
        +transform: Option~Transform~
    }

    class RapidMap~K, V~ {
    }

    class RapidSet~T~ {
    }

    class Transform {
    }

    class MetaVariableID {
    }

    class RResult~T~ {
    }

    CheckVarModule --> Rule: reads
    CheckVarModule --> RuleRegistration: reads
    CheckVarModule --> RapidMap: uses
    CheckVarModule --> RapidSet: uses
    CheckVarModule --> Transform: uses
    CheckVarModule --> MetaVariableID: uses
    CheckVarModule --> RResult: returns
Loading

File-Level Changes

Change Details Files
Avoid heap allocations during SCC traversal in incremental invalidation by using borrowed Path keys in RapidMap lookups.
  • Replace lowlink map lookups using &v.to_path_buf() with lookups directly on borrowed &Path values in Tarjan SCC recursion.
  • Replace index map lookups using &v.to_path_buf() with lookups directly on borrowed &Path values.
  • Add explanatory comments documenting that the change avoids O(E) heap allocations inside the SCC traversal loop.
crates/flow/src/incremental/invalidation.rs
Simplify function signatures by removing unnecessary explicit lifetimes in rule-engine variable/constraint helpers.
  • Remove explicit lifetime parameter from check_var_in_constraints and accept &RapidMap instead of a lifetime-annotated reference.
  • Remove explicit lifetime parameter from check_var_in_transform and accept &Option instead of a lifetime-annotated reference.
  • Rely on elided lifetimes for function signatures to address explicit lifetime lint warnings.
crates/rule-engine/src/check_var.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In the Tarjan loop you repeatedly call state.lowlinks.get_mut(v) in both branches; consider factoring this lookup so v_lowlink is obtained once per iteration (or once per v) to reduce RapidMap lookups in this hot path.
  • The new // Bolt: ... comments are quite tool-specific; consider rephrasing them in neutral terms (e.g., just explaining the allocation avoidance) to keep the codebase free of vendor- or tool-branded annotations.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In the Tarjan loop you repeatedly call `state.lowlinks.get_mut(v)` in both branches; consider factoring this lookup so `v_lowlink` is obtained once per iteration (or once per `v`) to reduce RapidMap lookups in this hot path.
- The new `// Bolt: ...` comments are quite tool-specific; consider rephrasing them in neutral terms (e.g., just explaining the allocation avoidance) to keep the codebase free of vendor- or tool-branded annotations.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces heap allocations during incremental invalidation’s Tarjan SCC traversal by switching RapidMap lookups from &v.to_path_buf() (allocating) to borrowed &Path lookups, leveraging PathBuf: Borrow<Path> support in the underlying hash map. It also simplifies a couple of helper function signatures in the rule-engine to remove unnecessary explicit lifetimes.

Changes:

  • Avoids per-edge PathBuf allocations in Tarjan’s algorithm by using borrowed &Path for RapidMap::get(_mut) queries.
  • Removes explicit lifetimes from two internal helper functions in check_var.rs (constraints/transform checks).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
crates/rule-engine/src/check_var.rs Removes unnecessary explicit lifetimes in helper function signatures.
crates/flow/src/incremental/invalidation.rs Replaces allocating to_path_buf() map lookups with borrowed &Path lookups in Tarjan SCC traversal.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants