Skip to content

perf(ci): build heph bin + both cdylibs in one cargo invocation#126

Merged
raphaelvigee merged 1 commit into
masterfrom
ci/combine-build-invocation
Jun 24, 2026
Merged

perf(ci): build heph bin + both cdylibs in one cargo invocation#126
raphaelvigee merged 1 commit into
masterfrom
ci/combine-build-invocation

Conversation

@raphaelvigee

Copy link
Copy Markdown
Member

What

The build job ran three sequential cargo build calls — heph bin, plugin-go-cdylib, plugin-gha-cdylib. Collapse them into one:

cargo build --release --locked --target $T --bin heph --lib \
  -p heph -p plugin-go-cdylib -p plugin-gha-cdylib

--bin heph selects the binary; --lib adds every selected package's lib target (both cdylibs). heph's own lib is already built as the bin's dependency, so it costs nothing extra.

Why

The three crates share nearly the whole workspace dep graph, so deps already compiled once. But under the release profile (lto = "thin", opt-level = 3) each artifact's final codegen/LTO pass is heavy — three separate invocations serialize those passes and re-pay cargo startup + freshness resolution each time. A single invocation lets cargo's jobserver overlap them: one artifact's link tail fills cores while the next codegens.

Why not split into parallel jobs

The original question was whether splitting core CLI + the 2 plugins into separate parallel jobs would be faster. It would not in the common case: each runner would have to recompile the shared dep graph cold (sccache is restored at job start, so concurrent jobs don't see each other's fresh writes), trading the serialized-LTO tail for a duplicated dep compile plus per-job nix-setup + artifact-download overhead. A job split only pays off in the warm-sccache + dominant-LTO regime, and even then is overhead-gated — worth measuring separately, not assuming.

Test

CI-only YAML change. Validated locally that cargo accepts the combined target selection and compiles all three targets (--bin heph --lib -p heph -p plugin-go-cdylib -p plugin-gha-cdylib). The build matrix on this PR exercises it end-to-end.

🤖 Generated with Claude Code

The build step ran three sequential `cargo build` calls (heph bin,
plugin-go-cdylib, plugin-gha-cdylib). They share nearly the whole
workspace dep graph, so the deps already compiled once — but under the
release profile (thin-LTO + opt-level=3) each artifact's final
codegen/LTO pass is heavy, and three separate invocations serialize
those passes and re-pay cargo startup + freshness resolution each time.

Collapse them into a single invocation:

    cargo build --bin heph --lib -p heph -p plugin-go-cdylib -p plugin-gha-cdylib

`--bin heph` selects the binary; `--lib` adds every selected package's
lib target (both cdylibs). heph's own lib is already built as the bin's
dependency, so it costs nothing extra. cargo's jobserver can now overlap
the three artifacts' codegen — one artifact's link tail fills cores
while the next codegens — instead of running them back to back.

This does NOT split the build across parallel jobs: that would force
each runner to recompile the shared dep graph cold (sccache is restored
at job start, so concurrent jobs don't see each other's fresh writes),
trading the serialized-LTO tail for a duplicated dep compile.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
@raphaelvigee raphaelvigee enabled auto-merge (squash) June 24, 2026 20:01
@raphaelvigee raphaelvigee merged commit 886a77c into master Jun 24, 2026
23 of 24 checks passed
@raphaelvigee raphaelvigee deleted the ci/combine-build-invocation branch June 24, 2026 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant