feat(results): versioned/arch storage layout + celeris-results ingest#37
Open
FumingPower3925 wants to merge 2 commits into
Open
feat(results): versioned/arch storage layout + celeris-results ingest#37FumingPower3925 wants to merge 2 commits into
FumingPower3925 wants to merge 2 commits into
Conversation
This was referenced May 31, 2026
…rkflow
Migrate legacy schema_version 3.0 results into the v5.2 split tree and add the
docs-side storage/publish maintenance (#165 storage, #166 publish).
- results/ retree: v1.0.0/{x86.json,arm64.json} (3.0, arch "x86") lifted to
v1.0.0/20260320/{x86_64,arm64}/{summary.json,timeseries.json.gz,
histograms.json.gz,env.json}; latest/ now per-arch (latest/<arch>/) mirroring
the four-file split; legacy flat files removed.
- index.json -> schema "index/1": versions -> dates -> arches -> runs -> file
pointers + derived headline numbers; latest + default_run for O(1) resolution.
- scripts/: lib/results.mjs (shared), update-index.mjs (idempotent full-tree
rebuild, single source of truth = the files), refresh-latest.mjs (per-arch
mirror), validate-results.mjs (cell contract check), migrate-v1.mjs (one-shot
3.0->5.2 lift; no fabricated HDR/SLO/rated data, gaps recorded in
env.json.migration_notes).
- sync-benchmarks.yml: listen on canonical event_type celeris-results (+ manual
workflow_dispatch); validate -> update index.json -> refresh latest/ -> commit.
Single writer of index.json and latest/.
2b27525 to
aef563c
Compare
The earlier commit migrated the old benchmarks-repo v1.0.0 dataset into the
new layout. That data is FAKE (no cluster run has happened) and carries the
deprecated AWS-era environment provenance, which has been deleted from the
project. Accommodating it was wrong.
Removed:
- results/v1.0.0/** and results/latest/** (all fabricated)
- scripts/migrate-v1.mjs (one-shot migrator for the dead layout)
- the legacy x86.json comment reference in refresh-latest.mjs
The repo now ships only the infrastructure with NO data:
- results/index.json — an empty manifest (versions: [], latest: null)
- results/README.md — documents the layout the publish pipeline will fill
- scripts/{update-index,validate-results,refresh-latest,lib/results}.mjs
- .github/workflows/sync-benchmarks.yml (celeris-results ingest)
The first real data appears when the benchmark tier runs on the cluster.
update-index + validate-results both run clean on the empty tree.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Docs-side of goceleris/probatorium Benchmarking GA Phase 4 (#165 storage, #166 publish).
Establishes the clean results-storage layout + ingest pipeline for benchmark data. No data is committed — the repo ships only the infrastructure; the first real results land when the benchmark tier runs on the cluster.
results/index.json— empty manifest (versions: [],latest: null, schemaindex/1).results/README.md— documents theresults/<version>/<yyyymmdd>/<arch>/{summary,timeseries.json.gz,histograms.json.gz,env}.jsonlayout (arch =x86_64|arm64).scripts/—update-index.mjs(rebuild the manifest from committed cells),validate-results.mjs(schema-validate committed files),refresh-latest.mjs(mirror newest run intolatest/),lib/results.mjs(shared constants). All run clean on the empty tree..github/workflows/sync-benchmarks.yml— listens on the singleceleris-resultsrepository_dispatch: probatorium commits the result files directly, this workflow rebuildsindex.json, refresheslatest/, and validates the tree.Pairs with the probatorium publish rewrite (
mage Publishwrites the tree + fires the pointer dispatch). Designed so a future dashboard/website can consumeindex.json+ the per-cell JSON directly.