Skip to content

feat(results): versioned/arch storage layout + celeris-results ingest#37

Open
FumingPower3925 wants to merge 2 commits into
mainfrom
feat/benchmarking-storage
Open

feat(results): versioned/arch storage layout + celeris-results ingest#37
FumingPower3925 wants to merge 2 commits into
mainfrom
feat/benchmarking-storage

Conversation

@FumingPower3925
Copy link
Copy Markdown
Contributor

@FumingPower3925 FumingPower3925 commented May 31, 2026

Docs-side of goceleris/probatorium Benchmarking GA Phase 4 (#165 storage, #166 publish).

Establishes the clean results-storage layout + ingest pipeline for benchmark data. No data is committed — the repo ships only the infrastructure; the first real results land when the benchmark tier runs on the cluster.

  • results/index.json — empty manifest (versions: [], latest: null, schema index/1).
  • results/README.md — documents the results/<version>/<yyyymmdd>/<arch>/{summary,timeseries.json.gz,histograms.json.gz,env}.json layout (arch = x86_64 | arm64).
  • scripts/update-index.mjs (rebuild the manifest from committed cells), validate-results.mjs (schema-validate committed files), refresh-latest.mjs (mirror newest run into latest/), lib/results.mjs (shared constants). All run clean on the empty tree.
  • .github/workflows/sync-benchmarks.yml — listens on the single celeris-results repository_dispatch: probatorium commits the result files directly, this workflow rebuilds index.json, refreshes latest/, and validates the tree.

Pairs with the probatorium publish rewrite (mage Publish writes the tree + fires the pointer dispatch). Designed so a future dashboard/website can consume index.json + the per-cell JSON directly.

Note: the previous revision of this PR mistakenly migrated the old benchmarks-repo v1.0.0 dataset (fabricated data + deprecated AWS-era environment). That has been fully removed — nothing here is real data, and there are zero references to the old infrastructure.

…rkflow

Migrate legacy schema_version 3.0 results into the v5.2 split tree and add the
docs-side storage/publish maintenance (#165 storage, #166 publish).

- results/ retree: v1.0.0/{x86.json,arm64.json} (3.0, arch "x86") lifted to
  v1.0.0/20260320/{x86_64,arm64}/{summary.json,timeseries.json.gz,
  histograms.json.gz,env.json}; latest/ now per-arch (latest/<arch>/) mirroring
  the four-file split; legacy flat files removed.
- index.json -> schema "index/1": versions -> dates -> arches -> runs -> file
  pointers + derived headline numbers; latest + default_run for O(1) resolution.
- scripts/: lib/results.mjs (shared), update-index.mjs (idempotent full-tree
  rebuild, single source of truth = the files), refresh-latest.mjs (per-arch
  mirror), validate-results.mjs (cell contract check), migrate-v1.mjs (one-shot
  3.0->5.2 lift; no fabricated HDR/SLO/rated data, gaps recorded in
  env.json.migration_notes).
- sync-benchmarks.yml: listen on canonical event_type celeris-results (+ manual
  workflow_dispatch); validate -> update index.json -> refresh latest/ -> commit.
  Single writer of index.json and latest/.
@FumingPower3925 FumingPower3925 force-pushed the feat/benchmarking-storage branch from 2b27525 to aef563c Compare May 31, 2026 06:41
The earlier commit migrated the old benchmarks-repo v1.0.0 dataset into the
new layout. That data is FAKE (no cluster run has happened) and carries the
deprecated AWS-era environment provenance, which has been deleted from the
project. Accommodating it was wrong.

Removed:
- results/v1.0.0/** and results/latest/** (all fabricated)
- scripts/migrate-v1.mjs (one-shot migrator for the dead layout)
- the legacy x86.json comment reference in refresh-latest.mjs

The repo now ships only the infrastructure with NO data:
- results/index.json — an empty manifest (versions: [], latest: null)
- results/README.md — documents the layout the publish pipeline will fill
- scripts/{update-index,validate-results,refresh-latest,lib/results}.mjs
- .github/workflows/sync-benchmarks.yml (celeris-results ingest)

The first real data appears when the benchmark tier runs on the cluster.
update-index + validate-results both run clean on the empty tree.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant