Skip to content

Pin Crashlanded no-agent baseline (N=4, scoring 1.1)#30

Merged
jkbennitt merged 1 commit into
masterfrom
feat/crashlanded-baseline
Jun 10, 2026
Merged

Pin Crashlanded no-agent baseline (N=4, scoring 1.1)#30
jkbennitt merged 1 commit into
masterfrom
feat/crashlanded-baseline

Conversation

@jkbennitt

Copy link
Copy Markdown
Member

What

Pins the unmanaged (no-agent) reference for Crashlanded Survival that agent runs are measured against.

Generated by scripts/calibrate_baseline.py crashlanded --seeds 42 43 44 45 against the live game (PR #29 infrastructure). Four seeded --no-agent --until-death colonies, each reloading rle_crashlanded_v1 fresh and running to natural death.

Result

Runs N=4, seeds 42-45
Outcomes defeat, defeat, defeat, defeat (all colonists dead)
Mean time-to-end 8.0d (95% CI 4.5-11.5d)
Trajectory 244 points
scoring_version 1.1

Provenance (pinned, loader fails fast on drift)

  • save_sha256 29530cd8... (matches scenario YAML pin)
  • rimapi_dll_sha256 8b26c382...
  • rle_commit 1961e91
  • scoring_version 1.1

Notes

  • Per-run CSVs live under results/baseline/ (gitignored); only the aggregated sidecar is committed.
  • Sidecar round-tripped through the strict load_baseline at calibration time and tests/unit/test_baseline.py passes.
  • Data-only change (no code touched).

🤖 Generated with Claude Code

Calibrated via scripts/calibrate_baseline.py against the live game:
four seeded --no-agent --until-death runs (seeds 42-45), each reloading
rle_crashlanded_v1 fresh and running to natural colony death.

All four end in DEFEAT (all colonists dead), mean time-to-end 8.0d
(95% CI 4.5-11.5d), 244-point score trajectory. Provenance pinned in
the sidecar: save_sha256, rimapi_dll_sha256, rle_commit 1961e91,
scoring_version 1.1. Loader fails fast if any of these drift.

This is the unmanaged reference the agent runs are measured against.

Co-Authored-By: Claude Fable 5 <[email protected]>
@jkbennitt jkbennitt merged commit e091252 into master Jun 10, 2026
3 checks passed
@jkbennitt jkbennitt deleted the feat/crashlanded-baseline branch June 10, 2026 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant