Fix run-integrity issues #25/#26/#27 from the Fable 5 live run by jkbennitt · Pull Request #28 · AppSprout-dev/RLE

jkbennitt · 2026-06-10T03:56:42Z

Closes #25, closes #26, closes #27. All three were found via the Fable 5 live run's artifacts (results/fable5-live-N1).

#25 — threat_response unwinnable

first_draft_tick is now actually written: successful draft executions record per-threat response delay in loop ticks (_record_draft_response); threats appearing while colonists are already drafted record an instant (0-tick) response.
Null incident placeholders (enemy_count=0 && threat_level=0.0) no longer enter threats_seen — the live run's "threat" was a null /incidents entry, and DefenseCommander's refusal to draft was correct.
SCORING_VERSION 1.0 → 1.1: v1.0 scores with non-empty threats_seen are not comparable (the Fable run's 0.754 ex-artifact recomputes to ~0.834).

#26 — wandering shelter

First successful get_terrain_summary is pinned for the client lifetime. Terrain is static; re-deriving the colony center from live pawn positions made the verified sites chase the builders (10 blueprints, 10 sites, 0 shelters, mood 0.52→0.36).

#27 — pawn-targeted writes 1/13

work_priority accepts the three shapes models emit ({"<WorkType>": n} documented, {"work_type": X, "priority": n}, {"work_priorities": {...}}) — previously the second shape posted literal work="work_type", priority="Research" to RIMAPI, which is what the "null-ref" actually was.
job_assign accepts the job alias and maps x/z to target_position; an absent job name now fails visibly (ActionOutcome error) instead of posting an empty JobDef.

Observability

ActionOutcome + ACTION_EXEC events now carry action parameters (root-causing #27 required exhuming payloads from raw LLM text).

Verification

440 tests pass (17 new across executor normalization, threat tracking, terrain pinning), ruff clean, mypy strict clean.
Live 2-tick preflight pending RimWorld being loaded — will report results on this PR before merge.

🤖 Generated with Claude Code

#25 — threat_response was unwinnable: first_draft_tick was never written anywhere, so the metric zeroed permanently once any threat registered, and null incident placeholders (enemy_count=0, threat_level=0.0) counted as threats. Draft-action execution now records per-threat response delays (instant if already drafted when the threat appears), placeholders are filtered, and SCORING_VERSION bumps to 1.1 — 1.0 scores with non-empty threats_seen are not comparable. #26 — MAP_SUMMARY sites chased the builders: the colony center was re-derived from live pawn positions every tick, invalidating the previous shelter blueprint (10 sites in 10 ticks, none built). The first successful terrain summary is now pinned for the client's lifetime; terrain is static. #27 — pawn-targeted writes were 1/13: work_priority params passed verbatim posted garbage (work="work_type", priority="Research") and job_assign defaulted a missing job_def to an empty string. Both handlers now normalize the shapes models actually emit, and unnormalizable params fail visibly through ActionOutcome instead of posting junk. Observability: ActionOutcome (and ACTION_EXEC events) now carry the action parameters — debugging #27 required digging payloads out of raw LLM output because events omitted them. Also includes the launch-chart/quote-card generation script for the Fable 5 run artifacts. Co-Authored-By: Claude Fable 5 <[email protected]>

jkbennitt · 2026-06-10T04:13:09Z

Live 2-tick preflight complete (results/fable5-preflight5, Fable 5, seed 42). All five verification targets pass:

Target	Before (live-N1)	After
`threat_response` with no hostiles	0.0 (phantom null incident)	1.000
Pawn-targeted write success	1/13	3/3 (both `work_type` and nested `work_priorities` shapes normalized; `bed_rest` also landing)
Shelter site across ticks	new site every tick	(133,138) both ticks — pin holds
`ACTION_EXEC` parameters	absent	present — and immediately useful: both remaining failures are fully diagnosable from the event alone
`scoring_version`	1.0	1.1

Composite 0.859 over 2 ticks, 14/14 deliberations parsed, zero retries, $0 on subscription.

Two new visible (non-silent) failures surfaced by the params observability, neither in scope here: stockpile_zone priority sent as the string "Important", and a t1 growing_zone rejected with "Invalid plant definition: Plant_Rice" while t0's 'plant': 'rice' succeeded — RIMAPI def-name quirk worth a small follow-up issue.

jkbennitt merged commit 34cb377 into master Jun 10, 2026
3 checks passed

jkbennitt deleted the fix/run-integrity branch June 10, 2026 04:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix run-integrity issues #25/#26/#27 from the Fable 5 live run#28

Fix run-integrity issues #25/#26/#27 from the Fable 5 live run#28
jkbennitt merged 1 commit into
masterfrom
fix/run-integrity

jkbennitt commented Jun 10, 2026

Uh oh!

jkbennitt commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jkbennitt commented Jun 10, 2026

#25 — threat_response unwinnable

#26 — wandering shelter

#27 — pawn-targeted writes 1/13

Observability

Verification

Uh oh!

jkbennitt commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant