Fix run-integrity issues #25/#26/#27 from the Fable 5 live run#28
Merged
Conversation
#25 — threat_response was unwinnable: first_draft_tick was never written anywhere, so the metric zeroed permanently once any threat registered, and null incident placeholders (enemy_count=0, threat_level=0.0) counted as threats. Draft-action execution now records per-threat response delays (instant if already drafted when the threat appears), placeholders are filtered, and SCORING_VERSION bumps to 1.1 — 1.0 scores with non-empty threats_seen are not comparable. #26 — MAP_SUMMARY sites chased the builders: the colony center was re-derived from live pawn positions every tick, invalidating the previous shelter blueprint (10 sites in 10 ticks, none built). The first successful terrain summary is now pinned for the client's lifetime; terrain is static. #27 — pawn-targeted writes were 1/13: work_priority params passed verbatim posted garbage (work="work_type", priority="Research") and job_assign defaulted a missing job_def to an empty string. Both handlers now normalize the shapes models actually emit, and unnormalizable params fail visibly through ActionOutcome instead of posting junk. Observability: ActionOutcome (and ACTION_EXEC events) now carry the action parameters — debugging #27 required digging payloads out of raw LLM output because events omitted them. Also includes the launch-chart/quote-card generation script for the Fable 5 run artifacts. Co-Authored-By: Claude Fable 5 <[email protected]>
Member
Author
|
Live 2-tick preflight complete (
Composite 0.859 over 2 ticks, 14/14 deliberations parsed, zero retries, $0 on subscription. Two new visible (non-silent) failures surfaced by the params observability, neither in scope here: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #25, closes #26, closes #27. All three were found via the Fable 5 live run's artifacts (results/fable5-live-N1).
#25 — threat_response unwinnable
first_draft_tickis now actually written: successfuldraftexecutions record per-threat response delay in loop ticks (_record_draft_response); threats appearing while colonists are already drafted record an instant (0-tick) response.enemy_count=0 && threat_level=0.0) no longer enterthreats_seen— the live run's "threat" was a null/incidentsentry, and DefenseCommander's refusal to draft was correct.SCORING_VERSION1.0 → 1.1: v1.0 scores with non-emptythreats_seenare not comparable (the Fable run's 0.754 ex-artifact recomputes to ~0.834).#26 — wandering shelter
First successful
get_terrain_summaryis pinned for the client lifetime. Terrain is static; re-deriving the colony center from live pawn positions made the verified sites chase the builders (10 blueprints, 10 sites, 0 shelters, mood 0.52→0.36).#27 — pawn-targeted writes 1/13
work_priorityaccepts the three shapes models emit ({"<WorkType>": n}documented,{"work_type": X, "priority": n},{"work_priorities": {...}}) — previously the second shape posted literalwork="work_type", priority="Research"to RIMAPI, which is what the "null-ref" actually was.job_assignaccepts thejobalias and mapsx/ztotarget_position; an absent job name now fails visibly (ActionOutcomeerror) instead of posting an empty JobDef.Observability
ActionOutcome+ACTION_EXECevents now carry actionparameters(root-causing #27 required exhuming payloads from raw LLM text).Verification
🤖 Generated with Claude Code