erain · erain · Jun 10, 2026 · Jun 10, 2026
diff --git a/README.md b/README.md
@@ -307,10 +307,50 @@ system prompts; unknown versions error with the available list.
 
 **Long context.** `AgentOptions.Compactor` + `CompactionThreshold`.
 `glue.KeepRecentMessages(n)` is the zero-dependency default;
-`SummarizingCompactor` is token-aware
+`SummarizingCompactor` is token-aware and produces a **structured
+state snapshot** (goal / constraints / progress / next steps, exact
+paths and errors preserved) with a prompt-injection firewall, a
+cumulative read/modified **file ledger** that survives repeated
+compactions, splits that never sever a tool-call/result pair, and an
+inflation guard; `KeepRecentTokens` keeps the recent tail by token
+budget instead of message count
 ([ADR-0002](docs/adr/0002-context-compaction.md),
 [ADR-0007](docs/adr/0007-memory-layer.md)).
 
+## Harness reliability
+
+The loop absorbs the failure shapes that waste agent turns — on by
+default, each with an opt-out
+([docs/coding-harness-roadmap.md](docs/coding-harness-roadmap.md)
+records the analysis behind them):
+
+- **History hardening.** Every run repairs the transcript first:
+  dangling tool calls from an interrupted turn get synthesized error
+  results, orphaned results and empty turns are dropped, and turns
+  from a different model lose their thinking signatures — the things
+  providers reject with opaque 400s (`loop.HardenHistory`).
+- **Classified retries.** Transient provider failures (429/5xx,
+  dropped streams) retry with backoff, honoring `Retry-After` /
+  Gemini `RetryInfo` hints; auth and invalid-request errors fail
+  fast. Context overflow surfaces as a typed `*loop.OverflowError`
+  that sessions answer by compacting once and retrying once. Opt out
+  with `RunRequest.Retry.Disabled`
+  ([ADR-0017](docs/adr/0017-loop-retry-overflow-recovery.md)).
+- **Guardrails.** Repeating the same tool call with identical
+  arguments, or burning consecutive all-error tool rounds, first
+  draws a corrective message and then halts the run with a typed
+  error (`RunRequest.Guardrails`).
+- **Stall recovery.** `AgentOptions.AutoContinue` nudges a model that
+  narrates "I will now…" and stops without acting — bounded to twice
+  per run; the `glue` binary enables it for providers that declare
+  the stall in the capability registry.
+
+**Per-model capabilities.** Providers declare harness-relevant facts
+at registration — context window, parallel-tool safety, prompt
+variant, auto-continue proneness — queried via
+`providers.CapabilitiesFor(name)` instead of if-provider-name
+switches.
+
 ## Coding tools
 
 `tools/coding.Tools(...)` assembles a permission-gated local coding
@@ -324,6 +364,26 @@ go run ./cmd/glue run --provider codex --coding --work . \
   --prompt "Run the tests and fix the first failure."
 ```
 
+The bundle is built to tolerate model sloppiness instead of bouncing
+it back:
+
+- **`edit_file` repairs near-miss matches** — a deterministic ladder
+  (whitespace → indentation, with the replacement re-indented to the
+  file's real indentation → smart-quote/dash folding → block-anchor)
+  plus over-escape repair; non-exact matches are named in the result,
+  and success echoes the updated lines so the model doesn't re-read
+  the file. CRLF and BOMs are preserved.
+- **`shell_exec` keeps head *and* tail** of long output with an
+  omitted-bytes marker and the complete stream spooled to a named
+  temp file; timeouts keep the partial output. `read_file` pages by
+  line offset and says exactly how to continue.
+- **The system prompt is assembled from the active toolset**
+  (`coding.SystemPrompt`): one line per registered tool plus their
+  usage guidelines, in a terse variant for frontier models and an
+  explicit variant for open-weight ones — it cannot drift from the
+  tools actually available. Tools contribute their own text via
+  `ToolSpec.PromptSnippet` / `PromptGuidelines`.
+
 Side-effecting tools (`write_file`, `edit_file`, `shell_exec`) are
 permission-gated; reads and navigation are not. Execution defaults to
 the local process via `glue.Executor` — not a sandbox. Implement your

diff --git a/docs/building-agents.md b/docs/building-agents.md
@@ -387,6 +387,15 @@ and asserting on the `ToolResult` (including `IsError` for the recovery
 path). Keep live provider tests gated behind an env-var check and out of
 CI.
 
+Two loop defaults to know when scripting *failures* in tests: transient-
+looking provider errors ("rate limit", "503", dropped streams) are
+retried with backoff, and pathological tool patterns (the same call
+repeated, all-error rounds) trigger guardrails. When a fake provider
+should fail *fast*, use error text that classifies as fatal (e.g.
+"invalid request"); callers driving `loop.Run` directly can also pass
+`Retry: loop.RetryPolicy{Disabled: true}` or
+`Guardrails: loop.GuardrailPolicy{Disabled: true}`.
+
 ## Going further
 
 You now have the full shape of a Glue agent. The advanced surfaces, each

diff --git a/docs/coding-harness-roadmap.md b/docs/coding-harness-roadmap.md
@@ -2,6 +2,11 @@
 
 Date: 2026-06-09. Tracker: [#110](https://github.com/erain/glue/issues/110).
 
+> **Status: shipped.** All eight queue items below landed the same day
+> as [v1.13.0](https://github.com/erain/glue/releases/tag/v1.13.0)
+> (PRs #346–#353). This document remains the record of the analysis
+> and of what was deliberately deferred (the P3 notes).
+
 This is a source-verified analysis of four reference coding-agent
 harnesses — [pi](https://github.com/earendil-works/pi),
 [Cline](https://github.com/cline/cline),
@@ -251,16 +256,17 @@ fallback (v1.8.0). What Gemini CLI additionally does that we don't:
 
 ## Implementation order
 
-Filed as one-issue-one-PR items under tracker #110:
-
-1. P0.1 edit_file repair ladder + instructive errors (+ escape repair) — [#338](https://github.com/erain/glue/issues/338).
-2. P0.2 structured truncation for shell_exec / read_file — [#339](https://github.com/erain/glue/issues/339).
-3. P0.3 history hardening before send/resume — [#340](https://github.com/erain/glue/issues/340).
-4. P1.4 retry/overflow state machine — [#341](https://github.com/erain/glue/issues/341).
-5. P1.6 compaction upgrade — [#342](https://github.com/erain/glue/issues/342).
-6. P2.7 Gemini next-speaker check + stall recovery — [#343](https://github.com/erain/glue/issues/343).
-7. P2.8 loop & mistake guardrails — [#344](https://github.com/erain/glue/issues/344).
-8. P1.5 per-model capability registry + tool-owned prompt snippets — [#345](https://github.com/erain/glue/issues/345).
+Filed as one-issue-one-PR items under tracker #110 — **all shipped in
+v1.13.0**:
+
+1. ✅ P0.1 edit_file repair ladder + instructive errors (+ escape repair) — [#338](https://github.com/erain/glue/issues/338), PR #346.
+2. ✅ P0.2 structured truncation for shell_exec / read_file — [#339](https://github.com/erain/glue/issues/339), PR #347.
+3. ✅ P0.3 history hardening before send/resume — [#340](https://github.com/erain/glue/issues/340), PR #348.
+4. ✅ P1.4 retry/overflow state machine — [#341](https://github.com/erain/glue/issues/341), PR #349, [ADR-0017](adr/0017-loop-retry-overflow-recovery.md).
+5. ✅ P1.6 compaction upgrade — [#342](https://github.com/erain/glue/issues/342), PR #350.
+6. ✅ P2.7 Gemini next-speaker check + stall recovery — [#343](https://github.com/erain/glue/issues/343), PR #351 (surrogate sub-item verified moot per #313).
+7. ✅ P2.8 loop & mistake guardrails — [#344](https://github.com/erain/glue/issues/344), PR #352.
+8. ✅ P1.5 per-model capability registry + tool-owned prompt snippets — [#345](https://github.com/erain/glue/issues/345), PR #353.
 
 Items 1–3 are pure-Go, dependency-free, and benefit every provider;
 they go first. Item 8 touches public API shape (registry), so it goes

diff --git a/docs/design.md b/docs/design.md
@@ -179,12 +179,24 @@ until the provider stops or the context is canceled.
    tool-call IDs are normalized, and turns from a different model lose
    their thinking blocks and provider signatures — providers reject all
    of these with opaque 400s otherwise.
-2. Ask the provider to stream an assistant response.
+2. Ask the provider to stream an assistant response. Transient
+   failures (429/5xx, dropped streams) retry with classified backoff
+   under `RunRequest.Retry`; context overflow surfaces as a typed
+   `*loop.OverflowError` that the session layer answers by compacting
+   once and retrying once
+   ([ADR-0017](adr/0017-loop-retry-overflow-recovery.md)).
 3. Emit text/tool/lifecycle events as provider events arrive.
-4. Append the final assistant message to the transcript.
+4. Append the final assistant message to the transcript. A turn that
+   narrates a future action without calling a tool gets a bounded
+   "Please continue." nudge when `RunRequest.AutoContinue` is set
+   (the Gemini narrate-then-stop stall).
 5. If the assistant requested tools, execute the requested tools.
 6. Append tool result messages in deterministic order.
-7. Repeat from step 2 until no tool calls remain.
+7. Guardrails inspect the round: repeated identical calls or
+   consecutive all-error rounds first draw a corrective injected
+   message, then halt the run with a typed error
+   (`RunRequest.Guardrails`).
+8. Repeat from step 2 until no tool calls remain.
 
 The concrete P0 entry point is `loop.Run(ctx, loop.RunRequest)`. It returns a
 `loop.RunResult` containing both the full transcript and the messages produced by

diff --git a/docs/project-plan.md b/docs/project-plan.md
@@ -165,14 +165,19 @@ milestone).
   agent ships as its own product face with a homepage
   (<https://glue-coding-agent-site.vercel.app>, repo
   [glue-coding-agent-site](https://github.com/erain/glue-coding-agent-site)).
-  Next: **harness quality** — a source-verified analysis of pi, Cline,
-  Codex CLI, and Gemini CLI distilled into
-  [`coding-harness-roadmap.md`](coding-harness-roadmap.md) (edit-repair
-  ladder, structured truncation, history hardening, retry/overflow
-  recovery, compaction upgrades, Gemini loop polish), prioritized for
-  Gemini 3.x first and open-weight OpenRouter/NVIDIA models second.
-  Still planned beyond that: daemon goal endpoints, a sandboxed
-  `Executor` backend (container/VM), and TUI-on-`glue connect`.
+  The **harness-quality phase shipped as `v1.13.0`**: a source-verified
+  analysis of pi, Cline, Codex CLI, and Gemini CLI
+  ([`coding-harness-roadmap.md`](coding-harness-roadmap.md)) landed as
+  eight PRs — edit-repair ladder, structured truncation, history
+  hardening, retry/overflow recovery
+  ([ADR 0017](adr/0017-loop-retry-overflow-recovery.md)), compaction
+  upgrades, next-speaker stall recovery, loop guardrails, and the
+  per-model capability registry with tool-owned prompt assembly —
+  prioritized for Gemini 3.x first and open-weight OpenRouter/NVIDIA
+  models second. Still planned: daemon goal endpoints, a sandboxed
+  `Executor` backend (container/VM), TUI-on-`glue connect`, and the
+  roadmap's deferred P3 notes (XML tool-calling fallback, parallel-tool
+  read/write locking, goal-loop budget wind-down).
 - **Track B — Peggy.** Peggy v0.1–v0.5 plus dogfood hardening (M1–M6)
   shipped: single-prompt CLI, Telegram channel, durable sqlite+FTS5
   memory with curated recall, opt-in coding tools, MCP servers, the

diff --git a/docs/provider-guide.md b/docs/provider-guide.md
@@ -175,6 +175,36 @@ for the `glue.Provider` implementation. Reference upstream
 open-source CLIs as the protocol spec rather than copying code, and
 quarantine all vendor-specific headers and base URLs in the package.
 
+## Registering with the driver registry
+
+Shipped providers register themselves in `init()` so callers can
+construct them by name through `providers.New("<name>")` (this is how
+the `glue` binary's `--provider` flag works). Registration also
+declares **capabilities** — harness-relevant facts the loop and CLIs
+query through `providers.CapabilitiesFor(name)` instead of switching
+on provider names:
+
+```go
+func init() {
+	providers.Register("acme", providers.Factory{
+		New:          func() loop.Provider { return New(Options{}) },
+		DefaultModel: DefaultModel,
+		EnvKey:       "ACME_API_KEY",
+		Capabilities: providers.Capabilities{
+			ContextWindow: 131_072, // default model's window; 0 = unknown
+			ParallelTools: false,   // safe to run tool calls concurrently?
+			PromptVariant: "",      // "" explicit (open-weight), "terse" frontier
+			AutoContinue:  false,   // prone to the narrate-then-stop stall?
+		},
+	})
+}
+```
+
+Declare conservatively: the zero value means "assume nothing", and
+consumers treat unknown capabilities as the safe default. Out-of-tree
+providers do not have to register at all — construct them directly and
+pass them to `glue.NewAgent`.
+
 ## Common mistakes
 
 - **Aliasing the same `Message` across `Start` and `Done`.** The loop