Goal
Reduce the end-to-end cost of generating a termchart diagram from a coding agent, on two axes:
- Latency — wall-clock from "agent decides to draw" to "diagram visible".
- Tokens — input+output(+cache) tokens to produce a correct diagram, incl. retries.
Measured across three runners (Claude Code, AGY, OpenCode) on VMs, baseline vs a combined-fixes build.
Plan
Full plan tracked in-repo: docs/plans/2026-06-07-latency-token-experimentation-plan.md (branch perf/reduce-latency).
Where the cost comes from (grounded)
- Tokens:
plugin/skills/diagram-recipes/examples/ is ~298 KB; the two *-matrix.component.json (~89 KB) are pretty-printed.
- Latency: A* router cliff at 25 nodes (
render.ts), --fit 10x re-render, begin = 3 POSTs, push vs patch (18–190x tokens/change), 2–3 server dagre passes, cold npm i -g.
Decisions
- One shared model held constant across all runners (fair comparison).
- First pass = pilot: {baseline, c1} only, then gate to the full T*/L* ablation.
P0 status (landed)
Harness in scripts/experiments/ + corpus_run.py --metrics-out; --dry-run produces a full report; 20 unit tests pass.
Next
Goal
Reduce the end-to-end cost of generating a termchart diagram from a coding agent, on two axes:
Measured across three runners (Claude Code, AGY, OpenCode) on VMs, baseline vs a combined-fixes build.
Plan
Full plan tracked in-repo:
docs/plans/2026-06-07-latency-token-experimentation-plan.md(branchperf/reduce-latency).Where the cost comes from (grounded)
plugin/skills/diagram-recipes/examples/is ~298 KB; the two*-matrix.component.json(~89 KB) are pretty-printed.render.ts),--fit10x re-render,begin= 3 POSTs,pushvspatch(18–190x tokens/change), 2–3 server dagre passes, coldnpm i -g.Decisions
P0 status (landed)
Harness in
scripts/experiments/+corpus_run.py --metrics-out;--dry-runproduces a full report; 20 unit tests pass.Next