Studying the gap between what agents know and when they act on it.
-
Updated
Mar 29, 2026 - TypeScript
Studying the gap between what agents know and when they act on it.
Toy 7. An elimination-filter landscape applying two structural constraints simultaneously to map which objective classes can persist under sustained optimization pressure — and which cannot. Includes a four-stage scenario engine and open-question frontier. Companion simulation for The Shape of What Does Not End — Series 2, Part 4.
RL-style eval measuring intent/action divergence in frontier agents: model acknowledges a correction, then acts on the stale value anyway. 3 scenarios, 371 trials on claude-haiku-4-5, Sonnet 4.6, GPT-5.4, and Gemini 3.1 Pro Preview.
Execution layer for skill-dispatcher — runs multi-phase agent chains end-to-end with per-step telemetry and chain_id correlation
Audit and reduce YAML frontmatter bloat in AgentSkill SKILL.md files. Automates deduplication, flattening, and noise removal.
Toy 5. An interactive proxy decay simulator showing how optimization pressure erodes the modeling capacity required to distinguish proxy from territory — producing self-reinforcing V(t) degradation that becomes progressively harder to correct. Companion simulation for The Depth Constraint — Series 2, Part 2.
Canonical home for AI Behavior Science research and the Founding Territory Paper
Reviews and modernizes stacks, packages, SDKs, and tooling before code is written against them.
Audits frontend implementations for design-system drift across CSS, Tailwind, JSX, TSX, Vue, and Angular code.
High-performance routing engine that selects the best agent skill for a task and emits structured handoff decisions.
Manages durable cross-agent shared memory for stable conventions, reusable policies, and organization-wide operating rules.
Toy 4. An AI-powered goal-chain classifier and invariance stress tester. Enter any goal, trace the instrumental chain to its terminal type, then probe whether the classification survives symbolic, blind, independent, and adversarial re-examination. Companion simulation for The Invariant Drive — Series 2, Part 1 of the Architecture of Thriving.
Produces auditable token-usage and cost reports from runtime evidence, normalized usage bundles, and repository-level report sets.
MCP server for AI agent research — captures LLM reasoning, model identity, and feedback via schema injection
text systems, behavioral contracts, AI protocols, and small tools for keeping AI-assisted work operational. I built these tools to reduce drift, force useful decisions, and keep messy AI-assisted work usable.
Toy 6. An interactive phase-space instrument mapping Ψ = S/D — the ratio of capability to modeling depth that determines whether a system is in the viable, transitional, or failure-mode-dominant regime. Includes the Inner Crossing animation. Companion simulation for The Inner Crossing — Series 2, Part 3.
Scores and improves prompts for clarity, consistency, signal density, structure, and runtime fit.
An easy-to-integrate Unity FSM for basic enemy AI behaviors, utilizing ScriptableObject for customizable and reusable AI states like Idle, Chase, and Attack.
Audits APIs against OpenAPI, AsyncAPI, JSON Schema, protobuf, or PRD contracts to catch drift before release.
Add a description, image, and links to the agent-behavior topic page so that developers can more easily learn about it.
To associate your repository with the agent-behavior topic, visit your repo's landing page and select "manage topics."