Full Elixir by deepfates · Pull Request #19 · deepfates/cantrip

deepfates · 2026-05-29T06:29:11Z

Reattaches the work done in deepfates/grimoire (162 commits, full history preserved) onto cantrip's template history.

grimoire was created from the cantrip template, so its root "Initial commit" was a fresh snapshot of cantrip main's tip (identical tree) with no ancestry link. This branch re-parents that root onto cantrip main (a0841ee) via graft + history rewrite, so the branch descends from main and merges as a normal fast-forwardable PR.

Net change: replatform from the clj/+ex/+ts/+py/ monorepo scaffold to the Elixir implementation promoted to the repo root (lib/, test/, docs/).

Build a tests.yaml conformance runner (loader, runner, expect) that exercises the Elixir implementation against the shared 71-case behavioral spec. Fixed all discovered divergences: - call_entity raises on error in code medium (COMP-6, COMP-8) - cantrip_error propagation through child entities (COMP-8) - child turn sequence preservation in parent loom (COMP-5, LOOM-8) - ACP session inference when sessionId omitted (PROD-6, ENTITY-5) - malformed done call treated as error, not termination (LOOP-7) - tool result ID mismatch validation (LLM-7) - circle rejects missing medium declaration (MEDIUM-1) 198 tests, 0 failures.

Six sections covering basic cast, multi-turn gates, streaming events, custom gates, composition with call_entity, and loom table rendering. All sections use FakeLLM — no API keys needed.

Instrument EntityServer with :telemetry events for entity lifecycle, turn lifecycle, gate execution, and code medium evaluation. 8 new tests. Events: [:cantrip, :entity, :start/:stop], [:cantrip, :turn, :start/:stop], [:cantrip, :gate, :start/:stop], [:cantrip, :code, :eval] 206 tests, 0 failures.

Cantrip.Familiar builds a production-ready persistent coding assistant with read_file, list_dir, search, and done gates. JSONL loom persistence. Mix tasks: - mix cantrip.familiar — REPL mode with persistent entity - mix cantrip.cast "intent" — single-shot mode 12 new tests. 218 total, 0 failures.

Section 7 wires telemetry events into Kino widgets with color-coded real-time display, plus summary tables for turn/gate metrics.

LLMs write done(x) instead of done.(x) — now both work. Source-level transform adds dots before parsing, skipping strings and module-qualified calls. 8 new tests. 230 total, 0 failures.

mix cantrip.familiar --acp starts an ACP stdio server using the Familiar's gates and identity. New Runtime.Familiar module handles session construction.

The familiar now uses code medium and constructs child cantrips at runtime via cantrip()/cast()/cast_batch()/dispose() gates. Entity writes Elixir that observes the codebase, builds specialized children with chosen LLMs/mediums/gates/wards, and composes their results. Replaces the previous conversation-medium filesystem assistant. 20 tests. 234 total, 0 failures.

The code-medium familiar can return maps from done(). Handle gracefully with inspect/2 instead of crashing on String.Chars protocol.

Code medium naturally produces Elixir terms. The done gate now renders non-binary values with inspect/2 instead of passing raw maps/lists through to callers.

The entity's loom is now available as a plain variable in code medium. No gate, no file read — just `loom.turns` to access conversation history directly from process state.

Document the code-medium familiar with orchestration gates, loom as data binding, ACP editor setup for Zed, Livebook notebook, telemetry events, and 71/71 conformance. Remove stale limitations.

- Add BashMedium: shell command execution via System.cmd with SUBMIT: termination pattern, output truncation, configurable cwd/timeout - Fix systemic normalize_opts bug: bare values (strings, numbers) passed to code-medium gates were silently erased to %{}. Gate closures now pass bare values through; call_entity wraps strings as %{intent: value}; cantrip/cast_batch raise clear errors on invalid input - Fix fork message reconstruction: include tool_calls on assistant messages and tool_call_id on tool messages; code-medium turns use user-message format instead of orphaned tool messages - Add bash support to Circle (type normalization, tool_view, capability text), EntityServer (execute_turn routing, message construction), and Familiar (system prompt documents bash children, cast() return value clarity) - Rewrite livebook demo with real LLM calls - Add tests: bash medium (14), code medium bare-value ergonomics (3), fork message format (2)

The Agent Client Protocol uses "sessionUpdate" as the discriminant key in session/update notifications, not "kind". Also strip extra fields from PromptResponse — spec says only stopReason + optional _meta. - protocol.ex: "kind" → "sessionUpdate", result is just {stopReason} - Update all tests and fixtures to match corrected wire format - Conformance expect checker now searches across all replies per step

The ACP client sends the project working directory but the Familiar had no way to know where to look. Now the runtime appends the cwd to the system prompt so the Familiar orients itself on first turn.

The "Familiar ACP server starting on stdio..." message was written to stdout via Mix.shell().info, corrupting the JSON-RPC stream.

Add conformance runner, telemetry, Livebook, and familiar

Sandbox: Add Cantrip.CodeMedium.DuneSandbox as opt-in restricted evaluation via %{sandbox: :dune} ward. Blocks File, System, Node, Process, spawn. Gate closures work through Dune sessions with persistent bindings across turns. 20 tests. ReqLLM: Add Cantrip.LLMs.ReqLLM adapter (req_llm v1.9.0) supporting 18+ providers. llm_from_env now prefers ReqLLM for all known providers, falling back to legacy adapters only when unavailable. 15 tests. Verified: real LLM smoke test passes through both OpenAI and Anthropic via ReqLLM. 288 tests, 0 failures.

Delete the old Protocol module and its tests (m11, m14, m15, m16). AgentHandler is now the single ACP path — a plain module with ETS state, no GenServer bottleneck. Each request runs in a Connection Task concurrently. - AgentHandler stores sessions and last_answer in public ETS - meta field on NewSessionRequest passes through to runtime (for LLM injection in tests) - Conformance runner updated to use AgentHandler with JSON reply reconstruction - Familiar and divergence tests migrated to typed ACP structs - Stdio integration test covers full JSON wire format via spawned BEAM

Use Dotenvy.source/2 with side_effect callback instead of the custom load_dotenv function. Only sets env vars not already defined.

The catch-all clause in normalize_opts converted bare values (strings, numbers) to empty maps. compile_and_load now uses the same inline normalization as gate closures: maps/lists normalize, bare values pass through.

Production deps: dotenvy ~> 0.8, nimble_options ~> 1.1, agent_client_protocol (f1729 GitHub). Test-only deps: mox ~> 1.2.

- Add EventBridge: translates {:cantrip_event, _} messages into ACP session_notification calls (tool_call, tool_call_update, thought chunks) - AgentHandler spawns a bridge per prompt and injects stream_to into session - Cantrip.summon/3 accepts opts (e.g. stream_to:) passed to EntityServer - Runtimes (Cantrip, Familiar) forward stream_to from session to summon

Replace hand-rolled normalize_retry with NimbleOptions schema validation. Provides clear error messages for invalid retry config (e.g. wrong types).

- WARD-1/COMP-6: extract_numerics guard n>0 → n>=0 so max_depth:0 is preserved during ward composition and delegation gates are stripped - A.12: Save/restore :cantrip_familiar_store across eval Tasks so child cantrips constructed on turn N survive to turn N+1 - LLM-3: Preserve base_url and api_key through ReqLLM normalize_state and pass them in build_opts; extract OPENAI_BASE_URL in llm_from_env - COMP-8: cast_batch sequential fallback now raises on child failure (is_error: true) matching cast behavior Red-green TDD for each fix.

cantrip.cast was routing everything through the Familiar orchestrator (code medium, filesystem gates, child cantrips). Now it creates a minimal conversation cantrip with just a done gate — the simplest useful cast per the spec. Use --familiar / -f for the orchestrator.

The spec says "if no medium is specified, the default is conversation" but validate_medium rejected empty medium_sources. Now it accepts them, matching Circle.new which already defaulted type to :conversation.

* fix: harden bash sandbox workloads * test: show bash workload sandbox failures * fix: restore bubblewrap /dev mount behavior * fix: unshare user for bubblewrap network isolation * fix: allow bwrap loopback setup * test: split bash workload and netns coverage * fix: avoid bwrap user namespace requirement * ci: install uidmap for bubblewrap workloads * ci: enable bubblewrap workload tests * fix: address bash workload review --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]>

* fix: constrain public api docs surface * test: derive public api guard from compiled modules * docs: avoid supervisor names as module links

…owup chore: post-v1.3.2 hardening followup

…icted-115 fix: make Familiar default sandbox unrestricted

max_turns accumulated across sends in a summoned entity (REPL / ACP session). Once cumulative turns crossed the limit, every later intent truncated immediately and the session was bricked — the visible 'How can that possibly be the max turn limit' symptom from dogfooding mix cantrip.familiar. max_turns is meant to bound the work for ONE intent, not the lifetime of the entity. Reset the per-episode turn counter on each new intent; message history, loom, and code_state still persist across sends. Also point TMPDIR at the always-writable per-session sandbox dir so shell heredocs / process substitution work on TMPDIR-honoring shells (modern bash on Linux) without widening the sandbox. macOS bash 3.2 ignores TMPDIR and uses /tmp, so heredocs there still need an explicit bash_writable_paths entry — the sandbox stays deny-by-default. Regression coverage: test/persistent_turn_budget_test.exs. mix verify green: 642 tests, 0 failures, credo clean.

Copilot

Copilot wasn't able to review this pull request because it exceeds the maximum number of files (300). Try reducing the number of changed files and requesting a review from Copilot again.

deepfates and others added 30 commits March 22, 2026 21:31

Add Livebook demo notebook with streaming and loom visualization

39b3eb5

Six sections covering basic cast, multi-turn gates, streaming events, custom gates, composition with call_entity, and loom table rendering. All sections use FakeLLM — no API keys needed.

Add telemetry dashboard section to Livebook notebook

6dfde38

Section 7 wires telemetry events into Kino widgets with color-coded real-time display, plus summary tables for turn/gate metrics.

Auto-transform bare gate calls to dot-calls in code medium

0438a75

LLMs write done(x) instead of done.(x) — now both work. Source-level transform adds dots before parsing, skipping strings and module-qualified calls. 8 new tests. 230 total, 0 failures.

Add ACP mode to Familiar with --acp flag

3c8a88e

mix cantrip.familiar --acp starts an ACP stdio server using the Familiar's gates and identity. New Runtime.Familiar module handles session construction.

Fix Mix tasks to handle non-string results from code medium

00cd6cd

The code-medium familiar can return maps from done(). Handle gracefully with inspect/2 instead of crashing on String.Chars protocol.

Coerce non-string done() results at the gate boundary

763271b

Code medium naturally produces Elixir terms. The done gate now renders non-binary values with inspect/2 instead of passing raw maps/lists through to callers.

Expose loom as a data binding in code medium

05a36d5

The entity's loom is now available as a plain variable in code medium. No gate, no file read — just `loom.turns` to access conversation history directly from process state.

Update README with familiar, telemetry, Livebook, conformance docs

a1b2a89

Document the code-medium familiar with orchestration gates, loom as data binding, ACP editor setup for Zed, Livebook notebook, telemetry events, and 71/71 conformance. Remove stale limitations.

Fix Livebook notebook Mix.install app name to :cantrip_ex

5e1fd10

Fix Livebook path resolution with __DIR__

ebf2ebd

Add ACP wrapper script for Zed editor integration

d84c308

Inject cwd into Familiar system prompt via ACP runtime

286eb7d

The ACP client sends the project working directory but the Familiar had no way to know where to look. Now the runtime appends the cwd to the system prompt so the Familiar orients itself on first turn.

Move ACP startup message to stderr

a83d71d

The "Familiar ACP server starting on stdio..." message was written to stdout via Mix.shell().info, corrupting the JSON-RPC stream.

Merge pull request #1 from deepfates/conformance-telemetry-familiar

f9e95f6

Add conformance runner, telemetry, Livebook, and familiar

Replace hand-rolled .env parser with dotenvy

99705de

Use Dotenvy.source/2 with side_effect callback instead of the custom load_dotenv function. Only sets env vars not already defined.

Fix normalize_opts erasing bare values in compile_and_load closure

6a1faa1

The catch-all clause in normalize_opts converted bare values (strings, numbers) to empty maps. compile_and_load now uses the same inline normalization as gate closures: maps/lists normalize, bare values pass through.

Add dotenvy, nimble_options, mox, and agent_client_protocol deps

e05b207

Production deps: dotenvy ~> 0.8, nimble_options ~> 1.1, agent_client_protocol (f1729 GitHub). Test-only deps: mox ~> 1.2.

Use nimble_options for retry config validation

4e4b410

Replace hand-rolled normalize_retry with NimbleOptions schema validation. Provides clear error messages for invalid retry config (e.g. wrong types).

Default to conversation medium when none specified (MEDIUM-1)

1874bd5

The spec says "if no medium is specified, the default is conversation" but validate_medium rejected empty medium_sources. Now it accepts them, matching Circle.new which already defaulted type to :conversation.

deepfates and others added 23 commits May 28, 2026 09:19

fix: include env example in hex package (#88)

31eafbd

fix: constrain public API docs surface (#89)

8f7c35b

* fix: constrain public api docs surface * test: derive public api guard from compiled modules * docs: avoid supervisor names as module links

fix: teach familiar synthesis composition (#90)

e097448

chore: prepare v1.3.0 release state (#91)

f87a625

fix: fail closed and redact observation args (#94)

a80df48

docs: update cleanup status after v1.3.1 (#95)

21ff718

docs: refresh migration and package docs (#101)

c3c377b

feat: orient conversation entities (#104)

7332fd4

docs: add spellbook and public module voice (#105)

988e0b3

chore: prepare v1.3.2 release (#106)

e4d0e80

chore: post-v1.3.2 hardening followup

577b6c1

docs: add v1.3.2 inhabitant affordance audit

28cf044

fix: make familiar default sandbox unrestricted

4d9d759

Merge pull request #113 from deepfates/codex/post-v132-hardening-foll…

69781d1

…owup chore: post-v1.3.2 hardening followup

Merge pull request #121 from deepfates/codex/familiar-default-unrestr…

4f451b8

…icted-115 fix: make Familiar default sandbox unrestricted

fix: clarify bash filesystem write affordance (#123)

58db1a7

test: cover Mnesia familiar rehydration (#124)

0a06cdb

docs: tighten v1.3.3 familiar guidance

2658ccd

chore: prepare v1.3.3 release

13b58dc

chore: remove resolved v132 audit artifact

eb453a6

docs: trim aspirational README paths

d882db6

deepfates force-pushed the import-copied-work branch from 3eb121a to fb3c893 Compare May 29, 2026 06:33

deepfates requested a review from Copilot May 29, 2026 06:41

Copilot AI reviewed May 29, 2026

deepfates changed the title ~~Import Elixir rewrite from grimoire~~ Convert to full Elixir and level up May 29, 2026

deepfates changed the title ~~Convert to full Elixir and level up~~ Full Elixir May 29, 2026

deepfates merged commit 5de6e23 into main May 29, 2026
5 of 6 checks passed

deepfates deleted the import-copied-work branch May 29, 2026 06:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full Elixir#19

Full Elixir#19
deepfates merged 161 commits into
mainfrom
import-copied-work

deepfates commented May 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

deepfates commented May 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants