A daemon-first orchestration engine for chip design verification (DV). Icarium receives natural-language or structured commands, routes them to multi-step pipeline definitions called gears, and manages parallel LLM calls and simulation processes to drive DV tasks to completion.
Icarium defines schema/plugin_schema.json — the contract that knowledge-graph
indexer plugins must follow. What consumes that data is outside icarium's scope.
| Dependency | Version | Notes |
|---|---|---|
| Zig | 0.14+ | brew install zig |
| PostgreSQL | 16+ | brew install postgresql@16 |
| pgvector | 0.7+ | brew install pgvector |
| libpq | (with PG) | headers in /opt/homebrew/opt/postgresql@16/include |
| ONNX Runtime | 1.17+ | for the NER indexer plugin only |
The daemon (icariumd) and CLI (icarium) have no ONNX dependency. Only the
icarium-indexer-codebert plugin binary needs ONNX Runtime.
# daemon + CLI only (no ONNX)
zig build
# full build including the NER plugin (requires ONNX Runtime)
zig build \
-Donnxruntime-include=/opt/homebrew/include/onnxruntime \
-Donnxruntime-lib=/opt/homebrew/lib
# run C smoke tests (NER layer)
zig build testBuild outputs in zig-out/bin/:
icariumd— the daemonicarium— CLI clienticarium-indexer-codebert— built-in NER extractor plugin
Override library search paths if your PostgreSQL lives elsewhere:
zig build \
-Dpq-include=/usr/include/postgresql \
-Dpq-lib=/usr/lib# 1. Create the database
createdb icarium
# 2. Apply the schema
psql icarium -f schema/001_init.sql
# 3. Initialise config in your project root
icarium init
# 4. Start the daemon
icariumd start
# 5. Verify
icarium status
# 6. Index your SV/UVM project
icarium index --project myproject --root /path/to/testbench
# 7. Query indexed entities
echo '{"method":"query","type":"entities","kind":"UVM_AGENT"}' | nc -U /tmp/icarium.sockicarium init writes a template. Key sections:
[indexer]
plugin = "icarium-indexer-codebert" # extractor plugin on $PATH
models_dir = "" # default: $ICARIUM_MODELS
[db]
conninfo = "dbname=icarium host=localhost"
[daemon]
socket = "/tmp/icarium.sock"
log_level = "info"
[llm]
# endpoint = "https://api.anthropic.com/v1"
# model = "claude-sonnet-4-6"
# api_key_env = "ANTHROPIC_API_KEY"
# — or any OpenAI-compatible endpoint:
# endpoint = "http://localhost:11434/v1"
# model = "qwen2.5-coder:32b"icariumd start Daemonize and start listening on /tmp/icarium.sock
icariumd stop Send SIGTERM to the running daemon
icariumd status Check daemon health and print task queue stats
icarium init Write icarium.toml in the current directory
icarium index Trigger incremental NER index (also called by git hook)
All communication is newline-delimited JSON over the Unix socket at
/tmp/icarium.sock. One request per connection; the daemon writes one response
and closes.
# generic client one-liner
echo '<json>' | nc -U /tmp/icarium.sockQuery the local entity store built by indexer plugins. All parameters except
type are optional filters.
// List entities by kind or name pattern
{"method": "query", "type": "entities", "kind": "UVM_AGENT"}
{"method": "query", "type": "entities", "name": "axi*"}
// Relation graph hop
{"method": "query", "type": "relations", "from": "axi_agent", "rel": "HAS_DRIVER"}
// Assembled context centered on an entity
{"method": "query", "type": "context", "focus": "dma_agent", "depth": 1}
// UVM agents with no covergroup — coverage gap report
{"method": "query", "type": "no_covergroup"}
{"method": "coverage_gaps"}Add "project": "myproject" to scope any query to a named project.
Run a gear end-to-end. The daemon matches the query to a gear by trigger,
runs each stage (LLM call or shell command), substitutes {stage_id} template
tokens between stages, and returns the final synthesized output.
{"method": "gear.run", "query": "close coverage on the AXI agent"}
→ {"result": {"gear": "close_coverage", "output": "..."}}
{"method": "gear.run", "query": "triage the nightly regression"}
→ {"result": {"gear": "triage", "output": "..."}}Lookup only (no execution):
{"method": "gear.find", "q": "close coverage"}
→ {"result": {"name": "close_coverage", "stages": 5, "triggers": 4}}{"method": "kanban.add", "params": {"title": "close AXI coverage", "gear": "close_coverage", "priority": 80}}
→ {"task_id": "a1b2c3...", "status": "triage"}
{"method": "kanban.list", "params": {"status": "todo", "limit": 20}}
{"method": "kanban.get", "params": {"task_id": "a1b2c3..."}}
{"method": "kanban.move", "params": {"task_id": "a1b2c3...", "status": "done"}}
{"method": "kanban.link", "params": {"parent_id": "...", "child_id": "..."}}Kanban task statuses: triage → todo → ready → running → blocked → review → done | archived
┌─────────────────────────────────────────────────────┐
│ icariumd │
│ │
│ Unix socket accept loop (single-threaded) │
│ └── ipc.zig — method routing │
│ │
│ Executor thread (queue.zig) │
│ └── runs shell/index/triage tasks │
│ └── fires hooks on state changes │
│ │
│ Entity store queries (db.c → libpq) │
│ └── entities, relationships indexed by plugins │
│ │
│ Gear registry (gear_registry.zig) │
│ └── loads *.gear files from gears/, ~/.icarium/ │
│ │
│ Gear executor (executor.zig) │
│ └── stage loop, template fill, LLM + process │
│ └── called synchronously by gear.run IPC method │
│ │
│ Plugin registry (plugin_registry.zig) │
│ └── kanban (in-process, plugins/kanban.zig) │
│ └── extractor plugins (short-lived subprocesses) │
└─────────────────────────────────────────────────────┘
│ libpq
┌────────▼────────────────────────────────────────────┐
│ PostgreSQL │
│ entities, relationships, tasks, findings │
│ kanban_tasks, kanban_task_links, kanban_events │
│ pgvector (embedding vector(768) on entities) │
└─────────────────────────────────────────────────────┘
src/c/
db.c / db.h PostgreSQL interface (libpq): entities, tasks,
entity store queries, kanban CRUD
validate.c / .h NDJSON record validator for plugin output
infer.c / .h ONNX Runtime inference (NER model)
tok.c / .h BPE tokeniser (matches GraphCodeBERT vocab)
index.c / .h Entity extraction pipeline (tok → infer → emit)
plugin_main.c icarium-indexer-codebert binary entry point
src/zig/
main.zig CLI entry point: init / start / stop / status / index
daemon.zig Double-fork, pidfile, socket accept loop, startup init
ipc.zig IPC method dispatch (all JSON-over-Unix-socket handlers)
queue.zig In-memory task queue + executor thread
query.zig Entity store query wrappers (db.c → IPC handlers)
gear.zig Gear file parser (YAML-like, arena-allocated)
gear_registry.zig Gear discovery: ICARIUM_GEARS, ./gears/, ~/.icarium/gears/
executor.zig Gear stage runner: template fill, LLM calls, process stages
plugin_registry.zig Plugin manifest parser + in-process capability dispatch
hooks.zig Fire-and-forget hook registry (64 slots)
plugins/kanban.zig Kanban capability plugin (handles kanban.* methods)
plugin_runner.zig Extractor plugin subprocess runner
config.zig icarium.toml parser
index_cmd.zig `icarium index` subcommand
setup.zig `icarium init` subcommand
cli.zig Shared CLI utilities
c.zig C FFI bindings import
Applied automatically at daemon startup. Manual application:
psql icarium -f schema/001_init.sql.
entities NER-extracted SV/UVM entities (kind, name, file, line, confidence,
embedding vector(768))
relationships Directed structural edges between entities (kind, from_id, to_id)
tasks Daemon task queue (shell / index / triage jobs)
findings Structured LLM analysis results
icarium_projects Named projects with root paths
kanban_tasks Kanban board cards (9 statuses, gear_name, gear_run_id)
kanban_task_links Parent/child dependency edges between cards
kanban_events Audit trail (status changes, comments, finding links)
Entity kinds indexed by the built-in NER model:
MODULE PORT PARAMETER PACKAGE INTERFACE COVERGROUP ASSERTION
UVM_AGENT UVM_DRIVER UVM_MONITOR UVM_SEQUENCER UVM_SCOREBOARD
UVM_ENV UVM_TEST UVM_SEQUENCE CLASS
Icarium publishes schema/plugin_schema.json — the NDJSON record format that
all extractor plugins must emit. Each line is either an entity or a relation:
// Entity record
{"kind": "entity", "type": "UVM_AGENT", "name": "axi_agent",
"file": "tb/axi_agent.sv", "line_start": 12, "line_end": 89,
"confidence": 0.97}
// Relation record
{"kind": "relation", "type": "HAS_DRIVER",
"from_kind": "UVM_AGENT", "from_name": "axi_agent",
"to_kind": "UVM_DRIVER", "to_name": "axi_driver",
"confidence": 0.90}Icarium validates every record from every plugin against this schema before ingesting it. Any system that consumes the entity/relation tables — whether a graph database, a vector store, or an analysis tool — works from this contract.
Icarium supports two plugin kinds, both declared via a plugin.yaml manifest.
Short-lived subprocesses. Read file paths from stdin, write NDJSON records to
stdout conforming to schema/plugin_schema.json.
name: icarium-indexer-codebert
kind: extractor
emits_kinds: [UVM_AGENT, UVM_DRIVER, MODULE, COVERGROUP, ...]
emits_relations: [HAS_DRIVER, HAS_MONITOR, EXTENDS, ...]
executable: icarium-indexer-codebertThe built-in extractor (icarium-indexer-codebert) runs a fine-tuned
GraphCodeBERT model (125M parameters, MIT licence) for SV/UVM named-entity
recognition. Model files live in $ICARIUM_MODELS or the path set in
icarium.toml.
Model performance (epoch 5, 6,257-file corpus): F1 = 0.972 · Precision = 0.969 · Recall = 0.975 · Accuracy = 0.995
In-process method handlers registered at startup. The kanban plugin ships built-in.
name: icarium-kanban
kind: capability
provides_methods: [kanban.add, kanban.list, kanban.get, kanban.update,
kanban.move, kanban.link]
provides_hooks: [on_task_complete, on_finding]Daemon scans at startup:
$ICARIUM_BIN/../plugins/<name>/plugin.yaml(built-in)~/.icarium/plugins/<name>/plugin.yaml(user)./.icarium/plugins/<name>/plugin.yaml(project, requiresICARIUM_ENABLE_PROJECT_PLUGINS=1)
Hooks are fire-and-forget notifications fired on daemon events:
| Hook | Fires when |
|---|---|
pre_index |
Before plugin_runner starts on a file batch |
post_index |
After plugin_runner completes |
on_task_complete |
A task queue entry reaches done or failed |
on_finding |
A triage gear writes a finding |
on_gear_stage_complete |
A gear executor stage finishes |
on_gear_complete |
A gear run reaches its termination condition |
Gears are YAML-like pipeline definitions that describe multi-step DV tasks. The daemon loads them at startup from (in priority order):
ICARIUM_GEARSenv var directory./gears/alongside the binary~/.icarium/gears/
name: close_coverage
version: 2
triggers:
- "close coverage"
- "coverage closure"
stages:
- id: decompose
type: llm
prompt: "Identify which functional scenarios will close remaining coverage holes.\n\nTask: {input}\nKnown entities:\n{context}"
- id: execute
type: process
prompt: "echo 'Coverage plan: {decompose}' && date"
- id: analyze
type: llm
prompt: "Analyze the simulation output and identify remaining gaps.\n\nPlan:\n{decompose}\n\nOutput:\n{execute}"
- id: synthesize
type: llm
prompt: "Synthesize into a prioritized action list.\n\nAnalysis:\n{analyze}"
termination:
condition: synthesize.status == done
max_iterations: 3
on_max: return_last_synthesizeTemplate tokens: {input} = original user query, {context} = entity-store
context injected by the executor, {stage_id} = output from a prior stage.
\n in prompt strings is unescaped to a real newline at runtime.
Stage types: llm process parallel_llm condition
| Gear | Triggers | Description |
|---|---|---|
close_coverage |
"close coverage", "coverage closure" | Decompose → simulate → analyze gaps (parallel) → synthesize |
triage |
"triage", "failures", "regression triage" | Parse failure logs → cluster → root-cause per cluster → synthesize |
simulate |
"simulate", "run sim", "smoke test" | Build run command → launch subprocess → parse pass/fail |
debug |
"debug", "why is", "investigate" | Gather context → 3 hypotheses → verify each → rank |
| Phase | Goal | Status |
|---|---|---|
| 1 — Entity store queries | IPC query handlers against indexed entity/relation tables | ✓ Done |
| 2 — Gear format + parser | Load and validate gear definition files | ✓ Done |
| 2B — Plugin infrastructure | Plugin manifests, hooks, kanban plugin | ✓ Done |
| 3 — LLM pool | Parallel structured LLM calls (Anthropic + OpenAI-compat) | ✓ Done |
| 4 — Gear executor | Run a gear end-to-end: stage loop, template fill, iteration | ✓ Done |
| 5 — Router | Natural language → gear selection (trigger match → embedding → LLM) | Next |
| 6 — Relation extraction | Heuristic SV relation extractor (EXTENDS, HAS_DRIVER, DRIVES…) | |
| 7 — pgvector embeddings | Semantic entity search via HNSW index | |
| 8 — TUI | Interactive REPL + live task queue + findings panes | |
| 9 — Hardening | icariumd doctor, config validation, structured errors |
src/zig/llm.zig — two backends: Anthropic (native tool_use for structured
output) and OpenAI-compatible (json_schema mode). Synchronous single calls via
llm.call() and parallel fan-out via llm.callParallel() (thread-per-request).
Backend auto-detected from endpoint URL; API key resolved from env at init time.
src/zig/executor.zig — the stage runner called by gear.run. Loops through
each stage in order; for llm/parallel_llm stages calls llm.call() with the
template-filled prompt; for process stages runs the command via icr_exec_shell.
Template engine substitutes {input}, {context}, and {stage_id} tokens using
prior stage outputs. Respects termination.condition and max_iterations. All
four built-in gear files carry concrete prompt templates.
src/zig/router.zig — three-tier query classification:
- Structural match — pure entity-store queries (
"list all UVM agents","which modules have no covergroup") bypass gears entirely and go directly to thequeryIPC handler. - Trigger match — substring match against gear trigger lists. Already
implemented in
gear_registry.zig; the router formalizes it as a first-pass. - Embedding similarity — encode the query with the encoder model; cosine similarity against cached gear trigger embeddings. Fallback when trigger match misses.
- LLM fallback — for ambiguous or novel queries, a lightweight LLM call
with the gear list selects the best match or returns
nullfor direct entity-store dispatch.
Done when a router test suite correctly routes 20 labelled queries across all four gear types plus the structural bypass case, with no LLM calls for trigger-matched or structural inputs.
Single-threaded IPC loop: ipc.zig uses module-level g_resp_buf and
g_data_buf instead of stack buffers. Returning slices into stack-allocated
arrays from dispatch() is UB; the module-level buffers are safe because the
accept loop handles one connection at a time.
Gear parser: line-oriented indent state machine in gear.zig. Indent 0 =
top-level scalars/section headers; indent 2 = list items; indent 4 = object
fields within list items. All strings are arena-allocated; call gear.deinit()
to free.
Optional SQL filters: nullable query parameters use PostgreSQL's
$N::text IS NULL OR col = $N pattern. When libpq passes a C NULL pointer for
a params-array entry, $N becomes SQL NULL, the IS NULL branch is TRUE, and
the filter is skipped entirely.
In-process capability plugins: plugins/kanban.zig is linked directly into
the daemon rather than running as a sidecar process. The external-process
protocol (Unix socket NDJSON with {"ready":true,"socket":"..."} handshake)
is planned for Phase 9 (Hardening).
Source available. See LICENCE file.