Skip to content

Feat/skill management#24

Open
aarontuor wants to merge 38 commits into
mainfrom
feat/skill-management
Open

Feat/skill management#24
aarontuor wants to merge 38 commits into
mainfrom
feat/skill-management

Conversation

@aarontuor

Copy link
Copy Markdown
Contributor

Skills discovery feature and better skills management relying on agent native skills usage. Merged mcp servers to deal with communication between two servers issue. Updated use cases and docs.

aarontuor and others added 16 commits June 24, 2026 09:11
Bridge dsagt's skill system into agent-native discovery and add a
searchable catalog of installable skills from external GitHub repos.
Two tiers:

- Catalog — external Agent-Skills repos (default: K-Dense scientific,
  140+) cloned + indexed into per-source `skills_catalog__<slug>` KB
  collections. Searchable via search_skills, never loaded into the
  agent's context.
- Installed — a chosen skill copied into <project>/skills/ and mirrored
  into .claude/skills/ for Claude Code's native discovery.

Changes:
- commands/skills_catalog.py: shallow-clone cache, recursive SKILL.md
  discovery, per-source indexing (idempotent re-sync via drop+rebuild),
  find/install. Known sources: scientific, anthropic, antigravity, composio.
- MCP tools: install_skill + catalog-spanning search_skills (registry);
  add_skill_source / list_skill_sources (knowledge). Added to auto-allow.
- agents/base._mirror_skills_to + ClaudeSetup hook: manifest-gated mirror
  into .claude/skills/ (never clobbers user skills; reaps stale entries;
  trims >1536-char descriptions in the copy only).
- Bundled skill-creator meta-skill (Anthropic template + condensed spec).
- CLI: dsagt skills sync/add/list/search.
- Config: skills block (sources/populate_native/populate_catalog),
  backfilled for old configs; setup-kb syncs the default catalog
  (--no-skill-catalog to skip); reserve .skill_sources; kb_from_config.
- dsagt_instructions.md: two-tier guidance (native vs catalog/install).
- use_cases/isaac_skills_demo: runnable mock of the isaac_vasp workflow
  exercising the full flow with tiny mock VASP data.

Tests: test_skills_catalog.py + config/server-routing additions
(201 passed, 13 skipped); black + ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
First-pass vetting of use_cases/isaac_skills_demo against the real
K-Dense catalog surfaced one bug: `dsagt skills add <proj> <source>`
synced + indexed the catalog but never wrote the source into
dsagt_config.yaml, so a later config-driven `dsagt skills sync` would
forget it. Only the `add_skill_source` MCP tool persisted.

- Move the persist logic into a shared `persist_source_to_config` helper
  in skills_catalog.py; call it from both the CLI add-source path and the
  knowledge-server `add_skill_source` handler (removes the duplicated
  `_persist_skill_source`).
- Regression test for the helper (append + dedupe + missing-config no-op).
- Add use_cases/isaac_skills_demo/PROMPTS.md: the 8-prompt hand-pass
  script plus first-pass results (init/mirror, 146-skill sync, search,
  install pymatgen, native re-mirror, add anthropic all verified).

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
The isaac_skills_demo rebuild path (rm + init) left the external catalog
empty because a fresh `init` copies only the shared KB, while the catalog
is project-scoped. The agent's search_skills then correctly returned no
catalog hits. Add the required `dsagt skills sync` step to the rebuild
block, a pre-launch catalog check, and a note on the global setup-kb
alternative.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
An empty/unsynced external skill catalog was indistinguishable from a
genuine no-match, forcing the agent through a multi-step discovery dance
and a misleading "skill unusable until restart" message.

- search_skills: when no catalog is synced, say so and point at
  list_skill_sources / add_skill_source instead of a bare no-match
- list_skill_sources: flag each known source synced/available with its
  indexed count, rather than two parallel lists to cross-reference
- install_skill: state the skill is usable this session immediately;
  restart only enables hands-free native auto-invocation
- dsagt_instructions: document the catalog as opt-in and the
  list_skill_sources -> add_skill_source -> search_skills -> install flow

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Bump to 0.2.0 and make dsagt.__version__ the single source of truth —
pyproject reads it via setuptools dynamic metadata, so future bumps
touch one line. Add CHANGELOG (Keep a Changelog format).

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
- README/docs: document the non-developer install (pip install
  git+https://github.com/AI-ModCon/dsagt.git into any 3.12/3.13 env);
  note uv is dev/CI-only and conda/venv both work
- de-duplicate the supported-agents table and install block via
  mkdocs-include-markdown so docs/index.md pulls them from the README
- correct the Python prerequisite to 3.12/3.13
- cli.md: drop the uv-sync-specific install assumption
- docs CI builds with the locked docs dependency group via uv

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
…Catalog + keyword fallback

Collapse the two MCP servers (dsagt-registry-server + dsagt-knowledge-server)
into a single dsagt-server backed by one shared KnowledgeBase, and finish the
skills-discovery refactor from design-notes/genesis-skills-comparison.md §10 and
skills-catalog-server-merge.md.

Server merge
- New commands/dsagt_server.py: create_dsagt_server composes both modules'
  (tools, handlers) under one Server("dsagt") with a type-dispatched call_tool
  (registry str passthrough + knowledge dict->json + error wrap); one shared-KB
  main() with the cross-backend guard living once in _build_kb_from_config.
- registry_server / knowledge_server keep create_*_server as thin test-facing
  wrappers via extracted _registry_tools_and_handlers / _knowledge_tools_and_handlers;
  their standalone main()s + entry points are removed.
- pyproject: dsagt-registry-server + dsagt-knowledge-server -> dsagt-server.
- Per-agent MCP config collapsed to one "dsagt" entry across claude/goose/codex/
  cline/roo/opencode (+ base helpers, info.py span buckets).
- Compat is rebuild-not-migrate: re-run `dsagt start` to regenerate config
  (README upgrade note + cline .cline-data caveat). No migration shims.

Skills discovery
- New SkillsCatalog (commands/skills_catalog.py) owns the catalog data plane
  (sync/install/search/list_sources + ChromaDB-vs-keyword backend selection);
  SkillRouter is now a thin render/MCP facade over it.
- New skill_keyword.py: Genesis-faithful token-overlap scorer (no-embedder
  fallback). skill_discovery.py: stateless SkillRouter.
- Catalog-only search + frontmatter-only indexing; removed the dead
  skills-collection indexing (SkillRegistry + setup_core_kb bundled skills).
- Struck the stale bundled datacard-generator skill (now via `dsagt skills add
  <project> genesis`); skill-creator is the only bundled skill.

Docs/diagram
- Architecture figure: one MCP box spanning Knowledge + Registry, "Server"
  dropped from both labels, Skills -> Skills Catalog; regenerated PNGs.
- README/docs/design-notes updated.

Tests: new test_dsagt_server.py + test_skill_discovery.py; config-shape, info,
and smoke-test assertions updated to the single-server shape. 338 passed /
13 skipped across affected suites; ruff + black clean on changed files.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
…overy

Bump to 0.3.0 (breaking: the registry/knowledge MCP servers merge into one
dsagt-server; the old console scripts are removed). CHANGELOG documents the
motivation and a rebuild-not-migrate upgrade note; README version refs bumped.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
- Add the skill-routing diagram (assets/skills-routing.png) to tools-skills.md
  with a "Skill discovery architecture" section motivating the two-tier
  (catalog vs native) split, catalog-only search, the keyword fallback, the
  single SkillRouter entry point, and per-source federation/provenance.
- Sweep README + docs for the old two-server model: mcp-servers, architecture,
  cli, developer, quickstart, knowledge-base now describe one dsagt-server.
- Fix stale skill-indexing language: bundled/installed skills are auto-
  discovered natively (not indexed); setup-kb rebuilds Tool Specs only; the
  KB collection tables list "Skills Catalog" (external, per-source) instead of
  a bundled "Skills" collection.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
The README/docs documented `dsagt --version` but the CLI had no such flag
(argparse errored). Wire it from dsagt.__version__ so it reports the release
the docs claim. Found while vetting the isaac_skills_demo walkthrough.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
…source name clashes

When the same skill name exists in more than one synced source (the clone
cache is machine-global, so this happens even across projects), the install
guard now resolves a source-qualified '<slug>/<skill>' name instead of dead-
ending. find_catalog_skill parses the '<slug>/' prefix and scopes the search;
install_skill (MCP) and 'dsagt skills add' inherit it. The CLI 'add' routes a
'<synced-slug>/<skill>' target to install (not a clone) and now prints a clean
error instead of a traceback on an ambiguous bare name. The ambiguity message
points at the qualified form, which actually works now.

Found while vetting the isaac_skills_demo walkthrough.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
…package; consolidate skill modules into skills.py

The single dsagt-server now lives in src/dsagt/mcp/, composed from per-concern
tool modules behind one dispatch shell:
- mcp/server.py   — composition, main(), shared build_dispatch_server + KB startup
- mcp/registry_tools.py / knowledge_tools.py / memory_tools.py / skill_tools.py
This replaces the registry_server / knowledge_server / dsagt_server trio and
moves importable logic out of commands/ (which is for entry points). Entry point
repointed to dsagt.mcp.server:main; agents are unchanged (same `dsagt-server`).

Skill discovery is consolidated into one module src/dsagt/skills.py — the
catalog data plane (SkillsCatalog), the SkillRouter render facade, and the
Genesis-derived keyword scorer (was skill_keyword + skill_discovery +
commands/skills_catalog).

install_skill now returns a terse confirmation (the install→use→restart model
already lives in the agent instructions).

Remove dead src/session.py. Migrate tests to the new layout (new
test_memory_tools / test_skill_tools; server-startup reworked to fast,
network-free entry-point checks; memory + kb_search tests rehomed). Update
CLAUDE.md and docs/mcp-servers.md to the merged-server / four-concern shape.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
…ntmatter

Third-party catalogs (e.g. Genesis) ship SKILL.md files whose unquoted
`description` contains a colon (`…readiness levels: Level 1…`), which strict
YAML rejects — so _parse_frontmatter raised and the skill was silently dropped
from indexing, search, and install. Genesis's `generating-datacards`
(datacard-generator) was one such casualty.

_parse_frontmatter now falls back to a lenient flat key:value parse on YAML
error, recovering name/description/tags. dsagt-authored specs are valid YAML, so
the fallback never fires for them. Adds TestLenientFrontmatter.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Consolidate the branch into one 0.2.0 release: a single CHANGELOG entry (external
skill catalogs incl. the Genesis integration, native discovery, catalog-only
search + keyword fallback, license/PROVENANCE capture, server merge) replacing
the interim 0.2.0/0.3.0 entries; version set to 0.2.0 (one minor bump from
master's 0.1.0). README/architecture diagram refreshed.

Add two skill-management use cases:
- isaac_skills_demo — agent-led arc (what we have → find more → sync → install →
  create) that authors a vasp-to-isaac converter parsing with real pymatgen.
- genesis_skills — pull Genesis curation skills, ingest domain into the KB, and
  generate a datacard for a finished dataset.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
@ajtritt ajtritt assigned aarontuor and unassigned ajtritt Jun 26, 2026
@ajtritt ajtritt requested review from a team, RAVarikoti, ajtritt and jeanbez June 26, 2026 18:19
aarontuor and others added 7 commits June 28, 2026 20:51
Recover observability and episodic memory without the LiteLLM proxy, by
reading each agent's own on-disk session transcript instead of intercepting
LLM traffic — no proxy, no credentials, serverless trace store.

Trace pipeline (Phase 2), all in one src/dsagt/traces.py:
- `Trace` — the canonical session form as nested dicts (a list of span dicts
  + compose/query/to_exchanges methods; the span/message/block shapes have a
  single home in Trace's add_* methods).
- `Reader`/`Translator` ABCs with one subclass per agent (claude/codex/goose/
  opencode/cline); codex/goose/opencode/cline share the Translator turn-
  template, claude is bespoke, claude+codex share JsonlReader.
- `TraceCollector` — the in-session heartbeat: read → translate → hand the
  Trace to its consumers, each with its own ack set for idempotency.
- `MLflowSink` (observability) consumes Trace + dict spans directly.

Episodic memory (Phase 3, opt-in via `dsagt init --episodic`):
- `memory.MemoryExtractor` — a TraceCollector consumer: Tier-0 mechanical
  chunk+tag+embed always; Tier-1 distillation via `judge.LocalJudge`
  (Qwen2.5-1.5B GGUF, GBNF-grammar JSON), degrading to Tier-0 on failure.
- Recency-weighted retrieval over session_memory (episodic.recency_half_life_days).
- `judge.py` — Judge.create → LocalJudge (default) / APIJudge; llama-cpp-python
  added to core, pinned to an integrity-verified CPU wheel.

Tool-use indexing:
- `provenance.ToolUseIndexer` — incremental, idempotent embedding of dsagt-run
  records into the tool_use collection on the heartbeat + before
  reconstruct_pipeline (fixes the prior re-index-everything duplicate bug).

Removed: the LiteLLM proxy (commands/proxy_server.py, test_proxy_callback,
proxy_walkthrough) and the roo agent. Docs (README, docs/ site, CHANGELOG)
updated to the post-proxy, serverless, episodic-enabled reality.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
…r MCP

Reserve "tool"/"tools" for the MCP + agent sense; registered CLI
executables the agent runs via dsagt-run are now "codes" throughout.

- Registry: <project>/tools/ → codes/ (+ codes/scripts/), ToolRegistry →
  CodeRegistry, save_tool_spec → save_code_spec, search_registry takes
  code_name, dsagt-run --tool → --code, list_tools/get_tool →
  list_codes/get_code.
- Collections/spans: tools → codes, tool_use → code_use, tool.execute →
  code.execute; provenance/registry metadata tool_name → code_name.
  Kept tool_name for session_memory (agent tool calls); kb_search now
  accepts both code_name and tool_name.
- src/dsagt/tools/ → src/dsagt/codes/, run_tool.py → run_code.py
  (+ package-data, dsagt-run entry point, bundled scan_directory).
- Agent instructions, CLAUDE.md, README, and all docs updated to the
  convention. Kept the Anthropic tool_use block type and the MCP SDK
  @server.list_tools() decorator (both the reserved sense).

Docs also: fixed staleness (dropped Roo, the removed mlflow-autolog hook,
the removed LLM judge), trimmed implementation-detail TMI and negative
framing, simplified jargon (provisioning → setup, dropped "layer").

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
…nit`

- Restructure the docs into one page per capability, each ending with a
  hands-on "Try it" walkthrough and a pointer to the use cases:
  provenance (code registry + dsagt-run + reconstruction), knowledge-base
  (domain-knowledge cataloging + hybrid vector search + shared-store note),
  skills, memory, observability. Split the old codes-&-skills page, move
  memory + execution-record content out of knowledge-base, and regroup the
  mkdocs nav under a "Capabilities" section.
- Docs now lead with the interactive `dsagt init` menu; every walkthrough
  uses bare `dsagt init` + `dsagt start`. The pre-menu flags (positional
  name, --agent, --location, --include/--exclude, --episodic) are documented
  as deprecated backcompat only in the CLI reference and README.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
The handler now always forwards where_document (the regex/contains
document-content filter) to kb.search; the three call-signature
assertions predated that argument.

Co-Authored-By: Claude Fable 5 <[email protected]>
…flow deprecation

- http_request was the only MCP tool with no handler-level test; add
  success/method-forwarding/transport-error cases with a mocked
  httpx.AsyncClient.
- test_running_job held its worker in 'running' with a real
  time.sleep(10) that asyncio.run joined at loop shutdown; gate on a
  threading.Event released once the test observes the running state.
- test_install_skill_routes_and_reports_missing built a real
  local-embedding KnowledgeBase it never used (the install path builds
  its own SkillRouter) and scanned the machine-global skill_sources
  cache; use a mock KB and point SKILL_SOURCES_DIR at a tmp dir.
- mlflow.search_traces(experiment_ids=...) is deprecated in favor of
  locations= (FutureWarning under mlflow 3.11); switch the five call
  sites.

Suite: 621 passed, 0 warnings, ~31s (was ~56s).

Co-Authored-By: Claude Fable 5 <[email protected]>
aarontuor and others added 15 commits July 2, 2026 12:49
…Tel/csvtool era removed

Smoke test now runs two back-to-back dsagt start sessions and asserts
15 artifacts instead of 7:

- Agent LLM-trace recovery is a HARD assertion (was informational,
  waiting on 'Phase 2' — the trace pipeline landed). The OTel block and
  its agent_otel_support helper are removed from source entirely
  (function, __all__, otel_payload_support class vars, docstring matrix)
  — the smoke script was the last consumer.
- greet.py replaces the abandoned-csvtool fixture as a registerable AND
  executable code (stdlib-only, documented GRT-42 error code); the
  knowledge/ docs describe it, so registration, execution, provenance,
  and retrieval all reference one consistent tool. Execution is proven
  by greet's actual stdout in the trace_archive record; retrieval by
  the agent answering GRT-42 (only present in the ingested docs).
- New coverage: skill catalog search+install, episodic memory
  (--episodic init, session_memory collection), code_use indexing,
  reconstruct_pipeline prompt, state.yaml session log + trace_source,
  dsagt info, and session 2's cross-session recall + startup catch-up.
- ExplicitMemory now roots at <project>/.dsagt/ as CLAUDE.md and the
  user docs document (the code was the outlier at project root).
- Knowledge-ingest assert fixed to chroma_ids.json — route.json marks
  routed external collections, which ingest never creates.
- Serverless staleness: run.sh header, cli.py 'MLflow ports' docstring.
- manual_runs/ walkthroughs moved to tests/manual_walkthroughs/ (they
  document the dropped roo agent — kept off the docs site until
  refreshed); developer.md link updated.

Verified end-to-end with --agent claude: 15/15 checks pass (the one
first-run failure was the stale route.json assert, re-verified against
the run's artifacts after the fix). Unit suite: 621 passed.

Co-Authored-By: Claude Fable 5 <[email protected]>
…ening from the 4-agent sweep

Running the smoke test across codex/goose/cline/opencode surfaced one
product bug and three harness issues:

- CodexReader scanned the global ~/.codex/sessions/, but DSAGT always
  launches codex with CODEX_HOME=<project>/.codex-data (its MCP config
  lives there), so rollouts land project-local and the live trace
  pipeline collected NOTHING for codex — no MLflow traces, no episodic
  memory, no trace_source stamp for crash catch-up.  Default the reader
  to <project>/.codex-data/sessions.  Verified end-to-end: codex smoke
  now 15/15.
- The smoke agent-trace check masked that bug: its service.name span
  heuristic counted internal DSAGT spans lacking the attribute as agent
  traces.  Count MLflowSink's positive dsagt.trace_id trace-metadata
  marker instead (validated per-store: claude 1, goose 5, opencode 1,
  broken-codex 0).
- cline: dsagt start --script hard-errors by design (agents/cline.py),
  so the harness SKIPs cline with that explanation instead of 15 red
  checks.
- greet execution: goose put 'Ahoy' in both arg slots and opencode ran
  greet through bare bash (bypassing dsagt-run) — the prompt now names
  both arguments and steers to the spec's dsagt-run command, and the
  assert matches the greeting prefix only.  Opencode passes with the
  steered prompt.

Sweep verdicts: claude 15/15, codex 15/15, opencode 15/15,
goose 14/15→pass (both changed checks verified against its artifacts),
cline SKIP.  Unit suite: 621 passed.

Co-Authored-By: Claude Fable 5 <[email protected]>
Agents routinely sidestepped MCP-dispatched execution with their own
bash tools; dsagt-run inside the stored command makes that path
harmless. Capture the rationale where the wrapping code lives.

Co-Authored-By: Claude Fable 5 <[email protected]>
…tive skills

Registered codes move from flat codes/<name>.md to skill-standard
directories (codes/<name>/SKILL.md + per-code scripts/), making each
code a self-contained, portable dir structurally identical to an
installed skill.  Because codes now share the skill envelope,
setup_skills mirrors them into the agent's native skills dir unchanged
— native discovery puts the exact dsagt-run command in context at
invocation time, a second discovery path (alongside search_registry)
aimed at the from-memory command-reconstruction failure mode observed
with opencode.  The agent-facing MCP surface is unchanged.

- Code names constrained to the skill charset (lowercase-hyphen);
  save_code_spec validates with an actionable error.  Bundled code
  renamed scan_directory → scan-directory (python module unchanged).
- SKILL.md body leads with the exact wrapped command + the
  copy-byte-for-byte instruction — what native discovery injects.
- Mirror order: codes first, skills last — a deliberately installed
  instruction skill wins a name collision with a code.
- dsagt start again refreshes the dynamic agent record (MCP config +
  native mirror, all idempotent) — the refresh had been dropped from
  _cmd_start, silently breaking install_skill's 'available after the
  next dsagt start' promise; restored, so mid-session-registered codes
  and installed skills mirror at the next start.
- opencode gains its missing native_skills_dir (.agents/skills, the
  AGENTS.md-convention dir codex/goose already use) — it was the one
  agent with no native mirror at all.
- Smoke test asserts both native mirrors (bundled code at session 1's
  start, mid-session-registered greet at session 2's).

Verified end-to-end: smoke sweep PASS 4/4 (claude, codex, goose,
opencode; cline SKIPs by design).  Unit suite: 624 passed.

Known issue (pre-existing, surfaced by a parallel sweep): concurrent
dsagt init runs race on projects.yaml registration — smoke-test --all
can clobber project registrations; needs a file lock.

Co-Authored-By: Claude Fable 5 <[email protected]>
- dsagt init hard-failed for cline without ANTHROPIC_API_KEY /
  ANTHROPIC_MODEL in the shell — a batch-mode-era requirement, though
  cline batch is unconditionally unsupported.  Worse, running
  `cline auth -p anthropic` from scavenged env vars can clobber an
  existing provider integration (e.g. an OpenAI subscription login).
  BYOA: the user configures cline before pointing dsagt at it; dsagt
  writes only the MCP config.  credential_env_vars/hints dropped.
- cline 3.x moved --config to a global option (must precede the
  subcommand) and made `mcp add` open an interactive wizard without
  --yes; both fixed (probe-verified against cline 3.0.34).

Verified: credential-less `dsagt init --agent cline` completes —
instructions written, MCP server registered + env-patched, 2 skills
mirrored natively.

Co-Authored-By: Claude Fable 5 <[email protected]>
…ctions

agent-card.md described the pre-0.2.0 architecture (two MCP servers,
LiteLLM proxy, OTel/OTLP, setup-kb/mlflow/memory/stop commands, Roo
Code, dsagt_config.yaml, faiss).  Rewritten in the same card format:
single dsagt-server with all 20 tools enumerated by concern, serverless
sqlite MLflow store, BYOA auth, transcript trace pipeline, codes as
skill-standard dirs with native mirroring, current agents/deps/CLI, and
a skill_discovery capability replacing the no-longer-bundled
datacard-generator.

dsagt_instructions.md fixes: the 'scientific' skill source doesn't
exist (→ k-dense-ai); codes/scripts/ → codes/<name>/scripts/; code-name
charset rule beside the save_code_spec guidance; and a note that
registered codes appear among native skills, whose SKILL.md opens with
the exact command — same verbatim-executable rule as section 1b.

Co-Authored-By: Claude Fable 5 <[email protected]>
…l hint machinery

Extends the cline decision to every agent: the user configures and
authenticates their agent (shell env vars, subscription logins) BEFORE
pointing dsagt at it.  dsagt neither hints, warns, nor troubleshoots.

Removed: credential_env_vars / credential_hints class vars on all five
setups, byoa_env_hints(), and the launch-time 'agent will use
preconfigured env vars' transparency note (+ their 18 tests).

Kept (functional, not troubleshooting): opencode's {env:VAR} provider
interpolation in opencode.json, codex's auth.json propagation into the
per-project CODEX_HOME, and batch-mode requirements that genuinely need
an env value (OPENCODE_MODEL for opencode run -m).

Co-Authored-By: Claude Fable 5 <[email protected]>
python -m execution is the only machine-independent executable for
package-shipped code, and module paths can't contain the hyphen the
skill-standard dir name requires — so bundled implementation modules
sit beside their spec dirs, unlike project codes whose scripts are
self-contained in codes/<name>/scripts/.

Co-Authored-By: Claude Fable 5 <[email protected]>
…l whitelist

Re-verified against cline 3.0.34: batch runs fine (act mode, subscription
auth honored), but the headless CLI session path never loads MCP servers
— a probe session with a registered dsagt entry exposed zero MCP tools.
Also verified: cline auth state rides with its config dir, so ANY
per-project --config/CLINE_DIR isolation loses a subscription login;
subscription users need a single global MCP entry (the server is
cwd-self-sufficient).  run_script error + smoke SKIP updated to the
current rationale; re-probe on cline upgrades.

Co-Authored-By: Claude Fable 5 <[email protected]>
…never bridge

Deeper probes (spawn watcher, MCP-child cwd/env probe, tool-inventory
enumeration): headless cline spawns registered MCP servers with
cwd = session dir and full shell env, and dsagt-server answers the
initialize handshake in ~1.6s — but the session's model toolset
contains only cline built-ins + team tools, zero MCP entries.  Also
pinpointed the auth coupling: providers.json (subscription/OAuth
creds) lives inside the settings dir, which is why any --config
isolation goes Unauthorized.  A global MCP entry is not viable either:
it makes every cline session everywhere spawn dsagt-server.

Co-Authored-By: Claude Fable 5 <[email protected]>
…e, one format

Parity: every available code — bundled or agent-registered — now lives
in <project>/codes/<name>/ as a self-contained skill-standard dir.
ensure_bundled_copies() runs at dsagt init (never clobbers an existing
dir; re-init after upgrade refreshes untouched copies).  This dissolves
the bundled/project two-layer merge in CodeRegistry (single glob, no
source_tools_dir seam) and the scan_directory.py placement asymmetry:
the script moves inside scan-directory/scripts/ and the executable
becomes an ordinary project-relative path instead of python -m.
reindex_all removed (no production caller).

Verified: claude smoke 18/18 with scan-directory executing via the
project-relative path.  Unit suite: 607 passed.

Co-Authored-By: Claude Fable 5 <[email protected]>
…bal auth + settings untouched

Cline 3.x resolves its MCP settings file and providers.json (auth)
through INDEPENDENT env overrides.  Setting CLINE_MCP_SETTINGS_PATH to
<project>/.cline-data/cline_mcp_settings.json in runtime_env gives all
three at once (verified live): per-project dsagt server spawns, the
user's subscription/codex auth keeps working, and the global
mcpServers list stays empty.

write_dynamic now hand-writes that file directly (the 'only cline mcp
add is loaded' behavior is gone in 3.x; verified a hand-written file
loads) with the env block in transport.env — no cline binary needed at
init, no add+patch dance, preserves user-added entries.  The
auth-clobbering CLINE_DIR runtime isolation is dropped.

Co-Authored-By: Claude Fable 5 <[email protected]>
Non-invasive correctness pass across the use-case walkthroughs (comb-flow-uni
left untouched by request).  Each demo gets an 'Estimated time' header with an
honest runtime + data/credential gate.

Uniform staleness fixed:
- tool→code terminology; save_tool_spec→save_code_spec; tools/<x>.md→
  codes/<x>/SKILL.md (registry is skill-standard dirs now).
- removed commands dropped: dsagt mlflow / dsagt stop / dsagt setup-kb /
  dsagt skills — replaced with the current init/start/info surface or the
  MCP-tool equivalents.
- serverless observability: deleted the tokamak OTEL_* export block and
  'dsagt mlflow' server start; point at sqlite:///<project>/mlflow.db and the
  'mlflow ui --backend-store-uri' view.
- BYOA: dropped 'set your API keys in dsagt_config.yaml' edits; agents are
  pre-authenticated. Config path corrected to .dsagt/config.yaml.
- isaac_skills_demo: 'scientific' skill source → 'k-dense-ai' (the real
  KNOWN_SOURCES alias); setup uses 'dsagt init --exclude genesis'.

Per-folder:
- isaac_vasp: rewrote the 12-line stub into a real ~15-min walkthrough that
  registers the bundled NEB converter as a code and runs it via dsagt-run on
  the vendored fixture data; fixed a dangling script ref (ase_slab_db_to_isaac
  → ase_db_to_isaac) in the skill.
- tokamak_stability: README + AGENTS.md + m3dc1-skill swept (filenames like
  m3dc1_tools.py preserved).
- .gitignore: guard mlflow.db + mlruns/ so session trace dumps never land in
  git again.

Deferred (flagged, not done): externalizing isaac_vasp's 32 MB OUTCAR
fixtures — the pymatgen source path 404s and I couldn't verify a fetch URL
without risking a broken demo. comb-flow-uni left entirely as-is.

Co-Authored-By: Claude Fable 5 <[email protected]>
A 12-line plan outline fully superseded by isolate_demo.md (same
walkthrough, more detail).  Referenced nowhere in README/docs.

Co-Authored-By: Claude Fable 5 <[email protected]>
… transcript

- docs/use-cases/*: 'tool registration' → 'code registration' (index synced
  with README), '**Tools:**' header labels → '**Codes:**', and the vasp/cryoem
  overviews reworded (registers codes, not tools; ISAAC record not database).
- Removed use_cases/microbial_isolates/isolate_session.txt — a personal scratch
  transcript with hardcoded paths and the dead two-server launch commands,
  referenced nowhere; same class as the demoplan.md already removed.

Co-Authored-By: Claude Fable 5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants