Skills, File System, Console capaiblities via pydantic-ai vs custom artifact store.

pydantic-ai is the direction we will take, so we want to minimize custom implementation.

Claude helped me work through the design comparing our `ArtifactStore` (custom implementation) vs pydantic-ai capabilities. Verdict + plan below.

Scope: **read-only** for skills/artifacts (knowledge inputs), **read+write** for per-session workspaces (ephemeral local — papers, plots, generated files). Agent-driven writes back to GitHub remain out of scope.

---

# ArtifactStore vs pydantic-ai capabilities — finalized verdict

## Scope

In akd-ext, agents need to:

1. **Read** skills/artifacts (knowledge inputs) — bundled with the package and/or hosted on a public GitHub repo. Read-only.
2. **Read + write** a per-session workspace — ephemeral local directory where the agent produces outputs (papers, plots, generated files) the user can inspect at end of chat.

Agent-driven **writes back to GitHub** (publishing produced artifacts back to a repo) are explicitly **out of scope** for this plan.

(akd-ext nomenclature: what `pydantic-ai-skills` calls a "skill" we call an "artifact"; same shape, just naming. We use the terms interchangeably below; on disk both are `SKILL.md` files with YAML frontmatter.)

## Verdict

**`ArtifactStore` is fully redundant. Adopt `pydantic-ai-skills` for the read side and `pydantic-ai-backend` for the session-workspace read/write side.** Both responsibilities map cleanly onto pydantic-ai capabilities; neither requires custom infrastructure. The work-in-progress on `feature/github-store` (`GitHubArtifactStore`) is paused — neither merged nor deleted — because GitHub access in akd-ext is read-only via `GitSkillsRegistry`, and writes to GitHub are out of scope. Immediate priority: migrate `closed_loop_cm1` agents off their eager system-prompt context injection (12k+ lines of `cm1_readme.md` per call) to `SkillsCapability`'s progressive disclosure, for the token win.

## Three needs → two pydantic-ai capabilities

| Agent need | pydantic-ai answer | Custom code |
|---|---|---|
| Read skills/artifacts from local dirs (e.g. `closed_loop_cm1/context/*.md` bundled in the package) | `SkillsCapability(directories=[...])` from `pydantic-ai-skills` | None |
| Read skills/artifacts from a public GitHub repo | `SkillsCapability(registries=[GitSkillsRegistry(repo_url=...)])` — clones, caches, supports auth, version pinning | None |
| Read + write a per-session workspace (papers, plots, generated files) | `ConsoleCapability(backend=LocalBackend(root_dir=session_path), permissions=DEFAULT_RULESET)` from `pydantic-ai-backend` | None (per-session dir lifecycle is already handled by `pydantic-ai-backend`'s `SessionManager`) |

Both `SkillsCapability` configurations can coexist on a single agent. The `ConsoleCapability` composes alongside them — they're orthogonal: skills are loaded knowledge (immutable inputs), session workspace is generated output (ephemeral, local-only). The agent sees one unified set of tools.

`SkillsCapability` ships read-only progressive disclosure: at construction it injects skill name + description into the system prompt (cheap), then exposes `list_skills` / `load_skill(name)` / `read_skill_resource(...)` / `run_skill_script(...)` tools so the model fetches full content only when relevant.

`ConsoleCapability` ships `ls / read / write / edit / glob / grep` filtered by permission rulesets (`READONLY` / `DEFAULT` / `STRICT` / `PERMISSIVE`). With `LocalBackend(root_dir=session_path)`, the agent is sandboxed inside the session workspace — it cannot read or write outside of it.

## Why `ArtifactStore` is fully redundant — two facts

### Fact 1: A real, present skills use case is burning tokens today

`akd_ext/agents/closed_loop_cm1/capability_feasibility_mapper.py:369-374` (on `feature/pydantic_ai_base_agent`):

```python
cluster_it_context: str = Field(
    default_factory=lambda: (Path(__file__).parent / "context" / "cluster_it.md").read_text(),
)
cm1_readme_context: str = Field(
    default_factory=lambda: (Path(__file__).parent / "context" / "cm1_readme.md").read_text(),
)
```

`_create_agent` (line 436-442) concatenates both into `agent.instructions` on every construction. **`cm1_readme.md` is 12,115 lines** — currently injected into the system prompt on every model call. This is precisely the failure mode `SkillsCapability`'s progressive disclosure was designed to fix.

### Fact 2: `ArtifactStore` has zero call sites

Cross-branch grep for `ArtifactStore`, `GitHubArtifact`, `LocalArtifact`: hits only inside `akd_ext/artifacts/` itself. No agent imports it. No tool consumes it. No system prompt builder uses `__str__()`. It's purely speculative infrastructure.

## Dependency assessment (verified, not skimmed)

### `pydantic-ai-skills` (Doug Trajano) — safe to depend on

- 261 ⭐, 21 forks, v0.8.0 released April 21 2026
- **17 issues total, 0 open** — every reported issue closed
- 20 PRs merged including headline features: `SkillsCapability`, `GitSkillsRegistry`, `auto_reload`, generic `SkillsToolset[Any]`
- Recent commits: responsive iteration on real user reports
- Single maintainer — real risk, but the data format (markdown + YAML frontmatter) is fully portable; rebuild cost if abandoned is ~1 day for a homegrown ~150 LOC `AbstractCapability` subclass

### `pydantic-ai-backend` (Vstorm) — safe to depend on

- 83 ⭐, 19 forks, v0.2.5 released April 20 2026
- **3 issues total, 0 open**
- Multi-contributor (Kacper Włodarczyk, DEENUU1, ilayu, community PRs) — better bus factor than skills
- Recent activity: docker+daytona session manager, async protocol, sandbox sessions
- Used here only for local-filesystem `ConsoleCapability` (per-session workspace). No GitHub backend, no external network.

Both are alpha-stage but actively maintained. No abandonment signals.

## Action sequence

### Prerequisite: Merge `feature/pydantic_ai_base_agent`

Everything below builds on the pydantic-ai foundation. Until that branch lands on `develop` (or whatever your integration branch is), the capability work has no place to live. Confirm mergeability and ship it first.

### Step 1 — Adopt SkillsCapability for `closed_loop_cm1`, local-only (priority: token win, ~1 day)

On a new branch off the now-merged pydantic-ai base:

- `pyproject.toml`: add `pydantic-ai-skills>=0.8.0`
- Convert `akd_ext/agents/closed_loop_cm1/context/cluster_it.md` and `cm1_readme.md` into `SKILL.md` format with YAML frontmatter:
  ```
  ---
  name: cluster-it-infrastructure
  description: NCAR/Frontera cluster compute resources, scheduling, storage layout for HPC feasibility analysis.
  ---

  <existing markdown body>
  ```
  Each lives at e.g. `akd_ext/agents/closed_loop_cm1/skills/cluster-it-infrastructure/SKILL.md`.
- `akd_ext/agents/closed_loop_cm1/capability_feasibility_mapper.py`:
  - Delete `cluster_it_context` and `cm1_readme_context` config fields (lines 369-376)
  - Delete the `extra +=` concatenation in `_create_agent` (lines 436-442)
  - Add `SkillsCapability(directories=[Path(__file__).parent / "skills"])` to the agent's capability list. Read `_base/pydantic_ai/_capabilities.py` first to find the composition site.
- Apply the same pattern to peers in `closed_loop_cm1/`: `experiment_implementation.py`, `workflow_spec_builder.py`, `research_report_generator.py`, `interpretation_paper_assembly.py`.

**Verification:**
- `uv run pytest tests/agents/test_capability_feasibility_mapper.py -v` (existing tests; may need updates if they assert on the deleted config fields)
- Run a representative query against the agent. Confirm via debug logging that `cm1_readme.md` is **not** in the initial system prompt, **is** available via `list_skills`, and that `load_skill("cm1-readme")` returns the content when called
- Compare token counts before/after on a representative input — the win should be visible

### Step 2 — Session-workspace ConsoleCapability (when first agent produces file outputs)

When the first agent needs to write outputs the user wants to inspect (likely `research_report_generator` or `interpretation_paper_assembly`):

- `pyproject.toml`: add `pydantic-ai-backend>=0.2.5`
- Decide on session-workspace lifecycle: per-chat-session temp dir vs. per-user persistent dir vs. `pydantic-ai-backend.SessionManager`. Probably: a small wrapper around `SessionManager` that scopes a workspace dir to the chat session and exposes the path to the agent at run time.
- Wire into agents that produce outputs:
  ```python
  capabilities=[
      SkillsCapability(directories=[...]),   # from Step 1
      ConsoleCapability(
          backend=LocalBackend(root_dir=session_workspace_path),
          permissions=DEFAULT_RULESET,
      ),
  ]
  ```

**Verification:**
- Run an agent that produces a file (e.g. a markdown report) in its workspace. Confirm the file lands at `session_workspace_path` and is accessible to the user post-run.
- Confirm sandboxing: agent attempts to read/write outside `root_dir` should fail.

### Step 3 — Add `GitSkillsRegistry` once a public skills/artifacts repo exists (read-only, ~half day)

When the canonical AKD skills/artifacts repo is set up on GitHub (e.g. `NASA-IMPACT/akd-skills` or similar), wire it in:

```python
SkillsCapability(
    directories=[Path(__file__).parent / "skills"],   # local fallback
    registries=[GitSkillsRegistry(
        repo_url="https://github.com/NASA-IMPACT/akd-skills",
        path="skills",
        target_dir="./.cache/akd-skills",
        clone_options=GitCloneOptions(depth=1, single_branch=True),
    )],
)
```

Agents pull canonical artifacts from GitHub at startup and use them via the same progressive-disclosure surface as local skills. **Read-only — no commit/push semantics.**

**Verification:**
- Tiny `PydanticAIBaseAgent` instantiation against the public repo. Confirm `list_skills` reports the GitHub-hosted artifacts, `load_skill(...)` returns content. Skip CI runs unless network is allowed.

### Step 4 — Park `feature/github-store`

Don't merge. Don't delete. The PyGithub plumbing in `akd_ext/artifacts/stores/github.py:14-210` (sha-aware writes, fast-path no-op detection, `github_client=` injection pattern) is good code — keep the branch as a parking lot. If a GitHub-write use case ever lands later (e.g. CARE agents publishing produced artifacts back to a repo), this is the implementation foundation to revive. Until then, leave it dormant.

## Scope note

This plan covers **read-only skills/artifact loading** plus **read-write session workspaces**. Agent-driven writes to GitHub are intentionally out of scope here — defer that decision to a separate plan when a concrete write use case lands.

The mechanical work in Step 1 is well-scoped and ready to execute as a regular task once the pydantic-ai base is merged. Steps 2 and 3 unblock when their respective use cases / infrastructure are ready.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skills, File System, Console capaiblities via pydantic-ai vs custom artifact store. #88

ArtifactStore vs pydantic-ai capabilities — finalized verdict

Scope

Verdict

Three needs → two pydantic-ai capabilities

Why `ArtifactStore` is fully redundant — two facts

Fact 1: A real, present skills use case is burning tokens today

Fact 2: `ArtifactStore` has zero call sites

Dependency assessment (verified, not skimmed)

`pydantic-ai-skills` (Doug Trajano) — safe to depend on

`pydantic-ai-backend` (Vstorm) — safe to depend on

Action sequence

Prerequisite: Merge `feature/pydantic_ai_base_agent`

Step 1 — Adopt SkillsCapability for `closed_loop_cm1`, local-only (priority: token win, ~1 day)

Step 2 — Session-workspace ConsoleCapability (when first agent produces file outputs)

Step 3 — Add `GitSkillsRegistry` once a public skills/artifacts repo exists (read-only, ~half day)

Step 4 — Park `feature/github-store`

Scope note

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Agent need	pydantic-ai answer	Custom code
Read skills/artifacts from local dirs (e.g. `closed_loop_cm1/context/*.md` bundled in the package)	`SkillsCapability(directories=[...])` from `pydantic-ai-skills`	None
Read skills/artifacts from a public GitHub repo	`SkillsCapability(registries=[GitSkillsRegistry(repo_url=...)])` — clones, caches, supports auth, version pinning	None
Read + write a per-session workspace (papers, plots, generated files)	`ConsoleCapability(backend=LocalBackend(root_dir=session_path), permissions=DEFAULT_RULESET)` from `pydantic-ai-backend`	None (per-session dir lifecycle is already handled by `pydantic-ai-backend`'s `SessionManager`)

Skills, File System, Console capaiblities via pydantic-ai vs custom artifact store. #88

Description

ArtifactStore vs pydantic-ai capabilities — finalized verdict

Scope

Verdict

Three needs → two pydantic-ai capabilities

Why ArtifactStore is fully redundant — two facts

Fact 1: A real, present skills use case is burning tokens today

Fact 2: ArtifactStore has zero call sites

Dependency assessment (verified, not skimmed)

pydantic-ai-skills (Doug Trajano) — safe to depend on

pydantic-ai-backend (Vstorm) — safe to depend on

Action sequence

Prerequisite: Merge feature/pydantic_ai_base_agent

Step 1 — Adopt SkillsCapability for closed_loop_cm1, local-only (priority: token win, ~1 day)

Step 2 — Session-workspace ConsoleCapability (when first agent produces file outputs)

Step 3 — Add GitSkillsRegistry once a public skills/artifacts repo exists (read-only, ~half day)

Step 4 — Park feature/github-store

Scope note

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Why `ArtifactStore` is fully redundant — two facts

Fact 2: `ArtifactStore` has zero call sites

`pydantic-ai-skills` (Doug Trajano) — safe to depend on

`pydantic-ai-backend` (Vstorm) — safe to depend on

Prerequisite: Merge `feature/pydantic_ai_base_agent`

Step 1 — Adopt SkillsCapability for `closed_loop_cm1`, local-only (priority: token win, ~1 day)

Step 3 — Add `GitSkillsRegistry` once a public skills/artifacts repo exists (read-only, ~half day)

Step 4 — Park `feature/github-store`