pydantic-ai is the direction we will take, so we want to minimize custom implementation.
Claude helped me work through the design comparing our ArtifactStore (custom implementation) vs pydantic-ai capabilities. Verdict + plan below.
Scope: read-only for skills/artifacts (knowledge inputs), read+write for per-session workspaces (ephemeral local — papers, plots, generated files). Agent-driven writes back to GitHub remain out of scope.
ArtifactStore vs pydantic-ai capabilities — finalized verdict
Scope
In akd-ext, agents need to:
- Read skills/artifacts (knowledge inputs) — bundled with the package and/or hosted on a public GitHub repo. Read-only.
- Read + write a per-session workspace — ephemeral local directory where the agent produces outputs (papers, plots, generated files) the user can inspect at end of chat.
Agent-driven writes back to GitHub (publishing produced artifacts back to a repo) are explicitly out of scope for this plan.
(akd-ext nomenclature: what pydantic-ai-skills calls a "skill" we call an "artifact"; same shape, just naming. We use the terms interchangeably below; on disk both are SKILL.md files with YAML frontmatter.)
Verdict
ArtifactStore is fully redundant. Adopt pydantic-ai-skills for the read side and pydantic-ai-backend for the session-workspace read/write side. Both responsibilities map cleanly onto pydantic-ai capabilities; neither requires custom infrastructure. The work-in-progress on feature/github-store (GitHubArtifactStore) is paused — neither merged nor deleted — because GitHub access in akd-ext is read-only via GitSkillsRegistry, and writes to GitHub are out of scope. Immediate priority: migrate closed_loop_cm1 agents off their eager system-prompt context injection (12k+ lines of cm1_readme.md per call) to SkillsCapability's progressive disclosure, for the token win.
Three needs → two pydantic-ai capabilities
| Agent need |
pydantic-ai answer |
Custom code |
Read skills/artifacts from local dirs (e.g. closed_loop_cm1/context/*.md bundled in the package) |
SkillsCapability(directories=[...]) from pydantic-ai-skills |
None |
| Read skills/artifacts from a public GitHub repo |
SkillsCapability(registries=[GitSkillsRegistry(repo_url=...)]) — clones, caches, supports auth, version pinning |
None |
| Read + write a per-session workspace (papers, plots, generated files) |
ConsoleCapability(backend=LocalBackend(root_dir=session_path), permissions=DEFAULT_RULESET) from pydantic-ai-backend |
None (per-session dir lifecycle is already handled by pydantic-ai-backend's SessionManager) |
Both SkillsCapability configurations can coexist on a single agent. The ConsoleCapability composes alongside them — they're orthogonal: skills are loaded knowledge (immutable inputs), session workspace is generated output (ephemeral, local-only). The agent sees one unified set of tools.
SkillsCapability ships read-only progressive disclosure: at construction it injects skill name + description into the system prompt (cheap), then exposes list_skills / load_skill(name) / read_skill_resource(...) / run_skill_script(...) tools so the model fetches full content only when relevant.
ConsoleCapability ships ls / read / write / edit / glob / grep filtered by permission rulesets (READONLY / DEFAULT / STRICT / PERMISSIVE). With LocalBackend(root_dir=session_path), the agent is sandboxed inside the session workspace — it cannot read or write outside of it.
Why ArtifactStore is fully redundant — two facts
Fact 1: A real, present skills use case is burning tokens today
akd_ext/agents/closed_loop_cm1/capability_feasibility_mapper.py:369-374 (on feature/pydantic_ai_base_agent):
cluster_it_context: str = Field(
default_factory=lambda: (Path(__file__).parent / "context" / "cluster_it.md").read_text(),
)
cm1_readme_context: str = Field(
default_factory=lambda: (Path(__file__).parent / "context" / "cm1_readme.md").read_text(),
)
_create_agent (line 436-442) concatenates both into agent.instructions on every construction. cm1_readme.md is 12,115 lines — currently injected into the system prompt on every model call. This is precisely the failure mode SkillsCapability's progressive disclosure was designed to fix.
Fact 2: ArtifactStore has zero call sites
Cross-branch grep for ArtifactStore, GitHubArtifact, LocalArtifact: hits only inside akd_ext/artifacts/ itself. No agent imports it. No tool consumes it. No system prompt builder uses __str__(). It's purely speculative infrastructure.
Dependency assessment (verified, not skimmed)
pydantic-ai-skills (Doug Trajano) — safe to depend on
- 261 ⭐, 21 forks, v0.8.0 released April 21 2026
- 17 issues total, 0 open — every reported issue closed
- 20 PRs merged including headline features:
SkillsCapability, GitSkillsRegistry, auto_reload, generic SkillsToolset[Any]
- Recent commits: responsive iteration on real user reports
- Single maintainer — real risk, but the data format (markdown + YAML frontmatter) is fully portable; rebuild cost if abandoned is ~1 day for a homegrown ~150 LOC
AbstractCapability subclass
pydantic-ai-backend (Vstorm) — safe to depend on
- 83 ⭐, 19 forks, v0.2.5 released April 20 2026
- 3 issues total, 0 open
- Multi-contributor (Kacper Włodarczyk, DEENUU1, ilayu, community PRs) — better bus factor than skills
- Recent activity: docker+daytona session manager, async protocol, sandbox sessions
- Used here only for local-filesystem
ConsoleCapability (per-session workspace). No GitHub backend, no external network.
Both are alpha-stage but actively maintained. No abandonment signals.
Action sequence
Prerequisite: Merge feature/pydantic_ai_base_agent
Everything below builds on the pydantic-ai foundation. Until that branch lands on develop (or whatever your integration branch is), the capability work has no place to live. Confirm mergeability and ship it first.
Step 1 — Adopt SkillsCapability for closed_loop_cm1, local-only (priority: token win, ~1 day)
On a new branch off the now-merged pydantic-ai base:
pyproject.toml: add pydantic-ai-skills>=0.8.0
- Convert
akd_ext/agents/closed_loop_cm1/context/cluster_it.md and cm1_readme.md into SKILL.md format with YAML frontmatter:
---
name: cluster-it-infrastructure
description: NCAR/Frontera cluster compute resources, scheduling, storage layout for HPC feasibility analysis.
---
<existing markdown body>
Each lives at e.g. akd_ext/agents/closed_loop_cm1/skills/cluster-it-infrastructure/SKILL.md.
akd_ext/agents/closed_loop_cm1/capability_feasibility_mapper.py:
- Delete
cluster_it_context and cm1_readme_context config fields (lines 369-376)
- Delete the
extra += concatenation in _create_agent (lines 436-442)
- Add
SkillsCapability(directories=[Path(__file__).parent / "skills"]) to the agent's capability list. Read _base/pydantic_ai/_capabilities.py first to find the composition site.
- Apply the same pattern to peers in
closed_loop_cm1/: experiment_implementation.py, workflow_spec_builder.py, research_report_generator.py, interpretation_paper_assembly.py.
Verification:
uv run pytest tests/agents/test_capability_feasibility_mapper.py -v (existing tests; may need updates if they assert on the deleted config fields)
- Run a representative query against the agent. Confirm via debug logging that
cm1_readme.md is not in the initial system prompt, is available via list_skills, and that load_skill("cm1-readme") returns the content when called
- Compare token counts before/after on a representative input — the win should be visible
Step 2 — Session-workspace ConsoleCapability (when first agent produces file outputs)
When the first agent needs to write outputs the user wants to inspect (likely research_report_generator or interpretation_paper_assembly):
pyproject.toml: add pydantic-ai-backend>=0.2.5
- Decide on session-workspace lifecycle: per-chat-session temp dir vs. per-user persistent dir vs.
pydantic-ai-backend.SessionManager. Probably: a small wrapper around SessionManager that scopes a workspace dir to the chat session and exposes the path to the agent at run time.
- Wire into agents that produce outputs:
capabilities=[
SkillsCapability(directories=[...]), # from Step 1
ConsoleCapability(
backend=LocalBackend(root_dir=session_workspace_path),
permissions=DEFAULT_RULESET,
),
]
Verification:
- Run an agent that produces a file (e.g. a markdown report) in its workspace. Confirm the file lands at
session_workspace_path and is accessible to the user post-run.
- Confirm sandboxing: agent attempts to read/write outside
root_dir should fail.
Step 3 — Add GitSkillsRegistry once a public skills/artifacts repo exists (read-only, ~half day)
When the canonical AKD skills/artifacts repo is set up on GitHub (e.g. NASA-IMPACT/akd-skills or similar), wire it in:
SkillsCapability(
directories=[Path(__file__).parent / "skills"], # local fallback
registries=[GitSkillsRegistry(
repo_url="https://github.com/NASA-IMPACT/akd-skills",
path="skills",
target_dir="./.cache/akd-skills",
clone_options=GitCloneOptions(depth=1, single_branch=True),
)],
)
Agents pull canonical artifacts from GitHub at startup and use them via the same progressive-disclosure surface as local skills. Read-only — no commit/push semantics.
Verification:
- Tiny
PydanticAIBaseAgent instantiation against the public repo. Confirm list_skills reports the GitHub-hosted artifacts, load_skill(...) returns content. Skip CI runs unless network is allowed.
Step 4 — Park feature/github-store
Don't merge. Don't delete. The PyGithub plumbing in akd_ext/artifacts/stores/github.py:14-210 (sha-aware writes, fast-path no-op detection, github_client= injection pattern) is good code — keep the branch as a parking lot. If a GitHub-write use case ever lands later (e.g. CARE agents publishing produced artifacts back to a repo), this is the implementation foundation to revive. Until then, leave it dormant.
Scope note
This plan covers read-only skills/artifact loading plus read-write session workspaces. Agent-driven writes to GitHub are intentionally out of scope here — defer that decision to a separate plan when a concrete write use case lands.
The mechanical work in Step 1 is well-scoped and ready to execute as a regular task once the pydantic-ai base is merged. Steps 2 and 3 unblock when their respective use cases / infrastructure are ready.
pydantic-ai is the direction we will take, so we want to minimize custom implementation.
Claude helped me work through the design comparing our
ArtifactStore(custom implementation) vs pydantic-ai capabilities. Verdict + plan below.Scope: read-only for skills/artifacts (knowledge inputs), read+write for per-session workspaces (ephemeral local — papers, plots, generated files). Agent-driven writes back to GitHub remain out of scope.
ArtifactStore vs pydantic-ai capabilities — finalized verdict
Scope
In akd-ext, agents need to:
Agent-driven writes back to GitHub (publishing produced artifacts back to a repo) are explicitly out of scope for this plan.
(akd-ext nomenclature: what
pydantic-ai-skillscalls a "skill" we call an "artifact"; same shape, just naming. We use the terms interchangeably below; on disk both areSKILL.mdfiles with YAML frontmatter.)Verdict
ArtifactStoreis fully redundant. Adoptpydantic-ai-skillsfor the read side andpydantic-ai-backendfor the session-workspace read/write side. Both responsibilities map cleanly onto pydantic-ai capabilities; neither requires custom infrastructure. The work-in-progress onfeature/github-store(GitHubArtifactStore) is paused — neither merged nor deleted — because GitHub access in akd-ext is read-only viaGitSkillsRegistry, and writes to GitHub are out of scope. Immediate priority: migrateclosed_loop_cm1agents off their eager system-prompt context injection (12k+ lines ofcm1_readme.mdper call) toSkillsCapability's progressive disclosure, for the token win.Three needs → two pydantic-ai capabilities
closed_loop_cm1/context/*.mdbundled in the package)SkillsCapability(directories=[...])frompydantic-ai-skillsSkillsCapability(registries=[GitSkillsRegistry(repo_url=...)])— clones, caches, supports auth, version pinningConsoleCapability(backend=LocalBackend(root_dir=session_path), permissions=DEFAULT_RULESET)frompydantic-ai-backendpydantic-ai-backend'sSessionManager)Both
SkillsCapabilityconfigurations can coexist on a single agent. TheConsoleCapabilitycomposes alongside them — they're orthogonal: skills are loaded knowledge (immutable inputs), session workspace is generated output (ephemeral, local-only). The agent sees one unified set of tools.SkillsCapabilityships read-only progressive disclosure: at construction it injects skill name + description into the system prompt (cheap), then exposeslist_skills/load_skill(name)/read_skill_resource(...)/run_skill_script(...)tools so the model fetches full content only when relevant.ConsoleCapabilityshipsls / read / write / edit / glob / grepfiltered by permission rulesets (READONLY/DEFAULT/STRICT/PERMISSIVE). WithLocalBackend(root_dir=session_path), the agent is sandboxed inside the session workspace — it cannot read or write outside of it.Why
ArtifactStoreis fully redundant — two factsFact 1: A real, present skills use case is burning tokens today
akd_ext/agents/closed_loop_cm1/capability_feasibility_mapper.py:369-374(onfeature/pydantic_ai_base_agent):_create_agent(line 436-442) concatenates both intoagent.instructionson every construction.cm1_readme.mdis 12,115 lines — currently injected into the system prompt on every model call. This is precisely the failure modeSkillsCapability's progressive disclosure was designed to fix.Fact 2:
ArtifactStorehas zero call sitesCross-branch grep for
ArtifactStore,GitHubArtifact,LocalArtifact: hits only insideakd_ext/artifacts/itself. No agent imports it. No tool consumes it. No system prompt builder uses__str__(). It's purely speculative infrastructure.Dependency assessment (verified, not skimmed)
pydantic-ai-skills(Doug Trajano) — safe to depend onSkillsCapability,GitSkillsRegistry,auto_reload, genericSkillsToolset[Any]AbstractCapabilitysubclasspydantic-ai-backend(Vstorm) — safe to depend onConsoleCapability(per-session workspace). No GitHub backend, no external network.Both are alpha-stage but actively maintained. No abandonment signals.
Action sequence
Prerequisite: Merge
feature/pydantic_ai_base_agentEverything below builds on the pydantic-ai foundation. Until that branch lands on
develop(or whatever your integration branch is), the capability work has no place to live. Confirm mergeability and ship it first.Step 1 — Adopt SkillsCapability for
closed_loop_cm1, local-only (priority: token win, ~1 day)On a new branch off the now-merged pydantic-ai base:
pyproject.toml: addpydantic-ai-skills>=0.8.0akd_ext/agents/closed_loop_cm1/context/cluster_it.mdandcm1_readme.mdintoSKILL.mdformat with YAML frontmatter:akd_ext/agents/closed_loop_cm1/skills/cluster-it-infrastructure/SKILL.md.akd_ext/agents/closed_loop_cm1/capability_feasibility_mapper.py:cluster_it_contextandcm1_readme_contextconfig fields (lines 369-376)extra +=concatenation in_create_agent(lines 436-442)SkillsCapability(directories=[Path(__file__).parent / "skills"])to the agent's capability list. Read_base/pydantic_ai/_capabilities.pyfirst to find the composition site.closed_loop_cm1/:experiment_implementation.py,workflow_spec_builder.py,research_report_generator.py,interpretation_paper_assembly.py.Verification:
uv run pytest tests/agents/test_capability_feasibility_mapper.py -v(existing tests; may need updates if they assert on the deleted config fields)cm1_readme.mdis not in the initial system prompt, is available vialist_skills, and thatload_skill("cm1-readme")returns the content when calledStep 2 — Session-workspace ConsoleCapability (when first agent produces file outputs)
When the first agent needs to write outputs the user wants to inspect (likely
research_report_generatororinterpretation_paper_assembly):pyproject.toml: addpydantic-ai-backend>=0.2.5pydantic-ai-backend.SessionManager. Probably: a small wrapper aroundSessionManagerthat scopes a workspace dir to the chat session and exposes the path to the agent at run time.Verification:
session_workspace_pathand is accessible to the user post-run.root_dirshould fail.Step 3 — Add
GitSkillsRegistryonce a public skills/artifacts repo exists (read-only, ~half day)When the canonical AKD skills/artifacts repo is set up on GitHub (e.g.
NASA-IMPACT/akd-skillsor similar), wire it in:Agents pull canonical artifacts from GitHub at startup and use them via the same progressive-disclosure surface as local skills. Read-only — no commit/push semantics.
Verification:
PydanticAIBaseAgentinstantiation against the public repo. Confirmlist_skillsreports the GitHub-hosted artifacts,load_skill(...)returns content. Skip CI runs unless network is allowed.Step 4 — Park
feature/github-storeDon't merge. Don't delete. The PyGithub plumbing in
akd_ext/artifacts/stores/github.py:14-210(sha-aware writes, fast-path no-op detection,github_client=injection pattern) is good code — keep the branch as a parking lot. If a GitHub-write use case ever lands later (e.g. CARE agents publishing produced artifacts back to a repo), this is the implementation foundation to revive. Until then, leave it dormant.Scope note
This plan covers read-only skills/artifact loading plus read-write session workspaces. Agent-driven writes to GitHub are intentionally out of scope here — defer that decision to a separate plan when a concrete write use case lands.
The mechanical work in Step 1 is well-scoped and ready to execute as a regular task once the pydantic-ai base is merged. Steps 2 and 3 unblock when their respective use cases / infrastructure are ready.