feat(appkit): agents() plugin, createAgent(def), and markdown-driven agents#304
feat(appkit): agents() plugin, createAgent(def), and markdown-driven agents#304MarioCadenas wants to merge 10 commits intoagent/v2/3-plugin-infrafrom
Conversation
a5642df to
e26795b
Compare
3c7c35e to
cb7fe2b
Compare
e26795b to
d73e138
Compare
cb7fe2b to
0afea5e
Compare
…template
Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
### Reference application: agent-app
`apps/agent-app/` — a standalone app purpose-built around the agents
feature. Ships with:
- `server.ts` — full example of code-defined agents via `fromPlugin`:
```ts
const support = createAgent({
instructions: "…",
tools: {
...fromPlugin(analytics),
...fromPlugin(files),
get_weather,
"mcp.vector-search": mcpServer("vector-search", "https://…"),
},
});
await createApp({
plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })],
});
```
- `config/agents/assistant.md` — markdown-driven agent alongside the
code-defined one, showing the asymmetric auto-inherit default.
- Vite + React 19 + TailwindCSS frontend with a chat UI.
- Databricks deployment config (`databricks.yml`, `app.yaml`) and
deploy scripts.
### dev-playground chat UI + demo agent
`apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with
inline autocomplete (hits the `autocomplete` markdown agent) and a
full threaded conversation panel (hits the default agent).
`apps/dev-playground/server/index.ts` — adds a code-defined `helper`
agent using `fromPlugin(analytics)` alongside the markdown-driven
`autocomplete` agent in `config/agents/`. Exercises the mixed-style
setup (markdown + code) against the same plugin list.
`apps/dev-playground/config/agents/*.md` — both agents defined with
valid YAML frontmatter.
### Docs
`docs/docs/plugins/agents.md` — progressive five-level guide:
1. Drop a markdown file → it just works.
2. Scope tools via `toolkits:` / `tools:` frontmatter.
3. Code-defined agents with `fromPlugin()`.
4. Sub-agents.
5. Standalone `runAgent()` (no `createApp` or HTTP).
Plus a configuration reference, runtime API reference, and frontmatter
schema table.
`docs/docs/api/appkit/` — regenerated typedoc for the new public
surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig,
ToolkitEntry, ToolkitOptions, all adapter types, and the agents
plugin factory).
### Template
`template/appkit.plugins.json` — adds the `agent` plugin entry so
`npx @databricks/appkit init --features agent` scaffolds the plugin
correctly.
### Test plan
- Full appkit vitest suite: 1311 tests passing
- Typecheck clean across all 8 workspace projects
- `pnpm docs:build` clean (no broken links)
- `pnpm --filter=@databricks/appkit build:package` clean, publint
clean
Signed-off-by: MarioCadenas <[email protected]>
### Zero-trust MCP host policy documentation (S1 security)
Documents the new `mcp` configuration block and the rules it enforces:
same-origin-only by default, explicit `trustedHosts` for external MCP
servers, plaintext `http://` refused outside localhost-in-dev, and
DNS-level blocking of private / link-local IP ranges (covers cloud
metadata services). See PR #302 for the policy implementation and
PR #304 for the `AgentsPluginConfig.mcp` wiring.
Signed-off-by: MarioCadenas <[email protected]>
…template
Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
### Reference application: agent-app
`apps/agent-app/` — a standalone app purpose-built around the agents
feature. Ships with:
- `server.ts` — full example of code-defined agents via `fromPlugin`:
```ts
const support = createAgent({
instructions: "…",
tools: {
...fromPlugin(analytics),
...fromPlugin(files),
get_weather,
"mcp.vector-search": mcpServer("vector-search", "https://…"),
},
});
await createApp({
plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })],
});
```
- `config/agents/assistant.md` — markdown-driven agent alongside the
code-defined one, showing the asymmetric auto-inherit default.
- Vite + React 19 + TailwindCSS frontend with a chat UI.
- Databricks deployment config (`databricks.yml`, `app.yaml`) and
deploy scripts.
### dev-playground chat UI + demo agent
`apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with
inline autocomplete (hits the `autocomplete` markdown agent) and a
full threaded conversation panel (hits the default agent).
`apps/dev-playground/server/index.ts` — adds a code-defined `helper`
agent using `fromPlugin(analytics)` alongside the markdown-driven
`autocomplete` agent in `config/agents/`. Exercises the mixed-style
setup (markdown + code) against the same plugin list.
`apps/dev-playground/config/agents/*.md` — both agents defined with
valid YAML frontmatter.
### Docs
`docs/docs/plugins/agents.md` — progressive five-level guide:
1. Drop a markdown file → it just works.
2. Scope tools via `toolkits:` / `tools:` frontmatter.
3. Code-defined agents with `fromPlugin()`.
4. Sub-agents.
5. Standalone `runAgent()` (no `createApp` or HTTP).
Plus a configuration reference, runtime API reference, and frontmatter
schema table.
`docs/docs/api/appkit/` — regenerated typedoc for the new public
surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig,
ToolkitEntry, ToolkitOptions, all adapter types, and the agents
plugin factory).
### Template
`template/appkit.plugins.json` — adds the `agent` plugin entry so
`npx @databricks/appkit init --features agent` scaffolds the plugin
correctly.
### Test plan
- Full appkit vitest suite: 1311 tests passing
- Typecheck clean across all 8 workspace projects
- `pnpm docs:build` clean (no broken links)
- `pnpm --filter=@databricks/appkit build:package` clean, publint
clean
Signed-off-by: MarioCadenas <[email protected]>
### Zero-trust MCP host policy documentation (S1 security)
Documents the new `mcp` configuration block and the rules it enforces:
same-origin-only by default, explicit `trustedHosts` for external MCP
servers, plaintext `http://` refused outside localhost-in-dev, and
DNS-level blocking of private / link-local IP ranges (covers cloud
metadata services). See PR #302 for the policy implementation and
PR #304 for the `AgentsPluginConfig.mcp` wiring.
Signed-off-by: MarioCadenas <[email protected]>
0afea5e to
983461c
Compare
…template
Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
### Reference application: agent-app
`apps/agent-app/` — a standalone app purpose-built around the agents
feature. Ships with:
- `server.ts` — full example of code-defined agents via `fromPlugin`:
```ts
const support = createAgent({
instructions: "…",
tools: {
...fromPlugin(analytics),
...fromPlugin(files),
get_weather,
"mcp.vector-search": mcpServer("vector-search", "https://…"),
},
});
await createApp({
plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })],
});
```
- `config/agents/assistant.md` — markdown-driven agent alongside the
code-defined one, showing the asymmetric auto-inherit default.
- Vite + React 19 + TailwindCSS frontend with a chat UI.
- Databricks deployment config (`databricks.yml`, `app.yaml`) and
deploy scripts.
### dev-playground chat UI + demo agent
`apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with
inline autocomplete (hits the `autocomplete` markdown agent) and a
full threaded conversation panel (hits the default agent).
`apps/dev-playground/server/index.ts` — adds a code-defined `helper`
agent using `fromPlugin(analytics)` alongside the markdown-driven
`autocomplete` agent in `config/agents/`. Exercises the mixed-style
setup (markdown + code) against the same plugin list.
`apps/dev-playground/config/agents/*.md` — both agents defined with
valid YAML frontmatter.
### Docs
`docs/docs/plugins/agents.md` — progressive five-level guide:
1. Drop a markdown file → it just works.
2. Scope tools via `toolkits:` / `tools:` frontmatter.
3. Code-defined agents with `fromPlugin()`.
4. Sub-agents.
5. Standalone `runAgent()` (no `createApp` or HTTP).
Plus a configuration reference, runtime API reference, and frontmatter
schema table.
`docs/docs/api/appkit/` — regenerated typedoc for the new public
surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig,
ToolkitEntry, ToolkitOptions, all adapter types, and the agents
plugin factory).
### Template
`template/appkit.plugins.json` — adds the `agent` plugin entry so
`npx @databricks/appkit init --features agent` scaffolds the plugin
correctly.
### Test plan
- Full appkit vitest suite: 1311 tests passing
- Typecheck clean across all 8 workspace projects
- `pnpm docs:build` clean (no broken links)
- `pnpm --filter=@databricks/appkit build:package` clean, publint
clean
Signed-off-by: MarioCadenas <[email protected]>
### Zero-trust MCP host policy documentation (S1 security)
Documents the new `mcp` configuration block and the rules it enforces:
same-origin-only by default, explicit `trustedHosts` for external MCP
servers, plaintext `http://` refused outside localhost-in-dev, and
DNS-level blocking of private / link-local IP ranges (covers cloud
metadata services). See PR #302 for the policy implementation and
PR #304 for the `AgentsPluginConfig.mcp` wiring.
Signed-off-by: MarioCadenas <[email protected]>
### HITL approval UI + SQL safety docs (S2 security, Layer 1-3)
- `docs/docs/plugins/agents.md`: new "SQL agent tools" subsection
covering `analytics.query` readOnly enforcement, `lakebase.query`
opt-in via `exposeAsAgentTool`, and the approval flow. New
"Human-in-the-loop approval for destructive tools" subsection
documents the config, SSE event shape, and `POST /chat/approve`
contract.
- `apps/agent-app`: approval-card component rendered inline in the
chat stream whenever an `appkit.approval_pending` event arrives.
Destructive badge + Approve/Deny buttons POST to
`/api/agent/approve` with the carried `streamId`/`approvalId`.
- `apps/dev-playground/client`: matching approval-card on the agent
route, using the existing appkit-ui `Button` component and
Tailwind utility classes.
Signed-off-by: MarioCadenas <[email protected]>
983461c to
a7b0444
Compare
…template
Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
### Reference application: agent-app
`apps/agent-app/` — a standalone app purpose-built around the agents
feature. Ships with:
- `server.ts` — full example of code-defined agents via `fromPlugin`:
```ts
const support = createAgent({
instructions: "…",
tools: {
...fromPlugin(analytics),
...fromPlugin(files),
get_weather,
"mcp.vector-search": mcpServer("vector-search", "https://…"),
},
});
await createApp({
plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })],
});
```
- `config/agents/assistant.md` — markdown-driven agent alongside the
code-defined one, showing the asymmetric auto-inherit default.
- Vite + React 19 + TailwindCSS frontend with a chat UI.
- Databricks deployment config (`databricks.yml`, `app.yaml`) and
deploy scripts.
### dev-playground chat UI + demo agent
`apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with
inline autocomplete (hits the `autocomplete` markdown agent) and a
full threaded conversation panel (hits the default agent).
`apps/dev-playground/server/index.ts` — adds a code-defined `helper`
agent using `fromPlugin(analytics)` alongside the markdown-driven
`autocomplete` agent in `config/agents/`. Exercises the mixed-style
setup (markdown + code) against the same plugin list.
`apps/dev-playground/config/agents/*.md` — both agents defined with
valid YAML frontmatter.
### Docs
`docs/docs/plugins/agents.md` — progressive five-level guide:
1. Drop a markdown file → it just works.
2. Scope tools via `toolkits:` / `tools:` frontmatter.
3. Code-defined agents with `fromPlugin()`.
4. Sub-agents.
5. Standalone `runAgent()` (no `createApp` or HTTP).
Plus a configuration reference, runtime API reference, and frontmatter
schema table.
`docs/docs/api/appkit/` — regenerated typedoc for the new public
surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig,
ToolkitEntry, ToolkitOptions, all adapter types, and the agents
plugin factory).
### Template
`template/appkit.plugins.json` — adds the `agent` plugin entry so
`npx @databricks/appkit init --features agent` scaffolds the plugin
correctly.
### Test plan
- Full appkit vitest suite: 1311 tests passing
- Typecheck clean across all 8 workspace projects
- `pnpm docs:build` clean (no broken links)
- `pnpm --filter=@databricks/appkit build:package` clean, publint
clean
Signed-off-by: MarioCadenas <[email protected]>
### Zero-trust MCP host policy documentation (S1 security)
Documents the new `mcp` configuration block and the rules it enforces:
same-origin-only by default, explicit `trustedHosts` for external MCP
servers, plaintext `http://` refused outside localhost-in-dev, and
DNS-level blocking of private / link-local IP ranges (covers cloud
metadata services). See PR #302 for the policy implementation and
PR #304 for the `AgentsPluginConfig.mcp` wiring.
Signed-off-by: MarioCadenas <[email protected]>
### HITL approval UI + SQL safety docs (S2 security, Layer 1-3)
- `docs/docs/plugins/agents.md`: new "SQL agent tools" subsection
covering `analytics.query` readOnly enforcement, `lakebase.query`
opt-in via `exposeAsAgentTool`, and the approval flow. New
"Human-in-the-loop approval for destructive tools" subsection
documents the config, SSE event shape, and `POST /chat/approve`
contract.
- `apps/agent-app`: approval-card component rendered inline in the
chat stream whenever an `appkit.approval_pending` event arrives.
Destructive badge + Approve/Deny buttons POST to
`/api/agent/approve` with the carried `streamId`/`approvalId`.
- `apps/dev-playground/client`: matching approval-card on the agent
route, using the existing appkit-ui `Button` component and
Tailwind utility classes.
Signed-off-by: MarioCadenas <[email protected]>
### Auto-inherit posture documentation (S3 security, Layer 3)
Updates `docs/docs/plugins/agents.md` to document the new
two-key auto-inherit model introduced in PR #302 (per-tool
`autoInheritable` flag) and PR #304 (safe-by-default
`autoInheritTools: { file: false, code: false }`). Adds an
"Auto-inherit posture" subsection explaining that the developer
must opt into `autoInheritTools` AND the plugin author must mark
each tool `autoInheritable: true` for a tool to spread without
explicit wiring.
Includes a table documenting the `autoInheritable` marking on each
core plugin tool, plus an example of the setup-time audit log so
operators can see exactly what's inherited vs. skipped.
Signed-off-by: MarioCadenas <[email protected]>
a7b0444 to
623792d
Compare
89ce0e8 to
6c7291b
Compare
f361bd8 to
e5ec02f
Compare
6c7291b to
d0a4596
Compare
e5ec02f to
a02ab55
Compare
d0a4596 to
85603f7
Compare
a02ab55 to
5bf6b22
Compare
1b72080 to
a6567bc
Compare
5bf6b22 to
863439e
Compare
…template
Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
`apps/agent-app/` — a standalone app purpose-built around the agents
feature. Ships with:
- `server.ts` — full example of code-defined agents via `fromPlugin`:
```ts
const support = createAgent({
instructions: "…",
tools: {
...fromPlugin(analytics),
...fromPlugin(files),
get_weather,
"mcp.vector-search": mcpServer("vector-search", "https://…"),
},
});
await createApp({
plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })],
});
```
- `config/agents/assistant.md` — markdown-driven agent alongside the
code-defined one, showing the asymmetric auto-inherit default.
- Vite + React 19 + TailwindCSS frontend with a chat UI.
- Databricks deployment config (`databricks.yml`, `app.yaml`) and
deploy scripts.
`apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with
inline autocomplete (hits the `autocomplete` markdown agent) and a
full threaded conversation panel (hits the default agent).
`apps/dev-playground/server/index.ts` — adds a code-defined `helper`
agent using `fromPlugin(analytics)` alongside the markdown-driven
`autocomplete` agent in `config/agents/`. Exercises the mixed-style
setup (markdown + code) against the same plugin list.
`apps/dev-playground/config/agents/*.md` — both agents defined with
valid YAML frontmatter.
`docs/docs/plugins/agents.md` — progressive five-level guide:
1. Drop a markdown file → it just works.
2. Scope tools via `toolkits:` / `tools:` frontmatter.
3. Code-defined agents with `fromPlugin()`.
4. Sub-agents.
5. Standalone `runAgent()` (no `createApp` or HTTP).
Plus a configuration reference, runtime API reference, and frontmatter
schema table.
`docs/docs/api/appkit/` — regenerated typedoc for the new public
surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig,
ToolkitEntry, ToolkitOptions, all adapter types, and the agents
plugin factory).
`template/appkit.plugins.json` — adds the `agent` plugin entry so
`npx @databricks/appkit init --features agent` scaffolds the plugin
correctly.
- Full appkit vitest suite: 1311 tests passing
- Typecheck clean across all 8 workspace projects
- `pnpm docs:build` clean (no broken links)
- `pnpm --filter=@databricks/appkit build:package` clean, publint
clean
Signed-off-by: MarioCadenas <[email protected]>
Documents the new `mcp` configuration block and the rules it enforces:
same-origin-only by default, explicit `trustedHosts` for external MCP
servers, plaintext `http://` refused outside localhost-in-dev, and
DNS-level blocking of private / link-local IP ranges (covers cloud
metadata services). See PR #302 for the policy implementation and
PR #304 for the `AgentsPluginConfig.mcp` wiring.
Signed-off-by: MarioCadenas <[email protected]>
- `docs/docs/plugins/agents.md`: new "SQL agent tools" subsection
covering `analytics.query` readOnly enforcement, `lakebase.query`
opt-in via `exposeAsAgentTool`, and the approval flow. New
"Human-in-the-loop approval for destructive tools" subsection
documents the config, SSE event shape, and `POST /chat/approve`
contract.
- `apps/agent-app`: approval-card component rendered inline in the
chat stream whenever an `appkit.approval_pending` event arrives.
Destructive badge + Approve/Deny buttons POST to
`/api/agent/approve` with the carried `streamId`/`approvalId`.
- `apps/dev-playground/client`: matching approval-card on the agent
route, using the existing appkit-ui `Button` component and
Tailwind utility classes.
Signed-off-by: MarioCadenas <[email protected]>
Updates `docs/docs/plugins/agents.md` to document the new
two-key auto-inherit model introduced in PR #302 (per-tool
`autoInheritable` flag) and PR #304 (safe-by-default
`autoInheritTools: { file: false, code: false }`). Adds an
"Auto-inherit posture" subsection explaining that the developer
must opt into `autoInheritTools` AND the plugin author must mark
each tool `autoInheritable: true` for a tool to spread without
explicit wiring.
Includes a table documenting the `autoInheritable` marking on each
core plugin tool, plus an example of the setup-time audit log so
operators can see exactly what's inherited vs. skipped.
Signed-off-by: MarioCadenas <[email protected]>
- **Reference app no longer ships hardcoded dogfood URLs.** The three
`https://e2-dogfood.staging.cloud.databricks.com/...` and
`https://mario-mcp-hello-*.staging.aws.databricksapps.com/...` MCP
URLs in `apps/agent-app/server.ts` are replaced with optional
env-driven `VECTOR_SEARCH_MCP_URL` / `CUSTOM_MCP_URL` config. When
set, their hostnames are auto-added to `agents({ mcp: { trustedHosts
} })`. `.env.example` uses placeholder values the reader can replace
instead of another team's workspace.
- **`appkit.agent` → `appkit.agents` in the reference app.** The
prior `appkit.agent as { list, getDefault }` cast papered over the
plugin-name mismatch fixed in PR #304. The runtime key now matches
the docs, the manifest, and the factory name; the cast is gone.
- **Auto-inherit opt-in added to the reference config.** Since the
defaults flipped to `{ file: false, code: false }` (PR #304, S-3),
the reference now explicitly enables `autoInheritTools: { file:
true }` so the markdown agents that ship alongside the code-defined
one still pick up the analytics / files read-only tools. This is the
pattern a real deployment should follow — opt in deliberately.
Signed-off-by: MarioCadenas <[email protected]>
- `apps/dev-playground/config/agents/autocomplete.md` sets
`ephemeral: true`. Each debounced autocomplete keystroke no longer
leaves an orphan thread in `InMemoryThreadStore` — the server now
deletes the thread in the stream's `finally` (PR #304). Closes R1
from the MVP re-review.
- `docs/docs/plugins/agents.md` documents the new `ephemeral`
frontmatter key alongside the other AgentDefinition knobs.
Signed-off-by: MarioCadenas <[email protected]>
Documents the MVP resource caps landed in PR #304: the static
request-body caps (enforced by the Zod schemas) and the three
configurable runtime limits (`maxConcurrentStreamsPerUser`,
`maxToolCalls`, `maxSubAgentDepth`). Includes the config-block
shape in the main reference and a new "Resource limits" subsection
under the Configuration section explaining the intent and per-user
semantics of each cap.
Signed-off-by: MarioCadenas <[email protected]>
863439e to
cca914f
Compare
a6567bc to
5d0fae2
Compare
…template
Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
`apps/agent-app/` — a standalone app purpose-built around the agents
feature. Ships with:
- `server.ts` — full example of code-defined agents via `fromPlugin`:
```ts
const support = createAgent({
instructions: "…",
tools: {
...fromPlugin(analytics),
...fromPlugin(files),
get_weather,
"mcp.vector-search": mcpServer("vector-search", "https://…"),
},
});
await createApp({
plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })],
});
```
- `config/agents/assistant.md` — markdown-driven agent alongside the
code-defined one, showing the asymmetric auto-inherit default.
- Vite + React 19 + TailwindCSS frontend with a chat UI.
- Databricks deployment config (`databricks.yml`, `app.yaml`) and
deploy scripts.
`apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with
inline autocomplete (hits the `autocomplete` markdown agent) and a
full threaded conversation panel (hits the default agent).
`apps/dev-playground/server/index.ts` — adds a code-defined `helper`
agent using `fromPlugin(analytics)` alongside the markdown-driven
`autocomplete` agent in `config/agents/`. Exercises the mixed-style
setup (markdown + code) against the same plugin list.
`apps/dev-playground/config/agents/*.md` — both agents defined with
valid YAML frontmatter.
`docs/docs/plugins/agents.md` — progressive five-level guide:
1. Drop a markdown file → it just works.
2. Scope tools via `toolkits:` / `tools:` frontmatter.
3. Code-defined agents with `fromPlugin()`.
4. Sub-agents.
5. Standalone `runAgent()` (no `createApp` or HTTP).
Plus a configuration reference, runtime API reference, and frontmatter
schema table.
`docs/docs/api/appkit/` — regenerated typedoc for the new public
surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig,
ToolkitEntry, ToolkitOptions, all adapter types, and the agents
plugin factory).
`template/appkit.plugins.json` — adds the `agent` plugin entry so
`npx @databricks/appkit init --features agent` scaffolds the plugin
correctly.
- Full appkit vitest suite: 1311 tests passing
- Typecheck clean across all 8 workspace projects
- `pnpm docs:build` clean (no broken links)
- `pnpm --filter=@databricks/appkit build:package` clean, publint
clean
Signed-off-by: MarioCadenas <[email protected]>
Documents the new `mcp` configuration block and the rules it enforces:
same-origin-only by default, explicit `trustedHosts` for external MCP
servers, plaintext `http://` refused outside localhost-in-dev, and
DNS-level blocking of private / link-local IP ranges (covers cloud
metadata services). See PR #302 for the policy implementation and
PR #304 for the `AgentsPluginConfig.mcp` wiring.
Signed-off-by: MarioCadenas <[email protected]>
- `docs/docs/plugins/agents.md`: new "SQL agent tools" subsection
covering `analytics.query` readOnly enforcement, `lakebase.query`
opt-in via `exposeAsAgentTool`, and the approval flow. New
"Human-in-the-loop approval for destructive tools" subsection
documents the config, SSE event shape, and `POST /chat/approve`
contract.
- `apps/agent-app`: approval-card component rendered inline in the
chat stream whenever an `appkit.approval_pending` event arrives.
Destructive badge + Approve/Deny buttons POST to
`/api/agent/approve` with the carried `streamId`/`approvalId`.
- `apps/dev-playground/client`: matching approval-card on the agent
route, using the existing appkit-ui `Button` component and
Tailwind utility classes.
Signed-off-by: MarioCadenas <[email protected]>
Updates `docs/docs/plugins/agents.md` to document the new
two-key auto-inherit model introduced in PR #302 (per-tool
`autoInheritable` flag) and PR #304 (safe-by-default
`autoInheritTools: { file: false, code: false }`). Adds an
"Auto-inherit posture" subsection explaining that the developer
must opt into `autoInheritTools` AND the plugin author must mark
each tool `autoInheritable: true` for a tool to spread without
explicit wiring.
Includes a table documenting the `autoInheritable` marking on each
core plugin tool, plus an example of the setup-time audit log so
operators can see exactly what's inherited vs. skipped.
Signed-off-by: MarioCadenas <[email protected]>
- **Reference app no longer ships hardcoded dogfood URLs.** The three
`https://e2-dogfood.staging.cloud.databricks.com/...` and
`https://mario-mcp-hello-*.staging.aws.databricksapps.com/...` MCP
URLs in `apps/agent-app/server.ts` are replaced with optional
env-driven `VECTOR_SEARCH_MCP_URL` / `CUSTOM_MCP_URL` config. When
set, their hostnames are auto-added to `agents({ mcp: { trustedHosts
} })`. `.env.example` uses placeholder values the reader can replace
instead of another team's workspace.
- **`appkit.agent` → `appkit.agents` in the reference app.** The
prior `appkit.agent as { list, getDefault }` cast papered over the
plugin-name mismatch fixed in PR #304. The runtime key now matches
the docs, the manifest, and the factory name; the cast is gone.
- **Auto-inherit opt-in added to the reference config.** Since the
defaults flipped to `{ file: false, code: false }` (PR #304, S-3),
the reference now explicitly enables `autoInheritTools: { file:
true }` so the markdown agents that ship alongside the code-defined
one still pick up the analytics / files read-only tools. This is the
pattern a real deployment should follow — opt in deliberately.
Signed-off-by: MarioCadenas <[email protected]>
- `apps/dev-playground/config/agents/autocomplete.md` sets
`ephemeral: true`. Each debounced autocomplete keystroke no longer
leaves an orphan thread in `InMemoryThreadStore` — the server now
deletes the thread in the stream's `finally` (PR #304). Closes R1
from the MVP re-review.
- `docs/docs/plugins/agents.md` documents the new `ephemeral`
frontmatter key alongside the other AgentDefinition knobs.
Signed-off-by: MarioCadenas <[email protected]>
Documents the MVP resource caps landed in PR #304: the static
request-body caps (enforced by the Zod schemas) and the three
configurable runtime limits (`maxConcurrentStreamsPerUser`,
`maxToolCalls`, `maxSubAgentDepth`). Includes the config-block
shape in the main reference and a new "Resource limits" subsection
under the Configuration section explaining the intent and per-user
semantics of each cap.
Signed-off-by: MarioCadenas <[email protected]>
5d0fae2 to
af9b6ee
Compare
cca914f to
a3d2cc6
Compare
…template
Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
`apps/agent-app/` — a standalone app purpose-built around the agents
feature. Ships with:
- `server.ts` — full example of code-defined agents via `fromPlugin`:
```ts
const support = createAgent({
instructions: "…",
tools: {
...fromPlugin(analytics),
...fromPlugin(files),
get_weather,
"mcp.vector-search": mcpServer("vector-search", "https://…"),
},
});
await createApp({
plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })],
});
```
- `config/agents/assistant.md` — markdown-driven agent alongside the
code-defined one, showing the asymmetric auto-inherit default.
- Vite + React 19 + TailwindCSS frontend with a chat UI.
- Databricks deployment config (`databricks.yml`, `app.yaml`) and
deploy scripts.
`apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with
inline autocomplete (hits the `autocomplete` markdown agent) and a
full threaded conversation panel (hits the default agent).
`apps/dev-playground/server/index.ts` — adds a code-defined `helper`
agent using `fromPlugin(analytics)` alongside the markdown-driven
`autocomplete` agent in `config/agents/`. Exercises the mixed-style
setup (markdown + code) against the same plugin list.
`apps/dev-playground/config/agents/*.md` — both agents defined with
valid YAML frontmatter.
`docs/docs/plugins/agents.md` — progressive five-level guide:
1. Drop a markdown file → it just works.
2. Scope tools via `toolkits:` / `tools:` frontmatter.
3. Code-defined agents with `fromPlugin()`.
4. Sub-agents.
5. Standalone `runAgent()` (no `createApp` or HTTP).
Plus a configuration reference, runtime API reference, and frontmatter
schema table.
`docs/docs/api/appkit/` — regenerated typedoc for the new public
surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig,
ToolkitEntry, ToolkitOptions, all adapter types, and the agents
plugin factory).
`template/appkit.plugins.json` — adds the `agent` plugin entry so
`npx @databricks/appkit init --features agent` scaffolds the plugin
correctly.
- Full appkit vitest suite: 1311 tests passing
- Typecheck clean across all 8 workspace projects
- `pnpm docs:build` clean (no broken links)
- `pnpm --filter=@databricks/appkit build:package` clean, publint
clean
Signed-off-by: MarioCadenas <[email protected]>
Documents the new `mcp` configuration block and the rules it enforces:
same-origin-only by default, explicit `trustedHosts` for external MCP
servers, plaintext `http://` refused outside localhost-in-dev, and
DNS-level blocking of private / link-local IP ranges (covers cloud
metadata services). See PR #302 for the policy implementation and
PR #304 for the `AgentsPluginConfig.mcp` wiring.
Signed-off-by: MarioCadenas <[email protected]>
- `docs/docs/plugins/agents.md`: new "SQL agent tools" subsection
covering `analytics.query` readOnly enforcement, `lakebase.query`
opt-in via `exposeAsAgentTool`, and the approval flow. New
"Human-in-the-loop approval for destructive tools" subsection
documents the config, SSE event shape, and `POST /chat/approve`
contract.
- `apps/agent-app`: approval-card component rendered inline in the
chat stream whenever an `appkit.approval_pending` event arrives.
Destructive badge + Approve/Deny buttons POST to
`/api/agent/approve` with the carried `streamId`/`approvalId`.
- `apps/dev-playground/client`: matching approval-card on the agent
route, using the existing appkit-ui `Button` component and
Tailwind utility classes.
Signed-off-by: MarioCadenas <[email protected]>
Updates `docs/docs/plugins/agents.md` to document the new
two-key auto-inherit model introduced in PR #302 (per-tool
`autoInheritable` flag) and PR #304 (safe-by-default
`autoInheritTools: { file: false, code: false }`). Adds an
"Auto-inherit posture" subsection explaining that the developer
must opt into `autoInheritTools` AND the plugin author must mark
each tool `autoInheritable: true` for a tool to spread without
explicit wiring.
Includes a table documenting the `autoInheritable` marking on each
core plugin tool, plus an example of the setup-time audit log so
operators can see exactly what's inherited vs. skipped.
Signed-off-by: MarioCadenas <[email protected]>
- **Reference app no longer ships hardcoded dogfood URLs.** The three
`https://e2-dogfood.staging.cloud.databricks.com/...` and
`https://mario-mcp-hello-*.staging.aws.databricksapps.com/...` MCP
URLs in `apps/agent-app/server.ts` are replaced with optional
env-driven `VECTOR_SEARCH_MCP_URL` / `CUSTOM_MCP_URL` config. When
set, their hostnames are auto-added to `agents({ mcp: { trustedHosts
} })`. `.env.example` uses placeholder values the reader can replace
instead of another team's workspace.
- **`appkit.agent` → `appkit.agents` in the reference app.** The
prior `appkit.agent as { list, getDefault }` cast papered over the
plugin-name mismatch fixed in PR #304. The runtime key now matches
the docs, the manifest, and the factory name; the cast is gone.
- **Auto-inherit opt-in added to the reference config.** Since the
defaults flipped to `{ file: false, code: false }` (PR #304, S-3),
the reference now explicitly enables `autoInheritTools: { file:
true }` so the markdown agents that ship alongside the code-defined
one still pick up the analytics / files read-only tools. This is the
pattern a real deployment should follow — opt in deliberately.
Signed-off-by: MarioCadenas <[email protected]>
- `apps/dev-playground/config/agents/autocomplete.md` sets
`ephemeral: true`. Each debounced autocomplete keystroke no longer
leaves an orphan thread in `InMemoryThreadStore` — the server now
deletes the thread in the stream's `finally` (PR #304). Closes R1
from the MVP re-review.
- `docs/docs/plugins/agents.md` documents the new `ephemeral`
frontmatter key alongside the other AgentDefinition knobs.
Signed-off-by: MarioCadenas <[email protected]>
Documents the MVP resource caps landed in PR #304: the static
request-body caps (enforced by the Zod schemas) and the three
configurable runtime limits (`maxConcurrentStreamsPerUser`,
`maxToolCalls`, `maxSubAgentDepth`). Includes the config-block
shape in the main reference and a new "Resource limits" subsection
under the Configuration section explaining the intent and per-user
semantics of each cap.
Signed-off-by: MarioCadenas <[email protected]>
a3d2cc6 to
f495962
Compare
af9b6ee to
e4b1322
Compare
…template
Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
`apps/agent-app/` — a standalone app purpose-built around the agents
feature. Ships with:
- `server.ts` — full example of code-defined agents via `fromPlugin`:
```ts
const support = createAgent({
instructions: "…",
tools: {
...fromPlugin(analytics),
...fromPlugin(files),
get_weather,
"mcp.vector-search": mcpServer("vector-search", "https://…"),
},
});
await createApp({
plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })],
});
```
- `config/agents/assistant.md` — markdown-driven agent alongside the
code-defined one, showing the asymmetric auto-inherit default.
- Vite + React 19 + TailwindCSS frontend with a chat UI.
- Databricks deployment config (`databricks.yml`, `app.yaml`) and
deploy scripts.
`apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with
inline autocomplete (hits the `autocomplete` markdown agent) and a
full threaded conversation panel (hits the default agent).
`apps/dev-playground/server/index.ts` — adds a code-defined `helper`
agent using `fromPlugin(analytics)` alongside the markdown-driven
`autocomplete` agent in `config/agents/`. Exercises the mixed-style
setup (markdown + code) against the same plugin list.
`apps/dev-playground/config/agents/*.md` — both agents defined with
valid YAML frontmatter.
`docs/docs/plugins/agents.md` — progressive five-level guide:
1. Drop a markdown file → it just works.
2. Scope tools via `toolkits:` / `tools:` frontmatter.
3. Code-defined agents with `fromPlugin()`.
4. Sub-agents.
5. Standalone `runAgent()` (no `createApp` or HTTP).
Plus a configuration reference, runtime API reference, and frontmatter
schema table.
`docs/docs/api/appkit/` — regenerated typedoc for the new public
surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig,
ToolkitEntry, ToolkitOptions, all adapter types, and the agents
plugin factory).
`template/appkit.plugins.json` — adds the `agent` plugin entry so
`npx @databricks/appkit init --features agent` scaffolds the plugin
correctly.
- Full appkit vitest suite: 1311 tests passing
- Typecheck clean across all 8 workspace projects
- `pnpm docs:build` clean (no broken links)
- `pnpm --filter=@databricks/appkit build:package` clean, publint
clean
Signed-off-by: MarioCadenas <[email protected]>
Documents the new `mcp` configuration block and the rules it enforces:
same-origin-only by default, explicit `trustedHosts` for external MCP
servers, plaintext `http://` refused outside localhost-in-dev, and
DNS-level blocking of private / link-local IP ranges (covers cloud
metadata services). See PR #302 for the policy implementation and
PR #304 for the `AgentsPluginConfig.mcp` wiring.
Signed-off-by: MarioCadenas <[email protected]>
- `docs/docs/plugins/agents.md`: new "SQL agent tools" subsection
covering `analytics.query` readOnly enforcement, `lakebase.query`
opt-in via `exposeAsAgentTool`, and the approval flow. New
"Human-in-the-loop approval for destructive tools" subsection
documents the config, SSE event shape, and `POST /chat/approve`
contract.
- `apps/agent-app`: approval-card component rendered inline in the
chat stream whenever an `appkit.approval_pending` event arrives.
Destructive badge + Approve/Deny buttons POST to
`/api/agent/approve` with the carried `streamId`/`approvalId`.
- `apps/dev-playground/client`: matching approval-card on the agent
route, using the existing appkit-ui `Button` component and
Tailwind utility classes.
Signed-off-by: MarioCadenas <[email protected]>
Updates `docs/docs/plugins/agents.md` to document the new
two-key auto-inherit model introduced in PR #302 (per-tool
`autoInheritable` flag) and PR #304 (safe-by-default
`autoInheritTools: { file: false, code: false }`). Adds an
"Auto-inherit posture" subsection explaining that the developer
must opt into `autoInheritTools` AND the plugin author must mark
each tool `autoInheritable: true` for a tool to spread without
explicit wiring.
Includes a table documenting the `autoInheritable` marking on each
core plugin tool, plus an example of the setup-time audit log so
operators can see exactly what's inherited vs. skipped.
Signed-off-by: MarioCadenas <[email protected]>
- **Reference app no longer ships hardcoded dogfood URLs.** The three
`https://e2-dogfood.staging.cloud.databricks.com/...` and
`https://mario-mcp-hello-*.staging.aws.databricksapps.com/...` MCP
URLs in `apps/agent-app/server.ts` are replaced with optional
env-driven `VECTOR_SEARCH_MCP_URL` / `CUSTOM_MCP_URL` config. When
set, their hostnames are auto-added to `agents({ mcp: { trustedHosts
} })`. `.env.example` uses placeholder values the reader can replace
instead of another team's workspace.
- **`appkit.agent` → `appkit.agents` in the reference app.** The
prior `appkit.agent as { list, getDefault }` cast papered over the
plugin-name mismatch fixed in PR #304. The runtime key now matches
the docs, the manifest, and the factory name; the cast is gone.
- **Auto-inherit opt-in added to the reference config.** Since the
defaults flipped to `{ file: false, code: false }` (PR #304, S-3),
the reference now explicitly enables `autoInheritTools: { file:
true }` so the markdown agents that ship alongside the code-defined
one still pick up the analytics / files read-only tools. This is the
pattern a real deployment should follow — opt in deliberately.
Signed-off-by: MarioCadenas <[email protected]>
- `apps/dev-playground/config/agents/autocomplete.md` sets
`ephemeral: true`. Each debounced autocomplete keystroke no longer
leaves an orphan thread in `InMemoryThreadStore` — the server now
deletes the thread in the stream's `finally` (PR #304). Closes R1
from the MVP re-review.
- `docs/docs/plugins/agents.md` documents the new `ephemeral`
frontmatter key alongside the other AgentDefinition knobs.
Signed-off-by: MarioCadenas <[email protected]>
Documents the MVP resource caps landed in PR #304: the static
request-body caps (enforced by the Zod schemas) and the three
configurable runtime limits (`maxConcurrentStreamsPerUser`,
`maxToolCalls`, `maxSubAgentDepth`). Includes the config-block
shape in the main reference and a new "Resource limits" subsection
under the Configuration section explaining the intent and per-user
semantics of each cap.
Signed-off-by: MarioCadenas <[email protected]>
…agents
The main product layer. Turns an AppKit app into an AI-agent host with
markdown-driven agent discovery, code-defined agents, sub-agents, and
a standalone run-without-HTTP executor.
Agent runtime files land in core/agent/ from day one:
core/agent/create-agent.ts — createAgent() definition factory
core/agent/run-agent.ts — standalone adapter loop (no HTTP)
core/agent/load-agents.ts — markdown agent discovery
core/agent/system-prompt.ts — base system prompt + composition
core/agent/types.ts — updated with AgentDefinition,
AgentsPluginConfig, RegisteredAgent, etc.
HTTP-facing concerns stay in plugins/agents/:
agents.ts, thread-store.ts, tool-approval-gate.ts,
event-channel.ts, event-translator.ts, schemas.ts,
defaults.ts, manifest.json
Tool-agnostic guidelines instead of SQL/files-specific defaults; accept full PromptContext in buildBaseSystemPrompt for parity with custom callbacks. Signed-off-by: MarioCadenas <[email protected]>
Register DATABRICKS_SERVING_ENDPOINT_NAME as optional CAN_QUERY so apps using Databricks-hosted agent models get resource wiring; optional when agents use only external adapters. Sync template/appkit.plugins.json. Signed-off-by: MarioCadenas <[email protected]>
Align optional serving resource with `DatabricksAdapter.fromModelServing()`, which reads `DATABRICKS_AGENT_ENDPOINT` — not `DATABRICKS_SERVING_ENDPOINT_NAME` (serving plugin). Sync template. Signed-off-by: MarioCadenas <[email protected]>
BREAKING CHANGE: top-level config/agents/*.md is no longer loaded. Use <agentId>/agent.md. The skills directory name is reserved and skipped. Orphan top-level .md files error at load; subdirs without agent.md error. Export agentIdFromMarkdownPath for path-based id resolution.
The MCP transport client and host policy aren't agents-specific; they are HTTP + JSON-RPC transport with URL/DNS allowlisting. Move them under packages/appkit/src/connectors/mcp/ so they sit alongside the other transport-layer modules (serving, genie, sql-warehouse, lakebase, …) and stop being reachable only through the agents plugin. - Move mcp-client.ts -> connectors/mcp/client.ts - Move mcp-host-policy.ts -> connectors/mcp/host-policy.ts - Move McpEndpointConfig type -> connectors/mcp/types.ts - Add connectors/mcp/index.ts barrel; re-export from connectors/index.ts - Move mcp-client / mcp-host-policy tests to connectors/mcp/tests/ - Agents plugin keeps hosted-tools.ts (HostedTool sugar + resolve) and imports connector types from ../../connectors/mcp. - tools/ barrel no longer re-exports AppKitMcpClient (never was public). No behaviour change. All existing tests pass against the new paths.
…dispatchToolCall Three small helpers pulled out of the AgentsPlugin streaming path to cut duplication and shrink the two large methods. - normalize-result.ts: void->"", JSON-stringify, 50K truncation with a human-readable marker. Unit-testable (previously covered only via the HTTP path). - consume-adapter-stream.ts: the 'message_delta' + 'message' accumulation loop shared between _streamAgent and runSubAgent. Accepts an optional signal and per-event side-effect callback (for SSE translation). - tool-dispatch.ts: one place that fans out toolkit/function/mcp/subagent entries. 'never'-typed default forces exhaustiveness: adding a fifth source is now a compile error at every call site. _streamAgent: executeTool closure shrinks from ~60 lines of dispatch + normalize to a single dispatchToolCall + normalizeToolResult call. Stream consumption collapses to consumeAdapterStream. runSubAgent: childExecute shrinks from ~30 lines of if/else dispatch to one dispatchToolCall call. Adapter loop collapses to consumeAdapterStream. Behaviour change (minor): childExecute previously silently fell through to 'Unsupported sub-agent tool source' when mcpClient or PluginContext was missing; now it throws the same specific error as the main stream. Matches the main-path behaviour. Tests: 15 new unit tests for normalizeToolResult + consumeAdapterStream. dispatchToolCall is exercised transitively through the full agent suite (288 existing tests still pass, 303 total on this branch).
… → def
The `annotations` field (notably `destructive: true`) was silently dropped
as tools flowed from `tool({...})` into the resolved `AgentToolDefinition`,
so user-defined destructive tools never triggered the approval gate.
- `ToolConfig` now accepts `annotations?: ToolAnnotations`.
- `tool()` forwards it to the returned `FunctionTool`.
- `FunctionTool` exposes `annotations` and `functionToolToDefinition`
preserves it on the definition it builds.
- `AgentsPlugin` reads the flag via `isDestructiveToolEntry()` (falls back
to `functionTool.annotations` so a future divergence between def and
function cannot re-introduce the bug) and emits the merged annotations
via `combinedToolAnnotations()` on the `approval_pending` SSE payload.
Covered by `tests/tool-approval-gate.test.ts` and
`tests/function-tool.test.ts`.
ToolAnnotations.destructive is binary and has started to mislead: "save_view" captures a screenshot and creates a new file, which is nothing like deleting a dashboard, yet both trip the same red "destructive" approval card. This adds a semantic `effect` enum with four tiers — `read`, `write`, `update`, `destructive` — so tool authors can tell the UI what blast radius they actually have. The approval gate fires for any mutating effect (`write`/`update`/ `destructive`) and continues to honour the legacy `destructive: true` flag so existing tools keep their current red treatment without migration. Callers consuming `annotations` over the wire (MCP clients, approval UIs) can now differentiate; the playground will ship a tiered approval card as a follow-up.
Follow-up for connector relocate: re-export AppKitMcpClient from connectors/mcp. Adjust Vitest mock pool typing without non-null mocks. Signed-off-by: MarioCadenas <[email protected]>
There was a problem hiding this comment.
should we hide this plugin until all PRs are merged?
There was a problem hiding this comment.
btw maybe we should rename this file as beta.gen.ts? so that we don't review an autogenerated file?
(and add "generated, do not edit" header)
| "@types/semver": "7.7.1", | ||
| "dotenv": "16.6.1", | ||
| "express": "4.22.0", | ||
| "js-yaml": "^4.1.1", |
There was a problem hiding this comment.
| "js-yaml": "^4.1.1", | |
| "js-yaml": "4.1.1", |
There was a problem hiding this comment.
Agentic review:
Code Review: agent/v2/4-agents-plugin
Scope
Branch: agent/v2/4-agents-plugin (16 commits, 48 files, +5563/-32 lines)
Base: main at a7ebc57
Mode: report-only (plan mode)
Intent
Add a complete agents() plugin to AppKit: markdown-driven agent definitions with folder-based discovery (<id>/agent.md), code-defined agents via createAgent(), human-in-the-loop approval gate for destructive tool calls, sub-agent delegation with depth limiting, SSE streaming with Responses API-compatible event translation, pluggable ThreadStore, MCP tool integration, toolkit inheritance, DoS protection (concurrent stream limits, tool-call budgets), and refactoring of MCP client and utility extraction.
Review Team
- correctness (always)
- testing (always)
- maintainability (always)
- security -- tool approval gates, OBO token handling, user auth, input validation
- api-contract -- new REST routes, SSE events, Responses API compat
- reliability -- abort signals, timeouts, event channel, async error paths
- adversarial -- 5500+ line diff, external APIs, user input
- kieran-typescript -- large TypeScript codebase, discriminated unions, type guards
Findings
P1 -- Critical
| # | File | Issue | Reviewer | Confidence |
|---|---|---|---|---|
| 1 | agents.ts:751 |
Approval gate ignores effect field. The gate checks entry.def.annotations?.destructive === true but never reads effect. A tool with effect: "destructive" (the new preferred API) and no legacy destructive: true boolean bypasses approval entirely. The JSDoc on tool.ts:13-16 and function-tool.ts:12-15 explicitly states "any mutating value forces the agents-plugin approval gate." The implementation contradicts the documented contract. Fix: check const isDestructive = ann?.destructive === true || ann?.effect === "destructive"; (or broaden to write/update if the JSDoc intent holds). |
security, correctness | 0.95 |
| 2 | agents.ts:991-1023 |
Sub-agent childExecute bypasses tool-call budget. The toolCallsUsed counter and budget check live in the top-level executeTool closure (line 735-745). runSubAgent creates its own childExecute that never increments or checks the budget. A sub-agent can make unlimited tool calls, defeating limits.maxToolCalls. Fix: pass the budget counter (or a shared budget object) into runSubAgent and check on every child tool call. |
correctness, adversarial | 0.95 |
| 3 | agents.ts:991-1023 |
Sub-agent childExecute bypasses approval gate. The top-level executeTool checks approvalPolicy.requireForDestructive (line 751) before executing destructive tools and emits approval_pending events. childExecute in runSubAgent does none of this. A sub-agent calling a destructive tool executes it without human approval. Fix: apply the same approval check in childExecute, or factor the gate logic into a shared helper. |
security | 0.90 |
| 4 | agents.ts:661-707 |
/invocations bypasses concurrent stream limit. _handleChat checks countUserStreams(userId) >= limits.maxConcurrentStreamsPerUser before streaming. _handleInvocations calls _streamAgent without the same check. A client can bypass the rate limit by hitting /invocations instead of /chat. Fix: add the same guard to _handleInvocations. |
security, reliability | 0.92 |
P2 -- Moderate
| # | File | Issue | Reviewer | Confidence |
|---|---|---|---|---|
| 5 | consume-adapter-stream.ts, normalize-result.ts |
Extracted utilities are dead code. consumeAdapterStream and normalizeToolResult were extracted into core/agent/ (with tests), but no production code imports them. The same logic is still inlined in agents.ts:815-826 (result normalization), agents.ts:885-896 (stream consumption), run-agent.ts:97-105, and agents.ts:1058-1067. Three call sites duplicate the accumulation pattern. Either use the extracted functions or remove them. |
maintainability | 0.95 |
| 6 | agents.ts:148-156 |
reload() is non-atomic. this.agents.clear() runs before await this.loadAgents(). If loadAgents throws (e.g. malformed markdown), the registry is empty and new requests get "No agent registered" errors. In-flight streams holding old references still work but new ones fail. Fix: build into a new Map, then swap on success. |
reliability | 0.85 |
| 7 | agents.ts:1072 |
_handleCancel uses unsafe type assertion for streamId. Unlike /chat, /approve, and /invocations which use Zod schemas, the cancel route extracts streamId via req.body as { streamId?: string }. This is inconsistent and skips validation. Fix: add a small Zod schema (or validate inline). |
api-contract, security | 0.80 |
| 8 | agents.ts:815-826 |
Tool result type inconsistency. When serialized.length > MAX, the return is a truncated string. When <= MAX, the return is the raw result (which may be an object). Adapters receive different types depending on length. The extracted normalizeToolResult has the same behavior (returns result not serialized). This is intentional (preserving structured data for short results), but consider documenting the contract or always returning strings. |
correctness | 0.70 |
| 9 | event-channel.ts:23-31 |
EventChannel has no backpressure. push() accumulates into this.queue without bound. If the SSE consumer is slow (network backpressure, paused tab), the queue grows indefinitely. For typical chat streams this is fine; for high-throughput tool-calling agents it could cause memory pressure. Consider a max queue size with drop/error policy. |
performance | 0.65 |
| 10 | load-agents.ts:106,141,145 |
Synchronous filesystem I/O in agent loader. fs.readFileSync, fs.existsSync, fs.readdirSync block the event loop. Acceptable during startup, but reload() can be called at runtime via the exported appkit.agents.reload() API, which would block the event loop while reading files. |
performance | 0.65 |
P3 -- Low
| # | File | Issue | Reviewer | Confidence |
|---|---|---|---|---|
| 11 | agents.ts:1176-1188 |
printRegistry() uses console.log instead of logger. Inconsistent with the rest of the file which uses createLogger("agents"). The formatted output is intentionally styled with picocolors, so this may be deliberate for terminal aesthetics. |
maintainability | 0.60 |
Coverage
- Suppressed: 0 findings below 0.60 confidence.
- Untracked files: None.
- Testing: Test coverage is extensive (~1800 lines of tests across 7 test files). Tests cover plugin lifecycle, event translation, thread store, approval gate, DoS limits, and load-agents. The tool-call budget bypass in sub-agents (# 2) is not tested. The
effectfield interaction with the approval gate (chore: rework TelemetryManager to use Node SDK #1) is not tested. - Residual risks: The
InMemoryThreadStorehas no eviction, bounds, or TTL. The code warns about this in both prod and dev, which is good. A follow-up for bounded eviction is mentioned in comments.
Verdict: Not ready -- fix P1 items before merge
Fix order:
- # 1 (approval gate
effectgap) -- security hole, contradicts documented API - # 2 + # 3 (sub-agent budget + approval bypass) -- these are the same code path; fix together
- # 4 (
/invocationsrate-limit bypass) -- one-line fix - # 5 (dead code) -- either wire the extracted utilities or delete them
P2 items # 6-# 10 are lower priority and could be addressed in follow-ups, though # 6 (reload() atomicity) is worth fixing now since it's small.
There was a problem hiding this comment.
You might consider using an external lib:
EventChannel (~70 lines) -- This is a basic unbounded async queue (push/consume as async iterable). Existing alternatives:
- @repeaterjs/repeater -- Almost exactly this API: push-based async iterable creation with close/error semantics and optional backpressure. Would replace EventChannel entirely.
- Node.js built-in events.on(emitter, event) -- Returns an AsyncIterator from an EventEmitter. Could work but is clunkier (requires wrapping in an EventEmitter, less clean close/error semantics).
- Web Streams ReadableStream -- The ReadableStream constructor with a controller gives push/pull with built-in backpressure. Slightly heavier API surface.
@repeaterjs/repeater is the cleanest drop-in. That said, 70 lines of zero-dependency code is defensible in an SDK -- adding a dependency has its own cost.
The main product layer. Turns an AppKit app into an AI-agent host with
markdown-driven agent discovery, code-defined agents, sub-agents,
human-in-the-loop approval, zero-trust MCP, and a standalone
run-without-HTTP executor.
createAgent(def)— pure factorypackages/appkit/src/core/create-agent-def.ts. Returns the passed-indefinition after cycle-detecting the sub-agent graph. No adapter
construction, no side effects — safe at module top-level. The returned
AgentDefinitionis plain data, consumable by eitheragents({ agents })or
runAgent(def, input).agents()pluginpackages/appkit/src/plugins/agents/agents.ts.AgentsPluginclass:config/agents/*.md(configurable dir)via real YAML frontmatter parsing (
js-yaml). Frontmatter schema:endpoint,model,toolkits,tools,agents,default,maxSteps,maxTokens,baseSystemPrompt,ephemeral. Unknownkeys logged, invalid YAML throws at boot.
agents({ agents: { name: def } }).Code wins on key collision.
agents:frontmatter oragents:code field) —synthesized as
agent-<key>tools on the parent. Markdownagents: [...]resolves against both markdown siblings andcode-defined agents passed via
LoadContext.codeAgents, so amarkdown orchestrator can delegate to a code-defined specialist.
ToolkitEntrys, inlineFunctionTools, orHostedTools.ToolProviderplugin's tools whose author markedautoInheritable: true. Asymmetric default: markdown agentsinherit (
file: true), code-defined agents don't (code: false).POST /api/agents/invocations(OpenAI Responses compatible) +POST /api/agents/chat,POST /api/agents/cancel,POST /api/agents/approve,GET /api/agents/threads/:id,DELETE /api/agents/threads/:id,GET /api/agents/info.executeStream. Tool calls dispatch throughPluginContext.executeTool(req, pluginName, localName, args, signal)for OBO, telemetry, and timeout.
appkit.agents.{register, list, get, reload, getDefault, getThreads}runtime helpers.
Human-in-the-loop approval gate
Any tool annotated
destructive: truepauses the stream, emits anappkit.approval_pendingSSE event, and waits for aPOST /api/agents/approvedecision from the same user who initiatedthe run. A missing decision after
approval.timeoutMsauto-denies.Enabled by default (
approval.requireForDestructive: true); opt outfor dev. Per-user ownership enforced (
x-forwarded-user).Zero-trust MCP host policy
tools/mcp-host-policy.tsenforces an allowlist on every MCP URLbefore the first byte is sent. Same-origin Databricks workspace URLs
are admitted by default; any other host must be explicitly trusted
via
agents({ mcp: { trustedHosts: [...] } }). Blocks link-local(cloud metadata at 169.254/16), RFC1918, CGNAT, loopback,
ULA, multicast, and IPv4-mapped IPv6 equivalents at DNS-resolve
time. Workspace credentials (service-principal on
initialize/tools/list; caller OBO ontools/call) are never attached tonon-workspace hosts.
DoS caps
limits: { maxConcurrentStreamsPerUser, maxToolCalls, maxSubAgentDepth }(defaults 5 / 50 / 3). Chat bodies are capped at 64k characters via
Zod schema; 6th concurrent stream for the same user returns 429; tool
budget exhaustion aborts the run with a clear error.
runAgent(def, input)— standalone executorpackages/appkit/src/core/run-agent.ts. Runs anAgentDefinitionwithout
createAppor HTTP. Drives the adapter's event stream tocompletion, executing inline tools + sub-agents along the way.
Aggregates events into
{ text, events }. Useful for tests, CLIscripts, and offline pipelines.
Event translation and thread storage
AgentEventTranslator— stateful converter from internalAgentEvents to OpenAI Responses APIResponseStreamEvents withstrictly monotonic
sequence_numberandoutput_index.InMemoryThreadStore— per-user conversation persistence withexplicit
ephemeral: trueopt-in onAgentDefinitionforstateless agents (autocomplete, one-shot tools).
buildBaseSystemPrompt+composeSystemPrompt— formats theAppKit base prompt (with plugin names and tool names) and layers
the agent's instructions on top.
Frontmatter loader
load-agents.ts— reads*.mdfiles, parses YAML frontmatter withjs-yaml, resolvestoolkits: [...]entries against the pluginprovider index at load time, wraps ambient tools (from
agents({ tools: {...} })) fortools: [...]frontmatter references.loadAgentsFromDirruns a two-pass resolver soagents:referencescan be resolved regardless of file-system iteration order; supports
markdown siblings + code-defined agents (via
LoadContext.codeAgents)with code precedence on collision.
Plumbing
js-yaml+@types/js-yamldeps./api/agents/*(plural — matches theappkit.agents.*runtime handle).agents,createAgent,runAgent,AgentDefinition,AgentsPluginConfig,AgentTool,ToolkitEntry,ToolkitOptions,BaseSystemPromptOption,PromptContext,isToolkitEntry,loadAgentFromFile,loadAgentsFromDir.Test plan
toolkits/toolsresolution,agents:sibling resolution regardless of order, mutual delegation,missing/self/non-array refs, deduplication,
loadAgentFromFilerejection of
agents:, markdown → code references, code precedenceon collision.
stream-owner enforcement.
tool-call budget, sub-agent depth cap.
message-interruption-by-tool-call, undefined-result coalescing.
full range, IPv4-mapped colon-hex normalization.
Signed-off-by: MarioCadenas [email protected]
PR Stack
agents()plugin +createAgent(def)+ markdown-driven agents (this PR)fromPlugin()DX +runAgentplugins arg + toolkit-resolver — feat(appkit): fromPlugin() DX, runAgent plugins arg, shared toolkit-resolver #305Demo
agent-demo.mp4