Small, dependency-light Python runtime for building local agent workflows.
The distribution package is protocol-lattice-py-agent; the Python import
package is py_agent.
- Agent runtime with
generate,generate_with_files, andgenerate_stream - In-memory short-term memory plus vector-style long-term memory
- Pluggable LLM providers and embedders
- Local tools, sub-agents, and agent-as-tool adapters
- Shared memory spaces with simple ACLs
- Input and output guardrails
- ADK-style composition helpers for model, memory, tool, and sub-agent modules
- UTCP tool/manual generation
- Optional Code Mode and FastMCP examples
The base package has no runtime dependencies.
From GitHub:
python -m pip install "git+https://github.com/Protocol-Lattice/py-agent.git"From this checkout:
python -m pip install -e .Optional extras:
python -m pip install -e ".[providers]"
python -m pip install -e ".[fastembed]"
python -m pip install -e ".[utcp]"
python -m pip install -e ".[mcp]"
python -m pip install -e ".[dev]"Extras map to:
| Extra | Adds |
|---|---|
providers |
OpenAI, Anthropic, Gemini, and FastEmbed SDKs |
fastembed |
FastEmbed only |
utcp |
UTCP, UTCP HTTP, and Code Mode integrations |
mcp |
FastMCP |
dev |
pytest |
from py_agent import Agent, Options
from py_agent.memory import InMemoryStore, MemoryBank, SessionMemory
from py_agent.models import DummyLLM
memory = SessionMemory(MemoryBank(InMemoryStore()), short_term_size=8)
agent = Agent(
Options(
model=DummyLLM("local:"),
memory=memory,
system_prompt="You are concise and helpful.",
)
)
print(agent.generate("demo-session", "Say hello in one sentence."))Run the bundled version:
python examples/quickstart.pyAn Agent is configured with Options:
from py_agent import Options
options = Options(
model=model,
memory=memory,
system_prompt="You coordinate a helpful assistant.",
context_limit=8,
tools=[],
subagents=[],
)The model object can implement any of these methods:
generate(prompt) -> strgenerate_with_files(prompt, files) -> strgenerate_stream(prompt) -> Iterable[StreamChunk]
The package includes DummyLLM for local tests and provider wrappers for
OpenAI, Anthropic, Gemini, and Ollama.
Use new_llm_provider when you want provider selection by name:
from py_agent.models import new_llm_provider
model = new_llm_provider("openai", model="gpt-4o-mini")Supported provider names:
| Provider | Aliases | Environment |
|---|---|---|
| OpenAI | openai |
OPENAI_API_KEY or OPENAI_KEY |
| Gemini | gemini, google |
GOOGLE_API_KEY or GEMINI_API_KEY |
| Anthropic | anthropic, claude |
ANTHROPIC_API_KEY |
| Ollama | ollama |
optional OLLAMA_HOST, default http://localhost:11434 |
Response caching is opt-in through environment variables:
| Variable | Purpose |
|---|---|
AGENT_LLM_CACHE_SIZE |
Enables cache when set to a positive integer |
AGENT_LLM_CACHE_TTL |
Cache TTL in seconds, default 300 |
AGENT_LLM_CACHE_PATH |
Cache file path, default .agent_cache.json |
Memory uses DummyEmbedder by default, which is deterministic and local. For
real vector search, configure ADK_EMBED_PROVIDER and optionally
ADK_EMBED_MODEL.
from py_agent.memory import AutoEmbedder, InMemoryStore, MemoryBank, SessionMemory
memory = SessionMemory(MemoryBank(InMemoryStore())).with_embedder(AutoEmbedder())Supported embedding providers:
| Provider | Aliases | Default model | Setup |
|---|---|---|---|
| FastEmbed | fastembed, fast_embed, fastembedder, fastembeeder |
BAAI/bge-small-en-v1.5 |
pip install fastembed |
| OpenAI | openai |
text-embedding-3-small |
OPENAI_API_KEY or OPENAI_KEY |
| Gemini | gemini, google, vertex, vertexai |
text-embedding-004 |
GOOGLE_API_KEY or GEMINI_API_KEY |
| Ollama | ollama |
nomic-embed-text |
optional OLLAMA_HOST |
| Claude/Voyage | claude, anthropic, voyage, voyageai |
voyage-3.5 |
VOYAGE_API_KEY |
FastEmbed can also be used directly:
from py_agent.memory import FastEmbedder, FastEmbeeder
embedder = FastEmbedder()
same_embedder_class = FastEmbeederAutoEmbedder() falls back to DummyEmbedder when no provider can be selected.
Claude/Anthropic embeddings are backed by Voyage AI because Anthropic does not
provide a first-party embeddings API.
The core memory components are:
| Component | Purpose |
|---|---|
InMemoryStore |
Stores long-term memory records in process |
MemoryBank |
Adapter around a vector store |
SessionMemory |
Short-term window, embedding, retrieval, and flush logic |
SharedSession |
Cross-session memory spaces with ACL checks |
SpaceRegistry |
Creates, grants, revokes, lists, and expires spaces |
Typical use:
from py_agent.memory import InMemoryStore, MemoryBank, SessionMemory
store = InMemoryStore()
memory = SessionMemory(MemoryBank(store), short_term_size=8)
memory.add_short_term("user-123", "Remember this note.", {"role": "user"}, memory.embed("Remember this note."))
memory.flush_to_long_term("user-123")
records = memory.retrieve_context("user-123", "note", limit=4)Tools expose spec() and invoke(request):
from py_agent import ToolRequest, ToolResponse, ToolSpec
class EchoTool:
def spec(self) -> ToolSpec:
return ToolSpec(
name="echo",
description="Return the input text.",
input_schema={
"type": "object",
"properties": {"input": {"type": "string"}},
"required": ["input"],
},
)
def invoke(self, request: ToolRequest) -> ToolResponse:
return ToolResponse(content=str(request.arguments["input"]))Register tools through Options:
agent = Agent(Options(model=model, memory=memory, tools=[EchoTool()]))
print(agent.generate("demo", 'tool: echo {"input": "hello"}'))Sub-agents expose name(), description(), and run(input_text). Agents can
also be wrapped as tools:
agent_tool = agent.as_tool("assistant", "Run the assistant agent.")See examples/tools_and_subagents.py for a complete local workflow.
Input and output guardrails are plain Python policies and transformers:
from py_agent.guardrails import InputGuardrails, RegexInputBlocklistPolicy
guardrails = InputGuardrails(
safety_policies=[RegexInputBlocklistPolicy([r"(?i)\bunsafe\b"])]
)Available guardrail helpers include:
RegexBlocklistPolicyRegexInputBlocklistPolicyPromptInjectionDetectorPolicyLLMEvaluatorPolicyLLMEvaluatorInputPolicyPIIMaskerTransformerRegexInputReplaceTransformer
Use the ADK when you want declarative assembly for a coordinator agent:
from py_agent.adk import (
AgentDevelopmentKit,
in_memory_memory_module,
model_module,
static_model_provider,
with_default_system_prompt,
with_modules,
)
from py_agent.models import DummyLLM
kit = AgentDevelopmentKit.new(
with_default_system_prompt("You coordinate a helpful assistant."),
with_modules(
model_module("llm", static_model_provider(DummyLLM("local:"))),
in_memory_memory_module(window=8),
),
)
agent = kit.build_agent()
print(agent.generate("user-123", "Draft a short project update."))Any agent can describe itself as a UTCP 1.x tool:
tool = agent.as_utcp_tool(
"assistant.run",
"Run the assistant agent.",
base_url="https://agents.example.com",
)
manual = agent.as_utcp_manual(
"assistant",
"Run the assistant agent.",
base_url="https://agents.example.com",
)Multiple agents can be exposed from one registry:
from py_agent import AgentUTCPBinding, AgentUTCPRegistry
registry = AgentUTCPRegistry(
"team",
[
AgentUTCPBinding("researcher", researcher_agent, "Run the researcher agent."),
AgentUTCPBinding("writer", writer_agent, "Run the writer agent."),
],
base_url="https://agents.example.com",
)
manual = registry.manual()
result = registry.call(
"team.researcher",
{"instruction": "Find two project facts.", "session_id": "user-123"},
)Run the dependency-free local UTCP HTTP example:
python examples/utcp_http_server.pyThen query it:
curl http://127.0.0.1:8765/utcp
curl -X POST http://127.0.0.1:8765/tools/researcher \
-H 'Content-Type: application/json' \
-d '{"instruction":"List two facts.","session_id":"demo"}'Code Mode clients can be exposed as regular tools:
from py_agent import new_code_mode_tool
code_tool = new_code_mode_tool(code_mode_client)
agent = Agent(Options(model=model, memory=memory, tools=[code_tool]))Install optional dependencies with:
python -m pip install -e ".[utcp]"The runnable example uses a local stand-in client:
python examples/code_mode_tool.pyInstall FastMCP support:
python -m pip install -e ".[mcp]"Run the in-process client demo:
python examples/fastmcp_client_demo.pyRun the server over Streamable HTTP:
FASTMCP_TRANSPORT=streamable-http python examples/fastmcp_agents_server.pyThe server exposes researcher and writer tools plus a utcp://manual
resource.
| Example | Purpose |
|---|---|
examples/quickstart.py |
Minimal local agent |
examples/tools_and_subagents.py |
Tool call, sub-agent call, and agent-as-tool |
examples/agent_as_utcp_tool.py |
Single agent exposed as a UTCP tool/manual |
examples/agents_as_utcp_tools.py |
Multi-agent UTCP registry |
examples/utcp_http_server.py |
Dependency-free HTTP wrapper for the UTCP registry |
examples/code_mode_tool.py |
Code Mode tool adapter with a demo async client |
examples/fastmcp_agents_server.py |
FastMCP server exposing the agent team |
examples/fastmcp_client_demo.py |
In-process FastMCP smoke demo |
Common local smoke runs:
python examples/quickstart.py
python examples/tools_and_subagents.py
python examples/agent_as_utcp_tool.py
python examples/agents_as_utcp_tools.py
python examples/code_mode_tool.pysrc/py_agent/
agent.py core runtime, tools, sub-agents, UTCP adapters
adk.py module container and provider wiring
cache.py small TTL LRU cache
concurrent.py worker helpers
guardrails.py input and output guardrails
memory.py stores, sessions, shared spaces, embedders
models.py model wrappers and attachment helpers
subagents.py built-in researcher sub-agent
types.py shared dataclasses and protocols
examples/
tests/
Install test dependencies and run:
python -m pip install -e ".[dev]"
pytestThe suite covers the core agent runtime, memory, embedders, guardrails, ADK, UTCP artifacts, and runnable examples.