Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,11 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
- **Per-fetch prompt cache control: `cache_ttl_seconds`** (proposal 0072, prompt-management §5 / §6, spec v0.63.0). `PromptBackend.fetch`, `PromptManager.fetch`, and `PromptManager.get` gain an optional `cache_ttl_seconds` read-side control: `None` preserves current behavior, `0` forces a fresh read past any client-side cache, and `N > 0` bounds a served entry's staleness to N seconds; a negative value is rejected at the manager. It governs only which cached entry may be served, not whether or how results are cached. The bundled filesystem backend is cacheless and ignores it; the bundled Langfuse backend forwards it to the Langfuse SDK's `get_prompt` cache. Conformance fixtures 033/034 run through a caching harness backend (conformance-adapter §6.8: `source_read_count` plus a controllable `advance_clock`).
- **Failure-isolation `catch` gate + cause-chain classification primitive** (proposal 0074, pipeline-utilities §6.3 / §6.4, spec v0.65.0). `FailureIsolationMiddleware` gains an optional `catch`: a set of error categories. An exception is caught only if the *derived category* of its cause chain (the outermost non-carrier link's category, resolved through the engine's `node_exception` carriers, the same value reported as `caught_exception.category`) is in the set. This closes a degrade-into-crash footgun: at a wrapping placement (subgraph, fan-out instance, branch) the engine wraps the originating failure in a carrier, so a `predicate` inspecting the surface exception sees only the carrier and misses it, whereas `catch` classifies through the carrier. `catch` composes with `predicate` as a conjunction; both default permissive (both unset stays catch-all), and a null derived category never matches a non-empty set. The carrier-skipping walk behind `catch` and `caught_exception` is promoted to a public primitive, `classify_cause_chain(exc) -> CaughtException` (the ordered `chain`, the derived `category`, and its `message` — the same record the event carries), exported from `openarmature.graph` for use in a custom `predicate`, a router, a metric, or a full-chain retry classifier. The default retry classifier stays deliberately single-level (it classifies at re-attempt granularity); this is now documented, with no behavior change. Conformance fixture 072 (catch matches through an instance-placement carrier and degrades; a non-matching catch propagates with no event). The optional native-exception-type `catch` form (spec MAY) is not shipped.
- **Inline-callable parallel branches and conditional `when`** (proposal 0075, pipeline-utilities §11, spec v0.66.0). `ParallelBranchesNode` gains two additive branch forms. A branch may now give its work as `call`, an inline async function over the parent state returning a parent-shaped partial update, instead of a compiled `subgraph` with its own state schema and `inputs` / `outputs` projection; the returned partial is the branch's contribution directly, merged via the parent reducer with no projection. This makes the primitive adoptable for the "M heterogeneous lightweight parallel calls over shared state, each independently failure-isolated" shape (hybrid recall, paired reads) that previously dropped to a hand-rolled gather, while reusing the existing concurrency, fail-fast cancellation, per-branch failure isolation, and reducer fan-in. A branch gives its work as exactly one of `subgraph` / `call`, and a callable branch declares no `inputs` / `outputs`, else a new compile-time `ParallelBranchesInvalidBranchSpec`; a node may mix the two forms freely. A branch (either form) may also carry an optional `when` predicate over the parent state, evaluated once at dispatch: a `False` result skips the branch entirely (no dispatch, contribution, observer events, or span), and an all-skipped node is a valid no-op distinct from the compile-time `ParallelBranchesNoBranches`. A callable branch is the unit of work, so it emits one `started` / `completed` observer pair keyed by `branch_name` (rendered as a single branch span); a skipped branch emits nothing. `ParallelBranchesInvalidBranchSpec` is exported from `openarmature.graph`. Conformance fixtures 073 (two callable branches merge to disjoint fields), 074 (conditional `when` skips / dispatches), and 075 (callable branch failure-isolation degrade) run in `test_pipeline_utilities`.
- **Tool-call request observability on LLM spans** (proposal 0076, observability §5.5.1 / §5.5.10 / §5.5.5, spec v0.67.0). The tool calls a model requests in its completion now have an output-side home on the `openarmature.llm.complete` span, closing the gap where they surfaced only incidentally on the next turn's input history. *Which* tools were requested renders by default as three ungated identity projections (the class of `openarmature.llm.model`): `openarmature.llm.output.tool_calls.count`, `.names`, and `.ids`, with `.names` and `.ids` index-aligned in request order and `.count` equal to their length. The full request, arguments included, renders as the payload-gated `openarmature.llm.output.tool_calls`, a JSON `[{id, name, arguments}]` array reusing the input tool-call encoding, surfaced only with `disable_provider_payload=False`. The whole family is emitted only on a tool-calling completion; a completion that requests no tools emits none of it (absence, not `count = 0`). The typed `LlmCompletionEvent` gains an additive `output_tool_calls` field carrying the `ToolCall` records, the source the span attributes render from (in python the OTel span renders from the per-attempt `LlmRetryAttemptEvent`, which carries the field too). This is the request side; the tool-execution complement (a separate `openarmature.tool.call` span) is a later proposal, joined to this one by the `ToolCall.id`. A Langfuse request-side mapping is out of scope. Conformance fixtures 085 (two requested calls surface count / names / ids), 086 (no calls, family absent), and 087 (payload gating: identity survives payload-off while the full serialization is suppressed) run in `test_observability`.

### Changed

- **Pinned spec advances v0.60.0 → v0.66.1** across the v0.15.0 cycle: v0.61.0 (proposal 0061, the detached-trace invocation span above), v0.62.0 (proposal 0064, the Langfuse session/user population above), v0.63.0 (proposal 0072, the prompt cache control above), the v0.63.1 patch (pipeline-utilities coverage fixtures 070/071 for the already-implemented 0069 / 0070 behavior, no new proposal), and v0.64.0 (proposal 0073, GenAI semconv adoption reconciliation: OA retains `gen_ai.system` despite the upstream rename to `gen_ai.provider.name`; textual-only, with no emitted-attribute or fixture change, so the existing `gen_ai.*` fixtures stand as the retention regression), v0.65.0 (proposal 0074, the failure-isolation `catch` gate above), v0.66.0 (proposal 0075, the inline-callable parallel branches and conditional `when` above), and the v0.66.1 patch (an observability §8 call-level-retry Langfuse-mapping clarification reconciling §8 with the per-attempt §5.5 spans: one terminal Generation per `complete()` call, not one per attempt, which the Langfuse observer already renders by driving the Generation from the terminal `LlmCompletionEvent` / `LlmFailedEvent` and skipping the per-attempt `LlmRetryAttemptEvent`; no behavior or fixture change). `conformance.toml` records 0061 / 0072 / 0074 / 0075 `implemented`, 0064 `partial` (its `sessionId` half is dormant pending the sessions capability), and 0073 `textual-only`. Proposal 0050 needed no pin bump of its own (it was already within the pin from its v0.42.0 acceptance); its v0.14.0 `partial` entry flips to `implemented` with the per-attempt span surface above.
- **Pinned spec advances v0.60.0 → v0.67.0** across the v0.15.0 cycle: v0.61.0 (proposal 0061, the detached-trace invocation span above), v0.62.0 (proposal 0064, the Langfuse session/user population above), v0.63.0 (proposal 0072, the prompt cache control above), the v0.63.1 patch (pipeline-utilities coverage fixtures 070/071 for the already-implemented 0069 / 0070 behavior, no new proposal), and v0.64.0 (proposal 0073, GenAI semconv adoption reconciliation: OA retains `gen_ai.system` despite the upstream rename to `gen_ai.provider.name`; textual-only, with no emitted-attribute or fixture change, so the existing `gen_ai.*` fixtures stand as the retention regression), v0.65.0 (proposal 0074, the failure-isolation `catch` gate above), v0.66.0 (proposal 0075, the inline-callable parallel branches and conditional `when` above), the v0.66.1 patch (an observability §8 call-level-retry Langfuse-mapping clarification reconciling §8 with the per-attempt §5.5 spans: one terminal Generation per `complete()` call, not one per attempt, which the Langfuse observer already renders by driving the Generation from the terminal `LlmCompletionEvent` / `LlmFailedEvent` and skipping the per-attempt `LlmRetryAttemptEvent`; no behavior or fixture change), and v0.67.0 (proposal 0076, the tool-call request observability above). `conformance.toml` records 0061 / 0072 / 0074 / 0075 / 0076 `implemented`, 0064 `partial` (its `sessionId` half is dormant pending the sessions capability), and 0073 `textual-only`. Proposal 0050 needed no pin bump of its own (it was already within the pin from its v0.42.0 acceptance); its v0.14.0 `partial` entry flips to `implemented` with the per-attempt span surface above.

## [0.14.0] — 2026-06-17

Expand Down
10 changes: 9 additions & 1 deletion conformance.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@

[manifest]
implementation = "openarmature-python"
spec_pin = "v0.66.1"
spec_pin = "v0.67.0"

# Status values:
# implemented — shipped behavior matches the proposal's contract
Expand Down Expand Up @@ -731,3 +731,11 @@ note = "FailureIsolationMiddleware gains an optional `catch` set of error catego
status = "implemented"
since = "0.15.0"
note = "ParallelBranchesNode gains two additive branch forms. (1) Inline-callable branches (§11.1.1): a BranchSpec may give its work as `call` (an async function over the parent state returning a parent-shaped partial update) instead of a compiled `subgraph` + inputs/outputs projection; the contribution is the returned partial directly, merged via the parent reducer with no projection (§11.4). Exactly one of subgraph/call per branch, and a callable branch declares no inputs/outputs, else parallel_branches_invalid_branch_spec (a new compile-time category); a node MAY mix subgraph and callable branches. Per-leg failure isolation on a callable branch is the existing §11.7 branch-middleware contract (wrap the callable in FailureIsolationMiddleware). (2) Conditional branches (§11.10): a BranchSpec may carry an optional `when` predicate (parent_state) -> bool, evaluated once at dispatch; false skips the branch entirely (no dispatch, contribution, observer events, or span). All-branches-skipped is a valid no-op, distinct from the compile-time parallel_branches_no_branches (empty declared mapping). graph-engine §6 / observability §5.7: a callable branch is the unit -- it emits one started/completed pair keyed by branch_name (rendered as a branch span via the existing §5.7 machinery), a skipped branch emits nothing. Fixtures 073 (two callable branches merge to disjoint fields), 074 (when false skips / true dispatches), 075 (callable branch + FailureIsolationMiddleware degrades, sibling completes, category resolves through the chain)."

# Spec v0.67.0 (proposal 0076). Tool-call request observability on the
# LLM completion span (observability §5.5.1 / §5.5.10 / §5.5.5,
# graph-engine §6).
[proposals."0076"]
status = "implemented"
since = "0.15.0"
note = "The model's output tool calls get an output-side home on the openarmature.llm.complete span. observability §5.5.10 adds the UNGATED identity projections openarmature.llm.output.tool_calls.count / .names / .ids (the class of openarmature.llm.model / attempt_index; emitted only on a tool-calling completion, omitted entirely otherwise -- not count=0); .names and .ids are index-aligned in request order, .count equals their length. §5.5.1 adds the GATED openarmature.llm.output.tool_calls, the full [{id, name, arguments}] serialization (reusing the §5.5.5 input tool-call encoding) carrying the arguments, suppressed under disable_provider_payload and subject to the truncation contract. graph-engine §6: LlmCompletionEvent gains an output_tool_calls field (the ToolCall records, populated unconditionally). python carries the field on BOTH the terminal LlmCompletionEvent (spec-conformance + the Langfuse/consumer path) and the python-internal per-attempt LlmRetryAttemptEvent, and the OTel observer renders the span attributes from the per-attempt event (the LLM-span source since 0050) -- mirroring how output_content already works. OA-namespace, no gen_ai.* mirror (the attempt_index precedent). Langfuse request-side mapping is OUT OF SCOPE (proposal defers it as future work); no Langfuse change. Fixtures 085 (two calls -> count/names/ids), 086 (no calls -> family absent), 087 (payload-gating: identity survives off, gated full present only on)."
31 changes: 30 additions & 1 deletion docs/concepts/observability.md
Original file line number Diff line number Diff line change
Expand Up @@ -739,13 +739,19 @@ observer = OTelObserver(
)
```

This surfaces three attributes:
This surfaces four attributes:

- `openarmature.llm.input.messages`: JSON-encoded message array
(the spec §3 message shape: `{role, content, tool_calls?, …}`).
- `openarmature.llm.output.content`: the assistant's response
content string verbatim. Omitted for tool-call-only responses
with empty content.
- `openarmature.llm.output.tool_calls`: JSON-encoded `[{id, name,
arguments}]` array of the tool calls the model requested (the same
encoding `tool_calls` uses inside `input.messages`). This is the
output-side home for the request, including the call arguments, so
it is payload-gated. Emitted only when the response requests tool
calls.
- `openarmature.llm.request.extras`: JSON-encoded `RuntimeConfig`
extras bag (provider-specific pass-through fields like
`repetition_penalty` for vLLM, or `top_k` for HuggingFace
Expand All @@ -757,6 +763,29 @@ observability. The flag name keeps symmetry with `disable_llm_spans`:
the default value (`True`) reads as "the observer disables payload
emission by default."

#### Output tool-call identity (ungated)

The full `openarmature.llm.output.tool_calls` carries the arguments, so
it is payload-gated. But *which* tools the model asked for (their
names and ids) is identity, not payload, the same class as
`openarmature.llm.model`. So three identity projections render
**regardless** of `disable_provider_payload`, surfacing the request
under the default payload-off posture and queryable without parsing
JSON:

- `openarmature.llm.output.tool_calls.count`: the number of tool calls
requested (an int, equal to the length of `.names`).
- `openarmature.llm.output.tool_calls.names`: the requested tool names,
in request order.
- `openarmature.llm.output.tool_calls.ids`: the requested `ToolCall`
ids, index-aligned with `.names` (`names[i]` / `ids[i]` describe the
same call), the linkage to a downstream tool execution.

The whole family (these three plus the gated full serialization) is
emitted **only** on a tool-calling completion. A completion that
requests no tools emits none of them; absence means "no tools
requested", distinct from `count = 0`.

#### Truncation

Each payload attribute is capped at `payload_max_bytes` UTF-8 bytes
Expand Down
1 change: 1 addition & 0 deletions docs/model-providers/authoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,7 @@ of:
finish_reason=response.finish_reason,
input_messages=serialized_messages,
output_content=response.message.content or None,
output_tool_calls=list(response.message.tool_calls or []),
request_params=request_params,
request_extras=request_extras,
active_prompt=None,
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Specification = "https://github.com/LunarCommand/openarmature-spec"
openarmature = "openarmature.cli:main"

[tool.openarmature]
spec_version = "0.66.1"
spec_version = "0.67.0"

[dependency-groups]
dev = [
Expand Down
4 changes: 2 additions & 2 deletions src/openarmature/AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# OpenArmature — Agent documentation

*This is the agent guide bundled with the openarmature Python package, version 0.14.0 (spec v0.66.1). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
*This is the agent guide bundled with the openarmature Python package, version 0.14.0 (spec v0.67.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*

## TL;DR

Expand All @@ -10,7 +10,7 @@ OpenArmature is a workflow framework for LLM pipelines and tool-calling agents:

## Capability contracts

_Sourced from openarmature-spec v0.66.1. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
_Sourced from openarmature-spec v0.67.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._

### Capability: `graph-engine`

Expand Down
2 changes: 1 addition & 1 deletion src/openarmature/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"""

__version__ = "0.14.0"
__spec_version__ = "0.66.1"
__spec_version__ = "0.67.0"
# Proposal 0052 (spec observability §5.1 / §8.4.1): canonical
# package-registry name for this implementation. Surfaces on every
# OTel invocation span as ``openarmature.implementation.name`` and on
Expand Down
Loading
Loading