LunarCommand · chris-colinsky · Jun 21, 2026 · Jun 21, 2026 · Jun 21, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -14,10 +14,11 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
 - **Per-fetch prompt cache control: `cache_ttl_seconds`** (proposal 0072, prompt-management §5 / §6, spec v0.63.0). `PromptBackend.fetch`, `PromptManager.fetch`, and `PromptManager.get` gain an optional `cache_ttl_seconds` read-side control: `None` preserves current behavior, `0` forces a fresh read past any client-side cache, and `N > 0` bounds a served entry's staleness to N seconds; a negative value is rejected at the manager. It governs only which cached entry may be served, not whether or how results are cached. The bundled filesystem backend is cacheless and ignores it; the bundled Langfuse backend forwards it to the Langfuse SDK's `get_prompt` cache. Conformance fixtures 033/034 run through a caching harness backend (conformance-adapter §6.8: `source_read_count` plus a controllable `advance_clock`).
 - **Failure-isolation `catch` gate + cause-chain classification primitive** (proposal 0074, pipeline-utilities §6.3 / §6.4, spec v0.65.0). `FailureIsolationMiddleware` gains an optional `catch`: a set of error categories. An exception is caught only if the *derived category* of its cause chain (the outermost non-carrier link's category, resolved through the engine's `node_exception` carriers, the same value reported as `caught_exception.category`) is in the set. This closes a degrade-into-crash footgun: at a wrapping placement (subgraph, fan-out instance, branch) the engine wraps the originating failure in a carrier, so a `predicate` inspecting the surface exception sees only the carrier and misses it, whereas `catch` classifies through the carrier. `catch` composes with `predicate` as a conjunction; both default permissive (both unset stays catch-all), and a null derived category never matches a non-empty set. The carrier-skipping walk behind `catch` and `caught_exception` is promoted to a public primitive, `classify_cause_chain(exc) -> CaughtException` (the ordered `chain`, the derived `category`, and its `message` — the same record the event carries), exported from `openarmature.graph` for use in a custom `predicate`, a router, a metric, or a full-chain retry classifier. The default retry classifier stays deliberately single-level (it classifies at re-attempt granularity); this is now documented, with no behavior change. Conformance fixture 072 (catch matches through an instance-placement carrier and degrades; a non-matching catch propagates with no event). The optional native-exception-type `catch` form (spec MAY) is not shipped.
 - **Inline-callable parallel branches and conditional `when`** (proposal 0075, pipeline-utilities §11, spec v0.66.0). `ParallelBranchesNode` gains two additive branch forms. A branch may now give its work as `call`, an inline async function over the parent state returning a parent-shaped partial update, instead of a compiled `subgraph` with its own state schema and `inputs` / `outputs` projection; the returned partial is the branch's contribution directly, merged via the parent reducer with no projection. This makes the primitive adoptable for the "M heterogeneous lightweight parallel calls over shared state, each independently failure-isolated" shape (hybrid recall, paired reads) that previously dropped to a hand-rolled gather, while reusing the existing concurrency, fail-fast cancellation, per-branch failure isolation, and reducer fan-in. A branch gives its work as exactly one of `subgraph` / `call`, and a callable branch declares no `inputs` / `outputs`, else a new compile-time `ParallelBranchesInvalidBranchSpec`; a node may mix the two forms freely. A branch (either form) may also carry an optional `when` predicate over the parent state, evaluated once at dispatch: a `False` result skips the branch entirely (no dispatch, contribution, observer events, or span), and an all-skipped node is a valid no-op distinct from the compile-time `ParallelBranchesNoBranches`. A callable branch is the unit of work, so it emits one `started` / `completed` observer pair keyed by `branch_name` (rendered as a single branch span); a skipped branch emits nothing. `ParallelBranchesInvalidBranchSpec` is exported from `openarmature.graph`. Conformance fixtures 073 (two callable branches merge to disjoint fields), 074 (conditional `when` skips / dispatches), and 075 (callable branch failure-isolation degrade) run in `test_pipeline_utilities`.
+- **Tool-call request observability on LLM spans** (proposal 0076, observability §5.5.1 / §5.5.10 / §5.5.5, spec v0.67.0). The tool calls a model requests in its completion now have an output-side home on the `openarmature.llm.complete` span, closing the gap where they surfaced only incidentally on the next turn's input history. *Which* tools were requested renders by default as three ungated identity projections (the class of `openarmature.llm.model`): `openarmature.llm.output.tool_calls.count`, `.names`, and `.ids`, with `.names` and `.ids` index-aligned in request order and `.count` equal to their length. The full request, arguments included, renders as the payload-gated `openarmature.llm.output.tool_calls`, a JSON `[{id, name, arguments}]` array reusing the input tool-call encoding, surfaced only with `disable_provider_payload=False`. The whole family is emitted only on a tool-calling completion; a completion that requests no tools emits none of it (absence, not `count = 0`). The typed `LlmCompletionEvent` gains an additive `output_tool_calls` field carrying the `ToolCall` records, the source the span attributes render from (in python the OTel span renders from the per-attempt `LlmRetryAttemptEvent`, which carries the field too). This is the request side; the tool-execution complement (a separate `openarmature.tool.call` span) is a later proposal, joined to this one by the `ToolCall.id`. A Langfuse request-side mapping is out of scope. Conformance fixtures 085 (two requested calls surface count / names / ids), 086 (no calls, family absent), and 087 (payload gating: identity survives payload-off while the full serialization is suppressed) run in `test_observability`.
 
 ### Changed
 
-- **Pinned spec advances v0.60.0 → v0.66.1** across the v0.15.0 cycle: v0.61.0 (proposal 0061, the detached-trace invocation span above), v0.62.0 (proposal 0064, the Langfuse session/user population above), v0.63.0 (proposal 0072, the prompt cache control above), the v0.63.1 patch (pipeline-utilities coverage fixtures 070/071 for the already-implemented 0069 / 0070 behavior, no new proposal), and v0.64.0 (proposal 0073, GenAI semconv adoption reconciliation: OA retains `gen_ai.system` despite the upstream rename to `gen_ai.provider.name`; textual-only, with no emitted-attribute or fixture change, so the existing `gen_ai.*` fixtures stand as the retention regression), v0.65.0 (proposal 0074, the failure-isolation `catch` gate above), v0.66.0 (proposal 0075, the inline-callable parallel branches and conditional `when` above), and the v0.66.1 patch (an observability §8 call-level-retry Langfuse-mapping clarification reconciling §8 with the per-attempt §5.5 spans: one terminal Generation per `complete()` call, not one per attempt, which the Langfuse observer already renders by driving the Generation from the terminal `LlmCompletionEvent` / `LlmFailedEvent` and skipping the per-attempt `LlmRetryAttemptEvent`; no behavior or fixture change). `conformance.toml` records 0061 / 0072 / 0074 / 0075 `implemented`, 0064 `partial` (its `sessionId` half is dormant pending the sessions capability), and 0073 `textual-only`. Proposal 0050 needed no pin bump of its own (it was already within the pin from its v0.42.0 acceptance); its v0.14.0 `partial` entry flips to `implemented` with the per-attempt span surface above.
+- **Pinned spec advances v0.60.0 → v0.67.0** across the v0.15.0 cycle: v0.61.0 (proposal 0061, the detached-trace invocation span above), v0.62.0 (proposal 0064, the Langfuse session/user population above), v0.63.0 (proposal 0072, the prompt cache control above), the v0.63.1 patch (pipeline-utilities coverage fixtures 070/071 for the already-implemented 0069 / 0070 behavior, no new proposal), and v0.64.0 (proposal 0073, GenAI semconv adoption reconciliation: OA retains `gen_ai.system` despite the upstream rename to `gen_ai.provider.name`; textual-only, with no emitted-attribute or fixture change, so the existing `gen_ai.*` fixtures stand as the retention regression), v0.65.0 (proposal 0074, the failure-isolation `catch` gate above), v0.66.0 (proposal 0075, the inline-callable parallel branches and conditional `when` above), the v0.66.1 patch (an observability §8 call-level-retry Langfuse-mapping clarification reconciling §8 with the per-attempt §5.5 spans: one terminal Generation per `complete()` call, not one per attempt, which the Langfuse observer already renders by driving the Generation from the terminal `LlmCompletionEvent` / `LlmFailedEvent` and skipping the per-attempt `LlmRetryAttemptEvent`; no behavior or fixture change), and v0.67.0 (proposal 0076, the tool-call request observability above). `conformance.toml` records 0061 / 0072 / 0074 / 0075 / 0076 `implemented`, 0064 `partial` (its `sessionId` half is dormant pending the sessions capability), and 0073 `textual-only`. Proposal 0050 needed no pin bump of its own (it was already within the pin from its v0.42.0 acceptance); its v0.14.0 `partial` entry flips to `implemented` with the per-attempt span surface above.
 
 ## [0.14.0] — 2026-06-17
 

diff --git a/conformance.toml b/conformance.toml
@@ -32,7 +32,7 @@
 
 [manifest]
 implementation = "openarmature-python"
-spec_pin = "v0.66.1"
+spec_pin = "v0.67.0"
 
 # Status values:
 #   implemented   — shipped behavior matches the proposal's contract
@@ -731,3 +731,11 @@ note = "FailureIsolationMiddleware gains an optional `catch` set of error catego
 status = "implemented"
 since = "0.15.0"
 note = "ParallelBranchesNode gains two additive branch forms. (1) Inline-callable branches (§11.1.1): a BranchSpec may give its work as `call` (an async function over the parent state returning a parent-shaped partial update) instead of a compiled `subgraph` + inputs/outputs projection; the contribution is the returned partial directly, merged via the parent reducer with no projection (§11.4). Exactly one of subgraph/call per branch, and a callable branch declares no inputs/outputs, else parallel_branches_invalid_branch_spec (a new compile-time category); a node MAY mix subgraph and callable branches. Per-leg failure isolation on a callable branch is the existing §11.7 branch-middleware contract (wrap the callable in FailureIsolationMiddleware). (2) Conditional branches (§11.10): a BranchSpec may carry an optional `when` predicate (parent_state) -> bool, evaluated once at dispatch; false skips the branch entirely (no dispatch, contribution, observer events, or span). All-branches-skipped is a valid no-op, distinct from the compile-time parallel_branches_no_branches (empty declared mapping). graph-engine §6 / observability §5.7: a callable branch is the unit -- it emits one started/completed pair keyed by branch_name (rendered as a branch span via the existing §5.7 machinery), a skipped branch emits nothing. Fixtures 073 (two callable branches merge to disjoint fields), 074 (when false skips / true dispatches), 075 (callable branch + FailureIsolationMiddleware degrades, sibling completes, category resolves through the chain)."
+
+# Spec v0.67.0 (proposal 0076).  Tool-call request observability on the
+# LLM completion span (observability §5.5.1 / §5.5.10 / §5.5.5,
+# graph-engine §6).
+[proposals."0076"]
+status = "implemented"
+since = "0.15.0"
+note = "The model's output tool calls get an output-side home on the openarmature.llm.complete span. observability §5.5.10 adds the UNGATED identity projections openarmature.llm.output.tool_calls.count / .names / .ids (the class of openarmature.llm.model / attempt_index; emitted only on a tool-calling completion, omitted entirely otherwise -- not count=0); .names and .ids are index-aligned in request order, .count equals their length. §5.5.1 adds the GATED openarmature.llm.output.tool_calls, the full [{id, name, arguments}] serialization (reusing the §5.5.5 input tool-call encoding) carrying the arguments, suppressed under disable_provider_payload and subject to the truncation contract. graph-engine §6: LlmCompletionEvent gains an output_tool_calls field (the ToolCall records, populated unconditionally). python carries the field on BOTH the terminal LlmCompletionEvent (spec-conformance + the Langfuse/consumer path) and the python-internal per-attempt LlmRetryAttemptEvent, and the OTel observer renders the span attributes from the per-attempt event (the LLM-span source since 0050) -- mirroring how output_content already works. OA-namespace, no gen_ai.* mirror (the attempt_index precedent). Langfuse request-side mapping is OUT OF SCOPE (proposal defers it as future work); no Langfuse change. Fixtures 085 (two calls -> count/names/ids), 086 (no calls -> family absent), 087 (payload-gating: identity survives off, gated full present only on)."
diff --git a/docs/concepts/observability.md b/docs/concepts/observability.md
@@ -739,13 +739,19 @@ observer = OTelObserver(
 )
 ```
 
-This surfaces three attributes:
+This surfaces four attributes:
 
 - `openarmature.llm.input.messages`: JSON-encoded message array
   (the spec §3 message shape: `{role, content, tool_calls?, …}`).
 - `openarmature.llm.output.content`: the assistant's response
   content string verbatim. Omitted for tool-call-only responses
   with empty content.
+- `openarmature.llm.output.tool_calls`: JSON-encoded `[{id, name,
+  arguments}]` array of the tool calls the model requested (the same
+  encoding `tool_calls` uses inside `input.messages`). This is the
+  output-side home for the request, including the call arguments, so
+  it is payload-gated. Emitted only when the response requests tool
+  calls.
 - `openarmature.llm.request.extras`: JSON-encoded `RuntimeConfig`
   extras bag (provider-specific pass-through fields like
   `repetition_penalty` for vLLM, or `top_k` for HuggingFace
@@ -757,6 +763,29 @@ observability. The flag name keeps symmetry with `disable_llm_spans`:
 the default value (`True`) reads as "the observer disables payload
 emission by default."
 
+#### Output tool-call identity (ungated)
+
+The full `openarmature.llm.output.tool_calls` carries the arguments, so
+it is payload-gated. But *which* tools the model asked for (their
+names and ids) is identity, not payload, the same class as
+`openarmature.llm.model`. So three identity projections render
+**regardless** of `disable_provider_payload`, surfacing the request
+under the default payload-off posture and queryable without parsing
+JSON:
+
+- `openarmature.llm.output.tool_calls.count`: the number of tool calls
+  requested (an int, equal to the length of `.names`).
+- `openarmature.llm.output.tool_calls.names`: the requested tool names,
+  in request order.
+- `openarmature.llm.output.tool_calls.ids`: the requested `ToolCall`
+  ids, index-aligned with `.names` (`names[i]` / `ids[i]` describe the
+  same call), the linkage to a downstream tool execution.
+
+The whole family (these three plus the gated full serialization) is
+emitted **only** on a tool-calling completion. A completion that
+requests no tools emits none of them; absence means "no tools
+requested", distinct from `count = 0`.
+
 #### Truncation
 
 Each payload attribute is capped at `payload_max_bytes` UTF-8 bytes

diff --git a/docs/model-providers/authoring.md b/docs/model-providers/authoring.md
@@ -305,6 +305,7 @@ of:
               finish_reason=response.finish_reason,
               input_messages=serialized_messages,
               output_content=response.message.content or None,
+              output_tool_calls=list(response.message.tool_calls or []),
               request_params=request_params,
               request_extras=request_extras,
               active_prompt=None,

diff --git a/openarmature-spec b/openarmature-spec
diff --git a/pyproject.toml b/pyproject.toml
@@ -63,7 +63,7 @@ Specification = "https://github.com/LunarCommand/openarmature-spec"
 openarmature = "openarmature.cli:main"
 
 [tool.openarmature]
-spec_version = "0.66.1"
+spec_version = "0.67.0"
 
 [dependency-groups]
 dev = [

diff --git a/src/openarmature/AGENTS.md b/src/openarmature/AGENTS.md
@@ -1,6 +1,6 @@
 # OpenArmature — Agent documentation
 
-*This is the agent guide bundled with the openarmature Python package, version 0.14.0 (spec v0.66.1). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
+*This is the agent guide bundled with the openarmature Python package, version 0.14.0 (spec v0.67.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
 
 ## TL;DR
 
@@ -10,7 +10,7 @@ OpenArmature is a workflow framework for LLM pipelines and tool-calling agents:
 
 ## Capability contracts
 
-_Sourced from openarmature-spec v0.66.1. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
+_Sourced from openarmature-spec v0.67.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
 
 ### Capability: `graph-engine`
 

diff --git a/src/openarmature/__init__.py b/src/openarmature/__init__.py
@@ -25,7 +25,7 @@
 """
 
 __version__ = "0.14.0"
-__spec_version__ = "0.66.1"
+__spec_version__ = "0.67.0"
 # Proposal 0052 (spec observability §5.1 / §8.4.1): canonical
 # package-registry name for this implementation. Surfaces on every
 # OTel invocation span as ``openarmature.implementation.name`` and on
+16 −0		CHANGELOG.md
+2 −2		README.md
+12 −1		docs/compatibility.md
+2 −1		docs/proposals.md
+1 −0		docs/proposals/0076-tool-call-request-observability-llm-spans.md
+230 −0		proposals/0076-tool-call-request-observability-llm-spans.md
+5 −3		spec/graph-engine/spec.md
+28 −0		spec/observability/conformance/085-llm-tool-call-request-attributes.md
+67 −0		spec/observability/conformance/085-llm-tool-call-request-attributes.yaml
+24 −0		spec/observability/conformance/086-llm-tool-call-request-absent.md
+57 −0		spec/observability/conformance/086-llm-tool-call-request-absent.yaml
+35 −0		spec/observability/conformance/087-llm-tool-call-request-survives-payload-gating.md
+126 −0		spec/observability/conformance/087-llm-tool-call-request-survives-payload-gating.yaml
+67 −8		spec/observability/spec.md