Skip to content

Add tool-call request observability on LLM spans (0076)#176

Merged
chris-colinsky merged 2 commits into
mainfrom
feature/0076-tool-call-request-observability
Jun 21, 2026
Merged

Add tool-call request observability on LLM spans (0076)#176
chris-colinsky merged 2 commits into
mainfrom
feature/0076-tool-call-request-observability

Conversation

@chris-colinsky

Copy link
Copy Markdown
Member

Implements accepted proposal 0076 (spec v0.67.0): gives the model's output tool calls an output-side home on the openarmature.llm.complete span. Pin advances v0.66.1 to v0.67.0 (0076 is the only proposal in the delta).

What changed

The output payload openarmature.llm.output.content is response text and is empty on a tool-call-only completion, so the tool calls a model requests had no output-side home and surfaced only incidentally on the next turn's input history. This adds them in two layers on the existing LLM span:

  • Ungated identity (the class of openarmature.llm.model): openarmature.llm.output.tool_calls.count / .names / .ids, so which tools were requested survives the default payload-off posture and is queryable without parsing JSON. .names / .ids are index-aligned in request order; .count equals their length.
  • Gated full: openarmature.llm.output.tool_calls, the full [{id, name, arguments}] serialization carrying the arguments, behind disable_provider_payload.

The whole family is emitted only on a tool-calling completion; a completion that requests no tools emits none of it (absence, not count = 0).

Implementation note

The proposal frames this as a field on LlmCompletionEvent. In this implementation the OTel span has rendered from the internal LlmRetryAttemptEvent since 0050 (the terminal LlmCompletionEvent is the Langfuse/consumer path), so the output_tool_calls field lands on both events and the observer renders the span attributes from the per-attempt one, mirroring how output_content already works. This is the same internal-event latitude the spec blessed for the 0050 per-attempt surface (5.5 does not pin which internal event the observer renders from); no spec implication.

The {id, name, arguments} encoding is shared with the input-message serialization via a single serialize_tool_calls helper, and the ungated identity arrays are deliberately untruncated (truncating would break the count == len(names) invariant, the index-alignment, or the id linkage to a downstream tool execution).

A Langfuse request-side mapping is out of scope (the proposal defers it as future work); the execution-side openarmature.tool.call span is a later proposal, joined to this one by the ToolCall.id.

Tests

  • Unit: observer rendering (identity, absence, payload-gating including the gated value) and provider population on both events.
  • Conformance: fixtures 085 / 086 / 087 wired through the generic LLM-payload runner, with a presence-only attributes_present clause added for 087's gated attribute.
  • Full suite green; ruff + pyright clean; mkdocs build --strict clean.

The model's output tool calls had no output-side home: output.content
is response text and is empty on a tool-call-only completion, so the
requested calls surfaced only incidentally as next-turn input history.

Add an output_tool_calls field (the ToolCall records, populated
unconditionally) to LlmCompletionEvent and the per-attempt
LlmRetryAttemptEvent. On a tool-calling completion the OTel observer
emits the ungated identity projections
openarmature.llm.output.tool_calls.count / .names / .ids plus the
payload-gated full [{id, name, arguments}] serialization. Identity is
deliberately untruncated; the gated full shares a serialize_tool_calls
encoder with the input-message side.

Implements proposal 0076 (observability 5.5.1 / 5.5.10).
Advance the spec pin v0.66.1 -> v0.67.0 across the four sync points
(submodule, __spec_version__, pyproject, conformance manifest) and the
smoke assertion; regenerate the bundled AGENTS.md.

Wire conformance fixtures 085-087 through the generic LLM-payload
runner (with a presence-only attributes_present clause for 087's gated
attribute) and record proposal 0076 implemented. Document the new
attributes in the observability concept and the provider-authoring
event snippet; add the 0076 CHANGELOG entry.
Copilot AI review requested due to automatic review settings June 21, 2026 17:13

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements accepted spec proposal 0076 (spec v0.67.0) by surfacing model-requested tool calls on the openarmature.llm.complete OTel span, including ungated identity projections and a payload-gated full serialization, and updates the repo’s spec pin + docs/tests accordingly.

Changes:

  • Add span attributes for output tool-call identity (.count/.names/.ids) and payload-gated full tool-call serialization (openarmature.llm.output.tool_calls) on openarmature.llm.complete.
  • Populate output_tool_calls on both LlmCompletionEvent and LlmRetryAttemptEvent, and centralize tool-call serialization via serialize_tool_calls.
  • Bump spec pin to v0.67.0 and extend unit + conformance coverage and documentation to match.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/openarmature/observability/otel/observer.py Emits new tool-call-related span attributes (ungated identity + gated full serialization).
src/openarmature/observability/llm_event.py Adds shared serialize_tool_calls helper for consistent tool-call encoding.
src/openarmature/llm/providers/openai.py Populates typed-event output_tool_calls and reuses shared tool-call serialization for payload messages.
src/openarmature/graph/events.py Adds output_tool_calls fields to typed LLM events (completion + per-attempt).
tests/unit/test_observability_otel.py Unit tests for new OTel span attributes, absence semantics, and payload gating.
tests/unit/test_llm_provider.py Unit test asserting provider populates output_tool_calls on both relevant typed events.
tests/conformance/test_observability.py Wires new conformance fixtures and adds presence-only attribute assertions support.
tests/test_smoke.py Updates smoke assertion for __spec_version__ to v0.67.0.
pyproject.toml Updates tool.openarmature.spec_version to v0.67.0.
src/openarmature/__init__.py Updates __spec_version__ to v0.67.0.
src/openarmature/AGENTS.md Updates embedded spec version reference to v0.67.0.
docs/model-providers/authoring.md Updates provider authoring example to include output_tool_calls.
docs/concepts/observability.md Documents the new output tool-call attributes and gating/absence semantics.
conformance.toml Updates spec pin to v0.67.0 and records proposal 0076 as implemented.
CHANGELOG.md Adds release note entry for proposal 0076 and updates spec-pin narrative to v0.67.0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/openarmature/graph/events.py
Comment thread src/openarmature/graph/events.py
@chris-colinsky chris-colinsky merged commit cd3a8ad into main Jun 21, 2026
7 checks passed
@chris-colinsky chris-colinsky deleted the feature/0076-tool-call-request-observability branch June 21, 2026 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants