Guard against unaccounted observability conformance fixtures by chris-colinsky · Pull Request #180 · LunarCommand/openarmature-python

chris-colinsky · 2026-06-23T03:32:24Z

What

The observability conformance harness silently pytest.skip-ped any pinned fixture not in the positive _SUPPORTED_FIXTURES allowlist, so a future unwired spec fixture would skip rather than fail CI. This adds a fail-on-unknown guard and converts the silent skips into explicit, documented accounting.

Changes

New test_observability_fixture_coverage_is_complete guard — every pinned observability fixture must be either run (_SUPPORTED_FIXTURES) or explicitly accounted for. Also catches stale entries (documented fixture no longer on disk) and supported/not-run overlaps.
Three documented buckets replace the silent skip:
- _DEFERRED_FIXTURES (future capability): the 10 embedding fixtures (074-083) + 089, gated on the embedding capability (proposal 0059, lands v0.16.0); plus the nested-lineage Langfuse case (039), whose stale "0045 not implemented" reason is corrected (0045 is implemented; 039 defers for a nested-case harness limitation).
- _UNIT_TESTED_FIXTURES (32): implemented behavior covered by the dedicated unit suite rather than the YAML harness, each cited to its covering file.
- _CONVENTION_ONLY_FIXTURES (3): the proposal 0048 section 9 queryable-observer pattern, convention-only and doc-satisfied (no library surface).
Close two genuine gaps the accounting surfaced (fixtures 064/066): the active_prompt / active_prompt_group event-population path was implemented in the provider but only the observer's span rendering of an injected field was tested. Two new provider tests drive complete() inside with_active_prompt / with_active_prompt_group and assert the emitted event carries the record.

Coverage after this change

Of 98 pinned observability fixtures: 51 run in the harness, 32 covered by unit tests, 11 deferred on the embedding capability (v0.16.0), 3 convention-only, 1 fixture-wiring-pending. Behavior-wise, only the 11 embedding fixtures await unimplemented capability; everything else is implemented and tested.

Came out of the v0.15.0 release review (spec finding: the allowlist silent-skip).

The OpenAI provider populates LlmCompletionEvent.active_prompt / active_prompt_group from the with_active_prompt context, but no test drove complete() inside a real prompt context to assert the event carries the record -- only the observer's span rendering of an injected field was covered. Add two provider tests closing that gap (conformance fixtures 064 / 066).

The conformance harness silently pytest.skip-ped any observability fixture not in _SUPPORTED_FIXTURES, so a future unwired spec fixture would not fail CI. Add test_observability_fixture_coverage_is_complete: every pinned fixture must be run or explicitly accounted for; the guard also catches stale entries and supported/not-run overlaps. Restructure the silent skips into three documented buckets: _DEFERRED_FIXTURES (future capability -- the embedding fixtures gated on proposal 0059, plus the nested-lineage case, whose stale "0045 not implemented" reason is corrected), _UNIT_TESTED_FIXTURES (32, each cited to its covering unit-suite file), and _CONVENTION_ONLY_FIXTURES (the 0048 section 9 queryable-observer pattern, doc-satisfied).

Copilot

Pull request overview

Adds a completeness guard to the observability conformance harness so newly pinned spec fixtures can’t silently skip CI, and fills two LLM provider event-coverage gaps around prompt context propagation.

Changes:

Add an explicit accounting/coverage guard for pinned observability fixtures and replace silent “unknown fixture” skips with documented buckets.
Document deferred, unit-tested, and convention-only observability fixtures (with reasons) and enforce no stale/overlapping entries.
Add unit tests ensuring OpenAIProvider.complete() populates LlmCompletionEvent.active_prompt / active_prompt_group from prompt context.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
tests/unit/test_llm_provider.py	Adds provider unit tests for `active_prompt` / `active_prompt_group` on typed completion events.
tests/conformance/test_observability.py	Adds fixture coverage guard and explicit “not run here” accounting buckets to prevent silent skips.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

PR #180 review: the active_prompt / active_prompt_group tests asserted object identity (is), coupling them to the provider passing the exact instance through. The contract is the populated record, so a reasonable snapshot/copy refactor would break them. Use pydantic structural equality (==), which is copy-robust and still checks the full record.

chris-colinsky added 2 commits June 22, 2026 20:31

Copilot AI review requested due to automatic review settings June 23, 2026 03:32

Copilot started reviewing on behalf of chris-colinsky June 23, 2026 03:32 View session

Copilot AI reviewed Jun 23, 2026

View reviewed changes

Comment thread tests/unit/test_llm_provider.py Outdated

Comment thread tests/unit/test_llm_provider.py Outdated

chris-colinsky merged commit d29c6b9 into main Jun 23, 2026
5 checks passed

chris-colinsky deleted the chore/observability-fixture-coverage-guard branch June 23, 2026 03:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guard against unaccounted observability conformance fixtures#180

Guard against unaccounted observability conformance fixtures#180
chris-colinsky merged 3 commits into
mainfrom
chore/observability-fixture-coverage-guard

chris-colinsky commented Jun 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chris-colinsky commented Jun 23, 2026

What

Changes

Coverage after this change

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants