Bead: asupersync-2a6k9.7.4
Parent: asupersync-2a6k9.7, asupersync-2a6k9
Author: SilentFinch (codex-cli / gpt-5-codex)
Date: 2026-03-21
Contract Version: lab-live-virtualized-surface-matrix-v1
Dependencies: docs/lab_live_differential_scope_matrix.md, docs/lab_live_verification_taxonomy.md, docs/lab_live_time_normalization_policy.md, docs/lab_live_scenario_adapter_contract.md, docs/lab_live_normalized_observable_schema.md, docs/lab_live_divergence_taxonomy.md, src/lab/dual_run.rs, tests/common/mod.rs, README.md, docs/WASM.md
This document defines the Phase 2 coverage matrix and observability contract for virtualized expansion surfaces in the lab-vs-live differential program.
asupersync-2a6k9.6.6 defines the core pilot coverage matrix for Phase 1
semantic-runtime surfaces. That matrix is the base contract. This document does
not replace it and it does not invent a second testing language. Instead, it
extends the same T0 / T1 / T2 / T3 / T4 / T5 vocabulary to the
first expansion surfaces where timing, virtualization, and external-boundary
truthfulness become the dominant risk.
The goal is simple: make future timer and transport beads prove that they are running controlled experiments instead of fragile demos with optimistic prose.
This matrix is downstream of:
docs/lab_live_differential_scope_matrix.mddocs/lab_live_verification_taxonomy.mddocs/lab_live_time_normalization_policy.mddocs/lab_live_scenario_adapter_contract.mddocs/lab_live_normalized_observable_schema.mddocs/lab_live_divergence_taxonomy.mdsrc/lab/dual_run.rstests/common/mod.rs
These documents already define the admitted rollout ladder, the
lab-live-scenario-spec-v1 contract, the lab-live-normalized-observable-v1
schema, the lab-live-verification-taxonomy-v1 tiers, and the
lab-live-time-normalization-v1 timing/noise classes.
This bead exists because Phase 2 needs one more layer:
- a matrix that says which virtualized surfaces are being widened,
- the unit/e2e/logging floor for each surface,
- the machine-readable field contract that must appear in
CaptureManifest,LiveRunMetadata, and retained reports, - the failure patterns that must be treated as scope violations instead of runtime bugs.
No Phase 2 widening work is complete unless the bead can point to one explicit matrix row in this document and satisfy every required column in that row.
That means:
- the row must reuse the Phase 1 vocabulary instead of inventing new tiers,
- the row must name the exact virtualization boundary,
- the row must declare the minimum unit checks and dual-run scripts,
- the row must declare the minimum observability hooks and required logs,
- the row must say which failures are genuine semantic mismatches and which failures are merely invalid or weakly-controlled experiments.
If a timer or transport-style experiment cannot meet that bar, the correct
result is insufficient_observability, blocked_missing_virtualization,
blocked_missing_verification, blocked_scope_red_line, or
unsupported_time_surface, not a soft "probably okay."
This document explicitly extends asupersync-2a6k9.6.6.
The Phase 1 matrix already proves:
- which semantic-core surfaces are worth trusting first,
- that
T0 unit_contract,T2 dual_run_smoke,T3 pilot_surface, andT4 negative_controlare mandatory for executable parity claims, - that future beads must retain structured logs and replayable artifacts.
Phase 2 inherits those same requirements and adds four new obligations:
- every widened surface must declare a
virtualization_boundary, - every widened surface must declare how time facts land in
semantic_time,qualified_time,scheduler_noise_signal,provenance_only_time, orunsupported_time_surface, - every widened surface must declare the minimum
CaptureManifestobservability floor usingobserved,inferred, andunsupported, - every widened surface must define "invalid experiment" cases that are not allowed to masquerade as real runtime defects.
The important discipline rule is:
- reuse the core pilot vocabulary and expand it,
- do not invent a second testing language,
- do not fork the differential program into a separate Phase 2 dialect.
Every Phase 2 matrix row must publish the following machine-readable columns:
| Column | Meaning | Why it is mandatory |
|---|---|---|
surface_family |
stable surface token | keeps Phase 2 rows queryable and comparable |
phase |
rollout phase (Phase 2 or gated Phase 3 descendant) |
binds the row back to the scope ladder |
runtime_profile |
declared scenario/runtime lane | keeps adapters from silently widening ambient execution |
virtualization_boundary |
exact boundary that constrains externality | prevents over-claiming uncontrolled host behavior |
unit_checks |
minimum T0 unit_contract floor |
ensures local contracts are pinned before wider runs |
golden_fixtures |
minimum T1 golden_fixture floor |
freezes any normalized or report-shaped artifacts |
dual_run_scripts |
minimum T2 dual_run_smoke or T3 pilot_surface commands |
proves the shared lab/live scenario contract is really exercised |
required_log_fields |
mandatory field set for reports, bundles, and logs | keeps operator/debug evidence stable |
invalid_experiment_signals |
failures that mean the experiment was under-controlled | prevents policy drift into false bug reports |
promotion_floor |
minimum evidence required before the row may be widened further | blocks scope creep from getting ahead of observability |
Rows may add explanatory prose, but they are not allowed to omit any of these columns.
The rows below are the normative matrix for timer, transport, and the first captured-boundary descendants of transport.
surface_family |
phase |
runtime_profile |
virtualization_boundary |
unit_checks |
golden_fixtures |
dual_run_scripts |
required_log_fields |
invalid_experiment_signals |
promotion_floor |
|---|---|---|---|---|---|---|---|---|---|
timer_surface |
Phase 2 |
phase2.timer_virtualized |
scenario-declared clock and deadline boundary only; no ambient wall-clock claim | timeout classification, timer cancellation, logical deadline mapping, scenario-clock validation | normalized time bundle and capture-manifest shape for timer scenarios | one T2 dual_run_smoke plus one T3 pilot_surface family over admitted timer semantics |
scenario_clock_id, clock_source, logical_deadline_id, timeout_budget_class, timeout_outcome_class, logical_elapsed_ticks, normalization_window, time_policy_class, scheduler_noise_class, CaptureManifest, LiveRunMetadata, ReplayMetadata |
missing scenario_clock_id, missing logical_deadline_id, wall-clock-only reasoning, or unsupported timer fields marked as semantic |
T0, T1, T2, T3, T4 complete before broader timing claims |
virtual_transport_surface |
Phase 2 |
phase2.transport_loopback |
loopback or explicit virtual transport with captured peer model; no ambient internet | ordering-class contract, in-flight cancel cleanup, transport-close semantics, capture-manifest completeness | normalized transport summary and retained bundle schema | one T2 dual_run_smoke plus one T3 pilot_surface family over virtualized delivery/cancel cases |
virtualization_boundary, capture_manifest_path, normalized_record_path, artifact_bundle, repro_command, event_hash, schedule_hash, nondeterminism_notes, surface_family, observability_status |
uncontrolled peer timing, real network dependency, missing capture packet, or transport evidence only inferred from wall-clock logs | T0, T1, T2, T3, T4 complete before HTTP/gRPC-on-captured-boundaries claims |
http_surface |
gated Phase 3 descendant of Phase 2 transport proof |
phase3.http_captured_boundary |
HTTP over loopback or virtualized transport with explicit timeout and peer-model contract | request/response termination contract, protocol-version mapping, shutdown/cancel boundary, malformed-artifact rejection | normalized request/response bundle and gate packet fixture | one T2 dual_run_smoke, one T3 pilot_surface, and at least one T4 negative_control proving malformed or under-observed traces are rejected |
surface_family, virtualization_boundary, scenario_clock_id, logical_deadline_id, capture_manifest_path, normalized_record_path, artifact_bundle, eligibility_verdict, observability_status, unsupported_reason |
real-internet RTT, uncontrolled TLS/DNS timing, opaque upstream behavior, or missing normalized peer evidence | transport row must already be credible; no direct jump from Phase 1 to HTTP parity |
browser_surface |
gated Phase 3 descendant of captured transport and host-boundary work |
phase3.browser_captured_boundary |
explicit host-role contract plus admitted lane boundary; no opaque browser-host parity claim | lane classification, downgrade semantics, host-role logging, unsupported-host rejection, bridge-only proof | host classification packet, downgrade artifact, and gate packet fixture | one T2 dual_run_smoke on an admitted lane, one T4 negative_control, and T5 stress_nightly before noisy multi-host claims |
surface_family, host_role, support_class, reason_code, lane_id, eligibility_verdict, observability_status, capture_manifest_path, artifact_bundle, repro_command, unsupported_reason |
opaque browser scheduler claims, service-worker lifetime parity claims, shared-worker parity without promotion, or missing lane/host metadata | captured transport row plus host gate must already hold; browser support remains lane-scoped, not host-global |
The http_surface and browser_surface rows are deliberately included here
because Phase 2 transport evidence is what makes those later captured-boundary
surfaces honest. They are not promoted by this bead; they are constrained by
it.
The timer row is the first place where time may become semantic instead of remaining explanatory. That promotion is only legitimate when the row carries:
scenario_clock_idclock_sourcelogical_deadline_idtimeout_budget_classtimeout_outcome_classlogical_elapsed_ticksnormalization_windowsuppression_reasonrerun_decision
The row must also name which fields are merely qualified_time and which are
actually semantic_time. For example:
timeout_outcome_classmay be semantic when the scenario clock and deadline are explicit,logical_elapsed_ticksmay only be compared through the declarednormalization_window,wall_elapsed_ns,monotonic_start_ns,monotonic_end_ns,now_nanos, andsteps_deltaremain provenance only.
The timer row must never accept a report that explains a semantic mismatch with
"scheduler noise" alone. scheduler_noise_signal may explain drift, but it may
not erase a real timeout or cancellation contract break.
The transport row is still about semantic control, not about network realism.
The minimum truthful boundary is:
- loopback or explicitly virtualized transport only,
- captured peer behavior only,
- explicit cancel/close semantics,
- retained lab/live normalized records and reproducible artifacts,
- no ambient DNS, TLS, packet loss, or remote-peer timing claims.
The required logs for this row must be strong enough to answer:
- what transport model was used?
- what peer boundary was captured?
- which delivery/cancel/close events were observed versus inferred?
- what artifact bundle or replay command reproduces the mismatch?
If the row cannot answer those questions from stable fields, it is not ready to
host asupersync-2a6k9.7.2.
This row exists to stop future HTTP work from treating "runs over loopback" as synonymous with "truthful parity."
The HTTP row inherits the transport row and additionally requires:
- request/response termination semantics,
- protocol version evidence,
- timeout/cancel boundary evidence,
- explicit peer-model capture,
- rejection of malformed or under-observed artifacts.
The row must classify weak experiments as blocked work rather than accepting
them as partial passes. The correct results for weak experiments are
blocked_missing_virtualization, blocked_missing_observability, or
blocked_missing_verification.
The browser row is the most likely place for over-claiming.
This row therefore requires explicit browser-boundary metadata:
host_rolesupport_classreason_codelane_ideligibility_verdictobservability_status
The row must also preserve the code-facing downgrade vocabulary already present
in src/lab/dual_run.rs, especially:
support_class = bridge_onlyreason_code = downgrade_to_server_bridgereason_code = unsupported_runtime_context
An admitted bridge_only downgrade can be a truthful captured lane. It is not
proof of full browser-runtime parity. The matrix must keep those ideas
separate.
Every Phase 2 row must emit or define the following stable fields whenever it claims meaningful differential evidence:
surface_familyphaseruntime_profilevirtualization_boundaryscenario_clock_idclock_sourcelogical_deadline_idtimeout_budget_classtimeout_outcome_classlogical_elapsed_ticksnormalization_windowtime_policy_classscheduler_noise_classsuppression_reasonrerun_decisionobservability_statuseligibility_verdictcapture_manifest_pathnormalized_record_pathartifact_bundlerepro_commandunsupported_reason
Rows that touch host or external-surface gates must additionally emit:
host_rolesupport_classreason_codelane_id
The retained bundle must be rich enough to connect those report fields back to:
CaptureManifestFieldObservabilityLiveRunMetadataReplayMetadatanondeterminism_notesartifact_pathconfig_hashtrace_fingerprintschedule_hashevent_hashevent_count
This is how future contributors prove that a row is controlled, replayable, and auditable instead of merely "covered."
CaptureManifest is the minimum observability packet for live-side widening
work.
Every executable Phase 2 bead must say which of its important fields are:
observedinferredunsupported
and it must retain unsupported_fields explicitly.
Normative rules:
- a field may not be presented as semantic if the best available capture class
is
unsupported, - a field marked only as
inferredmay support triage or a gate packet, but it should not silently satisfy a row that demands direct semantic capture, - when the row depends on
CaptureManifest, a missing manifest is itself aninvalid_experiment_signal, observability_statusmust summarize whether the live adapter actually met the row's declared capture floor.
This is the main discipline tool that keeps widened surfaces from being graded with weaker evidence than the core semantic pilots.
The table below defines failure patterns that must be classified as invalid experiments or scope violations rather than runtime bugs.
| Failure pattern | Required classification | Why |
|---|---|---|
timer report lacks scenario_clock_id or logical_deadline_id but compares timeout behavior semantically |
insufficient_observability |
the time contract is incomplete |
timer suite relies only on wall_elapsed_ns or other wall-clock values |
unsupported_time_surface |
raw wall-clock evidence is not an admitted semantic surface |
| transport suite talks to uncontrolled remote peers or ambient internet services | blocked_scope_red_line |
the virtualization boundary was abandoned |
transport bundle lacks capture_manifest_path or stable normalized records |
blocked_missing_observability |
the experiment cannot defend its observables |
surface row changes comparator/report behavior without T1 golden_fixture or T4 negative_control evidence |
blocked_missing_verification |
policy-shaping work needs stronger proof than a single pass |
browser row omits host_role, support_class, reason_code, or lane_id |
blocked_missing_observability |
host-boundary truthfulness depends on those fields |
browser claim treats bridge_only downgrade as full host parity |
blocked_scope_red_line |
downgrade is an admitted fallback, not a full support proof |
a report uses scheduler_noise_signal to erase a hard semantic mismatch |
policy violation | noise may explain drift, but it cannot rewrite a real failure |
These classifications are deliberately conservative. The whole point of this bead is to keep Phase 2 from diluting the trust story established by Phase 1.
This matrix is directly downstream of asupersync-2a6k9.6.6 and directly
constrains:
| Downstream bead | What it must inherit from this document |
|---|---|
asupersync-2a6k9.7.1 |
timer and virtual-time suites must use the timer row, its field vocabulary, and its invalid-experiment policy |
asupersync-2a6k9.7.2 |
virtualized transport suites must use the transport row, the capture-manifest floor, and the loopback/virtual boundary law |
asupersync-2a6k9.7.3 |
raw-socket, HTTP, and browser gate packets must use the exact eligibility_verdict, support_class, reason_code, lane_id, and observability vocabulary here |
asupersync-2a6k9.8.1 |
normal CI gates must not treat a widened Phase 2 surface as credible unless the row-level log and artifact floor is met |
README.md and docs/WASM.md |
browser-facing support claims must remain lane-scoped and downgrade-aware, not host-global |
If a later bead makes a broader claim than one of these rows allows, that bead is out of contract even if its code "works" in a demo.
Future widening beads should be able to copy this checklist directly into their notes or PR description:
- Name the
surface_family. - Name the
phaseandruntime_profile. - State the exact
virtualization_boundary. - List the minimum
T0 unit_contractchecks. - List the minimum
T1 golden_fixtureoutputs. - List the exact
T2 dual_run_smokeandT3 pilot_surfacescripts. - List the exact
T4 negative_controlandT5 stress_nightlyexpectations. - Declare the required
CaptureManifestandLiveRunMetadatafields. - Declare which failures are semantic mismatches versus invalid experiments.
- Provide
rch-offloaded validation commands.
If a widening bead cannot fill out all ten items, it is still planning work, not completed verification work.
Every change to this contract must be validated with:
rch exec -- cargo fmt --checkrch exec -- cargo check --all-targetsrch exec -- cargo clippy --all-targets -- -D warningsrch exec -- cargo test --test lab_live_virtualized_surface_matrix_contract -- --nocapture
Rows that move from contract-only status into executable timer or transport work should additionally replay the relevant executable anchors, such as:
rch exec -- cargo test --test time_e2e -- --nocapturerch exec -- cargo test --test e2e_transport -- --nocapture
The purpose of the extra commands is not to inflate ceremony. It is to ensure that Phase 2 widening beads keep the same disciplined proof posture as the core Phase 1 pilots.