test: add aFIPC numerical regression fixtures#90
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds deterministic, fixture-backed regression tests to lock down two historically high-risk behaviors in autoFIPC()—prior-update linking semantics and IPD anchor filtering—without modifying R/aFIPC.R. It also records a short decision note explaining why PR #87 is being replaced rather than repaired.
Changes:
- Add a new regression test file covering (1) free-mean vs fixed-normal prior-update behavior and (2) IPD filtering followed by fixed-parameter linking assertions.
- Add a dedicated fixture definitions file to centralize scenario parameters (seeds, sample sizes, drift index, thresholds).
- Add a plan/decision markdown note documenting the rationale for replacing PR #87.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| tests/testthat/test-regression-fixtures.R | Adds two deterministic regression tests for prior-update behavior and IPD anchor filtering/fixing behavior. |
| tests/testthat/fixtures/fipc-regression-fixtures.R | Adds shared fixture parameter lists (seeds, sample sizes, drift index, thresholds) used by regression tests. |
| docs/plans/2026-07-02-regression-fixtures.md | Documents the decision to replace PR #87 and clarifies the intended (non-algorithmic) scope of the replacement. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
OpenCode reviewed the current-head evidence but found unresolved reviewer or review-agent threads before approval.
Findings
1. HIGH .github/workflows/opencode-review.yml:1 - Unresolved reviewer thread blocks automated approval
- Problem: OpenCode reached an APPROVE control result, but the approval step found unresolved, non-outdated human or review-agent thread evidence on the current pull request.
- Root cause: Reviewer and review-agent feedback can arrive after bounded model evidence is prepared, so the approval step must re-query GitHub immediately before publishing an approval.
- Fix: Address or resolve the listed reviewer thread(s), then re-run OpenCode on the current head.
- Regression test: Keep the approval gate querying reviewThreads(first: 100) after model output and before create_pull_review APPROVE, including bot review agents other than OpenCode itself.
Review thread evidence
Latest unresolved reviewer thread evidence
tests/testthat/test-regression-fixtures.R line 196
- Latest reviewer comment: @copilot-pull-request-reviewer at 2026-07-02T00:15:33Z
- Comment URL: #90 (comment)
- Comment excerpt: 'IPDCommonItemList' is a data.frame created inside 'autoFIPC()' via 'data.frame(rbind(...))' without 'stringsAsFactors = FALSE', so under R < 4.0 (or if 'stringsAsFactors' is enabled) its columns can be factors. Using 'unlist()' on a data.frame row will coerce factor columns to their underlying integer codes, producing incorrect item names (e.g., "1", "2") and making the test non-portable.
tests/testthat/test-regression-fixtures.R line 103
- Latest reviewer comment: @copilot-pull-request-reviewer at 2026-07-02T00:15:34Z
- Comment URL: #90 (comment)
- Comment excerpt: In the fixed-normal case, 'forceNormalZeroOne = TRUE' forces 'freeMEAN <- FALSE' inside 'autoFIPC()' (see 'R/aFIPC.R'). Passing 'freeMEAN = TRUE' here is misleading and makes the test rely on an internal side effect rather than the explicit API inputs.
docs/plans/2026-07-02-regression-fixtures.md line 21
-
Latest reviewer comment: @copilot-pull-request-reviewer at 2026-07-02T00:15:34Z
-
Comment URL: #90 (comment)
-
Comment excerpt: This PR adds the regression fixtures/tests described in 'ARCHITECTURE.md' as a future roadmap item ("Add non-interactive regression fixtures for historically trusted FIPC results."). With these fixtures now present, 'ARCHITECTURE.md' should be updated so the roadmap reflects the current state of the repository.
-
Result: REQUEST_CHANGES
-
Reason: unresolved reviewer or review-agent thread(s) were present before approval.
-
Head SHA:
51f701f8b4f60eb2d3b00f1799e68f273cba989f -
Workflow run: 28556297634
-
Workflow attempt: 1
Changed-File Evidence Map
flowchart LR
PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
Evidence --> S1["Docs: 2026-07-02-regression-fixtures.md"]
S1 --> I1["operator or user guidance"]
I1 --> R1["Review risk: Docs: 2026-07-02-regression-fixtures.md"]
R1 --> V1["docs review"]
Evidence --> S2["Test (2 files)"]
S2 --> I2["regression suite"]
I2 --> R2["Review risk: Test (2 files)"]
R2 --> V2["targeted test run"]
OpenCode Review Overview
Pull request overviewOpenCode exhausted the configured model pool without a usable current-head review conclusion. This is not approval evidence, so the PR is blocked until a source-backed review can establish approval sufficiency or identify concrete fixes. Findings1. HIGH ARCHITECTURE.md:1 - OpenCode could not establish approval sufficiency
Summary
No PR approval was posted because model-output failure is not evidence that the PR has no blockers. Changed-File Evidence Mapflowchart LR
PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
Evidence --> S1["Changed file: ARCHITECTURE.md"]
S1 --> I1["repository behavior"]
I1 --> R1["Review risk: Changed file: ARCHITECTURE.md"]
R1 --> V1["required checks"]
Evidence --> S2["Docs: 2026-07-02-regression-fixtures.md"]
S2 --> I2["operator or user guidance"]
I2 --> R2["Review risk: Docs: 2026-07-02-regression-fixtures.md"]
R2 --> V2["docs review"]
Evidence --> S3["Test (2 files)"]
S3 --> I3["regression suite"]
I3 --> R3["Review risk: Test (2 files)"]
R3 --> V3["targeted test run"]
|
51f701f to
6beeeca
Compare
There was a problem hiding this comment.
Pull request overview
OpenCode exhausted the configured model pool without a usable current-head review conclusion. This is not approval evidence, so the PR is blocked until a source-backed review can establish approval sufficiency or identify concrete fixes.
Findings
1. HIGH ARCHITECTURE.md:1 - OpenCode could not establish approval sufficiency
- Problem: every configured model path failed to produce a usable current-head control block.
- Root cause: model execution, timeout, export, normalization, or approval-gate validation did not complete after exponential retry across the configured model pool.
- Impact: approving from deterministic check state alone would miss PR-intent mismatches, missing files, edge-case bugs, robustness gaps, UX/DX regressions, security issues, and CodeGraph-backed base/head flow changes.
- Fix: rerun OpenCode after model availability recovers, or update the PR with the missing files, tests, docs, generated artifacts, and verification evidence needed for a source-backed review conclusion.
- Regression test: keep the approval gate posting REQUEST_CHANGES, not APPROVE or check-only failure, when no model produces a valid current-head review.
Summary
- Result: REQUEST_CHANGES
- Reason: coverage-evidence passed and peer GitHub Checks completed without failures, but no model produced a valid review control block.
- Deterministic evidence checked but not used for approval: current-head changed-file evidence (ARCHITECTURE.md, docs/plans/2026-07-02-regression-fixtures.md, tests/testthat/fixtures/fipc-regression-fixtures.R, tests/testthat/test-regression-fixtures.R); coverage-evidence result success; peer checks from statusCheckRollup excluding this OpenCode check.
- Model outcome: model_pool=exhausted; selected_model=none.
- Head SHA:
6beeecad14a279b1c946946ff77d931697937438 - Workflow run: 28557005459
- Workflow attempt: 1
No PR approval was posted because model-output failure is not evidence that the PR has no blockers.
Changed-File Evidence Map
flowchart LR
PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
Evidence --> S1["Changed file: ARCHITECTURE.md"]
S1 --> I1["repository behavior"]
I1 --> R1["Review risk: Changed file: ARCHITECTURE.md"]
R1 --> V1["required checks"]
Evidence --> S2["Docs: 2026-07-02-regression-fixtures.md"]
S2 --> I2["operator or user guidance"]
I2 --> R2["Review risk: Docs: 2026-07-02-regression-fixtures.md"]
R2 --> V2["docs review"]
Evidence --> S3["Test (2 files)"]
S3 --> I3["regression suite"]
I3 --> R3["Review risk: Test (2 files)"]
R3 --> V3["targeted test run"]
|
|
||
| - Add non-interactive regression fixtures for historically trusted | ||
| FIPC results. | ||
| - Maintain non-interactive regression fixtures for historically trusted |
There was a problem hiding this comment.
HIGH OpenCode could not establish approval sufficiency
- Problem: the model pool exhausted without a valid current-head review control block, so this changed line cannot be approved from deterministic check state alone.
- Impact: PR-intent mismatches, missing files, robustness bugs, UX/DX regressions, and CodeGraph-backed flow changes could be missed.
- Fix: rerun OpenCode after model availability recovers, or add the missing source/test/docs/generated verification evidence needed for a source-backed approval.
- Verification: rerun the OpenCode Review workflow and confirm it emits APPROVE or source-backed REQUEST_CHANGES for this head SHA.
There was a problem hiding this comment.
Pull request overview
OpenCode exhausted the configured model pool without a usable current-head review conclusion. This is not approval evidence, so the PR is blocked until a source-backed review can establish approval sufficiency or identify concrete fixes.
Findings
1. HIGH ARCHITECTURE.md:1 - OpenCode could not establish approval sufficiency
- Problem: every configured model path failed to produce a usable current-head control block.
- Root cause: model execution, timeout, export, normalization, or approval-gate validation did not complete after exponential retry across the configured model pool.
- Impact: approving from deterministic check state alone would miss PR-intent mismatches, missing files, edge-case bugs, robustness gaps, UX/DX regressions, security issues, and CodeGraph-backed base/head flow changes.
- Fix: rerun OpenCode after model availability recovers, or update the PR with the missing files, tests, docs, generated artifacts, and verification evidence needed for a source-backed review conclusion.
- Regression test: keep the approval gate posting REQUEST_CHANGES, not APPROVE or check-only failure, when no model produces a valid current-head review.
Summary
- Result: REQUEST_CHANGES
- Reason: coverage-evidence passed and peer GitHub Checks completed without failures, but no model produced a valid review control block.
- Deterministic evidence checked but not used for approval: current-head changed-file evidence (ARCHITECTURE.md, docs/plans/2026-07-02-regression-fixtures.md, tests/testthat/fixtures/fipc-regression-fixtures.R, tests/testthat/test-regression-fixtures.R); coverage-evidence result success; peer checks from statusCheckRollup excluding this OpenCode check.
- Model outcome: model_pool=exhausted; selected_model=none.
- Head SHA:
6beeecad14a279b1c946946ff77d931697937438 - Workflow run: 28557005459
- Workflow attempt: 2
No PR approval was posted because model-output failure is not evidence that the PR has no blockers.
Changed-File Evidence Map
flowchart LR
PR["PR changed files"] --> Evidence["OpenCode bounded evidence"]
Evidence --> S1["Changed file: ARCHITECTURE.md"]
S1 --> I1["repository behavior"]
I1 --> R1["Review risk: Changed file: ARCHITECTURE.md"]
R1 --> V1["required checks"]
Evidence --> S2["Docs: 2026-07-02-regression-fixtures.md"]
S2 --> I2["operator or user guidance"]
I2 --> R2["Review risk: Docs: 2026-07-02-regression-fixtures.md"]
R2 --> V2["docs review"]
Evidence --> S3["Test (2 files)"]
S3 --> I3["regression suite"]
I3 --> R3["Review risk: Test (2 files)"]
R3 --> V3["targeted test run"]
|
|
||
| - Add non-interactive regression fixtures for historically trusted | ||
| FIPC results. | ||
| - Maintain non-interactive regression fixtures for historically trusted |
There was a problem hiding this comment.
HIGH OpenCode could not establish approval sufficiency
- Problem: the model pool exhausted without a valid current-head review control block, so this changed line cannot be approved from deterministic check state alone.
- Impact: PR-intent mismatches, missing files, robustness bugs, UX/DX regressions, and CodeGraph-backed flow changes could be missed.
- Fix: rerun OpenCode after model availability recovers, or add the missing source/test/docs/generated verification evidence needed for a source-backed approval.
- Verification: rerun the OpenCode Review workflow and confirm it emits APPROVE or source-backed REQUEST_CHANGES for this head SHA.
Summary
R/aFIPC.Rnumerical behavior.Validation
R_PROFILE_USER=/dev/null Rscript -e 'pkgload::load_all(); testthat::test_file("tests/testthat/test-regression-fixtures.R")'\n-R_PROFILE_USER=/dev/null Rscript -e 'pkgload::load_all(); testthat::test_dir("tests/testthat")'\n-R_PROFILE_USER=/dev/null Rscript -e 'rcmdcheck::rcmdcheck(args = c("--no-manual", "--as-cran"), error_on = "error")'\n-npx markdownlint-cli2 docs/plans/2026-07-02-regression-fixtures.md\n-git diff --check\n\nLocal R CMD check result: 0 errors, 0 warnings, 2 existing NOTEs (New submission,stats::na.omitnamespace NOTE).\n\n## Risk\n\nLow. This is additive regression evidence plus a decision note. It intentionally avoids the failing exact drift-removal assertion from Add fixture-backed regression coverage for prior-update behavior and IPD anchor filtering #87 because current CI showedautoFIPC()retained that synthetic drifted anchor.