Harden news workflow pass‑2/re-run contracts and rebalance schedule/model budget for reliable 60‑min runs#2495
Harden news workflow pass‑2/re-run contracts and rebalance schedule/model budget for reliable 60‑min runs#2495Copilot wants to merge 3 commits into
Conversation
🏷️ Automatic Labeling SummaryThis PR has been automatically labeled based on the files changed and PR metadata. Applied Labels: size-xs Label Categories
For more information, see |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
… window Agent-Logs-Url: https://github.com/Hack23/riksdagsmonitor/sessions/e51960bd-f4d0-48d4-9b62-9767d7a0283a Co-authored-by: pethers <[email protected]>
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
🔍 Lighthouse Performance Audit
📥 Download full Lighthouse report Budget Compliance: Performance budgets enforced via |
There was a problem hiding this comment.
Pull request overview
This PR strengthens the agentic news workflow reliability contract by hardening Pass‑2/improvement-mode gating, standardizing re-run metadata requirements, and rebalancing schedule/model settings to better fit the 60‑minute runtime envelope.
Changes:
- Tightens the analysis gate to require an explicit Pass‑2 completion declaration and (in improvement-mode) a canonical
## Re-run logschema. - Adjusts operational knobs: moves realtime monitor weekday afternoon cron to 15:30 UTC and switches long-horizon workflows to
claude-sonnet-4.6. - Adds a focused Vitest contract test to lock the new schedule/model/contract invariants.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 18 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/news-workflow-pass2-budget-contract.test.ts | Adds contract assertions for cron timing, model selection, and Pass‑2/re-run schema presence. |
| analysis/templates/methodology-reflection.md | Updates the methodology reflection template with Pass‑2 status and re-run log sections. |
| .github/workflows/news-realtime-monitor.md | Moves the weekday afternoon schedule to 15:30 UTC for better same-day chamber coverage. |
| .github/workflows/news-election-cycle.md | Switches engine model to Sonnet and clarifies anchor-coverage scope policy. |
| .github/workflows/news-year-ahead.md | Switches engine model to Sonnet for throughput/budget tuning. |
| .github/prompts/04-analysis-pipeline.md | Adds an explicit Pass‑2 declaration step and parallelisation guidance. |
| .github/prompts/05-analysis-gate.md | Hardens the gate to require explicit Pass‑2 declaration and improvement-mode rerun schema fields. |
| .github/prompts/07-commit-and-pr.md | Clarifies/standardizes the canonical improvement-mode rerun marker schema. |
| .github/prompts/ext/cycle-rollover.md | Clarifies anchor coverage rules when cycle_anchor=both. |
| .github/workflows/news-realtime-monitor.lock.yml | Updates compiled workflow lock output for schedule and compilation changes. |
| .github/workflows/news-election-cycle.lock.yml | Updates compiled workflow lock output for model/compilation changes. |
| .github/workflows/news-year-ahead.lock.yml | Updates compiled workflow lock output for model/compilation changes. |
| .github/workflows/news-committee-reports.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-evening-analysis.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-interpellations.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-month-ahead.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-monthly-review.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-motions.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-propositions.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-quarter-ahead.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-translate.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-week-ahead.lock.yml | Updates compiled workflow lock output for compilation changes. |
| .github/workflows/news-weekly-review.lock.yml | Updates compiled workflow lock output for compilation changes. |
| | **Generated** | `YYYY-MM-DD HH:MM UTC` | | ||
| | **Workflow** | `e.g., news-morning-propositions` | | ||
| | **Run duration** | `minutes` | | ||
| | **Pass-2 status** | `executed in full` *(required literal for gate compliance)* | |
| ## 🔁 Re-run log (improvement-mode only) | ||
|
|
||
| > Use this exact schema on every improvement re-run. Keep the heading stable (`## Re-run log`) so the gate can validate contract compliance across all 14 news workflows. | ||
|
|
||
| - **Re-run**: `YYYY-MM-DD HH:MM UTC` · workflow=`$GITHUB_WORKFLOW` · run_id=`$GITHUB_RUN_ID` · attempt=`$GITHUB_RUN_ATTEMPT` | ||
| - new dok_ids: `<count or "none">` | ||
| - artifacts extended: `<comma-separated list or "none — content stable">` | ||
| - flags closed: `<count>` | ||
| - vintage refresh: `<"yes" or "no, IMF WEO Apr-2026 still current">` |
| grep -qE 'Pass-2[[:space:]]+status:[[:space:]]*executed[[:space:]]+in[[:space:]]+full' "$ANALYSIS_DIR/methodology-reflection.md" \ | ||
| || { echo "❌ methodology-reflection.md: missing canonical 'Pass-2 status: executed in full' declaration"; FAIL=1; } | ||
| if grep -qiE 'Pass-2[[:space:]]+status:[[:space:]]*(not[[:space:]]+executed|skipped|deferred|partial)' "$ANALYSIS_DIR/methodology-reflection.md"; then | ||
| echo "❌ methodology-reflection.md: Pass-2 cannot be marked not executed/skipped/deferred/partial" | ||
| FAIL=1 | ||
| fi | ||
| if [ "${IMPROVEMENT_MODE:-false}" = "true" ]; then | ||
| grep -qE '^##[[:space:]]+Re-run[[:space:]]+log' "$ANALYSIS_DIR/methodology-reflection.md" \ | ||
| || { echo "❌ methodology-reflection.md: improvement-mode requires '## Re-run log'"; FAIL=1; } | ||
| grep -qE 'run_id[=:][[:space:]]*'"${GITHUB_RUN_ID:-}" "$ANALYSIS_DIR/methodology-reflection.md" \ | ||
| || { echo "❌ methodology-reflection.md: improvement-mode requires current run_id in Re-run log"; FAIL=1; } | ||
| grep -qE 'attempt[=:][[:space:]]*'"${GITHUB_RUN_ATTEMPT:-}" "$ANALYSIS_DIR/methodology-reflection.md" \ |
| expect(template).toContain('## 🔁 Re-run log (improvement-mode only)'); | ||
| expect(template).toContain('run_id=`$GITHUB_RUN_ID`'); | ||
| expect(template).toContain('attempt=`$GITHUB_RUN_ATTEMPT`'); |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
| token: ${{ secrets.GH_AW_GITHUB_TOKEN || secrets.GITHUB_TOKEN }} | ||
| persist-credentials: false | ||
| fetch-depth: 0 | ||
| fetch-depth: 1 |
This change addresses recurring 2026-05-14 cohort failures where workflows silently shipped incomplete Pass‑2 output, showed inconsistent re-run metadata, and missed midday chamber updates in realtime monitoring. The update tightens shared contracts and adjusts workflow timing/model choices to improve deterministic delivery within the 60-minute envelope.
Pass‑2 gate hardening (fail-loud)
methodology-reflection.mddeclares:Pass-2 status: executed in fullCanonical re-run schema normalization (all news workflows)
## Re-run logcontract in shared prompt/template surfaces.run_id,attempt,new dok_ids,artifacts extended,flags closed,vintage refresh.Realtime monitor coverage window adjustment
Budget/throughput tuning for long-horizon workflows
news-election-cycleandnews-year-aheadengine model toclaude-sonnet-4.6to reduce runtime pressure while preserving two-pass workflow semantics.Cycle-rollover policy clarification
cycle_anchor=both, skipping an anchor requires formal rollover-window reason, not runtime exhaustion wording.Contract-level regression coverage