Context
Feedback from a single completed RLCR session (exited complete — acceptance criteria met, code review passed). Substantive work landed in the initial round; three follow-up rounds each handled exactly one automated review finding. This report is fully sanitized — it concerns the methodology only, not any project.
Headline issue: reviewer verdict oscillation
Two consecutive review rounds issued opposite, equally-confident directives on the same single location: round N required a change; round N+1 required reverting that exact change on a more refined argument. Net effect of two full rounds = no change to the deliverable. Each finding was internally plausible; the loop had no mechanism to notice round N+1 was reversing round N, and trusted each single-reviewer verdict in isolation.
Suggestions
-
Reversal detection + forced adjudication. Before accepting a finding, compare its target against the immediately preceding round's changes. If a finding would revert/contradict a change the loop itself just made, flag it as contested (not routine-blocking). On the first detected reversal, force an explicit adjudication: state both arguments, identify the deciding principle, and bake that principle into the artifact then — not after paying two rounds. Escalate persistent oscillation to a human or a second reviewer.
-
Batch low-severity findings + severity-gated iteration. Round count tracked the arrival rate of small findings rather than remaining risk. Let a review pass emit all remaining minor findings together (one consolidated polish round); once no high-severity findings remain, switch to a final-polish mode that finalizes after one cleanup round instead of looping once per nit.
-
Explicit, justified blocking-vs-queued routing. Every finding — including a prose-only one that changed no computed value — was locally promoted to "blocking," so severity labels carried no information. Require each finding to be routed blocking or queued with a one-line justification; default no-result-change findings to queued. Test: "Does any externally-observable value or conclusion change?"
-
Independent cross-check per key claim (analysis-only work). Verification relied on internal-consistency checks + a re-derivation reaching the same numbers; no external oracle existed, which is why consistency checking couldn't break the oscillation tie (both readings were internally consistent). Require at least one independent cross-check per key claim from a different angle (sanity bound, degenerate case, second derivation).
-
Promote oscillation-resolving distinctions to durable lessons. The recurring trap behind the oscillation was documented in the artifact but never promoted to a cross-session lesson because it occurred in one location. Treat any self-reversal as a lesson trigger by itself.
What worked well (preserve)
Up-front testable acceptance criteria + an immutable goal anchor kept execution within its declared scope across all rounds with zero drift, even during the unproductive churn. Keep this as the default.
Filed from an RLCR methodology-analysis phase; fully sanitized (no project-specific information).
Context
Feedback from a single completed RLCR session (exited
complete— acceptance criteria met, code review passed). Substantive work landed in the initial round; three follow-up rounds each handled exactly one automated review finding. This report is fully sanitized — it concerns the methodology only, not any project.Headline issue: reviewer verdict oscillation
Two consecutive review rounds issued opposite, equally-confident directives on the same single location: round N required a change; round N+1 required reverting that exact change on a more refined argument. Net effect of two full rounds = no change to the deliverable. Each finding was internally plausible; the loop had no mechanism to notice round N+1 was reversing round N, and trusted each single-reviewer verdict in isolation.
Suggestions
Reversal detection + forced adjudication. Before accepting a finding, compare its target against the immediately preceding round's changes. If a finding would revert/contradict a change the loop itself just made, flag it as contested (not routine-blocking). On the first detected reversal, force an explicit adjudication: state both arguments, identify the deciding principle, and bake that principle into the artifact then — not after paying two rounds. Escalate persistent oscillation to a human or a second reviewer.
Batch low-severity findings + severity-gated iteration. Round count tracked the arrival rate of small findings rather than remaining risk. Let a review pass emit all remaining minor findings together (one consolidated polish round); once no high-severity findings remain, switch to a final-polish mode that finalizes after one cleanup round instead of looping once per nit.
Explicit, justified blocking-vs-queued routing. Every finding — including a prose-only one that changed no computed value — was locally promoted to "blocking," so severity labels carried no information. Require each finding to be routed blocking or queued with a one-line justification; default no-result-change findings to queued. Test: "Does any externally-observable value or conclusion change?"
Independent cross-check per key claim (analysis-only work). Verification relied on internal-consistency checks + a re-derivation reaching the same numbers; no external oracle existed, which is why consistency checking couldn't break the oscillation tie (both readings were internally consistent). Require at least one independent cross-check per key claim from a different angle (sanity bound, degenerate case, second derivation).
Promote oscillation-resolving distinctions to durable lessons. The recurring trap behind the oscillation was documented in the artifact but never promoted to a cross-session lesson because it occurred in one location. Treat any self-reversal as a lesson trigger by itself.
What worked well (preserve)
Up-front testable acceptance criteria + an immutable goal anchor kept execution within its declared scope across all rounds with zero drift, even during the unproductive churn. Keep this as the default.
Filed from an RLCR methodology-analysis phase; fully sanitized (no project-specific information).