feat: confidence-gated cost cascade for AI name recovery (metaharness thesis) by ruvnet · Pull Request #5 · ruvnet/rudevolution

ruvnet · 2026-06-27T17:56:50Z

Summary

Wires the metaharness cost-cascade thesis onto ruDevolution's existing
per-inference confidence score: run a cheap model tier first, escalate
to a frontier tier only when confidence < threshold. Most names get
recovered cheaply ($0 corpus tier); you pay the frontier price only for the
hard, low-confidence ones. Pure cost-Pareto — no accuracy lost, because the
cheap answer is kept whenever it already clears the bar.

Default behavior is unchanged — the cascade is strictly opt-in. The
standard decompile() pipeline is untouched and all 59 pre-existing tests stay
green.

Part 1 — Honest architecture review (overlap with metaharness)

ruDevolution already implements much of the metaharness thesis natively:

Capability	ruDevolution (native)	metaharness analogue	Verdict
Self-learning	`inferrer::learn_from_ground_truth` extracts `LearnedPattern`s from ground-truth comparisons; `TrainingCorpus` (210+ patterns) feeds inference; real_world tests run "with learning".	darwin evolve loop	Already has it. Don't bolt on a second learner.
Witness chains	`witness.rs`: SHA3-256 content hashes + binary Merkle root, `verify_witness_chain` self-check, serializable `WitnessChainData`, self-verified inside `decompile()`.	ADR-011 Ed25519 witness	Already has it. (Hash-chain provenance vs signed authorship — see interop note.)
MinCut module detection	`partitioner.rs`: exact MinCut via `ruvector-mincut::GraphPartitioner` for <5K nodes, Louvain (rayon-parallel) for ≥5K.	ruvector-mincut / ADR-190	Already has it — literally the same `ruvector-mincut` crate.
Confidence scoring	`InferredName.confidence` (0..1) per inference, 5-strategy ladder (corpus → string patterns → property correlation → multi-literal → structural), `Confidence::{High,Medium,Low}` thresholds.	confidence gate	Already has it — and this is the hook.
AI inference path	`neural.rs` (feature `neural`): `NeuralInferrer` with 3 backends (pure-Rust transformer `.bin`, ONNX `.onnx`, GGUF/RVF stub). `infer_names_neural` already does a crude "neural first, fall back to corpus" at a hardcoded 0.8 cutoff.	model router	Partial. Single model, fixed cutoff, no cost awareness, no tier ordering, no outcome recording.

Where metaharness genuinely adds value: the AI inference path. ruDevolution
has a confidence gate but spends it on a fallback decision (neural → corpus),
not a cost-routing decision (cheap → frontier, pay only on escalation). That
is exactly the cascade thesis, and it slots onto the existing confidence field
with zero redundant machinery. Self-learning, witnesses, and MinCut are not
touched — they already exist.

Part 2 — What was implemented (the genuine fit)

New src/cascade.rs:

NameInferrer trait — one method (infer) + label()/cost(). A tier is
anything from the $0 corpus inferrer to a neural model to a remote frontier API.
CorpusTier — wraps the existing inferrer::infer_declaration_name +
TrainingCorpus as the cheap ($0) tier. No new inference logic.
CascadeInferrer — cheapest-first tiers + escalation threshold (default
0.9, mirroring Confidence::High). Runs tiers in order, stops as soon as a
tier clears the threshold (never invokes more expensive tiers), keeps the
highest-confidence answer if none clear it.
CascadeOutcome / CascadeStats — per-inference record (winning tier,
confidence, tiers tried, escalated?, did escalation change the answer?, cost)
and aggregates (cheap-win rate, cost spent vs frontier-only baseline, savings).
suggest_threshold / self_tune — read recorded outcomes and adjust the
threshold for the next run: lower it when the frontier keeps confirming
the cheap tier (stop paying for confirmations), raise it when the frontier
keeps overturning it. This is the self-tuning loop that closes with
ruDevolution's existing "gets smarter every run" learning.

src/lib.rs: pub mod cascade + opt-in infer_names_cascade(modules, tiers, threshold).
The default decompile() path is not changed.

examples/cost_cascade.rs: deterministic, model-free, $0 demo. Output:

a    -> tier=corpus         conf=0.95 escalated=false cost=0
b    -> tier=corpus         conf=0.95 escalated=false cost=0
c    -> tier=frontier(mock) conf=0.93 escalated=true cost=100
d    -> tier=frontier(mock) conf=0.93 escalated=true cost=100
cheap wins: 2 (50%)  escalations: 2  cost saved: 200 (50% vs frontier-only)

How it maps to the existing confidence score

The cascade reads nothing new — it routes purely on InferredName.confidence,
the score the crate has always produced. The cheap tier is the existing
5-strategy inferrer verbatim; escalation fires exactly when that score is below
the bar.

Default-unchanged confirmation

decompile() / decompile_default() still call inferrer::infer_names — no change.
A single-tier cascade can never escalate, so its output == the wrapped tier
(test single_tier_never_escalates_default_unchanged,
single_corpus_matches_existing_inferrer).
All 59 pre-existing tests pass unmodified.

Part 3 — Interop notes (design-level, no heavy cross-language dep)

ruDevolution as darwin's decompile engine: the cascade's CascadeStats
(cheap-win rate, cost saved, escalation overturn rate) are exactly the fitness
signals a darwin/metaharness self-improvement loop optimizes. Expose them as a
run summary and darwin can evolve the threshold + tier mix.
@metaharness/router as an optional sidecar: it fits behind NameInferrer
as a frontier tier — implement infer() to POST the InferenceContext to the
router over HTTP and map the response into InferredName{confidence}. No
cross-language build dependency; the router is just one more Box<dyn NameInferrer>.
Kept as a documented bridge, not wired in (no real model calls in this PR).
Witness-chain alignment: ruDevolution uses SHA3-256 Merkle content hashing;
ADR-011 uses Ed25519 signatures. They compose rather than conflict — the
Merkle chain_root is the natural payload to Ed25519-sign for authorship,
giving "these bytes derive from that bundle" (have) + "signed by this agent"
(add). Noted as a clean future bridge, not implemented here.

Test results

cargo test (built against the real ruvector-mincut dep chain):

lib unit:     54 passed   (41 prior + 13 new cascade tests)
ground_truth:  5 passed
integration:   8 passed
real_world:    4 passed
doc-tests:     1 passed
------------------------------
TOTAL:        72 passed, 0 failed   (was 59; +13 new, 0 pre-existing changed)

cargo clippy --all-targets: zero new warnings from cascade.rs/lib.rs
(the repo's pre-existing clippy warnings in other files are untouched).

Notes / honest status

Nothing stubbed in the cascade itself — routing, threshold gating,
best-answer keeping, outcome recording, stats, and self-tuning are all real and
tested. The example's "frontier" tier is an intentional deterministic mock so
the demo is $0 and reproducible; production plugs a real model into the same trait.
Build note: the public repo inherits version.workspace/serde.workspace
and a ../ruvector-mincut path dep, so it builds as a member of the ruvector
workspace (where ruvector-mincut lives). This PR was built/tested against that
real dependency; the committed Cargo.toml keeps the workspace inheritance
intact and only adds the [[example]] entry.

🤖 Generated with claude-flow

https://claude.ai/code/session_019rVRYrRDKyxYK18kuVrDSf

Wire the metaharness cost-cascade thesis onto rudevolution's existing per-inference confidence score: run a cheap model tier first and escalate to a frontier tier ONLY when confidence < threshold. Most names are recovered cheaply; pay frontier cost only for the hard, low-confidence ones. - src/cascade.rs: NameInferrer trait, CorpusTier ($0 built-in tier), CascadeInferrer (cheapest-first routing, keeps best answer), CascadeOutcome + CascadeStats (per-inference + aggregate cost accounting), and suggest_threshold/self_tune that learn from recorded outcomes to self-tune the escalation threshold over runs (closes the self-learning loop). - src/lib.rs: pub mod cascade + opt-in infer_names_cascade(); default decompile() path is untouched. - examples/cost_cascade.rs: deterministic, model-free ($0) demo. - 13 new unit tests; all 59 pre-existing tests stay green (72 total). - Default behavior unchanged: a single-tier cascade == existing inferrer. No real model calls. No provider hardcoded — bring your own frontier tier via the NameInferrer trait (e.g. neural::NeuralInferrer or an external sidecar such as @metaharness/router). Co-Authored-By: claude-flow <[email protected]> Claude-Session: https://claude.ai/code/session_019rVRYrRDKyxYK18kuVrDSf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: confidence-gated cost cascade for AI name recovery (metaharness thesis)#5

feat: confidence-gated cost cascade for AI name recovery (metaharness thesis)#5
ruvnet wants to merge 1 commit into
mainfrom
feat/metaharness-cost-cascade

ruvnet commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ruvnet commented Jun 27, 2026

Summary

Part 1 — Honest architecture review (overlap with metaharness)

Part 2 — What was implemented (the genuine fit)

How it maps to the existing confidence score

Default-unchanged confirmation

Part 3 — Interop notes (design-level, no heavy cross-language dep)

Test results

Notes / honest status

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant