Skip to content

feat: profile precision enhancement v2 — Beta confidence model with 44 fixes#152

Open
rookie136 wants to merge 1 commit into
tickernelz:mainfrom
rookie136:main
Open

feat: profile precision enhancement v2 — Beta confidence model with 44 fixes#152
rookie136 wants to merge 1 commit into
tickernelz:mainfrom
rookie136:main

Conversation

@rookie136

@rookie136 rookie136 commented Jul 3, 2026

Copy link
Copy Markdown

Summary

Comprehensive upgrade from linear confidence model to Beta posterior model, plus injection quality improvements, semantic deduplication, Thompson sampling isolation, and validator extensions.

Core Changes

Beta Confidence Model

  • syncConfidence(item): confidence = alpha/(alpha+beta) × min(timeFactor, trendMultiplier)
  • Alpha/beta accumulation replaces direct confidence manipulation
  • lazyMigrateAlpha: detects legacy Thompson prior (0.5, 1.5) and migrates to evidence-based alpha
  • Fills missing fields: weakAlpha, weakBeta, lastMatchTime, firstSeen, pendingValidation

Merge Method Constraints

  • combineItemsmergeConfirmedMatch: only called by exact/strong/weak-upgrade
  • Forced merge and cross-validation use direct field manipulation (avoid F5-9/F5-10 alpha desync)
  • combineThree: alpha=a1+a2+1, beta=b1+b2, frequency=f1+f2+1

Thompson Sampling (Isolated)

  • Weak hits use isolated weakAlpha/weakBeta
  • Cumulative limit > 7 triggers forced judgment (mean ≥ 0.45 upgrades)
  • Upgrade: total alpha +1.0 (mergeConfirmedMatch +0.5 + bonus +0.5)

Decay & Staleness

  • Removed linear decay and conf < 0.3 hard threshold
  • Stale condition: alpha ≤ 2 && age > 30 days
  • Removed decayWeakHits, applyConfidenceDecay, capConfidences

Semantic Deduplication

  • deduplicateItems: LLM checks same-category pairs with cos ≥ 0.50
  • Max 5 LLM calls per cycle, dedupCheckedCache LRU (1000 limit)
  • Merged items accumulate alpha/beta/weakAlpha/weakBeta

pendingValidation

  • New items start with pendingValidation = true
  • Confirmed next cycle: alpha += 1.0 + syncConfidence

Validator (5 verdicts)

  • confirmed/contradicted/no_evidence/inaccurate/oversimplified
  • Covers preferences (top-5) + patterns (top-3)
  • oversimplified triggers evolveAndUpdate when evidence ≥ 3

Injection Quality

  • dedupByCategory + scoreByRecency (conf×0.7 + exp(-age/90)×0.3)
  • Workflow context anchor in analysis prompt

Bug Fixes

  • P3: retry path now calls decayInMemory
  • P10: evolve evidence < 3 no longer adopts

Deployment Results (2026-07-03)

Metric Before After
conf saturation (≥0.98) 7/22 2/20
conf=1.0 6 items 0
conf range 0.50~0.998 0.42~0.99
Dedup merges 0 2 pairs
Forced merge conf→1 16/cycle 0
Decay false positives 12/cycle 0
Build 0 errors
image image image

@rookie136 rookie136 force-pushed the main branch 2 times, most recently from 4c9e610 to 1022f81 Compare July 3, 2026 08:22
…4 fixes

Core changes:
- Beta unified confidence model (syncConfidence, alpha/beta accumulation replaces confidence saturation)
- mergeConfirmedMatch rename (combineItems → called only for confirmed matches)
- forced merge / cross-validation use direct field manipulation (avoid alpha desync)
- Thompson weak-hit sampling uses isolated weakAlpha/weakBeta
- Semantic dedup pass: deduplicateItems + checkSemanticDuplicate + pair cache LRU
- pendingValidation cross-batch confirmation + lazy migration
- Remove capConfidences / applyConfidenceDecay / decayWeakHits dead code
- Validator extended: 5-value verdict + patterns coverage + oversimplified → evolve
- Injection quality: dedupByCategory + scoreByRecency + workflow context anchor
- AI-cleanup preserveKeys updated for Beta fields + alpha accumulation
- Bug fixes: P3 retry path decay, P10 evolve evidence threshold >= 3

Deployment results (2026-07-03):
- confidence distribution: 0.42~0.99 (distinguishable)
- 0 saturation events
- dedup: steady-state 0 LLM calls per cycle
- 0 decay false positives
- Thompson weak/familiar isolation correct
- injection: 5-dimension full coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant