Skip to content

Ed0C97/grounding-toolkit

path README.md
section repo-root
doc-type package-readme
status stable
last_updated 2026-05-03

grounding-toolkit

Multi-tier groundedness & hallucination detection for LLM outputs.

The 7th sibling of the Sentinel monorepo, alongside cognis-toolkit, exchequer-toolkit, dspy-toolkit, rlm-toolkit, ocr-toolkit, and pdf-finder-toolkit.

Features

  • Cascade verification: consensus prior, lexical, semantic, NLI, LLM-judge
  • Preventive grounding: structured signatures that force citation_span emission
  • Multimodal: tables, KV pairs, figures, signatures
  • Numerical derivation: generic DerivationVerifier that re-computes user-supplied formulas against grounded primitives
  • Temporal & definitional consistency
  • Cross-document grounding (multi-doc linker)
  • Multilingual (locale-tag driven, no hard-coded languages)
  • Explainability: evidence pointers + conflict spans + reasoning trace
  • Confidence calibration: Bayesian posterior + uncertainty quantification
  • Audit: cryptographic Merkle proofs + immutable reasoning logs
  • Adversarial robustness
  • Calibration framework: gold-truth schema + Bayesian threshold tuner + feedback loop
  • Eval harness: RAGAS, DeepEval, TruLens adapters
  • Constitutional rules engine: pluggable YAML-loaded rules (consumer supplies the rule set; toolkit only ships the matcher)

Design principles

  • Provider-agnostic: embedding / NLI / LLM-judge / retrieval are Protocols. Inject any backend (LLM, OpenAI, local Sentence-Transformers, custom, ...).
  • Domain-agnostic: zero domain knowledge baked in. The toolkit ships a pure verification engine. Domain-specific assets — financial ratios, legal-clause libraries, regulatory rule sets — live in the consumer and are passed as data.
  • Single-tenant friendly, multi-tenant ready: per-tenant Source objects + injectable audit logger.

Quick start

from grounding import GroundingVerifier, Claim, Source

verifier = GroundingVerifier()
claim = Claim(text='Total debt is EUR 8.4M', citation_span=(5, 1240, 1280))
source = Source.from_text(document_full_text)

result = verifier.verify(claim, source)
print(result.verdict)             # GROUNDED | UNGROUNDED | UNCERTAIN
print(result.confidence)          # 0.0 - 1.0
print(result.evidence_pointers)   # [(doc_id, page, char_start, char_end), ...]

Installation

pip install grounding-toolkit              # core (lexical + consensus tiers)
pip install grounding-toolkit[all]         # full stack with all integrations
pip install grounding-toolkit[cognis,dspy] # selective

Documentation

See STRUCTURE.md for repository layout and docs/ARCHITECTURE.md for design.

Contributing

See CONTRIBUTING.md.

License

MIT

Release v2026.5.16.12 (2026-05-12)

Toolkit adoption sweep. Sentinel, arbor and tay now consume the sibling toolkits at maximum reasonable level for their domain. Per- consumer matrix shipped at each repo root in TOOLKIT_COVERAGE.md.

Consumption summary (after this sweep):

Consumer Raw Effective
sentinel 33% (310/926) ~46%
arbor 17% (162/926) ~33%
tay 16% (150/926) ~30%

Effective coverage discounts structural overhead (errors, DTOs, Protocols, test stubs, CLI commands, alternate provider adapters) that a downstream consumer is not expected to import directly.

About

Multi-tier groundedness & hallucination detection — 7th sibling of the Sentinel monorepo

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors