Capability Schema does not replace world model benchmarks. It gives them a shared language.
Capability Schema defines a standard way to describe, observe, and report the capabilities of generative world models.
Benchmarks produce a number. Capability Schema produces a semantic unit — a Capability Score that is comparable across models, queryable, and traceable to evidence.
capabilities/
└── temporal_stability.yaml
Each file defines:
- What the Capability means
- What metrics observe it
- What predictors estimate it (when ground-truth is unavailable)
- How to report it
Dreamer4. Cosmos. Genie. Any future world model.
Same Capability definition. Different adapters.
# Install the reference implementation
pip install capability-schema
# Or use the spec directly — Capability Schema is a document standard,
# not a software dependency. Anyone can implement it.See capability-schema-reference for the reference implementation.
capability-schema-spec/
├── capabilities/ ← Capability definitions (YAML)
├── conformance/ ← Conformance test suite
├── rfcs/ ← Historical RFCs and proposals
├── CHARTER.md ← Governance charter
├── GOVERNANCE.md ← Governance rules
├── PROPOSAL.md ← Project establishment report
└── README.md ← This file
See GOVERNANCE.md. Three constitutional axioms. BDFL → Editorial Board → Foundation.
- Anyone may claim they use Capability Schema.
- Only those who pass the Conformance Test may claim "Capability Schema Conformant."
We report results using the Capability Schema Specification.
Short form: "following the Capability Schema"
Capability Schema is built on the work of others:
- MMBench2 (Hallucination in World Models is Predictable and Preventable) by Nicklas Hansen & Xiaolong Wang (UC San Diego) — the dataset, checkpoints, and hallucination taxonomy that first identified Temporal Stability as a measurable dimension. The Reference Implementation wraps MMBench2's pretrained Dreamer4 model as its first WorldModel adapter.
The Capability Schema is a document standard. These projects consume it as a runtime layer:
| Project | Role | Link |
|---|---|---|
| TraceOS | AI experiment operating system — uses Capability Schema YAML as input to its Analytics Engine via a transformation-only adapter | aaa-mvc/traceos |
| capability-schema-reference | Reference implementation wrapping Dreamer4 + MMBench2 | aaa-mvc/capability-schema-reference |
To add your project, open a PR adding a row to this table.
- Specification: CC BY 4.0
- Conformance Tests: MIT