Skip to content
#

score-normalization

Here are 3 public repositories matching this topic...

Capability Schema Spec defines a shared semantic language for world model evaluation. Standardize capability definition, observation, and verification across models and benchmarks. Not a benchmark—a shared language. Define • Observe • Verify

  • Updated Jul 3, 2026
  • Python

Reference implementation of the Capability Schema Specification. Proves that world model capabilities can be defined, observed, and verified in practice — with real checkpoints, real simulators, and real scores. Define • Observe • Verify • Deliver

  • Updated Jul 2, 2026
  • Python

Improve this page

Add a description, image, and links to the score-normalization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the score-normalization topic, visit your repo's landing page and select "manage topics."

Learn more