Skip to content

ontesseract/spec-driven-development-skill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spec-Driven Development Skill

Spec-Driven Development is an agent skill for driving implementation from committed behavior specs and conformance suites.

Use it when a change needs a durable contract in docs/specs, a matching validation doc, and automated or deterministic evidence that the implementation complies.

How It Works

  1. Use this skill in a chat with AI to create a spec. Include the external oracles you want to use, any references you already have, and a general or detailed outline of the requirements.
  2. Read the spec. Seriously. You may not always read all the code AI produces, but the entire purpose of this approach is to put the human in the loop where it matters.
  3. Make adjustments and iterate until the spec captures the behavior, invariants, constraints, exclusions, assumptions, and validation expectations you actually want.
  4. If specs get too large, ask AI to extract coherent sections into separate specs with their own validation docs.
  5. Implement the spec, or parts of the spec, by specifically including it, or a reference to it, in your implementation prompt.
  6. Always keep your specs in sync with your codebase. This often happens automatically with this skill, but you should do regular audits with AI to make sure nothing is stale.

Inspiration

This skill was inspired by ElectricSQL's article Configurancy: keeping systems intelligible when agents write all the code. It turns the article's configurancy idea into reusable agent guidance: explicit behavioral commitments plus enforcement that keep a system intelligible under rapid change.

System Self-Knowledge

As human and AI agents make implementation faster and cheaper, the scarce asset becomes the system's self-knowledge: the explicit record of what the system promises, which invariants must hold, why tradeoffs were made, and how behavior is verified. Durable specs and conformance suites preserve that shared understanding so agents can change code without rediscovering hidden rules or accidentally breaking coherence under rapid change.

What It Provides

  • A spec-first workflow for behavior changes.
  • A reusable spec template inside SKILL.md.
  • Lifecycle guidance for spec, partial, and implemented specs.
  • Conformance-suite guidance for validation docs, fixtures, commands, and completion rules.
  • Example AGENTS.md and CLAUDE.md repo instructions that point agents at spec-driven development.

Install

This repo keeps root SKILL.md as the canonical source of truth. If you copy it into tool-specific folders, refresh those copies from root SKILL.md when the skill changes.

Codex

Install this repo as a local Codex skill so the skill folder contains the root SKILL.md.

Typical shape:

skills/
└── spec-driven-development/
    ├── SKILL.md
    └── agents/openai.yaml

Then ask Codex to use spec-driven-development for spec-first implementation work.

Claude Code

Copy the canonical SKILL.md into a Claude Code skills folder:

.claude/skills/spec-driven-development/SKILL.md

You can also copy the example CLAUDE.md into a project root and adapt only the verification command reference for that repo.

OpenCode

OpenCode can load compatible skills from several project locations. Copy the canonical SKILL.md into one of these paths:

.opencode/skills/spec-driven-development/SKILL.md
.agents/skills/spec-driven-development/SKILL.md
.claude/skills/spec-driven-development/SKILL.md

You can also use the example AGENTS.md as project-level rules for OpenCode and other agents that read AGENTS.md.

Suggested Project Setup

In repos that use this skill, keep behavioral contracts under:

docs/specs/<area>/<spec-name>.md
docs/specs/<area>/validation/<spec-name>.md
docs/specs/INDEX.md

The index should track every open spec and partial entry. Remove entries when specs become implemented.

Spec Structure

A spec is the committed reference model for a behavior surface. It should be specific enough that a future human or AI agent can predict the intended behavior without reading implementation code.

New specs should include:

  • Title, status, validation link, and authoring date.
  • Purpose - the stable behavioral contract this page exists to preserve.
  • Scope - the user-visible surface, subsystem, workflow, API, data contract, or operational behavior covered.
  • References - durable sources, with external oracles labeled only when they directly define correctness.
  • Behaviors - what the system does for a user, operator, integrator, or downstream consumer.
  • Invariants - truths that must always hold.
  • Constraints - solution boundaries, dependencies, failure envelopes, supported platforms, and compatibility limits.
  • Exclusions - anti-goals, banned approaches, untouched areas, and out-of-scope behavior.
  • Assumptions - optional upstream or environmental premises that materially reduce ambiguity.
  • State Model - states, phases, transitions, and important fields when behavior depends on them.
  • Data And Dependency Expectations - inputs, outputs, schemas, services, packages, fixtures, and fallback behavior.
  • Required Artifacts - validation docs, tests, fixtures, schemas, generated artifacts, screenshots, logs, or reports.
  • Behavioral Delta - optional structured notes about how the shared model changed.
  • Open Questions and Change Log.

The canonical slot contract is Behaviors, Invariants, Constraints, and Exclusions, with Assumptions used when needed. Existing repos can keep local headings like Non-goals or Out of scope; the important part is that the slot is present and clear.

How To Use The Core Slots

Use the core slots to separate user-facing promises from always-true rules, implementation limits, and explicit non-goals. This separation matters because each slot drives a different kind of validation.

Behaviors describe observable capabilities. They should be phrased from the point of view of a user, operator, integrator, job, API client, or downstream consumer. A good behavior can usually be demonstrated with a scenario, example request, UI walkthrough, CLI invocation, or fixture. Prefer "The system supports..." and "The system enables..." over implementation-detail phrasing.

Examples:

  • The system retries failed webhook deliveries with exponential backoff.
  • The admin UI shows the current sync state for every connected account.
  • The export endpoint returns a stable CSV header order for repeated requests.

Invariants describe truths the system must preserve across all supported states, inputs, and implementations. They are stronger than behaviors: if an invariant fails, the system is wrong even if a happy-path feature appears to work. Invariants often map to property tests, schema checks, runtime assertions, database constraints, monitoring signals, or failure-condition tests.

Examples:

  • The system must never process the same idempotency key twice.
  • A deleted user must not appear in search results after the deletion transaction commits.
  • Every emitted event must include a stable event ID, timestamp, and source identifier.

Constraints describe the allowed solution space. They explain what the implementation may or may not depend on, where code may live, which compatibility surfaces must remain stable, what latency or failure envelopes apply, and which migration or rollout limits matter. Constraints are not feature promises; they are boundaries that keep future implementations compatible with the contract.

Examples:

  • The implementation must preserve the public REST response shape for existing clients.
  • The worker may depend on the existing queue package but must not introduce a second queue backend.
  • The migration must be safe to run more than once.

Exclusions describe what the spec intentionally does not cover. Use this slot to name anti-goals, banned approaches, untouched areas, or future work that should not be inferred from the current contract. Clear exclusions prevent agents from widening scope just because adjacent behavior looks related.

Examples:

  • This spec does not define billing reconciliation.
  • Do not replace the existing authentication provider as part of this work.
  • This spec does not require a new admin reporting page.

Assumptions are optional premises that reduce ambiguity. Include them only when the spec relies on an upstream system, data shape, operational fact, or environment that is not guaranteed by the implementation itself. If an assumption becomes a promise the system owns, promote it to an invariant or constraint.

Examples:

  • The upstream provider sends each event with a globally unique ID.
  • The initial rollout targets a single region.
  • Existing consumers tolerate additive JSON fields.

As a rule of thumb: put "what users can observe" in Behaviors, "what must always remain true" in Invariants, "what choices are allowed" in Constraints, "what is deliberately out" in Exclusions, and "what must be true elsewhere for this contract to make sense" in Assumptions.

Every spec has a lifecycle status:

  • spec - planned, no behaviors shipped yet.
  • partial - some behaviors are shipped, but at least one behavior, invariant, or constraint is still unimplemented.
  • implemented - every behavior, invariant, and constraint is shipped and covered by the conformance suite.

When a spec lands, flip its status in the same change as the implementation, update the validation doc from planned language to current commands, record the implementation reference if the repo tracks one, remove the open row from docs/specs/INDEX.md, and audit inbound links for stale "not shipped yet" language.

Conformance Suite Structure

The validation doc is the spec's conformance-suite entry point. It should live at docs/specs/<area>/validation/<spec-name>.md and share the spec basename so agents can find the pair quickly.

A strong validation doc names:

  • The external oracle, reference model, or rationale that constrains the suite.
  • Scenario matrices and fixture sets, including edge cases and failure conditions.
  • Exact commands to run locally and in CI.
  • Completion rules for when the implementation can be called conformant.
  • Required artifacts such as committed fixtures, generated reports, screenshots, logs, schemas, or traces.
  • Prediction questions a future agent should be able to answer from the spec and validation docs without reading implementation code.
  • Runtime signals for invariants that only fail under load, partial failure, production traffic, or timing-sensitive interleavings.

Prefer deterministic fixtures and explicit walkthroughs over ad hoc manual checks. If invalid states matter, encode them with schemas, enums, seed data, typed DSLs, or fixtures so the suite is harder to misunderstand.

The ElectricSQL article frames this as a hierarchy:

  1. External oracle - ground truth outside the system, such as Postgres, an RFC, or a shared vendor conformance suite.
  2. Reference model - the committed spec that makes the intended behavior explicit.
  3. Conformance suite - the operational contract every implementation must pass.
  4. Rationale - the tradeoffs, rejected paths, and context that explain why the contract exists.

Drift between these layers is a correctness bug. If tests pass but no longer reflect the oracle, spec, or rationale, the system has weak configurancy and agents will propagate the wrong behavior quickly.

Different behaviors need different suite shapes:

  • Deterministic scenario suites for crisp input/output contracts.
  • Fuzz or property-based testing for large combinatorial spaces.
  • Differential testing when multiple implementations should agree.
  • History-based checkers for weak consistency and ordering-sensitive systems.
  • Model checking or state exploration for concurrency interleavings.
  • Runtime assertions, metrics, canaries, and alerts for invariants that only show up after merge.

The suite should make the implicit explicit: if an invariant matters, encode it; if a constraint applies, make it visible; if a future agent needs setup context, commit the artifact instead of relying on memory.

Repository Contents

  • SKILL.md - canonical skill instructions.
  • agents/openai.yaml - Codex-facing skill metadata.
  • AGENTS.md - generic project instruction example for Codex, OpenCode, and other agents.
  • CLAUDE.md - generic project instruction example for Claude Code.
  • LICENSE - MIT license.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors