Flyhigh is a personal Codex-first engineering harness. It packages a canonical harness spec, active skills, decision memory, run artifacts, policy gates, SkillOpt-inspired skill evolution, eval suites, generated Codex/Claude/OpenCode surfaces, project installation helpers, and a Korean human dashboard.
It is not a separate agent runtime. It sits on top of Codex and remains compatibility-aware for Claude Code and similar tools.
flyhigh-project-bootstrap: inspect a repo and create or refresh project instructions.flyhigh-analyze: perform read-only investigation with evidence.flyhigh-plan: produce PRD and test-spec artifacts before broad work.flyhigh-implement: make scoped edits while preserving existing changes.flyhigh-review: review for defects, regressions, risks, and missing tests.flyhigh-qa: verify claims with tests, builds, static checks, or artifact inspection.flyhigh-ship: prepare final validation, commit or PR readiness, and release notes.flyhigh-learn: update memory and propose skill improvements based on outcomes.flyhigh-session-handoff: summarize current state for a future session.
- Install or copy this plugin into your Codex plugin location.
- In a target project, invoke
flyhigh-project-bootstrap. - Review the generated or updated
AGENTS.md. - Use
flyhigh-analyze,flyhigh-plan,flyhigh-implement, andflyhigh-qafor normal work. - Use
flyhigh-learnafter meaningful tasks to keep memory current.
From this repository, bootstrap a target project directly:
scripts/bootstrap-project /path/to/target-repoThe script writes:
AGENTS.mdorAGENTS.flyhigh.generated.mdwhen an instructions file already exists;.flyhigh/memory/skeleton;.flyhigh/bootstrap-report.md.
scripts/create-run "Task title" --repo /path/to/target-repo
scripts/validate-run /path/to/target-repo/reports/runs/YYYY-MM-DD-task
scripts/policy-preflight --root /path/to/target-repo
scripts/generate-adapters
scripts/validate-adapters
scripts/run-evals
scripts/render-v1-dashboard
scripts/validate-v1scripts/validate-spec
scripts/record-decision --id decision-id --title "..." --context "..." --decision "..." --rationale "..."
scripts/list-decisions
scripts/validate-decisions
scripts/init-skill-cycle --skill qa --id cycle-id
scripts/propose-skill-edit --id candidate-id --skill qa --operation add --target end --replacement "..."
scripts/apply-skill-patch candidate-id
scripts/score-skill-candidate candidate-id
scripts/promote-skill candidate-id
scripts/reject-skill-edit candidate-id --reason "..."
scripts/check-policy --kind command --value "rm -rf build"
scripts/explain-policy --kind file --value package.json
scripts/run-eval-suite all
scripts/generate-codex-surface
scripts/generate-claude-surface
scripts/generate-opencode-surface
scripts/install-into-project /path/to/project
scripts/render-dashboard
scripts/validate-dashboard-freshness
scripts/validate-dashboard-quality
scripts/validate-full-harness
scripts/create-issue --id FH-100 --title "..." --why "..."
scripts/update-issue FH-100 --lane doing
scripts/list-issues
scripts/record-iteration --id ITER-100 --goal "..." --summary "..."
scripts/propose-github-operation --id GH-100 --kind issue --title "..." --why "..." --body "..."
scripts/approve-github-operation GH-100 --approved-by operator
scripts/prepare-github-outbox --id GH-100
scripts/publish-github-operation GH-100 --repo /path/to/project
scripts/list-github-operations
scripts/list-github-repositories
scripts/sync-github-state
scripts/validate-github-repositories
scripts/validate-github-ops
scripts/validate-dashboard-truth
scripts/list-operator-decisions
scripts/decide OD-001 approve publish_issue --operator operator
scripts/list-operator-responses
scripts/apply-operator-decisions
scripts/validate-operator-decisions
scripts/validate-hitl-governance
scripts/review-direction
scripts/validate-direction-reviews
scripts/evaluate-merge-gate
scripts/validate-merge-gates
scripts/merge-approved-pr MG-001
scripts/harness-loopFlyhigh tracks v1 progress from structured state:
state/v1-status.json
state/v1-decisions.jsonl
state/v1-events.jsonl
state/v1-risks.jsonl
Render the single-file dashboard with:
scripts/render-dashboard
scripts/validate-dashboard-freshness
scripts/validate-dashboard-truthThen open:
dashboard/flyhigh.html
dashboard/flyhigh.freshness.json
scripts/render-dashboard writes freshness metadata with the tracked SSOT source hash, latest source mtime, renderer version, and render timestamp. scripts/validate-dashboard-freshness fails when dashboard HTML or metadata no longer matches the current SSOT. scripts/validate-full-harness includes this gate, and scripts/harness-loop rerenders after its final state update.
Run scripts/sync-github-state before relying on GitHub/repository readiness. It observes local git remotes, gh auth, GitHub open issues, and open PRs without remote writes, then updates state/github/repositories.json and state/github/live-state.json. scripts/validate-dashboard-truth fails when the dashboard/SSOT claims repo, auth, issue, or PR facts that conflict with live observation.
.codex-plugin/ Codex plugin manifest
skills/ Reusable task playbooks
references/ Durable standards loaded only when needed
templates/ Project artifact templates
scripts/ Local validation and helper scripts
docs/ Design package
evals/ Evaluation concepts and fixtures
examples/ Usage examples
docs/vision.md: product thesis and non-goals.docs/competitive-landscape.md: how Flyhigh differs from current harnesses.docs/product-architecture.md: product-grade system model.docs/operating-model.md: day-to-day engineering lifecycle.docs/security-and-policy.md: safety model and future enforcement.docs/evaluation-program.md: eval-driven skill improvement plan.docs/orchestration-guidance.md: subagent, worktree, review, QA, and handoff guidance.docs/roadmap.md: staged path from scaffold to stable harness.docs/skillopt-engine.md: SkillOpt-inspired state machine and promotion flow.docs/dashboard.md: Korean dashboard source and rendering model.docs/dashboard-design-system.md: operator dashboard design system and content rules.docs/dashboard-refresh.md: dashboard freshness metadata and validation contract.docs/decision-memory.md: durable decision tracking.docs/policy-system.md: policy classes and commands.docs/eval-program.md: eval suite commands and reports.docs/adapter-system.md: generated surface model.docs/domain-projects.md: installing Flyhigh into future projects.docs/github-operations.md: GitHub issue/PR proposal, approval, outbox, and publish lifecycle.docs/github-operations-research.md: references and design consequences for GitHub operations.docs/merge-governance-research.md: direction review and merge gate rules.docs/decision-console.md: operator-facing decision prompt model for dashboard choices.
Substantial work should write artifacts under the target project:
reports/runs/YYYY-MM-DD-task-slug/
manifest.json
plan.md
events.jsonl
tool-calls.jsonl
verification.md
final.md
From this repository:
scripts/validate-plugin
scripts/validate-skills
scripts/validate-bootstrap-fixture
scripts/validate-run-fixture
scripts/validate-policy-fixture
scripts/validate-adapters
scripts/run-evals
scripts/validate-dashboard-freshness
scripts/validate-dashboard-truth
scripts/validate-v1
scripts/validate-full-harnessThese scripts are dependency-light and intended to run in local shells.