Wan2GP Operator is an open source CLI operator for WanGP/Wan2GP video generation. The goal is not to make users memorize every WanGP setting. The goal is to let Codex, Claude, or another terminal agent install WanGP, compose the right settings, run headless jobs, inspect logs, and correct course when something breaks.
Running WanGP directly means fragile prompts, wrong runtime flags, wasted generations, and no consistent troubleshooting loop. Wan2GP Operator adds the missing layer: current model guidance, VRAM-aware compose, headless batch execution, auto-retry, failure diagnosis, learned compatibility state, and a music video pipeline that turns audio tracks into beat-synced AI videos.
Current public releases follow the verified upstream WanGP version. Release
v11.61 is the operator release aligned to WanGP 11.61.
- What It Does
- Installation
- Quick Start
- Commands
- AI Music Video Pipeline
- FAQ
- Licensing and Models
- Contributing
- License
| Capability | Description |
|---|---|
| Bootstrap | Guided install, GPU detection, readiness checks |
| Compose | VRAM-aware prompt-to-settings with quality presets |
| Models | Current open-source video model guidance and curated WanGP targets |
| Run | Deterministic --dry-run then --run pipeline with structured logs |
| Evolve | Learn from failures, track compatibility state over time |
| Diagnose | Failure analysis with actionable next-step commands |
| Music Video | End-to-end audio-to-video pipeline with beat-synced generation |
This is not a GUI wrapper. It is a terminal-first agent control layer that makes Wan2GP reproducible, diagnosable, and automatable. The human gives creative intent. The agent uses the operator to choose the model target, generate settings, dry-run the job, execute it, diagnose failures, and retry with safer flags when needed.
Clone the repository:
git clone https://github.com/avalonreset/wan2gp-operator.gitpython scripts/install_skill.py --platform codex --scope userpython scripts/install_skill.py --platform claude --scope userRestart Codex or Claude after installing. The installer excludes local
runtime/ folders so WanGP clones, venvs, model weights, logs, and generated
outputs stay in the active project folder instead of being copied into skill
directories.
- Python 3.11+
- A Wan2GP installation (the operator wraps it, does not replace it)
- GPU with 12GB+ VRAM recommended for quality generation
ffmpegandffprobe(required for the music video pipeline)librosa(optional, improves beat detection accuracy)
# 1. Check hardware readiness
python scripts/wan2gp_operator.py assess
# 2. Bootstrap Wan2GP installation into ./runtime/Wan2GP
python scripts/wan2gp_operator.py bootstrap --execute
# 3. Check current model guidance
python scripts/wan2gp_operator.py models
# 4. Compose settings from a prompt
python scripts/wan2gp_operator.py compose \
--prompt "cinematic street shot, golden hour" \
--quality quality \
--duration-seconds 4
# Or target the current hot open-source audio-video model
python scripts/wan2gp_operator.py compose \
--model ltx23-dev-22b \
--quality quality \
--prompt "cinematic street shot, golden hour, natural city ambience"
# Or generate a high-quality ACE-Step 1.5 song on an RTX 4090
python scripts/wan2gp_operator.py compose \
--task music \
--model ace15-xl-lm-4b \
--quality quality \
--duration-seconds 120 \
--prompt "[Verse]\n...\n[Chorus]\n..." \
--music-caption "modern cinematic synth-pop, expressive lead vocal, polished mix"
# 5. Dry run first, then execute
python scripts/wan2gp_operator.py run \
--wan-root <WAN2GP_ROOT> \
--process settings.json \
--dry-run
python scripts/wan2gp_operator.py run \
--wan-root <WAN2GP_ROOT> \
--process settings.json \
--log-file logs/run.log
# 5. Learn from the run
python scripts/wan2gp_operator.py evolve \
--wan-root <WAN2GP_ROOT> \
--log-file logs/run.logThe compose step produces a settings.json tuned to your GPU's VRAM. The dry run validates everything before burning compute. The evolve step records what worked and what failed into .wan2gp_operator_state.json so the next run is smarter.
| Command | What It Does |
|---|---|
assess |
Check GPU, VRAM, and system readiness |
bootstrap |
Full guided install of Wan2GP |
setup |
Configure Wan2GP environment |
compose |
Generate run settings from a text prompt |
models |
Report current hot/practical model targets |
plan |
Preview what a run will do before executing |
run |
Execute a generation (supports --dry-run) |
diagnose |
Analyze failures and suggest fixes |
evolve |
Record outcomes and improve future runs |
updates |
Check for upstream Wan2GP releases |
launch-ui |
Start the Wan2GP web interface |
music-video |
End-to-end music video generation |
music-analyze |
Analyze audio for BPM, beats, and sections |
music-plan |
Create beat-aligned shot timeline |
music-generate |
Generate video clips from the plan |
music-assemble |
Stitch clips and mux audio into final video |
Turn one audio track into a structured, beat-synced AI music video using Wan2GP as the generation engine.
python scripts/wan2gp_operator.py music-video \
--audio "track.mp3" \
--theme "neon summer city, confident performance energy" \
--model ltx23-distilled-22b \
--wan-root <WAN2GP_ROOT> \
--execute-generation \
--evolve-on-failureThis analyzes the audio (BPM, beat grid, sections), plans shots aligned to beats, generates each clip via Wan2GP, retries failures automatically, and assembles the final video with ffmpeg.
# Analyze audio structure
python scripts/wan2gp_operator.py music-analyze --audio "track.mp3"
# Plan shots aligned to beats
python scripts/wan2gp_operator.py music-plan \
--analysis audio_analysis.json \
--theme "neon summer city"
# Generate each clip
python scripts/wan2gp_operator.py music-generate \
--plan music_video_plan.json \
--wan-root <WAN2GP_ROOT> \
--model ltx23-distilled-22b \
--execute-generation
# Assemble final video
python scripts/wan2gp_operator.py music-assemble \
--audio "track.mp3" \
--manifest generation_manifest.json| File | Contents |
|---|---|
audio_analysis.json |
Duration, BPM, beat grid, sections |
music_video_plan.json |
Beat-aligned shot timeline and prompts |
generation_manifest.json |
Per-take compose/run/evolve results |
music_video_master.mp4 |
Final video with audio muxed in |
As of 2026-05-10, the operator treats LTX-2.3 22B as the hottest general open-source audio-video target because it combines video and native synchronized audio in one model. Wan 2.2 14B remains the strongest Wan-family workhorse inside WanGP, especially for text-to-video and image-to-video without learning a node graph.
Run:
python scripts/wan2gp_operator.py modelsFor high-memory machines, skip tiny demo models unless debugging install health:
python scripts/wan2gp_operator.py compose --model ltx23-dev-22b --quality quality --prompt "<PROMPT>"On RTX 4090/24GB-class development machines, use serious models but start with
stable runtime flags. The operator's default 24GB path uses LTX-2.3 distilled for
balanced iteration, sdpa, profile 3, TeaCache, and no compile until
Sage/Sage2 is validated locally. Explicit quality tests keep the stable attention
and profile defaults and can omit TeaCache.
For faster iteration:
python scripts/wan2gp_operator.py compose --model ltx23-distilled-22b --quality balanced --prompt "<PROMPT>"Wan2GP Operator wraps WanGP/Wan2GP and adds operational tooling: model selection, VRAM-aware settings, batch execution, failure diagnosis, and evolution tracking.
Yes. The music video pipeline analyzes your audio track (BPM, beats, sections), generates beat-synced video clips through Wan2GP, retries failed generations automatically, and assembles the final video with ffmpeg. One command or four stages, depending on how much control you want.
Use ace15-xl-lm-4b when quality matters. It maps to WanGP's ACE-Step 1.5 XL Turbo 4B DiT plus 4B LM path. Start with 120 seconds at --quality quality; 180 seconds is a reasonable stretch target after the first full run succeeds. Use --music-caption for genre, arrangement, vocal, and mix direction, and optionally lock --bpm, --keyscale, --time-signature, and --language.
Yes. The repo includes SKILL.md for skill-suite installs, AGENTS.md for Codex-style agent instructions, and CLAUDE.md for Claude Code/project context. Install with:
python scripts/install_skill.py --platform codex --scope user
python scripts/install_skill.py --platform claude --scope userComfyUI is a node-based GUI for chaining AI models visually. Wan2GP Operator is a terminal-first CLI that wraps Wan2GP specifically. The tradeoffs: ComfyUI gives you visual flexibility across many models. Wan2GP Operator gives you reproducible, scriptable, headless batch execution with built-in failure diagnosis for Wan2GP specifically.
Wan2GP Operator is released under MIT for this repository's own code and documentation. It does not vendor, sublicense, or redistribute WanGP, model weights, datasets, LoRAs, generated artifacts, or third-party runtimes.
WanGP/Wan2GP is installed from its upstream repository:
https://github.com/deepbeepmeep/Wan2GP
Models and weights are downloaded by the user's local WanGP installation from
their upstream providers, the same way they are when using WanGP directly. Those
materials remain governed by their own licenses, terms, acceptable-use policies,
and provider requirements. Review NOTICE.md before distributing bundles,
selling services, or using third-party model outputs commercially.
Contributions are welcome. See CONTRIBUTING.md for development setup, code style, and PR workflow.
For bugs and feature requests, open an issue. For questions and ideas, use Discussions.
For security vulnerabilities, do not open a public issue. See SECURITY.md.
Join AI Marketing Hub Pro for access to exclusive projects (referral link).
gemini-seo - 14 professional SEO workflows for Gemini CLI. Technical audits, schema markup, Core Web Vitals, E-E-A-T, and AI search readiness.
codex-seo - Same firepower, built for Codex CLI. 12 workflows, 6 parallel agents, client-ready HTML/PDF reports from your terminal.
BenjaminTerm - Hacker-styled WezTerm distribution for Windows. Smart clipboard, paste undo, 86 curated dark themes, borderless glass mode.
MIT License for this operator's own code and documentation. See LICENSE and NOTICE.md. WanGP, model weights, datasets, LoRAs, and third-party runtimes keep their own upstream licenses and terms.
