Add optional TwelveLabs Pegasus footage analysis#140
Conversation
Adds PegasusAnalyzerService as an opt-in visual analysis backend for the asset indexer. When VISUAL_ANALYZER=twelvelabs and TWELVELABS_API_KEY are set, video scenes are described natively by Pegasus (no keyframe sampling) instead of the default Gemini path. Falls back to Gemini on any error. Non-breaking: default behavior is unchanged when the key is absent.
|
@mohit-twelvelabs is attempting to deploy a commit to the openvideo Team on Vercel. A member of the Team first needs to authorize it. |
|
I would to prefer to add twelvelabs support here |
…or-modal/src/services/twelvelabs_analyzer.py Moves the opt-in TwelveLabs Pegasus footage analysis from the director app (pegasus-analyzer.service.ts) to the processor-modal Python service layer as TwelveLabsVisionAnalyzer, implementing the VisionAnalyzer interface alongside GeminiVisionAnalyzer per maintainer request. - New service uses httpx against the TwelveLabs v1.3 REST API (x-api-key), matching the sibling DeepgramTranscriber/GeminiVisionAnalyzer conventions. - Pegasus analyzes a video URL + scene window (its native strength); the frame/text interface methods lazily delegate to Gemini, so a TwelveLabs-only deployment needs no Google key. - VideoIndexer selects the backend via VISUAL_ANALYZER=twelvelabs + a key; default Gemini path is unchanged when unset. Per-scene fallback to keyframes on any Pegasus error. - Removes the TS Pegasus service, its spec, DI wiring, and the twelvelabs-js dependency from the director app. - Adds a focused test (no-network parse/selection + key-gated live analyze) and documents the opt-in env vars in .env.example and README.
|
Thanks for the steer, @xo-o — agreed that's the right home. I moved the TwelveLabs integration over to A few notes:
Happy to adjust naming or the selection seam if you'd prefer something different. — Mohit (@mohit-twelvelabs, TwelveLabs) |
|
TwelveLabs integration is being added here (#144 ), along with other refactorings to support additional providers. |
|
That's great to hear, @xo-o — really glad TwelveLabs is landing in openvideo via #144, and the multi-provider refactor sounds like the right foundation. Happy to close this in favor of #144 whenever you'd like. If it's useful, a couple of things I verified while building this that might save you time in #144:
Happy to review the TwelveLabs parts of #144 or help with anything API-side — just tag me. Thanks for picking it up! |
Hi! I'm Mohit, I work at TwelveLabs (@mohit-twelvelabs).
What this adds
An opt-in TwelveLabs Pegasus backend for the video footage analyzer in the asset indexer (
apps/director/src/rag). Today,AssetIndexerServicedescribes each detected scene by extracting 3 keyframes and sending them to Gemini Flash. This PR addsPegasusAnalyzerService, which instead analyzes each scene window natively with Pegasus — a video-understanding model that reasons over the actual footage (motion, temporal context, on-screen text) rather than sampled stills.It returns the exact same
{ description, objects, topics, keywords }shape the visual timeline already expects, so the two backends are drop-in interchangeable.Why it helps OpenVideo
The "Semantic Search" and "AI Director" features are only as good as the scene descriptions feeding the vector store. Keyframe sampling can miss anything that happens between frames — actions, camera moves, transient on-screen text. Pegasus watches the whole clip, which tends to produce richer, more accurate descriptions for footage-heavy content, improving downstream retrieval and auto-editing. It also removes per-scene local ffmpeg keyframe extraction from that path.
Opt-in / non-breaking
TWELVELABS_API_KEYis set andVISUAL_ANALYZER=twelvelabs.AGENTS.md.How it was tested
pegasus-analyzer.service.spec.ts(Vitest, matching the existingauto-caption.skill.spec.tsstyle):isEnabled()gating and the JSON/markdown/fallback parsing logic.TWELVELABS_API_KEYis present.oxlintandoxfmtclean on all changed files; the new service type-checks against the[email protected]SDK types.You can grab a free API key at https://twelvelabs.io — there's a generous free tier.