Status: planned
media should be one Termlings app for AI image and video generation.
It should power:
- image generation
- video generation
- async provider-backed jobs
- file-based outputs
- brand-aware prompt defaults
- shared activity feed entries
The first provider target is Google-backed generation, for example:
- image models like Nano Banana 2
- video models like Veo 3
This should start as one app, not two.
Reasons:
- image and video generation share the same provider/auth layer
- both need the same job lifecycle
- both want the same storage model
- both should show up in the same activity and history surfaces
- future media capabilities like upscaling, editing, storyboards, thumbnails, and voiceovers fit naturally here
So the app is media, while the user-facing commands stay concrete:
termlings image ...termlings video ...
Termlings should let agents and operators:
- generate brand-aware image assets quickly
- generate short video assets from prompts or still images
- track generation jobs and outputs in local files
- reuse outputs across workflows, ads, landing pages, and future design tools
- see media work in
All activitywithout custom TUI plumbing
V1 should not try to be:
- a full creative suite
- a timeline editor
- a Figma replacement
- a cloud asset DAM
- a broad provider abstraction layer before one provider works well
Everything should be a job.
Images may support --wait, but internally they should still be stored as jobs so image and video use one consistent system.
Canonical job envelope:
{
"id": "vid_abc123",
"type": "video",
"status": "queued",
"provider": "google",
"model": "veo-3",
"prompt": "8 second launch teaser",
"negativePrompt": "",
"inputs": [
{ "kind": "image", "path": "./hero.png" }
],
"options": {
"aspect": "16:9",
"duration": 8,
"brand": "default"
},
"output": {
"path": ".termlings/store/media/outputs/vid_abc123.mp4",
"mime": "video/mp4"
},
"actorSlug": "designer",
"createdAt": 1770000000000,
"updatedAt": 1770000000000,
"error": null
}.termlings/
store/
media/
jobs/
img_abc123.json
vid_abc123.json
outputs/
img_abc123.png
vid_abc123.mp4
cache/
providers.json
Keep outputs file-based and stable so:
- agents can reuse them later
- the desktop app and future web UI can show history
- future automation can feed one output into the next job
termlings image generate <prompt...>
termlings image list
termlings image show <job-id>
termlings image open <job-id>
termlings video generate <prompt...>
termlings video list
termlings video show <job-id>
termlings video open <job-id>
termlings video cancel <job-id>termlings image schema generate
printf '%s\n' '{"prompt":"pixel-art founder dashboard hero","provider":"google","model":"nano-banana-2"}' \
| termlings image generate --stdin-json --json
printf '%s\n' '{"prompt":"landing page hero","brand":"default","image":"./public/logo.png"}' \
| termlings image generate --stdin-json --json
printf '%s\n' '{"prompt":"8 second launch teaser","provider":"google","model":"veo-3","image":"./hero.png","aspect":"16:9","duration":"8"}' \
| termlings video generate --stdin-json --json
termlings video show vid_abc123
termlings video open vid_abc123media should integrate with .termlings/brand/brand.json.
Possible behavior:
--brand defaultinjects style hints into the prompt- logo/mark references can be attached automatically when explicitly requested
- brand colors, voice, and domain can influence creative direction
This should stay optional. Brand data should enrich prompts, not become a hard dependency.
media should emit shared app activity like:
media.image.queuedmedia.image.completedmedia.image.failedmedia.video.queuedmedia.video.completedmedia.video.failed
These should use the shared activity system under .termlings/store/activity/.
Keep provider logic behind a small adapter boundary:
type MediaProvider = {
generateImage(input: ImageJobInput): Promise<MediaJobResult>
generateVideo(input: VideoJobInput): Promise<MediaJobResult | MediaPendingJob>
pollJob?(job: MediaJobRecord): Promise<MediaJobRecord>
cancelJob?(job: MediaJobRecord): Promise<void>
}This keeps the CLI and storage model provider-agnostic without overbuilding provider abstraction too early.
- image jobs should feel fast and scriptable
- video jobs should be explicitly async and cancellable
- prompts and outputs should be inspectable from local files
- the app should be useful from CLI first, before any dedicated UI
- future desktop/web views should be able to read the same job files without extra translation
media should connect naturally to:
brandfor prompt enrichmentadsfor creative generation and reuse- future design/rendering work for hero assets, OG images, motion graphics, and campaign visuals
- shared app activity for progress visibility
Keep v1 narrow:
- one app:
media - image + video generation only
- Google provider first
- file-based jobs and outputs
- brand-aware prompt enrichment
- activity events
Do not add in v1:
- editing timelines
- collaboration comments
- version trees
- cloud-only storage
- remote render workers
V1 is successful if:
- an agent can generate a useful image from one command
- an agent can generate a useful short video from one command
- outputs are stored locally and reusable
- activity shows progress clearly
- the app becomes a clean dependency for future
adsand design tooling