|
| 1 | +Once your Sora 2 model is deployed, you can start generating videos. Video generation is an asynchronous process—you submit a request with your prompt and video settings, then retrieve the completed video when it's ready. |
| 2 | + |
| 3 | +## Video generation parameters |
| 4 | + |
| 5 | +Before crafting your prompt, understand the API parameters that control your video output: |
| 6 | + |
| 7 | +| Parameter | Description | Supported values | |
| 8 | +|-----------|-------------|------------------| |
| 9 | +| **prompt** | Natural language description of your video | Text string (required) | |
| 10 | +| **model** | The model to use | `sora-2` or `sora-2-pro` | |
| 11 | +| **size** | Output resolution | `1280x720` (landscape), `720x1280` (portrait) | |
| 12 | +| **seconds** | Video duration | `4`, `8`, or `12` (default: 4) | |
| 13 | +| **input_reference** | Reference image for the first frame | JPEG, PNG, or WebP file | |
| 14 | +| **remix_video_id** | ID of a previous video to remix | Video ID string | |
| 15 | + |
| 16 | +> [!TIP] |
| 17 | +> The model follows instructions more reliably in shorter clips. For best results, consider generating two 4-second clips and stitching them together rather than a single 8-second clip. |
| 18 | +
|
| 19 | +## Test video generation in the playground |
| 20 | + |
| 21 | +After deploying the Sora 2 model, you can test it using the Video playground in Microsoft Foundry portal: |
| 22 | + |
| 23 | +1. Navigate to your deployed Sora 2 model in the Foundry portal. |
| 24 | +1. Select the **Playground** tab to access the video generation interface. |
| 25 | +1. Enter your prompt into the text box describing the video you want to generate. |
| 26 | +1. Configure video settings such as resolution and duration. |
| 27 | +1. Select **Generate** to start video creation. |
| 28 | + |
| 29 | +Video generation typically takes 1 to 5 minutes depending on your settings. When the AI-generated video is ready, it appears on the page. |
| 30 | + |
| 31 | +> [!NOTE] |
| 32 | +> The content generation APIs include a content moderation filter. If Azure OpenAI recognizes your prompt as harmful content, it won't return a generated video. For more information, see [Content filtering](/azure/ai-services/openai/concepts/content-filter). |
| 33 | +
|
| 34 | +In the video playground, you can also view cURL code samples that are prefilled according to your settings. Select the **View code** button at the top of the playground to access sample code you can use in your applications. |
| 35 | + |
| 36 | +## Writing effective prompts |
| 37 | + |
| 38 | +Think of prompting like briefing a cinematographer. The more specific you are about what the shot should achieve, the more control and consistency you'll get. However, leaving some details open can lead to creative, unexpected results. |
| 39 | + |
| 40 | +### Prompt anatomy |
| 41 | + |
| 42 | +A clear prompt describes a shot as if you were sketching it onto a storyboard: |
| 43 | + |
| 44 | +- **Camera framing**: Specify the shot type (wide, medium, close-up) and angle |
| 45 | +- **Subject description**: Anchor your subject with distinctive details |
| 46 | +- **Action**: Describe movement in beats—small steps, gestures, or pauses |
| 47 | +- **Lighting and palette**: Set the mood with lighting direction and color anchors |
| 48 | +- **Style**: Establish the aesthetic early (for example, "1970s film" or "handheld documentary") |
| 49 | + |
| 50 | +### Weak vs. strong prompts |
| 51 | + |
| 52 | +| Weak prompt | Strong prompt | |
| 53 | +|-------------|---------------| |
| 54 | +| "A beautiful street at night" | "Wet asphalt, zebra crosswalk, neon signs reflecting in puddles" | |
| 55 | +| "Person moves quickly" | "Cyclist pedals three times, brakes, and stops at crosswalk" | |
| 56 | +| "Cinematic look" | "Anamorphic 2.0x lens, shallow DOF, volumetric light" | |
| 57 | + |
| 58 | +### Example prompt |
| 59 | + |
| 60 | +Here's an example of a well-structured prompt: |
| 61 | + |
| 62 | +```text |
| 63 | +In a 90s documentary-style interview, an old Swedish man sits in a study |
| 64 | +and says, "I still remember when I was young." |
| 65 | +``` |
| 66 | + |
| 67 | +This prompt works because: |
| 68 | + |
| 69 | +- "90s documentary" sets the style, so the model chooses appropriate camera, lighting, and color |
| 70 | +- "old Swedish man sits in a study" describes subject and setting while allowing creative interpretation |
| 71 | +- The dialogue gives the model specific words to sync with the character |
| 72 | + |
| 73 | +## Using reference images |
| 74 | + |
| 75 | +For more control over composition and style, use the `input_reference` parameter to provide a visual reference. The model uses the image as an anchor for the first frame, while your prompt defines what happens next. |
| 76 | + |
| 77 | +Requirements for reference images: |
| 78 | + |
| 79 | +- The image resolution must match the target video size (`1280x720` or `720x1280`) |
| 80 | +- Supported formats: JPEG, PNG, WebP |
| 81 | + |
| 82 | +## Remixing existing videos |
| 83 | + |
| 84 | +The remix feature lets you modify specific aspects of an existing video while preserving its core elements—scene transitions, visual layout, and overall structure. This is useful for making targeted adjustments without regenerating from scratch. |
| 85 | + |
| 86 | +To remix a video: |
| 87 | + |
| 88 | +1. Generate a video and note its video ID from the completed job |
| 89 | +2. Call the remix endpoint with the original video ID and an updated prompt |
| 90 | +3. Describe only the changes you want—keep modifications focused |
| 91 | + |
| 92 | +For best results: |
| 93 | + |
| 94 | +- Limit changes to one clearly articulated adjustment |
| 95 | +- Be specific about what to change: "same shot, switch to 85mm lens" or "same lighting, new palette: teal, sand, rust" |
| 96 | +- Narrow, precise edits retain greater fidelity to the source material |
| 97 | + |
| 98 | +## Tips for better results |
| 99 | + |
| 100 | +- **Keep it simple**: Each shot should have one clear camera move and one clear subject action |
| 101 | +- **Use beats for timing**: Instead of "actor walks across the room," try "actor takes four steps to the window, pauses, and pulls the curtain" |
| 102 | +- **Be consistent**: Reuse phrasing for characters across shots to maintain continuity |
| 103 | +- **Iterate**: Small changes to camera, lighting, or action can shift outcomes dramatically—treat each generation as a creative variation |
| 104 | + |
| 105 | +Video generation with Sora 2 is a collaborative process. You provide direction, and the model delivers creative variations. Be prepared to experiment—sometimes the second or third generation is the best one. |
0 commit comments