Skip to content

Commit 1e43cae

Browse files
Merge pull request #53956 from GraemeMalcolm/main
Updates to vision modules
2 parents abfe059 + 01d8a0d commit 1e43cae

26 files changed

Lines changed: 115 additions & 146 deletions

learn-pr/wwl-data-ai/analyze-images-with-content-understanding/includes/3-analyze-images-with-content-understanding.md

Lines changed: 38 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -63,26 +63,48 @@ Here's an example schema for analyzing product images:
6363

6464
## Analyze an image
6565

66-
To analyze an image using Content Understanding, submit a POST request to the analyze endpoint with your analyzer ID and the image URL or file:
66+
To analyze an image using Content Understanding, you can use the Python SDK, which you can install using `pip` like this:
6767

6868
```bash
69-
curl -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
70-
-H "Ocp-Apim-Subscription-Key: {key}" \
71-
-H "Content-Type: application/json" \
72-
-d '{
73-
"inputs": [
74-
{
75-
"url": "https://example.url/product-image.jpg"
76-
}
77-
]
78-
}'
69+
pip install azure-ai-contentunderstanding
7970
```
8071

81-
The response includes a result ID that you use to retrieve the analysis results once processing completes.
82-
83-
## Understand the response
72+
To submit a request to the analyze endpoint with your analyzer ID and the image URL or file, you can use code similar to this example:
73+
74+
```python
75+
from azure.ai.contentunderstanding import ContentUnderstandingClient
76+
from azure.ai.contentunderstanding.models import AnalysisInput, AnalysisResult
77+
from azure.core.credentials import AzureKeyCredential # for key-based authentication
78+
from azure.identity import DefaultAzureCredential # for Entra ID authentication
79+
80+
# Get a client
81+
credential = AzureKeyCredential(key)
82+
client = ContentUnderstandingClient(endpoint={FOUNDRY_ENDPOINT},
83+
credential={KEY_OR_IDENTITY},
84+
api_version="2025-11-01")
85+
86+
# Analyze an image file
87+
with open("my_image.png", "rb") as f:
88+
file_bytes = f.read()
89+
90+
try:
91+
poller = client.begin_analyze(
92+
analyzer_id={ANALYSER_ID},
93+
inputs=[AnalysisInput(data=file_bytes)],
94+
)
95+
# Get results asynchronously from poller
96+
result: AnalysisResult = poller.result()
97+
98+
# Display results
99+
result_str = json.dumps(result.as_dict(), indent=2)
100+
ret_lines = result_str.splitlines()
101+
102+
except Exception as ex:
103+
print(f"[Unexpected Error]: {ex}")
104+
sys.exit(1)
105+
```
84106

85-
When analysis completes, the response includes:
107+
When analysis completes, the results include the extracted content:
86108

87109
- **markdown**: A text representation of the image content, useful for search and RAG scenarios
88110
- **fields**: Extracted field values matching your schema, each with a confidence score
@@ -134,4 +156,4 @@ Use confidence scores to build automation workflows that route low-confidence ex
134156
- **Single focus**: Images with one clear subject yield better results than cluttered scenes
135157
- **Consistent orientation**: Upright images are processed more reliably than rotated ones
136158

137-
Content Understanding's image analysis capabilities enable you to transform visual content into structured, actionable data for document processing, inventory management, quality inspection, and many other business scenarios.
159+
Content Understanding's image analysis capabilities enable you to transform visual content into structured, actionable data for document processing, inventory management, quality inspection, and many other business scenarios.
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
In this exercise, you'll use Azure Content Understanding to analyze images. You start by exploring the prebuilt image analyzer in the Microsoft Foundry portal to see how Content Understanding extracts information from images. Then, you create a Python application that uses the Content Understanding API to analyze images programmatically and extract structured data.
1+
In this exercise, you'll use Azure Content Understanding to analyze images. First, you create a custom image analyzer in the Microsoft Foundry portal. Then, you create a Python application that uses the Content Understanding API to analyze images programmatically and extract structured data.
22

33
> [!NOTE]
44
> To complete this exercise, you need an Azure subscription. If you don't have one, you can [sign up for a free account](https://azure.microsoft.com/pricing/purchase-options/azure-account?cid=msft_learn), which includes credits for the first 30 days.
55
66
Launch the exercise and follow the instructions.
77

8-
[![Button to launch exercise.](../media/launch-exercise.png)](https://go.microsoft.com/fwlink/?linkid=2356120&azure-portal=true)
8+
[![Button to launch exercise.](../media/launch-exercise.png)](https://go.microsoft.com/fwlink/?linkid=2356972&azure-portal=true)

learn-pr/wwl-data-ai/generate-images-azure-openai/1-introduction.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,10 @@ metadata:
66
description: Overview of image generation models
77
author: ivorb
88
ms.author: berryivor
9-
ms.date: 05/08/2025
9+
ms.date: 03/24/2026
1010
ms.topic: unit
1111
ms.collection:
1212
- wwl-ai-copilot
1313
durationInMinutes: 1
1414
content: |
1515
[!include[](includes/1-introduction.md)]
16-

learn-pr/wwl-data-ai/generate-images-azure-openai/2-what-is-dall-e.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,10 @@ metadata:
66
description: Explore what image-generation models are.
77
author: ivorb
88
ms.author: berryivor
9-
ms.date: 05/08/2025
9+
ms.date: 03/24/2026
1010
ms.topic: unit
1111
ms.collection:
1212
- wwl-ai-copilot
1313
durationInMinutes: 2
1414
content: |
1515
[!include[](includes/2-what-is-dall-e.md)]
16-

learn-pr/wwl-data-ai/generate-images-azure-openai/3-dall-e-in-openai-studio.yml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,10 @@ metadata:
66
description: Use the Images playground in Microsoft Foundry portal to explore image generation models.
77
author: ivorb
88
ms.author: berryivor
9-
ms.date: 05/08/2025
9+
ms.date: 03/24/2026
1010
ms.topic: unit
1111
ms.collection:
1212
- wwl-ai-copilot
1313
durationInMinutes: 3
1414
content: |
1515
[!include[](includes/3-dall-e-in-openai-studio.md)]
16-
17-

learn-pr/wwl-data-ai/generate-images-azure-openai/4-dall-e-rest-api.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,10 @@ metadata:
66
description: Use APIs and SDKs to consume image generation models.
77
author: ivorb
88
ms.author: berryivor
9-
ms.date: 05/08/2025
9+
ms.date: 03/24/2026
1010
ms.topic: unit
1111
ms.collection:
1212
- wwl-ai-copilot
1313
durationInMinutes: 3
1414
content: |
1515
[!include[](includes/4-dall-e-rest-api.md)]
16-

learn-pr/wwl-data-ai/generate-images-azure-openai/5-exercise-use-dall-e.yml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,10 @@ metadata:
66
description: Hands-on exercise to use an image generation model in Microsoft Foundry.
77
author: ivorb
88
ms.author: berryivor
9-
ms.date: 05/08/2025
9+
ms.date: 03/24/2026
1010
ms.topic: unit
1111
ms.collection:
1212
- wwl-ai-copilot
1313
durationInMinutes: 20
1414
content: |
1515
[!include[](includes/5-exercise-use-dall-e.md)]
16-
17-

learn-pr/wwl-data-ai/generate-images-azure-openai/6-knowledge-check.yml

Lines changed: 23 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ metadata:
66
description: Check your knowledge of image generation models.
77
author: ivorb
88
ms.author: berryivor
9-
ms.date: 04/29/2025
9+
ms.date: 03/24/2026
1010
ms.topic: unit
1111
ms.collection:
1212
- wwl-ai-copilot
@@ -15,39 +15,25 @@ durationInMinutes: 3
1515
content: |
1616
quiz:
1717
questions:
18-
- content: "You want to use a model in Microsoft Foundry to generate images. Which model should you use?"
19-
choices:
20-
- content: "DALL-E"
21-
isCorrect: true
22-
explanation: "The DALL-E model is used to generate images based on natural language prompts."
23-
- content: "GPT-4o"
24-
isCorrect: false
25-
explanation: "The GPT 4o model is used to generate natural language text."
26-
- content: "text-embedding-ada-002"
27-
isCorrect: false
28-
explanation: "The text-embedding-ada-002 model is used to generate embeddings (vectors that represent text tokens)."
29-
- content: "Which playground in Microsoft Foundry portal should you use to test an image-generation model?"
30-
choices:
31-
- content: "Agents"
32-
isCorrect: false
33-
explanation: "The Agents playground is used to test AI agents."
34-
- content: "Chat"
35-
isCorrect: false
36-
explanation: "The Chat playground is used to explore natural language generation models."
37-
- content: "Images"
38-
isCorrect: true
39-
explanation: "The Images playground is used to explore image generation models."
40-
- content: "In a REST request to generate images, what does the **n** parameter indicate?"
41-
choices:
42-
- content: "The description of the desired image."
43-
isCorrect: false
44-
explanation: "The image description is provided in the **prompt** parameter."
45-
- content: "The number of images to be generated"
46-
isCorrect: true
47-
explanation: "The number of images to be generated is specified in the **n** parameter."
48-
- content: "The size of the image to be generated"
49-
isCorrect: false
50-
explanation: "The size of the desired image is specified in the **size** parameter"
51-
52-
53-
18+
- content: "You want to find a model in Microsoft Foundry to generate images. Which inference task should you filter by?"
19+
choices:
20+
- content: "Text to image"
21+
isCorrect: true
22+
explanation: "Text to image models generate images based on natural language prompts."
23+
- content: "Image to text"
24+
isCorrect: false
25+
explanation: "Image to text models generate natural language descriptions based on images."
26+
- content: "Embeddings"
27+
isCorrect: false
28+
explanation: "Embedding models generate vector representations of text or other data, not images."
29+
- content: "Which OpenAI API can you use with image-generation models?"
30+
choices:
31+
- content: "Video"
32+
isCorrect: false
33+
explanation: "The Video API is used for video-generation tasks, not image generation."
34+
- content: "Image"
35+
isCorrect: true
36+
explanation: "The Image API is used for image-generation tasks."
37+
- content: "Graphics"
38+
isCorrect: false
39+
explanation: "There is no OpenAI Graphics API."

learn-pr/wwl-data-ai/generate-images-azure-openai/7-summary.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,10 @@ metadata:
66
description: Summarize what you've learned about image generation models
77
author: ivorb
88
ms.author: berryivor
9-
ms.date: 05/08/2025
9+
ms.date: 03/24/2026
1010
ms.topic: unit
1111
ms.collection:
1212
- wwl-ai-copilot
1313
durationInMinutes: 1
1414
content: |
1515
[!include[](includes/7-summary.md)]
16-
Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,9 @@
11
With Microsoft Foundry, you can use language models to generate content based on natural language prompts. Often the generated content is in the form of natural language text, but increasingly, models can generate other kinds of content.
22

3-
For example, the OpenAI DALL-E image generation model can create original graphical content based on a description of a desired image.
3+
For example, the OpenAI *gpt-image-1* model can create original graphical content based on a description of a desired image.
44

55
![Diagram of a prompt requesting a model to create an image.](../media/image-generation.png)
66

77
The ability to use AI to generate graphics has many applications; including the creation of illustrations or photorealistic images for articles or marketing collateral, generation of unique product or company logos, or any scenario where a desired image can be described.
88

99
In this module, you'll learn how to develop an application that uses generative AI to generate original images.
10-

0 commit comments

Comments
 (0)