Skip to content

Commit 9d7e4bd

Browse files
authored
Merge pull request #53460 from ivorb/new-agents
model catalog evaluation module
2 parents 4364dce + 3e34969 commit 9d7e4bd

22 files changed

Lines changed: 693 additions & 0 deletions
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
### YamlMime:LearningPath
2+
uid: learn.wwl.develop-generative-ai-apps
3+
hidden: true
4+
metadata:
5+
title: Develop generative AI apps on Microsoft Foundry
6+
description: Learn how to develop generative AI apps in Microsoft Foundry. (AI-3016)
7+
ms.date: 02/17/2026
8+
author: ivorb
9+
ms.author: berryivor
10+
ms.topic: learning-path
11+
ms.collection: wwl-ai-copilot
12+
ms.custom: [copilot-learning-hub]
13+
title: Develop generative AI apps in Azure
14+
prerequisites: |
15+
Before starting this module, you should be familiar with fundamental AI concepts and services in Azure. You should also have programming experience.
16+
summary: |
17+
Generative artificial intelligence (AI) is becoming more accessible through comprehensive development platforms like Microsoft Foundry. Learn how to build generative AI applications that use language models to interact with your users.
18+
iconUrl: /training/achievements/generic-badge.svg
19+
levels:
20+
- intermediate
21+
roles:
22+
- data-scientist
23+
- ai-engineer
24+
products:
25+
- ai-services
26+
- azure-ai-foundry
27+
- azure-ai-foundry-sdk
28+
subjects:
29+
- artificial-intelligence
30+
modules:
31+
- learn.wwl.prepare-azure-ai-development
32+
- learn.wwl.model-catalog-evaluate
33+
- learn.wwl.ai-foundry-sdk
34+
- learn.wwl.finetune-model-copilot-ai-studio
35+
- learn.wwl.responsible-ai-studio
36+
trophy:
37+
uid: learn.wwl.develop-generative-ai-apps.trophy
38+
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.model-catalog-evaluate.introduction
3+
title: Introduction
4+
metadata:
5+
title: Introduction
6+
description: Learn why model selection and evaluation are essential for building effective generative AI applications.
7+
ms.date: 02/13/2026
8+
author: ivorb
9+
ms.author: berryivor
10+
ms.topic: unit
11+
ai-usage: ai-assisted
12+
durationInMinutes: 3
13+
content: |
14+
[!include[](includes/1-introduction.md)]
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.model-catalog-evaluate.explore-model-catalog
3+
title: Explore the model catalog
4+
metadata:
5+
title: Explore the model catalog
6+
description: Learn how to browse and filter models in the Microsoft Foundry portal model catalog.
7+
ms.date: 02/13/2026
8+
author: ivorb
9+
ms.author: berryivor
10+
ms.topic: unit
11+
ai-usage: ai-assisted
12+
durationInMinutes: 7
13+
content: |
14+
[!include[](includes/2-explore-model-catalog.md)]
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.model-catalog-evaluate.select-models-benchmarks
3+
title: Select models using benchmarks
4+
metadata:
5+
title: Select models using benchmarks
6+
description: Understand how to use quality, safety, cost, and performance benchmarks to compare and select models.
7+
ms.date: 02/13/2026
8+
author: ivorb
9+
ms.author: berryivor
10+
ms.topic: unit
11+
ai-usage: ai-assisted
12+
durationInMinutes: 9
13+
content: |
14+
[!include[](includes/3-select-models-benchmarks.md)]
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.model-catalog-evaluate.deploy-models
3+
title: Deploy models to endpoints
4+
metadata:
5+
title: Deploy models to endpoints
6+
description: Learn how to deploy models to different endpoint types and test them in the playground.
7+
ms.date: 02/13/2026
8+
author: ivorb
9+
ms.author: berryivor
10+
ms.topic: unit
11+
ai-usage: ai-assisted
12+
durationInMinutes: 8
13+
content: |
14+
[!include[](includes/4-deploy-models.md)]
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.model-catalog-evaluate.evaluate-performance
3+
title: Evaluate model performance
4+
metadata:
5+
title: Evaluate model performance
6+
description: Discover how to evaluate models using manual testing, AI-assisted metrics, and NLP metrics.
7+
ms.date: 02/13/2026
8+
author: ivorb
9+
ms.author: berryivor
10+
ms.topic: unit
11+
ai-usage: ai-assisted
12+
durationInMinutes: 10
13+
content: |
14+
[!include[](includes/5-evaluate-performance.md)]
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.model-catalog-evaluate.exercise
3+
title: Exercise - Select, deploy, and evaluate models
4+
metadata:
5+
title: Exercise - Select, deploy, and evaluate models
6+
description: Practice selecting, deploying, and evaluating models in Microsoft Foundry portal.
7+
ms.date: 02/13/2026
8+
author: ivorb
9+
ms.author: berryivor
10+
ms.topic: unit
11+
ai-usage: ai-assisted
12+
durationInMinutes: 20
13+
content: |
14+
[!include[](includes/6-exercise.md)]
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.model-catalog-evaluate.knowledge-check
3+
title: Knowledge check
4+
metadata:
5+
title: Knowledge check
6+
description: Check your understanding of model selection, deployment, and evaluation concepts.
7+
ms.date: 02/13/2026
8+
author: ivorb
9+
ms.author: berryivor
10+
ms.topic: unit
11+
ai-usage: ai-assisted
12+
durationInMinutes: 3
13+
quiz:
14+
title: Check your knowledge
15+
questions:
16+
- content: "Which deployment type in Microsoft Foundry portal requires an Azure Marketplace subscription for models from partners and the community?"
17+
choices:
18+
- content: "Managed compute"
19+
isCorrect: false
20+
explanation: "Incorrect. Managed compute doesn't necessarily require Azure Marketplace subscriptions. It uses Azure virtual machines and is billed for hosting and inference costs."
21+
- content: "Serverless API"
22+
isCorrect: true
23+
explanation: "Correct. Serverless API deployments for models from partners and community require Azure Marketplace subscriptions, while models sold directly by Azure don't require this subscription."
24+
- content: "Provisioned"
25+
isCorrect: false
26+
explanation: "Incorrect. Provisioned deployments reserve dedicated capacity but don't specifically require Azure Marketplace subscriptions."
27+
- content: "Which evaluation metric measures whether model responses are based on provided context rather than speculation?"
28+
choices:
29+
- content: "Fluency"
30+
isCorrect: false
31+
explanation: "Incorrect. Fluency evaluates linguistic correctness and natural language quality, not whether responses are grounded in context."
32+
- content: "Groundedness"
33+
isCorrect: true
34+
explanation: "Correct. Groundedness determines whether responses are based on provided context rather than speculation or the model's general knowledge."
35+
- content: "Coherence"
36+
isCorrect: false
37+
explanation: "Incorrect. Coherence assesses whether responses flow logically and maintain consistent ideas, not whether they're based on provided context."
38+
- content: "What type of model should you select if your application needs to process both text and images?"
39+
choices:
40+
- content: "Small Language Model (SLM)"
41+
isCorrect: false
42+
explanation: "Incorrect. SLMs are efficient text-focused models but don't inherently process images alongside text."
43+
- content: "Embedding model"
44+
isCorrect: false
45+
explanation: "Incorrect. Embedding models convert text into numerical representations for semantic search and similarity tasks, not for processing images."
46+
- content: "Multimodal model"
47+
isCorrect: true
48+
explanation: "Correct. Multimodal models like GPT-4o and Phi-3-vision can process multiple data types including both text and images."
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.model-catalog-evaluate.summary
3+
title: Summary
4+
metadata:
5+
title: Summary
6+
description: Review key concepts for selecting, deploying, and evaluating models in Microsoft Foundry portal.
7+
ms.date: 02/13/2026
8+
author: ivorb
9+
ms.author: berryivor
10+
ms.topic: unit
11+
ai-usage: ai-assisted
12+
durationInMinutes: 3
13+
content: |
14+
[!include[](includes/8-summary.md)]
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
Building effective generative AI applications requires selecting the right foundation model for your specific use case. With thousands of models available, you need a structured approach to discover, compare, deploy, and validate that a model meets your requirements.
2+
3+
Consider a scenario where you're building an AI-powered customer support chatbot for a retail company. You need to select a language model that can understand customer questions, provide accurate responses, and maintain appropriate tone and safety standards. But how do you choose from the vast catalog of available models? How do you know if a model performs well for your specific needs? And once deployed, how do you measure and improve its performance?
4+
5+
The Microsoft Foundry portal provides a comprehensive platform for this entire workflow. You can explore over 1,900 models from providers like Microsoft, Anthropic, OpenAI, Meta, and Hugging Face. You can compare models using industry-standard benchmarks for quality, safety, cost, and performance. After selecting a model, you deploy it to an endpoint where your application can consume it. Finally, you evaluate the model's performance using both automated metrics and manual testing to ensure it meets your quality and safety requirements.
6+
7+
In this module, you explore how to use the Microsoft Foundry portal to select, deploy, and evaluate models from the model catalog. You learn how to make informed decisions about model selection, understand different deployment options, and assess model performance using various evaluation approaches.
8+
9+
By the end of this module, you'll be able to:
10+
11+
- Explore and filter models in the model catalog
12+
- Compare models using benchmark metrics for quality, safety, cost, and performance
13+
- Deploy a model to an endpoint and test it in the playground
14+
- Evaluate model performance using manual and automated approaches
15+
- Understand different evaluation metrics and when to use them

0 commit comments

Comments
 (0)