MicrosoftDocs
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/1-introduction.yml‎
Lines changed: 14 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/1-introduction.yml‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/2-explore-model-catalog.yml‎
Lines changed: 14 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/2-explore-model-catalog.yml‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/3-select-models-benchmarks.yml‎
Lines changed: 14 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/3-select-models-benchmarks.yml‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/4-deploy-models.yml‎
Lines changed: 14 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/4-deploy-models.yml‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/5-evaluate-performance.yml‎
Lines changed: 14 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/5-evaluate-performance.yml‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/6-exercise.yml‎
Lines changed: 14 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/6-exercise.yml‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/7-knowledge-check.yml‎
Lines changed: 48 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/7-knowledge-check.yml‎
Lines changed: 48 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/8-summary.yml‎
Lines changed: 14 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/8-summary.yml‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/includes/1-introduction.md‎
Lines changed: 15 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/includes/1-introduction.md‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/model-catalog-evaluate/includes/2-explore-model-catalog.md‎
Lines changed: 65 additions & 0 deletions b/‎learn-pr/wwl-data-ai/model-catalog-evaluate/includes/2-explore-model-catalog.md‎
Lines changed: 65 additions & 0 deletions
@@ -0,0 +1,14 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.model-catalog-evaluate.introduction
+title: Introduction
+metadata:
+  title: Introduction
+  description: Learn why model selection and evaluation are essential for building effective generative AI applications.
+  ms.date: 02/13/2026
+  author: ivorb
+  ms.author: berryivor
+  ms.topic: unit
+  ai-usage: ai-assisted
+durationInMinutes: 3
+content: |
+  [!include[](includes/1-introduction.md)]
@@ -0,0 +1,14 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.model-catalog-evaluate.explore-model-catalog
+title: Explore the model catalog
+metadata:
+  title: Explore the model catalog
+  description: Learn how to browse and filter models in the Microsoft Foundry portal model catalog.
+  ms.date: 02/13/2026
+  author: ivorb
+  ms.author: berryivor
+  ms.topic: unit
+  ai-usage: ai-assisted
+durationInMinutes: 7
+content: |
+  [!include[](includes/2-explore-model-catalog.md)]
@@ -0,0 +1,14 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.model-catalog-evaluate.select-models-benchmarks
+title: Select models using benchmarks
+metadata:
+  title: Select models using benchmarks
+  description: Understand how to use quality, safety, cost, and performance benchmarks to compare and select models.
+  ms.date: 02/13/2026
+  author: ivorb
+  ms.author: berryivor
+  ms.topic: unit
+  ai-usage: ai-assisted
+durationInMinutes: 9
+content: |
+  [!include[](includes/3-select-models-benchmarks.md)]
@@ -0,0 +1,14 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.model-catalog-evaluate.deploy-models
+title: Deploy models to endpoints
+metadata:
+  title: Deploy models to endpoints
+  description: Learn how to deploy models to different endpoint types and test them in the playground.
+  ms.date: 02/13/2026
+  author: ivorb
+  ms.author: berryivor
+  ms.topic: unit
+  ai-usage: ai-assisted
+durationInMinutes: 8
+content: |
+  [!include[](includes/4-deploy-models.md)]
@@ -0,0 +1,14 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.model-catalog-evaluate.evaluate-performance
+title: Evaluate model performance
+metadata:
+  title: Evaluate model performance
+  description: Discover how to evaluate models using manual testing, AI-assisted metrics, and NLP metrics.
+  ms.date: 02/13/2026
+  author: ivorb
+  ms.author: berryivor
+  ms.topic: unit
+  ai-usage: ai-assisted
+durationInMinutes: 10
+content: |
+  [!include[](includes/5-evaluate-performance.md)]
@@ -0,0 +1,14 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.model-catalog-evaluate.exercise
+title: Exercise - Select, deploy, and evaluate models
+metadata:
+  title: Exercise - Select, deploy, and evaluate models
+  description: Practice selecting, deploying, and evaluating models in Microsoft Foundry portal.
+  ms.date: 02/13/2026
+  author: ivorb
+  ms.author: berryivor
+  ms.topic: unit
+  ai-usage: ai-assisted
+durationInMinutes: 20
+content: |
+  [!include[](includes/6-exercise.md)]
@@ -0,0 +1,48 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.model-catalog-evaluate.knowledge-check
+title: Knowledge check
+metadata:
+  title: Knowledge check
+  description: Check your understanding of model selection, deployment, and evaluation concepts.
+  ms.date: 02/13/2026
+  author: ivorb
+  ms.author: berryivor
+  ms.topic: unit
+  ai-usage: ai-assisted
+durationInMinutes: 3
+quiz:
+  title: Check your knowledge
+  questions:
+  - content: "Which deployment type in Microsoft Foundry portal requires an Azure Marketplace subscription for models from partners and the community?"
+    choices:
+    - content: "Managed compute"
+      isCorrect: false
+      explanation: "Incorrect. Managed compute doesn't necessarily require Azure Marketplace subscriptions. It uses Azure virtual machines and is billed for hosting and inference costs."
+    - content: "Serverless API"
+      isCorrect: true
+      explanation: "Correct. Serverless API deployments for models from partners and community require Azure Marketplace subscriptions, while models sold directly by Azure don't require this subscription."
+    - content: "Provisioned"
+      isCorrect: false
+      explanation: "Incorrect. Provisioned deployments reserve dedicated capacity but don't specifically require Azure Marketplace subscriptions."
+  - content: "Which evaluation metric measures whether model responses are based on provided context rather than speculation?"
+    choices:
+    - content: "Fluency"
+      isCorrect: false
+      explanation: "Incorrect. Fluency evaluates linguistic correctness and natural language quality, not whether responses are grounded in context."
+    - content: "Groundedness"
+      isCorrect: true
+      explanation: "Correct. Groundedness determines whether responses are based on provided context rather than speculation or the model's general knowledge."
+    - content: "Coherence"
+      isCorrect: false
+      explanation: "Incorrect. Coherence assesses whether responses flow logically and maintain consistent ideas, not whether they're based on provided context."
+  - content: "What type of model should you select if your application needs to process both text and images?"
+    choices:
+    - content: "Small Language Model (SLM)"
+      isCorrect: false
+      explanation: "Incorrect. SLMs are efficient text-focused models but don't inherently process images alongside text."
+    - content: "Embedding model"
+      isCorrect: false
+      explanation: "Incorrect. Embedding models convert text into numerical representations for semantic search and similarity tasks, not for processing images."
+    - content: "Multimodal model"
+      isCorrect: true
+      explanation: "Correct. Multimodal models like GPT-4o and Phi-3-vision can process multiple data types including both text and images."
@@ -0,0 +1,14 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.model-catalog-evaluate.summary
+title: Summary
+metadata:
+  title: Summary
+  description: Review key concepts for selecting, deploying, and evaluating models in Microsoft Foundry portal.
+  ms.date: 02/13/2026
+  author: ivorb
+  ms.author: berryivor
+  ms.topic: unit
+  ai-usage: ai-assisted
+durationInMinutes: 3
+content: |
+  [!include[](includes/8-summary.md)]
@@ -0,0 +1,15 @@
+Building effective generative AI applications requires selecting the right foundation model for your specific use case. With thousands of models available, you need a structured approach to discover, compare, deploy, and validate that a model meets your requirements.
+
+Consider a scenario where you're building an AI-powered customer support chatbot for a retail company. You need to select a language model that can understand customer questions, provide accurate responses, and maintain appropriate tone and safety standards. But how do you choose from the vast catalog of available models? How do you know if a model performs well for your specific needs? And once deployed, how do you measure and improve its performance?
+
+The Microsoft Foundry portal provides a comprehensive platform for this entire workflow. You can explore over 1,900 models from providers like Microsoft, Anthropic, OpenAI, Meta, and Hugging Face. You can compare models using industry-standard benchmarks for quality, safety, cost, and performance. After selecting a model, you deploy it to an endpoint where your application can consume it. Finally, you evaluate the model's performance using both automated metrics and manual testing to ensure it meets your quality and safety requirements.
+
+In this module, you explore how to use the Microsoft Foundry portal to select, deploy, and evaluate models from the model catalog. You learn how to make informed decisions about model selection, understand different deployment options, and assess model performance using various evaluation approaches.
+
+By the end of this module, you'll be able to:
+
+- Explore and filter models in the model catalog
+- Compare models using benchmark metrics for quality, safety, cost, and performance
+- Deploy a model to an endpoint and test it in the playground
+- Evaluate model performance using manual and automated approaches
+- Understand different evaluation metrics and when to use them
@@ -0,0 +1,65 @@
+The model catalog in Microsoft Foundry portal serves as your central hub for discovering and comparing AI models. With over 1,900 models available from various providers, you need effective ways to filter and find models that match your specific requirements.
+
+## Access the model catalog
+
+You access the model catalog from the Microsoft Foundry portal at [ai.azure.com](https://ai.azure.com). After signing in and selecting your project, choose **Model catalog** from the left navigation pane. The catalog displays model cards showing key information about each model, including the provider, capabilities, and deployment options.
+
+:::image type="content" source="../media/model-catalog.png" alt-text="Screenshot of the model catalog in Microsoft Foundry portal.":::
+
+## Filter models by key attributes
+
+The model catalog provides several filters to help you narrow your search:
+
+**Collection** filters let you browse models by provider, such as Azure OpenAI, Meta, Mistral, Cohere, or Hugging Face. This helps when you have preferences or requirements for specific model families.
+
+**Industry** filters show models trained on industry-specific datasets. These specialized models often outperform general-purpose models in their respective domains.
+
+**Capabilities** filters highlight unique model features. You can filter for reasoning capabilities (complex problem-solving), tool calling (API and function integration), or multimodal processing (text, images, audio).
+
+**Deployment options** filters help you find models that support your preferred deployment type:
+- Serverless API for pay-per-call flexibility
+- Provisioned deployment for consistent, high-volume workloads
+- Managed compute for virtual machine-based hosting
+- Batch processing for cost-optimized, non-latency-sensitive jobs
+
+**Inference tasks** and **Fine-tune tasks** filters let you find models suited for specific activities like text generation, summarization, translation, or entity extraction.
+
+**License** filters help you identify models that align with your organization's licensing requirements and usage policies.
+
+## Understand model types
+
+As you explore the catalog, you encounter different categories of models designed for various use cases.
+
+### Large Language Models and Small Language Models
+
+**Large Language Models (LLMs)** like GPT-4, Mistral Large, and Llama 3 70B are powerful models designed for tasks requiring deep reasoning, complex content generation, and extensive context understanding. These models excel at sophisticated applications but require more computational resources.
+
+**Small Language Models (SLMs)** like Phi-3, Mistral OSS models, and Llama 3 8B offer efficiency and cost-effectiveness while handling common natural language processing tasks. They're ideal for scenarios where speed and cost matter more than handling the most complex reasoning tasks. SLMs can run on lower-end hardware or edge devices.
+
+### Chat completion and reasoning models
+
+Most language models in the catalog are **chat completion** models designed to generate coherent, contextually appropriate text responses. These models power conversational interfaces and content generation applications.
+
+For scenarios requiring higher performance in complex tasks like mathematics, coding, science, strategy, and logistics, **reasoning models** like DeepSeek-R1 provide enhanced problem-solving capabilities. These models can break down complex problems and show their reasoning process.
+
+### Multimodal models
+
+Beyond text-only processing, **multimodal models** like GPT-4o and Phi-3-vision can handle multiple data types including images, audio, and text. Use these models when your application needs to analyze visual content, such as document understanding, image description, or chart explanation.
+
+### Specialized models
+
+The catalog also includes task-specific models:
+
+**Image generation models** like DALL·E 3 create visual content from text descriptions. Use these for generating marketing materials, illustrations, or design mockups.
+
+**Embedding models** like Ada and Cohere convert text into numerical representations. These models enable semantic search, recommendation systems, and Retrieval Augmented Generation (RAG) scenarios where you need to find relevant information based on meaning rather than exact keyword matches.
+
+### Regional and domain-specific models
+
+Some models are optimized for specific languages, regions, or industries. When you need specialized performance in a particular domain or language, these models often outperform general-purpose alternatives. Examples include models trained on medical literature, legal documents, or specific language corpora.
+
+## Use the search and compare features
+
+Beyond filters, the model catalog offers search functionality to find models by name or keywords. You can open multiple model cards to compare their specifications, benchmarks, and capabilities side by side. This comparison helps you make informed decisions about which model best fits your use case, budget, and performance requirements.
+
+When you identify promising candidates, you can view detailed benchmark results, test models in the playground, or proceed directly to deployment. The structured approach of filtering, comparing, and testing helps ensure you select the right model for your generative AI application.