|
1 | | -The model catalog in Microsoft Foundry portal serves as your central hub for discovering and comparing AI models. With over 1,900 models available from various providers, you need effective ways to filter and find models that match your specific requirements. |
| 1 | +The Foundry Models catalog serves as your central hub for discovering and comparing AI models. With over 1,900 models available from various providers, you need effective ways to filter and find models that match your specific requirements. |
2 | 2 |
|
3 | | -## Access the model catalog |
| 3 | +The model catalog includes two broad categories of model: |
4 | 4 |
|
5 | | -You access the model catalog from the Microsoft Foundry portal at [ai.azure.com](https://ai.azure.com). After signing in and selecting your project, choose **Discover** from the top navigation. The catalog displays model cards showing key information about each model, including the provider, capabilities, and deployment options. |
| 5 | +- **Foundry Models sold directly by Azure** |
6 | 6 |
|
7 | | -:::image type="content" source="../media/model-catalog.png" alt-text="Screenshot of the model catalog in Microsoft Foundry portal."::: |
8 | | - |
9 | | -## Filter models by key attributes |
| 7 | + These models are billed directly through your Azure subscription, and include Azure OpenAI models as well as models from Microsoft and other providers. |
10 | 8 |
|
11 | | -The model catalog provides several filters to help you narrow your search: |
| 9 | +- **Foundry Models from partners and community** |
12 | 10 |
|
13 | | -**Collection** filters let you browse models by provider, such as Azure OpenAI, Meta, Mistral, Cohere, or Hugging Face. This helps when you have preferences or requirements for specific model families. |
| 11 | + These models are provided by trusted partners and the community; each with their own licensing and pricing. |
14 | 12 |
|
15 | | -**Industry** filters show models trained on industry-specific datasets. These specialized models often outperform general-purpose models in their respective domains. |
| 13 | +## Finding models in the model catalog |
16 | 14 |
|
17 | | -**Capabilities** filters highlight unique model features. You can filter for reasoning capabilities (complex problem-solving), tool calling (API and function integration), or multimodal processing (text, images, audio). |
| 15 | +The model catalog user interface in the Foundry Portal provides an easy way to search for the right model for your needs. Each model has a *model card* showing its key information; including the provider, capabilities, benchmark metrics, responsible AI considerations, and deployment options. |
18 | 16 |
|
19 | | -**Inference tasks** and **Fine-tune tasks** filters let you find models suited for specific activities like text generation, summarization, translation, or entity extraction. |
| 17 | +:::image type="content" source="../media/model-catalog.png" alt-text="Screenshot of the model catalog in Microsoft Foundry portal."::: |
20 | 18 |
|
21 | | -## Understand model types |
| 19 | +You can search for models by keyword, and you can filter based on the following attributes: |
22 | 20 |
|
23 | | -As you explore the catalog, you encounter different categories of models designed for various use cases. |
| 21 | +- **Collection**: Models are organized into collections, such as models that are provided directly in Azure, or models in the Hugging Face repository. |
| 22 | +- **Capabilities**: Specific model abilities, including *reasoning* (complex problem-solving), *tool calling* (API and function integration), or *multimodal processing* (text, images, audio). |
| 23 | +- **Source**: The model provider, including Azure OpenAI, Microsoft, Cohere, Mistral, Meta, Anthropic, and others. |
| 24 | +- **Inference tasks**: Specific tasks like text generation, summarization, translation, image-generation, speech synthesis, or other common AI tasks. |
| 25 | +- **Fine-tuning methods**: Supported techniques for fine-tuning a model. |
| 26 | +- **Industry**: Models trained on industry-specific datasets. These specialized models often outperform general-purpose models in their respective domains. |
24 | 27 |
|
25 | | -### Large Language Models and Small Language Models |
| 28 | +## Understand generative AI model types |
26 | 29 |
|
27 | | -**Large Language Models (LLMs)** like GPT-4, Mistral Large, and Llama 3 70B are powerful models designed for tasks requiring deep reasoning, complex content generation, and extensive context understanding. These models excel at sophisticated applications but require more computational resources. |
| 30 | +As you explore the catalog, you encounter different categories of models designed for various use cases. In broad terms, you can categorize language models as: |
28 | 31 |
|
29 | | -**Small Language Models (SLMs)** like Phi-3, Mistral OSS models, and Llama 3 8B offer efficiency and cost-effectiveness while handling common natural language processing tasks. They're ideal for scenarios where speed and cost matter more than handling the most complex reasoning tasks. SLMs can run on lower-end hardware or edge devices. |
| 32 | +- **Large Language Models (LLMs)** like GPT-5, Mistral Large, and Llama 3 70B that are designed for tasks requiring deep reasoning, complex content generation, and extensive context understanding. These models excel at sophisticated applications but require more computational resources. |
| 33 | +- **Small Language Models (SLMs)** like Phi-4, Mistral OSS models, and Llama 3 8B that offer efficiency and cost-effectiveness while handling common natural language processing tasks. They're ideal for scenarios where speed and cost matter more than handling the most complex reasoning tasks. SLMs can run on lower-end hardware or edge devices. |
30 | 34 |
|
31 | 35 | ### Chat completion and reasoning models |
32 | 36 |
|
33 | 37 | Most language models in the catalog are **chat completion** models designed to generate coherent, contextually appropriate text responses. These models power conversational interfaces and content generation applications. |
34 | 38 |
|
35 | 39 | For scenarios requiring higher performance in complex tasks like mathematics, coding, science, strategy, and logistics, **reasoning models** like Claude Opus 4.6 provide enhanced problem-solving capabilities. These models can break down complex problems and show their reasoning process. |
36 | 40 |
|
37 | | -### Multimodal models |
38 | | - |
39 | | -Beyond text-only processing, **multimodal models** like GPT-4o and Phi-3-vision can handle multiple data types including images, audio, and text. Use these models when your application needs to analyze visual content, such as document understanding, image description, or chart explanation. |
40 | | - |
41 | 41 | ### Specialized models |
42 | 42 |
|
43 | 43 | The catalog also includes task-specific models: |
44 | 44 |
|
45 | | -**Image generation models** like DALL·E 3 create visual content from text descriptions. Use these for generating marketing materials, illustrations, or design mockups. |
46 | | - |
47 | 45 | **Embedding models** like Ada and Cohere convert text into numerical representations. These models enable semantic search, recommendation systems, and Retrieval Augmented Generation (RAG) scenarios where you need to find relevant information based on meaning rather than exact keyword matches. |
48 | 46 |
|
49 | | -### Regional and domain-specific models |
| 47 | +**Image generation models** like GPT-image-1 create images from text descriptions. Use these for generating marketing materials, illustrations, or design mockups. |
50 | 48 |
|
51 | | -Some models are optimized for specific languages, regions, or industries. When you need specialized performance in a particular domain or language, these models often outperform general-purpose alternatives. Examples include models trained on medical literature, legal documents, or specific language corpora. |
| 49 | +**Video generation models** like Sora 2 create video content from text descriptions. |
| 50 | + |
| 51 | +**Image analysis models** like GPT-4.1 can accept *multimodal* input, including text and images; and generate natural language output based on prompts that include images for analysis. |
52 | 52 |
|
53 | | -## Use the search and compare features |
| 53 | +**Text to speech models** like GPT-4o-tts can convert text-based input to synthesized speech. |
54 | 54 |
|
55 | | -Beyond filters, the model catalog offers search functionality to find models by name or keywords. You can open multiple model cards to compare their specifications, benchmarks, and capabilities side by side. This comparison helps you make informed decisions about which model best fits your use case, budget, and performance requirements. |
| 55 | +**Speech to text models** like GPT-4o-transcribe can convert audio data containing speech into text transcriptions. |
56 | 56 |
|
57 | | -When you identify promising candidates, you can view detailed benchmark results, test models in the playground, or proceed directly to deployment. The structured approach of filtering, comparing, and testing helps ensure you select the right model for your generative AI application. |
| 57 | +### Regional and domain-specific models |
| 58 | + |
| 59 | +Some models are optimized for specific languages, regions, or industries. When you need specialized performance in a particular domain or language, these models often outperform general-purpose alternatives. Examples include models trained on medical literature, legal documents, or specific language corpora. |
0 commit comments