You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The *AI gateway* in Azure API Management is a set of capabilities that help you manage your AI backends effectively. These capabilities help you manage, secure, scale, monitor, and govern large language model (LLM) deployments, AI APIs, and Model Context Protocol (MCP) servers that back your intelligent apps and agents.
20
+
The *AI gateway* in Azure API Management is a set of capabilities that help you manage your AI backends effectively. Use these capabilities to secure, scale, monitor, and govern AI models, agents, and tools that back your intelligent apps and workloads.
21
21
22
22
Use the AI gateway to manage a wide range of AI endpoints, including:
23
23
@@ -30,7 +30,10 @@ Use the AI gateway to manage a wide range of AI endpoints, including:
30
30
:::image type="content" source="media/genai-gateway-capabilities/capabilities-summary.png" alt-text="Diagram summarizing AI gateway capabilities of Azure API Management.":::
31
31
32
32
> [!NOTE]
33
-
> The AI gateway, including [MCP server capabilities](mcp-server-overview.md), extends API Management's existing [API gateway](api-management-key-concepts.md#api-gateway); it isn't a separate offering. Related governance and developer features are in [Azure API Center](../api-center/overview.md).
33
+
> The AI gateway, including [MCP server capabilities](mcp-server-overview.md), extends API Management's existing [API gateway](api-management-key-concepts.md#api-gateway); it's not a separate offering. Related governance and developer features are in [Azure API Center](../api-center/overview.md).
34
+
35
+
> [!NOTE]
36
+
> New! AI gateway can now be integrated directly into Microsoft Foundry, enabling you to govern AI models, agents, and tools from within your Foundry environment. Learn more in the [AI gateway in Microsoft Foundry](#ai-gateway-in-microsoft-foundry-preview) section.
34
37
35
38
## Why use an AI gateway?
36
39
@@ -40,18 +43,17 @@ AI adoption in organizations involves several phases:
40
43
* Building AI apps and agents that need access to AI models and services
41
44
* Operationalizing and deploying AI apps and backends to production
42
45
43
-
As AI adoption matures, especially in larger enterprises, the AI gateway helps address key challenges, helping to:
46
+
As AI adoption matures, especially in larger enterprises, the AI gateway helps address key challenges. It helps you:
44
47
45
48
* Authenticate and authorize access to AI services
46
49
* Load balance across multiple AI endpoints
47
50
* Monitor and log AI interactions
48
51
* Manage token usage and quotas across multiple applications
49
52
* Enable self-service for developer teams
50
53
51
-
52
54
## Traffic mediation and control
53
55
54
-
With the AI gateway, you can:
56
+
By using the AI gateway, you can:
55
57
56
58
* Quickly import and configure OpenAI-compatible or passthrough LLM endpoints as APIs
57
59
* Manage models deployed in Microsoft Foundry or providers such as Amazon Bedrock
@@ -74,13 +76,13 @@ More information:
74
76
75
77
## Scalability and performance
76
78
77
-
One of the main resources in generative AI services is *tokens*. Microsoft Foundry and other providers assign quotas for your model deployments as tokens-per-minute (TPM). You distribute these tokens across your model consumers, such as different applications, developer teams, or departments within the company.
79
+
One of the main resources in generative AI services is *tokens*. Microsoft Foundry and other providers assign quotas for your model deployments as tokensperminute (TPM). You distribute these tokens across your model consumers, such as different applications, developer teams, or departments within the company.
78
80
79
81
If you have a single app connecting to an AI service backend, you can manage token consumption with a TPM limit that you set directly on the model deployment. However, when your application portfolio grows, you might have multiple apps calling single or multiple AI service endpoints. These endpoints can be pay-as-you-go or [Provisioned Throughput Units](/azure/ai-services/openai/concepts/provisioned-throughput) (PTU) instances. You need to make sure that one app doesn't use the whole TPM quota and block other apps from accessing the backends they need.
80
82
81
83
### Token rate limiting and quotas
82
84
83
-
Configure a token limit policy on your LLM APIs to manage and enforce limits per API consumer based on the usage of AI service tokens. With this policy, you can set a TPM limit or a token quota over a specified period, such as hourly, daily, weekly, monthly, or yearly.
85
+
Configure a token limit policy on your LLM APIs to manage and enforce limits per API consumer based on the usage of AI service tokens. By using this policy, you can set a TPM limit or a token quota over a specified period, such as hourly, daily, weekly, monthly, or yearly.
84
86
85
87
:::image type="content" source="media/genai-gateway-capabilities/token-rate-limiting.png" alt-text="Diagram of limiting Azure OpenAI Service tokens in API Management.":::
86
88
@@ -122,11 +124,11 @@ More information:
122
124
*[Deploy an API Management instance in multiple regions](api-management-howto-deploy-multi-region.md)
123
125
124
126
> [!NOTE]
125
-
> While API Management can scale gateway capacity, you also need to scale and distribute traffic to your AI backends to accommodate increased load (see the [Resiliency](#resiliency) section). For example, to take advantage of geographical distribution of your system in a multiregion configuration, you should deploy backend AI services in the same regions as your API Management gateways.
127
+
> While API Management can scale gateway capacity, you also need to scale and distribute traffic to your AI backends to accommodate increased load (see the [Resiliency](#resiliency) section). For example, to take advantage of geographical distribution of your system in a multiregion configuration, deploy backend AI services in the same regions as your API Management gateways.
126
128
127
129
## Security and safety
128
130
129
-
An AI gateway secures and controls access to your AI APIs. With the AI gateway, you can:
131
+
An AI gateway secures and controls access to your AI APIs. By using the AI gateway, you can:
130
132
131
133
* Use managed identities to authenticate to Azure AI services, so you don't need API keys for authentication
132
134
* Configure OAuth authorization for AI apps and agents to access APIs or MCP servers by using API Management's credential manager
@@ -139,6 +141,7 @@ More information:
139
141
*[Authenticate and authorize access to LLM APIs](api-management-authenticate-authorize-ai-apis.md)
140
142
*[About API credentials and credential manager](credentials-overview.md)
141
143
*[Enforce content safety checks on LLM requests](llm-content-safety-policy.md)
144
+
*[Secure access to MCP servers](secure-mcp-servers.md)
142
145
143
146
144
147
## Resiliency
@@ -163,15 +166,15 @@ More information:
163
166
164
167
## Observability and governance
165
168
166
-
API Management provides comprehensive monitoring and analytics capabilities to track token usage patterns, optimize costs, ensure compliance with your AI governance policies, and troubleshoot issues with your AI APIs. Use these capabilities to:
169
+
API Management provides comprehensive monitoring and analytics capabilities to track token usage patterns, optimize costs, ensure compliance with your AI governance policies, and troubleshoot problems with your AI APIs. Use these capabilities to:
167
170
168
-
* Log prompts and completions to Azure Monitor
169
-
* Track token metrics per consumer in Application Insights
170
-
* View the built-in monitoring dashboard
171
-
* Configure policies with custom expressions
172
-
* Manage token quotas across applications
171
+
* Log prompts and completions to Azure Monitor.
172
+
* Track token metrics per consumer in Application Insights.
173
+
* View the built-in monitoring dashboard.
174
+
* Configure policies with custom expressions.
175
+
* Manage token quotas across applications.
173
176
174
-
For example, you can emit token metrics with the [llm-emit-token-metric](llm-emit-token-metric-policy.md) policy and add custom dimensions you can use to filter the metric in Azure Monitor. The following example emits token metrics with dimensions for client IP address, API ID, and user ID (from a custom header):
177
+
For example, you can emit token metrics by using the [llm-emit-token-metric](llm-emit-token-metric-policy.md) policy and add custom dimensions you can use to filter the metric in Azure Monitor. The following example emits token metrics with dimensions for client IP address, API ID, and user ID (from a custom header):
175
178
176
179
```xml
177
180
<llm-emit-token-metricnamespace="llm-metrics">
@@ -183,7 +186,6 @@ For example, you can emit token metrics with the [llm-emit-token-metric](llm-emi
183
186
184
187
:::image type="content" source="media/genai-gateway-capabilities/emit-token-metrics.png" alt-text="Diagram of emitting token metrics using API Management.":::
185
188
186
-
187
189
Also, enable logging for LLM APIs in Azure API Management to track token usage, prompts, and completions for billing and auditing. After you enable logging, you can analyze the logs in Application Insights and use a built-in dashboard in API Management to view token consumption patterns across your AI APIs.
188
190
189
191
:::image type="content" source="media/api-management-howto-llm-logs/analytics-workbook-small.png" alt-text="Screenshot of analytics for language model APIs in the portal." lightbox="media/api-management-howto-llm-logs/analytics-workbook.png":::
@@ -214,6 +216,23 @@ More information:
214
216
*[Azure API Management policy toolkit](https://github.com/Azure/azure-api-management-policy-toolkit/)
215
217
*[API Center Copilot Studio connector](../api-center/export-to-copilot-studio.yml)
216
218
219
+
## AI gateway in Microsoft Foundry (preview)
220
+
221
+
You can now integrate AI gateway directly into Microsoft Foundry, enabling you to govern AI traffic from within your Foundry environment. When you create or associate an AI gateway instance with your Foundry resource, you can govern, secure, and monitor your Foundry resources through the gateway.
222
+
223
+
**Models**: Configure token quotas and rate limits directly in the Foundry interface for all model deployments, including Azure OpenAI and other providers.
224
+
225
+
**Agents**: Register agents running anywhere - Azure, other clouds, or on-premises - into the Foundry control plane for centralized inventory and governance. View telemetry in Foundry or Application Insights, and apply policies such as throttling or content safety.
226
+
227
+
**Tools**: Register MCP tools hosted across any environment for automatic governance and discovery. Tools appear in the Foundry inventory, ready for consumption by agents.
228
+
229
+
For advanced scenarios such as custom policies, enterprise networking, or federated gateways, access the full Azure API Management experience while maintaining continuity with Foundry-managed resources.
230
+
231
+
More information:
232
+
233
+
*[Enable AI gateway in Microsoft Foundry](/azure/ai-foundry/configuration/enable-ai-api-management-gateway-portal)
234
+
*[Register custom agents in Foundry](/azure/ai-foundry/control-plane/register-custom-agent)
235
+
*[Govern tools with AI gateway](/azure/ai-foundry/agents/how-to/tools/governance)
217
236
218
237
## Early access to AI gateway features
219
238
@@ -223,7 +242,6 @@ More information:
223
242
224
243
*[Configure service update settings for your API Management instances](configure-service-update-settings.md)
*[Blog: AI gateway in Azure API Management is now available in Microsoft Foundry](https://techcommunity.microsoft.com/blog/integrationsonazureblog/ai-gateway-in-azure-api-management-is-now-available-in-microsoft-foundry-preview/4470676)
243
262
*[Blog: Introducing AI capabilities in Azure API Management](https://techcommunity.microsoft.com/t5/azure-integration-services-blog/introducing-genai-gateway-capabilities-in-azure-api-management/ba-p/4146525)
244
263
*[Blog: Integrating Azure Content Safety with API Management](https://techcommunity.microsoft.com/t5/fasttrack-for-azure/integrating-azure-content-safety-with-api-management-for-azure/ba-p/4202505)
245
264
*[Training: Manage your generative AI APIs](/training/modules/api-management)
0 commit comments