Skip to content

Commit 24a1e56

Browse files
Merge pull request #307598 from MicrosoftDocs/main
Auto Publish – main to live - 2025-10-30 22:00 UTC
2 parents 070f50f + e73074f commit 24a1e56

172 files changed

Lines changed: 1166 additions & 487 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

articles/api-management/api-management-howto-cache-external.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: dlepow
66

77
ms.service: azure-api-management
88
ms.topic: how-to
9-
ms.date: 09/11/2025
9+
ms.date: 10/27/2025
1010
ms.author: danlep
1111
ms.custom: sfi-image-nochange
1212

@@ -44,10 +44,10 @@ To complete this tutorial, you need to:
4444

4545
+ [Create an Azure API Management instance](get-started-create-service-instance.md)
4646
+ Understand [caching in Azure API Management](api-management-howto-cache.md)
47-
+ Have an [Azure Managed Redis](../redis/quickstart-create-managed-redis.md), [Azure Cache for Redis](../azure-cache-for-redis/quickstart-create-redis.md), or another Redis-compatible cache available.
47+
+ Have an [Azure Managed Redis](../redis/quickstart-create-managed-redis.md) or another Redis-compatible cache available.
4848

4949
> [!IMPORTANT]
50-
> Azure API Management uses a Redis connection string to connect to the cache. If you use Azure Cache for Redis or Azure Managed Redis, enable access key authentication in your cache to use a connection string. Currently, you can't use Microsoft Entra authentication to connect Azure API Management to Azure Cache for Redis or Azure Managed Redis.
50+
> Azure API Management uses a Redis connection string to connect to the cache. If you use Azure Managed Redis, enable access key authentication in your cache to use a connection string. Currently, you can't use Microsoft Entra authentication to connect Azure API Management to Azure Managed Redis.
5151
5252
### Redis cache for Kubernetes
5353

@@ -57,7 +57,7 @@ For an API Management self-hosted gateway, caching requires an external cache. F
5757

5858
Follow the steps below to add an external Redis-compatible cache in Azure API Management. You can limit the cache to a specific gateway in your API Management instance.
5959

60-
![Screenshot that shows how to add an external Azure Cache for Redis in Azure API Management.](media/api-management-howto-cache-external/add-external-cache.png)
60+
![Screenshot that shows how to add an external Azure Managed Redis cache in Azure API Management.](media/api-management-howto-cache-external/add-external-cache.png)
6161

6262
### Use from setting
6363

@@ -76,7 +76,7 @@ The **Use from** setting in the configuration specifies the location of your API
7676
> [!NOTE]
7777
> You can configure the same external cache for more than one API Management instance. The API Management instances can be in the same or different regions. When sharing the cache for more than one instance, you must select **Default** in the **Use from** setting.
7878
79-
### Add an Azure Cache for Redis or Azure Managed Redis instance from the same subscription
79+
### Add an Azure Managed Redis instance from the same subscription
8080

8181
1. Browse to your API Management instance in the Azure portal.
8282
1. In the left menu, under **Deployment + infrastructure** select **External cache**.
@@ -85,14 +85,17 @@ The **Use from** setting in the configuration specifies the location of your API
8585
1. In the [**Use from**](#use-from-setting) dropdown, select **Default** or specify the desired region. The **Connection string** is automatically populated.
8686
1. Select **Save**.
8787

88+
> [!NOTE]
89+
> The default connection string is in the form `<cache-name>:10000,<cache-access-key>,ssl=True,abortConnect=False`. API Management stores the string as a secret named value. If you need to view or edit the string to rotate the access key or troubleshoot connection issues, go to the **Named values** blade.
90+
8891
### Add a Redis-compatible cache hosted outside of the current Azure subscription or Azure in general
8992

9093
1. Browse to your API Management instance in the Azure portal.
9194
1. In the left menu, under **Deployment + infrastructure** select **External cache**.
9295
1. Select **+ Add**.
9396
1. In the **Cache instance** dropdown, select **Custom**.
9497
1. In the [**Use from**](#use-from-setting) dropdown, select **Default** or specify the desired region.
95-
1. Enter your Azure Cache for Redis, Azure Managed Redis, or Redis-compatible cache connection string in the **Connection string** field.
98+
1. Enter your Azure Managed Redis or Redis-compatible cache connection string in the **Connection string** field.
9699
1. Select **Save**.
97100

98101
### Add a Redis cache to a self-hosted gateway

articles/api-management/api-management-howto-cache.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ With the caching policies shown in this example, the first request to a test ope
8282
1. Select **Save**.
8383

8484
> [!TIP]
85-
> If you're using an external cache, as described in [Use an external Azure Cache for Redis in Azure API Management](api-management-howto-cache-external.md), you might want to specify the `caching-type` attribute of the caching policies. See [API Management caching policies](api-management-policies.md#caching) for more information.
85+
> If you're using an external cache, as described in [Use an external Redis-compatible cache in Azure API Management](api-management-howto-cache-external.md), you might want to specify the `caching-type` attribute of the caching policies. See [API Management caching policies](api-management-policies.md#caching) for more information.
8686
8787
## Call an operation to test the caching
8888

articles/api-management/api-management-howto-entra-external-id.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ Create an app registration in your Microsoft Entra ID tenant. The app registrati
5555
* In the **Supported account types** section, select **Accounts in this organizational directory only**.
5656
* In **Redirect URI**, select **Single-page application (SPA)** and enter the following URL: `https://{your-api-management-service-name}.developer.azure-api.net/signin`, where `{your-api-management-service-name}` is the name of your API Management instance.
5757
* Select **Register** to create the application.
58-
1.On the app **Overview** page, find the **Application (client) ID** and **Directory (tenant) ID** and copy theses values to a safe location. You need them later.
58+
1.On the app **Overview** page, find the **Application (client) ID** and **Directory (tenant) ID** and copy these values to a safe location. You need them later.
5959
1. In the sidebar menu, under **Manage**, select **Certificates & secrets**.
6060
1. From the **Certificates & secrets** page, on the **Client secrets** tab, select **+ New client secret**.
6161
* Enter a **Description**.

articles/api-management/azure-openai-enable-semantic-caching.md

Lines changed: 29 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,43 @@
11
---
2-
title: Enable semantic caching for Azure OpenAI APIs in Azure API Management
3-
description: Prerequisites and configuration steps to enable semantic caching for Azure OpenAI APIs in Azure API Management.
2+
title: Enable Semantic Caching for LLM APIs in Azure API Management
3+
description: Prerequisites and configuration steps to enable semantic caching for Azure OpenAI and other LLM APIs in Azure API Management.
44
author: dlepow
55
ms.service: azure-api-management
66
ms.custom:
77
- build-2024
88
ms.topic: how-to
9-
ms.date: 01/13/2025
9+
ms.date: 10/28/2025
1010
ms.update-cycle: 180-days
1111
ms.author: danlep
1212
ms.collection: ce-skilling-ai-copilot
1313
---
1414

15-
# Enable semantic caching for Azure OpenAI APIs in Azure API Management
15+
# Enable semantic caching for LLM APIs in Azure API Management
1616

1717
[!INCLUDE [api-management-availability-all-tiers](../../includes/api-management-availability-all-tiers.md)]
1818

19-
Enable semantic caching of responses to Azure OpenAI API requests to reduce bandwidth and processing requirements imposed on the backend APIs and lower latency perceived by API consumers. With semantic caching, you can return cached responses for identical prompts and also for prompts that are similar in meaning, even if the text isn't the same. For background, see [Tutorial: Use Azure Cache for Redis as a semantic cache](../redis/tutorial-semantic-cache.md).
19+
Enable semantic caching of responses to LLM API requests to reduce bandwidth and processing requirements imposed on the backend APIs and lower latency perceived by API consumers. With semantic caching, you can return cached responses for identical prompts and also for prompts that are similar in meaning, even if the text isn't identical. For background, see [Tutorial: Use Azure Managed Redis as a semantic cache](../redis/tutorial-semantic-cache.md).
2020

2121
> [!NOTE]
22-
> The configuration steps in this article enable semantic caching for Azure OpenAI APIs. These steps can be generalized to enable semantic caching for corresponding large language model (LLM) APIs available through the [Azure AI Model Inference API](/rest/api/aifoundry/modelinference/) or with OpenAI-compatible models served through third-party inference providers.
22+
> The configuration steps in this article show how to enable semantic caching for APIs added to API Management from Azure OpenAI in Azure AI Foundry models. You can apply similar steps to enable semantic caching for corresponding large language model (LLM) APIs available through the [Azure AI Model Inference API](/rest/api/aifoundry/modelinference/) or with OpenAI-compatible models served through third-party inference providers.
2323
2424
## Prerequisites
2525

26-
* One or more Azure OpenAI in Foundry Models APIs must be added to your API Management instance. For more information, see [Add an Azure OpenAI API to Azure API Management](azure-openai-api-from-specification.md).
27-
* The Azure OpenAI instance must have deployments for the following:
26+
* Add one or more Azure OpenAI in Azure AI Foundry model deployments as APIs to your API Management instance. For more information, see [Add an Azure OpenAI API to Azure API Management](azure-openai-api-from-specification.md).
27+
* Create deployments for the following APIs:
2828

2929
* Chat Completion API - Deployment used for API consumer calls
3030
* Embeddings API - Deployment used for semantic caching
31-
* The API Management instance must be configured to use managed identity authentication to the Azure OpenAI APIs. For more information, see [Authenticate and authorize access to Azure OpenAI APIs using Azure API Management ](api-management-authenticate-authorize-azure-openai.md#authenticate-with-managed-identity).
32-
* An [Azure Managed Redis](../redis/quickstart-create-managed-redis.md) instance. The **RediSearch** module must be enabled on the Redis cache.
31+
* Configure the API Management instance to use managed identity authentication to the Azure OpenAI APIs. For more information, see [Authenticate and authorize access to Azure OpenAI APIs using Azure API Management ](api-management-authenticate-authorize-azure-openai.md#authenticate-with-managed-identity).
32+
* An [Azure Managed Redis](../redis/quickstart-create-managed-redis.md) instance with the **RediSearch** module enabled on the Redis cache.
3333
> [!NOTE]
34-
> You can only enable the **RediSearch** module when creating a new Azure Redis Enterprise or Azure Managed Redis cache. You can't add a module to an existing cache. [Learn more](../redis/redis-modules.md)
35-
* External cache configured in the Azure API Management instance. For steps, see [Use an external Redis-compatible cache in Azure API Management](api-management-howto-cache-external.md).
34+
> You can only enable the **RediSearch** module when creating a new Azure Managed Redis cache. You can't add a module to an existing cache. [Learn more](../redis/redis-modules.md)
35+
* Configure the Azure Managed Redis instance as an external cache in the Azure API Management instance. For steps, see [Use an external Redis-compatible cache in Azure API Management](api-management-howto-cache-external.md).
3636

3737

3838
## Test Chat API deployment
3939

40-
First, test the Azure OpenAI deployment to ensure that the Chat Completion API or Chat API is working as expected. For steps, see [Import an Azure OpenAI API to Azure API Management](azure-openai-api-from-specification.md#test-the-azure-openai-api).
40+
First, test the Azure OpenAI deployment to make sure the Chat Completion API or Chat API works as expected. For steps, see [Import an Azure OpenAI API to Azure API Management](azure-openai-api-from-specification.md#test-the-azure-openai-api).
4141

4242
For example, test the Azure OpenAI Chat API by sending a POST request to the API endpoint with a prompt in the request body. The response should include the completion of the prompt. Example request:
4343

@@ -55,25 +55,25 @@ When the request succeeds, the response includes a completion for the chat messa
5555

5656
## Create a backend for embeddings API
5757

58-
Configure a [backend](backends.md) resource for the embeddings API deployment with the following settings:
58+
Create a [backend](backends.md) resource for the embeddings API deployment with the following settings:
5959

60-
* **Name** - A name of your choice, such as `embeddings-backend`. You use this name to reference the backend in policies.
60+
* **Name** - A name of your choice, such as *embeddings-backend*. You use this name to reference the backend in policies.
6161
* **Type** - Select **Custom URL**.
62-
* **Runtime URL** - The URL of the embeddings API deployment in Azure OpenAI, similar to: `https://my-aoai.openai.azure.com/openai/deployments/embeddings-deployment/embeddings`
62+
* **Runtime URL** - The URL of the embeddings API deployment in Azure OpenAI, similar to: `https://my-aoai.openai.azure.com/openai/deployments/embeddings-deployment/embeddings` (without query parameters).
6363

6464
* **Authorization credentials** - Go to **Managed Identity** tab.
65-
* **Client identity** - Select *System assigned identity* or type in a User assigned managed identity client ID.
65+
* **Client identity** - Select *System assigned identity* or enter a user-assigned managed identity client ID.
6666
* **Resource ID** - Enter `https://cognitiveservices.azure.com/` for Azure OpenAI.
6767

68-
### Test backend
68+
### Test embeddings backend
6969

70-
To test the backend, create an API operation for your Azure OpenAI API:
70+
To test the embeddings backend, create an API operation for your Azure OpenAI API:
7171

7272
1. On the **Design** tab of your API, select **+ Add operation**.
73-
1. Enter a **Display name** and optionally a **Name** for the operation.
73+
1. Enter a **Display name** such as *Embeddings* and optionally a **Name** for the operation.
7474
1. In the **Frontend** section, in **URL**, select **POST** and enter the path `/`.
7575
1. On the **Headers** tab, add a required header with the name `Content-Type` and value `application/json`.
76-
1. Select **Save**
76+
1. Select **Save**.
7777

7878
Configure the following policies in the **Inbound processing** section of the API operation. In the [set-backend-service](set-backend-service-policy.md) policy, substitute the name of the backend you created.
7979

@@ -94,7 +94,7 @@ On the **Test** tab, test the operation by adding an `api-version` query paramet
9494
{"input":"Hello"}
9595
```
9696

97-
If the request is successful, the response includes a vector representation of the input text:
97+
If the request is successful, the response includes a vector representation of the input text. Example response:
9898

9999
```json
100100
{
@@ -125,8 +125,8 @@ To enable semantic caching for Azure OpenAI APIs in Azure API Management, apply
125125

126126
```xml
127127
<azure-openai-semantic-cache-lookup
128-
score-threshold="0.8"
129-
embeddings-backend-id="embeddings-deployment"
128+
score-threshold="0.15"
129+
embeddings-backend-id="embeddings-backend"
130130
embeddings-backend-auth="system-assigned"
131131
ignore-system-messages="true"
132132
max-message-count="10">
@@ -151,14 +151,16 @@ To enable semantic caching for Azure OpenAI APIs in Azure API Management, apply
151151

152152
## Confirm caching
153153

154-
To confirm that semantic caching is working as expected, trace a test Completion or Chat Completion operation using the test console in the portal. Confirm that the cache was used on subsequent tries by inspecting the trace. [Learn more about tracing API calls in Azure API Management](api-management-howto-api-inspector.md).
154+
To confirm that semantic caching works as expected, trace a test Completion or Chat Completion operation by using the test console in the portal. Confirm that the cache is used on subsequent tries by inspecting the trace. [Learn more about tracing API calls in Azure API Management](api-management-howto-api-inspector.md).
155155

156-
For example, if the cache was used, the **Output** section includes entries similar to ones in the following screenshot:
156+
Adjust the `score-threshold` attribute in the lookup policy to control how closely an incoming prompt must match a cached prompt to return its stored response. A lower score threshold means that prompts must have higher semantic similarity to return cached responses. Prompts with scores above the threshold don't use the cached response.
157+
158+
For example, if the cache is used, the **Output** section includes entries similar to the following screenshot:
157159

158160
:::image type="content" source="media/azure-openai-enable-semantic-caching/cache-lookup.png" alt-text="Screenshot of request trace in the Azure portal.":::
159161

160162
## Related content
161163

162164
* [Caching policies](api-management-policies.md#caching)
163-
* [Azure Cache for Redis](../azure-cache-for-redis/cache-overview.md)
165+
* [Azure Managed Redis](../redis/overview.md)
164166
* [AI gateway capabilities](genai-gateway-capabilities.md) in Azure API Management

0 commit comments

Comments
 (0)