Skip to content

Commit cfab675

Browse files
committed
DocuMentor: Changes for danlep-patch-637793
1 parent b9d7bcd commit cfab675

1 file changed

Lines changed: 18 additions & 16 deletions

File tree

articles/api-management/llm-content-safety-policy.md

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-api-management
88
ms.collection: ce-skilling-ai-copilot
99
ms.custom:
1010
ms.topic: reference
11-
ms.date: 09/03/2025
11+
ms.date: 03/16/2026
1212
ms.update-cycle: 180-days
1313
ms.author: danlep
1414
---
@@ -17,16 +17,16 @@ ms.author: danlep
1717

1818
[!INCLUDE [api-management-availability-premium-dev-standard-basic-premiumv2-standardv2-basicv2](../../includes/api-management-availability-premium-dev-standard-basic-premiumv2-standardv2-basicv2.md)]
1919

20-
The `llm-content-safety` policy enforces content safety checks on large language model (LLM) requests (prompts) by transmitting them to the [Azure AI Content Safety](/azure/ai-services/content-safety/overview) service before sending to the backend LLM API. When the policy is enabled, and Azure AI Content Safety detects malicious content, API Management blocks the request and returns a `403` error code.
20+
The `llm-content-safety` policy enforces content safety checks on large language model (LLM) requests (prompts) or responses (completions) by sending them to the [Azure AI Content Safety](/azure/ai-services/content-safety/overview) service. When you enable the policy and Azure AI Content Safety detects malicious content, API Management blocks the request or response and returns a `403` error code.
2121

2222
> [!NOTE]
23-
> The terms _category_ and _categories_ used in API Management are synonymous with _harm category_ and _harm categories_ in the Azure AI Content Safety service. Details can be found on the [Harm categories in Azure AI Content Safety](/azure/ai-services/content-safety/concepts/harm-categories) page.
23+
> The terms _category_ and _categories_ used in API Management are synonymous with _harm category_ and _harm categories_ in the Azure AI Content Safety service. For more information, see [Harm categories in Azure AI Content Safety](/azure/ai-services/content-safety/concepts/harm-categories).
2424
2525
Use the policy in scenarios such as the following:
2626

27-
* Block requests that contain predefined categories of harmful content or hate speech
28-
* Apply custom blocklists to prevent specific content from being sent
29-
* Shield against prompts that match attack patterns
27+
* Block requests or responses that contain predefined categories of harmful content or hate speech.
28+
* Apply custom blocklists to prevent specific content from being sent or received.
29+
* Shield against prompts that match attack patterns.
3030

3131
[!INCLUDE [api-management-policy-generic-alert](../../includes/api-management-policy-generic-alert.md)]
3232

@@ -41,7 +41,7 @@ Use the policy in scenarios such as the following:
4141
## Policy statement
4242

4343
```xml
44-
<llm-content-safety backend-id="name of backend entity" shield-prompt="true | false" enforce-on-completions="true | false">
44+
<llm-content-safety backend-id="name of backend entity" shield-prompt="true | false" enforce-on-completions="true | false" window-size="integer" window-overlap-size="integer">
4545
<categories output-type="FourSeverityLevels | EightSeverityLevels">
4646
<category name="Hate | SelfHarm | Sexual | Violence" threshold="integer" />
4747
<!-- If there are multiple categories, add more category elements -->
@@ -60,8 +60,10 @@ Use the policy in scenarios such as the following:
6060
| Attribute | Description | Required | Default |
6161
| -------------- | ----------------------------------------------------------------------------------------------------- | -------- | ------- |
6262
| backend-id | Identifier (name) of the Azure AI Content Safety backend to route content-safety API calls to. Policy expressions are allowed. | Yes | N/A |
63-
| shield-prompt | If set to `true`, content is checked for user attacks. Otherwise, skip this check. Policy expressions are allowed. | No | `false` |
64-
| enforce-on-completions| If set to `true`, content safety checks are enforced on chat completions for response validation. Otherwise, skip this check. Policy expressions are allowed. | No | `false` |
63+
| shield-prompt | If set to `true`, check content for user attacks. Otherwise, skip this check. Policy expressions are allowed. | No | `false` |
64+
| enforce-on-completions| If set to `true`, enforce content safety checks on chat completions for response validation. Otherwise, skip this check. When you set the policy in the outbound section, this attribute is ignored. Policy expressions are allowed. | No | `false` |
65+
| window-size | The size of text windows in characters that the policy sends to Azure AI Content Safety for evaluation. If you don't specify a value, the entire content is sent as one window. Policy expressions are allowed. | No | N/A |
66+
| window-overlap-size | The size of overlaps in characters between text windows when the content is split by using the `window-size` attribute. If you don't specify a value, windows don't overlap. Policy expressions are allowed. | No | N/A |
6567

6668

6769
## Elements
@@ -83,24 +85,24 @@ Use the policy in scenarios such as the following:
8385
| Attribute | Description | Required | Default |
8486
| -------------- | ----------------------------------------------------------------------------------------------------- | -------- | ------- |
8587
| name | Specifies the name of this category. The attribute must have one of the following values: `Hate`, `SelfHarm`, `Sexual`, `Violence`. Policy expressions are allowed. | Yes | N/A |
86-
| threshold | Specifies the threshold value for this category at which request are blocked. Requests with content severities less than the threshold aren't blocked. The value must be between 0 (most restrictive) and 7 (least restrictive). Policy expressions are allowed. | Yes | N/A |
88+
| threshold | Specifies the threshold value for this category at which requests or responses are blocked. Requests with content severities less than the threshold aren't blocked. The value must be between 0 (most restrictive) and 7 (least restrictive). Policy expressions are allowed. | Yes | N/A |
8789

8890

8991
## Usage
9092

91-
- [**Policy sections:**](./api-management-howto-policies.md#understanding-policy-configuration) inbound
93+
- [**Policy sections:**](./api-management-howto-policies.md#understanding-policy-configuration) inbound, outbound
9294
- [**Policy scopes:**](./api-management-howto-policies.md#scopes) global, workspace, product, API
9395
- [**Gateways:**](api-management-gateways-overview.md) classic, v2, consumption, self-hosted, workspace
9496

9597
### Usage notes
9698

97-
* The policy runs on a concatenation of all text content in a completion or chat completion request.
98-
* If the request exceeds the character limit of Azure AI Content Safety, a `403` error is returned.
99-
* This policy can be used multiple times per policy definition.
99+
* Unless you specify `window-size`, the policy runs on a concatenation of all text content in a completion or chat completion request or response. If you specify `window-size`, the policy runs on windows of text content with the specified size and overlaps.
100+
* If the request or response exceeds the character limit of Azure AI Content Safety, the policy returns a `403` error.
101+
* You can use this policy multiple times per policy definition.
100102

101103
## Example
102104

103-
The following example enforces content safety checks on LLM requests using the Azure AI Content Safety service. The policy blocks requests that contain speech in the `Hate` or `Violence` category with a severity level of 4 or higher. In other words, the filter allows levels 0-3 to continue whereas levels 4-7 are blocked. Raising a category's threshold raises the tolerance and potentially decreases the number of blocked requests. Lowering the threshold lowers the tolerance and potentially increases the number of blocked requests. The `shield-prompt` attribute is set to `true` to check for adversarial attacks.
105+
The following example, when configured in the inbound section, enforces content safety checks on LLM requests by using the Azure AI Content Safety service. The policy blocks requests that contain speech in the `Hate` or `Violence` category with a severity level of 4 or higher. In other words, the filter allows levels 0-3 to continue whereas levels 4-7 are blocked. Raising a category's threshold raises the tolerance and potentially decreases the number of blocked requests. Lowering the threshold lowers the tolerance and potentially increases the number of blocked requests. The `shield-prompt` attribute is set to `true` to check for adversarial attacks.
104106

105107
```xml
106108
<policies>
@@ -117,7 +119,7 @@ The following example enforces content safety checks on LLM requests using the A
117119

118120
## Related policies
119121

120-
* [Content validation](api-management-policies.md#content-validation)
122+
* [Content validation](api-management-policies.md#content-validation) policies
121123
* [llm-token-limit](llm-token-limit-policy.md) policy
122124
* [llm-emit-token-metric](llm-emit-token-metric-policy.md) policy
123125

0 commit comments

Comments
 (0)