Skip to content

Latest commit

 

History

History
108 lines (76 loc) · 4.58 KB

File metadata and controls

108 lines (76 loc) · 4.58 KB
title Azure API Management policy reference - llm-emit-token-metric
description Reference for the llm-emit-token-metric policy available for use in Azure API Management. Provides policy usage, settings, and examples.
services api-management
author dlepow
ms.service azure-api-management
ms.topic reference
ms.date 04/18/2025
ms.update-cycle 180-days
ms.author danlep
ms.collection ce-skilling-ai-copilot
ms.custom

Emit metrics for consumption of large language model tokens

[!INCLUDE api-management-availability-all-tiers]

The llm-emit-token-metric policy sends custom metrics to Application Insights about consumption of large language model (LLM) tokens through LLM APIs. Token count metrics include: Total Tokens, Prompt Tokens, and Completion Tokens.

[!INCLUDE api-management-policy-generic-alert]

[!INCLUDE api-management-llm-models]

Limits for custom metrics

[!INCLUDE api-management-custom-metrics-limits]

Prerequisites

Policy statement

<llm-emit-token-metric
        namespace="metric namespace" >      
        <dimension name="dimension name" value="dimension value" />
        ...additional dimensions...
</llm-emit-token-metric>

Attributes

Attribute Description Required Default value
namespace A string. Namespace of metric. Policy expressions aren't allowed. No API Management

Elements

Element Description Required
dimension Add one or more of these elements for each dimension included in the metric. Yes

Dimension attributes

Attribute Description Required Default value
name A string or policy expression. Name of dimension. Yes N/A
value A string or policy expression. Value of dimension. Can only be omitted if name matches one of the default dimensions. If so, value is provided as per dimension name. No N/A

[!INCLUDE api-management-emit-metric-dimensions-llm]

Usage

Usage notes

  • This policy can be used multiple times per policy definition.
  • You can configure at most 5 custom dimensions for this policy.
  • Where available, values in the usage section of the response from the LLM API are used to determine token metrics.
  • Certain LLM endpoints support streaming of responses. When stream is set to true in the API request to enable streaming, token metrics are estimated.

Example

The following example sends LLM token count metrics to Application Insights along with API ID as a default dimension.

<policies>
  <inbound>
      <llm-emit-token-metric
            namespace="MyLLM">   
            <dimension name="API ID" />
        </llm-emit-token-metric> 
  </inbound>
  <outbound>
  </outbound>
</policies>

Related policies

[!INCLUDE api-management-policy-ref-next-steps]