Skip to content

Commit 88c628f

Browse files
authored
Merge pull request #313900 from vadim-kovalyov/patch-5
Add observability metrics documentation for Akri
2 parents 3e6d058 + 874796b commit 88c628f

2 files changed

Lines changed: 168 additions & 0 deletions

File tree

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
---
2+
title: Metrics for Akri and connectors
3+
description: Available observability metrics for Akri and connectors to monitor the health and performance of your solution.
4+
author: vadim-kovalyov
5+
ms.author: vakavali
6+
ms.topic: reference
7+
ms.date: 03/30/2026
8+
9+
# CustomerIntent: As an IT admin or operator, I want to be able to monitor and visualize data
10+
# on the health of my industrial assets and edge environment.
11+
---
12+
13+
# Metrics for Akri and connectors
14+
15+
Akri and the first-party connectors provide a set of observability metrics that you can use to monitor and analyze the health of your solution. This article lists the available metrics for Akri, the SSE connector, the REST connector, the MQTT connector, and the WASM graph runtime. The following sections group related sets of metrics, and list the name, type, description, and dimensions for each metric.
16+
17+
## Akri operator metrics
18+
19+
| Metric | Type | Description | Dimensions |
20+
|--------|------|-------------|------------|
21+
| aio_akri_operator_reconciliation_total_count | Counter | The total number of operator reconciliation attempts across Akri-managed resources. | [`result`](#result), [`resource_type`](#resource_type), [`error_type`](#error_type) |
22+
| aio_akri_operator_connector_template_count | Counter | The total number of connector template endpoint type entries handled by the operator. | [`endpoint_type`](#endpoint_type) |
23+
| aio_akri_operator_connector_deployment_instance_count | Counter | The total number of connector deployment instances created for a connector template. | [`template_name`](#template_name) |
24+
| aio_akri_operator_heartbeat | Counter | Emits a periodic heartbeat from the operator service. | [`service`](#service), [`instance`](#instance) |
25+
| aio_akri_operator_active_devices | Gauge | Reports the current number of active Device resources observed by the operator watcher. | |
26+
| aio_akri_operator_active_assets | Gauge | Reports the current number of active Asset resources observed by the operator watcher. | |
27+
| aio_akri_operator_connector_instance_count | Gauge | Reports the current connector instance count for each connector template. | [`connector_template_name`](#connector_template_name) |
28+
29+
## Akri ADR service metrics
30+
31+
| Metric | Type | Description | Dimensions |
32+
|--------|------|-------------|------------|
33+
| aio_akri_adr_service_instance_count | Gauge | Reports the configured ADR service instance count from operator heartbeat emission. | [`service`](#service) |
34+
| aio_akri_adr_service_heartbeat | Counter | Emits a periodic heartbeat from each ADR service instance. | [`service`](#service), [`instance`](#instance) |
35+
| aio_akri_adr_service_api_invocation_count | Counter | Counts ADR service API request handling outcomes. | [`api`](#api), [`result`](#result), [`operation`](#operation) |
36+
| aio_akri_adr_service_watcher_event_count | Counter | Counts ADR watcher event processing outcomes for Device and Asset flows. | [`result`](#result), [`target`](#target) |
37+
| aio_akri_adr_service_watcher_event_to_publish_duration_seconds | Histogram | Measures time from watcher event receipt to telemetry publish attempt in seconds. | [`result`](#result), [`target`](#target) |
38+
| aio_akri_api_request_duration_seconds | Histogram | Measures ADR API request handling latency in seconds. | [`api`](#api), [`result`](#result), [`operation`](#operation) |
39+
40+
## Akri Kubernetes API metrics
41+
42+
| Metric | Type | Description | Dimensions |
43+
|--------|------|-------------|------------|
44+
| aio_akri_k8s_api_request_count | Counter | Counts Kubernetes API requests issued by Akri components in both operator and ADR service. | [`operation`](#operation), [`resource_type`](#resource_type) |
45+
| aio_akri_k8s_api_request_duration_seconds | Histogram | Measures Kubernetes API request latency in seconds across operator and ADR service. | [`operation`](#operation), [`resource_type`](#resource_type), [`status`](#status) |
46+
| aio_akri_k8s_api_errors_total_count | Counter | Counts ADR service Kubernetes API errors. | |
47+
| aio_akri_mqtt_connection_errors_total_count | Counter | Counts ADR service MQTT connection and session failures. | [`error_reason`](#error_reason) |
48+
49+
## SSE connector metrics
50+
51+
All SSE connector metrics include the common dimensions [`instance`](#instance) and [`service`](#service).
52+
53+
| Metric | Type | Description | Dimensions |
54+
|--------|------|-------------|------------|
55+
| aio_sse_connector_messages_received | Counter | Number of messages received from SSE sources. | [`instance`](#instance), [`service`](#service) |
56+
| aio_sse_connector_messages_dropped | Counter | Number of messages dropped. | [`instance`](#instance), [`service`](#service) |
57+
| aio_sse_connector_messages_forwarded | Counter | Number of messages successfully forwarded. | [`instance`](#instance), [`service`](#service) |
58+
| aio_sse_connector_messages_failed | Counter | Number of messages that failed to process. | [`instance`](#instance), [`service`](#service) |
59+
| aio_sse_connector_processing_latency | Histogram | Processing latency in milliseconds from the moment the message was received until forwarding completion. May include network round-trip time for QoS 1 MQTT messages. For QoS 0, doesn't wait for ACK from the target. | [`instance`](#instance), [`service`](#service) |
60+
| aio_sse_connector_errors | Counter | Number of errors that occurred. | [`instance`](#instance), [`service`](#service), [`error`](#error) |
61+
| aio_sse_connector_bytes_in | Counter | Sum of all bytes received in each message payload. Doesn't include headers. | [`instance`](#instance), [`service`](#service) |
62+
| aio_sse_connector_bytes_out | Counter | Sum of all bytes in the payload of each message forwarded. Doesn't include headers or cloud events. | [`instance`](#instance), [`service`](#service) |
63+
| aio_sse_connector_heartbeat | Counter | Liveness counter incremented every 30 seconds. | [`instance`](#instance), [`service`](#service) |
64+
65+
## REST connector metrics
66+
67+
All REST connector metrics include the common dimensions [`instance`](#instance) and [`service`](#service).
68+
69+
| Metric | Type | Description | Dimensions |
70+
|--------|------|-------------|------------|
71+
| aio_rest_connector_messages_received | Counter | Number of messages received from REST sources. | [`instance`](#instance), [`service`](#service) |
72+
| aio_rest_connector_messages_forwarded | Counter | Number of messages successfully forwarded. | [`instance`](#instance), [`service`](#service) |
73+
| aio_rest_connector_messages_failed | Counter | Number of messages that failed to process. | [`instance`](#instance), [`service`](#service) |
74+
| aio_rest_connector_processing_latency | Histogram | Processing latency in milliseconds from the moment the message was received until forwarding completion, marked by target acknowledgment. May include network round-trip time for QoS 1 MQTT messages. For QoS 0, doesn't wait for ACK from the target. | [`instance`](#instance), [`service`](#service) |
75+
| aio_rest_connector_errors | Counter | Number of errors that occurred. | [`instance`](#instance), [`service`](#service), [`error`](#error) |
76+
| aio_rest_connector_bytes_in | Counter | Sum of all bytes received in each payload of REST endpoint GET responses. Doesn't include headers. | [`instance`](#instance), [`service`](#service) |
77+
| aio_rest_connector_bytes_out | Counter | Sum of all bytes in the payload of each message forwarded. Doesn't include headers or cloud events. | [`instance`](#instance), [`service`](#service) |
78+
| aio_rest_connector_heartbeat | Counter | Liveness counter incremented every 30 seconds. | [`instance`](#instance), [`service`](#service) |
79+
80+
## MQTT connector metrics
81+
82+
All MQTT connector metrics include the common dimensions [`instance`](#instance) and [`service`](#service).
83+
84+
| Metric | Type | Description | Dimensions |
85+
|--------|------|-------------|------------|
86+
| aio_mqtt_connector_heartbeat | Counter | Liveness counter incremented every 30 seconds. | [`instance`](#instance), [`service`](#service) |
87+
88+
## WASM graph runtime metrics
89+
90+
All WASM graph runtime metrics include the common dimensions [`instance`](#instance), [`service`](#service), and `graph`.
91+
92+
| Metric | Type | Description | Dimensions |
93+
|--------|------|-------------|------------|
94+
| aio_connector_wasm_graph_processing_latency | Histogram | End-to-end graph processing latency in milliseconds. | [`instance`](#instance), [`service`](#service), [`graph`](#graph), [`wasm_status`](#wasm_status) |
95+
| aio_connector_wasm_graphs_created | Counter | Number of WASM graph creation attempts. | [`instance`](#instance), [`service`](#service), [`graph`](#graph), [`wasm_status`](#wasm_status) |
96+
97+
## Dimension reference
98+
99+
### api
100+
101+
Identifies the ADR API or command path that handled the request (for example `GetDevice` or `UpdateAssetStatus`).
102+
103+
### connector_template_name
104+
105+
Name of the connector template being measured.
106+
107+
### endpoint_type
108+
109+
Endpoint type declared in a connector template.
110+
111+
### error
112+
113+
An enum representing the general category of error.
114+
115+
### error_reason
116+
117+
High-level MQTT failure reason (for example `config_error` or `broker_restart`).
118+
119+
### error_type
120+
121+
Error classification for failed reconciliations, when available.
122+
123+
### graph
124+
125+
Identifies the WASM graph instance that the metric is associated with.
126+
127+
### instance
128+
129+
Identifies the emitting service instance, typically using the pod hostname or the unique ID of the connector.
130+
131+
### operation
132+
133+
Identifies the operation being executed, such as a Kubernetes verb (`get`, `list`, `create`, `update`, `patch`, `delete`, `watch`) or discovered resource action (`create` or `update`).
134+
135+
### resource_type
136+
137+
Identifies the resource category targeted by the operation.
138+
139+
### result
140+
141+
Identifies request or event outcome classification.
142+
143+
### service
144+
145+
Identifies the logical service emitting the metric (for example `operator`, `adr_service`, `aio_sse_connector`, `aio_rest_connector`, or `aio_mqtt_connector`).
146+
147+
### status
148+
149+
Identifies whether the measured call completed with `success` or `error`.
150+
151+
### target
152+
153+
Identifies the watcher target kind. Values are `Device` or `Asset`.
154+
155+
### template_name
156+
157+
Name of the connector template that owns the deployment instance.
158+
159+
### wasm_status
160+
161+
Status of the WASM graph operation. For processing latency, values are `forwarded` (message returned by the graph), `filtered` (message dropped without error), or `failed` (graph returned an error). For graph creation, values are `succeeded` or `failed`.
162+
163+
## Related content
164+
165+
- [Configure observability](../configure-observability-monitoring/howto-configure-observability.md)

articles/iot-operations/toc.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -399,6 +399,9 @@ items:
399399
- name: Connector for OPC UA
400400
href: reference/observability-metrics-opcua-broker.md
401401
displayName: metrics, observability
402+
- name: Akri and connectors
403+
href: reference/observability-metrics-akri-connectors.md
404+
displayName: metrics, observability
402405
- name: Layered Network Management
403406
href: reference/observability-metrics-layered-network.md
404407
displayName: metrics, observability

0 commit comments

Comments
 (0)