|
| 1 | +--- |
| 2 | +title: Metrics for data flows |
| 3 | +description: Available observability metrics for data flows to monitor the health and performance of your solution. |
| 4 | +author: vadim-kovalyov |
| 5 | +ms.author: vakavali |
| 6 | +ms.topic: reference |
| 7 | +ms.date: 03/27/2026 |
| 8 | + |
| 9 | +# CustomerIntent: As an IT admin or operator, I want to be able to monitor and visualize data |
| 10 | +# on the health of my industrial assets and edge environment. |
| 11 | +--- |
| 12 | + |
| 13 | +# Metrics for data flows |
| 14 | + |
| 15 | +Data flows provide a set of observability metrics that you can use to monitor and analyze the health of your solution. This article lists the available metrics for data flows. The following sections group related sets of metrics, and list the name, type, description, and dimensions for each metric. |
| 16 | + |
| 17 | +## Common data flow metrics |
| 18 | + |
| 19 | +| Metric | Type | Description | Dimensions | |
| 20 | +|--------|------|-------------|------------| |
| 21 | +| aio_dataflow_messages_received | Counter | Number of messages received from sources. | [`source_service_type`](#source_service_type), [`category`](#category) | |
| 22 | +| aio_dataflow_messages_sent | Counter | Number of messages sent to targets. For QoS 0, the counter increments without waiting for ACK from the target. For higher QoS levels, it waits for ACK before incrementing. | [`target_service_type`](#target_service_type), [`category`](#category) | |
| 23 | +| aio_dataflow_messages_retried | Counter | Number of messages that were retried for delivery to targets. | [`target_service_type`](#target_service_type), [`category`](#category) | |
| 24 | +| aio_dataflow_messages_expired | Counter | Number of messages dropped because they expired after being received. Expiry is controlled by the `Message Expiry Interval` property set on the MQTTv5 publish packet. | [`target_service_type`](#target_service_type) | |
| 25 | +| aio_dataflow_messages_filtered | Counter | Number of messages dropped because they were filtered by the data processing rules. | [`target_service_type`](#target_service_type) | |
| 26 | +| aio_dataflow_messages_dropped_processing_errors | Counter | Number of messages dropped due to processing errors such as conversion errors or message size exceeding limits. | [`target_service_type`](#target_service_type) | |
| 27 | +| aio_dataflow_messages_dropped_when_busy | Counter | Number of messages dropped when internal queues were full. Applicable only to QoS 0 messages. | [`source_service_type`](#source_service_type), [`category`](#category) | |
| 28 | +| aio_dataflow_bytes_received | Counter | Sum of the payloads of all messages received. Doesn't include the size of message properties or publish packets. | [`source_service_type`](#source_service_type), [`category`](#category) | |
| 29 | +| aio_dataflow_bytes_sent | Counter | Sum of the payloads of all messages sent. Doesn't include the size of message properties or publish packets. | [`target_service_type`](#target_service_type), [`category`](#category) | |
| 30 | +| aio_dataflow_errors | Counter | Number of errors that occurred in sources or targets. | [`source_service_type`](#source_service_type) or [`target_service_type`](#target_service_type), [`error_code`](#error_code) | |
| 31 | +| aio_dataflow_processing_latency | Histogram | Processing latency in milliseconds from the moment the message was received until the acknowledge was received from the target. May include network round-trip time for QoS 1 MQTT messages. For QoS 0, doesn't wait for ACK from the target. | [`source_service_type`](#source_service_type), [`category`](#category) | |
| 32 | +| aio_dataflow_upload_latency | Histogram | Latency in milliseconds of sending messages to the target endpoint. | [`target_service_type`](#target_service_type), [`success`](#success), [`category`](#category) | |
| 33 | +| aio_dataflow_transformation_latency | Histogram | Latency in milliseconds from the moment a message was received until it was processed and is ready to be sent to the target. | [`source_service_type`](#source_service_type), [`target_service_type`](#target_service_type), [`success`](#success), [`category`](#category) | |
| 34 | + |
| 35 | +## Operator metrics |
| 36 | + |
| 37 | +| Metric | Type | Description | Dimensions | |
| 38 | +|--------|------|-------------|------------| |
| 39 | +| aio_dataflow_active_dataflows | Gauge | Number of active data flows. | | |
| 40 | +| aio_dataflow_active_dataflow_graphs | Gauge | Number of active data flow graphs. | | |
| 41 | +| aio_dataflow_version | Counter | Reports data flow version via the `version` dimension. | `version` | |
| 42 | +| aio_dataflow_reconcile_errors | Gauge | Indicates whether the operator encountered reconciliation errors. A value of 1 means an error occurred; 0 means no errors. | | |
| 43 | + |
| 44 | +## Data flow graph metrics |
| 45 | + |
| 46 | +| Metric | Type | Description | Dimensions | |
| 47 | +|--------|------|-------------|------------| |
| 48 | +| aio_dataflow_graphs | Gauge | Number of individual graphs within the data flow graphs. | [`dataflow_id`](#dataflow_id) | |
| 49 | +| aio_dataflow_graph_modules | Gauge | Number of unique WASM modules loaded across all graph artifacts in a DataflowGraph. | [`dataflow_id`](#dataflow_id) | |
| 50 | +| aio_dataflow_graph_inputs | Gauge | Number of input topics (dataSources) across all Source nodes in a DataflowGraph. | [`dataflow_id`](#dataflow_id) | |
| 51 | +| aio_dataflow_graph_operators | Gauge | Number of operations by type in graph artifact(s) referenced by a DataflowGraph. | [`dataflow_id`](#dataflow_id), [`operator_type`](#operator_type) | |
| 52 | +| aio_dataflow_graph_errors | Counter | Number of errors encountered while downloading or parsing graphs. | [`error_code`](#error_code) | |
| 53 | +| aio_dataflow_graph_module_exit | Counter | Number of unexpected module errors that caused a module to exit. | | |
| 54 | +| aio_dataflow_graph_messages_received | Counter | Number of messages received by graph processing. | | |
| 55 | +| aio_dataflow_graph_messages_sent | Counter | Number of messages sent from graphs to a target. The counter increments without waiting for ACK from the target. | | |
| 56 | +| aio_dataflow_graph_accumulated_messages | Counter | Number of messages that were accumulated. | | |
| 57 | +| aio_dataflow_graph_accumulated_bytes | Counter | Sum of the payloads of all accumulated messages. Doesn't include the size of message properties or publish packets. | | |
| 58 | + |
| 59 | +## Dimension reference |
| 60 | + |
| 61 | +### category |
| 62 | + |
| 63 | +The [`category`](#category) dimension comes from the MQTT Connect packet's `metriccategory` user property. When a client connects to the broker, it can include this user property to categorize its traffic. This allows dashboards to differentiate traffic sources. |
| 64 | + |
| 65 | +> [!IMPORTANT] |
| 66 | +> The number of unique categories is limited to 1000. Avoid using high-cardinality values for `metriccategory` to prevent metric data loss. |
| 67 | +
|
| 68 | +### source_service_type |
| 69 | + |
| 70 | +Source endpoint service type, determined from the endpoint type and host. Possible values: |
| 71 | + |
| 72 | +- `Local Storage` |
| 73 | +- `Blob Storage` |
| 74 | +- `Fabric OneLake` |
| 75 | +- `Data Explorer` |
| 76 | +- `Fabric RTI` |
| 77 | +- `Local AIO MQTT Broker` |
| 78 | +- `Event Hubs` |
| 79 | +- `Event Grid` |
| 80 | +- `Open Telemetry` |
| 81 | +- `Unknown Kafka Broker` |
| 82 | +- `Unknown Mqtt Broker` |
| 83 | +- `Other` |
| 84 | + |
| 85 | +### target_service_type |
| 86 | + |
| 87 | +Target endpoint service type, determined from the endpoint type and host. Same possible values as [`source_service_type`](#source_service_type). |
| 88 | + |
| 89 | +### error_code |
| 90 | + |
| 91 | +Type of error associated with the metric. Possible values: |
| 92 | + |
| 93 | +- `ConfigError` |
| 94 | +- `PayloadError` |
| 95 | +- `InternalError` |
| 96 | + |
| 97 | +The list also includes other error codes from health status reporting. |
| 98 | + |
| 99 | +### dataflow_id |
| 100 | + |
| 101 | +Name of the DataflowGraph custom resource that the metric is associated with. |
| 102 | + |
| 103 | +### operator_type |
| 104 | + |
| 105 | +Type of graph operation. Possible values: |
| 106 | + |
| 107 | +- `Source` |
| 108 | +- `Sink` |
| 109 | +- `Map` |
| 110 | +- `Filter` |
| 111 | +- `Branch` |
| 112 | +- `Concatenate` |
| 113 | +- `Accumulate` |
| 114 | +- `Delay` |
| 115 | + |
| 116 | +### success |
| 117 | + |
| 118 | +Boolean string (`"true"` or `"false"`) indicating whether the operation associated with the metric was successful. |
| 119 | + |
| 120 | +## Related content |
| 121 | + |
| 122 | +- [Configure observability](../configure-observability-monitoring/howto-configure-observability.md) |
0 commit comments