Skip to content

Commit 8df6ed2

Browse files
Clarify descriptions for various MQTT metrics
1 parent 798630a commit 8df6ed2

1 file changed

Lines changed: 37 additions & 29 deletions

File tree

articles/iot-operations/reference/observability-metrics-mqtt-broker.md

Lines changed: 37 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -36,20 +36,20 @@ Sessions without a `metriccategory` are tagged as `category=uncategorized`.
3636
| Metric | Type | Description | Dimensions |
3737
|--------|------|-------------|------------|
3838
| aio_broker_publishes_received | Counter | Number of incoming PUBLISH packets received from clients. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
39-
| aio_broker_publishes_sent | Counter | Number of outgoing PUBLISH packets sent to clients. Counts each delivery separately even if multiple clients receive the same payload. Does not count ack packets. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
40-
| aio_broker_payload_bytes_received | Counter | Number of payload bytes for all PUBLISH packets received. Does not include MQTT packet overhead or properties. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
41-
| aio_broker_payload_bytes_sent | Counter | Number of payload bytes for all PUBLISH packets sent. Does not include MQTT packet overhead or properties. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
39+
| aio_broker_publishes_sent | Counter | Number of outgoing PUBLISH packets sent to clients. Counts each delivery separately even if multiple clients receive the same payload. Doesn't count ack packets. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
40+
| aio_broker_payload_bytes_received | Counter | Number of payload bytes for all PUBLISH packets received. Doesn't include MQTT packet overhead or properties. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
41+
| aio_broker_payload_bytes_sent | Counter | Number of payload bytes for all PUBLISH packets sent. Doesn't include MQTT packet overhead or properties. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
4242
| aio_broker_authentication_successes | Counter | Number of successful client authentications. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
43-
| aio_broker_authentication_failures | Counter | Number of failed client authentications (failed authentication is when an error has occurred that prevented authentication check). | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
43+
| aio_broker_authentication_failures | Counter | Number of failed client authentications (failed authentication is when an error occurs that prevents authentication check). | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
4444
| aio_broker_authentication_deny | Counter | Number of denied client authentications. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
4545
| aio_broker_authorization_allow | Counter | Number of successful client authorizations. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
4646
| aio_broker_authorization_deny | Counter | Number of denied client authorizations. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
47-
| aio_broker_authorization_failures | Counter | Number of failed client authorizations (failed authorization is when an error has occurred that prevented authorization check). | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
47+
| aio_broker_authorization_failures | Counter | Number of failed client authorizations (failed authorization is when an error occurs that prevents authorization check). | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category) |
4848
| aio_broker_qos0_messages_dropped | Counter | Number of QoS 0 messages dropped due to high volume or memory limits. | [`namespace`](#namespace), [`hostname`](#hostname), [`category`](#category), [`direction`](#direction) |
4949
| aio_broker_store_retained_messages | Gauge | Retained messages currently stored. | [`namespace`](#namespace), [`backend_chain`](#backend_chain) |
50-
| aio_broker_store_retained_bytes | Gauge | Bytes used by retained messages payload. Does not include metadata overhead. | [`namespace`](#namespace), [`backend_chain`](#backend_chain) |
50+
| aio_broker_store_retained_bytes | Gauge | Bytes used by retained messages payload. Doesn't include metadata overhead. | [`namespace`](#namespace), [`backend_chain`](#backend_chain) |
5151
| aio_broker_store_will_messages | Gauge | Will messages currently stored. | [`namespace`](#namespace), [`backend_chain`](#backend_chain) |
52-
| aio_broker_store_will_bytes | Gauge | Bytes used by will messages payload. Does not include metadata overhead. | [`namespace`](#namespace), [`backend_chain`](#backend_chain) |
52+
| aio_broker_store_will_bytes | Gauge | Bytes used by will messages payload. Doesn't include metadata overhead. | [`namespace`](#namespace), [`backend_chain`](#backend_chain) |
5353
| aio_broker_store_expired_messages | Counter | Messages that expired before delivery. | [`namespace`](#namespace), [`backend_chain`](#backend_chain) |
5454

5555
## Memory and backpressure metrics
@@ -144,35 +144,35 @@ Metrics from the diagnostics service for monitoring broker SLO compliance.
144144

145145
| Metric | Type | Description | Dimensions |
146146
|--------|------|-------------|------------|
147-
| aio_broker_connect_route_replication_correctness | Gauge | Connect route replication correctness. 1 indicates success, 0 indicates failure. Failure means that the probe did not receive the response in time. | |
148-
| aio_broker_connect_latency_route_ms | Gauge | Connect latency route (ms). | |
147+
| aio_broker_connect_route_replication_correctness | Gauge | Connect route replication correctness. 1 indicates success, 0 indicates failure. Failure means that the probe didn't receive the response in time. | [`frontend`](#frontend), [`backend_chain`](#backend_chain) |
148+
| aio_broker_connect_latency_route_ms | Gauge | Connect latency value for a specific route (ms). | [`frontend`](#frontend), [`backend_chain`](#backend_chain) |
149149
| aio_broker_connect_latency_last_value_ms | Gauge | Connect latency last value (ms). | |
150-
| aio_broker_connect_latency_mu_ms | Gauge | Connect latency mean (ms). | |
150+
| aio_broker_connect_latency_mu_ms | Gauge | Connect latency mean value (ms). | |
151151
| aio_broker_connect_latency_sigma_ms | Gauge | Connect latency standard deviation (ms). | |
152-
| aio_broker_publish_route_replication_correctness | Gauge | Publish route replication correctness. 1 indicates success, 0 indicates failure. Failure means that the probe did not receive the response in time. | |
153-
| aio_broker_publish_latency_route_ms | Gauge | Publish latency route (ms). | |
152+
| aio_broker_publish_route_replication_correctness | Gauge | Publish route replication correctness. 1 indicates success, 0 indicates failure. Failure means that the probe didn't receive the response in time. | [`frontend`](#frontend), [`backend_chain`](#backend_chain) |
153+
| aio_broker_publish_latency_route_ms | Gauge | Publish latency value for a specific route (ms). | [`frontend`](#frontend), [`backend_chain`](#backend_chain) |
154154
| aio_broker_publish_latency_last_value_ms | Gauge | Publish latency last value (ms). | |
155-
| aio_broker_publish_latency_mu_ms | Gauge | Publish latency mean (ms). | |
155+
| aio_broker_publish_latency_mu_ms | Gauge | Publish latency mean value (ms). | |
156156
| aio_broker_publish_latency_sigma_ms | Gauge | Publish latency standard deviation (ms). | |
157-
| aio_broker_subscribe_route_replication_correctness | Gauge | Subscribe route replication correctness. 1 indicates success, 0 indicates failure. Failure means that the probe did not receive the response in time. | |
158-
| aio_broker_subscribe_latency_route_ms | Gauge | Subscribe latency route (ms). | |
157+
| aio_broker_subscribe_route_replication_correctness | Gauge | Subscribe route replication correctness. 1 indicates success, 0 indicates failure. Failure means that the probe didn't receive the response in time. | [`frontend`](#frontend), [`backend_chain`](#backend_chain), [`is_wildcard`](#is_wildcard) |
158+
| aio_broker_subscribe_latency_route_ms | Gauge | Subscribe latency value for a specific route (ms). | [`frontend`](#frontend), [`backend_chain`](#backend_chain), [`is_wildcard`](#is_wildcard) |
159159
| aio_broker_subscribe_latency_last_value_ms | Gauge | Subscribe latency last value (ms). | |
160-
| aio_broker_subscribe_latency_mu_ms | Gauge | Subscribe latency mean (ms). | |
160+
| aio_broker_subscribe_latency_mu_ms | Gauge | Subscribe latency mean value (ms). | |
161161
| aio_broker_subscribe_latency_sigma_ms | Gauge | Subscribe latency standard deviation (ms). | |
162-
| aio_broker_unsubscribe_route_replication_correctness | Gauge | Unsubscribe route replication correctness. 1 indicates success, 0 indicates failure. Failure means that the probe did not receive the response in time. | |
163-
| aio_broker_unsubscribe_latency_route_ms | Gauge | Unsubscribe latency route (ms). | |
162+
| aio_broker_unsubscribe_route_replication_correctness | Gauge | Unsubscribe route replication correctness. 1 indicates success, 0 indicates failure. Failure means that the probe didn't receive the response in time. | [`frontend`](#frontend), [`backend_chain`](#backend_chain), [`is_wildcard`](#is_wildcard) |
163+
| aio_broker_unsubscribe_latency_route_ms | Gauge | Unsubscribe latency value for a specific route (ms). | [`frontend`](#frontend), [`backend_chain`](#backend_chain), [`is_wildcard`](#is_wildcard) |
164164
| aio_broker_unsubscribe_latency_last_value_ms | Gauge | Unsubscribe latency last value (ms). | |
165-
| aio_broker_unsubscribe_latency_mu_ms | Gauge | Unsubscribe latency mean (ms). | |
165+
| aio_broker_unsubscribe_latency_mu_ms | Gauge | Unsubscribe latency mean value (ms). | |
166166
| aio_broker_unsubscribe_latency_sigma_ms | Gauge | Unsubscribe latency standard deviation (ms). | |
167-
| aio_broker_ping_correctness | Gauge | Ping correctness. 1 indicates success, 0 indicates failure. Failure means that the probe did not receive the response in time. | |
168-
| aio_broker_ping_latency_route_ms | Gauge | Ping latency route (ms). | |
167+
| aio_broker_ping_correctness | Gauge | Ping correctness. 1 indicates success, 0 indicates failure. Failure means that the probe didn't receive the response value for a specific in time. | [`frontend`](#frontend) |
168+
| aio_broker_ping_latency_route_ms | Gauge | Ping latency value for a specific route (ms). | [`frontend`](#frontend) |
169169
| aio_broker_ping_latency_last_value_ms | Gauge | Ping latency last value (ms). | |
170-
| aio_broker_ping_latency_mu_ms | Gauge | Ping latency mean (ms). | |
170+
| aio_broker_ping_latency_mu_ms | Gauge | Ping latency mean value (ms). | |
171171
| aio_broker_ping_latency_sigma_ms | Gauge | Ping latency standard deviation (ms). | |
172-
| aio_broker_message_delivery_check_total_timeouts | Gauge | Message delivery check correctness. Message delivery check validates the end-to-end delivery of a message from a publisher to the subscriber. 0 indicates success, greater than 0 indicates failure. Failure means that the subscriber probe did not receive the response in time. | |
173-
| aio_broker_message_delivery_check_latency_route_ms | Gauge | Message delivery check latency route (ms). | |
172+
| aio_broker_message_delivery_check_total_timeouts | Gauge | Message delivery check correctness. Message delivery check validates the end-to-end delivery of a message from a publisher to the subscriber. 0 indicates success, Greater than 0 indicates failure. Failure means that the subscriber probe didn't receive the response in time. | |
173+
| aio_broker_message_delivery_check_latency_route_ms | Gauge | Message delivery check latency value for a specific route (ms). | |
174174
| aio_broker_message_delivery_check_latency_last_value_ms | Gauge | Message delivery check latency last value (ms). | |
175-
| aio_broker_message_delivery_check_latency_mu_ms | Gauge | Message delivery check latency mean (ms). | |
175+
| aio_broker_message_delivery_check_latency_mu_ms | Gauge | Message delivery check latency mean value (ms). | |
176176
| aio_broker_message_delivery_check_latency_sigma_ms | Gauge | Message delivery check latency standard deviation (ms). | |
177177
| aio_broker_message_delivery_check_total_messages_sent | Counter | Total messages sent for delivery check. | |
178178
| aio_broker_message_delivery_check_total_messages_received | Counter | Total messages received for delivery check. | |
@@ -183,8 +183,8 @@ Metrics for debugging and diagnostics of internal traffic flow.
183183

184184
| Metric | Type | Description | Dimensions |
185185
|--------|------|-------------|------------|
186-
| aio_broker_patch_tracker_held_patches | Gauge | Pending message ids (of any type, including internal) currently held in the message id tracker. Message tracker is used to guarantee internal message delivery and ordering. Ids in the message tracker are cleaned up periodically. In a stable state the plot should look like a saw. | [`namespace`](#namespace), [`hostname`](#hostname), [`worker_id`](#worker_id) |
187-
| aio_broker_ack_handler_pending_transactions | Gauge | Pending messages in the ack handler. Ack handler tracks the acknowledgement of messages (of any type, including internal). In a stable state the plot should look like a flat line close to zero. Spikes or high values may indicate issues with message processing or queue build up. | [`namespace`](#namespace), [`hostname`](#hostname), [`worker_id`](#worker_id) |
186+
| aio_broker_patch_tracker_held_patches | Gauge | Pending message ids (of any type, including internal) currently held in the message id tracker. Message tracker is used to guarantee internal message delivery and ordering. Ids in the message tracker are cleaned up periodically. In a stable state, the plot should look like a saw. | [`namespace`](#namespace), [`hostname`](#hostname), [`worker_id`](#worker_id) |
187+
| aio_broker_ack_handler_pending_transactions | Gauge | Pending messages in the ack handler. Ack handler tracks the acknowledgment of messages (of any type, including internal). In a stable state, the plot should look like a flat line close to zero. Spikes or high values may indicate issues with message processing or queue buildup. | [`namespace`](#namespace), [`hostname`](#hostname), [`worker_id`](#worker_id) |
188188
| aio_broker_internal_client_connected | Counter | Internal client connections. | [`namespace`](#namespace), [`hostname`](#hostname), [`worker_id`](#worker_id), [`endpoint`](#endpoint) |
189189
| aio_broker_internal_client_disconnected | Counter | Internal client disconnections. | [`namespace`](#namespace), [`hostname`](#hostname), [`worker_id`](#worker_id), [`endpoint`](#endpoint) |
190190
| aio_broker_internal_client_removed | Counter | Internal clients removed. | [`namespace`](#namespace), [`hostname`](#hostname), [`worker_id`](#worker_id), [`endpoint`](#endpoint) |
@@ -205,7 +205,7 @@ The `category` dimension comes from the MQTT Connect packet's `metriccategory` u
205205

206206
Sessions without a `metriccategory` receive `category=uncategorized`.
207207

208-
> [!NOTE]
208+
> [!IMPORTANT]
209209
> The number of unique categories is limited to 1000. Avoid using high-cardinality values for `metriccategory` to prevent metric data loss.
210210
211211
### backend_chain
@@ -218,7 +218,15 @@ The `direction` dimension indicates the direction of message flow. Values are `i
218218

219219
### worker_id
220220

221-
The `worker_id` dimension identifies which worker within a frontend or backend partition generated the metric. Worker IDs are zero-indexed.
221+
The `worker_id` dimension identifies which worker within a frontend of backend partition generated the metric. Worker identifiers are zero-indexed.
222+
223+
### frontend
224+
225+
The `frontend` dimension identifies which frontend pod handled the probe operation. The value is the pod index extracted from the frontend pod name (e.g., `0` for `aio-broker-frontend-0`).
226+
227+
### is_wildcard
228+
229+
The `is_wildcard` dimension indicates whether the subscription topic contains a wildcard pattern. Values are `true` (wildcard topic like `foo/#`) or `false` (exact topic match).
222230

223231
### is_persistent
224232

0 commit comments

Comments
 (0)