Skip to content

Commit 6544263

Browse files
committed
Add CPU calculation formulas and capacity planning guidance
- Add formula table for calculating broker CPU requirements - Add caution about other AIO components consuming ~200-300m CPU - Add small cluster example showing why equal cores isn't enough - Expand larger deployment example with step-by-step math - Based on test results confirming pods get stuck Pending when cluster CPU is at or below broker reservation
1 parent 31990e3 commit 6544263

1 file changed

Lines changed: 56 additions & 5 deletions

File tree

articles/iot-operations/manage-mqtt-broker/howto-configure-availability-scale.md

Lines changed: 56 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ author: sethmanheim
55
ms.author: sethm
66
ms.topic: how-to
77
ms.subservice: azure-mqtt-broker
8-
ms.date: 05/14/2025
8+
ms.date: 02/20/2026
99
ms.service: azure-iot-operations
1010

1111
# CustomerIntent: As an operator, I want to understand the settings for the MQTT broker so that I can configure it for high availability and scale.
@@ -258,12 +258,59 @@ To prevent resource starvation in the cluster, the broker can be configured to [
258258
>
259259
> If you enable CPU resource limits, make sure your cluster has enough CPU resources to satisfy the broker's requests based on your cardinality configuration. See the CPU requirements below.
260260
261-
The MQTT broker currently requests one (1.0) CPU unit per frontend worker and two (2.0) CPU units per backend worker. For more information, see [Kubernetes CPU resource units](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu).
261+
### Calculate CPU requirements
262262

263-
For example, the following cardinality would request the following CPU resources:
263+
The MQTT broker requests CPU resources per pod based on the number of workers configured:
264264

265-
- **For frontends**: 2 CPU units per frontend pod, totaling 6 CPU units.
266-
- **For backends**: 4 CPU units per backend pod (for two backend workers), times 2 (redundancy factor), times 3 (number of partitions), totaling 24 CPU units.
265+
- **Frontend pods**: 1.0 CPU per worker
266+
- **Backend pods**: 2.0 CPU per worker
267+
268+
Use the following formulas to calculate total CPU requirements:
269+
270+
| Component | Formula |
271+
|-----------|---------|
272+
| Frontend CPU | `replicas` × `frontend.workers` × 1.0 CPU |
273+
| Backend CPU | `partitions` × `redundancyFactor` × `backend.workers` × 2.0 CPU |
274+
| **Total broker CPU** | Frontend CPU + Backend CPU |
275+
276+
For more information, see [Kubernetes CPU resource units](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu).
277+
278+
> [!CAUTION]
279+
> The broker isn't the only component that consumes CPU on the cluster. Other Azure IoT Operations components (such as the dataflow engine, OPC UA connector, and system pods) also reserve CPU resources, typically around 200-300m in aggregate. When planning cluster capacity, make sure to account for this overhead on top of the broker's CPU requirements. If the total CPU requested by all pods exceeds the available CPU on your cluster, broker pods get stuck in a `Pending` state.
280+
281+
#### Example: small cluster
282+
283+
Consider a 2-node cluster with 4 CPU cores per node (8 cores total) with the following cardinality:
284+
285+
```json
286+
{
287+
"cardinality": {
288+
"frontend": {
289+
"replicas": 2,
290+
"workers": 2
291+
},
292+
"backendChain": {
293+
"partitions": 1,
294+
"redundancyFactor": 2,
295+
"workers": 1
296+
}
297+
}
298+
}
299+
```
300+
301+
The broker requests:
302+
303+
- **Frontend CPU**: 2 replicas × 2 workers × 1.0 = **4.0 CPU**
304+
- **Backend CPU**: 1 partition × 2 RF × 1 worker × 2.0 = **4.0 CPU**
305+
- **Total broker CPU**: **8.0 CPU**
306+
307+
Even though the cluster has 8 cores total, this deployment fails because other Azure IoT Operations components also consume CPU (~280m). The broker pods get stuck in `Pending` state with `Insufficient cpu` errors.
308+
309+
To resolve this, either add more nodes, increase cores per node, or reduce the broker cardinality.
310+
311+
#### Example: larger deployment
312+
313+
The following cardinality requests significantly more CPU resources:
267314

268315
```json
269316
{
@@ -281,6 +328,10 @@ For example, the following cardinality would request the following CPU resources
281328
}
282329
```
283330

331+
- **Frontend CPU**: 3 replicas × 2 workers × 1.0 = **6.0 CPU**
332+
- **Backend CPU**: 3 partitions × 2 RF × 2 workers × 2.0 = **24.0 CPU**
333+
- **Total broker CPU**: **30.0 CPU**
334+
284335
To change this setting, set the `generateResourceLimits.cpu` field to `Enabled` or `Disabled` in the Broker resource.
285336

286337
# [Portal](#tab/portal)

0 commit comments

Comments
 (0)