Skip to content

Commit e4ab2de

Browse files
committed
Add upgrade known issue
1 parent 98af26d commit e4ab2de

1 file changed

Lines changed: 62 additions & 1 deletion

File tree

articles/iot-operations/troubleshoot/known-issues.md

Lines changed: 62 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ author: dominicbetts
55
ms.author: dobett
66
ms.topic: troubleshooting-known-issue
77
ms.custom: sfi-ropc-nochange
8-
ms.date: 11/21/2025
8+
ms.date: 04/17/2026
99
---
1010

1111
# Known issues for Azure IoT Operations
@@ -14,6 +14,67 @@ This article lists the current known issues you might encounter when using Azure
1414

1515
For general troubleshooting guidance, see [Troubleshoot Azure IoT Operations](troubleshoot.md).
1616

17+
## Deployment and upgrade issues
18+
19+
This section lists current known issues with deploying and upgrading Azure IoT Operations.
20+
21+
### Upgrade to Azure IoT Operations 2603 can silently fail
22+
23+
---
24+
25+
Log signature: N/A
26+
27+
---
28+
29+
When you run `az iot ops upgrade` to upgrade to Azure IoT Operations 2603, the upgrade can silently fail to reach the cluster. You then observe the following symptoms:
30+
 
31+
- `provisioningState: Failed` on the Azure IoT Operations extension.
32+
- All on-cluster workloads remain healthy (no upgrade activity occurs).
33+
- `az iot ops upgrade` might report nothing to upgrade on subsequent attempts.
34+
35+
#### Root cause
36+
 
37+
During the upgrade, if a dependent system extension, such as `microsoft.extensiondiagnostics` experiences a transient Helm timeout, Azure Resource Manager marks it as **Failed**. Even if the extension eventually succeeds on-cluster, the cloud-side state remains **Failed**. This blocks the dependency chain — Azure Resource Manager never delivers the updated Azure IoT Operations or secret-store extension config to the cluster's config agent.
38+
 
39+
Symptoms include:
40+
 
41+
- Config agent PostStatus returns `400: "Configuration spec has been modified"`
42+
- `getPendingConfigs` returns empty results
43+
- Extension manager never receives Helm upgrade instructions
44+
45+
#### Workaround
46+
 
47+
The workaround is to force Azure Resource Manager to re-submit the extension specs by running a no-op update on both the Azure IoT Operations and secret-store extensions, then retrying the upgrade:
48+
 
49+
```azurecli
50+
az k8s-extension update --name <aio-extension-name> \
51+
--cluster-name <cluster-name> \
52+
--resource-group <resource-group> \
53+
--cluster-type connectedClusters \
54+
--configuration-settings AgentOperationTimeoutInMinutes=120
55+
56+
az k8s-extension update --name azure-secret-store \
57+
--cluster-name <cluster-name> \
58+
--resource-group <resource-group> \
59+
--cluster-type connectedClusters \
60+
--configuration-settings AgentOperationTimeoutInMinutes=120
61+
62+
az iot ops upgrade
63+
```
64+
65+
To identify the Azure IoT Operations extension name, which includes a random suffix (for example, `azure-iot-operations-cym7h`), find your specific extension name by running:
66+
67+
```azurecli
68+
az k8s-extension list \
69+
--cluster-name <cluster-name> \
70+
--resource-group <resource-group> \
71+
--cluster-type connectedClusters \
72+
--query "[?extensionType=='microsoft.iotoperations'].name" -o tsv
73+
```
74+
75+
> [!IMPORTANT]
76+
> After the upgrade completes, reset `AgentOperationTimeoutInMinutes` back to a lower value like five minutes to avoid long wait times on future operations if something else fails.
77+
1778
## Azure Device Registry issues
1879

1980
This section lists current known issues for the Azure Device Registry.

0 commit comments

Comments
 (0)