You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool)
10
10
@@ -26,8 +26,7 @@ sections:
26
26
- question: |
27
27
Can I move my cluster to a different subscription, or move my subscription with my cluster to a new tenant?
28
28
answer: |
29
-
If you've moved your AKS cluster to a different subscription or the cluster's subscription to a new tenant, the cluster won't function because of missing cluster identity permissions. AKS doesn't support moving clusters across subscriptions or tenants because of this constraint.
30
-
29
+
No. If you've moved your AKS cluster to a different subscription or the cluster's subscription to a new tenant, the cluster won't function because of missing cluster identity permissions. AKS doesn't support moving clusters across subscriptions or tenants because of this constraint. For more information, see [Operations FAQ](/azure/aks/faq#operations).
31
30
- question: |
32
31
What naming restrictions are enforced for AKS resources and parameters?
33
32
answer: |
@@ -42,7 +41,10 @@ sections:
42
41
- AKS node pool names must be all lowercase. The names must be 1-12 characters in length for Linux node pools and 1-6 characters for Windows node pools. A name must start with a letter, and the only allowed characters are letters and numbers.
43
42
44
43
- The *admin-username*, which sets the administrator user name for Linux nodes, must start with a letter. This user name may only contain letters, numbers, hyphens, and underscores. It has a maximum length of 32 characters.
45
-
44
+
45
+
For more information about naming convention. see the following resources:
46
+
- [Naming rules and restrictions for Azure resources](/azure/azure-resource-manager/management/resource-name-rules#microsoftcontainerservice)
47
+
- [Abbreviation recommendations for Azure resources](/azure/cloud-adoption-framework/ready/azure-best-practices/resource-abbreviations#containers)
title: Troubleshoot the Throttled Error Code (429)
3
+
description: Learn how to resolve the Throttled error (status 429) when you try to create and deploy an Azure Kubernetes Service (AKS) cluster.
4
+
ms.date: 03/05/2025
5
+
ms.reviewer: jovieir, chiragpa, v-weizhu
6
+
ms.service: azure-kubernetes-service
7
+
#Customer intent: As an Azure Kubernetes user, I want to troubleshoot the Throttled error code (status 429) so that I can successfully create and deploy an Azure Kubernetes Service (AKS) cluster.
8
+
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool)
9
+
---
10
+
# Troubleshoot the Throttled error code (429)
11
+
12
+
This article discusses how to identify and resolve the `Throttled` error (status 429) that occurs when you try to create and deploy a Microsoft Azure Kubernetes Service (AKS) cluster.
13
+
14
+
## Symptoms
15
+
16
+
When you try to create an AKS cluster, you receive the following "The PutManagedClusterHandler.PUT request limit has been exceeded" error message that shows a "SubCode" value of **Throttled** and a "Status" value of **429**:
17
+
18
+
> Category: ClientError;
19
+
>
20
+
> SubCode: Throttled;
21
+
>
22
+
> OrginalError: autorest/azure: Service returned an error. **Status=429**
23
+
>
24
+
> **Code="Throttled"**
25
+
>
26
+
> Message="> The PutManagedClusterHandler.PUT request limit has been exceeded for SubID='*\<subscription-id-guid>*', please retry again in X seconds. For more information, please visit aka.ms/aks/throttling";
27
+
Request throttling can occur on various Azure components, so the error message might be different depending on the type of resource where this issue occurs.
28
+
29
+
Resource provider throttling is independent of ARM throttling and is tailored to the operations of a specific resource provider. In this scenario, AKS resource provider throttling is specific to the AKS resource provider and applies only to operations related to AKS resources.
30
+
31
+
## Cause
32
+
33
+
AKS requests are throttled. For information about how AKS limits work and the specific limits per hour, see [Throttling limits on AKS resource provider APIs](/azure/aks/quotas-skus-regions#throttling-limits-on-aks-resource-provider-apis).
34
+
35
+
## Solution
36
+
37
+
To resolve this issue, examine and modify your access pattern of the throttled subscription. The following table lists the possible access patterns and corresponding solutions.
38
+
39
+
| Access pattern | Solution |
40
+
| -------------- | -------- |
41
+
| Automated scripts constantly run LIST operations against managedCluster resources. | Run the scripts less frequently. |
42
+
| Users attempt to deploy multiple AKS clusters in a short period of time. | Space out deployments or use different subscriptions.|
43
+
| Users attempt to modify the same AKS cluster multiple times consecutively. | Space out operations. Ensure successful completion before initiating another one.|
44
+
| Users attempt to add, modify, or delete one or more agentPools on the same AKS cluster. | Space out operations. Ensure successful completion before initiating another one. |
45
+
46
+
## More information
47
+
48
+
[General troubleshooting of AKS cluster creation issues](troubleshoot-aks-cluster-creation-issues.md)
49
+
50
+
[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)]
title: Troubleshoot UpgradeFailed errors due to eviction failures caused by PDBs
3
3
description: Learn how to troubleshoot UpgradeFailed errors due to eviction failures caused by Pod Disruption Budgets when you try to upgrade an Azure Kubernetes Service cluster.
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool)
9
9
#Customer intent: As an Azure Kubernetes Services (AKS) user, I want to troubleshoot an Azure Kubernetes Service cluster upgrade that failed because of eviction failures caused by Pod Disruption Budgets so that I can upgrade the cluster successfully.
@@ -15,44 +15,98 @@ This article discusses how to identify and resolve UpgradeFailed errors due to e
15
15
16
16
## Prerequisites
17
17
18
-
This article requires Azure CLI version 2.0.65 or a later version. To find the version number, run `az --version`. If you have to install or upgrade Azure CLI, see [How to install the Azure CLI](/cli/azure/install-azure-cli).
18
+
This article requires Azure CLI version 2.67.0 or a later version. To find the version number, run `az --version`. If you have to install or upgrade Azure CLI, see [How to install the Azure CLI](/cli/azure/install-azure-cli).
19
19
20
20
For more detailed information about the upgrade process, see the "Upgrade an AKS cluster" section in [Upgrade an Azure Kubernetes Service (AKS) cluster](/azure/aks/upgrade-cluster#upgrade-an-aks-cluster).
21
21
22
22
## Symptoms
23
23
24
-
An AKS cluster upgrade operation fails with the following error message:
24
+
An AKS cluster upgrade operation fails with one of the following error messages:
25
25
26
-
> Code: UpgradeFailed
27
-
> Message: Drain node \<node-name> failed when evicting pod \<pod-name>. Eviction failed with Too many Requests error. This is often caused by a restrictive Pod Disruption Budget (PDB) policy. See `http://aka.ms/aks/debugdrainfailures`. Original error: API call to Kubernetes API Server failed.
26
+
-> (UpgradeFailed) Drain `node aks-<nodepool-name>-xxxxxxxx-vmssxxxxxx` failed when evicting pod `<pod-name>` failed with Too Many Requests error. This is often caused by a restrictive Pod Disruption Budget (PDB) policy. See https://aka.ms/aks/debugdrainfailures. Original error: Cannot evict pod as it would violate the pod's disruption budget.. PDB debug info: `<namespace>/<pod-name>` blocked by pdb `<pdb-name>` with 0 unready pods.
27
+
28
+
-> Code: UpgradeFailed
29
+
> Message: Drain node `aks-<nodepool-name>-xxxxxxxx-vmssxxxxxx` failed when evicting pod `<pod-name>` failed with Too Many Requests error. This is often caused by a restrictive Pod Disruption Budget (PDB) policy. See https://aka.ms/aks/debugdrainfailures. Original error: Cannot evict pod as it would violate the pod's disruption budget.. PDB debug info: `<namespace>/<pod-name>` blocked by pdb `<pdb-name>` with 0 unready pods.
28
30
29
31
## Cause
30
32
31
-
This error might occur if a pod is protected by the Pod Disruption Budget (PDB) policy. In this situation, the pod resists being drained.
33
+
This error might occur if a pod is protected by the Pod Disruption Budget (PDB) policy. In this situation, the pod resists being drained, and after several attempts, the upgrade operation fails, and the cluster/node pool falls into a `Failed` state.
34
+
35
+
Check the PDB configuration: `ALLOWED DISRUPTIONS` value. The value should be `1` or greater. For more information, see [Plan for availability using pod disruption budgets](/azure/aks/operator-best-practices-scheduler#plan-for-availability-using-pod-disruption-budgets). For example, you can check the workload and its PDB as follows. You should observe the `ALLOWED DISRUPTIONS` column doesn't allow any disruption. If the `ALLOWED DISRUPTIONS` value is `0`, the pods aren't evicted and node drain fails during the upgrade process:
36
+
37
+
```console
38
+
$ kubectl get deployments.apps nginx
39
+
NAME READY UP-TO-DATE AVAILABLE AGE
40
+
nginx 2/2 2 2 62s
41
+
42
+
$ kubectl get pod
43
+
NAME READY STATUS RESTARTS AGE
44
+
nginx-7854ff8877-gbr4m 1/1 Running 0 68s
45
+
nginx-7854ff8877-gnltd 1/1 Running 0 68s
46
+
47
+
$ kubectl get pdb
48
+
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
49
+
nginx-pdb 2 N/A 0 24s
50
+
51
+
```
32
52
33
-
To test this situation, run `kubectl get pdb -A`, and then check the **Allowed Disruption** value. The value should be **1** or greater. For more information, see [Plan for availability using pod disruption budgets](/azure/aks/operator-best-practices-scheduler#plan-for-availability-using-pod-disruption-budgets).
53
+
You can also check for any entries in Kubernetes events using the command `kubectl get events | grep -i drain`. A similar output shows the message "Eviction blocked by Too Many Requests (usually a pdb)":
54
+
55
+
```console
56
+
$ kubectl get events | grep -i drain
57
+
LAST SEEN TYPE REASON OBJECT MESSAGE
58
+
(...)
59
+
32m Normal Drain node/aks-<nodepool-name>-xxxxxxxx-vmssxxxxxx Draining node: aks-<nodepool-name>-xxxxxxxx-vmssxxxxxx
60
+
2m57s Warning Drain node/aks-<nodepool-name>-xxxxxxxx-vmssxxxxxx Eviction blocked by Too Many Requests (usually a pdb): <pod-name>
61
+
12m Warning Drain node/aks-<nodepool-name>-xxxxxxxx-vmssxxxxxx Eviction blocked by Too Many Requests (usually a pdb): <pod-name>
62
+
32m Warning Drain node/aks-<nodepool-name>-xxxxxxxx-vmssxxxxxx Eviction blocked by Too Many Requests (usually a pdb): <pod-name>
63
+
32m Warning Drain node/aks-<nodepool-name>-xxxxxxxx-vmssxxxxxx Eviction blocked by Too Many Requests (usually a pdb): <pod-name>
64
+
31m Warning Drain node/aks-<nodepool-name>-xxxxxxxx-vmssxxxxxx Eviction blocked by Too Many Requests (usually a pdb): <pod-name>
65
+
```
34
66
35
-
If the **Allowed Disruption** value is **0**, the node drain will fail during the upgrade process.
36
67
37
68
To resolve this issue, use one of the following solutions.
38
69
39
70
## Solution 1: Enable pods to drain
40
71
41
72
1. Adjust the PDB to enable pod draining. Generally, The allowed disruption is controlled by the `Min Available / Max unavailable` or `Running pods / Replicas` parameter. You can modify the `Min Available / Max unavailable` parameter at the PDB level or increase the number of `Running pods / Replicas` to push the Allowed Disruption value to **1** or greater.
42
-
2. Try again to upgrade the AKS cluster to the same version that you tried to upgrade to previously. This process will trigger a reconciliation.
73
+
2. Try again to upgrade the AKS cluster to the same version that you tried to upgrade to previously. This process triggers a reconciliation.
74
+
75
+
```console
76
+
$ az aks upgrade --name <aksName> --resource-group <resourceGroupName>
77
+
Are you sure you want to perform this operation? (y/N): y
78
+
Cluster currently in failed state. Proceeding with upgrade to existing version 1.28.3 to attempt resolution of failed cluster state.
79
+
Since control-plane-only argument is not specified, this will upgrade the control plane AND all nodepools to version . Continue? (y/N): y
80
+
```
43
81
44
82
## Solution 2: Back up, delete, and redeploy the PDB
45
83
46
-
1. Take a backup of the PDB `kubectl get pdb <pdb-name> -n <pdb-namespace> -o yaml > pdb_backup.yaml`, and then delete the PDB `kubectl delete pdb <pdb-name> -n /<pdb-namespace>`. After the upgrade is finished, you can redeploy the PDB `kubectl apply -f pdb_backup.yaml`.
47
-
2. Try again to upgrade the AKS cluster to the same version that you tried to upgrade to previously. This process will trigger a reconciliation.
84
+
1. Take a backup of the PDB(s) using the command `kubectl get pdb <pdb-name> -n <pdb-namespace> -o yaml > pdb-name-backup.yaml`, and then delete the PDB using the command `kubectl delete pdb <pdb-name> -n <pdb-namespace>`. After the new upgrade attempt is finished, you can redeploy the PDB just applying the backup file using the command `kubectl apply -f pdb-name-backup.yaml`.
85
+
2. Try again to upgrade the AKS cluster to the same version that you tried to upgrade to previously. This process triggers a reconciliation.
86
+
87
+
```console
88
+
$ az aks upgrade --name <aksName> --resource-group <resourceGroupName>
89
+
Are you sure you want to perform this operation? (y/N): y
90
+
Cluster currently in failed state. Proceeding with upgrade to existing version 1.28.3 to attempt resolution of failed cluster state.
91
+
Since control-plane-only argument is not specified, this will upgrade the control plane AND all nodepools to version . Continue? (y/N): y
92
+
```
48
93
49
-
## Solution 3: Delete the pods that can't be drained
94
+
## Solution 3: Delete the pods that can't be drained or scale the workload down to zero (0)
50
95
51
96
1. Delete the pods that can't be drained.
52
97
53
98
> [!NOTE]
54
-
> If the pods were created by a deployment or StatefulSet, they'll be controlled by a ReplicaSet. If that's the case, you might have to delete the deployment or StatefulSet. Before you do that, we recommend that you make a backup: `kubectl get <kubernetes-object> <name> -n <namespace> -o yaml > backup.yaml`.
99
+
> If the pods are created by a Deployment or StatefulSet, they'll be controlled by a ReplicaSet. If that's the case, you might have to delete or scale the workload replicas to zero (0) of the Deployment or StatefulSet. Before you do that, we recommend that you make a backup: `kubectl get <deployment.apps -or- statefulset.apps> <name> -n <namespace> -o yaml > backup.yaml`.
100
+
101
+
2. To scale down, you can use `kubectl scale --replicas=0 <deployment.apps -or- statefulset.apps> <name> -n <namespace>` before the reconciliation
102
+
103
+
3. Try again to upgrade the AKS cluster to the same version that you tried to upgrade to previously. This process triggers a reconciliation.
55
104
56
-
2. Try again to upgrade the AKS cluster to the same version that you tried to upgrade to previously. This process will trigger a reconciliation.
105
+
```console
106
+
$ az aks upgrade --name <aksName> --resource-group <resourceGroupName>
107
+
Are you sure you want to perform this operation? (y/N): y
108
+
Cluster currently in failed state. Proceeding with upgrade to existing version 1.28.3 to attempt resolution of failed cluster state.
109
+
Since control-plane-only argument is not specified, this will upgrade the control plane AND all nodepools to version . Continue? (y/N): y
110
+
```
57
111
58
112
[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)]
0 commit comments