You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: support/azure/azure-kubernetes/create-upgrade-delete/troubleshoot-apiserver-etcd.md
+7-6Lines changed: 7 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,6 +49,7 @@ The following table outlines the common symptoms of API server failures.
49
49
| Timeouts from the API server | Frequent timeouts that are beyond the guarantees in [the AKS API server SLA](/azure/aks/free-standard-pricing-tiers#uptime-sla-terms-and-conditions). For example, `kubectl` commands timeout. |
50
50
| High latencies | High latencies that make the Kubernetes SLOs fail. For example, the `kubectl` command takes more than 30 seconds to list pods. |
51
51
| API server pod in `CrashLoopbackOff` status or facing webhook call failures | Verify that you don't have any custom admission webhook (such as the [Kyverno](https://kyverno.io/docs/introduction/) policy engine) that's blocking the calls to the API server. |
52
+
| Elevated HTTP 429 responses from the API server | API server is throttling calls. Refer to the troubleshooting checklist|
52
53
53
54
## Troubleshooting checklist
54
55
@@ -86,8 +87,8 @@ AzureDiagnostics
86
87
> If your query returns no results, you might have selected the wrong table to query diagnostics logs. In resource-specific mode, data is written to individual tables, depending on the category of the resource. Diagnostics logs are written to the `AKSAudit` table. In Azure diagnostics mode, all data is written to the `AzureDiagnostics` table. For more information, see [Azure resource logs](/azure/azure-monitor/essentials/resource-logs).
87
88
88
89
Although it's helpful to know which clients generate the highest request volume, high request volume alone might not be a cause for concern. The response latency that clients experience is a better indicator of the actual load that each one generates on the API server.
89
-
90
-
###Step 2a: Identify and chart the average latency of API server requests per user agent
90
+
### Step 2 Identify and chart latency for user agentd
91
+
#### [Diagnose and Solve](#/tab/Diagnose-and-solve)
91
92
92
93
AKS now provides a built-in analyzer, the API Server Resource Intensive Listing Detector, to help you identify agents that make resource-intensive LIST calls. These calls are a leading cause of API server and etcd performance issues.
93
94
@@ -104,7 +105,7 @@ The detector analyzes recent API server activity and highlights agents or worklo
104
105
105
106
:::image type="content" source="media/troubleshoot-apiserver-etcd/resource-intensive-listing-analyzer-2.png" alt-text="Screenshot that shows the apiserver perf detector detailed view." lightbox="media/troubleshoot-apiserver-etcd/resource-intensive-listing-analyzer-2.png":::
106
107
107
-
#### How to interpret the detector output
108
+
#####How to interpret the detector output
108
109
109
110
-**Summary:**
110
111
Indicates if resource-intensive LIST calls were detected and describes possible impacts on your cluster.
@@ -124,11 +125,11 @@ The analyzer also provides recommendations directly in the Azure portal. These r
124
125
>
125
126
> After you identify the offending agents and apply the recommendations, you can use [the API Priority and Fairness feature](https://kubernetes.io/docs/concepts/cluster-administration/flow-control/) to throttle or isolate problematic clients. Alternatively, refer to the "Cause 3" section of [Troubleshoot API server and etcd problems in Azure Kubernetes Services](/troubleshoot/azure/azure-kubernetes/create-upgrade-delete/troubleshoot-apiserver-etcd?branch=pr-en-us-9260&tabs=resource-specific#cause-3-an-offending-client-makes-excessive-list-or-put-calls).
126
127
127
-
###Step 2.b: Identify the average latency
128
+
#### [Logs](#/tab/logs)
128
129
129
130
To identify the average latency of API server requests per user agent, as plotted on a time chart, run the following query.
0 commit comments