Skip to content

Commit 305b808

Browse files
docs: added new tshoot guide
1 parent 426af90 commit 305b808

2 files changed

Lines changed: 161 additions & 0 deletions

File tree

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
---
2+
title: Troubleshoot Performance Issues with the Managed NGINX Ingress Controller in AKS
3+
description: Step-by-step guide to identify and resolve performance issues with the Managed NGINX Ingress Controller in Azure Kubernetes Service (AKS), including common symptoms, root cause analysis, and configuration adjustments.
4+
ms.reviewer: claudiogodoy
5+
ms.service: azure-kubernetes-service
6+
ms.date: 05/24/2025
7+
---
8+
# Managed NGINX Ingress Controller Troubleshoot
9+
10+
The [Managed NGINX ingress controller](/azure/aks/app-routing) is a routing add-on that enables routing Hypertext Transfer Protocol (HTTP) and secure (HTTPS) traffic to applications running on an [Azure Kubernetes Service (AKS)](https://learn.microsoft.com/en-us/azure/aks/) cluster.
11+
12+
In performance-related problems, the routing system may be the root cause. This article provides step-by-step guidance to troubleshoot NGINX ingress controller performance issues.
13+
14+
## Prerequisites
15+
16+
Before you start, ensure you have the following tools installed:
17+
18+
- **Kubernetes CLI (`kubectl`)**: Use Azure CLI to install it by running the command `az aks install-cli`.
19+
20+
## Common Symptoms
21+
22+
| Symptom | Description |
23+
| --- | --- |
24+
| **HTTP Gateway Errors** | Error codes such as 502, 504 might indicate an NGINX exhaustion problem. |
25+
| **High Response Time Difference** | Significant difference between your service response time and the end-to-end response time. There's a common latency added by NGINX, but when it's too large, you might have an NGINX exhaustion problem. |
26+
27+
## Step 1: Verify HPA Behavior
28+
29+
The most common reason for performance issues on the NGINX is CPU exhaustion. During a load spike in the system, a good approach is to watch the [HPA](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/) behavior.
30+
By default, the routing add-on creates a [namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/) named `app-routing-system`.
31+
32+
1. **Get the HPA name**:
33+
34+
```console
35+
kubectl get hpa -n app-routing-system
36+
```
37+
38+
2. **Watch the HPA behavior**:
39+
40+
```console
41+
kubectl get hpa <HPA_NAME> -n app-routing-system -w
42+
```
43+
44+
3. **Evaluate the result**:
45+
46+
```console
47+
$ kubectl get hpa <HPA_NAME> -n app-routing-system -w
48+
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
49+
nginx Deployment/nginx cpu: 83%/70% 1 2 1 77m
50+
nginx Deployment/nginx cpu: 83%/70% 1 2 2 77m
51+
nginx Deployment/nginx cpu: 106%/70% 1 2 2 79m
52+
nginx Deployment/nginx cpu: 133%/70% 1 2 2 80m
53+
```
54+
55+
The **TARGETS** column shows the CPU threshold where the `HPA` will trigger to scale up the pods. You must interpret this behavior. There are a few possibilities:
56+
57+
- The `HPA` has reached the maximum number of pods.
58+
- There are no available nodes to schedule the pods.
59+
60+
## Step 2: Look for Pods in Pending State
61+
62+
If in the previous step you saw that the `NGINX HPA` hasn't reached the maximum number of pods, the [kube-scheduler](https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/#kube-scheduler) might be struggling to find available nodes to schedule the `NGINX pods`.
63+
64+
1. **Get Pending Pods**:
65+
66+
```console
67+
kubectl get pod --field-selector=status.phase=Pending -n app-routing-system
68+
```
69+
70+
> [!NOTE]
71+
> If there are Pending pods, the cluster is probably facing a resource exhaustion problem. In this case, refer to [Troubleshoot pod scheduler errors in Azure Kubernetes Service](azure/azure-kubernetes/availability-performance/troubleshoot-pod-scheduler-errors).
72+
73+
## Step 3: Verify if There Are Limits Applied to the NGINX Deployment
74+
75+
Any misconfiguration on the `NGINX` [resource limits or requests](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) might lead to `HPA` scaling up more pods than necessary.
76+
77+
1. **Describe the NGINX Deployment**:
78+
79+
```console
80+
kubectl describe deploy nginx -n app-routing-system
81+
```
82+
83+
2. **Verify Requests and Limits**:
84+
85+
```console
86+
$ kubectl describe deploy nginx -n app-routing-system
87+
Name: nginx
88+
....
89+
Selector: app=nginx
90+
....
91+
Pod Template:
92+
....
93+
Containers:
94+
controller:
95+
...
96+
Limits:
97+
...
98+
Requests:
99+
...
100+
```
101+
102+
## Solution
103+
104+
By default, the current version of the NGINX ingress controller does not set limits for NGINX pods and requests `500m` CPU, which is used by the `HPA`. It is not recommended to change these values directly in the deployment definition.
105+
106+
If your `HPA` is reaching the maximum number of pods and the deployment's requests and limits remain unchanged, you should configure the [custom resource definition (CRD)](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) called [NginxIngressController](https://github.com/Azure/aks-app-routing-operator/blob/main/config/crd/bases/approuting.kubernetes.azure.com_nginxingresscontrollers.yaml).
107+
108+
### Configuration Options
109+
110+
The following configuration options directly impact the `HPA` behavior:
111+
112+
| Property | Type | Description | Required | Default |
113+
|----------------|---------|-----------------------------------------------------------------------------|----------|-----------|
114+
| `scaling` | object | Configuration for scaling the controller. Contains nested properties. | No | - |
115+
| `maxReplicas` | integer | Upper limit for replicas. | No | 100 |
116+
| `minReplicas` | integer | Lower limit for replicas. | No | 2 |
117+
| `threshold` | string | Scaling threshold defining how aggressively to scale. Options: `rapid`, `steady`, `balanced`. | No | balanced |
118+
119+
### How to Apply the Configuration
120+
121+
Follow these steps to apply the configuration:
122+
123+
1. **Edit the NginxIngressController CRD**:
124+
125+
```console
126+
kubectl edit nginxingresscontroller -n app-routing-system
127+
```
128+
129+
2. **Add or modify the scaling configuration**:
130+
131+
```yaml
132+
spec:
133+
scaling:
134+
maxReplicas: 10
135+
minReplicas: 2
136+
threshold: "balanced"
137+
```
138+
139+
3. **Save and exit** the editor to apply the changes.
140+
141+
4. **Verify the changes**:
142+
143+
```console
144+
kubectl get hpa -n app-routing-system
145+
```
146+
147+
The HPA will automatically update based on your new configuration, and the NGINX ingress controller will scale according to the specified parameters.
148+
149+
## Additional resources
150+
151+
- [Learn more about Azure Kubernetes Service (AKS) best practices](/azure/aks/best-practices)
152+
- [Monitor your Kubernetes cluster performance with Container insights](/azure/azure-monitor/containers/container-insights-analyze)
153+
- [NGINX ingress controller](https://github.com/kubernetes/ingress-nginx)
154+
155+
[!INCLUDE [Third-party information disclaimer](../../../includes/third-party-disclaimer.md)]
156+
157+
[!INCLUDE [Third-party contact information disclaimer](../../../includes/third-party-contact-disclaimer.md)]
158+
159+
[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)]

support/azure/azure-kubernetes/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,8 @@
152152
href: load-bal-ingress-c/create-unmanaged-ingress-controller.md
153153
- name: Troubleshoot Application Gateway Ingress Controller connectivity
154154
href: load-bal-ingress-c/troubleshoot-app-gateway-ingress-controller-connectivity-issues.md
155+
- name: Troubleshoot Performance Issues with the Managed NGINX Ingress Controller in AKS
156+
href: load-bal-ingress-c/tshoot-performance-ingress.md
155157

156158
- name: Troubleshoot Kubernetes control plane
157159
items:

0 commit comments

Comments
 (0)