|
| 1 | +--- |
| 2 | +title: Troubleshoot Performance Issues in the Managed NGINX Ingress Controller in AKS |
| 3 | +description: Step-by-step guide to identify and resolve performance issues in the Managed NGINX Ingress Controller in AKS. |
| 4 | +ms.reviewer: claudiogodoy |
| 5 | +ms.service: azure-kubernetes-service |
| 6 | +ms.date: 05/24/2025 |
| 7 | +--- |
| 8 | +# Troubleshoot issues in Managed NGINX ingress controller |
| 9 | + |
| 10 | +The [Managed NGINX ingress controller](/azure/aks/app-routing) is a routing add-on that enables the routing of HTTP and HTTPS traffic to applications that run on an [Azure Kubernetes Service (AKS)](/azure/aks/) cluster. |
| 11 | + |
| 12 | +The routing system might be the root cause of performance-related problems. This article provides step-by-step guidance to troubleshoot NGINX ingress controller performance issues. This article also discusses common symptoms, root cause analysis, and configuration adjustments. |
| 13 | + |
| 14 | +## Prerequisites |
| 15 | + |
| 16 | +Before you start, make sure that you have the following tool installed: |
| 17 | + |
| 18 | +- Kubernetes CLI (`kubectl`) |
| 19 | + |
| 20 | +Use Azure CLI, and run the `az aks install-cli` command. |
| 21 | + |
| 22 | +## Symptoms |
| 23 | + |
| 24 | +You receive an HTTP gateway error code like `502` or `504` or a response time error message when you run the NGINX ingress controller. This can indicate an NGINX exhaustion problem. |
| 25 | + |
| 26 | +You encounter a significant difference between your service response time and the end-to-end response time. This can indicate a latency added by NGINX and an NGINX exhaustion problem. |
| 27 | + |
| 28 | +## Cause |
| 29 | + |
| 30 | +The most common cause of performance issues on the NGINX is CPU exhaustion. During a load spike in the system, a good troubleshooting method is to monitor [HPA](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/) behavior. By default, the routing add-on creates a [namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/) that's named `app-routing-system`. |
| 31 | + |
| 32 | +## Resolution |
| 33 | + |
| 34 | +To troubleshoot the issue, follow these steps. |
| 35 | + |
| 36 | +### Step 1: Verify horizontal pod autoscaler (HPA) behavior |
| 37 | + |
| 38 | +1. Get the HPA name: |
| 39 | + |
| 40 | + ```console |
| 41 | + kubectl get hpa -n app-routing-system |
| 42 | + ``` |
| 43 | + |
| 44 | +2. Monitor the `HPA` behavior: |
| 45 | + |
| 46 | + ```console |
| 47 | + kubectl get hpa <HPA_NAME> -n app-routing-system -w |
| 48 | + ``` |
| 49 | + |
| 50 | +3. Evaluate the results: |
| 51 | + |
| 52 | + ```console |
| 53 | + $ kubectl get hpa <HPA_NAME> -n app-routing-system -w |
| 54 | + NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE |
| 55 | + nginx Deployment/nginx cpu: 83%/70% 1 2 1 77m |
| 56 | + nginx Deployment/nginx cpu: 83%/70% 1 2 2 77m |
| 57 | + nginx Deployment/nginx cpu: 106%/70% 1 2 2 79m |
| 58 | + nginx Deployment/nginx cpu: 133%/70% 1 2 2 80m |
| 59 | + ``` |
| 60 | + |
| 61 | +The **TARGETS** column shows the CPU threshold at which the `HPA` is triggered to scale up the pods. This behavior has several possible causes: |
| 62 | + |
| 63 | +- The `HPA` reached the maximum number of pods. |
| 64 | +- No nodes are available to use to schedule the pods. |
| 65 | + |
| 66 | +### Step 2: Look for pods in a pending state |
| 67 | + |
| 68 | +If your evaluation reveals that `NGINX HPA` didn't reach the maximum number of pods, the [kube-scheduler](https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/#kube-scheduler) might not be able to find available nodes to use to schedule the `NGINX pods`. To find pending pods, run the following command: |
| 69 | + |
| 70 | +```console |
| 71 | +kubectl get pod --field-selector=status.phase=Pending -n app-routing-system |
| 72 | +``` |
| 73 | + |
| 74 | +> [!NOTE] |
| 75 | +> If there are pending pods, the cluster might experience a resource exhaustion problem. For more information, see [Troubleshoot pod scheduler errors in Azure Kubernetes Service](../availability-performance/troubleshoot-pod-scheduler-errors.md). |
| 76 | +
|
| 77 | +### Step 3: Check whether limits are applied to the NGINX deployment |
| 78 | + |
| 79 | +Any misconfiguration on the `NGINX` [resource limits or requests](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) can cause the `HPA` to scale up more pods than it needs. To check the limits, follow these steps: |
| 80 | + |
| 81 | +1. Describe the NGINX deployment: |
| 82 | + |
| 83 | + ```console |
| 84 | + kubectl describe deploy nginx -n app-routing-system |
| 85 | + ``` |
| 86 | + |
| 87 | +2. Verify the requests and limits: |
| 88 | + |
| 89 | + ```console |
| 90 | + $ kubectl describe deploy nginx -n app-routing-system |
| 91 | + Name: nginx |
| 92 | + .... |
| 93 | + Selector: app=nginx |
| 94 | + .... |
| 95 | + Pod Template: |
| 96 | + .... |
| 97 | + Containers: |
| 98 | + controller: |
| 99 | + ... |
| 100 | + Limits: |
| 101 | + ... |
| 102 | + Requests: |
| 103 | + ... |
| 104 | + ``` |
| 105 | + |
| 106 | +## More information |
| 107 | + |
| 108 | +By default, the current version of the NGINX ingress controller doesn't set limits for NGINX pods. The controller requests `500m` CPU to be used by the `HPA`. We recommend that you don't change these settings directly in the deployment definition. |
| 109 | + |
| 110 | +If the `HPA` reaches the maximum number of pods, and the deployment requests and limits remain unchanged, configure the [custom resource definition (CRD)](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) that's named [NginxIngressController](https://github.com/Azure/aks-app-routing-operator/blob/main/config/crd/bases/approuting.kubernetes.azure.com_nginxingresscontrollers.yaml). |
| 111 | + |
| 112 | +### Configuration options |
| 113 | + |
| 114 | +The following configuration options directly affect the `HPA` behavior. |
| 115 | + |
| 116 | +| Property | Type | Description | Required | Default | |
| 117 | +|----------------|---------|-----------------------------------------------------------------------------|----------|-----------| |
| 118 | +| `scaling` | object | Configuration for scaling the controller. Contains nested properties. | No | Not applicable | |
| 119 | +| `maxReplicas` | integer | Upper limit for replicas. | No | 100 | |
| 120 | +| `minReplicas` | integer | Lower limit for replicas. | No | 2 | |
| 121 | +| `threshold` | string | Scaling threshold that defines how aggressively to scale. Options include: `rapid`, `steady`, `balanced`. | No | balanced | |
| 122 | + |
| 123 | +### Apply the configuration |
| 124 | + |
| 125 | +1. Edit the NginxIngressController CRD: |
| 126 | + |
| 127 | + ```console |
| 128 | + kubectl edit nginxingresscontroller -n app-routing-system |
| 129 | + ``` |
| 130 | + |
| 131 | +2. Add or modify the scaling configuration: |
| 132 | + |
| 133 | + ```yaml |
| 134 | + spec: |
| 135 | + scaling: |
| 136 | + maxReplicas: 10 |
| 137 | + minReplicas: 2 |
| 138 | + threshold: "balanced" |
| 139 | + ``` |
| 140 | + |
| 141 | +3. To apply the changes, save them, and then exit the editor. |
| 142 | + |
| 143 | +4. Verify the changes: |
| 144 | + |
| 145 | + ```console |
| 146 | + kubectl get hpa -n app-routing-system |
| 147 | + ``` |
| 148 | + |
| 149 | +The HPA automatically updates, based on your new configuration. The NGINX ingress controller scales according to the specified parameters. |
| 150 | + |
| 151 | +## References |
| 152 | + |
| 153 | +- [Learn more about Azure Kubernetes Service (AKS) best practices](/azure/aks/best-practices) |
| 154 | +- [Monitor your Kubernetes cluster performance with Container insights](/azure/azure-monitor/containers/container-insights-analyze) |
| 155 | +- [NGINX ingress controller](https://github.com/kubernetes/ingress-nginx) |
| 156 | + |
| 157 | +[!INCLUDE [Third-party information disclaimer](../../../includes/third-party-disclaimer.md)] |
| 158 | + |
| 159 | +[!INCLUDE [Third-party contact information disclaimer](../../../includes/third-party-contact-disclaimer.md)] |
| 160 | + |
| 161 | +[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)] |
0 commit comments