Skip to content

Commit 59758d1

Browse files
authored
Update node-not-ready-then-recovers.md
Changes in the right branch following the PR: https://github.com/VictoriaNoje/SupportArticles-docs-pr/pull/1/files
1 parent 599f9cf commit 59758d1

1 file changed

Lines changed: 12 additions & 11 deletions

File tree

support/azure/azure-kubernetes/availability-performance/node-not-ready-then-recovers.md

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Node not ready but then recovers
33
description: Troubleshoot scenarios in the status of an Azure Kubernetes Service (AKS) cluster node is Node Not Ready, but then the node recovers.
4-
ms.date: 04/15/2022
4+
ms.date: 10/15/2024
55
ms.reviewer: rissing, chiragpa, momajed, v-leedennis
66
ms.service: azure-kubernetes-service
77
#Customer intent: As an Azure Kubernetes user, I want to prevent the Node Not Ready status for nodes that later recover so that I can avoid future errors within an Azure Kubernetes Service (AKS) cluster.
@@ -13,29 +13,30 @@ This article helps troubleshoot scenarios in which a node within a Microsoft Azu
1313

1414
## Symptoms
1515

16-
You notice that your application stops responding while the node is reporting that it has a Not Ready status. However, the node recovers automatically, and now, it's looking for a root cause analysis (RCA).
16+
Maintaining node readiness in Azure Kubernetes Service (AKS) clusters is crucial for ensuring application availability and performance. When a node enters a "Not Ready" state, it can disrupt the application's functionality, causing it to stop responding. Although the node typically recovers automatically after a short period, understanding the underlying causes and implementing effective resolutions is essential to prevent recurring issues and maintain a stable environment. This document provides a comprehensive guide to troubleshooting and resolving node readiness issues in AKS clusters.
1717

1818
## Cause
1919

20-
Possible causes of this issue include the following scenarios:
20+
There are several scenarios that could lead to this issue:
2121

22-
- The API server isn't available, and you're using a readiness probe for the deployment.
23-
24-
If a pod is running but isn't ready, that situation means that the readiness probe is failing. If the readiness probe fails, the pod isn't attached to the service, and traffic isn't forwarded to the pod instance.
22+
- One possible reason for a node entering a "Not Ready" state is the unavailability of the API server, which causes the readiness probe to fail. This failure prevents the pod from being attached to the service, resulting in traffic not being forwarded to the pod instance.
2523

2624
- Virtual machine (VM) host faults occur. To determine whether VM host faults occurred, check the following information sources:
2725
- [AKS diagnostics](/azure/aks/concepts-diagnostics)
2826
- [Azure status](https://status.azure.com/)
2927
- Azure notifications (for any recent outages or maintenance periods)
3028

29+
## Resolution
30+
31+
Check the API server availability by running the following command: kubectl get apiservices.
32+
33+
Ensure that the readiness probe is correctly configured in the deployment YAML file.
34+
35+
For further steps check here: [Basic troubleshooting of Node Not Ready failures](node-not-ready-basic-troubleshooting.md).
36+
3137
## Prevention
3238

3339
To prevent this issue from occurring in the future, take one or more of the following actions:
3440

3541
- Make sure that your service tier is fully paid for.
3642
- Reduce the number of `watch` and `get` requests to the API server.
37-
- Replace the node pool with a healthy node pool.
38-
39-
## More information
40-
41-
- For general troubleshooting steps, see [Basic troubleshooting of Node Not Ready failures](node-not-ready-basic-troubleshooting.md).

0 commit comments

Comments
 (0)