Skip to content

Commit ab2113d

Browse files
authored
Enhance troubleshooting steps for node auto-provisioning
Updated troubleshooting documentation for Azure Kubernetes node auto-provisioning. Added information on locks and network security group rules.
1 parent 5ab640c commit ab2113d

1 file changed

Lines changed: 8 additions & 3 deletions

File tree

support/azure/azure-kubernetes/extensions/troubleshoot-node-auto-provision.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,12 +52,14 @@ kubectl get events | grep -i "disruption\|consolidation"
5252
- DaemonSets preventing drain
5353
- Pod disruption budgets(PDBs) are not properly set
5454
- Nodes are marked with `do-not-disrupt` annotation
55+
- Locks blocking changes
5556

5657
**Solutions**:
5758
- Add proper tolerations to pods
5859
- Review DaemonSet configurations
5960
- Adjust pod disruption budgets to allow disruption
6061
- Remove `do-not-disrupt` annotations if appropriate
62+
- Review lock configurations
6163

6264

6365
## Networking Issues
@@ -75,6 +77,8 @@ kubectl exec -it <pod-name> -- ping <target-ip>
7577
kubectl exec -it <pod-name> -- nslookup kubernetes.default
7678
```
7779

80+
Another option to test node to node or pod to pod connectivity is with the open-source [goldpinger](https://github.com/bloomberg/goldpinger) tool.
81+
7882
2. **Check network plugin status**:
7983
```azurecli-interactive
8084
kubectl get pods -n kube-system | grep -E "azure-cni|kube-proxy"
@@ -102,7 +106,7 @@ ls -la /etc/cni/net.d/
102106
```
103107

104108
**Understanding conflist files**:
105-
- `10-azure.conflist`: Standard Azure CNI configuration for traditional networking with node subnet
109+
- `10-azure.conflist`: Standard Azure CNI configuration for traditional networking with all CNI's not using overlay
106110
- `15-azure-swift-overlay.conflist`: Azure CNI with overlay networking (used with Cilium or overlay mode)
107111

108112
**Inspect the configuration content**:
@@ -133,13 +137,13 @@ kubectl logs -n kube-system -l k8s-app=azure-cns --tail=100
133137
- **If CNI calls don't appear in CNS logs**: You likely have the wrong CNI installed. Verify the correct CNI plugin is deployed.
134138

135139
**Common Causes**:
136-
- Network security group rules
140+
- Network security group(NSG) rules
137141
- Incorrect subnet configuration
138142
- CNI plugin issues
139143
- DNS resolution problems
140144

141145
**Solutions**:
142-
- Review NSG rules for required traffic
146+
- Review [Network Sescurity Group][networ-security-group-docs] rules for required traffic
143147
- Verify subnet configuration in AKSNodeClass
144148
- Restart CNI plugin pods
145149
- Check CoreDNS configuration
@@ -263,3 +267,4 @@ az vm list-usage --location <region> --query "[?currentValue >= limit]"
263267
[aks-firewall-requirements]: /azure/aks/limit-egress-traffic#azure-global-required-network-rules
264268
[karpenter-troubleshooting]: https://karpenter.sh/docs/troubleshooting/
265269
[karpenter-faq]: https://karpenter.sh/docs/faq/
270+
[networ-security-group-docs]: /azure/virtual-network/network-security-groups-overview

0 commit comments

Comments
 (0)