You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -253,19 +253,19 @@ You can set up a new cluster to use a Managed Network Address Translation (NAT)
253
253
254
254
## Cause 6: Konnectivity Agents performance issues with Cluster growth
255
255
256
-
As the cluster grows, the performance of Konnectivity Agents may degrade due to increased network traffic, higher numbers of requests, or resource constraints.
256
+
As the cluster grows, the performance of Konnectivity Agents might degrade because of increased network traffic, more requests, or resource constraints.
257
257
258
258
> [!NOTE]
259
259
> This cause applies to only the `Konnectivity-agent` pods.
260
260
261
261
### Solution 6: Cluster Proportional Autoscaler for Konnectivity Agent
262
262
263
-
To address scalability challenges in large clusters, we have implemented the Cluster Proportional Autoscaler for our Konnectivity Agents. This approach aligns with industry standards and best practices. It ensures optimal resource usage and enhanced performance.
263
+
To manage scalability challenges in large clusters, we implement the Cluster Proportional Autoscaler for our Konnectivity Agents. This approach aligns with industry standards and best practices. It ensures optimal resource usage and enhanced performance.
264
264
265
-
**Why was this change made?**
266
-
Previously, the Konnectivity agent had a fixed replica count, which could create a bottleneck as the cluster grew. With the implementation of the Cluster Proportional Autoscaler, the replica count now dynamically adjusts based on node-scaling rules, ensuring optimal performance and resource usage.
265
+
**Why this change was made**
266
+
Previously, the Konnectivity agent had a fixed replica count that could create a bottleneck as the cluster grew. By implementating the Cluster Proportional Autoscaler, we enable the replica count to adjust dynamically, based on node-scaling rules, to provide optimal performance and resource usage.
267
267
268
-
**How does the Cluster Proportional Autoscaler work?**
268
+
**How the Cluster Proportional Autoscaler works**
269
269
The Cluster Proportional Autoscaler work uses a ladder configuration to determine the number of Konnectivity agent replicas based on the cluster size. The ladder configuration is defined in the konnectivity-agent-autoscaler configmap in the kube-system namespace. Here is an example of the ladder configuration:
270
270
271
271
```
@@ -279,41 +279,41 @@ nodesToReplicas": [
279
279
]
280
280
```
281
281
282
-
This configuration ensures that the number of replicas scales appropriately with the number of nodes in the cluster, providing optimal resource allocation and improved networking reliability.
282
+
This configuration makes sure that the number of replicas scales appropriately with the number of nodes in the cluster to provide optimal resource allocation and improved networking reliability.
283
283
284
284
**How to use the Cluster Proportional Autoscaler?**
285
285
You can override default values by updating the konnectivity-agent-autoscaler configmap in the kube-system namespace. Here is a sample command to update the configmap:
286
286
287
287
```bash
288
288
kubectl edit configmap <pod-name> -n kube-system
289
289
```
290
-
This command opens the configmap in an editor where you can make the necessary changes.
290
+
This command opens the configmap in an editor to enable you to make the necessary changes.
291
291
292
-
**What should be checked?**
292
+
**What you should check**
293
293
294
-
You need to monitor for Out Of Memory (OOM) kills on the nodes because misconfiguration of the Cluster Proportional Autoscaler can lead to insufficient memory allocation for the Konnectivity agents. Here are the key reasons:
294
+
You have to monitor for Out Of Memory (OOM) kills on the nodes because misconfiguration of the Cluster Proportional Autoscaler can cause insufficient memory allocation for the Konnectivity agents. This misconfiguration occurs for the following key reasons:
295
295
296
-
**High Memory Usage:** As the cluster grows, the memory usage of Konnectivity agents can increase significantly, especially during peak loads or when handling large numbers of connections. If the Cluster Proportional Autoscaler configuration does not scale the replicas appropriately, the agents may run out of memory.
296
+
**High Memory Usage:** As the cluster grows, the memory usage of Konnectivity agents can increase significantly. This increase can occur especially during peak loads or when handling large numbers of connections. If the Cluster Proportional Autoscaler configuration does not scale the replicas appropriately, the agents may run out of memory.
297
297
298
-
**Fixed Resource Limits:** If the resource requests and limits for the Konnectivity agents are set too low, they may not have enough memory to handle the workload, leading to OOM kills. Misconfigured Cluster Proportional Autoscaler settings can exacerbate this issue by not providing enough replicas to distribute the load.
298
+
**Fixed Resource Limits:** If the resource requests and limits for the Konnectivity agents are set too low, they might not have enough memory to handle the workload, leading to OOM kills. Misconfigured Cluster Proportional Autoscaler settings can exacerbate this issue by not providing enough replicas to distribute the load.
299
299
300
-
**Cluster Size and Workload Variability:** The CPU and memory needed by the Konnectivity agents can vary widely depending on the size of the cluster and the workload. If the Cluster Proportional Autoscaler ladder configuration is not right-sized and adaptively resized for the cluster's usage patterns, it can lead to memory overcommitment and OOM kills.
300
+
**Cluster Size and Workload Variability:** The CPU and memory that are needed by the Konnectivity agents can vary widely depending on the size of the cluster and the workload. If the Cluster Proportional Autoscaler ladder configuration is not right-sized and adaptively resized for the cluster's usage patterns, it can cause memory overcommitment and OOM kills.
301
301
302
-
Here are the steps to identify and troubleshoot OOM Kills:
302
+
To identify and troubleshoot OOM kills, follow these steps:
303
303
304
-
1. Check for OOM Kills on Nodes: Use the following command to check for OOM Kills on your nodes:
304
+
1. Check for OOM Kills on nodes: Use the following command to check for OOM Kills on your nodes:
305
305
306
306
```
307
307
kubectl get events --all-namespaces | grep -i 'oomkill'
308
308
```
309
309
310
-
2. Inspect Node Resource Usage: Verify the resource usage on your nodes to ensure they are not running out of memory:
310
+
2. Inspect Node Resource Usage: Verify the resource usage on your nodes to make sure that they aren't running out of memory:
311
311
312
312
```
313
313
kubectl top nodes
314
314
```
315
315
316
-
3. Review Pod Resource Requests and Limits: Ensure that the Konnectivity agent pods have appropriate resource requests and limits set to prevent OOM Kills:
316
+
3. Review Pod Resource Requests and Limits: Make sure that the Konnectivity agent pods have appropriate resource requests and limits set to prevent OOM Kills:
317
317
318
318
```
319
319
kubectl get pod <pod-name> -n kube-system -o yaml | grep -A5 "resources:"
0 commit comments