You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/redis/best-practices-server-load.md
+16-6Lines changed: 16 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,16 +36,26 @@ High memory usage on the server makes it more likely that the system needs to pa
36
36
37
37
Redis server is a single-threaded system. Long running commands can cause latency or timeouts on the client side because the server can't respond to any other requests while it's busy working on a long running command. For more information, see [Troubleshoot Azure Cache for Redis server-side issues](troubleshoot-server.md).
38
38
39
-
## Monitor Server Load
39
+
## Monitor Server Load and CPU
40
40
41
-
Add monitoring on server load to ensure you get notifications when high server load occurs. Monitoring can help you understand your application constraints. Then, you can work proactively to mitigate issues. We recommend trying to keep server load under 80% to avoid negative performance effects. Sustained server load over 80% can lead to unplanned failovers.
42
-
Currently, Azure Managed Redis exposes two metrics in **Insights** under **Monitoring** on the Resource menu on the left of the portal: **CPU** and **Server Load**. Understanding what is measured by each metric is important when monitoring server load.
41
+
Add monitoring on server load and CPU to ensure you get notifications when either one of them is high. Monitoring can help you understand your application constraints. Then, you can work proactively to mitigate issues. We recommend trying to keep server load under 80% to avoid negative performance effects. Sustained server load over 80% can lead to unplanned failovers.
42
+
Currently, Azure Managed Redis exposes two metrics in **Insights** under **Monitoring** on the Resource menu on the left of the portal: **CPU** and **Server Load**. Understanding what is measured by each metric is important when monitoring them.
43
43
44
-
The **CPU** metric indicates the CPU usage for the node that hosts the cache. The CPU metric also includes processes that aren't strictly Redis server processes. CPU includes background processes for anti-malware and others. As a result, the CPU metric can sometimes spike and might not be a perfect indicator of CPU usage for the Redis server.
44
+
The **CPU**(a.k.a. percentProcessorTime) metric indicates the CPU usage for the node that hosts the cache. The CPU metric also includes processes that aren't strictly Redis server processes. CPU includes background processes for anti-malware and others. As a result, the CPU metric can sometimes spike and might not be a perfect indicator of CPU usage for the Redis server.
45
45
46
-
The **Server Load** metric represents the load on the Redis Server alone. We recommend monitoring the **Server Load** metric instead of **CPU**.
46
+
The **Server Load** metric reflects the Redis server's own assessment of overall load and is similar to CPU metric but at a cluster level.
47
47
48
-
When monitoring server load, we also recommend that you examine the max spikes of Server Load rather than average because even brief spikes can trigger failovers and command timeouts.
48
+
### Recommendations for Smaller SKUs
49
+
50
+
On Azure Managed Redis SKUs backed by 2-vCPU VMs (B0–B5, X3, and M10), percentage-based metrics like **Server Load** and **CPU** are inherently more sensitive. A single short-lived background thread can consume a significant percentage of total CPU, causing metrics to appear elevated even when actual workload is light. As a result, these metrics can overestimate actual load on small SKUs and may not indicate workload saturation.
51
+
52
+
When reviewing metrics over longer time periods, such as several hours or days, we recommend:
53
+
54
+
- Using **CPU** instead of **Server Load** as it can be viewed at instance level adding more granularity
55
+
- Splitting by instance ID of the virtual machines backing the Azure Managed Redis instance
56
+
- Using **Average** aggregation instead of **Maximum** for these longer time ranges
57
+
58
+
You can still use **Maximum** aggregation over short time windows to catch brief spikes or events (such as those that might cause timeouts or failovers), while relying on **Average** over longer windows for trend analysis on small SKUs, especially when using **CPU**.
0 commit comments