You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: Describes how to troubleshoot rolling upgrade issues.
3
+
description: Discusses how to troubleshoot rolling upgrade issues.
4
4
ms.date: 12/05/2025
5
5
manager: dcscontentpm
6
6
audience: itpro
@@ -15,17 +15,17 @@ ms.custom:
15
15
16
16
## Summary
17
17
18
-
This article provides a structured troubleshooting approach for addressing common issues encountered during rolling upgrades in Windows Server Failover Clustering (WSFC), Storage Spaces Direct, SQL Server Always On availability groups, and Hyper-V.
18
+
This article provides a structured troubleshooting method to resolve common issues that you might encounter during rolling upgrades in Windows Server Failover Clustering (WSFC), Storage Spaces Direct, SQL Server Always On availability groups, and Hyper-V.
19
19
20
-
Rolling upgrades are essential for maintaining and upgrading systems with minimal downtime. However, challenges like compatibility and configuration errors can impact availability and potentially cause data loss.
20
+
Rolling upgrades are essential for maintaining and upgrading systems while experiencing minimal downtime. However, challenges such as compatibility and configuration errors can affect availability, and potentially cause data loss.
21
21
22
22
## Prerequisites
23
23
24
-
Before starting a rolling upgrade:
24
+
Before you start a rolling upgrade:
25
25
26
26
- Verify that the rolling upgrade feature is supported for your workload and operating system (OS) versions.
27
-
-Confirm all cluster nodes are healthy using the `Get-ClusterNode` PowerShell command.
28
-
-Ensure you have up-to-date backups, including:
27
+
-Verify that all cluster nodes are healthy by using the `Get-ClusterNode` PowerShell command.
28
+
-Make sure that you have up-to-date backups, including:
29
29
- System state
30
30
- Cluster configuration
31
31
- User data
@@ -34,82 +34,86 @@ Before starting a rolling upgrade:
34
34
35
35
### Address rolling upgrade failures
36
36
37
-
1. Move core resources to another node using Failover Cluster Manager or the `Move-ClusterGroup` PowerShell command.
38
-
2.Use `Suspend-ClusterNode -Drain` to migrate roles and resources off the node.
39
-
3. Check cluster logs for dependencies or errors blocking the operation.
37
+
1. Move core resources to another node by using Failover Cluster Manager or the `Move-ClusterGroup` PowerShell command.
38
+
2.Migrate roles and resources off the node by using `Suspend-ClusterNode -Drain`.
39
+
3. Check cluster logs for dependencies or errors that might block the operation.
40
40
41
41
## Troubleshooting checklist
42
42
43
-
1.**Review prerequisites**: Ensure the environment meets all prerequisites previously cited in this article.
43
+
1.**Review prerequisites**: Make sure that the environment meets all prerequisites that are mentioned in this article.
44
44
45
-
2.**Validate cluster status**: Run `Test-Cluster` and resolve any validation warnings or errors.
46
-
- Verify the current cluster functional level using `Get-Cluster | Select ClusterFunctionalLevel`.
45
+
2.**Validate cluster status**: Resolve any validation warnings or errors by running `Test-Cluster`.
46
+
- Verify the current cluster functional level by using `Get-Cluster | Select ClusterFunctionalLevel`.
47
47
- Validate network connectivity among all nodes.
48
48
49
49
3.**Plan and sequence upgrades**: Document the sequence of node upgrades (one node at a time).
50
-
- Move cluster roles (like virtual machines (VMs), availability groups, or file shares) off the node being upgraded.
51
-
- Update all nodes with the latest supported patches or hotfixes for the current OS.
50
+
- Move cluster roles (such as virtual machines (VMs), availability groups, or file shares) off the node that's being upgraded.
51
+
- Update all nodes to the latest supported updates or hotfixes for the current OS.
52
52
53
53
4.**Communicate with stakeholders**: Inform stakeholders and schedule maintenance windows.
54
-
- Notify monitoring teams to avoid unnecessary alerts.
54
+
- Notify monitoring teams in order to avoid unnecessary alerts.
55
55
56
-
5.**Ensure application awareness**: Confirm application compatibility for workloads like SQL Server, Hyper-V, or file services.
57
-
- Inform application owners of planned upgrades.
56
+
5.**Ensure application awareness**: Verify application compatibility for workloads such as SQL Server, Hyper-V, or file services.
57
+
- Inform application owners about planned upgrades.
58
58
59
59
6.**Conduct pre-upgrade tests**: Review logs for Windows, applications, clusters, and storage to identify any pre-existing issues.
60
60
61
61
## Common issues and their respective solutions
62
62
63
-
### 1. Rolling upgrade fails to start or node can't be evicted
63
+
### 1. Rolling upgrade doesn't start or node can't be evicted
64
64
65
65
**Symptoms**
66
66
67
-
You're unable to pause, drain, or remove a node from the cluster. Errors like "Node ... cannot be removed from the cluster ..." appear.
67
+
You can't pause, drain, or remove a node from the cluster. You receive error messages such as the following example:
68
+
69
+
> Node... cannot be removed from the cluster.
68
70
69
71
**Cause**
70
72
71
73
The node hosts core cluster resources, dependencies are misconfigured, or the cluster is unstable.
72
74
73
75
**Solution**
74
76
75
-
1. Move core resources to another node using Failover Cluster Manager or `Move-ClusterGroup`.
76
-
2.Use `Suspend-ClusterNode -Drain` to move roles and resources.
77
-
3.Ensure the node isn't the last up-to-date or quorum node.
77
+
1. Move core resources to another node by using Failover Cluster Manager or `Move-ClusterGroup`.
78
+
2.move roles and resources by running `Suspend-ClusterNode -Drain`.
79
+
3.Make sure that the node isn't the last up-to-date or quorum node.
78
80
4. Check cluster logs for blocking dependencies.
79
81
80
-
### 2. Failure adding upgraded node back to cluster
82
+
### 2. Can't restore upgraded node to cluster
81
83
82
84
**Symptoms**
83
85
84
-
Errors like "A node attempted to join a failover cluster but failed due to incompatibility…" or version mismatch messages appear.
86
+
You receive a version mismatch message or error messages such as the following example:
87
+
88
+
> A node attempted to join a failover cluster but failed due to incompatibility.
85
89
86
90
**Cause**
87
91
88
-
Unsupported OS version mix or unpatched node.
92
+
Unsupported OS version mix or nonupdated node.
89
93
90
-
**Solution**
94
+
**Solution**
91
95
92
96
1. Verify the supported OS and cluster version matrix.
93
-
2.Patch the node to the latest cumulative update (CU).
97
+
2.Update the node to the latest cumulative update (CU).
94
98
3. Upgrade the OS versions sequentially (for example, 2016 → 2019 → 2022).
95
-
4.Use `Get-ClusterLog` to identify versioning errors.
99
+
4.Identify versioning errors by using `Get-ClusterLog`.
96
100
97
-
### 3. Resource or service fails to come online
101
+
### 3. Resource or service doesn't come online
98
102
99
103
**Symptoms**
100
104
101
-
Resources like VMs or file shares enter a failed or offline state post-upgrade. Common Event IDs include `1069`, `1146`, and `1230`.
105
+
Resources such as VMs or file shares enter a failed or offline state post-upgrade. Common Event IDs include `1069`, `1146`, and `1230`.
102
106
103
107
**Cause**
104
108
105
109
Misconfiguration during upgrade, missing registry keys or files, or service account failures.
106
110
107
-
**Solution**
111
+
**Solution**
108
112
109
113
1. Check cluster events in Failover Cluster Manager.
-[Upgrade a Windows Server failover cluster with a cluster OS rolling upgrade](/windows-server/failover-clustering/cluster-operating-system-rolling-upgrade)
0 commit comments