You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update nfApps rollback compatibility and NFDV strategy
Revised sections on nfApps compatibility with helm rollback, emphasizing the importance of publisher support and detailing the incremental NFDV approach for selective rollback. Clarified risks associated with non-compliant nfApps and provided structured examples for better understanding.
Copy file name to clipboardExpand all lines: articles/operator-service-manager/safe-upgrades-nf-level-rollback.md
+23-25Lines changed: 23 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -130,43 +130,41 @@ example:
130
130
> * If multiple entries of `nfConfiguration` are found in the `roleOverrideValues`, then the NF reput is returned as a bad request.
131
131
132
132
## Manage nfApps that don't support rollback
133
-
Almost all publishers report some nfApps that aren't compatible with helm rollback operations. These nfApps maybe sourced from third-parties who don't common support such strict resiliency requirements. These nfApps maybe related to database applications with complicated schema management requirements. In these cases, special consideration should be taken to deal with nfApps that don't support rollback.
134
-
135
-
* The strong preference is to push publishers to support helm rollback for all nfApps.
133
+
Almost all publishers have some nfApps which aren't compatible with helm rollback. These nfApps maybe sourced from third-parties who don't commonly support strict resiliency requirements. These nfApps maybe database applications with complicated schema management requirements. Consider the following restrictions when onboarding services with nfApps that don't support rollback.
136
134
* nfApps that don't support rollback can't be skipped.
137
135
* nfApp rollback order can't change.
138
136
* Incremental-NFDV approach must be used in these situations.
139
137
140
138
### Selective rollback using incremental NFDVs
141
-
A network function’s composition often includes one, or more, nfApplications that can't support a helm rollback operation, such as Elastic or VoltDb. If a rollback is attempted on one of these nfApplications, the resulting nfApplication is broken. Pursuing publisher enhancements, to make these nfApplications rollback complaint is the best solution. Recognizing the potential for long publisher enhancement lead times, a method to prevent execution of rollback on selective nfApplications is needed. Selectively skipping rollback requires thorough testing with the network function owners as it resulting in transiet condition where multiple version permutation exist.
142
-
143
-
#### Problem Statement
144
-
At the network function level, when nfRollbackEnabled is true, and a failure occurs during an upgrade or install, a rollback is executed across all nfApps which proceed the failure. This may include those which are rollback noncompliant. A selective rollback parameter is not supported. It introduces risk of an operational state that doesn't correspond to a defined NFDV. This state mismatch results in nondeterministic behavior, increases the testing surface significantly, and undermines the reliability guarantees of deployment processes. Instead we rely on NFDVs to ensure deterministic workload states that map to well-defined and tested deployment configurations.
139
+
A network function’s composition may include nfAppa that don't support a helm rollback. Known examples are Elastic and VoltDb. An attmept to rollback one of these nfApps will break the nfApp. Pursuing publisher enhancements, to make these nfApps rollback complaint, is the best solution. A paramter to skip rollback is not supported as it introduces the risk of a deployed state not defined in a NFDV. This nondeterministic behavior increases the testing surface area significantly and undermines reliability guarantees of deployments. Instead, the incremental NFDV method enables selective rollback execution while ensuring deterministic deployment states.
145
140
146
-
#### Proposed Solution
147
-
AOSM proposes that publishers should use a combination of skipUpgrade and nfRollbackEnabled configurations in CGVs, along with multiple NFDVs, to logically segment nfApplications based on rollback compatibility. This multi-NFDV strategy allows customers to bypass rollback for select charts while preserving safety for the rest. This approach is production-safe and aligns with existing AOSM mechanisms. This staged approach effectively simulates per-chart rollback behavior using NFDV-level constructs. Consider the following example where a network function is composed of 20 nfApps with five nfApps that don't support rollback.
141
+
#### Incremental NFDV approach
142
+
It's recommended that publishers use a combination of `applicationEnablement`, `skipUpgrade` and `nfRollbackEnabled` configurations in CGVs, along with multiple NFDVs, to logically segment nfApps into sets based on rollback compatibility. This incremental NFDV strategy allows operators to break deployments down into multiple operatons, bypassing rollback for select charts while preserving rollback for the rest. This approach effectively simulates per-chart rollback behavior using NFDV-level constructs. Consider the following example where a network function is composed of 20 nfApps with five nfApps that don't support rollback.
148
143
149
144
* NFDV1
150
-
* Performs initial install of all 20 charts with version v1.0.
151
-
* In CGV1: rollbackEnabled: irrelevant (fresh install).
145
+
* Performs initial verions 1 install.
146
+
* Contains all 20 nfApps in an enabled state.
147
+
* In CGV1: `rollbackEnabled: true`.
148
+
* On the first install, a failure deletes charts and does not use rollback.
152
149
* NFDV2:
153
-
* Contains all 20 charts but the five Helm charts without rollback support, upgraded to v2.0.
150
+
* Performs first step upgrade to version 2.
151
+
* Contains all 20 nfApps but enable only the five nfApps without rollback support.
154
152
* In CGV2:
155
-
* Use skipUpgrade: true for the remaining 15 charts.
156
-
* Set nfRollbackEnabled: false.
157
-
* Result:
158
-
* Success: Only five charts upgrade
159
-
* Failure:
160
-
* No rollback if upgrade fails.
161
-
* Due to chart limitations, the workload is left in a nondeterministic state. No rollback is possible. To recover, there are two options:
162
-
* Upgrade with a working NFDV2
163
-
* Upgrade with NFDV1 and skipUpgrade disabled for every nfApplication
153
+
* Use `skipUpgrade: true` for the 15 nfApps with rollback supprt.
154
+
* Set `nfRollbackEnabled: false`.
155
+
* On success, only five nfApps are upgraded.
156
+
* On failure, no rollback is performed.
157
+
* Due to chart limitations, the workload is left in a nondeterministic state. No rollback is possible. To recover, there are two options:
158
+
* Fix NFDV2 and try the upgrade again.
159
+
* Downgrade to NFDV1 with `skipUpgrade: false`
164
160
* NFDV3:
165
-
* Contains all charts but the 15 rollback-compatible charts upgraded to v2.0.
161
+
* Performs second step upgrade to version 2
162
+
* Contains all 20 nfApps but enable only the 15 nfApps with rollback support.
166
163
* In CGV3:
167
-
* Use skipUpgrade: true for the 5 charts already handled in NFDV2.
168
-
* Set nfRollbackEnabled: true.
169
-
* Result: Remaining 15 charts upgrade; rollback occurs on failure.
164
+
* Use `skipUpgrade: true` for the 5 nfApps previous upgraded via NFDV2.
165
+
* Set `nfRollbackEnabled: true`.
166
+
* On success, the remaining 15 nfApps are upgraded
167
+
* On failure, a rollback occurs to restore the starting state.
170
168
171
169
> [!NOTE]
172
170
> * The five rollback-incompatible charts must not have runtime upgrade dependencies on charts in NFDV3.
0 commit comments