You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This article describes a method to interrupt a site network service (SNS) deployment operation in a nonterminal state. This capability only supports container network functions (CNF) and is triggered by applying a tag to the network function (NF) managed resource group (MRG). The user must later remove this tag to restore future SNS operations.
12
+
13
+
This article describes a method to interrupt a site network service (SNS) deployment operation in a nonterminal state. This capability supports only container network functions (CNFs). You trigger it by applying a tag to the managed resource group (MRG) for the network function (NF). You must later remove this tag to restore future SNS operations.
13
14
14
15
## Why interrupt a service deployment operation
15
-
Azure Operator Service Manager deploys complex CNF workloads, which are composed of many individual components (helm charts). When an SNS deployment is started, each component is processed sequentially, in the order as defined in the network function design (NFD). Depending on how many components are touched in a given deployment, the SNS operation can take an extended time to complete. As an example, consider a scenario where a CNF is composed of 30 components where each component takes 5 minutes to deploy. The total run time of this operation would exceed 2 hours. Now, consider operational issues with long running deployment operations:
16
-
* Users may wish to test deployment operation only up to a certain component.
17
-
* Users may realize, after initiating the operation, that an error exists in a component configuration.
18
-
* The operation might create an unexpected negative impact on a customer facing service.
19
16
20
-
In such cases, an ability to interrupt the operation is desirable. Before the introduction of this interruption capability, the only option was to wait for the defective component to fail. With this interruption capability, long-running deployments can be proactively interrupted before reaching the defective component, minimizing delays and improving operational agility.
17
+
Azure Operator Service Manager deploys complex CNF workloads, which consist of many individual components (Helm charts). When you start an SNS deployment, each component is processed sequentially, in the order defined in the network function design (NFD). Depending on how many components are touched in a deployment, the SNS operation can take an extended time to finish.
18
+
19
+
As an example, consider a scenario where a CNF has 30 components. Each component takes 5 minutes to deploy. The total run time of this operation would exceed 2 hours. Now, consider operational issues with long-running deployment operations:
20
+
21
+
* Users might want to test the deployment operation only up to a certain component.
22
+
* Users might realize, after starting the operation, that an error exists in a component configuration.
23
+
* The operation might create an unexpected negative impact on a customer-facing service.
24
+
25
+
In such cases, an ability to interrupt the operation is desirable. Before the introduction of this interruption capability, the only option was to wait for the defective component to fail. With this interruption capability, you can proactively interrupt long-running deployments before they reach the defective component. This capability minimizes delays and improves operational agility.
21
26
22
27
## Overview of service deployment operations
23
-
During the first deployment of an SNS, the install operation creates a managed resource group (MRG) which includes the network function resource. For subsequent SNS deployments, upgrade operations use this managed-by relationship to modify the NF within the MRG. As a prerequisite, the user must have access to the NF MRG to use the interruption feature.
28
+
29
+
During the first deployment of an SNS, the installation operation creates a managed resource group (MRG) that includes the NF resource. For subsequent SNS deployments, upgrade operations use this managed-by relationship to modify the NF within the MRG. As a prerequisite for using the interruption feature, you must have access to the NF MRG.
24
30
25
31
> [!NOTE]
26
-
> The NF MRG has different default permissions, versus the SNS resource group (RG), which often restricts direct user access.
32
+
> The NF MRG has different default permissions versus the SNS resource group (RG), which often restricts direct user access.
33
+
34
+
## Interrupt a service deployment operation
27
35
28
-
## Execute a service deployment operation interruption
29
-
Follow this process to execute an interruption, but note that interruption behavior differs when executed against an install operation versus an upgrade operation.
30
-
* When a user interrupts an install, the workflow only supports the pause-on-failure failure recovery method.
31
-
* When a user interrupts an upgrade, the workflow honors the configured failure recovery method, either rollback-on-failure or pause-on-failure.
36
+
Follow this process to execute an interruption. Keep in mind that interruption behavior differs when you execute it against an installation operation versus an upgrade operation:
37
+
38
+
* When you interrupt an installation, the workflow supports only the pause-on-failure failure recovery method.
39
+
* When you interrupt an upgrade, the workflow honors the configured failure recovery method. This method can be either rollback-on-failure or pause-on-failure.
32
40
33
41
### Request interruption with a tag
34
-
To interrupt a running deployment, add the tag `cancel:1` on the NF MRG. The MRG is identified by referencing the `properties.managedResourceGroupConfiguration.name` value within the SNS resource.
35
-
* The tag is a static key value pair and must be an exact match.
36
-
* The tag can be added via any supported method such as Azure portal, Azure CLI, REST SDK, etc.
37
-
* The following example shows how to add the tag using Azure CLI:
42
+
43
+
To interrupt a running deployment, add the tag `cancel:1` on the NF MRG. You can identify the MRG by referencing the `properties.managedResourceGroupConfiguration.name` value within the SNS resource.
44
+
45
+
The tag is a static key/value pair and must be an exact match. You can add it by using any supported method, such as the Azure portal, Azure CLI, or REST SDK.
46
+
47
+
The following example shows how to add the tag by using the Azure CLI:
38
48
39
49
```powershell
40
50
az tag update --resource-id {resourceGroup} --operation Merge --tags cancel=1
41
51
```
42
52
43
53
### Wait for interruption to be triggered
44
-
Once the tag is applied to the NF MRG, the interruption is executed between component operations.
45
-
* The current component operation isn't interrupted and must proceed to completion.
46
-
* Before starting the next component operation, the workflow checks for the presence of the tag on the NF MRG.
47
-
* If the tag is present, any remaining components aren't executed and set to fail state.
48
-
* If the interruption is applied to an upgrade operation, the configured failure recovery method is honored.
49
-
* After failure recovery is complete, the deployment operation terminal state is set to failed.
50
54
51
-
### Monitor state of network function components
52
-
Use the NF component view to determine state of an executed interruption. Look for the `DeploymentStatusProperties` property of the last completed component to be in a state other than installing. Component view can also be used to determine component states based on configured failure recovery method.
55
+
After you apply the tag to the NF MRG, the interruption is executed between component operations. The current component operation isn't interrupted and must proceed to completion.
56
+
57
+
Before the workflow starts the next component operation, it checks for the presence of the tag on the NF MRG. If the tag is present, any remaining components aren't executed and are set to a `failed` state.
58
+
59
+
If the interruption is applied to an upgrade operation, the workflow honors the configured failure recovery method. After failure recovery finishes, the deployment operation's terminal state is set to `failed`.
60
+
61
+
### Monitor the state of network function components
62
+
63
+
Use the NF component view to determine the state of an executed interruption. Look for the `DeploymentStatusProperties` property of the last completed component to be in a state other than `installing`.
64
+
65
+
You can also use the component view to determine component states based on the configured failure recovery method.
53
66
54
67
### Confirm interruption action via logs
55
-
Once the SNS deployment reaches a terminal state of failed, a notice of interruption is appended onto the operation output log.
56
68
57
-
#### Error emitted during install interruption
58
-
The following shows an example of the log emitted during a first install operation. The reference to `testapp` identifies the component that wasn't started, due to the interruption request. The string `deployment cancelled` indicates the interruption was applied to an initial install operation.
69
+
After the SNS deployment reaches a terminal state of `failed`, a notice of interruption is appended to the operation's output log.
70
+
71
+
#### Error emitted during installation interruption
72
+
73
+
The following code shows an example of the log emitted during a first installation operation. The reference to `testapp` identifies the component that wasn't started, due to the interruption request. The string `deployment cancelled` indicates that the interruption was applied to an initial installation operation.
74
+
59
75
```powershell
60
76
{
61
77
"code": "DeploymentFailed",
@@ -70,7 +86,9 @@ The following shows an example of the log emitted during a first install operati
70
86
```
71
87
72
88
#### Error emitted during upgrade interruption
73
-
The following shows an example of the log emitted during an upgrade operation. The reference to `testapp` identifies the next component that wasn't started, due to the interruption request. The string `NF update` indicates the interruption was applied to an upgrade operation.
89
+
90
+
The following code shows an example of the log emitted during an upgrade operation. The reference to `testapp` identifies the next component that wasn't started, due to the interruption request. The string `NF update` indicates that the interruption was applied to an upgrade operation.
91
+
74
92
```powershell
75
93
{
76
94
"code": "DeploymentFailed",
@@ -84,15 +102,17 @@ The following shows an example of the log emitted during an upgrade operation. T
84
102
}
85
103
```
86
104
87
-
### Remove tag once interruption is complete
88
-
The user should remove the tag from the NF MRG to avoid unintentionally interrupting any future SNS deployment operations.
89
-
*For example, to remove the tag using Azure CLI:
105
+
### Remove a tag after interruption is complete
106
+
107
+
To avoid unintentionally interrupting any future SNS deployment operations, you should remove the tag from the NF MRG. For example, to remove the tag by using the Azure CLI, run this command:
90
108
91
109
```powershell
92
110
az tag update --resource-id {resourceGroup} --operation Delete --tags cancel=1
93
111
```
94
112
95
113
## Other considerations
96
-
When considering interrupting an SNS deployment operation, be aware of the following considerations:
97
-
* Interrupting deployments is supported only for Container Network Function (CNF) deployments.
98
-
* When the tag is added to the SNS MRG, the ongoing component action isn't interrupted and must complete before interruption is executed.
114
+
115
+
When you want to interrupt an SNS deployment operation, be aware of these considerations:
116
+
117
+
* Interrupting deployments is supported only for CNF deployments.
118
+
* When you add the tag to the SNS MRG, the ongoing component action isn't interrupted and must finish before the interruption starts.
0 commit comments