Skip to content

Commit 224eb2f

Browse files
authored
Merge pull request #306804 from msftadam/patch-901545
Revise onboarding and deployment best practices
2 parents 337be96 + 8902158 commit 224eb2f

2 files changed

Lines changed: 95 additions & 147 deletions

File tree

articles/operator-service-manager/best-practices-onboard-deploy.md

Lines changed: 16 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@ Microsoft has developed many proven practices for managing network functions (NF
1313

1414
## General considerations
1515
We recommend that you first onboard and deploy your simplest NFs (one or two charts) by using the quickstarts to familiarize yourself with the overall flow. You can add necessary configuration details in subsequent iterations. As you go through the quickstarts, consider the following points:
16-
1716
- Structure your artifacts to align with planned use. Consider separating global artifacts from the artifacts that you want to vary by site or instance.
1817
- Ensure service composition of multiple NFs with a set of parameters that matches the needs of your network. For example, you might customize 100 values in a chart that has 1,000 values. Make sure that in the configuration group schema (CGS) layer (covered more extensively in sections that follow), you expose only 100.
1918
- Think early on about how you want to separate infrastructure (for example, clusters) or artifact stores and access between suppliers, in particular within a single service. Make your set of publisher resources match this model.
@@ -27,20 +26,23 @@ We recommend that you first onboard and deploy your simplest NFs (one or two cha
2726
- Lowers total operating costs by reducing the number of publisher backing resources, like ACR or Storage Accounts.
2827
- Simplifies the network service design (NSD), where it may consist of multiple NFs from multiple vendors.
2928
- After you test and approve the desired set of Azure Operator Service Manager publisher resources for production use, we recommend marking the entire set as immutable. Marking the set as immutable helps prevent accidental changes and ensure a consistent deployment experience. Consider relying on immutability capabilities to distinguish between:
30-
3129
- Resources and artifacts used in production
3230
- Resources and artifacts used for testing and development
33-
34-
You can query the state of the publisher resources and the artifact manifests to determine which ones are marked as immutable. For more information, see [Publisher Resource Preview Management feature](publisher-resource-preview-management.md).
3531

36-
Keep in mind the following logic:
37-
38-
- If the network service design version (NSDV) is marked as immutable, the CGS also must be marked as immutable. Otherwise, the deployment call fails.
39-
- If the network function definition version (NFDV) is marked as immutable, the artifact manifest also must be marked as immutable. Otherwise, the deployment call fails.
40-
- If only the artifact manifest or the CGS is marked as immutable, the deployment call succeeds regardless of whether the NFDV and NSDV are marked as immutable.
41-
- Marking an artifact manifest as immutable ensures that all artifacts listed in that manifest are also marked as immutable by enforcing necessary permissions on the artifact store. Listed artifacts typically include charts, images, and Azure Resource Manager templates (ARM templates).
32+
You can query the state of the publisher resources and the artifact manifests to determine which ones are marked as immutable. For more information, see [Publisher Resource Preview Management feature](publisher-resource-preview-management.md).
33+
34+
Keep in mind the following logic:
35+
- If the network service design version (NSDV) is marked as immutable, the CGS also must be marked as immutable. Otherwise, the deployment call fails.
36+
- If the network function definition version (NFDV) is marked as immutable, the artifact manifest also must be marked as immutable. Otherwise, the deployment call fails.
37+
- If only the artifact manifest or the CGS is marked as immutable, the deployment call succeeds regardless of whether the NFDV and NSDV are marked as immutable.
38+
- Marking an artifact manifest as immutable ensures that all artifacts listed in that manifest are also marked as immutable by enforcing necessary permissions on the artifact store. Listed artifacts typically include charts, images, and Azure Resource Manager templates (ARM templates).
4239
- Consider using agreed-upon naming conventions and governance techniques to help address any remaining gaps.
4340

41+
### Publisher high availability and disaster recovery
42+
The Azure Operator Service Manager publisher is a regional service deployed across local availability zones in supported regions only. Consider the following requirements for publisher high availability and disaster recovery:
43+
- To provide geo-redundancy, make sure you have a publisher in every region where you're planning to deploy NFs. Consider using pipelines to keep publisher artifacts and resources in sync across the regions.
44+
- The publisher name must be unique for each Microsoft Entra tenant in each region.
45+
4446
## NFDG and NFDV considerations
4547
The network function definition group (NFDG) represents the smallest component that you plan to reuse independently across multiple services. All parts of an NFDG are always deployed together. These parts are called `networkFunctionApplications` items.
4648

@@ -74,7 +76,6 @@ For ARM templates that contain anything beyond the preceding list, all `PUT` cal
7476
A network service design group (NSDG) is a composite of one or more NFDGs and any infrastructure components deployed at the same time. These components might include clusters and VMs in Nexus Kubernetes or Azure Kubernetes Service (AKS). A site network service (SNS) refers to a single NSDV. Such a design provides a consistent and repeatable deployment of the network service to a site from a single SNS `PUT` call.
7577

7678
An example NSDG might consist of:
77-
7879
- Authentication Server Function (AUSF) NF
7980
- Unified data management (UDM) NF
8081
- Admin VM that supports AUSF or UDM
@@ -96,13 +97,11 @@ We recommend that you have a single SNS for the entire site, including the infra
9697

9798
We recommend that you deploy every SNS with a user-assigned managed identity rather than a system-assigned managed identity. This user-assigned managed identity must have permissions to access the NFDV and must have the role of Managed Identity Operator on itself. For more information, see [Create and assign a user-assigned managed identity](how-to-create-user-assigned-managed-identity.md).
9899

99-
## Azure Operator Service Manager resource mapping per use case
100+
## Resource scheme use-case examples
100101
The following two scenarios illustrate Azure Operator Service Manager resource mapping.
101102

102103
### Scenario: Single network function
103-
An NF with one or two application components is deployed to a Nexus Kubernetes cluster.
104-
105-
Here's the breakdown of resources:
104+
An NF with one or two application components is deployed to a Nexus Kubernetes cluster. Here's the breakdown of resources:
106105
- **NFDG**: If components can be used independently, two NFDGs (one per component). If components are always deployed together, then a single NFDG.
107106
- **NFDV**: As needed based on use cases that trigger NFDV minor or major version updates.
108107
- **NSDG**: Single. Combines the NFs and the Kubernetes cluster definitions.
@@ -113,7 +112,6 @@ Here's the breakdown of resources:
113112

114113
### Scenario: Multiple network functions
115114
Multiple NFs with some shared and independent components are deployed to a Nexus Kubernetes cluster. Here's the breakdown of resources:
116-
117115
- **NFDG**:
118116
- Single for all shared components.
119117
- Single for every independent component or NF.
@@ -142,40 +140,12 @@ The following considerations apply for VNFs:
142140
- Deployment policy, to control whether VM deployment is allowed or not
143141
- In the NFDV, you need to parameterize `deployParameters` and `templateParameters` in such a way that you can supply the unique values by using CGVs for each.
144142

145-
## Considerations for high availability and disaster recovery
146-
Azure Operator Service Manager is a regional service deployed across availability zones in regions that support them. For a list of regions where Azure Operator Service Manager is available, see [Products available by region](https://azure.microsoft.com/explore/global-infrastructure/products-by-region/?products=operator-service-manager,azure-network-function-manager&regions=all). For a list of Azure regions that have availability zones, see [Find the Azure geography that meets your needs](https://azure.microsoft.com/explore/global-infrastructure/geographies/#geographies).
147-
148-
Consider the following requirements for high availability and disaster recovery:
149-
- To provide geo-redundancy, make sure you have a publisher in every region where you're planning to deploy NFs. Consider using pipelines to keep publisher artifacts and resources in sync across the regions.
150-
- The publisher name must be unique for each Microsoft Entra tenant in each region.
151-
- If a region becomes unavailable, you can deploy (but not upgrade) an NF by using publisher resources in another region. Assuming that artifacts and resources are identical between the publishers, you need to change only the `networkServiceDesignVersionOfferingLocation` value in the SNS resource payload:
152-
153-
```
154-
<pre>
155-
resource sns 'Microsoft.HybridNetwork/sitenetworkservices@2023-09-01' = {
156-
name: snsName
157-
location: location
158-
identity: {
159-
type: 'SystemAssigned'
160-
}
161-
properties: {
162-
publisherName: publisherName
163-
publisherScope: 'Private'
164-
networkServiceDesignGroupName: nsdGroup
165-
networkServiceDesignVersionName: nsdvName
166-
<b>networkServiceDesignVersionOfferingLocation: location</b>
167-
</pre>
168-
```
169-
170-
## Troubleshooting considerations
143+
## Deployment troubleshooting considerations
171144
During installation and upgrade, by default:
172-
173145
- The `atomic` and `wait` options are set to `true`.
174146
- The operation timeout is set to `27 minutes`.
175147

176-
During initial onboarding, only while you're still debugging and developing artifacts, we recommend that you set the `atomic` flag to `false`. This setting prevents a Helm rollback upon failure and retains any logs or errors that might otherwise be lost. The optimal way to accomplish it is in the ARM template of the NF.
177-
178-
In the ARM template, add the following section:
148+
During initial onboarding, only while you're still debugging and developing artifacts, we recommend that you set the `atomic` flag to `false`. This setting prevents a Helm rollback upon failure and retains any logs or errors that might otherwise be lost. The optimal way to accomplish it is in the ARM template of the NF. In the ARM template, add the following section:
179149

180150
```
181151
<pre>
@@ -231,76 +201,3 @@ As the first step toward cleaning up an onboarded environment, delete publisher
231201
> Be sure to delete the SNS before you delete the NFDV.
232202
233203
Azure Operator Service Manager does not delete namespaces as part of any deletion operation. As such, after all resources are deleted, some artifacts might remain on the cluster. To remove any remaining artifacts, you should delete any workload namespaces created on the cluster. Including the namespace deletion operation as part of the workflow pipeline is a recommendation to automate the action.
234-
235-
## Sequential ordering behavior for CNF applications
236-
By default, CNF applications are installed or updated based on the order in which they appear in the NFDV. For the deletion operation, the CNF applications are deleted in the specified reverse order. If you need to define a specific order of CNF applications that's different from the default, use `dependsOnProfile` to define a unique sequence for installation, update, and deletion operations.
237-
238-
### How to use dependsOnProfile
239-
You can use `dependsOnProfile` in the NFDV to control the sequence of Helm executions for CNF applications. In the example that follows:
240-
- During an installation operation, the CNF applications are deployed in the following order: `dummyApplication1`, `dummyApplication2`, `dummyApplication`.
241-
- During an update operation, the CNF applications are updated in the following order: `dummyApplication2`, `dummyApplication1`, `dummyApplication`.
242-
- During a deletion operation, the CNF applications are deleted in the following order: `dummyApplication2`, `dummyApplication1`, `dummyApplication`.
243-
244-
```json
245-
{
246-
"location": "eastus",
247-
"properties": {
248-
"networkFunctionTemplate": {
249-
"networkFunctionApplications": [
250-
{
251-
"dependsOnProfile": {
252-
"installDependsOn": [
253-
"dummyApplication1",
254-
"dummyApplication2"
255-
],
256-
"uninstallDependsOn": [
257-
"dummyApplication1"
258-
],
259-
"updateDependsOn": [
260-
"dummyApplication1"
261-
]
262-
},
263-
"name": "dummyApplication"
264-
},
265-
{
266-
"dependsOnProfile": {
267-
"installDependsOn": [
268-
],
269-
"uninstallDependsOn": [
270-
"dummyApplication2"
271-
],
272-
"updateDependsOn": [
273-
"dummyApplication2"
274-
]
275-
},
276-
"name": "dummyApplication1"
277-
},
278-
{
279-
"dependsOnProfile": null,
280-
"name": "dummyApplication2"
281-
}
282-
],
283-
"nfviType": "AzureArcKubernetes"
284-
},
285-
"networkFunctionType": "ContainerizedNetworkFunction"
286-
}
287-
}
288-
```
289-
290-
### Common errors
291-
Currently, if the `dependsOnProfile` code provided in the NFDV is invalid, the NF operation fails with a validation error. The message for the validation error appears in the operation status resource and looks similar to the following example:
292-
293-
```json
294-
{
295-
"id": "/providers/Microsoft.HybridNetwork/locations/EASTUS2EUAP/operationStatuses/ca051ddf-c8bc-4cb2-945c-a292bf7b654b*C9B39996CFCD97AB3A121AE136ED47F67BB13946C573EF90628C47628BC5EF5F",
296-
"name": "ca051ddf-c8bc-4cb2-945c-a292bf7b654b*C9B39996CFCD97AB3A121AE136ED47F67BB13946C573EF90628C47628BC5EF5F",
297-
"resourceId": "/subscriptions/aaaa0a0a-bb1b-cc2c-dd3d-eeeeee4e4e4e/resourceGroups/xinrui-publisher/providers/Microsoft.HybridNetwork/networkfunctions/testnfDependsOn02",
298-
"status": "Failed",
299-
"startTime": "2023-07-17T20:48:01.4792943Z",
300-
"endTime": "2023-07-17T20:48:10.0191285Z",
301-
"error": {
302-
"code": "DependenciesValidationFailed",
303-
"message": "CyclicDependencies: Circular dependencies detected at hellotest."
304-
}
305-
}
306-
```

0 commit comments

Comments
 (0)