Skip to content

Commit a13183c

Browse files
Merge pull request #306544 from anaharris-ms/reliability-concept-dr-waf-align
[WIP] Reliability: WAF DR align
2 parents 02480dc + 1e52b95 commit a13183c

1 file changed

Lines changed: 13 additions & 4 deletions

File tree

articles/reliability/concept-business-continuity-high-availability-disaster-recovery.md

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ description: Understand business continuity, high availability, and disaster rec
44
author: anaharris-ms
55
ms.service: azure
66
ms.topic: conceptual
7-
ms.date: 01/17/2025
7+
ms.date: 11/04/2025
88
ms.author: anaharris
99
ms.custom: subject-reliability
1010
ms.subservice: azure-reliability
@@ -42,6 +42,8 @@ A business continuity plan doesn't only take into consideration the resiliency f
4242

4343
Business continuity planning should include the following sequential steps:
4444

45+
1. **Criticality tier classification**. Workloads can be classified into different *criticality tiers* based on their importance to the business. Each tier has different requirements for availability, and therefore different requirements for business continuity planning. To determine your workload's critical tier, see [Well-Architected Framework - Select your criticality tier](/azure/well-architected/design-guides/disaster-recovery#select-your-criticality-tier).
46+
4547
1. **Risk identification**. Identify risks to a workload's availability or functionality. Possible risks could be network issues, hardware failures, human error, region outage, etc. Understand the impact of each risk.
4648

4749
1. **Risk classification**. Classify each risk as either a common risk, which should be factored into plans for HA, or an uncommon risk, which should be part of DR planning.
@@ -83,7 +85,7 @@ Here are some examples:
8385

8486
Business continuity plans must address both common and uncommon risks.
8587

86-
- *Common risks* are planned and expected. For example, in a cloud environment it's common for there to be *transient failures* including brief network outages, equipment restarts due to patches, timeouts when a service is busy, and so forth. Because these events happen regularly, workloads need to be resilient to them.
88+
- *Common risks* are planned and expected. For example, in a cloud environment it's common for there to be *transient failures* or *blips*,including brief network outages, equipment restarts due to patches, timeouts when a service is busy, and so forth. Because these events happen regularly, workloads need to be resilient to them.
8789

8890
A high availability strategy must consider and control for each risk of this type.
8991

@@ -93,7 +95,8 @@ Business continuity plans must address both common and uncommon risks.
9395

9496
High availability and disaster recovery are interrelated, and so it's important to plan strategies for both of them together.
9597

96-
It's important to understand that risk classification depends on workload architecture and the business requirements, and some risks can be classified as HA for one workload and DR for another workload. For example, a full Azure region outage would generally be considered a DR risk to workloads in that region. But for workloads that use multiple Azure regions in an active-active configuration with full replication, redundancy, and automatic region failover, a region outage is classified as an HA risk.
98+
Risk classification depends on workload architecture and the business requirements, and some risks can be classified as HA for one workload and DR for another workload. For example, a full Azure region outage would generally be considered a DR risk to workloads in that region. But for workloads that use multiple Azure regions in an active-active configuration with full replication, redundancy, and automatic region failover, a region outage is classified as an HA risk.
99+
97100

98101
#### Risk mitigation
99102

@@ -283,7 +286,11 @@ Regardless of the cause of the disaster, it's important that you create a well-d
283286

284287
DR isn't an automatic feature of Azure. However, many services do provide features and capabilities that you can use to support your DR strategies. You should review the [reliability guides for each Azure service](./overview-reliability-guidance.md) to understand how the service works and its capabilities, and then map those capabilities to your DR plan.
285288

286-
The following sections list some common elements of a disaster recovery plan, and describe how Azure can help you to achieve them.
289+
A strong DR plan turns strategy into decisive action. It provides a clear roadmap for responding to disasters, minimizing downtime, and ensuring business continuity.
290+
291+
To make this possible, every DR plan should be documented to include a clear runbook, a well-defined communication plan, and a structured escalation path. To learn more about these DR plan elements, see [Well-Architected Framework - Document your DR plan](/azure/well-architected/design-guides/disaster-recovery#document-your-dr-plan).
292+
293+
The following sections list some common approaches in a disaster recovery plan, and describe how Azure can help you to achieve them.
287294

288295
#### Failover and failback
289296

@@ -316,6 +323,8 @@ Many Azure data and storage services support backups, such as the following:
316323
- Many Azure database services, including [Azure SQL Database](./reliability-sql-database.md) and [Azure Cosmos DB](/azure/reliability/reliability-cosmos-db-nosql), have an automated backup capability for your databases.
317324
- [Azure Key Vault](./reliability-key-vault.md) provides features to back up your secrets, certificates, and keys.
318325

326+
To learn more about recovery strategies for backup and restore, see [Well-Architected Framework - Recovery strategy for backup and restore](/azure/well-architected/design-guides/disaster-recovery#recovery-strategy-for-backup-and-restore).
327+
319328
#### Automated deployments
320329

321330
To rapidly deploy and configure required resources in the event of a disaster, use Infrastructure as code (IaC) assets, such as Bicep files, ARM templates, or Terraform configuration file. Using IaC reduces your recovery time and potential for error, compared to manually deploying and configuring resources.

0 commit comments

Comments
 (0)