Skip to content

Commit 1b6b84b

Browse files
authored
Merge pull request #301658 from johndowns/reliability-files-1
Reliability guide: Azure Files
2 parents 8e00512 + 0e796c8 commit 1b6b84b

15 files changed

Lines changed: 276 additions & 31 deletions

articles/reliability/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -280,6 +280,8 @@ items:
280280
href: ../databox/data-box-disk-faq.yml?toc=/azure/reliability/toc.json&bc=/azure/reliability/breadcrumb/toc.json#how-can-i-recover-my-data-if-an-entire-region-fails-
281281
- name: Azure Elastic SAN
282282
href: reliability-elastic-san.md
283+
- name: Azure Files
284+
href: reliability-storage-files.md
283285
- name: Azure NetApp Files
284286
href: reliability-netapp-files.md
285287
- name: Azure Queue Storage

articles/reliability/availability-zones-enable-zone-resiliency.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ This table summarizes the availability zone support for many Azure services and
152152
| [Azure Elastic SAN](reliability-elastic-san.md#availability-zone-migration) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | | Redeployment |
153153
| [Azure Event Hubs](./reliability-event-hubs.md) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | | Always zone-resilient |
154154
| [Azure ExpressRoute](/azure/expressroute/expressroute-howto-gateway-migration-portal) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | | Modification |
155-
| [Azure Files](migrate-storage.md) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | | Enablement |
155+
| [Azure Files](./reliability-storage-files.md#availability-zone-support) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | | Enablement |
156156
| [Azure Firewall](./reliability-firewall.md#availability-zone-support) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | Modification |
157157
| [Azure Functions](reliability-functions.md#availability-zone-migration) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | | Redeployment |
158158
| [Azure HDInsight](reliability-hdinsight.md#availability-zone-migration) | | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | Redeployment |

articles/reliability/availability-zones-service-support.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ Some Azure services are *nonregional*, which means that you don't deploy the ser
6060
| [Azure Event Grid](reliability-event-grid.md#availability-zone-support) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | |
6161
| [Azure Event Hubs](/azure/event-hubs/event-hubs-business-continuity-outages-disasters#availability-zones) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | |
6262
| [Azure ExpressRoute](../expressroute/designing-for-high-availability-with-expressroute.md) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | |
63-
| [AzureFiles](../storage/common/storage-redundancy.md#supported-azure-storage-services) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | |
63+
| [Azure Files](./reliability-storage-files.md#availability-zone-support) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | |
6464
| [Azure Firewall](reliability-firewall.md#availability-zone-support) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: |
6565
| [Azure Firewall Manager](../firewall-manager/quick-firewall-policy.md) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | |
6666
| [Azure Functions](./reliability-functions.md#availability-zone-support) | :::image type="content" source="media/icon-checkmark.svg" alt-text="Yes" border="false"::: | |

articles/reliability/includes/storage/reliability-storage-availability-zone-down-experience-include.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,9 @@
1313

1414
If a zone becomes unavailable, Azure undertakes networking updates such as Domain Name System (DNS) repointing.
1515

16-
- **Notification:** You can monitor zone failure events by using Azure Service Health and Resource Health. Set up alerts on these services to receive notifications of zone-level issues.
16+
- **Notification**: Azure Storage doesn't notify you when a zone is down. However, you can use [Azure Resource Health](/azure/service-health/resource-health-overview) to monitor for the health of your storage account. You can also use [Azure Service Health](/azure/service-health/overview) to understand the overall health of the Azure Storage service, including any zone failures.
17+
18+
Set up alerts on these services to receive notifications of zone-level problems. For more information, see [Create Service Health alerts in the Azure portal](/azure/service-health/alerts-activity-log-service-notifications-portal) and [Create and configure Resource Health alerts](/azure/service-health/resource-health-alert-arm-template-guide).
1719

1820
- **Active requests:** In-flight requests might be dropped during the recovery process and should be retried. Applications should [implement retry logic](#transient-faults) to handle these temporary interruptions.
1921

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
title: Description of Azure Storage alternative multi-region deployment introductoin
3+
description: Description of Azure Storage alternative multi-region deployment introductoin
4+
author: anaharris-ms
5+
ms.service: azure
6+
ms.topic: include
7+
ms.date: 07/02/2024
8+
ms.author: anaharris
9+
ms.custom: include file
10+
---
11+
12+
This section provides a high-level overview of some approaches to consider, but a complete treatment of multi-region deployment topologies for Azure Storage is outside the scope of this article.

articles/reliability/includes/storage/reliability-storage-multi-region-alternative-reasons-include.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,5 +18,3 @@ The cross-region failover capabilities of Azure Storage might be unsuitable beca
1818
- You need to fail over to a region that isn't your primary region's pair.
1919

2020
- You need an active/active configuration across regions.
21-
22-
Instead, you can design a cross-region failover solution that meets your needs. A complete treatment of deployment topologies for Azure Storage is outside the scope of this article, but you can consider a multi-region deployment model.

articles/reliability/includes/storage/reliability-storage-multi-region-configure-enable-disable-include.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
ms.custom: include file
1010
---
1111

12-
- **Enable geo-redundancy on an existing storage account.** To convert an existing storage account to geo-redundant storage (GRS), see [Change how a storage account is replicated](/azure/storage/common/redundancy-migration) for step-by-step conversion procedures.
12+
- **Enable geo-redundancy on an existing storage account.** To convert an existing storage account to geo-redundant storage (GRS), see [Change how a storage account is replicated](/azure/storage/common/redundancy-migration).
1313

1414
> [!WARNING]
1515
> After your account is reconfigured for geo-redundancy, it might take a significant amount of time before existing data in the new primary region is fully copied to the new secondary region.

articles/reliability/includes/storage/reliability-storage-multi-region-down-experience-include.md

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -22,31 +22,35 @@ This section describes what to expect when a storage account is configured for g
2222
> [!WARNING]
2323
> An unplanned failover [can result in data loss](/azure/storage/common/storage-disaster-recovery-guidance#anticipate-data-loss-and-inconsistencies). Before you initiate a customer-managed failover, decide whether the restoration of service justifies the risk of data loss.
2424
25-
- **Notification:** Region failure events can be monitored through Azure Service Health and Resource Health. Set up alerts on these services to receive notifications of region-level issues.
26-
25+
- **Notification**: Azure Storage doesn't notify you when a region is down. However, you can use [Azure Resource Health](/azure/service-health/resource-health-overview) to monitor for the health of your storage account. You can also use [Azure Service Health](/azure/service-health/overview) to understand the overall health of the Azure Storage service, including any region failures.
26+
27+
Set up alerts on these services to receive notifications of region-level problems. For more information, see [Create Service Health alerts in the Azure portal](/azure/service-health/alerts-activity-log-service-notifications-portal) and [Create and configure Resource Health alerts](/azure/service-health/resource-health-alert-arm-template-guide).
28+
2729
- **Active requests:** During the failover process, both the primary and secondary storage account endpoints become temporarily unavailable for both reads and writes. Any active requests might be dropped, and client applications need to retry after the failover completes.
2830

29-
- **Expected data loss:** Data loss is common during an unplanned failover because of the asynchronous replication lag, which means that recent writes might not be replicated. You can check the [Last Sync Time property](/azure/storage/common/last-sync-time-get) to understand how much data might be lost during an unplanned failover. You can typically expect the data loss to be less than 15 minutes, but that time isn't guaranteed.
31+
- **Expected data loss:** Data loss is common during an unplanned failover because of the asynchronous replication lag, which means that recent writes might not be replicated. You can check the [Last Sync Time property](/azure/storage/common/last-sync-time-get) to understand how much data might be lost during an unplanned failover. Expected data loss is often referred to as the recovery point objective (RPO). You can typically expect the data loss (RPO) to be less than 15 minutes, but that time isn't guaranteed.
3032

31-
- **Expected downtime:** Failover typically completes within 60 minutes, depending on the account size and complexity.
33+
- **Expected downtime:** The amount of expected downtime is often referred to as the recovery time objective (RTO). Customer-managed failover typically completes within 60 minutes (in other words, the expected RTO is 60 minutes), depending on the account size and complexity.
3234

3335
- **Traffic rerouting:** As the failover completes, Azure automatically updates the storage account endpoints so that applications don't need to be reconfigured. If your application keeps Domain Name System (DNS) entries cached, it might be necessary to clear the cache to ensure that the application sends traffic to the new primary region.
3436

3537
- **Post-failover configuration:** After an unplanned failover completes, your storage account in the destination region uses the locally redundant storage (LRS) tier. If you need to geo-replicate it again, you need to re-enable geo-redundant storage (GRS) and wait for the data to be replicated to the new secondary region.
3638

3739
For more information about how to initiate customer-managed failover, see [How customer-managed (unplanned) failover works](/azure/storage/common/storage-failover-customer-managed-unplanned) and [Initiate a storage account failover](/azure/storage/common/storage-initiate-account-failover).
3840

39-
- **Customer-managed failover (planned):** Use a planned failover when storage remains operational in the primary region, but you need to fail over your whole solution to a secondary region for another reason.
41+
- **Customer-managed failover (planned):** Use a planned failover when storage remains operational in the primary region, but you need to fail over your whole solution to a secondary region for another reason. For example, another Azure service might be experiencing a problem and you need to switch to using a secondary region for your whole solution, or you might use a planned failover to conduct a disaster recovery drill for compliance and audit purposes.
4042

4143
- **Detection and response:** You're responsible for deciding to fail over. You typically make this decision if you need to fail over between regions even though your storage account is healthy. For example, you might trigger a failover when there's a major outage of another application component that you can't recover from in the primary region.
4244

43-
- **Notification:** Region failure events can be monitored through Azure Service Health and Resource Health. Set up alerts on these services to receive notifications of region-level issues.
45+
- **Notification**: Azure Storage doesn't notify you when a region is down. However, you can use [Azure Resource Health](/azure/service-health/resource-health-overview) to monitor for the health of your storage account. You can also use [Azure Service Health](/azure/service-health/overview) to understand the overall health of the Azure Storage service, including any region failures.
46+
47+
Set up alerts on these services to receive notifications of region-level problems. For more information, see [Create Service Health alerts in the Azure portal](/azure/service-health/alerts-activity-log-service-notifications-portal) and [Create and configure Resource Health alerts](/azure/service-health/resource-health-alert-arm-template-guide).
4448

4549
- **Active requests:** During the failover process, both the primary and secondary storage account endpoints become temporarily unavailable for both reads and writes. Any active requests might be dropped, and client applications need to retry after the failover completes.
4650

47-
- **Expected data loss:** No data loss is expected because the failover process waits for all data to be synchronized.
51+
- **Expected data loss:** No data loss is expected (in other words, the expected RPO is zero) because the failover process waits for all data to be synchronized.
4852

49-
- **Expected downtime:** Failover typically completes within 60 minutes, depending on the account size and complexity. During the failover process, both the primary and secondary storage account endpoints become temporarily unavailable for both reads and writes.
53+
- **Expected downtime:** Failover typically completes within 60 minutes (in other words, the expected RTO is 60 minutes), depending on the account size and complexity. During the failover process, both the primary and secondary storage account endpoints become temporarily unavailable for both reads and writes.
5054

5155
- **Traffic rerouting:** As the failover completes, Azure automatically updates the storage account endpoints so that applications don't need to be reconfigured. If your application keeps DNS entries cached, it might be necessary to clear the cache to ensure that the application sends traffic to the new primary region.
5256

@@ -56,7 +60,9 @@ This section describes what to expect when a storage account is configured for g
5660

5761
- **Microsoft-managed failover:** In the rare case of a major disaster, where Microsoft determines that the primary region is permanently unrecoverable, Microsoft might initiate automatic failover to the secondary region. This process is managed entirely by Microsoft and requires no customer action. The amount of time that elapses before failover occurs depends on the severity of the disaster and the time required to assess the situation.
5862

59-
- **Notification:** Region failure events can be monitored through Azure Service Health and Resource Health. Set up alerts on these services to receive notifications of region-level issues.
63+
- **Notification:** Azure Storage doesn't notify you when a region is down. However, you can use [Azure Resource Health](/azure/service-health/resource-health-overview) to monitor for the health of your storage account. You can also use [Azure Service Health](/azure/service-health/overview) to understand the overall health of the Azure Storage service, including any region failures.
64+
65+
Set up alerts on these services to receive notifications of region-level problems. For more information, see [Create Service Health alerts in the Azure portal](/azure/service-health/alerts-activity-log-service-notifications-portal) and [Create and configure Resource Health alerts](/azure/service-health/resource-health-alert-arm-template-guide).
6066

6167
> [!IMPORTANT]
6268
> Use customer-managed failover options to develop, test, and implement your disaster recovery plans. **Don't rely on Microsoft-managed failover**, which might only be used in extreme circumstances. A Microsoft-managed failover is likely initiated for an entire region. It can't be initiated for individual storage accounts, subscriptions, or customers. Failover might occur at different times for different Azure services. We recommend that you use customer-managed failover.

articles/reliability/includes/storage/reliability-storage-multi-region-support-failover-types-include.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Azure Storage supports three types of failover for different scenarios.
1515

1616
- **Customer-managed unplanned failover:** You're responsible for initiating recovery if there's a region-wide storage failure in your primary region.
1717

18-
- **Customer-managed planned failover:** You're responsible for initiating recovery if another part of your solution has a failure in your primary region. You need to switch your whole solution over to a secondary region.
18+
- **Customer-managed planned failover:** You are responsible for initiating recovery if another part of your solution has a failure in your primary region, and you need to switch your whole solution over to a secondary region. Use a planned failover when storage remains operational in the primary region, but you need to fail over your whole solution to a secondary region, such as for disaster recovery drills designed to ensure compliance and audit requirements.
1919

2020
- **Microsoft-managed failover:** In exceptional circumstances, Microsoft might initiate failover for all geo-redundant storage (GRS) accounts in a region. However, Microsoft-managed failover is a last resort and is expected to only be performed after an extended period of outage. You shouldn't rely on Microsoft-managed failover.
2121

articles/reliability/overview-reliability-guidance.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ This section provides links to reliability guidance for many Azure services. Eac
7878
|Azure Event Grid| [Reliability in Event Grid](./reliability-event-grid.md)||
7979
|Azure Event Hubs||[Best practices for insulating Azure Event Hubs applications against outages and disasters](/azure/event-hubs/event-hubs-business-continuity-outages-disasters)|
8080
|Azure ExpressRoute|| [Design for high availability with ExpressRoute](../expressroute/designing-for-high-availability-with-expressroute.md?toc=/azure/reliability/toc.json&bc=/azure/reliability/breadcrumb/toc.json) </p>[Design for disaster recovery with ExpressRoute private peering](../expressroute/designing-for-disaster-recovery-with-expressroute-privatepeering.md?toc=/azure/reliability/toc.json&bc=/azure/reliability/breadcrumb/toc.json)|
81-
|Azure Files||[Choose the right redundancy option](/azure/storage/files/files-disaster-recovery?toc=/azure/reliability/toc.json&bc=/azure/reliability/breadcrumb/toc.json#choose-the-right-redundancy-option)</p>[Disaster recovery and failover for Azure Files](/azure/storage/files/files-disaster-recovery?toc=/azure/reliability/toc.json&bc=/azure/reliability/breadcrumb/toc.json)|
81+
|Azure Files| [Reliability in Azure Files](reliability-storage-files.md)||
8282
|Azure Firewall| [Reliability in Azure Firewall](./reliability-firewall.md) ||
8383
|Azure Functions| [Reliability in Azure Functions ](reliability-functions.md)||
8484
|Azure guest configuration||[Azure guest configuration availability](../governance/machine-configuration/overview.md?toc=/azure/reliability/toc.json&bc=/azure/reliability/breadcrumb/toc.json#availability) |

0 commit comments

Comments
 (0)