Skip to content

Commit 351d4e5

Browse files
Merge pull request #304564 from johndowns/reliability-bastion-updates
Reliability Hub - Minor updates
2 parents 161c172 + b5e38ed commit 351d4e5

18 files changed

Lines changed: 78 additions & 57 deletions

articles/reliability/reliability-ai-search.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ When an availability zone experiences an outage, your search service continues t
8181

8282
+ **Traffic rerouting**: When a zone fails, Azure AI Search detects the failure and routes requests to active replicas in the surviving zones.
8383

84-
### Failback
84+
### Zone recovery
8585

8686
When the availability zone recovers, Azure AI Search automatically restores normal operations and begins routing traffic to available replicas across all zones, including the recovered zone.
8787

articles/reliability/reliability-aks.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,11 @@ There's no extra charge to enable availability zone support in AKS. You pay for
114114

115115
AKS also attempts to rebalance the pods across the healthy zones. If you choose to manually scale your node pool in a zone-down scenario, your pods might remain in the *Pending* state when there are no nodes available in the healthy zones. Scaling out in the remaining zones is also subject to the availability of quota and capacity for the VM SKU that you use.
116116

117-
- **Notification:** AKS doesn't notify you when a zone is down. You can use your node or pod health metrics to monitor the health of your nodes and pods.
117+
- **Notification:** AKS doesn't notify you when a zone is down. However, you can use [Azure Resource Health](/azure/service-health/resource-health-overview) to monitor for the health of your cluster. You can also use [Azure Service Health](/azure/service-health/overview) to understand the overall health of the AKS service, including any zone failures.
118+
119+
Set up alerts on these services to receive notifications of zone-level problems. For more information, see [Create Service Health alerts in the Azure portal](/azure/service-health/alerts-activity-log-service-notifications-portal) and [Create and configure Resource Health alerts](/azure/service-health/resource-health-alert-arm-template-guide).
120+
121+
You can also use your node or pod health metrics to monitor the health of your nodes and pods.
118122

119123
- **Active requests:** Any active requests might experience disruptions. Some requests can fail, and latency might increase while your workload fails over to another zone.
120124

@@ -126,7 +130,7 @@ There's no extra charge to enable availability zone support in AKS. You pay for
126130

127131
For more information, see [Zone resiliency considerations for AKS](/azure/aks/aks-zone-resiliency).
128132

129-
### Failback
133+
### Zone recovery
130134

131135
When the availability zone recovers, failback behavior depends on the component:
132136

articles/reliability/reliability-api-management.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -197,9 +197,9 @@ This section describes what to expect when API Management instances are configur
197197

198198
- *Zonal:* For zonal instances, when a zone is unavailable, your instance is unavailable. If you have a secondary instance in another availability zone, you're responsible for rerouting traffic to that secondary instance.
199199

200-
### Failback
200+
### Zone recovery
201201

202-
The failback behavior depends on the availability zone configuration that your instance uses.
202+
The zone recovery behavior depends on the availability zone configuration that your instance uses.
203203

204204
- **Automatic and zone-redundant:** For instances that are configured to use automatic availability zone support or are manually configured to use zone redundancy, when the availability zone recovers, API Management automatically restores units in the availability zone and reroutes traffic between your units as normal.
205205

@@ -294,7 +294,7 @@ This section describes what to expect when API Management instances are configur
294294

295295
- **Traffic rerouting:** If a region goes offline, API requests are automatically routed around the failed region to the next closest gateway.
296296

297-
### Failback
297+
### Region recovery
298298

299299
When the primary region recovers, API Management automatically restores units in the region and reroutes traffic between your units.
300300

articles/reliability/reliability-app-service-environment.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ To learn how to create, enable, or disable a new zone-redundant App Service Envi
114114

115115
[!INCLUDE [Zone-down experience description](includes/app-service/reliability-zone-down-experience-include.md)]
116116

117-
### Failback
117+
### Zone recovery
118118

119119
[!INCLUDE [Failback description](includes/app-service/reliability-failback-include.md)]
120120

articles/reliability/reliability-app-service.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ If you enable availability zones but specify a capacity of less than two, the pl
9595

9696
[!INCLUDE [Zone-down experience description](includes/app-service/reliability-zone-down-experience-include.md)]
9797

98-
### Failback
98+
### Zone recovery
9999

100100
[!INCLUDE [Failback description](includes/app-service/reliability-failback-include.md)]
101101

articles/reliability/reliability-application-gateway-v2.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -177,9 +177,9 @@ The following section describes what to expect when Application Gateway v2 is co
177177

178178
- *Zonal:* When a zone is unavailable, your gateway is unavailable. If you have a secondary gateway in another availability zone, you're responsible for rerouting traffic to that secondary gateway.
179179

180-
### Failback
180+
### Zone recovery
181181

182-
The failback behavior depends on the availability zone configuration that your gateway uses:
182+
The zone recovery behavior depends on the availability zone configuration that your gateway uses:
183183

184184
- *Zone-redundant:* When the affected availability zone recovers, Application Gateway automatically:
185185

articles/reliability/reliability-bastion.md

Lines changed: 13 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@ ms.topic: reliability-article
77
ms.custom: subject-reliability, references_regions
88
ms.service: azure-bastion
99
ms.date: 04/04/2025
10-
1110
---
1211

1312
# Reliability in Azure Bastion
@@ -79,27 +78,27 @@ There's no additional cost to use zone redundancy for Azure Bastion.
7978

8079
### Configure availability zone support
8180

82-
**New resources:** When you deploy a new Azure Bastion resource in a [region that supports availability zones](#regions-supported), you select the specific zones that you want to deploy to. For zone redundancy, you must select multiple zones.
83-
84-
>[!IMPORTANT]
85-
> You can't change the availability zone setting after you deploy your Azure Bastion resource.
81+
- **New resources:** When you deploy a new Azure Bastion resource in a [region that supports availability zones](#regions-supported), you select the specific zones that you want to deploy to. For zone redundancy, you must select multiple zones.
8682

87-
[!INCLUDE [Availability zone numbering](./includes/reliability-availability-zone-numbering-include.md)]
83+
[!INCLUDE [Availability zone numbering](./includes/reliability-availability-zone-numbering-include.md)]
8884

89-
**Migration:** It's not possible to change the availability zone configuration of an existing Azure Bastion resource. Instead, you need to create an Azure Bastion resource with the new configuration and delete the old one.
85+
- **Existing resources:** It's not possible to change the availability zone configuration of an existing Azure Bastion resource. Instead, you need to create an Azure Bastion resource with the new configuration and delete the old one.
9086

9187
### Normal operations
9288

9389
This section describes what to expect when Azure Bastion resources are configured for availability zone support and all availability zones are operational.
9490

95-
**Traffic routing between zones:** When you initiate an SSH or RDP session, it can be routed to an Azure Bastion instance in any of the availability zones you selected.
91+
- **Traffic routing between zones:** When you initiate an SSH or RDP session, it can be routed to an Azure Bastion instance in any of the availability zones you selected.
9692

97-
If you configure zone redundancy on Azure Bastion, a session might be sent to an Azure Bastion instance in an availability zone that's different from the virtual machine you're connecting to. In the following diagram, a request from the user is sent to an Azure Bastion instance in zone 2, although the virtual machine is in zone 1:
93+
If you configure zone redundancy on Azure Bastion, a session might be sent to an Azure Bastion instance in an availability zone that's different from the virtual machine you're connecting to. In the following diagram, a request from the user is sent to an Azure Bastion instance in zone 2, although the virtual machine is in zone 1:
9894

95+
<!-- Art Library Source# ConceptArt-0-000-015- -->
96+
:::image type="content" source="./media/bastion/bastion-instance-zone-traffic.png" alt-text="Diagram that shows Azure Bastion with three instances. A user request goes to an Azure Bastion instance in zone 2 and is sent to a VM in zone 1." border="false":::
9997

100-
:::image type="content" source="./media/bastion/bastion-instance-zone-traffic.png" alt-text="Diagram that shows Azure Bastion with three instances. A user request goes to an Azure Bastion instance in zone 2 and is sent to a VM in zone 1." border="false":::
98+
>[!TIP]
99+
>In most scenarios, the amount of cross-zone latency isn't significant. However, if you have unusually stringent latency requirements your workloads, you should deploy a dedicated single-zone Azure Bastion instance in the virtual machine's availability zone. Keep in mind that this configuration doesn't provide zone redundancy, and we don't recommend it for most customers.
101100
102-
In most scenarios, the small amount of cross-zone latency isn't significant. However, if you have unusually stringent latency requirements for your Azure Bastion workloads, you should deploy a dedicated single-zone Azure Bastion instance in the virtual machine's availability zone. This configuration doesn't provide zone redundancy, and we don't recommend it for most customers.
101+
- **Data replication between zones:** Because Azure Bastion doesn't store state, there's no data to replicate between zones.
103102

104103
### Zone-down experience
105104

@@ -113,13 +112,9 @@ This section describes what to expect when an Azure Bastion resource is configur
113112

114113
- **Traffic rerouting:** When you use zone redundancy, new connections use Azure Bastion instances in the surviving availability zones. Overall, Azure Bastion remains operational.
115114

116-
### Failback
117-
118-
When the availability zone recovers, Azure Bastion:
115+
### Zone recovery
119116

120-
- Automatically restores instances in the availability zone.
121-
- Removes any temporary instances created in the other availability zones.
122-
- Reroutes traffic between your instances as normal.
117+
When the availability zone recovers, Azure Bastion automatically restores instances in the availability zone, and reroutes traffic between your instances as normal.
123118

124119
### Testing for zone failures
125120

@@ -135,7 +130,7 @@ If you have a disaster recovery site in another Azure region, be sure to deploy
135130

136131
## Service-level agreement
137132

138-
The service-level agreement (SLA) for Azure Bastion describes the expected availability of the service and the conditions that must be met to achieve that availability expectation. To understand those conditions, it's important that you review the [SLA for Online Services](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services).
133+
[!INCLUDE [SLA description](includes/reliability-service-level-agreement-include.md)]
139134

140135
## Related content
141136

articles/reliability/reliability-container-registry.md

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: chasedmicrosoft
66
ms.topic: reliability-article
77
ms.custom: subject-reliability
88
ms.service: azure-container-registry
9-
ms.date: 07/23/2025
9+
ms.date: 08/22/2025
1010
#Customer intent: As an engineer responsible for business continuity, I want to understand the details of how Azure Container Registry works from a reliability perspective and plan disaster recovery strategies in alignment with the exact processes that Azure services follow during different kinds of situations.
1111
---
1212

@@ -73,14 +73,18 @@ For client applications that use Container Registry, implement appropriate retry
7373

7474
Zone redundancy protects your container registry against single zone failures by distributing registry data and operations across multiple availability zones within the region. Container image pull and push operations continue to function during zone outages, with automatic failover to healthy zones.
7575

76-
Zone redundancy is enabled by default for all Azure Container Registries in regions that support availability zones, making your resources more resilient automatically and at no additional cost. This enhancement applies to all SKUs including Basic and Standard and has been rolled out to both new and existing registries in supported regions.
76+
Zone redundancy is enabled by default for all registries in regions that support availability zones, making your resources more resilient automatically and at no additional cost. This enhancement applies to all service tiers, including Basic and Standard, and has been applied to both new and existing registries.
7777

78-
>[!IMPORTANT]
79-
>The Azure portal and CLI may not yet reflect the zone redundancy update accurately. The `zoneRedundancy` property in your registry’s configuration might still show as false even though zone redundancy is active for all registries in supported regions. We’re actively updating the portal and API surfaces to reflect this default behavior more transparently. All previously enabled features will continue to function as expected.
78+
> [!IMPORTANT]
79+
> The Azure portal and other tooling might not yet reflect the zone redundancy update accurately.
80+
>
81+
> The `zoneRedundancy` property in your registry’s configuration might still show as `false`, but zone redundancy is active for all registries in supported regions.
82+
>
83+
> We're actively updating the portal and API surfaces to reflect this default behavior more transparently. All previously enabled features continue to function as expected.
8084
8185
### Region support
8286

83-
Zone-redundant registries can only be deployed into [a region that supports availability zones](./regions-list.md).
87+
Zone-redundant registries can be deployed into [any region that supports availability zones](./regions-list.md).
8488

8589
### Considerations
8690

@@ -98,7 +102,7 @@ Zone redundancy is included with container registries at no extra cost.
98102

99103
- To migrate your artifacts between registries, you can [create a transfer pipeline](/azure/container-registry/container-registry-transfer-prerequisites). Alternatively, you can [import container images to a container registry](/azure/container-registry/container-registry-import-images).
100104

101-
- If your registry uses [geo-replication](#multi-region-support) in a region that supports Availablity Zones. Your replica will be zone redundant by default. For more information, see [Create a zone-redundant replica in Container Registry](/azure/container-registry/zone-redundancy-replica). After a geo-replication is created, you can only change the zone redundancy setting by deleting and recreating the replication.
105+
- If your registry uses [geo-replication](#multi-region-support) in a region that supports availability zones, the replica in that region will be zone-redundant automatically. For more information, see [Create a zone-redundant replica in Container Registry](/azure/container-registry/zone-redundancy-replica). After a geo-replication is created, you can only change the zone redundancy setting by deleting and recreating the replication.
102106

103107
- **Disable zone redundancy.** Zone redundancy can't be disabled.
104108

@@ -122,7 +126,11 @@ When a zone becomes unavailable, Container Registry automatically handles the fa
122126

123127
- **Detection and response:** The Container Registry platform automatically detects failures in an availability zone and initiates a response. The service automatically routes traffic to the remaining healthy zones. No manual intervention is required to initiate a zone failover.
124128

125-
- **Notification:** Zone failure events can be monitored through Azure Service Health and through registry availability metrics in Azure Monitor. Set up alerts on these services to receive notifications about zone-level problems.
129+
- **Notification**: Azure Container Registry doesn't notify you when a zone is down. However, you can use [Azure Service Health](/azure/service-health/overview) to understand the overall health of the Azure Container Registry service, including any zone failures.
130+
131+
Set up alerts to receive notifications of zone-level problems. For more information, see [Create Service Health alerts in the Azure portal](/azure/service-health/alerts-activity-log-service-notifications-portal).
132+
133+
You can also monitor registry availability metrics in Azure Monitor.
126134

127135
- **Active requests:** When an availability zone is unavailable, any requests in progress that are connected to resources in the faulty availability zone are terminated. They need to be retried.
128136

@@ -132,7 +140,7 @@ When a zone becomes unavailable, Container Registry automatically handles the fa
132140

133141
- **Traffic rerouting:** The platform automatically reroutes traffic to healthy zones without requiring you to make any configuration changes.
134142

135-
### Failback
143+
### Zone recovery
136144

137145
When the affected availability zone recovers, Container Registry automatically distributes operations across all available zones, including the recovered zone. The service rebalances traffic and data distribution without requiring manual intervention or causing service disruption.
138146

@@ -206,7 +214,11 @@ When a region becomes unavailable, container operations can continue to use alte
206214

207215
- **Detection and response:** Container Registry monitors the health of each regional replica and is responsible for redirecting traffic to another region.
208216

209-
- **Notification:** Region health can be monitored through Azure Service Health. Set up alerts to receive notifications of region-level problems. You can also monitor registry availability metrics for each regional endpoint to detect problems.
217+
- **Notification**: Azure Container Registry doesn't notify you when a region is down. However, you can use [Azure Service Health](/azure/service-health/overview) to understand the overall health of the Azure Container Registry service, including any region failures.
218+
219+
Set up alerts to receive notifications of region-level problems. For more information, see [Create Service Health alerts in the Azure portal](/azure/service-health/alerts-activity-log-service-notifications-portal).
220+
221+
You can also monitor registry availability metrics for each regional endpoint in Azure Monitor.
210222

211223
- **Active requests:** Any active requests currently in flight to an unavailable region will fail and must be retried so that they can be directed to a healthy region.
212224

@@ -218,7 +230,7 @@ When a region becomes unavailable, container operations can continue to use alte
218230

219231
- **Traffic rerouting:** When a region becomes unavailable, container operations are automatically routed to another replica in a healthy region. Clients don't need to change the endpoint in which they interact with the registry. Microsoft automatically handles routing, failover, and failback.
220232

221-
### Failback
233+
### Region recovery
222234

223235
When a region recovers, data plane operations automatically resume for that regional endpoint through Traffic Manager routing. The service synchronizes any changes that occur during the outage by using asynchronous replication with eventual consistency.
224236

0 commit comments

Comments
 (0)