Skip to content

Commit 2c88812

Browse files
committed
acrolinx and link fix
1 parent 8d06a35 commit 2c88812

7 files changed

Lines changed: 29 additions & 29 deletions

learn-pr/wwl-data-ai/scale-containers-azure-container-apps/includes/1-introduction.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Containerized applications require dynamic scaling to handle varying workloads while controlling costs. This module guides you through configuring automatic horizontal scaling in Azure Container Apps to build responsive, cost-efficient container deployments that adapt to real-time demand.
22

3-
Imagine you're a developer building an order processing service for an e-commerce platform. The application experiences predictable traffic spikes during sales events and unpredictable bursts when marketing campaigns launch. Your current deployment uses fixed resources, leading to poor response times during peak periods and wasted capacity during quiet hours. The operations team reports that costs have increased substantially because the application runs at full capacity around the clock. Meanwhile, customer complaints about slow checkout times have increased during flash sales.
3+
Imagine you're a developer building an order processing service for an e-commerce platform. The application experiences predictable traffic spikes during sales events and unpredictable bursts when marketing campaigns launch. Your current deployment uses fixed resources, leading to poor response times during peak periods and wasted capacity during quiet hours. The operations team reports that costs increased substantially because the application runs at full capacity around the clock. Meanwhile, customer complaints about slow checkout times increased during flash sales.
44

55
Your team decides to implement automatic scaling in Azure Container Apps. You need the application to scale out rapidly when HTTP requests increase, process messages from Azure Service Bus queues during order fulfillment, and scale back to zero during idle periods to minimize costs. The platform must handle both synchronous API traffic and asynchronous background processing with different scaling behaviors. Leadership expects the solution to reduce infrastructure costs by at least 40% while maintaining response times under 200 milliseconds during peak load.
66

learn-pr/wwl-data-ai/scale-containers-azure-container-apps/includes/2-configure-scale-rules.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,17 +6,17 @@ Scale definitions in Azure Container Apps consist of three components: limits, r
66

77
Azure Container Apps is powered by [KEDA (Kubernetes Event-driven Autoscaling)](https://keda.sh/), which provides the underlying scaling infrastructure. When you configure scale rules, the platform translates your settings into KEDA specifications that monitor your defined triggers and adjust replica counts accordingly. Each replica is an instance of your container app that runs independently and can handle requests.
88

9-
The default scale behavior creates up to 10 replicas with a minimum of zero when ingress is enabled and no custom rules are defined. If ingress is disabled and you don't specify a minimum replica count or custom scale rule, your container app scales to zero and cannot restart because there is no trigger to activate it. You can configure the minimum to one or more replicas to ensure your application remains available without waiting for scale-up.
9+
The default scale behavior creates up to 10 replicas with a minimum of zero when ingress is enabled and no custom rules are defined. If ingress is disabled and you don't specify a minimum replica count or custom scale rule, your container app scales to zero and can't restart because there's no trigger to activate it. You can configure the minimum to one or more replicas to ensure your application remains available without waiting for scale-up.
1010

1111
Billing in Azure Container Apps depends on replica count. When your application scales to zero, you incur no compute charges. Replicas that are running but not actively processing requests are billed at a lower idle rate. Setting a minimum replica count of one or more ensures availability but increases costs compared to scale-to-zero configurations.
1212

1313
## Configure HTTP scale rules
1414

1515
HTTP scaling adjusts replica count based on concurrent HTTP requests to your container app. The platform calculates concurrent requests by counting the number of requests received in the past 15 seconds and dividing by 15. When this value exceeds your configured threshold, the platform creates additional replicas to handle the load.
1616

17-
The default HTTP concurrency threshold is 10 requests per replica. You can adjust this value based on your application's capacity and response time requirements. Lower thresholds trigger scaling earlier, providing more headroom but potentially creating more replicas than necessary. Higher thresholds maximize utilization of each replica but may cause latency increases before new replicas are available.
17+
The default HTTP concurrency threshold is 10 requests per replica. You can adjust this value based on your application's capacity and response time requirements. Lower thresholds trigger scaling earlier, providing more headroom but potentially creating more replicas than necessary. Higher thresholds maximize utilization of each replica but might cause latency increases before new replicas are available.
1818

19-
HTTP scaling is appropriate for synchronous API workloads and web applications where request volume directly correlates with resource needs. This scaling type supports scale-to-zero, meaning your application can have zero replicas when no requests arrive and automatically start replicas when traffic resumes. Container Apps jobs cannot use HTTP scaling rules because jobs do not expose HTTP endpoints.
19+
HTTP scaling is appropriate for synchronous API workloads and web applications where request volume directly correlates with resource needs. This scaling type supports scale-to-zero, meaning your application can have zero replicas when no requests arrive and automatically start replicas when traffic resumes. Container Apps jobs can't use HTTP scaling rules because jobs don't expose HTTP endpoints.
2020

2121
The following command creates a container app with an HTTP scale rule that triggers scaling when concurrent requests exceed 50 per replica:
2222

@@ -45,7 +45,7 @@ Like HTTP scaling, TCP scaling supports scale-to-zero. When all TCP connections
4545

4646
CPU and memory scaling adjust replica count based on resource utilization across your container app replicas. These rules are implemented as KEDA custom scalers and trigger scaling when average utilization exceeds your configured percentage threshold. CPU scaling monitors processor utilization, while memory scaling monitors memory consumption.
4747

48-
Resource-based scaling has a critical limitation: CPU and memory rules cannot scale your application to zero. The platform requires at least one running replica to measure utilization, so these scaling types always maintain a minimum of one replica regardless of your configured minimum. If you need scale-to-zero capability, combine resource scaling with HTTP or event-driven rules, or use HTTP scaling as your primary trigger.
48+
Resource-based scaling has a critical limitation: CPU and memory rules can't scale your application to zero. The platform requires at least one running replica to measure utilization, so these scaling types always maintain a minimum of one replica regardless of your configured minimum. If you need scale-to-zero capability, combine resource scaling with HTTP or event-driven rules, or use HTTP scaling as your primary trigger.
4949

5050
CPU scaling is appropriate for compute-intensive workloads such as image processing, video transcoding, or machine learning inference where processor utilization directly indicates capacity needs. Memory scaling suits applications with memory-intensive operations like caching, data aggregation, or processing large datasets where memory consumption reflects workload intensity.
5151

@@ -73,7 +73,7 @@ scale:
7373
7474
The scaling algorithm in Azure Container Apps uses several timing parameters that affect how quickly your application responds to load changes. Understanding these parameters helps you configure rules that balance responsiveness with stability.
7575
76-
The polling interval determines how frequently the platform checks your scale triggers. For custom scalers including CPU, memory, and event-driven triggers, the polling interval is 30 seconds. HTTP and TCP rules use a 15-second calculation window. This means changes in load may not trigger scaling for up to 30 seconds after they occur.
76+
The polling interval determines how frequently the platform checks your scale triggers. For custom scalers including CPU, memory, and event-driven triggers, the polling interval is 30 seconds. HTTP and TCP rules use a 15-second calculation window. This means changes in load might not trigger scaling for up to 30 seconds after they occur.
7777
7878
The cool-down period is how long the platform waits after the last scaling event before considering scale-down to zero replicas. The default cool-down period is 300 seconds (five minutes). This delay prevents rapid scale-down when traffic temporarily drops and helps avoid repeated scale-up and scale-down cycles for bursty workloads.
7979

learn-pr/wwl-data-ai/scale-containers-azure-container-apps/includes/3-event-driven-scaling-keda.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Event-driven scaling enables your container apps to respond to external signals beyond HTTP traffic. Azure Container Apps integrates with KEDA (Kubernetes Event-driven Autoscaling) to provide scaling based on message queues, event streams, and other Azure services. This capability is essential for applications that process asynchronous workloads where scaling based on request volume alone does not reflect actual work being performed.
1+
Event-driven scaling enables your container apps to respond to external signals beyond HTTP traffic. Azure Container Apps integrates with KEDA (Kubernetes Event-driven Autoscaling) to provide scaling based on message queues, event streams, and other Azure services. This capability is essential for applications that process asynchronous workloads where scaling based on request volume alone doesn't reflect actual work being performed.
22

33
## Understand KEDA integration
44

@@ -51,7 +51,7 @@ Azure Event Hubs scaling is designed for high-throughput streaming scenarios whe
5151

5252
The metadata parameters for Event Hubs scaling include `consumerGroup`, `unprocessedEventThreshold`, and `checkpointStrategy`. The `unprocessedEventThreshold` sets the number of unprocessed events per partition that triggers scaling. The `checkpointStrategy` specifies how the scaler determines checkpoint positions, with `blobMetadata` being the recommended approach for applications using Azure Blob Storage for checkpointing.
5353

54-
Event Hubs partitions affect the maximum effective replica count. Since each partition can only be read by one consumer at a time within a consumer group, your application cannot benefit from more replicas than partitions. If your Event Hub has 32 partitions, setting `maxReplicas` higher than 32 provides no additional scaling benefit.
54+
Event Hubs partitions affect the maximum effective replica count. Since each partition can be read by one consumer at a time within a consumer group, your application can't benefit from more replicas than partitions. If your Event Hub has 32 partitions, setting `maxReplicas` higher than 32 provides no additional scaling benefit.
5555

5656
The following YAML configuration demonstrates Event Hubs scaling with checkpoint-based lag monitoring:
5757

learn-pr/wwl-data-ai/scale-containers-azure-container-apps/includes/4-keda-scalers-custom-workloads.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,13 @@ Azure Container Apps supports any ScaledObject-based KEDA scaler, providing acce
66

77
When evaluating whether a specific scaler meets your requirements, consider the authentication methods it supports, the metrics it exposes, and how those metrics translate to replica counts. Review the [KEDA scalers documentation](https://keda.sh/docs/scalers/) for detailed specifications of each scaler, including required and optional metadata parameters, supported authentication mechanisms, and example configurations.
88

9-
Scalers are categorized by their maintainer. Microsoft maintains Azure-native scalers with direct support. Community-maintained scalers receive contributions from the open-source community and may have varying levels of documentation and support. External scalers run as separate components and require additional deployment steps not covered by the built-in Container Apps configuration.
9+
Scalers are categorized by their maintainer. Microsoft maintains Azure-native scalers with direct support. Community-maintained scalers receive contributions from the open-source community and might have varying levels of documentation and support. External scalers run as separate components and require additional deployment steps not covered by the built-in Container Apps configuration.
1010

1111
## Configure Apache Kafka scaling
1212

1313
Apache Kafka scaling triggers replica changes based on consumer group lag. The scaler monitors the difference between the latest offset in each partition and the committed offset of your consumer group. When lag accumulates, the scaler increases replica count to process messages faster and reduce the backlog.
1414

15-
The key metadata parameters for Kafka scaling include `bootstrapServers`, `consumerGroup`, `topic`, and `lagThreshold`. The `lagThreshold` parameter sets the lag per partition that triggers scaling. For example, if you set `lagThreshold` to 100 and your consumer group has 500 messages of lag across partitions, the scaler calculates that 5 replicas are needed.
15+
The key metadata parameters for Kafka scaling include `bootstrapServers`, `consumerGroup`, `topic`, and `lagThreshold`. The `lagThreshold` parameter sets the lag per partition that triggers scaling. For example, if you set `lagThreshold` to 100 and your consumer group has 500 messages of lag across partitions, the scaler calculates that five replicas are needed.
1616

1717
Kafka authentication typically uses SASL mechanisms. You configure credentials as Container Apps secrets and reference them in the scaler authentication settings. The scaler supports SASL/PLAIN, SASL/SCRAM, and TLS authentication depending on your Kafka cluster configuration.
1818

@@ -103,17 +103,17 @@ Follow these steps to convert a KEDA scaler specification:
103103

104104
1. Configure scale limits using `--min-replicas` and `--max-replicas`. These correspond to the `minReplicaCount` and `maxReplicaCount` in KEDA ScaledObject specifications.
105105

106-
The Container Apps format differs from native KEDA in several ways. Container Apps doesn't support the full TriggerAuthentication resource type; instead, you reference secrets directly in the scale rule. Some advanced KEDA features like external scalers or custom scaling intervals may not be available or may require different configuration approaches.
106+
The Container Apps format differs from native KEDA in several ways. Container Apps doesn't support the full TriggerAuthentication resource type; instead, you reference secrets directly in the scale rule. Some advanced KEDA features like external scalers or custom scaling intervals might not be available or might require different configuration approaches.
107107

108108
## Best practices
109109

110110
- **Start with Azure-native scalers:** Azure Service Bus, Event Hubs, and Storage Queue scalers have first-party support and are maintained by Microsoft. Use these for Azure resources before considering community-maintained alternatives.
111111

112-
- **Test scaler behavior in staging:** Custom scalers may have unexpected polling or threshold behaviors. Validate scaling patterns in a non-production environment before deploying to production. Monitor how quickly scaling responds to load changes and verify that thresholds produce the expected replica counts.
112+
- **Test scaler behavior in staging:** Custom scalers might have unexpected polling or threshold behaviors. Validate scaling patterns in a non-production environment before deploying to production. Monitor how quickly scaling responds to load changes and verify that thresholds produce the expected replica counts.
113113

114114
- **Combine scheduled and reactive scaling:** Use cron scalers to establish baseline capacity before known peak periods. Add event-driven or HTTP scalers to handle variations and unexpected spikes. This combination ensures capacity is available when needed while still responding to actual demand.
115115

116-
- **Document scaler configurations:** Custom scaler metadata is not self-documenting. Maintain documentation that explains why specific thresholds were chosen, how authentication is configured, and what metrics drive scaling decisions. This documentation helps team members understand and maintain scaling configurations over time.
116+
- **Document scaler configurations:** Custom scaler metadata isn't self-documenting. Maintain documentation that explains why specific thresholds were chosen, how authentication is configured, and what metrics drive scaling decisions. This documentation helps team members understand and maintain scaling configurations over time.
117117

118118
## Additional resources
119119

0 commit comments

Comments
 (0)