Skip to content

Commit baf4d37

Browse files
authored
Merge pull request #53482 from MicrosoftDocs/NEW-queue-process-operations-service-bus
New queue process operations service bus - from release branch
2 parents e9e19b0 + 8409bdf commit baf4d37

17 files changed

Lines changed: 781 additions & 0 deletions
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.queue-process-operations-service-bus.introduction
3+
title: Introduction
4+
metadata:
5+
title: Introduction
6+
description: Introduction
7+
ms.date: 06/26/2025
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 3
12+
content: |
13+
[!include[](includes/1-introduction.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.queue-process-operations-service-bus.explore-service-bus-concepts
3+
title: Explore Azure Service Bus concepts and messaging in AI architectures
4+
metadata:
5+
title: Explore Azure Service Bus Concepts and Messaging in AI Architectures
6+
description: Explore Azure Service Bus concepts and messaging in AI architectures
7+
ms.date: 06/26/2025
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 10
12+
content: |
13+
[!include[](includes/2-explore-service-bus-concepts.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.queue-process-operations-service-bus.choose-queues-topics-subscriptions
3+
title: Choose between queues and topics with subscriptions
4+
metadata:
5+
title: Choose Between Queues and Topics with Subscriptions
6+
description: Choose between queues and topics with subscriptions
7+
ms.date: 06/26/2025
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 9
12+
content: |
13+
[!include[](includes/3-choose-queues-topics-subscriptions.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.queue-process-operations-service-bus.structure-messages-ai-workloads
3+
title: Structure messages for AI workloads
4+
metadata:
5+
title: Structure Messages for AI Workloads
6+
description: Structure messages for AI workloads
7+
ms.date: 06/26/2025
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 10
12+
content: |
13+
[!include[](includes/4-structure-messages-ai-workloads.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.queue-process-operations-service-bus.process-messages-reliably
3+
title: Process messages reliably
4+
metadata:
5+
title: Process Messages Reliably
6+
description: Process messages reliably
7+
ms.date: 06/26/2025
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 12
12+
content: |
13+
[!include[](includes/5-process-messages-reliably.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.queue-process-operations-service-bus.exercise-process-messages
3+
title: Exercise - Process messages with Azure Service Bus
4+
metadata:
5+
title: Exercise - Process Messages with Azure Service Bus
6+
description: Exercise - Process messages with Azure Service Bus
7+
ms.date: 06/26/2025
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 30
12+
content: |
13+
[!include[](includes/6-exercise-process-messages.md)]
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.queue-process-operations-service-bus.knowledge-check
3+
title: Module assessment
4+
metadata:
5+
title: Module Assessment
6+
description: Module assessment
7+
ms.date: 06/26/2025
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 5
12+
content: "Choose the best response for each of the following questions."
13+
quiz:
14+
questions:
15+
- content: "Your AI platform completes a document analysis, and three independent services need to react to the result: a notification service alerts the user, an audit service logs the result for compliance, and a dashboard service updates metrics. Which Service Bus entity type supports this requirement?"
16+
choices:
17+
- content: "A topic with three subscriptions"
18+
isCorrect: true
19+
explanation: "A topic distributes copies of each message to all attached subscriptions. Each subscription acts as an independent virtual queue, so the notification, audit, and dashboard services each receive and process their own copy of the message independently."
20+
- content: "A queue with three competing consumers"
21+
isCorrect: false
22+
explanation: "A queue delivers each message to exactly one competing consumer. With three consumers on one queue, only one service would receive each message, so the other two services wouldn't receive the result."
23+
- content: "Three separate queues with the sender publishing to each one"
24+
isCorrect: false
25+
explanation: "While sending to three separate queues would deliver messages to all three services, this approach tightly couples the sender to each consumer. Topics with subscriptions decouple the sender from the receivers, allowing you to add or remove subscribers without modifying the producer's code."
26+
- content: "You're building an AI inference pipeline where losing a customer's request is unacceptable. If a worker crashes while processing a message, the message must become available to another worker. Which receive mode should you configure?"
27+
choices:
28+
- content: "Peek-lock mode"
29+
isCorrect: true
30+
explanation: "Peek-lock mode locks the message without removing it from the queue. If the worker crashes before settling the message, the lock expires and Service Bus makes the message available for another consumer. This provides at-least-once delivery, which is essential when losing requests isn't acceptable."
31+
- content: "Receive-and-delete mode"
32+
isCorrect: false
33+
explanation: "Receive-and-delete mode removes the message from the queue immediately upon delivery. If the worker crashes before completing inference, the message is permanently lost because it's already been removed from the queue."
34+
- content: "Deferred receive mode"
35+
isCorrect: false
36+
explanation: "Deferred receive isn't a receive mode. Deferral is a settlement operation that keeps a message in the queue but removes it from regular delivery, requiring retrieval by sequence number. The two receive modes are peek-lock and receive-and-delete."
37+
- content: "A message in your inference queue consistently causes a processing error every time a worker attempts it. After 10 delivery attempts, what does Service Bus do with the message?"
38+
choices:
39+
- content: "Moves the message to the dead-letter queue with the reason MaxDeliveryCountExceeded"
40+
isCorrect: true
41+
explanation: "Service Bus tracks the delivery count for each message. When the count exceeds the queue's max_delivery_count (default 10), Service Bus automatically moves the message to the dead-letter subqueue with the reason MaxDeliveryCountExceeded. This prevents a poison message from blocking the queue indefinitely."
42+
- content: "Deletes the message permanently from the queue"
43+
isCorrect: false
44+
explanation: "Service Bus doesn't silently delete messages that exceed the max delivery count. It preserves them in the dead-letter queue so developers can inspect the failures, diagnose the root cause, and resubmit the messages after fixing the issue."
45+
- content: "Returns the message to the back of the queue for continued retry attempts"
46+
isCorrect: false
47+
explanation: "Service Bus doesn't continue retrying indefinitely. The max_delivery_count setting exists specifically to prevent a poison message from cycling through delivery and failure in an infinite loop. Once the count is exceeded, the message moves to the dead-letter queue."
48+
- content: "Your AI pipeline needs to process 500-MB document files, but Azure Service Bus Premium tier supports messages up to 100 MB via AMQP. Which pattern addresses this constraint?"
49+
choices:
50+
- content: "The claim-check pattern, uploading files to Azure Blob Storage and sending only the blob URI in the message"
51+
isCorrect: true
52+
explanation: "The claim-check pattern separates the large payload from the message. The producer uploads the file to Azure Blob Storage and sends a small Service Bus message containing only the blob URI. The consumer retrieves the full payload from storage using the URI. This works within any tier's size limits and reduces broker throughput costs."
53+
- content: "Splitting the document into five 100-MB messages and reassembling them at the consumer"
54+
isCorrect: false
55+
explanation: "While splitting could technically work, it adds significant complexity for message ordering, reassembly, and partial failure handling. The claim-check pattern is the recommended approach because it keeps messages small, works within any tier's limits, and avoids the complexity of splitting and reassembling payloads."
56+
- content: "Encoding the document as base64 and sending it as the message body on the Premium tier"
57+
isCorrect: false
58+
explanation: "Base64 encoding increases the payload size by approximately 33%, making a 500-MB file even larger (~667 MB). This exceeds the Premium tier's 100-MB AMQP limit. The claim-check pattern is the correct approach for payloads that exceed Service Bus message size limits."
59+
- content: "You set the correlation_id property on every Service Bus message in your AI pipeline. What is the primary purpose of this property?"
60+
choices:
61+
- content: "Tracking a request end-to-end across all pipeline stages, from the API through processing to result delivery"
62+
isCorrect: true
63+
explanation: "The correlation_id provides end-to-end request tracking. The client generates a unique identifier when submitting the original request, and this ID follows the message through every stage of the pipeline. When troubleshooting a failed inference, you can search logs, dead-letter queues, and result stores by correlation ID to trace the full lifecycle of a request."
64+
- content: "Enabling duplicate detection so Service Bus discards repeated submissions of the same request"
65+
isCorrect: false
66+
explanation: "Duplicate detection uses the message_id property, not correlation_id. When duplicate detection is enabled on a queue, Service Bus checks the message_id within the detection window to discard duplicates. The correlation_id serves a different purpose: tracking a request across pipeline stages."
67+
- content: "Routing messages to specific subscriptions based on filter rules"
68+
isCorrect: false
69+
explanation: "While correlation filters can match on the correlation_id property, routing isn't the primary purpose of setting correlation_id on every message. The primary purpose is end-to-end request tracking across pipeline stages. Routing decisions are typically based on application properties like model_name or priority that describe the message's content, not its tracking identifier."
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.queue-process-operations-service-bus.summary
3+
title: Summary
4+
metadata:
5+
title: Summary
6+
description: Summary
7+
ms.date: 06/26/2025
8+
author: jeffkoms
9+
ms.author: jeffko
10+
ms.topic: unit
11+
durationInMinutes: 2
12+
content: |
13+
[!include[](includes/8-summary.md)]
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
AI applications require asynchronous messaging to decouple request submission from inference processing and ensure reliable delivery under variable load. This module guides you through using Azure Service Bus to queue, distribute, and reliably process AI workloads on Azure.
2+
3+
Imagine you're a developer building a document analysis platform that uses large language models to extract structured data from uploaded contracts. Clients submit documents through a web API, and each document requires between five and 30 seconds of processing time depending on length and complexity. During peak hours, hundreds of documents arrive within minutes, but the inference service can only process a limited number concurrently. Without a buffer between the API and the processing layer, the API becomes unresponsive under load, clients receive timeout errors, and documents are lost when processing pods restart. Your team needs a messaging layer that absorbs traffic spikes, distributes work across multiple processors, and guarantees that every document is processed exactly once. Some downstream services also need to react to completed analyses, such as a notification service that alerts the submitter and an audit service that logs the result for compliance. The platform must handle processing failures gracefully, routing unprocessable documents to a separate queue for investigation rather than silently dropping them. Azure Service Bus provides the queuing, publish-subscribe, and dead-letter capabilities that this architecture requires.
4+
5+
After completing this module, you'll be able to:
6+
7+
- Explain how Azure Service Bus decouples AI application components and identify when to apply messaging patterns such as load leveling, competing consumers, and publish-subscribe.
8+
- Choose between Service Bus queues and topics with subscriptions based on whether an AI workflow requires single-consumer processing or fan-out to multiple consumers.
9+
- Structure Service Bus messages for AI workloads, including serializing prompts and model parameters, handling large payloads with the claim-check pattern, and including correlation IDs for end-to-end request tracking.
10+
- Process messages reliably using peek-lock receive mode, handle poison messages through dead-letter queues, and monitor the dead-letter queue for failed inferences.
11+
12+
> [!NOTE]
13+
> All code examples in this module are based on the most recent version of the `azure-servicebus` library at the time of writing. The library is updated often and the recommendation is to visit the [Azure Service Bus Python SDK documentation](/python/api/overview/azure/servicebus-readme) for the most up-to-date information.

0 commit comments

Comments
 (0)