You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/ingest-streaming-data-use-azure-stream-analytics-synapse/7-knowledge-check.yml
+24-13Lines changed: 24 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -12,25 +12,36 @@ metadata:
12
12
durationInMinutes: 3
13
13
quiz:
14
14
questions:
15
-
- content: "Which type of output should you use to ingest the results of an Azure Stream Analytics job into a dedicated SQL pool table in Azure Synapse Analytics?"
15
+
- content: "Which Azure Stream Analytics window type groups events based on periods of inactivity between consecutive events?"
16
16
choices:
17
-
- content: "Azure Synapse Analytics"
18
-
isCorrect: true
19
-
explanation: "Correct. An Azure Synapse Analytics output writes data to a table in an Azure Synapse Analytics dedicated SQL pool."
20
-
- content: "Blob storage/ADLS Gen2"
17
+
- content: "Tumbling"
21
18
isCorrect: false
22
-
explanation: "Incorrect. A Blob storage/ADLS Gen2 output does not write data to a relational table in a dedicated SQL pool."
23
-
- content: "Azure Event Hubs"
19
+
explanation: "Incorrect. A tumbling window groups events into fixed-size, nonoverlapping intervals regardless of gaps between events."
20
+
- content: "Session"
21
+
isCorrect: true
22
+
explanation: "Correct. A session window groups events that arrive within a configurable timeout of each other, creating variable-length windows bounded by inactivity gaps."
23
+
- content: "Snapshot"
24
24
isCorrect: false
25
-
explanation: "Incorrect. An Azure Event Hubs output does not write data to a relational table in a dedicated SQL pool."
26
-
- content: "Which type of output should be used to ingest the results of an Azure Stream Analytics job into files in a data lake for analysis in Azure Synapse Analytics?"
25
+
explanation: "Incorrect. A snapshot window groups events that share the same timestamp using System.Timestamp()."
26
+
- content: "You need to continuously write processed stream events to files in a data lake for later batch analytics. Which output type should you configure?"
27
27
choices:
28
-
- content: "Azure Synapse Analytics"
28
+
- content: "Azure SQL Database"
29
29
isCorrect: false
30
-
explanation: "Incorrect. An Azure Synapse Analytics output does not write data to files in a data lake."
30
+
explanation: "Incorrect. An Azure SQL Database output writes to a relational table, not to files in a data lake."
31
31
- content: "Blob storage/ADLS Gen2"
32
32
isCorrect: true
33
-
explanation: "Correct. A Blob storage/ADLS Gen2 output writes data to files in a data lake."
33
+
explanation: "Correct. A Blob storage/ADLS Gen2 output writes data to files in Azure Data Lake Storage Gen2, which is suitable for batch analytics workloads."
34
+
- content: "Power BI"
35
+
isCorrect: false
36
+
explanation: "Incorrect. A Power BI output writes to a streaming dataset for near real-time visualization, not to file-based storage."
37
+
- content: "You want to forward enriched events from a Stream Analytics job to a downstream application via a message hub. Which output type should you use?"
38
+
choices:
34
39
- content: "Azure Event Hubs"
40
+
isCorrect: true
41
+
explanation: "Correct. An Azure Event Hubs output forwards events to an event hub, enabling downstream consumers such as other jobs, functions, or applications to receive the enriched stream."
42
+
- content: "Azure SQL Database"
43
+
isCorrect: false
44
+
explanation: "Incorrect. An Azure SQL Database output writes to a relational table, not to a message hub."
45
+
- content: "Blob storage/ADLS Gen2"
35
46
isCorrect: false
36
-
explanation: "Incorrect. An Azure Event Hubs output does not write data to files in a data lake."
47
+
explanation: "Incorrect. A Blob storage/ADLS Gen2 output writes to files in a data lake, not to a message hub for real-time event forwarding."
Suppose a retail company captures real-time sales transaction data from an e-commerce website, and wants to analyze this data along with more static data related to products, customers, and employees. A common way to approach this problem is to ingest the stream of real-time data into a data lake or data warehouse, where it can be queried together with data that is loaded using batch processing techniques.
2
+
Suppose a manufacturing company captures real-time telemetry data from factory floor sensors, and wants to monitor equipment performance, detect anomalies, and archive event data for long-term analysis. A common approach is to use a stream processing engine to continuously filter and aggregate the flow of sensor events, and route the results to one or more destinations—such as a data lake for storage, a relational database for operational reporting, or a message hub for downstream alerting systems.
3
3
4
-
Microsoft Azure Synapse Analytics provides a comprehensive enterprise data analytics platform, into which real-time data captured in Azure Event Hubs or Azure IoT Hub, and processed by Azure Stream Analytics can be loaded.
4
+
Azure Stream Analytics is a fully managed, cloud-based stream processing service that enables you to build real-time analytics pipelines. It connects to streaming data sources such as Azure Event Hubs, Azure IoT Hub, and Azure Data Lake Storage, processes data using a SQL-based query language, and writes results to a wide range of output destinations.
5
5
6
-

6
+

7
7
8
-
A typical pattern for real-time data ingestion in Azure consists of the following sequence of service integrations:
8
+
A typical pattern for real-time data processing in Azure consists of the following sequence:
9
9
10
10
1. A real-time source of data is captured in an event ingestor, such as Azure Event Hubs or Azure IoT Hub.
11
-
2. The captured data is perpetually filtered and aggregated by an Azure Stream Analytics query.
12
-
3. The results of the query are loaded into a data lake or data warehouse in Azure Synapse Analytics for subsequent analysis.
11
+
2. The captured data is perpetually filtered, aggregated, or enriched by an Azure Stream Analytics query.
12
+
3. The results of the query are written to one or more output destinations—such as a data lake, a relational database, another event hub, or a real-time dashboard.
13
13
14
-
In this module, you'll explore multiple ways in which you can use Azure Stream Analytics to ingest real-time data into Azure Synapse Analytics.
14
+
In this module, you'll learn how to configure Azure Stream Analytics jobs to process streaming data and route the results to a variety of output destinations.
Azure Synapse Analytics provides multiple ways to analyze large volumes of data. Two of the most common approaches to large-scale data analytics are:
2
+
Azure Stream Analytics can route the results of stream processing to multiple types of output destinations, depending on whether you need to store, analyze, forward, or visualize the data.
3
3
4
-
-**Data warehouses** - relational databases, optimized for distributed storage and query processing. Data is stored in tables and queried using SQL.
5
-
-**Data lakes** - distributed file storage in which data is stored as files that can be processed and queried using multiple runtimes, including Apache Spark and SQL.
4
+
## Data lake storage
6
5
7
-
## Data warehouses in Azure Synapse Analytics
6
+
A common use case is to write stream processing results to a data lake hosted in Azure Data Lake Storage Gen2. Data stored in a data lake can later be processed and queried using batch analytics tools such as Apache Spark or serverless SQL engines. This approach is well suited to scenarios where you want to retain raw or lightly processed event data for historical analysis, compliance, or machine learning workloads.
8
7
9
-
Azure Synapse Analytics provides dedicated SQL pools that you can use to implement enterprise-scale relational data warehouses. Dedicated SQL pools are based on a *massively parallel processing* (MPP) instance of the Microsoft SQL Server relational database engine in which data is stored and queried in tables.
8
+

10
9
11
-
To ingest real-time data into a relational data warehouse, your Azure Stream Analytics query must write its results to an output that references the table into which you want to load the data.
10
+
## Relational database storage
12
11
13
-

12
+
When streaming results need to be available to applications or reporting tools that rely on relational data, you can write the output of a Stream Analytics job to a table in Azure SQL Database or Azure Synapse Analytics dedicated SQL pool. This approach enables dashboards and reports to query the most recently ingested data using standard SQL.
14
13
15
-
## Data lakes in Azure Synapse Analytics
14
+

16
15
17
-
An Azure Synapse Analytics workspace typically includes at least one storage service that is used as a data lake. Most commonly, the data lake is hosted in an Azure Storage account using a container configured to support Azure Data Lake Storage Gen2. Files in the data lake are organized hierarchically in directories (folders), and can be stored in multiple file formats, including delimited text (such as comma-separated values, or CSV), Parquet, and JSON.
16
+
## Real-time dashboards
18
17
19
-
When ingesting real-time data into a data lake, your Azure Stream Analytics query must write its results to an output that references the location in the Azure Data Lake Gen2 storage container where you want to save the data files. Data analysts, engineers, and scientists can then process and query the files in the data lake by running code in an Apache Spark pool, or by running SQL queries using a serverless SQL pool.
18
+
For scenarios that require live visualization of streaming metrics—such as monitoring sensor readings or tracking website activity in real time—Azure Stream Analytics can write output directly to a Power BI streaming dataset. Power BI then renders the data in near real time without requiring a scheduled data refresh.
20
19
21
-

20
+
## Event forwarding
21
+
22
+
Azure Stream Analytics can also write filtered or enriched events to another Azure Event Hubs instance. This pattern is used to build multi-stage streaming pipelines, where one Stream Analytics job performs initial filtering or enrichment and forwards the results to a downstream consumer such as another job, an Azure Function, or a custom application.
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/ingest-streaming-data-use-azure-stream-analytics-synapse/includes/3-configure-inputs-outputs.md
+18-6Lines changed: 18 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
All Azure Stream Analytics jobs include at least one input and output. In most cases, inputs reference sources of streaming data (though you can also define inputs for static reference data to augment the streamed event data). Outputs determine where the results of the stream processing query will be sent. In the case of data ingestion into Azure Synapse Analytics, the output usually references an Azure Data Lake Storage Gen2 container or a table in a dedicated SQL pool database.
3
3
4
4
> [!NOTE]
5
-
> Azure Stream Analytics offers two authoring experiences: the traditional SQL query editor covered in this module, and a no-code drag-and-drop editor. The no-code editor lets you build complete jobs — including inputs, transformations, and Synapse outputs — visually without writing SQL. You can access it from the **Overview** page of a Stream Analytics job in the Azure portal, or from Azure Event Hubs via **Process Data**. For more information, see [No-code stream processing in Azure Stream Analytics](/azure/stream-analytics/no-code-stream-processing).
5
+
> Azure Stream Analytics offers two authoring experiences: the traditional SQL query editor covered in this module, and a no-code drag-and-drop editor. The no-code editor lets you build complete jobs—including inputs, transformations, and Synapse outputs—visually without writing SQL. You can access it from the **Overview** page of a Stream Analytics job in the Azure portal, or from Azure Event Hubs via **Process Data**. For more information, see [No-code stream processing in Azure Stream Analytics](/azure/stream-analytics/no-code-stream-processing).
6
6
7
7
## Streaming data inputs
8
8
@@ -18,20 +18,32 @@ Depending on the specific input type, the data for each streamed event includes
18
18
> [!NOTE]
19
19
> For more information about streaming inputs, see [Stream data as input into Stream Analytics](/azure/stream-analytics/stream-analytics-define-inputs?azure-portal=true) in the Azure Stream Analytics documentation.
20
20
21
-
## Azure Synapse Analytics outputs
21
+
## Azure SQL Database outputs
22
22
23
-
If you need to load the results of your stream processing into a table in a dedicated SQL pool, use an **Azure Synapse Analytics** output. The output configuration includes the identity of the dedicated SQL pool in an Azure Synapse Analytics workspace, details of how the Azure Stream Analytics job should establish an authenticated connection to it, and the existing table into which the data should be loaded.
23
+
If you need to load the results of your stream processing into a relational table, use an **Azure SQL Database** output. The output configuration specifies the server name, database name, and the existing table into which data should be written. The table must already exist and its schema must exactly match the fields and their types produced by your query.
24
24
25
-
The recommended authentication method is **managed identity**, which eliminates password management overhead and avoids the 90-day token expiration that affects user-based authentication methods. Using managed identity also enables fully automated Stream Analytics deployments without embedded credentials. Alternatively, you can use SQL Server authentication with a username and password. When using an Azure Synapse Analytics output, your Azure Stream Analytics job configuration must include an Azure Storage account in which authentication metadata for the job is stored securely.
25
+
The recommended authentication method is **managed identity**, which eliminates password management overhead and avoids the 90-day token expiration that affects user-based authentication methods. Using managed identity also enables fully automated Stream Analytics deployments without embedded credentials. Alternatively, you can use SQL Server authentication with a username and password.
26
26
27
27
> [!NOTE]
28
-
> For more information about using an Azure Synapse Analytics output, see [Azure Synapse Analytics output from Azure Stream Analytics](/azure/stream-analytics/azure-synapse-analytics-output?azure-portal=true) in the Azure Stream Analytics documentation.
28
+
> For more information about using an Azure SQL Database output, see [Azure SQL Database output from Azure Stream Analytics](/azure/stream-analytics/sql-database-output?azure-portal=true) in the Azure Stream Analytics documentation.
29
29
30
30
## Azure Data Lake Storage Gen2 outputs
31
31
32
-
If you need to write the results of stream processing to an Azure Data Lake Storage Gen2 container that hosts a data lake in an Azure Synapse Analytics workspace, use a **Blob storage/ADLS Gen2** output. The output configuration includes details of the storage account in which the container is defined, authentication settings to connect to it, and details of the files to be created. You can specify the file format, including CSV, JSON, Parquet, and Delta formats. You can also specify custom patterns to define the folder hierarchy in which the files are saved - for example using a pattern such as *YYYY/MM/DD* to generate a folder hierarchy based on the current year, month, and day.
32
+
If you need to write the results of stream processing to files in a data lake, use a **Blob storage/ADLS Gen2** output. The output configuration includes details of the storage account in which the container is defined, authentication settings to connect to it, and details of the files to be created. You can specify the file format, including CSV, JSON, Parquet, and Delta formats. You can also specify custom patterns to define the folder hierarchy in which the files are saved - for example using a pattern such as *YYYY/MM/DD* to generate a folder hierarchy based on the current year, month, and day.
33
33
34
34
You can specify minimum and maximum row counts for each batch, which determines the number of output files generated (each batch creates a new file). You can also configure the *write mode* to control when the data is written for a time window - appending each row as it arrives or writing all rows once (which ensures "exactly once" delivery).
35
35
36
36
> [!NOTE]
37
37
> For more information about using a Blob storage/ADLS Gen2 output, see [Blob storage and Azure Data Lake Gen2 output from Azure Stream Analytics](/azure/stream-analytics/blob-storage-azure-data-lake-gen2-output?azure-portal=true) in the Azure Stream Analytics documentation.
38
+
39
+
## Additional output types
40
+
41
+
Azure Stream Analytics supports a wide range of output destinations beyond data lakes and relational databases:
42
+
43
+
-**Azure Event Hubs** — forward filtered or enriched events to another event hub for downstream consumers or multi-stage pipelines.
44
+
-**Power BI** — write aggregated streaming metrics directly to a Power BI streaming dataset for near real-time visualization without a scheduled refresh.
45
+
-**Azure Cosmos DB** — write results to a globally distributed NoSQL database.
46
+
-**Azure Functions** — trigger serverless functions in response to stream events.
47
+
48
+
> [!NOTE]
49
+
> For the full list of supported output types, see [Understand outputs from Azure Stream Analytics](/azure/stream-analytics/stream-analytics-define-outputs?azure-portal=true) in the Azure Stream Analytics documentation.
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/ingest-streaming-data-use-azure-stream-analytics-synapse/includes/4-define-query.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,21 +3,21 @@ After defining the input(s) and output(s) for your Azure Stream Analytics job, y
3
3
4
4
## Selecting input fields
5
5
6
-
The simplest approach to ingesting streaming data into Azure Synapse Analytics is to capture the required field values for every event using a **SELECT...INTO** query, as shown here:
6
+
The simplest approach to capturing event data from an input stream is to select the required field values for every event using a **SELECT...INTO** query, as shown here:
7
7
8
8
```sql
9
9
SELECT
10
10
EventEnqueuedUtcTime AS ReadingTime,
11
11
SensorID,
12
12
ReadingValue
13
13
INTO
14
-
[synapse-output]
14
+
[output]
15
15
FROM
16
16
[streaming-input] TIMESTAMP BY EventEnqueuedUtcTime
17
17
```
18
18
19
19
> [!TIP]
20
-
> When using an **Azure Synapse Analytics** output to write the results to a table in a dedicated SQL pool, the schema of the results produced by the query must match the table into which the data is to be loaded. You can use **AS** clauses to rename fields, and cast them to alternative (compatible) data types as necessary.
20
+
> When using an **Azure SQL Database** output to write the results to a relational table, the schema of the results produced by the query must match the table into which the data is to be loaded. You can use **AS** clauses to rename fields, and cast them to alternative (compatible) data types as necessary.
21
21
22
22
## Filtering event data
23
23
@@ -29,7 +29,7 @@ SELECT
29
29
SensorID,
30
30
ReadingValue
31
31
INTO
32
-
[synapse-output]
32
+
[output]
33
33
FROM
34
34
[streaming-input] TIMESTAMP BY EventEnqueuedUtcTime
35
35
WHERE ReadingValue <0
@@ -57,7 +57,7 @@ SELECT
57
57
SensorID,
58
58
MAX(ReadingValue) AS MaxReading
59
59
INTO
60
-
[synapse-output]
60
+
[output]
61
61
FROM
62
62
[streaming-input] TIMESTAMP BY EventEnqueuedUtcTime
0 commit comments