You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/stream-analytics/capture-event-hub-data-parquet.md
+13-11Lines changed: 13 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,12 @@
1
1
---
2
-
title: Event Hubs to Azure Data Lake in Parquet format
2
+
title: Event Hubs Data Capture to Azure Data Lake Parquet
3
3
description: Learn how to use the node code editor to automatically capture the streaming data in Event Hubs in an Azure Data Lake Storage Gen2 account in Parquet format.
4
4
author: xujxu
5
5
ms.author: xujiang1
6
+
ms.reviewer: spelluru
6
7
ms.service: azure-stream-analytics
7
8
ms.topic: how-to
8
-
ms.date: 01/23/2025
9
+
ms.date: 03/26/2026
9
10
ms.custom:
10
11
- mvc
11
12
- sfi-image-nochange
@@ -20,7 +21,7 @@ This article explains how to use the no code editor to automatically capture str
20
21
21
22
If you don't have an event hub, create one by following instructions from [Quickstart: Create an event hub](../event-hubs/event-hubs-create.md).
22
23
23
-
If you don't have a Data Lake Storage Gen2 account, create one by following instructions from [Create a storage account](../storage/blobs/create-data-lake-storage-account.md)
24
+
If you don't have a Data Lake Storage Gen2 account, create one by following instructions from [Create a storage account](../storage/blobs/create-data-lake-storage-account.md).
24
25
- The data in your Event Hubs instance (event hub) must be serialized in either JSON, CSV, or Avro format. On the **Event Hubs Instance** page for your event hub, follow these steps:
25
26
1. On the left menu, select **Data Explorer**.
26
27
1. In the middle pane, select **Send events**.
@@ -33,8 +34,8 @@ This article explains how to use the no code editor to automatically capture str
33
34
34
35
Use the following steps to configure a Stream Analytics job to capture data in Azure Data Lake Storage Gen2.
35
36
36
-
1. In the Azure portal, navigate to your event hub.
37
-
1. On the left menu, select**Process Data** under**Features**. Then, select **Start** on the **Capture data to ADLS Gen2 in Parquet format** card.
37
+
1. In the Azure portal, go to your event hub.
38
+
1. On the left menu, under**Features**, select**Process Data**. Then, select **Start** on the **Capture data to ADLS Gen2 in Parquet format** card.
38
39
39
40
:::image type="content" source="./media/capture-event-hub-data-parquet/process-event-hub-data-cards.png" alt-text="Screenshot showing the Process Event Hubs data start cards." lightbox="./media/capture-event-hub-data-parquet/process-event-hub-data-cards.png" :::
40
41
1. Enter a **name** for your Stream Analytics job, and then select **Create**.
@@ -51,25 +52,26 @@ Use the following steps to configure a Stream Analytics job to capture data in A
51
52
1. Select the **Azure Data Lake Storage Gen2** tile to edit the configuration.
52
53
1. On the **Azure Data Lake Storage Gen2** configuration page, follow these steps:
53
54
1. Select the subscription, storage account name, and container from the drop-down menu.
54
-
1.Once the subscription is selected, the authentication method and storage account key should be automatically filled in.
55
+
1.After you select the subscription, the authentication method and storage account key are automatically filled in.
55
56
1. Select **Parquet** for **Serialization** format.
56
57
57
58
:::image type="content" source="./media/capture-event-hub-data-parquet/job-top-settings.png" alt-text="Screenshot showing the Data Lake Storage Gen2 configuration page." lightbox="./media/capture-event-hub-data-parquet/job-top-settings.png":::
58
-
1. For streaming blobs, the directory path pattern is expected to be a dynamic value. It's required for the date to be a part of the file path for the blob – referenced as `{date}`. To learn about custom path patterns, see to[Azure Stream Analytics custom blob output partitioning](stream-analytics-custom-path-patterns-blob-storage-output.md).
59
+
1. For streaming blobs, the directory path pattern is a dynamic value. The date must be part of the file path for the blob – referenced as `{date}`. To learn about custom path patterns, see [Azure Stream Analytics custom blob output partitioning](stream-analytics-custom-path-patterns-blob-storage-output.md).
59
60
60
61
:::image type="content" source="./media/capture-event-hub-data-parquet/blob-configuration.png" alt-text="First screenshot showing the Blob window where you edit a blob's connection configuration." lightbox="./media/capture-event-hub-data-parquet/blob-configuration.png" :::
61
62
1. Select **Connect**
62
63
1. When the connection is established, you see fields that are present in the output data.
63
64
1. Select **Save** on the command bar to save your configuration.
64
65
65
-
:::image type="content" source="./media/capture-event-hub-data-parquet/save-configuration.png" alt-text="Screenshot showing the Save button selected on the command bar." :::
66
-
1. Select **Start** on the command bar to start the streaming flow to capture data. Then in the Start Stream Analytics job window:
66
+
:::image type="content" source="./media/capture-event-hub-data-parquet/save-configuration.png" alt-text="Screenshot showing the Save button on the command bar." :::
67
+
1. Select **Start** on the command bar to start the streaming flow to capture data. Then in the **Start Stream Analytics job** window:
67
68
1. Choose the output start time.
68
69
1. Select the pricing plan.
69
70
1. Select the number of Streaming Units (SU) that the job runs with. SU represents the computing resources that are allocated to execute a Stream Analytics job. For more information, see [Streaming Units in Azure Stream Analytics](stream-analytics-streaming-unit-consumption.md).
70
71
71
72
:::image type="content" source="./media/capture-event-hub-data-parquet/start-job.png" alt-text="Screenshot showing the Start Stream Analytics job window where you set the output start time, streaming units, and error handling." lightbox="./media/capture-event-hub-data-parquet/start-job.png" :::
72
-
1. You should see the Stream Analytic job in the **Stream Analytics job** tab of the **Process data** page for your event hub.
73
+
1. Select **X** at the top-right corner to close the **Stream Analytics job** window.
74
+
1. You see the Stream Analytic job in the **Stream Analytics job** tab of the **Process data** page for your event hub.
73
75
74
76
:::image type="content" source="./media/capture-event-hub-data-parquet/process-data-page-jobs.png" alt-text="Screenshot showing the Stream Analytics job on the Process data page." lightbox="./media/capture-event-hub-data-parquet/process-data-page-jobs.png" :::
75
77
@@ -84,7 +86,7 @@ Use the following steps to configure a Stream Analytics job to capture data in A
84
86
1. Verify that the Parquet files are generated in the Azure Data Lake Storage container.
85
87
86
88
:::image type="content" source="./media/capture-event-hub-data-parquet/verify-captured-data.png" alt-text="Screenshot showing the generated Parquet files in the Azure Data Lake Storage container." lightbox="./media/capture-event-hub-data-parquet/verify-captured-data.png" :::
87
-
1.Back on the Event Hubs instance page, select **Process data**on the left menu. Switch to the **Stream Analytics jobs** tab. Select **Open metrics** to monitor it. Add **Input metrics** to the chart using the **Add metric** on the toolbar. If you don't see the metrics in the chart, wait for a few minutes, and refresh the page.
89
+
1.Now, on the Event Hubs instance page, select **Process data**in the left menu. Switch to the **Stream Analytics jobs** tab. Select **Open metrics** to monitor it. Add **Input metrics** to the chart using the **Add metric** on the toolbar. If you don't see the metrics in the chart, wait for a few minutes, and refresh the page.
88
90
89
91
:::image type="content" source="./media/capture-event-hub-data-parquet/open-metrics-link.png" alt-text="Screenshot showing Open Metrics link selected." lightbox="./media/capture-event-hub-data-parquet/open-metrics-link.png" :::
Copy file name to clipboardExpand all lines: articles/stream-analytics/stream-analytics-get-started-with-azure-stream-analytics-to-process-data-from-iot-devices.md
+27-25Lines changed: 27 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,13 @@
1
1
---
2
-
title: Process real-time IoT data streams
2
+
title: Real-Time IoT Data Processing With Azure Stream Analytics
3
3
description: Shows you how to create stream processing logic to gather data from Internet of Things (IoT) devices. It uses a real-world IoT use case to demonstrate.
4
-
author: AliciaLiMicrosoft
5
-
ms.author: ali
4
+
#customer intent: As a data engineer, I want to process real-time IoT data streams using Azure Stream Analytics so that I can gain actionable insights from sensor data.
5
+
author: AliciaLiMicrosoft
6
+
ms.author: ali
7
+
ms.reviewer: spelluru
6
8
ms.service: azure-stream-analytics
7
9
ms.topic: how-to
8
-
ms.date: 01/23/2025
10
+
ms.date: 03/26/2026
9
11
# Customer intent: I want to learn how to process real-time IoT data streams with Azure Stream Analytics.
10
12
---
11
13
# Process real-time IoT data streams with Azure Stream Analytics
@@ -19,9 +21,9 @@ In this article, you learn how to create stream-processing logic to gather data
19
21
20
22
## Scenario
21
23
22
-
Contoso, a company in the industrial automation space, has automated its manufacturing process. The machinery in this plant has sensors that are capable of emitting streams of data in real time. In this scenario, a production floor manager wants to have real-time insights from the sensor data to look for patterns and take actions on them. You can use Stream Analytics Query Language (SAQL) over the sensor data to find interesting patterns from the incoming stream of data.
24
+
Contoso, a company in the industrial automation space, automated its manufacturing process. The machinery in this plant has sensors that emit streams of data in real time. In this scenario, a production floor manager wants to have real-time insights from the sensor data to look for patterns and take actions on them. You can use Stream Analytics Query Language (SAQL) over the sensor data to find interesting patterns from the incoming stream of data.
23
25
24
-
In this example, the data is generated from a Texas Instruments sensor tag device. The payload of the data is in JSON format as shown in the following sample snippet:
26
+
In this example, the data comes from a Texas Instruments sensor tag device. The payload of the data is in JSON format as shown in the following sample snippet:
25
27
26
28
```json
27
29
{
@@ -38,23 +40,23 @@ For ease of use, this getting started guide provides a sample data file, which w
38
40
39
41
## Create a Stream Analytics job
40
42
41
-
1.Navigate to the [Azure portal](https://portal.azure.com).
42
-
1. On the left navigation menu, select **All services**, select**Analytics**, hover the mouse over**Stream Analytics jobs**, and then select **Create**.
43
+
1.Go to the [Azure portal](https://portal.azure.com).
44
+
1. On the left navigation menu, select **All services**. Under**Analytics**, select**Stream Analytics jobs**, and then select **Create**.
43
45
44
46
:::image type="content" source="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-02.png" alt-text="Screenshot that shows the selection of Create button for a Stream Analytics job." lightbox="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-02.png":::
45
-
1. On the **New Stream Analytics job** page, follow these steps:
47
+
1. On **New Stream Analytics job**, follow these steps:
46
48
1. For **Subscription**, select your **Azure subscription**.
47
-
1. For **Resource group**, select an existing resource group or create a resource group.
49
+
1. For **Resource group**, select an existing resource group or create a new one.
48
50
1. For **Name**, enter a unique name for the Stream Analytics job.
49
-
1. Select the **Region**in which you want to deploy the Stream Analytics job. Use the same location for your resource group and all resources to increase the processing speed and reduce costs.
51
+
1. Select the **Region**where you want to deploy the Stream Analytics job. Use the same location for your resource group and all resources to increase the processing speed and reduce costs.
50
52
1. Select **Review + create**.
51
53
52
54
:::image type="content" source="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-03.png" alt-text="Screenshot that shows the New Stream Analytics job page.":::
53
-
1. On the **Review + create** page, review settings, and select **Create**.
54
-
1. After the deployment succeeds, select **Go to resource** to navigate to the **Stream Analytics job** page for your Stream Analytics job.
55
+
1. On **Review + create**, review the settings, and select **Create**.
56
+
1. After the deployment succeeds, select **Go to resource** to go to the **Stream Analytics job** page for your Stream Analytics job.
55
57
56
58
## Create an Azure Stream Analytics query
57
-
After your job is created, write a query. You can test queries against sample data without connecting an input or output to your job.
59
+
After you create your job, write a query. You can test queries against sample data without connecting an input or output to your job.
58
60
59
61
1. Download the [HelloWorldASA-InputStream.json](https://github.com/Azure/azure-stream-analytics/blob/master/Samples/GettingStarted/HelloWorldASA-InputStream.json) from GitHub.
60
62
1. On the **Azure Stream Analytics job** page in the Azure portal, select **Query** under **Job topology** from the left menu.
@@ -68,26 +70,26 @@ After your job is created, write a query. You can test queries against sample da
68
70
FROM
69
71
yourinputalias
70
72
```
71
-
73
+
1. Select**Save**on the command bar to save your query.
72
74
1. In the bottom pane, select**Upload sample input**, select the `HelloWorldASA-InputStream.json` file you downloaded, andselect**OK**.
73
75
74
-
:::image type="content" source="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-05.png" alt-text="Screenshot that shows the **Query** page with **Upload sample input** selected." lightbox="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-05.png":::
75
-
1. Notice that a preview of the data is automatically populatedin the **Input preview** table.
76
+
:::image type="content" source="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-05.png" alt-text="Screenshot that shows the Query page with Upload sample input selected." lightbox="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-05.png":::
77
+
1. A preview of the data automatically appearsin the **Input preview** table.
76
78
77
79
:::image type="content" source="./media/stream-analytics-get-started-with-iot-devices/input-preview.png" alt-text="Screenshot that shows sample input data in the Input preview tab.":::
78
80
79
81
### Query: Archive your raw data
80
82
81
-
The simplest form of query is a pass-through query that archives all input data to its designated output. This query is the default query populated in a new Azure Stream Analytics job.
83
+
The simplest form of query is a pass-through query that archives all input data to its designated output. This query is the default query in a new Azure Stream Analytics job.
82
84
83
85
1. Select**Test query**on the toolbar.
84
-
2. View the results in the **Test results** tab in the bottom pane.
86
+
1. View the results in the **Test results** tab in the bottom pane.
85
87
86
88
:::image type="content" source="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-07.png" alt-text="Screenshot that shows the sample query and its results.":::
87
89
88
90
### Query: Filter the data based on a condition
89
91
90
-
Let's update the query to filter the results based on a condition. For example, the following query shows events that come from `sensorA`."
92
+
Updatethe query to filter the results based on a condition. For example, the following query shows events that come from`sensorA`.
91
93
92
94
1. Update the query with the following sample:
93
95
@@ -103,7 +105,7 @@ Let's update the query to filter the results based on a condition. For example,
103
105
yourinputalias
104
106
WHERE dspl='sensorA'
105
107
```
106
-
2. Select **Test query** to see the results of the query.
108
+
1. Select**Test query** to see the results of the query.
107
109
108
110
:::image type="content" source="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-08.png" alt-text="Screenshot that shows the query results with the filter.":::
109
111
@@ -129,7 +131,7 @@ Let's make our query more detailed. For every type of sensor, we want to monitor
129
131
130
132
:::image type="content" source="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-10.png" alt-text="Screenshot that shows the query with a tumbling window.":::
131
133
132
-
You should see results that contain only 245 rows and names of sensors where the average temperate is greater than 100. This query groups the stream of events by **dspl**, which is the sensor name, over a **Tumbling Window** of 30 seconds. Temporal queries must state how you want time to progress. By using the **TIMESTAMP BY** clause, you have specified the **OUTPUTTIME** column to associate times with all temporal calculations. For detailed information, read about [Time Management](/stream-analytics-query/time-management-azure-stream-analytics) and [Windowing functions](/stream-analytics-query/windowing-azure-stream-analytics).
134
+
You should see results that contain only 245 rows and names of sensors where the average temperate is greater than 100. This query groups the stream of events by **dspl**, which is the sensor name, over a **Tumbling Window** of 30 seconds. Temporal queries must state how you want time to progress. By using the **TIMESTAMP BY** clause, you specified the **OUTPUTTIME** column to associate times with all temporal calculations. For detailed information, read about [Time Management](/stream-analytics-query/time-management-azure-stream-analytics) and [Windowing functions](/stream-analytics-query/windowing-azure-stream-analytics).
133
135
134
136
### Query: Detect absence of events
135
137
@@ -151,13 +153,13 @@ How can we write a query to find a lack of input events? Let's find the last tim
151
153
DATEDIFF(second,t1,t2) BETWEEN 1 and 5
152
154
WHERE t2.dspl IS NULL
153
155
```
154
-
2. Select**Test query** to see the results of the query.
156
+
1. Select **Test query** to see the results of the query.
155
157
156
158
:::image type="content" source="./media/stream-analytics-get-started-with-iot-devices/stream-analytics-get-started-with-iot-devices-11.png" alt-text="Screenshot that shows the query that detects absence of events.":::
157
159
158
160
159
-
Here we use a **LEFT OUTER**join to the same data stream (self-join). For an **INNER**join, a result is returned only when a match is found. For a **LEFT OUTER**join, if an event from the left side of the join is unmatched, a row that has NULL for all the columns of the right side is returned. This technique is useful to find an absence of events. For more information, see [JOIN](/stream-analytics-query/join-azure-stream-analytics).
161
+
This query uses a **LEFT OUTER** join to the same data stream (self-join). For an **INNER** join, a result is returned only when a match is found. For a **LEFT OUTER** join, if an event from the left side of the join is unmatched, the query returns a row that has NULL for all the columns of the right side. This technique is useful to find an absence of events. For more information, see [JOIN](/stream-analytics-query/join-azure-stream-analytics).
160
162
161
163
## Conclusion
162
164
163
-
The purpose of this article is to demonstrate how to write different Stream Analytics Query Language queries and see results in the browser. However, this article is just to get you started. Stream Analytics supports various inputs and outputs and can even use functions in Azure Machine Learning to make it a robust tool for analyzing data streams. For more information about how to write queries, read the article about [common query patterns](stream-analytics-stream-analytics-query-patterns.md).
165
+
This article demonstrates how to write different Stream Analytics Query Language queries and see results in the browser. However, this article is just to get you started. Stream Analytics supports various inputs and outputs and can even use functions in Azure Machine Learning to make it a robust tool for analyzing data streams. For more information about how to write queries, see [common query patterns](stream-analytics-stream-analytics-query-patterns.md).
0 commit comments