| title | Event Hubs Data Capture to Azure Data Lake Parquet | ||
|---|---|---|---|
| description | Learn how to use the node code editor to automatically capture the streaming data in Event Hubs in an Azure Data Lake Storage Gen2 account in Parquet format. | ||
| author | xujxu | ||
| ms.author | xujiang1 | ||
| ms.reviewer | spelluru | ||
| ms.service | azure-stream-analytics | ||
| ms.topic | how-to | ||
| ms.date | 03/26/2026 | ||
| ms.custom |
|
This article explains how to use the no code editor to automatically capture streaming data in Event Hubs in an Azure Data Lake Storage Gen2 account in the Parquet format.
-
An Azure Event Hubs namespace with an event hub and an Azure Data Lake Storage Gen2 account with a container to store the captured data. These resources must be publicly accessible and can't be behind a firewall or secured in an Azure virtual network.
If you don't have an event hub, create one by following instructions from Quickstart: Create an event hub.
If you don't have a Data Lake Storage Gen2 account, create one by following instructions from Create a storage account.
-
The data in your Event Hubs instance (event hub) must be serialized in either JSON, CSV, or Avro format. On the Event Hubs Instance page for your event hub, follow these steps:
-
On the left menu, select Data Explorer.
-
In the middle pane, select Send events.
-
In the Send events pane, for Select dataset, select Stocks data.
-
Select Send.
:::image type="content" source="./media/capture-event-hub-data-parquet/stocks-data.png" alt-text="Screenshot showing the Generate data page to generate sample stocks data." lightbox="./media/capture-event-hub-data-parquet/stocks-data.png":::
-
Use the following steps to configure a Stream Analytics job to capture data in Azure Data Lake Storage Gen2.
-
In the Azure portal, go to your event hub.
-
On the left menu, under Features, select Process Data. Then, select Start on the Capture data to ADLS Gen2 in Parquet format card.
:::image type="content" source="./media/capture-event-hub-data-parquet/process-event-hub-data-cards.png" alt-text="Screenshot showing the Process Event Hubs data start cards." lightbox="./media/capture-event-hub-data-parquet/process-event-hub-data-cards.png" :::
-
Enter a name for your Stream Analytics job, and then select Create.
:::image type="content" source="./media/capture-event-hub-data-parquet/new-stream-analytics-job-name.png" alt-text="Screenshot showing the New Stream Analytics job window where you enter the job name." :::
-
Specify the Serialization type of your data in the Event Hubs and the Authentication method that the job uses to connect to Event Hubs. For this tutorial, keep the default settings. Then select Connect.
:::image type="content" source="./media/capture-event-hub-data-parquet/event-hub-configuration.png" alt-text="Screenshot showing the Event Hubs connection configuration." lightbox="./media/capture-event-hub-data-parquet/event-hub-configuration.png" :::
-
When the connection is established successfully, you see:
-
Fields that are present in the input data. You can choose Add field or you can select the three dot symbol next to a field to optionally remove, rename, or change its name.
-
A live sample of incoming data in the Data preview table under the diagram view. It refreshes periodically. You can select Pause streaming preview to view a static view of the sample input.
:::image type="content" source="./media/capture-event-hub-data-parquet/edit-fields.png" alt-text="Screenshot showing sample data under Data Preview." lightbox="./media/capture-event-hub-data-parquet/edit-fields.png" :::
-
-
Select the Azure Data Lake Storage Gen2 tile to edit the configuration.
-
On the Azure Data Lake Storage Gen2 configuration page, follow these steps:
-
Select the subscription, storage account name, and container from the drop-down menu.
-
After you select the subscription, the authentication method and storage account key are automatically filled in.
-
Select Parquet for Serialization format.
:::image type="content" source="./media/capture-event-hub-data-parquet/job-top-settings.png" alt-text="Screenshot showing the Data Lake Storage Gen2 configuration page." lightbox="./media/capture-event-hub-data-parquet/job-top-settings.png":::
-
For streaming blobs, the directory path pattern is a dynamic value. The date must be part of the file path for the blob – referenced as
{date}. To learn about custom path patterns, see Azure Stream Analytics custom blob output partitioning.:::image type="content" source="./media/capture-event-hub-data-parquet/blob-configuration.png" alt-text="First screenshot showing the Blob window where you edit a blob's connection configuration." lightbox="./media/capture-event-hub-data-parquet/blob-configuration.png" :::
-
Select Connect
-
-
When the connection is established, you see fields that are present in the output data.
-
Select Save on the command bar to save your configuration.
:::image type="content" source="./media/capture-event-hub-data-parquet/save-configuration.png" alt-text="Screenshot showing the Save button on the command bar." :::
-
Select Start on the command bar to start the streaming flow to capture data. Then in the Start Stream Analytics job window:
-
Choose the output start time.
-
Select the pricing plan.
-
Select the number of Streaming Units (SU) that the job runs with. SU represents the computing resources that are allocated to execute a Stream Analytics job. For more information, see Streaming Units in Azure Stream Analytics.
:::image type="content" source="./media/capture-event-hub-data-parquet/start-job.png" alt-text="Screenshot showing the Start Stream Analytics job window where you set the output start time, streaming units, and error handling." lightbox="./media/capture-event-hub-data-parquet/start-job.png" :::
-
-
Select X at the top-right corner to close the Stream Analytics job window.
-
You see the Stream Analytic job in the Stream Analytics job tab of the Process data page for your event hub.
:::image type="content" source="./media/capture-event-hub-data-parquet/process-data-page-jobs.png" alt-text="Screenshot showing the Stream Analytics job on the Process data page." lightbox="./media/capture-event-hub-data-parquet/process-data-page-jobs.png" :::
-
On the Event Hubs instance page for your event hub, follow these steps:
- On the left menu, select Data Explorer.
- In the middle pane, select Send events.
- In the Send events pane, for Select dataset, select Stocks data.
- Select Send.
-
Verify that the Parquet files are generated in the Azure Data Lake Storage container.
:::image type="content" source="./media/capture-event-hub-data-parquet/verify-captured-data.png" alt-text="Screenshot showing the generated Parquet files in the Azure Data Lake Storage container." lightbox="./media/capture-event-hub-data-parquet/verify-captured-data.png" :::
-
Now, on the Event Hubs instance page, select Process data in the left menu. Switch to the Stream Analytics jobs tab. Select Open metrics to monitor it. Add Input metrics to the chart using the Add metric on the toolbar. If you don't see the metrics in the chart, wait for a few minutes, and refresh the page.
:::image type="content" source="./media/capture-event-hub-data-parquet/open-metrics-link.png" alt-text="Screenshot showing Open Metrics link selected." lightbox="./media/capture-event-hub-data-parquet/open-metrics-link.png" :::
Here's an example screenshot of metrics showing input and output events.
:::image type="content" source="./media/capture-event-hub-data-parquet/job-metrics.png" alt-text="Screenshot showing metrics of the Stream Analytics job." lightbox="./media/capture-event-hub-data-parquet/job-metrics.png" :::
[!INCLUDE geo-replication-stream-analytics-job]
Now you know how to use the Stream Analytics no code editor to create a job that captures Event Hubs data to Azure Data Lake Storage Gen2 in Parquet format. Next, you can learn more about Azure Stream Analytics and how to monitor the job that you created.