Skip to content

Commit 8e2a87f

Browse files
Merge pull request #2902 from MicrosoftDocs/main639116750230103524sync_temp
For protected branch, push strategy should use PR and merge to target branch method to work around git push error
2 parents 70e6f7a + 8a09296 commit 8e2a87f

18 files changed

Lines changed: 200 additions & 16 deletions

docs/data-engineering/materialized-lake-views/overview-materialized-lake-view.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,9 @@ Materialized lake views aren't the right choice for every scenario. Consider alt
3030
- **Non-SQL logic** such as ML inference, API calls, or complex Python processing — use Spark notebooks instead
3131
- **High-frequency streaming data** that requires sub-second updates — consider [Real-Time Intelligence](../../real-time-intelligence/overview.md) instead
3232

33+
> [!NOTE]
34+
> This feature is currently not available in South Central US region.
35+
3336
## Get started with materialized lake views
3437

3538
To create your first materialized lake view in Microsoft Fabric, see [Get started with materialized lake views](get-started-with-materialized-lake-views.md). For a complete walkthrough that builds a medallion architecture, see [Tutorial: Build a medallion architecture with materialized lake views](tutorial.md).

docs/data-factory/dataflow-gen2-partitioned-compute.md

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Use partitioned compute in Dataflow Gen2 (Preview)
33
description: Overview on how to use partitioned compute for parallel processing in Dataflow Gen2 with CI/CD.
44
ms.reviewer: miescobar
55
ms.topic: how-to
6-
ms.date: 01/28/2026
6+
ms.date: 04/13/2026
77
ms.custom: dataflows
88
---
99

@@ -12,41 +12,41 @@ ms.custom: dataflows
1212
> [!NOTE]
1313
> Partitioned compute is currently in preview and only available in Dataflow Gen2 with CI/CD.
1414
15-
Partitioned compute is a capability of the Dataflow Gen2 engine that allows parts of your dataflow logic to run in parallel, reducing the time to complete its evaluations.
15+
Partitioned compute is a capability of the Dataflow Gen2 engine that lets parts of your dataflow logic to run in parallel, reducing the time to finish its evaluations.
1616

1717
Partitioned compute targets scenarios where the Dataflow engine can efficiently fold operations that can partition the data source and process each partition in parallel. For example, in a scenario where you're connecting to multiple files stored in an Azure Data Lake Storage Gen2, you can partition the list of files from your source, efficiently retrieve the partitioned list of files using [query folding](/power-query/query-folding-basics), use the [combine files experience](/power-query/combine-files-overview), and process all files in parallel.
1818

1919
> [!NOTE]
20-
> Only connectors for Azure Data Lake Storage Gen2, Fabric Lakehouse, Folder, and Azure Blob Storage emit the correct script to use partitioned compute. The connector for SharePoint doesn't support it today.
20+
> Only connectors for Azure Data Lake Storage Gen2, Folder, and Azure Blob Storage emit the correct script to use partitioned compute. The connectors for SharePoint and Fabric Lakehouse do not support it today.
2121
2222
## How to set partitioned compute
2323

24-
In order to use this capability, you need to:
24+
To use this capability, follow these steps:
2525

2626
- [Enable Dataflow settings](#enable-dataflow-settings)
2727

2828
- [Query with partition keys](#query-with-partition-key)
2929

3030
### Enable Dataflow settings
3131

32-
Inside the Home tab of the ribbon, select the **Options** button to display its dialog. Navigate to the Scale section and enable the setting that reads **Allow use of partitioned compute**.
32+
Inside the Home tab of the ribbon, select the **Options** button to show its dialog. Go to the Scale section and turn on the setting that reads **Allow use of partitioned compute**.
3333

34-
:::image type="content" source="media/dataflow-gen2-partitioned-compute/partitioned-compute-setting.png" alt-text="Screenshot of the partitioned compute setting inside the scale section of the options dialog.":::
34+
:::image type="content" source="media/dataflow-gen2-partitioned-compute/partitioned-compute-setting.png" alt-text="Screenshot of the partitioned compute setting inside the Scale section of the Options dialog.":::
3535

3636
Enabling this option has two purposes:
3737

38-
- Allows your Dataflow to use partitioned compute if discovered through your query scripts
38+
- Lets your Dataflow use partitioned compute if discovered through your query scripts
3939

4040
- Experiences like the combine files will now automatically create partition keys that can be used for partitioned computed
4141

42-
You also need to enable the setting in the **Privacy** section to **Allow combining data from multiple sources**.
42+
You also need to turn on the setting in the **Privacy** section to **Allow combining data from multiple sources**.
4343

4444
### Query with partition key
4545

4646
> [!NOTE]
4747
> To use partitioned compute, make sure that your query is set to be staged.
4848
49-
After enabling the setting, you can use the combine files experience for a data source that uses the file system view such as Azure Data Lake Storage Gen2. When the combine files experience finalizes, you notice that your query has an **Added custom** step, which has a script similar to this:
49+
After turning on the setting, you can use the combine files experience for a data source that uses the file system view like Azure Data Lake Storage Gen2. When the combine files experience finalizes, you notice that your query has an **Added custom** step, which has a script similar to this:
5050

5151
```M code
5252
let
@@ -61,21 +61,23 @@ in
6161

6262
This script, and specifically the `withPartitionKey` component, drives the logic on how your Dataflow tries to partition your data and how it tries to evaluate things in parallel.
6363

64-
You can use the [Table.PartitionKey](/powerquery-m/table-partitionkey) function against the **Added custom** step. This function returns the partition key of the specified table. For the case above, it's the column *RelativePath*. You can get a distinct list of the values in that column to understand all the partitions that will be used during the dataflow run.
64+
You can use the [Table.PartitionKey](/powerquery-m/table-partitionkey) function against the **Added custom** step. This function returns the partition key of the specified table. For the case above, it's the column *RelativePath*. You can get a distinct list of the values in that column to learn all the partitions that are used during the dataflow run.
6565

6666
> [!IMPORTANT]
6767
> It's important that the partition key column remains in the query in order for partitioned compute to be applied.
6868
6969
## Considerations and recommendations
7070

71-
- For scenarios where your data source doesn't support folding the transformations for your files, it's recommended that you choose partitioned compute over fast copy.
71+
- **Partitioned compute vs. fast copy**: If your data source doesn't support folding the transformations for your files, we recommend that you choose partitioned compute over fast copy.
7272

73-
- For best performance, use this method to load data directly to staging as your destination or to a Fabric Warehouse.
73+
- **Lakehouse file access**: To connect to files in the Lakehouse, we recommend using the Azure Data Lake Storage Gen2 connector by passing the URL of the `Files` node.
7474

75-
- Only the latest partition run is stored in the Dataflow Staging Lakehouse and retuned by the Dataflow Connector.  Consider using  a data destination to retain data for each separate partitioned.
75+
- **Best performance**: Use this method to load data directly to staging as your destination or to a Fabric Warehouse.
7676

77-
- Use the *Sample transform file* from the **Combine files** experience to introduce transformations that should happen in every file.
77+
- **Data retention**: Only the latest partition run is stored in the Dataflow Staging Lakehouse and returned by the Dataflow Connector. Consider using a data destination to retain data for each separate partition.
7878

79-
- Partitioned compute only supports a subset of transformations. The performance might vary depending on your source and set of transformations used.
79+
- **File transformations**: Use the *Sample transform file* from the **Combine files** experience to introduce transformations that should happen in every file.
8080

81-
- Billing for the dataflow run is based on capacity unit (CU) consumption.
81+
- **Supported transformations**: Partitioned compute only supports a subset of transformations. The performance might vary depending on your source and set of transformations used.
82+
83+
- **Billing**: Billing for the dataflow run is based on capacity unit (CU) consumption.
152 KB
Loading
224 KB
Loading
402 KB
Loading
84.7 KB
Loading
216 KB
Loading
267 KB
Loading
246 KB
Loading
73.4 KB
Loading

0 commit comments

Comments
 (0)