Skip to content

Commit 3bcf7e7

Browse files
authored
Merge pull request #53087 from weslbo/updates-dp-750-jan
DP-750 Renamed Lakeflow Declarative Pipelines to Lakeflow Spark Declarative …
2 parents 630b83a + 4bf4593 commit 3bcf7e7

24 files changed

Lines changed: 58 additions & 55 deletions

learn-pr/paths/azure-databricks-data-engineer-deploy-maintain-data-pipelines-workloads/index.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ summary: |
1818
1919
By the end of this learning path, you'll be able to:
2020
21-
- Design and implement robust data pipelines using notebooks and Lakeflow Declarative Pipelines
21+
- Design and implement robust data pipelines using notebooks and Lakeflow Spark Declarative Pipelines
2222
- Create and orchestrate Lakeflow Jobs with triggers, schedules, and error handling
2323
- Apply version control and deploy pipelines across environments using Git and Databricks Asset Bundles
2424
- Monitor, troubleshoot, and optimize data workloads for reliability and performance

learn-pr/wwl-databricks/create-and-organize-objects-in-unity-catalog/includes/2-apply-naming-conventions.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ Name clusters according to their purpose and environment to make resource alloca
9090

9191
Structure job names using the pattern `job_{layer}_{purpose}` to align with your data transformation pipeline. Examples include `job_bronze_orders_ingestion`, `job_silver_orders_transformation`, and `job_gold_sales_aggregation`. This naming pattern makes dependencies between jobs immediately visible and helps you trace data lineage across the medallion architecture.
9292

93-
For Lakeflow Declarative Pipelines pipelines, use the prefix `pipe_` followed by the data domain or purpose: `pipe_orders_processing`, `pipe_customer_data_cleaning`.
93+
For Lakeflow Spark Declarative Pipelines pipelines, use the prefix `pipe_` followed by the data domain or purpose: `pipe_orders_processing`, `pipe_customer_data_cleaning`.
9494

9595
Name streaming pipelines to include both source and target, following patterns like `stream_{source}_to_{target}`. Examples such as `stream_kafka_to_bronze` and `stream_iot_sensor_data` make data flow explicit without requiring pipeline documentation. This convention is especially valuable when managing multiple concurrent streaming workloads.
9696

learn-pr/wwl-databricks/design-implement-data-pipelines/3-choose-notebook-lakeflow-pipelines.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ uid: learn.wwl.design-implement-data-pipelines.choose-notebook-vs-lakeflow-pipel
33
title: Choose notebook vs Lakeflow Pipelines
44
metadata:
55
title: Choose Notebook vs Lakeflow Pipelines
6-
description: Learn how to choose between notebooks and Lakeflow Declarative Pipelines for building data pipelines in Azure Databricks, comparing flexibility, maintainability, and use cases.
6+
description: Learn how to choose between notebooks and Lakeflow Spark Declarative Pipelines for building data pipelines in Azure Databricks, comparing flexibility, maintainability, and use cases.
77
ms.date: 12/07/2025
88
author: weslbo
99
ms.author: wedebols

learn-pr/wwl-databricks/design-implement-data-pipelines/7-create-pipeline-lakeflow-declarative.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
### YamlMime:ModuleUnit
22
uid: learn.wwl.design-implement-data-pipelines.create-pipeline-lakeflow-declarative
3-
title: Create pipeline with Lakeflow Declarative Pipelines
3+
title: Create pipeline with Lakeflow Spark Declarative Pipelines
44
metadata:
5-
title: Create Pipeline with Lakeflow Declarative Pipelines
6-
description: Learn how to create data pipelines using Lakeflow Declarative Pipelines in Azure Databricks, including streaming tables, materialized views, and data quality expectations.
5+
title: Create Pipeline with Lakeflow Spark Declarative Pipelines
6+
description: Learn how to create data pipelines using Lakeflow Spark Declarative Pipelines in Azure Databricks, including streaming tables, materialized views, and data quality expectations.
77
ms.date: 12/07/2025
88
author: weslbo
99
ms.author: wedebols

learn-pr/wwl-databricks/design-implement-data-pipelines/8-knowledge-check.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,14 @@ quiz:
2727
- content: "Bronze layer"
2828
isCorrect: true
2929
explanation: "Correct. The bronze layer stores ingested data with minimal transformation, preserving the raw state for auditing and potential reprocessing."
30-
- content: "What is the primary advantage of using Lakeflow Declarative Pipelines over notebooks for production data pipelines?"
30+
- content: "What is the primary advantage of using Lakeflow Spark Declarative Pipelines over notebooks for production data pipelines?"
3131
choices:
3232
- content: "Declarative pipelines allow rapid prototyping and cell-by-cell inspection"
3333
isCorrect: false
3434
explanation: "Incorrect. Notebooks are better suited for rapid prototyping and interactive exploration."
3535
- content: "Declarative pipelines automatically handle orchestration, incremental processing, and error recovery"
3636
isCorrect: true
37-
explanation: "Correct. Lakeflow Declarative Pipelines reduce operational burden by automatically managing orchestration, dependency analysis, incremental processing, and retry logic."
37+
explanation: "Correct. Lakeflow Spark Declarative Pipelines reduce operational burden by automatically managing orchestration, dependency analysis, incremental processing, and retry logic."
3838
- content: "Declarative pipelines support more external library dependencies than notebooks"
3939
isCorrect: false
4040
explanation: "Incorrect. Notebooks provide more flexibility for installing and using custom Python or Scala packages."
@@ -71,7 +71,7 @@ quiz:
7171
- content: "To automatically trigger a retry of the failed notebook task"
7272
isCorrect: false
7373
explanation: "Incorrect. Retry policies are configured at the job level, not triggered by dbutils.notebook.exit()."
74-
- content: "When should a data engineer choose streaming tables over materialized views in Lakeflow Declarative Pipelines?"
74+
- content: "When should a data engineer choose streaming tables over materialized views in Lakeflow Spark Declarative Pipelines?"
7575
choices:
7676
- content: "When the transformation requires complex aggregations or joins"
7777
isCorrect: false

learn-pr/wwl-databricks/design-implement-data-pipelines/includes/1-introduction.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Building reliable data pipelines requires more than connecting data sources to destinations. You need to design workflows that handle failures gracefully, scale with growing data volumes, and remain maintainable as business requirements evolve. Azure Databricks provides multiple approaches for creating data pipelines—from flexible **notebooks** with procedural code to **Lakeflow Declarative Pipelines** that automate orchestration and data quality enforcement.
1+
Building reliable data pipelines requires more than connecting data sources to destinations. You need to design workflows that handle failures gracefully, scale with growing data volumes, and remain maintainable as business requirements evolve. Azure Databricks provides multiple approaches for creating data pipelines—from flexible **notebooks** with procedural code to **Lakeflow Spark Declarative Pipelines** that automate orchestration and data quality enforcement.
22

33
When you design data pipelines, you make decisions that affect every downstream consumer of your data. The order of operations determines whether transformations build on validated, well-structured data. Your choice between notebooks and declarative pipelines influences how much orchestration code you write versus how much the platform manages for you. **Task dependencies** in **Lakeflow Jobs** control execution flow and enable parallel processing that reduces pipeline runtime.
44

learn-pr/wwl-databricks/design-implement-data-pipelines/includes/3-choose-notebook-lakeflow-pipelines.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
1-
When you build data pipelines in Azure Databricks, you have two primary approaches: **notebooks** with procedural code and **Lakeflow Declarative Pipelines**. Each approach serves different needs, and understanding when to use each helps you deliver maintainable, efficient data solutions.
1+
When you build data pipelines in Azure Databricks, you have two primary approaches: **notebooks** with procedural code and **Lakeflow Spark Declarative Pipelines**. Each approach serves different needs, and understanding when to use each helps you deliver maintainable, efficient data solutions.
22

33
## Understand the two approaches
44

55
Notebooks execute code **step by step**. You control every aspect of data processing—from reading sources to writing outputs. This **procedural approach** gives you full control over execution flow, error handling, and optimization decisions.
66

7-
Lakeflow Declarative Pipelines work differently. Instead of specifying **how** to process data, you define **what** you want as the end result. You declare your **streaming tables** and **materialized views**, and the pipeline engine handles **orchestration**, **parallelization**, and **error recovery** automatically.
7+
Lakeflow Spark Declarative Pipelines work differently. Instead of specifying **how** to process data, you define **what** you want as the end result. You declare your **streaming tables** and **materialized views**, and the pipeline engine handles **orchestration**, **parallelization**, and **error recovery** automatically.
88

9-
:::image type="content" source="../media/3-understand-notebook-pipeline-approach.png" alt-text="Diagram explaining the two approaches when it comes to choosing notebooks or Lakeflow Declarative Pipelines." border="false" lightbox="../media/3-understand-notebook-pipeline-approach.png":::
9+
:::image type="content" source="../media/3-understand-notebook-pipeline-approach.png" alt-text="Diagram explaining the two approaches when it comes to choosing notebooks or Lakeflow Spark Declarative Pipelines." border="false" lightbox="../media/3-understand-notebook-pipeline-approach.png":::
1010

11-
Consider a scenario where you need to ingest sales data, join it with product information, and calculate regional aggregates. With a notebook, you write explicit read, join, and aggregation commands in sequence. With Lakeflow Declarative Pipelines, you define the final tables and their relationships—the system determines the most efficient execution plan.
11+
Consider a scenario where you need to ingest sales data, join it with product information, and calculate regional aggregates. With a notebook, you write explicit read, join, and aggregation commands in sequence. With Lakeflow Spark Declarative Pipelines, you define the final tables and their relationships—the system determines the most efficient execution plan.
1212

1313
## When notebooks fit best
1414

@@ -22,17 +22,17 @@ Notebooks excel in scenarios requiring **flexibility** and **detailed control**.
2222

2323
**Fine-grained performance tuning**. When you need to manually control **partitioning**, **caching strategies**, or specific **Spark configurations**, notebooks give you direct access to these optimizations.
2424

25-
## When Lakeflow Declarative Pipelines fit best
25+
## When Lakeflow Spark Declarative Pipelines fit best
2626

27-
Lakeflow Declarative Pipelines simplify **production data pipelines** by handling operational complexity automatically. Choose this approach when your pipeline needs:
27+
Lakeflow Spark Declarative Pipelines simplify **production data pipelines** by handling operational complexity automatically. Choose this approach when your pipeline needs:
2828

2929
**Standardized ETL patterns**. For common ingestion and transformation workflows—reading from cloud storage, applying **schema evolution**, maintaining **slowly changing dimensions**—the declarative approach reduces thousands of lines of code to a few statements.
3030

3131
**Built-in data quality enforcement**. Declarative pipelines include **expectations** that validate data as it flows through. You define **quality rules** directly in your pipeline definition, and the system tracks violations and can halt processing when data quality degrades.
3232

3333
**Automatic dependency management**. The pipeline engine analyzes relationships between your tables and determines the correct **execution order**. When source data updates, the engine refreshes only the **affected downstream tables**.
3434

35-
**Operational visibility**. Lakeflow Declarative Pipelines provide **lineage tracking**, **execution graphs**, and **monitoring dashboards** without additional configuration. Operations teams can trace data from source to target and troubleshoot issues faster.
35+
**Operational visibility**. Lakeflow Spark Declarative Pipelines provide **lineage tracking**, **execution graphs**, and **monitoring dashboards** without additional configuration. Operations teams can trace data from source to target and troubleshoot issues faster.
3636

3737
## Compare the approaches
3838

@@ -57,8 +57,8 @@ Start by evaluating your specific requirements. Ask these questions:
5757
- What level of **operational monitoring** does your team need?
5858
- Who maintains this pipeline—**seasoned developers** or a broader team with varied skills?
5959

60-
For production pipelines with standard ingestion and transformation patterns, Lakeflow Declarative Pipelines **reduce operational burden** and **improve maintainability**. You spend less time writing orchestration code and more time defining business logic.
60+
For production pipelines with standard ingestion and transformation patterns, Lakeflow Spark Declarative Pipelines **reduce operational burden** and **improve maintainability**. You spend less time writing orchestration code and more time defining business logic.
6161

6262
For exploratory work, complex integrations, or pipelines requiring extensive customization, notebooks provide the **flexibility** you need. You can always refactor successful notebook prototypes into declarative pipelines once the logic stabilizes.
6363

64-
Many teams use **both approaches together**. Notebooks handle custom preprocessing or machine learning model training, while Lakeflow Declarative Pipelines manage the core ETL workflow. This **hybrid approach** lets you use each tool where it performs best.
64+
Many teams use **both approaches together**. Notebooks handle custom preprocessing or machine learning model training, while Lakeflow Spark Declarative Pipelines manage the core ETL workflow. This **hybrid approach** lets you use each tool where it performs best.

learn-pr/wwl-databricks/design-implement-data-pipelines/includes/4-design-task-logic-lakeflow-job.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -114,12 +114,12 @@ Azure Databricks provides **dynamic value references** that inject runtime conte
114114

115115
Each task type has recommended compute options that affect both capability and cost. Your task logic design should account for compute requirements:
116116

117-
| Task type | Recommended compute |
118-
| ------------------------------ | --------------------------- |
119-
| Notebooks, Python scripts | Serverless jobs compute |
120-
| SQL queries and files | Serverless SQL warehouse |
121-
| Lakeflow Declarative Pipelines | Serverless pipeline compute |
122-
| JAR and Spark Submit | Classic jobs compute |
117+
| Task type | Recommended compute |
118+
| ------------------------------------ | --------------------------- |
119+
| Notebooks, Python scripts | Serverless jobs compute |
120+
| SQL queries and files | Serverless SQL warehouse |
121+
| Lakeflow Spark Declarative Pipelines | Serverless pipeline compute |
122+
| JAR and Spark Submit | Classic jobs compute |
123123

124124
Tasks within the same job can use different compute resources. A common pattern assigns SQL tasks to a SQL warehouse while notebook-based transformations run on jobs compute.
125125

learn-pr/wwl-databricks/design-implement-data-pipelines/includes/5-design-error-handling-pipelines.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Each scenario requires a different response. Some errors warrant immediate pipel
1919

2020
## Define data quality expectations in declarative pipelines
2121

22-
Lakeflow Declarative Pipelines provides built-in data quality constraints called **expectations**. These constraints validate records as data flows through your pipeline, giving you control over how to handle invalid data.
22+
Lakeflow Spark Declarative Pipelines provides built-in data quality constraints called **expectations**. These constraints validate records as data flows through your pipeline, giving you control over how to handle invalid data.
2323

2424
:::image type="content" source="../media/5-define-data-quality-expectations.png" alt-text="Screenshot of the declarative pipeline editor, highlighting expectations." border="false" lightbox="../media/5-define-data-quality-expectations.png":::
2525

learn-pr/wwl-databricks/design-implement-data-pipelines/includes/7-create-pipeline-lakeflow-declarative.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
Production data pipelines require reliability, maintainability, and clear data quality enforcement. As a data engineer, you likely spend significant time writing code to handle incremental processing, orchestrate dependencies, and validate data quality. **Lakeflow Declarative Pipelines** in Azure Databricks addresses these challenges by letting you define *what* your data should look like rather than *how* to process it step by step.
1+
Production data pipelines require reliability, maintainability, and clear data quality enforcement. As a data engineer, you likely spend significant time writing code to handle incremental processing, orchestrate dependencies, and validate data quality. **Lakeflow Spark Declarative Pipelines** in Azure Databricks addresses these challenges by letting you define *what* your data should look like rather than *how* to process it step by step.
22

33
In this unit, you learn how to create data pipelines using the declarative approach, define streaming tables and materialized views, and apply data quality expectations to enforce constraints on your data.
44

55
## Understand the declarative approach
66

7-
Traditional data pipelines require you to write imperative code that specifies every processing step. You handle incremental processing logic, manage checkpoint recovery, and orchestrate dependencies between tables. With Lakeflow Declarative Pipelines, you instead declare the **desired end state**, and the framework handles the execution details.
7+
Traditional data pipelines require you to write imperative code that specifies every processing step. You handle incremental processing logic, manage checkpoint recovery, and orchestrate dependencies between tables. With Lakeflow Spark Declarative Pipelines, you instead declare the **desired end state**, and the framework handles the execution details.
88

99
The declarative approach provides three key benefits for production pipelines:
1010

0 commit comments

Comments
 (0)