You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/build-data-pipeline-with-delta-live-tables/6-knowledge-check.yml
+10-10Lines changed: 10 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ title: Module assessment
4
4
metadata:
5
5
title: Module Assessment
6
6
description: "Knowledge check"
7
-
ms.date: 09/12/2025
7
+
ms.date: 04/27/2026
8
8
author: weslbo
9
9
ms.author: wedebols
10
10
ms.topic: unit
@@ -18,24 +18,24 @@ quiz:
18
18
choices:
19
19
- content: "Reducing the cost of storage"
20
20
isCorrect: false
21
-
explanation: "Incorrect. Reducing the cost of storage isn't the primary benefit of using DLT in Azure Databricks for real-time processing."
21
+
explanation: "Incorrect. Reducing the cost of storage isn't the primary benefit of using Lakeflow Declarative Pipelines in Azure Databricks for real-time processing."
22
22
- content: "Automating data pipeline management"
23
23
isCorrect: true
24
24
explanation: "Correct. Lakeflow Declarative Pipelines is a framework that simplifies the management of data pipelines by automating complex tasks such as error handling, monitoring, and data pipeline lineage. This automation is valuable in real-time data processing, where managing data flows efficiently and reliably is critical."
25
25
- content: "Increasing the latency of data processing."
26
26
isCorrect: false
27
-
explanation: "Incorrect. Increasing the latency of data processing isn't the primary benefit of using DLT in Azure Databricks for real-time processing."
27
+
explanation: "Incorrect. Increasing the latency of data processing isn't the primary benefit of using Lakeflow Declarative Pipelines in Azure Databricks for real-time processing."
28
28
- content: "Which feature of Lakeflow Declarative Pipelines ensures data reliability and quality in real-time processing environments?"
29
29
choices:
30
-
- content: "Live data"
30
+
- content: "Manual schema validation"
31
31
isCorrect: false
32
-
explanation: "Incorrect. You can receive live data with real-time data processing with Lakeflow Declarative Pipelines but this feature doesn't ensure data reliability and quality."
33
-
- content: "ACID Transactions"
32
+
explanation: "Incorrect. Manual schema validation isn't a built-in feature of Lakeflow Declarative Pipelines and doesn't automatically ensure data reliability at runtime."
33
+
- content: "Data quality expectations"
34
34
isCorrect: true
35
-
explanation: "Correct. Lakeflow Declarative Pipelines support ACID transactions, which are crucial for ensuring data integrity by making all operations atomic, consistent, isolated, and durable. This is important in real-time processing environments where concurrent data modifications can lead to inconsistencies without proper transaction controls."
35
+
explanation: "Correct. Lakeflow Declarative Pipelines provide built-in data quality expectations that validate records as data flows through the pipeline. You can configure expectations to log violations, drop invalid records, or fail the pipeline update, ensuring that only data meeting your quality rules reaches downstream tables."
36
36
- content: "Data Lake"
37
37
isCorrect: false
38
-
explanation: "Incorrect. A Data Lake is a centralized repository that houses vast volumes of structured and unstructured data from various sources but it doesn't ensure data reliability and quality."
38
+
explanation: "Incorrect. A Data Lake is a centralized storage repository and isn't a Lakeflow Declarative Pipelines feature that enforces data quality constraints."
39
39
- content: "Which component of Azure Databricks enhances performance and scalability of data operations on Delta Lake?"
40
40
choices:
41
41
- content: "Azure Blob Storage"
@@ -44,6 +44,6 @@ quiz:
44
44
- content: "Azure Synapse Analytics"
45
45
isCorrect: false
46
46
explanation: "Incorrect. Azure Synapse Analytics is an enterprise analytics service that accelerates time to insight across data warehouses and big data systems. It isn't used to enhance performance and scalability of data operations on Delta Lake."
47
-
- content: "Delta Engine"
47
+
- content: "Photon"
48
48
isCorrect: true
49
-
explanation: "Correct. Delta Engine significantly enhances the performance and scalability of operations on Delta Lake in Azure Databricks. It optimizes the execution of queries by utilizing a high-performance, in-memory execution environment, which is crucial for processing large datasets efficiently."
49
+
explanation: "Correct. Photon is the vectorized query engine built into Databricks Runtime that significantly enhances the performance and scalability of operations on Delta Lake in Azure Databricks. It accelerates data processing by executing queries using a native, vectorized execution model, which is crucial for processing large datasets efficiently."
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/build-data-pipeline-with-delta-live-tables/includes/3-data-ingestion-and-integration.md
+17-3Lines changed: 17 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -90,6 +90,9 @@ SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE');
90
90
91
91
Optionally, you can use expectations to apply quality constraints that validate data as it flows through ETL pipelines. Expectations provide greater insight into data quality metrics and allow you to fail updates or drop records when detecting invalid records.
92
92
93
+
> [!IMPORTANT]
94
+
> The **Advanced** product edition of Lakeflow Declarative Pipelines is required to use expectations. If your pipeline includes expectations with the Core or Pro editions, you receive an error.
Here's an example of a materialized view that defines a constraint clause. In this case, the constraint contains the actual logic for what is being validated: the Country_Region shouldn't be empty. When a record fails this condition, the expectation is triggered.
@@ -106,7 +109,7 @@ SELECT
106
109
Confirmed,
107
110
Deaths,
108
111
Recovered
109
-
FROMlive.raw_covid_data;
112
+
FROM raw_covid_data;
110
113
```
111
114
112
115
Examples of constraints:
@@ -160,7 +163,7 @@ SELECT
160
163
sum(Confirmed) as Total_Confirmed,
161
164
sum(Deaths) as Total_Deaths,
162
165
sum(Recovered) as Total_Recovered
163
-
FROMlive.processed_covid_data
166
+
FROM processed_covid_data
164
167
GROUP BY Report_Date;
165
168
```
166
169
@@ -177,4 +180,15 @@ Lakeflow Declarative Pipelines support tasks such as:
177
180
- Observing the progress and status of pipeline updates.
178
181
- Alerting on pipeline events such as the success or failure of pipeline updates.
179
182
- Viewing metrics for streaming sources like Apache Kafka and Auto Loader.
180
-
- Receiving email notifications when a pipeline update fails or completes successfully.
183
+
- Receiving email notifications when a pipeline update fails or completes successfully.
184
+
185
+
## Develop with the Lakeflow Pipelines Editor
186
+
187
+
The **Lakeflow Pipelines Editor** is the integrated development environment for creating and iterating on pipeline source code. When you create a new pipeline, the editor provides a default folder structure with a `transformations/` directory for source code and an `explorations/` directory for ad hoc analysis notebooks. Store your pipeline source code in a Git folder to enable version control.
188
+
189
+
The editor supports iterative development through several features:
190
+
191
+
-**Dry run**: Validates your pipeline code without processing data, allowing you to catch syntax errors and missing dependencies before execution.
192
+
-**Selective execution**: Run individual files or single table definitions rather than the entire pipeline, enabling faster iteration during development.
193
+
-**Interactive DAG**: Visualize the dependency graph between your tables, select specific tables for targeted refreshes, and inspect execution metrics.
194
+
-**Data preview**: Sample data from streaming tables and materialized views directly in the editor to verify transformation logic.
0 commit comments