MicrosoftDocs
diff --git a/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/4-detect-manage-schema-drift.yml‎
Lines changed: 14 additions & 0 deletions b/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/4-detect-manage-schema-drift.yml‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/4-implement-schema-enforcement-manage-drift.yml‎
Lines changed: 0 additions & 14 deletions b/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/4-implement-schema-enforcement-manage-drift.yml‎
Lines changed: 0 additions & 14 deletions
diff --git a/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/2-implement-validation-checks.md‎
Lines changed: 9 additions & 5 deletions b/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/2-implement-validation-checks.md‎
Lines changed: 9 additions & 5 deletions
diff --git a/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/3-implement-data-type-checks.md‎
Lines changed: 2 additions & 2 deletions b/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/3-implement-data-type-checks.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎…ement-schema-enforcement-manage-drift.md‎ ‎…includes/4-detect-manage-schema-drift.md‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/4-implement-schema-enforcement-manage-drift.md renamed to learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/4-detect-manage-schema-drift.md
Lines changed: 7 additions & 17 deletions b/‎…ement-schema-enforcement-manage-drift.md‎ ‎…includes/4-detect-manage-schema-drift.md‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/4-implement-schema-enforcement-manage-drift.md renamed to learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/4-detect-manage-schema-drift.md
Lines changed: 7 additions & 17 deletions
diff --git a/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/5-manage-data-quality-pipeline-expectations.md‎
Lines changed: 4 additions & 0 deletions b/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/includes/5-manage-data-quality-pipeline-expectations.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/index.yml‎
Lines changed: 1 addition & 1 deletion b/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/index.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/media/2-understand-validation-approaches.png‎
509 KB b/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/media/2-understand-validation-approaches.png‎
509 KB
diff --git a/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/media/4-recognize-schema-drift-challenges.png‎
2.61 MB b/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/media/4-recognize-schema-drift-challenges.png‎
2.61 MB
diff --git a/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/media/5-define-expectations.png‎
396 KB b/‎learn-pr/wwl-databricks/implement-manage-data-quality-constraints-unity-catalog/media/5-define-expectations.png‎
396 KB
@@ -0,0 +1,14 @@
+### YamlMime:ModuleUnit
+uid: learn.wwl.implement-manage-data-quality-constraints-unity-catalog.detect-manage-schema-drift
+title: Detect and manage schema drift
+metadata:
+  title: Detect and Manage Schema Drift
+  description: Learn how to detect and manage schema drift in Azure Databricks data pipelines using Delta Lake, Auto Loader, schema evolution, and error handling strategies.
+  ms.date: 12/07/2025
+  author: weslbo
+  ms.author: wedebols
+  ms.topic: unit
+  ai-usage: ai-generated
+durationInMinutes: 8
+content: |
+  [!include[](includes/4-detect-manage-schema-drift.md)]
@@ -6,6 +6,8 @@ In this unit, you learn how to implement validation checks for nullability, data
 
 Azure Databricks provides two primary mechanisms for implementing validation checks: pipeline expectations and table constraints. Each approach serves different scenarios and offers distinct capabilities.
 
+:::image type="content" source="../media/2-understand-validation-approaches.png" alt-text="Diagram explaining validation approaches in Azure Databricks." border="false" lightbox="../media/2-understand-validation-approaches.png":::
+
 **Pipeline expectations** apply validation during data transformations in Lakeflow Spark Declarative Pipelines. Expectations let you warn, drop invalid records, or fail the pipeline when data violates your rules. This approach works well for streaming tables and materialized views where you need real-time quality control.
 
 **Table constraints** enforce rules directly on Delta Lake tables. Constraints reject invalid data at write time, preventing bad records from ever entering your tables. This approach suits batch processing and scenarios requiring strict data integrity guarantees.
@@ -64,13 +66,15 @@ Cardinality validation ensures that columns expected to contain unique values ac
 Pipeline expectations can validate cardinality by checking for conditions that indicate uniqueness issues. For example, you can verify that a Social Security Number appears only once per person:
 
 ```python
+from pyspark.sql.window import Window
+from pyspark.sql.functions import count
+
 @dp.table()
-@dp.expect("unique_ssn_per_person", """
-    ssn IS NOT NULL 
-    AND LENGTH(ssn) = 9
-""")
+@dp.expect("unique_ssn_per_person", "ssn_count = 1")
 def employees():
-    return spark.readStream.table("raw.employees")
+    df = spark.table("raw.employees")
+    w = Window.partitionBy("ssn")
+    return df.withColumn("ssn_count", count("*").over(w))
 ```
 
 For more comprehensive cardinality checks, combine expectations with aggregation logic in your transformation:
 
@@ -25,14 +25,14 @@ When you insert data where `quantity` is a string that represents a number, Delt
 
 ```sql
 -- This succeeds because '100' can be cast to INT
-INSERT INTO inventory VALUES (1, '100', '2024-01-15');
+INSERT INTO inventory VALUES (1, '100', '2026-01-15');
 ```
 
 However, inserting a string that can't be converted to an integer causes the operation to fail:
 
 ```sql
 -- This fails because 'fifty' cannot be cast to INT
-INSERT INTO inventory VALUES (2, 'fifty', '2024-01-15');
+INSERT INTO inventory VALUES (2, 'fifty', '2026-01-15');
 ```
 
 Schema enforcement provides a first line of defense against type mismatches. For more control over how mismatches are handled, you can use explicit casting.
 
@@ -1,36 +1,26 @@
 Data pipelines often receive data from sources that evolve over time. New columns appear, others disappear, and the structure of incoming data changes as business requirements shift. Without proper controls, these changes can silently corrupt your data or break your pipelines entirely.
 
-In this unit, you learn how to enforce schema constraints in Azure Databricks and implement strategies for detecting and managing schema drift in your data engineering workflows.
+In this unit, you learn how to detect and manage schema drift—the structural changes that occur when source systems add, remove, or rename columns over time.
 
-## Understand schema enforcement
+## Recognize schema drift challenges
 
-Schema enforcement is the process of validating that incoming data matches the expected structure of your target table. Delta Lake enforces schema on write by default, which means every write operation validates the data structure before committing changes.
+While data type validation ensures values match expected types (as covered in the previous unit), schema drift addresses a different challenge: the structure of your data changes over time. A source system adds a new `phone_number` column, removes a deprecated `legacy_id` field, or renames `customer_email` to `email_address`. These structural changes happen independently of type validation.
 
-When you insert data into a Delta table, Azure Databricks enforces these rules:
+Delta Lake's schema enforcement blocks structural mismatches by default. When incoming data contains columns not present in the target table, or when required columns are missing, the write operation fails. This fail-fast behavior protects your tables from unexpected structural changes, but you need strategies to handle legitimate schema evolution.
 
-- All columns in the incoming data must exist in the target table
-- The source data must include all columns present in the target table
-- Column names must match (schema enforcement is case-sensitive by default)
+:::image type="content" source="../media/4-recognize-schema-drift-challenges.png" alt-text="Diagram helping you recognize schema drift challenges." border="false" lightbox="../media/4-recognize-schema-drift-challenges.png":::
 
-Consider what happens when you attempt to write data that doesn't match the expected schema:
+Consider a streaming pipeline that processes customer data:
 
 ```sql
--- Target table expects columns: customer_id, name, email
 CREATE TABLE customers (
     customer_id INT,
     name STRING,
     email STRING
 );
-
--- This insert fails because 'phone' column doesn't exist in target
-INSERT INTO customers 
-SELECT customer_id, name, email, phone FROM source_data;
 ```
 
-The operation fails with an error indicating that the column `phone` doesn't exist in the target table. This fail-fast behavior prevents unexpected data from entering your tables.
-
-> [!NOTE]
-> Schema enforcement applies to Delta Lake tables by default. Tables backed by external data sources don't enforce schema automatically.
+When your source system adds a `phone_number` column and starts sending it in the data feed, writes fail because the target table doesn't include this column. Your pipeline stops until you decide how to handle the new field—either by rejecting it, adding it to the table schema, or preserving it for later analysis.
 
 ## Detect and respond to schema drift
 
 
@@ -6,6 +6,8 @@ With expectations, you specify what valid data looks like using SQL constraints.
 
 Every expectation consists of three parts: a name, a constraint, and an action. Understanding these components helps you design effective data quality checks.
 
+:::image type="content" source="../media/5-define-expectations.png" alt-text="Diagram defines expectations with three components." border="false" lightbox="../media/5-define-expectations.png":::
+
 The **name** identifies the expectation and appears in monitoring dashboards. Choose names that clearly describe what you're validating. For example, `valid_customer_age` communicates the rule's purpose better than `check_1`.
 
 The **constraint** is a SQL Boolean expression that evaluates to true or false for each record. When a record fails the constraint, the expectation triggers. You can use any valid SQL syntax except custom Python functions, external service calls, or subqueries.
@@ -128,6 +130,8 @@ To view expectation metrics:
 3. Select a dataset that has expectations defined.
 4. Open the **Data quality** tab in the right sidebar.
 
+:::image type="content" source="../media/5-monitor-expectation-results.png" alt-text="Screenshot of the declarative pipeline editor, highlighting expectations." border="false" lightbox="../media/5-monitor-expectation-results.png":::
+
 The metrics show you how many records passed or failed each expectation during pipeline runs. For `warn` and `drop` actions, you see counts of violations. For `fail` actions, the pipeline stops before metrics are recorded, but error messages include details about the violating record.
 
 When a `fail` expectation triggers, the error message provides context to help you investigate:
 
@@ -34,7 +34,7 @@ units:
   - learn.wwl.implement-manage-data-quality-constraints-unity-catalog.introduction
   - learn.wwl.implement-manage-data-quality-constraints-unity-catalog.implement-validation-checks
   - learn.wwl.implement-manage-data-quality-constraints-unity-catalog.implement-data-type-checks
-  - learn.wwl.implement-manage-data-quality-constraints-unity-catalog.implement-schema-enforcement-manage-drift
+  - learn.wwl.implement-manage-data-quality-constraints-unity-catalog.detect-manage-schema-drift
   - learn.wwl.implement-manage-data-quality-constraints-unity-catalog.manage-data-quality-pipeline-expectations
   - learn.wwl.implement-manage-data-quality-constraints-unity-catalog.knowledge-check
   - learn.wwl.implement-manage-data-quality-constraints-unity-catalog.summary