MicrosoftDocs
diff --git a/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/01-introduction.md‎
Lines changed: 4 additions & 0 deletions b/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/01-introduction.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/02-azure-databricks.md‎
Lines changed: 4 additions & 0 deletions b/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/02-azure-databricks.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/03-workloads.md‎
Lines changed: 6 additions & 0 deletions b/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/03-workloads.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/04-key-concepts.md‎
Lines changed: 16 additions & 0 deletions b/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/04-key-concepts.md‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/05-data-governance-using-unity-catalog-and-microsoft-purview.md‎
Lines changed: 6 additions & 0 deletions b/‎learn-pr/wwl-data-ai/explore-azure-databricks/includes/05-data-governance-using-unity-catalog-and-microsoft-purview.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎learn-pr/wwl-databricks/cleanse-transform-load-data-into-unity-catalog/5-transform-data-filter-group-aggregate.yml‎
Lines changed: 2 additions & 2 deletions b/‎learn-pr/wwl-databricks/cleanse-transform-load-data-into-unity-catalog/5-transform-data-filter-group-aggregate.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎learn-pr/wwl-databricks/cleanse-transform-load-data-into-unity-catalog/includes/5-transform-data-filter-group-aggregate.md‎
Lines changed: 34 additions & 0 deletions b/‎learn-pr/wwl-databricks/cleanse-transform-load-data-into-unity-catalog/includes/5-transform-data-filter-group-aggregate.md‎
Lines changed: 34 additions & 0 deletions
diff --git a/‎learn-pr/wwl-databricks/cleanse-transform-load-data-into-unity-catalog/index.yml‎
Lines changed: 1 addition & 1 deletion b/‎learn-pr/wwl-databricks/cleanse-transform-load-data-into-unity-catalog/index.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎learn-pr/wwl-databricks/design-implement-data-modeling-unity-catalog/3-choose-data-ingestion-tool.yml‎
Lines changed: 2 additions & 2 deletions b/‎learn-pr/wwl-databricks/design-implement-data-modeling-unity-catalog/3-choose-data-ingestion-tool.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎learn-pr/wwl-databricks/design-implement-data-modeling-unity-catalog/5-design-data-partitioning-scheme.yml‎
Lines changed: 2 additions & 2 deletions b/‎learn-pr/wwl-databricks/design-implement-data-modeling-unity-catalog/5-design-data-partitioning-scheme.yml‎
Lines changed: 2 additions & 2 deletions
@@ -1,3 +1,5 @@
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-01]
+
 Azure Databricks is a cloud-based data platform that brings together the best of **data engineering, data science, and machine learning** in a single, unified workspace. Built on top of **Apache Spark**, it allows organizations to easily process, analyze, and visualize massive amounts of data in real time.
 
 ![Diagram showing an Overview of Azure Databricks.](../media/databricks-overview.png)
@@ -16,6 +18,8 @@ At its core, Azure Databricks helps organizations:
 
 ## Data Lakehouse
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-02]
+
 A **data lakehouse** is a data management approach that blends the strengths of both data lakes and data warehouses. It offers scalable storage and processing, allowing organizations to handle diverse workloads—such as machine learning and business intelligence—without relying on separate, disconnected systems. By centralizing data, a lakehouse supports a single source of truth, reduces duplicate costs, and ensures that information stays up to date.
 
 Many lakehouses follow a layered design pattern where data is gradually improved, enriched, and refined as it moves through different stages of processing. This layered approach—commonly called the **medallion architecture**—organizes data into stages that build on one another, making it easier to manage and use effectively.
 
@@ -1,3 +1,5 @@
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-03]
+
 To use Azure Databricks, you must create an Azure Databricks workspace in your Azure subscription. A workspace is an Azure Databricks deployment in a cloud service account. It provides a unified environment for working with Azure Databricks assets for a specified set of users.
 
 You can create an Azure Databricks workspace by:
@@ -60,6 +62,8 @@ The workspace is available in **multiple languages.** To change the workspace la
 
 ## Get help from Databricks Assistant
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-04]
+
 **Databricks Assistant** is an AI-powered pair programmer and support tool that helps you work more efficiently in Databricks by generating, explaining, and fixing code or queries directly in notebooks, dashboards, and files. 
 
 ![Screenshot of the Azure Databricks Assistant.](../media/databricks-assistant.png)
 
@@ -2,18 +2,24 @@ Azure Databricks offers capabilities for various workloads including Machine Lea
 
 ## Data Engineering
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-05]
+
 Azure Databricks provides capabilities for data scientists and engineers who need to collaborate on complex data processing tasks. It provides an integrated environment with Apache Spark for big data processing in a data lakehouse, and supports multiple languages including Python, R, Scala, and SQL. The platform facilitates data exploration, visualization, and the development of data pipelines.
 
 :::image type="content" source="../media/03-azure-databricks-data-science-engineering.png" alt-text="Diagram of Databricks data ingestion & data sources screen." lightbox="../media/03-azure-databricks-data-science-engineering.png":::
 
 ## Machine Learning
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-06]
+
 Azure Databricks supports building, training, and deploying machine learning models at scale. It includes MLflow, an open-source platform to manage the ML lifecycle, including experimentation, reproducibility, and deployment. It also supports various ML frameworks such as TensorFlow, PyTorch, and Scikit-learn, making it versatile for different ML tasks.
 
 :::image type="content" source="../media/04-azure-databricks-machine-learning.png" alt-text="Diagram of Databricks Machine Learning screen." lightbox="../media/04-azure-databricks-machine-learning.png":::
 
 ## SQL
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-07]
+
 Data analysts who primarily interact with data through SQL can use SQL warehouses in Azure Databricks. The Azure Databricks Workspace UI provides a familiar SQL editor, dashboards, and automatic visualization tools to analyze and visualize data directly within Azure Databricks. This workload is ideal for running quick ad-hoc queries and creating reports from large datasets.
 
 :::image type="content" source="../media/05-azure-databricks-sql.png" alt-text="Diagram of DatabricksSQL Editor screen." lightbox="../media/05-azure-databricks-sql.png":::
 
@@ -2,6 +2,8 @@ Azure Databricks is a single service platform with multiple technologies that en
 
 ## Workspaces
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-08]
+
 A **workspace** in Azure Databricks is a secure, collaborative environment where your can access and organize all Databricks assets, such as notebooks, clusters, jobs, libraries, dashboards, and experiments. 
 
 You can open an Azure Databricks Workspace from the Azure portal, by selecting **Launch Workspace**.
@@ -14,6 +16,8 @@ In addition, workspaces are tied to **Unity Catalog** (when enabled) for central
 
 ## Notebooks
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-888888888-09]
+
 **Databricks notebooks** are interactive, web-based documents that combine **runnable code, visualizations, and narrative text** in a single environment. They support multiple languages—such as Python, R, Scala, and SQL—and allow users to switch between languages within the same notebook using *magic commands*. This flexibility makes notebooks well-suited for **exploratory data analysis, data visualization, machine learning experiments, and building complex data pipelines**.
 
 Notebooks are also designed for **collaboration**: multiple users can edit and run cells simultaneously, add comments, and share insights in real time. They integrate tightly with Databricks clusters, enabling users to process large datasets efficiently, and can connect to external data sources through **Unity Catalog** for governed data access. In addition, notebooks can be version-controlled, scheduled as jobs, or exported for sharing outside the platform, making them central to both **ad-hoc exploration** and **production-grade workflows**.
@@ -24,6 +28,8 @@ Notebooks contain a collection of two types of cells: **code cells** and **Markd
 
 ## Clusters
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-10]
+
 Azure Databricks leverages a two-layer architecture:
 
 - **Control Plane**: this internal layer, managed by Microsoft, handles backend services specific to your Azure Databricks account.
@@ -43,6 +49,8 @@ This allows you to tailor compute to specific needs—from exploratory analysis
 
 ## Databricks Runtime
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-11]
+
 The **Databricks Runtime** is a set of customized builds of **Apache Spark** that include performance improvements and additional libraries. These runtimes make it easier to handle tasks such as **machine learning**, **graph processing**, and **genomics**, while still supporting general data processing and analytics.
 
 Databricks provides multiple runtime versions, including **long-term support (LTS)** releases. Each release specifies the underlying Apache Spark version, its release date, and when support will end. Over time, older runtime versions follow a lifecycle:
@@ -56,6 +64,8 @@ If a maintenance update is released for a runtime version you're using, you can
 
 ## Lakeflow Jobs
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-12]
+
 **Lakeflow Jobs** provide workflow automation and orchestration in Azure Databricks, making it possible to reliably schedule, coordinate, and run data processing tasks. Instead of running code manually, you can use jobs to automate repetitive or production-grade workloads such as ETL pipelines, machine learning training, or dashboard refreshes.
 
 :::image type="content" source="../media/jobs.png" alt-text="Screenshot of an Azure Databricks Jobs landing page." lightbox="../media/jobs.png":::
@@ -72,6 +82,8 @@ Because they're repeatable and managed, jobs are critical for **production workl
 
 ## Delta Lake
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-13]
+
 **Delta Lake** is an open-source storage framework that improves the reliability and scalability of data lakes by adding transactional features on top of cloud object storage, such as **Azure Data Lake Storage**. Traditional data lakes can suffer from issues like inconsistent data, partial writes, or difficulties managing concurrent access. Delta Lake addresses these problems by supporting:
 
 - **ACID transactions** (atomicity, consistency, isolation, durability) for reliable reads and writes.
@@ -83,6 +95,8 @@ On top of this foundation, **Delta tables** provide a familiar table abstraction
 
 ## Databricks SQL
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-14]
+
 **Databricks SQL** brings **data warehousing capabilities** to the Databricks Lakehouse, allowing analysts and business users to query and visualize data stored in open formats directly in the data lake. It supports **ANSI SQL**, so anyone familiar with SQL can run queries, build reports, and create dashboards without needing to learn new languages or tools.
 
 Databricks SQL is available only in the **Premium tier** of Azure Databricks. It includes:
@@ -93,6 +107,8 @@ Databricks SQL is available only in the **Premium tier** of Azure Databricks. It
 
 ## SQL Warehouses
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-15]
+
 All Databricks SQL queries run on **SQL warehouses** (formerly called SQL endpoints), which are scalable compute resources decoupled from storage. Different warehouse types are available depending on performance, cost, and management needs:
 
 - **Serverless SQL Warehouses**
 
@@ -1,3 +1,5 @@
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-16]
+
 Data governance is critical for ensuring that data within an organization is managed securely, efficiently, and in compliance with regulations. 
 
 In many organizations, data is distributed across databases, data warehouses, data lakes, and even multiple catalogs. It also exists in diverse formats like Parquet, CSV, and Delta Lake. Beyond structured data in tables, there’s also unstructured data in files, along with other assets such as machine learning models, notebooks, and dashboards that require management and governance. This fragmentation creates silos across sources, formats, and asset types.
@@ -14,6 +16,8 @@ Azure Databricks, combined with Unity Catalog and Microsoft Purview, provides a
 
 ## Unity Catalog 
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-17]
+
 Unity Catalog provides a centralized way to manage access, discovery, lineage, audit logs, and quality monitoring across data and AI assets within Azure Databricks. It applies consistently across all workspaces in a region.
 
 ![Diagram of the Unity Catalog components.](../media/06-azure-databricks-with-unity-catalog.png)
@@ -46,6 +50,8 @@ In most accounts, Unity Catalog is enabled by default when you create a workspac
 
 ## Microsoft Purview
 
+>[!VIDEO https://learn-video.azurefd.net/vod/player?id=22222222-2222-2222-8888-8888888888-18]
+
 Microsoft Purview is a data governance service that lets you manage and oversee data across on-premises systems, multiple clouds, and SaaS platforms. It includes features such as data discovery, classification, lineage tracking, and access governance.
 
 When integrated with Azure Databricks and Unity Catalog, Purview can discover Lakehouse data and ingest its metadata into the Data Map. This allows you to apply consistent governance across your entire data environment, while acting as a central catalog that brings together metadata from different sources.
 
@@ -4,11 +4,11 @@ title: Transform data with filters and aggregations
 metadata:
   title: Transform Data With Filters and Aggregations
   description: Learn how to filter, group, and aggregate data in Azure Databricks using PySpark and SQL to transform raw data into meaningful summaries.
-  ms.date: 12/07/2025
+  ms.date: 01/15/2026
   author: weslbo
   ms.author: wedebols
   ms.topic: unit
   ai-usage: ai-generated
-durationInMinutes: 6
+durationInMinutes: 7
 content: |
   [!include[](includes/5-transform-data-filter-group-aggregate.md)]
@@ -57,6 +57,40 @@ df_filtered = spark.sql("""
 """)
 ```
 
+### Filter null values
+
+Filtering null values requires special handling in Spark DataFrames. Use the `isNull()` and `isNotNull()` functions to identify or exclude null values:
+
+```python
+# Filter rows where order_amount is not null
+df_valid_orders = df.filter(col("order_amount").isNotNull())
+
+# Filter rows where order_amount is null
+df_null_orders = df.filter(col("order_amount").isNull())
+
+# Alternative syntax using column object directly
+df_valid_orders = df.filter(df.order_amount.isNotNull())
+```
+
+> [!IMPORTANT]
+> Using Python's `None` with inequality operators like `!= None` doesn't reliably filter null values in Spark DataFrames. Null comparisons in SQL semantics don't evaluate to true or false—they return null. Always use `isNull()` or `isNotNull()` for correct null handling.
+
+In SQL, use the `IS NULL` or `IS NOT NULL` operators:
+
+```sql
+-- Filter orders with non-null amounts
+SELECT *
+FROM orders
+WHERE order_amount IS NOT NULL;
+```
+
+For comprehensive null handling that removes entire rows containing null values, use the `dropna()` method covered in the unit on resolving duplicate and missing values:
+
+```python
+# Remove rows where order_amount is null
+df_clean = df.dropna(subset=["order_amount"])
+```
+
 ## Group data to organize records
 
 Grouping organizes rows that share common values into categories. This prepares data for aggregation—once grouped, you can calculate statistics for each category.
 
@@ -3,7 +3,7 @@ uid: learn.wwl.cleanse-transform-load-data-into-unity-catalog
 metadata:
   title: Cleanse, Transform, and Load Data into Unity Catalog
   description: Learn how to cleanse, transform, and load data into Unity Catalog tables in Azure Databricks by profiling data, handling duplicates and nulls, applying transformations, and using various loading strategies.
-  ms.date: 12/07/2025
+  ms.date: 01/15/2026
   author: weslbo
   ms.author: wedebols
   ms.topic: module
 
@@ -4,11 +4,11 @@ title: Choose a data ingestion tool
 metadata:
   title: Choose a Data Ingestion Tool
   description: Learn how to select the appropriate data ingestion tool in Azure Databricks, including Lakeflow Connect, Auto Loader, COPY INTO, Spark Structured Streaming, JDBC/ODBC, and Azure Data Factory.
-  ms.date: 12/07/2025
+  ms.date: 01/15/2026
   author: weslbo
   ms.author: wedebols
   ms.topic: unit
   ai-usage: ai-generated
-durationInMinutes: 10
+durationInMinutes: 11
 content: |
   [!include[](includes/3-choose-data-ingestion-tool.md)]
@@ -4,11 +4,11 @@ title: Design and implement a data partitioning scheme
 metadata:
   title: Design and Implement a Data Partitioning Scheme
   description: Learn how to design and implement effective data partitioning schemes in Azure Databricks to optimize query performance and manage large-scale datasets.
-  ms.date: 12/07/2025
+  ms.date: 01/15/2026
   author: weslbo
   ms.author: wedebols
   ms.topic: unit
   ai-usage: ai-generated
-durationInMinutes: 8
+durationInMinutes: 10
 content: |
   [!include[](includes/5-design-data-partitioning-scheme.md)]