Skip to content

Commit 12e14b0

Browse files
committed
updated module with features and AI narrative
1 parent 42c7e6f commit 12e14b0

14 files changed

Lines changed: 210 additions & 103 deletions

learn-pr/wwl/get-started-lakehouses/1-introduction.yml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@ title: Introduction
44
metadata:
55
title: Introduction
66
description: "Introduction"
7-
ms.date: 03/19/2025
7+
ms.date: 03/04/2026
88
author: angierudduck
99
ms.author: anrudduc
1010
ms.topic: unit
11-
durationInMinutes: 1
11+
ai-usage: ai-assisted
12+
durationInMinutes: 2
1213
content: |
1314
[!include[](includes/1-introduction.md)]
Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
11
### YamlMime:ModuleUnit
22
uid: learn.wwl.get-started-lakehouses.fabric-lakehouse
3-
title: Explore the Microsoft Fabric lakehouse
3+
title: Describe lakehouse features and capabilities
44
metadata:
5-
title: Explore the Microsoft Fabric lakehouse
6-
description: "Explore the Microsoft Fabric lakehouse"
7-
ms.date: 03/19/2025
5+
title: Describe Lakehouse Features and Capabilities
6+
description: "Describe the features and capabilities of a lakehouse in Microsoft Fabric"
7+
ms.date: 03/04/2026
88
author: angierudduck
99
ms.author: anrudduc
1010
ms.topic: unit
11+
ai-usage: ai-assisted
1112
durationInMinutes: 5
1213
content: |
13-
[!include[](includes/2-fabric-lakehouse.md)]
14+
[!include[](includes/2-fabric-lakehouse.md)]
Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
11
### YamlMime:ModuleUnit
22
uid: learn.wwl.get-started-lakehouses.work-lakehouse
3-
title: Work with Microsoft Fabric lakehouses
3+
title: Ingest and transform data in a lakehouse
44
metadata:
5-
title: Work with Microsoft Fabric lakehouses
6-
description: "Work with Microsoft Fabric lakehouses"
7-
ms.date: 03/19/2025
5+
title: Ingest and Transform Data in a Lakehouse
6+
description: "Learn how to create a lakehouse, ingest data, and transform data in Microsoft Fabric."
7+
ms.date: 03/04/2026
88
author: angierudduck
99
ms.author: anrudduc
1010
ms.topic: unit
11-
durationInMinutes: 6
11+
ai-usage: ai-assisted
12+
durationInMinutes: 7
1213
content: |
1314
[!include[](includes/3-work-lakehouse.md)]
Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,14 @@
11
### YamlMime:ModuleUnit
22
uid: learn.wwl.get-started-lakehouses.explore-data-lakehouse
3-
title: Explore and transform data in a lakehouse
3+
title: Query and analyze lakehouse data
44
metadata:
5-
title: Explore and transform data in a lakehouse
6-
description: "Explore and transform data in a lakehouse"
7-
ms.date: 03/19/2025
5+
title: Query and Analyze Lakehouse Data
6+
description: "Learn how to query and analyze data in a lakehouse using SQL and Spark notebooks."
7+
ms.date: 03/04/2026
88
author: angierudduck
99
ms.author: anrudduc
1010
ms.topic: unit
11-
durationInMinutes: 6
11+
ai-usage: ai-assisted
12+
durationInMinutes: 7
1213
content: |
1314
[!include[](includes/4-explore-data-lakehouse.md)]

learn-pr/wwl/get-started-lakehouses/5-exercise-lakehouse.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@ title: Exercise - Create a Microsoft Fabric lakehouse
44
metadata:
55
title: Exercise - Create a Microsoft Fabric lakehouse
66
description: "Exercise - Create a Microsoft Fabric lakehouse"
7-
ms.date: 03/19/2025
7+
ms.date: 03/04/2026
88
author: angierudduck
99
ms.author: anrudduc
1010
ms.topic: unit
11+
ai-usage: ai-assisted
1112
durationInMinutes: 30
1213
content: |
1314
[!include[](includes/5-exercise-lakehouse.md)]

learn-pr/wwl/get-started-lakehouses/6-knowledge-check.yml

Lines changed: 40 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,17 @@ uid: learn.wwl.get-started-lakehouses.knowledge-check
33
title: Module assessment
44
metadata:
55
hidden_question_numbers: ["62697432_2","62697432_30","62697432_6","62697432_59","62697432_63","62697432_71","62697432_75","62697432_79","62697432_83","62697432_87","62697432_91","62697432_95","62697432_108","62697432_112","62697432_116","62697432_120","62697432_132","62697432_136","62697432_140"]
6-
ai_generated_module_assessment: true
6+
ai_generated_module_assessment: false
77
title: Module assessment
88
description: "Knowledge check"
9-
ms.date: 03/19/2025
9+
ms.date: 03/04/2026
1010
author: angierudduck
1111
ms.author: anrudduc
1212
ms.topic: unit
1313
module_assessment: true
14-
durationInMinutes: 3
14+
ai-usage: ai-assisted
15+
durationInMinutes: 5
1516
quiz:
16-
title: ""
1717
questions:
1818
- content: "What is a Microsoft Fabric lakehouse?"
1919
choices:
@@ -26,25 +26,58 @@ quiz:
2626
- content: "An analytical store that combines the file storage flexibility of a data lake with the SQL-based query capabilities of a data warehouse."
2727
isCorrect: true
2828
explanation: "Correct. Lakehouses combine data lake and data warehouse features."
29+
- content: "What is the main difference between the lakehouse explorer and SQL analytics endpoint?"
30+
choices:
31+
- content: "The lakehouse explorer provides read-only access, while the SQL analytics endpoint allows data modifications."
32+
isCorrect: false
33+
explanation: "Incorrect. The SQL analytics endpoint provides read-only access, while the lakehouse explorer allows modifications."
34+
- content: "Lakehouse explorer enables interaction with tables, files, and folders, while SQL analytics endpoint provides read-only T-SQL querying of Delta tables."
35+
isCorrect: true
36+
explanation: "Correct. Lakehouse explorer supports data management operations, while SQL analytics endpoint is read-only SQL access."
37+
- content: "Both provide identical functionality with different user interfaces."
38+
isCorrect: false
39+
explanation: "Incorrect. The two modes have different capabilities and purposes."
2940
- content: "You want to include data in an external Azure Data Lake Store Gen2 location in your lakehouse, without the requirement to copy the data. What should you do?"
3041
choices:
3142
- content: "Create a Data pipeline that uses a Copy Data activity to load the external data into a file."
3243
isCorrect: false
3344
explanation: "Incorrect. A Copy Data activity in a pipeline copies the data."
3445
- content: "Create a shortcut."
3546
isCorrect: true
36-
explanation: "Correct. A shortcut enables you to include external data in the lakehouse."
47+
explanation: "Correct. A shortcut enables you to include external data in the lakehouse without copying it."
3748
- content: "Create a Dataflow Gen2 that extracts the data and loads it into a table."
3849
isCorrect: false
3950
explanation: "Incorrect. A dataflow that loads the data into a table would copy the data."
51+
- content: "You have CSV files in your lakehouse Files area and want to create Delta tables without writing code. What should you use?"
52+
choices:
53+
- content: "A notebook with PySpark code"
54+
isCorrect: false
55+
explanation: "Incorrect. While notebooks can load data, they require writing code."
56+
- content: "Load to Table"
57+
isCorrect: true
58+
explanation: "Correct. Load to Table is a no-code option that creates Delta tables from Parquet or CSV files directly in the lakehouse explorer."
59+
- content: "The SQL analytics endpoint"
60+
isCorrect: false
61+
explanation: "Incorrect. The SQL analytics endpoint is read-only and can't create tables."
4062
- content: "You want to use Apache Spark to interactively explore data in a file in the lakehouse. What should you do?"
4163
choices:
4264
- content: "Create a notebook."
4365
isCorrect: true
44-
explanation: "Correct. A notebook enables interactive Spark coding."
66+
explanation: "Correct. A notebook enables interactive Spark coding with PySpark or Spark SQL."
4567
- content: "Switch to the SQL analytics endpoint mode."
4668
isCorrect: false
4769
explanation: "Incorrect. The SQL analytics endpoint mode doesn't support interactive Spark code."
4870
- content: "Create a Dataflow Gen2."
4971
isCorrect: false
50-
explanation: "Incorrect. A dataflow doesn't provide an interactive Spark coding interface."
72+
explanation: "Incorrect. A dataflow doesn't provide an interactive Spark coding interface."
73+
- content: "What connection mode does Power BI use by default when connecting to a lakehouse semantic model?"
74+
choices:
75+
- content: "Import mode, which copies data into Power BI."
76+
isCorrect: false
77+
explanation: "Incorrect. Power BI uses Direct Lake mode by default when connecting to a Fabric lakehouse."
78+
- content: "DirectQuery mode, which queries the source in real-time."
79+
isCorrect: false
80+
explanation: "Incorrect. While DirectQuery exists, Direct Lake is the default when connecting to a Fabric lakehouse."
81+
- content: "Direct Lake mode, which reads directly from Delta Lake files without copying data."
82+
isCorrect: true
83+
explanation: "Correct. Direct Lake provides fast query performance while ensuring reports always reflect current lakehouse data."

learn-pr/wwl/get-started-lakehouses/7-summary.yml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@ title: Summary
44
metadata:
55
title: Summary
66
description: "Summary"
7-
ms.date: 03/19/2025
7+
ms.date: 03/04/2026
88
author: angierudduck
99
ms.author: anrudduc
1010
ms.topic: unit
11-
durationInMinutes: 1
11+
ai-usage: ai-assisted
12+
durationInMinutes: 2
1213
content: |
1314
[!include[](includes/7-summary.md)]
Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
1-
The foundation of Microsoft Fabric is a **lakehouse**, which is built on top of the **OneLake** scalable storage layer and uses Apache Spark and SQL compute engines for big data processing. A lakehouse is a unified platform that combines:
1+
A **lakehouse** in Microsoft Fabric combines the flexible storage of a data lake with the analytical capabilities of a data warehouse. A lakehouse uses Apache Spark and SQL compute engines to process and analyze data at scale, and is built on the **OneLake** storage layer.
22

3-
- The flexible and scalable storage of a data *lake*
4-
- The ability to query and analyze data of a data ware*house*
3+
A lakehouse is a unified platform that combines:
54

6-
Imagine your company has been using a data warehouse to store structured data from its transactional systems, such as order history, inventory levels, and customer information. You collect unstructured data from social media, website logs, and external sources that are difficult to manage and analyze using the existing data warehouse infrastructure. Your company's new directive is to improve its decision-making capabilities by analyzing data in various formats across multiple sources, so the company chooses Microsoft Fabric.
5+
- The flexible and scalable storage of a data **lake**
6+
- The ability to query and analyze data of a data ware**house**
77

8-
In this module, we explore how a lakehouse in Microsoft Fabric provides a scalable and flexible data store for files and tables that you can query using SQL.
8+
Imagine your organization relies on a traditional data warehouse for business analytics. The warehouse handles structured transactional data well, but struggles with the growing volume of semi-structured and unstructured data from sources like application logs, IoT devices, and external feeds. Storing and processing these diverse data types requires separate systems, creating data silos and complex integration efforts. Your organization needs a unified solution that handles both structured and unstructured data while maintaining strong analytical capabilities.
9+
10+
In this module, you learn how lakehouses address these challenges. You explore how to create a lakehouse, ingest and transform data, and query lakehouse data using SQL and Spark. You also learn how well-structured lakehouse data supports downstream analytics and AI-powered experiences across the Microsoft Fabric platform.
Lines changed: 41 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,59 @@
1-
A **lakehouse** presents as a database and is built on top of a data lake using Delta format tables. Lakehouses combine the SQL-based analytical capabilities of a relational data warehouse and the flexibility and scalability of a data lake. Lakehouses store all data formats and can be used with various analytics tools and programming languages. As cloud-based solutions, lakehouses can scale automatically and provide high availability and disaster recovery.
1+
Traditional analytics architectures often force you to choose between two approaches. Data lakes offer flexibility and scalability but lack the structure and performance needed for business analytics. Data warehouses provide strong analytical capabilities but struggle with diverse data formats and can be costly to scale. **Lakehouses** bridge this gap by bringing database-like capabilities directly to your data lake, eliminating the need to maintain separate systems for different workloads.
22

33
![Diagram of a lakehouse, displaying the folder structure of a data lake and the relational capabilities of a data warehouse.](../media/lakehouse-components.png)
44

5-
Some benefits of a lakehouse include:
5+
## Understand lakehouse design
66

7-
- Lakehouses use Spark and SQL engines to process large-scale data and support machine learning or predictive modeling analytics.
8-
- Lakehouse data is organized in a *schema-on-read format*, which means you define the schema as needed rather than having a predefined schema.
9-
- Lakehouses support ACID (Atomicity, Consistency, Isolation, Durability) transactions through Delta Lake formatted tables for data consistency and integrity.
10-
- Lakehouses are a single location for data engineers, data scientists, and data analysts to access and use data.
7+
A lakehouse organizes data into two main areas: **Tables** and **Files**. Understanding this separation helps you design effective data workflows.
118

12-
A lakehouse is a great option if you want a scalable analytics solution that maintains data consistency. It's important to evaluate your specific requirements to determine which solution is the best fit.
9+
**Tables folder**: This folder contains Delta Lake tables that provide structured, queryable data. Tables in this folder:
1310

14-
## Load data into a lakehouse
11+
- Support SQL queries through the SQL analytics endpoint
12+
- Enforce schemas and support ACID transactions
13+
- Can be accessed in Power BI for reporting
14+
- Benefit from automatic optimization and maintenance
1515

16-
Fabric lakehouses are a central element for your analytics solution. You can follow the ETL (Extract, Transform, Load) process to ingest and transform data before loading to the lakehouse.
16+
**Files folder**: This folder stores raw or semi-structured data files in their native format. Files in this folder:
1717

18-
You can ingest data in many common formats from various sources, including local files, databases, or APIs. You can also create Fabric **shortcuts** to data in external sources, such as Azure Data Lake Store Gen2 or OneLake. Use the Lakehouse explorer to browse files, folders, shortcuts, and tables and view their contents within the Fabric platform.
18+
- Support any file format (CSV, JSON, Parquet, images, documents)
19+
- Provide flexibility for data exploration and processing
20+
- Can be staged before transformation into tables
21+
- Don't enforce schema or support direct SQL queries
1922

20-
Ingested data can be transformed and then loaded using either Apache Spark with notebooks or Dataflows Gen2. Use Data Factory pipelines to orchestrate your different ETL activities and land the prepared data into your lakehouse.
23+
This separation lets you maintain both raw data (for compliance or reprocessing) and structured tables (for analytics) within the same lakehouse. You can process files using Spark notebooks or Dataflows Gen2, then load the results into tables for querying and reporting.
2124

22-
> [!NOTE]
23-
> Dataflows Gen2 are based on Power Query - a familiar tool to data analysts using Excel or Power BI that provides visual representation of transformations as an alternative to traditional programming.
25+
## Understand Delta Lake tables
26+
27+
At the heart of a lakehouse are **Delta Lake tables**. Delta Lake is an open-source storage layer that brings reliability to data lakes. When you create a table in a lakehouse, the data is stored in Delta format in the underlying OneLake storage.
28+
29+
Delta Lake tables provide several key advantages:
30+
31+
- **ACID transactions**: Delta Lake ensures data consistency even when multiple users read and write data simultaneously.
32+
- **Schema enforcement**: Delta Lake validates that the data you write matches the table schema, preventing corrupt data.
33+
- **Time travel**: Delta Lake maintains a transaction log that lets you query previous versions of your data or roll back changes.
34+
- **Efficient updates and deletes**: Unlike traditional data lake files, Delta tables support efficient update and delete operations.
35+
36+
Each Delta table consists of Parquet data files plus a transaction log that tracks all changes. This architecture enables both batch and streaming workloads to work reliably with the same data.
2437

25-
You can use your lakehouse for many reasons, including:
38+
## Manage lakehouse access
2639

27-
- Analyze using SQL.
28-
- Train machine learning models.
29-
- Perform analytics on real-time data.
30-
- Develop reports in Power BI.
40+
When you centralize data in your lakehouse, protecting that data becomes critical. Fabric provides layered access controls to secure lakehouse data at multiple levels.
3141

32-
## Secure a lakehouse
42+
Use **workspace roles** for collaborators who need access to all items in the workspace. Use **item-level sharing** to grant read-only access for specific needs, such as analytics or Power BI report development.
3343

34-
Lakehouse access is managed either through the workspace or item-level sharing. Workspaces roles should be used for collaborators because these roles grant access to all items within the workspace. Item-level sharing is best used for granting access for read-only needs, such as analytics or Power BI report development.
44+
For granular control, the SQL analytics endpoint supports **row-level** and **column-level security**, so you can restrict what specific users see when they query through SQL. If you organize tables into schemas, you can also apply **schema-level permissions** to control access by business domain.
3545

36-
Fabric lakehouses also support data governance features including sensitivity labels, and can be extended by using Microsoft Purview with your Fabric tenant.
46+
Fabric lakehouses also support data governance features, including sensitivity labels, and can be extended by using Microsoft Purview with your Fabric tenant.
3747

3848
> [!NOTE]
3949
> For more information, see the [Security in Microsoft Fabric](/fabric/security/security-overview) documentation.
50+
51+
## Build a foundation for intelligent analytics
52+
53+
The data you structure in a lakehouse doesn't just serve traditional reports and dashboards. Well-organized lakehouse data becomes the foundation that intelligent experiences across Microsoft Fabric depend on.
54+
55+
When you create tables with clear schemas, consistent naming conventions, and descriptive column names, you make that data accessible to both human analysts and AI-powered tools. Fabric IQ data agents can query your lakehouse tables through the SQL analytics endpoint, translating natural language questions into SQL queries that return accurate answers. The quality of those answers depends directly on how well you structure and document your data.
56+
57+
Copilot capabilities in Fabric also benefit from well-structured lakehouse data. Copilot in Power BI can generate reports and answer business questions when it can reason over clearly defined tables and relationships. The same lakehouse data can feed semantic models that support natural language exploration across Microsoft 365 experiences.
58+
59+
This means the investment you make in organizing, naming, and structuring lakehouse data pays dividends beyond your immediate analytics needs. Good data engineering practices in the lakehouse create a reusable foundation for intelligent experiences across the platform.

0 commit comments

Comments
 (0)