Skip to content

Commit e48f73c

Browse files
Merge pull request #2959 from MicrosoftDocs/main639130110841909712sync_temp
For protected branch, push strategy should use PR and merge to target branch method to work around git push error
2 parents a09fa25 + d355d68 commit e48f73c

36 files changed

Lines changed: 1986 additions & 158 deletions

docs/data-engineering/migrate-synapse-hms-metadata.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ ms.reviewer: aimurg
55
ms.topic: how-to
66
ms.custom:
77
- fabric-cat
8-
ms.date: 11/15/2023
8+
ms.date: 04/28/2026
99
---
1010

1111
# Migrate Hive Metastore metadata from Azure Synapse Analytics to Fabric
@@ -104,7 +104,7 @@ Step 2 is when the actual metadata is imported from intermediate storage into th
104104
* **2.4) Run all notebook commands** to import catalog objects from intermediate path.
105105

106106
> [!NOTE]
107-
> When importing multiple databases, you can (i) create one lakehouse per database (the approach used here), or (ii) move all tables from different databases to a single lakehouse. For the latter, all migrated tables could be `<lakehouse>.<db_name>_<table_name>`, and you will need to adjust the import notebook accordingly.
107+
> When importing multiple databases, you can (i) create one lakehouse per database (the approach used here), or (ii) move all tables from different databases to a single lakehouse. For the latter, all migrated tables could be `<lakehouse>.<db_name>_<table_name>`, and you need to adjust the import notebook accordingly.
108108
109109
### Step 3: Validate content
110110

Lines changed: 70 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,92 @@
11
---
2-
title: Migrating from Azure Synapse Spark to Fabric
3-
description: Learn about migrating from Azure Synapse Spark to Microsoft Fabric, including key considerations and different migration scenarios.
2+
title: Overview of migrating Azure Synapse Spark to Fabric
3+
description: Learn how to migrate Azure Synapse Spark workloads to Microsoft Fabric, choose the right migration path, and navigate the available migration guidance.
44
ms.reviewer: aimurg
55
ms.topic: concept-article
66
ms.custom:
77
- fabric-cat
8-
ms.date: 11/15/2023
8+
ms.date: 04/28/2026
9+
ai-usage: ai-assisted
910
---
1011

11-
# Migrating from Azure Synapse Spark to Fabric
12+
# Overview of migrating Azure Synapse Spark to Fabric
1213

13-
Before you begin your migration, you should verify that [Fabric Data Engineering](data-engineering-overview.md) is the best solution for your workload. Fabric Data Engineering supports [lakehouse](lakehouse-overview.md), [notebook](how-to-use-notebook.md), [environment](create-and-use-environment.md), [Spark job definition (SJD)](spark-job-definition.md) and [pipeline](../data-factory/data-factory-overview.md) items, including different runtime and Spark capabilities support.
14+
Use this article as the starting point for migrating Azure Synapse Spark workloads to Microsoft Fabric. It helps you decide which guidance to use, what can be migrated directly, and where manual refactoring or validation is still required.
1415

15-
## Key considerations
16+
Fabric Data Engineering supports [lakehouse](lakehouse-overview.md), [notebook](how-to-use-notebook.md), [environment](create-and-use-environment.md), [Spark job definition](spark-job-definition.md), and [pipeline](../data-factory/data-factory-overview.md) items. Most Synapse Spark migrations involve some combination of item migration, data access changes, metadata migration, code refactoring, and post-migration validation.
1617

17-
The initial step in crafting a migration strategy is to assess suitability. It's worth noting that certain Fabric features related to Spark are currently in development or planning. For more details and updates, visit the [Fabric roadmap](https://aka.ms/fabricrm).
18+
## Before you migrate
1819

19-
For Spark, see a detailed comparison [differences between Azure Synapse Spark and Fabric](comparison-between-fabric-and-azure-synapse-spark.md).
20+
Before you begin, confirm that Fabric Data Engineering is the right destination for your workload. Review the Spark runtime, security model, pool model, environment model, and data access patterns that your current Synapse implementation depends on.
2021

21-
## Migration scenarios
22+
Start with these articles:
2223

23-
If you determine that Fabric Data Engineering is the right choice for migrating your existing Spark workloads, the migration process can involve multiple scenarios and phases:
24+
- [Compare Fabric and Azure Synapse Spark: Key Differences](comparison-between-fabric-and-azure-synapse-spark.md)
25+
- [Phase 1: Migration strategy and planning](synapse-migration-strategy-planning.md)
2426

25-
* **Items**: Items migration involves the transfer of one or various items from your existing Azure Synapse workspace to Fabric. Learn more about migrating [Spark pools](migrate-synapse-spark-pools.md), [Spark configurations](migrate-synapse-spark-configurations.md), [Spark libraries](migrate-synapse-spark-libraries.md), [notebooks](migrate-synapse-notebooks.md), and [Spark job definition](migrate-synapse-spark-job-definition.md).
26-
* **Data and pipelines**: Using [OneLake shortcuts](../onelake/create-adls-shortcut.md), you can make ADLS Gen2 data (linked to an Azure Synapse workspace) available in Fabric lakehouse. Pipeline migration involves moving existing pipelines to Fabric, including notebook and Spark job definition pipeline activities. Learn more about [data and pipelines migration](migrate-synapse-data-pipelines.md).
27-
* **Metadata**: Metadata migration involves moving Spark catalog metadata (databases, tables, and partitions) from an existing Hive MetaStore (HMS) in Azure Synapse to Fabric lakehouse. Learn more about [HMS metadata migration](migrate-synapse-hms-metadata.md).
28-
* **Workspace**: Users can migrate an existing Azure Synapse workspace by creating a new workspace in Microsoft Fabric, including metadata. Workspace migration isn't covered in this guidance, assumption is that users need to [create a new workspace](../fundamentals/create-workspaces.md) or have an existing Fabric workspace. Learn more about [workspace roles](../fundamentals/roles-workspaces.md) in Fabric.
27+
If you're migrating an existing Synapse workspace, plan to create or use an existing Fabric workspace as the migration target. This article doesn't cover full workspace provisioning or non-Spark workload migration.
28+
29+
## What can you migrate?
30+
31+
Synapse-to-Fabric migration usually spans several workstreams.
32+
33+
| **Migration area** | **Typical scope** | **Primary guidance** |
34+
|----|----|----|
35+
| **Planning and assessment** | Inventory Spark pools, notebooks, Spark Job Definitions, lake databases, linked services, and blockers | [Phase 1: Migration strategy and planning](synapse-migration-strategy-planning.md) |
36+
| **Items, code refactoring, pools, configs, and libraries** | Notebooks, Spark Job Definitions, Spark pools, lake database mappings, `mssparkutils`, linked services, file paths, catalog APIs, connector auth, environments, custom pools, Spark properties, library compatibility | [Phase 2: Spark workload migration](synapse-migration-spark-workloads.md) |
37+
| **Hive Metastore and lake metadata** | Databases, tables, partitions, managed vs. external tables | [Phase 3: Hive Metastore and data migration](synapse-migration-hms-data.md) |
38+
| **Data access and pipelines** | OneLake shortcuts, ADLS Gen2 access, copy activities, pipeline migration | [Migrate data and pipelines](migrate-synapse-data-pipelines.md) |
39+
| **Security, validation, and cutover** | Roles, connections, governance, verification, cutover planning | [Phase 4: Security and governance migration](synapse-migration-security-validation-cutover.md) |
40+
41+
## Choose your migration path
42+
43+
Use the path that matches your goal.
44+
45+
- **You need an end-to-end migration plan.** Start with the 4-phase best practices series. This is the best entry point for most production migrations.
46+
- **You want to move supported Spark items quickly.** Start with the [Spark Migration Assistant](synapse-to-fabric-spark-migration-assistant.md) and then use the refactoring and validation articles to close the gaps.
47+
- **You only need help with one area.** Use the task-specific articles for notebooks, Spark Job Definitions, pools, libraries, Hive Metastore metadata, or data/pipeline migration.
48+
49+
## Recommended reading order
50+
51+
For most teams, the fastest way to approach a Synapse Spark migration is:
52+
53+
1. Review [Compare Fabric and Azure Synapse Spark: Key Differences](comparison-between-fabric-and-azure-synapse-spark.md).
54+
1. Read [Phase 1: Migration strategy and planning](synapse-migration-strategy-planning.md).
55+
1. Run the [Spark Synapse to Fabric Spark Migration Assistant](synapse-to-fabric-spark-migration-assistant.md) where applicable.
56+
1. Refactor notebooks, Spark jobs, pools, and libraries using [Phase 2: Spark workload migration](synapse-migration-spark-workloads.md).
57+
1. Validate data access, metadata, security, and cutover readiness using the remaining best-practices articles.
2958

3059
:::image type="content" source="media\migrate-synapse\migration-scenarios.png" alt-text="Screenshot showing the migration scenarios." lightbox="media/migrate-synapse/migration-scenarios.png":::
3160

32-
Transitioning from Azure Synapse Spark to Fabric Spark requires a deep understanding of your current architecture and the differences between Azure Synapse Spark and Fabric. The first crucial step is an assessment, followed by the creation of a detailed migration plan. This plan can be customized to match your system's unique traits, phase dependencies, and workload complexities.
61+
Migration from Synapse Spark to Fabric is usually a copy-and-adapt process rather than a direct in-place move. You can migrate many assets quickly, but you should still expect to validate runtime behavior, replace Synapse-specific integrations, and align security, metadata, and operational patterns with Fabric.
62+
63+
## Best practices series
64+
65+
Use the best practices series for a structured, end-to-end migration path:
66+
67+
- [Phase 1: Migration strategy and planning](synapse-migration-strategy-planning.md)
68+
- [Phase 2: Spark workload migration](synapse-migration-spark-workloads.md)
69+
- [Phase 3: Hive Metastore and data migration](synapse-migration-hms-data.md)
70+
- [Phase 4: Security and governance migration](synapse-migration-security-validation-cutover.md)
71+
72+
## Task-specific migration articles
73+
74+
If you need targeted guidance for a specific migration task, use these articles:
75+
76+
- [Spark Synapse to Fabric Spark Migration Assistant](synapse-to-fabric-spark-migration-assistant.md)
77+
- [Migrate Azure Synapse notebooks to Fabric](migrate-synapse-notebooks.md)
78+
- [Migrate Spark Job Definitions from Azure Synapse to Fabric](migrate-synapse-spark-job-definition.md)
79+
- [Migrate Spark Pools from Azure Synapse to Fabric](migrate-synapse-spark-pools.md)
80+
- [Migrate Spark configurations from Azure Synapse to Fabric](migrate-synapse-spark-configurations.md)
81+
- [Migrate Spark Libraries from Azure Synapse to Fabric](migrate-synapse-spark-libraries.md)
82+
- [Migrate Hive Metastore metadata](migrate-synapse-hms-metadata.md)
83+
- [Migrate data and pipelines](migrate-synapse-data-pipelines.md)
3384

3485
## Related content
3586

36-
- [Fabric vs. Azure Synapse Spark](comparison-between-fabric-and-azure-synapse-spark.md)
37-
- Learn more about migration options for [Spark pools](migrate-synapse-spark-pools.md), [configurations](migrate-synapse-spark-configurations.md), [libraries](migrate-synapse-spark-libraries.md), [notebooks](migrate-synapse-notebooks.md) and [Spark job definition](migrate-synapse-spark-job-definition.md)
87+
- [Compare Fabric and Azure Synapse Spark: Key Differences](comparison-between-fabric-and-azure-synapse-spark.md)
88+
- [Phase 1: Migration strategy and planning](synapse-migration-strategy-planning.md)
89+
- [Spark Synapse to Fabric Spark Migration Assistant](synapse-to-fabric-spark-migration-assistant.md)
90+
- Learn more about migration options for [Spark pools](migrate-synapse-spark-pools.md), [configurations](migrate-synapse-spark-configurations.md), [libraries](migrate-synapse-spark-libraries.md), [notebooks](migrate-synapse-notebooks.md), and [Spark job definition](migrate-synapse-spark-job-definition.md)
3891
- [Migrate data and pipelines](migrate-synapse-data-pipelines.md)
3992
- [Migrate Hive Metastore metadata](migrate-synapse-hms-metadata.md)
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
---
2+
title: Migrate Hive Metastore metadata and data paths to Fabric
3+
description: Migrate Hive Metastore objects and align data access with OneLake shortcuts and data movement options for Synapse to Fabric migration.
4+
ms.topic: how-to
5+
ms.date: 04/28/2026
6+
ms.reviewer: jejiang
7+
ai-usage: ai-assisted
8+
---
9+
10+
# Phase 3: Hive Metastore and data migration
11+
12+
This article is Phase 3 of 4 in the Azure Synapse Spark to Microsoft Fabric migration best practices series.
13+
14+
Use this article when you're ready to migrate your Hive Metastore catalog and plan data access in Fabric. This article focuses on two decisions: how to migrate your table metadata and whether to use OneLake shortcuts (zero-copy) or move data to accessible storage.
15+
16+
In this article, you learn how to:
17+
18+
- Assess managed vs. external tables to determine your migration approach.
19+
- Export and import Hive Metastore metadata using notebook workflows.
20+
- Create OneLake shortcuts for zero-copy access to existing data sources.
21+
- Choose between shortcuts, copy pipelines, and bulk transfer tools for data movement.
22+
23+
> [!TIP]
24+
> Create your target Lakehouse with schemas enabled. Lakehouse schemas allow you to organize tables into named collections (for example, sales, marketing, hr). The Spark Migration Assistant maps the default Synapse database to the `dbo` schema and additional databases to additional schemas in the same Lakehouse. Schemas are enabled by default when creating a new Lakehouse in the Fabric portal.
25+
26+
For the full HMS migration guide, see [Migrate Hive Metastore metadata](migrate-synapse-hms-metadata.md).
27+
28+
## Assess managed vs. external tables
29+
30+
The critical first step is distinguishing managed from external tables in your Synapse Hive Metastore.
31+
32+
- **External tables:** If data is in ADLS Gen2 in Delta format, create OneLake shortcuts directly to the ADLS Gen2 paths. No data movement needed.
33+
- **Managed tables:** Data is stored in Synapse's internal warehouse directory. You must create OneLake shortcuts to this path or copy data to an accessible ADLS Gen2 location.
34+
35+
Synapse managed table warehouse directory path:
36+
37+
```
38+
abfss://<container>@<storage>.dfs.core.windows.net/synapse/workspaces/<workspace>/warehouse
39+
```
40+
41+
## Migration workflow
42+
43+
Microsoft provides export/import notebooks for Hive Metastore migration. The process has two phases.
44+
45+
For the full HMS migration guide, see [Migrate Hive Metastore metadata](migrate-synapse-hms-metadata.md).
46+
47+
### Phase 1: Export metadata from Synapse
48+
49+
1. **Import the HMS export notebook** into your Azure Synapse workspace. This notebook queries and exports HMS metadata of databases, tables, and partitions to an intermediate directory in OneLake.
50+
51+
1. **Configure parameters.** Set your Synapse workspace name, database names to export, and the target OneLake lakehouse for staging. The Spark internal catalog API is used to read catalog objects.
52+
53+
1. **Run the export.** Execute all notebook cells. Metadata is written to the Files section of your Fabric Lakehouse in a structured folder hierarchy.
54+
55+
### Phase 2: Import metadata into Fabric Lakehouse
56+
57+
1. **Create shortcuts for data access.** Create a shortcut within the Files section of the Lakehouse pointing to the Synapse Spark warehouse directory. This makes managed table data accessible to Fabric.
58+
59+
1. **Configure warehouse mappings.** For managed tables, provide `WarehouseMappings` to replace old Synapse warehouse directory paths with the shortcut paths in Fabric. All managed tables are converted to external tables during import.
60+
61+
1. **Run the import notebook** in Fabric to create catalog objects (databases, tables, partitions) in the Lakehouse using Spark's internal catalog API.
62+
63+
1. **Verify.** Check that all imported tables are visible in the Lakehouse Explorer UI's Tables section.
64+
65+
## Limitations and considerations
66+
67+
- The migration scripts use Spark's internal catalog API, not direct HMS database connections. This might not scale well for very large catalogs — for large environments, consider modifying the export logic to query the HMS database directly.
68+
69+
- There's no isolation guarantee during export. If Synapse Spark compute modifies the metastore concurrently, inconsistent data might be introduced. Schedule migration during a maintenance window.
70+
71+
- Functions aren't included in the current migration scripts.
72+
73+
- After migration, OneLake shortcuts provide ongoing data access. If Synapse continues writing to the same ADLS Gen2 paths, Fabric sees the updated data through shortcuts automatically (data-level sync). However, new tables or schema changes in the Synapse HMS won't propagate automatically — you must re-run the migration scripts or manually create new tables in the Fabric Lakehouse.
74+
75+
- **External Hive Metastore (Azure SQL DB / MySQL):** Some Synapse workspaces use an external HMS backed by Azure SQL Database or Azure Database for MySQL to persist catalog metadata outside the workspace and share it with HDInsight or Databricks. Fabric doesn't support connecting to an external Hive Metastore — it uses the Lakehouse catalog exclusively. If you use an external HMS, you must migrate the metadata into the Fabric Lakehouse catalog. You can do this by querying the external HMS database directly (via JDBC) to export table definitions and then recreating them in Fabric using Spark SQL or the HMS import notebooks. Note that external HMS support in Synapse is deprecated after Spark 3.4.
76+
77+
> [!TIP]
78+
> For ongoing synchronization when both Synapse and Fabric are active: use OneLake shortcuts for data-level sync (automatic), and schedule periodic re-runs of the HMS export/import notebooks or build a reconciliation notebook to detect and sync new tables.
79+
80+
## Data migration options
81+
82+
You have data in ADLS Gen2 linked to your Synapse workspace that you need to make accessible in Fabric Lakehouse without unnecessary data duplication. Choose from the following approaches.
83+
84+
- **OneLake Shortcuts (recommended, zero-copy):** Create shortcuts in Fabric Lakehouse pointing to your existing ADLS Gen2 paths. Delta format data in the Tables section auto-registers in the Lakehouse catalog. CSV/JSON/Parquet data goes in the Files section. No data movement required.
85+
86+
- **mssparkutils fastcp:** For copying data from ADLS Gen2 to OneLake within notebooks.
87+
88+
- **AzCopy:** Command-line utility for bulk data copy from ADLS Gen2 to OneLake.
89+
90+
- **Data Factory Copy Activity:** Use Fabric Data Factory (or existing ADF/Synapse pipelines) to copy data to the Lakehouse.
91+
92+
- **Azure Storage Explorer:** Visual tool for moving files from ADLS Gen2 to OneLake.
93+
94+
> [!TIP]
95+
> Prefer shortcuts over data movement whenever possible. Shortcuts avoid data duplication and storage costs, and Delta tables in the Tables section are automatically discoverable in the SQL analytics endpoint and Power BI.
96+
97+
## Related content
98+
99+
- [Phase 1: Migration strategy and planning](synapse-migration-strategy-planning.md)
100+
- [Phase 2: Spark workload migration](synapse-migration-spark-workloads.md)
101+
- [Phase 3: Hive Metastore and data migration](synapse-migration-hms-data.md)
102+
- [Phase 4: Security and governance migration](synapse-migration-security-validation-cutover.md)

0 commit comments

Comments
 (0)