title: Read data from semantic models and write data that semantic models can consume using Spark description: Learn how to read from semantic models and write data that can be used in semantic models using Spark. ms.reviewer: marcozo ms.topic: how-to ms.date: 03/22/2025 ms.search.form: Read write powerbi

Read from semantic models and write data consumable by Power BI using Spark (DEPRECATED)

Important

The Spark native connector for Semantic Link is in deprecation mode as of October 2025 and will be fully retired by October 2028. No major versions of the Spark native connector will be released after October 1, 2025. The connector is only compatible with Spark runtime versions up to 3.5 and does not support Spark runtime version 4.0 or later. To ensure continued support and access to new features, please migrate to the Semantic Link Python SDK.

In this article, you can learn how to read data and metadata and evaluate measures in semantic models using the semantic link Spark native connector in Microsoft Fabric. You will also learn how to write data that semantic models can consume.

Prerequisites

[!INCLUDE prerequisites]

Go to the Data Science experience in [!INCLUDE product-name].
- From the left pane, select Workloads.
- Select Data Science.
Create a new notebook to copy/paste code into cells.
Spark runtime compatibility: The Spark native connector is supported only on Spark runtimes up to version 3.5. It is not supported on Spark runtime 4.0 or later. For new development, use the Semantic Link Python SDK.
[!INCLUDE sempy-notebook-installation]
Add a Lakehouse to your notebook.
Download the Customer Profitability Sample.pbix semantic model from the datasets folder of the fabric-samples repository, and save the semantic model locally.

Upload the semantic model into your workspace

In this article, we use the Customer Profitability Sample.pbix semantic model. This semantic model references a company manufacturing marketing materials and contains data about products, customers, and corresponding revenue for various business units.

From the left pane, select Workspaces and then select the name of your workspace to open it.
Select Import > Report or Paginated Report > From this computer and select the Customer Profitability Sample.pbix semantic model.

:::image type="content" source="media/read-write-power-bi-spark/upload-power-bi-data-to-workspace.png" alt-text="Screenshot showing the interface for uploading a semantic model into the workspace." lightbox="media/read-write-power-bi-spark/upload-power-bi-data-to-workspace.png":::

Once the upload is done, your workspace has three new artifacts: a Power BI report, a dashboard, and a semantic model named Customer Profitability Sample. You use this semantic model for the steps in this article.

:::image type="content" source="media/read-write-power-bi-spark/uploaded-artifacts-in-workspace.png" alt-text="Screenshot showing the items from the Power BI file uploaded into the workspace." lightbox="media/read-write-power-bi-spark/uploaded-artifacts-in-workspace.png":::

Read and write data, using Spark in Python, R, SQL, and Scala

By default, the workspace used to access semantic models is:

the workspace of the attached Lakehouse or
the workspace of the notebook, if no Lakehouse is attached.

Microsoft Fabric exposes all tables from all semantic models in the workspace as Spark tables. All Spark SQL commands can be executed in Python, R, and Scala. The semantic link Spark native connector supports push-down of Spark predicates to the Power BI engine.

Note

The Spark native connector has compatibility and support limitations. See the Important note at the top of this article for retirement timeline and support boundaries.

Tip

Since Power BI tables and measures are exposed as regular Spark tables, they can be joined with other Spark data sources in a single query.

List tables of all semantic models in the workspace, using PySpark.
```
df = spark.sql("SHOW TABLES FROM pbi")
display(df)
```
Retrieve the data from the Customer table in the Customer Profitability Sample semantic model, using SparkR.

[!NOTE] Retrieving tables is subject to strict limitations (see Read Limitations) and the results might be incomplete. Use aggregate pushdown to reduce the amount of data transferred. The supported aggregates are: COUNT, SUM, AVG, MIN, and MAX.
```
%%sparkr

df = sql("SELECT * FROM pbi.`Customer Profitability Sample`.Customer")
display(df)
```

Power BI measures are available through the virtual table _Metrics. The following query computes the total revenue and revenue budget by region and industry.

%%sql

SELECT
    `Customer[Country/Region]`,
    `Industry[Industry]`,
    AVG(`Total Revenue`),
    AVG(`Revenue Budget`)
FROM
    pbi.`Customer Profitability Sample`.`_Metrics`
WHERE
    `Customer[State]` in ('CA', 'WA')
GROUP BY
    `Customer[Country/Region]`,
    `Industry[Industry]`

Inspect available measures and dimensions, using Spark schema.

spark.table("pbi.`Customer Profitability Sample`._Metrics").printSchema()

Save the data as a delta table to your Lakehouse.

delta_table_path = "<your delta table path>" #fill in your delta table path 
df.write.format("delta").mode("overwrite").save(delta_table_path)

Read-access limitations

The read access APIs have the following limitations:

Spark runtime version: The Spark native connector is unsupported on Spark runtime 4.0 or later. Behavior on unsupported runtimes is not guaranteed.
Queries running longer than 10s in Analysis Service are not supported (Indication inside Spark: "java.net.SocketTimeoutException: PowerBI service comm failed ")
Power BI table access using Spark SQL is subject to Power BI backend limitations.
Predicate pushdown for Spark _Metrics queries is limited to a single IN expression and requires at least two elements. Extra IN expressions and unsupported predicates are evaluated in Spark after data transfer.
Predicate pushdown for Power BI tables accessed using Spark SQL doesn't support the following expressions:
- ISNULL
- IS_NOT_NULL
- STARTS_WITH
- ENDS_WITH
- CONTAINS.
The Spark session must be restarted to make new semantic models accessible in Spark SQL.

Migration guidance

As the Spark native connector for Semantic Link approaches retirement, we recommend planning your migration to the Semantic Link Python SDK. The Python SDK provides continued support, new features, and compatibility with future Spark runtime versions.

To prepare for migration:

Review your current usage of the Spark native connector across your Fabric tenants.
Consult the Semantic Link Python SDK documentation for migration guidance and updated APIs.
If you have questions or need migration assistance, contact the support team at [email protected].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read from semantic models and write data consumable by Power BI using Spark (DEPRECATED)

Prerequisites

Upload the semantic model into your workspace

Read and write data, using Spark in Python, R, SQL, and Scala

Read-access limitations

Migration guidance

Related content

FilesExpand file tree

read-write-power-bi-spark.md

Latest commit

History

read-write-power-bi-spark.md

File metadata and controls

Read from semantic models and write data consumable by Power BI using Spark (DEPRECATED)

Prerequisites

Upload the semantic model into your workspace

Read and write data, using Spark in Python, R, SQL, and Scala

Read-access limitations

Migration guidance

Related content