| title | Notebook activity |
|---|---|
| description | Learn how to add a notebook activity to a pipeline and use it to invoke a notebook in Data Factory in Microsoft Fabric. |
| ms.reviewer | xupxhou |
| ms.topic | how-to |
| ms.custom | pipelines |
| ms.date | 06/16/2025 |
Use the Notebook activity to run notebooks you create in [!INCLUDE product-name] as part of your Data Factory pipelines. Notebooks let you run Apache Spark jobs to bring in, clean up, or transform your data as part of your data workflows. It’s easy to add a Notebook activity to your pipelines in Fabric, and this guide walks you through each step.
To get started, you must complete the following prerequisites:
[!INCLUDEbasic-prerequisites]
- A notebook is created in your workspace. To create a new notebook, refer to [How to create [!INCLUDE product-name] notebooks](../data-engineering/how-to-use-notebook.md).
-
Create a new pipeline in your workspace.
-
Search for Notebook in the pipeline Activities pane, and select it to add it to the pipeline canvas.
:::image type="content" source="media/notebook-activity/add-notebook-activity-to-pipeline.png" alt-text="Screenshot of the Fabric UI with the Activities pane and Notebook activity highlighted.":::
-
Select the new Notebook activity on the canvas if it isn't already selected.
:::image type="content" source="media/notebook-activity/notebook-general-settings.png" alt-text="Screenshot showing the General settings tab of the Notebook activity.":::
Refer to the General settings guidance to configure the General settings tab.
Select the Settings tab.
Under Connection, select the authentication method for the notebook run and provide the required credentials.
Select an existing notebook from the Notebook dropdown, and optionally specify any parameters to pass to the notebook.
:::image type="content" source="media/notebook-activity/notebook-connection-workspace-parameters.png" alt-text="Screenshot showing the Notebook settings tab highlighting the tab, where to choose a notebook, and where to add parameters.":::
-
Create the Workspace Identity
You must enable WI in your workspace (this may take a moment to load). Create a Workspace Identity in your Fabric workspace. Note that the WI should be created in the same workspace as your Pipeline.
Check out the docs on Workspace Identity.
-
Enable tenant-level settings
Enable the following tenant setting (it's disabled by default): Service principals can call Fabric public APIs.
You can enable this setting in the Fabric admin portal. For more information about this setting, see the enable service principal authentication for admin APIs article.
-
Grant workspace permissions to the Workspace Identity
Open the workspace, select Manage access, and assign permissions to the Workspace Identity. Contributor access is sufficient for most scenarios. If your Notebook is not in the same workspace as your Pipeline, you'll need to assign the WI you created in your Pipeline's workspace at least Contributor access to your Notebook's workspace.
Check out the docs on Give users access to workspaces.
In order to minimize the amount of time it takes to execute your notebook job, you could optionally set a session tag. Setting the session tag instructs Spark to reuse any existing Spark session, minimizing the startup time. Any arbitrary string value can be used for the session tag. If no session exists, a new one would be created using the tag value.
:::image type="content" source="media/notebook-activity/notebook-advanced-settings.png" alt-text="Screenshot showing the Notebook settings tab highlighting the tab, where to add session tag.":::
Note
To be able to use the session tag, High concurrency mode for pipeline running multiple notebooks option must be turned on. This option can be found under the High concurrency mode for Spark settings under the Workspace settings
:::image type="content" source="media/notebook-activity/turn-on-high-concurrency-mode-for-session-tags.png" alt-text="Screenshot showing the Workspace settings tab highlighting the tab, where to enable high concurrency mode for pipelines running multiple notebooks.":::
[!INCLUDEsave-run-schedule-pipeline]
- Using Service Principal to run a notebook that contains Semantic Link code has functional limitations and supports only a subset of semantic link features. See the supported semantic link functions for details. To use other capabilities, you're recommended to manually authenticate semantic link with a service principal.