Skip to content

Commit cfe80d9

Browse files
committed
updated image
1 parent cd4a951 commit cfe80d9

4 files changed

Lines changed: 10 additions & 10 deletions

File tree

learn-pr/wwl-databricks/select-and-configure-compute/5-install-libraries-for-compute.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ title: Install libraries for compute
44
metadata:
55
title: Install Libraries for Compute
66
description: Learn how to install libraries on Azure Databricks compute resources using package repositories, workspace files, Unity Catalog volumes, and init scripts.
7-
ms.date: 01/14/2026
7+
ms.date: 01/19/2026
88
author: weslbo
99
ms.author: wedebols
1010
ms.topic: unit

learn-pr/wwl-databricks/select-and-configure-compute/includes/2-choose-appropriate-compute-type.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ Different compute types suit different scenarios. The following table compares k
9898

9999
Start your decision-making process by identifying your workload characteristics. The following diagram illustrates a decision flow to help you select the appropriate compute type:
100100

101-
![Diagram explaining how to choose the right compute type in Azure Databricks.](../media/databricks-compute-selection.svg)
101+
![Diagram explaining how to choose the right compute type in Azure Databricks.](../media/databricks-compute-selection.png)
102102

103103
Consider these questions:
104104

learn-pr/wwl-databricks/select-and-configure-compute/includes/5-install-libraries-for-compute.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -29,10 +29,10 @@ Maven libraries require **coordinates** in the format `groupId:artifactId:versio
2929

3030
For R packages from CRAN, provide the package name. Unlike Python and Java libraries, CRAN installations always pull the latest version from the configured mirror. To pin specific R package versions, you need to store the package files in workspace files or volumes instead of installing from CRAN.
3131

32-
With clusters configured in **standard access mode**, Maven coordinates and JAR file paths require **allow list approval** before installation. This security measure ensures admins review and approve libraries that run on shared compute resources.
32+
With clusters configured in **standard access mode**, Maven coordinates and JAR file paths require **`allowlist` approval** before installation. This security measure ensures admins review and approve libraries that run on shared compute resources.
3333

3434
> [!NOTE]
35-
> To learn more about configuring and managing allow lists for libraries, see the [documentation](/azure/databricks/data-governance/unity-catalog/manage-privileges/allowlist).
35+
> To learn more about configuring and managing `allowlists` for libraries, see the [documentation](/azure/databricks/data-governance/unity-catalog/manage-privileges/allowlist).
3636
3737
## Install libraries from files
3838

@@ -50,15 +50,15 @@ Unity Catalog volumes offer enhanced security and governance for library storage
5050

5151
Python **requirements.txt files** work with both workspace files and volumes in Databricks Runtime 15.0 and above. These files let you define multiple package dependencies in a single file, making it easier to maintain consistent environments across clusters. Upload the requirements.txt file and install it just like any other library—Azure Databricks automatically installs all listed packages.
5252

53-
For clusters with standard access mode, you must add library file paths to the allow list before installation. This applies to both workspace files and volumes, ensuring admins approve the libraries used on shared compute.
53+
For clusters with standard access mode, you must add library file paths to the `allowlist` before installation. This applies to both workspace files and volumes, ensuring admins approve the libraries used on shared compute.
5454

5555
## Use init scripts for advanced configuration
5656

5757
**Init scripts** run shell commands during **cluster startup**, before the Spark driver and executors start. While Databricks **doesn't recommend** using init scripts for library installation—cluster-scoped libraries provide a better approach—init scripts prove useful for system-level **configuration** that libraries can't handle.
5858

5959
You might use init scripts to install system packages with `apt-get`, configure environment variables, or set up monitoring agents. For example, an init script could install a specialized database driver that requires system libraries, then configure connection parameters through environment variables. The script runs every time the cluster starts, ensuring your configuration persists across restarts.
6060

61-
Store init scripts in Unity Catalog volumes for clusters running Databricks Runtime 13.3 LTS and above. Create a shell script file, upload it to a volume, then configure the cluster to run the script by specifying its path like `/Volumes/main/engineering/scripts/setup.sh`. For standard access mode, add the init script path to the allow list before configuring the cluster.
61+
Store init scripts in Unity Catalog volumes for clusters running Databricks Runtime 13.3 LTS and above. Create a shell script file, upload it to a volume, then configure the cluster to run the script by specifying its path like `/Volumes/main/engineering/scripts/setup.sh`. For standard access mode, add the init script path to the `allowlist` before configuring the cluster.
6262

6363
Init scripts execute sequentially in the order you specify. If any script returns a non-zero exit code, the cluster fails to start. This failure protection prevents clusters from running with incomplete or incorrect configuration. You can troubleshoot failed init scripts by configuring cluster log delivery and examining the init script logs.
6464

@@ -68,13 +68,13 @@ Consider init scripts as a last resort for configuration needs that cluster-scop
6868

6969
Clusters configured with **standard access mode** provide the strongest security and isolation in Azure Databricks. This mode requires explicit approval for libraries and init scripts to prevent unauthorized code execution on shared compute resources.
7070

71-
Before installing Maven libraries or JAR files on standard access mode clusters, a **metastore admin** must add them to the allowlist. Maven coordinates go on the allowlist using the format `groupId:artifactId:version`. You can allowlist all versions of a library with `groupId:artifactId`, or all artifacts in a group with just `groupId`. For JAR files stored in volumes or object storage, allowlist the file path or directory path.
71+
Before installing Maven libraries or JAR files on standard access mode clusters, a **metastore admin** must add them to the `allowlist`. Maven coordinates go on the `allowlist` using the format `groupId:artifactId:version`. You can `allowlist` all versions of a library with `groupId:artifactId`, or all artifacts in a group with just `groupId`. For JAR files stored in volumes or object storage, `allowlist` the file path or directory path.
7272

73-
Init scripts require separate allowlist entries even if stored in the same location as JAR files. When allowlisting a path, Azure Databricks uses prefix matching—adding `/Volumes/prod-libraries/` to the allowlist permits all files and subdirectories within that location. Include a trailing slash to prevent unintended prefix matches at the directory level.
73+
Init scripts require separate `allowlist` entries even if stored in the same location as JAR files. When allow listing a path, Azure Databricks uses prefix matching—adding `/Volumes/prod-libraries/` to the `allowlist` permits all files and subdirectories within that location. Include a trailing slash to prevent unintended prefix matches at the directory level.
7474

75-
The allowlist only grants permission to use a path for library or init script installation. You still need appropriate data access permissions. For volumes, the installer identity must have `READ VOLUME` permission. For standard access mode, the cluster owner's identity validates these permissions during library installation.
75+
The `allowlist` only grants permission to use a path for library or init script installation. You still need appropriate data access permissions. For volumes, the installer identity must have `READ VOLUME` permission. For standard access mode, the cluster owner's identity validates these permissions during library installation.
7676

77-
To configure the allowlist, metastore admins use Catalog Explorer, selecting the metastore settings and navigating to the **Allowed JARs/Init Scripts** section. This centralized control ensures that security teams can review and approve all libraries used across the organization's compute resources, maintaining governance without blocking productivity.
77+
To configure the `allowlist`, metastore admins use Catalog Explorer, selecting the metastore settings and navigating to the **Allowed JARs/Init Scripts** section. This centralized control ensures that security teams can review and approve all libraries used across the organization's compute resources, maintaining governance without blocking productivity.
7878

7979
![Screenshot of the Add allowed JARs / Init Scripts / Maven Coordinates dialog box.](../media/allow-list.png)
8080

70.8 KB
Loading

0 commit comments

Comments
 (0)