Skip to content

Latest commit

 

History

History
214 lines (144 loc) · 14.8 KB

File metadata and controls

214 lines (144 loc) · 14.8 KB
title Get started with Azure Blob Storage and Python
titleSuffix Azure Storage
description Get started developing a Python application that works with Azure Blob Storage. This article helps you set up a project and authorize access to an Azure Blob Storage endpoint.
services storage
author stevenmatthew
ms.author shaas
ms.service azure-blob-storage
ms.topic how-to
ms.date 10/02/2024
ai-usage ai-assisted
ms.custom
devx-track-python
devguide-python
sfi-ropc-nochange

Get started with Azure Blob Storage and Python

[!INCLUDE storage-dev-guide-selector-getting-started]

This article shows you how to connect to Azure Blob Storage by using the Azure Blob Storage client library for Python. Once connected, use the developer guides to learn how your code can operate on containers, blobs, and features of the Blob Storage service.

If you're looking to start with a complete example, see Quickstart: Azure Blob Storage client library for Python.

API reference | Package (PyPi) | Library source code | Samples | Give feedback

Prerequisites

Set up your project

This section walks you through preparing a project to work with the Azure Blob Storage client library for Python.

From your project directory, install packages for the Azure Blob Storage and Azure Identity client libraries using the pip install command. The azure-identity package is needed for passwordless connections to Azure services.

pip install azure-storage-blob azure-identity

Then open your code file and add the necessary import statements. In this example, we add the following to our .py file:

from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

Blob client library information:

  • azure.storage.blob: Contains the primary classes (client objects) that you can use to operate on the service, containers, and blobs.

Asynchronous programming

The Azure Blob Storage client library for Python supports both synchronous and asynchronous APIs. The asynchronous APIs are based on Python's asyncio library.

Follow these steps to use the asynchronous APIs in your project:

  • Install an async transport, such as aiohttp. You can install aiohttp along with azure-storage-blob by using an optional dependency install command. In this example, we use the following pip install command:

    pip install azure-storage-blob[aio]
  • Open your code file and add the necessary import statements. In this example, we add the following to our .py file:

    import asyncio
    
    from azure.identity.aio import DefaultAzureCredential
    from azure.storage.blob.aio import BlobServiceClient, BlobClient, ContainerClient

    The import asyncio statement is only required if you're using the library in your code. It's added here for clarity, as the examples in the developer guide articles use the asyncio library.

  • Create a client object using async with to begin working with data resources. Only the top level client needs to use async with, as other clients created from it share the same connection pool. In this example, we create a BlobServiceClient object using async with, and then create a ContainerClient object:

    async with BlobServiceClient(account_url, credential=credential) as blob_service_client:
        container_client = blob_service_client.get_container_client(container="sample-container")

    To learn more, see the async examples in Authorize access and connect to Blob Storage.

Blob async client library information:

  • azure.storage.blob.aio: Contains the primary classes that you can use to operate on the service, containers, and blobs asynchronously.

Authorize access and connect to Blob Storage

To connect an app to Blob Storage, create an instance of the BlobServiceClient class. This object is your starting point to interact with data resources at the storage account level. You can use it to operate on the storage account and its containers. You can also use the service client to create container clients or blob clients, depending on the resource you need to work with.

To learn more about creating and managing client objects, including best practices, see Create and manage client objects that interact with data resources.

You can authorize a BlobServiceClient object by using a Microsoft Entra authorization token, an account access key, or a shared access signature (SAS). For optimal security, Microsoft recommends using Microsoft Entra ID with managed identities to authorize requests against blob data. For more information, see Authorize access to blobs using Microsoft Entra ID.

To authorize with Microsoft Entra ID, you need to use a security principal. Which type of security principal you need depends on where your app runs. Use the following table as a guide:

Where the app runs Security principal Guidance
Local machine (developing and testing) Service principal To learn how to register the app, set up a Microsoft Entra group, assign roles, and configure environment variables, see Authorize access using developer service principals
Local machine (developing and testing) User identity To learn how to set up a Microsoft Entra group, assign roles, and sign in to Azure, see Authorize access using developer credentials
Hosted in Azure Managed identity To learn how to enable managed identity and assign roles, see Authorize access from Azure-hosted apps using a managed identity
Hosted outside of Azure (for example, on-premises apps) Service principal To learn how to register the app, assign roles, and configure environment variables, see Authorize access from on-premises apps using an application service principal

Authorize access using DefaultAzureCredential

An easy and secure way to authorize access and connect to Blob Storage is to obtain an OAuth token by creating a DefaultAzureCredential instance. You can then use that credential to create a BlobServiceClient object.

The following example creates a BlobServiceClient object using DefaultAzureCredential:

:::code language="python" source="~/azure-storage-snippets/blobs/howto/python/blob-devguide-py/blob_devguide_auth.py" id="Snippet_get_service_client_DAC":::

If your project uses asynchronous APIs, instantiate BlobServiceClient using async with:

# TODO: Replace <storage-account-name> with your actual storage account name
account_url = "https://<storage-account-name>.blob.core.windows.net"
credential = DefaultAzureCredential()

async with BlobServiceClient(account_url, credential=credential) as blob_service_client:
    # Work with data resources in the storage account

To use a shared access signature (SAS) token, provide the token as a string and initialize a BlobServiceClient object. If your account URL includes the SAS token, omit the credential parameter.

:::code language="python" source="~/azure-storage-snippets/blobs/howto/python/blob-devguide-py/blob_devguide_auth.py" id="Snippet_get_service_client_SAS":::

If your project uses asynchronous APIs, instantiate BlobServiceClient using async with:

# TODO: Replace <storage-account-name> with your actual storage account name
account_url = "https://<storage-account-name>.blob.core.windows.net"

# Replace <sas_token_str> with your actual SAS token
sas_token: str = "<sas_token_str>"

async with BlobServiceClient(account_url, credential=sas_token) as blob_service_client:
    # Work with data resources in the storage account

To learn more about generating and managing SAS tokens, see the following articles:

Note

For scenarios where shared access signatures (SAS) are used, Microsoft recommends using a user delegation SAS. A user delegation SAS is secured with Microsoft Entra credentials instead of the account key.

To use a storage account shared key, provide the key as a string and initialize a BlobServiceClient object.

:::code language="python" source="~/azure-storage-snippets/blobs/howto/python/blob-devguide-py/blob_devguide_auth.py" id="Snippet_get_service_client_account_key":::

You can also create a BlobServiceClient object using a connection string.

:::code language="python" source="~/azure-storage-snippets/blobs/howto/python/blob-devguide-py/blob_devguide_auth.py" id="Snippet_get_service_client_connection_string":::

If your project uses asynchronous APIs, instantiate BlobServiceClient using async with:

# TODO: Replace <storage-account-name> with your actual storage account name
account_url = "https://<storage-account-name>.blob.core.windows.net"

shared_access_key = os.getenv("AZURE_STORAGE_ACCESS_KEY")

async with BlobServiceClient(account_url, credential=shared_access_key) as blob_service_client:
    # Work with data resources in the storage account

For information about how to obtain account keys and best practice guidelines for properly managing and safeguarding your keys, see Manage storage account access keys.

Important

The account access key should be used with caution. If your account access key is lost or accidentally placed in an insecure location, your service may become vulnerable. Anyone who has the access key is able to authorize requests against the storage account, and effectively has access to all the data. DefaultAzureCredential provides enhanced security features and benefits and is the recommended approach for managing authorization to Azure services.


Build your app

As you build apps to work with data resources in Azure Blob Storage, your code primarily interacts with three resource types: storage accounts, containers, and blobs. To learn more about these resource types, how they relate to one another, and how apps interact with resources, see Understand how apps interact with Blob Storage data resources.

The following guides show you how to access data and perform specific actions using the Azure Storage client library for Python:

Guide Description
Configure a retry policy Implement retry policies for client operations.
Copy blobs Copy a blob from one location to another.
Create a container Create blob containers.
Create a user delegation SAS Create a user delegation SAS for a container or blob.
Create and manage blob leases Establish and manage a lock on a blob.
Create and manage container leases Establish and manage a lock on a container.
Delete and restore blobs Delete blobs and restore soft-deleted blobs.
Delete and restore containers Delete containers and restore soft-deleted containers.
Download blobs Download blobs by using strings, streams, and file paths.
Find blobs using tags Set and retrieve tags, and use tags to find blobs.
List blobs List blobs in different ways.
List containers List containers in an account and the various options available to customize a listing.
Manage properties and metadata (blobs) Get and set properties and metadata for blobs.
Manage properties and metadata (containers) Get and set properties and metadata for containers.
Performance tuning for data transfers Optimize performance for data transfer operations.
Set or change a blob's access tier Set or change the access tier for a block blob.
Upload blobs Learn how to upload blobs by using strings, streams, file paths, and other methods.