-
Notifications
You must be signed in to change notification settings - Fork 21.8k
Expand file tree
/
Copy pathcalculate-blob-count-size.yml
More file actions
84 lines (70 loc) · 4.08 KB
/
calculate-blob-count-size.yml
File metadata and controls
84 lines (70 loc) · 4.08 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
### YamlMime:HowTo
metadata:
title: Calculate blob count and size using Azure Storage inventory
description: Learn how to calculate the count and total size of blobs per container.
author: normesta
ms.author: normesta
ms.date: 3/20/2024
ms.service: azure-blob-storage
ms.topic: how-to
ms.custom:
- subject-rbac-steps
- ge-structured-content-pilot
title: |
Calculate blob count and total size per container using Azure Storage inventory
introduction: |
This article uses the Azure Blob Storage inventory feature and Azure Synapse to calculate the blob count and total size of blobs per container. These values are useful when optimizing blob usage per container.
procedureSection:
- title: |
Enable inventory reports
summary: |
The first step in this method is to [enable inventory reports](blob-inventory.md#enabling-inventory-reports) on your storage account. You may have to wait up to 24 hours after enabling inventory reports for your first report to be generated.
When you have an inventory report to analyze, grant yourself read access to the container where the report CSV file resides by assigning yourself the **Storage Blob Data Reader** role. Be sure to use the email address of the account you're using to run the report. To learn how to assign an Azure role to a user with Azure role-based access control (Azure RBAC), follow the instructions provided in [Assign Azure roles using the Azure portal](/azure/role-based-access-control/role-assignments-portal).
> [!NOTE]
> To calculate the blob size from the inventory report, make sure to include the **Content-Length** schema field in your rule definition.
## Create an Azure Synapse workspace
Next, [create an Azure Synapse workspace](../../synapse-analytics/get-started-create-workspace.md) where you will execute a SQL query to report the inventory results.
## Create the SQL query
After you create your Azure Synapse workspace, do the following steps.
steps:
- |
Navigate to [https://web.azuresynapse.net](https://web.azuresynapse.net).
- |
Select the **Develop** tab on the left edge.
- |
Select the large plus sign (+) to add an item.
- |
Select **SQL script**.
:::image type="content" source="media/calculate-blob-count-size/synapse-sql-script.png" alt-text="Screenshot of select SQL script to create a new query.":::
- title: |
Run the SQL query
summary: |
Follow the below steps:
steps:
- |
Add the following SQL query in your Azure Synapse workspace to [read the inventory CSV file](../../synapse-analytics/sql/query-single-csv-file.md#read-a-csv-file).
For the `bulk` parameter, use the URL of the inventory report CSV file that you want to analyze.
```sql
SELECT LEFT([Name], CHARINDEX('/', [Name]) - 1) AS Container,
COUNT(*) As TotalBlobCount,
SUM([Content-Length]) As TotalBlobSize
FROM OPENROWSET(
bulk '<URL to your inventory CSV file>',
format='csv', parser_version='2.0', header_row=true
) AS Source
GROUP BY LEFT([Name], CHARINDEX('/', [Name]) - 1)
```
- |
Name your SQL query in the properties pane on the right.
- |
Publish your SQL query by pressing CTRL+S or selecting the **Publish all** button.
- |
Select the **Run** button to execute the SQL query. The blob count and total size per container are reported in the **Results** pane.
:::image type="content" source="media/calculate-blob-count-size/output.png" alt-text="Screenshot of output from running the script to calculate blob count and total size.":::
relatedContent:
- text: Use Azure Storage blob inventory to manage blob data
url: blob-inventory.md
- text: Calculate container statistics by using Databricks
url: storage-blob-calculate-container-statistics-databricks.md
- text: Calculate the total billing size of a blob container
url: ../scripts/storage-blobs-container-calculate-billing-size-powershell.md