Skip to content

Commit 9ac6baa

Browse files
Merge pull request #314881 from umamaheshmsft/discovery-storage-container-docs
Add MS Learn docs: managed identity and storage containers (concept + how-to)
2 parents d9cc287 + f4935ff commit 9ac6baa

6 files changed

Lines changed: 671 additions & 0 deletions

File tree

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
---
2+
title: Managed identities in Microsoft Discovery
3+
description: Understand how Microsoft Discovery uses user-assigned managed identities (UAMI) for authentication across workspaces, supercomputers, and bookshelves.
4+
author: umamm
5+
ms.author: umamm
6+
ms.service: azure
7+
ms.topic: concept-article
8+
ms.date: 04/17/2026
9+
---
10+
11+
# Managed identities in Microsoft Discovery
12+
13+
Microsoft Discovery uses **user-assigned managed identities (UAMI)** to authenticate against Azure resources on your behalf. Rather than managing secrets or connection strings, you create a managed identity, grant it the necessary Azure roles, and provide its resource ID when you create Discovery resources. The Discovery platform then uses that identity to access storage accounts, container registries, AI services, and managed resource group resources.
14+
15+
## Why user-assigned managed identities
16+
17+
Microsoft Discovery requires user-assigned (not system-assigned) managed identities for the following reasons:
18+
19+
| Reason | Explanation |
20+
|--------|------------|
21+
| **Customer ownership** | You create, manage, and control the lifecycle of the identity in your own subscription. |
22+
| **Shared across resources** | A single UAMI can be reused across a workspace, supercomputer, and storage operations, reducing management overhead. |
23+
| **Pre-provisioned role assignments** | You assign roles before resource creation, so the Discovery service has the permissions it needs from the start. |
24+
| **Immutable binding** | The workspace identity and supercomputer cluster identity are bound at creation time and can't be changed later, ensuring a consistent security posture. The supercomputer's kubelet and workload identities can be updated after creation. |
25+
26+
> [!IMPORTANT]
27+
> The workspace identity and supercomputer cluster identity are **immutable** after creation - you can't change them once provisioned. The supercomputer's kubelet and workload identities can be updated. Plan your identity strategy before creating resources.
28+
29+
## How Discovery uses your identity
30+
31+
When a workspace or supercomputer is created, the Discovery control plane:
32+
33+
1. **Reads your UAMI** - Validates the identity exists and the service can operate on it.
34+
2. **Assigns itself Managed Identity Operator** - The Discovery service principal gets the Managed Identity Operator role on your UAMI so it can use the identity for managed resource operations.
35+
3. **Uses the UAMI at runtime** - Tool runs on the supercomputer use the identity to pull container images and access blob storage. Agents use it to interact with Azure OpenAI and storage.
36+
37+
## Identity slots per resource type
38+
39+
Different Discovery resources use managed identities in different ways.
40+
41+
### Workspace
42+
43+
A workspace requires a single UAMI provided through the `workspaceIdentity` property. The Discovery service uses your UAMI to:
44+
45+
- **Identify your workspace** - the UAMI is the security principal that binds the workspace to your subscription's resources.
46+
- **Read and write data** in your Azure Blob Storage accounts through storage containers.
47+
- **Pull container images** from Azure Container Registry when running tools on a supercomputer.
48+
49+
The Discovery service provisions and operates the managed resource group (Cosmos DB, AI services, search indexes, Azure OpenAI) using its own service principals - not your UAMI. Your UAMI doesn't need roles on the managed resource group.
50+
51+
### Supercomputer
52+
53+
A supercomputer uses three identity slots, all of which can reference the same UAMI for simplicity, or separate UAMIs for least-privilege:
54+
55+
| Slot | Purpose |
56+
|------|---------|
57+
| **Cluster identity** | Used by the AKS control plane to manage cluster-level resources such as networking and load balancers. |
58+
| **Kubelet identity** | Used at the node level to pull container images from Azure Container Registry and access Azure resources. |
59+
| **Workload identity** | Used as federated credentials by pods running tools and agents on the supercomputer. |
60+
61+
### Bookshelf
62+
63+
A bookshelf references your UAMI through its `workloadIdentities` property. The Discovery service uses its own service principals to provision and operate the bookshelf managed resource group (AI Search, SQL, AI Services). The service also creates a system-managed identity inside the bookshelf MRG for internal resource-to-resource authentication.
64+
65+
## Required role assignments
66+
67+
You must assign the following built-in roles to your UAMI **before** creating Discovery resources. Assign these at the **resource group** scope.
68+
69+
| Role | Role definition ID | Purpose |
70+
|------|-------------------|---------|
71+
| Microsoft Discovery Platform Contributor (Preview) | `01288891-85ee-45a7-b367-9db3b752fc65` | Manage Discovery resources (workspaces, projects, agents, tools). |
72+
| Storage Blob Data Contributor | `ba92f5b4-2d11-453d-a403-e96b0029c9fe` | Read and write blobs in Azure Storage accounts. |
73+
| AcrPull | `7f951dda-4ed3-4680-a7ca-43fe172d538d` | Pull container images from Azure Container Registry. |
74+
75+
For additional roles needed in specialized scenarios, see [Configure managed identities](how-to-configure-managed-identity.md#additional-roles-for-specific-scenarios).
76+
77+
## End-to-end identity flow across Discovery resources
78+
79+
When you deploy a complete Discovery stack, the platform creates three managed resource groups (workspace MRG, bookshelf MRG, supercomputer MRG), each containing Azure resources managed by the service.
80+
81+
### What the service manages automatically
82+
83+
When you create a workspace, bookshelf, or supercomputer, the Discovery service automatically:
84+
85+
- Creates role assignments on the managed resource group so that the service can provision and operate MRG resources (AI Foundry, Cosmos DB, AI Search, Storage, Key Vault, AKS).
86+
- Assigns **Managed Identity Operator** on your UAMI so the service can use it for MRG deployments.
87+
- Creates a **system-managed identity** inside each workspace and bookshelf MRG for internal resource-to-resource authentication (Container Apps, Foundry, SQL).
88+
89+
You don't need to create or manage any of these identities or role assignments - they're fully lifecycle-managed by the service.
90+
91+
### What you're responsible for
92+
93+
You're responsible for:
94+
95+
- **Creating your UAMI** and assigning the three core roles (Discovery Platform Contributor, Storage Blob Data Contributor, AcrPull) before creating Discovery resources.
96+
- **Providing the UAMI resource ID** when creating a workspace or supercomputer.
97+
98+
99+
### Your UAMI at runtime
100+
101+
Your UAMI is the identity that agents and tools use at runtime:
102+
103+
| Operation | Azure resource accessed | Required role |
104+
|-----------|----------------------|---------------|
105+
| Read/write data in storage containers | Azure Blob Storage | Storage Blob Data Contributor |
106+
| Pull tool container images | Azure Container Registry | AcrPull |
107+
| Manage Discovery resources | Discovery RP | Microsoft Discovery Platform Contributor (Preview) |
108+
| Operate AKS cluster networking | Virtual Network subnets | Network Contributor (supercomputer cluster identity) |
109+
110+
For the supercomputer, your UAMI is used in three slots:
111+
112+
- **Cluster identity** - AKS control plane uses it to manage load balancers and networking.
113+
- **Kubelet identity** - Node-level agent uses it to pull images from ACR and access Azure resources.
114+
- **Workload identity** - Federated credentials used by pods running tools and agents.
115+
116+
For guidance on choosing between a single shared UAMI and separate identities per function, see [Managed identity best practice recommendations](/entra/identity/managed-identities-azure-resources/managed-identity-best-practice-recommendations).
117+
118+
## Limitations
119+
120+
- The UAMI must be in the **same region** as the Discovery resource that uses it.
121+
- The workspace identity and supercomputer cluster identity can't be changed after resource creation - you must delete and recreate the resource. The supercomputer's kubelet and workload identities can be updated via PATCH.
122+
- Role assignment propagation can take up to 10 minutes. Create role assignments before creating Discovery resources.
123+
- The Discovery service requires **Managed Identity Operator** on your UAMI. If this role assignment fails during resource creation (for example, due to Azure Policy restrictions), the workspace provisioning fails.
124+
125+
## Related content
126+
127+
- [Configure managed identities for Microsoft Discovery](how-to-configure-managed-identity.md) - Step-by-step instructions for creating a UAMI and assigning roles.
128+
- [Role assignments in Microsoft Discovery](concept-role-assignments.md) - Built-in Discovery roles and persona-based assignment guidance.
129+
- [Azure Blob Storage in Microsoft Discovery](concept-storage-account.md) - Storage account requirements including identity access.
130+
- [Quickstart: Deploy infrastructure using Azure portal](quickstart-infrastructure-portal.md) — End-to-end setup including UAMI creation.
131+
- [What are managed identities for Azure resources?](/entra/identity/managed-identities-azure-resources/overview) — Azure platform documentation on managed identities.
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
---
2+
title: Storage containers and storage assets in Microsoft Discovery
3+
description: Understand how storage containers and storage assets organize data in Microsoft Discovery workspaces, including the relationship to Azure Blob Storage and how agents use them.
4+
author: umamm
5+
ms.author: umamm
6+
ms.service: azure
7+
ms.topic: concept-article
8+
ms.date: 04/17/2026
9+
---
10+
11+
# Storage containers and storage assets in Microsoft Discovery
12+
13+
Microsoft Discovery uses **storage containers** and **storage assets** to organize data for workspaces. A storage container creates a logical reference to an Azure Blob Storage account (or Azure NetApp Files volume), while storage assets point to specific blob paths within that account. Together, they provide the data layer that agents, tools, and investigations use to read input and write output.
14+
15+
16+
## Key concepts
17+
18+
| Concept | What it's | Azure resource type |
19+
|---------|-----------|-------------------|
20+
| **Storage container** | A workspace child resource that references an Azure Blob Storage account. | `Microsoft.Discovery/storageContainers` |
21+
| **Storage asset** | A child resource of a storage container that references a specific blob path. | `Microsoft.Discovery/storageContainers/storageAssets` |
22+
| **Storage account** | The underlying Azure Blob Storage account that holds the actual data. | `Microsoft.Storage/storageAccounts` |
23+
24+
## Resource hierarchy
25+
26+
Storage containers and storage assets exist in a two-level hierarchy at the resource group level, referenced by workspaces through projects:
27+
28+
:::image type="content" source="media/concept-storage-containers-assets/storage-hierarchy.jpg" alt-text="Diagram showing storage container and storage asset hierarchy within a resource group, with workspace and project references." lightbox="media/concept-storage-containers-assets/storage-hierarchy.jpg":::
29+
30+
A workspace can have multiple storage containers, each pointing to a different storage account. A storage container can have multiple storage assets, each pointing to a different blob path within the same storage account.
31+
32+
## How storage containers work
33+
34+
### Registration, not data movement
35+
36+
Creating a storage container **registers** an existing Azure Blob Storage account with your workspace - it doesn't move or copy data. The actual blobs remain in your storage account under your control. Discovery simply creates a reference so that workspace resources (projects, agents, tools) can find and access the data.
37+
38+
### Supported storage kinds
39+
40+
Discovery supports the following storage kinds:
41+
42+
| Kind | Value | Description |
43+
|------|-------|-------------|
44+
| Azure Blob Storage | `AzureStorageBlob` | Standard Azure Blob Storage account. Hierarchical namespace (Data Lake Storage Gen2) isn't required. |
45+
| Azure NetApp Files | `AzureNetAppFiles` | Azure NetApp Files volume for high-performance file storage. Specify the `netAppVolumeId` instead of `storageAccountId`. |
46+
47+
Azure Blob Storage is the most common choice for general-purpose data ingestion and output. Azure NetApp Files is available for workloads that require high-throughput NFS-based file access.
48+
49+
### Authentication
50+
51+
The storage container uses the workspace's **user-assigned managed identity (UAMI)** to authenticate against the storage account. The UAMI must have the **Storage Blob Data Contributor** role on the storage account before you create the storage container. For more information, see [Managed identities in Microsoft Discovery](concept-managed-identities.md).
52+
53+
## How storage assets work
54+
55+
A storage asset represents a specific blob path within the storage account referenced by its parent storage container. Storage assets are the primary way data enters and exits Discovery investigations.
56+
57+
### Input data
58+
59+
When you create a storage asset with a `path` pointing to an existing blob or blob prefix, tools and agents running on the supercomputer can read that data. For example, you might create a storage asset pointing to a blob prefix containing CSV files for an experiment.
60+
61+
### Output data
62+
63+
When agents and tools produce output (research reports, datasets, analysis results), the platform creates storage assets automatically in the investigation's storage container. Each output file gets a storage asset with a unique path. See [Files and storage assets](concept-files-storage-assets.md) for details on how agents write files.
64+
65+
### Addressing
66+
67+
Storage assets are addressed using `discovery://` URIs within the platform:
68+
69+
```
70+
discovery://resources/{storageContainerName}/paths/{blobPath}
71+
```
72+
73+
Agents use built-in tools like **GetResourceContext** and **PreviewResource** to discover and read files through these URIs. The platform translates `discovery://` URIs to the physical blob storage location at runtime.
74+
75+
## How Discovery resources use storage
76+
77+
| Resource | How it uses storage |
78+
|----------|-------------------|
79+
| **Project** | References one or more storage containers as its data sources. Tools and investigations within the project read from and write to these containers. |
80+
| **Tool runs** | Mount storage containers as input and output data volumes. Input mounts provide read access; output mounts collect results. |
81+
| **Agents** | Use built-in file tools (WriteResource, PreviewResource) to create and read storage assets during investigations. |
82+
| **Investigations** | Accumulate storage assets as tasks complete. The root task collects all output file references through upward propagation. |
83+
| **Bookshelf** | Uses storage to ingest documents for knowledge-base indexing. Source data for indexing comes from storage assets. |
84+
85+
## Storage container lifecycle
86+
87+
| Phase | What happens |
88+
|-------|-------------|
89+
| **Create** | You register a storage account by creating a storage container (`PUT`). Discovery validates the storage account exists and the UAMI has access. |
90+
| **Use** | Projects, tools, and agents read from and write to the storage account through the registered container. |
91+
| **Update** | Storage container properties (like the storage account reference) are immutable after creation. To change the backing storage account, delete and recreate the container. |
92+
| **Delete** | Deleting a storage container removes the Discovery reference. **The underlying Azure Blob Storage account and its data are not deleted.** Delete child storage assets before deleting the parent container. |
93+
94+
> [!IMPORTANT]
95+
> Deleting a storage container doesn't delete your data. The actual blobs remain in your Azure Storage account. Only the Discovery resource reference is removed.
96+
97+
## Networking considerations
98+
99+
The storage account backing a storage container must be network-accessible to the Discovery platform:
100+
101+
- The storage account must allow access from the **virtual network subnets** used by the supercomputer, workspace, and agents.
102+
- If you restrict public access on the storage account, configure **virtual network rules** or **private endpoints** so the platform can reach it.
103+
- For CORS requirements (needed for Discovery Studio file browsing), see [Azure Blob Storage in Microsoft Discovery](concept-storage-account.md#cors-configuration).
104+
105+
## Limitations
106+
107+
- Storage container properties (storage store reference, kind) are **immutable** after creation.
108+
- A storage container can reference only one storage backend (storage account or NetApp volume). To access multiple backends, create multiple storage containers.
109+
- Storage assets reference blob paths, not individual blobs. A single path can contain multiple blobs (prefix-based access).
110+
- The workspace UAMI must have the appropriate role (Storage Blob Data Contributor for Azure Blob, or the equivalent for NetApp Files) on the storage backend before the storage container is created.
111+
112+
## Related content
113+
114+
- [Manage storage containers and storage assets](how-to-manage-storage-containers.md) - Step-by-step guide for creating, listing, and deleting storage containers and assets.
115+
- [Azure Blob Storage in Microsoft Discovery](concept-storage-account.md) - Storage account prerequisites including networking, CORS, and identity access.
116+
- [Files and storage assets](concept-files-storage-assets.md) — How agents produce and consume files during investigations.
117+
- [Managed identities in Microsoft Discovery](concept-managed-identities.md) - Identity requirements for storage access.

0 commit comments

Comments
 (0)