Skip to content

Commit 7a7cf93

Browse files
Merge pull request #307119 from cachai2/NIMs
pamela feedback
2 parents 31cb1c4 + 3ef9635 commit 7a7cf93

1 file changed

Lines changed: 17 additions & 29 deletions

File tree

articles/container-apps/serverless-gpu-nim.md

Lines changed: 17 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,8 @@ This tutorial uses a premium instance of Azure Container Registry to improve col
4848
CONTAINERAPPS_ENVIRONMENT="my-environment-name"
4949
GPU_TYPE="Consumption-GPU-NC24-A100"
5050
CONTAINER_APP_NAME="llama3-nim"
51-
CONTAINER_AND_TAG="llama-3.1-8b-instruct:latest"
51+
CONTAINER_AND_TAG="meta/llama-3.1-8b-instruct:latest"
52+
NGC_SECRET=<Your NVIDIA NGC API Key>
5253
```
5354

5455
[!INCLUDE [container-apps-create-resource-group.md](../../includes/container-apps-create-resource-group.md)]
@@ -67,53 +68,37 @@ This tutorial uses a premium instance of Azure Container Registry to improve col
6768
--sku premium
6869
```
6970
70-
## Pull, tag, and push your image
71+
## Import the NVIDIA NIM image into your Azure Container Registry
7172
72-
Next, pull the image from NVIDIA GPU Cloud and push to Azure Container Registry.
73+
Next, import the image from NVIDIA GPU Cloud to Azure Container Registry.
7374
7475
> [!NOTE]
7576
> NVIDIA NICs each have their own hardware requirements. Make sure the GPU type you select supports the [NIM](https://build.nvidia.com/models?filters=nimType%3Anim_type_run_anywhere&q=llama) of your choice. The Llama3 NIM used in this tutorial can run on NVIDIA A100 GPUs.
7677
77-
1. Authenticate to the NVIDIA container registry.
78-
79-
```bash
80-
docker login nvcr.io
81-
```
82-
83-
After you run this command, the sign in process prompts you to enter a username. Enter **$oauthtoken** for your user name value.
84-
85-
Then you're prompted for a password. Enter your NVIDIA NGC API key here. Once authenticated to the NVIDIA registry, you can authenticate to the Azure registry.
86-
8778
1. Authenticate to Azure Container Registry.
8879
8980
```bash
9081
az acr login --name $ACR_NAME
9182
```
9283
93-
1. Pull the Llama3 NIM image.
94-
95-
```azurecli
96-
docker pull nvcr.io/nim/meta/$CONTAINER_AND_TAG
97-
```
98-
99-
1. Tag the image.
100-
101-
```azurecli
102-
docker tag nvcr.io/nim/meta/$CONTAINER_AND_TAG $ACR_NAME.azurecr.io/$CONTAINER_AND_TAG
103-
```
104-
10584
1. Push the image to Azure Container Registry.
10685
10786
```azurecli
108-
docker push $ACR_NAME.azurecr.io/$CONTAINER_AND_TAG
87+
az acr import \
88+
--name $ACR_NAME \
89+
--source nvcr.io/nim/$CONTAINER_AND_TAG \
90+
--image $CONTAINER_AND_TAG \
91+
--username '$oauthtoken' \
92+
--password $NGC_SECRET
93+
10994
```
11095
11196
## Enable artifact streaming (recommended but optional)
11297
113-
When your container app runs, it pulls the container from your container registry. When you have larger images like in the case of AI workloads, this image pull may take some time. By enabling artifact streaming, you reduce the time needed, and your container app can take a long time to start if you don't enable artifact streaming. Use the following steps to enable artifact streaming.
98+
When your container app runs, it pulls the container from your container registry. When you have larger images like in the case of AI workloads, this image pull may take some time. By enabling artifact streaming, your container app will load the essential parts of your image first, reducing the amount of time to startup your container. Use the following steps to enable artifact streaming.
11499
115100
> [!NOTE]
116-
> The following commands can take a few minutes to complete.
101+
> The following commands can take a long time to complete.
117102
118103
1. Enable artifact streaming on your container registry.
119104
@@ -136,7 +121,7 @@ When your container app runs, it pulls the container from your container registr
136121
137122
Next you create a container app with the NVIDIA GPU Cloud API key.
138123
139-
1. Create the container app.
124+
1. Create the container app environment.
140125
141126
```azurecli
142127
az containerapp env create \
@@ -177,6 +162,9 @@ Next you create a container app with the NVIDIA GPU Cloud API key.
177162
178163
This command returns the URL of your container app. Set this value aside in a text editor for use in a following command.
179164
165+
> [!NOTE]
166+
> Some NIMs have longer startup times. To account for this, you can configure a [health probe](./health-probes.md) or set your container app's min-replica count with `--min-replicas 1` to keep a replica running at all times.
167+
180168
## Verify the application works
181169

182170
You can verify a successful deployment by sending a request `POST` request to your application.

0 commit comments

Comments
 (0)