Skip to content

Commit 3813ccf

Browse files
authored
Update troubleshoot-deployment-new-vm-linux.md
1 parent ce89814 commit 3813ccf

1 file changed

Lines changed: 34 additions & 30 deletions

File tree

support/azure/virtual-machines/linux/troubleshoot-deployment-new-vm-linux.md

Lines changed: 34 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ ms.reviewer: srijangupta, scotro, jarrettr
2020

2121
## Symptoms
2222

23-
A typical provisioning failure scenario occurs after you create a custom image, then deploy a VM from it, you then experience unto 40mins where the VM status is showing `creating`, and you see this error message:
23+
A typical provisioning failure scenario occurs after you create a custom image and then deploy a virtual machine (VM) from that image. When this failure occurs, the VM status is shown as `creating` for up to 40 minutes, and you receive one of the following error messages:
2424

2525
```output
2626
Provisioning state Provisioning failed.
@@ -29,8 +29,6 @@ The VM may still finish provisioning successfully. Please check provisioning sta
2929
Also, make sure the image has been properly prepared (generalized). * Instructions for Windows: https://azure.microsoft.com/documentation/articles/virtual-machines-windows-upload-image/ * Instructions for Linux: https://azure.microsoft.com/documentation/articles/virtual-machines-linux-capture-image/.
3030
```
3131

32-
Or:
33-
3432
```toutputext
3533
Deployment failed. Correlation ID: f9dcb33a-4e6e-45c5-9c9d-b29dd73da2e0. {
3634
"status": "Failed",
@@ -47,25 +45,25 @@ Deployment failed. Correlation ID: f9dcb33a-4e6e-45c5-9c9d-b29dd73da2e0. {
4745
}
4846
```
4947

50-
You then see the VM state marked as `failed`.
48+
When this problem occurs, the VM state is shown as `failed`.
5149

52-
## Why do provisioning failures occur?
50+
## Why provisioning failures occur
5351

54-
Commonly, provisioning failures can happen for multiple reasons, such as:
52+
Commonly, provisioning failures occur for multiple reasons, such as:
5553

56-
- Missing provisioning /incorrectly configured agent
54+
- Missing provisioning or incorrectly configured agent
5755

58-
You will need to ensure an agent is present and is working correctly, you should be using [cloud-init](/azure/virtual-machines/linux/using-cloud-init) or if your image will not support this, you can review these [steps](/azure/virtual-machines/linux/no-agent).
56+
Verify that an agent exists and is working correctly by using [cloud-init](/azure/virtual-machines/linux/using-cloud-init). If your image doesn't support this configuration, review [these steps](/azure/virtual-machines/linux/no-agent).
5957

6058
- Incorrect image configuration
6159

62-
We have guidance on how images should be set up with cloud-init and other [Azure image requirements](/azure/virtual-machines/linux/create-upload-generic), please check this.
60+
For guidance to set up images by using cloud-init, see [Azure image requirements](/azure/virtual-machines/linux/create-upload-generic).
6361

6462
## Troubleshoot provisioning failures
6563

66-
To identify the reason for failed provisioning you will need to start with the serial log, this is available to you by deploying the VM with Azure Boot diagnostics.
64+
To identify the reason for failed provisioning, start by examining the serial log. This log is made available by deploying the VM to use Azure Boot diagnostics.
6765

68-
You will need to deploy a new VM with [boot diagnostics enabled](/cli/azure/vm/boot-diagnostics) for the VM with the failing image to access provisioning events in the serial log.
66+
You must deploy a new VM to have [boot diagnostics enabled](/cli/azure/vm/boot-diagnostics) in order for the VM that has the failing image to access provisioning events in the serial log.
6967

7068
```azurecli
7169
# create resource group
@@ -92,17 +90,23 @@ az vm create \
9290
--boot-diagnostics-storage $storageacct
9391
```
9492

95-
To view the serial log, you can go to the Portal, or run the command below to download the *serialConsoleLogBlobUri* log:
93+
To view the serial log, go to the Azure portal, or run the following command to download the *serialConsoleLogBlobUri* log:
9694

9795
```azurecli
9896
az vm boot-diagnostics get-boot-log-uris --name $vmName --resource-group $resourceGroup
9997
```
10098

10199
## Understanding the serial log for system events and provisioning events
102100

103-
When the VM is created for the first time, cloud-init will start up and try to mount an ISO, establish network connectivity, set the properties passed during the VM creation, mount the ephemeral disk (on supported VM sizes), and signal back to the Azure platform that the initial OS config has completed.
101+
When the VM is created, cloud-init starts up and tries to take the following actions:
102+
103+
- Mount an ISO
104+
- Establish network connectivity
105+
- Set the properties that are passed during VM creation
106+
- Mount the ephemeral disk (on supported VM sizes)
107+
- Notify the Azure platform that the initial OS config is completed
104108

105-
| System Events and Key Information | Serial Log | Notes |
109+
| System events and key information | Serial log | Notes |
106110
|---|---|---|
107111
| Kernel release and kernel version | `[ 0.000000] Linux version 5.4.0-1031-azure (buildd@lcy01-amd64-021) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #32~18.04.1-Ubuntu SMP Tue Oct 6 10:03:22 UTC 2020 (Ubuntu 5.4.0-1031.32~18.04.1-azure 5.4.65)` | Appears at the beginning of the serial log. |
108112
| Kernel command-line options | `[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=UUID=8c0a4742-2f51-40b4-b659-357cfb0bb2a3 ro console=tty1 console=ttyS0 earlyprintk=ttyS0`<br>`[ 0.503399] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1031-azure root=UUID=8c0a4742-2f51-40b4-b659-357cfb0bb2a3 ro console=tty1 console=ttyS0 earlyprintk=ttyS0` | Appears at the beginning of the serial log. Search for `command line:`. |
@@ -146,15 +150,15 @@ When the VM is created for the first time, cloud-init will start up and try to m
146150
"UDF driver Blocklisted 2020/09/11 19:16:40.240016 ERROR Daemon Provisioning failed: [ProtocolError] [CopyOvfEnv] Error mounting dvd: [OSUtilError] Failed to mount dvd deviceInner error: [mount -o ro -t udf,iso9660 /dev/sr0 /mnt/cdrom/secure] returned 32: mount: /mnt/cdrom/secure: wrong fs type, bad option, bad superblock on /dev/sr0, missing codepage or helper program, or other error."
147151
```
148152

149-
**Cause**: The UDF driver is not loaded in the kernel, this is required for the VM to provision, see [image requirements](/azure/virtual-machines/linux/create-upload-generic).
153+
**Cause**: The UDF driver is not loaded in the kernel. Loading is required for the VM to provision. See [image requirements](/azure/virtual-machines/linux/create-upload-generic).
150154

151-
When a VM is first provisioned on Azure, the Azure host presents a 'provisioning cdrom iso disk' to the VM. This provisioning disk is usually presented to the VM through /dev/sr0. Within the provisioning disk, there is a provisioning manifest which contains a VM's provisioning information. The in-VM provisioning agent is expected to mount the provisioning disk, read the provisioning manifest, and provision the VM accordingly.
155+
When a VM is first provisioned on Azure, the Azure host presents a 'provisioning cdrom iso disk' to the VM. This provisioning disk is usually presented to the VM through /dev/sr0. Within the provisioning disk, there is a provisioning manifest that contains a VM's provisioning information. The in-VM provisioning agent is expected to mount the provisioning disk, read the provisioning manifest, and provision the VM accordingly.
152156

153-
Since the provisioning disk is a `cdrom iso disk`, the Linux UDF driver is required by the kernel in order to successfully mount this disk. This is referenced in Microsoft [documentation on Linux images](/azure/virtual-machines/linux/create-upload-generic). For this VM, logs indicate that the provisioning disk failed to mount, which caused VM provisioning to fail. The most likely reason is due to missing or blocked UDF drivers.
157+
Because the provisioning disk is a `cdrom iso disk`, the Linux UDF driver is required by the kernel in order to successfully mount this disk. This is referenced in Microsoft [documentation for Linux images](/azure/virtual-machines/linux/create-upload-generic). For this VM, logs indicate that the provisioning disk didn't mount and VM provisioning failed. The most likely reason is missing or blocked UDF drivers.
154158

155-
**Solution**: Ensure the UDF driver is configured to be loaded in the kernel.
159+
**Solution**: Make sure that the UDF driver is configured to be loaded in the kernel.
156160

157-
A common way for UDF drivers to be blocked is through configs within `/etc/modprobe.d/`. Please work with the customer/image owner to ensure that Linux UDF drivers are present and not blocked. Please consult [this article on blocking/unblocking kernel drivers](https://linux.die.net/man/5/modprobe.d).
161+
A common method for UDF drivers to be blocked is through configurations within `/etc/modprobe.d/`. Work with the image owner to make sure that Linux UDF drivers are present and not blocked. Refer to [this article about blocking and unblocking kernel drivers](https://linux.die.net/man/5/modprobe.d).
158162

159163
### Unicode characters in VM tags issue
160164

@@ -166,9 +170,9 @@ A common way for UDF drivers to be blocked is through configs within `/etc/modpr
166170
AttributeError: 'module' object has no attribute 'JSONDecodeError'
167171
```
168172

169-
**Cause**: This happens because VM tags have non-ascii characters and the version of cloud-init is older than 20.3.
173+
**Cause**: This problem occurs because VM tags have non-ASCII characters, and the version of cloud-init is earlier than 20.3.
170174

171-
**Solution**: Either use or ensure your image supports cloud-init 20.3 or newer, or remove non-ascii characters from the VM tags.
175+
**Solution**: Either use or ensure your image supports cloud-init 20.3 or newer, or remove non-ASCII characters from the VM tags.
172176

173177
### Password with unicode characters
174178

@@ -182,9 +186,9 @@ File "/usr/lib/python2.7/site-packages/cloudinit/sources/DataSourceAzure.py", li
182186
UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-11: ordinal not in range(128)
183187
```
184188

185-
**Cause**: This happens because the provided password has unsupported characters (non-ascii).
189+
**Cause**: This problem occurs because the provided password includes unsupported (non-ASCII) characters.
186190

187-
**Solution**: Provide a password that only has ascii characters.
191+
**Solution**: Provide a password that includes only ASCII characters.
188192

189193
### Dhclient permission
190194

@@ -196,18 +200,18 @@ Exit code: -
196200
Reason: [Errno 13] Permission denied: b'/var/tmp/cloud-init/cloud-init-dhcp-yd8mvxud/dhclient'
197201
```
198202

199-
**Cause**: Older versions of cloud-init (before version 20.3) perform DHCP by copying and executing `dhclient` within `/var/tmp`. If `/var/tmp` is mounted as `noexec` (no execution) by the VM, then DHCP will fail due to `dhclient` not having permissions to execute within `/var/tmp`.
203+
**Cause**: Older versions of cloud-init (earlier than version 20.3) perform DHCP by copying and running `dhclient` within `/var/tmp`. If `/var/tmp` is mounted as `noexec` (no execution) by the VM, then DHCP will fail because `dhclient` doesn't have permissions to run within `/var/tmp`.
200204

201-
Cloud-init versions >= 20.3 contain a fix which falls back and executes `dhclient` "as-is" (by not copying and executing it in `/var/tmp` if there are permissions issues).
205+
Cloud-init version 20.3 and later versions contain a fix that falls back and runs `dhclient` "as-is" (by not copying and running it in `/var/tmp` if there are permissions issues).
202206

203-
**Solution**: For VMs running cloud-init older than version 20.3, configure the VM so that `/var/tmp` is not mounted as `noexec`. Alternatively, upgrade the VM's cloud-init package to a version >= 20.3.
207+
**Solution**: For VMs that run cloud-init earlier than version 20.3, configure the VM so that `/var/tmp` is not mounted as `noexec`. Alternatively, upgrade the VM's cloud-init package to version 20.3 or a later version.
204208

205209
> [!NOTE]
206-
> The `dhclient` permission issue has been resolved in cloud-init 22.4 and later versions. For more information, see [cloud-init issues 3956](https://github.com/canonical/cloud-init/issues/3956).
210+
> The `dhclient` permission issue was resolved in cloud-init 22.4 and later versions. For more information, see [cloud-init issues 3956](https://github.com/canonical/cloud-init/issues/3956).
207211
208212
## Getting more logs
209213

210-
If you find that you need more logs from the VM to understand the issues, you maybe can SSH into the VM using the [serial console](/azure/virtual-machines/troubleshooting/serial-console-linux) using a user that is baked into the image. If you do not have a user baked in, then you can either recreate the image with a user, or use the [AZ VM Repair tool](/cli/azure/vm/repair#az-vm-repair-create) which will mount the OS disk of the VM that failed to provision, to another VM.
214+
If you find that you need more logs from the VM in order to understand the issues, SSH into the VM by using the [serial console](/azure/virtual-machines/troubleshooting/serial-console-linux) by using a user that's baked into the image. If you don't have a user baked in, you can either re-create the image to include a user, or use the [AZ VM Repair tool](/cli/azure/vm/repair#az-vm-repair-create) to mount the OS disk of the VM that didn't provision to another VM.
211215

212216
```azurecli
213217
az vm repair create \
@@ -225,15 +229,15 @@ When you have access to the cloud-init logs, review the [cloud-init troubleshoot
225229

226230
## Collect activity logs
227231

228-
To start troubleshooting, collect the activity logs to identify the error associated with the issue. The following links contain detailed information on the process to follow.
232+
To start troubleshooting, collect the activity logs to identify the error that's associated with the issue. The following links contain detailed information about the process to follow.
229233

230234
[View deployment operations](/azure/azure-resource-manager/templates/deployment-history)
231235

232236
[View activity logs to manage Azure resources](/azure/azure-resource-manager/management/view-activity-logs)
233237

234238
## Getting Support
235239

236-
If you have referred to the guidance, and still cannot troubleshoot your issue, you can open a support case. When doing so, please select right product and support topic, doing this will engage the correct support team.
240+
If you referred to the guidance but still can't troubleshoot the problem, contact Microsoft Support. Select the appropriate product and support topic to engage the correct support team.
237241

238242
Selecting the case product:
239243

0 commit comments

Comments
 (0)