|
| 1 | +--- |
| 2 | +title: Troubleshoot non-boot scenarios after enabling Azure Disk Encryption in the OS disk on Linux VMs |
| 3 | +description: Resolve issues when a Linux VM is not booting after enabling Azure Disk Encryption |
| 4 | +author: elicorme |
| 5 | +ms.author: elcorral |
| 6 | +ms.date: 04/01/2025 |
| 7 | +ms.reviewer: divargas |
| 8 | +ms.service: azure-virtual-machines |
| 9 | +ms.custom: linux-related-content |
| 10 | +ms.topic: troubleshooting |
| 11 | +ms.collection: linux |
| 12 | +--- |
| 13 | + |
| 14 | +# How to fix issues related to VMs not booting after enabling Azure Disk Encryption |
| 15 | + |
| 16 | +**Applies to:** :heavy_check_mark: Linux VMs |
| 17 | + |
| 18 | +When deploying Azure Disk Encryption (ADE), various essential settings related to the boot process and system components are modified by editing files. If ADE fails or is interrupted, the virtual machine is likely to get stuck in emergency mode or become unusable. Especially when the OS disk is the one being encrypted. |
| 19 | + |
| 20 | +Based on this, here you can find a list of the most common scenarios for a VM not to boot after ADE is deployed and how to approach them towards a feasible solution. |
| 21 | + |
| 22 | +Remember that in all cases, you should [take a snapshot](https://learn.microsoft.com/azure/virtual-machines/linux/snapshot-copy-managed-disk) and/or create a backup before disks are encrypted. |
| 23 | + |
| 24 | +Backups ensure that a recovery option is possible if an unexpected failure occurs during encryption. For more information about how to back up and restore encrypted VMs, see the [Azure Backup](https://learn.microsoft.com/azure/backup/backup-azure-vms-encryption) article. |
| 25 | + |
| 26 | +## Common issues related to non-boot scenarios on machines using Azure Disk Encryption |
| 27 | + |
| 28 | +For many of the issues related to non-boot scenarios, you need to pay attention to the extension logs showed either in the serial console or the extension log file, which is normally located at `/var/log/azure/Microsoft.Azure.Security.AzureDiskEncryptionForLinux/extension.log`. |
| 29 | + |
| 30 | +## <a id="initram-miss"> </a> ADE modules missing in the initramfs image ADD THE STEPS FOR UBUNTU |
| 31 | + |
| 32 | +If the OS disk is using LVM and you see a message like this: |
| 33 | + |
| 34 | + ```bash |
| 35 | + Warning: /dev/mapper/rootvg-rootlv does not exist |
| 36 | + ... |
| 37 | + Entering emergency mode. Exit the shell to continue. |
| 38 | + dracut:/# |
| 39 | + ``` |
| 40 | + |
| 41 | +chances are that the required modules were not added to the initial ram disk image, then try to: |
| 42 | + |
| 43 | +* [Restore from backup](https://learn.microsoft.com/azure/backup/restore-azure-encrypted-virtual-machines) and attempt the encryption again |
| 44 | +* Use either the Azure CLI extension [az vm repair](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/unlock-encrypted-linux-disk-offline-repair#method1) or the [manual method](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/unlock-encrypted-linux-disk-offline-repair#method2) to create a rescue VM, attach and unlock the OS disk of the failed Linux machine to that rescue VM |
| 45 | + * Inside the failed disk, execute the following commands. Replace the kernel and extension version accordingly |
| 46 | + |
| 47 | + RHEL 8,9 |
| 48 | + |
| 49 | + ```bash |
| 50 | + # cp /var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-X.X.X.X/main/oscrypto/91adeOnline /usr/lib/dracut/modules.d/ |
| 51 | + |
| 52 | + # dracut -f -v /boot/initramfs-X.XX.X-XXX.XX.X.x86_64.img <KERNEL VERSION> |
| 53 | + ``` |
| 54 | + |
| 55 | + Ubuntu 20 |
| 56 | + |
| 57 | + > [!NOTE] |
| 58 | + > This procedure could apply to non-boot scenarios after upgrading from Ubuntu 18 to Ubuntu 20. Review the scenario to confirm if it applies. |
| 59 | + |
| 60 | + Copy the following files from the extension configuration directory to the initramfs scripts directory: |
| 61 | + |
| 62 | + ```bash |
| 63 | + # cd /var/lib/waagent/Microsoft.Azure.Security.AzureDiskEncryptionForLinux-X.x.x.xx/main/oscrypto/ubuntu_2004/encryptscripts |
| 64 | + # cp crypt-ade-boot /usr/share/initramfs-tools/scripts/init-premount/ |
| 65 | + # cp crypt-ade-hook /usr/share/initramfs-tools/hooks/ |
| 66 | + ``` |
| 67 | + |
| 68 | + Once the file crypt-ade-boot is copied, replace `ROOTPARTUUID` variable in the line below with the OS partition path from /dev/disk/by-partuuid/. |
| 69 | + |
| 70 | + ```bash |
| 71 | + Example: |
| 72 | + # ls -l /dev/disk/by-partuuid/ | grep -w <partition containing the OS> |
| 73 | + lrwxrwxrwx 1 root root 10 May 18 17:33 ef61c3c3-50bb-40f0-8124-4cbe8cb2a380 -> ../../sda1 |
| 74 | + ``` |
| 75 | + |
| 76 | + Replace the `ROOTPARTUUID` variable below with the one obtained in the step above. Remember to replace the UUID according to your enviroment |
| 77 | + |
| 78 | + ```bash |
| 79 | + cryptsetup luksOpen /dev/disk/by-partuuid/ROOTPARTUUID osencrypt --header /boot/luks/osluksheader -d /mnt/azure_bek_disk/LinuxPassPhraseFileName |
| 80 | + ``` |
| 81 | + |
| 82 | + Regenerate the initramfs image |
| 83 | + |
| 84 | + ```bash |
| 85 | + update-initramfs -u -k all |
| 86 | + ``` |
| 87 | + |
| 88 | + An output similar to the one below is expected: |
| 89 | + |
| 90 | + ```bash |
| 91 | + update-initramfs: Generating /boot/initrd.img-5.15.0-1038-azure |
| 92 | + cryptsetup: WARNING: target 'osencrypt' not found in /etc/crypttab |
| 93 | + + PREREQS=udev |
| 94 | + + mount -a |
| 95 | + + cryptsetup luksOpen /dev/disk/by-partuuid/ef61c3c3-50bb-40f0-8124-4cbe8cb2a380 osencrypt --header /boot/luks/osluksheader -d /mnt/azure_bek_disk/LinuxPassPhraseFileName |
| 96 | + Device osencrypt already exists. |
| 97 | + + exit 0 |
| 98 | + ``` |
| 99 | + |
| 100 | + * Swap the failed OS disk with the one containing the fix. |
| 101 | + * Review the extension and console logs to ensure the encryption process finished successfully. |
| 102 | + |
| 103 | +## Interrupted encryption |
| 104 | + |
| 105 | +It depends on where the encryption process was interrupted to determine what steps to follow for troubleshooting, keep in mind that there could be scenarios where the only option will be to [restore from backup](https://learn.microsoft.com/azure/backup/restore-azure-encrypted-virtual-machines). |
| 106 | + |
| 107 | +* Review the console logs and look for any error messages, normally extension deployment problems will be presented in the form of python errors. |
| 108 | + |
| 109 | +* Ensure all the [extension pre-requisites](https://learn.microsoft.com/azure/virtual-machines/linux/disk-encryption-overview#additional-vm-requirements) are met. |
| 110 | + |
| 111 | +* If required, work on a rescue VM and analyze the failed disk. For the operating system disk ensure that: |
| 112 | + * The required partitions are in place and the data is healthy. |
| 113 | + * The [operating system LUKS header file](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/unlock-encrypted-linux-disk-offline-repair#identify-the-header-file) is called `osluksheader` and is stored separately under the `/boot` partition. If the disk was encrypted and this file is missing or corrupted, there is no way to recover the virtual machine unless you have a working backup. |
| 114 | + * The initramfs contains the required ADE modules. If he modules are missing, follow the steps on [ADE modules missing in the initram image](#initram-miss). |
| 115 | + * The BEK VOLUME contains the [ADE key file](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/unlock-encrypted-linux-disk-offline-repair#identify-the-ade-key-file). |
| 116 | +* In case the key file is missing, then create a test machine and encrypt it (volume type DATA) using the original encryption settings used to encrypt the faulty VM, once encrypted, check the test VM looking for the ADE key file in the BEK volume. |
| 117 | + 1. Copy the ADE key file |
| 118 | + 2. Start the faulty machine |
| 119 | + 3. While in the emergency mode, [Identify the ADE key file](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/), [Identify the header file](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/unlock-encrypted-linux-disk-offline-repair#identify-the-header-file) then, based on the disk layout LVM or raw, [open the disk from encryption manually](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/unlock-encrypted-linux-disk-offline-repair#unlock-by-files). |
| 120 | + 4. Let the machine boot. |
| 121 | + 5. If the ADE key file is still missing, and the BEK volume is mounted, manually create a file called `/mnt/azure_bek_disk/LinuxPassPhraseFileName` with the ADE key file contents. |
| 122 | + 6. Reboot the machine |
| 123 | + 7. Redeploy the machine. |
| 124 | + |
| 125 | +## Not enough space in the boot partition (Ubuntu) |
| 126 | + |
| 127 | +> [!NOTE] |
| 128 | +> Ubuntu 24 images now come with a separate `boot` partition with 1GB size. |
| 129 | + |
| 130 | +ADE needs a separate partition for `/boot`, for that reason during the extension deployment it creates `/boot` as a separate partition and restore the original files back. At the end of the process a new initial ram disk file is created, if there is not enough space, this step is going to fail. This scenario is particularly complex since there are many variants and as for now [resizing the OS disk](https://learn.microsoft.com/azure/virtual-machines/linux/how-to-resize-encrypted-lvm#scenarios) is not supported when the OS disk is using ADE. |
| 131 | +At the time of writing, only Ubuntu images may fall under this process of boot split. |
| 132 | + |
| 133 | +In order to avoid falling into this issue, check on the following items: |
| 134 | + |
| 135 | +* Delete old kernels not in use. |
| 136 | +* Ensure only the necessary files are under `/boot`. |
| 137 | + |
| 138 | +## VFAT kernel module disabled |
| 139 | + |
| 140 | +The VFAT kernel module is required in order to mount the BEK volume. If the module is not enabled the ADE key file is not going to be available, therefore the disk is not going to be unlocked. |
| 141 | + |
| 142 | +Before continuing with the encryption [enable the VFAT module](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/vfat-disabled-boot-issues#ade-encrypted-vm-is-unable-to-access-root-volume) |
| 143 | + |
| 144 | +## Problems related to missing packages |
| 145 | + |
| 146 | +The ADE extension will install the required packages in case they are not installed by default. |
| 147 | +If for some reason this installation step fails, the encryption will also fail. |
| 148 | + |
| 149 | +In order to identify the cause for packages not being installed review the extension logs from the console. Locate a message like this: |
| 150 | + |
| 151 | +`[Info] Installing pre-requisites` |
| 152 | + |
| 153 | +Then, ensure all the packages were successfully installed. Visit [Package management](https://learn.microsoft.com/azure/virtual-machines/linux/disk-encryption-isolated-network#package-management) for a full list of the required packages based on the Linux distro. |
| 154 | + |
| 155 | +If there are errors related to package installation, identify which package failed and why it failed. |
| 156 | +Ensure the VM has access to the package repositories. Go to [Azure Disk Encryption on an isolated network](https://learn.microsoft.com/azure/virtual-machines/linux/disk-encryption-isolated-network) in case the VM is under special network requirements. |
| 157 | + |
| 158 | +## Missing parameters in the GRUB configuration |
| 159 | + |
| 160 | +During the encryption process the extension will add a couple of parameters to the kernel options in the file `/etc/default/grub` these are related to the boot and root partition UUID: |
| 161 | + |
| 162 | +`rd.luks.ade.partuuid` and `rd.luks.ade.bootuuid` |
| 163 | + |
| 164 | +These parameters must be present and properly set to the `UUIDs` accordingly. If this is not case, [offline troubleshooting](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/unlock-encrypted-linux-disk-offline-repair) will be required in order to add the parameter manually. The UUIDs can be obtained in a `chroot` environment by running the command `blkid`. |
| 165 | + |
| 166 | +## Missing or corrupted osluksheader file |
| 167 | + |
| 168 | +LUKS stores its encryption metadata in a special section at the beginning of the encrypted partition called the LUKS header. |
| 169 | +This header contains some critical information such as the cipher and mode, hash function, and key slots. |
| 170 | +The actual encrypting of the partition is done using a master key. |
| 171 | + |
| 172 | +When using ADE in the OS disk, the header is stored in a file under the `/boot` partition named `osluksheader`. If for any reason this file suffers corruption or if it is missing, the only way to retrive it is via a backup. Use the [offline troubleshooting](https://learn.microsoft.com/troubleshoot/azure/virtual-machines/linux/unlock-encrypted-linux-disk-offline-repair) method to mount the `boot` partition of the affected disk and place the `osluksheader` file from backup respectively. |
| 173 | + |
| 174 | +## Resources |
| 175 | + |
| 176 | +* [Azure Disk Encryption for Linux VMs](https://learn.microsoft.com/azure/virtual-machines/linux/disk-encryption-overview) |
| 177 | +* [Azure Disk Encryption troubleshooting](https://docs.microsoft.com/azure/virtual-machines/linux/disk-encryption-troubleshooting) |
| 178 | +* [Azure Disk Encryption frequently asked questions](https://docs.microsoft.com/azure/virtual-machines/linux/disk-encryption-faq) |
0 commit comments