Skip to content

Commit 5d0b437

Browse files
authored
adding sudo and general doc updates
Adding the sudo action, reorg doc for clarity and to address comments
1 parent 1399fbf commit 5d0b437

1 file changed

Lines changed: 66 additions & 54 deletions

File tree

support/azure/virtual-machines/linux/repair-linux-vm-using-ALAR.md

Lines changed: 66 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,7 @@ title: Repair a Linux VM automatically with the help of ALAR
33
description: This article describes how to automatically repair a non bootable VM with the Azure Linux Auto Repair (ALAR) scripts.
44
services: virtual-machines-linux
55
documentationcenter: ''
6-
author: malachma
7-
manager: noambi
6+
author: pagienge
87
editor: v-jsitser
98
tags: virtual-machines
109
ms.custom: sap:VM Admin - Linux (Guest OS), linux-related-content
@@ -13,8 +12,8 @@ ms.topic: troubleshooting
1312
ms.workload: infrastructure-services
1413
ms.tgt_pltfrm: vm-linux
1514
ms.devlang: azurecli
16-
ms.date: 09/24/2024
17-
ms.author: malachma
15+
ms.date: 10/31/2025
16+
ms.author: pagienge
1817
---
1918

2019
# Use Azure Linux Auto Repair (ALAR) to fix a Linux VM
@@ -27,13 +26,61 @@ ALAR utilizes the VM repair extension that's described in [Repair a Linux VM by
2726

2827
ALAR covers the following repair scenarios:
2928

30-
- Malformed /etc/fstab
31-
syntax error
32-
missing disk
33-
- Damaged initrd or missing initrd line in the /boot/grub/grub.cfg
34-
- Last installed kernel isn't bootable
35-
- Serial console and GRUB serial are incorrectly configured or are missing
36-
- GRUB/EFI installation or configuration damaged
29+
- No-boot scenarios
30+
- Malformed /etc/fstab
31+
- syntax error
32+
- missing disk
33+
- Damaged initrd or missing initrd line in the /boot/grub/grub.cfg
34+
- Last installed kernel isn't bootable
35+
- GRUB/EFI installation or configuration damaged
36+
- disk space / auditd forced shutdowns
37+
- Configuration issues
38+
- Serial console and GRUB serial are incorrectly configured or are missing
39+
- sudo misconfiguration
40+
41+
## How to use ALAR
42+
43+
The ALAR scripts use the [az vm repair](/cli/azure/vm/repair) extension, `run` command, and its `--run-id` option. The value of the `--run-id` option for the automated recovery is `linux-alar2`. To fix a Linux VM by using an ALAR script, follow these steps:
44+
45+
> [!NOTE]
46+
> The VM Contributor role doesn't provide enough permissions to run these scripted operations, as they require permissions to read, write, and delete resources in the resource group that includes the target VM. Therefore roles such as Contributor or Owner at the resource group level is required.
47+
48+
1. Create a rescue VM:
49+
50+
```azurecli-interactive
51+
az vm repair create --verbose --resource-group <RG-NAME> --name <VM-NAME>
52+
```
53+
54+
- There are currently 3 parameters which will prompt for values if they are not given on the command line. Add these parameters and values to the command for a non-interactive execution
55+
- `--repair-username <RESCUE-USERNAME>`
56+
- `--repair-password <RESCUE-PASS>`
57+
- `--associate-public-ip`
58+
- See the [az vm repair](/cli/azure/vm/repair) documentation for more options that can be used to control the creation of the repair VM
59+
60+
2. Run the `linux-alar2` script, along with parameters for one or more of the ALAR actions on the rescue VM:
61+
62+
```azurecli-interactive
63+
az vm repair run --verbose --resource-group <RG-NAME> --name <VM-NAME> --run-id linux-alar2 --parameters <action1,action2,...> --run-on-repair
64+
```
65+
66+
valid action names will be given below.
67+
68+
3. Swap the copy of the OS disk back to the original VM and delete the temporary resources:
69+
70+
```azurecli-interactive
71+
az vm repair restore --verbose --resource-group <RG-NAME> --name <VM-NAME>
72+
```
73+
74+
> [!NOTE]
75+
> The original and new disks won't be deleted.
76+
77+
In all of the example commands these are the parameters shown:
78+
79+
- `RG-NAME`: The name of the resource group containing the broken VM.
80+
- `VM-NAME`: The name of the broken VM.
81+
- `RESCUE-UID`: The user created on the repair VM for login. It's the equivalent of the user created on a new VM in the Azure portal.
82+
- `RESCUE-PASS`: The password for `RESCUE-UID`, enclosed in single quotes. For example: `'password!234'`.
83+
- `action1,action2`, etc.: One or more of the defined actions available to apply to the broken VM. See the complete list of actions below, and in the [ALAR github ReadMe](https://github.com/Azure/ALAR). You can pass one or more actions which will be run consecutively. For multiple operations, delineate them using commas without spaces, such as `fstab,sudo`.
3784
3885
## The ALAR actions
3986
@@ -43,12 +90,6 @@ This action strips off any lines in the */etc/fstab* file that aren't needed to
4390
4491
For more information about issues with a malformed */etc/fstab* file, see [Troubleshoot Linux VM starting issues because fstab errors](./linux-virtual-machine-cannot-start-fstab-errors.md).
4592
46-
### kernel
47-
48-
This action changes the default kernel. The script replaces the broken kernel with the previously installed version.
49-
50-
For more information about messages that might be logged on the serial console for kernel-related startup events, see [How to recover an Azure Linux virtual machine from kernel-related boot issues](kernel-related-boot-issues.md).
51-
5293
### initrd
5394
5495
This action can be used to fix an initrd or initramfs that is either corrupted or incorrectly created.
@@ -64,6 +105,12 @@ In both cases, the following information is logged before the error entries are
64105
65106
![Unpacking failed](media/repair-linux-vm-using-ALAR/unpacking-failed.png)
66107
108+
### kernel
109+
110+
This action changes the default kernel by replacing the default/broken kernel with a previously installed version.
111+
112+
For more information about messages that might be logged on the serial console for kernel-related startup events, see [How to recover an Azure Linux virtual machine from kernel-related boot issues](kernel-related-boot-issues.md).
113+
67114
### serialconsole
68115
69116
This action corrects an incorrect or malformed serial console configuration for the Linux kernel or GRUB. We recommend that you run this action in the following cases:
@@ -81,46 +128,11 @@ This action can be used to reinstall the required software to boot from a GEN2 V
81128
82129
### auditd
83130
84-
If your VM shuts down immediately upon startup due to the audit daemon configuration, use this action. This action modifies the audit daemon configuration (in the */etc/audit/auditd.conf* file) by changing the `HALT` value configured for any `action` parameters to `SYSLOG`, which doesn't force the system to shut down. In a Logical Volume Manager (LVM) environment, if the logical volume that contains the audit logs is full and there's available space in the volume group, the logical volume will also be extended by 10% of the current size. However, if you're not using an LVM environment or there's no available space, only the configuration file is altered.
131+
If your VM shuts down immediately upon startup due to the audit daemon configuration, use this action. This action modifies the audit daemon configuration (in the */etc/audit/auditd.conf* file) by changing the `HALT` value configured for any `action` parameters to `SYSLOG`, which doesn't force the system to shut down. In a Logical Volume Manager (LVM) environment, if the logical volume that contains the audit logs is full and there's available space in the volume group, the logical volume will also be extended by 10% of the current size. However, if you're not using an LVM environment or there's no available space, only the `auditd` configuration file is altered.
85132
86133
> [!IMPORTANT]
87-
> This action will change the VM's security posture by altering the audit daemon configuration so that the VM shutdown issue can be resolved. Once the VM is running and accessible, you need to revert the audit daemon configuration to the original state. For this purpose, a backup of the *auditd.conf* file is created in */etc/audit* by the ALAR action.
88-
89-
## How to use ALAR
90-
91-
The ALAR scripts use the repair extension `run` command and its `--run-id` option. The value of the `--run-id` option for the automated recovery is `linux-alar2`. To fix a Linux VM by using an ALAR script, follow these steps:
92-
93-
> [!NOTE]
94-
> The VM Contributor role doesn't provide enough permissions to run the scripts, as they require permissions to read, write, and delete resources in the resource group that includes the target VM. Therefore roles such as Contributor or Owner at the resource group level is required.
134+
> This action will change the VM's security posture by altering the audit daemon configuration so that the VM shutdown issue can be resolved. Once the VM is running and accessible, you need to evaluate the configuration and potentially revert it to the original state. For this purpose, a backup of the *auditd.conf* file is created in */etc/audit* by the ALAR action.
95135
96-
1. Create a rescue VM:
97-
98-
```azurecli-interactive
99-
az vm repair create --verbose -g RG-NAME -n VM-NAME --repair-username RESCUE-UID --repair-password RESCUE-PASS --copy-disk-name DISK-COPY
100-
```
101-
2. Run a script with one of the ALAR actions on the rescue VM:
102-
103-
```azurecli-interactive
104-
az vm repair run --verbose -g RG-NAME -n VM-NAME --run-id linux-alar2 --parameters ACTION --run-on-repair
105-
```
106-
3. Swap the OS disks and delete the temporary resources:
107-
108-
```azurecli-interactive
109-
az vm repair restore --verbose -g RG-NAME -n VM-NAME
110-
```
111-
112-
> [!NOTE]
113-
> The original and new disks won't be deleted.
114-
115-
Here are explanations for the parameters in the commands above:
116-
117-
- `RG-NAME`: The name of the resource group containing the broken VM.
118-
- `VM-NAME`: The name of the broken VM.
119-
- `RESCUE-UID`: The user created on the repair VM for login. It's the equivalent of the user created on a new VM in the Azure portal.
120-
- `RESCUE-PASS`: The password for `RESCUE-UID`, enclosed in single quotes. For example: `'password!234'`.
121-
- `DISK-COPY`: The name of the OS disk copy that will be created from the broken VM.
122-
- `ACTION`: A scripted task to run, such as `initrd` or `fstab`.
123-
You can pass over single or multiple recovery operations. For multiple operations, delineate them using commas without spaces, such as `fstab,initrd`.
124136
125137
## Limitation
126138

0 commit comments

Comments
 (0)