Skip to content

Commit 32d572e

Browse files
leitaohtejun
authored andcommitted
workqueue: add CONFIG_BOOTPARAM_WQ_STALL_PANIC option
Add a kernel config option to set the default value of workqueue.panic_on_stall, similar to CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC, CONFIG_BOOTPARAM_HARDLOCKUP_PANIC and CONFIG_BOOTPARAM_HUNG_TASK_PANIC. This allows setting the number of workqueue stalls before triggering a kernel panic at build time, which is useful for high-availability systems that need consistent panic-on-stall, in other words, those servers which run with CONFIG_BOOTPARAM_*_PANIC=y already. The default remains 0 (disabled). Setting it to 1 will panic on the first stall, and higher values will panic after that many stall warnings. The value can still be overridden at runtime via the workqueue.panic_on_stall boot parameter or sysfs. Signed-off-by: Breno Leitao <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
1 parent 51cd2d2 commit 32d572e

3 files changed

Lines changed: 26 additions & 2 deletions

File tree

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8336,7 +8336,8 @@ Kernel parameters
83368336
CONFIG_WQ_WATCHDOG. It sets the number times of the
83378337
stall to trigger panic.
83388338

8339-
The default is 0, which disables the panic on stall.
8339+
The default is set by CONFIG_BOOTPARAM_WQ_STALL_PANIC,
8340+
which is 0 (disabled) if not configured.
83408341

83418342
workqueue.cpu_intensive_thresh_us=
83428343
Per-cpu work items which run for longer than this

kernel/workqueue.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7568,7 +7568,7 @@ static struct timer_list wq_watchdog_timer;
75687568
static unsigned long wq_watchdog_touched = INITIAL_JIFFIES;
75697569
static DEFINE_PER_CPU(unsigned long, wq_watchdog_touched_cpu) = INITIAL_JIFFIES;
75707570

7571-
static unsigned int wq_panic_on_stall;
7571+
static unsigned int wq_panic_on_stall = CONFIG_BOOTPARAM_WQ_STALL_PANIC;
75727572
module_param_named(panic_on_stall, wq_panic_on_stall, uint, 0644);
75737573

75747574
/*

lib/Kconfig.debug

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1297,6 +1297,29 @@ config WQ_WATCHDOG
12971297
state. This can be configured through kernel parameter
12981298
"workqueue.watchdog_thresh" and its sysfs counterpart.
12991299

1300+
config BOOTPARAM_WQ_STALL_PANIC
1301+
int "Panic on Nth workqueue stall"
1302+
default 0
1303+
range 0 100
1304+
depends on WQ_WATCHDOG
1305+
help
1306+
Set the number of workqueue stalls to trigger a kernel panic.
1307+
A workqueue stall occurs when a worker pool doesn't make forward
1308+
progress on a pending work item for over 30 seconds (configurable
1309+
using the workqueue.watchdog_thresh parameter).
1310+
1311+
If n = 0, the kernel will not panic on stall. If n > 0, the kernel
1312+
will panic after n stall warnings.
1313+
1314+
The panic can be used in combination with panic_timeout,
1315+
to cause the system to reboot automatically after a
1316+
stall has been detected. This feature is useful for
1317+
high-availability systems that have uptime guarantees and
1318+
where a stall must be resolved ASAP.
1319+
1320+
This setting can be overridden at runtime via the
1321+
workqueue.panic_on_stall kernel parameter.
1322+
13001323
config WQ_CPU_INTENSIVE_REPORT
13011324
bool "Report per-cpu work items which hog CPU for too long"
13021325
depends on DEBUG_KERNEL

0 commit comments

Comments
 (0)