Skip to content

sched: disable preemption around blk_flush_plug in sched_submit_work#765

Open
blktests-ci[bot] wants to merge 1 commit intolinus-master_basefrom
series/1084708=>linus-master
Open

sched: disable preemption around blk_flush_plug in sched_submit_work#765
blktests-ci[bot] wants to merge 1 commit intolinus-master_basefrom
series/1084708=>linus-master

Conversation

@blktests-ci
Copy link
Copy Markdown

@blktests-ci blktests-ci Bot commented Apr 23, 2026

Pull request for series with
subject: sched: disable preemption around blk_flush_plug in sched_submit_work
version: 1
url: https://patchwork.kernel.org/project/linux-block/list/?series=1084708

@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented Apr 23, 2026

Upstream branch: 6596a02
series: https://patchwork.kernel.org/project/linux-block/list/?series=1084708
version: 1

@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented Apr 23, 2026

Upstream branch: 507bd4b
series: https://patchwork.kernel.org/project/linux-block/list/?series=1084708
version: 1

@blktests-ci blktests-ci Bot force-pushed the series/1084708=>linus-master branch from ca74bd9 to 2f53b0b Compare April 23, 2026 17:00
@blktests-ci blktests-ci Bot force-pushed the linus-master_base branch from 6a0b974 to 59ca59b Compare April 24, 2026 00:56
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented Apr 24, 2026

Upstream branch: dd6c438
series: https://patchwork.kernel.org/project/linux-block/list/?series=1084708
version: 1

@blktests-ci blktests-ci Bot force-pushed the series/1084708=>linus-master branch from 2f53b0b to 1250dc6 Compare April 24, 2026 00:58
@blktests-ci blktests-ci Bot force-pushed the linus-master_base branch 2 times, most recently from 94f0438 to 857ada9 Compare April 24, 2026 07:54
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented Apr 24, 2026

Upstream branch: dd6c438
series: https://patchwork.kernel.org/project/linux-block/list/?series=1084708
version: 1

On preemptible kernels, a three-way deadlock can occur involving
blk_mq_freeze_queue and blk_mq_dispatch_list:

- Task A holds a filesystem lock (e.g., f2fs io_rwsem) and enters
  __bio_queue_enter(), waiting for mq_freeze_depth == 0
- Task B holds mq_freeze_depth=1 (elevator_change) and waits for
  q_usage_counter to reach zero in blk_mq_freeze_queue_wait()
- Task C is going to sleep waiting for the filesystem lock. Before
  sleeping, schedule() calls sched_submit_work() -> blk_flush_plug()
  -> blk_mq_dispatch_list(), which acquires q_usage_counter via
  percpu_ref_get(). If Task C gets preempted before percpu_ref_put(),
  it will not be scheduled back because the task is already in
  uninterruptible sleep state (TASK_UNINTERRUPTIBLE). This means it
  holds the percpu_ref indefinitely, preventing freeze from completing.

This is fundamentally an ABBA deadlock between queue freeze and the
filesystem lock, exposed by preemption creating an artificial hold
on q_usage_counter during the plug flush.

Fix by disabling preemption around blk_flush_plug() in
sched_submit_work(). The _notrace variants are used since this runs
in scheduler context. preempt_enable_no_resched_notrace() is correct
because we are already inside __schedule() and about to pick the next
task.

Fixes: 73c1010 ("block: initial patch for on-stack per-task plugging")
Reported-by: Michael Wu <[email protected]>
Tested-by: Michael Wu <[email protected]>
Link: https://lore.kernel.org/linux-block/[email protected]/
Signed-off-by: Ming Lei <[email protected]>
@blktests-ci blktests-ci Bot force-pushed the series/1084708=>linus-master branch from 1250dc6 to 621292f Compare April 24, 2026 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant