Skip to content

blk-mq: recover from stale cached request in blk_mq_submit_bio#767

Open
blktests-ci[bot] wants to merge 1 commit intolinus-master_basefrom
series/1085314=>linus-master
Open

blk-mq: recover from stale cached request in blk_mq_submit_bio#767
blktests-ci[bot] wants to merge 1 commit intolinus-master_basefrom
series/1085314=>linus-master

Conversation

@blktests-ci
Copy link
Copy Markdown

@blktests-ci blktests-ci Bot commented Apr 24, 2026

Pull request for series with
subject: blk-mq: recover from stale cached request in blk_mq_submit_bio
version: 1
url: https://patchwork.kernel.org/project/linux-block/list/?series=1085314

When submitting a bio to blk-mq, if the task should sleep after peeking
a cached request, but before it pops it, the plug is flushed and calls
blk_mq_free_plug_rqs, freeing the cached_rqs.

The code had already warned of this possibility, and specifically popped
the request before other known blocking calls, but it didn't handle a
blocking GFP_NOIO alloc. Allocating the split bio or the integrity
payload are two such cases that can block under high memory pressure.
The blk-mq submit_bio function continues with the peeked request that
was just freed and re-initialized, so the driver receives that request
with a NULL'ed mq_hctx, and inevitably panics.

Relevant kernel messages if you should encounter this condition, where
the "WARNING" is the harbinger of the panic about to happen:

 ------------[ cut here ]------------
 WARNING: CPU: 4 PID: 80820 at block/blk-mq.c:3071 blk_mq_submit_bio+0x2cf/0x5b0
...
 BUG: kernel NULL pointer dereference, address: 0000000000000100
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 6b367b067 P4D 6b367b067 PUD 6bb5eb067 PMD 0
 Oops: Oops: 0000 [#1] SMP
 CPU: 4 UID: 36666 PID: 80820 Comm: IOUringThread0 Kdump: loaded Tainted: G S      W           6.16.1-0_fbk3_0_gd6c130b80483 #1 NONE
 Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
 Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A23 12/08/2020
 RIP: 0010:nvme_queue_rqs+0x93/0x180
 Code: 44 24 48 00 00 00 00 4d 85 f6 74 19 49 8b 44 24 10 4c 3b b0 00 01 00 00 74 0b 4c 89 f7 4c 89 fe e8 c2 62 cd ff 49 8b 44 24 10 <4c> 8b b0 00 01 00 00 49 f7 46 78 01 00 00 00 74 1a 49 8b 3e 8b 8f
 RSP: 0018:ffffc9004a8f34a8 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff888c12b92920 RCX: 0000000000000020
 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff888c12b92920
 RBP: 0000000000000000 R08: 000000000000000b R09: ffffffffffffffff
 R10: 0000000000092800 R11: 000000000002e000 R12: ffff88828e325c00
 R13: 0000000000000000 R14: 0000000000000000 R15: ffffc9004a8f34a8
 FS:  00007f7cf30e6640(0000) GS:ffff8890fa908000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000100 CR3: 00000006bb387006 CR4: 00000000007726f0
 PKRU: 55555554
 Call Trace:
  <TASK>
  blk_mq_dispatch_queue_requests+0x46/0x120
  blk_mq_flush_plug_list+0x38/0x130
  blk_add_rq_to_plug+0xa2/0x160
  blk_mq_submit_bio+0x3ab/0x5b0
  __submit_bio+0x3a/0x260
  submit_bio_noacct_nocheck+0xc6/0x2b0
  btrfs_submit_bbio+0x14d/0x520
  ? btrfs_get_extent+0x43f/0x640
  submit_extent_folio+0x31f/0x340
  btrfs_do_readpage+0x2d7/0xac0
  btrfs_readahead+0x142/0x200
  ? clear_state_bit+0x520/0x520
  read_pages+0x57/0x200
  ? folio_alloc_noprof+0x10c/0x310
  page_cache_ra_unbounded+0x28c/0x480
  ? asm_sysvec_call_function+0x16/0x20
  ? blk_cgroup_congested+0xa/0x50
  ? page_cache_sync_ra+0x41/0x2d0
  filemap_get_pages+0x347/0xd50
  filemap_read+0xd3/0x500
  ? 0xffffffff81000000
  __io_read+0x111/0x440
  io_read+0x23/0x90
  __io_issue_sqe+0x40/0x120
  io_issue_sqe+0x3f/0x3a0
  io_submit_sqes+0x2bd/0x790
  __se_sys_io_uring_enter+0x100/0xc10
  ? eventfd_read+0x100/0x1f0
  ? futex_wake+0x1b9/0x260
  ? syscall_trace_enter+0x34/0x1d0
  do_syscall_64+0x6a/0x250
  entry_SYSCALL_64_after_hwframe+0x4b/0x53
 RIP: 0033:0x95d621e
 Code: 8b 12 f6 c2 01 75 2a 84 c9 74 22 44 0f b6 d1 41 09 c2 8b bf cc 00 00 00 41 b9 08 00 00 00 b8 aa 01 00 00 31 d2 45 31 c0 0f 05 <48> 89 c6 89 f0 5d c3 83 c8 02 eb d5 cc cc cc cc cc cc 55 48 89 e5
 RSP: 002b:00007f7cf30dc630 EFLAGS: 00000246 ORIG_RAX: 00000000000001aa
 RAX: ffffffffffffffda RBX: 00007f7d18907000 RCX: 00000000095d621e
 RDX: 0000000000000000 RSI: 0000000000000006 RDI: 00000000000004cf
 RBP: 00007f7cf30dc630 R08: 0000000000000000 R09: 0000000000000008
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
 R13: 0000000000000001 R14: 00007f7d18907320 R15: 0000000000000009
  </TASK>

Fixes: b0077e2 ("blk-mq: make sure active queue usage is held for bio_integrity_prep()")
Fixes: 7b4f36c ("block: ensure we hold a queue reference when using queue limits")
Signed-off-by: Keith Busch <[email protected]>
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented Apr 24, 2026

Upstream branch: dd6c438
series: https://patchwork.kernel.org/project/linux-block/list/?series=1085314
version: 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant