Skip to content

Commit 42081e3

Browse files
ChaoShi001kawasaki
authored andcommitted
nvme: downgrade WARN in nvme_setup_rw to pr_debug
When an NVMe namespace is configured with embedded metadata (flbas bit 4 set, NVME_NS_FLBAS_META_EXT) but no Protection Information (dps=0) and no NVME_NS_METADATA_SUPPORTED, nvme_setup_rw() fires WARN_ON_ONCE on any request that reaches it with REQ_INTEGRITY unset. The WARN was observed repeatedly during NVMe fuzz testing with a FEMU-based fuzzer that performs semantic mutation of Identify Namespace responses. The trigger requires three conditions to align: (a) a namespace transitions through the EXT_LBAS non-PI state (head->ms != 0, features & NVME_NS_EXT_LBAS, !(features & NVME_NS_METADATA_SUPPORTED)), (b) nvme_init_integrity() returns false through the early-exit branch at core.c:1834 without populating bi->metadata_size, leaving the disk without an integrity profile (blk_get_integrity() returns NULL), and (c) a request that was admitted to the block layer before the namespace update reaches nvme_setup_rw() after it. The admission gap arises in two places. First, the plug-list flush path: a process with dirty pages queued in a plug before the namespace update flushes them on file close (blk_finish_plug -> blk_mq_dispatch -> nvme_setup_rw), bypassing any capacity-zero gate. Second, the cached-rq path: blk_mq_submit_bio() at blk-mq.c:3155 may find a cached request; if so, the bio_queue_enter() freeze-serialization guard at blk-mq.c:3174-3176 is skipped and the bio is dispatched immediately. In both cases the bio was submitted without REQ_INTEGRITY (because blk_get_integrity() returned NULL at dispatch time, so bio_integrity_action() returned 0 and bio_integrity_prep() was not called), and it reaches nvme_setup_rw() for a namespace where head->ms != 0. The existing BLK_STS_NOTSUPP return correctly handles this dispatch; the WARN_ON_ONCE is a false positive. The WARN was reproduced six times over four days of fuzzing (April 2026). A representative crash shows the plug-flush path: nvme0n1: detected capacity change from 2097152 to 0 WARNING: drivers/nvme/host/core.c:1042 at nvme_setup_rw+0x768/0xfd0 PID: 785 (systemd-udevd) Call Trace: nvme_setup_cmd / nvme_queue_rq / blk_mq_dispatch_rq_list blk_mq_flush_plug_list / blk_finish_plug / blkdev_writepages sync_blockdev / bdev_release / __fput / sys_close Replace WARN_ON_ONCE with pr_debug_ratelimited so the condition is logged at debug level without splat. The BLK_STS_NOTSUPP return is preserved; I/O to the transitioning namespace is still rejected. An alternative approach that addresses the root cause at the integrity-profile level is proposed in patch 2/2: populate bi->metadata_size for EXT_LBAS non-PI namespaces in nvme_init_integrity() so that bio_integrity_action() returns non-zero, bio_integrity_prep() sets REQ_INTEGRITY, and nvme_setup_rw() never reaches this branch. Both patches are sent as RFC for maintainer guidance on the preferred direction. Tested: Compiled on linux-kcov-debug (6.19.0+, KASAN/DEBUG_LIST). Boot-tested under FEMU with NVME_MALICIOUS_RESPONDER=1 NVME_SEMANTIC_DATA_MUTATOR=1; ran 4 concurrent dd processes plus 500 rescan_controller cycles. No WARN, BUG, or Oops observed. Found by FuzzNvme(Syzkaller with FEMU fuzzing framework). Acked-by: Sungwoo Kim <[email protected]> Acked-by: Dave Tian <[email protected]> Acked-by: Weidong Zhu <[email protected]> Signed-off-by: Chao Shi <[email protected]>
1 parent 857ada9 commit 42081e3

1 file changed

Lines changed: 5 additions & 1 deletion

File tree

drivers/nvme/host/core.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1039,8 +1039,12 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
10391039
* namespace capacity to zero to prevent any I/O.
10401040
*/
10411041
if (!blk_integrity_rq(req)) {
1042-
if (WARN_ON_ONCE(!nvme_ns_has_pi(ns->head)))
1042+
if (!nvme_ns_has_pi(ns->head)) {
1043+
pr_debug_ratelimited("nvme: %s: metadata (ms=%u) without PI or integrity request, returning NOTSUPP\n",
1044+
ns->disk->disk_name,
1045+
ns->head->ms);
10431046
return BLK_STS_NOTSUPP;
1047+
}
10441048
control |= NVME_RW_PRINFO_PRACT;
10451049
nvme_set_ref_tag(ns, cmnd, req);
10461050
}

0 commit comments

Comments
 (0)