block: introduce pi_size field in blk_integrity#6
Closed
blktests-ci[bot] wants to merge 10 commits intofor-next_basefrom
Closed
block: introduce pi_size field in blk_integrity#6blktests-ci[bot] wants to merge 10 commits intofor-next_basefrom
blktests-ci[bot] wants to merge 10 commits intofor-next_basefrom
Conversation
* io_uring-6.16: MAINTAINERS: remove myself from io_uring io_uring/net: only consider msg_inq if larger than 1 io_uring/zcrx: fix area release on registration failure io_uring/zcrx: init id for xa_find
* block-6.16: selftests: ublk: cover PER_IO_DAEMON in more stress tests Documentation: ublk: document UBLK_F_PER_IO_DAEMON selftests: ublk: add stress test for per io daemons selftests: ublk: add functional test for per io daemons selftests: ublk: kublk: decouple ublk_queues from ublk server threads selftests: ublk: kublk: move per-thread data out of ublk_queue selftests: ublk: kublk: lift queue initialization out of thread selftests: ublk: kublk: tie sqe allocation to io instead of queue selftests: ublk: kublk: plumb q_id in io_uring user_data ublk: have a per-io daemon instead of a per-queue daemon md/md-bitmap: remove parameter slot from bitmap_create() md/md-bitmap: cleanup bitmap_ops->startwrite() md/dm-raid: remove max_write_behind setting limit md/md-bitmap: fix dm-raid max_write_behind setting md/raid1,raid10: don't handle IO error for REQ_RAHEAD and REQ_NOWAIT loop: add file_start_write() and file_end_write() bcache: reserve more RESERVE_BTREE buckets to prevent allocator hang bcache: remove unused constants bcache: fix NULL pointer in cache_set_flush()
* io_uring-6.16: io_uring/kbuf: limit legacy provided buffer lists to USHRT_MAX
* block-6.16: block: drop direction param from bio_integrity_copy_user()
* block-6.16: selftests: ublk: kublk: improve behavior on init failure block: flip iter directions in blk_rq_integrity_map_user()
* io_uring-6.16: io_uring/futex: mark wait requests as inflight io_uring/futex: get rid of struct io_futex addr union
* block-6.16: nvme: spelling fixes nvme-tcp: fix I/O stalls on congested sockets nvme-tcp: sanitize request list handling nvme-tcp: remove tag set when second admin queue config fails nvme: enable vectored registered bufs for passthrough cmds nvme: fix implicit bool to flags conversion nvme: fix command limits status code
Introduce a new pi_size field in struct blk_integrity to explicitly represent the size (in bytes) of the protection information (PI) tuple. This is a prep patch. Signed-off-by: Anuj Gupta <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]>
Add a new ioctl, FS_IOC_GETPICAP, to query protection info (PI) capabilities. This ioctl returns information about the files integrity profile. This is useful for userspace applications to understand a files end-to-end data protection support and configure the I/O accordingly. For now this interface is only supported by block devices. However the design and placement of this ioctl in generic FS ioctl space allows us to extend it to work over files as well. This maybe useful when filesystems start supporting PI-aware layouts. A new structure struct fs_pi_cap is introduced, which contains the following fields: 1. fpc_flags: bitmask of capability flags. 2. fpc_interval: the data block interval (in bytes) for which the protection information is generated. 3. fpc_csum type: type of checksum used. 4. fpc_metadata_size: size (in bytes) of the metadata associated with each interval. 5. fpc_pi_size: size (in bytes) of the PI associated with each interval. 6. fpc_tag_size: size (in bytes) of tag information. 7. pi_offset: offset of protection information tuple within the metadata. 8. fpc_ref_tag_size: size in bytes of the reference tag. 9. fpc_storage_tag_size: size in bytes of the storage tag. 10. fpc_rsvd: reserved for future use. The internal logic to fetch the capability is encapsulated in a helper function blk_get_pi_cap(), which uses the blk_integrity profile associated with the device. The ioctl returns -EOPNOTSUPP, if CONFIG_BLK_DEV_INTEGRITY is not enabled. Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]>
Author
|
Upstream branch: 38f4878 |
a73409f to
d74f66f
Compare
Author
|
Upstream branch: f4ca523 |
Author
|
Upstream branch: f4ca523 |
Author
|
Github failed to update this PR after force push. Close it. |
blktests-ci Bot
pushed a commit
that referenced
this pull request
Jul 23, 2025
…/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.16, take #6 - Fix use of u64_replace_bits() in adjusting the guest's view of MDCR_EL2.HPMN.
blktests-ci Bot
pushed a commit
that referenced
this pull request
Aug 2, 2025
pert script tests fails with segmentation fault as below:
92: perf script tests:
--- start ---
test child forked, pid 103769
DB test
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB /tmp/perf-test-script.7rbftEpOzX/perf.data (9 samples) ]
/usr/libexec/perf-core/tests/shell/script.sh: line 35:
103780 Segmentation fault (core dumped)
perf script -i "${perfdatafile}" -s "${db_test}"
--- Cleaning up ---
---- end(-1) ----
92: perf script tests : FAILED!
Backtrace pointed to :
#0 0x0000000010247dd0 in maps.machine ()
#1 0x00000000101d178c in db_export.sample ()
#2 0x00000000103412c8 in python_process_event ()
#3 0x000000001004eb28 in process_sample_event ()
#4 0x000000001024fcd0 in machines.deliver_event ()
#5 0x000000001025005c in perf_session.deliver_event ()
#6 0x00000000102568b0 in __ordered_events__flush.part.0 ()
#7 0x0000000010251618 in perf_session.process_events ()
#8 0x0000000010053620 in cmd_script ()
#9 0x00000000100b5a28 in run_builtin ()
#10 0x00000000100b5f94 in handle_internal_command ()
#11 0x0000000010011114 in main ()
Further investigation reveals that this occurs in the `perf script tests`,
because it uses `db_test.py` script. This script sets `perf_db_export_mode = True`.
With `perf_db_export_mode` enabled, if a sample originates from a hypervisor,
perf doesn't set maps for "[H]" sample in the code. Consequently, `al->maps` remains NULL
when `maps__machine(al->maps)` is called from `db_export__sample`.
As al->maps can be NULL in case of Hypervisor samples , use thread->maps
because even for Hypervisor sample, machine should exist.
If we don't have machine for some reason, return -1 to avoid segmentation fault.
Reported-by: Disha Goel <[email protected]>
Signed-off-by: Aditya Bodkhe <[email protected]>
Reviewed-by: Adrian Hunter <[email protected]>
Tested-by: Disha Goel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Adrian Hunter <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Aug 2, 2025
Without the change `perf `hangs up on charaster devices. On my system
it's enough to run system-wide sampler for a few seconds to get the
hangup:
$ perf record -a -g --call-graph=dwarf
$ perf report
# hung
`strace` shows that hangup happens on reading on a character device
`/dev/dri/renderD128`
$ strace -y -f -p 2780484
strace: Process 2780484 attached
pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached
It's call trace descends into `elfutils`:
$ gdb -p 2780484
(gdb) bt
#0 0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0)
at ../sysdeps/unix/sysv/linux/pread64.c:25
#1 0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1
#2 0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#3 0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#4 0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 ()
from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#5 0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0)
at util/dso.h:537
#6 0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114
#7 frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242
#8 0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#9 0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0,
thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127,
best_effort=best_effort@entry=false) at util/thread.h:152
#13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0,
sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939
#14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540,
max_stack=127, symbols=true) at util/machine.c:2920
#15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440,
sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true)
at util/machine.c:2970
#16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440,
sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198
#17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0,
evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127
#18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127,
arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255
#19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540,
evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334
#20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0,
file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367
#21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245
#22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324
#23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224,
file_path=0x105ffbf0 "perf.data") at util/session.c:1419
#24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0,
--Type <RET> for more, q to quit, c to continue without paging--
quit
prog=prog@entry=0x7fff9df81220) at util/session.c:2132
#25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220)
at util/session.c:2181
#26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226
#27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390
#28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076
#29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827
#30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0)
at perf.c:351
#31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404
#32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448
#33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556
The hangup happens because nothing in` perf` or `elfutils` checks if a
mapped file is easily readable.
The change conservatively skips all non-regular files.
Signed-off-by: Sergei Trofimovich <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Aug 2, 2025
Symbolize stack traces by creating a live machine. Add this
functionality to dump_stack and switch dump_stack users to use
it. Switch TUI to use it. Add stack traces to the child test function
which can be useful to diagnose blocked code.
Example output:
```
$ perf test -vv PERF_RECORD_
...
7: PERF_RECORD_* events & perf_sample fields:
7: PERF_RECORD_* events & perf_sample fields : Running (1 active)
^C
Signal (2) while running tests.
Terminating tests with the same signal
Internal test harness failure. Completing any started tests:
: 7: PERF_RECORD_* events & perf_sample fields:
---- unexpected signal (2) ----
#0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
#1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
#2 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
#3 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
#4 0x7fc12fef1393 in __nanosleep nanosleep.c:26
#5 0x7fc12ff02d68 in __sleep sleep.c:55
#6 0x55788c63196b in test__PERF_RECORD perf-record.c:0
#7 0x55788c620fb0 in run_test_child builtin-test.c:0
#8 0x55788c5bd18d in start_command run-command.c:127
#9 0x55788c621ef3 in __cmd_test builtin-test.c:0
#10 0x55788c6225bf in cmd_test ??:0
#11 0x55788c5afbd0 in run_builtin perf.c:0
#12 0x55788c5afeeb in handle_internal_command perf.c:0
#13 0x55788c52b383 in main ??:0
#14 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
#15 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#16 0x55788c52b9d1 in _start ??:0
---- unexpected signal (2) ----
#0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
#1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
#2 0x7fc12fea3a14 in pthread_sigmask@GLIBC_2.2.5 pthread_sigmask.c:45
#3 0x7fc12fe49fd9 in __GI___sigprocmask sigprocmask.c:26
#4 0x7fc12ff2601b in __longjmp_chk longjmp.c:36
#5 0x55788c6210c0 in print_test_result.isra.0 builtin-test.c:0
#6 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
#7 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
#8 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
#9 0x7fc12fef1393 in __nanosleep nanosleep.c:26
#10 0x7fc12ff02d68 in __sleep sleep.c:55
#11 0x55788c63196b in test__PERF_RECORD perf-record.c:0
#12 0x55788c620fb0 in run_test_child builtin-test.c:0
#13 0x55788c5bd18d in start_command run-command.c:127
#14 0x55788c621ef3 in __cmd_test builtin-test.c:0
#15 0x55788c6225bf in cmd_test ??:0
#16 0x55788c5afbd0 in run_builtin perf.c:0
#17 0x55788c5afeeb in handle_internal_command perf.c:0
#18 0x55788c52b383 in main ??:0
#19 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
#20 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#21 0x55788c52b9d1 in _start ??:0
7: PERF_RECORD_* events & perf_sample fields : Skip (permissions)
```
Signed-off-by: Ian Rogers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Aug 2, 2025
Calling perf top with branch filters enabled on Intel CPU's
with branch counters logging (A.K.A LBR event logging [1]) support
results in a segfault.
$ perf top -e '{cpu_core/cpu-cycles/,cpu_core/event=0xc6,umask=0x3,frontend=0x11,name=frontend_retired_dsb_miss/}' -j any,counter
...
Thread 27 "perf" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffafff76c0 (LWP 949003)]
perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
653 *width = env->cpu_pmu_caps ? env->br_cntr_width :
(gdb) bt
#0 perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
#1 0x00000000005b1599 in symbol__account_br_cntr (branch=0x7fffcc3db580, evsel=0xfea2d0, offset=12, br_cntr=8) at util/annotate.c:345
#2 0x00000000005b17fb in symbol__account_cycles (addr=5658172, start=5658160, sym=0x7fffcc0ee420, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:389
#3 0x00000000005b1976 in addr_map_symbol__account_cycles (ams=0x7fffcd7b01d0, start=0x7fffcd7b02b0, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:422
#4 0x000000000068d57f in hist__account_cycles (bs=0x110d288, al=0x7fffafff6540, sample=0x7fffafff6760, nonany_branch_mode=false, total_cycles=0x0, evsel=0xfea2d0) at util/hist.c:2850
#5 0x0000000000446216 in hist_iter__top_callback (iter=0x7fffafff6590, al=0x7fffafff6540, single=true, arg=0x7fffffff9e00) at builtin-top.c:737
#6 0x0000000000689787 in hist_entry_iter__add (iter=0x7fffafff6590, al=0x7fffafff6540, max_stack_depth=127, arg=0x7fffffff9e00) at util/hist.c:1359
#7 0x0000000000446710 in perf_event__process_sample (tool=0x7fffffff9e00, event=0x110d250, evsel=0xfea2d0, sample=0x7fffafff6760, machine=0x108c968) at builtin-top.c:845
#8 0x0000000000447735 in deliver_event (qe=0x7fffffffa120, qevent=0x10fc200) at builtin-top.c:1211
#9 0x000000000064ccae in do_flush (oe=0x7fffffffa120, show_progress=false) at util/ordered-events.c:245
#10 0x000000000064d005 in __ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324
#11 0x000000000064d0ef in ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP) at util/ordered-events.c:342
#12 0x00000000004472a9 in process_thread (arg=0x7fffffff9e00) at builtin-top.c:1120
#13 0x00007ffff6e7dba8 in start_thread (arg=<optimized out>) at pthread_create.c:448
#14 0x00007ffff6f01b8c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
The cause is that perf_env__find_br_cntr_info tries to access a
null pointer pmu_caps in the perf_env struct. A similar issue exists
for homogeneous core systems which use the cpu_pmu_caps structure.
Fix this by populating cpu_pmu_caps and pmu_caps structures with
values from sysfs when calling perf top with branch stack sampling
enabled.
[1], LBR event logging introduced here:
https://lore.kernel.org/all/[email protected]/
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Thomas Falcon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Aug 31, 2025
These iterations require the read lock, otherwise RCU lockdep will splat: ============================= WARNING: suspicious RCU usage 6.17.0-rc3-00014-g31419c045d64 #6 Tainted: G O ----------------------------- drivers/base/power/main.c:1333 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 5 locks held by rtcwake/547: #0: 00000000643ab418 (sb_writers#6){.+.+}-{0:0}, at: file_start_write+0x2b/0x3a #1: 0000000067a0ca88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x181/0x24b #2: 00000000631eac40 (kn->active#3){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x191/0x24b #3: 00000000609a1308 (system_transition_mutex){+.+.}-{4:4}, at: pm_suspend+0xaf/0x30b #4: 0000000060c0fdb0 (device_links_srcu){.+.+}-{0:0}, at: device_links_read_lock+0x75/0x98 stack backtrace: CPU: 0 UID: 0 PID: 547 Comm: rtcwake Tainted: G O 6.17.0-rc3-00014-g31419c045d64 #6 VOLUNTARY Tainted: [O]=OOT_MODULE Stack: 223721b3a80 6089eac6 00000001 00000001 ffffff00 6089eac6 00000535 6086e528 721b3ac0 6003c294 00000000 60031fc0 Call Trace: [<600407ed>] show_stack+0x10e/0x127 [<6003c294>] dump_stack_lvl+0x77/0xc6 [<6003c2fd>] dump_stack+0x1a/0x20 [<600bc2f8>] lockdep_rcu_suspicious+0x116/0x13e [<603d8ea1>] dpm_async_suspend_superior+0x117/0x17e [<603d980f>] device_suspend+0x528/0x541 [<603da24b>] dpm_suspend+0x1a2/0x267 [<603da837>] dpm_suspend_start+0x5d/0x72 [<600ca0c9>] suspend_devices_and_enter+0xab/0x736 [...] Add the fourth argument to the iteration to annotate this and avoid the splat. Fixes: 0679963 ("PM: sleep: Make async suspend handle suppliers like parents") Fixes: ed18738 ("PM: sleep: Make async resume handle consumers like children") Signed-off-by: Johannes Berg <[email protected]> Link: https://patch.msgid.link/20250826134348.aba79f6e6299.I9ecf55da46ccf33778f2c018a82e1819d815b348@changeid Signed-off-by: Rafael J. Wysocki <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Sep 16, 2025
Commit 0e2f80a("fs/dax: ensure all pages are idle prior to filesystem unmount") introduced the WARN_ON_ONCE to capture whether the filesystem has removed all DAX entries or not and applied the fix to xfs and ext4. Apply the missed fix on erofs to fix the runtime warning: [ 5.266254] ------------[ cut here ]------------ [ 5.266274] WARNING: CPU: 6 PID: 3109 at mm/truncate.c:89 truncate_folio_batch_exceptionals+0xff/0x260 [ 5.266294] Modules linked in: [ 5.266999] CPU: 6 UID: 0 PID: 3109 Comm: umount Tainted: G S 6.16.0+ #6 PREEMPT(voluntary) [ 5.267012] Tainted: [S]=CPU_OUT_OF_SPEC [ 5.267017] Hardware name: Dell Inc. OptiPlex 5000/05WXFV, BIOS 1.5.1 08/24/2022 [ 5.267024] RIP: 0010:truncate_folio_batch_exceptionals+0xff/0x260 [ 5.267076] Code: 00 00 41 39 df 7f 11 eb 78 83 c3 01 49 83 c4 08 41 39 df 74 6c 48 63 f3 48 83 fe 1f 0f 83 3c 01 00 00 43 f6 44 26 08 01 74 df <0f> 0b 4a 8b 34 22 4c 89 ef 48 89 55 90 e8 ff 54 1f 00 48 8b 55 90 [ 5.267083] RSP: 0018:ffffc900013f36c8 EFLAGS: 00010202 [ 5.267095] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 5.267101] RDX: ffffc900013f3790 RSI: 0000000000000000 RDI: ffff8882a1407898 [ 5.267108] RBP: ffffc900013f3740 R08: 0000000000000000 R09: 0000000000000000 [ 5.267113] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 5.267119] R13: ffff8882a1407ab8 R14: ffffc900013f3888 R15: 0000000000000001 [ 5.267125] FS: 00007aaa8b437800(0000) GS:ffff88850025b000(0000) knlGS:0000000000000000 [ 5.267132] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.267138] CR2: 00007aaa8b3aac10 CR3: 000000024f764000 CR4: 0000000000f52ef0 [ 5.267144] PKRU: 55555554 [ 5.267150] Call Trace: [ 5.267154] <TASK> [ 5.267181] truncate_inode_pages_range+0x118/0x5e0 [ 5.267193] ? save_trace+0x54/0x390 [ 5.267296] truncate_inode_pages_final+0x43/0x60 [ 5.267309] evict+0x2a4/0x2c0 [ 5.267339] dispose_list+0x39/0x80 [ 5.267352] evict_inodes+0x150/0x1b0 [ 5.267376] generic_shutdown_super+0x41/0x180 [ 5.267390] kill_block_super+0x1b/0x50 [ 5.267402] erofs_kill_sb+0x81/0x90 [erofs] [ 5.267436] deactivate_locked_super+0x32/0xb0 [ 5.267450] deactivate_super+0x46/0x60 [ 5.267460] cleanup_mnt+0xc3/0x170 [ 5.267475] __cleanup_mnt+0x12/0x20 [ 5.267485] task_work_run+0x5d/0xb0 [ 5.267499] exit_to_user_mode_loop+0x144/0x170 [ 5.267512] do_syscall_64+0x2b9/0x7c0 [ 5.267523] ? __lock_acquire+0x665/0x2ce0 [ 5.267535] ? __lock_acquire+0x665/0x2ce0 [ 5.267560] ? lock_acquire+0xcd/0x300 [ 5.267573] ? find_held_lock+0x31/0x90 [ 5.267582] ? mntput_no_expire+0x97/0x4e0 [ 5.267606] ? mntput_no_expire+0xa1/0x4e0 [ 5.267625] ? mntput+0x24/0x50 [ 5.267634] ? path_put+0x1e/0x30 [ 5.267647] ? do_faccessat+0x120/0x2f0 [ 5.267677] ? do_syscall_64+0x1a2/0x7c0 [ 5.267686] ? from_kgid_munged+0x17/0x30 [ 5.267703] ? from_kuid_munged+0x13/0x30 [ 5.267711] ? __do_sys_getuid+0x3d/0x50 [ 5.267724] ? do_syscall_64+0x1a2/0x7c0 [ 5.267732] ? irqentry_exit+0x77/0xb0 [ 5.267743] ? clear_bhb_loop+0x30/0x80 [ 5.267752] ? clear_bhb_loop+0x30/0x80 [ 5.267765] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 5.267772] RIP: 0033:0x7aaa8b32a9fb [ 5.267781] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 e9 83 0d 00 f7 d8 [ 5.267787] RSP: 002b:00007ffd7c4c9468 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 5.267796] RAX: 0000000000000000 RBX: 00005a61592a8b00 RCX: 00007aaa8b32a9fb [ 5.267802] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005a61592b2080 [ 5.267806] RBP: 00007ffd7c4c9540 R08: 00007aaa8b403b20 R09: 0000000000000020 [ 5.267812] R10: 0000000000000001 R11: 0000000000000246 R12: 00005a61592a8c00 [ 5.267817] R13: 0000000000000000 R14: 00005a61592b2080 R15: 00005a61592a8f10 [ 5.267849] </TASK> [ 5.267854] irq event stamp: 4721 [ 5.267859] hardirqs last enabled at (4727): [<ffffffff814abf50>] __up_console_sem+0x90/0xa0 [ 5.267873] hardirqs last disabled at (4732): [<ffffffff814abf35>] __up_console_sem+0x75/0xa0 [ 5.267884] softirqs last enabled at (3044): [<ffffffff8132adb3>] kernel_fpu_end+0x53/0x70 [ 5.267895] softirqs last disabled at (3042): [<ffffffff8132b5f4>] kernel_fpu_begin_mask+0xc4/0x120 [ 5.267905] ---[ end trace 0000000000000000 ]--- Fixes: bde708f ("fs/dax: always remove DAX page-cache entries when breaking layouts") Signed-off-by: Yuezhang Mo <[email protected]> Reviewed-by: Friendy Su <[email protected]> Reviewed-by: Daniel Palmer <[email protected]> Reviewed-by: Gao Xiang <[email protected]> Signed-off-by: Gao Xiang <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Sep 25, 2025
Currently, if CCW request creation fails with -EINVAL, the DASD driver returns BLK_STS_IOERR to the block layer. This can happen, for example, when a user-space application such as QEMU passes a misaligned buffer, but the original cause of the error is masked as a generic I/O error. This patch changes the behavior so that -EINVAL is returned as BLK_STS_INVAL, allowing user space to properly detect alignment issues instead of interpreting them as I/O errors. Reviewed-by: Stefan Haberland <[email protected]> Cc: [email protected] #6.11+ Signed-off-by: Jaehoon Kim <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 10, 2026
This leak will cause a hang when tearing down the SCSI host. For example, iscsid hangs with the following call trace: [130120.652718] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured PID: 2528 TASK: ffff9d0408974e00 CPU: 3 COMMAND: "iscsid" #0 [ffffb5b9c134b9e0] __schedule at ffffffff860657d4 #1 [ffffb5b9c134ba28] schedule at ffffffff86065c6f #2 [ffffb5b9c134ba40] schedule_timeout at ffffffff86069fb0 #3 [ffffb5b9c134bab0] __wait_for_common at ffffffff8606674f #4 [ffffb5b9c134bb10] scsi_remove_host at ffffffff85bfe84b #5 [ffffb5b9c134bb30] iscsi_sw_tcp_session_destroy at ffffffffc03031c4 [iscsi_tcp] #6 [ffffb5b9c134bb48] iscsi_if_recv_msg at ffffffffc0292692 [scsi_transport_iscsi] #7 [ffffb5b9c134bb98] iscsi_if_rx at ffffffffc02929c2 [scsi_transport_iscsi] #8 [ffffb5b9c134bbf0] netlink_unicast at ffffffff85e551d6 #9 [ffffb5b9c134bc38] netlink_sendmsg at ffffffff85e554ef Fixes: 8fe4ce5 ("scsi: core: Fix a use-after-free") Cc: [email protected] Signed-off-by: Junxiao Bi <[email protected]> Reviewed-by: Mike Christie <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Martin K. Petersen <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 10, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 10, 2026
Quiesce and resume is a mechanism to suspend operations on DASD devices. In the context of a controlled copy pair swap operation, the quiesce operation is usually issued before the actual swap and a resume afterwards. During the swap operation, the underlying device is exchanged. Therefore, the quiesce flag must be moved to the secondary device to ensure a consistent quiesce state after the swap. The secondary device itself cannot be suspended separately because there is no separate block device representation for it. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 10, 2026
During online processing for a DASD device an IO operation is started to determine the format of the device. CDL format contains specifically sized blocks at the beginning of the disk. For a PPRC secondary device no real IO operation is possible therefore this IO request can not be started and this step is skipped for online processing of secondary devices. This is generally fine since the secondary is a copy of the primary device. In case of an additional partition detection that is run after a swap operation the format information is needed to properly drive partition detection IO. Currently the information is not passed leading to IO errors during partition detection and a wrongly detected partition table which in turn might lead to data corruption on the disk with the wrong partition table. Fix by passing the format information from primary to secondary device. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Acked-by: Eduard Shishkin <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 10, 2026
Quiesce and resume is a mechanism to suspend operations on DASD devices. In the context of a controlled copy pair swap operation, the quiesce operation is usually issued before the actual swap and a resume afterwards. During the swap operation, the underlying device is exchanged. Therefore, the quiesce flag must be moved to the secondary device to ensure a consistent quiesce state after the swap. The secondary device itself cannot be suspended separately because there is no separate block device representation for it. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Signed-off-by: Stefan Haberland <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jens Axboe <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 10, 2026
During online processing for a DASD device an IO operation is started to determine the format of the device. CDL format contains specifically sized blocks at the beginning of the disk. For a PPRC secondary device no real IO operation is possible therefore this IO request can not be started and this step is skipped for online processing of secondary devices. This is generally fine since the secondary is a copy of the primary device. In case of an additional partition detection that is run after a swap operation the format information is needed to properly drive partition detection IO. Currently the information is not passed leading to IO errors during partition detection and a wrongly detected partition table which in turn might lead to data corruption on the disk with the wrong partition table. Fix by passing the format information from primary to secondary device. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Acked-by: Eduard Shishkin <[email protected]> Signed-off-by: Stefan Haberland <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jens Axboe <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 11, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 11, 2026
Quiesce and resume is a mechanism to suspend operations on DASD devices. In the context of a controlled copy pair swap operation, the quiesce operation is usually issued before the actual swap and a resume afterwards. During the swap operation, the underlying device is exchanged. Therefore, the quiesce flag must be moved to the secondary device to ensure a consistent quiesce state after the swap. The secondary device itself cannot be suspended separately because there is no separate block device representation for it. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 11, 2026
During online processing for a DASD device an IO operation is started to determine the format of the device. CDL format contains specifically sized blocks at the beginning of the disk. For a PPRC secondary device no real IO operation is possible therefore this IO request can not be started and this step is skipped for online processing of secondary devices. This is generally fine since the secondary is a copy of the primary device. In case of an additional partition detection that is run after a swap operation the format information is needed to properly drive partition detection IO. Currently the information is not passed leading to IO errors during partition detection and a wrongly detected partition table which in turn might lead to data corruption on the disk with the wrong partition table. Fix by passing the format information from primary to secondary device. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Acked-by: Eduard Shishkin <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 12, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 12, 2026
Quiesce and resume is a mechanism to suspend operations on DASD devices. In the context of a controlled copy pair swap operation, the quiesce operation is usually issued before the actual swap and a resume afterwards. During the swap operation, the underlying device is exchanged. Therefore, the quiesce flag must be moved to the secondary device to ensure a consistent quiesce state after the swap. The secondary device itself cannot be suspended separately because there is no separate block device representation for it. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 12, 2026
During online processing for a DASD device an IO operation is started to determine the format of the device. CDL format contains specifically sized blocks at the beginning of the disk. For a PPRC secondary device no real IO operation is possible therefore this IO request can not be started and this step is skipped for online processing of secondary devices. This is generally fine since the secondary is a copy of the primary device. In case of an additional partition detection that is run after a swap operation the format information is needed to properly drive partition detection IO. Currently the information is not passed leading to IO errors during partition detection and a wrongly detected partition table which in turn might lead to data corruption on the disk with the wrong partition table. Fix by passing the format information from primary to secondary device. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Acked-by: Eduard Shishkin <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 13, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 13, 2026
Quiesce and resume is a mechanism to suspend operations on DASD devices. In the context of a controlled copy pair swap operation, the quiesce operation is usually issued before the actual swap and a resume afterwards. During the swap operation, the underlying device is exchanged. Therefore, the quiesce flag must be moved to the secondary device to ensure a consistent quiesce state after the swap. The secondary device itself cannot be suspended separately because there is no separate block device representation for it. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 13, 2026
During online processing for a DASD device an IO operation is started to determine the format of the device. CDL format contains specifically sized blocks at the beginning of the disk. For a PPRC secondary device no real IO operation is possible therefore this IO request can not be started and this step is skipped for online processing of secondary devices. This is generally fine since the secondary is a copy of the primary device. In case of an additional partition detection that is run after a swap operation the format information is needed to properly drive partition detection IO. Currently the information is not passed leading to IO errors during partition detection and a wrongly detected partition table which in turn might lead to data corruption on the disk with the wrong partition table. Fix by passing the format information from primary to secondary device. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Acked-by: Eduard Shishkin <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 15, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 15, 2026
Quiesce and resume is a mechanism to suspend operations on DASD devices. In the context of a controlled copy pair swap operation, the quiesce operation is usually issued before the actual swap and a resume afterwards. During the swap operation, the underlying device is exchanged. Therefore, the quiesce flag must be moved to the secondary device to ensure a consistent quiesce state after the swap. The secondary device itself cannot be suspended separately because there is no separate block device representation for it. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 15, 2026
During online processing for a DASD device an IO operation is started to determine the format of the device. CDL format contains specifically sized blocks at the beginning of the disk. For a PPRC secondary device no real IO operation is possible therefore this IO request can not be started and this step is skipped for online processing of secondary devices. This is generally fine since the secondary is a copy of the primary device. In case of an additional partition detection that is run after a swap operation the format information is needed to properly drive partition detection IO. Currently the information is not passed leading to IO errors during partition detection and a wrongly detected partition table which in turn might lead to data corruption on the disk with the wrong partition table. Fix by passing the format information from primary to secondary device. Fixes: 413862c ("s390/dasd: add copy pair swap capability") Cc: [email protected] #6.1 Reviewed-by: Jan Hoeppner <[email protected]> Acked-by: Eduard Shishkin <[email protected]> Signed-off-by: Stefan Haberland <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 18, 2026
SMB2_write() places write payload in iov[1..n] as part of rq_iov. smb3_init_transform_rq() pointer-shares rq_iov, so crypt_message() encrypts iov[1] in-place, replacing the original plaintext with ciphertext. On a replayable error, the retry sends the same iov[1] which now contains ciphertext instead of the original data, resulting in corruption. The corruption is most likely to be observed when connections are unstable, as reconnects trigger write retries that re-send the already-encrypted data. This affects SFU mknod, MF symlinks, etc. On kernels before 6.10 (prior to the netfs conversion), sync writes also used this path and were similarly affected. The async write path wasn't unaffected as it uses rq_iter which gets deep-copied. Fix by moving the write payload into rq_iter via iov_iter_kvec(), so smb3_init_transform_rq() deep-copies it before encryption. Cc: [email protected] #6.3+ Acked-by: Henrique Carvalho <[email protected]> Acked-by: Shyam Prasad N <[email protected]> Acked-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Bharath SM <[email protected]> Signed-off-by: Steve French <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 18, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 18, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 22, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 23, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 24, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 25, 2026
As reported by syzbot [0], NBD can trigger a deadlock during
memory reclaim.
This occurs when a process holds lock_sock() on a backend TCP
socket and triggers a memory allocation that leads to fs reclaim.
If it eventually calls into NBD to send data or shut down the
socket, NBD will attempt to acquire the same lock_sock(),
resulting in the deadlock.
While NBD sets sk->sk_allocation to GFP_NOIO before calling
sendmsg(), this does not prevent the issue in some paths where
GFP_KERNEL is used directly under lock_sock().
To resolve this, let's use lock_sock_try() for TCP sendmsg() and
shutdown().
For sock_sendmsg(), if lock_sock_try() fails, -ERESTARTSYS is
returned, allowing the request to be retried later (e.g., via
was_interrupted() logic).
For sock_sendmsg() for NBD_CMD_DISC and kernel_sock_shutdown(),
the operation might be skipped if the lock cannot be acquired.
However, this is not expected to occur in practice because the
backend TCP socket should not be touched by userspace once it is
handed over to NBD.
Note that sock_recvmsg() does not require this special handling
because it is only called from the workqueue context.
Also note that AF_UNIX sockets continue to use sock_sendmsg()
and kernel_sock_shutdown() because unix_stream_sendmsg() and
unix_shutdown() do not acquire lock_sock().
[0]:
WARNING: possible circular locking dependency detected
syzkaller #0 Tainted: G L
syz.7.2282/12353 is trying to acquire lock:
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: might_alloc include/linux/sched/mm.h:317 [inline]
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: slab_pre_alloc_hook mm/slub.c:4489 [inline]
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: slab_alloc_node mm/slub.c:4843 [inline]
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: kmem_cache_alloc_node_noprof+0x53/0x6f0 mm/slub.c:4918
but task is already holding lock:
ffff88806f972a20 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
ffff88806f972a20 (sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_close+0x1d/0x110 net/ipv4/tcp.c:3349
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #6 (sk_lock-AF_INET6){+.+.}-{0:0}:
lock_sock_nested+0x41/0xf0 net/core/sock.c:3780
lock_sock include/net/sock.h:1709 [inline]
inet_shutdown+0x67/0x410 net/ipv4/af_inet.c:919
nbd_mark_nsock_dead+0xae/0x5c0 drivers/block/nbd.c:318
sock_shutdown+0x16b/0x200 drivers/block/nbd.c:411
nbd_clear_sock drivers/block/nbd.c:1427 [inline]
nbd_config_put+0x1eb/0x750 drivers/block/nbd.c:1451
nbd_genl_connect+0xaf8/0x1a40 drivers/block/nbd.c:2248
genl_family_rcv_msg_doit+0x214/0x300 net/netlink/genetlink.c:1114
genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
genl_rcv_msg+0x560/0x800 net/netlink/genetlink.c:1209
netlink_rcv_skb+0x159/0x420 net/netlink/af_netlink.c:2550
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1218
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0x9e1/0xb70 net/socket.c:2592
___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
__sys_sendmsg+0x170/0x220 net/socket.c:2678
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #5 (&nsock->tx_lock){+.+.}-{4:4}:
__mutex_lock_common kernel/locking/mutex.c:614 [inline]
__mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
nbd_handle_cmd drivers/block/nbd.c:1143 [inline]
nbd_queue_rq+0x428/0x1080 drivers/block/nbd.c:1207
blk_mq_dispatch_rq_list+0x422/0x1e70 block/blk-mq.c:2148
__blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
__blk_mq_sched_dispatch_requests+0xcea/0x1620 block/blk-mq-sched.c:307
blk_mq_sched_dispatch_requests+0xd7/0x1c0 block/blk-mq-sched.c:329
blk_mq_run_hw_queue+0x23c/0x670 block/blk-mq.c:2386
blk_mq_dispatch_list+0x51d/0x1360 block/blk-mq.c:2949
blk_mq_flush_plug_list block/blk-mq.c:2997 [inline]
blk_mq_flush_plug_list+0x130/0x600 block/blk-mq.c:2969
__blk_flush_plug+0x2c4/0x4b0 block/blk-core.c:1230
blk_finish_plug block/blk-core.c:1257 [inline]
__submit_bio+0x584/0x6c0 block/blk-core.c:649
__submit_bio_noacct_mq block/blk-core.c:722 [inline]
submit_bio_noacct_nocheck+0x562/0xc10 block/blk-core.c:753
submit_bio_noacct+0xd17/0x2010 block/blk-core.c:884
blk_crypto_submit_bio include/linux/blk-crypto.h:203 [inline]
submit_bh_wbc+0x59c/0x770 fs/buffer.c:2821
submit_bh fs/buffer.c:2826 [inline]
block_read_full_folio+0x264/0x8e0 fs/buffer.c:2444
filemap_read_folio+0xfc/0x3b0 mm/filemap.c:2501
do_read_cache_folio+0x2d7/0x6b0 mm/filemap.c:4101
read_mapping_folio include/linux/pagemap.h:1028 [inline]
read_part_sector+0xd1/0x370 block/partitions/core.c:723
adfspart_check_ICS+0x93/0x910 block/partitions/acorn.c:360
check_partition block/partitions/core.c:142 [inline]
blk_add_partitions block/partitions/core.c:590 [inline]
bdev_disk_changed+0x7f8/0xc80 block/partitions/core.c:694
blkdev_get_whole+0x187/0x290 block/bdev.c:764
bdev_open+0x2c7/0xe40 block/bdev.c:973
blkdev_open+0x34e/0x4f0 block/fops.c:697
do_dentry_open+0x6d8/0x1660 fs/open.c:949
vfs_open+0x82/0x3f0 fs/open.c:1081
do_open fs/namei.c:4671 [inline]
path_openat+0x208c/0x31a0 fs/namei.c:4830
do_file_open+0x20e/0x430 fs/namei.c:4859
do_sys_openat2+0x10d/0x1e0 fs/open.c:1366
do_sys_open fs/open.c:1372 [inline]
__do_sys_openat fs/open.c:1388 [inline]
__se_sys_openat fs/open.c:1383 [inline]
__x64_sys_openat+0x12d/0x210 fs/open.c:1383
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #4 (&cmd->lock){+.+.}-{4:4}:
__mutex_lock_common kernel/locking/mutex.c:614 [inline]
__mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
nbd_queue_rq+0xba/0x1080 drivers/block/nbd.c:1199
blk_mq_dispatch_rq_list+0x422/0x1e70 block/blk-mq.c:2148
__blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
__blk_mq_sched_dispatch_requests+0xcea/0x1620 block/blk-mq-sched.c:307
blk_mq_sched_dispatch_requests+0xd7/0x1c0 block/blk-mq-sched.c:329
blk_mq_run_hw_queue+0x23c/0x670 block/blk-mq.c:2386
blk_mq_dispatch_list+0x51d/0x1360 block/blk-mq.c:2949
blk_mq_flush_plug_list block/blk-mq.c:2997 [inline]
blk_mq_flush_plug_list+0x130/0x600 block/blk-mq.c:2969
__blk_flush_plug+0x2c4/0x4b0 block/blk-core.c:1230
blk_finish_plug block/blk-core.c:1257 [inline]
__submit_bio+0x584/0x6c0 block/blk-core.c:649
__submit_bio_noacct_mq block/blk-core.c:722 [inline]
submit_bio_noacct_nocheck+0x562/0xc10 block/blk-core.c:753
submit_bio_noacct+0xd17/0x2010 block/blk-core.c:884
blk_crypto_submit_bio include/linux/blk-crypto.h:203 [inline]
submit_bh_wbc+0x59c/0x770 fs/buffer.c:2821
submit_bh fs/buffer.c:2826 [inline]
block_read_full_folio+0x264/0x8e0 fs/buffer.c:2444
filemap_read_folio+0xfc/0x3b0 mm/filemap.c:2501
do_read_cache_folio+0x2d7/0x6b0 mm/filemap.c:4101
read_mapping_folio include/linux/pagemap.h:1028 [inline]
read_part_sector+0xd1/0x370 block/partitions/core.c:723
adfspart_check_ICS+0x93/0x910 block/partitions/acorn.c:360
check_partition block/partitions/core.c:142 [inline]
blk_add_partitions block/partitions/core.c:590 [inline]
bdev_disk_changed+0x7f8/0xc80 block/partitions/core.c:694
blkdev_get_whole+0x187/0x290 block/bdev.c:764
bdev_open+0x2c7/0xe40 block/bdev.c:973
blkdev_open+0x34e/0x4f0 block/fops.c:697
do_dentry_open+0x6d8/0x1660 fs/open.c:949
vfs_open+0x82/0x3f0 fs/open.c:1081
do_open fs/namei.c:4671 [inline]
path_openat+0x208c/0x31a0 fs/namei.c:4830
do_file_open+0x20e/0x430 fs/namei.c:4859
do_sys_openat2+0x10d/0x1e0 fs/open.c:1366
do_sys_open fs/open.c:1372 [inline]
__do_sys_openat fs/open.c:1388 [inline]
__se_sys_openat fs/open.c:1383 [inline]
__x64_sys_openat+0x12d/0x210 fs/open.c:1383
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #3 (set->srcu){.+.+}-{0:0}:
srcu_lock_sync include/linux/srcu.h:199 [inline]
__synchronize_srcu+0xa1/0x2a0 kernel/rcu/srcutree.c:1505
blk_mq_wait_quiesce_done block/blk-mq.c:284 [inline]
blk_mq_wait_quiesce_done block/blk-mq.c:281 [inline]
blk_mq_quiesce_queue block/blk-mq.c:304 [inline]
blk_mq_quiesce_queue+0x149/0x1c0 block/blk-mq.c:299
elevator_switch+0x17b/0x7e0 block/elevator.c:576
elevator_change+0x352/0x530 block/elevator.c:681
elevator_set_default+0x29e/0x360 block/elevator.c:754
blk_register_queue+0x412/0x590 block/blk-sysfs.c:946
__add_disk+0x73f/0xe40 block/genhd.c:528
add_disk_fwnode+0x118/0x5c0 block/genhd.c:597
add_disk include/linux/blkdev.h:785 [inline]
nbd_dev_add+0x77a/0xb10 drivers/block/nbd.c:1984
nbd_init+0x291/0x2b0 drivers/block/nbd.c:2692
do_one_initcall+0x11d/0x760 init/main.c:1382
do_initcall_level init/main.c:1444 [inline]
do_initcalls init/main.c:1460 [inline]
do_basic_setup init/main.c:1479 [inline]
kernel_init_freeable+0x6e5/0x7a0 init/main.c:1692
kernel_init+0x1f/0x1e0 init/main.c:1582
ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
-> #2 (&q->elevator_lock){+.+.}-{4:4}:
__mutex_lock_common kernel/locking/mutex.c:614 [inline]
__mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
elevator_change+0x1bc/0x530 block/elevator.c:679
elevator_set_none+0x92/0xf0 block/elevator.c:769
blk_mq_elv_switch_none block/blk-mq.c:5110 [inline]
__blk_mq_update_nr_hw_queues block/blk-mq.c:5155 [inline]
blk_mq_update_nr_hw_queues+0x4c1/0x15f0 block/blk-mq.c:5220
nbd_start_device+0x1a6/0xbd0 drivers/block/nbd.c:1489
nbd_genl_connect+0xff2/0x1a40 drivers/block/nbd.c:2239
genl_family_rcv_msg_doit+0x214/0x300 net/netlink/genetlink.c:1114
genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
genl_rcv_msg+0x560/0x800 net/netlink/genetlink.c:1209
netlink_rcv_skb+0x159/0x420 net/netlink/af_netlink.c:2550
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1218
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0x9e1/0xb70 net/socket.c:2592
___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
__sys_sendmsg+0x170/0x220 net/socket.c:2678
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #1 (&q->q_usage_counter(io)#49){++++}-{0:0}:
blk_alloc_queue+0x610/0x790 block/blk-core.c:461
blk_mq_alloc_queue+0x174/0x290 block/blk-mq.c:4429
__blk_mq_alloc_disk+0x29/0x120 block/blk-mq.c:4476
nbd_dev_add+0x492/0xb10 drivers/block/nbd.c:1954
nbd_init+0x291/0x2b0 drivers/block/nbd.c:2692
do_one_initcall+0x11d/0x760 init/main.c:1382
do_initcall_level init/main.c:1444 [inline]
do_initcalls init/main.c:1460 [inline]
do_basic_setup init/main.c:1479 [inline]
kernel_init_freeable+0x6e5/0x7a0 init/main.c:1692
kernel_init+0x1f/0x1e0 init/main.c:1582
ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
-> #0 (fs_reclaim){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3165 [inline]
check_prevs_add kernel/locking/lockdep.c:3284 [inline]
validate_chain kernel/locking/lockdep.c:3908 [inline]
__lock_acquire+0x14b8/0x2630 kernel/locking/lockdep.c:5237
lock_acquire kernel/locking/lockdep.c:5868 [inline]
lock_acquire+0x1cf/0x380 kernel/locking/lockdep.c:5825
__fs_reclaim_acquire mm/page_alloc.c:4348 [inline]
fs_reclaim_acquire+0xc4/0x100 mm/page_alloc.c:4362
might_alloc include/linux/sched/mm.h:317 [inline]
slab_pre_alloc_hook mm/slub.c:4489 [inline]
slab_alloc_node mm/slub.c:4843 [inline]
kmem_cache_alloc_node_noprof+0x53/0x6f0 mm/slub.c:4918
__alloc_skb+0x140/0x710 net/core/skbuff.c:702
alloc_skb include/linux/skbuff.h:1383 [inline]
tcp_send_active_reset+0x8b/0xa60 net/ipv4/tcp_output.c:3862
__tcp_close+0x41e/0x1110 net/ipv4/tcp.c:3223
tcp_close+0x28/0x110 net/ipv4/tcp.c:3350
inet_release+0xed/0x200 net/ipv4/af_inet.c:443
inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:479
__sock_release+0xb3/0x260 net/socket.c:662
sock_close+0x1c/0x30 net/socket.c:1455
__fput+0x3ff/0xb40 fs/file_table.c:469
task_work_run+0x150/0x240 kernel/task_work.c:233
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
__exit_to_user_mode_loop kernel/entry/common.c:67 [inline]
exit_to_user_mode_loop+0x100/0x4a0 kernel/entry/common.c:98
__exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline]
do_syscall_64+0x67c/0xf80 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
other info that might help us debug this:
Chain exists of:
fs_reclaim --> &nsock->tx_lock --> sk_lock-AF_INET6
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(sk_lock-AF_INET6);
lock(&nsock->tx_lock);
lock(sk_lock-AF_INET6);
lock(fs_reclaim);
*** DEADLOCK ***
Fixes: fd8383f ("nbd: convert to blkmq")
Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Kuniyuki Iwashima <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 25, 2026
As reported by syzbot [0], NBD can trigger a deadlock during
memory reclaim.
This occurs when a process holds lock_sock() on a backend TCP
socket and triggers a memory allocation that leads to fs reclaim.
If it eventually calls into NBD to send data or shut down the
socket, NBD will attempt to acquire the same lock_sock(),
resulting in the deadlock.
While NBD sets sk->sk_allocation to GFP_NOIO before calling
sendmsg(), this does not prevent the issue in some paths where
GFP_KERNEL is used directly under lock_sock().
To resolve this, let's use lock_sock_try() for TCP sendmsg() and
shutdown().
For sock_sendmsg(), if lock_sock_try() fails, -ERESTARTSYS is
returned, allowing the request to be retried later (e.g., via
was_interrupted() logic).
For sock_sendmsg() for NBD_CMD_DISC and kernel_sock_shutdown(),
the operation might be skipped if the lock cannot be acquired.
However, this is not expected to occur in practice because the
backend TCP socket should not be touched by userspace once it is
handed over to NBD.
Note that sock_recvmsg() does not require this special handling
because it is only called from the workqueue context.
Also note that AF_UNIX sockets continue to use sock_sendmsg()
and kernel_sock_shutdown() because unix_stream_sendmsg() and
unix_shutdown() do not acquire lock_sock().
[0]:
WARNING: possible circular locking dependency detected
syzkaller #0 Tainted: G L
syz.7.2282/12353 is trying to acquire lock:
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: might_alloc include/linux/sched/mm.h:317 [inline]
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: slab_pre_alloc_hook mm/slub.c:4489 [inline]
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: slab_alloc_node mm/slub.c:4843 [inline]
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: kmem_cache_alloc_node_noprof+0x53/0x6f0 mm/slub.c:4918
but task is already holding lock:
ffff88806f972a20 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
ffff88806f972a20 (sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_close+0x1d/0x110 net/ipv4/tcp.c:3349
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #6 (sk_lock-AF_INET6){+.+.}-{0:0}:
lock_sock_nested+0x41/0xf0 net/core/sock.c:3780
lock_sock include/net/sock.h:1709 [inline]
inet_shutdown+0x67/0x410 net/ipv4/af_inet.c:919
nbd_mark_nsock_dead+0xae/0x5c0 drivers/block/nbd.c:318
sock_shutdown+0x16b/0x200 drivers/block/nbd.c:411
nbd_clear_sock drivers/block/nbd.c:1427 [inline]
nbd_config_put+0x1eb/0x750 drivers/block/nbd.c:1451
nbd_genl_connect+0xaf8/0x1a40 drivers/block/nbd.c:2248
genl_family_rcv_msg_doit+0x214/0x300 net/netlink/genetlink.c:1114
genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
genl_rcv_msg+0x560/0x800 net/netlink/genetlink.c:1209
netlink_rcv_skb+0x159/0x420 net/netlink/af_netlink.c:2550
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1218
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0x9e1/0xb70 net/socket.c:2592
___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
__sys_sendmsg+0x170/0x220 net/socket.c:2678
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #5 (&nsock->tx_lock){+.+.}-{4:4}:
__mutex_lock_common kernel/locking/mutex.c:614 [inline]
__mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
nbd_handle_cmd drivers/block/nbd.c:1143 [inline]
nbd_queue_rq+0x428/0x1080 drivers/block/nbd.c:1207
blk_mq_dispatch_rq_list+0x422/0x1e70 block/blk-mq.c:2148
__blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
__blk_mq_sched_dispatch_requests+0xcea/0x1620 block/blk-mq-sched.c:307
blk_mq_sched_dispatch_requests+0xd7/0x1c0 block/blk-mq-sched.c:329
blk_mq_run_hw_queue+0x23c/0x670 block/blk-mq.c:2386
blk_mq_dispatch_list+0x51d/0x1360 block/blk-mq.c:2949
blk_mq_flush_plug_list block/blk-mq.c:2997 [inline]
blk_mq_flush_plug_list+0x130/0x600 block/blk-mq.c:2969
__blk_flush_plug+0x2c4/0x4b0 block/blk-core.c:1230
blk_finish_plug block/blk-core.c:1257 [inline]
__submit_bio+0x584/0x6c0 block/blk-core.c:649
__submit_bio_noacct_mq block/blk-core.c:722 [inline]
submit_bio_noacct_nocheck+0x562/0xc10 block/blk-core.c:753
submit_bio_noacct+0xd17/0x2010 block/blk-core.c:884
blk_crypto_submit_bio include/linux/blk-crypto.h:203 [inline]
submit_bh_wbc+0x59c/0x770 fs/buffer.c:2821
submit_bh fs/buffer.c:2826 [inline]
block_read_full_folio+0x264/0x8e0 fs/buffer.c:2444
filemap_read_folio+0xfc/0x3b0 mm/filemap.c:2501
do_read_cache_folio+0x2d7/0x6b0 mm/filemap.c:4101
read_mapping_folio include/linux/pagemap.h:1028 [inline]
read_part_sector+0xd1/0x370 block/partitions/core.c:723
adfspart_check_ICS+0x93/0x910 block/partitions/acorn.c:360
check_partition block/partitions/core.c:142 [inline]
blk_add_partitions block/partitions/core.c:590 [inline]
bdev_disk_changed+0x7f8/0xc80 block/partitions/core.c:694
blkdev_get_whole+0x187/0x290 block/bdev.c:764
bdev_open+0x2c7/0xe40 block/bdev.c:973
blkdev_open+0x34e/0x4f0 block/fops.c:697
do_dentry_open+0x6d8/0x1660 fs/open.c:949
vfs_open+0x82/0x3f0 fs/open.c:1081
do_open fs/namei.c:4671 [inline]
path_openat+0x208c/0x31a0 fs/namei.c:4830
do_file_open+0x20e/0x430 fs/namei.c:4859
do_sys_openat2+0x10d/0x1e0 fs/open.c:1366
do_sys_open fs/open.c:1372 [inline]
__do_sys_openat fs/open.c:1388 [inline]
__se_sys_openat fs/open.c:1383 [inline]
__x64_sys_openat+0x12d/0x210 fs/open.c:1383
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #4 (&cmd->lock){+.+.}-{4:4}:
__mutex_lock_common kernel/locking/mutex.c:614 [inline]
__mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
nbd_queue_rq+0xba/0x1080 drivers/block/nbd.c:1199
blk_mq_dispatch_rq_list+0x422/0x1e70 block/blk-mq.c:2148
__blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
__blk_mq_sched_dispatch_requests+0xcea/0x1620 block/blk-mq-sched.c:307
blk_mq_sched_dispatch_requests+0xd7/0x1c0 block/blk-mq-sched.c:329
blk_mq_run_hw_queue+0x23c/0x670 block/blk-mq.c:2386
blk_mq_dispatch_list+0x51d/0x1360 block/blk-mq.c:2949
blk_mq_flush_plug_list block/blk-mq.c:2997 [inline]
blk_mq_flush_plug_list+0x130/0x600 block/blk-mq.c:2969
__blk_flush_plug+0x2c4/0x4b0 block/blk-core.c:1230
blk_finish_plug block/blk-core.c:1257 [inline]
__submit_bio+0x584/0x6c0 block/blk-core.c:649
__submit_bio_noacct_mq block/blk-core.c:722 [inline]
submit_bio_noacct_nocheck+0x562/0xc10 block/blk-core.c:753
submit_bio_noacct+0xd17/0x2010 block/blk-core.c:884
blk_crypto_submit_bio include/linux/blk-crypto.h:203 [inline]
submit_bh_wbc+0x59c/0x770 fs/buffer.c:2821
submit_bh fs/buffer.c:2826 [inline]
block_read_full_folio+0x264/0x8e0 fs/buffer.c:2444
filemap_read_folio+0xfc/0x3b0 mm/filemap.c:2501
do_read_cache_folio+0x2d7/0x6b0 mm/filemap.c:4101
read_mapping_folio include/linux/pagemap.h:1028 [inline]
read_part_sector+0xd1/0x370 block/partitions/core.c:723
adfspart_check_ICS+0x93/0x910 block/partitions/acorn.c:360
check_partition block/partitions/core.c:142 [inline]
blk_add_partitions block/partitions/core.c:590 [inline]
bdev_disk_changed+0x7f8/0xc80 block/partitions/core.c:694
blkdev_get_whole+0x187/0x290 block/bdev.c:764
bdev_open+0x2c7/0xe40 block/bdev.c:973
blkdev_open+0x34e/0x4f0 block/fops.c:697
do_dentry_open+0x6d8/0x1660 fs/open.c:949
vfs_open+0x82/0x3f0 fs/open.c:1081
do_open fs/namei.c:4671 [inline]
path_openat+0x208c/0x31a0 fs/namei.c:4830
do_file_open+0x20e/0x430 fs/namei.c:4859
do_sys_openat2+0x10d/0x1e0 fs/open.c:1366
do_sys_open fs/open.c:1372 [inline]
__do_sys_openat fs/open.c:1388 [inline]
__se_sys_openat fs/open.c:1383 [inline]
__x64_sys_openat+0x12d/0x210 fs/open.c:1383
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #3 (set->srcu){.+.+}-{0:0}:
srcu_lock_sync include/linux/srcu.h:199 [inline]
__synchronize_srcu+0xa1/0x2a0 kernel/rcu/srcutree.c:1505
blk_mq_wait_quiesce_done block/blk-mq.c:284 [inline]
blk_mq_wait_quiesce_done block/blk-mq.c:281 [inline]
blk_mq_quiesce_queue block/blk-mq.c:304 [inline]
blk_mq_quiesce_queue+0x149/0x1c0 block/blk-mq.c:299
elevator_switch+0x17b/0x7e0 block/elevator.c:576
elevator_change+0x352/0x530 block/elevator.c:681
elevator_set_default+0x29e/0x360 block/elevator.c:754
blk_register_queue+0x412/0x590 block/blk-sysfs.c:946
__add_disk+0x73f/0xe40 block/genhd.c:528
add_disk_fwnode+0x118/0x5c0 block/genhd.c:597
add_disk include/linux/blkdev.h:785 [inline]
nbd_dev_add+0x77a/0xb10 drivers/block/nbd.c:1984
nbd_init+0x291/0x2b0 drivers/block/nbd.c:2692
do_one_initcall+0x11d/0x760 init/main.c:1382
do_initcall_level init/main.c:1444 [inline]
do_initcalls init/main.c:1460 [inline]
do_basic_setup init/main.c:1479 [inline]
kernel_init_freeable+0x6e5/0x7a0 init/main.c:1692
kernel_init+0x1f/0x1e0 init/main.c:1582
ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
-> #2 (&q->elevator_lock){+.+.}-{4:4}:
__mutex_lock_common kernel/locking/mutex.c:614 [inline]
__mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
elevator_change+0x1bc/0x530 block/elevator.c:679
elevator_set_none+0x92/0xf0 block/elevator.c:769
blk_mq_elv_switch_none block/blk-mq.c:5110 [inline]
__blk_mq_update_nr_hw_queues block/blk-mq.c:5155 [inline]
blk_mq_update_nr_hw_queues+0x4c1/0x15f0 block/blk-mq.c:5220
nbd_start_device+0x1a6/0xbd0 drivers/block/nbd.c:1489
nbd_genl_connect+0xff2/0x1a40 drivers/block/nbd.c:2239
genl_family_rcv_msg_doit+0x214/0x300 net/netlink/genetlink.c:1114
genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
genl_rcv_msg+0x560/0x800 net/netlink/genetlink.c:1209
netlink_rcv_skb+0x159/0x420 net/netlink/af_netlink.c:2550
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1218
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0x9e1/0xb70 net/socket.c:2592
___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
__sys_sendmsg+0x170/0x220 net/socket.c:2678
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #1 (&q->q_usage_counter(io)#49){++++}-{0:0}:
blk_alloc_queue+0x610/0x790 block/blk-core.c:461
blk_mq_alloc_queue+0x174/0x290 block/blk-mq.c:4429
__blk_mq_alloc_disk+0x29/0x120 block/blk-mq.c:4476
nbd_dev_add+0x492/0xb10 drivers/block/nbd.c:1954
nbd_init+0x291/0x2b0 drivers/block/nbd.c:2692
do_one_initcall+0x11d/0x760 init/main.c:1382
do_initcall_level init/main.c:1444 [inline]
do_initcalls init/main.c:1460 [inline]
do_basic_setup init/main.c:1479 [inline]
kernel_init_freeable+0x6e5/0x7a0 init/main.c:1692
kernel_init+0x1f/0x1e0 init/main.c:1582
ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
-> #0 (fs_reclaim){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3165 [inline]
check_prevs_add kernel/locking/lockdep.c:3284 [inline]
validate_chain kernel/locking/lockdep.c:3908 [inline]
__lock_acquire+0x14b8/0x2630 kernel/locking/lockdep.c:5237
lock_acquire kernel/locking/lockdep.c:5868 [inline]
lock_acquire+0x1cf/0x380 kernel/locking/lockdep.c:5825
__fs_reclaim_acquire mm/page_alloc.c:4348 [inline]
fs_reclaim_acquire+0xc4/0x100 mm/page_alloc.c:4362
might_alloc include/linux/sched/mm.h:317 [inline]
slab_pre_alloc_hook mm/slub.c:4489 [inline]
slab_alloc_node mm/slub.c:4843 [inline]
kmem_cache_alloc_node_noprof+0x53/0x6f0 mm/slub.c:4918
__alloc_skb+0x140/0x710 net/core/skbuff.c:702
alloc_skb include/linux/skbuff.h:1383 [inline]
tcp_send_active_reset+0x8b/0xa60 net/ipv4/tcp_output.c:3862
__tcp_close+0x41e/0x1110 net/ipv4/tcp.c:3223
tcp_close+0x28/0x110 net/ipv4/tcp.c:3350
inet_release+0xed/0x200 net/ipv4/af_inet.c:443
inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:479
__sock_release+0xb3/0x260 net/socket.c:662
sock_close+0x1c/0x30 net/socket.c:1455
__fput+0x3ff/0xb40 fs/file_table.c:469
task_work_run+0x150/0x240 kernel/task_work.c:233
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
__exit_to_user_mode_loop kernel/entry/common.c:67 [inline]
exit_to_user_mode_loop+0x100/0x4a0 kernel/entry/common.c:98
__exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline]
do_syscall_64+0x67c/0xf80 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
other info that might help us debug this:
Chain exists of:
fs_reclaim --> &nsock->tx_lock --> sk_lock-AF_INET6
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(sk_lock-AF_INET6);
lock(&nsock->tx_lock);
lock(sk_lock-AF_INET6);
lock(fs_reclaim);
*** DEADLOCK ***
Fixes: fd8383f ("nbd: convert to blkmq")
Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Kuniyuki Iwashima <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 25, 2026
A malicious user program can request large user-memory pinning via ioctl with a large metadata_len. However, this does not guarantee that all requested memory will be pinned. Pinning may partially succeed and return the number of bytes that were actually pinned, which may not match the requested size. In this case, only the addresses of the pinned pages are valid. The current implementation does not handle partial pinning and incorrectly assumes that all pages in the range [0, nr_vecs) are valid. This can lead to a null-pointer dereference because pages[n] may refer to an unpinned memory range. To fix this, add a check to verify that all requested pages are successfully pinned. Pinning all pages is required to copy user data. KASAN splat: Syzkaller hit 'general protection fault in bio_integrity_map_user' bug. nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: Command: 80f60320000000000300000000c9ffffb38ab5410000000070693aa0ffffffffb00e619dffffffff80f6032000000000b38ab5410000000070fc38a0ffffffff nvme nvme0: 2/0/0 default/read/poll queues Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 UID: 0 PID: 280 Comm: syz-executor294 Not tainted 6.11.0-dirty #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compound_head home/wukong/fuzznvme/linux/./include/linux/page-flags.h:240 [inline] RIP: 0010:bvec_from_pages home/wukong/fuzznvme/linux/block/bio-integrity.c:290 [inline] RIP: 0010:bio_integrity_map_user+0x5a3/0x11e0 home/wukong/fuzznvme/linux/block/bio-integrity.c:345 Code: 4c 89 e0 48 c1 e8 03 80 3c 30 00 0f 85 4b 0a 00 00 48 be 00 00 00 00 00 fc ff df 49 8b 1c 24 48 8d 7b 08 48 89 f8 48 c1 e8 03 <80> 3c 30 00 0f 85 35 0a 00 00 48 8b 43 08 31 ff 49 89 c5 48 89 44 RSP: 0018:ffffc900010cf4f0 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 000000000000f761 RDX: ffff888006cae600 RSI: dffffc0000000000 RDI: 0000000000000008 RBP: ffffc900010cf7d0 R08: ffff888006cae600 R09: ffffed1000e0db95 R10: ffff88800706dcaf R11: ffff888006a31a00 R12: ffff888006a31a08 R13: 0000000000000740 R14: ffff888006a31a00 R15: 0000000000000001 FS: 0000555587e483c0(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002002f8c0 CR3: 000000000719a000 CR4: 00000000000006f0 Call Trace: <TASK> nvme_map_user_request+0x4b6/0x5e0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:149 nvme_submit_user_cmd+0x2e8/0x3c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:185 nvme_user_cmd.constprop.0+0x35b/0x540 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:325 nvme_ns_ioctl+0x11e/0x1c0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:570 nvme_ioctl+0x147/0x1d0 home/wukong/fuzznvme/linux/drivers/nvme/host/ioctl.c:605 blkdev_ioctl+0x28c/0x6c0 home/wukong/fuzznvme/linux/block/ioctl.c:676 vfs_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:51 [inline] __do_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:907 [inline] __se_sys_ioctl home/wukong/fuzznvme/linux/fs/ioctl.c:893 [inline] __x64_sys_ioctl+0x1bc/0x230 home/wukong/fuzznvme/linux/fs/ioctl.c:893 x64_sys_call+0x1209/0x20d0 home/wukong/fuzznvme/linux/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:52 [inline] do_syscall_64+0x6f/0x110 home/wukong/fuzznvme/linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5b3d8e98bd Code: c3 e8 a7 1f 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fffd1c0e988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000000f4240 RCX: 00007f5b3d8e98bd RDX: 000000002003f680 RSI: 00000000c0484e43 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00007f5b3d93eb4d R09: 00007f5b3d93eb4d R10: 00007f5b3d93eb4d R11: 0000000000000246 R12: 0000000000000001 R13: 00007fffd1c0ebe8 R14: 00007fffd1c0e9b0 R15: 00007fffd1c0e9a0 </TASK> Modules linked in: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#2] PREEMPT SMP KASAN PTI Fixes: 492c5d4 (block: bio-integrity: directly map user buffers) Acked-by: Chao Shi <[email protected]> Acked-by: Weidong Zhu <[email protected]> Acked-by: Dave Tian <[email protected]> Signed-off-by: Sungwoo Kim <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 27, 2026
The devm_free_irq() and devm_request_irq() functions should not be
executed in an atomic context.
During device suspend, all userspace processes and most kernel threads
are frozen. Additionally, we flush all tx/rx status, disable all macb
interrupts, and halt rx operations. Therefore, it is safe to split the
region protected by bp->lock into two independent sections, allowing
devm_free_irq() and devm_request_irq() to run in a non-atomic context.
This modification resolves the following lockdep warning:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:591
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 501, name: rtcwake
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 0
7 locks held by rtcwake/501:
#0: ffff0008038c3408 (sb_writers#5){.+.+}-{0:0}, at: vfs_write+0xf8/0x368
#1: ffff0008049a5e88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0xbc/0x1c8
#2: ffff00080098d588 (kn->active#70){.+.+}-{0:0}, at: kernfs_fop_write_iter+0xcc/0x1c8
#3: ffff800081c84888 (system_transition_mutex){+.+.}-{4:4}, at: pm_suspend+0x1ec/0x290
#4: ffff0008009ba0f8 (&dev->mutex){....}-{4:4}, at: device_suspend+0x118/0x4f0
#5: ffff800081d00458 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48
#6: ffff0008031fb9e0 (&bp->lock){-.-.}-{3:3}, at: macb_suspend+0x144/0x558
irq event stamp: 8682
hardirqs last enabled at (8681): [<ffff8000813c7d7c>] _raw_spin_unlock_irqrestore+0x44/0x88
hardirqs last disabled at (8682): [<ffff8000813c7b58>] _raw_spin_lock_irqsave+0x38/0x98
softirqs last enabled at (7322): [<ffff8000800f1b4c>] handle_softirqs+0x52c/0x588
softirqs last disabled at (7317): [<ffff800080010310>] __do_softirq+0x20/0x2c
CPU: 1 UID: 0 PID: 501 Comm: rtcwake Not tainted 7.0.0-rc3-next-20260310-yocto-standard+ #125 PREEMPT
Hardware name: ZynqMP ZCU102 Rev1.1 (DT)
Call trace:
show_stack+0x24/0x38 (C)
__dump_stack+0x28/0x38
dump_stack_lvl+0x64/0x88
dump_stack+0x18/0x24
__might_resched+0x200/0x218
__might_sleep+0x38/0x98
__mutex_lock_common+0x7c/0x1378
mutex_lock_nested+0x38/0x50
free_irq+0x68/0x2b0
devm_irq_release+0x24/0x38
devres_release+0x40/0x80
devm_free_irq+0x48/0x88
macb_suspend+0x298/0x558
device_suspend+0x218/0x4f0
dpm_suspend+0x244/0x3a0
dpm_suspend_start+0x50/0x78
suspend_devices_and_enter+0xec/0x560
pm_suspend+0x194/0x290
state_store+0x110/0x158
kobj_attr_store+0x1c/0x30
sysfs_kf_write+0xa8/0xd0
kernfs_fop_write_iter+0x11c/0x1c8
vfs_write+0x248/0x368
ksys_write+0x7c/0xf8
__arm64_sys_write+0x28/0x40
invoke_syscall+0x4c/0xe8
el0_svc_common+0x98/0xf0
do_el0_svc+0x28/0x40
el0_svc+0x54/0x1e0
el0t_64_sync_handler+0x84/0x130
el0t_64_sync+0x198/0x1a0
Fixes: 558e35c ("net: macb: WoL support for GEM type of Ethernet controller")
Cc: [email protected]
Reviewed-by: Théo Lebrun <[email protected]>
Signed-off-by: Kevin Hao <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 27, 2026
…nd napi_tx is false A UAF issue occurs when the virtio_net driver is configured with napi_tx=N and the device's IFF_XMIT_DST_RELEASE flag is cleared (e.g., during the configuration of tc route filter rules). When IFF_XMIT_DST_RELEASE is removed from the net_device, the network stack expects the driver to hold the reference to skb->dst until the packet is fully transmitted and freed. In virtio_net with napi_tx=N, skbs may remain in the virtio transmit ring for an extended period. If the network namespace is destroyed while these skbs are still pending, the corresponding dst_ops structure has freed. When a subsequent packet is transmitted, free_old_xmit() is triggered to clean up old skbs. It then calls dst_release() on the skb associated with the stale dst_entry. Since the dst_ops (referenced by the dst_entry) has already been freed, a UAF kernel paging request occurs. fix it by adds skb_dst_drop(skb) in start_xmit to explicitly release the dst reference before the skb is queued in virtio_net. Call Trace: Unable to handle kernel paging request at virtual address ffff80007e150000 CPU: 2 UID: 0 PID: 6236 Comm: ping Kdump: loaded Not tainted 7.0.0-rc1+ #6 PREEMPT ... percpu_counter_add_batch+0x3c/0x158 lib/percpu_counter.c:98 (P) dst_release+0xe0/0x110 net/core/dst.c:177 skb_release_head_state+0xe8/0x108 net/core/skbuff.c:1177 sk_skb_reason_drop+0x54/0x2d8 net/core/skbuff.c:1255 dev_kfree_skb_any_reason+0x64/0x78 net/core/dev.c:3469 napi_consume_skb+0x1c4/0x3a0 net/core/skbuff.c:1527 __free_old_xmit+0x164/0x230 drivers/net/virtio_net.c:611 [virtio_net] free_old_xmit drivers/net/virtio_net.c:1081 [virtio_net] start_xmit+0x7c/0x530 drivers/net/virtio_net.c:3329 [virtio_net] ... Reproduction Steps: NETDEV="enp3s0" config_qdisc_route_filter() { tc qdisc del dev $NETDEV root tc qdisc add dev $NETDEV root handle 1: prio tc filter add dev $NETDEV parent 1:0 \ protocol ip prio 100 route to 100 flowid 1:1 ip route add 192.168.1.100/32 dev $NETDEV realm 100 } test_ns() { ip netns add testns ip link set $NETDEV netns testns ip netns exec testns ifconfig $NETDEV 10.0.32.46/24 ip netns exec testns ping -c 1 10.0.32.1 ip netns del testns } config_qdisc_route_filter test_ns sleep 2 test_ns Fixes: f2fc6a5 ("[NETNS][IPV6] route6 - move ip6_dst_ops inside the network namespace") Cc: [email protected] Signed-off-by: xietangxin <[email protected]> Reviewed-by: Xuan Zhuo <[email protected]> Fixes: 0287587 ("net: better IFF_XMIT_DST_RELEASE support") Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 27, 2026
As reported by syzbot [0], NBD can trigger a deadlock during
memory reclaim.
This occurs when a process holds lock_sock() on a backend TCP
socket and triggers a memory allocation that leads to fs reclaim.
If it eventually calls into NBD to send data or shut down the
socket, NBD will attempt to acquire the same lock_sock(),
resulting in the deadlock.
While NBD sets sk->sk_allocation to GFP_NOIO before calling
sendmsg(), this does not prevent the issue in some paths where
GFP_KERNEL is used directly under lock_sock().
To resolve this, let's use lock_sock_try() for TCP sendmsg() and
shutdown().
For sock_sendmsg(), if lock_sock_try() fails, -ERESTARTSYS is
returned, allowing the request to be retried later (e.g., via
was_interrupted() logic).
For sock_sendmsg() for NBD_CMD_DISC and kernel_sock_shutdown(),
the operation might be skipped if the lock cannot be acquired.
However, this is not expected to occur in practice because the
backend TCP socket should not be touched by userspace once it is
handed over to NBD.
Note that sock_recvmsg() does not require this special handling
because it is only called from the workqueue context.
Also note that AF_UNIX sockets continue to use sock_sendmsg()
and kernel_sock_shutdown() because unix_stream_sendmsg() and
unix_shutdown() do not acquire lock_sock().
[0]:
WARNING: possible circular locking dependency detected
syzkaller #0 Tainted: G L
syz.7.2282/12353 is trying to acquire lock:
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: might_alloc include/linux/sched/mm.h:317 [inline]
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: slab_pre_alloc_hook mm/slub.c:4489 [inline]
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: slab_alloc_node mm/slub.c:4843 [inline]
ffffffff8e9aa700 (fs_reclaim){+.+.}-{0:0}, at: kmem_cache_alloc_node_noprof+0x53/0x6f0 mm/slub.c:4918
but task is already holding lock:
ffff88806f972a20 (sk_lock-AF_INET6){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
ffff88806f972a20 (sk_lock-AF_INET6){+.+.}-{0:0}, at: tcp_close+0x1d/0x110 net/ipv4/tcp.c:3349
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #6 (sk_lock-AF_INET6){+.+.}-{0:0}:
lock_sock_nested+0x41/0xf0 net/core/sock.c:3780
lock_sock include/net/sock.h:1709 [inline]
inet_shutdown+0x67/0x410 net/ipv4/af_inet.c:919
nbd_mark_nsock_dead+0xae/0x5c0 drivers/block/nbd.c:318
sock_shutdown+0x16b/0x200 drivers/block/nbd.c:411
nbd_clear_sock drivers/block/nbd.c:1427 [inline]
nbd_config_put+0x1eb/0x750 drivers/block/nbd.c:1451
nbd_genl_connect+0xaf8/0x1a40 drivers/block/nbd.c:2248
genl_family_rcv_msg_doit+0x214/0x300 net/netlink/genetlink.c:1114
genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
genl_rcv_msg+0x560/0x800 net/netlink/genetlink.c:1209
netlink_rcv_skb+0x159/0x420 net/netlink/af_netlink.c:2550
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1218
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0x9e1/0xb70 net/socket.c:2592
___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
__sys_sendmsg+0x170/0x220 net/socket.c:2678
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #5 (&nsock->tx_lock){+.+.}-{4:4}:
__mutex_lock_common kernel/locking/mutex.c:614 [inline]
__mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
nbd_handle_cmd drivers/block/nbd.c:1143 [inline]
nbd_queue_rq+0x428/0x1080 drivers/block/nbd.c:1207
blk_mq_dispatch_rq_list+0x422/0x1e70 block/blk-mq.c:2148
__blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
__blk_mq_sched_dispatch_requests+0xcea/0x1620 block/blk-mq-sched.c:307
blk_mq_sched_dispatch_requests+0xd7/0x1c0 block/blk-mq-sched.c:329
blk_mq_run_hw_queue+0x23c/0x670 block/blk-mq.c:2386
blk_mq_dispatch_list+0x51d/0x1360 block/blk-mq.c:2949
blk_mq_flush_plug_list block/blk-mq.c:2997 [inline]
blk_mq_flush_plug_list+0x130/0x600 block/blk-mq.c:2969
__blk_flush_plug+0x2c4/0x4b0 block/blk-core.c:1230
blk_finish_plug block/blk-core.c:1257 [inline]
__submit_bio+0x584/0x6c0 block/blk-core.c:649
__submit_bio_noacct_mq block/blk-core.c:722 [inline]
submit_bio_noacct_nocheck+0x562/0xc10 block/blk-core.c:753
submit_bio_noacct+0xd17/0x2010 block/blk-core.c:884
blk_crypto_submit_bio include/linux/blk-crypto.h:203 [inline]
submit_bh_wbc+0x59c/0x770 fs/buffer.c:2821
submit_bh fs/buffer.c:2826 [inline]
block_read_full_folio+0x264/0x8e0 fs/buffer.c:2444
filemap_read_folio+0xfc/0x3b0 mm/filemap.c:2501
do_read_cache_folio+0x2d7/0x6b0 mm/filemap.c:4101
read_mapping_folio include/linux/pagemap.h:1028 [inline]
read_part_sector+0xd1/0x370 block/partitions/core.c:723
adfspart_check_ICS+0x93/0x910 block/partitions/acorn.c:360
check_partition block/partitions/core.c:142 [inline]
blk_add_partitions block/partitions/core.c:590 [inline]
bdev_disk_changed+0x7f8/0xc80 block/partitions/core.c:694
blkdev_get_whole+0x187/0x290 block/bdev.c:764
bdev_open+0x2c7/0xe40 block/bdev.c:973
blkdev_open+0x34e/0x4f0 block/fops.c:697
do_dentry_open+0x6d8/0x1660 fs/open.c:949
vfs_open+0x82/0x3f0 fs/open.c:1081
do_open fs/namei.c:4671 [inline]
path_openat+0x208c/0x31a0 fs/namei.c:4830
do_file_open+0x20e/0x430 fs/namei.c:4859
do_sys_openat2+0x10d/0x1e0 fs/open.c:1366
do_sys_open fs/open.c:1372 [inline]
__do_sys_openat fs/open.c:1388 [inline]
__se_sys_openat fs/open.c:1383 [inline]
__x64_sys_openat+0x12d/0x210 fs/open.c:1383
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #4 (&cmd->lock){+.+.}-{4:4}:
__mutex_lock_common kernel/locking/mutex.c:614 [inline]
__mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
nbd_queue_rq+0xba/0x1080 drivers/block/nbd.c:1199
blk_mq_dispatch_rq_list+0x422/0x1e70 block/blk-mq.c:2148
__blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
__blk_mq_sched_dispatch_requests+0xcea/0x1620 block/blk-mq-sched.c:307
blk_mq_sched_dispatch_requests+0xd7/0x1c0 block/blk-mq-sched.c:329
blk_mq_run_hw_queue+0x23c/0x670 block/blk-mq.c:2386
blk_mq_dispatch_list+0x51d/0x1360 block/blk-mq.c:2949
blk_mq_flush_plug_list block/blk-mq.c:2997 [inline]
blk_mq_flush_plug_list+0x130/0x600 block/blk-mq.c:2969
__blk_flush_plug+0x2c4/0x4b0 block/blk-core.c:1230
blk_finish_plug block/blk-core.c:1257 [inline]
__submit_bio+0x584/0x6c0 block/blk-core.c:649
__submit_bio_noacct_mq block/blk-core.c:722 [inline]
submit_bio_noacct_nocheck+0x562/0xc10 block/blk-core.c:753
submit_bio_noacct+0xd17/0x2010 block/blk-core.c:884
blk_crypto_submit_bio include/linux/blk-crypto.h:203 [inline]
submit_bh_wbc+0x59c/0x770 fs/buffer.c:2821
submit_bh fs/buffer.c:2826 [inline]
block_read_full_folio+0x264/0x8e0 fs/buffer.c:2444
filemap_read_folio+0xfc/0x3b0 mm/filemap.c:2501
do_read_cache_folio+0x2d7/0x6b0 mm/filemap.c:4101
read_mapping_folio include/linux/pagemap.h:1028 [inline]
read_part_sector+0xd1/0x370 block/partitions/core.c:723
adfspart_check_ICS+0x93/0x910 block/partitions/acorn.c:360
check_partition block/partitions/core.c:142 [inline]
blk_add_partitions block/partitions/core.c:590 [inline]
bdev_disk_changed+0x7f8/0xc80 block/partitions/core.c:694
blkdev_get_whole+0x187/0x290 block/bdev.c:764
bdev_open+0x2c7/0xe40 block/bdev.c:973
blkdev_open+0x34e/0x4f0 block/fops.c:697
do_dentry_open+0x6d8/0x1660 fs/open.c:949
vfs_open+0x82/0x3f0 fs/open.c:1081
do_open fs/namei.c:4671 [inline]
path_openat+0x208c/0x31a0 fs/namei.c:4830
do_file_open+0x20e/0x430 fs/namei.c:4859
do_sys_openat2+0x10d/0x1e0 fs/open.c:1366
do_sys_open fs/open.c:1372 [inline]
__do_sys_openat fs/open.c:1388 [inline]
__se_sys_openat fs/open.c:1383 [inline]
__x64_sys_openat+0x12d/0x210 fs/open.c:1383
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #3 (set->srcu){.+.+}-{0:0}:
srcu_lock_sync include/linux/srcu.h:199 [inline]
__synchronize_srcu+0xa1/0x2a0 kernel/rcu/srcutree.c:1505
blk_mq_wait_quiesce_done block/blk-mq.c:284 [inline]
blk_mq_wait_quiesce_done block/blk-mq.c:281 [inline]
blk_mq_quiesce_queue block/blk-mq.c:304 [inline]
blk_mq_quiesce_queue+0x149/0x1c0 block/blk-mq.c:299
elevator_switch+0x17b/0x7e0 block/elevator.c:576
elevator_change+0x352/0x530 block/elevator.c:681
elevator_set_default+0x29e/0x360 block/elevator.c:754
blk_register_queue+0x412/0x590 block/blk-sysfs.c:946
__add_disk+0x73f/0xe40 block/genhd.c:528
add_disk_fwnode+0x118/0x5c0 block/genhd.c:597
add_disk include/linux/blkdev.h:785 [inline]
nbd_dev_add+0x77a/0xb10 drivers/block/nbd.c:1984
nbd_init+0x291/0x2b0 drivers/block/nbd.c:2692
do_one_initcall+0x11d/0x760 init/main.c:1382
do_initcall_level init/main.c:1444 [inline]
do_initcalls init/main.c:1460 [inline]
do_basic_setup init/main.c:1479 [inline]
kernel_init_freeable+0x6e5/0x7a0 init/main.c:1692
kernel_init+0x1f/0x1e0 init/main.c:1582
ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
-> #2 (&q->elevator_lock){+.+.}-{4:4}:
__mutex_lock_common kernel/locking/mutex.c:614 [inline]
__mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
elevator_change+0x1bc/0x530 block/elevator.c:679
elevator_set_none+0x92/0xf0 block/elevator.c:769
blk_mq_elv_switch_none block/blk-mq.c:5110 [inline]
__blk_mq_update_nr_hw_queues block/blk-mq.c:5155 [inline]
blk_mq_update_nr_hw_queues+0x4c1/0x15f0 block/blk-mq.c:5220
nbd_start_device+0x1a6/0xbd0 drivers/block/nbd.c:1489
nbd_genl_connect+0xff2/0x1a40 drivers/block/nbd.c:2239
genl_family_rcv_msg_doit+0x214/0x300 net/netlink/genetlink.c:1114
genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
genl_rcv_msg+0x560/0x800 net/netlink/genetlink.c:1209
netlink_rcv_skb+0x159/0x420 net/netlink/af_netlink.c:2550
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1218
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0x9e1/0xb70 net/socket.c:2592
___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
__sys_sendmsg+0x170/0x220 net/socket.c:2678
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #1 (&q->q_usage_counter(io)#49){++++}-{0:0}:
blk_alloc_queue+0x610/0x790 block/blk-core.c:461
blk_mq_alloc_queue+0x174/0x290 block/blk-mq.c:4429
__blk_mq_alloc_disk+0x29/0x120 block/blk-mq.c:4476
nbd_dev_add+0x492/0xb10 drivers/block/nbd.c:1954
nbd_init+0x291/0x2b0 drivers/block/nbd.c:2692
do_one_initcall+0x11d/0x760 init/main.c:1382
do_initcall_level init/main.c:1444 [inline]
do_initcalls init/main.c:1460 [inline]
do_basic_setup init/main.c:1479 [inline]
kernel_init_freeable+0x6e5/0x7a0 init/main.c:1692
kernel_init+0x1f/0x1e0 init/main.c:1582
ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
-> #0 (fs_reclaim){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3165 [inline]
check_prevs_add kernel/locking/lockdep.c:3284 [inline]
validate_chain kernel/locking/lockdep.c:3908 [inline]
__lock_acquire+0x14b8/0x2630 kernel/locking/lockdep.c:5237
lock_acquire kernel/locking/lockdep.c:5868 [inline]
lock_acquire+0x1cf/0x380 kernel/locking/lockdep.c:5825
__fs_reclaim_acquire mm/page_alloc.c:4348 [inline]
fs_reclaim_acquire+0xc4/0x100 mm/page_alloc.c:4362
might_alloc include/linux/sched/mm.h:317 [inline]
slab_pre_alloc_hook mm/slub.c:4489 [inline]
slab_alloc_node mm/slub.c:4843 [inline]
kmem_cache_alloc_node_noprof+0x53/0x6f0 mm/slub.c:4918
__alloc_skb+0x140/0x710 net/core/skbuff.c:702
alloc_skb include/linux/skbuff.h:1383 [inline]
tcp_send_active_reset+0x8b/0xa60 net/ipv4/tcp_output.c:3862
__tcp_close+0x41e/0x1110 net/ipv4/tcp.c:3223
tcp_close+0x28/0x110 net/ipv4/tcp.c:3350
inet_release+0xed/0x200 net/ipv4/af_inet.c:443
inet6_release+0x4f/0x70 net/ipv6/af_inet6.c:479
__sock_release+0xb3/0x260 net/socket.c:662
sock_close+0x1c/0x30 net/socket.c:1455
__fput+0x3ff/0xb40 fs/file_table.c:469
task_work_run+0x150/0x240 kernel/task_work.c:233
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
__exit_to_user_mode_loop kernel/entry/common.c:67 [inline]
exit_to_user_mode_loop+0x100/0x4a0 kernel/entry/common.c:98
__exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline]
do_syscall_64+0x67c/0xf80 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
other info that might help us debug this:
Chain exists of:
fs_reclaim --> &nsock->tx_lock --> sk_lock-AF_INET6
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(sk_lock-AF_INET6);
lock(&nsock->tx_lock);
lock(sk_lock-AF_INET6);
lock(fs_reclaim);
*** DEADLOCK ***
Fixes: fd8383f ("nbd: convert to blkmq")
Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Kuniyuki Iwashima <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull request for series with
subject: block: introduce pi_size field in blk_integrity
version: 2
url: https://patchwork.kernel.org/project/linux-block/list/?series=968998