nvme/068: add new test for nvme subsystem-reset test#229
nvme/068: add new test for nvme subsystem-reset test#229yizhanglinux wants to merge 1 commit intolinux-blktests:masterfrom
Conversation
Signed-off-by: Yi Zhang <[email protected]>
|
Here is the ouput of the test: |
| } | ||
|
|
||
| _require_test_dev_support_subsystem_reset() { | ||
| if ! nvme show-regs "$TEST_DEV" -H | grep -q "NSSRS.*Yes"; then |
There was a problem hiding this comment.
Same comment on the already closed PR. :)
nvme show-regs is likely not to work: https://github.com/linux-nvme/nvme-cli/wiki/FAQ#nvme-show-regs-devnvme0-returns-nvme0-failed-to-map
I think this is something the kernel needs to expose via the sysfs interface.
There was a problem hiding this comment.
Yes, just tried one kernel with CONFIG_IO_STRICT_DEVMEM enabled, and the cmd failed.
# nvme show-regs /dev/nvme0
NVMe status: Invalid Command Opcode: A reserved coded value or an unsupported value in the command opcode field(0x4001)
# cat /boot/config-7.0.0-0.rc2.21.fc45.x86_64 | grep IO_STRI
CONFIG_IO_STRICT_DEVMEM=y
There was a problem hiding this comment.
What we could do is add a sysfs entry which exposes all the registers and nvme show-regs uses these when evailable otherwise it falls back to the raw memory access.
| fi | ||
|
|
||
| # Wait NVMe disk reinitialized | ||
| sleep 10 |
There was a problem hiding this comment.
At least on PPC platform, adding delay of sleep 10 would not work. As we know, running subsystem-reset command on PPC platform would cause the communication loss to the NVMe adapter. So on PPC system any I/Os those were running while susbsytem-reset is executed or any new I/O submitted after the subsystem-reset is executed would eventually times out after 30 seconds. The nvme timeout handler code would then attempt to read PCIe/MMIO config space register which triggers the EEH and then EEH would recover the communication link to the NVMe adapter. So in theory, in worst case, it would take more than 30 seconds for the link to be restored and device to be back online post subsystem-reset.
I'd suggest following steps:
- Start I/O (maybe using fio)
- execute nvme subsystem-reset
- Add sleep 35 (30 seconds for I/O request timeout plus additional 5 seconds as cushion)
There was a problem hiding this comment.
Or even better you may first retrieve timeout value from, /sys/block/<blk-dev>/queue/io_timeout (which is in ms) and then accordingly adjust sleep.
There was a problem hiding this comment.
I suggest to set the default timeout to something short like 5s to avoid tests just waiting for timeouts.
No description provided.