Skip to content

libnvme segfault during blktests nvme/003 fc transport #3104

@kawasaki

Description

@kawasaki

When I tested the latest libnvme and nvme-cli, I observed blktests test case nvme/003 failed for fc transport.

nvme/003 (tr=fc) (test if we're sending keep-alives to a discovery controller) [failed]
    runtime  10.435s  ...  10.385s
    --- tests/nvme/003.out      2024-05-09 16:48:17.745065811 +0900
    +++ /home/shin/Blktests/blktests/results/nodev_tr_fc/nvme/003.out.bad       2026-02-20 16:13:48.546035453 +0900
    @@ -1,3 +1,3 @@
     Running nvme/003
    -disconnected 1 controller(s)
    +disconnected 2 controller(s)
     Test complete

dmesg

[89463.214967][T218430] run blktests nvme/003 at 2026-02-20 16:18:39
[89463.326168][T218477] loop0: detected capacity change from 0 to 2097152
[89463.342311][T218480] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[89463.366460][T218484] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[89463.409843][T211598] nvmet: Created discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[89463.420179][T218488] nvme nvme5: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
[89473.578561][T218505] nvme nvme5: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[89489.442516][T218622] run blktests nvme/003 at 2026-02-20 16:19:05
[89489.535263][T218670] loop0: detected capacity change from 0 to 2097152
[89489.554982][T218673] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[89489.625862][T190232] nvme nvme5: NVME-FC{0}: create association : host wwpn 0x20001100aa000001  rport wwpn 0x20001100ab000001: NQN "nqn.2014-08.org.nvmexpress.discovery"
[89489.629893][T213747] nvmet_fc: {0:0}: Association created
[89489.631665][T195922] nvmet: Created discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[89489.636739][T190232] nvme nvme5: NVME-FC{0}: controller connect complete
[89489.638590][T218694] nvme nvme5: NVME-FC{0}: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
[89489.696322][T192710] nvme nvme6: NVME-FC{1}: create association : host wwpn 0x20001100aa000001  rport wwpn 0x20001100ab000001: NQN "nqn.2014-08.org.nvmexpress.discovery"
[89489.702879][T213747] nvmet_fc: {0:1}: Association created
[89489.705234][T190232] nvmet: Created discovery controller 2 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:81edead3-7664-4eda-801e-a1b6966e8e9d.
[89489.710923][T192710] nvme nvme6: NVME-FC{1}: controller connect complete
[89489.713167][T218696] nvme nvme6: NVME-FC{1}: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", hostnqn: nqn.2014-08.org.nvmexpress:uuid:81edead3-7664-4eda-801e-a1b6966e8e9d
[89489.717846][T218696] nvme[218696]: segfault at 30 ip 00007f8e3e506729 sp 00007fff9ee2fb90 error 4 in libnvme.so.1.16.0[13729,7f8e3e4f3000+1a000] likely on CPU 3 (core 3, socket 0)
[89489.721180][T218696] Code: f0 00 00 00 48 c7 83 e0 00 00 00 00 00 00 00 e8 2d d5 fe ff 48 c7 83 f0 00 00 00 00 00 00 00 5b c3 55 48 89 fd 53 48 83 ec 18 <48> 8b 5f 30 48 85 db 74 0d 48 8b 5b 30 48 85 db 74 04 48 8b 5b 20
[89499.742717][T218729] nvme nvme5: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[89499.765065][T218729] nvme nvme6: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[89499.768158][T190232] nvmet_fc: {0:0}: Association deleted
[89499.777287][T190232] nvmet_fc: {0:0}: Association freed
[89499.780138][T192710] nvmet_fc: {0:1}: Association deleted
[89499.780160][T211103] (NULL device *): Disconnect LS failed: No Association
[89499.789536][T192710] nvmet_fc: {0:1}: Association freed
[89499.790864][T212241] (NULL device *): Disconnect LS failed: No Association
[89499.871937][T218742] nvme_fc: nvme_fc_create_ctrl: nn-0x10001100ab000001:pn-0x20001100ab000001 - nn-0x10001100aa000001:pn-0x20001100aa000001 combination not found

The failure was not observed with older libnvme/nvme-cli (1.15/2.15). I tried to bisect pinning libnvme version to the git hash d65b44cd. The failure was observed with nvme-cli git hash dcf6799, and was not observed with git hash 149aeb0. There are 6 commits between these two commits, but it looks like nvme-cli build fails for these commits.

I guess and hope that libnvme/nvme-cli experts' knowledge to resolve this failure quicker than my effort.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions