Skip to content

Skip tests if scsi_debug module is already loaded and in use#218

Closed
disgoel wants to merge 1 commit intolinux-blktests:masterfrom
disgoel:fix-scsi-debug
Closed

Skip tests if scsi_debug module is already loaded and in use#218
disgoel wants to merge 1 commit intolinux-blktests:masterfrom
disgoel:fix-scsi-debug

Conversation

@disgoel
Copy link
Copy Markdown
Contributor

@disgoel disgoel commented Nov 26, 2025

Several tests across block/, scsi/, dm/, md/, zbd/, nvme/ require exclusive access to the scsi_debug module because they load, unload or reconfigure it. When scsi_debug is already loaded by the environment (e.g., by another driver or a previous setup), these tests fail with:

modprobe: FATAL: Module scsi_debug is in use.
Unloading scsi_debug failed
# lsmod | grep scsi_debug
scsi_debug            327680  4

Instead of modifying common rc files—which would overskip unrelated tests, this patch adds _module_not_in_use scsi_debug only to the tests that actually depend on exclusive access to scsi_debug.

@kawasaki
Copy link
Copy Markdown
Collaborator

kawasaki commented Jan 4, 2026

@disgoel Thanks for the idea. It is interesting for me, and I would like to think about it. When the scsi_debug is a loadable module, it makes sense that the scsi_debug is not loaded before running each of the test cases, probably. I wonder how the check can be done when the scsi_debug is built-in.

I took a look in the commit, but it looks missing the implementation of _module_not_in_use() function, doesn't it?

@disgoel
Copy link
Copy Markdown
Contributor Author

disgoel commented Jan 9, 2026

I took a look in the commit, but it looks missing the implementation of _module_not_in_use() function, doesn't it?

The function already exists.
https://github.com/linux-blktests/blktests/blob/master/common/rc#L95

@disgoel
Copy link
Copy Markdown
Contributor Author

disgoel commented Jan 9, 2026

@disgoel Thanks for the idea. It is interesting for me, and I would like to think about it. When the scsi_debug is a loadable module, it makes sense that the scsi_debug is not loaded before running each of the test cases, probably. I wonder how the check can be done when the scsi_debug is built-in.

For built-in scsi_debug, the situation is actually more constrained than the modular case. Since the driver cannot be unloaded or reset, any existing scsi_debug devices mean the tests cannot obtain exclusive control. In that case, skipping the tests is still the correct and safest behavior.

The intent of the check is not to distinguish built-in vs modular scsi_debug, but to ensure exclusive access. If that exclusivity cannot be guaranteed (either because the module is already in use or because the driver is built-in and active), the test should be skipped.

@kawasaki
Copy link
Copy Markdown
Collaborator

kawasaki commented Jan 13, 2026

I took a look in the commit, but it looks missing the implementation of _module_not_in_use() function, doesn't it?

The function already exists. http://github.com/linux-blktests/blktests/blob/master/common/rc#L95

Ah, I overlooked that function. Sorry. However, I'm not sure if that function really works for the purpose of this PR. The function checks /sys/module/scsi_debug/refcnt. But this refcnt value shows how many other modules depend on scsi_debug. IIUC, regardless of the number of scsi_debug devices, the refcnt value is zero. To check usage of scsi_debug, I think we need to check /sys/bus/pseudo/drivers/scsi_debug/add_host.

@kawasaki
Copy link
Copy Markdown
Collaborator

kawasaki commented Jan 13, 2026

I read your first explanation again, and I noticed that I had missed the point that the refcnt can be more than zero, when the scsi_debug device is opened by any user application. So, your code change will catch such case that the user creates any scsi_debug device and keeps the file open for an application before running blktests. And, this really happens in your test system. Now I see your point.

_module_not_in_use() checks /sys/module/scsi_debug/refcnt. This sysfs attribute does not exists when scsi_debug is built-in. So this approach will not work for built-in scsi_debug. But still your change will add value for loadable scsi_debug case.

As I noted in the above comment, another idea is to refer to the add_host attribute. However, when scsi_debug is loaded with default parameter, or scsi_debug is built-in, this add_host attribute has value "1", because scsi_debug creates the first device automatically. When add_host has value 1, we can not tell if this is the device created automatically, or the user created it. If the value is 2 or larger, we are sure that the user created their own device.

So, I can think of two approaches:

  1. Just add the check for loadable scsi_debug case, following your idea to use _modules_not_in_use() and refer to the refcnt value. This is simper, but can not cover some cases: built-in scsi_debug is not covered. Even when scsi_debug is loadable, this can not cover the case that the user created any scsi_debug device, and it is not opened by applications.
  2. Check both "refcnt == 0" and "add_host < 2". This is not perfect, but can cover built-in scsi_debug case. It can cover the case that the user created scsi_debug devices, but applications do not open them.

I think either way is okay. Whichever, it adds value to blktests.

@kawasaki
Copy link
Copy Markdown
Collaborator

kawasaki commented Jan 13, 2026

Assuming the approach 1), it is rather odd to call _modules_not_in_use() in required() of each test case. Once _have_scsi_debug() is specified, it should imply "_modules_not_in_use scsi_debug". So I suggest to modify _have_scsi_debug to call "_modules_not_in_use scsi_debug".

Some test cases call "_have_module scsi_debug" instead of "_have_scsi_debug". For those test cases, I suggest to introduce a new helper function "_have_loadable_scsi_debug" in common/scsi_debug. This helper function can call both "_have_module scsi_debug" and "_module_not_in_use scsi_debug". The test cases can call _have_loadable_scsi_debug instead of "_have_loadable_scsi_debug".

I think this simplification can also be done when approach 2) is chosen.

@disgoel
Copy link
Copy Markdown
Contributor Author

disgoel commented Feb 24, 2026

@kawasaki sorry for the delay.

I have refactored the patch to centralize the logic in common/scsi_debug as requested:

  • Centralized Check: Updated _have_scsi_debug to include _module_not_in_use. This ensures all tests skip gracefully if the module is busy, without needing to modify individual test files.
  • Added "Approach 2": Included a check for add_host > 1 to detect "dirty" environments where extra hosts are already configured.
  • New Helper: Added _have_loadable_scsi_debug for tests that explicitly require unloading/reloading, ensuring they skip if the driver is built-in.
  • Robustness: Switched to _have_module to better verify that the module can be managed by the test suite.

Verification: Confirmed that tests now correctly skip when the module is busy (refcnt > 0) or has extra hosts, and pass when the environment is clean.

@disgoel disgoel force-pushed the fix-scsi-debug branch 2 times, most recently from 71b68c7 to 8c18df0 Compare February 25, 2026 06:12
Copy link
Copy Markdown
Collaborator

@kawasaki kawasaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@disgoel Thank you for updating the patch. I agree with this fix concept. Please find my review comments and see if the comments make sense for you.

For your reference, I share a code change which reflects my review comments (link). I hope it helps to clarify my intents.

Also, I think some more test cases can be modified to use _have_loadable_scsi_debug or _have_scsi_debug. I created another code change to try it out (link). Please take a look in it also.

Comment thread common/scsi_debug Outdated
Comment thread common/scsi_debug Outdated
Comment thread common/scsi_debug Outdated
Comment thread common/scsi_debug Outdated
@kawasaki
Copy link
Copy Markdown
Collaborator

kawasaki commented Mar 1, 2026

@disgoel Thanks again for your effort. I think this work is close to the merge. Before the merge, I would like to post your patch to the linux-block mailing list to get wider reviews and build up consensus. I can post the patch to the list on behalf of you with your authorship. Just in case this is not okay for you, please let me know.

@disgoel disgoel force-pushed the fix-scsi-debug branch 2 times, most recently from a2e0e51 to 637f9a8 Compare March 16, 2026 10:06
@disgoel
Copy link
Copy Markdown
Contributor Author

disgoel commented Mar 16, 2026

@kawasaki Thank you for the detailed review and the reference code. I have updated the PR to address requested changes. Also updated the suggested test cases (block/009, md/002, etc.) to use the appropriate helper.

I am completely fine with you posting the patch to the linux-block mailing list on my behalf with my authorship. I appreciate the help in getting this merged!

kawasaki pushed a commit to kawasaki/blktests that referenced this pull request Mar 20, 2026
Several tests across block/, scsi/, dm/, md/, zbd/, nvme/ require
exclusive access to the scsi_debug module because they load, unload or
reconfigure it. When scsi_debug is loadable and already loaded by the
environment (e.g., by another driver or a previous setup), these tests
fail with:

  modprobe: FATAL: Module scsi_debug is in use.
  Unloading scsi_debug failed

  scsi_debug            327680  4

To prevent the failures, check if scsi_debug already loaded. If so, skip
the test cases which use scsi_debug. Instead of modifying common rc
files—which would overskip unrelated tests, add "_module_not_in_use
scsi_debug" call to the new helper function _have_loadable_scsi_debug()
and call it only for the tests that actually depend on exclusive access
to scsi_debug.

Also, when scsi_debug is built-in and already additional hosts are
created by the environment, the tests may break the hosts. To cover this
built-in scsi_debug scenario, check the number of added hosts in
_have_scsi_debug(). If the number of hosts is larger than 1, skip the
test cases.

Link: linux-blktests#218
Signed-off-by: Disha Goel <[email protected]>
[Shin'ichiro: improved commit message]
Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
@kawasaki
Copy link
Copy Markdown
Collaborator

@disgoel Thank you for updating the patch. While I did the final check of the patch, I noticed that it does not work as expected. Agrrr.

  1. I overlooked the behavior of /sysfs/module/scsi_debug/refcnt is not working as expected. It is just zero even afater a scsi_debug host is created. I should have noticed that at the beginning. My bad...
  2. It does not work for the test case which creates scsi_debug device as "fallback". e.g. scsi/009.
  3. It does not work well when other helper function loaded scsi_debug. e.g. "_have_module_param scsi_debug inq_vector" call in scsi/005.
  4. srp group test cases, zbd/012 and _have_scsi_debug_group_number_stats() also use loadable scsi_debug. They should check _have_loadable_scsi_debug.

I created some patches to address these problems, which are available in the branch here. I will do some more testing and will post these as a series to linux-block.

Your patch is here, in the series. I'm thinking to fold-in following two changes [1][2].

[1] kawasaki@11656fd
[2] kawasaki@ec0853f

Especially, [1] is the key difference. Could you check these changes?

@disgoel
Copy link
Copy Markdown
Contributor Author

disgoel commented Mar 23, 2026

Your patch is here, in the series. I'm thinking to fold-in following two changes [1][2].

[1] kawasaki@11656fd
[2] kawasaki@ec0853f

Especially, [1] is the key difference. Could you check these changes?

@kawasaki I have updated the PR with the final refactoring. I've switched to using the ${MODULES_TO_UNLOAD[*]} array check as suggested, which correctly identifies if scsi_debug was loaded by the test harness or pre-loaded by the system.

I'm ready for these changes to be incorporated into your series for the mailing list. Thanks for the collaborative effort!

Several tests across block/, scsi/, dm/, md/, zbd/, nvme/ require
exclusive access to the scsi_debug module because they load, unload
or reconfigure it. When scsi_debug is loadable and already loaded
by the environment (e.g., by another driver or a previous setup),
these tests fail with:

    modprobe: FATAL: Module scsi_debug is in use.
    Unloading scsi_debug failed

    scsi_debug            327680  4

To prevent these failures, this patch introduces a new helper function
_have_loadable_scsi_debug(). It verifies if the module is already loaded
by checking the ${MODULES_TO_UNLOAD[*]} array. If the module exists but
is not in the array, it indicates the module was loaded before the test
started, and the test is skipped.

Additionally, for cases where scsi_debug is built-in, the environment may
have already created additional hosts. To prevent the tests from disrupting
these hosts, _have_scsi_debug() now checks the add_host attribute. If the
number of hosts is greater than 1, the test is skipped.

Signed-off-by: Disha Goel <[email protected]>
@kawasaki
Copy link
Copy Markdown
Collaborator

@disgoel Thanks for the response! I posted the patch series with your patch to linux-block list [3].

[3] https://lore.kernel.org/linux-block/[email protected]/

kawasaki pushed a commit that referenced this pull request Apr 4, 2026
Several tests across block/, scsi/, dm/, md/, zbd/, nvme/ require
exclusive access to the scsi_debug module because they load, unload
or reconfigure it. When scsi_debug is loadable and already loaded
by the environment (e.g., by another driver or a previous setup),
these tests fail with:

    modprobe: FATAL: Module scsi_debug is in use.
    Unloading scsi_debug failed

    scsi_debug            327680  4

To prevent these failures, this patch introduces a new helper function
_have_loadable_scsi_debug(). It verifies if the module is already loaded
by checking the ${MODULES_TO_UNLOAD[*]} array. If the module exists but
is not in the array, it indicates the module was loaded before the test
started, and the test is skipped.

Additionally, for cases where scsi_debug is built-in, the environment may
have already created additional hosts. To prevent the tests from disrupting
these hosts, _have_scsi_debug() now checks the add_host attribute. If the
number of hosts is greater than 1, the test is skipped.

Link: #218
Signed-off-by: Disha Goel <[email protected]>
Signed-off-by: Shin'ichiro Kawasaki <[email protected]>
@kawasaki
Copy link
Copy Markdown
Collaborator

kawasaki commented Apr 6, 2026

Now the change is in the master branch. Let me close this case.

@kawasaki kawasaki closed this Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants