Improve write performance for zoned UFS devices#59
Closed
blktests-ci[bot] wants to merge 20 commits intofor-next_basefrom
Closed
Improve write performance for zoned UFS devices#59blktests-ci[bot] wants to merge 20 commits intofor-next_basefrom
blktests-ci[bot] wants to merge 20 commits intofor-next_basefrom
Conversation
* block-6.16: block: fix module reference leak in mq-deadline I/O scheduler
* for-6.17/io_uring: (39 commits) io_uring: fix breakage in EXPERT menu io_uring/cmd: remove struct io_uring_cmd_data btrfs/ioctl: store btrfs_uring_encoded_data in io_btrfs_cmd io_uring/cmd: introduce IORING_URING_CMD_REISSUE flag io_uring/zcrx: account area memory io_uring: export io_[un]account_mem io_uring/net: Support multishot receive len cap io_uring: deduplicate wakeup handling io_uring/net: cast min_not_zero() type io_uring/poll: cleanup apoll freeing io_uring/net: allow multishot receive per-invocation cap io_uring/net: move io_sr_msg->retry_flags to io_sr_msg->flags io_uring/net: use passed in 'len' in io_recv_buf_select() io_uring/zcrx: prepare fallback for larger pages io_uring/zcrx: assert area type in io_zcrx_iov_page io_uring/zcrx: allocate sgtable for umem areas io_uring/zcrx: introduce io_populate_area_dma io_uring/zcrx: return error from io_zcrx_map_area_* io_uring/zcrx: always pass page to io_zcrx_copy_chunk io_uring/rw: cast rw->flags assignment to rwf_t ...
* for-6.17/block: (77 commits) dm: split write BIOs on zone boundaries when zone append is not emulated block: use chunk_sectors when evaluating stacked atomic write limits dm-stripe: limit chunk_sectors to the stripe size md/raid10: set chunk_sectors limit md/raid0: set chunk_sectors limit block: sanitize chunk_sectors for atomic write limits ilog2: add max_pow_of_two_factor() block: fix blk_zone_append_update_request_bio() kernel-doc ublk: remove unused req argument from ublk_sub_req_ref() selftests: ublk: add utils.h selftests: ublk: add helper ublk_handle_uring_cmd() for handle ublk command selftests: ublk: improve flags naming selftests: ublk: remove ublk queue self-defined flags selftests: ublk: pass 'ublk_thread *' to more common helpers selftests: ublk: pass 'ublk_thread *' to ->queue_io() and ->tgt_io_done() selftests: ublk: remove `tag` parameter of ->tgt_io_done() ublk: pass 'const struct ublk_io *' to ublk_[un]map_io() ublk: remove ublk_commit_and_fetch() ublk: add helper ublk_check_fetch_buf() ublk: store auto buffer register data into `struct ublk_io` ...
* for-6.17/io_uring: io_uring/zcrx: fix leaking pages on sg init fail io_uring/zcrx: don't leak pages on account failure io_uring/zcrx: fix null ifq on area destruction
* for-6.17/block: nvme-pci: try function level reset on init failure nvmet: pci-epf: Do not complete commands twice if nvmet_req_init() fails nvme-tcp: log TLS handshake failures at error level docs: nvme: fix grammar in nvme-pci-endpoint-target.rst nvme: fix typo in status code constant for self-test in progress nvmet: remove redundant assignment of error code in nvmet_ns_enable() nvme: fix incorrect variable in io cqes error message nvme: fix multiple spelling and grammar issues in host drivers md/raid10: fix set but not used variable in sync_request_write() md: allow removing faulty rdev during resync md/raid5: unset WQ_CPU_INTENSIVE for raid5 unbound workqueue md: remove/add redundancy group only in level change md: Don't clear MD_CLOSING until mddev is freed md: call del_gendisk in control path
* for-6.17/block: sunvdc: Balance device refcount in vdc_port_mpgroup_check
* for-6.17/block: cdrom: Call cdrom_mrw_exit from cdrom_release function
Some storage controllers preserve the request order per hardware queue. Some but not all device mapper drivers preserve the bio order. Introduce the feature flag BLK_FEAT_ORDERED_HWQ to allow block drivers and stacked drivers to indicate that the order of write commands is preserved per hardware queue and hence that serialization of writes per zone is not required if all pending writes are submitted to the same hardware queue. Cc: Damien Le Moal <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Nitesh Shetty <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Signed-off-by: Bart Van Assche <[email protected]>
Zoned writes may be requeued. This happens if a block driver returns BLK_STS_RESOURCE, to handle SCSI unit attentions or by the SCSI error handler after error handling has finished. A later patch enables write pipelining and increases the number of pending writes per zone. If multiple writes are pending per zone, write requests may be requeued in another order than submitted. Restore the request order if requests are requeued. Add RQF_DONTPREP to RQF_NOMERGE_FLAGS because this patch may cause RQF_DONTPREP requests to be sent to the code that checks whether a request can be merged. RQF_DONTPREP requests must not be merged. Cc: Christoph Hellwig <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Yu Kuai <[email protected]> Signed-off-by: Bart Van Assche <[email protected]>
Software that submits zoned writes, e.g. a filesystem, may submit zoned writes from multiple CPUs as long as the zoned writes are serialized per zone. Submitting bios from different CPUs may cause bio reordering if e.g. different bios reach the storage device through different queues. Prepare for preserving the order of pipelined zoned writes per zone by adding the 'rq_cpu` argument to blk_zone_plug_bio(). This argument tells blk_zone_plug_bio() from which CPU a cached request has been allocated. The cached request will only be used if it matches the CPU from which zoned writes are being submitted for the zone associated with the bio. Cc: Christoph Hellwig <[email protected]> Cc: Damien Le Moal <[email protected]> Signed-off-by: Bart Van Assche <[email protected]>
Split an if-statement and also the comment above that if-statement. This patch prepares for moving code from disk_zone_wplug_add_bio() into its caller. No functionality has been changed. Signed-off-by: Bart Van Assche <[email protected]>
Move the following code into the only caller of disk_zone_wplug_add_bio(): - The code for clearing the REQ_NOWAIT flag. - The code that sets the BLK_ZONE_WPLUG_PLUGGED flag. - The disk_zone_wplug_schedule_bio_work() call. No functionality has been changed. This patch prepares for zoned write pipelining by removing the code from disk_zone_wplug_add_bio() that does not apply to all zoned write pipelining bio processing cases. Signed-off-by: Bart Van Assche <[email protected]>
Prepare for submitting multiple bios from inside a single blk_zone_wplug_bio_work() call. No functionality has been changed. Signed-off-by: Bart Van Assche <[email protected]>
Support pipelining of zoned writes if the block driver preserves the write order per hardware queue. Track per zone to which software queue writes have been queued. If zoned writes are pipelined, submit new writes to the same software queue as the writes that are already in progress. This prevents reordering by submitting requests for the same zone to different software or hardware queues. Cc: Christoph Hellwig <[email protected]> Cc: Damien Le Moal <[email protected]> Signed-off-by: Bart Van Assche <[email protected]>
If zoned writes (REQ_OP_WRITE) for a sequential write required zone have a starting LBA that differs from the write pointer, e.g. because a prior write triggered a unit attention condition, then the storage device will respond with an UNALIGNED WRITE COMMAND error. Retry commands that failed with an unaligned write error. Reviewed-by: Damien Le Moal <[email protected]> Cc: Martin K. Petersen <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Signed-off-by: Bart Van Assche <[email protected]>
If the write order is preserved, increase the number of retries for write commands sent to a sequential zone to the maximum number of outstanding commands because in the worst case the number of times reordered zoned writes have to be retried is (number of outstanding writes per sequential zone) - 1. Cc: Damien Le Moal <[email protected]> Cc: Martin K. Petersen <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Signed-off-by: Bart Van Assche <[email protected]>
Zone write locking is not used for zoned devices if the block driver reports that it preserves the order of write commands. Make it easier to test not using zone write locking by adding support for setting the driver_preserves_write_order flag. Acked-by: Douglas Gilbert <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Cc: Martin K. Petersen <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Signed-off-by: Bart Van Assche <[email protected]>
Allow user space software, e.g. a blktests test, to inject unaligned write errors. Acked-by: Douglas Gilbert <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Cc: Martin K. Petersen <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Signed-off-by: Bart Van Assche <[email protected]>
From the UFSHCI 4.0 specification, about the MCQ mode: "Command Submission 1. Host SW writes an Entry to SQ 2. Host SW updates SQ doorbell tail pointer Command Processing 3. After fetching the Entry, Host Controller updates SQ doorbell head pointer 4. Host controller sends COMMAND UPIU to UFS device" In other words, in MCQ mode, UFS controllers are required to forward commands to the UFS device in the order these commands have been received from the host. This patch improves performance as follows on a test setup with UFSHCI 4.0 controller: - When not using an I/O scheduler: 2.3x more IOPS for small writes. - With the mq-deadline scheduler: 2.0x more IOPS for small writes. Reviewed-by: Avri Altman <[email protected]> Cc: Bao D. Nguyen <[email protected]> Cc: Can Guo <[email protected]> Cc: Martin K. Petersen <[email protected]> Signed-off-by: Bart Van Assche <[email protected]>
Author
|
Upstream branch: a8fa173 |
7bf6dad to
0fb21b4
Compare
Author
|
Upstream branch: 301d58c |
Author
|
Upstream branch: 301d58c |
Author
|
Github failed to update this PR after force push. Close it. |
blktests-ci Bot
pushed a commit
that referenced
this pull request
Mar 18, 2026
…abled When the Remote System Update (RSU) isn't enabled in the First Stage Boot Loader (FSBL), the driver encounters a NULL pointer dereference when excute svc_normal_to_secure_thread() thread, resulting in a kernel panic: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 Mem abort info: ... Data abort info: ... [0000000000000008] user address but active_mm is swapper Internal error: Oops: 0000000096000004 [#1] SMP Modules linked in: CPU: 0 UID: 0 PID: 79 Comm: svc_smc_hvc_thr Not tainted 6.19.0-rc8-yocto-standard+ #59 PREEMPT Hardware name: SoCFPGA Stratix 10 SoCDK (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : svc_normal_to_secure_thread+0x38c/0x990 lr : svc_normal_to_secure_thread+0x144/0x990 ... Call trace: svc_normal_to_secure_thread+0x38c/0x990 (P) kthread+0x150/0x210 ret_from_fork+0x10/0x20 Code: 97cfc11 f9400260 aa1403e1 f9400400 (f9400402) ---[ end trace 0000000000000000 ]--- The issue occurs because rsu_send_async_msg() fails when RSU is not enabled in firmware, causing the channel to be freed via stratix10_svc_free_channel(). However, the probe function continues execution and registers svc_normal_to_secure_thread(), which subsequently attempts to access the already-freed channel, triggering the NULL pointer dereference. Fix this by properly cleaning up the async client and returning early on failure, preventing the thread from being used with an invalid channel. Fixes: 1584753 ("firmware: stratix10-rsu: Migrate RSU driver to use stratix10 asynchronous framework.") Cc: [email protected] Signed-off-by: Liwei Song <[email protected]> Signed-off-by: Dinh Nguyen <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull request for series with
subject: Improve write performance for zoned UFS devices
version: 21
url: https://patchwork.kernel.org/project/linux-block/list/?series=983507