The 'request' member of struct scsi_cmnd is superfluous. The struct request
and struct scsi_cmnd data structures are adjacent and hence the request
pointer can be derived easily from a scsi_cmnd pointer. Introduce a helper
function that performs that conversion in a type-safe way. This patch is
the first step towards removing the request member from struct
scsi_cmnd. Making that change has the following advantages:
- This is a performance optimization since adding an offset to a pointer
takes less time than dereferencing a pointer.
- struct scsi_cmnd becomes smaller.
Link: https://lore.kernel.org/r/20210809230355.8186-2-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In host control mode, eviction is perceived as an extreme measure. There
are several conditions that both the entering and exiting regions should
meet, so that eviction will take place.
The common case however, is that those conditions are rarely met, so it is
normal that the act of eviction fails. Therefore, do not report an error
in host control mode if eviction fails.
Link: https://lore.kernel.org/r/20210808090024.21721-5-avri.altman@wdc.com
Fixes: 6c59cb501b (scsi: ufs: ufshpb: Make eviction depend on region's reads)
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
'num_inflight_map_req' should not be negative. It is incremented and
decremented without any protection, allowing it theoretically to be
negative, should some weird unbalanced count occur.
Verify that the those calls are properly serialized.
Link: https://lore.kernel.org/r/20210808090024.21721-4-avri.altman@wdc.com
Fixes: 33845a2d84 (scsi: ufs: ufshpb: Limit the number of in-flight map requests)
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In HPB2.0, if pre_req_min_tr_len < transfer_len < pre_req_max_tr_len, the
driver is expected to send a HPB-WRITE-BUFFER companion to HPB-READ.
The upper bound should fit into a single byte, regardless of bMAX_
DATA_SIZE_FOR_HPB_SINGLE_CMD which being an attribute (u32) can be
significantly larger.
To further illustrate the issue, consider the following scenario:
- SCSI_DEFAULT_MAX_SECTORS is 1024 limiting the I/O chunks to 512KB
- The OEM changes scsi_host_template .max_sectors to be 2048 which allows
for 1MB requests: transfer_len = 256
- pre_req_max_tr_len = HPB_MULTI_CHUNK_HIGH = 256
- ufshpb_is_supported_chunk() returns true (256 <= 256)
- WARN_ON_ONCE(256 > 256) doesn't warn
- ufshpb_set_hpb_read_to_upiu() casts transfer_len to u8: transfer_len = 0
- The command is failing with ILLEGAL REQUEST
Link: https://lore.kernel.org/r/20210808090024.21721-3-avri.altman@wdc.com
Fixes: 41d8a9333c (scsi: ufs: ufshpb: Add HPB 2.0 support)
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Managed device links are deleted by device_del(). However it is possible to
add a device link to a consumer before device_add(), and then discovering
an error prevents the device from being used. In that case normally
references to the device would be dropped and the device would be deleted.
However the device link holds a reference to the device, so the device link
and device remain indefinitely (unless the supplier is deleted).
For UFSHCD, if a LUN fails to probe (e.g. absent BOOT WLUN), the device
will not have been registered but can still have a device link holding a
reference to the device. The unwanted device link will prevent runtime
suspend indefinitely.
Amend device link removal to accept removal of a link with an unregistered
consumer device (suggested by Rafael), and fix UFSHCD by explicitly
deleting the device link when SCSI destroys the SCSI device.
Link: https://lore.kernel.org/r/a1c9bac8-b560-b662-f0aa-58c7e000cbbd@intel.com
Fixes: b294ff3e34 ("scsi: ufs: core: Enable power management for wlun")
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Similarly to AHCI, introduce the device sysfs attribute
sas_ncq_prio_supported to advertise if a SATA device supports the NCQ
priority feature. Without this new attribute, the user can only discover if
a SATA device supports NCQ priority by trying to enable the feature use
with the sas_ncq_prio_enable sysfs device attribute, which fails when the
device does not support high prioity commands.
Link: https://lore.kernel.org/r/20210807041859.579409-11-damien.lemoal@wdc.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, the mpt3sas driver sets the default queue depth based on the
physical interface of the attached device:
- SAS : 254
- SATA: 32
- NVMe: 128
The IOC firmware provides a recommended queue depth for each device through
SAS IO Unit Page1 for SAS/SATA and PCIe IO Unit Page 1 for NVMe devices.
If the host sets the queue depth greater than the firmware recommended
value, then the IOC places the I/Os above the recommended queue depth in an
internal pending queue. This consumes outstanding host-credit/resources,
thereby leading to potential starvation of other devices.
To avoid this, use the device depth recommended by the IOC firmware.
Link: https://lore.kernel.org/r/20210809072639.21228-2-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enable the driver to work in non-IRQ mode, i.e. there will not be any MSI-X
vectors associated with queues dedicated to polling. The IOC hardware is
single submission queue and multiple reply queue. However, using the shared
host tagset support it is possible to simulate multiple hardware queues.
When poll_queues are enabled through the module parameter, the driver will
allocate extra reply queues without an MSI-X association. All I/O
completion on these queues will be done through the iopoll interface.
Link: https://lore.kernel.org/r/20210727081212.2742-1-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When running command pipelining for WRITE direction commands (e.g. tape
device write), userspace sends cmd completion to cmd ring before processing
write data. In that case userspace has to copy data before sending
completion, because cmd completion also implicitly releases the data buffer
in data area.
The new feature KEEP_BUF allows userspace to optionally keep the buffer
after completion by setting new bit TCMU_UFLAG_KEEP_BUF in
tcmu_cmd_entry_hdr->uflags. In that case buffer has to be released
explicitly by writing the cmd_id to new action item free_kept_buf.
All kept buffers are released during reset_ring and if userspace closes uio
device (tcmu_release).
Link: https://lore.kernel.org/r/20210713175021.20103-1-bostroesser@gmail.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Clearing a unit attention synchronously from inside the UFS error handler
may trigger the following deadlock:
- ufshcd_err_handler() calls ufshcd_err_handling_unprepare() and the
latter function calls ufshcd_clear_ua_wluns().
- ufshcd_clear_ua_wluns() submits a REQUEST SENSE command and that command
activates the SCSI error handler.
- The SCSI error handler calls ufshcd_host_reset_and_restore().
- ufshcd_host_reset_and_restore() executes the following code:
ufshcd_schedule_eh_work(hba); flush_work(&hba->eh_work);
This sequence results in a deadlock (circular wait). Fix this by requesting
sense data asynchronously.
Link: https://lore.kernel.org/r/20210722033439.26550-16-bvanassche@acm.org
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Asutosh Das <asutoshd@codeaurora.org>
Cc: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make the following changes in ufshcd_abort():
- Return FAILED instead of SUCCESS if the abort handler notices that a
SCSI command has already been completed. Returning SUCCESS in this case
triggers a use-after-free and may trigger a kernel crash.
- Fix the code for aborting SCSI commands submitted to a WLUN.
The current approach for aborting SCSI commands that have been submitted to
a WLUN and that timed out is as follows:
- Report to the SCSI core that the command has completed successfully.
Let the block layer free any data buffers associated with the command.
- Mark the command as outstanding in 'outstanding_reqs'.
- If the block layer tries to reuse the tag associated with the aborted
command, busy-wait until the tag is freed.
This approach can result in:
- Memory corruption if the controller accesses the data buffer after the
block layer has freed the associated data buffers.
- A race condition if ufshcd_queuecommand() or ufshcd_exec_dev_cmd()
checks the bit that corresponds to an aborted command in
'outstanding_reqs' after it has been cleared and before it is reset.
- High energy consumption if ufshcd_queuecommand() repeatedly returns
SCSI_MLQUEUE_HOST_BUSY.
Fix this by reporting to the SCSI error handler that aborting a SCSI
command failed if the SCSI command was submitted to a WLUN.
Link: https://lore.kernel.org/r/20210722033439.26550-15-bvanassche@acm.org
Fixes: 7a7e66c65d ("scsi: ufs: Fix a race condition between ufshcd_abort() and eh_work()")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Asutosh Das <asutoshd@codeaurora.org>
Cc: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>