Commit Graph

1296460 Commits

Author SHA1 Message Date
Ryusuke Konishi
0f3819e8c4 nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro
According to the C standard 3.4.3p3, the result of signed integer overflow
is undefined.  The macro nilfs_cnt32_ge(), which compares two sequence
numbers, uses signed integer subtraction that can overflow, and therefore
the result of the calculation may differ from what is expected due to
undefined behavior in different environments.

Similar to an earlier change to the jiffies-related comparison macros in
commit 5a581b367b ("jiffies: Avoid undefined behavior from signed
overflow"), avoid this potential issue by changing the definition of the
macro to perform the subtraction as unsigned integers, then cast the
result to a signed integer for comparison.

Link: https://lkml.kernel.org/r/20130727225828.GA11864@linux.vnet.ibm.com
Link: https://lkml.kernel.org/r/20240702183512.6390-1-konishi.ryusuke@gmail.com
Fixes: 9ff05123e3 ("nilfs2: segment constructor")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:11 -07:00
Jeff Johnson
8547d1150f math: rational: add missing MODULE_DESCRIPTION() macro
With ARCH=sh, make allmodconfig && make W=1 C=1 reports: WARNING: modpost:
missing MODULE_DESCRIPTION() in lib/math/rational.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240702-md-sh-lib-math-v1-1-93f4ac4fa8fd@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:11 -07:00
Jeff Johnson
bee6c683de lib/zlib: add missing MODULE_DESCRIPTION() macro
With ARCH=csky, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/zlib_deflate/zlib_deflate.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Link: https://lkml.kernel.org/r/20240613-md-csky-lib-zlib_deflate-v1-1-83504d9a27d6@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Guo Ren <guoren@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:11 -07:00
Jeff Johnson
c61d7259a6 fs: ufs: add MODULE_DESCRIPTION()
Fix make W=1 warning:

WARNING: modpost: missing MODULE_DESCRIPTION() in fs/ufs/ufs.o

Link: https://lkml.kernel.org/r/20240510-ufs-md-v1-1-85eaff8c6beb@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:10 -07:00
Hsin Chang Yu
cedb08caac lib/rbtree.c: fix the example typo
Replace the "Sr" with "sr", the example is wrong if sl and N don't have
child nodes, so sr should be red node.

Link: https://lkml.kernel.org/r/20240628142229.69419-1-zxcvb600870024@gmail.com
Signed-off-by: Hsin Chang Yu <zxcvb600870024@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:10 -07:00
lei lu
255547c6bb ocfs2: add bounds checking to ocfs2_check_dir_entry()
This adds sanity checks for ocfs2_dir_entry to make sure all members of
ocfs2_dir_entry don't stray beyond valid memory region.

Link: https://lkml.kernel.org/r/20240626104433.163270-1-llfamsec@gmail.com
Signed-off-by: lei lu <llfamsec@gmail.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:10 -07:00
Yang Li
937b2972ce fs: add kernel-doc comments to ocfs2_prepare_orphan_dir()
This commit adds kernel-doc style comments with complete parameter
descriptions for the function ocfs2_prepare_orphan_dir.

Link: https://lkml.kernel.org/r/20240322063718.88183-1-yang.lee@linux.alibaba.com
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:10 -07:00
Oleg Nesterov
1e3fa25fca coredump: simplify zap_process()
After commit 0258b5fd7c ("coredump: Limit coredumps to a single thread
group") zap_process() doesn't need the "task_struct *start" arg,
zap_threads() can pass "signal_struct *signal" instead.

This simplifies the code and allows to use __for_each_thread() which
is slightly more efficient.

Link: https://lkml.kernel.org/r/20240625140311.GA20787@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04 23:43:09 -07:00
Damien Le Moal
2f20872ed4 block: Remove blk_alloc_zone_bitmap()
Remove the helper function blk_alloc_zone_bitmap() and replace its
single call site with a call to bitmap_alloc(). To be consistent with
this change, use bitmap_free() to free a disk convnetional zone bitmap.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240704052816.623865-6-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-05 00:42:04 -06:00
Damien Le Moal
f2a7bea237 block: Remove REQ_OP_ZONE_RESET_ALL emulation
Now that device mapper can handle resetting all zones of a mapped zoned
device using REQ_OP_ZONE_RESET_ALL, all zoned block device drivers
support this operation. With this, the request queue feature
BLK_FEAT_ZONE_RESETALL is not necessary and the emulation code in
blk-zone.c can be removed.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240704052816.623865-5-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-05 00:42:04 -06:00
Damien Le Moal
81e7706345 dm: handle REQ_OP_ZONE_RESET_ALL
This commit implements processing of the REQ_OP_ZONE_RESET_ALL operation
for zoned mapped devices. Given that this operation always has a BIO
sector of 0 and a 0 size, processing through the regular BIO
__split_and_process_bio() function does not work because this function
would always select the first target. Instead, handling of this
operation is implemented using the function __send_zone_reset_all().

Similarly to the __send_empty_flush() function, the new
__send_zone_reset_all() function manually goes through all targets of a
mapped device table doing the following:
1) If the target can natively support REQ_OP_ZONE_RESET_ALL,
   __send_duplicate_bios() is used to forward the reset all operation to
   the target. This case is handled with the
   __send_zone_reset_all_native() function.
2) For other targets, the function __send_zone_reset_all_emulated() is
   executed to emulate the execution of REQ_OP_ZONE_RESET_ALL using
   regular REQ_OP_ZONE_RESET operations.

Targets that can natively support REQ_OP_ZONE_RESET_ALL are identified
using the new target field zone_reset_all_supported. This boolean is set
to true in for targets that have reliable zone limits, that is, targets
that map all sequential write required zones of their zoned device(s).
Setting this field is handled in dm_set_zones_restrictions() and
device_get_zone_resource_limits().

For targets with unreliable zone limits, REQ_OP_ZONE_RESET_ALL must be
emulated (case 2 above). This is implemented with
__send_zone_reset_all_emulated() and is similar to the block layer
function blkdev_zone_reset_all_emulated(): first a report zones is done
for the zones of the target to identify zones that need reset, that is,
any sequential write required zone that is not already empty. This is
done using a bitmap and the function dm_zone_get_reset_bitmap() which
sets to 1 the bit corresponding to a zone that needs reset. Next, this
zone bitmap is inspected and a clone BIO modified to use the
REQ_OP_ZONE_RESET operation issued for any zone with its bit set in the
zone bitmap.

This implementation is more efficient than what the block layer does
with blkdev_zone_reset_all_emulated(), which is always used for DM zoned
devices currently: as we can natively use REQ_OP_ZONE_RESET_ALL on
targets mapping all sequential write required zones, resetting all zones
of a zoned mapped device can be much faster compared to always emulating
this operation using regular per-zone reset. In the worst case, this
implementation is as-efficient as the block layer emulation. This
reduction in the time it takes to reset all zones of a zoned mapped
device depends directly on the mapped device targets mapping (reliable
zone limits or not).

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240704052816.623865-4-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-05 00:42:04 -06:00
Damien Le Moal
ae7e965b36 dm: Refactor is_abnormal_io()
Use a single switch-case to simplify is_abnormal_io() and make this
function more readable and easier to modify.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240704052816.623865-3-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-05 00:42:04 -06:00
Damien Le Moal
f4d5dc33c8 null_blk: Introduce the zone_full parameter
Allow creating a zoned null_blk device with the initial state of its
sequential write required zones to be FULL. This is convenient to avoid
having to first write these zones to perform read performance evaluation
or test zone management operations such as zone reset (and zone reset
all).

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240704052816.623865-2-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-05 00:42:04 -06:00
Christoph Hellwig
dd54fd4e17 loop: remove the unused inode variable in loop_configure
Remove the inode variable now that the last user is gone.

Fixes: a17ece76bc ("loop: regularize upgrading the block size for direct I/O")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240705053114.2042976-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-05 00:27:56 -06:00
Viresh Kumar
dfd3e8b90b cpufreq: pcc: Remove empty exit() callback
The exit() callback is optional, remove the empty one.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2024-07-05 11:42:00 +05:30
Viresh Kumar
fa90377278 cpufreq: loongson2: Remove empty exit() callback
The exit() callback is optional, remove the empty one.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2024-07-05 11:41:46 +05:30
Viresh Kumar
bf8a44c07b cpufreq: nforce2: Remove empty exit() callback
The exit() callback is optional, remove the empty one.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
2024-07-05 11:41:30 +05:30
Martin K. Petersen
6cd48c8f62 Merge patch series "mpi3mr: Support PCI Error Recovery"
Sumit Saxena <sumit.saxena@broadcom.com> says:

This patch series contains the changes done in the driver to support
PCI error recovery. It is rework of older patch series from Ranjan
Kumar, see [1].

[1] https://lore.kernel.org/all/20231214205900.270488-1-ranjan.kumar@broadcom.com/

Link: https://lore.kernel.org/r/20240627101735.18286-1-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:38:16 -04:00
Sumit Saxena
cf82b9e866 scsi: mpi3mr: Driver version update
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240627101735.18286-4-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:37:07 -04:00
Sumit Saxena
1c342b0548 scsi: mpi3mr: Prevent PCI writes from driver during PCI error recovery
Prevent interaction with the hardware while the error recovery in progress.

Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240627101735.18286-3-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:37:07 -04:00
Sumit Saxena
30bafe1774 scsi: mpi3mr: Support PCI Error Recovery callback handlers
PCI Error recovery support is required to recover the controller upon
detection of PCI errors. Add support for the PCI error recovery callback
handlers in mpi3mr driver.

Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240627101735.18286-2-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:37:07 -04:00
Martin K. Petersen
34438552c9 Merge patch series "Update lpfc to revision 14.4.0.3"
Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.4.0.3

This patch set contains bug fixes related to discovery, submission of
mailbox commands, and proper endianness conversions.

The patches were cut against Martin's 6.11/scsi-queue tree.

Link: https://lore.kernel.org/r/20240628172011.25921-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:25:40 -04:00
Justin Tee
41972df1a5 scsi: lpfc: Update lpfc version to 14.4.0.3
Update lpfc version to 14.4.0.3.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:52 -04:00
Justin Tee
8bc7c61764 scsi: lpfc: Revise lpfc_prep_embed_io routine with proper endian macro usages
On big endian architectures, it is possible to run into a memory out of
bounds pointer dereference when FCP targets are zoned.

In lpfc_prep_embed_io, the memcpy(ptr, fcp_cmnd, sgl->sge_len) is
referencing a little endian formatted sgl->sge_len value.  So, the memcpy
can cause big endian systems to crash.

Redefine the *sgl ptr as a struct sli4_sge_le to make it clear that we are
referring to a little endian formatted data structure.  And, update the
routine with proper le32_to_cpu macro usages.

Fixes: af20bb73ac ("scsi: lpfc: Add support for 32 byte CDBs")
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:52 -04:00
Justin Tee
f65f31ac12 scsi: lpfc: Fix incorrect request len mbox field when setting trunking via sysfs
When setting trunk modes through sysfs, the SLI_CONFIG mailbox command's
command payload length is incorrectly hardcoded to 12 bytes.  SLI_CONFIG's
payload length field should be specified large enough to encompass both the
submailbox command header and the submailbox request itself.

Thus, replace the hardcoded 12 bytes with a clearer calculation by way of
sizeof(struct lpfc_mbx_set_trunk_mode) - sizeof(struct lpfc_sli4_cfg_mhdr).

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
ede596b143 scsi: lpfc: Handle mailbox timeouts in lpfc_get_sfp_info
The MBX_TIMEOUT return code is not handled in lpfc_get_sfp_info and the
routine unconditionally frees submitted mailbox commands regardless of
return status.  The issue is that for MBX_TIMEOUT cases, when firmware
returns SFP information at a later time, that same mailbox memory region
references previously freed memory in its cmpl routine.

Fix by adding checks for the MBX_TIMEOUT return code.  During mailbox
resource cleanup, check the mbox flag to make sure that the wait did not
timeout.  If the MBOX_WAKE flag is not set, then do not free the resources
because it will be freed when firmware completes the mailbox at a later
time in its cmpl routine.

Also, increase the timeout from 30 to 60 seconds to accommodate boot
scripts requiring longer timeouts.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
15e21dc6d6 scsi: lpfc: Fix handling of fully recovered fabric node in dev_loss callbk
In rare cases when a fabric node is recovered after a link bounce and
before dev_loss_tmo callbk is reached, the driver may leave the fabric node
in an inconsistent state with the NLP_IN_DEV_LOSS flag perpetually set.

In lpfc_dev_loss_tmo_callbk, a check is added for a recovered fabric node.
If the node is recovered, then don't queue the lpfc_dev_loss_tmo_handler
work. In lpfc_dev_loss_tmo_handler, the path taken for the recovered fabric
nodes is updated to clear the NLP_IN_DEV_LOSS flag.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
aeaf117cc7 scsi: lpfc: Relax PRLI issue conditions after GID_FT response
If previously in REG_LOGIN_ISSUE state, then remove the requirement that
PLOGI must have been received from the remote port before issuing a PRLI.
After GID_FT completes, it does not matter whether the driver itself sent a
PLOGI or received one.  The fact that we're in REG_LOGIN_ISSUE state simply
means that the next state should be issuing the PRLI to continue discovery
of the remote port.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
9609385dd9 scsi: lpfc: Allow DEVICE_RECOVERY mode after RSCN receipt if in PRLI_ISSUE state
Certain vendor specific targets initially register with the fabric as an
initiator function first and then re-register as a target function
afterwards.

The timing of the target function re-registration can cause a race
condition such that the driver is stuck assuming the remote port as an
initiator function and never discovers the target's hosted LUNs.

Expand the nlp_state qualifier to also include NLP_STE_PRLI_ISSUE because
the state means that PRLI was issued but we have not quite reached
MAPPED_NODE state yet.  If we received an RSCN in the PRLI_ISSUE state,
then we should restart discovery again by going into DEVICE_RECOVERY.

Fixes: dded1dc31a ("scsi: lpfc: Modify when a node should be put in device recovery mode during RSCN")
Cc: <stable@vger.kernel.org> # v6.6+
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
e999ef1542 scsi: lpfc: Cancel ELS WQE instead of issuing abort when SLI port is inactive
During SLI port errata events, there should be no expectation that
submitted outstanding WQEs will return back CQEs.  In these situations, the
driver should not rely on receiving CQEs from the SLI port to signal WQE
resource clean up.

Put an sli_flag LPFC_SLI_ACTIVE check in lpfc_els_flush_cmd() when walking
the txcmplq.  The sli_flag check helps determine whether to issue an abort
or driver based cancel on outstanding WQEs.  If !LPFC_SLI_ACTIVE, then
there's no point to issue anything to the SLI port.  Instead, let the
driver based cancel logic clean up the submitted WQE resources.

Also, enhance some abort log messages that help with future debugging.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Damien Le Moal
7a6bbc2829 scsi: sd: Do not repeat the starting disk message
The SCSI disk message "Starting disk" to signal resuming of a suspended
disk is printed in both sd_resume() and sd_resume_common() which results
in this message being printed twice when resuming from e.g. autosuspend:

$ echo 5000 > /sys/block/sda/device/power/autosuspend_delay_ms
$ echo auto > /sys/block/sda/device/power/control

[ 4962.438293] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 4962.501121] sd 0:0:0:0: [sda] Stopping disk

$ echo on > /sys/block/sda/device/power/control

[ 4972.805851] sd 0:0:0:0: [sda] Starting disk
[ 4980.558806] sd 0:0:0:0: [sda] Starting disk

Fix this double print by removing the call to sd_printk() from sd_resume()
and moving the call to sd_printk() in sd_resume_common() earlier in the
function, before the check using sd_do_start_stop().  Doing so, the message
is printed once regardless if sd_resume_common() actually executes
sd_start_stop_device() (i.e. SCSI device case) or not (libsas and libata
managed ATA devices case).

Fixes: 0c76106cb9 ("scsi: sd: Fix TCG OPAL unlock on system resume")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20240701215326.128067-1-dlemoal@kernel.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:05:46 -04:00
Peter Wang
74736103fb scsi: ufs: core: Fix ufshcd_abort_one racing issue
When ufshcd_abort_one is racing with the completion ISR, the completed tag
of the request's mq_hctx pointer will be set to NULL by ISR.  Return
success when request is completed by ISR because ufshcd_abort_one does not
need to do anything.

The racing flow is:

Thread A
ufshcd_err_handler					step 1
	...
	ufshcd_abort_one
		ufshcd_try_to_abort_task
			ufshcd_cmd_inflight(true)	step 3
		ufshcd_mcq_req_to_hwq
			blk_mq_unique_tag
				rq->mq_hctx->queue_num	step 5

Thread B
ufs_mtk_mcq_intr(cq complete ISR)			step 2
	scsi_done
		...
		__blk_mq_free_request
			rq->mq_hctx = NULL;		step 4

Below is KE back trace.
  ufshcd_try_to_abort_task: cmd at tag 41 not pending in the device.
  ufshcd_try_to_abort_task: cmd at tag=41 is cleared.
  Aborting tag 41 / CDB 0x28 succeeded
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000194
  pc : [0xffffffddd7a79bf8] blk_mq_unique_tag+0x8/0x14
  lr : [0xffffffddd6155b84] ufshcd_mcq_req_to_hwq+0x1c/0x40 [ufs_mediatek_mod_ise]
   do_mem_abort+0x58/0x118
   el1_abort+0x3c/0x5c
   el1h_64_sync_handler+0x54/0x90
   el1h_64_sync+0x68/0x6c
   blk_mq_unique_tag+0x8/0x14
   ufshcd_err_handler+0xae4/0xfa8 [ufs_mediatek_mod_ise]
   process_one_work+0x208/0x4fc
   worker_thread+0x228/0x438
   kthread+0x104/0x1d4
   ret_from_fork+0x10/0x20

Fixes: 93e6c0e19d ("scsi: ufs: core: Clear cmd if abort succeeds in MCQ mode")
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20240628070030.30929-3-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:04:15 -04:00
Peter Wang
9307a998cb scsi: ufs: core: Fix ufshcd_clear_cmd racing issue
When ufshcd_clear_cmd is racing with the completion ISR, the completed tag
of the request's mq_hctx pointer will be set to NULL by the ISR.  And
ufshcd_clear_cmd's call to ufshcd_mcq_req_to_hwq will get NULL pointer KE.
Return success when the request is completed by ISR because sq does not
need cleanup.

The racing flow is:

Thread A
ufshcd_err_handler					step 1
	ufshcd_try_to_abort_task
		ufshcd_cmd_inflight(true)		step 3
		ufshcd_clear_cmd
			...
			ufshcd_mcq_req_to_hwq
			blk_mq_unique_tag
				rq->mq_hctx->queue_num	step 5

Thread B
ufs_mtk_mcq_intr(cq complete ISR)			step 2
	scsi_done
		...
		__blk_mq_free_request
			rq->mq_hctx = NULL;		step 4

Below is KE back trace:

  ufshcd_try_to_abort_task: cmd pending in the device. tag = 6
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000194
   pc : [0xffffffd589679bf8] blk_mq_unique_tag+0x8/0x14
   lr : [0xffffffd5862f95b4] ufshcd_mcq_sq_cleanup+0x6c/0x1cc [ufs_mediatek_mod_ise]
   Workqueue: ufs_eh_wq_0 ufshcd_err_handler [ufs_mediatek_mod_ise]
   Call trace:
    dump_backtrace+0xf8/0x148
    show_stack+0x18/0x24
    dump_stack_lvl+0x60/0x7c
    dump_stack+0x18/0x3c
    mrdump_common_die+0x24c/0x398 [mrdump]
    ipanic_die+0x20/0x34 [mrdump]
    notify_die+0x80/0xd8
    die+0x94/0x2b8
    __do_kernel_fault+0x264/0x298
    do_page_fault+0xa4/0x4b8
    do_translation_fault+0x38/0x54
    do_mem_abort+0x58/0x118
    el1_abort+0x3c/0x5c
    el1h_64_sync_handler+0x54/0x90
    el1h_64_sync+0x68/0x6c
    blk_mq_unique_tag+0x8/0x14
    ufshcd_clear_cmd+0x34/0x118 [ufs_mediatek_mod_ise]
    ufshcd_try_to_abort_task+0x2c8/0x5b4 [ufs_mediatek_mod_ise]
    ufshcd_err_handler+0xa7c/0xfa8 [ufs_mediatek_mod_ise]
    process_one_work+0x208/0x4fc
    worker_thread+0x228/0x438
    kthread+0x104/0x1d4
    ret_from_fork+0x10/0x20

Fixes: 8d72903489 ("scsi: ufs: mcq: Add supporting functions for MCQ abort")
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20240628070030.30929-2-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:04:14 -04:00
Terrence Adams
76a20140ef scsi: pm8001: Update log level when reading config table
Reading the main config table occurs as a part of initialization in
pm80xx_chip_init(). Because of this it makes more sense to have it be a
part of the INIT logging.

Signed-off-by: Terrence Adams <tadamsjr@google.com>
Link: https://lore.kernel.org/r/20240627155924.2361370-3-tadamsjr@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:53:13 -04:00
Igor Pylypiv
e4f949ef15 scsi: pm80xx: Set phy->enable_completion only when we wait for it
pm8001_phy_control() populates the enable_completion pointer with a stack
address, sends a PHY_LINK_RESET / PHY_HARD_RESET, waits 300 ms, and
returns. The problem arises when a phy control response comes late.  After
300 ms the pm8001_phy_control() function returns and the passed
enable_completion stack address is no longer valid. Late phy control
response invokes complete() on a dangling enable_completion pointer which
leads to a kernel crash.

Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Terrence Adams <tadamsjr@google.com>
Link: https://lore.kernel.org/r/20240627155924.2361370-2-tadamsjr@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:53:13 -04:00
Kyoungrul Kim
7cbff570db scsi: ufs: core: Remove SCSI host only if added
If host tries to remove ufshcd driver from a UFS device it would cause a
kernel panic if ufshcd_async_scan fails during ufshcd_probe_hba before
adding a SCSI host with scsi_add_host and MCQ is enabled since SCSI host
has been defered after MCQ configuration introduced by commit 0cab4023ec
("scsi: ufs: core: Defer adding host to SCSI if MCQ is supported").

To guarantee that SCSI host is removed only if it has been added, set the
scsi_host_added flag to true after adding a SCSI host and check whether it
is set or not before removing it.

Signed-off-by: Kyoungrul Kim <k831.kim@samsung.com>
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Link: https://lore.kernel.org/r/20240627085104epcms2p5897a3870ea5c6416aa44f94df6c543d7@epcms2p5
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:51:31 -04:00
Ram Prakash Gupta
ed7dac86f1 scsi: ufs: qcom: Enable suspending clk scaling on no request
Enable suspending clk scaling on no request for Qualcomm SoC.

Signed-off-by: Ram Prakash Gupta <quic_rampraka@quicinc.com>
Link: https://lore.kernel.org/r/20240627083756.25340-3-quic_rampraka@quicinc.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:48:33 -04:00
Ram Prakash Gupta
50183ac2cf scsi: ufs: core: Suspend clk scaling on no request
Currently UFS clk scaling is getting suspended only when the clks are
scaled down. When high load is generated, a huge amount of latency is added
due to scaling up the clk and completing the request post that.

Suspending the scaling in its existing state when high load is generated
improves the random performance KPI by 28%. So suspending the scaling when
there are no requests. And the clk would be put in low scaled state when
the actual request load is low.

Make this change optional by having the check enabled using vops since for
some devices suspending without bringing the clk in low scaled state might
have impact on power consumption of the SoC.

Signed-off-by: Ram Prakash Gupta <quic_rampraka@quicinc.com>
Link: https://lore.kernel.org/r/20240627083756.25340-2-quic_rampraka@quicinc.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:48:33 -04:00
Tomas Henzl
de24085328 scsi: mpi3mr: Correct a test in mpi3mr_sas_port_add()
The test for a possible shift overflow is not correct. Fix it by replacing
the '>' with a '>='.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20240627074827.13672-1-thenzl@redhat.com
Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:40:32 -04:00
Christian Eggers
0005b2dc43 dsa: lan9303: Fix mapping between DSA port number and PHY address
The 'phy' parameter supplied to lan9303_phy_read/_write was sometimes a
DSA port number and sometimes a PHY address. This isn't a problem as
long as they are equal.  But if the external phy_addr_sel_strap pin is
wired to 'high', the PHY addresses change from 0-1-2 to 1-2-3 (CPU,
slave0, slave1).  In this case, lan9303_phy_read/_write must translate
between DSA port numbers and the corresponding PHY address.

Fixes: a1292595e0 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20240703145718.19951-1-ceggers@arri.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04 19:19:21 -07:00
Rob Herring (Arm)
390b14b5e9 dt-bindings: net: Define properties at top-level
Convention is DT schemas should define all properties at the top-level
and not inside of if/then schemas. That minimizes the if/then schemas
and is more future proof.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://patch.msgid.link/20240703195827.1670594-2-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04 19:18:11 -07:00
Aleksandr Mishin
85099c7ce4 wifi: rtw89: Fix array index mistake in rtw89_sta_info_get_iter()
In rtw89_sta_info_get_iter() 'status->he_gi' is compared to array size.
But then 'rate->he_gi' is used as array index instead of 'status->he_gi'.
This can lead to go beyond array boundaries in case of 'rate->he_gi' is
not equal to 'status->he_gi' and is bigger than array size. Looks like
"copy-paste" mistake.

Fix this mistake by replacing 'rate->he_gi' with 'status->he_gi'.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: e3ec7017f6 ("rtw89: add Realtek 802.11ax driver")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240703210510.11089-1-amishin@t-argos.ru
2024-07-05 10:03:48 +08:00
Fredrik Lönnegren
a1e7eafd12 wifi: rtlwifi: fix default typo
Change 'defult' to 'default' in comments in several rtlwifi drivers.

Signed-off-by: Fredrik Lönnegren <fredrik@frelon.se>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240703070627.135328-1-fredrik@frelon.se
2024-07-05 09:59:46 +08:00
Zong-Zhe Yang
8095364696 wifi: rtw89: unify the selection logic of RFK table when MCC
Driver will notify FW the target index of RFK table to use at some
moments. When MCC (multi-channel concurrent), the correctness of the
notification is especially important.

We now unify the selection logic of RFK table as below among chips.
1. check each table if it matches target channel
2. check all tables if any is idle by iterating active channels
3. replace the first table if all are busy unexpectedly

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240702124452.18747-2-pkshih@realtek.com
2024-07-05 09:51:34 +08:00
Dan Carpenter
4baf1cc544 power: supply: cros_charge-control: Fix signedness bug in charge_behaviour_store()
The C standard is vague about the signedness of enums, but in this case
here, they are treated as unsigned so the error handling does not work.
Use an int type to fix this.

Fixes: c6ed48ef52 ("power: supply: add ChromeOS EC based charge control driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/ZoWKEs4mCqeLyTOB@stanley.mountain
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
2024-07-05 01:51:33 +00:00
Zong-Zhe Yang
1e71be6a34 wifi: rtw89: mac: parse MRC C2H failure report
MRC (multi-role concurrency) has a C2H event for status report. Newer
FW will report some kinds of failures. We parse them now and show by
debug log.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240702124452.18747-1-pkshih@realtek.com
2024-07-05 09:50:44 +08:00
Ping-Ke Shih
52bc83ad2e wifi: rtw89: 8852bx: add extra handles of BTC for 8852BT in 8852b_common
For 8852BT, the initial settings of BT-coexistence is a little bit
different, so add the extra handles.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240701014619.7300-2-pkshih@realtek.com
2024-07-05 09:48:10 +08:00
Ping-Ke Shih
0e557c5c1a wifi: rtw89: 8852bx: move BTC common code from 8852b to 8852b_common
The BT coexistence part of 8852B and 8852BT are similar, so move shared
code into common module.

Don't change logic for existing RTL8852BE.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240701014619.7300-1-pkshih@realtek.com
2024-07-05 09:47:57 +08:00
Marcin Ślusarz
adc539784c wifi: rtw88: usb: schedule rx work after everything is set up
Right now it's possible to hit NULL pointer dereference in
rtw_rx_fill_rx_status on hw object and/or its fields because
initialization routine can start getting USB replies before
rtw_dev is fully setup.

The stack trace looks like this:

rtw_rx_fill_rx_status
rtw8821c_query_rx_desc
rtw_usb_rx_handler
...
queue_work
rtw_usb_read_port_complete
...
usb_submit_urb
rtw_usb_rx_resubmit
rtw_usb_init_rx
rtw_usb_probe

So while we do the async stuff rtw_usb_probe continues and calls
rtw_register_hw, which does all kinds of initialization (e.g.
via ieee80211_register_hw) that rtw_rx_fill_rx_status relies on.

Fix this by moving the first usb_submit_urb after everything
is set up.

For me, this bug manifested as:
[    8.893177] rtw_8821cu 1-1:1.2: band wrong, packet dropped
[    8.910904] rtw_8821cu 1-1:1.2: hw->conf.chandef.chan NULL in rtw_rx_fill_rx_status
because I'm using Larry's backport of rtw88 driver with the NULL
checks in rtw_rx_fill_rx_status.

Link: https://lore.kernel.org/linux-wireless/CA+shoWQ7P49jhQasofDcTdQhiuarPTjYEDa--NiVVx494WcuQw@mail.gmail.com/
Signed-off-by: Marcin Ślusarz <mslusarz@renau.com>
Cc: Tim K <tpkuester@gmail.com>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: linux-wireless@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20240528110246.477321-1-marcin.slusarz@gmail.com
2024-07-05 09:38:18 +08:00
Dan Carpenter
99bf7c2ecc hwmon: (ltc2991) re-order conditions to fix off by one bug
LTC2991_T_INT_CH_NR is 4.  The st->temp_en[] array has LTC2991_MAX_CHANNEL
(4) elements.  Thus if "channel" is equal to LTC2991_T_INT_CH_NR then we
have read one element beyond the end of the array.  Flip the conditions
around so that we check if "channel" is valid before using it as an array
index.

Fixes: 2b9ea4262a ("hwmon: Add driver for ltc2991")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/Zoa9Y_UMY4_ROfhF@stanley.mountain
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-07-04 18:14:50 -07:00