linux

Author	SHA1	Message	Date
Linus Torvalds	d31e86ef63	arm64: access_ok() optimization The TBI setup on arm64 is very strange: HW is set up to always do TBI, but the kernel enforcement for system calls is purely a software contract, and user space is supposed to mask off the top bits before the system call. Except all the actual brk/mmap/etc() system calls then mask it in kernel space anyway, and accept any TBI address. This basically unifies things and makes access_ok() also ignore it. This is an ABI change, but the current situation is very odd, and this change avoids the current mess and makes the kernel more permissive, and as such is unlikely to break anything. The way forward - for some possible future situation when people want to use more bits - is probably to introduce a new "I actually want the full 64-bit address space" prctl. But we should make sure that the software and hardware rules actually match at that point. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-06-19 12:33:38 -07:00
Linus Torvalds	7fd298d4b3	arm64: start using 'asm goto' for put_user() This generates noticeably better code since we don't need to test the error register etc, the exception just jumps to the error handling directly. Unlike get_user(), there's no need to worry about old compilers. All supported compilers support the regular non-output 'asm goto', as pointed out by Nathan Chancellor. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-06-19 12:33:38 -07:00
Linus Torvalds	86a6a68feb	arm64: start using 'asm goto' for get_user() when available This generates noticeably better code with compilers that support it, since we don't need to test the error register etc, the exception just jumps to the error handling directly. Note that this also marks SW_TTBR0_PAN incompatible with KCSAN support, since KCSAN wants to save and restore the user access state. KCSAN and SW_TTBR0_PAN were probably always incompatible, but it became obvious only when implementing the unsafe user access functions. At that point the default empty user_access_save/restore() functions weren't provided by the default fallback functions. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-06-19 12:33:38 -07:00
Linus Torvalds	6ba59ff422	Linux 6.10-rc4	2024-06-16 13:40:16 -07:00
Linus Torvalds	6456c4256d	Merge tag 'parisc-for-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fix from Helge Deller: "On parisc we have suffered since years from random segfaults which seem to have been triggered due to cache inconsistencies. Those segfaults happened more often on machines with PA8800 and PA8900 CPUs, which have much bigger caches than the earlier machines. Dave Anglin has worked over the last few weeks to fix this bug. His patch has been successfully tested by various people on various machines and with various kernels (6.6, 6.8 and 6.9), and the debian buildd servers haven't shown a single random segfault with this patch. Since the cache handling has been reworked, the patch is slightly bigger than I would like in this stage, but the greatly improved stability IMHO justifies the inclusion now" * tag 'parisc-for-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Try to fix random segmentation faults in package builds	2024-06-16 11:50:16 -07:00
Linus Torvalds	4301487e6b	Merge tag 'i2c-for-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Two fixes to correctly report i2c functionality, ensuring that I2C_FUNC_SLAVE is reported when a device operates solely as a slave interface" * tag 'i2c-for-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: designware: Fix the functionality flags of the slave-only interface i2c: at91: Fix the functionality flags of the slave-only interface	2024-06-16 11:37:38 -07:00
Linus Torvalds	b5beaa4474	Merge tag 'usb-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt fixes from Greg KH: "Here are some small USB and Thunderbolt driver fixes for 6.10-rc4. Included in here are: - thunderbolt debugfs bugfix - USB typec bugfixes - kcov usb bugfix - xhci bugfixes - usb-storage bugfix - dt-bindings bugfix - cdc-wdm log message spam bugfix All of these, except for the last cdc-wdm log level change, have been in linux-next for a while with no reported problems. The cdc-wdm bugfix has been tested by syzbot and proved to fix the reported cpu lockup issues when the log is constantly spammed by a broken device" * tag 'usb-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages xhci: Handle TD clearing for multiple streams case xhci: Apply broken streams quirk to Etron EJ188 xHCI host xhci: Apply reset resume quirk to Etron EJ188 xHCI host xhci: Set correct transferred length for cancelled bulk transfers usb-storage: alauda: Check whether the media is initialized usb: typec: ucsi: Ack also failed Get Error commands kcov, usb: disable interrupts in kcov_remote_start_usb_softirq dt-bindings: usb: realtek,rts5411: Add missing "additionalProperties" on child nodes usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected usb: typec: ucsi: glink: increase max ports for x1e80100 Revert "usb: chipidea: move ci_ulpi_init after the phy initialization" thunderbolt: debugfs: Fix margin debugfs node creation condition	2024-06-16 11:20:26 -07:00
Linus Torvalds	6efc63a843	Merge tag 'tty-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are some small tty and serial driver fixes that resolve som reported problems. Included in here are: - n_tty lookahead buffer bugfix - WARN_ON() removal where it was not needed - 8250_dw driver bugfixes - 8250_pxa bugfix - sc16is7xx Kconfig fixes for reported build issues All of these have been in linux-next for over a week with no reported problems" * tag 'tty-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: drop debugging WARN_ON_ONCE() from uart_write() serial: sc16is7xx: re-add Kconfig SPI or I2C dependency serial: sc16is7xx: rename Kconfig CONFIG_SERIAL_SC16IS7XX_CORE serial: port: Don't block system suspend even if bytes are left to xmit serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level serial: 8250_dw: Revert "Move definitions to the shared header" serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw tty: n_tty: Fix buffer offsets when lookahead is used	2024-06-16 11:05:47 -07:00
Linus Torvalds	d3e6dc4ff0	Merge tag 'staging-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fix from Greg KH: "Here is a single staging driver fix, for the vc04 driver. It resolves a reported problem that showed up in the merge window set of changes. It's been in linux-next for over a week with no reported problems" * tag 'staging-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: vchiq_debugfs: Fix NPD in vchiq_dump_state	2024-06-16 10:57:05 -07:00
Linus Torvalds	e12fa4dd64	Merge tag 'driver-core-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and sysfs fixes from Greg KH: "Here are three small changes for 6.10-rc4 that resolve reported problems, and finally drop an unused api call. These are: - removal of devm_device_add_groups(), all the callers of this are finally gone after the 6.10-rc1 merge (changes came in through different trees), so it's safe to remove. - much reported sysfs build error fixed up for systems that did not have sysfs enabled - driver core sync issue fix for a many reported issue over the years that no one really paid much attention to, until Dirk finally tracked down the real issue and made the "obviously correct and simple" fix for it. All of these have been in linux-next for over a week with no reported problems" * tag 'driver-core-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: drivers: core: synchronize really_probe() and dev_uevent() sysfs: Unbreak the build around sysfs_bin_attr_simple_read() driver core: remove devm_device_add_groups()	2024-06-16 10:43:04 -07:00
Linus Torvalds	33f855cbb7	Merge tag 'char-misc-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a number of small char/misc and iio driver fixes for 6.10-rc4. Included in here are the following: - iio driver fixes for a bunch of reported problems. - mei driver fixes for a number of reported issues. - amiga parport driver build fix. - .editorconfig fix that was causing lots of unintended whitespace changes to happen to files when they were being edited. Unless we want to sweep the whole tree and remove all trailing whitespace at once, this is needed for the .editorconfig file to be able to be used at all. This change is required because the original submitters never touched older files in the tree. - jfs bugfix for a buffer overflow The jfs bugfix is in here as I didn't know where else to put it, and it's been ignored for a while as the filesystem seems to be abandoned and I'm tired of seeing the same issue reported in multiple places. All of these have been in linux-next with no reported issues" * tag 'char-misc-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (25 commits) .editorconfig: remove trim_trailing_whitespace option jfs: xattr: fix buffer overflow for invalid xattr misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe() misc: microchip: pci1xxxx: fix double free in the error handling of gp_aux_bus_probe() parport: amiga: Mark driver struct with __refdata to prevent section mismatch mei: vsc: Fix wrong invocation of ACPI SID method mei: vsc: Don't stop/restart mei device during system suspend/resume mei: me: release irq in mei_me_pci_resume error path mei: demote client disconnect warning on suspend to debug iio: inkern: fix channel read regression iio: imu: inv_mpu6050: stabilized timestamping in interrupt iio: adc: ad7173: Fix sampling frequency setting iio: adc: ad7173: Clear append status bit iio: imu: inv_icm42600: delete unneeded update watermark call iio: imu: inv_icm42600: stabilized timestamp in interrupt iio: invensense: fix odr switching to same value iio: adc: ad7173: Remove index from temp channel iio: adc: ad7173: Add ad7173_device_info names iio: adc: ad7173: fix buffers enablement for ad7176-2 iio: temperature: mlx90635: Fix ERR_PTR dereference in mlx90635_probe() ...	2024-06-16 10:29:37 -07:00
Linus Torvalds	e8b0264d6f	Merge tag 'ata-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: "Fix a bug where the SCSI Removable Media Bit (RMB) was incorrectly set for hot-plug capable (and eSATA) ports. The RMB bit means that the media is removable (e.g. floppy or CD-ROM), not that the device server is removable. If the RMB bit is set, SCSI will set the removable media sysfs attribute. If the removable media sysfs attribute is set on a device, GNOME/udisks will automatically mount the device on boot. We only want to set the SCSI RMB bit (and thus the removable media sysfs attribute) for devices where the ATA removable media device bit is set" * tag 'ata-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata-scsi: Set the RMB bit only for removable media devices	2024-06-16 10:20:18 -07:00
Linus Torvalds	e39388e430	Merge tag 'edac_urgent_for_v6.10_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fixes from Borislav Petkov: - Fix two issues with MI300 address translation logic * tag 'edac_urgent_for_v6.10_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: RAS/AMD/ATL: Use system settings for MI300 DRAM to normalized address translation RAS/AMD/ATL: Fix MI300 bank hash	2024-06-16 10:11:11 -07:00
Linus Torvalds	be2fa8865c	Merge tag 'firewire-fixes-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fixes from Takashi Sakamoto: - Update tracepoints events introduced in v6.10-rc1 so that it includes the numeric identifier of host card in which the event happens - replace wiki URL with the current website URL in Kconfig * tag 'firewire-fixes-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: core: record card index in bus_reset_handle tracepoints event firewire: core: record card index in tracepoinrts events derived from bus_reset_arrange_template firewire: core: record card index in async_phy_inbound tracepoints event firewire: core: record card index in async_phy_outbound_complete tracepoints event firewire: core: record card index in async_phy_outbound_initiate tracepoints event firewire: core: record card index in tracepoinrts events derived from async_inbound_template firewire: core: record card index in tracepoinrts events derived from async_outbound_initiate_template firewire: core: record card index in tracepoinrts events derived from async_outbound_complete_template firewire: fix website URL in Kconfig	2024-06-16 09:58:02 -07:00
Hans de Goede	fcf2a9970e	leds: class: Revert: "If no default trigger is given, make hw_control trigger the default trigger" Commit `66601a29bb` ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") causes ledtrig-netdev to get set as default trigger on various network LEDs. This causes users to hit a pre-existing AB-BA deadlock issue in ledtrig-netdev between the LED-trigger locks and the rtnl mutex, resulting in hung tasks in kernels >= 6.9. Solving the deadlock is non trivial, so for now revert the change to set the hw_control trigger as default trigger, so that ledtrig-netdev no longer gets activated automatically for various network LEDs. The netdev trigger is not needed because the network LEDs are usually under hw-control and the netdev trigger tries to leave things that way so setting it as the active trigger for the LED class device is a no-op. Fixes: `66601a29bb` ("leds: class: If no default trigger is given, make hw_control trigger the default trigger") Reported-by: Genes Lists <lists@sapience.com> Closes: https://lore.kernel.org/all/9d189ec329cfe68ed68699f314e191a10d4b5eda.camel@sapience.com/ Reported-by: Johannes Wüller <johanneswueller@gmail.com> Closes: https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Lee Jones <lee@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-06-16 09:33:28 -07:00
Wolfram Sang	7e9bb0cb50	Merge tag 'i2c-host-fixes-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current Two fixes from Jean aim to correctly report i2c functionality, specifically ensuring that I2C_FUNC_SLAVE is reported when a device operates solely as a slave interface.	2024-06-16 12:48:30 +02:00
Yazen Ghannam	ba437905b4	RAS/AMD/ATL: Use system settings for MI300 DRAM to normalized address translation The currently used normalized address format is not applicable to all MI300 systems. This leads to incorrect results during address translation. Drop the fixed layout and construct the normalized address from system settings. Fixes: `87a6123753` ("RAS/AMD/ATL: Add MI300 DRAM to normalized address translation support") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20240607-mi300-dram-xl-fix-v1-2-2f11547a178c@amd.com	2024-06-16 11:22:57 +02:00
Linus Torvalds	a3e18a5405	Merge tag 'xfs-6.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fix from Chandan Babu: "Ensure xfs incore superblock's allocated inode counter, free inode counter, and free data block counter are all zero or positive when they are copied over from xfs_mount->m_[icount,ifree,fdblocks] respectively" * tag 'xfs-6.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: make sure sb_fdblocks is non-negative	2024-06-15 12:03:32 -07:00
Linus Torvalds	62e1f3b3fd	Merge tag '6.10-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd Pull smb server fixes from Steve French: "Two small smb3 server fixes: - set xatttr fix - pathname parsing check fix" * tag '6.10-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix missing use of get_write in in smb2_set_ea() ksmbd: move leading slash check to smb2_get_name()	2024-06-15 12:00:25 -07:00
Linus Torvalds	08a6b55aa0	Merge tag 'x86-urgent-2024-06-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Fix the 8 bytes get_user() logic on x86-32 - Fix build bug that creates weird & mistaken target directory under arch/x86/ * tag 'x86-urgent-2024-06-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Don't add the EFI stub to targets, again x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking	2024-06-15 11:03:05 -07:00
Linus Torvalds	41d707222e	Merge tag 'timers-urgent-2024-06-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix boot-time warning in tick_setup_device()" * tag 'timers-urgent-2024-06-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/nohz_full: Don't abuse smp_call_function_single() in tick_setup_device()	2024-06-15 10:54:24 -07:00
Takashi Sakamoto	893098b2af	firewire: core: record card index in bus_reset_handle tracepoints event The bus reset event occurs in the bus managed by one of 1394 OHCI controller in Linux system, however the existing tracepoints events has the lack of data about it to distinguish the issued hardware from the others. This commit adds card_index member into event structure to store the index of host controller in use, and prints it. Link: https://lore.kernel.org/r/20240613131440.431766-9-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-06-15 14:59:26 +09:00
Takashi Sakamoto	7507dbc46b	firewire: core: record card index in tracepoinrts events derived from bus_reset_arrange_template The asynchronous transmission of phy packet is initiated on one of 1394 OHCI controller, however the existing tracepoints events has the lack of data about it. This commit adds card_index member into event structure to store the index of host controller in use, and prints it. Link: https://lore.kernel.org/r/20240613131440.431766-8-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-06-15 14:59:17 +09:00
Takashi Sakamoto	abbb4bd96d	firewire: core: record card index in async_phy_inbound tracepoints event The asynchronous transmission of phy packet is initiated on one of 1394 OHCI controller, however the existing tracepoints events has the lack of data about it. This commit adds card_index member into event structure to store the index of host controller in use, and prints it. Link: https://lore.kernel.org/r/20240613131440.431766-7-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-06-15 14:59:17 +09:00
Takashi Sakamoto	810f2aa835	firewire: core: record card index in async_phy_outbound_complete tracepoints event The asynchronous transmission of phy packet is initiated on one of 1394 OHCI controller, however the existing tracepoints events has the lack of data about it. This commit adds card_index member into event structure to store the index of host controller in use, and prints it. Link: https://lore.kernel.org/r/20240613131440.431766-6-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-06-15 14:59:17 +09:00
Takashi Sakamoto	3cb44a72a3	firewire: core: record card index in async_phy_outbound_initiate tracepoints event The asynchronous transaction is initiated on one of 1394 OHCI controller, however the existing tracepoints events has the lack of data about it. This commit adds card_index member into event structure to store the index of host controller in use, and prints it. Link: https://lore.kernel.org/r/20240613131440.431766-5-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-06-15 14:59:17 +09:00
Takashi Sakamoto	65ec7ebefe	firewire: core: record card index in tracepoinrts events derived from async_inbound_template The asynchronous transaction is initiated on one of 1394 OHCI controller, however the existing tracepoints events has the lack of data about it. This commit adds card_index member into event structure to store the index of host controller in use, and prints it. Link: https://lore.kernel.org/r/20240613131440.431766-4-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-06-15 14:59:17 +09:00
Takashi Sakamoto	64e02b64fb	firewire: core: record card index in tracepoinrts events derived from async_outbound_initiate_template The asynchronous transaction is initiated on one of 1394 OHCI controller, however the existing tracepoints events has the lack of data about it. This commit adds card_index member into event structure to store the index of host controller in use, and prints it. Link: https://lore.kernel.org/r/20240613131440.431766-3-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-06-15 14:59:17 +09:00
Takashi Sakamoto	e7da16abf0	firewire: core: record card index in tracepoinrts events derived from async_outbound_complete_template The asynchronous transaction is initiated on one of 1394 OHCI controller, however the existing tracepoints events has the lack of data about it. This commit adds card_index member into event structure to store the index of host controller in use, and prints it. Link: https://lore.kernel.org/r/20240613131440.431766-2-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-06-15 14:59:17 +09:00
Takashi Sakamoto	e789523fe2	firewire: fix website URL in Kconfig The wiki in kernel.org is no longer updated. This commit replaces the website URL with the latest one. Link: https://lore.kernel.org/r/20240613090343.416198-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-06-15 14:59:04 +09:00
Linus Torvalds	44ef20baed	Merge tag 's390-6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - A couple of fixes for regressions resulting from the uncoupling of physical vs virtual kernel address spaces: fix the mapping of the kernel image using large pages; enforce alignment checks on physical addresses before creating large pages - Update defconfigs * tag 's390-6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: Restore mapping of kernel image using large pages s390/mm: Allow large pages only for aligned physical addresses s390: Update defconfigs	2024-06-14 19:27:02 -07:00
Linus Torvalds	d4332da0f2	Merge tag 'drm-fixes-2024-06-15' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Weekly fixes. Seems a little quieter than usual, but still a bunch of stuff across the board. Mostly xe, some exynos and nouveau fixes. core: - Werror Kconfig fix panel: - add orientation quirk for Aya Neo KUN - fix runtime warning on panel/bridge release nouveau: - remove unused struct - fix wq crash on cards with no display amdgpu: - fix bo release clear page warning xe: - update MAINTAINERS - Use correct forcewake assertions - Assert that VRAM provisioning is only done on DGFX - Flush render caches before user-fence signalling on all engines - Move the disable_c6 call since it was sometimes never called exynos: - fix regression with fallback mode - fix EDID related memory leak - remove redundant code komeda: - fix debugfs conditional compilations - check pointer error value renesas: - atomic shutdown fix mediatek: - atomic shutdown fix" * tag 'drm-fixes-2024-06-15' of https://gitlab.freedesktop.org/drm/kernel: arm/komeda: Remove all CONFIG_DEBUG_FS conditional compilations drm/xe: move disable_c6 call drm/xe: flush engine buffers before signalling user fence on all engines drm/xe/pf: Assert LMEM provisioning is done only on DGFX drm/xe/xe_gt_idle: use GT forcewake domain assertion drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown time drm: renesas: shmobile: Call drm_atomic_helper_shutdown() at shutdown time drm/nouveau: remove unused struct 'init_exec' drm/nouveau: don't attempt to schedule hpd_work on headless cards drm/amdgpu: Fix the BO release clear memory warning drm/bridge/panel: Fix runtime warning on panel bridge release drm/komeda: check for error-valued pointer drm: panel-orientation-quirks: Add quirk for Aya Neo KUN drm/exynos/vidi: fix memory leak in .get_modes() drm/exynos: dp: drop driver owner initialization drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found drm: have config DRM_WERROR depend on !WERROR MAINTAINERS: Update Xe driver maintainers MAINTAINERS: update Xe driver maintainers	2024-06-14 18:57:28 -07:00
Linus Torvalds	68132b3536	Merge tag 'vfio-v6.10-rc4' of https://github.com/awilliam/linux-vfio Pull VFIO fixes from Alex Williamson: "Fix long standing lockdep issue of using remap_pfn_range() from the vfio-pci fault handler for mapping device MMIO. Commit `ba168b52bf` ("mm: use rwsem assertion macros for mmap_lock") now exposes this as a warning forcing this to be addressed. remap_pfn_range() was used here to efficiently map the entire vma, but it really never should have been used in the fault handler and doesn't handle concurrency, which introduced complex locking. We also needed to track vmas mapping the device memory in order to zap those vmas when the memory is disabled resulting in a vma list. Instead of all that mess, setup an address space on the device fd such that we can use unmap_mapping_range() for zapping to avoid the tracking overhead and use the standard vmf_insert_pfn() to insert mappings on fault. For now we'll iterate the vma and opportunistically try to insert mappings for the entire vma. This aligns with typical use cases, but hopefully in the future we can drop the iterative approach and make use of huge_fault instead, once vmf_insert_pfn{pud,pmd}() learn to handle pfnmaps" * tag 'vfio-v6.10-rc4' of https://github.com/awilliam/linux-vfio: vfio/pci: Insert full vma on mmap'd MMIO fault vfio/pci: Use unmap_mapping_range() vfio: Create vfio_fs_type with inode per device	2024-06-14 18:46:53 -07:00
Dave Airlie	9f0a86492a	Merge tag 'drm-misc-fixes-2024-06-14' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.10-rc4: - Kconfig fix for WERROR. - Add panel quirk for Aya Neo KUN - Small bugfixes in komeda, bridge/panel, amdgpu, nouveau. - Remove unused nouveau struct. - Call drm_atomic_helper_shutdown for shmobile and mediatek on shutdown. - Remove DEBUGFS ifdefs from komeda. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/941c0552-3614-4af1-b04a-0a62c99fd7fb@linux.intel.com	2024-06-15 06:52:56 +10:00
Linus Torvalds	c286c21ff9	Merge tag 'block-6.10-20240614' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Discard double free on error conditions (Chunguang) - Target Fixes (Daniel) - Namespace detachment regression fix (Keith) - Fix for an issue with flush requests and queuelist reuse (Chengming) - nbd sparse annotation fixes (Christoph) - unmap and free bio mapped data via submitter (Anuj) - loop discard/fallocate unsupported fix (Cyril) - Fix for the zoned write plugging added in this release (Damien) - sed-opal wrong address fix (Su) * tag 'block-6.10-20240614' of git://git.kernel.dk/linux: loop: Disable fallocate() zero and discard if not supported nvme: fix namespace removal list nbd: Remove __force casts nvmet: always initialize cqe.result nvmet-passthru: propagate status from id override functions nvme: avoid double free special payload block: unmap and free user mapped integrity via submitter block: fix request.queuelist usage in flush block: Optimize disk zone resource cleanup block: sed-opal: avoid possible wrong address reference in read_sed_opal_key()	2024-06-14 11:41:50 -07:00
Linus Torvalds	ac3cb72aea	Merge tag 'io_uring-6.10-20240614' of git://git.kernel.dk/linux Pull io_uring fixes from Jens Axboe: "Two fixes from Pavel headed to stable: - Ensure that the task state is correct before attempting to grab a mutex - Split cancel sequence flag into a separate variable, as it can get set by someone not owning the request (but holding the ctx lock)" * tag 'io_uring-6.10-20240614' of git://git.kernel.dk/linux: io_uring: fix cancellation overwriting req->flags io_uring/rsrc: don't lock while !TASK_RUNNING	2024-06-14 11:17:24 -07:00
Linus Torvalds	0b320c8601	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three obvious driver fixes and two core fixes. The two core fixes are to disable Command Duration Limits by default to fix an inconsistency in SATA and some USB devices. The other is to change the default read size for block zero to follow the device preference (some USB bridges preferring 16 byte commands don't have a translation for READ(10) and thus don't scan properly)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: mpi3mr: Fix ATA NCQ priority support scsi: ufs: core: Quiesce request queues before checking pending cmds scsi: core: Disable CDL by default scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory scsi: sd: Use READ(16) when reading block zero on large capacity disks	2024-06-14 10:25:29 -07:00
Linus Torvalds	11100273f2	Merge tag 'iommu-fix-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fix from Joerg Roedel: "A single patch that fixes a regression which several people reported: - AMD-Vi: Fix regression causing panics" * tag 'iommu-fix-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix panic accessing amd_iommu_enable_faulting	2024-06-14 10:06:29 -07:00
Linus Torvalds	0cac73eb38	Merge tag 'pm-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Restore the behavior of the no_turbo sysfs attribute in the intel_pstate driver which allowed users to make the driver start using turbo P-states if they have been enabled on the fly by the firmware after OS initialization (Rafael Wysocki)" * tag 'pm-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Check turbo_is_disabled() in store_no_turbo()	2024-06-14 09:52:51 -07:00
Linus Torvalds	94df82fe5b	Merge tag 'acpi-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix a recent regression in the ACPI EC driver and make system suspend work on multiple platforms where StorageD3Enable _DSD is missing in the ACPI tables. Specifics: - Make the ACPI EC driver directly evaluate an "orphan" _REG method under the EC device, if present, which stopped being evaluated after the driver had started to install its EC address space handler at the root of the ACPI namespace (Rafael Wysocki) - Make more devices put NVMe storage devices into D3 at suspend to work around missing StorageD3Enable _DSD in the BIOS (Mario Limonciello)" * tag 'acpi-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: EC: Evaluate orphan _REG under EC device ACPI: x86: Force StorageD3Enable on more products	2024-06-14 09:39:14 -07:00
Linus Torvalds	cee84c0b00	Merge tag 'thermal-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fixes from Rafael Wysocki: "These fix three issues introduced recently, two related to defects in ACPI tables supplied by the platform firmware and one cause by a thermal core change that went too far: - Prevent the thermal core from failing the registration of a cooling device if its .get_cur_state() reports an incorrect state to start with which may happen for fans handled through firmware-supplied AML in ACPI tables (Rafael Wysocki) - Make the ACPI thermal zone driver initialize all trip points with temperature of 0 centigrade and below as invalid because such trip point temperatures do not make sense on systems with ACPI thermal control and they cause performance regressions due to permanent thermal mitigations to occur (Rafael Wysocki) - Restore passive polling management in the Step-Wise thermal governor that uses it to ensure that all cooling devices used for thermal mitigation will go back to their initial states eventually (Rafael Wysocki)" * tag 'thermal-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: gov_step_wise: Restore passive polling management thermal: ACPI: Invalidate trip points with temperature of 0 or below thermal: core: Do not fail cdev registration because of invalid initial state	2024-06-14 09:28:56 -07:00
Rafael J. Wysocki	04f82fbb86	Merge branch acpi-x86 Merge a fix for a suspend issue related to storage handling on multiple systems based on AMD hardware: - Make more devices put NVMe storage devices into D3 at suspend to work around missing StorageD3Enable _DSD in the BIOS (Mario Limonciello). * branch acpi-x86: ACPI: x86: Force StorageD3Enable on more products	2024-06-14 14:27:16 +02:00
Cyril Hrubis	5f75e081ab	loop: Disable fallocate() zero and discard if not supported If fallcate is implemented but zero and discard operations are not supported by the filesystem the backing file is on we continue to fill dmesg with errors from the blk_mq_end_request() since each time we call fallocate() on the loop device the EOPNOTSUPP error from lo_fallocate() ends up propagated into the block layer. In the end syscall succeeds since the blkdev_issue_zeroout() falls back to writing zeroes which makes the errors even more misleading and confusing. How to reproduce: 1. make sure /tmp is mounted as tmpfs 2. dd if=/dev/zero of=/tmp/disk.img bs=1M count=100 3. losetup /dev/loop0 /tmp/disk.img 4. mkfs.ext2 /dev/loop0 5. dmesg \|tail [710690.898214] operation not supported error, dev loop0, sector 204672 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 [710690.898279] operation not supported error, dev loop0, sector 522 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 [710690.898603] operation not supported error, dev loop0, sector 16906 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 [710690.898917] operation not supported error, dev loop0, sector 32774 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 [710690.899218] operation not supported error, dev loop0, sector 49674 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 [710690.899484] operation not supported error, dev loop0, sector 65542 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 [710690.899743] operation not supported error, dev loop0, sector 82442 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 [710690.900015] operation not supported error, dev loop0, sector 98310 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 [710690.900276] operation not supported error, dev loop0, sector 115210 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 [710690.900546] operation not supported error, dev loop0, sector 131078 op 0x9:(WRITE_ZEROES) flags 0x8000800 phys_seg 0 prio class 0 This patch changes the lo_fallocate() to clear the flags for zero and discard operations if we get EOPNOTSUPP from the backing file fallocate callback, that way we at least stop spewing errors after the first unsuccessful try. CC: Jan Kara <jack@suse.cz> Signed-off-by: Cyril Hrubis <chrubis@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20240613163817.22640-1-chrubis@suse.cz Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-14 06:21:25 -06:00
Damien Le Moal	a6a75edc86	ata: libata-scsi: Set the RMB bit only for removable media devices The SCSI Removable Media Bit (RMB) should only be set for removable media, where the device stays and the media changes, e.g. CD-ROM or floppy. The ATA removable media device bit is obsoleted since ATA-8 ACS (2006), but before that it was used to indicate that the device can have its media removed (while the device stays). Commit `8a3e33cf92` ("ata: ahci: find eSATA ports and flag them as removable") introduced a change to set the RMB bit if the port has either the eSATA bit or the hot-plug capable bit set. The reasoning was that the author wanted his eSATA ports to get treated like a USB stick. This is however wrong. See "20-082r23SPC-6: Removable Medium Bit Expectations" which has since been integrated to SPC, which states that: """ Reports have been received that some USB Memory Stick device servers set the removable medium (RMB) bit to one. The rub comes when the medium is actually removed, because... The device server is removed concurrently with the medium removal. If there is no device server, then there is no device server that is waiting to have removable medium inserted. Sufficient numbers of SCSI analysts see such a device: - not as a device that supports removable medium; but - as a removable, hot pluggable device. """ The definition of the RMB bit in the SPC specification has since been clarified to match this. Thus, a USB stick should not have the RMB bit set (and neither shall an eSATA nor a hot-plug capable port). Commit `dc8b4afc4a` ("ata: ahci: don't mark HotPlugCapable Ports as external/removable") then changed so that the RMB bit is only set for the eSATA bit (and not for the hot-plug capable bit), because of a lot of bug reports of SATA devices were being automounted by udisks. However, treating eSATA and hot-plug capable ports differently is not correct. From the AHCI 1.3.1 spec: Hot Plug Capable Port (HPCP): When set to '1', indicates that this port's signal and power connectors are externally accessible via a joint signal and power connector for blindmate device hot plug. So a hot-plug capable port is an external port, just like commit `45b96d65ec` ("ata: ahci: a hotplug capable port is an external port") claims. In order to not violate the SPC specification, modify the SCSI INQUIRY data to only set the RMB bit if the ATA device can have its media removed. This fixes a reported problem where GNOME/udisks was automounting devices connected to hot-plug capable ports. Fixes: `45b96d65ec` ("ata: ahci: a hotplug capable port is an external port") Cc: stable@vger.kernel.org Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Tested-by: Thomas Weißschuh <linux@weissschuh.net> Reported-by: Thomas Weißschuh <linux@weissschuh.net> Closes: https://lore.kernel.org/linux-ide/c0de8262-dc4b-4c22-9fac-33432e5bddd3@t-8ch.de/ Signed-off-by: Damien Le Moal <dlemoal@kernel.org> [cassel: wrote commit message] Signed-off-by: Niklas Cassel <cassel@kernel.org>	2024-06-14 14:18:46 +02:00
Maxime Ripard	14731a640e	Merge drm/drm-fixes into drm-misc-fixes Roll -rc3 and current drm/fixes in. This will also unstuck our for-next branch. Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-06-14 09:55:46 +02:00
pengfuyuan	41f590e31c	arm/komeda: Remove all CONFIG_DEBUG_FS conditional compilations Since the debugfs functions have no-op stubs for CONFIG_DEBUG_FS=n, the compiler will optimize the rest away since they are no longer referenced. The benefit of removing the conditional compilation is that the build is actually tested for both CONFIG_DEBUG_FS configuration values. Assuming most developers have it enabled, CONFIG_DEBUG_FS=n is not tested much and may fail the build due to the conditional compilation. Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: pengfuyuan <pengfuyuan@kylinos.cn> Acked-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240606120842.1377267-1-pengfuyuan@kylinos.cn Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-06-14 08:57:28 +02:00
Alan Stern	22f0081286	USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages The syzbot fuzzer found that the interrupt-URB completion callback in the cdc-wdm driver was taking too long, and the driver's immediate resubmission of interrupt URBs with -EPROTO status combined with the dummy-hcd emulation to cause a CPU lockup: cdc_wdm 1-1:1.0: nonzero urb status received: -71 cdc_wdm 1-1:1.0: wdm_int_callback - 0 bytes watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [syz-executor782:6625] CPU#0 Utilization every 4s during lockup: #1: 98% system, 0% softirq, 3% hardirq, 0% idle #2: 98% system, 0% softirq, 3% hardirq, 0% idle #3: 98% system, 0% softirq, 3% hardirq, 0% idle #4: 98% system, 0% softirq, 3% hardirq, 0% idle #5: 98% system, 1% softirq, 3% hardirq, 0% idle Modules linked in: irq event stamp: 73096 hardirqs last enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline] hardirqs last enabled at (73095): [<ffff80008037bc00>] console_flush_all+0x650/0xb74 kernel/printk/printk.c:2994 hardirqs last disabled at (73096): [<ffff80008af10b00>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline] hardirqs last disabled at (73096): [<ffff80008af10b00>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551 softirqs last enabled at (73048): [<ffff8000801ea530>] softirq_handle_end kernel/softirq.c:400 [inline] softirqs last enabled at (73048): [<ffff8000801ea530>] handle_softirqs+0xa60/0xc34 kernel/softirq.c:582 softirqs last disabled at (73043): [<ffff800080020de8>] __do_softirq+0x14/0x20 kernel/softirq.c:588 CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G W 6.10.0-rc2-syzkaller-g8867bbd4a056 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Testing showed that the problem did not occur if the two error messages -- the first two lines above -- were removed; apparently adding material to the kernel log takes a surprisingly large amount of time. In any case, the best approach for preventing these lockups and to avoid spamming the log with thousands of error messages per second is to ratelimit the two dev_err() calls. Therefore we replace them with dev_err_ratelimited(). Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Suggested-by: Greg KH <gregkh@linuxfoundation.org> Reported-and-tested-by: syzbot+5f996b83575ef4058638@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/00000000000073d54b061a6a1c65@google.com/ Reported-and-tested-by: syzbot+1b2abad17596ad03dcff@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/000000000000f45085061aa9b37e@google.com/ Fixes: `9908a32e94` ("USB: remove err() macro from usb class drivers") Link: https://lore.kernel.org/linux-usb/40dfa45b-5f21-4eef-a8c1-51a2f320e267@rowland.harvard.edu/ Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/29855215-52f5-4385-b058-91f42c2bee18@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-14 08:47:59 +02:00
Pavel Begunkov	f4a1254f2a	io_uring: fix cancellation overwriting req->flags Only the current owner of a request is allowed to write into req->flags. Hence, the cancellation path should never touch it. Add a new field instead of the flag, move it into the 3rd cache line because it should always be initialised. poll_refs can move further as polling is an involved process anyway. It's a minimal patch, in the future we can and should find a better place for it and remove now unused REQ_F_CANCEL_SEQ. Fixes: `521223d7c2` ("io_uring/cancel: don't default to setting req->work.cancel_seq") Cc: stable@vger.kernel.org Reported-by: Li Shi <sl1589472800@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6827b129f8f0ad76fa9d1f0a773de938b240ffab.1718323430.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-13 19:25:28 -06:00
Dave Airlie	f1909e8597	Merge tag 'drm-xe-fixes-2024-06-13' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Core Changes: - Xe Maintainers update to MAINTAINERS file. Driver Changes: - Use correct forcewake assertions. - Assert that VRAM provisioning is only done on DGFX. - Flush render caches before user-fence signalling on all engines. - Move the disable_c6 call since it was sometimes never called. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZmrXV0FoBb8M0c6J@fedora	2024-06-14 11:08:06 +10:00
Dave Airlie	ae1e782362	Merge tag 'exynos-drm-fixes-for-v6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes Regression fix - Fix an regression issue by adding 640x480 fallback mode for Exynos HDMI driver. Bug fix - Fix a memory leak by ensuring the duplicated EDID is properly freed in the get_modes function. Code cleanup - Remove redundant driver owner initialization since platform_driver_register() sets it automatically. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Inki Dae <inki.dae@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240610073839.37430-1-inki.dae@samsung.com	2024-06-14 10:12:27 +10:00
Jens Axboe	e3e53683cc	Merge tag 'nvme-6.10-2024-06-13' of git://git.infradead.org/nvme into block-6.10 Pull NVMe fixes from Keith: "nvme fixes for Linux 6.10 - Discard double free on error conditions (Chunguang) - Target Fixes (Daniel) - Namespace detachment regression fix (Keith)" * tag 'nvme-6.10-2024-06-13' of git://git.infradead.org/nvme: nvme: fix namespace removal list nvmet: always initialize cqe.result nvmet-passthru: propagate status from id override functions nvme: avoid double free special payload	2024-06-13 14:19:57 -06:00
Keith Busch	ff0ffe5b7c	nvme: fix namespace removal list This function wants to move a subset of a list from one element to the tail into another list. It also needs to use the srcu synchronize instead of the regular rcu version. Do this one element at a time because that's the only to do it. Fixes: `be647e2c76` ("nvme: use srcu for iterating namespace list") Reported-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-06-13 11:47:40 -07:00
Linus Torvalds	d20f6b3d74	Merge tag 'net-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter. Slim pickings this time, probably a combination of summer, DevConf.cz, and the end of first half of the year at corporations. Current release - regressions: - Revert "igc: fix a log entry using uninitialized netdev", it traded lack of netdev name in a printk() for a crash Previous releases - regressions: - Bluetooth: L2CAP: fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ - geneve: fix incorrectly setting lengths of inner headers in the skb, confusing the drivers and causing mangled packets - sched: initialize noop_qdisc owner to avoid false-positive recursion detection (recursing on CPU 0), which bubbles up to user space as a sendmsg() error, while noop_qdisc should silently drop - netdevsim: fix backwards compatibility in nsim_get_iflink() Previous releases - always broken: - netfilter: ipset: fix race between namespace cleanup and gc in the list:set type" * tag 'net-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits) bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send() af_unix: Read with MSG_PEEK loops if the first unread byte is OOB bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded response gve: Clear napi->skb before dev_kfree_skb_any() ionic: fix use after netif_napi_del() Revert "igc: fix a log entry using uninitialized netdev" net: bridge: mst: fix suspicious rcu usage in br_mst_set_state net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state net/ipv6: Fix the RT cache flush via sysctl using a previous delay net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters gve: ignore nonrelevant GSO type bits when processing TSO headers net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP netfilter: Use flowlabel flow key when re-routing mangled packets netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type netfilter: nft_inner: validate mandatory meta and payload tcp: use signed arithmetic in tcp_rtx_probe0_timed_out() mailmap: map Geliang's new email address mptcp: pm: update add_addr counters after connect mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID mptcp: ensure snd_una is properly initialized on connect ...	2024-06-13 11:11:53 -07:00
Linus Torvalds	fd88e181d8	Merge tag 'nfs-for-6.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client fixes from Trond Myklebust: "Bugfixes: - NFSv4.2: Fix a memory leak in nfs4_set_security_label - NFSv2/v3: abort nfs_atomic_open_v23 if the name is too long. - NFS: Add appropriate memory barriers to the sillyrename code - Propagate readlink errors in nfs_symlink_filler - NFS: don't invalidate dentries on transient errors - NFS: fix unnecessary synchronous writes in random write workloads - NFSv4.1: enforce rootpath check when deciding whether or not to trunk Other: - Change email address for Trond Myklebust due to email server concerns" * tag 'nfs-for-6.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: add barriers when testing for NFS_FSDATA_BLOCKED SUNRPC: return proper error from gss_wrap_req_priv NFSv4.1 enforce rootpath check in fs_location query NFS: abort nfs_atomic_open_v23 if name is too long. nfs: don't invalidate dentries on transient errors nfs: Avoid flushing many pages with NFS_FILE_SYNC nfs: propagate readlink errors in nfs_symlink_filler MAINTAINERS: Change email address for Trond Myklebust NFSv4: Fix memory leak in nfs4_set_security_label	2024-06-13 11:07:32 -07:00
Linus Torvalds	3572597ca8	Merge tag 'fixes-2024-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fixes from Mike Rapoport: "Fix validation of NUMA coverage. memblock_validate_numa_coverage() was checking for a unset node ID using NUMA_NO_NODE, but x86 used MAX_NUMNODES when no node ID was specified by buggy firmware. Update memblock to substitute MAX_NUMNODES with NUMA_NO_NODE in memblock_set_node() and use NUMA_NO_NODE in x86::numa_init()" * tag 'fixes-2024-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node() memblock: make memblock_set_node() also warn about use of MAX_NUMNODES	2024-06-13 10:09:29 -07:00
Aleksandr Mishin	a9b9741854	bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send() In case of token is released due to token->state == BNXT_HWRM_DEFERRED, released token (set to NULL) is used in log messages. This issue is expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But this error code is returned by recent firmware. So some firmware may not return it. This may lead to NULL pointer dereference. Adjust this issue by adding token pointer check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `8fa4219dba` ("bnxt_en: add dynamic debug support for HWRM messages") Suggested-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240611082547.12178-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-13 08:05:46 -07:00
Rao Shoaib	a6736a0add	af_unix: Read with MSG_PEEK loops if the first unread byte is OOB Read with MSG_PEEK flag loops if the first byte to read is an OOB byte. commit `22dd70eb2c` ("af_unix: Don't peek OOB data without MSG_OOB.") addresses the loop issue but does not address the issue that no data beyond OOB byte can be read. >>> from socket import * >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM) >>> c1.send(b'a', MSG_OOB) 1 >>> c1.send(b'b') 1 >>> c2.recv(1, MSG_PEEK \| MSG_DONTWAIT) b'b' >>> from socket import * >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM) >>> c2.setsockopt(SOL_SOCKET, SO_OOBINLINE, 1) >>> c1.send(b'a', MSG_OOB) 1 >>> c1.send(b'b') 1 >>> c2.recv(1, MSG_PEEK \| MSG_DONTWAIT) b'a' >>> c2.recv(1, MSG_PEEK \| MSG_DONTWAIT) b'a' >>> c2.recv(1, MSG_DONTWAIT) b'a' >>> c2.recv(1, MSG_PEEK \| MSG_DONTWAIT) b'b' >>> Fixes: `314001f0bf` ("af_unix: Add OOB support") Signed-off-by: Rao Shoaib <Rao.Shoaib@oracle.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240611084639.2248934-1-Rao.Shoaib@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-13 08:03:55 -07:00
Michael Chan	7d9df38c9c	bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded response Firmware interface 1.10.2.118 has increased the size of HWRM_PORT_PHY_QCFG response beyond the maximum size that can be forwarded. When the VF's link state is not the default auto state, the PF will need to forward the response back to the VF to indicate the forced state. This regression may cause the VF to fail to initialize. Fix it by capping the HWRM_PORT_PHY_QCFG response to the maximum 96 bytes. The SPEEDS2_SUPPORTED flag needs to be cleared because the new speeds2 fields are beyond the legacy structure. Also modify bnxt_hwrm_fwd_resp() to print a warning if the message size exceeds 96 bytes to make this failure more obvious. Fixes: `84a911db83` ("bnxt_en: Update firmware interface to 1.10.2.118") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240612231736.57823-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-13 07:50:16 -07:00
Greg Kroah-Hartman	7da9dfdd5a	.editorconfig: remove trim_trailing_whitespace option Some editors (like the vim variants), when seeing "trim_whitespace" decide to do just that for all of the whitespace in the file you are saving, even if it is not on a line that you have modified. This plays havoc with diffs and is NOT something that should be intended. As the "only trim whitespace on modified lines" is not part of the editorconfig standard yet, just delete these lines from the .editorconfig file so that we don't end up with diffs that are automatically rejected by maintainers for containing things they shouldn't. Cc: Danny Lin <danny@kdrag0n.dev> Cc: Íñigo Huguet <ihuguet@redhat.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Fixes: `5a602de997` ("Add .editorconfig file for basic formatting") Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/r/2024061137-jawless-dipped-e789@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-13 16:47:52 +02:00
Ziwei Xiao	6f4d93b78a	gve: Clear napi->skb before dev_kfree_skb_any() gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it is freed with dev_kfree_skb_any(). This can result in a subsequent call to napi_get_frags returning a dangling pointer. Fix this by clearing napi->skb before the skb is freed. Fixes: `9b8dd5e5ea` ("gve: DQO: Add RX path") Cc: stable@vger.kernel.org Reported-by: Shailend Chand <shailend@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Shailend Chand <shailend@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Link: https://lore.kernel.org/r/20240612001654.923887-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-13 07:37:35 -07:00
Taehee Yoo	79f18a41dd	ionic: fix use after netif_napi_del() When queues are started, netif_napi_add() and napi_enable() are called. If there are 4 queues and only 3 queues are used for the current configuration, only 3 queues' napi should be registered and enabled. The ionic_qcq_enable() checks whether the .poll pointer is not NULL for enabling only the using queue' napi. Unused queues' napi will not be registered by netif_napi_add(), so the .poll pointer indicates NULL. But it couldn't distinguish whether the napi was unregistered or not because netif_napi_del() doesn't reset the .poll pointer to NULL. So, ionic_qcq_enable() calls napi_enable() for the queue, which was unregistered by netif_napi_del(). Reproducer: ethtool -L <interface name> rx 1 tx 1 combined 0 ethtool -L <interface name> rx 0 tx 0 combined 1 ethtool -L <interface name> rx 0 tx 0 combined 4 Splat looks like: kernel BUG at net/core/dev.c:6666! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 Workqueue: events ionic_lif_deferred_work [ionic] RIP: 0010:napi_enable+0x3b/0x40 Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? napi_enable+0x3b/0x40 ? do_error_trap+0x83/0xb0 ? napi_enable+0x3b/0x40 ? napi_enable+0x3b/0x40 ? exc_invalid_op+0x4e/0x70 ? napi_enable+0x3b/0x40 ? asm_exc_invalid_op+0x16/0x20 ? napi_enable+0x3b/0x40 ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] process_one_work+0x145/0x360 worker_thread+0x2bb/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0xcc/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Fixes: `0f3154e6bc` ("ionic: Add Tx and Rx handling") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20240612060446.1754392-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-13 07:30:06 -07:00
Sasha Neftin	8eef5c3cea	Revert "igc: fix a log entry using uninitialized netdev" This reverts commit `86167183a1`. igc_ptp_init() needs to be called before igc_reset(), otherwise kernel crash could be observed. Following the corresponding discussion [1] and [2] revert this commit. Link: https://lore.kernel.org/all/8fb634f8-7330-4cf4-a8ce-485af9c0a61a@intel.com/ [1] Link: https://lore.kernel.org/all/87o78rmkhu.fsf@intel.com/ [2] Fixes: `86167183a1` ("igc: fix a log entry using uninitialized netdev") Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240611162456.961631-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-13 07:24:52 -07:00
Riana Tauro	2470b141bf	drm/xe: move disable_c6 call disable c6 called in guc_pc_fini_hw is unreachable. GuC PC init returns earlier if skip_guc_pc is true and never registers the finish call thus making disable_c6 unreachable. move this call to gt idle. v2: rebase v3: add fixes tag (Himal) Fixes: `975e4a3795` ("drm/xe: Manually setup C6 when skip_guc_pc is set") Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240606100842.956072-3-riana.tauro@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `6800e63cf9`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-06-13 12:35:13 +02:00
Andrzej Hajda	b5e3a9b83f	drm/xe: flush engine buffers before signalling user fence on all engines Tests show that user fence signalling requires kind of write barrier, otherwise not all writes performed by the workload will be available to userspace. It is already done for render and compute, we need it also for the rest: video, gsc, copy. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240605-fix_user_fence_posted-v3-2-06e7932f784a@intel.com (cherry picked from commit `3ad7d18c5d`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-06-13 11:36:23 +02:00
Michal Wajdeczko	cd554e1e11	drm/xe/pf: Assert LMEM provisioning is done only on DGFX The Local Memory (aka VRAM) is only available on DGFX platforms. We shouldn't attempt to provision VFs with LMEM or attempt to update the LMTT on non-DGFX platforms. Add missing asserts that would enforce that and fix release code that could crash on iGFX due to uninitialized LMTT. Fixes: `0698ff57bf` ("drm/xe/pf: Update the LMTT when freeing VF GT config") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240607153155.1592-1-michal.wajdeczko@intel.com (cherry picked from commit `b321cb83a3`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-06-13 11:33:07 +02:00
Riana Tauro	7c877115da	drm/xe/xe_gt_idle: use GT forcewake domain assertion The rc6 registers used in disable_c6 function belong to the GT forcewake domain. Hence change the forcewake assertion to check GT forcewake domain. v2: add fixes tag (Himal) Fixes: `975e4a3795` ("drm/xe: Manually setup C6 when skip_guc_pc is set") Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240606100842.956072-2-riana.tauro@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `21b7085546`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-06-13 11:33:07 +02:00
Rafael J. Wysocki	0e6b6dedf1	ACPI: EC: Evaluate orphan _REG under EC device After starting to install the EC address space handler at the ACPI namespace root, if there is an "orphan" _REG method in the EC device's scope, it will not be evaluated any more. This breaks EC operation regions on some systems, like Asus gu605. To address this, use a wrapper around an existing ACPICA function to look for an "orphan" _REG method in the EC device scope and evaluate it if present. Fixes: `60fa6ae6e6` ("ACPI: EC: Install address space handler at the namespace root") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218945 Reported-by: VitaliiT <vitaly.torshyn@gmail.com> Tested-by: VitaliiT <vitaly.torshyn@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-06-13 11:28:54 +02:00
Dimitri Sivanich	12243a8115	iommu/amd: Fix panic accessing amd_iommu_enable_faulting This fixes a bug introduced by commit `d74169ceb0` ("iommu/vt-d: Allocate DMAR fault interrupts locally"). The panic happens when amd_iommu_enable_faulting is called from CPUHP_AP_ONLINE_DYN context. Fixes: `d74169ceb0` ("iommu/vt-d: Allocate DMAR fault interrupts locally") Signed-off-by: Dimitri Sivanich <sivanich@hpe.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/ZljHE/R4KLzGU6vx@hpe.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-06-13 11:16:05 +02:00
Benjamin Segall	b2747f108b	x86/boot: Don't add the EFI stub to targets, again This is a re-commit of `da05b143a3` ("x86/boot: Don't add the EFI stub to targets") after the tagged patch incorrectly reverted it. vmlinux-objs-y is added to targets, with an assumption that they are all relative to $(obj); adding a $(objtree)/drivers/... path causes the build to incorrectly create a useless arch/x86/boot/compressed/drivers/... directory tree. Fix this just by using a different make variable for the EFI stub. Fixes: `cb8bda8ad4` ("x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S") Signed-off-by: Ben Segall <bsegall@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: stable@vger.kernel.org # v6.1+ Link: https://lore.kernel.org/r/xm267ceukksz.fsf@bsegall.svl.corp.google.com	2024-06-13 10:32:36 +02:00
Jakub Kicinski	b60b1bdc18	Merge branch 'net-bridge-mst-fix-suspicious-rcu-usage-warning' Nikolay Aleksandrov says: ==================== net: bridge: mst: fix suspicious rcu usage warning This set fixes a suspicious RCU usage warning triggered by syzbot[1] in the bridge's MST code. After I converted br_mst_set_state to RCU, I forgot to update the vlan group dereference helper. Fix it by using the proper helper, in order to do that we need to pass the vlan group which is already obtained correctly by the callers for their respective context. Patch 01 is a requirement for the fix in patch 02. Note I did consider rcu_dereference_rtnl() but the churn is much bigger and in every part of the bridge. We can do that as a cleanup in net-next. [1] https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe ============================= WARNING: suspicious RCU usage 6.10.0-rc2-syzkaller-00235-g8a92980606e3 #0 Not tainted ----------------------------- net/bridge/br_private.h:1599 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 4 locks held by syz-executor.1/5374: #0: ffff888022d50b18 (&mm->mmap_lock){++++}-{3:3}, at: mmap_read_lock include/linux/mmap_lock.h:144 [inline] #0: ffff888022d50b18 (&mm->mmap_lock){++++}-{3:3}, at: __mm_populate+0x1b0/0x460 mm/gup.c:2111 #1: ffffc90000a18c00 ((&p->forward_delay_timer)){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x650 kernel/time/timer.c:1789 #2: ffff88805fb2ccb8 (&br->lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #2: ffff88805fb2ccb8 (&br->lock){+.-.}-{2:2}, at: br_forward_delay_timer_expired+0x50/0x440 net/bridge/br_stp_timer.c:86 #3: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline] #3: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:781 [inline] #3: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: br_mst_set_state+0x171/0x7a0 net/bridge/br_mst.c:105 stack backtrace: CPU: 1 PID: 5374 Comm: syz-executor.1 Not tainted 6.10.0-rc2-syzkaller-00235-g8a92980606e3 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712 nbp_vlan_group net/bridge/br_private.h:1599 [inline] br_mst_set_state+0x29e/0x7a0 net/bridge/br_mst.c:106 br_set_state+0x28a/0x7b0 net/bridge/br_stp.c:47 br_forward_delay_timer_expired+0x176/0x440 net/bridge/br_stp_timer.c:88 call_timer_fn+0x18e/0x650 kernel/time/timer.c:1792 expire_timers kernel/time/timer.c:1843 [inline] __run_timers kernel/time/timer.c:2417 [inline] __run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2428 run_timer_base kernel/time/timer.c:2437 [inline] run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2447 handle_softirqs+0x2c4/0x970 kernel/softirq.c:554 __do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637 irq_exit_rcu+0x9/0x30 kernel/softirq.c:649 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043 </IRQ> <TASK> ==================== Link: https://lore.kernel.org/r/20240609103654.914987-1-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-12 18:24:26 -07:00
Nikolay Aleksandrov	546ceb1dfd	net: bridge: mst: fix suspicious rcu usage in br_mst_set_state I converted br_mst_set_state to RCU to avoid a vlan use-after-free but forgot to change the vlan group dereference helper. Switch to vlan group RCU deref helper to fix the suspicious rcu usage warning. Fixes: `3a7c1661ae` ("net: bridge: mst: fix vlan use-after-free") Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20240609103654.914987-3-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-12 18:24:24 -07:00
Nikolay Aleksandrov	36c92936e8	net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state Pass the already obtained vlan group pointer to br_mst_vlan_set_state() instead of dereferencing it again. Each caller has already correctly dereferenced it for their context. This change is required for the following suspicious RCU dereference fix. No functional changes intended. Fixes: `3a7c1661ae` ("net: bridge: mst: fix vlan use-after-free") Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20240609103654.914987-2-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-12 18:24:24 -07:00
Petr Pavlu	14a20e5b4a	net/ipv6: Fix the RT cache flush via sysctl using a previous delay The net.ipv6.route.flush system parameter takes a value which specifies a delay used during the flush operation for aging exception routes. The written value is however not used in the currently requested flush and instead utilized only in the next one. A problem is that ipv6_sysctl_rtcache_flush() first reads the old value of net->ipv6.sysctl.flush_delay into a local delay variable and then calls proc_dointvec() which actually updates the sysctl based on the provided input. Fix the problem by switching the order of the two operations. Fixes: `4990509f19` ("[NETNS][IPV6]: Make sysctls route per namespace.") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240607112828.30285-1-petr.pavlu@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-12 17:51:35 -07:00
Linus Torvalds	2ccbdf43d5	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux Pull ARM and clkdev fixes from Russell King: - Fix clkdev - erroring out on long strings causes boot failures, so don't do this. Still warn about the over-sized strings (which will never match and thus their registration with clkdev is useless) - Fix for ftrace with frame pointer unwinder with recent GCC changing the way frames are stacked. * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9405/1: ftrace: Don't assume stack frames are contiguous in memory clkdev: don't fail clkdev_alloc() if over-sized	2024-06-12 16:58:05 -07:00
Jakub Kicinski	d92589f8fd	Merge tag 'nf-24-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: Patch #1 fixes insufficient sanitization of netlink attributes for the inner expression which can trigger nul-pointer dereference, from Davide Ornaghi. Patch #2 address a report that there is a race condition between namespace cleanup and the garbage collection of the list:set type. This patch resolves this issue with other minor issues as well, from Jozsef Kadlecsik. Patch #3 ip6_route_me_harder() ignores flowlabel/dsfield when ip dscp has been mangled, this unbreaks ip6 dscp set $v, from Florian Westphal. All of these patches address issues that are present in several releases. * tag 'nf-24-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: Use flowlabel flow key when re-routing mangled packets netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type netfilter: nft_inner: validate mandatory meta and payload ==================== Link: https://lore.kernel.org/r/20240611220323.413713-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-12 16:29:00 -07:00
Linus Torvalds	0b4989ebe8	Merge tag 'bcachefs-2024-06-12' of https://evilpiepirate.org/git/bcachefs Pull bcachefs fixes from Kent Overstreet: - fix kworker explosion, due to calling submit_bio() (which can block) from a multithreaded workqueue - fix error handling in btree node scan - forward compat fix: kill an old debug assert - key cache shrinker fixes This is a partial fix for stalls doing multithreaded creates - there were various O(n^2) issues the key cache shrinker was hitting [1]. There's more work coming here; I'm working on a patch to delete the key cache lock, which initial testing shows to be a pretty drastic performance improvement - assorted syzbot fixes Link: https://lore.kernel.org/linux-bcachefs/CAGudoHGenxzk0ZqPXXi1_QDbfqQhGHu+wUwzyS6WmfkUZ1HiXA@mail.gmail.com/ [1] * tag 'bcachefs-2024-06-12' of https://evilpiepirate.org/git/bcachefs: bcachefs: Fix rcu_read_lock() leak in drop_extra_replicas bcachefs: Add missing bch_inode_info.ei_flags init bcachefs: Add missing synchronize_srcu_expedited() call when shutting down bcachefs: Check for invalid bucket from bucket_gen(), gc_bucket() bcachefs: Replace bucket_valid() asserts in bucket lookup with proper checks bcachefs: Fix snapshot_create_lock lock ordering bcachefs: Fix refcount leak in check_fix_ptrs() bcachefs: Leave a buffer in the btree key cache to avoid lock thrashing bcachefs: Fix reporting of freed objects from key cache shrinker bcachefs: set sb->s_shrinker->seeks = 0 bcachefs: increase key cache shrinker batch size bcachefs: Enable automatic shrinking for rhashtables bcachefs: fix the display format for show-super bcachefs: fix stack frame size in fsck.c bcachefs: Delete incorrect BTREE_ID_NR assertion bcachefs: Fix incorrect error handling found_btree_node_is_readable() bcachefs: Split out btree_write_submit_wq	2024-06-12 15:08:23 -07:00
Alex Williamson	d71a989cf5	vfio/pci: Insert full vma on mmap'd MMIO fault In order to improve performance of typical scenarios we can try to insert the entire vma on fault. This accelerates typical cases, such as when the MMIO region is DMA mapped by QEMU. The vfio_iommu_type1 driver will fault in the entire DMA mapped range through fixup_user_fault(). In synthetic testing, this improves the time required to walk a PCI BAR mapping from userspace by roughly 1/3rd. This is likely an interim solution until vmf_insert_pfn_{pmd,pud}() gain support for pfnmaps. Suggested-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/all/Zl6XdUkt%2FzMMGOLF@yzhao56-desk.sh.intel.com/ Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/r/20240607035213.2054226-1-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2024-06-12 15:40:39 -06:00
Christoph Hellwig	957df9af72	nbd: Remove __force casts Make it again possible for sparse to verify that blk_status_t and Unix error codes are used in the proper context by making nbd_send_cmd() return a blk_status_t instead of an integer. No functionality has been changed. Signed-off-by: Christoph Hellwig <hch@lst.de> [ bvanassche: added description and made two small formatting changes ] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20240604221531.327131-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-12 14:47:07 -06:00
Pavel Begunkov	54559642b9	io_uring/rsrc: don't lock while !TASK_RUNNING There is a report of io_rsrc_ref_quiesce() locking a mutex while not TASK_RUNNING, which is due to forgetting restoring the state back after io_run_task_work_sig() and attempts to break out of the waiting loop. do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff815d2494>] prepare_to_wait+0xa4/0x380 kernel/sched/wait.c:237 WARNING: CPU: 2 PID: 397056 at kernel/sched/core.c:10099 __might_sleep+0x114/0x160 kernel/sched/core.c:10099 RIP: 0010:__might_sleep+0x114/0x160 kernel/sched/core.c:10099 Call Trace: <TASK> __mutex_lock_common kernel/locking/mutex.c:585 [inline] __mutex_lock+0xb4/0x940 kernel/locking/mutex.c:752 io_rsrc_ref_quiesce+0x590/0x940 io_uring/rsrc.c:253 io_sqe_buffers_unregister+0xa2/0x340 io_uring/rsrc.c:799 __io_uring_register io_uring/register.c:424 [inline] __do_sys_io_uring_register+0x5b9/0x2400 io_uring/register.c:613 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd8/0x270 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x6f/0x77 Reported-by: Li Shi <sl1589472800@gmail.com> Fixes: `4ea15b56f0` ("io_uring/rsrc: use wq for quiescing") Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/77966bc104e25b0534995d5dbb152332bc8f31c0.1718196953.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-12 13:02:12 -06:00
Daniel Wagner	cd0c1b8e04	nvmet: always initialize cqe.result The spec doesn't mandate that the first two double words (aka results) for the command queue entry need to be set to 0 when they are not used (not specified). Though, the target implemention returns 0 for TCP and FC but not for RDMA. Let's make RDMA behave the same and thus explicitly initializing the result field. This prevents leaking any data from the stack. Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-06-12 11:00:08 -07:00
Daniel Wagner	d76584e53f	nvmet-passthru: propagate status from id override functions The id override functions return a status which is not propagated to the caller. Fixes: `c1fef73f79` ("nvmet: add passthru code to process commands") Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-06-12 11:00:08 -07:00
Chunguang Xu	e5d574ab37	nvme: avoid double free special payload If a discard request needs to be retried, and that retry may fail before a new special payload is added, a double free will result. Clear the RQF_SPECIAL_LOAD when the request is cleaned. Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-06-12 10:56:50 -07:00
Anuj Gupta	e038ee6189	block: unmap and free user mapped integrity via submitter The user mapped intergity is copied back and unpinned by bio_integrity_free which is a low-level routine. Do it via the submitter rather than doing it in the low-level block layer code, to split the submitter side from the consumer side of the bio. Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240610111144.14647-1-anuj20.g@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-12 11:00:50 -06:00
Chengming Zhou	d0321c812d	block: fix request.queuelist usage in flush Friedrich Weber reported a kernel crash problem and bisected to commit `81ada09cc2` ("blk-flush: reuse rq queuelist in flush state machine"). The root cause is that we use "list_move_tail(&rq->queuelist, pending)" in the PREFLUSH/POSTFLUSH sequences. But rq->queuelist.next == xxx since it's popped out from plug->cached_rq in __blk_mq_alloc_requests_batch(). We don't initialize its queuelist just for this first request, although the queuelist of all later popped requests will be initialized. Fix it by changing to use "list_add_tail(&rq->queuelist, pending)" so rq->queuelist doesn't need to be initialized. It should be ok since rq can't be on any list when PREFLUSH or POSTFLUSH, has no move actually. Please note the commit `81ada09cc2` ("blk-flush: reuse rq queuelist in flush state machine") also has another requirement that no drivers would touch rq->queuelist after blk_mq_end_request() since we will reuse it to add rq to the post-flush pending list in POSTFLUSH. If this is not true, we will have to revert that commit IMHO. This updated version adds "list_del_init(&rq->queuelist)" in flush rq callback since the dm layer may submit request of a weird invalid format (REQ_FSEQ_PREFLUSH \| REQ_FSEQ_POSTFLUSH), which causes double list_add if without this "list_del_init(&rq->queuelist)". The weird invalid format problem should be fixed in dm layer. Reported-by: Friedrich Weber <f.weber@proxmox.com> Closes: https://lore.kernel.org/lkml/14b89dfb-505c-49f7-aebb-01c54451db40@proxmox.com/ Closes: https://lore.kernel.org/lkml/c9d03ff7-27c5-4ebd-b3f6-5a90d96f35ba@proxmox.com/ Fixes: `81ada09cc2` ("blk-flush: reuse rq queuelist in flush state machine") Cc: Christoph Hellwig <hch@lst.de> Cc: ming.lei@redhat.com Cc: bvanassche@acm.org Tested-by: Friedrich Weber <f.weber@proxmox.com> Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240608143115.972486-1-chengming.zhou@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-12 10:58:11 -06:00
Damien Le Moal	1933192a91	block: Optimize disk zone resource cleanup For zoned block devices using zone write plugging, an rcu_barrier() call is needed in disk_free_zone_resources() to synchronize freeing of zone write plugs and the destrution of the mempool used to allocate the plugs. The barrier call does slow down a little teardown of zoned block devices but should not affect teardown of regular block devices or zoned block devices that do not use zone write plugging (e.g. zoned DM devices that do not require zone append emulation). Modify disk_free_zone_resources() to return early if we do not have a mempool to start with, that is, if the device does not use zone write plugging. This avoids the costly rcu_barrier() and speeds up disk teardown. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: `dd291d77cc` ("block: Introduce zone write plugging") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20240607002126.104227-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-12 10:56:45 -06:00
Su Hui	9b1ebce6a1	block: sed-opal: avoid possible wrong address reference in read_sed_opal_key() Clang static checker (scan-build) warning: block/sed-opal.c:line 317, column 3 Value stored to 'ret' is never read. Fix this problem by returning the error code when keyring_search() failed. Otherwise, 'key' will have a wrong value when 'kerf' stores the error code. Fixes: `3bfeb61256` ("block: sed-opal: keyring support for SED keys") Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20240611073659.429582-1-suhui@nfschina.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-12 10:53:20 -06:00
Andy Shevchenko	cea2a26553	mailmap: Add my outdated addresses to the map file There is a couple of outdated addresses that are still visible in the Git history, add them to .mailmap. While at it, replace one in the comment. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-06-12 09:28:04 -07:00
Jean Delvare	cbf3fb5b29	i2c: designware: Fix the functionality flags of the slave-only interface When an I2C adapter acts only as a slave, it should not claim to support I2C master capabilities. Fixes: `5b6d721b26` ("i2c: designware: enable SLAVE in platform module") Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Luis Oliveira <lolivei@synopsys.com> Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Jan Dabros <jsd@semihalf.com> Cc: Andi Shyti <andi.shyti@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-06-12 17:07:34 +01:00
Jean Delvare	d6d5645e5f	i2c: at91: Fix the functionality flags of the slave-only interface When an I2C adapter acts only as a slave, it should not claim to support I2C master capabilities. Fixes: `9d3ca54b55` ("i2c: at91: added slave mode support") Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Juergen Fitschen <me@jue.yt> Cc: Ludovic Desroches <ludovic.desroches@microchip.com> Cc: Codrin Ciubotariu <codrin.ciubotariu@microchip.com> Cc: Andi Shyti <andi.shyti@kernel.org> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-06-12 17:07:33 +01:00
Rafael J. Wysocki	350cbb5d2f	cpufreq: intel_pstate: Check turbo_is_disabled() in store_no_turbo() After recent changes in intel_pstate, global.turbo_disabled is only set at the initialization time and never changed. However, it turns out that on some systems the "turbo disabled" bit in MSR_IA32_MISC_ENABLE, the initial state of which is reflected by global.turbo_disabled, can be flipped later and there should be a way to take that into account (other than checking that MSR every time the driver runs which is costly and useless overhead on the vast majority of systems). For this purpose, notice that before the changes in question, store_no_turbo() contained a turbo_is_disabled() check that was used for updating global.turbo_disabled if the "turbo disabled" bit in MSR_IA32_MISC_ENABLE had been flipped and that functionality can be restored. Then, users will be able to reset global.turbo_disabled by writing 0 to no_turbo which used to work before on systems with flipping "turbo disabled" bit. This guarantees the driver state to remain in sync, but READ_ONCE() annotations need to be added in two places where global.turbo_disabled is accessed locklessly, so modify the driver to make that happen. Fixes: `0940f1a801` ("cpufreq: intel_pstate: Do not update global.turbo_disabled after initialization") Closes: https://lore.kernel.org/linux-pm/bf3ebf1571a4788e97daf861eb493c12d42639a3.camel@xry111.site Suggested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reported-by: Xi Ruoyao <xry111@xry111.site> Tested-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-06-12 14:11:50 +02:00
Douglas Anderson	c38896ca63	drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown time Based on grepping through the source code this driver appears to be missing a call to drm_atomic_helper_shutdown() at system shutdown time. Among other things, this means that if a panel is in use that it won't be cleanly powered off at system shutdown time. The fact that we should call drm_atomic_helper_shutdown() in the case of OS shutdown/restart comes straight out of the kernel doc "driver instance overview" in drm_drv.c. This driver users the component model and shutdown happens in the base driver. The "drvdata" for this driver will always be valid if shutdown() is called and as of commit `2a07396828` ("drm/atomic-helper: drm_atomic_helper_shutdown(NULL) should be a noop") we don't need to confirm that "drm" is non-NULL. Suggested-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Fei Shao <fshao@chromium.org> Tested-by: Fei Shao <fshao@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240611102744.v2.1.I2b014f90afc4729b6ecc7b5ddd1f6dedcea4625b@changeid	2024-06-12 09:54:23 +02:00
Douglas Anderson	0320ca14c6	drm: renesas: shmobile: Call drm_atomic_helper_shutdown() at shutdown time Based on grepping through the source code, this driver appears to be missing a call to drm_atomic_helper_shutdown() at system shutdown time. This is important because drm_atomic_helper_shutdown() will cause panels to get disabled cleanly which may be important for their power sequencing. Future changes will remove any custom powering off in individual panel drivers so the DRM drivers need to start getting this right. The fact that we should call drm_atomic_helper_shutdown() in the case of OS shutdown comes straight out of the kernel doc "driver instance overview" in drm_drv.c. [geert: shmob_drm_remove() already calls drm_atomic_helper_shutdown] Suggested-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20230901164111.RFT.15.Iaf638a1d4c8b3c307a6192efabb4cbb06b195f15@changeid [geert: s/drm_helper_force_disable_all/drm_atomic_helper_shutdown/] Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Sui Jingfeng <sui.jingfeng@linux.dev> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/17c6a5a668e5975f871b77fb1fca6711a0799d9e.1718176895.git.geert+renesas@glider.be	2024-06-12 09:54:07 +02:00
Hector Martin	5ceac4402f	xhci: Handle TD clearing for multiple streams case When multiple streams are in use, multiple TDs might be in flight when an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for each, to ensure everything is reset properly and the caches cleared. Change the logic so that any N>1 TDs found active for different streams are deferred until after the first one is processed, calling xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to queue another command until we are done with all of them. Also change the error/"should never happen" paths to ensure we at least clear any affected TDs, even if we can't issue a command to clear the hardware cache, and complain loudly with an xhci_warn() if this ever happens. This problem case dates back to commit `e9df17eb14` ("USB: xhci: Correct assumptions about number of rings per endpoint.") early on in the XHCI driver's life, when stream support was first added. It was then identified but not fixed nor made into a warning in commit `674f8438c1` ("xhci: split handling halted endpoints into two steps"), which added a FIXME comment for the problem case (without materially changing the behavior as far as I can tell, though the new logic made the problem more obvious). Then later, in commit `94f339147f` ("xhci: Fix failure to give back some cached cancelled URBs."), it was acknowledged again. [Mathias: commit `94f339147f` ("xhci: Fix failure to give back some cached cancelled URBs.") was a targeted regression fix to the previously mentioned patch. Users reported issues with usb stuck after unmounting/disconnecting UAS devices. This rolled back the TD clearing of multiple streams to its original state.] Apparently the commit author was aware of the problem (yet still chose to submit it): It was still mentioned as a FIXME, an xhci_dbg() was added to log the problem condition, and the remaining issue was mentioned in the commit description. The choice of making the log type xhci_dbg() for what is, at this point, a completely unhandled and known broken condition is puzzling and unfortunate, as it guarantees that no actual users would see the log in production, thereby making it nigh undebuggable (indeed, even if you turn on DEBUG, the message doesn't really hint at there being a problem at all). It took me months of random xHC crashes to finally find a reliable repro and be able to do a deep dive debug session, which could all have been avoided had this unhandled, broken condition been actually reported with a warning, as it should have been as a bug intentionally left in unfixed (never mind that it shouldn't have been left in at all). > Another fix to solve clearing the caches of all stream rings with > cancelled TDs is needed, but not as urgent. 3 years after that statement and 14 years after the original bug was introduced, I think it's finally time to fix it. And maybe next time let's not leave bugs unfixed (that are actually worse than the original bug), and let's actually get people to review kernel commits please. Fixes xHC crashes and IOMMU faults with UAS devices when handling errors/faults. Easiest repro is to use `hdparm` to mark an early sector (e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop. At least in the case of JMicron controllers, the read errors end up having to cancel two TDs (for two queued requests to different streams) and the one that didn't get cleared properly ends up faulting the xHC entirely when it tries to access DMA pages that have since been unmapped, referred to by the stale TDs. This normally happens quickly (after two or three loops). After this fix, I left the `cat` in a loop running overnight and experienced no xHC failures, with all read errors recovered properly. Repro'd and tested on an Apple M1 Mac Mini (dwc3 host). On systems without an IOMMU, this bug would instead silently corrupt freed memory, making this a security bug (even on systems with IOMMUs this could silently corrupt memory belonging to other USB devices on the same controller, so it's still a security bug). Given that the kernel autoprobes partition tables, I'm pretty sure a malicious USB device pretending to be a UAS device and reporting an error with the right timing could deliberately trigger a UAF and write to freed memory, with no user action. [Mathias: Commit message and code comment edit, original at:] https://lore.kernel.org/linux-usb/20240524-xhci-streams-v1-1-6b1f13819bea@marcan.st/ Fixes: `e9df17eb14` ("USB: xhci: Correct assumptions about number of rings per endpoint.") Fixes: `94f339147f` ("xhci: Fix failure to give back some cached cancelled URBs.") Fixes: `674f8438c1` ("xhci: split handling halted endpoints into two steps") Cc: stable@vger.kernel.org Cc: security@kernel.org Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Hector Martin <marcan@marcan.st> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240611120610.3264502-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-12 09:43:36 +02:00
Kuangyi Chiang	91f7a1524a	xhci: Apply broken streams quirk to Etron EJ188 xHCI host As described in commit `8f873c1ff4` ("xhci: Blacklist using streams on the Etron EJ168 controller"), EJ188 have the same issue as EJ168, where Streams do not work reliable on EJ188. So apply XHCI_BROKEN_STREAMS quirk to EJ188 as well. Cc: stable@vger.kernel.org Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240611120610.3264502-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-12 09:43:36 +02:00
Kuangyi Chiang	17bd54555c	xhci: Apply reset resume quirk to Etron EJ188 xHCI host As described in commit `c877b3b2ad` ("xhci: Add reset on resume quirk for asrock p67 host"), EJ188 have the same issue as EJ168, where completely dies on resume. So apply XHCI_RESET_ON_RESUME quirk to EJ188 as well. Cc: stable@vger.kernel.org Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240611120610.3264502-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-12 09:43:36 +02:00
Mathias Nyman	f0260589b4	xhci: Set correct transferred length for cancelled bulk transfers The transferred length is set incorrectly for cancelled bulk transfer TDs in case the bulk transfer ring stops on the last transfer block with a 'Stop - Length Invalid' completion code. length essentially ends up being set to the requested length: urb->actual_length = urb->transfer_buffer_length Length for 'Stop - Length Invalid' cases should be the sum of all TRB transfer block lengths up to the one the ring stopped on, _excluding_ the one stopped on. Fix this by always summing up TRB lengths for 'Stop - Length Invalid' bulk cases. This issue was discovered by Alan Stern while debugging https://bugzilla.kernel.org/show_bug.cgi?id=218890, but does not solve that bug. Issue is older than 4.10 kernel but fix won't apply to those due to major reworks in that area. Tested-by: Pierre Tomon <pierretom+12@ik.me> Cc: stable@vger.kernel.org # v4.10+ Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240611120610.3264502-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-12 09:43:36 +02:00
Namjae Jeon	2bfc4214c6	ksmbd: fix missing use of get_write in in smb2_set_ea() Fix an issue where get_write is not used in smb2_set_ea(). Fixes: `6fc0a265e1` ("ksmbd: fix potential circular locking issue in smb2_set_ea()") Cc: stable@vger.kernel.org Reported-by: Wang Zhaolong <wangzhaolong1@huawei.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-06-11 23:43:09 -05:00
Namjae Jeon	1cdeca6a72	ksmbd: move leading slash check to smb2_get_name() If the directory name in the root of the share starts with character like 镜(0x955c) or Ṝ(0x1e5c), it (and anything inside) cannot be accessed. The leading slash check must be checked after converting unicode to nls string. Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-06-11 23:43:09 -05:00
Xiaolei Wang	be27b89652	net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters The current cbs parameter depends on speed after uplinking, which is not needed and will report a configuration error if the port is not initially connected. The UAPI exposed by tc-cbs requires userspace to recalculate the send slope anyway, because the formula depends on port_transmit_rate (see man tc-cbs), which is not an invariant from tc's perspective. Therefore, we use offload->sendslope and offload->idleslope to derive the original port_transmit_rate from the CBS formula. Fixes: `1f705bc61a` ("net: stmmac: Add support for CBS QDISC") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20240608143524.2065736-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-11 19:58:18 -07:00
Joshua Washington	1b9f756344	gve: ignore nonrelevant GSO type bits when processing TSO headers TSO currently fails when the skb's gso_type field has more than one bit set. TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes virtualization, such as QEMU, a real use-case. The gso_type and gso_size fields as passed from userspace in virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type \|= SKB_GSO_DODGY to force the packet to enter the software GSO stack for verification. This issue might similarly come up when the CWR bit is set in the TCP header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit to be set. Fixes: `a57e5de476` ("gve: DQO: Add TX path") Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Acked-by: Andrei Vagin <avagin@gmail.com> v2 - Remove unnecessary comments, remove line break between fixes tag and signoffs. v3 - Add back unrelated empty line removal. Link: https://lore.kernel.org/r/20240610225729.2985343-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-11 19:42:35 -07:00
Jakub Kicinski	f6b2f578df	Merge tag 'for-net-2024-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_sync: fix not using correct handle - L2CAP: fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ - L2CAP: fix connection setup in l2cap_connect * tag 'for-net-2024-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: fix connection setup in l2cap_connect Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ Bluetooth: hci_sync: Fix not using correct handle ==================== Link: https://lore.kernel.org/r/20240610135803.920662-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-11 19:40:27 -07:00
Kory Maincent	144ba8580b	net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP as reported by checkpatch script. Fixes: `18ff0bcda6` ("ethtool: add interface to interact with Ethernet Power Equipment") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://lore.kernel.org/r/20240610083426.740660-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-11 19:33:37 -07:00
Damien Le Moal	90e6f08915	scsi: mpi3mr: Fix ATA NCQ priority support The function mpi3mr_qcmd() of the mpi3mr driver is able to indicate to the HBA if a read or write command directed at an ATA device should be translated to an NCQ read/write command with the high prioiryt bit set when the request uses the RT priority class and the user has enabled NCQ priority through sysfs. However, unlike the mpt3sas driver, the mpi3mr driver does not define the sas_ncq_prio_supported and sas_ncq_prio_enable sysfs attributes, so the ncq_prio_enable field of struct mpi3mr_sdev_priv_data is never actually set and NCQ Priority cannot ever be used. Fix this by defining these missing atributes to allow a user to check if an ATA device supports NCQ priority and to enable/disable the use of NCQ priority. To do this, lift the function scsih_ncq_prio_supp() out of the mpt3sas driver and make it the generic SCSI SAS transport function sas_ata_ncq_prio_supported(). Nothing in that function is hardware specific, so this function can be used in both the mpt3sas driver and the mpi3mr driver. Reported-by: Scott McCoy <scott.mccoy@wdc.com> Fixes: `023ab2a9b4` ("scsi: mpi3mr: Add support for queue command processing") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240611083435.92961-1-dlemoal@kernel.org Reviewed-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-06-11 21:40:23 -04:00
Ziqi Chen	77691af484	scsi: ufs: core: Quiesce request queues before checking pending cmds In ufshcd_clock_scaling_prepare(), after SCSI layer is blocked, ufshcd_pending_cmds() is called to check whether there are pending transactions or not. And only if there are no pending transactions can we proceed to kickstart the clock scaling sequence. ufshcd_pending_cmds() traverses over all SCSI devices and calls sbitmap_weight() on their budget_map. sbitmap_weight() can be broken down to three steps: 1. Calculate the nr outstanding bits set in the 'word' bitmap. 2. Calculate the nr outstanding bits set in the 'cleared' bitmap. 3. Subtract the result from step 1 by the result from step 2. This can lead to a race condition as outlined below: Assume there is one pending transaction in the request queue of one SCSI device, say sda, and the budget token of this request is 0, the 'word' is 0x1 and the 'cleared' is 0x0. 1. When step 1 executes, it gets the result as 1. 2. Before step 2 executes, block layer tries to dispatch a new request to sda. Since the SCSI layer is blocked, the request cannot pass through SCSI but the block layer would do budget_get() and budget_put() to sda's budget map regardless, so the 'word' has become 0x3 and 'cleared' has become 0x2 (assume the new request got budget token 1). 3. When step 2 executes, it gets the result as 1. 4. When step 3 executes, it gets the result as 0, meaning there is no pending transactions, which is wrong. Thread A Thread B ufshcd_pending_cmds() __blk_mq_sched_dispatch_requests() \| \| sbitmap_weight(word) \| \| scsi_mq_get_budget() \| \| \| scsi_mq_put_budget() \| \| sbitmap_weight(cleared) ... When this race condition happens, the clock scaling sequence is started with transactions still in flight, leading to subsequent hibernate enter failure, broken link, task abort and back to back error recovery. Fix this race condition by quiescing the request queues before calling ufshcd_pending_cmds() so that block layer won't touch the budget map when ufshcd_pending_cmds() is working on it. In addition, remove the SCSI layer blocking/unblocking to reduce redundancies and latencies. Fixes: `8d077ede48` ("scsi: ufs: Optimize the command queueing code") Co-developed-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Can Guo <quic_cang@quicinc.com> Signed-off-by: Ziqi Chen <quic_ziqichen@quicinc.com> Link: https://lore.kernel.org/r/1717754818-39863-1-git-send-email-quic_ziqichen@quicinc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-06-11 21:22:33 -04:00
Damien Le Moal	52912ca87e	scsi: core: Disable CDL by default For SCSI devices supporting the Command Duration Limits feature set, the user can enable/disable this feature use through the sysfs device attribute "cdl_enable". This attribute modification triggers a call to scsi_cdl_enable() to enable and disable the feature for ATA devices and set the scsi device cdl_enable field to the user provided bool value. For SCSI devices supporting CDL, the feature set is always enabled and scsi_cdl_enable() is reduced to setting the cdl_enable field. However, for ATA devices, a drive may spin-up with the CDL feature enabled by default. But the SCSI device cdl_enable field is always initialized to false (CDL disabled), regardless of the actual device CDL feature state. For ATA devices managed by libata (or libsas), libata-core always disables the CDL feature set when the device is attached, thus syncing the state of the CDL feature on the device and of the SCSI device cdl_enable field. However, for ATA devices connected to a SAS HBA, the CDL feature is not disabled on scan for ATA devices that have this feature enabled by default, leading to an inconsistent state of the feature on the device with the SCSI device cdl_enable field. Avoid this inconsistency by adding a call to scsi_cdl_enable() in scsi_cdl_check() to make sure that the device-side state of the CDL feature set always matches the scsi device cdl_enable field state. This implies that CDL will always be disabled for ATA devices connected to SAS HBAs, which is consistent with libata/libsas initialization of the device. Reported-by: Scott McCoy <scott.mccoy@wdc.com> Fixes: `1b22cfb141` ("scsi: core: Allow enabling and disabling command duration limits") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240607012507.111488-1-dlemoal@kernel.org Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Igor Pylypiv <ipylypiv@google.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-06-11 21:14:54 -04:00
John David Anglin	72d95924ee	parisc: Try to fix random segmentation faults in package builds PA-RISC systems with PA8800 and PA8900 processors have had problems with random segmentation faults for many years. Systems with earlier processors are much more stable. Systems with PA8800 and PA8900 processors have a large L2 cache which needs per page flushing for decent performance when a large range is flushed. The combined cache in these systems is also more sensitive to non-equivalent aliases than the caches in earlier systems. The majority of random segmentation faults that I have looked at appear to be memory corruption in memory allocated using mmap and malloc. My first attempt at fixing the random faults didn't work. On reviewing the cache code, I realized that there were two issues which the existing code didn't handle correctly. Both relate to cache move-in. Another issue is that the present bit in PTEs is racy. 1) PA-RISC caches have a mind of their own and they can speculatively load data and instructions for a page as long as there is a entry in the TLB for the page which allows move-in. TLBs are local to each CPU. Thus, the TLB entry for a page must be purged before flushing the page. This is particularly important on SMP systems. In some of the flush routines, the flush routine would be called and then the TLB entry would be purged. This was because the flush routine needed the TLB entry to do the flush. 2) My initial approach to trying the fix the random faults was to try and use flush_cache_page_if_present for all flush operations. This actually made things worse and led to a couple of hardware lockups. It finally dawned on me that some lines weren't being flushed because the pte check code was racy. This resulted in random inequivalent mappings to physical pages. The __flush_cache_page tmpalias flush sets up its own TLB entry and it doesn't need the existing TLB entry. As long as we can find the pte pointer for the vm page, we can get the pfn and physical address of the page. We can also purge the TLB entry for the page before doing the flush. Further, __flush_cache_page uses a special TLB entry that inhibits cache move-in. When switching page mappings, we need to ensure that lines are removed from the cache. It is not sufficient to just flush the lines to memory as they may come back. This made it clear that we needed to implement all the required flush operations using tmpalias routines. This includes flushes for user and kernel pages. After modifying the code to use tmpalias flushes, it became clear that the random segmentation faults were not fully resolved. The frequency of faults was worse on systems with a 64 MB L2 (PA8900) and systems with more CPUs (rp4440). The warning that I added to flush_cache_page_if_present to detect pages that couldn't be flushed triggered frequently on some systems. Helge and I looked at the pages that couldn't be flushed and found that the PTE was either cleared or for a swap page. Ignoring pages that were swapped out seemed okay but pages with cleared PTEs seemed problematic. I looked at routines related to pte_clear and noticed ptep_clear_flush. The default implementation just flushes the TLB entry. However, it was obvious that on parisc we need to flush the cache page as well. If we don't flush the cache page, stale lines will be left in the cache and cause random corruption. Once a PTE is cleared, there is no way to find the physical address associated with the PTE and flush the associated page at a later time. I implemented an updated change with a parisc specific version of ptep_clear_flush. It fixed the random data corruption on Helge's rp4440 and rp3440, as well as on my c8000. At this point, I realized that I could restore the code where we only flush in flush_cache_page_if_present if the page has been accessed. However, for this, we also need to flush the cache when the accessed bit is cleared in ptep_clear_flush_young to keep things synchronized. The default implementation only flushes the TLB entry. Other changes in this version are: 1) Implement parisc specific version of ptep_get. It's identical to default but needed in arch/parisc/include/asm/pgtable.h. 2) Revise parisc implementation of ptep_test_and_clear_young to use ptep_get (READ_ONCE). 3) Drop parisc implementation of ptep_get_and_clear. We can use default. 4) Revise flush_kernel_vmap_range and invalidate_kernel_vmap_range to use full data cache flush. 5) Move flush_cache_vmap and flush_cache_vunmap to cache.c. Handle VM_IOREMAP case in flush_cache_vmap. At this time, I don't know whether it is better to always flush when the PTE present bit is set or when both the accessed and present bits are set. The later saves flushing pages that haven't been accessed, but we need to flush in ptep_clear_flush_young. It also needs a page table lookup to find the PTE pointer. The lpa instruction only needs a page table lookup when the PTE entry isn't in the TLB. We don't atomically handle setting and clearing the _PAGE_ACCESSED bit. If we miss an update, we may miss a flush and the cache may get corrupted. Whether the current code is effectively atomic depends on process control. When CONFIG_FLUSH_PAGE_ACCESSED is set to zero, the page will eventually be flushed when the PTE is cleared or in flush_cache_page_if_present. The _PAGE_ACCESSED bit is not used, so the problem is avoided. The flush method can be selected using the CONFIG_FLUSH_PAGE_ACCESSED define in cache.c. The default is 0. I didn't see a large difference in performance. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # v6.6+ Signed-off-by: Helge Deller <deller@gmx.de>	2024-06-12 01:57:05 +02:00
Kees Cook	8c860ed825	x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking When reworking the range checking for get_user(), the get_user_8() case on 32-bit wasn't zeroing the high register. (The jump to bad_get_user_8 was accidentally dropped.) Restore the correct error handling destination (and rename the jump to using the expected ".L" prefix). While here, switch to using a named argument ("size") for the call template ("%c4" to "%c[size]") as already used in the other call templates in this file. Found after moving the usercopy selftests to KUnit: # usercopy_test_invalid: EXPECTATION FAILED at lib/usercopy_kunit.c:278 Expected val_u64 == 0, but val_u64 == -60129542144 (0xfffffff200000000) Closes: https://lore.kernel.org/all/CABVgOSn=tb=Lj9SxHuT4_9MTjjKVxsq-ikdXC4kGHO4CfKVmGQ@mail.gmail.com Fixes: `b19b74bc99` ("x86/mm: Rework address range check in get_user() and put_user()") Reported-by: David Gow <davidgow@google.com> Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Tested-by: David Gow <davidgow@google.com> Link: https://lore.kernel.org/all/20240610210213.work.143-kees%40kernel.org	2024-06-11 16:08:43 -07:00
Kent Overstreet	f2736b9c79	bcachefs: Fix rcu_read_lock() leak in drop_extra_replicas Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-11 18:59:08 -04:00
Dr. David Alan Gilbert	017ed5e70c	drm/nouveau: remove unused struct 'init_exec' 'init_exec' is unused since commit `cb75d97e9c` ("drm/nouveau: implement devinit subdev, and new init table parser") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Acked-by: Danilo Krummrich <dakr@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240517232617.230767-1-linux@treblig.org	2024-06-11 23:20:05 +02:00
Linus Torvalds	2ef5971ff3	Merge tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "Misc: - Restore debugfs behavior of ignoring unknown mount options - Fix kernel doc for netfs_wait_for_oustanding_io() - Fix struct statx comment after new addition for this cycle - Fix a check in find_next_fd() iomap: - Fix data zeroing behavior when an extent spans the block that contains i_size - Restore i_size increasing in iomap_write_end() for now to avoid stale data exposure on xfs with a realtime device Cachefiles: - Remove unneeded fdtable.h include - Improve trace output for cachefiles_obj_{get,put}_ondemand_fd() - Remove requests from the request list to prevent accessing already freed requests - Fix UAF when issuing restore command while the daemon is still alive by adding an additional reference count to requests - Fix UAF by grabbing a reference during xarray lookup with xa_lock() held - Simplify error handling in cachefiles_ondemand_daemon_read() - Add consistency checks read and open requests to avoid crashes - Add a spinlock to protect ondemand_id variable which is used to determine whether an anonymous cachefiles fd has already been closed - Make on-demand reads killable allowing to handle broken cachefiles daemon better - Flush all requests after the kernel has been marked dead via CACHEFILES_DEAD to avoid hung-tasks - Ensure that closed requests are marked as such to avoid reusing them with a reopen request - Defer fd_install() until after copy_to_user() succeeded and thereby get rid of having to use close_fd() - Ensure that anonymous cachefiles on-demand fds are reused while they are valid to avoid pinning already freed cookies" * tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iomap: Fix iomap_adjust_read_range for plen calculation iomap: keep on increasing i_size in iomap_write_end() cachefiles: remove unneeded include of <linux/fdtable.h> fs/file: fix the check in find_next_fd() cachefiles: make on-demand read killable cachefiles: flush all requests after setting CACHEFILES_DEAD cachefiles: Set object to close if ondemand_id < 0 in copen cachefiles: defer exposing anon_fd until after copy_to_user() succeeds cachefiles: never get a new anonymous fd if ondemand_id is valid cachefiles: add spin_lock for cachefiles_ondemand_info cachefiles: add consistency check for copen/cread cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read() cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read() cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd() cachefiles: remove requests from xarray during flushing requests cachefiles: add output string to cachefiles_obj_[get\|put]_ondemand_fd statx: Update offset commentary for struct statx netfs: fix kernel doc for nets_wait_for_outstanding_io() debugfs: continue to ignore unknown mount options	2024-06-11 12:04:21 -07:00
Rafael J. Wysocki	b684682698	thermal: gov_step_wise: Restore passive polling management Consider a thermal zone with one passive trip point, a cooling device with 3 states (0, 1, 2) bound to it, passive polling enabled (nonzero passive_delay_jiffies) and no regular polling (polling_delay_jiffies equal to 0) that is managed by the Step-Wise governor. Suppose that the initial state of the cooling device is 0 and the zone temperature is below the trip point to start with. When the trip point is crossed, tz->passive is incremented by the thermal core and the governor's .manage() callback is invoked. It sets 'throttle' to 'true' for the trip in question and get_target_state() returns 1 for the instance corresponding to the cooling device (say that 'upper' and 'lower' are set to 2 and 0 for it, respectively), so its state changes to 1. Passive polling is still active for the zone, so next time the temperature is updated, the governor's .manage() callback will be invoked again. If the temperature is still rising, it will change the state of the cooling device to 2. Now suppose that next time the zone temperature is updated, it falls below the trip point, so tz->passive is decremented for the zone (say it becomes 0 then) and the governor's .manage() callbacks runs. It finds that the temperature trend for the zone is 'falling' and 'throttle' will be set to 'false' for the trip in question, so the cooling device's state will be changed to 1. However, because tz->polling is 0 for the zone, the governor's .manage() callback may not be invoked again for a long time and the cooling device's state will not be reset back to 0. This can happen because commit `042a3d80f1` ("thermal: core: Move passive polling management to the core") removed passive polling management from the Step-Wise governor. Before that change, thermal_zone_trip_update() would bump up tz->passive when changing the target state for a thermal instance from "no target" to a specific value and it would drop tz->passive when changing it back to "no target" which would cause passive polling to be active for the zone until the governor has reset the states of all cooling devices. In particular, in the example above tz->passive would be incremented when changing the state of the cooling device from 0 to 1 and then it would be still nonzero when the state of the cooling device was changed from 2 to 1. To prevent this problem from occurring, restore the passive polling management in the Step-Wise governor by partially reverting the commit in question and update the comment in the restored code to explain its role more clearly. Fixes: `042a3d80f1` ("thermal: core: Move passive polling management to the core") Closes: https://lore.kernel.org/linux-pm/ZmVfcEOxmjUHZTSX@hovoldconsulting.com Reported-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-06-11 21:00:44 +02:00
Florian Westphal	6f8f132cc7	netfilter: Use flowlabel flow key when re-routing mangled packets 'ip6 dscp set $v' in an nftables outpute route chain has no effect. While nftables does detect the dscp change and calls the reroute hook. But ip6_route_me_harder never sets the dscp/flowlabel: flowlabel/dsfield routing rules are ignored and no reroute takes place. Thanks to Yi Chen for an excellent reproducer script that I used to validate this change. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: Yi Chen <yiche@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-06-11 18:46:04 +02:00
Jozsef Kadlecsik	4e7aaa6b82	netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type Lion Ackermann reported that there is a race condition between namespace cleanup in ipset and the garbage collection of the list:set type. The namespace cleanup can destroy the list:set type of sets while the gc of the set type is waiting to run in rcu cleanup. The latter uses data from the destroyed set which thus leads use after free. The patch contains the following parts: - When destroying all sets, first remove the garbage collectors, then wait if needed and then destroy the sets. - Fix the badly ordered "wait then remove gc" for the destroy a single set case. - Fix the missing rcu locking in the list:set type in the userspace test case. - Use proper RCU list handlings in the list:set type. The patch depends on `c1193d9bbb` (netfilter: ipset: Add list flush to cancel_gc). Fixes: `97f7cf1cd8` (netfilter: ipset: fix performance regression in swap operation) Reported-by: Lion Ackermann <nnamrec@gmail.com> Tested-by: Lion Ackermann <nnamrec@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-06-11 18:46:04 +02:00
Davide Ornaghi	c4ab9da85b	netfilter: nft_inner: validate mandatory meta and payload Check for mandatory netlink attributes in payload and meta expression when used embedded from the inner expression, otherwise NULL pointer dereference is possible from userspace. Fixes: `a150d122b6` ("netfilter: nft_meta: add inner match support") Fixes: `3a07327d10` ("netfilter: nft_inner: support for inner tunnel header matching") Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-06-11 18:46:04 +02:00
Alexander Gordeev	693d41f7c9	s390/mm: Restore mapping of kernel image using large pages Since physical and virtual kernel address spaces are uncoupled the kernel image is not mapped using large segment pages anymore, which is a regression. Put the kernel image at the same large segment page offset in physical memory as in virtual memory. Such approach preserves the existing number of bits of entropy used for randomization of the kernel location in virtual memory when KASLR is on. As result, the kernel is mapped using large segment pages. Fixes: `c98d2ecae0` ("s390/mm: Uncouple physical vs virtual address spaces") Reported-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2024-06-11 16:20:40 +02:00
Alexander Gordeev	d8073dc6bc	s390/mm: Allow large pages only for aligned physical addresses Do not allow creation of large pages against physical addresses, which itself are not aligned on the correct boundary. Failure to do so might lead to referencing wrong memory as result of the way DAT works. Fixes: `c98d2ecae0` ("s390/mm: Uncouple physical vs virtual address spaces") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2024-06-11 16:20:40 +02:00
Heiko Carstens	b01b8151ef	s390: Update defconfigs Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2024-06-11 16:20:40 +02:00
Vasily Khoruzhick	b96a225377	drm/nouveau: don't attempt to schedule hpd_work on headless cards If the card doesn't have display hardware, hpd_work and hpd_lock are left uninitialized which causes BUG when attempting to schedule hpd_work on runtime PM resume. Fix it by adding headless flag to DRM and skip any hpd if it's set. Fixes: `ae1aadb1eb` ("nouveau: don't fail driver load if no display hw present.") Link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/337 Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Reviewed-by: Ben Skeggs <bskeggs@nvidia.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240607221032.25918-1-anarsoul@gmail.com	2024-06-11 12:04:59 +02:00
Eric Dumazet	36534d3c54	tcp: use signed arithmetic in tcp_rtx_probe0_timed_out() Due to timer wheel implementation, a timer will usually fire after its schedule. For instance, for HZ=1000, a timeout between 512ms and 4s has a granularity of 64ms. For this range of values, the extra delay could be up to 63ms. For TCP, this means that tp->rcv_tstamp may be after inet_csk(sk)->icsk_timeout whenever the timer interrupt finally triggers, if one packet came during the extra delay. We need to make sure tcp_rtx_probe0_timed_out() handles this case. Fixes: `e89688e3e9` ("net: tcp: fix unexcepted socket die when snd_wnd is 0") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Menglong Dong <imagedong@tencent.com> Acked-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240607125652.1472540-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:50:10 -07:00
Jakub Kicinski	70b3c88cec	Merge branch 'mptcp-various-fixes' Matthieu Baerts says: ==================== mptcp: various fixes The different patches here are some unrelated fixes for MPTCP: - Patch 1 ensures 'snd_una' is initialised on connect in case of MPTCP fallback to TCP followed by retransmissions before the processing of any other incoming packets. A fix for v5.9+. - Patch 2 makes sure the RmAddr MIB counter is incremented, and only once per ID, upon the reception of a RM_ADDR. A fix for v5.10+. - Patch 3 doesn't update 'add addr' related counters if the connect() was not possible. A fix for v5.7+. - Patch 4 updates the mailmap file to add Geliang's new email address. ==================== Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-0-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:14 -07:00
Geliang Tang	74acb250e1	mailmap: map Geliang's new email address Just like my other email addresses, map my new one to kernel.org account too. My new email address uses "last name, first name" format, which is different from my other email addresses. This mailmap is also used to indicate that it is actually the same person. Suggested-by: Mat Martineau <martineau@kernel.org> Suggested-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-4-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:11 -07:00
YonglongLi	40eec1795c	mptcp: pm: update add_addr counters after connect The creation of new subflows can fail for different reasons. If no subflow have been created using the received ADD_ADDR, the related counters should not be updated, otherwise they will never be decremented for events related to this ID later on. For the moment, the number of accepted ADD_ADDR is only decremented upon the reception of a related RM_ADDR, and only if the remote address ID is currently being used by at least one subflow. In other words, if no subflow can be created with the received address, the counter will not be decremented. In this case, it is then important not to increment pm.add_addr_accepted counter, and not to modify pm.accept_addr bit. Note that this patch does not modify the behaviour in case of failures later on, e.g. if the MP Join is dropped or rejected. The "remove invalid addresses" MP Join subtest has been modified to validate this case. The broadcast IP address is added before the "valid" address that will be used to successfully create a subflow, and the limit is decreased by one: without this patch, it was not possible to create the last subflow, because: - the broadcast address would have been accepted even if it was not usable: the creation of a subflow to this address results in an error, - the limit of 2 accepted ADD_ADDR would have then been reached. Fixes: `01cacb00b3` ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-3-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:10 -07:00
YonglongLi	6a09788c1a	mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID The RmAddr MIB counter is supposed to be incremented once when a valid RM_ADDR has been received. Before this patch, it could have been incremented as many times as the number of subflows connected to the linked address ID, so it could have been 0, 1 or more than 1. The "RmSubflow" is incremented after a local operation. In this case, it is normal to tied it with the number of subflows that have been actually removed. The "remove invalid addresses" MP Join subtest has been modified to validate this case. A broadcast IP address is now used instead: the client will not be able to create a subflow to this address. The consequence is that when receiving the RM_ADDR with the ID attached to this broadcast IP address, no subflow linked to this ID will be found. Fixes: `7a7e52e38a` ("mptcp: add RM_ADDR related mibs") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-2-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:10 -07:00
Paolo Abeni	8031b58c3a	mptcp: ensure snd_una is properly initialized on connect This is strictly related to commit `fb7a0d3348` ("mptcp: ensure snd_nxt is properly initialized on connect"). It turns out that syzkaller can trigger the retransmit after fallback and before processing any other incoming packet - so that snd_una is still left uninitialized. Address the issue explicitly initializing snd_una together with snd_nxt and write_seq. Suggested-by: Mat Martineau <martineau@kernel.org> Fixes: `8fd738049a` ("mptcp: fallback in case of simultaneous connect") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-1-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:10 -07:00
Johannes Berg	44180feacc	net/sched: initialize noop_qdisc owner When the noop_qdisc owner isn't initialized, then it will be 0, so packets will erroneously be regarded as having been subject to recursion as long as only CPU 0 queues them. For non-SMP, that's all packets, of course. This causes a change in what's reported to userspace, normally noop_qdisc would drop packets silently, but with this change the syscall returns -ENOBUFS if RECVERR is also set on the socket. Fix this by initializing the owner field to -1, just like it would be for dynamically allocated qdiscs by qdisc_alloc(). Fixes: `0f022d32c3` ("net/sched: Fix mirred deadlock on device recursion") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240607175340.786bfb938803.I493bf8422e36be4454c08880a8d3703cea8e421a@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:36:49 -07:00
Kent Overstreet	7124a8982b	bcachefs: Add missing bch_inode_info.ei_flags init Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 20:50:14 -04:00
Oleg Nesterov	07c54cc598	tick/nohz_full: Don't abuse smp_call_function_single() in tick_setup_device() After the recent commit `5097cbcb38` ("sched/isolation: Prevent boot crash when the boot CPU is nohz_full") the kernel no longer crashes, but there is another problem. In this case tick_setup_device() calls tick_take_do_timer_from_boot() to update tick_do_timer_cpu and this triggers the WARN_ON_ONCE(irqs_disabled) in smp_call_function_single(). Kill tick_take_do_timer_from_boot() and just use WRITE_ONCE(), the new comment explains why this is safe (thanks Thomas!). Fixes: `08ae95f4fd` ("nohz_full: Allow the boot CPU to be nohz_full") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240528122019.GA28794@redhat.com Link: https://lore.kernel.org/all/20240522151742.GA10400@redhat.com	2024-06-10 20:18:13 +02:00
Arunpravin Paneer Selvam	31849bf07e	drm/amdgpu: Fix the BO release clear memory warning This happens when the amdgpu_bo_release_notify running before amdgpu_ttm_set_buffer_funcs_status set the buffer funcs to enabled. check the buffer funcs enablement before calling the fill buffer memory. v2:(Christian) - Apply it only for GEM buffers and since GEM buffers are only allocated/freed while the driver is loaded we never run into the issue to clear with buffer funcs disabled. v3:(Mario) - drop the stable tag as this will presumably go into a -fixes PR for 6.10 Log snip: ERROR Trying to clear memory with ring turned off. RIP: 0010:amdgpu_bo_release_notify+0x201/0x220 [amdgpu] Fixes: `a68c7eaa7a` ("drm/amdgpu: Enable clear page functionality") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Tested-by: Richard Gong <richard.gong@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240610180401.9540-1-Arunpravin.PaneerSelvam@amd.com	2024-06-10 23:46:33 +05:30
Kent Overstreet	b799220092	bcachefs: Add missing synchronize_srcu_expedited() call when shutting down We use the polling interface to srcu for tracking pending frees; when shutting down we don't need to wait for an srcu barrier to free them, but SRCU still gets confused if we shutdown with an outstanding grace period. Reported-by: syzbot+6a038377f0a594d7d44e@syzkaller.appspotmail.com Reported-by: syzbot+0ece6edfd05ed20e32d9@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	9432e90df1	bcachefs: Check for invalid bucket from bucket_gen(), gc_bucket() Turn more asserts into proper recoverable error paths. Reported-by: syzbot+246b47da27f8e7e7d6fb@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	9c4acd19bb	bcachefs: Replace bucket_valid() asserts in bucket lookup with proper checks The bucket_gens array and gc_buckets array known their own size; we should be using those members, and returning an error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	e0cb5722e1	bcachefs: Fix snapshot_create_lock lock ordering ====================================================== WARNING: possible circular locking dependency detected 6.10.0-rc2-ktest-00018-gebd1d148b278 #144 Not tainted ------------------------------------------------------ fio/1345 is trying to acquire lock: ffff88813e200ab8 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_truncate+0x76/0xf0 but task is already holding lock: ffff888105a1fa38 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}, at: do_truncate+0x7b/0xc0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}: down_write+0x3d/0xd0 bch2_write_iter+0x1c0/0x10f0 vfs_write+0x24a/0x560 __x64_sys_pwrite64+0x77/0xb0 x64_sys_call+0x17e5/0x1ab0 do_syscall_64+0x68/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #1 (sb_writers#10){.+.+}-{0:0}: mnt_want_write+0x4a/0x1d0 filename_create+0x69/0x1a0 user_path_create+0x38/0x50 bch2_fs_file_ioctl+0x315/0xbf0 __x64_sys_ioctl+0x297/0xaf0 x64_sys_call+0x10cb/0x1ab0 do_syscall_64+0x68/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 -> #0 (&c->snapshot_create_lock){++++}-{3:3}: __lock_acquire+0x1445/0x25b0 lock_acquire+0xbd/0x2b0 down_read+0x40/0x180 bch2_truncate+0x76/0xf0 bchfs_truncate+0x240/0x3f0 bch2_setattr+0x7b/0xb0 notify_change+0x322/0x4b0 do_truncate+0x8b/0xc0 do_ftruncate+0x110/0x270 __x64_sys_ftruncate+0x43/0x80 x64_sys_call+0x1373/0x1ab0 do_syscall_64+0x68/0x130 entry_SYSCALL_64_after_hwframe+0x4b/0x53 other info that might help us debug this: Chain exists of: &c->snapshot_create_lock --> sb_writers#10 --> &sb->s_type->i_mutex_key#13 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#13); lock(sb_writers#10); lock(&sb->s_type->i_mutex_key#13); rlock(&c->snapshot_create_lock); * DEADLOCK * Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	f9035b0ce6	bcachefs: Fix refcount leak in check_fix_ptrs() fsck_err() does a goto fsck_err on error; factor out check_fix_ptr() so that our error label can drop our device ref. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	bf2b356afd	bcachefs: Leave a buffer in the btree key cache to avoid lock thrashing Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	2760bfe388	bcachefs: Fix reporting of freed objects from key cache shrinker We count objects as freed when we move them to the srcu-pending lists because we're doing the equivalent of a kfree_srcu(); the only difference is managing the pending list ourself means we can allocate from the pending list. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	9ac3e660ca	bcachefs: set sb->s_shrinker->seeks = 0 inodes and dentries are still present in the btree node cache, in much more compact form Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	bc65e98e68	bcachefs: increase key cache shrinker batch size Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	5ae67abcdf	bcachefs: Enable automatic shrinking for rhashtables Since the key cache shrinker walks the rhashtable, a mostly empty rhashtable leads to really nasty reclaim performance issues. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Hongbo Li	26447d224a	bcachefs: fix the display format for show-super There are three keys displayed in non-uniform format. Let's fix them. [Before] ``` Label: testbcachefs Version: 1.9: (unknown version) Version upgrade complete: 0.0: (unknown version) ``` [After] ``` Label: testbcachefs Version: 1.9: (unknown version) Version upgrade complete: 0.0: (unknown version) ``` Fixes: `7423330e30` ("bcachefs: prt_printf() now respects \r\n\t") Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	dab1870439	bcachefs: fix stack frame size in fsck.c fsck.c always runs top of the stack so we're not too concerned here; noinline_for_stack is sufficient Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	04f635ede8	bcachefs: Delete incorrect BTREE_ID_NR assertion for forwards compat we now explicitly allow mounting and using filesystems with unknown btrees, and we have to walk them for fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:16 -04:00
Kent Overstreet	1c8cc24eef	bcachefs: Fix incorrect error handling found_btree_node_is_readable() error handling here is slightly odd, which is why we were accidently calling evict() on an error pointer Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:15 -04:00
Kent Overstreet	161f73c2c7	bcachefs: Split out btree_write_submit_wq Split the workqueues for btree read completions and btree write submissions; we don't want concurrency control on btree read completions, but we do want concurrency control on write submissions, else blocking in submit_bio() will cause a ton of kworkers to be allocated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-10 13:17:15 -04:00
Pauli Virtanen	c695439d19	Bluetooth: fix connection setup in l2cap_connect The amp_id argument of l2cap_connect() was removed in commit `84a4bb6548` ("Bluetooth: HCI: Remove HCI_AMP support") It was always called with amp_id == 0, i.e. AMP_ID_BREDR == 0x00 (ie. non-AMP controller). In the above commit, the code path for amp_id != 0 was preserved, although it should have used the amp_id == 0 one. Restore the previous behavior of the non-AMP code path, to fix problems with L2CAP connections. Fixes: `84a4bb6548` ("Bluetooth: HCI: Remove HCI_AMP support") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-06-10 09:48:30 -04:00
Luiz Augusto von Dentz	806a5198c0	Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ This removes the bogus check for max > hcon->le_conn_max_interval since the later is just the initial maximum conn interval not the maximum the stack could support which is really 3200=4000ms. In order to pass GAP/CONN/CPUP/BV-05-C one shall probably enter values of the following fields in IXIT that would cause hci_check_conn_params to fail: TSPX_conn_update_int_min TSPX_conn_update_int_max TSPX_conn_update_peripheral_latency TSPX_conn_update_supervision_timeout Link: https://github.com/bluez/bluez/issues/847 Fixes: `e4b019515f` ("Bluetooth: Enforce validation on max value of connection interval") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-06-10 09:48:27 -04:00
Luiz Augusto von Dentz	86fbd9f63a	Bluetooth: hci_sync: Fix not using correct handle When setting up an advertisement the code shall always attempt to use the handle set by the instance since it may not be equal to the instance ID. Fixes: `e77f43d531` ("Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-06-10 09:48:25 -04:00
David S. Miller	93792130a9	Merge branch 'geneve-fixes' Tariq Toukan says: ==================== geneve fixes This small patchset by Gal provides bug fixes to the geneve tunnels flows. Patch 1 fixes an incorrect value returned by the inner network header offset helper. Patch 2 fixes an issue inside the mlx5e tunneling flow. It 'happened' to be harmless so far, before applying patch 1. Series generated against: commit `d30d0e49da` ("Merge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:18:09 +01:00
Gal Pressman	791b4089e3	net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets Move the vxlan_features_check() call to after we verified the packet is a tunneled VXLAN packet. Without this, tunneled UDP non-VXLAN packets (for ex. GENENVE) might wrongly not get offloaded. In some cases, it worked by chance as GENEVE header is the same size as VXLAN, but it is obviously incorrect. Fixes: `e3cfc7e6b7` ("net/mlx5e: TX, Add geneve tunnel stateless offload support") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:18:09 +01:00
Gal Pressman	c6ae073f59	geneve: Fix incorrect inner network header offset when innerprotoinherit is set When innerprotoinherit is set, the tunneled packets do not have an inner Ethernet header. Change 'maclen' to not always assume the header length is ETH_HLEN, as there might not be a MAC header. This resolves issues with drivers (e.g. mlx5, in mlx5e_tx_tunnel_accel()) who rely on the skb inner network header offset to be correct, and use it for TX offloads. Fixes: `d8a6213d70` ("geneve: fix header validation in geneve[6]_xmit_skb") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:18:08 +01:00
Andy Shevchenko	d029edefed	net dsa: qca8k: fix usages of device_get_named_child_node() The documentation for device_get_named_child_node() mentions this important point: " The caller is responsible for calling fwnode_handle_put() on the returned fwnode pointer. " Add fwnode_handle_put() to avoid leaked references. Fixes: `1e264f9d29` ("net: dsa: qca8k: add LEDs basic support") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:12:14 +01:00
Eric Dumazet	d37fe4255a	tcp: fix race in tcp_v6_syn_recv_sock() tcp_v6_syn_recv_sock() calls ip6_dst_store() before inet_sk(newsk)->pinet6 has been set up. This means ip6_dst_store() writes over the parent (listener) np->dst_cookie. This is racy because multiple threads could share the same parent and their final np->dst_cookie could be wrong. Move ip6_dst_store() call after inet_sk(newsk)->pinet6 has been changed and after the copy of parent ipv6_pinfo. Fixes: `e994b2f0fb` ("tcp: do not lock listener to process SYN packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:11:22 +01:00
Adam Miotk	ce62600c4d	drm/bridge/panel: Fix runtime warning on panel bridge release Device managed panel bridge wrappers are created by calling to drm_panel_bridge_add_typed() and registering a release handler for clean-up when the device gets unbound. Since the memory for this bridge is also managed and linked to the panel device, the release function should not try to free that memory. Moreover, the call to devm_kfree() inside drm_panel_bridge_remove() will fail in this case and emit a warning because the panel bridge resource is no longer on the device resources list (it has been removed from there before the call to release handlers). Fixes: `67022227ff` ("drm/bridge: Add a devm_ allocator for panel bridge.") Signed-off-by: Adam Miotk <adam.miotk@arm.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240610102739.139852-1-adam.miotk@arm.com	2024-06-10 13:22:05 +02:00
Amjad Ouled-Ameur	b880018edd	drm/komeda: check for error-valued pointer komeda_pipeline_get_state() may return an error-valued pointer, thus check the pointer for negative or null value before dereferencing. Fixes: `502932a03f` ("drm/komeda: Add the initial scaler support for CORE") Signed-off-by: Amjad Ouled-Ameur <amjad.ouled-ameur@arm.com> Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240610102056.40406-1-amjad.ouled-ameur@arm.com	2024-06-10 13:20:37 +02:00
Russell King (Oracle)	594ce0b8a9	Merge topic branches 'clkdev' and 'fixes' into for-linus	2024-06-10 12:03:21 +01:00
Ard Biesheuvel	e3cf20e5c6	ARM: 9405/1: ftrace: Don't assume stack frames are contiguous in memory The frame pointer unwinder relies on a standard layout of the stack frame, consisting of (in downward order) Calling frame: PC <---------+ LR \| SP \| FP \| .. locals .. \| Callee frame: \| PC \| LR \| SP \| FP ----------+ where after storing its previous value on the stack, FP is made to point at the location of PC in the callee stack frame, using the canonical prologue: mov ip, sp stmdb sp!, {fp, ip, lr, pc} sub fp, ip, #4 The ftrace code assumes that this activation record is pushed first, and that any stack space for locals is allocated below this. Strict adherence to this would imply that the caller's value of SP at the time of the function call can always be obtained by adding 4 to FP (which points to PC in the callee frame). However, recent versions of GCC appear to deviate from this rule, and so the only reliable way to obtain the caller's value of SP is to read it from the activation record. Since this involves a read from memory rather than simple arithmetic, we need to use the uaccess API here which protects against inadvertent data aborts resulting from attempts to dereference bogus FP values. The plain uaccess API is ftrace instrumented itself, so to avoid unbounded recursion, use the __get_kernel_nofault() primitive directly. Closes: https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76 Closes: https://lore.kernel.org/all/d870c149-4363-43de-b0ea-7125dec5608e@broadcom.com/ Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reported-by: Justin Chen <justin.chen@broadcom.com> Tested-by: Thorsten Scherer <t.scherer@eckelmann.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>	2024-06-10 12:00:27 +01:00
David Wei	5add2f7288	netdevsim: fix backwards compatibility in nsim_get_iflink() The default ndo_get_iflink() implementation returns the current ifindex of the netdev. But the overridden nsim_get_iflink() returns 0 if the current nsim is not linked, breaking backwards compatibility for userspace that depend on this behaviour. Fix the problem by returning the current ifindex if not linked to a peer. Fixes: `8debcf5832` ("netdevsim: add ndo_get_iflink() implementation") Reported-by: Yu Watanabe <watanabe.yu@gmail.com> Suggested-by: Yu Watanabe <watanabe.yu@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 11:51:04 +01:00
Tobias Jakobi	f74fb5df42	drm: panel-orientation-quirks: Add quirk for Aya Neo KUN Similar to the other Aya Neo devices this one features again a portrait screen, here with a native resolution of 1600x2560. Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240310220401.895591-1-tjakobi@math.uni-bielefeld.de	2024-06-10 12:19:25 +02:00
Wengang Wang	58f880711f	xfs: make sure sb_fdblocks is non-negative A user with a completely full filesystem experienced an unexpected shutdown when the filesystem tried to write the superblock during runtime. kernel shows the following dmesg: [ 8.176281] XFS (dm-4): Metadata corruption detected at xfs_sb_write_verify+0x60/0x120 [xfs], xfs_sb block 0x0 [ 8.177417] XFS (dm-4): Unmount and run xfs_repair [ 8.178016] XFS (dm-4): First 128 bytes of corrupted metadata buffer: [ 8.178703] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 01 90 00 00 XFSB............ [ 8.179487] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [ 8.180312] 00000020: cf 12 dc 89 ca 26 45 29 92 e6 e3 8d 3b b8 a2 c3 .....&E)....;... [ 8.181150] 00000030: 00 00 00 00 01 00 00 06 00 00 00 00 00 00 00 80 ................ [ 8.182003] 00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82 ................ [ 8.182004] 00000050: 00 00 00 01 00 64 00 00 00 00 00 04 00 00 00 00 .....d.......... [ 8.182004] 00000060: 00 00 64 00 b4 a5 02 00 02 00 00 08 00 00 00 00 ..d............. [ 8.182005] 00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 17 00 00 19 ................ [ 8.182008] XFS (dm-4): Corruption of in-memory data detected. Shutting down filesystem [ 8.182010] XFS (dm-4): Please unmount the filesystem and rectify the problem(s) When xfs_log_sb writes super block to disk, b_fdblocks is fetched from m_fdblocks without any lock. As m_fdblocks can experience a positive -> negative -> positive changing when the FS reaches fullness (see xfs_mod_fdblocks). So there is a chance that sb_fdblocks is negative, and because sb_fdblocks is type of unsigned long long, it reads super big. And sb_fdblocks being bigger than sb_dblocks is a problem during log recovery, xfs_validate_sb_write() complains. Fix: As sb_fdblocks will be re-calculated during mount when lazysbcount is enabled, We just need to make xfs_validate_sb_write() happy -- make sure sb_fdblocks is not nenative. This patch also takes care of other percpu counters in xfs_log_sb. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>	2024-06-10 11:38:12 +05:30
Jani Nikula	38e3825631	drm/exynos/vidi: fix memory leak in .get_modes() The duplicated EDID is never freed. Fix it. Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2024-06-10 15:05:43 +09:00
Yazen Ghannam	fe8a08973a	RAS/AMD/ATL: Fix MI300 bank hash Apply the SID bits to the correct offset in the Bank value. Do this in the temporary value so they don't need to be masked off later. Fixes: `87a6123753` ("RAS/AMD/ATL: Add MI300 DRAM to normalized address translation support") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20240607-mi300-dram-xl-fix-v1-1-2f11547a178c@amd.com	2024-06-10 07:56:33 +02:00
Krzysztof Kozlowski	1f3512cdf8	drm/exynos: dp: drop driver owner initialization Core in platform_driver_register() already sets the .owner, so driver does not need to. Whatever is set here will be anyway overwritten by main driver calling platform_driver_register(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2024-06-10 13:50:53 +09:00
Marek Szyprowski	799d4b3924	drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found When reading EDID fails and driver reports no modes available, the DRM core adds an artificial 1024x786 mode to the connector. Unfortunately some variants of the Exynos HDMI (like the one in Exynos4 SoCs) are not able to drive such mode, so report a safe 640x480 mode instead of nothing in case of the EDID reading failure. This fixes the following issue observed on Trats2 board since commit `13d5b04036` ("drm/exynos: do not return negative values from .get_modes()"): [drm] Exynos DRM: using 11c00000.fimd device for DMA mapping operations exynos-drm exynos-drm: bound 11c00000.fimd (ops fimd_component_ops) exynos-drm exynos-drm: bound 12c10000.mixer (ops mixer_component_ops) exynos-dsi 11c80000.dsi: [drm:samsung_dsim_host_attach] Attached s6e8aa0 device (lanes:4 bpp:24 mode-flags:0x10b) exynos-drm exynos-drm: bound 11c80000.dsi (ops exynos_dsi_component_ops) exynos-drm exynos-drm: bound 12d00000.hdmi (ops hdmi_component_ops) [drm] Initialized exynos 1.1.0 20180330 for exynos-drm on minor 1 exynos-hdmi 12d00000.hdmi: [drm:hdmiphy_enable.part.0] ERROR PLL could not reach steady state panel-samsung-s6e8aa0 11c80000.dsi.0: ID: 0xa2, 0x20, 0x8c exynos-mixer 12c10000.mixer: timeout waiting for VSYNC ------------[ cut here ]------------ WARNING: CPU: 1 PID: 11 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8 [CRTC:70:crtc-1] vblank wait timed out Modules linked in: CPU: 1 PID: 11 Comm: kworker/u16:0 Not tainted 6.9.0-rc5-next-20240424 #14913 Hardware name: Samsung Exynos (Flattened Device Tree) Workqueue: events_unbound deferred_probe_work_func Call trace: unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x68/0x88 dump_stack_lvl from __warn+0x7c/0x1c4 __warn from warn_slowpath_fmt+0x11c/0x1a8 warn_slowpath_fmt from drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8 drm_atomic_helper_wait_for_vblanks.part.0 from drm_atomic_helper_commit_tail_rpm+0x7c/0x8c drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x184 commit_tail from drm_atomic_helper_commit+0x168/0x190 drm_atomic_helper_commit from drm_atomic_commit+0xb4/0xe0 drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x27c drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1cc drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40 drm_client_modeset_commit from __drm_fb_helper_restore_fbdev_mode_unlocked+0x9c/0xc4 __drm_fb_helper_restore_fbdev_mode_unlocked from drm_fb_helper_set_par+0x2c/0x3c drm_fb_helper_set_par from fbcon_init+0x3d8/0x550 fbcon_init from visual_init+0xc0/0x108 visual_init from do_bind_con_driver+0x1b8/0x3a4 do_bind_con_driver from do_take_over_console+0x140/0x1ec do_take_over_console from do_fbcon_takeover+0x70/0xd0 do_fbcon_takeover from fbcon_fb_registered+0x19c/0x1ac fbcon_fb_registered from register_framebuffer+0x190/0x21c register_framebuffer from __drm_fb_helper_initial_config_and_unlock+0x350/0x574 __drm_fb_helper_initial_config_and_unlock from exynos_drm_fbdev_client_hotplug+0x6c/0xb0 exynos_drm_fbdev_client_hotplug from drm_client_register+0x58/0x94 drm_client_register from exynos_drm_bind+0x160/0x190 exynos_drm_bind from try_to_bring_up_aggregate_device+0x200/0x2d8 try_to_bring_up_aggregate_device from __component_add+0xb0/0x170 __component_add from mixer_probe+0x74/0xcc mixer_probe from platform_probe+0x5c/0xb8 platform_probe from really_probe+0xe0/0x3d8 really_probe from __driver_probe_device+0x9c/0x1e4 __driver_probe_device from driver_probe_device+0x30/0xc0 driver_probe_device from __device_attach_driver+0xa8/0x120 __device_attach_driver from bus_for_each_drv+0x80/0xcc bus_for_each_drv from __device_attach+0xac/0x1fc __device_attach from bus_probe_device+0x8c/0x90 bus_probe_device from deferred_probe_work_func+0x98/0xe0 deferred_probe_work_func from process_one_work+0x240/0x6d0 process_one_work from worker_thread+0x1a0/0x3f4 worker_thread from kthread+0x104/0x138 kthread from ret_from_fork+0x14/0x28 Exception stack(0xf0895fb0 to 0xf0895ff8) ... irq event stamp: 82357 hardirqs last enabled at (82363): [<c01a96e8>] vprintk_emit+0x308/0x33c hardirqs last disabled at (82368): [<c01a969c>] vprintk_emit+0x2bc/0x33c softirqs last enabled at (81614): [<c0101644>] __do_softirq+0x320/0x500 softirqs last disabled at (81609): [<c012dfe0>] __irq_exit_rcu+0x130/0x184 ---[ end trace 0000000000000000 ]--- exynos-drm exynos-drm: [drm] ERROR flip_done timed out exynos-drm exynos-drm: [drm] ERROR [CRTC:70:crtc-1] commit wait timed out exynos-drm exynos-drm: [drm] ERROR flip_done timed out exynos-drm exynos-drm: [drm] ERROR [CONNECTOR:74:HDMI-A-1] commit wait timed out exynos-drm exynos-drm: [drm] ERROR flip_done timed out exynos-drm exynos-drm: [drm] ERROR [PLANE:56:plane-5] commit wait timed out exynos-mixer 12c10000.mixer: timeout waiting for VSYNC Cc: stable@vger.kernel.org Fixes: `13d5b04036` ("drm/exynos: do not return negative values from .get_modes()") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2024-06-10 13:50:50 +09:00
Linus Torvalds	83a7eefedc	Linux 6.10-rc3	2024-06-09 14:19:43 -07:00
Linus Torvalds	b8481381d4	Merge tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Update copies of kernel headers, which resulted in support for the new 'mseal' syscall, SUBVOL statx return mask bit, RISC-V and PPC prctls, fcntl's DUPFD_QUERY, POSTED_MSI_NOTIFICATION IRQ vector, 'map_shadow_stack' syscall for x86-32. - Revert perf.data record memory allocation optimization that ended up causing a regression, work is being done to re-introduce it in the next merge window. - Fix handling of minimal vmlinux.h file used with BPF's CO-RE when interrupting the build. * tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf bpf: Fix handling of minimal vmlinux.h file when interrupting the build Revert "perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event" tools headers arm64: Sync arm64's cputype.h with the kernel sources tools headers uapi: Sync linux/stat.h with the kernel sources to pick STATX_SUBVOL tools headers UAPI: Update i915_drm.h with the kernel sources tools headers UAPI: Sync kvm headers with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers: Update the syscall tables and unistd.h, mostly to support the new 'mseal' syscall perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources to pick POSTED_MSI_NOTIFICATION perf beauty: Update copy of linux/socket.h with the kernel sources tools headers UAPI: Sync fcntl.h with the kernel sources to pick F_DUPFD_QUERY tools headers UAPI: Sync linux/prctl.h with the kernel sources tools include UAPI: Sync linux/stat.h with the kernel sources	2024-06-09 09:04:51 -07:00
Linus Torvalds	637c2dfcd9	Merge tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fixes from Borislav Petkov: - Convert PCI core error codes to proper error numbers since latter get propagated all the way up to the module loading functions * tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/igen6: Convert PCIBIOS_* return codes to errnos EDAC/amd64: Convert PCIBIOS_* return codes to errnos	2024-06-09 08:49:13 -07:00
Sagar Cheluvegowda	0579f27249	net: stmmac: dwmac-qcom-ethqos: Configure host DMA width Commit `070246e467` ("net: stmmac: Fix for mismatched host/device DMA address width") added support in the stmmac driver for platform drivers to indicate the host DMA width, but left it up to authors of the specific platforms to indicate if their width differed from the addr64 register read from the MAC itself. Qualcomm's EMAC4 integration supports only up to 36 bit width (as opposed to the addr64 register indicating 40 bit width). Let's indicate that in the platform driver to avoid a scenario where the driver will allocate descriptors of size that is supported by the CPU which in our case is 36 bit, but as the addr64 register is still capable of 40 bits the device will use two descriptors as one address. Fixes: `8c4d92e82d` ("net: stmmac: dwmac-qcom-ethqos: add support for emac4 on sa8775p platforms") Signed-off-by: Sagar Cheluvegowda <quic_scheluve@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-09 15:52:52 +01:00
Linus Torvalds	771ed66105	Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fix from Stephen Boyd: "One fix for the SiFive PRCI clocks so that the device boots again. This driver was registering clkdev lookups that were always going to be useless. This wasn't a problem until clkdev started returning an error in these cases, causing this driver to fail probe, and thus boot to fail because clks are essential for most drivers. The fix is simple, don't use clkdev because this is a DT based system where clkdev isn't used" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: sifive: Do not register clkdevs for PRCI clocks	2024-06-08 19:14:02 -07:00
Linus Torvalds	c5dbc2ed00	Merge tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: "Two small smb3 client fixes: - fix deadlock in umount - minor cleanup due to netfs change" * tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Don't advance the I/O iterator before terminating subrequest smb: client: fix deadlock in smb2_find_smb_tcon()	2024-06-08 19:07:18 -07:00
Linus Torvalds	061d1af7b0	Merge tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - fix potential read out of bounds in hid-asus (Andrew Ballance) - fix endian-conversion on little endian systems in intel-ish-hid (Arnd Bergmann) - A couple of new input event codes (Aseda Aboagye) - errors handling fixes in hid-nvidia-shield (Chen Ni), hid-nintendo (Christophe JAILLET), hid-logitech-dj (José Expósito) - current leakage fix while the device is in suspend on a i2c-hid laptop (Johan Hovold) - other assorted smaller fixes and device ID / quirk entry additions * tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: Ignore battery for ELAN touchscreens 2F2C and 4116 HID: i2c-hid: elan: fix reset suspend current leakage dt-bindings: HID: i2c-hid: elan: add 'no-reset-on-power-off' property dt-bindings: HID: i2c-hid: elan: add Elan eKTH5015M dt-bindings: HID: i2c-hid: add dedicated Ilitek ILI2901 schema input: Add support for "Do Not Disturb" input: Add event code for accessibility key hid: asus: asus_report_fixup: fix potential read out of bounds HID: logitech-hidpp: add missing MODULE_DESCRIPTION() macro HID: intel-ish-hid: fix endian-conversion HID: nintendo: Fix an error handling path in nintendo_hid_probe() HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode() HID: core: remove unnecessary WARN_ON() in implement() HID: nvidia-shield: Add missing check for input_ff_create_memless HID: intel-ish-hid: Fix build error for COMPILE_TEST	2024-06-08 10:48:11 -07:00
Linus Torvalds	329f70c5be	Merge tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix the initial state of the save button in 'make gconfig' - Improve the Kconfig documentation - Fix a Kconfig bug regarding property visibility - Fix build breakage for systems where 'sed' is not installed in /bin - Fix a false warning about missing MODULE_DESCRIPTION() * tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.o kbuild: explicitly run mksysmap as sed script from link-vmlinux.sh kconfig: remove wrong expr_trans_bool() kconfig: doc: document behavior of 'select' and 'imply' followed by 'if' kconfig: doc: fix a typo in the note about 'imply' kconfig: gconf: give a proper initial state to the Save button kconfig: remove unneeded code for user-supplied values being out of range	2024-06-08 10:12:33 -07:00
Linus Torvalds	1e7ccdd325	Merge tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - fixes for the new ipu6 driver (and related fixes to mei csi driver) - fix a double debugfs remove logic at mgb4 driver - a documentation fix * tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: intel/ipu6: add csi2 port sanity check in notifier bound media: intel/ipu6: update the maximum supported csi2 port number to 6 media: mei: csi: Warn less verbosely of a missing device fwnode media: mei: csi: Put the IPU device reference media: intel/ipu6: fix the buffer flags caused by wrong parentheses media: intel/ipu6: Fix an error handling path in isys_probe() media: intel/ipu6: Move isys_remove() close to isys_probe() media: intel/ipu6: Fix some redundant resources freeing in ipu6_pci_remove() media: Documentation: v4l: Fix ACTIVE route flag media: mgb4: Fix double debugfs remove	2024-06-08 09:57:09 -07:00
Linus Torvalds	36714d69b1	Merge tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: - Fix possible memory leak the riscv-intc irqchip driver load failures - Fix boot crash in the sifive-plic irqchip driver caused by recently changed boot initialization order - Fix race condition in the gic-v3-its irqchip driver * tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update() irqchip/sifive-plic: Chain to parent IRQ after handlers are ready irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails	2024-06-08 09:44:50 -07:00
Linus Torvalds	7cedb020d5	Merge tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Miscellaneous fixes: - Fix kexec() crash if call depth tracking is enabled - Fix SMN reads on inaccessible registers on certain AMD systems" * tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd_nb: Check for invalid SMN reads x86/kexec: Fix bug with call depth tracking	2024-06-08 09:36:08 -07:00
Linus Torvalds	7cec2e16cb	Merge tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fix from Ingo Molnar: "Fix race between perf_event_free_task() and perf_event_release_kernel() that can result in missed wakeups and hung tasks" * tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix missing wakeup when waiting for context reference	2024-06-08 09:26:59 -07:00
Linus Torvalds	bbc5332b8c	Merge tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking doc fix from Ingo Molnar: "Fix typos in the kerneldoc of some of the atomic APIs" * tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldoc	2024-06-08 09:03:46 -07:00
Linus Torvalds	dc772f8237	Merge tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "14 hotfixes, 6 of which are cc:stable. All except the nilfs2 fix affect MM and all are singletons - see the chagelogs for details" * tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors mm: fix xyz_noprof functions calling profiled functions codetag: avoid race at alloc_slab_obj_exts mm/hugetlb: do not call vma_add_reservation upon ENOMEM mm/ksm: fix ksm_zero_pages accounting mm/ksm: fix ksm_pages_scanned accounting kmsan: do not wipe out origin when doing partial unpoisoning vmalloc: check CONFIG_EXECMEM in is_vmalloc_or_module_addr() mm: page_alloc: fix highatomic typing in multi-block buddies nilfs2: fix potential kernel bug due to lack of writeback flag waiting memcg: remove the lockdep assert from __mod_objcg_mlstate() mm: arm64: fix the out-of-bounds issue in contpte_clear_young_dirty_ptes mm: huge_mm: fix undefined reference to `mthp_stats' for CONFIG_SYSFS=n mm: drop the 'anon_' prefix for swap-out mTHP counters	2024-06-07 17:01:10 -07:00
Linus Torvalds	e60721bf3c	Merge tag 'gpio-fixes-for-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - interrupt handling and Kconfig fixes for gpio-tqmx86 - add a buffer for storing output values in gpio-tqmx86 as reading back the registers always returns the input values - add missing MODULE_DESCRIPTION()s to several GPIO drivers * tag 'gpio-fixes-for-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: add missing MODULE_DESCRIPTION() macros gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type gpio: tqmx86: store IRQ trigger type and unmask status separately gpio: tqmx86: introduce shadow register for GPIO output value gpio: tqmx86: fix typo in Kconfig label	2024-06-07 16:54:57 -07:00
Linus Torvalds	602079a0a1	Merge tag 'block-6.10-20240607' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - Fix for null_blk block size validation (Andreas) - NVMe pull request via Keith: - Use reserved tags for special fabrics operations (Chunguang) - Persistent Reservation status masking fix (Weiwen) * tag 'block-6.10-20240607' of git://git.kernel.dk/linux: null_blk: fix validation of block size nvme: fix nvme_pr_* status code parsing nvme-fabrics: use reserved tag for reg read/write command	2024-06-07 16:45:48 -07:00
Linus Torvalds	e33915892d	Merge tag 'io_uring-6.10-20240607' of git://git.kernel.dk/linux Pull io_uring fixes from Jens Axboe: - Fix a locking order issue with setting max async thread workers (Hagar) - Fix for a NULL pointer dereference for failed async flagged requests using ring provided buffers. This doesn't affect the current kernel, but it does affect older kernels, and is being queued up for 6.10 just to make the stable process easier (me) - Fix for NAPI timeout calculations for how long to busy poll, and subsequently how much to sleep post that if a wait timeout is passed in (me) - Fix for a regression in this release cycle, where we could end up using a partially unitialized match value for io-wq (Su) * tag 'io_uring-6.10-20240607' of git://git.kernel.dk/linux: io_uring: fix possible deadlock in io_register_iowq_max_workers() io_uring/io-wq: avoid garbage value of 'match' in io_wq_enqueue() io_uring/napi: fix timeout calculation io_uring: check for non-NULL file pointer in io_file_can_poll()	2024-06-07 16:43:07 -07:00
Linus Torvalds	07978330e6	Merge tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix handling of folio private changes. The private value holds pointer to our extent buffer structure representing a metadata range. Release and create of the range was not properly synchronized when updating the private bit which ended up in double folio_put, leading to all sorts of breakage - fix a crash, reported as duplicate key in metadata, but caused by a race of fsync and size extending write. Requires prealloc target range + fsync and other conditions (log tree state, timing) - fix leak of qgroup extent records after transaction abort * tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: protect folio::private when attaching extent buffer folios btrfs: fix leak of qgroup extent records after transaction abort btrfs: fix crash on racing fsync and size-extending write into prealloc	2024-06-07 15:13:12 -07:00
Linus Torvalds	eecba7c070	Merge tag 'nfsd-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Fix an occasional memory overwrite caused by a fix added in 6.10 * tag 'nfsd-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Fix loop termination condition in gss_free_in_token_pages()	2024-06-07 15:07:57 -07:00
Linus Torvalds	0a02756d91	Merge tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - Another fix to avoid allocating pages that overlap with ERR_PTR, which manifests on rv32 - A revert for the badaccess patch I incorrectly picked up an early version of * tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: Revert "riscv: mm: accelerate pagefault when badaccess" riscv: fix overlap of allocated page and PTR_ERR	2024-06-07 14:47:38 -07:00
Linus Torvalds	8d6b029e15	Merge tag 's390-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - Do not create PT_LOAD program header for the kenel image when the virtual memory informaton in OS_INFO data is not available. That fixes stand-alone dump failures against kernels that do not provide the virtual memory informaton - Add KVM s390 shared zeropage selftest * tag 's390-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: KVM: s390x: selftests: Add shared zeropage test s390/crash: Do not use VM info if os_info does not have it	2024-06-07 14:44:53 -07:00
Linus Torvalds	8d437867ba	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - Fix spurious CPU hotplug warning message from SETEND emulation code - Fix the build when GCC wasn't inlining our I/O accessor internals * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/io: add constant-argument check arm64: armv8_deprecated: Fix warning in isndep cpuhp starting process	2024-06-07 14:36:57 -07:00
Linus Torvalds	96e09b8f81	Merge tag 'platform-drivers-x86-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - Default silead touchscreen driver to 10 fingers and drop 10 finger setting from all DMI quirks. More of a cleanup then a pure fix, but since the DMI quirks always get updated through the fixes branch this avoids conflicts. - Kconfig fix for randconfig builds - dell-smbios: Fix wrong token data in sysfs - amd-hsmp: Fix driver poking unsupported hw when loaded manually * tag 'platform-drivers-x86-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/amd/hsmp: Check HSMP support on AMD family of processors platform/x86: dell-smbios: Simplify error handling platform/x86: dell-smbios: Fix wrong token data in sysfs platform/x86: yt2-1380: add CONFIG_EXTCON dependency platform/x86: touchscreen_dmi: Use 2-argument strscpy() platform/x86: touchscreen_dmi: Drop "silead,max-fingers" property Input: silead - Always support 10 fingers	2024-06-07 14:13:46 -07:00
Linus Torvalds	f24b46ea10	Merge tag 'iommu-fixes-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: "Core: - Make iommu-dma code recognize 'force_aperture' again - Fix for potential NULL-ptr dereference from iommu_sva_bind_device() return value AMD IOMMU fixes: - Fix lockdep splat for invalid wait context - Add feature bit check before enabling PPR - Make workqueue name fit into buffer - Fix memory leak in sysfs code" * tag 'iommu-fixes-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix Invalid wait context issue iommu/amd: Check EFR[EPHSup] bit before enabling PPR iommu/amd: Fix workqueue name iommu: Return right value in iommu_sva_bind_device() iommu/dma: Fix domain init iommu/amd: Fix sysfs leak in iommu init	2024-06-07 13:34:53 -07:00
Linus Torvalds	e693c5026c	Merge tag 'ata-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: - Fix a regression for the PATA MacIO driver were it would fail to probe because of the recent changes of initializing the limits in SCSI core * tag 'ata-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: pata_macio: Fix max_segment_size with PAGE_SIZE == 64K	2024-06-07 12:47:20 -07:00
Linus Torvalds	2e32d58075	Merge tag 'drm-fixes-2024-06-07' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Weekly fixes: vmwgfx leads the way this week, with minor changes in xe and amdgpu and a couple of other small fixes. Seems quiet enough. xe: - Update the LMTT when freeing VF GT config amdgpu: - Fix shutdown issues on some SMU 13.x platforms - Silence some UBSAN flexible array warnings panel: - sitronix-st7789v: handle of_drm_get_panel_orientation failing error vmwgfx: - filter modes greater than available graphics memory - fix 3D vs STDU enable - remove STDU logic from mode valid - logging fix - memcmp pointers fix - remove unused struct - screen target lifetime fix komeda: - unused struct removal" * tag 'drm-fixes-2024-06-07' of https://gitlab.freedesktop.org/drm/kernel: drm/vmwgfx: Don't memcmp equivalent pointers drm/vmwgfx: remove unused struct 'vmw_stdu_dma' drm/vmwgfx: Don't destroy Screen Target when CRTC is enabled but inactive drm/vmwgfx: Standardize use of kibibytes when logging drm/vmwgfx: Remove STDU logic from generic mode_valid function drm/vmwgfx: 3D disabled should not effect STDU memory limits drm/vmwgfx: Filter modes which exceed graphics memory drm/amdgpu/pptable: Fix UBSAN array-index-out-of-bounds drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platforms drm/xe/pf: Update the LMTT when freeing VF GT config drm/panel: sitronix-st7789v: Add check for of_drm_get_panel_orientation drm/komeda: remove unused struct 'gamma_curve_segment'	2024-06-07 12:35:56 -07:00
Greg Kroah-Hartman	1092c4126b	Merge tag 'thunderbolt-for-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next Mika writes: thunderbolt: Fix for v6.10-rc3 This includes one USB4/Thunderbolt fix for v6.10-rc3: - Fix lane margining debugfs node creation condition. This has been in linux-next with no reported issues.	2024-06-07 21:08:20 +02:00
Greg Kroah-Hartman	8f40af3197	Merge tag 'iio-fixes-for-6.10a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus Jonathan writes: 1st set of IIO fixes for the 6.10 cycle. The usual mixed bag of old and new driver bugs plus one core issue that highlighted we have some documentation issues that we need to fix as a follow up action. core in kernel interface - Wrong return value documentation didn't help with an error in a cleanup. Result is that thermal is failing to read the temperature. adi,ad3552r - Fix DT binding output range sign error. adi,ad5592r - Wrong scaling on temperature channel. adi,ad7173 - Driver assumed all supported devices had input buffers. Make sure not to enable them on the ad7176-2 which doesn't. - Add some missing device names for recently added models. - Drop an unneeded zero index on the single temperature channel. - Clear buffered capture specific control bit when returning to on demand sampling which otherwise no longer works. - Make sampling frequency per channel rather than just setting it for the first channel. adi,ad9467 - Capital S for sign of channel whereas ABI is lowercase. bosch,bmi323 - Make sure to release the trigger even on error paths in the trigger handler as otherwise there is no path to recover. bosch,bmp280 - Avoid an overflow in calculating the temperature. invensense,timestamp helper - Fix case where ODR is being switched to the existing ODR an update that never finishes. - Fix an issue with the timestamp being updated whilst still handling previous interrupt (icm42600 and mpu6050) invensense,icm42600 - Don't update the watermark parameters twice. melexis,mlx90635 - Fix variable returned as error code. * tag 'iio-fixes-for-6.10a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: iio: inkern: fix channel read regression iio: imu: inv_mpu6050: stabilized timestamping in interrupt iio: adc: ad7173: Fix sampling frequency setting iio: adc: ad7173: Clear append status bit iio: imu: inv_icm42600: delete unneeded update watermark call iio: imu: inv_icm42600: stabilized timestamp in interrupt iio: invensense: fix odr switching to same value iio: adc: ad7173: Remove index from temp channel iio: adc: ad7173: Add ad7173_device_info names iio: adc: ad7173: fix buffers enablement for ad7176-2 iio: temperature: mlx90635: Fix ERR_PTR dereference in mlx90635_probe() iio: imu: bmi323: Fix trigger notification in case of error iio: dac: ad5592r: fix temperature channel scaling value iio: pressure: bmp280: Fix BMP580 temperature reading dt-bindings: iio: dac: fix ad354xr output range iio: adc: ad9467: fix scan type sign	2024-06-07 21:05:39 +02:00
Mario Limonciello	e79a10652b	ACPI: x86: Force StorageD3Enable on more products A Rembrandt-based HP thin client is reported to have problems where the NVME disk isn't present after resume from s2idle. This is because the NVME disk wasn't put into D3 at suspend, and that happened because the StorageD3Enable _DSD was missing in the BIOS. As AMD's architecture requires that the NVME is in D3 for s2idle, adjust the criteria for force_storage_d3 to match all Zen SoCs when the FADT advertises low power idle support. This will ensure that any future products with this BIOS deficiency don't need to be added to the allow list of overrides. Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-06-07 20:24:07 +02:00
Louis Dalibard	a3a5a37efb	HID: Ignore battery for ELAN touchscreens 2F2C and 4116 At least ASUS Zenbook 14 (2023) and ASUS Zenbook 14 Pro (2023) are affected. The touchscreen reports a battery status of 0% and jumps to 1% when a stylus is used. The device ID was added and the battery ignore quirk was enabled for it. [jkosina@suse.com: reformatted changelog a bit] Signed-off-by: Louis Dalibard <ontake@ontake.dev> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-06-07 17:02:56 +02:00
Jani Nikula	0c76053e3f	drm: have config DRM_WERROR depend on !WERROR If WERROR is already enabled, there's no point in enabling DRM_WERROR or asking users about it. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Closes: https://lore.kernel.org/r/CAHk-=whxT8D_0j=bjtrvj-O=VEOjn6GW8GK4j2V+BiDUntZKAQ@mail.gmail.com Fixes: `f89632a9e5` ("drm: Add CONFIG_DRM_WERROR") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240516083343.1375687-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-06-07 16:28:34 +03:00
Aleksandr Mishin	c44711b786	liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet In lio_vf_rep_copy_packet() pg_info->page is compared to a NULL value, but then it is unconditionally passed to skb_add_rx_frag() which looks strange and could lead to null pointer dereference. lio_vf_rep_copy_packet() call trace looks like: octeon_droq_process_packets octeon_droq_fast_process_packets octeon_droq_dispatch_pkt octeon_create_recv_info ...search in the dispatch_list... ->disp_fn(rdisp->rinfo, ...) lio_vf_rep_pkt_recv(struct octeon_recv_info *recv_info, ...) In this path there is no code which sets pg_info->page to NULL. So this check looks unneeded and doesn't solve potential problem. But I guess the author had reason to add a check and I have no such card and can't do real test. In addition, the code in the function liquidio_push_packet() in liquidio/lio_core.c does exactly the same. Based on this, I consider the most acceptable compromise solution to adjust this issue by moving skb_add_rx_frag() into conditional scope. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `1f233f3279` ("liquidio: switchdev support for LiquidIO NIC") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 14:22:19 +01:00
Rafael J. Wysocki	7f18bd49cb	thermal: ACPI: Invalidate trip points with temperature of 0 or below It is reported that commit `9502108876` ("thermal: core: Drop trips_disabled bitmask") causes the maximum frequency of CPUs to drop further down with every system sleep-wake cycle on Intel Core i7-4710HQ. This turns out to be due to a trip point whose temperature is equal to 0 degrees Celsius which is acted on every time the system wakes from sleep. Before commit `9502108876` this trip point would be disabled wia the trips_disabled bitmask, but now it is treated as a valid one. Since ACPI thermal control is generally about protection against overheating, trip points with temperature of 0 centigrade or below are not particularly useful there, so initialize them all as invalid which fixes the problem at hand. Fixes: `9502108876` ("thermal: core: Drop trips_disabled bitmask") Closes: https://lore.kernel.org/linux-pm/3f71747b-f852-4ee0-b384-cf46b2aefa3f@gmx.com Reported-by: Tibor Billes <tbilles@gmx.com> Tested-by: Tibor Billes <tbilles@gmx.com> Cc: 6.7+ <stable@vger.kernel.org> # 6.7+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-06-07 13:52:17 +02:00
Rafael J. Wysocki	1af89dedc8	thermal: core: Do not fail cdev registration because of invalid initial state It is reported that commit `31a0fa0019` ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()") causes the ACPI fan driver to fail probing on some systems which turns out to be due to the _FST control method returning an invalid value until _FSL is first evaluated for the given fan. If this happens, the .get_cur_state() cooling device callback returns an error and __thermal_cooling_device_register() fails as uses that callback after commit `31a0fa0019`. Arguably, _FST should not return an invalid value even if it is evaluated before _FSL, so this may be regarded as a platform firmware issue, but at the same time it is not a good enough reason for failing the cooling device registration where the initial cooling device state is only needed to initialize a thermal debug facility. Accordingly, modify __thermal_cooling_device_register() to avoid calling thermal_debug_cdev_add() instead of returning an error if the initial .get_cur_state() callback invocation fails. Fixes: `31a0fa0019` ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()") Closes: https://lore.kernel.org/linux-acpi/20240530153727.843378-1-laura.nao@collabora.com Reported-by: Laura Nao <laura.nao@collabora.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Laura Nao <laura.nao@collabora.com>	2024-06-07 13:51:51 +02:00
David S. Miller	dbfb886465	Merge branch 'hns3-fixes' Jijie Shao says: ==================== There are some bugfix for the HNS3 ethernet driver There are some bugfix for the HNS3 ethernet driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 12:20:28 +01:00
Jie Wang	968fde8384	net: hns3: add cond_resched() to hns3 ring buffer init process Currently hns3 ring buffer init process would hold cpu too long with big Tx/Rx ring depth. This could cause soft lockup. So this patch adds cond_resched() to the process. Then cpu can break to run other tasks instead of busy looping. Fixes: `a723fb8efe` ("net: hns3: refine for set ring parameters") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 12:20:28 +01:00
Yonglong Liu	12cda92021	net: hns3: fix kernel crash problem in concurrent scenario When link status change, the nic driver need to notify the roce driver to handle this event, but at this time, the roce driver may uninit, then cause kernel crash. To fix the problem, when link status change, need to check whether the roce registered, and when uninit, need to wait link update finish. Fixes: `45e92b7e4e` ("net: hns3: add calling roce callback function when link status change") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 12:20:28 +01:00
Udit Kumar	b472b996a4	dt-bindings: net: dp8386x: Add MIT license along with GPL-2.0 Modify license to include dual licensing as GPL-2.0-only OR MIT license for TI specific phy header files. This allows for Linux kernel files to be used in other Operating System ecosystems such as Zephyr or FreeBSD. While at this, update the GPL-2.0 to be GPL-2.0-only to be in sync with latest SPDX conventions (GPL-2.0 is deprecated). While at this, update the TI copyright year to sync with current year to indicate license change. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Trent Piepho <tpiepho@impinj.com> Cc: Wadim Egorov <w.egorov@phytec.de> Cc: Kip Broadhurst <kbroadhurst@ti.com> Signed-off-by: Udit Kumar <u-kumar1@ti.com> Acked-by: Wadim Egorov <w.egorov@phytec.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 12:16:22 +01:00
Johan Hovold	0eafc58f21	HID: i2c-hid: elan: fix reset suspend current leakage The Elan eKTH5015M touch controller found on the Lenovo ThinkPad X13s shares the VCC33 supply with other peripherals that may remain powered during suspend (e.g. when enabled as wakeup sources). The reset line is also wired so that it can be left deasserted when the supply is off. This is important as it avoids holding the controller in reset for extended periods of time when it remains powered, which can lead to increased power consumption, and also avoids leaking current through the X13s reset circuitry during suspend (and after driver unbind). Use the new 'no-reset-on-power-off' devicetree property to determine when reset needs to be asserted on power down. Notably this also avoids wasting power on machine variants without a touchscreen for which the driver would otherwise exit probe with reset asserted. Fixes: `bd3cba00dc` ("HID: i2c-hid: elan: Add support for Elan eKTH6915 i2c-hid touchscreens") Cc: <stable@vger.kernel.org> # 6.0 Cc: Douglas Anderson <dianders@chromium.org> Tested-by: Steev Klimaszewski <steev@kali.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240507144821.12275-5-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2024-06-07 11:18:11 +02:00
Johan Hovold	e538d4b85b	dt-bindings: HID: i2c-hid: elan: add 'no-reset-on-power-off' property When the power supply is shared with other peripherals the reset line can be wired in such a way that it can remain deasserted regardless of whether the supply is on or not. This is important as it can be used to avoid holding the controller in reset for extended periods of time when it remains powered, something which can lead to increased power consumption. Leaving reset deasserted also avoids leaking current through the reset circuitry pull-up resistors. Add a new 'no-reset-on-power-off' devicetree property which can be used by the OS to determine when reset needs to be asserted on power down. Note that this property can also be used when the supply cannot be turned off by the OS at all. Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240507144821.12275-4-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2024-06-07 11:16:55 +02:00
Johan Hovold	07fc16fa55	dt-bindings: HID: i2c-hid: elan: add Elan eKTH5015M Add a compatible string for the Elan eKTH5015M touch controller. Judging from the current binding and commit `bd3cba00dc` ("HID: i2c-hid: elan: Add support for Elan eKTH6915 i2c-hid touchscreens"), eKTH5015M appears to be compatible with eKTH6915. Notably the power-on sequence is the same. While at it, drop a redundant label from the example. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20240507144821.12275-3-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2024-06-07 11:16:55 +02:00
Johan Hovold	8d3ae46c64	dt-bindings: HID: i2c-hid: add dedicated Ilitek ILI2901 schema The Ilitek ILI2901 touch screen controller was apparently incorrectly added to the Elan eKTH6915 schema simply because it also has a reset gpio and is currently managed by the Elan driver in Linux. The two controllers are not related even if an unfortunate wording in the commit message adding the Ilitek compatible made it sound like they were. Add a dedicated schema for the ILI2901 which does not specify the I2C address (which is likely 0x41 rather than 0x10 as for other Ilitek touch controllers) to avoid cluttering the Elan schema with unrelated devices and to make it easier to find the correct schema when adding further Ilitek controllers. Fixes: `d74ac6f60a` ("dt-bindings: HID: i2c-hid: elan: Introduce Ilitek ili2901") Cc: Zhengqiao Xia <xiazhengqiao@huaqin.corp-partner.google.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20240507144821.12275-2-johan+linaro@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2024-06-07 11:16:55 +02:00
Aseda Aboagye	22d6d060ac	input: Add support for "Do Not Disturb" HUTRR94 added support for a new usage titled "System Do Not Disturb" which toggles a system-wide Do Not Disturb setting. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-gUHE70s7wCAoB@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2024-06-07 11:08:07 +02:00
Aseda Aboagye	0c7dd00de0	input: Add event code for accessibility key HUTRR116 added support for a new usage titled "System Accessibility Binding" which toggles a system-wide bound accessibility UI or command. This commit simply adds a new event code for the usage. Signed-off-by: Aseda Aboagye <aaboagye@chromium.org> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/Zl-e97O9nvudco5z@google.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2024-06-07 11:08:07 +02:00
Andrew Ballance	89e1ee118d	hid: asus: asus_report_fixup: fix potential read out of bounds syzbot reported a potential read out of bounds in asus_report_fixup. this patch adds checks so that a read out of bounds will not occur Signed-off-by: Andrew Ballance <andrewjballance@gmail.com> Reported-by: <syzbot+07762f019fd03d01f04c@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=07762f019fd03d01f04c Fixes: `59d2f5b739` ("HID: asus: fix more n-key report descriptors if n-key quirked") Link: https://lore.kernel.org/r/20240602085023.1720492-1-andrewjballance@gmail.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2024-06-07 11:07:40 +02:00
Jeff Johnson	64054eb716	gpio: add missing MODULE_DESCRIPTION() macros On x86, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpio/gpio-gw-pld.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpio/gpio-mc33880.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpio/gpio-pcf857x.o Add the missing invocations of the MODULE_DESCRIPTION() macro, including the one missing in gpio-pl061.c, which is not built for x86. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240606-md-drivers-gpio-v1-1-cb42d240ca5c@quicinc.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-06-07 10:05:21 +02:00
David Howells	a88d609036	cifs: Don't advance the I/O iterator before terminating subrequest There's now no need to make sure subreq->io_iter is advanced to match subreq->transferred before calling one of the netfs subrequest termination functions as the check has been removed netfslib and the iterator is reset prior to retrying a subreq. Fixes: `3ee1a1fc39` ("cifs: Cut over to using netfslib") Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Signed-off-by: Steve French <stfrench@microsoft.com>	2024-06-07 01:05:26 -05:00
Enzo Matsumiya	02c418774f	smb: client: fix deadlock in smb2_find_smb_tcon() Unlock cifs_tcp_ses_lock before calling cifs_put_smb_ses() to avoid such deadlock. Cc: stable@vger.kernel.org Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-06-07 01:05:07 -05:00
Csókás, Bence	e96b293315	net: sfp: Always call `sfp_sm_mod_remove()` on remove If the module is in SFP_MOD_ERROR, `sfp_sm_mod_remove()` will not be run. As a consequence, `sfp_hwmon_remove()` is not getting run either, leaving a stale `hwmon` device behind. `sfp_sm_mod_remove()` itself checks `sfp->sm_mod_state` anyways, so this check was not really needed in the first place. Fixes: `d2e816c029` ("net: sfp: handle module remove outside state machine") Signed-off-by: "Csókás, Bence" <csokas.bence@prolan.hu> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240605084251.63502-1-csokas.bence@prolan.hu Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 17:34:03 -07:00
Masahiro Yamada	9185afeac2	modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.o Building with W=1 incorrectly emits the following warning: WARNING: modpost: missing MODULE_DESCRIPTION() in vmlinux.o This check should apply only to modules. Fixes: `1fffe7a34c` ("script: modpost: emit a warning when the description is missing") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>	2024-06-07 08:42:31 +09:00
Richard Acayan	96c965667b	kbuild: explicitly run mksysmap as sed script from link-vmlinux.sh In commit `b18b047002` ("kbuild: change scripts/mksysmap into sed script"), the mksysmap script was transformed into a sed script, made directly executable with "#!/bin/sed -f". Apparently, the path to sed is different on NixOS. The shebang can't use the env command, otherwise the "sed -f" command would be treated as a single argument. This can be solved with the -S flag, but that is a GNU extension. Explicitly use sed instead of relying on the executable shebang to fix NixOS builds without breaking build environments using Busybox. Fixes: `b18b047002` ("kbuild: change scripts/mksysmap into sed script") Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Richard Acayan <mailingradian@gmail.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Dmitry Safonov <0x7f454c46@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-06-07 08:42:14 +09:00
Dave Airlie	eb55943aab	Merge tag 'drm-misc-next-fixes-2024-06-07' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-next-fixes for v6.10-rc3: - Single unused struct removal that should have been in -fixes. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/0251b6ae-bffa-44b2-b698-955712c25a27@linux.intel.com	2024-06-07 08:40:58 +10:00
Dave Airlie	26033424ed	Merge tag 'drm-misc-fixes-2024-06-07' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.10-rc3: - Robustness fixes for vmwgfx. - Error check for of_drm_get_panel_orientation failing in sitronix-st7789v. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/d5645d00-a8cf-47d9-a2a0-4ff55842fc7d@linux.intel.com	2024-06-07 08:37:25 +10:00
Dave Airlie	2d42183110	Merge tag 'amd-drm-fixes-6.10-2024-06-06' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.10-2024-06-06: amdgpu: - Fix shutdown issues on some SMU 13.x platforms - Silence some UBSAN flexible array warnings Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240606192348.3620805-1-alexander.deucher@amd.com	2024-06-07 08:22:59 +10:00
Linus Torvalds	8a92980606	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "The core change is to detect unusually large number of VPD pages (caused by device manufacturers having an endiannes issue) and reject them rather than trying to parse a huge non-existent array. The remaining fixes are in drivers the most user visible of which is the ALUA state transition recognition (leads to intermittent I/O errors in some situations otherwise)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: mcq: Fix error output and clean up ufshcd_mcq_abort() scsi: core: Handle devices which return an unusually large VPD page count scsi: mpt3sas: Add missing kerneldoc parameter descriptions scsi: qedf: Set qed_slowpath_params to zero before use scsi: qedf: Wait for stag work during unload scsi: qedf: Don't process stag work during unload and recovery scsi: sr: Fix unintentional arithmetic wraparound scsi: core: alua: I/O errors for ALUA state transitions scsi: mpi3mr: Use proper format specifier in mpi3mr_sas_port_add()	2024-06-06 14:40:51 -07:00
Linus Torvalds	d91e656262	Merge tag 'pci-v6.10-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fix from Bjorn Helgaas: - Revert lockdep checking on locking that protects device resets from user-space config accesses; it exposed issues for which fixes are in the works but are too risky for this cycle (Dan Williams) * tag 'pci-v6.10-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: Revert the cfg_access_lock lockdep mechanism	2024-06-06 14:28:11 -07:00
Thomas Hellström	a9f9b30e17	MAINTAINERS: Update Xe driver maintainers Add Rodrigo Vivi as an Xe driver maintainer. v2: - Cc also Lucas De Marchi (Rodrigo vivi) - Remove a blank line in commit the commit message (Lucas De Marchi) Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240602190959.2981-1-thomas.hellstrom@linux.intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-06-06 12:47:47 -07:00
Oded Gabbay	bb93148ca8	MAINTAINERS: update Xe driver maintainers Because I left Intel, I'm removing myself from the list of Xe driver maintainers. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240515162222.12958-3-ogabbay@kernel.org Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `8de6625daf`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-06-06 12:47:47 -07:00
Qu Wenruo	f3a5367c67	btrfs: protect folio::private when attaching extent buffer folios [BUG] Since v6.8 there are rare kernel crashes reported by various people, the common factor is bad page status error messages like this: BUG: Bad page state in process kswapd0 pfn:d6e840 page: refcount:0 mapcount:0 mapping:000000007512f4f2 index:0x2796c2c7c pfn:0xd6e840 aops:btree_aops ino:1 flags: 0x17ffffe0000008(uptodate\|node=0\|zone=2\|lastcpupid=0x3fffff) page_type: 0xffffffff() raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0 raw: 00000002796c2c7c 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: non-NULL mapping [CAUSE] Commit `09e6cef19c` ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") changes the sequence when allocating a new extent buffer. Previously we always called grab_extent_buffer() under mapping->i_private_lock, to ensure the safety on modification on folio::private (which is a pointer to extent buffer for regular sectorsize). This can lead to the following race: Thread A is trying to allocate an extent buffer at bytenr X, with 4 4K pages, meanwhile thread B is trying to release the page at X + 4K (the second page of the extent buffer at X). Thread A \| Thread B -----------------------------------+------------------------------------- \| btree_release_folio() \| \| This is for the page at X + 4K, \| \| Not page X. \| \| alloc_extent_buffer() \| \|- release_extent_buffer() \|- filemap_add_folio() for the \| \| \|- atomic_dec_and_test(eb->refs) \| page at bytenr X (the first \| \| \| \| page). \| \| \| \| Which returned -EEXIST. \| \| \| \| \| \| \| \|- filemap_lock_folio() \| \| \| \| Returned the first page locked. \| \| \| \| \| \| \| \|- grab_extent_buffer() \| \| \| \| \|- atomic_inc_not_zero() \| \| \| \| \| Returned false \| \| \| \| \|- folio_detach_private() \| \| \|- folio_detach_private() for X \| \|- folio_test_private() \| \| \|- folio_test_private() \| Returned true \| \| \| Returned true \|- folio_put() \| \|- folio_put() Now there are two puts on the same folio at folio X, leading to refcount underflow of the folio X, and eventually causing the BUG_ON() on the page->mapping. The condition is not that easy to hit: - The release must be triggered for the middle page of an eb If the release is on the same first page of an eb, page lock would kick in and prevent the race. - folio_detach_private() has a very small race window It's only between folio_test_private() and folio_clear_private(). That's exactly when mapping->i_private_lock is used to prevent such race, and commit `09e6cef19c` ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") screwed that up. At that time, I thought the page lock would kick in as filemap_release_folio() also requires the page to be locked, but forgot the filemap_release_folio() only locks one page, not all pages of an extent buffer. [FIX] Move all the code requiring i_private_lock into attach_eb_folio_to_filemap(), so that everything is done with proper lock protection. Furthermore to prevent future problems, add an extra lockdep_assert_locked() to ensure we're holding the proper lock. To reproducer that is able to hit the race (takes a few minutes with instrumented code inserting delays to alloc_extent_buffer()): #!/bin/sh drop_caches () { while(true); do echo 3 > /proc/sys/vm/drop_caches echo 1 > /proc/sys/vm/compact_memory done } run_tar () { while(true); do for x in `seq 1 80` ; do tar cf /dev/zero /mnt > /dev/null & done wait done } mkfs.btrfs -f -d single -m single /dev/vda mount -o noatime /dev/vda /mnt # create 200,000 files, 1K each ./simoop -n 200000 -E -f 1k /mnt drop_caches & (run_tar) Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/linux-btrfs/CAHk-=wgt362nGfScVOOii8cgKn2LVVHeOvOA7OBwg1OwbuJQcw@mail.gmail.com/ Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Link: https://lore.kernel.org/lkml/CABXGCsPktcHQOvKTbPaTwegMExije=Gpgci5NW=hqORo-s7diA@mail.gmail.com/ Reported-by: Toralf Förster <toralf.foerster@gmx.de> Link: https://lore.kernel.org/linux-btrfs/e8b3311c-9a75-4903-907f-fc0f7a3fe423@gmx.de/ Reported-by: syzbot+f80b066392366b4af85e@syzkaller.appspotmail.com Fixes: `09e6cef19c` ("btrfs: refactor alloc_extent_buffer() to allocate-then-attach method") CC: stable@vger.kernel.org # 6.8+ CC: Chris Mason <clm@fb.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-06-06 21:42:22 +02:00
Jan Beulich	3ac36aa730	x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node() memblock_set_node() warns about using MAX_NUMNODES, see `e0eec24e2e` ("memblock: make memblock_set_node() also warn about use of MAX_NUMNODES") for details. Reported-by: Narasimhan V <Narasimhan.V@amd.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org [bp: commit message] Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240603141005.23261-1-bp@kernel.org Link: https://lore.kernel.org/r/abadb736-a239-49e4-ab42-ace7acdd4278@suse.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>	2024-06-06 22:20:39 +03:00
Linus Torvalds	d30d0e49da	Merge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from BPF and big collection of fixes for WiFi core and drivers. Current release - regressions: - vxlan: fix regression when dropping packets due to invalid src addresses - bpf: fix a potential use-after-free in bpf_link_free() - xdp: revert support for redirect to any xsk socket bound to the same UMEM as it can result in a corruption - virtio_net: - add missing lock protection when reading return code from control_buf - fix false-positive lockdep splat in DIM - Revert "wifi: wilc1000: convert list management to RCU" - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config Previous releases - regressions: - rtnetlink: make the "split" NLM_DONE handling generic, restore the old behavior for two cases where we started coalescing those messages with normal messages, breaking sloppily-coded userspace - wifi: - cfg80211: validate HE operation element parsing - cfg80211: fix 6 GHz scan request building - mt76: mt7615: add missing chanctx ops - ath11k: move power type check to ASSOC stage, fix connecting to 6 GHz AP - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS - iwlwifi: mvm: fix a crash on 7265 Previous releases - always broken: - ncsi: prevent multi-threaded channel probing, a spec violation - vmxnet3: disable rx data ring on dma allocation failure - ethtool: init tsinfo stats if requested, prevent unintentionally reporting all-zero stats on devices which don't implement any - dst_cache: fix possible races in less common IPv6 features - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED - ax25: fix two refcounting bugs - eth: ionic: fix kernel panic in XDP_TX action Misc: - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB" * tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits) selftests: net: lib: set 'i' as local selftests: net: lib: avoid error removing empty netns name selftests: net: lib: support errexit with busywait net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() ipv6: fix possible race in __fib6_drop_pcpu_from() af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill(). af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen(). af_unix: Use skb_queue_empty_lockless() in unix_release_sock(). af_unix: Use unix_recvq_full_lockless() in unix_stream_connect(). af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen. af_unix: Annotate data-races around sk->sk_sndbuf. af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG. af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb(). af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg(). af_unix: Annotate data-race of sk->sk_state in unix_accept(). af_unix: Annotate data-race of sk->sk_state in unix_stream_connect(). af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll(). af_unix: Annotate data-race of sk->sk_state in unix_inq_len(). af_unix: Annodate data-races around sk->sk_state for writers. af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer. ...	2024-06-06 09:55:27 -07:00
Linus Torvalds	2faf6332c5	Merge tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo Pull tomoyo fixlet from Tetsuo Handa: "Single patch to update project links, no behavior changes" * tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo: tomoyo: update project links	2024-06-06 09:48:57 -07:00
Linus Torvalds	a34adf6010	Merge tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Ensure that .discard sections are really discarded in the EFI zboot image build - Return proper error numbers from efi-pstore - Add __nocfi annotations to EFI runtime wrappers * tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Add missing __nocfi annotations to runtime wrappers efi: pstore: Return proper errors on UEFI failures efi/libstub: zboot.lds: Discard .discard sections	2024-06-06 09:39:36 -07:00
Jakub Kicinski	27bc865408	Merge branch 'selftests-net-lib-small-fixes' Matthieu Baerts says: ==================== selftests: net: lib: small fixes While looking at using 'lib.sh' for the MPTCP selftests [1], we found some small issues with 'lib.sh'. Here they are: - Patch 1: fix 'errexit' (set -e) support with busywait. 'errexit' is supported in some functions, not all. A fix for v6.8+. - Patch 2: avoid confusing error messages linked to the cleaning part when the netns setup fails. A fix for v6.8+. - Patch 3: set a variable as local to avoid accidentally changing the value of a another one with the same name on the caller side. A fix for v6.10-rc1+. Link: https://lore.kernel.org/mptcp/5f4615c3-0621-43c5-ad25-55747a4350ce@kernel.org/T/ [1] ==================== Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-0-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 08:29:07 -07:00
Matthieu Baerts (NGI0)	84a8bc3ec2	selftests: net: lib: set 'i' as local Without this, the 'i' variable declared before could be overridden by accident, e.g. for i in "${@}"; do __ksft_status_merge "${i}" ## 'i' has been modified foo "${i}" ## using 'i' with an unexpected value done After a quick look, it looks like 'i' is currently not used after having been modified in __ksft_status_merge(), but still, better be safe than sorry. I saw this while modifying the same file, not because I suspected an issue somewhere. Fixes: `596c8819cb` ("selftests: forwarding: Have RET track kselftest framework constants") Acked-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-3-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 08:29:07 -07:00
Matthieu Baerts (NGI0)	79322174bc	selftests: net: lib: avoid error removing empty netns name If there is an error to create the first netns with 'setup_ns()', 'cleanup_ns()' will be called with an empty string as first parameter. The consequences is that 'cleanup_ns()' will try to delete an invalid netns, and wait 20 seconds if the netns list is empty. Instead of just checking if the name is not empty, convert the string separated by spaces to an array. Manipulating the array is cleaner, and calling 'cleanup_ns()' with an empty array will be a no-op. Fixes: `25ae948b44` ("selftests/net: add lib.sh") Cc: stable@vger.kernel.org Acked-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-2-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 08:29:07 -07:00
Matthieu Baerts (NGI0)	41b02ea4c0	selftests: net: lib: support errexit with busywait If errexit is enabled ('set -e'), loopy_wait -- or busywait and others using it -- will stop after the first failure. Note that if the returned status of loopy_wait is checked, and even if errexit is enabled, Bash will not stop at the first error. Fixes: `25ae948b44` ("selftests/net: add lib.sh") Cc: stable@vger.kernel.org Acked-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-1-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 08:29:07 -07:00
Michael Ellerman	09fe2bfa6b	ata: pata_macio: Fix max_segment_size with PAGE_SIZE == 64K The pata_macio driver advertises a max_segment_size of 0xff00, because the hardware doesn't cope with requests >= 64K. However the SCSI core requires max_segment_size to be at least PAGE_SIZE, which is a problem for pata_macio when the kernel is built with 64K pages. In older kernels the SCSI core would just increase the segment size to be equal to PAGE_SIZE, however since the commit tagged below it causes a warning and the device fails to probe: WARNING: CPU: 0 PID: 26 at block/blk-settings.c:202 .blk_validate_limits+0x2f8/0x35c CPU: 0 PID: 26 Comm: kworker/u4:1 Not tainted 6.10.0-rc1 #1 Hardware name: PowerMac7,2 PPC970 0x390202 PowerMac ... NIP .blk_validate_limits+0x2f8/0x35c LR .blk_alloc_queue+0xc0/0x2f8 Call Trace: .blk_alloc_queue+0xc0/0x2f8 .blk_mq_alloc_queue+0x60/0xf8 .scsi_alloc_sdev+0x208/0x3c0 .scsi_probe_and_add_lun+0x314/0x52c .__scsi_add_device+0x170/0x1a4 .ata_scsi_scan_host+0x2bc/0x3e4 .async_port_probe+0x6c/0xa0 .async_run_entry_fn+0x60/0x1bc .process_one_work+0x228/0x510 .worker_thread+0x360/0x530 .kthread+0x134/0x13c .start_kernel_thread+0x10/0x14 ... scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured Although the hardware can't cope with a 64K segment, the driver already deals with that internally by splitting large requests in pata_macio_qc_prep(). That is how the driver has managed to function until now on 64K kernels. So fix the driver to advertise a max_segment_size of 64K, which avoids the warning and keeps the SCSI core happy. Fixes: `afd53a3d85` ("scsi: core: Initialize scsi midlayer limits before allocating the queue") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/ce2bf6af-4382-4fe1-b392-cc6829f5ceb2@roeck-us.net/ Reported-by: Doru Iorgulescu <doru.iorgulescu1@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218858 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Niklas Cassel <cassel@kernel.org>	2024-06-06 14:53:34 +02:00
Su Hui	0dcc53abf5	net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() Clang static checker (scan-build) warning: net/ethtool/ioctl.c:line 2233, column 2 Called function pointer is null (null dereference). Return '-EOPNOTSUPP' when 'ops->get_ethtool_phy_stats' is NULL to fix this typo error. Fixes: `201ed315f9` ("net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240605034742.921751-1-suhui@nfschina.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 13:34:33 +02:00
Masahiro Yamada	77a92660d8	kconfig: remove wrong expr_trans_bool() expr_trans_bool() performs an incorrect transformation. [Test Code] config MODULES def_bool y modules config A def_bool y select C if B != n config B def_tristate m config C tristate [Result] CONFIG_MODULES=y CONFIG_A=y CONFIG_B=m CONFIG_C=m This output is incorrect because CONFIG_C=y is expected. Documentation/kbuild/kconfig-language.rst clearly explains the function of the '!=' operator: If the values of both symbols are equal, it returns 'n', otherwise 'y'. Therefore, the statement: select C if B != n should be equivalent to: select C if y Or, more simply: select C Hence, the symbol C should be selected by the value of A, which is 'y'. However, expr_trans_bool() wrongly transforms it to: select C if B Therefore, the symbol C is selected by (A && B), which is 'm'. The comment block of expr_trans_bool() correctly explains its intention: * bool FOO!=n => FOO ^^^^ If FOO is bool, FOO!=n can be simplified into FOO. This is correct. However, the actual code performs this transformation when FOO is tristate: if (e->left.sym->type == S_TRISTATE) { ^^^^^^^^^^ While it can be fixed to S_BOOLEAN, there is no point in doing so because expr_tranform() already transforms FOO!=n to FOO when FOO is bool. (see the "case E_UNEQUAL" part) expr_trans_bool() is wrong and unnecessary. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org>	2024-06-06 20:09:10 +09:00
Eric Dumazet	b01e1c0307	ipv6: fix possible race in __fib6_drop_pcpu_from() syzbot found a race in __fib6_drop_pcpu_from() [1] If compiler reads more than once (*ppcpu_rt), second read could read NULL, if another cpu clears the value in rt6_get_pcpu_route(). Add a READ_ONCE() to prevent this race. Also add rcu_read_lock()/rcu_read_unlock() because we rely on RCU protection while dereferencing pcpu_rt. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097] CPU: 0 PID: 7543 Comm: kworker/u8:17 Not tainted 6.10.0-rc1-syzkaller-00013-g2bfcfd584ff5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Workqueue: netns cleanup_net RIP: 0010:__fib6_drop_pcpu_from.part.0+0x10a/0x370 net/ipv6/ip6_fib.c:984 Code: f8 48 c1 e8 03 80 3c 28 00 0f 85 16 02 00 00 4d 8b 3f 4d 85 ff 74 31 e8 74 a7 fa f7 49 8d bf 90 00 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 0f 85 1e 02 00 00 49 8b 87 90 00 00 00 48 8b 0c 24 48 RSP: 0018:ffffc900040df070 EFLAGS: 00010206 RAX: 0000000000000012 RBX: 0000000000000001 RCX: ffffffff89932e16 RDX: ffff888049dd1e00 RSI: ffffffff89932d7c RDI: 0000000000000091 RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000007 R10: 0000000000000001 R11: 0000000000000006 R12: ffff88807fa080b8 R13: fffffbfff1a9a07d R14: ffffed100ff41022 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32c26000 CR3: 000000005d56e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __fib6_drop_pcpu_from net/ipv6/ip6_fib.c:966 [inline] fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1027 [inline] fib6_purge_rt+0x7f2/0x9f0 net/ipv6/ip6_fib.c:1038 fib6_del_route net/ipv6/ip6_fib.c:1998 [inline] fib6_del+0xa70/0x17b0 net/ipv6/ip6_fib.c:2043 fib6_clean_node+0x426/0x5b0 net/ipv6/ip6_fib.c:2205 fib6_walk_continue+0x44f/0x8d0 net/ipv6/ip6_fib.c:2127 fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2175 fib6_clean_tree+0xd7/0x120 net/ipv6/ip6_fib.c:2255 __fib6_clean_all+0x100/0x2d0 net/ipv6/ip6_fib.c:2271 rt6_sync_down_dev net/ipv6/route.c:4906 [inline] rt6_disable_ip+0x7ed/0xa00 net/ipv6/route.c:4911 addrconf_ifdown.isra.0+0x117/0x1b40 net/ipv6/addrconf.c:3855 addrconf_notify+0x223/0x19e0 net/ipv6/addrconf.c:3778 notifier_call_chain+0xb9/0x410 kernel/notifier.c:93 call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1992 call_netdevice_notifiers_extack net/core/dev.c:2030 [inline] call_netdevice_notifiers net/core/dev.c:2044 [inline] dev_close_many+0x333/0x6a0 net/core/dev.c:1585 unregister_netdevice_many_notify+0x46d/0x19f0 net/core/dev.c:11193 unregister_netdevice_many net/core/dev.c:11276 [inline] default_device_exit_batch+0x85b/0xae0 net/core/dev.c:11759 ops_exit_list+0x128/0x180 net/core/net_namespace.c:178 cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640 process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231 process_scheduled_works kernel/workqueue.c:3312 [inline] worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fixes: `d52d3997f8` ("ipv6: Create percpu rt6_info") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20240604193549.981839-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 13:05:54 +02:00
Masahiro Yamada	45c7f555bf	kconfig: doc: document behavior of 'select' and 'imply' followed by 'if' Documentation/kbuild/kconfig-language.rst explains the behavior of 'select' as follows: reverse dependencies can be used to force a lower limit of another symbol. The value of the current menu symbol is used as the minimal value <symbol> can be set to. This is not true when the 'select' property is followed by 'if'. [Test Code] config MODULES def_bool y modules config A def_tristate y select C if B config B def_tristate m config C tristate [Result] CONFIG_MODULES=y CONFIG_A=y CONFIG_B=m CONFIG_C=m If "the value of A is used as the minimal value C can be set to", C must be 'y'. The actual behavior is "C is selected by (A && B)". The lower limit of C is downgraded due to B being 'm'. This behavior is kind of weird, and this has arisen several times in the mailing list. I do not know whether it is a bug or intended behavior. Anyway, it is not feasible to change it now because many Kconfig files are written based on this behavior. The same applies to 'imply'. Document this (but reserve the possibility for a future change). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org>	2024-06-06 20:05:15 +09:00
Masahiro Yamada	bf83266a1e	kconfig: doc: fix a typo in the note about 'imply' This sentence does not make sense due to a typo. Fix it. Fixes: `def2fbffe6` ("kconfig: allow symbols implied by y to become m") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org>	2024-06-06 20:03:16 +09:00
Masahiro Yamada	46edf4372e	kconfig: gconf: give a proper initial state to the Save button Currently, the initial state of the "Save" button is always active. If none of the CONFIG options are changed while loading the .config file, the "Save" button should be greyed out. This can be fixed by calling conf_read() after widget initialization. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-06-06 20:03:16 +09:00
Masahiro Yamada	c181689bc4	kconfig: remove unneeded code for user-supplied values being out of range This is a leftover from commit `ce1fc9345a` ("kconfig: do not clear SYMBOL_DEF_USER when the value is out of range"). This code is now redundant because if a user-supplied value is out of range, the value adjusted by sym_validate_range() differs, and conf_unsaved has already been incremented a few lines above. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-06-06 20:03:14 +09:00
Paolo Abeni	411c0ea696	Merge branch 'af_unix-fix-lockless-access-of-sk-sk_state-and-others-fields' Kuniyuki Iwashima says: ==================== af_unix: Fix lockless access of sk->sk_state and others fields. The patch 1 fixes a bug where SOCK_DGRAM's sk->sk_state is changed to TCP_CLOSE even if the socket is connect()ed to another socket. The rest of this series annotates lockless accesses to the following fields. * sk->sk_state * sk->sk_sndbuf * net->unx.sysctl_max_dgram_qlen * sk->sk_receive_queue.qlen * sk->sk_shutdown Note that with this series there is skb_queue_empty() left in unix_dgram_disconnected() that needs to be changed to lockless version, and unix_peer(other) access there should be protected by unix_state_lock(). This will require some refactoring, so another series will follow. Changes: v2: * Patch 1: Fix wrong double lock v1: https://lore.kernel.org/netdev/20240603143231.62085-1-kuniyu@amazon.com/ ==================== Link: https://lore.kernel.org/r/20240604165241.44758-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:18 +02:00
Kuniyuki Iwashima	efaf24e30e	af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill(). While dumping sockets via UNIX_DIAG, we do not hold unix_state_lock(). Let's use READ_ONCE() to read sk->sk_shutdown. Fixes: `e4e541a848` ("sock-diag: Report shutdown for inet and unix sockets (v2)") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	5d915e584d	af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen(). We can dump the socket queue length via UNIX_DIAG by specifying UDIAG_SHOW_RQLEN. If sk->sk_state is TCP_LISTEN, we return the recv queue length, but here we do not hold recvq lock. Let's use skb_queue_len_lockless() in sk_diag_show_rqlen(). Fixes: `c9da99e647` ("unix_diag: Fixup RQLEN extension report") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	83690b82d2	af_unix: Use skb_queue_empty_lockless() in unix_release_sock(). If the socket type is SOCK_STREAM or SOCK_SEQPACKET, unix_release_sock() checks the length of the peer socket's recvq under unix_state_lock(). However, unix_stream_read_generic() calls skb_unlink() after releasing the lock. Also, for SOCK_SEQPACKET, __skb_try_recv_datagram() unlinks skb without unix_state_lock(). Thues, unix_state_lock() does not protect qlen. Let's use skb_queue_empty_lockless() in unix_release_sock(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	45d872f0e6	af_unix: Use unix_recvq_full_lockless() in unix_stream_connect(). Once sk->sk_state is changed to TCP_LISTEN, it never changes. unix_accept() takes advantage of this characteristics; it does not hold the listener's unix_state_lock() and only acquires recvq lock to pop one skb. It means unix_state_lock() does not prevent the queue length from changing in unix_stream_connect(). Thus, we need to use unix_recvq_full_lockless() to avoid data-race. Now we remove unix_recvq_full() as no one uses it. Note that we can remove READ_ONCE() for sk->sk_max_ack_backlog in unix_recvq_full_lockless() because of the following reasons: (1) For SOCK_DGRAM, it is a written-once field in unix_create1() (2) For SOCK_STREAM and SOCK_SEQPACKET, it is changed under the listener's unix_state_lock() in unix_listen(), and we hold the lock in unix_stream_connect() Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	bd9f2d0573	af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen. net->unx.sysctl_max_dgram_qlen is exposed as a sysctl knob and can be changed concurrently. Let's use READ_ONCE() in unix_create1(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	b0632e53e0	af_unix: Annotate data-races around sk->sk_sndbuf. sk_setsockopt() changes sk->sk_sndbuf under lock_sock(), but it's not used in af_unix.c. Let's use READ_ONCE() to read sk->sk_sndbuf in unix_writable(), unix_dgram_sendmsg(), and unix_stream_sendmsg(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	0aa3be7b3e	af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG. While dumping AF_UNIX sockets via UNIX_DIAG, sk->sk_state is read locklessly. Let's use READ_ONCE() there. Note that the result could be inconsistent if the socket is dumped during the state change. This is common for other SOCK_DIAG and similar interfaces. Fixes: `c9da99e647` ("unix_diag: Fixup RQLEN extension report") Fixes: `2aac7a2cb0` ("unix_diag: Pending connections IDs NLA") Fixes: `45a96b9be6` ("unix_diag: Dumping all sockets core") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	af4c733b6b	af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb(). unix_stream_read_skb() is called from sk->sk_data_ready() context where unix_state_lock() is not held. Let's use READ_ONCE() there. Fixes: `77462de14a` ("af_unix: Add read_sock for stream socket types") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:14 +02:00
Kuniyuki Iwashima	8a34d4e8d9	af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg(). The following functions read sk->sk_state locklessly and proceed only if the state is TCP_ESTABLISHED. * unix_stream_sendmsg * unix_stream_read_generic * unix_seqpacket_sendmsg * unix_seqpacket_recvmsg Let's use READ_ONCE() there. Fixes: `a05d2ad1c1` ("af_unix: Only allow recv on connected seqpacket sockets.") Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:14 +02:00
Kuniyuki Iwashima	1b536948e8	af_unix: Annotate data-race of sk->sk_state in unix_accept(). Once sk->sk_state is changed to TCP_LISTEN, it never changes. unix_accept() takes the advantage and reads sk->sk_state without holding unix_state_lock(). Let's use READ_ONCE() there. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:14 +02:00
Kuniyuki Iwashima	a9bf9c7dc6	af_unix: Annotate data-race of sk->sk_state in unix_stream_connect(). As small optimisation, unix_stream_connect() prefetches the client's sk->sk_state without unix_state_lock() and checks if it's TCP_CLOSE. Later, sk->sk_state is checked again under unix_state_lock(). Let's use READ_ONCE() for the first check and TCP_CLOSE directly for the second check. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:14 +02:00
Kuniyuki Iwashima	eb0718fb3e	af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll(). unix_poll() and unix_dgram_poll() read sk->sk_state locklessly and calls unix_writable() which also reads sk->sk_state without holding unix_state_lock(). Let's use READ_ONCE() in unix_poll() and unix_dgram_poll() and pass it to unix_writable(). While at it, we remove TCP_SYN_SENT check in unix_dgram_poll() as that state does not exist for AF_UNIX socket since the code was added. Fixes: `1586a5877d` ("af_unix: do not report POLLOUT on listeners") Fixes: `3c73419c09` ("af_unix: fix 'poll for write'/ connected DGRAM sockets") Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:14 +02:00
Kuniyuki Iwashima	3a0f38eb28	af_unix: Annotate data-race of sk->sk_state in unix_inq_len(). ioctl(SIOCINQ) calls unix_inq_len() that checks sk->sk_state first and returns -EINVAL if it's TCP_LISTEN. Then, for SOCK_STREAM sockets, unix_inq_len() returns the number of bytes in recvq. However, unix_inq_len() does not hold unix_state_lock(), and the concurrent listen() might change the state after checking sk->sk_state. If the race occurs, 0 is returned for the listener, instead of -EINVAL, because the length of skb with embryo is 0. We could hold unix_state_lock() in unix_inq_len(), but it's overkill given the result is true for pre-listen() TCP_CLOSE state. So, let's use READ_ONCE() for sk->sk_state in unix_inq_len(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:14 +02:00
Kuniyuki Iwashima	942238f973	af_unix: Annodate data-races around sk->sk_state for writers. sk->sk_state is changed under unix_state_lock(), but it's read locklessly in many places. This patch adds WRITE_ONCE() on the writer side. We will add READ_ONCE() to the lockless readers in the following patches. Fixes: `83301b5367` ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:14 +02:00
Kuniyuki Iwashima	26bfb8b570	af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer. When a SOCK_DGRAM socket connect()s to another socket, the both sockets' sk->sk_state are changed to TCP_ESTABLISHED so that we can register them to BPF SOCKMAP. When the socket disconnects from the peer by connect(AF_UNSPEC), the state is set back to TCP_CLOSE. Then, the peer's state is also set to TCP_CLOSE, but the update is done locklessly and unconditionally. Let's say socket A connect()ed to B, B connect()ed to C, and A disconnects from B. After the first two connect()s, all three sockets' sk->sk_state are TCP_ESTABLISHED: $ ss -xa Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess u_dgr ESTAB 0 0 @A 641 * 642 u_dgr ESTAB 0 0 @B 642 * 643 u_dgr ESTAB 0 0 @C 643 * 0 And after the disconnect, B's state is TCP_CLOSE even though it's still connected to C and C's state is TCP_ESTABLISHED. $ ss -xa Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess u_dgr UNCONN 0 0 @A 641 * 0 u_dgr UNCONN 0 0 @B 642 * 643 u_dgr ESTAB 0 0 @C 643 * 0 In this case, we cannot register B to SOCKMAP. So, when a socket disconnects from the peer, we should not set TCP_CLOSE to the peer if the peer is connected to yet another socket, and this must be done under unix_state_lock(). Note that we use WRITE_ONCE() for sk->sk_state as there are many lockless readers. These data-races will be fixed in the following patches. Fixes: `83301b5367` ("af_unix: Set TCP_ESTABLISHED for datagram sockets too") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:14 +02:00
Aleksandr Mishin	b0c9a26435	net: wwan: iosm: Fix tainted pointer delete is case of region creation fail In case of region creation fail in ipc_devlink_create_region(), previously created regions delete process starts from tainted pointer which actually holds error code value. Fix this bug by decreasing region index before delete. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `4dcd183fbd` ("net: wwan: iosm: devlink registration") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240604082500.20769-1-amishin@t-argos.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 10:15:14 +02:00
Ian Forbes	5703fc058e	drm/vmwgfx: Don't memcmp equivalent pointers These pointers are frequently the same and memcmp does not compare the pointers before comparing their contents so this was wasting cycles comparing 16 KiB of memory which will always be equal. Fixes: `bb6780aa5a` ("drm/vmwgfx: Diff cursors when using cmds") Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240328190716.27367-1-ian.forbes@broadcom.com	2024-06-05 22:38:40 -04:00
Jakub Kicinski	5899c88513	Merge branch 'intel-wired-lan-driver-updates-2024-05-29-ice-igc' Jacob Keller says: ==================== Intel Wired LAN Driver Updates 2024-05-29 (ice, igc) This series includes fixes for the ice driver as well as a fix for the igc driver. Jacob fixes two issues in the ice driver with reading the NVM for providing firmware data via devlink info. First, fix an off-by-one error when reading the Preserved Fields Area, resolving an infinite loop triggered on some NVMs which lack certain data in the NVM. Second, fix the reading of the NVM Shadow RAM on newer E830 and E825-C devices which have a variable sized CSS header rather than assuming this header is always the same fixed size as in the E810 devices. Larysa fixes three issues with the ice driver XDP logic that could occur if the number of queues is changed after enabling an XDP program. First, the af_xdp_zc_qps bitmap is removed and replaced by simpler logic to track whether queues are in zero-copy mode. Second, the reset and .ndo_bpf flows are distinguished to avoid potential races with a PF reset occuring simultaneously to .ndo_bpf callback from userspace. Third, the logic for mapping XDP queues to vectors is fixed so that XDP state is restored for XDP queues after a reconfiguration. Sasha fixes reporting of Energy Efficient Ethernet support via ethtool in the igc driver. v1: https://lore.kernel.org/r/20240530-net-2024-05-30-intel-net-fixes-v1-0-8b11c8c9bff8@intel.com ==================== Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-0-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 19:28:06 -07:00
Sasha Neftin	7d67d11fbe	igc: Fix Energy Efficient Ethernet support declaration The commit `01cf893bf0` ("net: intel: i40e/igc: Remove setting Autoneg in EEE capabilities") removed SUPPORTED_Autoneg field but left inappropriate ethtool_keee structure initialization. When "ethtool --show <device>" (get_eee) invoke, the 'ethtool_keee' structure was accidentally overridden. Remove the 'ethtool_keee' overriding and add EEE declaration as per IEEE specification that allows reporting Energy Efficient Ethernet capabilities. Examples: Before fix: ethtool --show-eee enp174s0 EEE settings for enp174s0: EEE status: not supported After fix: EEE settings for enp174s0: EEE status: disabled Tx LPI: disabled Supported EEE link modes: 100baseT/Full 1000baseT/Full 2500baseT/Full Fixes: `01cf893bf0` ("net: intel: i40e/igc: Remove setting Autoneg in EEE capabilities") Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-6-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 19:27:56 -07:00
Larysa Zaremba	f3df404425	ice: map XDP queues to vectors in ice_vsi_map_rings_to_vectors() ice_pf_dcb_recfg() re-maps queues to vectors with ice_vsi_map_rings_to_vectors(), which does not restore the previous state for XDP queues. This leads to no AF_XDP traffic after rebuild. Map XDP queues to vectors in ice_vsi_map_rings_to_vectors(). Also, move the code around, so XDP queues are mapped independently only through .ndo_bpf(). Fixes: `6624e780a5` ("ice: split ice_vsi_setup into smaller functions") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-5-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 19:27:56 -07:00
Larysa Zaremba	744d197162	ice: add flag to distinguish reset from .ndo_bpf in XDP rings config Commit `6624e780a5` ("ice: split ice_vsi_setup into smaller functions") has placed ice_vsi_free_q_vectors() after ice_destroy_xdp_rings() in the rebuild process. The behaviour of the XDP rings config functions is context-dependent, so the change of order has led to ice_destroy_xdp_rings() doing additional work and removing XDP prog, when it was supposed to be preserved. Also, dependency on the PF state reset flags creates an additional, fortunately less common problem: * PFR is requested e.g. by tx_timeout handler * .ndo_bpf() is asked to delete the program, calls ice_destroy_xdp_rings(), but reset flag is set, so rings are destroyed without deleting the program * ice_vsi_rebuild tries to delete non-existent XDP rings, because the program is still on the VSI * system crashes With a similar race, when requested to attach a program, ice_prepare_xdp_rings() can actually skip setting the program in the VSI and nevertheless report success. Instead of reverting to the old order of function calls, add an enum argument to both ice_prepare_xdp_rings() and ice_destroy_xdp_rings() in order to distinguish between calls from rebuild and .ndo_bpf(). Fixes: `efc2214b60` ("ice: Add support for XDP") Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-4-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 19:27:56 -07:00
Larysa Zaremba	adbf5a4234	ice: remove af_xdp_zc_qps bitmap Referenced commit has introduced a bitmap to distinguish between ZC and copy-mode AF_XDP queues, because xsk_get_pool_from_qid() does not do this for us. The bitmap would be especially useful when restoring previous state after rebuild, if only it was not reallocated in the process. This leads to e.g. xdpsock dying after changing number of queues. Instead of preserving the bitmap during the rebuild, remove it completely and distinguish between ZC and copy-mode queues based on the presence of a device associated with the pool. Fixes: `e102db780e` ("ice: track AF_XDP ZC enabled queues in bitmap") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-3-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 19:27:56 -07:00
Jacob Keller	cfa747a66e	ice: fix reads from NVM Shadow RAM on E830 and E825-C devices The ice driver reads data from the Shadow RAM portion of the NVM during initialization, including data used to identify the NVM image and device, such as the ETRACK ID used to populate devlink dev info fw.bundle. Currently it is using a fixed offset defined by ICE_CSS_HEADER_LENGTH to compute the appropriate offset. This worked fine for E810 and E822 devices which both have CSS header length of 330 words. Other devices, including both E825-C and E830 devices have different sizes for their CSS header. The use of a hard coded value results in the driver reading from the wrong block in the NVM when attempting to access the Shadow RAM copy. This results in the driver reporting the fw.bundle as 0x0 in both the devlink dev info and ethtool -i output. The first E830 support was introduced by commit `ba20ecb1d1` ("ice: Hook up 4 E830 devices by adding their IDs") and the first E825-C support was introducted by commit `f64e189442` ("ice: introduce new E825C devices family") The NVM actually contains the CSS header length embedded in it. Remove the hard coded value and replace it with logic to read the length from the NVM directly. This is more resilient against all existing and future hardware, vs looking up the expected values from a table. It ensures the driver will read from the appropriate place when determining the ETRACK ID value used for populating the fw.bundle_id and for reporting in ethtool -i. The CSS header length for both the active and inactive flash bank is stored in the ice_bank_info structure to avoid unnecessary duplicate work when accessing multiple words of the Shadow RAM. Both banks are read in the unlikely event that the header length is different for the NVM in the inactive bank, rather than being different only by the overall device family. Fixes: `ba20ecb1d1` ("ice: Hook up 4 E830 devices by adding their IDs") Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-2-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 19:27:55 -07:00
Jacob Keller	03e4a092be	ice: fix iteration of TLVs in Preserved Fields Area The ice_get_pfa_module_tlv() function iterates over the Type-Length-Value structures in the Preserved Fields Area (PFA) of the NVM. This is used by the driver to access data such as the Part Board Assembly identifier. The function uses simple logic to iterate over the PFA. First, the pointer to the PFA in the NVM is read. Then the total length of the PFA is read from the first word. A pointer to the first TLV is initialized, and a simple loop iterates over each TLV. The pointer is moved forward through the NVM until it exceeds the PFA area. The logic seems sound, but it is missing a key detail. The Preserved Fields Area length includes one additional final word. This is documented in the device data sheet as a dummy word which contains 0xFFFF. All NVMs have this extra word. If the driver tries to scan for a TLV that is not in the PFA, it will read past the size of the PFA. It reads and interprets the last dummy word of the PFA as a TLV with type 0xFFFF. It then reads the word following the PFA as a length. The PFA resides within the Shadow RAM portion of the NVM, which is relatively small. All of its offsets are within a 16-bit size. The PFA pointer and TLV pointer are stored by the driver as 16-bit values. In almost all cases, the word following the PFA will be such that interpreting it as a length will result in 16-bit arithmetic overflow. Once overflowed, the new next_tlv value is now below the maximum offset of the PFA. Thus, the driver will continue to iterate the data as TLVs. In the worst case, the driver hits on a sequence of reads which loop back to reading the same offsets in an endless loop. To fix this, we need to correct the loop iteration check to account for this extra word at the end of the PFA. This alone is sufficient to resolve the known cases of this issue in the field. However, it is plausible that an NVM could be misconfigured or have corrupt data which results in the same kind of overflow. Protect against this by using check_add_overflow when calculating both the maximum offset of the TLVs, and when calculating the next_tlv offset at the end of each loop iteration. This ensures that the driver will not get stuck in an infinite loop when scanning the PFA. Fixes: `e961b679fb` ("ice: add board identifier info to devlink .info_get") Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-1-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 19:27:55 -07:00
Dr. David Alan Gilbert	b91e05f1fc	drm/vmwgfx: remove unused struct 'vmw_stdu_dma' 'vmw_stdu_dma' is unused since commit `39985eea5a` ("drm/vmwgfx: Abstract placement selection") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240517232858.230860-1-linux@treblig.org	2024-06-05 22:20:25 -04:00
Ryusuke Konishi	7373a51e79	nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors The error handling in nilfs_empty_dir() when a directory folio/page read fails is incorrect, as in the old ext2 implementation, and if the folio/page cannot be read or nilfs_check_folio() fails, it will falsely determine the directory as empty and corrupt the file system. In addition, since nilfs_empty_dir() does not immediately return on a failed folio/page read, but continues to loop, this can cause a long loop with I/O if i_size of the directory's inode is also corrupted, causing the log writer thread to wait and hang, as reported by syzbot. Fix these issues by making nilfs_empty_dir() immediately return a false value (0) if it fails to get a directory folio/page. Link: https://lkml.kernel.org/r/20240604134255.7165-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+c8166c541d3971bf6c87@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c8166c541d3971bf6c87 Fixes: `2ba466d74e` ("nilfs2: directory entry operations") Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:27 -07:00
Suren Baghdasaryan	9415983599	mm: fix xyz_noprof functions calling profiled functions Grepping /proc/allocinfo for "noprof" reveals several xyz_noprof functions, which means internally they are calling profiled functions. This should never happen as such calls move allocation charge from a higher level location where it should be accounted for into these lower level helpers. Fix this by replacing profiled function calls with noprof ones. Link: https://lkml.kernel.org/r/20240531205350.3973009-1-surenb@google.com Fixes: `b951aaff50` ("mm: enable page allocation tagging") Fixes: `e26d8769da` ("mempool: hook up to memory allocation profiling") Fixes: `88ae5fb755` ("mm: vmalloc: enable memory allocation profiling") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:26 -07:00
Thadeu Lima de Souza Cascardo	3f0c44c8c2	codetag: avoid race at alloc_slab_obj_exts When CONFIG_MEM_ALLOC_PROFILING_DEBUG is enabled, the following warning may be noticed: [ 48.299584] ------------[ cut here ]------------ [ 48.300092] alloc_tag was not set [ 48.300528] WARNING: CPU: 2 PID: 1361 at include/linux/alloc_tag.h:130 alloc_tagging_slab_free_hook+0x84/0xc7 [ 48.301305] Modules linked in: [ 48.301553] CPU: 2 PID: 1361 Comm: systemd-udevd Not tainted 6.10.0-rc1-00003-gac8755535862 #176 [ 48.302196] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 48.302752] RIP: 0010:alloc_tagging_slab_free_hook+0x84/0xc7 [ 48.303169] Code: 8d 1c c4 48 85 db 74 4d 48 83 3b 00 75 1e 80 3d 65 02 86 04 00 75 15 48 c7 c7 11 48 1d 85 c6 05 55 02 86 04 01 e8 64 44 a5 ff <0f> 0b 48 8b 03 48 85 c0 74 21 48 83 f8 01 74 14 48 8b 50 20 48 f7 [ 48.304411] RSP: 0018:ffff8880111b7d40 EFLAGS: 00010282 [ 48.304916] RAX: 0000000000000000 RBX: ffff88800fcc9008 RCX: 0000000000000000 [ 48.305455] RDX: 0000000080000000 RSI: ffff888014060000 RDI: ffffed1002236f97 [ 48.305979] RBP: 0000000000001100 R08: fffffbfff0aa73a1 R09: 0000000000000000 [ 48.306473] R10: ffffffff814515e5 R11: 0000000000000003 R12: ffff88800fcc9000 [ 48.306943] R13: ffff88800b2e5cc0 R14: ffff8880111b7d90 R15: 0000000000000000 [ 48.307529] FS: 00007faf5d1908c0(0000) GS:ffff88806cf00000(0000) knlGS:0000000000000000 [ 48.308223] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.308710] CR2: 000058fb220c9118 CR3: 00000000110cc000 CR4: 0000000000750ef0 [ 48.309274] PKRU: 55555554 [ 48.309804] Call Trace: [ 48.310029] <TASK> [ 48.310290] ? show_regs+0x84/0x8d [ 48.310722] ? alloc_tagging_slab_free_hook+0x84/0xc7 [ 48.311298] ? __warn+0x13b/0x2ff [ 48.311580] ? alloc_tagging_slab_free_hook+0x84/0xc7 [ 48.311987] ? report_bug+0x2ce/0x3ab [ 48.312292] ? handle_bug+0x8c/0x107 [ 48.312563] ? exc_invalid_op+0x34/0x6f [ 48.312842] ? asm_exc_invalid_op+0x1a/0x20 [ 48.313173] ? this_cpu_in_panic+0x1c/0x72 [ 48.313503] ? alloc_tagging_slab_free_hook+0x84/0xc7 [ 48.313880] ? putname+0x143/0x14e [ 48.314152] kmem_cache_free+0xe9/0x214 [ 48.314454] putname+0x143/0x14e [ 48.314712] do_unlinkat+0x413/0x45e [ 48.315001] ? __pfx_do_unlinkat+0x10/0x10 [ 48.315388] ? __check_object_size+0x4d7/0x525 [ 48.315744] ? __sanitizer_cov_trace_pc+0x20/0x4a [ 48.316167] ? __sanitizer_cov_trace_pc+0x20/0x4a [ 48.316757] ? getname_flags+0x4ed/0x500 [ 48.317261] __x64_sys_unlink+0x42/0x4a [ 48.317741] do_syscall_64+0xe2/0x149 [ 48.318171] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 48.318602] RIP: 0033:0x7faf5d8850ab [ 48.318891] Code: fd ff ff e8 27 dd 01 00 0f 1f 80 00 00 00 00 f3 0f 1e fa b8 5f 00 00 00 0f 05 c3 0f 1f 40 00 f3 0f 1e fa b8 57 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 41 2d 0e 00 f7 d8 [ 48.320649] RSP: 002b:00007ffc44982b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000057 [ 48.321182] RAX: ffffffffffffffda RBX: 00005ba344a44680 RCX: 00007faf5d8850ab [ 48.321667] RDX: 0000000000000000 RSI: 00005ba344a44430 RDI: 00007ffc44982b40 [ 48.322139] RBP: 00007ffc44982c00 R08: 0000000000000000 R09: 0000000000000007 [ 48.322598] R10: 00005ba344a44430 R11: 0000000000000246 R12: 0000000000000000 [ 48.323071] R13: 00007ffc44982b40 R14: 0000000000000000 R15: 0000000000000000 [ 48.323596] </TASK> This is due to a race when two objects are allocated from the same slab, which did not have an obj_exts allocated for. In such a case, the two threads will notice the NULL obj_exts and after one assigns slab->obj_exts, the second one will happily do the exchange if it reads this new assigned value. In order to avoid that, verify that the read obj_exts does not point to an allocated obj_exts before doing the exchange. Link: https://lkml.kernel.org/r/20240527183007.1595037-1-cascardo@igalia.com Fixes: `09c46563ff` ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:26 -07:00
Oscar Salvador	8daf9c702e	mm/hugetlb: do not call vma_add_reservation upon ENOMEM sysbot reported a splat [1] on __unmap_hugepage_range(). This is because vma_needs_reservation() can return -ENOMEM if allocate_file_region_entries() fails to allocate the file_region struct for the reservation. Check for that and do not call vma_add_reservation() if that is the case, otherwise region_abort() and region_del() will see that we do not have any file_regions. If we detect that vma_needs_reservation() returned -ENOMEM, we clear the hugetlb_restore_reserve flag as if this reservation was still consumed, so free_huge_folio() will not increment the resv count. [1] https://lore.kernel.org/linux-mm/0000000000004096100617c58d54@google.com/T/#ma5983bc1ab18a54910da83416b3f89f3c7ee43aa Link: https://lkml.kernel.org/r/20240528205323.20439-1-osalvador@suse.de Fixes: `df7a6d1f64` ("mm/hugetlb: restore the reservation if needed") Signed-off-by: Oscar Salvador <osalvador@suse.de> Reported-and-tested-by: syzbot+d3fe2dc5ffe9380b714b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-mm/0000000000004096100617c58d54@google.com/ Cc: Breno Leitao <leitao@debian.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:26 -07:00
Chengming Zhou	c2dc78b86e	mm/ksm: fix ksm_zero_pages accounting We normally ksm_zero_pages++ in ksmd when page is merged with zero page, but ksm_zero_pages-- is done from page tables side, where there is no any accessing protection of ksm_zero_pages. So we can read very exceptional value of ksm_zero_pages in rare cases, such as -1, which is very confusing to users. Fix it by changing to use atomic_long_t, and the same case with the mm->ksm_zero_pages. Link: https://lkml.kernel.org/r/20240528-b4-ksm-counters-v3-2-34bb358fdc13@linux.dev Fixes: `e2942062e0` ("ksm: count all zero pages placed by KSM") Fixes: `6080d19f07` ("ksm: add ksm zero pages for each process") Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Cc: Stefan Roesch <shr@devkernel.io> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:26 -07:00
Chengming Zhou	730cdc2c72	mm/ksm: fix ksm_pages_scanned accounting Patch series "mm/ksm: fix some accounting problems", v3. We encountered some abnormal ksm_pages_scanned and ksm_zero_pages during some random tests. 1. ksm_pages_scanned unchanged even ksmd scanning has progress. 2. ksm_zero_pages maybe -1 in some rare cases. This patch (of 2): During testing, I found ksm_pages_scanned is unchanged although the scan_get_next_rmap_item() did return valid rmap_item that is not NULL. The reason is the scan_get_next_rmap_item() will return NULL after a full scan, so ksm_do_scan() just return without accounting of the ksm_pages_scanned. Fix it by just putting ksm_pages_scanned accounting in that loop, and it will be accounted more timely if that loop would last for a long time. Link: https://lkml.kernel.org/r/20240528-b4-ksm-counters-v3-0-34bb358fdc13@linux.dev Link: https://lkml.kernel.org/r/20240528-b4-ksm-counters-v3-1-34bb358fdc13@linux.dev Fixes: `b348b5fe2b` ("mm/ksm: add pages scanned metric") Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: xu xin <xu.xin16@zte.com.cn> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Cc: Stefan Roesch <shr@devkernel.io> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:25 -07:00
Alexander Potapenko	2ef3cec44c	kmsan: do not wipe out origin when doing partial unpoisoning As noticed by Brian, KMSAN should not be zeroing the origin when unpoisoning parts of a four-byte uninitialized value, e.g.: char a[4]; kmsan_unpoison_memory(a, 1); This led to false negatives, as certain poisoned values could receive zero origins, preventing those values from being reported. To fix the problem, check that kmsan_internal_set_shadow_origin() writes zero origins only to slots which have zero shadow. Link: https://lkml.kernel.org/r/20240528104807.738758-1-glider@google.com Fixes: `f80be4571b` ("kmsan: add KMSAN runtime core") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Link: https://lore.kernel.org/lkml/20240524232804.1984355-1-bjohannesmeyer@gmail.com/T/ Reviewed-by: Marco Elver <elver@google.com> Tested-by: Brian Johannesmeyer <bjohannesmeyer@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:25 -07:00
Cong Wang	0105eaabb2	vmalloc: check CONFIG_EXECMEM in is_vmalloc_or_module_addr() After commit `2c9e5d4a00` ("bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of") CONFIG_BPF_JIT does not depend on CONFIG_MODULES any more and bpf jit also uses the [MODULES_VADDR, MODULES_END] memory region. But is_vmalloc_or_module_addr() still checks CONFIG_MODULES, which then returns false for a bpf jit memory region when CONFIG_MODULES is not defined. It leads to the following kernel BUG: [ 1.567023] ------------[ cut here ]------------ [ 1.567883] kernel BUG at mm/vmalloc.c:745! [ 1.568477] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 1.569367] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.9.0+ #448 [ 1.570247] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 [ 1.570786] RIP: 0010:vmalloc_to_page+0x48/0x1ec [ 1.570786] Code: 0f 00 00 e8 eb 1a 05 00 b8 37 00 00 00 48 ba fe ff ff ff ff 1f 00 00 4c 03 25 76 49 c6 02 48 c1 e0 28 48 01 e8 48 39 d0 76 02 <0f> 0b 4c 89 e7 e8 bf 1a 05 00 49 8b 04 24 48 a9 9f ff ff ff 0f 84 [ 1.570786] RSP: 0018:ffff888007787960 EFLAGS: 00010212 [ 1.570786] RAX: 000036ffa0000000 RBX: 0000000000000640 RCX: ffffffff8147e93c [ 1.570786] RDX: 00001ffffffffffe RSI: dffffc0000000000 RDI: ffffffff840e32c8 [ 1.570786] RBP: ffffffffa0000000 R08: 0000000000000000 R09: 0000000000000000 [ 1.570786] R10: ffff888007787a88 R11: ffffffff8475d8e7 R12: ffffffff83e80ff8 [ 1.570786] R13: 0000000000000640 R14: 0000000000000640 R15: 0000000000000640 [ 1.570786] FS: 0000000000000000(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000 [ 1.570786] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.570786] CR2: ffff888006a01000 CR3: 0000000003e80000 CR4: 0000000000350ef0 [ 1.570786] Call Trace: [ 1.570786] <TASK> [ 1.570786] ? __die_body+0x1b/0x58 [ 1.570786] ? die+0x31/0x4b [ 1.570786] ? do_trap+0x9d/0x138 [ 1.570786] ? vmalloc_to_page+0x48/0x1ec [ 1.570786] ? do_error_trap+0xcd/0x102 [ 1.570786] ? vmalloc_to_page+0x48/0x1ec [ 1.570786] ? vmalloc_to_page+0x48/0x1ec [ 1.570786] ? handle_invalid_op+0x2f/0x38 [ 1.570786] ? vmalloc_to_page+0x48/0x1ec [ 1.570786] ? exc_invalid_op+0x2b/0x41 [ 1.570786] ? asm_exc_invalid_op+0x16/0x20 [ 1.570786] ? vmalloc_to_page+0x26/0x1ec [ 1.570786] ? vmalloc_to_page+0x48/0x1ec [ 1.570786] __text_poke+0xb6/0x458 [ 1.570786] ? __pfx_text_poke_memcpy+0x10/0x10 [ 1.570786] ? __pfx___mutex_lock+0x10/0x10 [ 1.570786] ? __pfx___text_poke+0x10/0x10 [ 1.570786] ? __pfx_get_random_u32+0x10/0x10 [ 1.570786] ? srso_return_thunk+0x5/0x5f [ 1.570786] text_poke_copy_locked+0x70/0x84 [ 1.570786] text_poke_copy+0x32/0x4f [ 1.570786] bpf_arch_text_copy+0xf/0x27 [ 1.570786] bpf_jit_binary_pack_finalize+0x26/0x5a [ 1.570786] bpf_int_jit_compile+0x576/0x8ad [ 1.570786] ? __pfx_bpf_int_jit_compile+0x10/0x10 [ 1.570786] ? srso_return_thunk+0x5/0x5f [ 1.570786] ? __kmalloc_node_track_caller+0x2b5/0x2e0 [ 1.570786] bpf_prog_select_runtime+0x7c/0x199 [ 1.570786] bpf_prepare_filter+0x1e9/0x25b [ 1.570786] ? __pfx_bpf_prepare_filter+0x10/0x10 [ 1.570786] ? srso_return_thunk+0x5/0x5f [ 1.570786] ? _find_next_bit+0x29/0x7e [ 1.570786] bpf_prog_create+0xb8/0xe0 [ 1.570786] ptp_classifier_init+0x75/0xa1 [ 1.570786] ? __pfx_ptp_classifier_init+0x10/0x10 [ 1.570786] ? srso_return_thunk+0x5/0x5f [ 1.570786] ? register_pernet_subsys+0x36/0x42 [ 1.570786] ? srso_return_thunk+0x5/0x5f [ 1.570786] sock_init+0x99/0xa3 [ 1.570786] ? __pfx_sock_init+0x10/0x10 [ 1.570786] do_one_initcall+0x104/0x2c4 [ 1.570786] ? __pfx_do_one_initcall+0x10/0x10 [ 1.570786] ? parameq+0x25/0x2d [ 1.570786] ? rcu_is_watching+0x1c/0x3c [ 1.570786] ? trace_kmalloc+0x81/0xb2 [ 1.570786] ? srso_return_thunk+0x5/0x5f [ 1.570786] ? __kmalloc+0x29c/0x2c7 [ 1.570786] ? srso_return_thunk+0x5/0x5f [ 1.570786] do_initcalls+0xf9/0x123 [ 1.570786] kernel_init_freeable+0x24f/0x289 [ 1.570786] ? __pfx_kernel_init+0x10/0x10 [ 1.570786] kernel_init+0x19/0x13a [ 1.570786] ret_from_fork+0x24/0x41 [ 1.570786] ? __pfx_kernel_init+0x10/0x10 [ 1.570786] ret_from_fork_asm+0x1a/0x30 [ 1.570786] </TASK> [ 1.570819] ---[ end trace 0000000000000000 ]--- [ 1.571463] RIP: 0010:vmalloc_to_page+0x48/0x1ec [ 1.572111] Code: 0f 00 00 e8 eb 1a 05 00 b8 37 00 00 00 48 ba fe ff ff ff ff 1f 00 00 4c 03 25 76 49 c6 02 48 c1 e0 28 48 01 e8 48 39 d0 76 02 <0f> 0b 4c 89 e7 e8 bf 1a 05 00 49 8b 04 24 48 a9 9f ff ff ff 0f 84 [ 1.574632] RSP: 0018:ffff888007787960 EFLAGS: 00010212 [ 1.575129] RAX: 000036ffa0000000 RBX: 0000000000000640 RCX: ffffffff8147e93c [ 1.576097] RDX: 00001ffffffffffe RSI: dffffc0000000000 RDI: ffffffff840e32c8 [ 1.577084] RBP: ffffffffa0000000 R08: 0000000000000000 R09: 0000000000000000 [ 1.578077] R10: ffff888007787a88 R11: ffffffff8475d8e7 R12: ffffffff83e80ff8 [ 1.578810] R13: 0000000000000640 R14: 0000000000000640 R15: 0000000000000640 [ 1.579823] FS: 0000000000000000(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000 [ 1.580992] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.581869] CR2: ffff888006a01000 CR3: 0000000003e80000 CR4: 0000000000350ef0 [ 1.582800] Kernel panic - not syncing: Fatal exception [ 1.583765] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fix this by checking CONFIG_EXECMEM instead. Link: https://lkml.kernel.org/r/20240528160838.102223-1-xiyou.wangcong@gmail.com Fixes: `2c9e5d4a00` ("bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of") Signed-off-by: Cong Wang <cong.wang@bytedance.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:25 -07:00
Johannes Weiner	7cc5a5d650	mm: page_alloc: fix highatomic typing in multi-block buddies Christoph reports a page allocator splat triggered by xfstests: generic/176 214s ... [ 1204.507931] run fstests generic/176 at 2024-05-27 12:52:30 XFS (nvme0n1): Mounting V5 Filesystem cd936307-415f-48a3-b99d-a2d52ae1f273 XFS (nvme0n1): Ending clean mount XFS (nvme1n1): Mounting V5 Filesystem ab3ee1a4-af62-4934-9a6a-6c2fde321850 XFS (nvme1n1): Ending clean mount XFS (nvme1n1): Unmounting Filesystem ab3ee1a4-af62-4934-9a6a-6c2fde321850 XFS (nvme1n1): Mounting V5 Filesystem 7099b02d-9c58-4d1d-be1d-2cc472d12cd9 XFS (nvme1n1): Ending clean mount ------------[ cut here ]------------ page type is 3, passed migratetype is 1 (nr=512) WARNING: CPU: 0 PID: 509870 at mm/page_alloc.c:645 expand+0x1c5/0x1f0 Modules linked in: i2c_i801 crc32_pclmul i2c_smbus [last unloaded: scsi_debug] CPU: 0 PID: 509870 Comm: xfs_io Not tainted 6.10.0-rc1+ #2437 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:expand+0x1c5/0x1f0 Code: 05 16 70 bf 02 01 e8 ca fc ff ff 8b 54 24 34 44 89 e1 48 c7 c7 80 a2 28 83 48 89 c6 b8 01 00 3 RSP: 0018:ffffc90003b2b968 EFLAGS: 00010082 RAX: 0000000000000000 RBX: ffffffff83fa9480 RCX: 0000000000000000 RDX: 0000000000000005 RSI: 0000000000000027 RDI: 00000000ffffffff RBP: 00000000001f2600 R08: 00000000fffeffff R09: 0000000000000001 R10: 0000000000000000 R11: ffffffff83676200 R12: 0000000000000009 R13: 0000000000000200 R14: 0000000000000001 R15: ffffea0007c98000 FS: 00007f72ca3d5780(0000) GS:ffff8881f9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f72ca1fff38 CR3: 00000001aa0c6002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __warn+0x7b/0x120 ? expand+0x1c5/0x1f0 ? report_bug+0x191/0x1c0 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? expand+0x1c5/0x1f0 ? expand+0x1c5/0x1f0 __rmqueue_pcplist+0x3a9/0x730 get_page_from_freelist+0x7a0/0xf00 __alloc_pages_noprof+0x153/0x2e0 __folio_alloc_noprof+0x10/0xa0 __filemap_get_folio+0x16b/0x370 iomap_write_begin+0x496/0x680 While trying to service a movable allocation (page type 1), the page allocator runs into a two-pageblock buddy on the movable freelist whose second block is typed as highatomic (page type 3). This inconsistency is caused by the highatomic reservation system operating on single pageblocks, while MAX_ORDER can be bigger than that - in this configuration, pageblock_order is 9 while MAX_PAGE_ORDER is 10. The test case is observed to make several adjacent order-3 requests with __GFP_DIRECT_RECLAIM cleared, which marks the surrounding block as highatomic. Upon freeing, the blocks merge into an order-10 buddy. When the highatomic pool is drained later on, this order-10 buddy gets moved back to the movable list, but only the first pageblock is marked movable again. A subsequent expand() of this buddy warns about the tail being of a different type. This is a long-standing bug that's surfaced by the recent block type warnings added to the allocator. The consequences seem mostly benign, it just results in odd behavior: the highatomic tail blocks are not properly drained, instead they end up on the movable list first, then go back to the highatomic list after an alloc-free cycle. To fix this, make the highatomic reservation code aware that allocations/buddies can be larger than a pageblock. While it's an old quirk, the recently added type consistency warnings seem to be the most prominent consequence of it. Set the Fixes: tag accordingly to highlight this backporting dependency. Link: https://lkml.kernel.org/r/20240530114203.GA1222079@cmpxchg.org Fixes: `e0932b6c1f` ("mm: page_alloc: consolidate free page accounting") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Tested-by: Christoph Hellwig <hch@lst.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:25 -07:00
Ryusuke Konishi	a4ca369ca2	nilfs2: fix potential kernel bug due to lack of writeback flag waiting Destructive writes to a block device on which nilfs2 is mounted can cause a kernel bug in the folio/page writeback start routine or writeback end routine (__folio_start_writeback in the log below): kernel BUG at mm/page-writeback.c:3070! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI ... RIP: 0010:__folio_start_writeback+0xbaa/0x10e0 Code: 25 ff 0f 00 00 0f 84 18 01 00 00 e8 40 ca c6 ff e9 17 f6 ff ff e8 36 ca c6 ff 4c 89 f7 48 c7 c6 80 c0 12 84 e8 e7 b3 0f 00 90 <0f> 0b e8 1f ca c6 ff 4c 89 f7 48 c7 c6 a0 c6 12 84 e8 d0 b3 0f 00 ... Call Trace: <TASK> nilfs_segctor_do_construct+0x4654/0x69d0 [nilfs2] nilfs_segctor_construct+0x181/0x6b0 [nilfs2] nilfs_segctor_thread+0x548/0x11c0 [nilfs2] kthread+0x2f0/0x390 ret_from_fork+0x4b/0x80 ret_from_fork_asm+0x1a/0x30 </TASK> This is because when the log writer starts a writeback for segment summary blocks or a super root block that use the backing device's page cache, it does not wait for the ongoing folio/page writeback, resulting in an inconsistent writeback state. Fix this issue by waiting for ongoing writebacks when putting folios/pages on the backing device into writeback state. Link: https://lkml.kernel.org/r/20240530141556.4411-1-konishi.ryusuke@gmail.com Fixes: `9ff05123e3` ("nilfs2: segment constructor") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:24 -07:00
Sebastian Andrzej Siewior	36eef400c2	memcg: remove the lockdep assert from __mod_objcg_mlstate() The assert was introduced in the commit cited below as an insurance that the semantic is the same after the local_irq_save() has been removed and the function has been made static. The original requirement to disable interrupt was due the modification of per-CPU counters which require interrupts to be disabled because the counter update operation is not atomic and some of the counters are updated from interrupt context. All callers of __mod_objcg_mlstate() acquire a lock (memcg_stock.stock_lock) which disables interrupts on !PREEMPT_RT and the lockdep assert is satisfied. On PREEMPT_RT the interrupts are not disabled and the assert triggers. The safety of the counter update is already ensured by VM_WARN_ON_IRQS_ENABLED() which is part of __mod_memcg_lruvec_state() and does not require yet another check. Remove the lockdep assert from __mod_objcg_mlstate(). Link: https://lkml.kernel.org/r/20240528141341.rz_rytN_@linutronix.de Fixes: `91882c1617` ("memcg: simple cleanup of stats update functions") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:24 -07:00
Barry Song	6434e69814	mm: arm64: fix the out-of-bounds issue in contpte_clear_young_dirty_ptes We are passing a huge nr to __clear_young_dirty_ptes() right now. While we should pass the number of pages, we are actually passing CONT_PTE_SIZE. This is causing lots of crashes of MADV_FREE, panic oops could vary everytime. Link: https://lkml.kernel.org/r/20240524005444.135417-1-21cnbao@gmail.com Fixes: `89e86854fb` ("mm/arm64: override clear_young_dirty_ptes() batch helper") Signed-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Lance Yang <ioworker0@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Chris Li <chrisl@kernel.org> Cc: Barry Song <21cnbao@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Jeff Xie <xiehuan09@gmail.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Zach O'Keefe <zokeefe@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:24 -07:00
Barry Song	94d46bf179	mm: huge_mm: fix undefined reference to `mthp_stats' for CONFIG_SYSFS=n if CONFIG_SYSFS is not enabled in config, we get the below error, All errors (new ones prefixed by >>): s390-linux-ld: mm/memory.o: in function `count_mthp_stat': >> include/linux/huge_mm.h:285:(.text+0x191c): undefined reference to `mthp_stats' s390-linux-ld: mm/huge_memory.o:(.rodata+0x10): undefined reference to `mthp_stats' vim +285 include/linux/huge_mm.h 279 280 static inline void count_mthp_stat(int order, enum mthp_stat_item item) 281 { 282 if (order <= 0 \|\| order > PMD_ORDER) 283 return; 284 > 285 this_cpu_inc(mthp_stats.stats[order][item]); 286 } 287 Link: https://lkml.kernel.org/r/20240523210045.40444-1-21cnbao@gmail.com Fixes: `ec33687c67` ("mm: add per-order mTHP anon_fault_alloc and anon_fault_fallback counters") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405231728.tCAogiSI-lkp@intel.com/ Signed-off-by: Barry Song <v-songbaohua@oppo.com> Tested-by: Yujie Liu <yujie.liu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:24 -07:00
Baolin Wang	0d648dd5c8	mm: drop the 'anon_' prefix for swap-out mTHP counters The mTHP swap related counters: 'anon_swpout' and 'anon_swpout_fallback' are confusing with an 'anon_' prefix, since the shmem can swap out non-anonymous pages. So drop the 'anon_' prefix to keep consistent with the old swap counter names. This is needed in 6.10-rcX to avoid having an inconsistent ABI out in the field. Link: https://lkml.kernel.org/r/7a8989c13299920d7589007a30065c3e2c19f0e0.1716431702.git.baolin.wang@linux.alibaba.com Fixes: `d0f048ac39` ("mm: add per-order mTHP anon_swpout and anon_swpout_fallback counters") Fixes: `42248b9d34` ("mm: add docs for per-order mTHP counters and transhuge_page ABI") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Suggested-by: "Huang, Ying" <ying.huang@intel.com> Acked-by: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-06-05 19:19:23 -07:00
Ian Forbes	7ef91dcba1	drm/vmwgfx: Don't destroy Screen Target when CRTC is enabled but inactive drm_crtc_helper_funcs::atomic_disable can be called even when the CRTC is still enabled. This can occur when the mode changes or the CRTC is set as inactive. In the case where the CRTC is being set as inactive we only want to blank the screen. The Screen Target should remain intact as long as the mode has not changed and CRTC is enabled. This fixes a bug with GDM where locking the screen results in a permanent black screen because the Screen Target is no longer defined. Fixes: `7b0062036c` ("drm/vmwgfx: Implement virtual crc generation") Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Reviewed-by: Martin Krastev <martin.krastev@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240531203358.26677-1-ian.forbes@broadcom.com	2024-06-05 22:11:42 -04:00
Ian Forbes	a54a200f3d	drm/vmwgfx: Standardize use of kibibytes when logging Use the same standard abbreviation KiB instead of incorrect variants. Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-5-ian.forbes@broadcom.com	2024-06-05 22:10:49 -04:00
Ian Forbes	dde1de06bd	drm/vmwgfx: Remove STDU logic from generic mode_valid function STDU has its own mode_valid function now so this logic can be removed from the generic version. Fixes: `935f795045` ("drm/vmwgfx: Refactor drm connector probing for display modes") Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-4-ian.forbes@broadcom.com	2024-06-05 22:10:48 -04:00
Ian Forbes	fb5e19d2dd	drm/vmwgfx: 3D disabled should not effect STDU memory limits This limit became a hard cap starting with the change referenced below. Surface creation on the device will fail if the requested size is larger than this limit so altering the value arbitrarily will expose modes that are too large for the device's hard limits. Fixes: `7ebb47c9f9` ("drm/vmwgfx: Read new register for GB memory when available") Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-3-ian.forbes@broadcom.com	2024-06-05 22:10:47 -04:00
Ian Forbes	4268269331	drm/vmwgfx: Filter modes which exceed graphics memory SVGA requires individual surfaces to fit within graphics memory (max_mob_pages) which means that modes with a final buffer size that would exceed graphics memory must be pruned otherwise creation will fail. Additionally llvmpipe requires its buffer height and width to be a multiple of its tile size which is 64. As a result we have to anticipate that llvmpipe will round up the mode size passed to it by the compositor when it creates buffers and filter modes where this rounding exceeds graphics memory. This fixes an issue where VMs with low graphics memory (< 64MiB) configured with high resolution mode boot to a black screen because surface creation fails. Fixes: `d947d1b71d` ("drm/vmwgfx: Add and connect connector helper function") Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-2-ian.forbes@broadcom.com	2024-06-05 22:10:46 -04:00
Jakub Kicinski	886bf9172d	Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-06-05 We've added 8 non-merge commits during the last 6 day(s) which contain a total of 9 files changed, 34 insertions(+), 35 deletions(-). The main changes are: 1) Fix a potential use-after-free in bpf_link_free when the link uses dealloc_deferred to free the link object but later still tests for presence of link->ops->dealloc, from Cong Wang. 2) Fix BPF test infra to set the run context for rawtp test_run callback where syzbot reported a crash, from Jiri Olsa. 3) Fix bpf_session_cookie BTF_ID in the special_kfunc_set list to exclude it for the case of !CONFIG_FPROBE, also from Jiri Olsa. 4) Fix a Coverity static analysis report to not close() a link_fd of -1 in the multi-uprobe feature detector, from Andrii Nakryiko. 5) Revert support for redirect to any xsk socket bound to the same umem as it can result in corrupted ring state which can lead to a crash when flushing rings. A different approach will be pursued for bpf-next to address it safely, from Magnus Karlsson. 6) Fix inet_csk_accept prototype in test_sk_storage_tracing.c which caused BPF CI failure after the last tree fast forwarding, from Andrii Nakryiko. 7) Fix a coccicheck warning in BPF devmap that iterator variable cannot be NULL, from Thorsten Blum. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: Revert "xsk: Document ability to redirect to any socket bound to the same umem" Revert "xsk: Support redirect to any socket bound to the same umem" bpf: Set run context for rawtp test_run callback bpf: Fix a potential use-after-free in bpf_link_free() bpf, devmap: Remove unnecessary if check in for loop libbpf: don't close(-1) in multi-uprobe feature detector bpf: Fix bpf_session_cookie BTF_ID in special_kfunc_set list selftests/bpf: fix inet_csk_accept prototype in test_sk_storage_tracing.c ==================== Link: https://lore.kernel.org/r/20240605091525.22628-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 19:03:08 -07:00
Dave Airlie	1cfa043fc0	Merge tag 'drm-xe-fixes-2024-06-04' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - drm/xe/pf: Update the LMTT when freeing VF GT config Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Zl8uFrQp0YjTtX4p@fedora	2024-06-06 11:38:51 +10:00
Breno Leitao	4254dfeda8	scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory There is a potential out-of-bounds access when using test_bit() on a single word. The test_bit() and set_bit() functions operate on long values, and when testing or setting a single word, they can exceed the word boundary. KASAN detects this issue and produces a dump: BUG: KASAN: slab-out-of-bounds in _scsih_add_device.constprop.0 (./arch/x86/include/asm/bitops.h:60 ./include/asm-generic/bitops/instrumented-atomic.h:29 drivers/scsi/mpt3sas/mpt3sas_scsih.c:7331) mpt3sas Write of size 8 at addr ffff8881d26e3c60 by task kworker/u1536:2/2965 For full log, please look at [1]. Make the allocation at least the size of sizeof(unsigned long) so that set_bit() and test_bit() have sufficient room for read/write operations without overwriting unallocated memory. [1] Link: https://lore.kernel.org/all/ZkNcALr3W3KGYYJG@gmail.com/ Fixes: `c696f7b83e` ("scsi: mpt3sas: Implement device_remove_in_progress check in IOCTL path") Cc: stable@vger.kernel.org Suggested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240605085530.499432-1-leitao@debian.org Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-06-05 20:15:41 -04:00
Martin K. Petersen	7926d51f73	scsi: sd: Use READ(16) when reading block zero on large capacity disks Commit `321da3dc1f` ("scsi: sd: usb_storage: uas: Access media prior to querying device properties") triggered a read to LBA 0 before attempting to inquire about device characteristics. This was done because some protocol bridge devices will return generic values until an attached storage device's media has been accessed. Pierre Tomon reported that this change caused problems on a large capacity external drive connected via a bridge device. The bridge in question does not appear to implement the READ(10) command. Issue a READ(16) instead of READ(10) when a device has been identified as preferring 16-byte commands (use_16_for_rw heuristic). Link: https://bugzilla.kernel.org/show_bug.cgi?id=218890 Link: https://lore.kernel.org/r/70dd7ae0-b6b1-48e1-bb59-53b7c7f18274@rowland.harvard.edu Link: https://lore.kernel.org/r/20240605022521.3960956-1-martin.petersen@oracle.com Fixes: `321da3dc1f` ("scsi: sd: usb_storage: uas: Access media prior to querying device properties") Cc: stable@vger.kernel.org Reported-by: Pierre Tomon <pierretom+12@ik.me> Suggested-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Pierre Tomon <pierretom+12@ik.me> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-06-05 20:08:06 -04:00
Karol Kolacinski	323a359f9b	ptp: Fix error message on failed pin verification On failed verification of PTP clock pin, error message prints channel number instead of pin index after "pin", which is incorrect. Fix error message by adding channel number to the message and printing pin number instead of channel number. Fixes: `6092315dfd` ("ptp: introduce programmable pins.") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20240604120555.16643-1-karol.kolacinski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 16:01:16 -07:00
Eric Dumazet	f921a58ae2	net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP If one TCA_TAPRIO_ATTR_PRIOMAP attribute has been provided, taprio_parse_mqprio_opt() must validate it, or userspace can inject arbitrary data to the kernel, the second time taprio_change() is called. First call (with valid attributes) sets dev->num_tc to a non zero value. Second call (with arbitrary mqprio attributes) returns early from taprio_parse_mqprio_opt() and bad things can happen. Fixes: `a3d43c0d56` ("taprio: Add support adding an admin schedule") Reported-by: Noam Rathaus <noamr@ssd-disclosure.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20240604181511.769870-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 15:54:51 -07:00
Linus Torvalds	2df0193e62	Merge tag 'thermal-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fixes from Rafael Wysocki: "Fix issues related to the handling of invalid trip points in the thermal core and in the thermal debug code that have been overlooked by some recent thermal control core changes" * tag 'thermal-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: trip: Trigger trip down notifications when trips involved in mitigation become invalid thermal: core: Introduce thermal_trip_crossed() thermal/debugfs: Allow tze_seq_show() to print statistics for invalid trips thermal/debugfs: Print initial trip temperature and hysteresis in tze_seq_show()	2024-06-05 15:28:20 -07:00
Linus Torvalds	553352597d	Merge tag 'acpi-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix the ACPI EC and AC drivers, the ACPI APEI error injection driver and build issues related to the dev_is_pnp() macro referring to pnp_bus_type that is not exported to modules. Specifics: - Fix error handling during EC operation region accesses in the ACPI EC driver (Armin Wolf) - Fix a memory leak in the APEI error injection driver introduced during its converion to a platform driver (Dan Williams) - Fix build failures related to the dev_is_pnp() macro by redefining it as a proper function and exporting it to modules as appropriate and unexport pnp_bus_type which need not be exported any more (Andy Shevchenko) - Update the ACPI AC driver to use power_supply_changed() to let the power supply core handle configuration changes properly (Thomas Weißschuh)" * tag 'acpi-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: AC: Properly notify powermanagement core about changes PNP: Hide pnp_bus_type from the non-PNP code PNP: Make dev_is_pnp() to be a function and export it for modules ACPI: EC: Avoid returning AE_OK on errors in address space handler ACPI: EC: Abort address space access upon error ACPI: APEI: EINJ: Fix einj_dev release leak	2024-06-05 15:19:15 -07:00
Linus Torvalds	64c6a36d79	Merge tag 'pm-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the intel_pstate and amd-pstate cpufreq drivers and the cpupower utility. Specifics: - Fix a recently introduced unchecked HWP MSR access in the intel_pstate driver (Srinivas Pandruvada) - Add missing conversion from MHz to KHz to amd_pstate_set_boost() to address sysfs inteface inconsistency and fix P-state frequency reporting on AMD Family 1Ah CPUs in the cpupower utility (Dhananjay Ugwekar) - Get rid of an excess global header file used by the amd-pstate cpufreq driver (Arnd Bergmann)" * tag 'pm-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Fix unchecked HWP MSR access cpufreq: amd-pstate: Fix the inconsistency in max frequency units cpufreq: amd-pstate: remove global header file tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs	2024-06-05 15:12:35 -07:00
Aleksandr Mishin	229bedbf62	net/mlx5: Fix tainted pointer delete is case of flow rules creation fail In case of flow rule creation fail in mlx5_lag_create_port_sel_table(), instead of previously created rules, the tainted pointer is deleted deveral times. Fix this bug by using correct flow rules pointers. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `352899f384` ("net/mlx5: Lag, use buckets in hash mode") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240604100552.25201-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 12:51:25 -07:00
Yazen Ghannam	c625dabbf1	x86/amd_nb: Check for invalid SMN reads AMD Zen-based systems use a System Management Network (SMN) that provides access to implementation-specific registers. SMN accesses are done indirectly through an index/data pair in PCI config space. The PCI config access may fail and return an error code. This would prevent the "read" value from being updated. However, the PCI config access may succeed, but the return value may be invalid. This is in similar fashion to PCI bad reads, i.e. return all bits set. Most systems will return 0 for SMN addresses that are not accessible. This is in line with AMD convention that unavailable registers are Read-as-Zero/Writes-Ignored. However, some systems will return a "PCI Error Response" instead. This value, along with an error code of 0 from the PCI config access, will confuse callers of the amd_smn_read() function. Check for this condition, clear the return value, and set a proper error code. Fixes: `ddfe43cdc0` ("x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403164244.471141-1-yazen.ghannam@amd.com	2024-06-05 21:23:34 +02:00
Linus Torvalds	19ca0d8a43	Merge tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "A fix for fast fsync that needs to handle errors during writes after some COW failure so it does not lead to an inconsistent state" * tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: ensure fast fsync waits for ordered extents after a write failure	2024-06-05 11:28:25 -07:00
Linus Torvalds	e20b269d73	Merge tag 'bcachefs-2024-06-05' of https://evilpiepirate.org/git/bcachefs Pull bcachefs fixes from Kent Overstreet: "Just a few small fixes" * tag 'bcachefs-2024-06-05' of https://evilpiepirate.org/git/bcachefs: bcachefs: Fix trans->locked assert bcachefs: Rereplicate now moves data off of durability=0 devices bcachefs: Fix GFP_KERNEL allocation in break_cycle()	2024-06-05 11:25:41 -07:00
Jens Axboe	27d024235b	Merge tag 'nvme-6.10-2024-06-05' of git://git.infradead.org/nvme into block-6.10 Pull NVMe fixes from Keith: "nvme fixes Linux 6.10 - Use reserved tags for special fabrics operations (Chunguang) - Persistent Reservation status masking fix (Weiwen)" * tag 'nvme-6.10-2024-06-05' of git://git.infradead.org/nvme: nvme: fix nvme_pr_* status code parsing nvme-fabrics: use reserved tag for reg read/write command	2024-06-05 12:13:00 -06:00
Andreas Hindborg	c462ecd659	null_blk: fix validation of block size Block size should be between 512 and PAGE_SIZE and be a power of 2. The current check does not validate this, so update the check. Without this patch, null_blk would Oops due to a null pointer deref when loaded with bs=1536 [1]. Link: https://lore.kernel.org/all/87wmn8mocd.fsf@metaspace.dk/ Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240603192645.977968-1-nmi@metaspace.dk [axboe: remove unnecessary braces and != 0 check] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-05 12:12:54 -06:00
Tasos Sahanidis	c6c4dd5401	drm/amdgpu/pptable: Fix UBSAN array-index-out-of-bounds Flexible arrays used [1] instead of []. Replace the former with the latter to resolve multiple UBSAN warnings observed on boot with a BONAIRE card. In addition, use the __counted_by attribute where possible to hint the length of the arrays to the compiler and any sanitizers. Signed-off-by: Tasos Sahanidis <tasos@tasossah.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-06-05 13:43:34 -04:00
Mario Limonciello	267cace556	drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platforms commit `cd94d1b182` ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") attempted to fix shutdown issues that were reported since commit `31729e8c21` ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") but caused issues for some people. Adjust the workaround flow to properly only apply in the S4 case: -> For shutdown go through SMU_MSG_PrepareMp1ForUnload -> For S4 go through SMU_MSG_GfxDeviceDriverReset and SMU_MSG_PrepareMp1ForUnload Reported-and-tested-by: lectrode <electrodexsnet@gmail.com> Closes: https://github.com/void-linux/void-packages/issues/50417 Cc: stable@vger.kernel.org Fixes: `cd94d1b182` ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-06-05 13:41:56 -04:00
Linus Torvalds	558dc49aac	Merge tag 'i2c-for-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "This should have been my second pull request during the merge window but one dependency in the drm subsystem fell through the cracks and was only applied for rc2. Now we can finally remove I2C_CLASS_SPD" * tag 'i2c-for-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: Remove I2C_CLASS_SPD i2c: synquacer: Remove a clk reference from struct synquacer_i2c	2024-06-05 10:32:20 -07:00
Linus Torvalds	208d9b65c0	Merge tag 'tpmdd-next-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fixes from Jarkko Sakkinen: "The bug fix for tpm_tis_core_init() is not that critical but still makes sense to get into release for the sake of better quality. I included the Intel CPU model define change mainly to help Tony just a bit, as for this subsystem it cannot realistically speaking cause any possible harm" * tag 'tpmdd-next-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Switch to new Intel CPU model defines tpm_tis: Do not flush uninitialized work	2024-06-05 10:29:13 -07:00
Filipe Manana	fb33eb2ef0	btrfs: fix leak of qgroup extent records after transaction abort Qgroup extent records are created when delayed ref heads are created and then released after accounting extents at btrfs_qgroup_account_extents(), called during the transaction commit path. If a transaction is aborted we free the qgroup records by calling btrfs_qgroup_destroy_extent_records() at btrfs_destroy_delayed_refs(), unless we don't have delayed references. We are incorrectly assuming that no delayed references means we don't have qgroup extents records. We can currently have no delayed references because we ran them all during a transaction commit and the transaction was aborted after that due to some error in the commit path. So fix this by ensuring we btrfs_qgroup_destroy_extent_records() at btrfs_destroy_delayed_refs() even if we don't have any delayed references. Reported-by: syzbot+0fecc032fa134afd49df@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/0000000000004e7f980619f91835@google.com/ Fixes: `81f7eb00ff` ("btrfs: destroy qgroup extent records on transaction abort") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-06-05 18:06:54 +02:00
Omar Sandoval	9d274c19a7	btrfs: fix crash on racing fsync and size-extending write into prealloc We have been seeing crashes on duplicate keys in btrfs_set_item_key_safe(): BTRFS critical (device vdb): slot 4 key (450 108 8192) new key (450 108 8192) ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.c:2620! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 3139 Comm: xfs_io Kdump: loaded Not tainted 6.9.0 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:btrfs_set_item_key_safe+0x11f/0x290 [btrfs] With the following stack trace: #0 btrfs_set_item_key_safe (fs/btrfs/ctree.c:2620:4) #1 btrfs_drop_extents (fs/btrfs/file.c:411:4) #2 log_one_extent (fs/btrfs/tree-log.c:4732:9) #3 btrfs_log_changed_extents (fs/btrfs/tree-log.c:4955:9) #4 btrfs_log_inode (fs/btrfs/tree-log.c:6626:9) #5 btrfs_log_inode_parent (fs/btrfs/tree-log.c:7070:8) #6 btrfs_log_dentry_safe (fs/btrfs/tree-log.c:7171:8) #7 btrfs_sync_file (fs/btrfs/file.c:1933:8) #8 vfs_fsync_range (fs/sync.c:188:9) #9 vfs_fsync (fs/sync.c:202:9) #10 do_fsync (fs/sync.c:212:9) #11 __do_sys_fdatasync (fs/sync.c:225:9) #12 __se_sys_fdatasync (fs/sync.c:223:1) #13 __x64_sys_fdatasync (fs/sync.c:223:1) #14 do_syscall_x64 (arch/x86/entry/common.c:52:14) #15 do_syscall_64 (arch/x86/entry/common.c:83:7) #16 entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121) So we're logging a changed extent from fsync, which is splitting an extent in the log tree. But this split part already exists in the tree, triggering the BUG(). This is the state of the log tree at the time of the crash, dumped with drgn (https://github.com/osandov/drgn/blob/main/contrib/btrfs_tree.py) to get more details than btrfs_print_leaf() gives us: >>> print_extent_buffer(prog.crashed_thread().stack_trace()[0]["eb"]) leaf 33439744 level 0 items 72 generation 9 owner 18446744073709551610 leaf 33439744 flags 0x100000000000000 fs uuid e5bd3946-400c-4223-8923-190ef1f18677 chunk uuid d58cb17e-6d02-494a-829a-18b7d8a399da item 0 key (450 INODE_ITEM 0) itemoff 16123 itemsize 160 generation 7 transid 9 size 8192 nbytes 8473563889606862198 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 204 flags 0x10(PREALLOC) atime 1716417703.220000000 (2024-05-22 15:41:43) ctime 1716417704.983333333 (2024-05-22 15:41:44) mtime 1716417704.983333333 (2024-05-22 15:41:44) otime 17592186044416.000000000 (559444-03-08 01:40:16) item 1 key (450 INODE_REF 256) itemoff 16110 itemsize 13 index 195 namelen 3 name: 193 item 2 key (450 XATTR_ITEM 1640047104) itemoff 16073 itemsize 37 location key (0 UNKNOWN.0 0) type XATTR transid 7 data_len 1 name_len 6 name: user.a data a item 3 key (450 EXTENT_DATA 0) itemoff 16020 itemsize 53 generation 9 type 1 (regular) extent data disk byte 303144960 nr 12288 extent data offset 0 nr 4096 ram 12288 extent compression 0 (none) item 4 key (450 EXTENT_DATA 4096) itemoff 15967 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 4096 nr 8192 item 5 key (450 EXTENT_DATA 8192) itemoff 15914 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 8192 nr 4096 ... So the real problem happened earlier: notice that items 4 (4k-12k) and 5 (8k-12k) overlap. Both are prealloc extents. Item 4 straddles i_size and item 5 starts at i_size. Here is the state of the filesystem tree at the time of the crash: >>> root = prog.crashed_thread().stack_trace()[2]["inode"].root >>> ret, nodes, slots = btrfs_search_slot(root, BtrfsKey(450, 0, 0)) >>> print_extent_buffer(nodes[0]) leaf 30425088 level 0 items 184 generation 9 owner 5 leaf 30425088 flags 0x100000000000000 fs uuid e5bd3946-400c-4223-8923-190ef1f18677 chunk uuid d58cb17e-6d02-494a-829a-18b7d8a399da ... item 179 key (450 INODE_ITEM 0) itemoff 4907 itemsize 160 generation 7 transid 7 size 4096 nbytes 12288 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 6 flags 0x10(PREALLOC) atime 1716417703.220000000 (2024-05-22 15:41:43) ctime 1716417703.220000000 (2024-05-22 15:41:43) mtime 1716417703.220000000 (2024-05-22 15:41:43) otime 1716417703.220000000 (2024-05-22 15:41:43) item 180 key (450 INODE_REF 256) itemoff 4894 itemsize 13 index 195 namelen 3 name: 193 item 181 key (450 XATTR_ITEM 1640047104) itemoff 4857 itemsize 37 location key (0 UNKNOWN.0 0) type XATTR transid 7 data_len 1 name_len 6 name: user.a data a item 182 key (450 EXTENT_DATA 0) itemoff 4804 itemsize 53 generation 9 type 1 (regular) extent data disk byte 303144960 nr 12288 extent data offset 0 nr 8192 ram 12288 extent compression 0 (none) item 183 key (450 EXTENT_DATA 8192) itemoff 4751 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 8192 nr 4096 Item 5 in the log tree corresponds to item 183 in the filesystem tree, but nothing matches item 4. Furthermore, item 183 is the last item in the leaf. btrfs_log_prealloc_extents() is responsible for logging prealloc extents beyond i_size. It first truncates any previously logged prealloc extents that start beyond i_size. Then, it walks the filesystem tree and copies the prealloc extent items to the log tree. If it hits the end of a leaf, then it calls btrfs_next_leaf(), which unlocks the tree and does another search. However, while the filesystem tree is unlocked, an ordered extent completion may modify the tree. In particular, it may insert an extent item that overlaps with an extent item that was already copied to the log tree. This may manifest in several ways depending on the exact scenario, including an EEXIST error that is silently translated to a full sync, overlapping items in the log tree, or this crash. This particular crash is triggered by the following sequence of events: - Initially, the file has i_size=4k, a regular extent from 0-4k, and a prealloc extent beyond i_size from 4k-12k. The prealloc extent item is the last item in its B-tree leaf. - The file is fsync'd, which copies its inode item and both extent items to the log tree. - An xattr is set on the file, which sets the BTRFS_INODE_COPY_EVERYTHING flag. - The range 4k-8k in the file is written using direct I/O. i_size is extended to 8k, but the ordered extent is still in flight. - The file is fsync'd. Since BTRFS_INODE_COPY_EVERYTHING is set, this calls copy_inode_items_to_log(), which calls btrfs_log_prealloc_extents(). - btrfs_log_prealloc_extents() finds the 4k-12k prealloc extent in the filesystem tree. Since it starts before i_size, it skips it. Since it is the last item in its B-tree leaf, it calls btrfs_next_leaf(). - btrfs_next_leaf() unlocks the path. - The ordered extent completion runs, which converts the 4k-8k part of the prealloc extent to written and inserts the remaining prealloc part from 8k-12k. - btrfs_next_leaf() does a search and finds the new prealloc extent 8k-12k. - btrfs_log_prealloc_extents() copies the 8k-12k prealloc extent into the log tree. Note that it overlaps with the 4k-12k prealloc extent that was copied to the log tree by the first fsync. - fsync calls btrfs_log_changed_extents(), which tries to log the 4k-8k extent that was written. - This tries to drop the range 4k-8k in the log tree, which requires adjusting the start of the 4k-12k prealloc extent in the log tree to 8k. - btrfs_set_item_key_safe() sees that there is already an extent starting at 8k in the log tree and calls BUG(). Fix this by detecting when we're about to insert an overlapping file extent item in the log tree and truncating the part that would overlap. CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-06-05 18:06:30 +02:00
Linus Torvalds	71d7b52cc3	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "This is dominated by a couple large series for ARM and x86 respectively, but apart from that things are calm. ARM: - Large set of FP/SVE fixes for pKVM, addressing the fallout from the per-CPU data rework and making sure that the host is not involved in the FP/SVE switching any more - Allow FEAT_BTI to be enabled with NV now that FEAT_PAUTH is completely supported - Fix for the respective priorities of Failed PAC, Illegal Execution state and Instruction Abort exceptions - Fix the handling of AArch32 instruction traps failing their condition code, which was broken by the introduction of ESR_EL2.ISS2 - Allow vcpus running in AArch32 state to be restored in System mode - Fix AArch32 GPR restore that would lose the 64 bit state under some conditions RISC-V: - No need to use mask when hart-index-bits is 0 - Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext() x86: - Fixes and debugging help for the #VE sanity check. Also disable it by default, even for CONFIG_DEBUG_KERNEL, because it was found to trigger spuriously (most likely a processor erratum as the exact symptoms vary by generation). - Avoid WARN() when two NMIs arrive simultaneously during an NMI-disabled situation (GIF=0 or interrupt shadow) when the processor supports virtual NMI. While generally KVM will not request an NMI window when virtual NMIs are supported, in this case it does have to single-step over the interrupt shadow or enable the STGI intercept, in order to deliver the latched second NMI. - Drop support for hand tuning APIC timer advancement from userspace. Since we have adaptive tuning, and it has proved to work well, drop the module parameter for manual configuration and with it a few stupid bugs that it had" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (32 commits) KVM: x86/mmu: Don't save mmu_invalidate_seq after checking private attr KVM: arm64: Ensure that SME controls are disabled in protected mode KVM: arm64: Refactor CPACR trap bit setting/clearing to use ELx format KVM: arm64: Consolidate initializing the host data's fpsimd_state/sve in pKVM KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM KVM: arm64: Allocate memory mapped at hyp for host sve state in pKVM KVM: arm64: Specialize handling of host fpsimd state on trap KVM: arm64: Abstract set/clear of CPTR_EL2 bits behind helper KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_state KVM: arm64: Reintroduce __sve_save_state KVM: x86: Drop support for hand tuning APIC timer advancement from userspace KVM: SEV-ES: Delegate LBR virtualization to the processor KVM: SEV-ES: Disallow SEV-ES guests when X86_FEATURE_LBRV is absent KVM: SEV-ES: Prevent MSR access post VMSA encryption RISC-V: KVM: Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext function RISC-V: KVM: No need to use mask when hart-index-bit is 0 KVM: arm64: nv: Expose BTI and CSV_frac to a guest hypervisor KVM: arm64: nv: Fix relative priorities of exceptions generated by ERETAx KVM: arm64: AArch32: Fix spurious trapping of conditional instructions KVM: arm64: Allow AArch32 PSTATE.M to be restored as System mode ...	2024-06-05 08:43:41 -07:00
Ritesh Harjani (IBM)	f5ceb1bbc9	iomap: Fix iomap_adjust_read_range for plen calculation If the extent spans the block that contains i_size, we need to handle both halves separately so that we properly zero data in the page cache for blocks that are entirely outside of i_size. But this is needed only when i_size is within the current folio under processing. "orig_pos + length > isize" can be true for all folios if the mapped extent length is greater than the folio size. That is making plen to break for every folio instead of only the last folio. So use orig_plen for checking if "orig_pos + orig_plen > isize". Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/a32e5f9a4fcfdb99077300c4020ed7ae61d6e0f9.1715067055.git.ritesh.list@gmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> cc: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-06-05 17:27:03 +02:00
Zhang Yi	0841ea4a3b	iomap: keep on increasing i_size in iomap_write_end() Commit '943bc0882ceb ("iomap: don't increase i_size if it's not a write operation")' breaks xfs with realtime device on generic/561, the problem is when unaligned truncate down a xfs realtime inode with rtextsize > 1 fs block, xfs only zero out the EOF block but doesn't zero out the tail blocks that aligned to rtextsize, so if we don't increase i_size in iomap_write_end(), it could expose stale data after we do an append write beyond the aligned EOF block. xfs should zero out the tail blocks when truncate down, but before we finish that, let's fix the issue by just revert the changes in iomap_write_end(). Fixes: `943bc0882c` ("iomap: don't increase i_size if it's not a write operation") Reported-by: Chandan Babu R <chandanbabu@kernel.org> Link: https://lore.kernel.org/linux-xfs/0b92a215-9d9b-3788-4504-a520778953c2@huaweicloud.com Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20240603112222.2109341-1-yi.zhang@huaweicloud.com Tested-by: Chandan Babu R <chandanbabu@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-06-05 17:23:39 +02:00
Rafael J. Wysocki	9b7e7ff0fe	Merge branch 'pm-cpufreq' Merge cpufreq fixes for 6.10-rc3: - Fix a recently introduced unchecked HWP MSR access in the intel_pstate driver (Srinivas Pandruvada). - Add missing conversion from MHz to KHz to amd_pstate_set_boost() to address sysfs inteface inconsistency (Dhananjay Ugwekar). - Get rid of an excess global header file used by the amd-pstate cpufreq driver (Arnd Bergmann). * pm-cpufreq: cpufreq: intel_pstate: Fix unchecked HWP MSR access cpufreq: amd-pstate: Fix the inconsistency in max frequency units cpufreq: amd-pstate: remove global header file	2024-06-05 17:11:47 +02:00
David Hildenbrand	01c51a32dc	KVM: s390x: selftests: Add shared zeropage test Let's test that we can have shared zeropages in our process as long as storage keys are not getting used, that shared zeropages are properly unshared (replaced by anonymous pages) once storage keys are enabled, and that no new shared zeropages are populated after storage keys were enabled. We require the new pagemap interface to detect the shared zeropage. On an old kernel (zeropages always disabled): # ./s390x/shared_zeropage_test TAP version 13 1..3 not ok 1 Shared zeropages should be enabled ok 2 Shared zeropage should be gone ok 3 Shared zeropages should be disabled # Totals: pass:2 fail:1 xfail:0 xpass:0 skip:0 error:0 On a fixed kernel: # ./s390x/shared_zeropage_test TAP version 13 1..3 ok 1 Shared zeropages should be enabled ok 2 Shared zeropage should be gone ok 3 Shared zeropages should be disabled # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Testing of UFFDIO_ZEROPAGE can be added later. [ agordeev: Fixed checkpatch complaint, added ucall_common.h include ] Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Tested-by: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/r/20240412084329.30315-1-david@redhat.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>	2024-06-05 17:03:24 +02:00
Alexander Gordeev	d38e48563c	s390/crash: Do not use VM info if os_info does not have it The virtual memory information stored in os_info area is required for creation of the kernel image PT_LOAD program header for kernels since commit a2ec5bec56dd ("s390/mm: uncouple physical vs virtual address spaces"). By contrast, if such information in os_info is absent the PT_LOAD program header should not be created. Currently the proper PT_LOAD program header is created for kernels that contain the virtual memory information, but for kernels without one an invalid header of zero size is created. That in turn leads to stand-alone dump failures. Use OS_INFO_KASLR_OFFSET variable to check whether os_info is present or not (same as crash and makedumpfile tools do) and based on that create or do not create the kernel image PT_LOAD program header. Fixes: `f4cac27dc0` ("s390/crash: Use old os_info to create PT_LOAD headers") Tested-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Acked-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>	2024-06-05 17:03:24 +02:00
Rafael J. Wysocki	1bfc0835d4	Merge branches 'acpi-ec', 'acpi-apei' and 'pnp' Merge ACPI EC driver fixes, an ACPI APEI fix and PNP fixes for 6.10-rc3: - Fix error handling during EC operation region accesses in the ACPI EC driver (Armin Wolf). - Fix a memory leak in the APEI error injection driver introduced during its converion to a platform driver (Dan Williams). - Fix build failures related to the dev_is_pnp() macro by redefining it as a proper function and exporting it to modules as appropriate and unexport pnp_bus_type which need not be exported any more (Andy Shevchenko). * acpi-ec: ACPI: EC: Avoid returning AE_OK on errors in address space handler ACPI: EC: Abort address space access upon error * acpi-apei: ACPI: APEI: EINJ: Fix einj_dev release leak * pnp: PNP: Hide pnp_bus_type from the non-PNP code PNP: Make dev_is_pnp() to be a function and export it for modules	2024-06-05 16:58:09 +02:00
Kent Overstreet	319fef29e9	bcachefs: Fix trans->locked assert in bch2_move_data_btree, we might start with the trans unlocked from a previous loop iteration - we need a trans_begin() before iter_init(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-05 10:44:08 -04:00
Kent Overstreet	fdccb24352	bcachefs: Rereplicate now moves data off of durability=0 devices This fixes an issue where setting a device to durability=0 after it's been used makes it impossible to remove. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-05 10:44:08 -04:00
Kent Overstreet	9a64e1bfd8	bcachefs: Fix GFP_KERNEL allocation in break_cycle() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-05 10:44:08 -04:00
Namhyung Kim	ca9680821d	perf bpf: Fix handling of minimal vmlinux.h file when interrupting the build Ingo reported that he was seeing these when hitting Control+C during a perf tools build: Makefile.perf:1149: *** Missing bpftool input for generating vmlinux.h. Stop. The failure happens when you don't have vmlinux.h or vmlinux with BTF. ifeq ($(VMLINUX_H),) ifeq ($(VMLINUX_BTF),) $(error Missing bpftool input for generating vmlinux.h) endif endif VMLINUX_BTF can be empty if you didn't build a kernel or it doesn't have a BTF section and the current kernel also has no BTF. This is totally ok. But VMLINUX_H should be set to the minimal version in the source tree (unless you overwrite it manually) when you don't pass GEN_VMLINUX_H=1 (which requires VMLINUX_BTF should not be empty). The problem is that it's defined in Makefile.config which is not included for `make clean`. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lore.kernel.org/lkml/CAM9d7ch5HTr+k+_GpbMrX0HUo5BZ11byh1xq0Two7B7RQACuNw@mail.gmail.com Link: http://lore.kernel.org/lkml/ZjssGrj+abyC6mYP@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-06-05 11:33:00 -03:00
Arnaldo Carvalho de Melo	5b3cde1988	Revert "perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event" This reverts commit `7d1405c71d`. This causes segfaults in some cases, as reported by Milian: ``` sudo /usr/bin/perf record -z --call-graph dwarf -e cycles -e raw_syscalls:sys_enter ls ... [ perf record: Woken up 3 times to write data ] malloc(): invalid next size (unsorted) Aborted ``` Backtrace with GDB + debuginfod: ``` malloc(): invalid next size (unsorted) Thread 1 "perf" received signal SIGABRT, Aborted. __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 Downloading source file /usr/src/debug/glibc/glibc/nptl/pthread_kill.c 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007ffff6ea8eb3 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 #2 0x00007ffff6e50a30 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/ raise.c:26 #3 0x00007ffff6e384c3 in __GI_abort () at abort.c:79 #4 0x00007ffff6e39354 in __libc_message_impl (fmt=fmt@entry=0x7ffff6fc22ea "%s\n") at ../sysdeps/posix/libc_fatal.c:132 #5 0x00007ffff6eb3085 in malloc_printerr (str=str@entry=0x7ffff6fc5850 "malloc(): invalid next size (unsorted)") at malloc.c:5772 #6 0x00007ffff6eb657c in _int_malloc (av=av@entry=0x7ffff6ff6ac0 <main_arena>, bytes=bytes@entry=368) at malloc.c:4081 #7 0x00007ffff6eb877e in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3754 #8 0x000055555569bdb6 in perf_session.do_write_header () #9 0x00005555555a373a in __cmd_record.constprop.0 () #10 0x00005555555a6846 in cmd_record () #11 0x000055555564db7f in run_builtin () #12 0x000055555558ed77 in main () ``` Valgrind memcheck: ``` ==45136== Invalid write of size 8 ==45136== at 0x2B38A5: perf_event__synthesize_id_sample (in /usr/bin/perf) ==45136== by 0x157069: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd ==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675) ==45136== by 0x3574AB: zalloc (in /usr/bin/perf) ==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== ==45136== Syscall param write(buf) points to unaddressable byte(s) ==45136== at 0x575953D: __libc_write (write.c:26) ==45136== by 0x575953D: write (write.c:24) ==45136== by 0x35761F: ion (in /usr/bin/perf) ==45136== by 0x357778: writen (in /usr/bin/perf) ==45136== by 0x1548F7: record__write (in /usr/bin/perf) ==45136== by 0x15708A: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd ==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675) ==45136== by 0x3574AB: zalloc (in /usr/bin/perf) ==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== ----- Closes: https://lore.kernel.org/linux-perf-users/23879991.0LEYPuXRzz@milian-workstation/ Reported-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@kernel.org # 6.8+ Link: https://lore.kernel.org/lkml/Zl9ksOlHJHnKM70p@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-06-05 11:12:36 -03:00
Carlos Llamas	f92a59f6d1	locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldoc For ${atomic}_sub_and_test() the @i parameter is the value to subtract, not add. Fix the typo in the kerneldoc template and generate the headers with this update. Fixes: `ad8110706f` ("locking/atomic: scripts: generate kerneldoc comments") Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20240515133844.3502360-1-cmllamas@google.com	2024-06-05 15:52:34 +02:00
Haifeng Xu	74751ef5c1	perf/core: Fix missing wakeup when waiting for context reference In our production environment, we found many hung tasks which are blocked for more than 18 hours. Their call traces are like this: [346278.191038] __schedule+0x2d8/0x890 [346278.191046] schedule+0x4e/0xb0 [346278.191049] perf_event_free_task+0x220/0x270 [346278.191056] ? init_wait_var_entry+0x50/0x50 [346278.191060] copy_process+0x663/0x18d0 [346278.191068] kernel_clone+0x9d/0x3d0 [346278.191072] __do_sys_clone+0x5d/0x80 [346278.191076] __x64_sys_clone+0x25/0x30 [346278.191079] do_syscall_64+0x5c/0xc0 [346278.191083] ? syscall_exit_to_user_mode+0x27/0x50 [346278.191086] ? do_syscall_64+0x69/0xc0 [346278.191088] ? irqentry_exit_to_user_mode+0x9/0x20 [346278.191092] ? irqentry_exit+0x19/0x30 [346278.191095] ? exc_page_fault+0x89/0x160 [346278.191097] ? asm_exc_page_fault+0x8/0x30 [346278.191102] entry_SYSCALL_64_after_hwframe+0x44/0xae The task was waiting for the refcount become to 1, but from the vmcore, we found the refcount has already been 1. It seems that the task didn't get woken up by perf_event_release_kernel() and got stuck forever. The below scenario may cause the problem. Thread A Thread B ... ... perf_event_free_task perf_event_release_kernel ... acquire event->child_mutex ... get_ctx ... release event->child_mutex acquire ctx->mutex ... perf_free_event (acquire/release event->child_mutex) ... release ctx->mutex wait_var_event acquire ctx->mutex acquire event->child_mutex # move existing events to free_list release event->child_mutex release ctx->mutex put_ctx ... ... In this case, all events of the ctx have been freed, so we couldn't find the ctx in free_list and Thread A will miss the wakeup. It's thus necessary to add a wakeup after dropping the reference. Fixes: `1cf8dfe8a6` ("perf/core: Fix race between close() and fork()") Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20240513103948.33570-1-haifeng.xu@shopee.com	2024-06-05 15:52:33 +02:00
David S. Miller	f8f0de9d58	Merge branch 'mlx5-fixes' Tariq Toukan says: ==================== mlx5 core fixes 20240603 This small patchset provides two bug fixes from the team to the mlx5 core driver. Series generated against: commit `33700a0c9b` ("net/tcp: Don't consider TCP_CLOSE in TCP_AO_ESTABLISHED") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 14:07:17 +01:00
Shay Drory	c8b3f38d2d	net/mlx5: Always stop health timer during driver removal Currently, if teardown_hca fails to execute during driver removal, mlx5 does not stop the health timer. Afterwards, mlx5 continue with driver teardown. This may lead to a UAF bug, which results in page fault Oops[1], since the health timer invokes after resources were freed. Hence, stop the health monitor even if teardown_hca fails. [1] mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: cleanup mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup BUG: unable to handle page fault for address: ffffa26487064230 PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 RIP: 0010:ioread32be+0x34/0x60 RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? __die+0x23/0x70 ? page_fault_oops+0x171/0x4e0 ? exc_page_fault+0x175/0x180 ? asm_exc_page_fault+0x26/0x30 ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? ioread32be+0x34/0x60 mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] poll_health+0x42/0x230 [mlx5_core] ? __next_timer_interrupt+0xbc/0x110 ? __pfx_poll_health+0x10/0x10 [mlx5_core] call_timer_fn+0x21/0x130 ? __pfx_poll_health+0x10/0x10 [mlx5_core] __run_timers+0x222/0x2c0 run_timer_softirq+0x1d/0x40 __do_softirq+0xc9/0x2c8 __irq_exit_rcu+0xa6/0xc0 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:cpuidle_enter_state+0xcc/0x440 ? cpuidle_enter_state+0xbd/0x440 cpuidle_enter+0x2d/0x40 do_idle+0x20d/0x270 cpu_startup_entry+0x2a/0x30 rest_init+0xd0/0xd0 arch_call_rest_init+0xe/0x30 start_kernel+0x709/0xa90 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0x96/0xa0 secondary_startup_64_no_verify+0x18f/0x19b ---[ end trace 0000000000000000 ]--- Fixes: `9b98d395b8` ("net/mlx5: Start health poll at earlier stage of driver load") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 14:07:16 +01:00
Moshe Shemesh	33afbfcc10	net/mlx5: Stop waiting for PCI if pci channel is offline In case pci channel becomes offline the driver should not wait for PCI reads during health dump and recovery flow. The driver has timeout for each of these loops trying to read PCI, so it would fail anyway. However, in case of recovery waiting till timeout may cause the pci error_detected() callback fail to meet pci_dpc_recovered() wait timeout. Fixes: `b3bd076f75` ("net/mlx5: Report devlink health on FW fatal issues") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 14:07:16 +01:00
Frank Wunderlich	c57e558194	net: ethernet: mtk_eth_soc: handle dma buffer size soc specific The mainline MTK ethernet driver suffers long time from rarly but annoying tx queue timeouts. We think that this is caused by fixed dma sizes hardcoded for all SoCs. We suspect this problem arises from a low level of free TX DMADs, the TX Ring alomost full. The transmit timeout is caused by the Tx queue not waking up. The Tx queue stops when the free counter is less than ring->thres, and it will wake up once the free counter is greater than ring->thres. If the CPU is too late to wake up the Tx queues, it may cause a transmit timeout. Therefore, we increased the TX and RX DMADs to improve this error situation. Use the dma-size implementation from SDK in a per SoC manner. In difference to SDK we have no RSS feature yet, so all RX/TX sizes should be raised from 512 to 2048 byte except fqdma on mt7988 to avoid the tx timeout issue. Fixes: `656e705243` ("net-next: mediatek: add support for MT7623 ethernet") Suggested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 14:04:44 +01:00
Arnd Bergmann	5c40e428ae	arm64/io: add constant-argument check In some configurations __const_iowrite32_copy() does not get inlined and gcc runs into the BUILD_BUG(): In file included from <command-line>: In function '__const_memcpy_toio_aligned32', inlined from '__const_iowrite32_copy' at arch/arm64/include/asm/io.h:203:3, inlined from '__const_iowrite32_copy' at arch/arm64/include/asm/io.h:199:20: include/linux/compiler_types.h:487:45: error: call to '__compiletime_assert_538' declared with attribute error: BUILD_BUG failed 487 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| ^ include/linux/compiler_types.h:468:25: note: in definition of macro '__compiletime_assert' 468 \| prefix ## suffix(); \ \| ^~~~~~ include/linux/compiler_types.h:487:9: note: in expansion of macro '_compiletime_assert' 487 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| ^~~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 \| #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) \| ^~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:59:21: note: in expansion of macro 'BUILD_BUG_ON_MSG' 59 \| #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed") \| ^~~~~~~~~~~~~~~~ arch/arm64/include/asm/io.h:193:17: note: in expansion of macro 'BUILD_BUG' 193 \| BUILD_BUG(); \| ^~~~~~~~~ Move the check for constant arguments into the inline function to ensure it is still constant if the compiler decides against inlining it, and mark them as __always_inline to override the logic that sometimes leads to the compiler not producing the simplified output. Note that either the __always_inline annotation or the check for a constant value are sufficient here, but combining the two looks cleaner as it also avoids the macro. With clang-8 and older, the macro was still needed, but all versions of gcc and clang can reliably perform constant folding here. Fixes: `ead79118da` ("arm64/io: Provide a WC friendly __iowriteXX_copy()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240604210006.668912-1-arnd@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2024-06-05 13:30:58 +01:00
Jakub Kicinski	5b4b62a169	rtnetlink: make the "split" NLM_DONE handling generic Jaroslav reports Dell's OMSA Systems Management Data Engine expects NLM_DONE in a separate recvmsg(), both for rtnl_dump_ifinfo() and inet_dump_ifaddr(). We already added a similar fix previously in commit `460b0d33cf` ("inet: bring NLM_DONE out to a separate recv() again") Instead of modifying all the dump handlers, and making them look different than modern for_each_netdev_dump()-based dump handlers - put the workaround in rtnetlink code. This will also help us move the custom rtnl-locking from af_netlink in the future (in net-next). Note that this change is not touching rtnl_dump_all(). rtnl_dump_all() is different kettle of fish and a potential problem. We now mix families in a single recvmsg(), but NLM_DONE is not coalesced. Tested: ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_addr.yaml \ --dump getaddr --json '{"ifa-family": 2}' ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_route.yaml \ --dump getroute --json '{"rtm-family": 2}' ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_link.yaml \ --dump getlink Fixes: `3e41af9076` ("rtnetlink: use xarray iterator to implement rtnl_dump_ifinfo()") Fixes: `cdb2f80f1c` ("inet: use xa_array iterator to implement inet_dump_ifaddr()") Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Link: https://lore.kernel.org/all/CAK8fFZ7MKoFSEzMBDAOjoUt+vTZRRQgLDNXEOfdCCXSoXXKE0g@mail.gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 12:34:54 +01:00
David S. Miller	e137596ec1	Merge branch 'tcp-mptcp-close-wait' Jason Xing says: ==================== tcp/mptcp: count CLOSE-WAIT for CurrEstab Taking CLOSE-WAIT sockets into CurrEstab counters is in accordance with RFC 1213, as suggested by Eric and Neal. v5 Link: https://lore.kernel.org/all/20240531091753.75930-1-kerneljasonxing@gmail.com/ 1. add more detailed comment (Matthieu) v4 Link: https://lore.kernel.org/all/20240530131308.59737-1-kerneljasonxing@gmail.com/ 1. correct the Fixes: tag in patch [2/2]. (Eric) Previous discussion Link: https://lore.kernel.org/all/20240529033104.33882-1-kerneljasonxing@gmail.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 12:32:47 +01:00
Jason Xing	9633e9377e	mptcp: count CLOSE-WAIT sockets for MPTCP_MIB_CURRESTAB Like previous patch does in TCP, we need to adhere to RFC 1213: "tcpCurrEstab OBJECT-TYPE ... The number of TCP connections for which the current state is either ESTABLISHED or CLOSE- WAIT." So let's consider CLOSE-WAIT sockets. The logic of counting When we increment the counter? a) Only if we change the state to ESTABLISHED. When we decrement the counter? a) if the socket leaves ESTABLISHED and will never go into CLOSE-WAIT, say, on the client side, changing from ESTABLISHED to FIN-WAIT-1. b) if the socket leaves CLOSE-WAIT, say, on the server side, changing from CLOSE-WAIT to LAST-ACK. Fixes: `d9cd27b8cd` ("mptcp: add CurrEstab MIB counter support") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 12:32:47 +01:00
Jason Xing	a46d0ea5c9	tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB According to RFC 1213, we should also take CLOSE-WAIT sockets into consideration: "tcpCurrEstab OBJECT-TYPE ... The number of TCP connections for which the current state is either ESTABLISHED or CLOSE- WAIT." After this, CurrEstab counter will display the total number of ESTABLISHED and CLOSE-WAIT sockets. The logic of counting When we increment the counter? a) if we change the state to ESTABLISHED. b) if we change the state from SYN-RECEIVED to CLOSE-WAIT. When we decrement the counter? a) if the socket leaves ESTABLISHED and will never go into CLOSE-WAIT, say, on the client side, changing from ESTABLISHED to FIN-WAIT-1. b) if the socket leaves CLOSE-WAIT, say, on the server side, changing from CLOSE-WAIT to LAST-ACK. Please note: there are two chances that old state of socket can be changed to CLOSE-WAIT in tcp_fin(). One is SYN-RECV, the other is ESTABLISHED. So we have to take care of the former case. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 12:32:46 +01:00
Tao Su	db574f2f96	KVM: x86/mmu: Don't save mmu_invalidate_seq after checking private attr Drop the second snapshot of mmu_invalidate_seq in kvm_faultin_pfn(). Before checking the mismatch of private vs. shared, mmu_invalidate_seq is saved to fault->mmu_seq, which can be used to detect an invalidation related to the gfn occurred, i.e. KVM will not install a mapping in page table if fault->mmu_seq != mmu_invalidate_seq. Currently there is a second snapshot of mmu_invalidate_seq, which may not be same as the first snapshot in kvm_faultin_pfn(), i.e. the gfn attribute may be changed between the two snapshots, but the gfn may be mapped in page table without hindrance. Therefore, drop the second snapshot as it has no obvious benefits. Fixes: `f6adeae81f` ("KVM: x86/mmu: Handle no-slot faults at the beginning of kvm_faultin_pfn()") Signed-off-by: Tao Su <tao1.su@linux.intel.com> Message-ID: <20240528102234.2162763-1-tao1.su@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-05 06:45:06 -04:00
Paolo Bonzini	45ce0314bf	Merge tag 'kvmarm-fixes-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.10, take #1 - Large set of FP/SVE fixes for pKVM, addressing the fallout from the per-CPU data rework and making sure that the host is not involved in the FP/SVE switching any more - Allow FEAT_BTI to be enabled with NV now that FEAT_PAUTH is copletely supported - Fix for the respective priorities of Failed PAC, Illegal Execution state and Instruction Abort exceptions - Fix the handling of AArch32 instruction traps failing their condition code, which was broken by the introduction of ESR_EL2.ISS2 - Allow vpcus running in AArch32 state to be restored in System mode - Fix AArch32 GPR restore that would lose the 64 bit state under some conditions	2024-06-05 06:32:18 -04:00
Wei Li	14951beaec	arm64: armv8_deprecated: Fix warning in isndep cpuhp starting process The function run_all_insn_set_hw_mode() is registered as startup callback of 'CPUHP_AP_ARM64_ISNDEP_STARTING', it invokes set_hw_mode() methods of all emulated instructions. As the STARTING callbacks are not expected to fail, if one of the set_hw_mode() fails, e.g. due to el0 mixed-endian is not supported for 'setend', it will report a warning: ``` CPU[2] cannot support the emulation of setend CPU 2 UP state arm64/isndep:starting (136) failed (-22) CPU2: Booted secondary processor 0x0000000002 [0x414fd0c1] ``` To fix it, add a check for INSN_UNAVAILABLE status and skip the process. Signed-off-by: Wei Li <liwei391@huawei.com> Tested-by: Huisong Li <lihuisong@huawei.com> Link: https://lore.kernel.org/r/20240423093501.3460764-1-liwei391@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2024-06-05 11:16:55 +01:00
Hangbin Liu	712115a24b	selftests: hsr: add missing config for CONFIG_BRIDGE hsr_redbox.sh test need to create bridge for testing. Add the missing config CONFIG_BRIDGE in config file. Fixes: `eafbf0574e` ("test: hsr: Extend the hsr_redbox.sh to have more SAN devices connected") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Tested-by: Simon Horman <horms@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 10:52:04 +01:00
Daniel Borkmann	1cd4bc987a	vxlan: Fix regression when dropping packets due to invalid src addresses Commit `f58f45c1e5` ("vxlan: drop packets from invalid src-address") has recently been added to vxlan mainly in the context of source address snooping/learning so that when it is enabled, an entry in the FDB is not being created for an invalid address for the corresponding tunnel endpoint. Before commit `f58f45c1e5` vxlan was similarly behaving as geneve in that it passed through whichever macs were set in the L2 header. It turns out that this change in behavior breaks setups, for example, Cilium with netkit in L3 mode for Pods as well as tunnel mode has been passing before the change in `f58f45c1e5` for both vxlan and geneve. After mentioned change it is only passing for geneve as in case of vxlan packets are dropped due to vxlan_set_mac() returning false as source and destination macs are zero which for E/W traffic via tunnel is totally fine. Fix it by only opting into the is_valid_ether_addr() check in vxlan_set_mac() when in fact source address snooping/learning is actually enabled in vxlan. This is done by moving the check into vxlan_snoop(). With this change, the Cilium connectivity test suite passes again for both tunnel flavors. Fixes: `f58f45c1e5` ("vxlan: drop packets from invalid src-address") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David Bauer <mail@david-bauer.net> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 10:51:10 +01:00
Hangyu Hua	affc18fdc6	net: sched: sch_multiq: fix possible OOB write in multiq_tune() q->bands will be assigned to qopt->bands to execute subsequent code logic after kmalloc. So the old q->bands should not be used in kmalloc. Otherwise, an out-of-bounds write will occur. Fixes: `c2999f7fb0` ("net: sched: multiq: don't call qdisc_put() while holding tree lock") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Acked-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 10:50:19 +01:00
Taehee Yoo	491aee894a	ionic: fix kernel panic in XDP_TX action In the XDP_TX path, ionic driver sends a packet to the TX path with rx page and corresponding dma address. After tx is done, ionic_tx_clean() frees that page. But RX ring buffer isn't reset to NULL. So, it uses a freed page, which causes kernel panic. BUG: unable to handle page fault for address: ffff8881576c110c PGD 773801067 P4D 773801067 PUD 87f086067 PMD 87efca067 PTE 800ffffea893e060 Oops: Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN NOPTI CPU: 1 PID: 25 Comm: ksoftirqd/1 Not tainted 6.9.0+ #11 Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 RIP: 0010:bpf_prog_f0b8caeac1068a55_balancer_ingress+0x3b/0x44f Code: 00 53 41 55 41 56 41 57 b8 01 00 00 00 48 8b 5f 08 4c 8b 77 00 4c 89 f7 48 83 c7 0e 48 39 d8 RSP: 0018:ffff888104e6fa28 EFLAGS: 00010283 RAX: 0000000000000002 RBX: ffff8881576c1140 RCX: 0000000000000002 RDX: ffffffffc0051f64 RSI: ffffc90002d33048 RDI: ffff8881576c110e RBP: ffff888104e6fa88 R08: 0000000000000000 R09: ffffed1027a04a23 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881b03a21a8 R13: ffff8881589f800f R14: ffff8881576c1100 R15: 00000001576c1100 FS: 0000000000000000(0000) GS:ffff88881ae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8881576c110c CR3: 0000000767a90000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x254/0x790 ? __pfx_page_fault_oops+0x10/0x10 ? __pfx_is_prefetch.constprop.0+0x10/0x10 ? search_bpf_extables+0x165/0x260 ? fixup_exception+0x4a/0x970 ? exc_page_fault+0xcb/0xe0 ? asm_exc_page_fault+0x22/0x30 ? 0xffffffffc0051f64 ? bpf_prog_f0b8caeac1068a55_balancer_ingress+0x3b/0x44f ? do_raw_spin_unlock+0x54/0x220 ionic_rx_service+0x11ab/0x3010 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? ionic_tx_clean+0x29b/0xc60 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? __pfx_ionic_tx_clean+0x10/0x10 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? __pfx_ionic_rx_service+0x10/0x10 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? ionic_tx_cq_service+0x25d/0xa00 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? __pfx_ionic_rx_service+0x10/0x10 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ionic_cq_service+0x69/0x150 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ionic_txrx_napi+0x11a/0x540 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] __napi_poll.constprop.0+0xa0/0x440 net_rx_action+0x7e7/0xc30 ? __pfx_net_rx_action+0x10/0x10 Fixes: `8eeed8373e` ("ionic: Add XDP_TX support") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 10:49:27 +01:00
Tristram Ha	0a8d3f2e3e	net: phy: Micrel KSZ8061: fix errata solution not taking effect problem KSZ8061 needs to write to a MMD register at driver initialization to fix an errata. This worked in 5.0 kernel but not in newer kernels. The issue is the main phylib code no longer resets PHY at the very beginning. Calling phy resuming code later will reset the chip if it is already powered down at the beginning. This wipes out the MMD register write. Solution is to implement a phy resume function for KSZ8061 to take care of this problem. Fixes: `232ba3a51c` ("net: phy: Micrel KSZ8061: link failure after cable connect") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 10:03:15 +01:00
Wen Gu	fb0aa0781a	net/smc: avoid overwriting when adjusting sock bufsizes When copying smc settings to clcsock, avoid setting clcsock's sk_sndbuf to sysctl_tcp_wmem[1], since this may overwrite the value set by tcp_sndbuf_expand() in TCP connection establishment. And the other setting sk_{snd\|rcv}buf to sysctl value in smc_adjust_sock_bufsizes() can also be omitted since the initialization of smc sock and clcsock has set sk_{snd\|rcv}buf to smc.sysctl_{w\|r}mem or ipv4_sysctl_tcp_{w\|r}mem[1]. Fixes: `30c3c4a449` ("net/smc: Use correct buffer sizes when switching between TCP and SMC") Link: https://lore.kernel.org/r/5eaf3858-e7fd-4db8-83e8-3d7a3e0e9ae2@linux.alibaba.com Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>, too. Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 09:42:57 +01:00
Subbaraya Sundeep	8b0f741094	octeontx2-af: Always allocate PF entries from low prioriy zone PF mcam entries has to be at low priority always so that VF can install longest prefix match rules at higher priority. This was taken care currently but when priority allocation wrt reference entry is requested then entries are allocated from mid-zone instead of low priority zone. Fix this and always allocate entries from low priority zone for PFs. Fixes: `7df5b4b260` ("octeontx2-af: Allocate low priority entries for PF") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-05 09:40:02 +01:00
Ard Biesheuvel	99280413a5	efi: Add missing __nocfi annotations to runtime wrappers The EFI runtime wrappers are a sandbox for calling into EFI runtime services, which are invoked using indirect calls. When running with kCFI enabled, the compiler will require the target of any indirect call to be type annotated. Given that the EFI runtime services prototypes and calling convention are governed by the EFI spec, not the Linux kernel, adding such type annotations for firmware routines is infeasible, and so the compiler must be informed that prototype validation should be omitted. Add the __nocfi annotation at the appropriate places in the EFI runtime wrapper code to achieve this. Note that this currently only affects 32-bit ARM, given that other architectures that support both kCFI and EFI use an asm wrapper to call EFI runtime services, and this hides the indirect call from the compiler. Fixes: `1a4fec49ef` ("ARM: 9392/2: Support CLANG CFI") Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2024-06-05 10:18:58 +02:00
Magnus Karlsson	03e38d315f	Revert "xsk: Document ability to redirect to any socket bound to the same umem" This reverts commit `968595a936`. Reported-by: Yuval El-Hanany <YuvalE@radware.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/xdp-newbies/8100DBDC-0B7C-49DB-9995-6027F6E63147@radware.com Link: https://lore.kernel.org/bpf/20240604122927.29080-3-magnus.karlsson@gmail.com	2024-06-05 09:43:05 +02:00
Magnus Karlsson	7fcf26b315	Revert "xsk: Support redirect to any socket bound to the same umem" This reverts commit `2863d665ea`. This patch introduced a potential kernel crash when multiple napi instances redirect to the same AF_XDP socket. By removing the queue_index check, it is possible for multiple napi instances to access the Rx ring at the same time, which will result in a corrupted ring state which can lead to a crash when flushing the rings in __xsk_flush(). This can happen when the linked list of sockets to flush gets corrupted by concurrent accesses. A quick and small fix is not possible, so let us revert this for now. Reported-by: Yuval El-Hanany <YuvalE@radware.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/xdp-newbies/8100DBDC-0B7C-49DB-9995-6027F6E63147@radware.com Link: https://lore.kernel.org/bpf/20240604122927.29080-2-magnus.karlsson@gmail.com	2024-06-05 09:42:30 +02:00
Jiri Olsa	d0d1df8ba1	bpf: Set run context for rawtp test_run callback syzbot reported crash when rawtp program executed through the test_run interface calls bpf_get_attach_cookie helper or any other helper that touches task->bpf_ctx pointer. Setting the run context (task->bpf_ctx pointer) for test_run callback. Fixes: `7adfc6c9b3` ("bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value") Reported-by: syzbot+3ab78ff125b7979e45f9@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://syzkaller.appspot.com/bug?extid=3ab78ff125b7979e45f9 Link: https://lore.kernel.org/bpf/20240604150024.359247-1-jolsa@kernel.org	2024-06-05 09:41:33 +02:00
Tony Luck	f071d02eca	tpm: Switch to new Intel CPU model defines New CPU #defines encode vendor and family as well as model. Link: https://lore.kernel.org/all/20240520224620.9480-4-tony.luck@intel.com/ Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2024-06-05 04:55:04 +03:00
Jan Beulich	0ea00e249c	tpm_tis: Do not flush uninitialized work tpm_tis_core_init() may fail before tpm_tis_probe_irq_single() is called, in which case tpm_tis_remove() unconditionally calling flush_work() is triggering a warning for .func still being NULL. Cc: stable@vger.kernel.org # v6.5+ Fixes: `481c2d1462` ("tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2024-06-05 01:18:01 +03:00
Linus Torvalds	51214520ad	Merge tag 'devicetree-fixes-for-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Fix regression in 'interrupt-map' handling affecting Apple M1 mini (at least) - Fix binding example warning in stm32 st,mlahb binding - Fix schema error in Allwinner platform binding causing lots of spurious warnings - Add missing MODULE_DESCRIPTION() to DT kunit tests * tag 'devicetree-fixes-for-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: property: Fix fw_devlink handling of interrupt-map of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw() dt-bindings: arm: stm32: st,mlahb: Drop spurious "reg" property from example dt-bindings: arm: sunxi: Fix incorrect '-' usage of: of_test: add MODULE_DESCRIPTION()	2024-06-04 14:08:44 -07:00
Arnaldo Carvalho de Melo	dc6abbbde4	tools headers arm64: Sync arm64's cputype.h with the kernel sources To get the changes in: `0ce85db6c2` ("arm64: cputype: Add Neoverse-V3 definitions") `02a0a04676` ("arm64: cputype: Add Cortex-X4 definitions") `f4d9d9dcc7` ("arm64: Add Neoverse-V2 part") That makes this perf source code to be rebuilt: CC /tmp/build/perf-tools/util/arm-spe.o The changes in the above patch add MIDR_NEOVERSE_V[23] and MIDR_NEOVERSE_V1 is used in arm-spe.c, so probably we need to add those and perhaps MIDR_CORTEX_X4 to that array? Or maybe we need to leave this for later when this is all tested on those machines? static const struct midr_range neoverse_spe[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), {}, }; Mark Rutland recommended about arm-spe.c: "I would not touch this for now -- someone would have to go audit the TRMs to check that those other cores have the same encoding, and I think it'd be better to do that as a follow-up." That addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Besar Wicaksono <bwicaksono@nvidia.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/lkml/Zl8cYk0Tai2fs7aM@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-06-04 16:46:40 -03:00
Linus Torvalds	32f88d65f0	Merge tag 'linux_kselftest-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "Fixes to build warnings in several tests and fixes to ftrace tests" * tag 'linux_kselftest-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/futex: don't pass a const char* to asprintf(3) selftests/futex: don't redefine .PHONY targets (all, clean) selftests/tracing: Fix event filter test to retry up to 10 times selftests/futex: pass _GNU_SOURCE without a value to the compiler selftests/overlayfs: Fix build error on ppc64 selftests/openat2: Fix build warnings on ppc64 selftests: cachestat: Fix build warnings on ppc64 tracing/selftests: Fix kprobe event name test for .isra. functions selftests/ftrace: Update required config selftests/ftrace: Fix to check required event file kselftest/alsa: Ensure _GNU_SOURCE is defined	2024-06-04 10:34:13 -07:00
Ard Biesheuvel	290be0a402	Merge branch 'efi/next' into efi/urgent	2024-06-04 19:31:03 +02:00
Dan Williams	c9d52fb313	PCI: Revert the cfg_access_lock lockdep mechanism While the experiment did reveal that there are additional places that are missing the lock during secondary bus reset, one of the places that needs to take cfg_access_lock (pci_bus_lock()) is not prepared for lockdep annotation. Specifically, pci_bus_lock() takes pci_dev_lock() recursively and is currently dependent on the fact that the device_lock() is marked lockdep_set_novalidate_class(&dev->mutex). Otherwise, without that annotation, pci_bus_lock() would need to use something like a new pci_dev_lock_nested() helper, a scheme to track a PCI device's depth in the topology, and a hope that the depth of a PCI tree never exceeds the max value for a lockdep subclass. The alternative to ripping out the lockdep coverage would be to deploy a dynamic lock key for every PCI device. Unfortunately, there is evidence that increasing the number of keys that lockdep needs to track to be per-PCI-device is prohibitively expensive for something like the cfg_access_lock. The main motivation for adding the annotation in the first place was to catch unlocked secondary bus resets, not necessarily catch lock ordering problems between cfg_access_lock and other locks. Solve that narrower problem with follow-on patches, and just due to targeted revert for now. Link: https://lore.kernel.org/r/171711746402.1628941.14575335981264103013.stgit@dwillia2-xfh.jf.intel.com Fixes: `7e89efc6e9` ("PCI: Lock upstream bridge for pci_reset_function()") Reported-by: Imre Deak <imre.deak@intel.com> Closes: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Kalle Valo <kvalo@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Cc: Jani Saarinen <jani.saarinen@intel.com>	2024-06-04 12:10:05 -05:00
Dirk Behme	c0a40097f0	drivers: core: synchronize really_probe() and dev_uevent() Synchronize the dev->driver usage in really_probe() and dev_uevent(). These can run in different threads, what can result in the following race condition for dev->driver uninitialization: Thread #1: ========== really_probe() { ... probe_failed: ... device_unbind_cleanup(dev) { ... dev->driver = NULL; // <= Failed probe sets dev->driver to NULL ... } ... } Thread #2: ========== dev_uevent() { ... if (dev->driver) // If dev->driver is NULLed from really_probe() from here on, // after above check, the system crashes add_uevent_var(env, "DRIVER=%s", dev->driver->name); ... } really_probe() holds the lock, already. So nothing needs to be done there. dev_uevent() is called with lock held, often, too. But not always. What implies that we can't add any locking in dev_uevent() itself. So fix this race by adding the lock to the non-protected path. This is the path where above race is observed: dev_uevent+0x235/0x380 uevent_show+0x10c/0x1f0 <= Add lock here dev_attr_show+0x3a/0xa0 sysfs_kf_seq_show+0x17c/0x250 kernfs_seq_show+0x7c/0x90 seq_read_iter+0x2d7/0x940 kernfs_fop_read_iter+0xc6/0x310 vfs_read+0x5bc/0x6b0 ksys_read+0xeb/0x1b0 __x64_sys_read+0x42/0x50 x64_sys_call+0x27ad/0x2d30 do_syscall_64+0xcd/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Similar cases are reported by syzkaller in https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a But these are regarding the initialization of dev->driver dev->driver = drv; As this switches dev->driver to non-NULL these reports can be considered to be false-positives (which should be "fixed" by this commit, as well, though). The same issue was reported and tried to be fixed back in 2015 in https://lore.kernel.org/lkml/1421259054-2574-1-git-send-email-a.sangwan@samsung.com/ already. Fixes: `239378f16a` ("Driver core: add uevent vars for devices of a class") Cc: stable <stable@kernel.org> Cc: syzbot+ffa8143439596313a85a@syzkaller.appspotmail.com Cc: Ashish Sangwan <a.sangwan@samsung.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20240513050634.3964461-1-dirk.behme@de.bosch.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 18:14:51 +02:00
Russell King (Oracle)	616501eccb	clkdev: don't fail clkdev_alloc() if over-sized Don't fail clkdev_alloc() if the strings are over-sized. In this case, the entry will not match during lookup, so its useless. However, since code fails if we return NULL leading to boot failure, return a dummy entry with the connection and device IDs set to "bad". Leave the warning so these problems can be found, and the useless wasteful clkdev registrations removed. Reported-by: Ron Economos <re@w6rz.net> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: `8d532528ff` ("clkdev: report over-sized strings when creating clkdev entries") Closes: https://lore.kernel.org/linux-clk/7eda7621-0dde-4153-89e4-172e4c095d01@roeck-us.net. Link: https://lore.kernel.org/r/28114882-f8d7-21bf-4536-a186e8d7a22a@w6rz.net Tested-by: Ron Economos <re@w6rz.net> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>	2024-06-04 17:12:01 +01:00
Greg Kroah-Hartman	7c55b78818	jfs: xattr: fix buffer overflow for invalid xattr When an xattr size is not what is expected, it is printed out to the kernel log in hex format as a form of debugging. But when that xattr size is bigger than the expected size, printing it out can cause an access off the end of the buffer. Fix this all up by properly restricting the size of the debug hex dump in the kernel log. Reported-by: syzbot+9dfe490c8176301c1d06@syzkaller.appspotmail.com Cc: Dave Kleikamp <shaggy@kernel.org> Link: https://lore.kernel.org/r/2024051433-slider-cloning-98f9@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 18:09:03 +02:00
Yongzhi Liu	77427e3d5c	misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe() There is a memory leak (forget to free allocated buffers) in a memory allocation failure path. Fix it to jump to the correct error handling code. Fixes: `393fc2f594` ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.") Signed-off-by: Yongzhi Liu <hyperlyzcs@gmail.com> Reviewed-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com> Link: https://lore.kernel.org/r/20240523121434.21855-4-hyperlyzcs@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 18:08:39 +02:00
Yongzhi Liu	086c6cbcc5	misc: microchip: pci1xxxx: fix double free in the error handling of gp_aux_bus_probe() When auxiliary_device_add() returns error and then calls auxiliary_device_uninit(), callback function gp_auxiliary_device_release() calls ida_free() and kfree(aux_device_wrapper) to free memory. We should't call them again in the error handling path. Fix this by skipping the redundant cleanup functions. Fixes: `393fc2f594` ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.") Signed-off-by: Yongzhi Liu <hyperlyzcs@gmail.com> Link: https://lore.kernel.org/r/20240523121434.21855-3-hyperlyzcs@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 18:08:39 +02:00
Uwe Kleine-König	73fedc31fe	parport: amiga: Mark driver struct with __refdata to prevent section mismatch As described in the added code comment, a reference to .exit.text is ok for drivers registered via module_platform_driver_probe(). Make this explicit to prevent the following section mismatch warning WARNING: modpost: drivers/parport/parport_amiga: section mismatch in reference: amiga_parallel_driver+0x8 (section: .data) -> amiga_parallel_remove (section: .exit.text) that triggers on an allmodconfig W=1 build. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20240513075206.2337310-2-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 18:08:31 +02:00
Hans de Goede	af076156ec	mei: vsc: Fix wrong invocation of ACPI SID method When using an initializer for a union only one of the union members must be initialized. The initializer for the acpi_object union variable passed as argument to the SID ACPI method was initializing both the type and the integer members of the union. Unfortunately rather then complaining about this gcc simply ignores the first initializer and only used the second integer.value = 1 initializer. Leaving type set to 0 which leads to the argument being skipped by acpi acpi_ns_evaluate() resulting in: ACPI Warning: \_SB.PC00.SPI1.SPFD.CVFD.SID: Insufficient arguments - Caller passed 0, method requires 1 (20240322/nsarguments-232) Fix this by initializing only the integer struct part of the union and initializing both members of the integer struct. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Fixes: `566f5ca976` ("mei: Add transport driver for IVSC device") Reviewed-by: Wentong Wu <wentong.wu@intel.com> Link: https://lore.kernel.org/r/20240603205050.505389-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 17:46:00 +02:00
Wentong Wu	9b5e045029	mei: vsc: Don't stop/restart mei device during system suspend/resume The dynamically created mei client device (mei csi) is used as one V4L2 sub device of the whole video pipeline, and the V4L2 connection graph is built by software node. The mei_stop() and mei_restart() will delete the old mei csi client device and create a new mei client device, which will cause the software node information saved in old mei csi device lost and the whole video pipeline will be broken. Removing mei_stop()/mei_restart() during system suspend/resume can fix the issue above and won't impact hardware actual power saving logic. Fixes: `f6085a96c9` ("mei: vsc: Unregister interrupt handler for system suspend") Cc: stable@vger.kernel.org # for 6.8+ Reported-by: Hao Yao <hao.yao@intel.com> Signed-off-by: Wentong Wu <wentong.wu@intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Jason Chen <jason.z.chen@intel.com> Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20240527123835.522384-1-wentong.wu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 17:45:35 +02:00
Tomas Winkler	283cb234ef	mei: me: release irq in mei_me_pci_resume error path The mei_me_pci_resume doesn't release irq on the error path, in case mei_start() fails. Cc: <stable@kernel.org> Fixes: `33ec082631` ("mei: revamp mei reset state machine") Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20240604090728.1027307-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 17:11:48 +02:00
Alexander Usyskin	1db5322b7e	mei: demote client disconnect warning on suspend to debug Change level for the "not connected" client message in the write callback from error to debug. The MEI driver currently disconnects all clients upon system suspend. This behavior is by design and user-space applications with open connections before the suspend are expected to handle errors upon resume, by reopening their handles, reconnecting, and retrying their operations. However, the current driver implementation logs an error message every time a write operation is attempted on a disconnected client. Since this is a normal and expected flow after system resume logging this as an error can be misleading. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20240530091415.725247-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 17:11:44 +02:00
Michal Wajdeczko	0698ff57bf	drm/xe/pf: Update the LMTT when freeing VF GT config The LMTT must be updated whenever we change the VF LMEM configuration. We missed that step when freeing the whole VF GT config, which could result in stale PTE in LMTT or LMTT PT object leaks. Fix that. Fixes: `ac6598aed1` ("drm/xe/pf: Add support to configure SR-IOV VFs") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240527115408.1064-1-michal.wajdeczko@intel.com (cherry picked from commit `c063cce7df`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-06-04 16:31:24 +02:00
Jeff Johnson	9c8f05cf1d	HID: logitech-hidpp: add missing MODULE_DESCRIPTION() macro make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-logitech-hidpp.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-06-04 16:16:37 +02:00
Arnd Bergmann	9e438fe31e	HID: intel-ish-hid: fix endian-conversion The newly added file causes a ton of sparse warnings about the incorrect use of __le32 and similar types: drivers/hid/intel-ish-hid/ishtp/loader.h:41:23: error: invalid bitfield specifier for type restricted __le32. drivers/hid/intel-ish-hid/ishtp/loader.h:42:27: error: invalid bitfield specifier for type restricted __le32. drivers/hid/intel-ish-hid/ishtp/loader.h:43:24: error: invalid bitfield specifier for type restricted __le32. drivers/hid/intel-ish-hid/ishtp/loader.h:44:24: error: invalid bitfield specifier for type restricted __le32. drivers/hid/intel-ish-hid/ishtp/loader.h:45:22: error: invalid bitfield specifier for type restricted __le32. drivers/hid/intel-ish-hid/ishtp/loader.c:172:33: warning: restricted __le32 degrades to integer drivers/hid/intel-ish-hid/ishtp/loader.c:178:50: warning: incorrect type in assignment (different base types) drivers/hid/intel-ish-hid/ishtp/loader.c:178:50: expected restricted __le32 [usertype] length drivers/hid/intel-ish-hid/ishtp/loader.c:178:50: got unsigned long drivers/hid/intel-ish-hid/ishtp/loader.c:179:50: warning: incorrect type in assignment (different base types) drivers/hid/intel-ish-hid/ishtp/loader.c:179:50: expected restricted __le32 [usertype] fw_off drivers/hid/intel-ish-hid/ishtp/loader.c:179:50: got unsigned int [usertype] offset drivers/hid/intel-ish-hid/ishtp/loader.c:180:17: warning: cast from restricted __le32 drivers/hid/intel-ish-hid/ishtp/loader.c:183:24: warning: invalid assignment: += drivers/hid/intel-ish-hid/ishtp/loader.c:183:24: left side has type unsigned int drivers/hid/intel-ish-hid/ishtp/loader.c:183:24: right side has type restricted __le32 Add the necessary conversions and use temporary variables where appropriate to avoid converting back. Fixes: `579a267e46` ("HID: intel-ish-hid: Implement loading firmware from host feature") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Zhang Lixu <lixu.zhang@intel.com> Tested-by: Zhang Lixu <lixu.zhang@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-06-04 16:16:37 +02:00
Christophe JAILLET	655a8a7684	HID: nintendo: Fix an error handling path in nintendo_hid_probe() joycon_leds_create() has a ida_alloc() call. So if an error occurs after it, a corresponding ida_free() call is needed, as already done in the .remove function. This is not 100% perfect, because if ida_alloc() fails, then 'ctlr->player_id' will forced to be U32_MAX, and an error will be logged when ida_free() is called. Considering that this can't happen in real life, no special handling is done to handle it. Fixes: `5307de63d7` ("HID: nintendo: use ida for LED player id") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Silvan Jegen <s.jegen@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-06-04 16:16:37 +02:00
José Expósito	ce3af2ee95	HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode() Fix a memory leak on logi_dj_recv_send_report() error path. Fixes: `6f20d32612` ("HID: logitech-dj: Fix error handling in logi_dj_recv_switch_to_dj_mode()") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-06-04 16:16:37 +02:00
Fuad Tabba	afb91f5f8a	KVM: arm64: Ensure that SME controls are disabled in protected mode KVM (and pKVM) do not support SME guests. Therefore KVM ensures that the host's SME state is flushed and that SME controls for enabling access to ZA storage and for streaming are disabled. pKVM needs to protect against a buggy/malicious host. Ensure that it wouldn't run a guest when protected mode is enabled should any of the SME controls be enabled. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-10-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	a69283ae1d	KVM: arm64: Refactor CPACR trap bit setting/clearing to use ELx format When setting/clearing CPACR bits for EL0 and EL1, use the ELx format of the bits, which covers both. This makes the code clearer, and reduces the chances of accidentally missing a bit. No functional change intended. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-9-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	1696fc2174	KVM: arm64: Consolidate initializing the host data's fpsimd_state/sve in pKVM Now that we have introduced finalize_init_hyp_mode(), lets consolidate the initializing of the host_data fpsimd_state and sve state. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240603122852.3923848-8-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	b5b9955617	KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM When running in protected mode we don't want to leak protected guest state to the host, including whether a guest has used fpsimd/sve. Therefore, eagerly restore the host state on guest exit when running in protected mode, which happens only if the guest has used fpsimd/sve. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-7-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	66d5b53e20	KVM: arm64: Allocate memory mapped at hyp for host sve state in pKVM Protected mode needs to maintain (save/restore) the host's sve state, rather than relying on the host kernel to do that. This is to avoid leaking information to the host about guests and the type of operations they are performing. As a first step towards that, allocate memory mapped at hyp, per cpu, for the host sve state. The following patch will use this memory to save/restore the host state. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-6-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	e511e08a9f	KVM: arm64: Specialize handling of host fpsimd state on trap In subsequent patches, n/vhe will diverge on saving the host fpsimd/sve state when taking a guest fpsimd/sve trap. Add a specialized helper to handle it. No functional change intended. Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-5-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	6d8fb3cbf7	KVM: arm64: Abstract set/clear of CPTR_EL2 bits behind helper The same traps controlled by CPTR_EL2 or CPACR_EL1 need to be toggled in different parts of the code, but the exact bits and their polarity differ between these two formats and the mode (vhe/nvhe/hvhe). To reduce the amount of duplicated code and the chance of getting the wrong bit/polarity or missing a field, abstract the set/clear of CPTR_EL2 bits behind a helper. Since (h)VHE is the way of the future, use the CPACR_EL1 format, which is a subset of the VHE CPTR_EL2, as a reference. No functional change intended. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-4-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:33 +01:00
Fuad Tabba	45f4ea9bcf	KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_state Since the prototypes for __sve_save_state/__sve_restore_state at hyp were added, the underlying macro has acquired a third parameter for saving/restoring ffr. Fix the prototypes to account for the third parameter, and restore the ffr for the guest since it is saved. Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240603122852.3923848-3-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:32 +01:00
Fuad Tabba	87bb39ed40	KVM: arm64: Reintroduce __sve_save_state Now that the hypervisor is handling the host sve state in protected mode, it needs to be able to save it. This reverts commit `e66425fc9b` ("KVM: arm64: Remove unused __sve_save_state"). Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240603122852.3923848-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-06-04 15:06:32 +01:00
Lukas Wunner	44a45be57f	sysfs: Unbreak the build around sysfs_bin_attr_simple_read() Günter reports build breakage for m68k "m5208evb_defconfig" plus CONFIG_BLK_DEV_INITRD=y caused by commit `66bc1a1733` ("treewide: Use sysfs_bin_attr_simple_read() helper"). The defconfig disables CONFIG_SYSFS, so sysfs_bin_attr_simple_read() is not compiled into the kernel. But init/initramfs.c references that function in the initializer of a struct bin_attribute. Add an empty static inline to avoid the build breakage. Fixes: `66bc1a1733` ("treewide: Use sysfs_bin_attr_simple_read() helper") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/r/e12b0027-b199-4de7-b83d-668171447ccc@roeck-us.net Signed-off-by: Lukas Wunner <lukas@wunner.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/05f4290439a58730738a15b0c99cd8576c4aa0d9.1716461752.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:56:45 +02:00
Greg Kroah-Hartman	9711873506	driver core: remove devm_device_add_groups() There is no more in-kernel users of this function, and no driver should ever be using it, so remove it from the kernel. Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> Link: https://lore.kernel.org/r/20230704131715.44454-8-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:53:36 +02:00
Hagar Hemdan	73254a297c	io_uring: fix possible deadlock in io_register_iowq_max_workers() The io_register_iowq_max_workers() function calls io_put_sq_data(), which acquires the sqd->lock without releasing the uring_lock. Similar to the commit `009ad9f0c6` ("io_uring: drop ctx->uring_lock before acquiring sqd->lock"), this can lead to a potential deadlock situation. To resolve this issue, the uring_lock is released before calling io_put_sq_data(), and then it is re-acquired after the function call. This change ensures that the locks are acquired in the correct order, preventing the possibility of a deadlock. Suggested-by: Maximilian Heyne <mheyne@amazon.de> Signed-off-by: Hagar Hemdan <hagarhem@amazon.com> Link: https://lore.kernel.org/r/20240604130527.3597-1-hagarhem@amazon.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-04 07:39:17 -06:00
Su Hui	91215f70ea	io_uring/io-wq: avoid garbage value of 'match' in io_wq_enqueue() Clang static checker (scan-build) warning: o_uring/io-wq.c:line 1051, column 3 The expression is an uninitialized value. The computed value will also be garbage. 'match.nr_pending' is used in io_acct_cancel_pending_work(), but it is not fully initialized. Change the order of assignment for 'match' to fix this problem. Fixes: `42abc95f05` ("io-wq: decouple work_list protection from the big wqe->lock") Signed-off-by: Su Hui <suhui@nfschina.com> Link: https://lore.kernel.org/r/20240604121242.2661244-1-suhui@nfschina.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-04 07:39:00 -06:00
Shichao Lai	16637fea00	usb-storage: alauda: Check whether the media is initialized The member "uzonesize" of struct alauda_info will remain 0 if alauda_init_media() fails, potentially causing divide errors in alauda_read_data() and alauda_write_lba(). - Add a member "media_initialized" to struct alauda_info. - Change a condition in alauda_check_media() to ensure the first initialization. - Add an error check for the return value of alauda_init_media(). Fixes: `e80b0fade0` ("[PATCH] USB Storage: add alauda support") Reported-by: xingwei lee <xrivendell7@gmail.com> Reported-by: yue sun <samsun1006219@gmail.com> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Shichao Lai <shichaorai@gmail.com> Link: https://lore.kernel.org/r/20240526012745.2852061-1-shichaorai@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:36:28 +02:00
Heikki Krogerus	8bdf8a42bc	usb: typec: ucsi: Ack also failed Get Error commands It is possible that also the GET_ERROR command fails. If that happens, the command completion still needs to be acknowledged. Otherwise the interface will be stuck until it's reset. Reported-by: Ammy Yi <ammy.yi@intel.com> Fixes: `bdc62f2bae` ("usb: typec: ucsi: Simplified registration and I/O API") Cc: stable@vger.kernel.org Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240531104653.1303519-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:36:12 +02:00
Andrey Konovalov	f85d39dd7e	kcov, usb: disable interrupts in kcov_remote_start_usb_softirq After commit `8fea0c8fda` ("usb: core: hcd: Convert from tasklet to BH workqueue"), usb_giveback_urb_bh() runs in the BH workqueue with interrupts enabled. Thus, the remote coverage collection section in usb_giveback_urb_bh()-> __usb_hcd_giveback_urb() might be interrupted, and the interrupt handler might invoke __usb_hcd_giveback_urb() again. This breaks KCOV, as it does not support nested remote coverage collection sections within the same context (neither in task nor in softirq). Update kcov_remote_start/stop_usb_softirq() to disable interrupts for the duration of the coverage collection section to avoid nested sections in the softirq context (in addition to such in the task context, which are already handled). Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Closes: https://lore.kernel.org/linux-usb/0f4d1964-7397-485b-bc48-11c01e2fcbca@I-love.SAKURA.ne.jp/ Closes: https://syzkaller.appspot.com/bug?extid=0438378d6f157baae1a2 Suggested-by: Alan Stern <stern@rowland.harvard.edu> Fixes: `8fea0c8fda` ("usb: core: hcd: Convert from tasklet to BH workqueue") Cc: stable@vger.kernel.org Acked-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20240527173538.4989-1-andrey.konovalov@linux.dev Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:34:44 +02:00
Rob Herring (Arm)	e4228cfd09	dt-bindings: usb: realtek,rts5411: Add missing "additionalProperties" on child nodes All nodes need an explicit additionalProperties or unevaluatedProperties unless a $ref has one that's false. As that is not the case with usb-device.yaml, "additionalProperties" is needed here. Fixes: `c44d9dab31` ("dt-bindings: usb: Add downstream facing ports to realtek binding") Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20240523194500.2958192-1-robh@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:34:28 +02:00
Kyle Tso	fc8fb9eea9	usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state Similar to what fixed in Commit `a6fe37f428` ("usb: typec: tcpm: Skip hard reset when in error recovery"), the handling of the received Hard Reset has to be skipped during TOGGLING state. [ 4086.021288] VBUS off [ 4086.021295] pending state change SNK_READY -> SNK_UNATTACHED @ 650 ms [rev2 NONE_AMS] [ 4086.022113] VBUS VSAFE0V [ 4086.022117] state change SNK_READY -> SNK_UNATTACHED [rev2 NONE_AMS] [ 4086.022447] VBUS off [ 4086.022450] state change SNK_UNATTACHED -> SNK_UNATTACHED [rev2 NONE_AMS] [ 4086.023060] VBUS VSAFE0V [ 4086.023064] state change SNK_UNATTACHED -> SNK_UNATTACHED [rev2 NONE_AMS] [ 4086.023070] disable BIST MODE TESTDATA [ 4086.023766] disable vbus discharge ret:0 [ 4086.023911] Setting usb_comm capable false [ 4086.028874] Setting voltage/current limit 0 mV 0 mA [ 4086.028888] polarity 0 [ 4086.030305] Requesting mux state 0, usb-role 0, orientation 0 [ 4086.033539] Start toggling [ 4086.038496] state change SNK_UNATTACHED -> TOGGLING [rev2 NONE_AMS] // This Hard Reset is unexpected [ 4086.038499] Received hard reset [ 4086.038501] state change TOGGLING -> HARD_RESET_START [rev2 HARD_RESET] Fixes: `f0690a25a1` ("staging: typec: USB Type-C Port Manager (tcpm)") Cc: stable@vger.kernel.org Signed-off-by: Kyle Tso <kyletso@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20240520154858.1072347-1-kyletso@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:34:12 +02:00
Amit Sunil Dhamne	e7e921918d	usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps There could be a potential use-after-free case in tcpm_register_source_caps(). This could happen when: * new (say invalid) source caps are advertised * the existing source caps are unregistered * tcpm_register_source_caps() returns with an error as usb_power_delivery_register_capabilities() fails This causes port->partner_source_caps to hold on to the now freed source caps. Reset port->partner_source_caps value to NULL after unregistering existing source caps. Fixes: `230ecdf71a` ("usb: typec: tcpm: unregister existing source caps before re-registration") Cc: stable@vger.kernel.org Signed-off-by: Amit Sunil Dhamne <amitsd@google.com> Reviewed-by: Ondrej Jirman <megi@xff.cz> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240514220134.2143181-1-amitsd@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:33:55 +02:00
John Ernberg	8475ffcfb3	USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected If no other USB HCDs are selected when compiling a small pure virutal machine, the Xen HCD driver cannot be built. Fix it by traversing down host/ if CONFIG_USB_XEN_HCD is selected. Fixes: `494ed3997d` ("usb: Introduce Xen pvUSB frontend (xen hcd)") Cc: stable@vger.kernel.org # v5.17+ Signed-off-by: John Ernberg <john.ernberg@actia.se> Link: https://lore.kernel.org/r/20240517114345.1190755-1-john.ernberg@actia.se Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:33:38 +02:00
Johan Hovold	fc3568f142	usb: typec: ucsi: glink: increase max ports for x1e80100 The new X Elite (x1e80100) platform has three ports so increase the maximum so that all ports can be registered. Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20240603100007.10236-1-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 15:33:19 +02:00
Jens Axboe	415ce0ea55	io_uring/napi: fix timeout calculation Not quite sure what __io_napi_adjust_timeout() was attemping to do, it's adjusting both the NAPI timeout and the general overall timeout, and calculating a value that is never used. The overall timeout is a super set of the NAPI timeout, and doesn't need adjusting. The only thing we really need to care about is that the NAPI timeout doesn't exceed the overall timeout. If a user asked for a timeout of eg 5 usec and NAPI timeout is 10 usec, then we should not spin for 10 usec. While in there, sanitize the time checking a bit. If we have a negative value in the passed in timeout, discard it. Round up the value as well, so we don't end up with a NAPI timeout for the majority of the wait, with only a tiny sleep value at the end. Hence the only case we need to care about is if the NAPI timeout is larger than the overall timeout. If it is, cap the NAPI timeout at what the overall timeout is. Cc: stable@vger.kernel.org Fixes: `8d0c12a80c` ("io-uring: add napi busy poll support") Reported-by: Lewis Baker <lewissbaker@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-04 07:32:45 -06:00
Peter Chen	718d4a63c0	Revert "usb: chipidea: move ci_ulpi_init after the phy initialization" This reverts commit `22ffd399e6`. People report this commit causes the driver defer probed, and never back to work[1][2]. [1] https://lore.kernel.org/lkml/20240407011913.GA168730@nchen-desktop/T/#mc2b93bc11a8b01ec7cd0d0bf6b0b03951d9ef751 [2] https://lore.kernel.org/lkml/20240407011913.GA168730@nchen-desktop/T/#me87d9a2a76c07619d83b3879ea14780da89fbbbf Cc: Michael Grzeschik <m.grzeschik@pengutronix.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Wouter Franken <wouter@franken-peeters.be> Signed-off-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20240517023648.3459188-1-peter.chen@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 14:51:09 +02:00
Tetsuo Handa	ae01e52da2	serial: drop debugging WARN_ON_ONCE() from uart_write() syzbot is reporting lockdep warning upon int disc = 7; ioctl(open("/dev/ttyS3", O_RDONLY), TIOCSETD, &disc); sequence. Do like what commit `5f1149d2f4` ("serial: drop debugging WARN_ON_ONCE() from uart_put_char()") does. Reported-by: syzbot+f78380e4eae53c64125c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f78380e4eae53c64125c Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/r/d775ae2d-a2ac-439e-8e2b-134749f60f30@I-love.SAKURA.ne.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 14:10:43 +02:00
Hugo Villeneuve	7a2e8e30ad	serial: sc16is7xx: re-add Kconfig SPI or I2C dependency Commit `d492164381` ("serial: sc16is7xx: split into core and I2C/SPI parts (core)") removed Kconfig SPI_MASTER or I2C dependency for SERIAL_SC16IS7XX (core). This removal was done because I inadvertently misinterpreted some review comments. Because of that, the driver question now pops up if both I2C and SPI_MASTER are disabled. Re-add Kconfig SPI_MASTER or I2C dependency to fix the problem. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: `d492164381` ("serial: sc16is7xx: split into core and I2C/SPI parts (core)") Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20240603152601.3689319-3-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 14:10:28 +02:00
Hugo Villeneuve	4e534ff4b6	serial: sc16is7xx: rename Kconfig CONFIG_SERIAL_SC16IS7XX_CORE Commit `d492164381` ("serial: sc16is7xx: split into core and I2C/SPI parts (core)") renamed SERIAL_SC16IS7XX_CORE by SERIAL_SC16IS7XX. This means that some configs should have been updated when I submitted the original patch, but unfortunately they were not. Geert mentioned for example: arch/mips/configs/cu1??0-neo_defconfig Rename SERIAL_SC16IS7XX to SERIAL_SC16IS7XX_CORE so that existing configs will still work correctly. Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: `d492164381` ("serial: sc16is7xx: split into core and I2C/SPI parts (core)") Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20240603152601.3689319-2-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 14:10:28 +02:00
Douglas Anderson	ca84cd379b	serial: port: Don't block system suspend even if bytes are left to xmit Recently, suspend testing on sc7180-trogdor based devices has started to sometimes fail with messages like this: port a88000.serial:0.0: PM: calling pm_runtime_force_suspend+0x0/0xf8 @ 28934, parent: a88000.serial:0 port a88000.serial:0.0: PM: dpm_run_callback(): pm_runtime_force_suspend+0x0/0xf8 returns -16 port a88000.serial:0.0: PM: pm_runtime_force_suspend+0x0/0xf8 returned -16 after 33 usecs port a88000.serial:0.0: PM: failed to suspend: error -16 I could reproduce these problems by logging in via an agetty on the debug serial port (which was _not_ used for kernel console) and running: cat /var/log/messages ...and then (via an SSH session) forcing a few suspend/resume cycles. Tracing through the code and doing some printf()-based debugging shows that the -16 (-EBUSY) comes from the recently added serial_port_runtime_suspend(). The idea of the serial_port_runtime_suspend() function is to prevent the port from being _runtime_ suspended if it still has bytes left to transmit. Having bytes left to transmit isn't a reason to block _system_ suspend, though. If a serdev device in the kernel needs to block system suspend it should block its own suspend and it can use serdev_device_wait_until_sent() to ensure bytes are sent. The DEFINE_RUNTIME_DEV_PM_OPS() used by the serial_port code means that the system suspend function will be pm_runtime_force_suspend(). In pm_runtime_force_suspend() we can see that before calling the runtime suspend function we'll call pm_runtime_disable(). This should be a reliable way to detect that we're called from system suspend and that we shouldn't look for busyness. Fixes: `43066e3222` ("serial: port: Don't suspend if the port is still busy") Cc: stable@vger.kernel.org Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240531080914.v3.1.I2395e66cf70c6e67d774c56943825c289b9c13e4@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 14:09:47 +02:00
Doug Brown	5208e7ced5	serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level The FIFO is 64 bytes, but the FCR is configured to fire the TX interrupt when the FIFO is half empty (bit 3 = 0). Thus, we should only write 32 bytes when a TX interrupt occurs. This fixes a problem observed on the PXA168 that dropped a bunch of TX bytes during large transmissions. Fixes: `ab28f51c77` ("serial: rewrite pxa2xx-uart to use 8250_core") Signed-off-by: Doug Brown <doug@schmorgal.com> Link: https://lore.kernel.org/r/20240519191929.122202-1-doug@schmorgal.com Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 14:08:09 +02:00
Andy Shevchenko	2c94512055	serial: 8250_dw: Revert "Move definitions to the shared header" This reverts commit `d9666dfb31`. The container of the struct dw8250_port_data is private to the actual driver. In particular, 8250_lpss and 8250_dw use different data types that are assigned to the UART port private_data. Hence, it must not be used outside the specific driver. Fix the mistake made in the past by moving the respective definitions to the specific driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240514190730.2787071-3-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 14:07:58 +02:00
Andy Shevchenko	87d80bfbd5	serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw The container of the struct dw8250_port_data is private to the actual driver. In particular, 8250_lpss and 8250_dw use different data types that are assigned to the UART port private_data. Hence, it must not be used outside the specific driver. Currently the only cpr_val is required by the common code, make it be available via struct dw8250_port_data. This fixes the UART breakage on Intel Galileo boards. Fixes: `593dea000b` ("serial: 8250: dw: Allow to use a fallback CPR value if not synthesized") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240514190730.2787071-2-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 14:07:58 +02:00
Ilpo Järvinen	b19ab7ee2c	tty: n_tty: Fix buffer offsets when lookahead is used When lookahead has "consumed" some characters (la_count > 0), n_tty_receive_buf_standard() and n_tty_receive_buf_closing() for characters beyond the la_count are given wrong cp/fp offsets which leads to duplicating and losing some characters. If la_count > 0, correct buffer pointers and make count consistent too (the latter is not strictly necessary to fix the issue but seems more logical to adjust all variables immediately to keep state consistent). Reported-by: Vadym Krevs <vkrevs@yahoo.com> Fixes: `6bb6fa6908` ("tty: Implement lookahead to process XON/XOFF timely") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218834 Tested-by: Vadym Krevs <vkrevs@yahoo.com> Cc: stable@vger.kernel.org Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240514140429.12087-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 14:07:27 +02:00
Vasant Hegde	526606b0a1	iommu/amd: Fix Invalid wait context issue With commit `c4cb231111` ("iommu/amd: Add support for enable/disable IOPF") we are hitting below issue. This happens because in IOPF enablement path it holds spin lock with irq disable and then tries to take mutex lock. dmesg: ----- [ 0.938739] ============================= [ 0.938740] [ BUG: Invalid wait context ] [ 0.938742] 6.10.0-rc1+ #1 Not tainted [ 0.938745] ----------------------------- [ 0.938746] swapper/0/1 is trying to lock: [ 0.938748] ffffffff8c9f01d8 (&port_lock_key){....}-{3:3}, at: serial8250_console_write+0x78/0x4a0 [ 0.938767] other info that might help us debug this: [ 0.938768] context-{5:5} [ 0.938769] 7 locks held by swapper/0/1: [ 0.938772] #0: ffff888101a91310 (&group->mutex){+.+.}-{4:4}, at: bus_iommu_probe+0x70/0x160 [ 0.938790] #1: ffff888101d1f1b8 (&domain->lock){....}-{3:3}, at: amd_iommu_attach_device+0xa5/0x700 [ 0.938799] #2: ffff888101cc3d18 (&dev_data->lock){....}-{3:3}, at: amd_iommu_attach_device+0xc5/0x700 [ 0.938806] #3: ffff888100052830 (&iommu->lock){....}-{2:2}, at: amd_iommu_iopf_add_device+0x3f/0xa0 [ 0.938813] #4: ffffffff8945a340 (console_lock){+.+.}-{0:0}, at: _printk+0x48/0x50 [ 0.938822] #5: ffffffff8945a390 (console_srcu){....}-{0:0}, at: console_flush_all+0x58/0x4e0 [ 0.938867] #6: ffffffff82459f80 (console_owner){....}-{0:0}, at: console_flush_all+0x1f0/0x4e0 [ 0.938872] stack backtrace: [ 0.938874] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc1+ #1 [ 0.938877] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.39 04/16/2019 Fix above issue by re-arranging code in attach device path: - move device PASID/IOPF enablement outside lock in AMD IOMMU driver. This is safe as core layer holds group->mutex lock before calling iommu_ops->attach_dev. Reported-by: Borislav Petkov <bp@alien8.de> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com> Fixes: `c4cb231111` ("iommu/amd: Add support for enable/disable IOPF") Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240530084801.10758-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-06-04 14:00:59 +02:00
Vasant Hegde	48dc345a23	iommu/amd: Check EFR[EPHSup] bit before enabling PPR Check for EFR[EPHSup] bit before enabling PPR. This bit must be set to enable PPR. Reported-by: Borislav Petkov <bp@alien8.de> Fixes: `c4cb231111` ("iommu/amd: Add support for enable/disable IOPF") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218900 Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Jean-Christophe Guillain <jean-christophe@guillain.net> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20240530071118.10297-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-06-04 13:59:52 +02:00
Vasant Hegde	998a0a362b	iommu/amd: Fix workqueue name Workqueue name length is crossing WQ_NAME_LEN limit. Fix it by changing name format. New format : "iopf_queue/amdvi-<iommu-devid>" kernel warning: [ 11.146912] workqueue: name exceeds WQ_NAME_LEN. Truncating to: iopf_queue/amdiommu-0xc002-iopf Reported-by: Borislav Petkov <bp@alien8.de> Fixes: `61928bab9d` ("iommu/amd: Define per-IOMMU iopf_queue") Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240529113900.5798-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-06-04 13:58:38 +02:00
Lu Baolu	89e8a2366e	iommu: Return right value in iommu_sva_bind_device() iommu_sva_bind_device() should return either a sva bond handle or an ERR_PTR value in error cases. Existing drivers (idxd and uacce) only check the return value with IS_ERR(). This could potentially lead to a kernel NULL pointer dereference issue if the function returns NULL instead of an error pointer. In reality, this doesn't cause any problems because iommu_sva_bind_device() only returns NULL when the kernel is not configured with CONFIG_IOMMU_SVA. In this case, iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) will return an error, and the device drivers won't call iommu_sva_bind_device() at all. Fixes: `26b25a2b98` ("iommu: Bind process address spaces to devices") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240528042528.71396-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-06-04 13:54:31 +02:00
Robin Murphy	cc8d89d063	iommu/dma: Fix domain init Despite carefully rewording the kerneldoc to describe the new direct interaction with dma_range_map, it seems I managed to confuse myself in removing the redundant force_aperture check and ended up making the code not do that at all. This led to dma_range_maps inadvertently being able to set iovad->start_pfn = 0, and all the nonsensical chaos which ensues from there. Restore the correct behaviour of constraining base_pfn to the domain aperture regardless of dma_range_map, and not trying to apply dma_range_map constraints to the basic IOVA domain since they will be properly handled with reserved regions later. Reported-by: Jon Hunter <jonathanh@nvidia.com> Reported-by: Jerry Snitselaar <jsnitsel@redhat.com> Fixes: `ad4750b07d` ("iommu/dma: Make limit checks self-contained") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/721fa6baebb0924aa40db0b8fb86bcb4538434af.1716232484.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-06-04 13:54:09 +02:00
Kun(llfl)	a295ec52c8	iommu/amd: Fix sysfs leak in iommu init During the iommu initialization, iommu_init_pci() adds sysfs nodes. However, these nodes aren't remove in free_iommu_resources() subsequently. Fixes: `39ab9555c2` ("iommu: Add sysfs bindings for struct iommu_device") Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/c8e0d11c6ab1ee48299c288009cf9c5dae07b42d.1715215003.git.llfl@linux.alibaba.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-06-04 13:50:15 +02:00
Stefan Wahren	c3552ab19a	staging: vchiq_debugfs: Fix NPD in vchiq_dump_state The commit `42a2f6664e` ("staging: vc04_services: Move global g_state to vchiq_state") falsely assumed that the debugfs entry vchiq/state was created with vchiq_instance as data. This causes now a NULL pointer derefence while trying to dump the vchiq state. So fix this by passing vchiq_state as data, because this is the relevant part here. Fixes: `42a2f6664e` ("staging: vc04_services: Move global g_state to vchiq_state") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Umang Jain <umang.jain@ideasonboard.com> Link: https://lore.kernel.org/r/20240524151542.19415-1-wahrenst@gmx.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-04 13:36:44 +02:00
Jakub Kicinski	a535d59432	net: tls: fix marking packets as decrypted For TLS offload we mark packets with skb->decrypted to make sure they don't escape the host without getting encrypted first. The crypto state lives in the socket, so it may get detached by a call to skb_orphan(). As a safety check - the egress path drops all packets with skb->decrypted and no "crypto-safe" socket. The skb marking was added to sendpage only (and not sendmsg), because tls_device injected data into the TCP stack using sendpage. This special case was missed when sendpage got folded into sendmsg. Fixes: `c5c37af6ec` ("tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240530232607.82686-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-04 12:58:50 +02:00
Ilpo Järvinen	f8367a74ae	EDAC/igen6: Convert PCIBIOS_* return codes to errnos errcmd_enable_error_reporting() uses pci_{read,write}_config_word() that return PCIBIOS_* codes. The return code is then returned all the way into the probe function igen6_probe() that returns it as is. The probe functions, however, should return normal errnos. Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal errno before returning it from errcmd_enable_error_reporting(). Fixes: `10590a9d4f` ("EDAC/igen6: Add EDAC driver for Intel client SoCs using IBECC") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240527132236.13875-2-ilpo.jarvinen@linux.intel.com	2024-06-04 11:29:52 +02:00
Ilpo Järvinen	3ec8ebd8a5	EDAC/amd64: Convert PCIBIOS_* return codes to errnos gpu_get_node_map() uses pci_read_config_dword() that returns PCIBIOS_* codes. The return code is then returned all the way into the module init function amd64_edac_init() that returns it as is. The module init functions, however, should return normal errnos. Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal errno before returning it from gpu_get_node_map(). For consistency, convert also the other similar cases which return PCIBIOS_* codes even if they do not have any bugs at the moment. Fixes: `4251566ebc` ("EDAC/amd64: Cache and use GPU node map") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240527132236.13875-1-ilpo.jarvinen@linux.intel.com	2024-06-04 11:24:16 +02:00
Nikita Zhandarovich	4aa2dcfbad	HID: core: remove unnecessary WARN_ON() in implement() Syzkaller hit a warning [1] in a call to implement() when trying to write a value into a field of smaller size in an output report. Since implement() already has a warn message printed out with the help of hid_warn() and value in question gets trimmed with: ... value &= m; ... WARN_ON may be considered superfluous. Remove it to suppress future syzkaller triggers. [1] WARNING: CPU: 0 PID: 5084 at drivers/hid/hid-core.c:1451 implement drivers/hid/hid-core.c:1451 [inline] WARNING: CPU: 0 PID: 5084 at drivers/hid/hid-core.c:1451 hid_output_report+0x548/0x760 drivers/hid/hid-core.c:1863 Modules linked in: CPU: 0 PID: 5084 Comm: syz-executor424 Not tainted 6.9.0-rc7-syzkaller-00183-gcf87f46fd34d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 RIP: 0010:implement drivers/hid/hid-core.c:1451 [inline] RIP: 0010:hid_output_report+0x548/0x760 drivers/hid/hid-core.c:1863 ... Call Trace: <TASK> __usbhid_submit_report drivers/hid/usbhid/hid-core.c:591 [inline] usbhid_submit_report+0x43d/0x9e0 drivers/hid/usbhid/hid-core.c:636 hiddev_ioctl+0x138b/0x1f00 drivers/hid/usbhid/hiddev.c:726 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:904 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... Fixes: `95d1c8951e` ("HID: simplify implement() a bit") Reported-by: <syzbot+5186630949e3c55f0799@syzkaller.appspotmail.com> Suggested-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-06-04 09:49:35 +02:00
Jakub Kicinski	d630180260	Merge tag 'wireless-2024-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.10-rc3 The first fixes for v6.10. And we have a big one, I suspect the biggest wireless pull request we ever had. There are fixes all over, both in stack and drivers. Likely the most important here are mt76 not working on mt7615 devices, ath11k not being able to connect to 6 GHz networks and rtlwifi suffering from packet loss. But of course there's much more. * tag 'wireless-2024-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (37 commits) wifi: rtlwifi: Ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS wifi: mt76: mt7615: add missing chanctx ops wifi: wilc1000: document SRCU usage instead of SRCU Revert "wifi: wilc1000: set atomic flag on kmemdup in srcu critical section" Revert "wifi: wilc1000: convert list management to RCU" wifi: mac80211: fix UBSAN noise in ieee80211_prep_hw_scan() wifi: mac80211: correctly parse Spatial Reuse Parameter Set element wifi: mac80211: fix Spatial Reuse element size check wifi: iwlwifi: mvm: don't read past the mfuart notifcation wifi: iwlwifi: mvm: Fix scan abort handling with HW rfkill wifi: iwlwifi: mvm: check n_ssids before accessing the ssids wifi: iwlwifi: mvm: properly set 6 GHz channel direct probe option wifi: iwlwifi: mvm: handle BA session teardown in RF-kill wifi: iwlwifi: mvm: Handle BIGTK cipher in kek_kck cmd wifi: iwlwifi: mvm: remove stale STA link data during restart wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef wifi: iwlwifi: mvm: set properly mac header wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64 wifi: iwlwifi: mvm: d3: fix WoWLAN command version lookup wifi: iwlwifi: mvm: fix a crash on 7265 ... ==================== Link: https://lore.kernel.org/r/20240603115129.9494CC2BD10@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-03 18:52:24 -07:00
Jeff Johnson	c6cab01d7e	lib/test_rhashtable: add missing MODULE_DESCRIPTION() macro make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_rhashtable.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240531-md-lib-test_rhashtable-v1-1-cd6d4138f1b6@quicinc.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-03 18:51:18 -07:00
Jakub Kicinski	d730a42ca6	Merge branch 'dst_cache-fix-possible-races' Eric Dumazet says: ==================== dst_cache: fix possible races This series is inspired by various undisclosed syzbot reports hinting at corruptions in dst_cache structures. It seems at least four users of dst_cache are racy against BH reentrancy. Last patch is adding a DEBUG_NET check to catch future misuses. ==================== Link: https://lore.kernel.org/r/20240531132636.2637995-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-03 18:50:14 -07:00
Eric Dumazet	2fe6fb36c7	net: dst_cache: add two DEBUG_NET warnings After fixing four different bugs involving dst_cache users, it might be worth adding a check about BH being blocked by dst_cache callers. DEBUG_NET_WARN_ON_ONCE(!in_softirq()); It is not fatal, if we missed valid case where no BH deadlock is to be feared, we might change this. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-03 18:50:09 -07:00
Eric Dumazet	cf28ff8e4c	ila: block BH in ila_output() As explained in commit `1378817486` ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled. ila_output() is called from lwtunnel_output() possibly from process context, and under rcu_read_lock(). We might be interrupted by a softirq, re-enter ila_output() and corrupt dst_cache data structures. Fix the race by using local_bh_disable(). Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-03 18:50:09 -07:00
Eric Dumazet	c0b98ac1cc	ipv6: sr: block BH in seg6_output_core() and seg6_input_core() As explained in commit `1378817486` ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled. Disabling preemption in seg6_output_core() is not good enough, because seg6_output_core() is called from process context, lwtunnel_output() only uses rcu_read_lock(). We might be interrupted by a softirq, re-enter seg6_output_core() and corrupt dst_cache data structures. Fix the race by using local_bh_disable() instead of preempt_disable(). Apply a similar change in seg6_input_core(). Fixes: `fa79581ea6` ("ipv6: sr: fix several BUGs when preemption is enabled") Fixes: `6c8702c60b` ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Lebrun <dlebrun@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-03 18:50:08 -07:00
Eric Dumazet	db0090c6eb	net: ipv6: rpl_iptunnel: block BH in rpl_output() and rpl_input() As explained in commit `1378817486` ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled. Disabling preemption in rpl_output() is not good enough, because rpl_output() is called from process context, lwtunnel_output() only uses rcu_read_lock(). We might be interrupted by a softirq, re-enter rpl_output() and corrupt dst_cache data structures. Fix the race by using local_bh_disable() instead of preempt_disable(). Apply a similar change in rpl_input(). Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexander Aring <aahringo@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-03 18:50:08 -07:00
Eric Dumazet	2fe40483ec	ipv6: ioam: block BH from ioam6_output() As explained in commit `1378817486` ("tipc: block BH before using dst_cache"), net/core/dst_cache.c helpers need to be called with BH disabled. Disabling preemption in ioam6_output() is not good enough, because ioam6_output() is called from process context, lwtunnel_output() only uses rcu_read_lock(). We might be interrupted by a softirq, re-enter ioam6_output() and corrupt dst_cache data structures. Fix the race by using local_bh_disable() instead of preempt_disable(). Fixes: `8cb3bf8bff` ("ipv6: ioam: Add support for the ip6ip6 encapsulation") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Justin Iurman <justin.iurman@uliege.be> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240531132636.2637995-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-03 18:50:08 -07:00
Matthias Stocker	ffbe335b8d	vmxnet3: disable rx data ring on dma allocation failure When vmxnet3_rq_create() fails to allocate memory for rq->data_ring.base, the subsequent call to vmxnet3_rq_destroy_all_rxdataring does not reset rq->data_ring.desc_size for the data ring that failed, which presumably causes the hypervisor to reference it on packet reception. To fix this bug, rq->data_ring.desc_size needs to be set to 0 to tell the hypervisor to disable this feature. [ 95.436876] kernel BUG at net/core/skbuff.c:207! [ 95.439074] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 95.440411] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 6.9.3-dirty #1 [ 95.441558] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018 [ 95.443481] RIP: 0010:skb_panic+0x4d/0x4f [ 95.444404] Code: 4f 70 50 8b 87 c0 00 00 00 50 8b 87 bc 00 00 00 50 ff b7 d0 00 00 00 4c 8b 8f c8 00 00 00 48 c7 c7 68 e8 be 9f e8 63 58 f9 ff <0f> 0b 48 8b 14 24 48 c7 c1 d0 73 65 9f e8 a1 ff ff ff 48 8b 14 24 [ 95.447684] RSP: 0018:ffffa13340274dd0 EFLAGS: 00010246 [ 95.448762] RAX: 0000000000000089 RBX: ffff8fbbc72b02d0 RCX: 000000000000083f [ 95.450148] RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000083f [ 95.451520] RBP: 000000000000002d R08: 0000000000000000 R09: ffffa13340274c60 [ 95.452886] R10: ffffffffa04ed468 R11: 0000000000000002 R12: 0000000000000000 [ 95.454293] R13: ffff8fbbdab3c2d0 R14: ffff8fbbdbd829e0 R15: ffff8fbbdbd809e0 [ 95.455682] FS: 0000000000000000(0000) GS:ffff8fbeefd80000(0000) knlGS:0000000000000000 [ 95.457178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 95.458340] CR2: 00007fd0d1f650c8 CR3: 0000000115f28000 CR4: 00000000000406f0 [ 95.459791] Call Trace: [ 95.460515] <IRQ> [ 95.461180] ? __die_body.cold+0x19/0x27 [ 95.462150] ? die+0x2e/0x50 [ 95.462976] ? do_trap+0xca/0x110 [ 95.463973] ? do_error_trap+0x6a/0x90 [ 95.464966] ? skb_panic+0x4d/0x4f [ 95.465901] ? exc_invalid_op+0x50/0x70 [ 95.466849] ? skb_panic+0x4d/0x4f [ 95.467718] ? asm_exc_invalid_op+0x1a/0x20 [ 95.468758] ? skb_panic+0x4d/0x4f [ 95.469655] skb_put.cold+0x10/0x10 [ 95.470573] vmxnet3_rq_rx_complete+0x862/0x11e0 [vmxnet3] [ 95.471853] vmxnet3_poll_rx_only+0x36/0xb0 [vmxnet3] [ 95.473185] __napi_poll+0x2b/0x160 [ 95.474145] net_rx_action+0x2c6/0x3b0 [ 95.475115] handle_softirqs+0xe7/0x2a0 [ 95.476122] __irq_exit_rcu+0x97/0xb0 [ 95.477109] common_interrupt+0x85/0xa0 [ 95.478102] </IRQ> [ 95.478846] <TASK> [ 95.479603] asm_common_interrupt+0x26/0x40 [ 95.480657] RIP: 0010:pv_native_safe_halt+0xf/0x20 [ 95.481801] Code: 22 d7 e9 54 87 01 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 93 ba 3b 00 fb f4 <e9> 2c 87 01 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 [ 95.485563] RSP: 0018:ffffa133400ffe58 EFLAGS: 00000246 [ 95.486882] RAX: 0000000000004000 RBX: ffff8fbbc1d14064 RCX: 0000000000000000 [ 95.488477] RDX: ffff8fbeefd80000 RSI: ffff8fbbc1d14000 RDI: 0000000000000001 [ 95.490067] RBP: ffff8fbbc1d14064 R08: ffffffffa0652260 R09: 00000000000010d3 [ 95.491683] R10: 0000000000000018 R11: ffff8fbeefdb4764 R12: ffffffffa0652260 [ 95.493389] R13: ffffffffa06522e0 R14: 0000000000000001 R15: 0000000000000000 [ 95.495035] acpi_safe_halt+0x14/0x20 [ 95.496127] acpi_idle_do_entry+0x2f/0x50 [ 95.497221] acpi_idle_enter+0x7f/0xd0 [ 95.498272] cpuidle_enter_state+0x81/0x420 [ 95.499375] cpuidle_enter+0x2d/0x40 [ 95.500400] do_idle+0x1e5/0x240 [ 95.501385] cpu_startup_entry+0x29/0x30 [ 95.502422] start_secondary+0x11c/0x140 [ 95.503454] common_startup_64+0x13e/0x141 [ 95.504466] </TASK> [ 95.505197] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables vsock_loopback vmw_vsock_virtio_transport_common qrtr vmw_vsock_vmci_transport vsock sunrpc binfmt_misc pktcdvd vmw_balloon pcspkr vmw_vmci i2c_piix4 joydev loop dm_multipath nfnetlink zram crct10dif_pclmul crc32_pclmul vmwgfx crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 vmxnet3 sha1_ssse3 drm_ttm_helper vmw_pvscsi ttm ata_generic pata_acpi serio_raw scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables fuse [ 95.516536] ---[ end trace 0000000000000000 ]--- Fixes: `6f4833383e` ("net: vmxnet3: Fix NULL pointer dereference in vmxnet3_rq_rx_complete()") Signed-off-by: Matthias Stocker <mstocker@barracuda.com> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Ronak Doshi <ronak.doshi@broadcom.com> Link: https://lore.kernel.org/r/20240531103711.101961-1-mstocker@barracuda.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-03 18:49:33 -07:00
Linus Torvalds	2ab7951410	Merge tag 'cxl-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dave Jiang: - Compile fix for cxl-test from missing linux/vmalloc.h - Fix for memregion leaks in devm_cxl_add_region() * tag 'cxl-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Fix memregion leaks in devm_cxl_add_region() cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c	2024-06-03 14:42:41 -07:00
Johan Hovold	78f0dfa64c	iio: inkern: fix channel read regression A recent "cleanup" broke IIO channel read outs and thereby thermal mitigation on the Lenovo ThinkPad X13s by returning zero instead of the expected IIO value type in iio_read_channel_processed_scale(): thermal thermal_zone12: failed to read out thermal zone (-22) Fixes: `3092bde731` ("iio: inkern: move to the cleanup.h magic") Cc: Nuno Sa <nuno.sa@analog.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20240530074416.13697-1-johan+linaro@kernel.org Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-06-03 20:29:31 +01:00
Jean-Baptiste Maneyrol	8844ed0a6e	iio: imu: inv_mpu6050: stabilized timestamping in interrupt Use IRQ ONESHOT flag to ensure the timestamp is not updated in the hard handler during the thread handler. And use a fixed value of 1 sample that correspond to this first timestamp. This way we can ensure the timestamp is always corresponding to the value used by the timestamping mechanism. Otherwise, it is possible that between FIFO count read and FIFO processing the timestamp is overwritten in the hard handler. Fixes: `111e1abd00` ("iio: imu: inv_mpu6050: use the common inv_sensors timestamp module") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com> Link: https://lore.kernel.org/r/20240527150117.608792-1-inv.git-commit@tdk.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-06-03 19:05:56 +01:00
Dumitru Ceclan	182bc496dc	iio: adc: ad7173: Fix sampling frequency setting This patch fixes two issues regarding the sampling frequency setting: -The attribute was set as per device, not per channel. As such, when setting the sampling frequency, the configuration was always done for the slot 0, and the correct configuration was applied on the next channel configuration call by the LRU mechanism. -The LRU implementation does not take into account external settings of the slot registers. When setting the sampling frequency directly to a slot register in write_raw(), there is no guarantee that other channels were not also using that slot and now incorrectly retain their config as live. Set the sampling frequency attribute as separate in the channel templates. Do not set the sampling directly to the slot register in write_raw(), just mark the config as not live and let the LRU mechanism handle it. As the reg variable is no longer used, remove it. Fixes: `76a1e6a428` ("iio: adc: ad7173: add AD7173 driver") Signed-off-by: Dumitru Ceclan <dumitru.ceclan@analog.com> Link: https://lore.kernel.org/r/20240530-ad7173-fixes-v3-5-b85f33079e18@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-06-03 19:05:45 +01:00
Dumitru Ceclan	18befe4a28	iio: adc: ad7173: Clear append status bit The previous value of the append status bit was not cleared before setting the new value. This caused the bit to remain set after enabling buffered mode for multiple channels and not permit further buffered reads from a single channel after the fact. Fixes: `76a1e6a428` ("iio: adc: ad7173: add AD7173 driver") Signed-off-by: Dumitru Ceclan <dumitru.ceclan@analog.com> Link: https://lore.kernel.org/r/20240530-ad7173-fixes-v3-4-b85f33079e18@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-06-03 19:05:19 +01:00
Arnaldo Carvalho de Melo	d6283b160a	tools headers uapi: Sync linux/stat.h with the kernel sources to pick STATX_SUBVOL To pick the changes from: `2a82bb0294` ("statx: stx_subvol") This silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlnK2Fmx_gahzwZI@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-06-03 14:44:28 -03:00
Paolo Bonzini	b50788f7cd	Merge tag 'kvm-riscv-fixes-6.10-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv fixes for 6.10, take #1 - No need to use mask when hart-index-bits is 0 - Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext()	2024-06-03 13:18:18 -04:00
Paolo Bonzini	b3233c737e	Merge branch 'kvm-fixes-6.10-1' into HEAD * Fixes and debugging help for the #VE sanity check. Also disable it by default, even for CONFIG_DEBUG_KERNEL, because it was found to trigger spuriously (most likely a processor erratum as the exact symptoms vary by generation). * Avoid WARN() when two NMIs arrive simultaneously during an NMI-disabled situation (GIF=0 or interrupt shadow) when the processor supports virtual NMI. While generally KVM will not request an NMI window when virtual NMIs are supported, in this case it does have to single-step over the interrupt shadow or enable the STGI intercept, in order to deliver the latched second NMI. * Drop support for hand tuning APIC timer advancement from userspace. Since we have adaptive tuning, and it has proved to work well, drop the module parameter for manual configuration and with it a few stupid bugs that it had.	2024-06-03 13:18:08 -04:00
Sean Christopherson	89a58812c4	KVM: x86: Drop support for hand tuning APIC timer advancement from userspace Remove support for specifying a static local APIC timer advancement value, and instead present a read-only boolean parameter to let userspace enable or disable KVM's dynamic APIC timer advancement. Realistically, it's all but impossible for userspace to specify an advancement that is more precise than what KVM's adaptive tuning can provide. E.g. a static value needs to be tuned for the exact hardware and kernel, and if KVM is using hrtimers, likely requires additional tuning for the exact configuration of the entire system. Dropping support for a userspace provided value also fixes several flaws in the interface. E.g. KVM interprets a negative value other than -1 as a large advancement, toggling between a negative and positive value yields unpredictable behavior as vCPUs will switch from dynamic to static advancement, changing the advancement in the middle of VM creation can result in different values for vCPUs within a VM, etc. Those flaws are mostly fixable, but there's almost no justification for taking on yet more complexity (it's minimal complexity, but still non-zero). The only arguments against using KVM's adaptive tuning is if a setup needs a higher maximum, or if the adjustments are too reactive, but those are arguments for letting userspace control the absolute max advancement and the granularity of each adjustment, e.g. similar to how KVM provides knobs for halt polling. Link: https://lore.kernel.org/all/20240520115334.852510-1-zhoushuling@huawei.com Cc: Shuling Zhou <zhoushuling@huawei.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240522010304.1650603-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-03 13:08:05 -04:00
Ravi Bangoria	b7e4be0a22	KVM: SEV-ES: Delegate LBR virtualization to the processor As documented in APM[1], LBR Virtualization must be enabled for SEV-ES guests. Although KVM currently enforces LBRV for SEV-ES guests, there are multiple issues with it: o MSR_IA32_DEBUGCTLMSR is still intercepted. Since MSR_IA32_DEBUGCTLMSR interception is used to dynamically toggle LBRV for performance reasons, this can be fatal for SEV-ES guests. For ex SEV-ES guest on Zen3: [guest ~]# wrmsr 0x1d9 0x4 KVM: entry failed, hardware error 0xffffffff EAX=00000004 EBX=00000000 ECX=000001d9 EDX=00000000 Fix this by never intercepting MSR_IA32_DEBUGCTLMSR for SEV-ES guests. No additional save/restore logic is required since MSR_IA32_DEBUGCTLMSR is of swap type A. o KVM will disable LBRV if userspace sets MSR_IA32_DEBUGCTLMSR before the VMSA is encrypted. Fix this by moving LBRV enablement code post VMSA encryption. [1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June 2023, Vol 2, 15.35.2 Enabling SEV-ES. https://bugzilla.kernel.org/attachment.cgi?id=304653 Fixes: `376c6d2850` ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading") Co-developed-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Message-ID: <20240531044644.768-4-ravi.bangoria@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-03 13:07:18 -04:00
Ravi Bangoria	d922056215	KVM: SEV-ES: Disallow SEV-ES guests when X86_FEATURE_LBRV is absent As documented in APM[1], LBR Virtualization must be enabled for SEV-ES guests. So, prevent SEV-ES guests when LBRV support is missing. [1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June 2023, Vol 2, 15.35.2 Enabling SEV-ES. https://bugzilla.kernel.org/attachment.cgi?id=304653 Fixes: `376c6d2850` ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading") Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Message-ID: <20240531044644.768-3-ravi.bangoria@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-03 13:06:48 -04:00
Nikunj A Dadhania	27bd5fdc24	KVM: SEV-ES: Prevent MSR access post VMSA encryption KVM currently allows userspace to read/write MSRs even after the VMSA is encrypted. This can cause unintentional issues if MSR access has side- effects. For ex, while migrating a guest, userspace could attempt to migrate MSR_IA32_DEBUGCTLMSR and end up unintentionally disabling LBRV on the target. Fix this by preventing access to those MSRs which are context switched via the VMSA, once the VMSA is encrypted. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Message-ID: <20240531044644.768-2-ravi.bangoria@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-06-03 13:06:48 -04:00
Linus Torvalds	f06ce44145	Merge tag 'loongarch-fixes-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Some bootloader interface fixes, a dts fix, and a trivial cleanup" * tag 'loongarch-fixes-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Fix GMAC's phy-mode definitions in dts LoongArch: Override higher address bits in JUMP_VIRT_ADDR LoongArch: Fix entry point in kernel image header LoongArch: Add all CPUs enabled by fdt to NUMA node 0 LoongArch: Fix built-in DTB detection LoongArch: Remove CONFIG_ACPI_TABLE_UPGRADE in platform_init()	2024-06-03 09:27:45 -07:00
Hagar Hemdan	b97e8a2f71	irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update() its_vlpi_prop_update() calls lpi_write_config() which obtains the mapping information for a VLPI without lock held. So it could race with its_vlpi_unmap(). Since all calls from its_irq_set_vcpu_affinity() require the same lock to be held, hoist the locking there instead of sprinkling the locking all over the place. This bug was discovered using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. [ tglx: Use guard() instead of goto ] Fixes: `015ec0386a` ("irqchip/gic-v3-its: Add VLPI configuration handling") Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Hagar Hemdan <hagarhem@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240531162144.28650-1-hagarhem@amazon.com	2024-06-03 18:20:00 +02:00
Cong Wang	2884dc7d08	bpf: Fix a potential use-after-free in bpf_link_free() After commit `1a80dbcb2d`, bpf_link can be freed by link->ops->dealloc_deferred, but the code still tests and uses link->ops->dealloc afterward, which leads to a use-after-free as reported by syzbot. Actually, one of them should be sufficient, so just call one of them instead of both. Also add a WARN_ON() in case of any problematic implementation. Fixes: `1a80dbcb2d` ("bpf: support deferring bpf_link dealloc to after RCU grace period") Reported-by: syzbot+1989ee16d94720836244@syzkaller.appspotmail.com Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240602182703.207276-1-xiyou.wangcong@gmail.com	2024-06-03 18:16:19 +02:00
Srinivas Pandruvada	1e24c31351	cpufreq: intel_pstate: Fix unchecked HWP MSR access Fix unchecked MSR access error for processors with no HWP support. On such processors, maximum frequency can be changed by the system firmware using ACPI event ACPI_PROCESSOR_NOTIFY_HIGEST_PERF_CHANGED. This results in accessing HWP MSR 0x771. Call Trace: <TASK> generic_exec_single+0x58/0x120 smp_call_function_single+0xbf/0x110 rdmsrl_on_cpu+0x46/0x60 intel_pstate_get_hwp_cap+0x1b/0x70 intel_pstate_update_limits+0x2a/0x60 acpi_processor_notify+0xb7/0x140 acpi_ev_notify_dispatch+0x3b/0x60 HWP MSR 0x771 can be only read on a CPU which supports HWP and enabled. Hence intel_pstate_get_hwp_cap() can only be called when hwp_active is true. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Closes: https://lore.kernel.org/linux-pm/20240529155740.Hq2Hw7be@linutronix.de/ Fixes: `e8217b4bec` ("cpufreq: intel_pstate: Update the maximum CPU frequency consistently") Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-06-03 18:00:23 +02:00
David Kaplan	93c1800b37	x86/kexec: Fix bug with call depth tracking The call to cc_platform_has() triggers a fault and system crash if call depth tracking is active because the GS segment has been reset by load_segments() and GS_BASE is now 0 but call depth tracking uses per-CPU variables to operate. Call cc_platform_has() earlier in the function when GS is still valid. [ bp: Massage. ] Fixes: `5d8213864a` ("x86/retbleed: Add SKL return thunk") Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20240603083036.637-1-bp@kernel.org	2024-06-03 17:19:03 +02:00
Thorsten Blum	2317dc2c22	bpf, devmap: Remove unnecessary if check in for loop The iterator variable dst cannot be NULL and the if check can be removed. Remove it and fix the following Coccinelle/coccicheck warning reported by itnull.cocci: ERROR: iterator variable bound on line 762 cannot be NULL Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240529101900.103913-2-thorsten.blum@toblux.com	2024-06-03 17:09:23 +02:00
Palmer Dabbelt	e2c79b4c5c	Revert "riscv: mm: accelerate pagefault when badaccess" I accidentally picked up an earlier version of this patch, which had already landed via mm. The patch I picked up contains a bug, which I kept as I thought it was a fix. So let's just revert it. This reverts commit `4c6c002042`. Fixes: `4c6c002042` ("riscv: mm: accelerate pagefault when badaccess") Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240530164451.21336-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-06-03 07:41:13 -07:00
Nam Cao	994af1825a	riscv: fix overlap of allocated page and PTR_ERR On riscv32, it is possible for the last page in virtual address space (0xfffff000) to be allocated. This page overlaps with PTR_ERR, so that shouldn't happen. There is already some code to ensure memblock won't allocate the last page. However, buddy allocator is left unchecked. Fix this by reserving physical memory that would be mapped at virtual addresses greater than 0xfffff000. Reported-by: Björn Töpel <bjorn@kernel.org> Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong.to.us Fixes: `76d2a0493a` ("RISC-V: Init and Halt Code") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: <stable@vger.kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Link: https://lore.kernel.org/r/20240425115201.3044202-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-06-03 07:41:09 -07:00
Tetsuo Handa	c6144a2116	tomoyo: update project links TOMOYO project has moved to SourceForge.net . Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>	2024-06-03 22:43:11 +09:00
Gao Xiang	3d117494e2	cachefiles: remove unneeded include of <linux/fdtable.h> close_fd() has been killed, let's get rid of unneeded <linux/fdtable.h> as Al Viro pointed out [1]. [1] https://lore.kernel.org/r/20240603034055.GI1629371@ZenIV Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240603062344.818290-1-hsiangkao@linux.alibaba.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-06-03 15:39:17 +02:00
Chuck Lever	4a77c3dead	SUNRPC: Fix loop termination condition in gss_free_in_token_pages() The in_token->pages[] array is not NULL terminated. This results in the following KASAN splat: KASAN: maybe wild-memory-access in range [0x04a2013400000008-0x04a201340000000f] Fixes: `bafa6b4d95` ("SUNRPC: Fix gss_free_in_token_pages()") Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-06-03 09:07:55 -04:00
Matthias Schiffer	90dd7de4ef	gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type The TQMx86 GPIO controller only supports falling and rising edge triggers, but not both. Fix this by implementing a software both-edge mode that toggles the edge type after every interrupt. Fixes: `b868db94a6` ("gpio: tqmx86: Add GPIO from for this IO controller") Co-developed-by: Gregor Herburger <gregor.herburger@tq-group.com> Signed-off-by: Gregor Herburger <gregor.herburger@tq-group.com> Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Link: https://lore.kernel.org/r/515324f0491c4d44f4ef49f170354aca002d81ef.1717063994.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-06-03 14:13:13 +02:00
Matthias Schiffer	08af509efd	gpio: tqmx86: store IRQ trigger type and unmask status separately irq_set_type() should not implicitly unmask the IRQ. All accesses to the interrupt configuration register are moved to a new helper tqmx86_gpio_irq_config(). We also introduce the new rule that accessing irq_type must happen while locked, which will become significant for fixing EDGE_BOTH handling. Fixes: `b868db94a6` ("gpio: tqmx86: Add GPIO from for this IO controller") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Link: https://lore.kernel.org/r/6aa4f207f77cb58ef64ffb947e91949b0f753ccd.1717063994.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-06-03 14:13:13 +02:00
Matthias Schiffer	9d6a811b52	gpio: tqmx86: introduce shadow register for GPIO output value The TQMx86 GPIO controller uses the same register address for input and output data. Reading the register will always return current inputs rather than the previously set outputs (regardless of the current direction setting). Therefore, using a RMW pattern does not make sense when setting output values. Instead, the previously set output register value needs to be stored as a shadow register. As there is no reliable way to get the current output values from the hardware, also initialize all channels to 0, to ensure that stored and actual output values match. This should usually not have any effect in practise, as the TQMx86 UEFI sets all outputs to 0 during boot. Also prepare for extension of the driver to more than 8 GPIOs by using DECLARE_BITMAP. Fixes: `b868db94a6` ("gpio: tqmx86: Add GPIO from for this IO controller") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/d0555933becd45fa92a85675d26e4d59343ddc01.1717063994.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-06-03 14:13:13 +02:00
Gregor Herburger	8c219e52ca	gpio: tqmx86: fix typo in Kconfig label Fix description for GPIO_TQMX86 from QTMX86 to TQMx86. Fixes: `b868db94a6` ("gpio: tqmx86: Add GPIO from for this IO controller") Signed-off-by: Gregor Herburger <gregor.herburger@tq-group.com> Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/e0e38c9944ad6d281d9a662a45d289b88edc808e.1717063994.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-06-03 14:13:13 +02:00
Samuel Holland	e306a894bd	irqchip/sifive-plic: Chain to parent IRQ after handlers are ready Now that the PLIC uses a platform driver, the driver is probed later in the boot process, where interrupts from peripherals might already be pending. As a result, plic_handle_irq() may be called as early as the call to irq_set_chained_handler() completes. But this call happens before the per-context handler is completely set up, so there is a window where plic_handle_irq() can see incomplete per-context state and crash. Avoid this by delaying the call to irq_set_chained_handler() until all handlers from all PLICs are initialized. Fixes: `8ec99b0331` ("irqchip/sifive-plic: Convert PLIC driver into a platform driver") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Anup Patel <anup@brainfault.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240529215458.937817-1-samuel.holland@sifive.com Closes: https://lore.kernel.org/r/CAMuHMdVYFFR7K5SbHBLY-JHhb7YpgGMS_hnRWm8H0KD-wBo+4A@mail.gmail.com/	2024-06-03 13:53:12 +02:00
Tristram Ha	6149db4997	net: phy: micrel: fix KSZ9477 PHY issues after suspend/resume When the PHY is powered up after powered down most of the registers are reset, so the PHY setup code needs to be done again. In addition the interrupt register will need to be setup again so that link status indication works again. Fixes: `26dd2974c5` ("net: phy: micrel: Move KSZ9477 errata fixes to PHY driver") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-03 12:06:29 +01:00
Sunil V L	0110c4b110	irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails When riscv_intc_init_common() fails, the firmware node allocated is not freed. Add the missing free(). Fixes: `7023b9d83f` ("irqchip/riscv-intc: Add ACPI support") Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Anup Patel <anup@brainfault.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240527081113.616189-1-sunilvl@ventanamicro.com	2024-06-03 12:29:35 +02:00
Suma Hegde	77f1972bdc	platform/x86/amd/hsmp: Check HSMP support on AMD family of processors HSMP interface is supported only on few x86 processors from AMD. Accessing HSMP registers on rest of the platforms might cause unexpected behaviour. So add a check. Also unavailability of this interface on rest of the processors is not an error. Hence, use pr_info() instead of the pr_err() to log the message. Signed-off-by: Suma Hegde <suma.hegde@amd.com> Reviewed-by: Naveen Krishna Chatradhi <naveenkrishna.chatradhi@amd.com> Link: https://lore.kernel.org/r/20240603081512.142909-1-suma.hegde@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-06-03 11:57:28 +02:00
Armin Wolf	306aec7eea	platform/x86: dell-smbios: Simplify error handling When the allocation of value_name fails, the error handling code uses two gotos for error handling, which is not necessary. Simplify the error handling in this case by only using a single goto. Tested on a Dell Inspiron 3505. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240528204903.445546-2-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-06-03 11:54:29 +02:00
Armin Wolf	1981b296f8	platform/x86: dell-smbios: Fix wrong token data in sysfs When reading token data from sysfs on my Inspiron 3505, the token locations and values are wrong. This happens because match_attribute() blindly assumes that all entries in da_tokens have an associated entry in token_attrs. This however is not true as soon as da_tokens[] contains zeroed token entries. Those entries are being skipped when initialising token_attrs, breaking the core assumption of match_attribute(). Fix this by defining an extra struct for each pair of token attributes and use container_of() to retrieve token information. Tested on a Dell Inspiron 3050. Fixes: `33b9ca1e53` ("platform/x86: dell-smbios: Add a sysfs interface for SMBIOS tokens") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240528204903.445546-1-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-06-03 11:53:54 +02:00
Arnd Bergmann	078fc56f5c	platform/x86: yt2-1380: add CONFIG_EXTCON dependency This driver uses the extcon subsystem and fails to build when it cannot call into that subsystem: x86_64-linux-ld: vmlinux.o: in function `yt2_1380_fc_worker': lenovo-yoga-tab2-pro-1380-fastcharger.c:(.text+0xa9d819): undefined reference to `extcon_get_state' x86_64-linux-ld: lenovo-yoga-tab2-pro-1380-fastcharger.c:(.text+0xa9d853): undefined reference to `extcon_get_state' x86_64-linux-ld: vmlinux.o: in function `yt2_1380_fc_serdev_probe': lenovo-yoga-tab2-pro-1380-fastcharger.c:(.text+0xa9da22): undefined reference to `extcon_get_extcon_dev' x86_64-linux-ld: lenovo-yoga-tab2-pro-1380-fastcharger.c:(.text+0xa9dc0c): undefined reference to `devm_extcon_register_notifier_all' Add a Kconfig dependency to make it it always builds correctly. Fixes: `b2ed33e8d4` ("platform/x86: Add lenovo-yoga-tab2-pro-1380-fastcharger driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240528115940.3169455-1-arnd@kernel.org Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-06-03 11:43:15 +02:00
Andy Shevchenko	55624db051	platform/x86: touchscreen_dmi: Use 2-argument strscpy() Use 2-argument strscpy(), which is not only shorter but also provides an additional check that destination buffer is an array. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20240602090244.1666360-8-andy.shevchenko@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-06-03 11:12:38 +02:00
Hans de Goede	84b26f509c	platform/x86: touchscreen_dmi: Drop "silead,max-fingers" property The silead touchscreen driver now defaults to 10 fingers, so it is no longer necessary to have a "silead,max-fingers=10" property for each silead touchscreen model. Drop this property from all the configs. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240525193854.39130-3-hdegoede@redhat.com	2024-06-03 11:10:46 +02:00
Hans de Goede	38a38f5a36	Input: silead - Always support 10 fingers When support for Silead touchscreens was orginal added some touchscreens with older firmware versions only supported 5 fingers and this was made the default requiring the setting of a "silead,max-fingers=10" uint32 device-property for all touchscreen models which do support 10 fingers. There are very few models with the old 5 finger fw, so in practice the setting of the "silead,max-fingers=10" is boilerplate which needs to be copy and pasted to every touchscreen config. Reporting that 10 fingers are supported on devices which only support 5 fingers doesn't cause any problems for userspace in practice, since at max 4 finger gestures are supported anyways. Drop the max_fingers configuration and simply always assume 10 fingers. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/20240525193854.39130-2-hdegoede@redhat.com	2024-06-03 11:10:08 +02:00
Chen Ni	629f2b4e05	drm/panel: sitronix-st7789v: Add check for of_drm_get_panel_orientation Add check for the return value of of_drm_get_panel_orientation() and return the error if it fails in order to catch the error. Fixes: `b27c0f6d20` ("drm/panel: sitronix-st7789v: add panel orientation support") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Michael Riesch <michael.riesch@wolfvision.net> Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20240528030832.2529471-1-nichen@iscas.ac.cn Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240528030832.2529471-1-nichen@iscas.ac.cn	2024-06-03 10:46:36 +02:00
Huacai Chen	eb36e520f4	LoongArch: Fix GMAC's phy-mode definitions in dts The GMAC of Loongson chips cannot insert the correct 1.5-2ns delay. So we need the PHY to insert internal delays for both transmit and receive data lines from/to the PHY device. Fix this by changing the "phy-mode" from "rgmii" to "rgmii-id" in dts. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-06-03 15:45:53 +08:00
Jiaxun Yang	1098efd299	LoongArch: Override higher address bits in JUMP_VIRT_ADDR In JUMP_VIRT_ADDR we are performing an or calculation on address value directly from pcaddi. This will only work if we are currently running from direct 1:1 mapping addresses or firmware's DMW is configured exactly same as kernel. Still, we should not rely on such assumption. Fix by overriding higher bits in address comes from pcaddi, so we can get rid of or operator. Cc: stable@vger.kernel.org Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-06-03 15:45:53 +08:00
Jiaxun Yang	beb2800074	LoongArch: Fix entry point in kernel image header Currently kernel entry in head.S is in DMW address range, firmware is instructed to jump to this address after loading the kernel image. However kernel should not make any assumption on firmware's DMW setting, thus the entry point should be a physical address falls into direct translation region. Fix by converting entry address to physical and amend entry calculation logic in libstub accordingly. BTW, use ABSOLUTE() to calculate variables to make Clang/LLVM happy. Cc: stable@vger.kernel.org Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-06-03 15:45:53 +08:00
Jiaxun Yang	3de9c42d02	LoongArch: Add all CPUs enabled by fdt to NUMA node 0 NUMA enabled kernel on FDT based machine fails to boot because CPUs are all in NUMA_NO_NODE and mm subsystem won't accept that. Fix by adding them to default NUMA node at FDT parsing phase and move numa_add_cpu(0) to a later point. Cc: stable@vger.kernel.org Fixes: `88d4d957ed` ("LoongArch: Add FDT booting support from efi system table") Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-06-03 15:45:53 +08:00
Jiaxun Yang	b56f67a6c7	LoongArch: Fix built-in DTB detection fdt_check_header(__dtb_start) will always success because kernel provides a dummy dtb, and by coincidence __dtb_start clashed with entry of this dummy dtb. The consequence is fdt passed from firmware will never be taken. Fix by trying to utilise __dtb_start only when CONFIG_BUILTIN_DTB is enabled. Cc: stable@vger.kernel.org Fixes: `7b937cc243` ("of: Create of_root if no dtb provided by firmware") Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-06-03 15:45:53 +08:00
Tiezhu Yang	6c3ca6654a	LoongArch: Remove CONFIG_ACPI_TABLE_UPGRADE in platform_init() Both acpi_table_upgrade() and acpi_boot_table_init() are defined as empty functions under !CONFIG_ACPI_TABLE_UPGRADE and !CONFIG_ACPI in include/linux/acpi.h, there are no implicit declaration errors with various configs. #ifdef CONFIG_ACPI_TABLE_UPGRADE void acpi_table_upgrade(void); #else static inline void acpi_table_upgrade(void) { } #endif #ifdef CONFIG_ACPI ... void acpi_boot_table_init (void); ... #else /* !CONFIG_ACPI / ... static inline void acpi_boot_table_init(void) { } ... #endif / !CONFIG_ACPI */ As Huacai suggested, CONFIG_ACPI_TABLE_UPGRADE is ugly and not necessary here, just remove it. At the same time, just keep CONFIG_ACPI to prevent potential build errors in future, and give a signal to indicate the code is ACPI-specific. For the same reason, we also put acpi_table_upgrade() under CONFIG_ACPI. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-06-03 15:45:53 +08:00
Wolfram Sang	c4aff1d1ec	Merge tag 'i2c-host-6.10-pt2' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current Removed the SPD class of i2c devices from the device core. Additionally, a cleanup in the Synquacer code removes the pclk from the global structure, as it is used only in the probe. Therefore, it is now declared locally.	2024-06-03 08:51:53 +02:00
Linus Torvalds	c3f38fa61a	Linux 6.10-rc2	2024-06-02 15:44:56 -07:00
Linus Torvalds	58d89ee81a	Merge tag 'ata-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fixes from Niklas Cassel: - Add a quirk for three different devices that have shown issues with LPM (link power management). These devices appear to not implement LPM properly, since we see command timeouts when enabling LPM. The quirk disables LPM for these problematic devices. (Me) - Do not apply the Intel PCS quirk on Alder Lake. The quirk is not needed and was originally added by mistake when LPM support was enabled for this AHCI controller. Enabling the quirk when not needed causes the the controller to not be able to detect the connected devices on some platforms. * tag 'ata-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata-core: Add ATA_HORKAGE_NOLPM for Apacer AS340 ata: libata-core: Add ATA_HORKAGE_NOLPM for AMD Radeon S3 SSD ata: libata-core: Add ATA_HORKAGE_NOLPM for Crucial CT240BX500SSD1 ata: ahci: Do not apply Intel PCS quirk on Intel Alder Lake	2024-06-02 13:30:53 -07:00
Linus Torvalds	a693b9c95a	Merge tag 'x86-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Miscellaneous topology parsing fixes: - Fix topology parsing regression on older CPUs in the new AMD/Hygon parser - Fix boot crash on odd Intel Quark and similar CPUs that do not fill out cpuinfo_x86::x86_clflush_size and zero out cpuinfo_x86::x86_cache_alignment as a result. Provide 32 bytes as a general fallback value. - Fix topology enumeration on certain rare CPUs where the BIOS locks certain CPUID leaves and the kernel unlocked them late, which broke with the new topology parsing code. Factor out this unlocking logic and move it earlier in the parsing sequence" * tag 'x86-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/topology/intel: Unlock CPUID before evaluating anything x86/cpu: Provide default cache line size if not enumerated x86/topology/amd: Evaluate SMT in CPUID leaf 0x8000001e only on family 0x17 and greater	2024-06-02 09:32:34 -07:00
Linus Torvalds	3fca58ffad	Merge tag 'sched-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Export a symbol to make life easier for instrumentation/debugging" * tag 'sched-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/x86: Export 'percpu arch_freq_scale'	2024-06-02 09:23:35 -07:00
Linus Torvalds	efa8f11a7e	Merge tag 'perf-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events fix from Ingo Molnar: "Add missing MODULE_DESCRIPTION() lines" * tag 'perf-urgent-2024-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Add missing MODULE_DESCRIPTION() lines perf/x86/rapl: Add missing MODULE_DESCRIPTION() line	2024-06-02 09:20:37 -07:00
Linus Torvalds	00a8c352dd	Merge tag 'hardening-v6.10-rc2-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - scsi: mpt3sas: Avoid possible run-time warning with long manufacturer strings - mailmap: update entry for Kees Cook - kunit/fortify: Remove __kmalloc_node() test * tag 'hardening-v6.10-rc2-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kunit/fortify: Remove __kmalloc_node() test mailmap: update entry for Kees Cook scsi: mpt3sas: Avoid possible run-time warning with long manufacturer strings	2024-06-02 09:15:28 -07:00
Jean-Baptiste Maneyrol	245f3b149e	iio: imu: inv_icm42600: delete unneeded update watermark call Update watermark will be done inside the hwfifo_set_watermark callback just after the update_scan_mode. It is useless to do it here. Fixes: `7f85e42a6c` ("iio: imu: inv_icm42600: add buffer support in iio devices") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com> Link: https://lore.kernel.org/r/20240527210008.612932-1-inv.git-commit@tdk.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-06-02 13:16:02 +01:00
Jean-Baptiste Maneyrol	d7bd473632	iio: imu: inv_icm42600: stabilized timestamp in interrupt Use IRQF_ONESHOT flag to ensure the timestamp is not updated in the hard handler during the thread handler. And compute and use the effective watermark value that correspond to this first timestamp. This way we can ensure the timestamp is always corresponding to the value used by the timestamping mechanism. Otherwise, it is possible that between FIFO count read and FIFO processing the timestamp is overwritten in the hard handler. Fixes: `ec74ae9fd3` ("iio: imu: inv_icm42600: add accurate timestamping") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com> Link: https://lore.kernel.org/r/20240529154717.651863-1-inv.git-commit@tdk.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-06-02 11:56:40 +01:00
Linus Torvalds	83814698cf	Merge tag 'powerpc-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Enforce full ordering for ATOMIC operations with BPF_FETCH - Fix uaccess build errors seen with GCC 13/14 - Fix build errors on ppc32 due to ARCH_HAS_KERNEL_FPU_SUPPORT - Drop error message from lparcfg guest name lookup Thanks to Christophe Leroy, Guenter Roeck, Nathan Lynch, Naveen N Rao, Puranjay Mohan, and Samuel Holland. * tag 'powerpc-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Limit ARCH_HAS_KERNEL_FPU_SUPPORT to PPC64 powerpc/uaccess: Use YZ asm constraint for ld powerpc/uaccess: Fix build errors seen with GCC 13/14 powerpc/pseries/lparcfg: drop error message from guest name lookup powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH	2024-06-01 17:34:35 -07:00
Linus Torvalds	54bec8ed57	Merge tag 'firewire-fixes-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Takashi Sakamoto: "After merging a commit `1fffe7a34c` ("script: modpost: emit a warning when the description is missing"), MODULE_DESCRIPTOR seems to be mandatory for kernel modules. In FireWire subsystem, the most of practical kernel modules have the field, while KUnit test modules do not. A single patch is applied to fix them" * tag 'firewire-fixes-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: add missing MODULE_DESCRIPTION() to test modules	2024-06-01 17:05:00 -07:00
Dmitry Safonov	33700a0c9b	net/tcp: Don't consider TCP_CLOSE in TCP_AO_ESTABLISHED TCP_CLOSE may or may not have current/rnext keys and should not be considered "established". The fast-path for TCP_CLOSE is SKB_DROP_REASON_TCP_CLOSE. This is what tcp_rcv_state_process() does anyways. Add an early drop path to not spend any time verifying segment signatures for sockets in TCP_CLOSE state. Cc: stable@vger.kernel.org # v6.7 Fixes: `0a3a809089` ("net/tcp: Verify inbound TCP-AO signed segments") Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://lore.kernel.org/r/20240529-tcp_ao-sk_state-v1-1-d69b5d323c52@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 16:27:26 -07:00
DelphineCCChiu	e85e271dec	net/ncsi: Fix the multi thread manner of NCSI driver Currently NCSI driver will send several NCSI commands back to back without waiting the response of previous NCSI command or timeout in some state when NIC have multi channel. This operation against the single thread manner defined by NCSI SPEC(section 6.3.2.3 in DSP0222_1.1.1) According to NCSI SPEC(section 6.2.13.1 in DSP0222_1.1.1), we should probe one channel at a time by sending NCSI commands (Clear initial state, Get version ID, Get capabilities...), than repeat this steps until the max number of channels which we got from NCSI command (Get capabilities) has been probed. Fixes: `e6f44ed6d0` ("net/ncsi: Package and channel management") Signed-off-by: DelphineCCChiu <delphine_cc_chiu@wiwynn.com> Link: https://lore.kernel.org/r/20240529065856.825241-1-delphine_cc_chiu@wiwynn.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 16:21:44 -07:00
Jason Xing	8105378c0c	net: rps: fix error when CONFIG_RFS_ACCEL is off John Sperbeck reported that if we turn off CONFIG_RFS_ACCEL, the 'head' is not defined, which will trigger compile error. So I move the 'head' out of the CONFIG_RFS_ACCEL scope. Fixes: `84b6823cd9` ("net: rps: protect last_qtail with rps_input_queue_tail_save() helper") Reported-by: John Sperbeck <jsperbeck@google.com> Closes: https://lore.kernel.org/all/20240529203421.2432481-1-jsperbeck@google.com/ Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240530032717.57787-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 16:02:08 -07:00
Duoming Zhou	166fcf86cd	ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put() The object "ax25_dev" is managed by reference counting. Thus it should not be directly released by kfree(), replace with ax25_dev_put(). Fixes: `d01ffb9eee` ("ax25: add refcount in ax25_dev to avoid UAF bugs") Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/20240530051733.11416-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 15:49:42 -07:00
Lars Kellogg-Stedman	3c34fb0bd4	ax25: Fix refcount imbalance on inbound connections When releasing a socket in ax25_release(), we call netdev_put() to decrease the refcount on the associated ax.25 device. However, the execution path for accepting an incoming connection never calls netdev_hold(). This imbalance leads to refcount errors, and ultimately to kernel crashes. A typical call trace for the above situation will start with one of the following errors: refcount_t: decrement hit 0; leaking memory. refcount_t: underflow; use-after-free. And will then have a trace like: Call Trace: <TASK> ? show_regs+0x64/0x70 ? __warn+0x83/0x120 ? refcount_warn_saturate+0xb2/0x100 ? report_bug+0x158/0x190 ? prb_read_valid+0x20/0x30 ? handle_bug+0x3e/0x70 ? exc_invalid_op+0x1c/0x70 ? asm_exc_invalid_op+0x1f/0x30 ? refcount_warn_saturate+0xb2/0x100 ? refcount_warn_saturate+0xb2/0x100 ax25_release+0x2ad/0x360 __sock_release+0x35/0xa0 sock_close+0x19/0x20 [...] On reboot (or any attempt to remove the interface), the kernel gets stuck in an infinite loop: unregister_netdevice: waiting for ax0 to become free. Usage count = 0 This patch corrects these issues by ensuring that we call netdev_hold() and ax25_dev_hold() for new connections in ax25_accept(). This makes the logic leading to ax25_accept() match the logic for ax25_bind(): in both cases we increment the refcount, which is ultimately decremented in ax25_release(). Fixes: `9fd75b66b8` ("ax25: Fix refcount leaks caused by ax25_cb_del()") Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com> Tested-by: Duoming Zhou <duoming@zju.edu.cn> Tested-by: Dan Cross <crossd@gmail.com> Tested-by: Chris Maness <christopher.maness@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/20240529210242.3346844-2-lars@oddbit.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 15:49:26 -07:00
Jakub Kicinski	45c0a209dc	Merge branch 'virtio_net-fix-lock-warning-and-unrecoverable-state' Heng Qi says: ==================== virtio_net: fix lock warning and unrecoverable state Patch 1 describes and fixes an issue where dim cannot return to normal state in certain scenarios. Patch 2 attempts to resolve lockdep's complaints that holding many nested locks. ==================== Link: https://lore.kernel.org/r/20240528134116.117426-1-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 15:14:13 -07:00
Heng Qi	d1f0bd01bc	virtio_net: fix a spurious deadlock issue When the following snippet is run, lockdep will report a deadlock[1]. /* Acquire all queues dim_locks / for (i = 0; i < vi->max_queue_pairs; i++) mutex_lock(&vi->rq[i].dim_lock); There's no deadlock here because the vq locks are always taken in the same order, but lockdep can not figure it out. So refactoring the code to alleviate the problem. [1] ======================================================== WARNING: possible recursive locking detected 6.9.0-rc7+ #319 Not tainted -------------------------------------------- ethtool/962 is trying to acquire lock: but task is already holding lock: other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&vi->rq[i].dim_lock); lock(&vi->rq[i].dim_lock); DEADLOCK * May be due to missing lock nesting notation 3 locks held by ethtool/962: #0: ffffffff82dbaab0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 #1: ffffffff82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at: ethnl_default_set_doit+0xbe/0x1e0 stack backtrace: CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x79/0xb0 check_deadlock+0x130/0x220 __lock_acquire+0x861/0x990 lock_acquire.part.0+0x72/0x1d0 ? lock_acquire+0xf8/0x130 __mutex_lock+0x71/0xd50 virtnet_set_coalesce+0x151/0x190 __ethnl_set_coalesce.isra.0+0x3f8/0x4d0 ethnl_set_coalesce+0x34/0x90 ethnl_default_set_doit+0xdd/0x1e0 genl_family_rcv_msg_doit+0xdc/0x130 genl_family_rcv_msg+0x154/0x230 ? __pfx_ethnl_default_set_doit+0x10/0x10 genl_rcv_msg+0x4b/0xa0 ? __pfx_genl_rcv_msg+0x10/0x10 netlink_rcv_skb+0x5a/0x110 genl_rcv+0x28/0x40 netlink_unicast+0x1af/0x280 netlink_sendmsg+0x20e/0x460 __sys_sendto+0x1fe/0x210 ? find_held_lock+0x2b/0x80 ? do_user_addr_fault+0x3a2/0x8a0 ? __lock_release+0x5e/0x160 ? do_user_addr_fault+0x3a2/0x8a0 ? lock_release+0x72/0x140 ? do_user_addr_fault+0x3a7/0x8a0 __x64_sys_sendto+0x29/0x30 do_syscall_64+0x78/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: `4d4ac2ecec` ("virtio_net: Add a lock for per queue RX coalesce") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20240528134116.117426-3-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 15:14:06 -07:00
Heng Qi	9e0945b190	virtio_net: fix possible dim status unrecoverable When the dim worker is scheduled, if it no longer needs to issue commands, dim may not be able to return to the working state later. For example, the following single queue scenario: 1. The dim worker of rxq0 is scheduled, and the dim status is changed to DIM_APPLY_NEW_PROFILE; 2. dim is disabled or parameters have not been modified; 3. virtnet_rx_dim_work exits directly; Then, even if net_dim is invoked again, it cannot work because the state is not restored to DIM_START_MEASURE. Fixes: `6208799553` ("virtio-net: support rx netdim") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240528134116.117426-2-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 15:14:06 -07:00
Vadim Fedorenko	89e281ebff	ethtool: init tsinfo stats if requested Statistic values should be set to ETHTOOL_STAT_NOT_SET even if the device doesn't support statistics. Otherwise zeros will be returned as if they are proper values: host# ethtool -I -T lo Time stamping parameters for lo: Capabilities: software-transmit software-receive software-system-clock PTP Hardware Clock: none Hardware Transmit Timestamp Modes: none Hardware Receive Filter Modes: none Statistics: tx_pkts: 0 tx_lost: 0 tx_err: 0 Fixes: `0e9c127729` ("ethtool: add interface to read Tx hardware timestamping statistics") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Link: https://lore.kernel.org/r/20240530040814.1014446-1-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 15:11:09 -07:00
Peter Geis	7679935b8b	MAINTAINERS: remove Peter Geis The Motorcomm PHY driver is now maintained by the OEM. The driver has expanded far beyond my original purpose, and I do not have the hardware to test against the new portions of it. Therefore I am removing myself as a maintainer of the driver. Signed-off-by: Peter Geis <pgwipeout@gmail.com> Link: https://lore.kernel.org/r/20240529185635.538072-1-pgwipeout@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 15:10:35 -07:00
Heng Qi	30636258a7	virtio_net: fix missing lock protection on control_buf access Refactored the handling of control_buf to be within the cvq_lock critical section, mitigating race conditions between reading device responses and new command submissions. Fixes: `6f45ab3e04` ("virtio_net: Add a lock for the command VQ.") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240530034143.19579-1-hengqi@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 15:09:56 -07:00
Linus Torvalds	89be4025b0	Merge tag '6.10-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: "Two small smb3 fixes: - Fix socket creation with sfu mount option (spotted by test generic/423) - Minor cleanup: fix missing description in two files" * tag '6.10-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix creating sockets when using sfu mount options fs: smb: common: add missing MODULE_DESCRIPTION() macros	2024-06-01 14:35:57 -07:00
Jens Axboe	5fc16fa5f1	io_uring: check for non-NULL file pointer in io_file_can_poll() In earlier kernels, it was possible to trigger a NULL pointer dereference off the forced async preparation path, if no file had been assigned. The trace leading to that looks as follows: BUG: kernel NULL pointer dereference, address: 00000000000000b0 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 67 PID: 1633 Comm: buf-ring-invali Not tainted 6.8.0-rc3+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022 RIP: 0010:io_buffer_select+0xc3/0x210 Code: 00 00 48 39 d1 0f 82 ae 00 00 00 48 81 4b 48 00 00 01 00 48 89 73 70 0f b7 50 0c 66 89 53 42 85 ed 0f 85 d2 00 00 00 48 8b 13 <48> 8b 92 b0 00 00 00 48 83 7a 40 00 0f 84 21 01 00 00 4c 8b 20 5b RSP: 0018:ffffb7bec38c7d88 EFLAGS: 00010246 RAX: ffff97af2be61000 RBX: ffff97af234f1700 RCX: 0000000000000040 RDX: 0000000000000000 RSI: ffff97aecfb04820 RDI: ffff97af234f1700 RBP: 0000000000000000 R08: 0000000000200030 R09: 0000000000000020 R10: ffffb7bec38c7dc8 R11: 000000000000c000 R12: ffffb7bec38c7db8 R13: ffff97aecfb05800 R14: ffff97aecfb05800 R15: ffff97af2be5e000 FS: 00007f852f74b740(0000) GS:ffff97b1eeec0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b0 CR3: 000000016deab005 CR4: 0000000000370ef0 Call Trace: <TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x14d/0x420 ? do_user_addr_fault+0x61/0x6a0 ? exc_page_fault+0x6c/0x150 ? asm_exc_page_fault+0x22/0x30 ? io_buffer_select+0xc3/0x210 __io_import_iovec+0xb5/0x120 io_readv_prep_async+0x36/0x70 io_queue_sqe_fallback+0x20/0x260 io_submit_sqes+0x314/0x630 __do_sys_io_uring_enter+0x339/0xbc0 ? __do_sys_io_uring_register+0x11b/0xc50 ? vm_mmap_pgoff+0xce/0x160 do_syscall_64+0x5f/0x180 entry_SYSCALL_64_after_hwframe+0x46/0x4e RIP: 0033:0x55e0a110a67e Code: ba cc 00 00 00 45 31 c0 44 0f b6 92 d0 00 00 00 31 d2 41 b9 08 00 00 00 41 83 e2 01 41 c1 e2 04 41 09 c2 b8 aa 01 00 00 0f 05 <c3> 90 89 30 eb a9 0f 1f 40 00 48 8b 42 20 8b 00 a8 06 75 af 85 f6 because the request is marked forced ASYNC and has a bad file fd, and hence takes the forced async prep path. Current kernels with the request async prep cleaned up can no longer hit this issue, but for ease of backporting, let's add this safety check in here too as it really doesn't hurt. For both cases, this will inevitably end with a CQE posted with -EBADF. Cc: stable@vger.kernel.org Fixes: `a76c0b31ee` ("io_uring: commit non-pollable provided mapped buffers upfront") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-06-01 12:25:35 -06:00
Linus Torvalds	ec9eeb89e6	Merge tag 'kbuild-fixes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix a Kconfig bug regarding comparisons to 'm' or 'n' - Replace missed $(srctree)/$(src) - Fix unneeded kallsyms step 3 - Remove incorrect "compatible" properties from image nodes in image.fit - Improve gen_kheaders.sh - Fix 'make dt_binding_check' - Clean up unnecessary code * tag 'kbuild-fixes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: dt-bindings: kbuild: Fix dt_binding_check on unconfigured build kheaders: use `command -v` to test for existence of `cpio` kheaders: explicitly define file modes for archived headers scripts/make_fit: Drop fdt image entry compatible string kbuild: remove a stale comment about cleaning in link-vmlinux.sh kbuild: fix short log for AS in link-vmlinux.sh kbuild: change scripts/mksysmap into sed script kbuild: avoid unneeded kallsyms step 3 kbuild: scripts/gdb: Replace missed $(srctree)/$(src) w/ $(src) kconfig: remove redundant check in expr_join_or() kconfig: fix comparison to constant symbols, 'm', 'n' kconfig: remove unused expr_is_no()	2024-06-01 09:33:55 -07:00
Linus Torvalds	bbeb1219ee	Merge tag 'xfs-6.10-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Chandan Babu: - Fix a livelock by dropping an xfarray sortinfo folio when an error is encountered - During extended attribute operations, Initialize transaction reservation computation based on attribute operation code - Relax symbolic link's ondisk verification code to allow symbolic links with short remote targets - Prevent soft lockups when unmapping file ranges and also during remapping blocks during a reflink operation - Fix compilation warnings when XFS is built with W=1 option * tag 'xfs-6.10-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Add cond_resched to block unmap range and reflink remap path xfs: don't open-code u64_to_user_ptr xfs: allow symlinks with short remote targets xfs: fix xfs_init_attr_trans not handling explicit operation codes xfs: drop xfarray sortinfo folio on error xfs: Stop using __maybe_unused in xfs_alloc.c xfs: Clear W=1 warning in xfs_iwalk_run_callbacks()	2024-06-01 08:59:04 -07:00
Linus Torvalds	f26ee67a0f	Merge tag 'tty-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty fix from Greg KH: "Here is a single revert for a much-reported regression in 6.10-rc1 when it comes to a few older architectures. Turns out that the VT ioctls don't work the same across all cpu types because of some old compatibility requrements for stuff like alpha and powerpc. So revert the change that attempted to have them use the _IO() macros and go back to the known-working values instead. This has NOT been in linux-next but has had many reports that it fixes the issue with 6.10-rc1" * tag 'tty-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: Revert "VT: Use macros to define ioctls"	2024-06-01 08:53:39 -07:00
Linus Torvalds	d9aab0b1c9	Merge tag 'landlock-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fix from Mickaël Salaün: "This fixes a wrong path walk triggered by syzkaller" * tag 'landlock-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add layout1.refer_mount_root landlock: Fix d_parent walk	2024-06-01 08:28:24 -07:00
Bitterblue Smith	819bda58e7	wifi: rtlwifi: Ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS Since commit `0a44dfc070` ("wifi: mac80211: simplify non-chanctx drivers") ieee80211_hw_config() is no longer called with changed = ~0. rtlwifi relied on ~0 in order to ignore the default retry limits of 4/7, preferring 48/48 in station mode and 7/7 in AP/IBSS. RTL8192DU has a lot of packet loss with the default limits from mac80211. Fix it by ignoring IEEE80211_CONF_CHANGE_RETRY_LIMITS completely, because it's the simplest solution. Link: https://lore.kernel.org/linux-wireless/cedd13d7691f4692b2a2fa5a24d44a22@realtek.com/ Cc: stable@vger.kernel.org # 6.9.x Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/1fabb8e4-adf3-47ae-8462-8aea963bc2a5@gmail.com	2024-06-01 13:15:26 +03:00
Johannes Berg	40cecacabc	wifi: mt76: mt7615: add missing chanctx ops Here's another one I missed during the initial conversion, fix that. Cc: stable@vger.kernel.org Reported-by: Rene Petersen <renepetersen@posteo.de> Fixes: `0a44dfc070` ("wifi: mac80211: simplify non-chanctx drivers") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218895 Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/20240528142308.3f7db1821e68.I531135d7ad76331a50244d6d5288e14aa9668390@changeid	2024-06-01 13:00:59 +03:00
Alexis Lothoré	596c195680	wifi: wilc1000: document SRCU usage instead of SRCU Commit `f236464f1d` ("wifi: wilc1000: convert list management to RCU") attempted to convert SRCU to RCU usage, assuming it was not really needed. The runtime issues that arose after merging it showed that there are code paths involving sleeping functions, and removing those would need some heavier driver rework. Add some documentation about SRCU need to make sure that any future developer do not miss some use cases if tempted to convert back again to RCU. Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/20240528-wilc_revert_srcu_to_rcu-v1-3-bce096e0798c@bootlin.com	2024-06-01 12:59:30 +03:00
Alexis Lothoré	3596717a6f	Revert "wifi: wilc1000: set atomic flag on kmemdup in srcu critical section" This reverts commit `35aee01ff4` Commit `35aee01ff4` ("wifi: wilc1000: set atomic flag on kmemdup in srcu critical section") was preparatory to the SRCU to RCU conversion done by commit `f236464f1d` ("wifi: wilc1000: convert list management to RCU"). This conversion brought issues and so has been reverted, so the atomic flag is not needed anymore. Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/20240528-wilc_revert_srcu_to_rcu-v1-2-bce096e0798c@bootlin.com	2024-06-01 12:59:30 +03:00
Alexis Lothoré	ebfb5e8fc8	Revert "wifi: wilc1000: convert list management to RCU" This reverts commit `f236464f1d` Commit `f236464f1d` ("wifi: wilc1000: convert list management to RCU") replaced SRCU with RCU, aiming to simplify RCU usage in the driver. No documentation or commit history hinted about why SRCU has been preferred in original design, so it has been assumed to be safe to do this conversion. Unfortunately, some static analyzers raised warnings, confirmed by runtime checker, not long after the merge. At least three different issues arose when switching to RCU: - wilc_wlan_txq_filter_dup_tcp_ack is executed in a RCU read critical section yet calls wait_for_completion_timeout - wilc_wfi_init_mon_interface calls kmalloc and register_netdevice while manipulating a vif retrieved from vif list - set_channel sends command to chip (and so, also waits for a completion) while holding a vif retrieved from vif list (so, in RCU read critical section) Some of those issues are not trivial to fix and would need bigger driver rework. Fix those issues by reverting the SRCU to RCU conversion commit Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-wireless/3b46ec7c-baee-49fd-b760-3bc12fb12eaf@moroto.mountain/ Fixes: `f236464f1d` ("wifi: wilc1000: convert list management to RCU") Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://msgid.link/20240528-wilc_revert_srcu_to_rcu-v1-1-bce096e0798c@bootlin.com	2024-06-01 12:59:29 +03:00
Kalle Valo	10bc8558b5	Merge tag 'ath-current-20240531' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath Merge ath-current from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git ath.git fixes for 6.10. Two fixes for user reported regressions in ath11k. One dependency fix and one error path fix.	2024-06-01 12:57:28 +03:00
Greg Kroah-Hartman	7bc4244c88	Revert "VT: Use macros to define ioctls" This reverts commit `8c467f3300`. Turns out this breaks many architectures as the vt ioctls do not all match up everywhere due to historical reasons, so the original commit is invalid for many values. Reported-by: Nick Bowler <nbowler@draconx.ca> Reported-by: Arnd Bergmann <arnd@kernel.org> Reported-by: Jiri Slaby <jirislaby@kernel.org> Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Reported-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Gladkov <legion@kernel.org> Link: https://lore.kernel.org/r/ad4e561c-1d49-4f25-882c-7a36c6b1b5c0@draconx.ca Link: https://lore.kernel.org/r/0da9785e-ba44-4718-9d08-4e96c1ba7ab2@kernel.org Link: https://lore.kernel.org/all/34d848f4-670b-4493-bf21-130ef862521b@xenosoft.de/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-01 07:28:21 +02:00
Linus Torvalds	cc8ed4d0a8	Merge tag 'drm-fixes-2024-06-01' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "This is the weekly fixes. Lots of small fixes across the board, one BUG_ON fix in shmem seems most important, otherwise amdgpu, i915, xe mostly with small fixes to all the other drivers. shmem: - fix BUG_ON in COW handling - warn when trying to pin imported objects buddy: - fix page size handling dma-buf: - sw-sync: Don't interfere with IRQ handling - fix kthreads-handling error path i915: - fix a race in audio component by registering it later - make DPT object unshrinkable to avoid shrinking when framebuffer has not shrunk - fix CCS id calculation to fix a perf regression - fix selftest caching mode - fix FIELD_PREP compiler warnings - fix indefinite wait for GT wakeref release - revert overeager multi-gt pm reference removal xe: - pcode polling timeout change - fix for deadlocks for faulting VMs - error-path lock imbalance fix amdgpu: - RAS fix - fix colorspace property for MST connectors - fix for PCIe DPM - silence UBSAN warning - GPUVM robustness fix - partition fix - drop deprecated I2C_CLASS_SPD amdkfd: - revert unused changes for certain 11.0.3 devices - simplify APU VRAM handling lima: - fix dma_resv-related deadlock in object pin msm: - remove build-time dependency on Python 3.9 nouveau: - nvif: Fix possible integer overflow panel: - lg-sw43408: Select DP helpers; Declare backlight ops as static - sitronix-st7789v: Various fixes for jt240mhqs_hwt_ek_e3 panel panfrost: - fix dma_resv-related deadlock in object pin" * tag 'drm-fixes-2024-06-01' of https://gitlab.freedesktop.org/drm/kernel: (35 commits) drm/msm: remove python 3.9 dependency for compiling msm drm/panel: sitronix-st7789v: fix display size for jt240mhqs_hwt_ek_e3 panel drm/panel: sitronix-st7789v: tweak timing for jt240mhqs_hwt_ek_e3 panel drm/panel: sitronix-st7789v: fix timing for jt240mhqs_hwt_ek_e3 panel drm/amd/pm: remove deprecated I2C_CLASS_SPD support from newly added SMU_14_0_2 drm/amdgpu: Make CPX mode auto default in NPS4 drm/amdkfd: simplify APU VRAM handling Revert "drm/amdkfd: fix gfx_target_version for certain 11.0.3 devices" drm/amdgpu: fix dereference null return value for the function amdgpu_vm_pt_parent drm/amdgpu: silence UBSAN warning drm/amdgpu: Adjust logic in amdgpu_device_partner_bandwidth() drm/i915: Fix audio component initialization drm/i915/dpt: Make DPT object unshrinkable drm/i915/gt: Fix CCS id's calculation for CCS mode setting drm/panel/lg-sw43408: mark sw43408_backlight_ops as static drm/i915/selftests: Set always_coherent to false when reading from CPU drm/panel/lg-sw43408: select CONFIG_DRM_DISPLAY_DP_HELPER drm/i915/guc: avoid FIELD_PREP warning drm/i915/gt: Disarm breadcrumbs if engines are already idle Revert "drm/i915: Remove extra multi-gt pm-references" ...	2024-05-31 16:26:48 -07:00
Linus Torvalds	1b907b83ae	Merge tag 'hwmon-for-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - sttcs: Fix property spelling - intel-m10-bmc-hwmon: Fix multiplier for N6000 board power sensor - ltc2992: Fix memory leak - dell-smm: Add Dell G15 5511 to fan control whitelist * tag 'hwmon-for-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (shtc1) Fix property misspelling hwmon: (intel-m10-bmc-hwmon) Fix multiplier for N6000 board power sensor hwmon: (ltc2992) Fix memory leak in ltc2992_parse_dt() hwmon: (dell-smm) Add Dell G15 5511 to fan control whitelist	2024-05-31 16:24:11 -07:00
Linus Torvalds	b7087cb35a	Merge tag 'mailbox-fixes-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox Pull mailbox fix from Jassi Brar: - zynqmp-ipi: fix linker error on some configurations * tag 'mailbox-fixes-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox: mailbox: zynqmp-ipi: drop irq_to_desc() call	2024-05-31 16:21:00 -07:00
Linus Torvalds	d5931dd0de	Merge tag 'spi-fix-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A series of fixes that came in since the merge window, the main thing being the fixes Andy did for DMA sync where we were calling into the DMA API in suprising ways and causing issues as a result, the main thing being confusing the IOMMU code. We've also got some fairly important fixes for the stm32 driver, it supports a wide range of hardware and some optimisations that were done recently have broken on some systems, and a fix to prevent glitched signals on the bus in the cadence driver" * tag 'spi-fix-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: stm32: Don't warn about spurious interrupts spi: Assign dummy scatterlist to unidirectional transfers spi: cadence: Ensure data lines set to low during dummy-cycle period spi: stm32: Revert change that enabled controller before asserting CS spi: Check if transfer is mapped before calling DMA sync APIs spi: Don't mark message DMA mapped when no transfer in it is	2024-05-31 16:17:40 -07:00
Linus Torvalds	28add42dc2	Merge tag 'regulator-fix-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "One fix that came in since -rc1, fixing misuse of a local variable in the DT parsing code in the RTQ2208 driver" * tag 'regulator-fix-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: rtq2208: Fix invalid memory access when devm_of_regulator_put_matches is called	2024-05-31 16:12:54 -07:00
Linus Torvalds	b7c05622da	Merge tag 'regmap-fix-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "The I2C bus was not taking account of the register and any padding bytes when handling maximum write sizes supported by an I2C adaptor, this patch from Jim Wylder fixes that" * tag 'regmap-fix-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap-i2c: Subtract reg size from max_write	2024-05-31 16:09:27 -07:00
Linus Torvalds	0f9a75179d	Merge tag 'block-6.10-20240530' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - NVMe fixes via Keith: - Removing unused fields (Kanchan) - Large folio offsets support (Kundan) - Multipath NUMA node initialiazation fix (Nilay) - Multipath IO stats accounting fixes (Keith) - Circular lockdep fix (Keith) - Target race condition fix (Sagi) - Target memory leak fix (Sagi) - bcache fixes - null_blk fixes (Damien) - Fix regression in io.max due to throttle low removal (Waiman) - DM limit table fixes (Christoph) - SCSI and block limit fixes (Christoph) - zone fixes (Damien) - Misc fixes (Christoph, Hannes, hexue) * tag 'block-6.10-20240530' of git://git.kernel.dk/linux: (25 commits) blk-throttle: Fix incorrect display of io.max block: Fix zone write plugging handling of devices with a runt zone block: Fix validation of zoned device with a runt zone null_blk: Do not allow runt zone with zone capacity smaller then zone size nvmet: fix a possible leak when destroy a ctrl during qp establishment nvme: use srcu for iterating namespace list bcache: code cleanup in __bch_bucket_alloc_set() bcache: call force_wake_up_gc() if necessary in check_should_bypass() bcache: allow allocator to invalidate bucket in gc block: check for max_hw_sectors underflow block: stack max_user_sectors sd: also set max_user_sectors when setting max_sectors null_blk: Print correct max open zones limit in null_init_zoned_dev() block: delete redundant function declaration null_blk: Fix return value of nullb_device_power_store() dm: make dm_set_zones_restrictions work on the queue limits dm: remove dm_check_zoned dm: move setting zoned_enabled to dm_table_set_restrictions block: remove blk_queue_max_integrity_segments nvme: adjust multiples of NVME_CTRL_PAGE_SIZE in offset ...	2024-05-31 15:31:27 -07:00
Linus Torvalds	6d541d6672	Merge tag 'io_uring-6.10-20240530' of git://git.kernel.dk/linux Pull io_uring fixes from Jens Axboe: "A couple of minor fixes for issues introduced in the 6.10 merge window: - Ensure that all read/write ops have an appropriate cleanup handler set (Breno) - Regression for applications still doing multiple mmaps even if FEAT_SINGLE_MMAP is set (me) - Move kmsg inquiry setting above any potential failure point, avoiding a spurious NONEMPTY flag setting on early error (me)" * tag 'io_uring-6.10-20240530' of git://git.kernel.dk/linux: io_uring/net: assign kmsg inq/flags before buffer selection io_uring/rw: Free iovec before cleaning async data io_uring: don't attempt to mmap larger than what the user asks for	2024-05-31 15:22:58 -07:00
Andrii Nakryiko	7d0b3953f6	libbpf: don't close(-1) in multi-uprobe feature detector Guard close(link_fd) with extra link_fd >= 0 check to prevent close(-1). Detected by Coverity static analysis. Fixes: `04d939a2ab` ("libbpf: detect broken PID filtering logic for multi-uprobe") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20240529231212.768828-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-31 14:56:51 -07:00
Jiri Olsa	aeb8fe0283	bpf: Fix bpf_session_cookie BTF_ID in special_kfunc_set list The bpf_session_cookie is unavailable for !CONFIG_FPROBE as reported by Sebastian [1]. To fix that we remove CONFIG_FPROBE ifdef for session kfuncs, which is fine, because there's filter for session programs. Then based on bpf_trace.o dependency: obj-$(CONFIG_BPF_EVENTS) += bpf_trace.o we add bpf_session_cookie BTF_ID in special_kfunc_set list dependency on CONFIG_BPF_EVENTS. [1] https://lore.kernel.org/bpf/20240531071557.MvfIqkn7@linutronix.de/T/#m71c6d5ec71db2967288cb79acedc15cc5dbfeec5 Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Suggested-by: Alexei Starovoitov <ast@kernel.org> Fixes: `5c919acef8` ("bpf: Add support for kprobe session cookie") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20240531194500.2967187-1-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-31 14:54:48 -07:00
Andrii Nakryiko	62da3acd28	selftests/bpf: fix inet_csk_accept prototype in test_sk_storage_tracing.c Recent kernel change ([0]) changed inet_csk_accept() prototype. Adapt progs/test_sk_storage_tracing.c to take that into account. [0] `92ef0fd55a` ("net: change proto and proto_ops accept type") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240528223218.3445297-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-31 14:54:25 -07:00
Alex Williamson	aac6db75a9	vfio/pci: Use unmap_mapping_range() With the vfio device fd tied to the address space of the pseudo fs inode, we can use the mm to track all vmas that might be mmap'ing device BARs, which removes our vma_list and all the complicated lock ordering necessary to manually zap each related vma. Note that we can no longer store the pfn in vm_pgoff if we want to use unmap_mapping_range() to zap a selective portion of the device fd corresponding to BAR mappings. This also converts our mmap fault handler to use vmf_insert_pfn() because we no longer have a vma_list to avoid the concurrency problem with io_remap_pfn_range(). The goal is to eventually use the vm_ops huge_fault handler to avoid the additional faulting overhead, but vmf_insert_pfn_{pmd,pud}() need to learn about pfnmaps first. Also, Jason notes that a race exists between unmap_mapping_range() and the fops mmap callback if we were to call io_remap_pfn_range() to populate the vma on mmap. Specifically, mmap_region() does call_mmap() before it does vma_link_file() which gives a window where the vma is populated but invisible to unmap_mapping_range(). Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240530045236.1005864-3-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2024-05-31 15:15:51 -06:00
Alex Williamson	b7c5e64fec	vfio: Create vfio_fs_type with inode per device By linking all the device fds we provide to userspace to an address space through a new pseudo fs, we can use tools like unmap_mapping_range() to zap all vmas associated with a device. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240530045236.1005864-2-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2024-05-31 15:15:51 -06:00
Weiwen Hu	b1a1fdd709	nvme: fix nvme_pr_* status code parsing Fix the parsing if extra status bits (e.g. MORE) is present. Fixes: `7fb42780d0` ("nvme: Convert NVMe errors to PR errors") Signed-off-by: Weiwen Hu <huweiwen@linux.alibaba.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-31 13:50:59 -07:00
Kees Cook	99a6087dfd	kunit/fortify: Remove __kmalloc_node() test __kmalloc_node() is considered an "internal" function to the Slab, so drop it from explicit testing. Link: https://lore.kernel.org/r/20240531185703.work.588-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>	2024-05-31 13:47:41 -07:00
John Hubbard	4bf15b1c65	selftests/futex: don't pass a const char* to asprintf(3) When building with clang, via: make LLVM=1 -C tools/testing/selftests ...clang issues this warning: futex_requeue_pi.c:403:17: warning: passing 'const char ' to parameter of type 'char ' discards qualifiers in nested pointer types [-Wincompatible-pointer-types-discards-qualifiers] This warning fires because test_name is passed into asprintf(3), which then changes it. Fix this by simply removing the const qualifier. This is a local automatic variable in a very short function, so there is not much need to use the compiler to enforce const-ness at this scope. [1] https://lore.kernel.org/all/20240329-selftests-libmk-llvm-rfc-v1-1-2f9ed7d1c49f@valentinobst.de/ Fixes: `f17d8a87ec` ("selftests: fuxex: Report a unique test name per run of futex_requeue_pi") Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-31 14:37:10 -06:00
John Hubbard	32c75ad4a7	selftests/futex: don't redefine .PHONY targets (all, clean) The .PHONY targets "all" and "clean" are both already defined in the file that is included in the very next line: ../lib.mk. Remove this duplicate code. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-31 14:37:04 -06:00
Chunguang Xu	7dc3bfcb4c	nvme-fabrics: use reserved tag for reg read/write command In some scenarios, if too many commands are issued by nvme command in the same time by user tasks, this may exhaust all tags of admin_q. If a reset (nvme reset or IO timeout) occurs before these commands finish, reconnect routine may fail to update nvme regs due to insufficient tags, which will cause kernel hang forever. In order to workaround this issue, maybe we can let reg_read32()/reg_read64()/reg_write32() use reserved tags. This maybe safe for nvmf: 1. For the disable ctrl path, we will not issue connect command 2. For the enable ctrl / fw activate path, since connect and reg_xx() are called serially. So the reserved tags may still be enough while reg_xx() use reserved tags. Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-31 13:26:15 -07:00
Linus Torvalds	b050496579	Merge tag 'dma-mapping-6.10-2024-05-31' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: - dma-mapping benchmark error handling fixes (Fedor Pchelkin) - correct a config symbol reference in the DMA API documentation (Lukas Bulwahn) * tag 'dma-mapping-6.10-2024-05-31' of git://git.infradead.org/users/hch/dma-mapping: Documentation/core-api: correct reference to SWIOTLB_DYNAMIC dma-mapping: benchmark: handle NUMA_NO_NODE correctly dma-mapping: benchmark: fix node id validation dma-mapping: benchmark: avoid needless copy_to_user if benchmark fails dma-mapping: benchmark: fix up kthread-related error handling	2024-05-31 12:14:55 -07:00
Linus Torvalds	7d88cc8ecc	Merge tag 'sound-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Lots of small fixes: - A race fix for debugfs handling in ALSA core - A series of corrections for MIDI2 core format conversions - ASoC Intel fixes for 16 bit DMIC config - Updates for missing module parameters in ASoC code - HD-audio quirk, Cirrus codec fix, etc minor fixes" * tag 'sound-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (26 commits) ALSA: seq: ump: Fix swapped song position pointer data ASoC: SOF: ipc4-topology: Adjust the params based on DAI formats ASoC: SOF: ipc4-topology: Improve readability of sof_ipc4_prepare_dai_copier() ASoC: SOF: ipc4-topology/pcm: Rename sof_ipc4_copier_is_single_format() ASoC: SOF: ipc4-topology: Print out the channel count in sof_ipc4_dbg_audio_format ASoC: SOF: ipc4-topology: Add support for NHLT with 16-bit only DMIC blob ALSA: seq: Fix yet another spot for system message conversion ALSA: ump: Set default protocol when not given explicitly ALSA: ump: Don't accept an invalid UMP protocol number ASoC: SOF: ipc4-topology: Fix input format query of process modules without base extension ASoC: Intel: sof-sdw: fix missing SPI_MASTER dependency ALSA: pcm: fix typo in comment ALSA: ump: Don't clear bank selection after sending a program change ALSA: seq: Fix incorrect UMP type for system messages ALSA/hda: intel-dsp-config: reduce log verbosity ALSA: seq: Don't clear bank selection at event -> UMP MIDI2 conversion ALSA: seq: Fix missing bank setup between MIDI1/MIDI2 UMP conversion ASoC: SOF: add missing MODULE_DESCRIPTION() ASoC: SOF: reorder MODULE_ definitions ASoC: SOF: AMD: group all module related information ...	2024-05-31 12:11:44 -07:00
Linus Torvalds	87895a6402	Merge tag 'platform-drivers-x86-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: - a use-after-free bugfix - Kconfig fixes for randconfig builds - allow setting touchscreen_dmi quirks from the cmdline for debugging - touchscreen_dmi quirks for two new laptop/tablet models * tag 'platform-drivers-x86-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: touchscreen_dmi: Add info for the EZpad 6s Pro platform/x86: touchscreen_dmi: Add info for GlobalSpace SolT IVW 11.6" tablet platform/x86: touchscreen_dmi: Add support for setting touchscreen properties from cmdline platform/x86: thinkpad_acpi: Select INPUT_SPARSEKMAP in Kconfig platform/x86: x86-android-tablets: Add "select LEDS_CLASS" platform/x86: ISST: fix use-after-free in tpmi_sst_dev_remove()	2024-05-31 12:03:28 -07:00
Linus Torvalds	c6cc9799b4	Merge tag 'riscv-for-linus-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix to avoid pt_regs aliasing with idle thread stacks on secondary harts. - HAVE_ARCH_HUGE_VMAP is enabled on XIP kernels, which fixes boot issues on XIP systems with huge pages. - An update to the uABI documentation clarifying that only scalar misaligned accesses were grandfathered in as supported, as the vector extension did not exist at the time the uABI was frozen. - A fix for the recently-added byte/half atomics to avoid losing the fully ordered decorations. * tag 'riscv-for-linus-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fix fully ordered LR/SC xchg[8\|16]() implementations Documentation: RISC-V: uabi: Only scalar misaligned loads are supported riscv: enable HAVE_ARCH_HUGE_VMAP for XIP kernel riscv: prevent pt_regs corruption for secondary idle threads	2024-05-31 11:52:06 -07:00
Linus Torvalds	ff9bce3d06	Merge tag 'bcachefs-2024-05-30' of https://evilpiepirate.org/git/bcachefs Pull bcachefs fixes from Kent Overstreet: "Assorted odds and ends... - two downgrade fixes - a couple snapshot deletion and repair fixes, thanks to noradtux for finding these and providing the image to debug them - a couple assert fixes - convert to folio helper, from Matthew - some improved error messages - bit of code reorganization (just moving things around); doing this while things are quiet so I'm not rebasing fixes past reorgs - don't return -EROFS on inconsistency error in recovery, this confuses util-linux and has it retry the mount - fix failure to return error on misaligned dio write; reported as an issue with coreutils shred" * tag 'bcachefs-2024-05-30' of https://evilpiepirate.org/git/bcachefs: (21 commits) bcachefs: Fix failure to return error on misaligned dio write bcachefs: Don't return -EROFS from mount on inconsistency error bcachefs: Fix uninitialized var warning bcachefs: Split out sb-errors_format.h bcachefs: Split out journal_seq_blacklist_format.h bcachefs: Split out replicas_format.h bcachefs: Split out disk_groups_format.h bcachefs: split out sb-downgrade_format.h bcachefs: split out sb-members_format.h bcachefs: Better fsck error message for key version bcachefs: btree_gc can now handle unknown btrees bcachefs: add missing MODULE_DESCRIPTION() bcachefs: Fix setting of downgrade recovery passes/errors bcachefs: Run check_key_has_snapshot in snapshot_delete_keys() bcachefs: Refactor delete_dead_snapshots() bcachefs: Fix locking assert bcachefs: Fix lookup_first_inode() when inode_generations are present bcachefs: Plumb bkey into __btree_err() bcachefs: Use copy_folio_from_iter_atomic() bcachefs: Fix sb-downgrade validation ...	2024-05-31 11:45:41 -07:00
Thomas Gleixner	0c2f6d0461	x86/topology/intel: Unlock CPUID before evaluating anything Intel CPUs have a MSR bit to limit CPUID enumeration to leaf two. If this bit is set by the BIOS then CPUID evaluation including topology enumeration does not work correctly as the evaluation code does not try to analyze any leaf greater than two. This went unnoticed before because the original topology code just repeated evaluation several times and managed to overwrite the initial limited information with the correct one later. The new evaluation code does it once and therefore ends up with the limited and wrong information. Cure this by unlocking CPUID right before evaluating anything which depends on the maximum CPUID leaf being greater than two instead of rereading stuff after unlock. Fixes: `22d63660c3` ("x86/cpu: Use common topology code for Intel") Reported-by: Peter Schneider <pschneider1968@googlemail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/fd3f73dc-a86f-4bcf-9c60-43556a21eb42@googlemail.com	2024-05-31 20:25:56 +02:00
Arnd Bergmann	d551ce15d0	mailbox: zynqmp-ipi: drop irq_to_desc() call irq_to_desc() is not exported to loadable modules, so this driver now fails to link in some configurations: ERROR: modpost: "irq_to_desc" [drivers/mailbox/zynqmp-ipi-mailbox.ko] undefined! I can't see a purpose for this call, since the return value is unused and probably left over from some code refactoring. Address the link failure by just removing the line. Fixes: `6ffb163534` ("mailbox: zynqmp: handle SGI for shared IPI") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Tanmay Shah <tanmay.shah@amd.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>	2024-05-31 12:39:15 -05:00
Kees Cook	4e173c825b	mailmap: update entry for Kees Cook I'm tired of gmail breaking DKIM. Switch everything over to my @kernel.org alias instead. Signed-off-by: Kees Cook <kees@kernel.org>	2024-05-31 08:58:36 -07:00
Kees Cook	adb77bba9c	scsi: mpt3sas: Avoid possible run-time warning with long manufacturer strings The prior strscpy() replacement of strncpy() here expected the manufacture_reply strings to be NUL-terminated, but it is possible they are not, as the code pattern here shows, e.g., edev->vendor_id being exactly 1 character larger than manufacture_reply->vendor_id, and the replaced strncpy() was copying only up to the size of the source character array. Replace this with memtostr(), which is the unambiguous way to convert a maybe not-NUL-terminated character array into a NUL-terminated string. Fixes: `b7e9712a02` ("scsi: mpt3sas: Replace deprecated strncpy() with strscpy()") Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Marco Patalano <mpatalan@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240410023155.2100422-3-keescook@chromium.org Signed-off-by: Kees Cook <kees@kernel.org>	2024-05-31 08:58:20 -07:00
Steve French	518549c120	cifs: fix creating sockets when using sfu mount options When running fstest generic/423 with sfu mount option, it was being skipped due to inability to create sockets: generic/423 [not run] cifs does not support mknod/mkfifo which can also be easily reproduced with their af_unix tool: ./src/af_unix /mnt1/socket-two bind: Operation not permitted Fix sfu mount option to allow creating and reporting sockets. Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2024-05-31 10:55:15 -05:00
Mickaël Salaün	0055f53aac	selftests/landlock: Add layout1.refer_mount_root Add tests to check error codes when linking or renaming a mount root directory. This previously triggered a kernel warning, but it is fixed with the previous commit. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20240516181935.1645983-3-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>	2024-05-31 16:41:54 +02:00
Mickaël Salaün	88da52ccd6	landlock: Fix d_parent walk The WARN_ON_ONCE() in collect_domain_accesses() can be triggered when trying to link a root mount point. This cannot work in practice because this directory is mounted, but the VFS check is done after the call to security_path_link(). Do not use source directory's d_parent when the source directory is the mount point. Cc: Günther Noack <gnoack@google.com> Cc: Paul Moore <paul@paul-moore.com> Cc: stable@vger.kernel.org Reported-by: syzbot+bf4903dc7e12b18ebc87@syzkaller.appspotmail.com Fixes: `b91c3e4ea7` ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") Closes: https://lore.kernel.org/r/000000000000553d3f0618198200@google.com Link: https://lore.kernel.org/r/20240516181935.1645983-2-mic@digikod.net [mic: Fix commit message] Signed-off-by: Mickaël Salaün <mic@digikod.net>	2024-05-31 16:41:52 +02:00
Masami Hiramatsu (Google)	0f42bdf59b	selftests/tracing: Fix event filter test to retry up to 10 times Commit `eb50d0f250` ("selftests/ftrace: Choose target function for filter test from samples") choose the target function from samples, but sometimes this test failes randomly because the target function does not hit at the next time. So retry getting samples up to 10 times. Fixes: `eb50d0f250` ("selftests/ftrace: Choose target function for filter test from samples") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-31 08:35:43 -06:00
Niklas Cassel	3cb648c4dd	ata: libata-core: Add ATA_HORKAGE_NOLPM for Apacer AS340 Commit `7627a0edef` ("ata: ahci: Drop low power policy board type") dropped the board_ahci_low_power board type, and instead enables LPM if: -The AHCI controller reports that it supports LPM (Partial/Slumber), and -CONFIG_SATA_MOBILE_LPM_POLICY != 0, and -The port is not defined as external in the per port PxCMD register, and -The port is not defined as hotplug capable in the per port PxCMD register. Partial and Slumber LPM states can either be initiated by HIPM or DIPM. For HIPM (host initiated power management) to get enabled, both the AHCI controller and the drive have to report that they support HIPM. For DIPM (device initiated power management) to get enabled, only the drive has to report that it supports DIPM. However, the HBA will reject device requests to enter LPM states which the HBA does not support. The problem is that Apacer AS340 drives do not handle low power modes correctly. The problem was most likely not seen before because no one had used this drive with a AHCI controller with LPM enabled. Add a quirk so that we do not enable LPM for this drive, since we see command timeouts if we do (even though the drive claims to support DIPM). Fixes: `7627a0edef` ("ata: ahci: Drop low power policy board type") Cc: stable@vger.kernel.org Reported-by: Tim Teichmann <teichmanntim@outlook.de> Closes: https://lore.kernel.org/linux-ide/87bk4pbve8.ffs@tglx/ Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org>	2024-05-31 15:14:06 +02:00
Niklas Cassel	4738803693	ata: libata-core: Add ATA_HORKAGE_NOLPM for AMD Radeon S3 SSD Commit `7627a0edef` ("ata: ahci: Drop low power policy board type") dropped the board_ahci_low_power board type, and instead enables LPM if: -The AHCI controller reports that it supports LPM (Partial/Slumber), and -CONFIG_SATA_MOBILE_LPM_POLICY != 0, and -The port is not defined as external in the per port PxCMD register, and -The port is not defined as hotplug capable in the per port PxCMD register. Partial and Slumber LPM states can either be initiated by HIPM or DIPM. For HIPM (host initiated power management) to get enabled, both the AHCI controller and the drive have to report that they support HIPM. For DIPM (device initiated power management) to get enabled, only the drive has to report that it supports DIPM. However, the HBA will reject device requests to enter LPM states which the HBA does not support. The problem is that AMD Radeon S3 SSD drives do not handle low power modes correctly. The problem was most likely not seen before because no one had used this drive with a AHCI controller with LPM enabled. Add a quirk so that we do not enable LPM for this drive, since we see command timeouts if we do (even though the drive claims to support both HIPM and DIPM). Fixes: `7627a0edef` ("ata: ahci: Drop low power policy board type") Cc: stable@vger.kernel.org Reported-by: Doru Iorgulescu <doru.iorgulescu1@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218832 Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org>	2024-05-31 15:11:51 +02:00
Niklas Cassel	86aaa7e9d6	ata: libata-core: Add ATA_HORKAGE_NOLPM for Crucial CT240BX500SSD1 Commit `7627a0edef` ("ata: ahci: Drop low power policy board type") dropped the board_ahci_low_power board type, and instead enables LPM if: -The AHCI controller reports that it supports LPM (Partial/Slumber), and -CONFIG_SATA_MOBILE_LPM_POLICY != 0, and -The port is not defined as external in the per port PxCMD register, and -The port is not defined as hotplug capable in the per port PxCMD register. Partial and Slumber LPM states can either be initiated by HIPM or DIPM. For HIPM (host initiated power management) to get enabled, both the AHCI controller and the drive have to report that they support HIPM. For DIPM (device initiated power management) to get enabled, only the drive has to report that it supports DIPM. However, the HBA will reject device requests to enter LPM states which the HBA does not support. The problem is that Crucial CT240BX500SSD1 drives do not handle low power modes correctly. The problem was most likely not seen before because no one had used this drive with a AHCI controller with LPM enabled. Add a quirk so that we do not enable LPM for this drive, since we see command timeouts if we do (even though the drive claims to support DIPM). Fixes: `7627a0edef` ("ata: ahci: Drop low power policy board type") Cc: stable@vger.kernel.org Reported-by: Aarrayy <lp610mh@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218832 Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org>	2024-05-31 15:10:41 +02:00
Dr. David Alan Gilbert	539d33b578	drm/komeda: remove unused struct 'gamma_curve_segment' 'gamma_curve_segment' looks like it has never been used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240516133724.251750-1-linux@treblig.org Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>	2024-05-31 12:09:51 +01:00
Aapo Vienamo	985cfe501b	thunderbolt: debugfs: Fix margin debugfs node creation condition The margin debugfs node controls the "Enable Margin Test" field of the lane margining operations. This field selects between either low or high voltage margin values for voltage margin test or left or right timing margin values for timing margin test. According to the USB4 specification, whether or not the "Enable Margin Test" control applies, depends on the values of the "Independent High/Low Voltage Margin" or "Independent Left/Right Timing Margin" capability fields for voltage and timing margin tests respectively. The pre-existing condition enabled the debugfs node also in the case where both low/high or left/right margins are returned, which is incorrect. This change only enables the debugfs node in question, if the specific required capability values are met. Signed-off-by: Aapo Vienamo <aapo.vienamo@linux.intel.com> Fixes: `d0f1e0c2a6` ("thunderbolt: Add support for receiver lane margining") Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2024-05-31 13:13:39 +03:00
Phil Auld	d40605a682	sched/x86: Export 'percpu arch_freq_scale' Commit: `7bc263840b` ("sched/topology: Consolidate and clean up access to a CPU's max compute capacity") removed rq->cpu_capacity_orig in favor of using arch_scale_freq_capacity() calls. Export the underlying percpu symbol on x86 so that external trace point helper modules can be made to work again. Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240530181548.2039216-1-pauld@redhat.com	2024-05-31 11:48:42 +02:00
Jeff Johnson	dc8e5dfb52	perf/x86/intel: Add missing MODULE_DESCRIPTION() lines Fix the 'make W=1 C=1' warnings: WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/events/intel/intel-uncore.o WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/events/intel/intel-cstate.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lore.kernel.org/r/20240530-md-arch-x86-events-intel-v1-1-8252194ed20a@quicinc.com	2024-05-31 11:41:15 +02:00
Jeff Johnson	0a44078f2b	perf/x86/rapl: Add missing MODULE_DESCRIPTION() line Fix the warning from 'make C=1 W=1': WARNING: modpost: missing MODULE_DESCRIPTION() in arch/x86/events/rapl.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lore.kernel.org/r/20240530-md-arch-x86-events-v1-1-e45ffa8af99f@quicinc.com	2024-05-31 11:41:04 +02:00
Jan Beulich	e0eec24e2e	memblock: make memblock_set_node() also warn about use of MAX_NUMNODES On an (old) x86 system with SRAT just covering space above 4Gb: ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0xfffffffff] hotplug the commit referenced below leads to this NUMA configuration no longer being refused by a CONFIG_NUMA=y kernel (previously NUMA: nodes only cover 6144MB of your 8185MB e820 RAM. Not used. No NUMA configuration found Faking a node at [mem 0x0000000000000000-0x000000027fffffff] was seen in the log directly after the message quoted above), because of memblock_validate_numa_coverage() checking for NUMA_NO_NODE (only). This in turn led to memblock_alloc_range_nid()'s warning about MAX_NUMNODES triggering, followed by a NULL deref in memmap_init() when trying to access node 64's (NODE_SHIFT=6) node data. To compensate said change, make memblock_set_node() warn on and adjust a passed in value of MAX_NUMNODES, just like various other functions already do. Fixes: `ff6c3d81f2` ("NUMA: optimize detection of memory with no node id assigned by firmware") Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1c8a058c-5365-4f27-a9f1-3aeb7fb3e7b2@suse.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>	2024-05-31 12:36:28 +03:00
Takashi Iwai	310fa3ec28	ALSA: seq: ump: Fix swapped song position pointer data At converting between the legacy event and UMP, the parameters for MIDI Song Position Pointer are incorrectly stored. It should have been LSB -> MSB order while it stored in MSB -> LSB order. This patch corrects the ordering. Fixes: `e9e02819a9` ("ALSA: seq: Automatic conversion of UMP events") Link: https://lore.kernel.org/r/20240531075110.3250-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-31 09:51:44 +02:00
Quan Zhou	c66f3b40b1	RISC-V: KVM: Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext function In the function kvm_riscv_vcpu_set_reg_isa_ext, the original code used incorrect reg_subtype labels KVM_REG_RISCV_SBI_MULTI_EN/DIS. These have been corrected to KVM_REG_RISCV_ISA_MULTI_EN/DIS respectively. Although they are numerically equivalent, the actual processing will not result in errors, but it may lead to ambiguous code semantics. Fixes: `613029442a` ("RISC-V: KVM: Extend ONE_REG to enable/disable multiple ISA extensions") Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/ff1c6771a67d660db94372ac9aaa40f51e5e0090.1716429371.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>	2024-05-31 10:40:39 +05:30
Yong-Xuan Wang	2d707b4e37	RISC-V: KVM: No need to use mask when hart-index-bit is 0 When the maximum hart number within groups is 1, hart-index-bit is set to 0. Consequently, there is no need to restore the hart ID from IMSIC addresses and hart-index-bit settings. Currently, QEMU and kvmtool do not pass correct hart-index-bit values when the maximum hart number is a power of 2, thereby avoiding this issue. Corresponding patches for QEMU and kvmtool will also be dispatched. Fixes: `89d01306e3` ("RISC-V: KVM: Implement device interface for AIA irqchip") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20240415064905.25184-1-yongxuan.wang@sifive.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-05-31 09:47:08 +05:30
Dave Airlie	a2ce3f7752	Merge tag 'drm-misc-fixes-2024-05-30' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: dma-buf: - sw-sync: Don't interfere with IRQ handling - Fix kthreads-handling error path gem-shmem: - Warn when trying to pin imported objects lima: - Fix dma_resv-related deadlock in object pin msm: - Remove build-time dependency on Python 3.9 nouveau: - nvif: Fix possible integer overflow panel: - lg-sw43408: Select DP helpers; Declare backlight ops as static - sitronix-st7789v: Various fixes for jt240mhqs_hwt_ek_e3 panel panfrost: - Fix dma_resv-related deadlock in object pin Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240530192307.GA14809@localhost.localdomain	2024-05-31 11:51:20 +10:00
Waiman Long	0a751df456	blk-throttle: Fix incorrect display of io.max Commit `bf20ab538c` ("blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOW") attempts to revert the code change introduced by commit `cd5ab1b0fc` ("blk-throttle: add .low interface"). However, it leaves behind the bps_conf[] and iops_conf[] fields in the throtl_grp structure which aren't set anywhere in the new blk-throttle.c code but are still being used by tg_prfill_limit() to display the limits in io.max. Now io.max always displays the following values if a block queue is used: <m>:<n> rbps=0 wbps=0 riops=0 wiops=0 Fix this problem by removing bps_conf[] and iops_conf[] and use bps[] and iops[] instead to complete the revert. Fixes: `bf20ab538c` ("blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOW") Reported-by: Justin Forbes <jforbes@redhat.com> Closes: https://github.com/containers/podman/issues/22701#issuecomment-2120627789 Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240530134547.970075-1-longman@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-30 19:44:29 -06:00
Marc Zyngier	e7985f4360	of: property: Fix fw_devlink handling of interrupt-map Commit `d976c6f4b3` ("of: property: Add fw_devlink support for interrupt-map property") tried to do what it says on the tin, but failed on a couple of points: - it confuses bytes and cells. Not a huge deal, except when it comes to pointer arithmetic - it doesn't really handle anything but interrupt-maps that have their parent #address-cells set to 0 The combinations of the two leads to some serious fun on my M1 box, with plenty of WARN-ON() firing all over the shop, and amusing values being generated for interrupt specifiers. Having 2 versions of parsing code for "interrupt-map" was a bad idea. Now that the common parsing parts have been refactored into of_irq_parse_imap_parent(), rework the code here to use it instead and fix the pointer arithmetic. Note that the dependency will be a bit different than the original code when the interrupt-map points to another interrupt-map. In this case, the original code would resolve to the final interrupt controller. Now the dependency is the parent interrupt-map (which itself should have a dependency to the parent). It is possible that a node with an interrupt-map has no driver. Fixes: `d976c6f4b3` ("of: property: Add fw_devlink support for interrupt-map property") Signed-off-by: Marc Zyngier <maz@kernel.org> Co-developed-by: Rob Herring (Arm) <robh@kernel.org> Cc: Saravana Kannan <saravanak@google.com> Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Anup Patel <apatel@ventanamicro.com> Link: https://lore.kernel.org/r/20240529-dt-interrupt-map-fix-v2-2-ef86dc5bcd2a@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-05-30 19:43:47 -05:00
Rob Herring (Arm)	935df1bd40	of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw() Factor out the parsing of interrupt-map interrupt parent phandle and its arg cells to a separate function, of_irq_parse_imap_parent(), so that it can be used in other parsing scenarios (e.g. fw_devlink). There was a refcount leak on non-matching entries when iterating thru "interrupt-map" which is fixed. Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Anup Patel <apatel@ventanamicro.com> Link: https://lore.kernel.org/r/20240529-dt-interrupt-map-fix-v2-1-ef86dc5bcd2a@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-05-30 19:43:19 -05:00
Chanwoo Lee	d53b681ce9	scsi: ufs: mcq: Fix error output and clean up ufshcd_mcq_abort() An error unrelated to ufshcd_try_to_abort_task is being logged and can cause confusion. Modify ufshcd_mcq_abort() to print the result of the abort failure. For readability, return immediately instead of 'goto'. Fixes: `f1304d4420` ("scsi: ufs: mcq: Added ufshcd_mcq_abort()") Signed-off-by: Chanwoo Lee <cw9316.lee@samsung.com> Link: https://lore.kernel.org/r/20240524015904.1116005-1-cw9316.lee@samsung.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-30 20:40:48 -04:00
Dave Airlie	bb61cf46b6	Merge tag 'amd-drm-fixes-6.10-2024-05-30' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.10-2024-05-30: amdgpu: - RAS fix - Fix colorspace property for MST connectors - Fix for PCIe DPM - Silence UBSAN warning - GPUVM robustness fix - Partition fix - Drop deprecated I2C_CLASS_SPD amdkfd: - Revert unused changes for certain 11.0.3 devices - Simplify APU VRAM handling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240530202316.2246826-1-alexander.deucher@amd.com	2024-05-31 08:38:15 +10:00
Dave Airlie	c301c3d2ac	Merge tag 'drm-xe-fixes-2024-05-30' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - One pcode polling timeout change - One fix for deadlocks for faulting VMs - One error-path lock imbalance fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZlisNHzgoq9nVg6g@fedora	2024-05-31 08:33:13 +10:00
Dave Airlie	cfd36ae37c	Merge tag 'drm-intel-fixes-2024-05-30' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes drm/i915 fixes for v6.10-rc2: - Fix a race in audio component by registering it later - Make DPT object unshrinkable to avoid shrinking when framebuffer has not shrunk - Fix CCS id calculation to fix a perf regression - Fix selftest caching mode - Fix FIELD_PREP compiler warnings - Fix indefinite wait for GT wakeref release - Revert overeager multi-gt pm reference removal Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87a5k7iwod.fsf@intel.com	2024-05-31 08:18:18 +10:00
Damien Le Moal	29459c3eaa	block: Fix zone write plugging handling of devices with a runt zone A zoned device may have a last sequential write required zone that is smaller than other zones. However, all tests to check if a zone write plug write offset exceeds the zone capacity use the same capacity value stored in the gendisk zone_capacity field. This is incorrect for a zoned device with a last runt (smaller) zone. Add the new field last_zone_capacity to struct gendisk to store the capacity of the last zone of the device. blk_revalidate_seq_zone() and blk_revalidate_conv_zone() are both modified to get this value when disk_zone_is_last() returns true. Similarly to zone_capacity, the value is first stored using the last_zone_capacity field of struct blk_revalidate_zone_args. Once zone revalidation of all zones is done, this is used to set the gendisk last_zone_capacity field. The checks to determine if a zone is full or if a sector offset in a zone exceeds the zone capacity in disk_should_remove_zone_wplug(), disk_zone_wplug_abort_unaligned(), blk_zone_write_plug_init_request(), and blk_zone_wplug_prepare_bio() are modified to use the new helper functions disk_zone_is_full() and disk_zone_wplug_is_full(). disk_zone_is_full() uses the zone index to determine if the zone being tested is the last one of the disk and uses the either the disk zone_capacity or last_zone_capacity accordingly. Fixes: `dd291d77cc` ("block: Introduce zone write plugging") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20240530054035.491497-4-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-30 15:03:52 -06:00
Damien Le Moal	cd63999368	block: Fix validation of zoned device with a runt zone Commit `ecfe43b11b` ("block: Remember zone capacity when revalidating zones") introduced checks to ensure that the capacity of the zones of a zoned device is constant for all zones. However, this check ignores the possibility that a zoned device has a smaller last zone with a size not equal to the capacity of other zones. Such device correspond in practice to an SMR drive with a smaller last zone and all zones with a capacity equal to the zone size, leading to the last zone capacity being different than the capacity of other zones. Correctly handle such device by fixing the check for the constant zone capacity in blk_revalidate_seq_zone() using the new helper function disk_zone_is_last(). This helper function is also used in blk_revalidate_zone_cb() when checking the zone size. Fixes: `ecfe43b11b` ("block: Remember zone capacity when revalidating zones") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20240530054035.491497-3-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-30 15:03:52 -06:00
Damien Le Moal	b164316808	null_blk: Do not allow runt zone with zone capacity smaller then zone size A zoned device with a smaller last zone together with a zone capacity smaller than the zone size does make any sense as that does not correspond to any possible setup for a real device: 1) For ZNS and zoned UFS devices, all zones are always the same size. 2) For SMR HDDs, all zones always have the same capacity. In other words, if we have a smaller last runt zone, then this zone capacity should always be equal to the zone size. Add a check in null_init_zoned_dev() to prevent a configuration to have both a smaller zone size and a zone capacity smaller than the zone size. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240530054035.491497-2-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-30 15:03:52 -06:00
NeilBrown	99bc9f2eb3	NFS: add barriers when testing for NFS_FSDATA_BLOCKED dentry->d_fsdata is set to NFS_FSDATA_BLOCKED while unlinking or renaming-over a file to ensure that no open succeeds while the NFS operation progressed on the server. Setting dentry->d_fsdata to NFS_FSDATA_BLOCKED is done under ->d_lock after checking the refcount is not elevated. Any attempt to open the file (through that name) will go through lookp_open() which will take ->d_lock while incrementing the refcount, we can be sure that once the new value is set, __nfs_lookup_revalidate() will see the new value and will block. We don't have any locking guarantee that when we set ->d_fsdata to NULL, the wait_var_event() in __nfs_lookup_revalidate() will notice. wait/wake primitives do NOT provide barriers to guarantee order. We must use smp_load_acquire() in wait_var_event() to ensure we look at an up-to-date value, and must use smp_store_release() before wake_up_var(). This patch adds those barrier functions and factors out block_revalidate() and unblock_revalidate() far clarity. There is also a hypothetical bug in that if memory allocation fails (which never happens in practice) we might leave ->d_fsdata locked. This patch adds the missing call to unblock_revalidate(). Reported-and-tested-by: Richard Kojedzinszky <richard+debian+bugreport@kojedz.in> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071501 Fixes: `3c59366c20` ("NFS: don't unhash dentry during unlink/rename") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-05-30 16:17:12 -04:00
Chen Hanxiao	33c94d7e3c	SUNRPC: return proper error from gss_wrap_req_priv don't return 0 if snd_buf->len really greater than snd_buf->buflen Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com> Fixes: `0c77668ddb` ("SUNRPC: Introduce trace points in rpc_auth_gss.ko") Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-05-30 16:12:43 -04:00
Olga Kornievskaia	28568c906c	NFSv4.1 enforce rootpath check in fs_location query In commit `4ca9f31a2b` ("NFSv4.1 test and add 4.1 trunking transport"), we introduce the ability to query the NFS server for possible trunking locations of the existing filesystem. However, we never checked the returned file system path for these alternative locations. According to the RFC, the server can say that the filesystem currently known under "fs_root" of fs_location also resides under these server locations under the following "rootpath" pathname. The client cannot handle trunking a filesystem that reside under different location under different paths other than what the main path is. This patch enforces the check that fs_root path and rootpath path in fs_location reply is the same. Fixes: `4ca9f31a2b` ("NFSv4.1 test and add 4.1 trunking transport") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-05-30 16:12:43 -04:00
NeilBrown	296f4ce81d	NFS: abort nfs_atomic_open_v23 if name is too long. An attempt to open a file with a name longer than NFS3_MAXNAMLEN will trigger a WARN_ON_ONCE in encode_filename3() because nfs_atomic_open_v23() doesn't have the test on ->d_name.len that nfs_atomic_open() has. So add that test. Reported-by: James Clark <james.clark@arm.com> Closes: https://lore.kernel.org/all/20240528105249.69200-1-james.clark@arm.com/ Fixes: `7c6c5249f0` ("NFS: add atomic_open for NFSv3 to handle O_TRUNC correctly.") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-05-30 16:12:11 -04:00
Jens Axboe	18414a4a2e	io_uring/net: assign kmsg inq/flags before buffer selection syzbot reports that recv is using an uninitialized value: ===================================================== BUG: KMSAN: uninit-value in io_req_cqe_overflow io_uring/io_uring.c:810 [inline] BUG: KMSAN: uninit-value in io_req_complete_post io_uring/io_uring.c:937 [inline] BUG: KMSAN: uninit-value in io_issue_sqe+0x1f1b/0x22c0 io_uring/io_uring.c:1763 io_req_cqe_overflow io_uring/io_uring.c:810 [inline] io_req_complete_post io_uring/io_uring.c:937 [inline] io_issue_sqe+0x1f1b/0x22c0 io_uring/io_uring.c:1763 io_wq_submit_work+0xa17/0xeb0 io_uring/io_uring.c:1860 io_worker_handle_work+0xc04/0x2000 io_uring/io-wq.c:597 io_wq_worker+0x447/0x1410 io_uring/io-wq.c:651 ret_from_fork+0x6d/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Uninit was stored to memory at: io_req_set_res io_uring/io_uring.h:215 [inline] io_recv_finish+0xf10/0x1560 io_uring/net.c:861 io_recv+0x12ec/0x1ea0 io_uring/net.c:1175 io_issue_sqe+0x429/0x22c0 io_uring/io_uring.c:1751 io_wq_submit_work+0xa17/0xeb0 io_uring/io_uring.c:1860 io_worker_handle_work+0xc04/0x2000 io_uring/io-wq.c:597 io_wq_worker+0x447/0x1410 io_uring/io-wq.c:651 ret_from_fork+0x6d/0x90 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Uninit was created at: slab_post_alloc_hook mm/slub.c:3877 [inline] slab_alloc_node mm/slub.c:3918 [inline] __do_kmalloc_node mm/slub.c:4038 [inline] __kmalloc+0x6e4/0x1060 mm/slub.c:4052 kmalloc include/linux/slab.h:632 [inline] io_alloc_async_data+0xc0/0x220 io_uring/io_uring.c:1662 io_msg_alloc_async io_uring/net.c:166 [inline] io_recvmsg_prep_setup io_uring/net.c:725 [inline] io_recvmsg_prep+0xbe8/0x1a20 io_uring/net.c:806 io_init_req io_uring/io_uring.c:2135 [inline] io_submit_sqe io_uring/io_uring.c:2182 [inline] io_submit_sqes+0x1135/0x2f10 io_uring/io_uring.c:2335 __do_sys_io_uring_enter io_uring/io_uring.c:3246 [inline] __se_sys_io_uring_enter+0x40f/0x3c80 io_uring/io_uring.c:3183 __x64_sys_io_uring_enter+0x11f/0x1a0 io_uring/io_uring.c:3183 x64_sys_call+0x2c0/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:427 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f which appears to be io_recv_finish() reading kmsg->msg.msg_inq to decide if it needs to set IORING_CQE_F_SOCK_NONEMPTY or not. If the recv is entered with buffer selection, but no buffer is available, then we jump error path which calls io_recv_finish() without having assigned kmsg->msg_inq. This might cause an errant setting of the NONEMPTY flag for a request get gets errored with -ENOBUFS. Reported-by: syzbot+b1647099e82b3b349fbf@syzkaller.appspotmail.com Fixes: `4a3223f7bf` ("io_uring/net: switch io_recv() to using io_async_msghdr") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-30 14:04:37 -06:00
Takashi Iwai	e1e287e6f9	Merge tag 'asoc-fix-v6.10-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.10 Several serieses of fixes that have come in since the merge window, mostly for Intel systems. The biggest thing is some updates from Peter which fix support for a series of Intel laptops which have been found to use 16 bit rather than 32 bit DMIC configuration blobs in their firmware descriptions. We also have a bunch of fixes for module annotations, and some smaller single patch fixes.	2024-05-30 21:26:19 +02:00
John Hubbard	cb708ab9f5	selftests/futex: pass _GNU_SOURCE without a value to the compiler It's slightly better to set _GNU_SOURCE in the source code, but if one must do it via the compiler invocation, then the best way to do so is this: $(CC) -D_GNU_SOURCE= ...because otherwise, if this form is used: $(CC) -D_GNU_SOURCE ...then that leads the compiler to set a value, as if you had passed in: $(CC) -D_GNU_SOURCE=1 That, in turn, leads to warnings under both gcc and clang, like this: futex_requeue_pi.c:20: warning: "_GNU_SOURCE" redefined Fix this by using the "-D_GNU_SOURCE=" form. Reviewed-by: Edward Liaw <edliaw@google.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-30 13:10:51 -06:00
Abhinav Kumar	bb19535880	drm/msm: remove python 3.9 dependency for compiling msm Since commit `5acf491196` ("drm/msm: import gen_header.py script from Mesa"), compilation is broken on machines having python versions older than 3.9 due to dependency on argparse.BooleanOptionalAction. Switch to use simple bool for the validate flag to remove the dependency. Fixes: `5acf491196` ("drm/msm: import gen_header.py script from Mesa") Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Tested-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240507230440.3384949-1-quic_abhinavk@quicinc.com	2024-05-30 18:49:23 +02:00
Alexandre Ghiti	1d84afaf02	riscv: Fix fully ordered LR/SC xchg[8\|16]() implementations The fully ordered versions of xchg[8\|16]() using LR/SC lack the necessary memory barriers to guarantee the order. Fix this by matching what is already implemented in the fully ordered versions of cmpxchg() using LR/SC. Suggested-by: Andrea Parri <parri.andrea@gmail.com> Reported-by: Andrea Parri <parri.andrea@gmail.com> Closes: https://lore.kernel.org/linux-riscv/ZlYbupL5XgzgA0MX@andrea/T/#u Fixes: `a8ed2b7a2c` ("riscv/cmpxchg: Implement xchg for variables of size 1 and 2") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20240530145546.394248-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-05-30 09:43:14 -07:00
Palmer Dabbelt	982a7eb97b	Documentation: RISC-V: uabi: Only scalar misaligned loads are supported We're stuck supporting scalar misaligned loads in userspace because they were part of the ISA at the time we froze the uABI. That wasn't the case for vector misaligned accesses, so depending on them unconditionally is a userspace bug. All extant vector hardware traps on these misaligned accesses. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240524185600.5919-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-05-30 09:42:53 -07:00
Nam Cao	7bed516174	riscv: enable HAVE_ARCH_HUGE_VMAP for XIP kernel HAVE_ARCH_HUGE_VMAP also works on XIP kernel, so remove its dependency on !XIP_KERNEL. This also fixes a boot problem for XIP kernel introduced by the commit in "Fixes:". This commit used huge page mapping for vmemmap, but huge page vmap was not enabled for XIP kernel. Fixes: `ff172d4818` ("riscv: Use hugepage mappings for vmemmap") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: <stable@vger.kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240526110104.470429-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-05-30 09:42:52 -07:00
Sergey Matyukevich	a638b0461b	riscv: prevent pt_regs corruption for secondary idle threads Top of the kernel thread stack should be reserved for pt_regs. However this is not the case for the idle threads of the secondary boot harts. Their stacks overlap with their pt_regs, so both may get corrupted. Similar issue has been fixed for the primary hart, see `c7cdd96eca` ("riscv: prevent stack corruption by reserving task_pt_regs(p) early"). However that fix was not propagated to the secondary harts. The problem has been noticed in some CPU hotplug tests with V enabled. The function smp_callin stored several registers on stack, corrupting top of pt_regs structure including status field. As a result, kernel attempted to save or restore inexistent V context. Fixes: `9a2451f186` ("RISC-V: Avoid using per cpu array for ordered booting") Fixes: `2875fe0561` ("RISC-V: Add cpu_ops and modify default booting method") Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240523084327.2013211-1-geomatsi@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-05-30 09:42:51 -07:00
Marc Zyngier	47eb2d68d1	KVM: arm64: nv: Expose BTI and CSV_frac to a guest hypervisor Now that we expose PAC to NV guests, we can also expose BTI (as the two as joined at the hip, due to some of the PAC instructions being landing pads). While we're at it, also propagate CSV_frac, which requires no particular emulation. Fixes: `f4f6a95bac` ("KVM: arm64: nv: Advertise support for PAuth") Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240528100632.1831995-3-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-05-30 17:36:22 +01:00
Marc Zyngier	41011e2de3	KVM: arm64: nv: Fix relative priorities of exceptions generated by ERETAx ERETAx can fail in multiple ways: (1) ELR_EL2 points lalaland (2) we get a PAC failure (3) SPSR_EL2 has the wrong mode (1) is easy, as we just let the CPU do its thing and deliver an Instruction Abort. However, (2) and (3) are interesting, because the PAC failure priority is way below that of the Illegal Execution State exception. Which means that if we have detected a PAC failure (and that we have FPACCOMBINE), we must be careful to give priority to the Illegal Execution State exception, should one be pending. Solving this involves hoisting the SPSR calculation earlier and testing for the IL bit before injecting the FPAC exception. In the extreme case of a ERETAx returning to an invalid mode and failing its PAC check, we end up with an Instruction Abort (due to the new PC being mangled by the failed Auth) and PSTATE.IL being set. Which matches the requirements of the architecture. Whilst we're at it, remove a stale comment that states the obvious and only confuses the reader. Fixes: `213b3d1ea1` ("KVM: arm64: nv: Handle ERETA[AB] instructions") Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240528100632.1831995-2-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-05-30 17:36:22 +01:00
Guenter Roeck	52a2c70c3e	hwmon: (shtc1) Fix property misspelling The property name is "sensirion,low-precision", not "sensicon,low-precision". Cc: Chris Ruehl <chris.ruehl@gtsys.com.hk> Fixes: `be7373b60d` ("hwmon: shtc1: add support for device tree bindings") Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2024-05-30 09:05:19 -07:00
Peter Colberg	027a44fedd	hwmon: (intel-m10-bmc-hwmon) Fix multiplier for N6000 board power sensor The Intel N6000 BMC outputs the board power value in milliwatt, whereas the hwmon sysfs interface must provide power values in microwatt. Fixes: `e1983220ae` ("hwmon: intel-m10-bmc-hwmon: Add N6000 sensors") Signed-off-by: Peter Colberg <peter.colberg@intel.com> Reviewed-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Link: https://lore.kernel.org/r/20240521181246.683833-1-peter.colberg@intel.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2024-05-30 09:05:06 -07:00
Linus Torvalds	d8ec19857b	Merge tag 'net-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - gro: initialize network_offset in network layer - tcp: reduce accepted window in NEW_SYN_RECV state Current release - new code bugs: - eth: mlx5e: do not use ptp structure for tx ts stats when not initialized - eth: ice: check for unregistering correct number of devlink params Previous releases - regressions: - bpf: Allow delete from sockmap/sockhash only if update is allowed - sched: taprio: extend minimum interval restriction to entire cycle too - netfilter: ipset: add list flush to cancel_gc - ipv4: fix address dump when IPv4 is disabled on an interface - sock_map: avoid race between sock_map_close and sk_psock_put - eth: mlx5: use mlx5_ipsec_rx_status_destroy to correctly delete status rules Previous releases - always broken: - core: fix __dst_negative_advice() race - bpf: - fix multi-uprobe PID filtering logic - fix pkt_type override upon netkit pass verdict - netfilter: tproxy: bail out if IP has been disabled on the device - af_unix: annotate data-race around unix_sk(sk)->addr - eth: mlx5e: fix UDP GSO for encapsulated packets - eth: idpf: don't enable NAPI and interrupts prior to allocating Rx buffers - eth: i40e: fully suspend and resume IO operations in EEH case - eth: octeontx2-pf: free send queue buffers incase of leaf to inner - eth: ipvlan: dont Use skb->sk in ipvlan_process_v{4,6}_outbound" * tag 'net-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) netdev: add qstat for csum complete ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outbound net: ena: Fix redundant device NUMA node override ice: check for unregistering correct number of devlink params ice: fix 200G PHY types to link speed mapping i40e: Fully suspend and resume IO operations in EEH case i40e: factoring out i40e_suspend/i40e_resume e1000e: move force SMBUS near the end of enable_ulp function net: dsa: microchip: fix RGMII error in KSZ DSA driver ipv4: correctly iterate over the target netns in inet_dump_ifaddr() net: fix __dst_negative_advice() race nfc/nci: Add the inconsistency check between the input data length and count MAINTAINERS: dwmac: starfive: update Maintainer net/sched: taprio: extend minimum interval restriction to entire cycle too net/sched: taprio: make q->picos_per_byte available to fill_sched_entry() netfilter: nft_fib: allow from forward/input without iif selector netfilter: tproxy: bail out if IP has been disabled on the device netfilter: nft_payload: skbuff vlan metadata mangle support net: ti: icssg-prueth: Fix start counter for ft1 filter sock_map: avoid race between sock_map_close and sk_psock_put ...	2024-05-30 08:33:04 -07:00
Dave Hansen	2a38e4ca30	x86/cpu: Provide default cache line size if not enumerated tl;dr: CPUs with CPUID.80000008H but without CPUID.01H:EDX[CLFSH] will end up reporting cache_line_size()==0 and bad things happen. Fill in a default on those to avoid the problem. Long Story: The kernel dies a horrible death if c->x86_cache_alignment (aka. cache_line_size() is 0. Normally, this value is populated from c->x86_clflush_size. Right now the code is set up to get c->x86_clflush_size from two places. First, modern CPUs get it from CPUID. Old CPUs that don't have leaf 0x80000008 (or CPUID at all) just get some sane defaults from the kernel in get_cpu_address_sizes(). The vast majority of CPUs that have leaf 0x80000008 also get ->x86_clflush_size from CPUID. But there are oddballs. Intel Quark CPUs[1] and others[2] have leaf 0x80000008 but don't set CPUID.01H:EDX[CLFSH], so they skip over filling in ->x86_clflush_size: cpuid(0x00000001, &tfms, &misc, &junk, &cap0); if (cap0 & (1<<19)) c->x86_clflush_size = ((misc >> 8) & 0xff) * 8; So they: land in get_cpu_address_sizes() and see that CPUID has level 0x80000008 and jump into the side of the if() that does not fill in c->x86_clflush_size. That assigns a 0 to c->x86_cache_alignment, and hilarity ensues in code like: buffer = kzalloc(ALIGN(sizeof(*buffer), cache_line_size()), GFP_KERNEL); To fix this, always provide a sane value for ->x86_clflush_size. Big thanks to Andy Shevchenko for finding and reporting this and also providing a first pass at a fix. But his fix was only partial and only worked on the Quark CPUs. It would not, for instance, have worked on the QEMU config. 1. https://raw.githubusercontent.com/InstLatx64/InstLatx64/master/GenuineIntel/GenuineIntel0000590_Clanton_03_CPUID.txt 2. You can also get this behavior if you use "-cpu 486,+clzero" in QEMU. [ dhansen: remove 'vp_bits_from_cpuid' reference in changelog because bpetkov brutally murdered it recently. ] Fixes: `fbf6449f84` ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach") Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Jörn Heusipp <osmanx@heusipp.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240516173928.3960193-1-andriy.shevchenko@linux.intel.com/ Link: https://lore.kernel.org/lkml/5e31cad3-ad4d-493e-ab07-724cfbfaba44@heusipp.de/ Link: https://lore.kernel.org/all/20240517200534.8EC5F33E%40davehans-spike.ostc.intel.com	2024-05-30 08:29:45 -07:00
Bingbu Cao	ffb9072bce	media: intel/ipu6: add csi2 port sanity check in notifier bound Invalid csi2 port will break the isys notifier bound ops as it is trying to access an invalid csi2 sub-device instance based on the port. It will trigger a mc warning, and it will cause the sensor driver to unbound an inexistent isys csi2 and crash. Adding a csi2 port sanity check, return error to avoid such case. Fixes: `f50c4ca0a8` ("media: intel/ipu6: add the main input system driver") Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> [Sakari Ailus: Fix spelling of "nports" field.] Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2024-05-30 16:34:59 +02:00
Bingbu Cao	54880795b4	media: intel/ipu6: update the maximum supported csi2 port number to 6 IPU6EP on Meteor Lake SoC supports maximum 6 csi2 ports instead of 4. Fixes: `25fedc0219` ("media: intel/ipu6: add Intel IPU6 PCI device driver") Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2024-05-30 16:34:59 +02:00
Sakari Ailus	cc864821c7	media: mei: csi: Warn less verbosely of a missing device fwnode The check for having device fwnode was meant to be a sanity check but this also happens if the ACPI DSDT has graph port nodes on sensor device(s) but not on the IVSC device. Use a more meaningful warning message to tell about this. Fixes: `33116eb12c` ("media: ivsc: csi: Use IPU bridge") Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2024-05-30 16:34:58 +02:00
Sakari Ailus	328af04b1a	media: mei: csi: Put the IPU device reference The mei csi's probe function obtains a reference to the IPU device but never puts that reference. Do that now. Fixes: `33116eb12c` ("media: ivsc: csi: Use IPU bridge") Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2024-05-30 16:34:58 +02:00
Breno Leitao	e112311615	io_uring/rw: Free iovec before cleaning async data kmemleak shows that there is a memory leak in io_uring read operation, where a buffer is allocated at iovec import, but never de-allocated. The memory is allocated at io_async_rw->free_iovec, but, then io_async_rw is kfreed, taking the allocated memory with it. I saw this happening when the read operation fails with -11 (EAGAIN). This is the kmemleak splat. unreferenced object 0xffff8881da591c00 (size 256): ... backtrace (crc 7a15bdee): [<00000000256f2de4>] __kmalloc+0x2d6/0x410 [<000000007a9f5fc7>] iovec_from_user.part.0+0xc6/0x160 [<00000000cecdf83a>] __import_iovec+0x50/0x220 [<00000000d1d586a2>] __io_import_iovec+0x13d/0x220 [<0000000054ee9bd2>] io_prep_rw+0x186/0x340 [<00000000a9c0372d>] io_prep_rwv+0x31/0x120 [<000000001d1170b9>] io_prep_readv+0xe/0x30 [<0000000070b8eb67>] io_submit_sqes+0x1bd/0x780 [<00000000812496d4>] __do_sys_io_uring_enter+0x3ed/0x5b0 [<0000000081499602>] do_syscall_64+0x5d/0x170 [<00000000de1c5a4d>] entry_SYSCALL_64_after_hwframe+0x76/0x7e This occurs because the async data cleanup functions are not set for read/write operations. As a result, the potentially allocated iovec in the rw async data is not freed before the async data is released, leading to a memory leak. With this following patch, kmemleak does not show the leaked memory anymore, and all liburing tests pass. Fixes: `a9165b83c1` ("io_uring/rw: always setup io_async_rw for read/write requests") Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240530142340.1248216-1-leitao@debian.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-30 08:33:01 -06:00
Thomas Gleixner	34bf6bae32	x86/topology/amd: Evaluate SMT in CPUID leaf 0x8000001e only on family 0x17 and greater The new AMD/HYGON topology parser evaluates the SMT information in CPUID leaf 0x8000001e unconditionally while the original code restricted it to CPUs with family 0x17 and greater. This breaks family 0x15 CPUs which advertise that leaf and have a non-zero value in the SMT section. The machine boots, but the scheduler complains loudly about the mismatch of the core IDs: WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:6482 sched_cpu_starting+0x183/0x250 WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2408 build_sched_domains+0x76b/0x12b0 Add the condition back to cure it. [ bp: Make it actually build because grandpa is not concerned with trivial stuff. :-P ] Fixes: `f7fb3b2dd9` ("x86/cpu: Provide an AMD/HYGON specific topology parser") Closes: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/56 Reported-by: Tim Teichmann <teichmanntim@outlook.de> Reported-by: Christian Heusel <christian@heusel.eu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Tim Teichmann <teichmanntim@outlook.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/7skhx6mwe4hxiul64v6azhlxnokheorksqsdbp7qw6g2jduf6c@7b5pvomauugk	2024-05-30 15:58:55 +02:00
Mark Brown	c85578e730	ASoC: SOF: ipc4-topology: Fix nhlt configuration blob Merge series from Peter Ujfalusi <peter.ujfalusi@linux.intel.com>: The existing logic to pick a DMIC blob is based on several historical assumptions that the NHLT in BIOS always contains 32-bits per sample type (first patch, [1]). The other issue with the existing logic is that it was designed to care only about the bit depth of the format and fails to find the existing and correct blob when rate/channels are different on the FE side compared to what we should be using on the DAI side (we have components in path which can change rate/channel count). These issues have not been observed in past but with new MTL based (Windows) laptops and new topologies to enhance the audio quality, we started to see weird issues around how our assumptions of vendors failed. Since some NHLT blob handling cleanup has been done for 6.10, this series will complete that work to cover even cases that we don't anticipate to see. [1] https://github.com/thesofproject/linux/issues/4973	2024-05-30 14:33:14 +01:00
Gerald Loacker	b62c150c3b	drm/panel: sitronix-st7789v: fix display size for jt240mhqs_hwt_ek_e3 panel This is a portrait mode display. Change the dimensions accordingly. Fixes: `0fbbe96bfa` ("drm/panel: sitronix-st7789v: add jasonic jt240mhqs-hwt-ek-e3 support") Signed-off-by: Gerald Loacker <gerald.loacker@wolfvision.net> Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-3-e4821802443d@wolfvision.net Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-3-e4821802443d@wolfvision.net	2024-05-30 14:57:33 +02:00
Gerald Loacker	2ba5058263	drm/panel: sitronix-st7789v: tweak timing for jt240mhqs_hwt_ek_e3 panel Use the default timing parameters to get a refresh rate of about 60 Hz for a clock of 6 MHz. Fixes: `0fbbe96bfa` ("drm/panel: sitronix-st7789v: add jasonic jt240mhqs-hwt-ek-e3 support") Signed-off-by: Gerald Loacker <gerald.loacker@wolfvision.net> Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-2-e4821802443d@wolfvision.net Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-2-e4821802443d@wolfvision.net	2024-05-30 14:57:33 +02:00
Gerald Loacker	0e5895ff7f	drm/panel: sitronix-st7789v: fix timing for jt240mhqs_hwt_ek_e3 panel Flickering was observed when using partial mode. Moving the vsync to the same position as used by the default sitronix-st7789v timing resolves this issue. Fixes: `0fbbe96bfa` ("drm/panel: sitronix-st7789v: add jasonic jt240mhqs-hwt-ek-e3 support") Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com> Signed-off-by: Gerald Loacker <gerald.loacker@wolfvision.net> Link: https://lore.kernel.org/r/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-1-e4821802443d@wolfvision.net Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240409-bugfix-jt240mhqs_hwt_ek_e3-timing-v2-1-e4821802443d@wolfvision.net	2024-05-30 14:57:32 +02:00
Samuel Holland	be2fc65d66	powerpc: Limit ARCH_HAS_KERNEL_FPU_SUPPORT to PPC64 When building a 32-bit kernel, some toolchains do not allow mixing soft float and hard float object files: LD vmlinux.o powerpc64le-unknown-linux-musl-ld: lib/test_fpu_impl.o uses hard float, arch/powerpc/kernel/udbg.o uses soft float powerpc64le-unknown-linux-musl-ld: failed to merge target specific data of file lib/test_fpu_impl.o make[2]: * [scripts/Makefile.vmlinux_o:62: vmlinux.o] Error 1 make[1]: * [Makefile:1152: vmlinux_o] Error 2 make: *** [Makefile:240: __sub-make] Error 2 This is not an issue when building a 64-bit kernel. To unbreak the build, limit ARCH_HAS_KERNEL_FPU_SUPPORT to 64-bit kernels. This is okay because the only real user of this option, amdgpu, was previously limited to PPC64 anyway; see commit `a28e4b672f` ("drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT"). Fixes: `01db473e1a` ("powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405250851.Z4daYSWG-lkp@intel.com/ Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/lkml/eeffaec3-df63-4e55-ab7a-064a65c00efa@roeck-us.net/ Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240529162852.1209-1-samuel.holland@sifive.com	2024-05-30 22:57:27 +10:00
Michael Ellerman	50934945d5	powerpc/uaccess: Use YZ asm constraint for ld The 'ld' instruction requires a 4-byte aligned displacement because it is a DS-form instruction. But the "m" asm constraint doesn't enforce that. Add a special case of __get_user_asm2_goto() so that the "YZ" constraint can be used for "ld". The "Z" constraint is documented in the GCC manual PowerPC machine constraints, and specifies a "memory operand accessed with indexed or indirect addressing". "Y" is not documented in the manual but specifies a "memory operand for a DS-form instruction". Using both allows the compiler to generate a DS-form "ld" or X-form "ldx" as appropriate. The change has to be conditional on CONFIG_PPC_KERNEL_PREFIXED because the "Y" constraint does not guarantee 4-byte alignment when prefixed instructions are enabled. No build errors have been reported due to this, but the possibility is there depending on compiler code generation decisions. Fixes: `c20beffeec` ("powerpc/uaccess: Use flexible addressing with __put_user()/__get_user()") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240529123029.146953-2-mpe@ellerman.id.au	2024-05-30 22:57:27 +10:00
Michael Ellerman	2d43cc701b	powerpc/uaccess: Fix build errors seen with GCC 13/14 Building ppc64le_defconfig with GCC 14 fails with assembler errors: CC fs/readdir.o /tmp/ccdQn0mD.s: Assembler messages: /tmp/ccdQn0mD.s:212: Error: operand out of domain (18 is not a multiple of 4) /tmp/ccdQn0mD.s:226: Error: operand out of domain (18 is not a multiple of 4) ... [6 lines] /tmp/ccdQn0mD.s:1699: Error: operand out of domain (18 is not a multiple of 4) A snippet of the asm shows: # ../fs/readdir.c:210: unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end); ld 9,0(29) # MEM[(u64 )name_38(D) + _88 1], MEM[(u64 )name_38(D) + _88 1] # 210 "../fs/readdir.c" 1 1: std 9,18(8) # put_user # __pus_addr_52, MEM[(u64 )name_38(D) + _88 * 1] The 'std' instruction requires a 4-byte aligned displacement because it is a DS-form instruction, and as the assembler says, 18 is not a multiple of 4. A similar error is seen with GCC 13 and CONFIG_UBSAN_SIGNED_WRAP=y. The fix is to change the constraint on the memory operand to put_user(), from "m" which is a general memory reference to "YZ". The "Z" constraint is documented in the GCC manual PowerPC machine constraints, and specifies a "memory operand accessed with indexed or indirect addressing". "Y" is not documented in the manual but specifies a "memory operand for a DS-form instruction". Using both allows the compiler to generate a DS-form "std" or X-form "stdx" as appropriate. The change has to be conditional on CONFIG_PPC_KERNEL_PREFIXED because the "Y" constraint does not guarantee 4-byte alignment when prefixed instructions are enabled. Unfortunately clang doesn't support the "Y" constraint so that has to be behind an ifdef. Although the build error is only seen with GCC 13/14, that appears to just be luck. The constraint has been incorrect since it was first added. Fixes: `c20beffeec` ("powerpc/uaccess: Use flexible addressing with __put_user()/__get_user()") Cc: stable@vger.kernel.org # v5.10+ Suggested-by: Kewen Lin <linkw@gcc.gnu.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240529123029.146953-1-mpe@ellerman.id.au	2024-05-30 22:57:27 +10:00
Nathan Lynch	12870ae381	powerpc/pseries/lparcfg: drop error message from guest name lookup It's not an error or exceptional situation when the hosting environment does not expose a name for the LP/guest via RTAS or the device tree. This happens with qemu when run without the '-name' option. The message also lacks a newline. Remove it. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: `eddaa9a402` ("powerpc/pseries: read the lpar name from the firmware") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240524-lparcfg-updates-v2-1-62e2e9d28724@linux.ibm.com	2024-05-30 22:57:26 +10:00
Peter Ujfalusi	b65456b7b3	ASoC: SOF: ipc4-topology: Adjust the params based on DAI formats Currently we only check the bit depth value among to DAI formats, but other parameters might be constant, like number of channels and/or rate. In capture we use the fe params as a reference to find the format and blob which should be used, but in the path we can have components which can handle expanding/narrowing number of channels or do a resample. In these cases the topology is expected to have 'fixed' parameter for channels/rates/bit depth and the conversion to the fe format is going to be done within the path. In practice this patch fixes issues like: All DMIC formats are fixed four channels We have a component which converts the four channel to stereo FE is opened with 2 channel Even if we have the correct bit depth format and blob (for four channel) we will still be looking for stereo configurations, which will fail. Note: the adjustment of params have switched order with the checking of single bit depth (needed for the NHLT blob fallback support). This change is non function, just that if the sof_ipc4_narrow_params_to_format() would fail, there is no point of checking the single bit depth. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://msgid.link/r/20240530111918.21974-6-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-30 12:33:32 +01:00
Peter Ujfalusi	2fcad03eab	ASoC: SOF: ipc4-topology: Improve readability of sof_ipc4_prepare_dai_copier() Remove the duplicated code paths to check for single bit depth and to update the params with storing the parameters needed by the function and have a single code section. No functional change but the code is easier to follow. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://msgid.link/r/20240530111918.21974-5-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-30 12:33:31 +01:00
Peter Ujfalusi	3b64fd2f83	ASoC: SOF: ipc4-topology/pcm: Rename sof_ipc4_copier_is_single_format() Rename the sof_ipc4_copier_is_single_format() to sof_ipc4_copier_is_single_bitdepth() to clear the confusion of the use of 'format' when we are querying information on the bit depth. Format is used to describe a combination of parameters (rate, channels, sample format / bit depth). Rename the flags used to store the result at the same time. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://msgid.link/r/20240530111918.21974-4-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-30 12:33:30 +01:00
Peter Ujfalusi	2a865c9c3f	ASoC: SOF: ipc4-topology: Print out the channel count in sof_ipc4_dbg_audio_format Print out the number of channels for the format explicitly instead of having the reader to understand how to interpret the ch_map and ch_cfg values. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://msgid.link/r/20240530111918.21974-3-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-30 12:33:29 +01:00
Peter Ujfalusi	49cb894d56	ASoC: SOF: ipc4-topology: Add support for NHLT with 16-bit only DMIC blob The ACPI NHLT table always had 32-bit DMIC blob even if 16-bit was also present and taken as a 'rule' which obviously got broken and there is at least one device on the market which ships with only 16-bit DMIC configuration blob. This corner case has never been supported and it is going to need topology updates for DMIC copier to support multiple formats. As for the kernel side: if the copier supports multiple formats and the preferred 32-bit DMIC blob is not found then we will try to get a 16-bit DMIC configuration and look for a 16-bit copier config. Fixes: `f9209644ae` ("ASoC: SOF: ipc4-topology: Correct DAI copier config and NHLT blob request") Link: https://github.com/thesofproject/linux/issues/4973 Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://msgid.link/r/20240530111918.21974-2-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-30 12:33:28 +01:00
Jakub Kicinski	13c7c941e7	netdev: add qstat for csum complete Recent commit `0cfe71f45f` ("netdev: add queue stats") added a lot of useful stats, but only those immediately needed by virtio. Presumably virtio does not support CHECKSUM_COMPLETE, so statistic for that form of checksumming wasn't included. Other drivers will definitely need it, in fact we expect it to be needed in net-next soon (mlx5). So let's add the definition of the counter for CHECKSUM_COMPLETE to uAPI in net already, so that the counters are in a more natural order (all subsequent counters have not been present in any released kernel, yet). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Fixes: `0cfe71f45f` ("netdev: add queue stats") Link: https://lore.kernel.org/r/20240529163547.3693194-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-30 12:15:56 +02:00
Takashi Iwai	700fe6fd09	ALSA: seq: Fix yet another spot for system message conversion We fixed the incorrect UMP type for system messages in the recent commit, but it missed one place in system_ev_to_ump_midi1(). Fix it now. Fixes: `e9e02819a9` ("ALSA: seq: Automatic conversion of UMP events") Fixes: c2bb79613fed ("ALSA: seq: Fix incorrect UMP type for system messages") Link: https://lore.kernel.org/r/20240530101044.17524-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-30 12:11:20 +02:00
Yue Haibing	b3dc6e8003	ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outbound Raw packet from PF_PACKET socket ontop of an IPv6-backed ipvlan device will hit WARN_ON_ONCE() in sk_mc_loop() through sch_direct_xmit() path. WARNING: CPU: 2 PID: 0 at net/core/sock.c:775 sk_mc_loop+0x2d/0x70 Modules linked in: sch_netem ipvlan rfkill cirrus drm_shmem_helper sg drm_kms_helper CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Not tainted 6.9.0+ #279 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:sk_mc_loop+0x2d/0x70 Code: fa 0f 1f 44 00 00 65 0f b7 15 f7 96 a3 4f 31 c0 66 85 d2 75 26 48 85 ff 74 1c RSP: 0018:ffffa9584015cd78 EFLAGS: 00010212 RAX: 0000000000000011 RBX: ffff91e585793e00 RCX: 0000000002c6a001 RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff91e589c0f000 RBP: ffff91e5855bd100 R08: 0000000000000000 R09: 3d00545216f43d00 R10: ffff91e584fdcc50 R11: 00000060dd8616f4 R12: ffff91e58132d000 R13: ffff91e584fdcc68 R14: ffff91e5869ce800 R15: ffff91e589c0f000 FS: 0000000000000000(0000) GS:ffff91e898100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f788f7c44c0 CR3: 0000000008e1a000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> ? __warn (kernel/panic.c:693) ? sk_mc_loop (net/core/sock.c:760) ? report_bug (lib/bug.c:201 lib/bug.c:219) ? handle_bug (arch/x86/kernel/traps.c:239) ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) ? sk_mc_loop (net/core/sock.c:760) ip6_finish_output2 (net/ipv6/ip6_output.c:83 (discriminator 1)) ? nf_hook_slow (net/netfilter/core.c:626) ip6_finish_output (net/ipv6/ip6_output.c:222) ? __pfx_ip6_finish_output (net/ipv6/ip6_output.c:215) ipvlan_xmit_mode_l3 (drivers/net/ipvlan/ipvlan_core.c:602) ipvlan ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:226) ipvlan dev_hard_start_xmit (net/core/dev.c:3594) sch_direct_xmit (net/sched/sch_generic.c:343) __qdisc_run (net/sched/sch_generic.c:416) net_tx_action (net/core/dev.c:5286) handle_softirqs (kernel/softirq.c:555) __irq_exit_rcu (kernel/softirq.c:589) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043) The warning triggers as this: packet_sendmsg packet_snd //skb->sk is packet sk __dev_queue_xmit __dev_xmit_skb //q->enqueue is not NULL __qdisc_run sch_direct_xmit dev_hard_start_xmit ipvlan_start_xmit ipvlan_xmit_mode_l3 //l3 mode ipvlan_process_outbound //vepa flag ipvlan_process_v6_outbound ip6_local_out __ip6_finish_output ip6_finish_output2 //multicast packet sk_mc_loop //sk->sk_family is AF_PACKET Call ip{6}_local_out() with NULL sk in ipvlan as other tunnels to fix this. Fixes: `2ad7bf3638` ("ipvlan: Initial check-in of the IPVLAN driver.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240529095633.613103-1-yuehaibing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-30 12:05:52 +02:00
Paolo Abeni	e889eb17f4	Merge tag 'nf-24-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: Patch #1 syzbot reports that nf_reinject() could be called without rcu_read_lock() when flushing pending packets at nfnetlink queue removal, from Eric Dumazet. Patch #2 flushes ipset list:set when canceling garbage collection to reference to other lists to fix a race, from Jozsef Kadlecsik. Patch #3 restores q-in-q matching with nft_payload by reverting `f6ae9f120d` ("netfilter: nft_payload: add C-VLAN support"). Patch #4 fixes vlan mangling in skbuff when vlan offload is present in skbuff, without this patch nft_payload corrupts packets in this case. Patch #5 fixes possible nul-deref in tproxy no IP address is found in netdevice, reported by syzbot and patch from Florian Westphal. Patch #6 removes a superfluous restriction which prevents loose fib lookups from input and forward hooks, from Eric Garver. My assessment is that patches #1, #2 and #5 address possible kernel crash, anything else in this batch fixes broken features. netfilter pull request 24-05-29 * tag 'nf-24-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_fib: allow from forward/input without iif selector netfilter: tproxy: bail out if IP has been disabled on the device netfilter: nft_payload: skbuff vlan metadata mangle support netfilter: nft_payload: restore vlan q-in-q match support netfilter: ipset: Add list flush to cancel_gc netfilter: nfnetlink_queue: acquire rcu_read_lock() in instance_destroy_rcu() ==================== Link: https://lore.kernel.org/r/20240528225519.1155786-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-30 10:14:56 +02:00
Yuntao Wang	ed8c7fbdfe	fs/file: fix the check in find_next_fd() The maximum possible return value of find_next_zero_bit(fdt->full_fds_bits, maxbit, bitbit) is maxbit. This return value, multiplied by BITS_PER_LONG, gives the value of bitbit, which can never be greater than maxfd, it can only be equal to maxfd at most, so the following check 'if (bitbit > maxfd)' will never be true. Moreover, when bitbit equals maxfd, it indicates that there are no unused fds, and the function can directly return. Fix this check. Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev> Link: https://lore.kernel.org/r/20240529160656.209352-1-yuntao.wang@linux.dev Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-30 09:11:47 +02:00
Shay Agroskin	2dc8b1e717	net: ena: Fix redundant device NUMA node override The driver overrides the NUMA node id of the device regardless of whether it knows its correct value (often setting it to -1 even though the node id is advertised in 'struct device'). This can lead to suboptimal configurations. This patch fixes this behavior and makes the shared memory allocation functions use the NUMA node id advertised by the underlying device. Fixes: `1738cd3ed3` ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20240528170912.1204417-1-shayagr@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 19:01:34 -07:00
Jakub Kicinski	602d9591a7	Merge branch 'intel-wired-lan-driver-updates-2024-05-28-e1000e-i40e-ice' Jacob Keller says: ==================== Intel Wired LAN Driver Updates 2024-05-28 (e1000e, i40e, ice) [part] This series includes a variety of fixes that have been accumulating on the Intel Wired LAN dev-queue. Hui Wang provides a fix for suspend/resume on e1000e due to failure to correctly setup the SMBUS in enable_ulp(). Thinh Tran provides a fix for EEH I/O suspend/resume on i40e to ensure that I/O operations can continue after a resume. To avoid duplicate code, the common logic is factored out of i40e_suspend and i40e_resume. Paul Greenwalt provides a fix to correctly map the 200G PHY types to link speeds in the ice driver. Dave Ertman provides a fix correcting devlink parameter unregistration in the event that the driver loads in safe mode and some of the parameters were not registered. ==================== Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-0-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 18:57:03 -07:00
Dave Ertman	a51c9b1c9a	ice: check for unregistering correct number of devlink params On module load, the ice driver checks for the lack of a specific PF capability to determine if it should reduce the number of devlink params to register. One situation when this test returns true is when the driver loads in safe mode. The same check is not present on the unload path when devlink params are unregistered. This results in the driver triggering a WARN_ON in the kernel devlink code. The current check and code path uses a reduction in the number of elements reported in the list of params. This is fragile and not good for future maintaining. Change the parameters to be held in two lists, one always registered and one dependent on the check. Add a symmetrical check in the unload path so that the correct parameters are unregistered as well. Fixes: `109eb29172` ("ice: Add tx_scheduling_layers devlink param") CC: Lukasz Czapnik <lukasz.czapnik@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-8-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 18:57:02 -07:00
Paul Greenwalt	2a6d8f2de2	ice: fix 200G PHY types to link speed mapping Commit `24407a01e5` ("ice: Add 200G speed/phy type use") added support for 200G PHY speeds, but did not include the mapping of 200G PHY types to link speed. As a result the driver is returning UNKNOWN link speed when setting 200G ethtool advertised link modes. To fix this add 200G PHY types to link speed mapping to ice_get_link_speed_based_on_phy_type(). Fixes: `24407a01e5` ("ice: Add 200G speed/phy type use") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-5-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 18:57:02 -07:00
Thinh Tran	c80b6538d3	i40e: Fully suspend and resume IO operations in EEH case When EEH events occurs, the callback functions in the i40e, which are managed by the EEH driver, will completely suspend and resume all IO operations. - In the PCI error detected callback, replaced i40e_prep_for_reset() with i40e_io_suspend(). The change is to fully suspend all I/O operations - In the PCI error slot reset callback, replaced pci_enable_device_mem() with pci_enable_device(). This change enables both I/O and memory of the device. - In the PCI error resume callback, replaced i40e_handle_reset_warning() with i40e_io_resume(). This change allows the system to resume I/O operations Fixes: `a5f3d2c17b` ("powerpc/pseries/pci: Add MSI domains") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Robert Thomas <rob.thomas@ibm.com> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-3-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 18:56:30 -07:00
Thinh Tran	218ed820d3	i40e: factoring out i40e_suspend/i40e_resume Two new functions, i40e_io_suspend() and i40e_io_resume(), have been introduced. These functions were factored out from the existing i40e_suspend() and i40e_resume() respectively. This factoring was done due to concerns about the logic of the I40E_SUSPENSED state, which caused the device to be unable to recover. The functions are now used in the EEH handling for device suspend/resume callbacks. The function i40e_enable_mc_magic_wake() has been moved ahead of i40e_io_suspend() to ensure it is declared before being used. Tested-by: Robert Thomas <rob.thomas@ibm.com> Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-2-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 18:56:30 -07:00
Hui Wang	bfd546a552	e1000e: move force SMBUS near the end of enable_ulp function The commit `861e808602` ("e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue") introduces a regression on PCH_MTP_I219_LM18 (PCIID: 0x8086550A). Without the referred commit, the ethernet works well after suspend and resume, but after applying the commit, the ethernet couldn't work anymore after the resume and the dmesg shows that the NIC link changes to 10Mbps (1000Mbps originally): [ 43.305084] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 10 Mbps Full Duplex, Flow Control: Rx/Tx Without the commit, the force SMBUS code will not be executed if "return 0" or "goto out" is executed in the enable_ulp(), and in my case, the "goto out" is executed since FWSM_FW_VALID is set. But after applying the commit, the force SMBUS code will be ran unconditionally. Here move the force SMBUS code back to enable_ulp() and put it immediately ahead of hw->phy.ops.release(hw), this could allow the longest settling time as possible for interface in this function and doesn't change the original code logic. The issue was found on a Lenovo laptop with the ethernet hw as below: 00:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:550a] (rev 20). And this patch is verified (cable plug and unplug, system suspend and resume) on Lenovo laptops with ethernet hw: [8086:550a], [8086:550b], [8086:15bb], [8086:15be], [8086:1a1f], [8086:1a1c] and [8086:0dc7]. Fixes: `861e808602` ("e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue") Signed-off-by: Hui Wang <hui.wang@canonical.com> Acked-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-1-dc8593d2bbc6@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 18:56:30 -07:00
Jens Axboe	1521dc2410	Merge tag 'nvme-6.10-2024-05-29' of git://git.infradead.org/nvme into block-6.10 Pull NVMe fixes from Keith: "nvme fixes for Linux 6.10 - Removing unused fields (Kanchan) - Large folio offsets support (Kundan) - Multipath NUMA node initialiazation fix (Nilay) - Multipath IO stats accounting fixes (Keith) - Circular lockdep fix (Keith) - Target race condition fix (Sagi) - Target memory leak fix (Sagi)" * tag 'nvme-6.10-2024-05-29' of git://git.infradead.org/nvme: nvmet: fix a possible leak when destroy a ctrl during qp establishment nvme: use srcu for iterating namespace list nvme: adjust multiples of NVME_CTRL_PAGE_SIZE in offset nvme: remove sgs and sws nvmet: fix ns enable/disable possible hang nvme-multipath: fix io accounting on failover nvme: fix multipath batched completion accounting nvme-multipath: find NUMA path only for online numa-node	2024-05-29 19:54:33 -06:00
Tristram Ha	278d65ccda	net: dsa: microchip: fix RGMII error in KSZ DSA driver The driver should return RMII interface when XMII is running in RMII mode. Fixes: `0ab7f6bf16` ("net: dsa: microchip: ksz9477: use common xmii function") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Acked-by: Jerry Ray <jerry.ray@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/1716932066-3342-1-git-send-email-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 18:44:36 -07:00
Alexander Mikhalitsyn	b8c8abefc0	ipv4: correctly iterate over the target netns in inet_dump_ifaddr() A recent change to inet_dump_ifaddr had the function incorrectly iterate over net rather than tgt_net, resulting in the data coming for the incorrect network namespace. Fixes: `cdb2f80f1c` ("inet: use xa_array iterator to implement inet_dump_ifaddr()") Reported-by: Stéphane Graber <stgraber@stgraber.org> Closes: https://github.com/lxc/incus/issues/892 Bisected-by: Stéphane Graber <stgraber@stgraber.org> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Tested-by: Stéphane Graber <stgraber@stgraber.org> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240528203030.10839-1-aleksandr.mikhalitsyn@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 18:43:42 -07:00
Eric Dumazet	92f1655aa2	net: fix __dst_negative_advice() race __dst_negative_advice() does not enforce proper RCU rules when sk->dst_cache must be cleared, leading to possible UAF. RCU rules are that we must first clear sk->sk_dst_cache, then call dst_release(old_dst). Note that sk_dst_reset(sk) is implementing this protocol correctly, while __dst_negative_advice() uses the wrong order. Given that ip6_negative_advice() has special logic against RTF_CACHE, this means each of the three ->negative_advice() existing methods must perform the sk_dst_reset() themselves. Note the check against NULL dst is centralized in __dst_negative_advice(), there is no need to duplicate it in various callbacks. Many thanks to Clement Lecigne for tracking this issue. This old bug became visible after the blamed commit, using UDP sockets. Fixes: `a87cb3e48e` ("net: Facility to report route quality of connected sockets") Reported-by: Clement Lecigne <clecigne@google.com> Diagnosed-by: Clement Lecigne <clecigne@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240528114353.1794151-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-29 17:34:49 -07:00
Javier Carrasco	a94ff8e50c	hwmon: (ltc2992) Fix memory leak in ltc2992_parse_dt() A new error path was added to the fwnode_for_each_available_node() loop in ltc2992_parse_dt(), which leads to an early return that requires a call to fwnode_handle_put() to avoid a memory leak in that case. Add the missing fwnode_handle_put() in the error path from a zero value shunt resistor. Cc: stable@vger.kernel.org Fixes: `10b0290204` ("hwmon: (ltc2992) Avoid division by zero") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://lore.kernel.org/r/20240523-fwnode_for_each_available_child_node_scoped-v2-1-701f3a03f2fb@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2024-05-29 15:34:46 -07:00
Armin Wolf	fa0bc8f297	hwmon: (dell-smm) Add Dell G15 5511 to fan control whitelist A user reported that he needs to disable BIOS fan control on his Dell G15 5511 in order to be able to control the fans. Closes: https://github.com/Wer-Wolf/i8kutils/issues/5 Signed-off-by: Armin Wolf <W_Armin@gmx.de> Acked-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20240522210809.294488-1-W_Armin@gmx.de Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2024-05-29 15:31:12 -07:00
Heiner Kallweit	67c7d4fa26	drm/amd/pm: remove deprecated I2C_CLASS_SPD support from newly added SMU_14_0_2 Support for I2C_CLASS_SPD is currently being removed from the kernel. Only remaining step is to remove the definition of I2C_CLASS_SPD. Setting I2C_CLASS_SPD in a driver is a no-op meanwhile, so remove it here. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-29 17:08:08 -04:00
Rajneesh Bhardwaj	a9bc5a19e4	drm/amdgpu: Make CPX mode auto default in NPS4 On GFXIP9.4.3, make CPX mode as the default compute mode if the node is setup in NPS4 memory partition mode. This change is only applicable for dGPU, for APU, continue to use TPX mode. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-29 17:06:48 -04:00
Alex Deucher	1f327dfc84	drm/amdkfd: simplify APU VRAM handling With commit `89773b8559` ("drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs") big and small APU "VRAM" handling in KFD was unified. Since AMD_IS_APU is set for both big and small APUs, we can simplify the checks in the code. v2: clean up a few more places (Lang) Acked-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-29 17:06:15 -04:00
Alex Deucher	dd2b75fd9a	Revert "drm/amdkfd: fix gfx_target_version for certain 11.0.3 devices" This reverts commit `28ebbb4981`. Revert this commit as apparently the LLVM code to take advantage of this never landed. Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Feifei Xu <feifei.xu@amd.com>	2024-05-29 17:06:04 -04:00
Jesse Zhang	a0cf36546c	drm/amdgpu: fix dereference null return value for the function amdgpu_vm_pt_parent The pointer parent may be NULLed by the function amdgpu_vm_pt_parent. To make the code more robust, check the pointer parent. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-29 17:03:20 -04:00
Alex Deucher	05d9e24ddb	drm/amdgpu: silence UBSAN warning Convert a variable sized array from [1] to []. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-29 17:02:41 -04:00
Alex Deucher	ba46b3bda2	drm/amdgpu: Adjust logic in amdgpu_device_partner_bandwidth() Use current speed/width on devices which don't support dynamic PCIe switching. Fixes: `466a7d1153` ("drm/amd: Use the first non-dGPU PCI device for BW limits") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3289 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-29 17:01:49 -04:00
Kent Overstreet	7b038b564b	bcachefs: Fix failure to return error on misaligned dio write This was reported as an error when running coreutils shred. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-29 16:40:30 -04:00
Samuel Holland	2607133196	clk: sifive: Do not register clkdevs for PRCI clocks These clkdevs were unnecessary, because systems using this driver always look up clocks using the devicetree. And as Russell King points out[1], since the provided device name was truncated, lookups via clkdev would never match. Recently, commit `8d532528ff` ("clkdev: report over-sized strings when creating clkdev entries") caused clkdev registration to fail due to the truncation, and this now prevents the driver from probing. Fix the driver by removing the clkdev registration. Link: https://lore.kernel.org/linux-clk/ZkfYqj+OcAxd9O2t@shell.armlinux.org.uk/ [1] Fixes: `30b8e27e3b` ("clk: sifive: add a driver for the SiFive FU540 PRCI IP block") Fixes: `8d532528ff` ("clkdev: report over-sized strings when creating clkdev entries") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/linux-clk/7eda7621-0dde-4153-89e4-172e4c095d01@roeck-us.net/ Suggested-by: Russell King <linux@armlinux.org.uk> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20240528001432.1200403-1-samuel.holland@sifive.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2024-05-29 12:31:02 -07:00
Michael Ellerman	e8b8c5264d	selftests/overlayfs: Fix build error on ppc64 Fix build error on ppc64: dev_in_maps.c: In function ‘get_file_dev_and_inode’: dev_in_maps.c:60:59: error: format ‘%llu’ expects argument of type ‘long long unsigned int ’, but argument 7 has type ‘__u64 ’ {aka ‘long unsigned int *’} [-Werror=format=] By switching to unsigned long long for u64 for ppc64 builds. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-29 12:26:40 -06:00
Michael Ellerman	84b6df4c49	selftests/openat2: Fix build warnings on ppc64 Fix warnings like: openat2_test.c: In function ‘test_openat2_flags’: openat2_test.c:303:73: warning: format ‘%llX’ expects argument of type ‘long long unsigned int’, but argument 5 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=] By switching to unsigned long long for u64 for ppc64 builds. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-29 12:26:11 -06:00
Michael Ellerman	bc4d5f5d2d	selftests: cachestat: Fix build warnings on ppc64 Fix warnings like: test_cachestat.c: In function ‘print_cachestat’: test_cachestat.c:30:38: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=] By switching to unsigned long long for u64 for ppc64 builds. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-29 12:24:44 -06:00
Steven Rostedt (Google)	23a4b108ac	tracing/selftests: Fix kprobe event name test for .isra. functions The kprobe_eventname.tc test checks if a function with .isra. can have a kprobe attached to it. It loops through the kallsyms file for all the functions that have the .isra. name, and checks if it exists in the available_filter_functions file, and if it does, it uses it to attach a kprobe to it. The issue is that kprobes can not attach to functions that are listed more than once in available_filter_functions. With the latest kernel, the function that is found is: rapl_event_update.isra.0 # grep rapl_event_update.isra.0 /sys/kernel/tracing/available_filter_functions rapl_event_update.isra.0 rapl_event_update.isra.0 It is listed twice. This causes the attached kprobe to it to fail which in turn fails the test. Instead of just picking the function function that is found in available_filter_functions, pick the first one that is listed only once in available_filter_functions. Cc: stable@vger.kernel.org Fixes: `604e354823` ("selftests/ftrace: Select an existing function in kprobe_eventname test") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-29 12:24:31 -06:00
Masami Hiramatsu (Google)	7ea794604b	selftests/ftrace: Update required config Update required config options for running all tests. This also sorts the config entries alphabetically. Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-29 12:24:14 -06:00
Masami Hiramatsu (Google)	f6c3c83db1	selftests/ftrace: Fix to check required event file The dynevent/test_duplicates.tc test case uses `syscalls/sys_enter_openat` event for defining eprobe on it. Since this `syscalls` events depend on CONFIG_FTRACE_SYSCALLS=y, if it is not set, the test will fail. Add the event file to `required` line so that the test will return `unsupported` result. Fixes: `297e1dcdca` ("selftests/ftrace: Add selftest for testing duplicate eprobes and kprobes") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-29 12:24:07 -06:00
Mark Brown	2032e61e24	kselftest/alsa: Ensure _GNU_SOURCE is defined The pcmtest driver tests use the kselftest harness which requires that _GNU_SOURCE is defined but nothing causes it to be defined. Since the KHDR_INCLUDES Makefile variable has had the required define added let's use that, this should provide some futureproofing. Fixes: `daef47b89e` ("selftests: Compile kselftest headers with -D_GNU_SOURCE") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-29 12:23:57 -06:00
Rob Herring (Arm)	321e4fa68c	dt-bindings: arm: stm32: st,mlahb: Drop spurious "reg" property from example "reg" is not documented nor used for st,mlahb, so drop it from the example to fix the warning: Documentation/devicetree/bindings/arm/stm32/st,mlahb.example.dtb: ahb@38000000: Unevaluated properties are not allowed ('reg' was unexpected) from schema $id: http://devicetree.org/schemas/arm/stm32/st,mlahb.yaml# Since "reg" is dropped, the unit-address must be as well. Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240523154208.2457864-1-robh@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-05-29 13:22:14 -05:00
Rob Herring (Arm)	84081a8853	dt-bindings: arm: sunxi: Fix incorrect '-' usage Commit `6bc6bf8a94` ("dt-bindings: arm: sunxi: document Anbernic RG35XX handheld gaming device variants") mistakenly added '-' on each line which created empty (i.e. description only) schemas matching anything. This causes validation to fail on all the root node compatibles as there are multiple oneOf clauses passing. Fixes: `6bc6bf8a94` ("dt-bindings: arm: sunxi: document Anbernic RG35XX handheld gaming device variants") Reviewed-by: Ryan Walklin <ryan@testtoast.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240503154402.967632-1-robh@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-05-29 13:22:14 -05:00
Uwe Kleine-König	95d7c452a2	spi: stm32: Don't warn about spurious interrupts The dev_warn to notify about a spurious interrupt was introduced with the reasoning that these are unexpected. However spurious interrupts tend to trigger continously and the error message on the serial console prevents that the core's detection of spurious interrupts kicks in (which disables the irq) and just floods the console. Fixes: `c64e7efe46` ("spi: stm32: make spurious and overrun interrupts visible") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://msgid.link/r/20240521105241.62400-2-u.kleine-koenig@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-29 19:12:09 +01:00
Takashi Iwai	bc42ca002d	ALSA: ump: Set default protocol when not given explicitly When an inquiry of the current protocol via UMP Stream Configuration message fails by some reason, we may leave the current protocol undefined, which may lead to unexpected behavior. Better to assume a valid protocol found in the protocol capability bits instead. For a device that doesn't support the UMP v1.2 feature, it won't reach to this code path, and USB MIDI GTB descriptor would be used for determining the protocol, instead. Link: https://lore.kernel.org/r/20240529164723.18309-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-29 18:49:00 +02:00
Takashi Iwai	ac0d71ee53	ALSA: ump: Don't accept an invalid UMP protocol number When a UMP Stream Configuration message is received, the driver tries to switch the protocol, but there was no sanity check of the protocol, hence it can pass an invalid value. Add the check and bail out if a wrong value is passed. Fixes: `a798076837` ("ALSA: ump: Add helper to change MIDI protocol") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240529164723.18309-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-29 18:48:51 +02:00
Linus Torvalds	4a4be1ad3a	Revert "vfs: Delete the associated dentry when deleting a file" This reverts commit `681ce86235`. We gave it a try, but it turns out the kernel test robot did in fact find performance regressions for it, so we'll have to look at the more involved alternative fixes for Yafang Shao's Elasticsearch load issue. There were several alternatives discussed, they just weren't as simple as this first attempt. The report is of a -7.4% regression of filebench.sum_operations/s, which appears significant enough to trigger my "this patch may get reverted if somebody finds a performance regression on some other load" rule. So it's still the case that we should end up deleting dentries more aggressively - or just be better at pruning them later - but it needs a bit more finesse than this simple thing. Link: https://lore.kernel.org/all/202405291318.4dfbb352-oliver.sang@intel.com/ Cc: Yafang Shao <laoar.shao@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-05-29 09:39:34 -07:00
Linus Torvalds	397a83ab97	Merge tag '9p-for-6.10-rc2' of https://github.com/martinetd/linux Pull 9p fixes from Dominique Martinet: "Two fixes headed to stable trees: - a trace event was dumping uninitialized values - a missing lock that was thought to have exclusive access, and it turned out not to" * tag '9p-for-6.10-rc2' of https://github.com/martinetd/linux: 9p: add missing locking around taking dentry fid list net/9p: fix uninit-value in p9_client_rpc()	2024-05-29 09:25:15 -07:00
Rob Herring (Arm)	1b1c9f0fd3	dt-bindings: kbuild: Fix dt_binding_check on unconfigured build The 'dt_binding_check' target shouldn't depend on the kernel configuration, but it has since commit `604a57ba97` ("dt-bindings: kbuild: Add separate target/dependency for processed-schema.json"). That is because CHECK_DT_BINDING make variable was dropped, but scripts/dtc/Makefile was missed. The CHECK_DTBS variable can be used instead. Reported-by: Francesco Dolcini <francesco.dolcini@toradex.com> Fixes: `604a57ba97` ("dt-bindings: kbuild: Add separate target/dependency for processed-schema.json") Signed-off-by: "Rob Herring (Arm)" <robh@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-30 01:15:58 +09:00
Miguel Ojeda	6e58e01735	kheaders: use `command -v` to test for existence of `cpio` Commit `13e1df0928` ("kheaders: explicitly validate existence of cpio command") added an explicit check for `cpio` using `type`. However, `type` in `dash` (which is used in some popular distributions and base images as the shell script runner) prints the missing message to standard output, and thus no error is printed: $ bash -c 'type missing >/dev/null' bash: line 1: type: missing: not found $ dash -c 'type missing >/dev/null' $ For instance, this issue may be seen by loongarch builders, given its defconfig enables CONFIG_IKHEADERS since commit `9cc1df421f` ("LoongArch: Update Loongson-3 default config file"). Therefore, use `command -v` instead to have consistent behavior, and take the chance to provide a more explicit error. Fixes: `13e1df0928` ("kheaders: explicitly validate existence of cpio command") Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-30 01:13:20 +09:00
Matthias Maennich	3bd27a847a	kheaders: explicitly define file modes for archived headers Build environments might be running with different umask settings resulting in indeterministic file modes for the files contained in kheaders.tar.xz. The file itself is served with 444, i.e. world readable. Archive the files explicitly with 744,a+X to improve reproducibility across build environments. --mode=0444 is not suitable as directories need to be executable. Also, 444 makes it hard to delete all the readonly files after extraction. Cc: stable@vger.kernel.org Signed-off-by: Matthias Maennich <maennich@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-30 01:13:20 +09:00
Linus Torvalds	db163660b0	Merge tag 'v6.10-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a new run-time warning triggered by tpm" * tag 'v6.10-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: hwrng: core - Remove add_early_randomness	2024-05-29 09:12:58 -07:00
Jens Axboe	06fe9b1df1	io_uring: don't attempt to mmap larger than what the user asks for If IORING_FEAT_SINGLE_MMAP is ignored, as can happen if an application uses an ancient liburing or does setup manually, then 3 mmap's are required to map the ring into userspace. The kernel will still have collapsed the mappings, however userspace may ask for mapping them individually. If so, then we should not use the full number of ring pages, as it may exceed the partial mapping. Doing so will yield an -EFAULT from vm_insert_pages(), as we pass in more pages than what the application asked for. Cap the number of pages to match what the application asked for, for the particular mapping operation. Reported-by: Lucas Mülling <lmulling@proton.me> Link: https://github.com/axboe/liburing/issues/1157 Fixes: `3ab1db3c60` ("io_uring: get rid of remap_pfn_range() for mapping rings/sqes") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-29 09:53:14 -06:00
Andy Shevchenko	9dedabe95b	spi: Assign dummy scatterlist to unidirectional transfers Commit `8cc3bad9d9` ("spi: Remove unneded check for orig_nents") introduced a regression: unmapped data could now be passed to the DMA APIs, resulting in null pointer dereferences. Commit `9f788ba457` ("spi: Don't mark message DMA mapped when no transfer in it is") and commit `da560097c0` ("spi: Check if transfer is mapped before calling DMA sync APIs") addressed the problem, but only partially. Unidirectional transactions will still result in null pointer dereference. To prevent that from happening, assign a dummy scatterlist when no data is mapped, so that the DMA API can be called and not result in a null pointer dereference. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reported-by: Neil Armstrong <neil.armstrong@linaro.org> Closes: https://lore.kernel.org/r/8ae675b5-fcf9-4c9b-b06a-4462f70e1322@linaro.org Reported-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Closes: https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano Closes: https://lore.kernel.org/all/4748499f-789c-45a8-b50a-2dd09f4bac8c@notapiano Fixes: `8cc3bad9d9` ("spi: Remove unneded check for orig_nents") Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> [nfraprado: wrote the commit message] Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Link: https://msgid.link/r/20240529-dma-oops-dummy-v1-1-bb43aacfb11b@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-29 16:47:53 +01:00
Mark Brown	ba2e8323d7	ASoC: SOF: add missing MODULE_DESCRIPTION Merge series from Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>: 'make W=1' now reports missing MODULE_DESCRIPTION lines. This patchset cleans-up all the module definitions and adds MODULE_DESCRIPTION lines as needed.	2024-05-29 15:46:05 +01:00
Dmitry Antipov	92ecbb3ac6	wifi: mac80211: fix UBSAN noise in ieee80211_prep_hw_scan() When testing the previous patch with CONFIG_UBSAN_BOUNDS, I've noticed the following: UBSAN: array-index-out-of-bounds in net/mac80211/scan.c:372:4 index 0 is out of range for type 'struct ieee80211_channel *[]' CPU: 0 PID: 1435 Comm: wpa_supplicant Not tainted 6.9.0+ #1 Hardware name: LENOVO 20UN005QRT/20UN005QRT <...BIOS details...> Call Trace: <TASK> dump_stack_lvl+0x2d/0x90 __ubsan_handle_out_of_bounds+0xe7/0x140 ? timerqueue_add+0x98/0xb0 ieee80211_prep_hw_scan+0x2db/0x480 [mac80211] ? __kmalloc+0xe1/0x470 __ieee80211_start_scan+0x541/0x760 [mac80211] rdev_scan+0x1f/0xe0 [cfg80211] nl80211_trigger_scan+0x9b6/0xae0 [cfg80211] ...<the rest is not too useful...> Since '__ieee80211_start_scan()' leaves 'hw_scan_req->req.n_channels' uninitialized, actual boundaries of 'hw_scan_req->req.channels' can't be checked in 'ieee80211_prep_hw_scan()'. Although an initialization of 'hw_scan_req->req.n_channels' introduces some confusion around allocated vs. used VLA members, this shouldn't be a problem since everything is correctly adjusted soon in 'ieee80211_prep_hw_scan()'. Cleanup 'kmalloc()' math in '__ieee80211_start_scan()' by using the convenient 'struct_size()' as well. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://msgid.link/20240517153332.18271-2-dmantipov@yandex.ru [improve (imho) indentation a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:58:54 +02:00
Lingbo Kong	a26d8dc522	wifi: mac80211: correctly parse Spatial Reuse Parameter Set element Currently, the way of parsing Spatial Reuse Parameter Set element is incorrect and some members of struct ieee80211_he_obss_pd are not assigned. To address this issue, it must be parsed in the order of the elements of Spatial Reuse Parameter Set defined in the IEEE Std 802.11ax specification. The diagram of the Spatial Reuse Parameter Set element (IEEE Std 802.11ax -2021-9.4.2.252). ------------------------------------------------------------------------- \| \| \| \| \|Non-SRG\| SRG \| SRG \| SRG \| SRG \| \|Element\|Length\| Element \| SR \|OBSS PD\|OBSS PD\|OBSS PD\| BSS \|Partial\| \| ID \| \| ID \|Control\| Max \| Min \| Max \|Color \| BSSID \| \| \| \|Extension\| \| Offset\| Offset\|Offset \|Bitmap\|Bitmap \| ------------------------------------------------------------------------- Fixes: `1ced169cc1` ("mac80211: allow setting spatial reuse parameters from bss_conf") Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com> Link: https://msgid.link/20240516021854.5682-3-quic_lingbok@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:35:12 +02:00
Lingbo Kong	0c2fd18f7e	wifi: mac80211: fix Spatial Reuse element size check Currently, the way to check the size of Spatial Reuse IE data in the ieee80211_parse_extension_element() is incorrect. This is because the len variable in the ieee80211_parse_extension_element() function is equal to the size of Spatial Reuse IE data minus one and the value of returned by the ieee80211_he_spr_size() function is equal to the length of Spatial Reuse IE data. So the result of the len >= ieee80211_he_spr_size(data) statement always false. To address this issue and make it consistent with the logic used elsewhere with ieee80211_he_oper_size(), change the "len >= ieee80211_he_spr_size(data)" to “len >= ieee80211_he_spr_size(data) - 1”. Fixes: `9d0480a7c0` ("wifi: mac80211: move element parsing to a new file") Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com> Link: https://msgid.link/20240516021854.5682-2-quic_lingbok@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:34:46 +02:00
Emmanuel Grumbach	4bb95f4535	wifi: iwlwifi: mvm: don't read past the mfuart notifcation In case the firmware sends a notification that claims it has more data than it has, we will read past that was allocated for the notification. Remove the print of the buffer, we won't see it by default. If needed, we can see the content with tracing. This was reported by KFENCE. Fixes: `bdccdb854f` ("iwlwifi: mvm: support MFUART dump in case of MFUART assert") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240513132416.ba82a01a559e.Ia91dd20f5e1ca1ad380b95e68aebf2794f553d9b@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:14 +02:00
Ilan Peer	e6dd2936ce	wifi: iwlwifi: mvm: Fix scan abort handling with HW rfkill When HW rfkill is toggled to disable the RF, the flow to stop scan is called. When trying to send the command to abort the scan, since HW rfkill is toggled, the command is not sent due to rfkill being asserted, and -ERFKILL is returned from iwl_trans_send_cmd(), but this is silently ignored in iwl_mvm_send_cmd() and thus the scan abort flow continues to wait for scan complete notification and fails. Since it fails, the UID to type mapping is not cleared, and thus a warning is later fired when trying to stop the interface. To fix this, modify the UMAC scan abort flow to force sending the scan abort command even when in rfkill, so stop the FW from accessing the radio etc. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240513132416.8cbe2f8c1a97.Iffe235c12a919dafec88eef399eb1f7bae2c5bdb@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:14 +02:00
Miri Korenblit	60d62757df	wifi: iwlwifi: mvm: check n_ssids before accessing the ssids In some versions of cfg80211, the ssids poinet might be a valid one even though n_ssids is 0. Accessing the pointer in this case will cuase an out-of-bound access. Fix this by checking n_ssids first. Fixes: `c1a7515393` ("iwlwifi: mvm: add adaptive dwell support") Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Link: https://msgid.link/20240513132416.6e4d1762bf0d.I5a0e6cc8f02050a766db704d15594c61fe583d45@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:13 +02:00
Ayala Beker	989830d1cf	wifi: iwlwifi: mvm: properly set 6 GHz channel direct probe option Ensure that the 6 GHz channel is configured with a valid direct BSSID, avoiding any invalid or multicast BSSID addresses. Signed-off-by: Ayala Beker <ayala.beker@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240513132416.91a631a0fe60.I2ea2616af9b8a2eaf959b156c69cf65a2f1204d4@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:13 +02:00
Johannes Berg	4d08c0b335	wifi: iwlwifi: mvm: handle BA session teardown in RF-kill When entering RF-kill, mac80211 tears down BA sessions, but due to RF-kill the commands aren't sent to the device. As a result, there can be frames pending on the reorder buffer or perhaps even received while doing so, leading to warnings. Avoid the warnings by doing the BA session teardown normally even in RF-kill, which also requires queue sync. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240513132416.0762cd80fb3d.I43c5877f3b546159b2db4f36d6d956b333c41cf0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:13 +02:00
Yedidya Benshimol	08b16d1b59	wifi: iwlwifi: mvm: Handle BIGTK cipher in kek_kck cmd The BIGTK cipher field was added to the kek_kck_material_cmd but wasn't assigned. Fix that by differentiating between the IGTK/BIGTK keys and assign the ciphers fields accordingly. Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240513132416.7fd0b22b7267.Ie9b581652b74bd7806980364d59e1b2e78e682c0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:13 +02:00
Benjamin Berg	cc3ba78f20	wifi: iwlwifi: mvm: remove stale STA link data during restart If pre-recovery mac80211 tried to disable a link but this disablement failed, then there might be a mismatch between mac80211 assuming the link has been disabled and the driver still having the data around. During recover itself, that is not a problem, but should the link be activated again at a later point, iwlwifi will refuse the activation as it detects the inconsistent state. Solve this corner-case by iterating the station in the restart cleanup handler. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240513132416.d2fd60338055.I840d4fdce5fd49fe69896d928b071067e3730259@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:13 +02:00
Shahar S Matityahu	87821b67de	wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef The driver should call iwl_dbg_tlv_free even if debugfs is not defined since ini mode does not depend on debugfs ifdef. Fixes: `68f6f492c4` ("iwlwifi: trans: support loading ini TLVs from external file") Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com> Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240510170500.c8e3723f55b0.I5e805732b0be31ee6b83c642ec652a34e974ff10@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:13 +02:00
Mordechay Goodstein	0f2e9f6f21	wifi: iwlwifi: mvm: set properly mac header In the driver we only use skb_put* for adding data to the skb, hence data never moves and skb_reset_mac_haeder would set mac_header to the first time data was added and not to mac80211 header, fix this my using the actual len of bytes added for setting the mac header. Fixes: `3f7a9d577d` ("wifi: iwlwifi: mvm: simplify by using SKB MAC header pointer") Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240510170500.12f2de2909c3.I72a819b96f2fe55bde192a8fd31a4b96c301aa73@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:13 +02:00
Johannes Berg	4a7aace289	wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64 We don't actually support >64 even for HE devices, so revert back to 64. This fixes an issue where the session is refused because the queue is configured differently from the actual session later. Fixes: `514c30696f` ("iwlwifi: add support for IEEE802.11ax") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Liad Kaufman <liad.kaufman@intel.com> Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240510170500.52f7b4cf83aa.If47e43adddf7fe250ed7f5571fbb35d8221c7c47@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:11 +02:00
Yedidya Benshimol	b7ffca9931	wifi: iwlwifi: mvm: d3: fix WoWLAN command version lookup After moving from commands to notificaitons in the d3 resume flow, removing the WOWLAN_GET_STATUSES and REPLY_OFFLOADS_QUERY_CMD causes the return of the default value when looking up their version. Returning zero here results in the driver sending the not supported NON_QOS_TX_COUNTER_CMD. Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com> Reviewed-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240510170500.8cabfd580614.If3a0db9851f56041f8f5360959354abd5379224a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:30:09 +02:00
Emmanuel Grumbach	788e4c75f8	wifi: iwlwifi: mvm: fix a crash on 7265 Since IWL_FW_CMD_VER_UNKNOWN = 99, then my change to consider cmd_ver >= 7 instead of cmd_ver = 7 included also firmwares that don't advertise the command version at all. This made us send a command with a bad size and because of that, the firmware hit a BAD_COMMAND immediately after handling the REDUCE_TX_POWER_CMD command. Fixes: `8f892e225f` ("wifi: iwlwifi: mvm: support iwl_dev_tx_power_cmd_v8") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240512072733.eb20ff5050d3.Ie4fc6f5496cd296fd6ff20d15e98676f28a3cccd@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:26:54 +02:00
Shaul Triebitz	98b7017ddb	wifi: iwlwifi: mvm: always set the TWT IE offset In beacon template version 14, make sure to always set the TWT IE offset before sending the beacon template command, also in the debugfs inject_beacon_ie path. If the TWT IE does not exist, the offset will be set to zero. Fixes: `bf0212fd8f` ("wifi: iwlwifi: mvm: add beacon template version 14") Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240512152312.eb27175c345a.If30ef24aba10fe47fd42a7a9703eb8903035e294@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:26:05 +02:00
Miri Korenblit	92158790ce	wifi: iwlwifi: mvm: don't initialize csa_work twice The initialization of this worker moved to iwl_mvm_mac_init_mvmvif but we removed only from the pre-MLD version of the add_interface callback. Remove it also from the MLD version. Fixes: `0bcc215598` ("wifi: iwlwifi: mvm: init vif works only once") Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Link: https://msgid.link/20240512152312.4f15b41604f0.Iec912158e5a706175531d3736d77d25adf02fba4@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:26:05 +02:00
Aditya Kumar Singh	8ecc4d7a7c	wifi: mac80211: pass proper link id for channel switch started notification Original changes[1] posted is having proper changes. However, at the same time, there was chandef puncturing changes which had a conflict with this. While applying, two errors crept in - a) Whitespace error. b) Link ID being passed to channel switch started notifier function is 0. However proper link ID is present in the function. Fix these now. [1] https://lore.kernel.org/all/20240130140918.1172387-5-quic_adisi@quicinc.com/ Fixes: `1a96bb4e8a` ("wifi: mac80211: start and finalize channel switch on link basis") Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com> Link: https://msgid.link/20240509032555.263933-1-quic_adisi@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:25:36 +02:00
Johannes Berg	f7a8b10bfd	wifi: cfg80211: fix 6 GHz scan request building The 6 GHz scan request struct allocated by cfg80211_scan_6ghz() is meant to be formed this way: [base struct][channels][ssids][6ghz_params] It is allocated with [channels] as the maximum number of channels supported by the driver in the 6 GHz band, since allocation is before knowing how many there will be. However, the inner pointers are set incorrectly: initially, the 6 GHz scan parameters pointer is set: [base struct][channels] ^ scan_6ghz_params and later the SSID pointer is set to the end of the actually _used_ channels. [base struct][channels] ^ ssids If many APs were to be discovered, and many channels used, and there were many SSIDs, then the SSIDs could overlap the 6 GHz parameters. Additionally, the request->ssids for most of the function points to the original request still (given the struct copy) but is used normally, which is confusing. Clear this up, by actually using the allocated space for 6 GHz parameters _after_ the SSIDs, and set up the SSIDs initially so they are used more clearly. Just like in nl80211.c, set them only if there actually are SSIDs though. Finally, also copy the elements (ie/ie_len) so they're part of the same request, not pointing to the old request. Co-developed-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://msgid.link/20240510113738.4190692ef4ee.I0cb19188be17a8abd029805e3373c0a7777c214c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:25:25 +02:00
Johannes Berg	177c6ae972	wifi: mac80211: handle tasklet frames before stopping The code itself doesn't want to handle frames from the driver if it's already stopped, but if the tasklet was queued before and runs after the stop, then all bets are off. Flush queues before actually stopping, RX should be off at this point since all the interfaces are removed already, etc. Reported-by: syzbot+8830db5d3593b5546d2e@syzkaller.appspotmail.com Link: https://msgid.link/20240515135318.b05f11385c9a.I41c1b33a2e1814c3a7ef352cd7f2951b91785617@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:25:10 +02:00
Johannes Berg	02c665f048	wifi: mac80211: apply mcast rate only if interface is up If the interface isn't enabled, don't apply multicast rate changes immediately. Reported-by: syzbot+de87c09cc7b964ea2e23@syzkaller.appspotmail.com Link: https://msgid.link/20240515133410.d6cffe5756cc.I47b624a317e62bdb4609ff7fa79403c0c444d32d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:24:52 +02:00
Lin Ma	ab904521f4	wifi: cfg80211: pmsr: use correct nla_get_uX functions The commit `9bb7e0f24e` ("cfg80211: add peer measurement with FTM initiator API") defines four attributes NL80211_PMSR_FTM_REQ_ATTR_ {NUM_BURSTS_EXP}/{BURST_PERIOD}/{BURST_DURATION}/{FTMS_PER_BURST} in following ways. static const struct nla_policy nl80211_pmsr_ftm_req_attr_policy[NL80211_PMSR_FTM_REQ_ATTR_MAX + 1] = { ... [NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP] = NLA_POLICY_MAX(NLA_U8, 15), [NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD] = { .type = NLA_U16 }, [NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION] = NLA_POLICY_MAX(NLA_U8, 15), [NL80211_PMSR_FTM_REQ_ATTR_FTMS_PER_BURST] = NLA_POLICY_MAX(NLA_U8, 31), ... }; That is, those attributes are expected to be NLA_U8 and NLA_U16 types. However, the consumers of these attributes in `pmsr_parse_ftm` blindly all use `nla_get_u32`, which is incorrect and causes functionality issues on little-endian platforms. Hence, fix them with the correct `nla_get_u8` and `nla_get_u16` functions. Fixes: `9bb7e0f24e` ("cfg80211: add peer measurement with FTM initiator API") Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://msgid.link/20240521075059.47999-1-linma@zju.edu.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:23:54 +02:00
Remi Pommarel	642f89daa3	wifi: cfg80211: Lock wiphy in cfg80211_get_station Wiphy should be locked before calling rdev_get_station() (see lockdep assert in ieee80211_get_station()). This fixes the following kernel NULL dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050 Mem abort info: ESR = 0x0000000096000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x06: level 2 translation fault Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000003001000 [0000000000000050] pgd=0800000002dca003, p4d=0800000002dca003, pud=08000000028e9003, pmd=0000000000000000 Internal error: Oops: 0000000096000006 [#1] SMP Modules linked in: netconsole dwc3_meson_g12a dwc3_of_simple dwc3 ip_gre gre ath10k_pci ath10k_core ath9k ath9k_common ath9k_hw ath CPU: 0 PID: 1091 Comm: kworker/u8:0 Not tainted 6.4.0-02144-g565f9a3a7911-dirty #705 Hardware name: RPT (r1) (DT) Workqueue: bat_events batadv_v_elp_throughput_metric_update pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ath10k_sta_statistics+0x10/0x2dc [ath10k_core] lr : sta_set_sinfo+0xcc/0xbd4 sp : ffff000007b43ad0 x29: ffff000007b43ad0 x28: ffff0000071fa900 x27: ffff00000294ca98 x26: ffff000006830880 x25: ffff000006830880 x24: ffff00000294c000 x23: 0000000000000001 x22: ffff000007b43c90 x21: ffff800008898acc x20: ffff00000294c6e8 x19: ffff000007b43c90 x18: 0000000000000000 x17: 445946354d552d78 x16: 62661f7200000000 x15: 57464f445946354d x14: 0000000000000000 x13: 00000000000000e3 x12: d5f0acbcebea978e x11: 00000000000000e3 x10: 000000010048fe41 x9 : 0000000000000000 x8 : ffff000007b43d90 x7 : 000000007a1e2125 x6 : 0000000000000000 x5 : ffff0000024e0900 x4 : ffff800000a0250c x3 : ffff000007b43c90 x2 : ffff00000294ca98 x1 : ffff000006831920 x0 : 0000000000000000 Call trace: ath10k_sta_statistics+0x10/0x2dc [ath10k_core] sta_set_sinfo+0xcc/0xbd4 ieee80211_get_station+0x2c/0x44 cfg80211_get_station+0x80/0x154 batadv_v_elp_get_throughput+0x138/0x1fc batadv_v_elp_throughput_metric_update+0x1c/0xa4 process_one_work+0x1ec/0x414 worker_thread+0x70/0x46c kthread+0xdc/0xe0 ret_from_fork+0x10/0x20 Code: a9bb7bfd 910003fd a90153f3 f9411c40 (f9402814) This happens because STA has time to disconnect and reconnect before batadv_v_elp_throughput_metric_update() delayed work gets scheduled. In this situation, ath10k_sta_state() can be in the middle of resetting arsta data when the work queue get chance to be scheduled and ends up accessing it. Locking wiphy prevents that. Fixes: `7406353d43` ("cfg80211: implement cfg80211_get_station cfg80211 API") Signed-off-by: Remi Pommarel <repk@triplefau.lt> Reviewed-by: Nicolas Escande <nico.escande@gmail.com> Acked-by: Antonio Quartulli <a@unstable.cc> Link: https://msgid.link/983b24a6a176e0800c01aedcd74480d9b551cb13.1716046653.git.repk@triplefau.lt Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:23:41 +02:00
Johannes Berg	e296c95eac	wifi: cfg80211: fully move wiphy work to unbound workqueue Previously I had moved the wiphy work to the unbound system workqueue, but missed that when it restarts and during resume it was still using the normal system workqueue. Fix that. Fixes: `91d20ab9d9` ("wifi: cfg80211: use system_unbound_wq for wiphy work") Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240522124126.7ca959f2cbd3.I3e2a71ef445d167b84000ccf934ea245aef8d395@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:23:33 +02:00
Johannes Berg	4dc3a3893d	wifi: cfg80211: validate HE operation element parsing Validate that the HE operation element has the correct length before parsing it. Cc: stable@vger.kernel.org Fixes: `645f3d8512` ("wifi: cfg80211: handle UHB AP and STA power type") Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240523120533.677025eb4a92.I44c091029ef113c294e8fe8b9bf871bf5dbeeb27@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:20:11 +02:00
Remi Pommarel	44c06bbde6	wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup() The ieee80211_sta_ps_deliver_wakeup() function takes sta->ps_lock to synchronizes with ieee80211_tx_h_unicast_ps_buf() which is called from softirq context. However using only spin_lock() to get sta->ps_lock in ieee80211_sta_ps_deliver_wakeup() does not prevent softirq to execute on this same CPU, to run ieee80211_tx_h_unicast_ps_buf() and try to take this same lock ending in deadlock. Below is an example of rcu stall that arises in such situation. rcu: INFO: rcu_sched self-detected stall on CPU rcu: 2-....: (42413413 ticks this GP) idle=b154/1/0x4000000000000000 softirq=1763/1765 fqs=21206996 rcu: (t=42586894 jiffies g=2057 q=362405 ncpus=4) CPU: 2 PID: 719 Comm: wpa_supplicant Tainted: G W 6.4.0-02158-g1b062f552873 #742 Hardware name: RPT (r1) (DT) pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : queued_spin_lock_slowpath+0x58/0x2d0 lr : invoke_tx_handlers_early+0x5b4/0x5c0 sp : ffff00001ef64660 x29: ffff00001ef64660 x28: ffff000009bc1070 x27: ffff000009bc0ad8 x26: ffff000009bc0900 x25: ffff00001ef647a8 x24: 0000000000000000 x23: ffff000009bc0900 x22: ffff000009bc0900 x21: ffff00000ac0e000 x20: ffff00000a279e00 x19: ffff00001ef646e8 x18: 0000000000000000 x17: ffff800016468000 x16: ffff00001ef608c0 x15: 0010533c93f64f80 x14: 0010395c9faa3946 x13: 0000000000000000 x12: 00000000fa83b2da x11: 000000012edeceea x10: ffff0000010fbe00 x9 : 0000000000895440 x8 : 000000000010533c x7 : ffff00000ad8b740 x6 : ffff00000c350880 x5 : 0000000000000007 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffff00000ac0e0e8 Call trace: queued_spin_lock_slowpath+0x58/0x2d0 ieee80211_tx+0x80/0x12c ieee80211_tx_pending+0x110/0x278 tasklet_action_common.constprop.0+0x10c/0x144 tasklet_action+0x20/0x28 _stext+0x11c/0x284 ____do_softirq+0xc/0x14 call_on_irq_stack+0x24/0x34 do_softirq_own_stack+0x18/0x20 do_softirq+0x74/0x7c __local_bh_enable_ip+0xa0/0xa4 _ieee80211_wake_txqs+0x3b0/0x4b8 __ieee80211_wake_queue+0x12c/0x168 ieee80211_add_pending_skbs+0xec/0x138 ieee80211_sta_ps_deliver_wakeup+0x2a4/0x480 ieee80211_mps_sta_status_update.part.0+0xd8/0x11c ieee80211_mps_sta_status_update+0x18/0x24 sta_apply_parameters+0x3bc/0x4c0 ieee80211_change_station+0x1b8/0x2dc nl80211_set_station+0x444/0x49c genl_family_rcv_msg_doit.isra.0+0xa4/0xfc genl_rcv_msg+0x1b0/0x244 netlink_rcv_skb+0x38/0x10c genl_rcv+0x34/0x48 netlink_unicast+0x254/0x2bc netlink_sendmsg+0x190/0x3b4 ____sys_sendmsg+0x1e8/0x218 ___sys_sendmsg+0x68/0x8c __sys_sendmsg+0x44/0x84 __arm64_sys_sendmsg+0x20/0x28 do_el0_svc+0x6c/0xe8 el0_svc+0x14/0x48 el0t_64_sync_handler+0xb0/0xb4 el0t_64_sync+0x14c/0x150 Using spin_lock_bh()/spin_unlock_bh() instead prevents softirq to raise on the same CPU that is holding the lock. Fixes: `1d147bfa64` ("mac80211: fix AP powersave TX vs. wakeup race") Signed-off-by: Remi Pommarel <repk@triplefau.lt> Link: https://msgid.link/8e36fe07d0fbc146f89196cd47a53c8a0afe84aa.1716910344.git.repk@triplefau.lt Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:19:55 +02:00
Nicolas Escande	6f6291f09a	wifi: mac80211: mesh: init nonpeer_pm to active by default in mesh sdata With a ath9k device I can see that: iw phy phy0 interface add mesh0 type mp ip link set mesh0 up iw dev mesh0 scan Will start a scan with the Power Management bit set in the Frame Control Field. This is because we set this bit depending on the nonpeer_pm variable of the mesh iface sdata and when there are no active links on the interface it remains to NL80211_MESH_POWER_UNKNOWN. As soon as links starts to be established, it wil switch to NL80211_MESH_POWER_ACTIVE as it is the value set by befault on the per sta nonpeer_pm field. As we want no power save by default, (as expressed with the per sta ini values), lets init it to the expected default value of NL80211_MESH_POWER_ACTIVE. Also please note that we cannot change the default value from userspace prior to establishing a link as using NL80211_CMD_SET_MESH_CONFIG will not work before NL80211_CMD_JOIN_MESH has been issued. So too late for our initial scan. Signed-off-by: Nicolas Escande <nico.escande@gmail.com> Link: https://msgid.link/20240527141759.299411-1-nico.escande@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:19:45 +02:00
Nicolas Escande	b7d7f11a29	wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects The hwmp code use objects of type mesh_preq_queue, added to a list in ieee80211_if_mesh, to keep track of mpath we need to resolve. If the mpath gets deleted, ex mesh interface is removed, the entries in that list will never get cleaned. Fix this by flushing all corresponding items of the preq_queue in mesh_path_flush_pending(). This should take care of KASAN reports like this: unreferenced object 0xffff00000668d800 (size 128): comm "kworker/u8:4", pid 67, jiffies 4295419552 (age 1836.444s) hex dump (first 32 bytes): 00 1f 05 09 00 00 ff ff 00 d5 68 06 00 00 ff ff ..........h..... 8e 97 ea eb 3e b8 01 00 00 00 00 00 00 00 00 00 ....>........... backtrace: [<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c [<00000000049bd418>] kmalloc_trace+0x34/0x80 [<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8 [<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c [<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4 [<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764 [<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4 [<000000004c86e916>] dev_hard_start_xmit+0x174/0x440 [<0000000023495647>] __dev_queue_xmit+0xe24/0x111c [<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4 [<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508 [<00000000adc3cd94>] process_one_work+0x4b8/0xa1c [<00000000b36425d1>] worker_thread+0x9c/0x634 [<0000000005852dd5>] kthread+0x1bc/0x1c4 [<000000005fccd770>] ret_from_fork+0x10/0x20 unreferenced object 0xffff000009051f00 (size 128): comm "kworker/u8:4", pid 67, jiffies 4295419553 (age 1836.440s) hex dump (first 32 bytes): 90 d6 92 0d 00 00 ff ff 00 d8 68 06 00 00 ff ff ..........h..... 36 27 92 e4 02 e0 01 00 00 58 79 06 00 00 ff ff 6'.......Xy..... backtrace: [<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c [<00000000049bd418>] kmalloc_trace+0x34/0x80 [<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8 [<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c [<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4 [<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764 [<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4 [<000000004c86e916>] dev_hard_start_xmit+0x174/0x440 [<0000000023495647>] __dev_queue_xmit+0xe24/0x111c [<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4 [<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508 [<00000000adc3cd94>] process_one_work+0x4b8/0xa1c [<00000000b36425d1>] worker_thread+0x9c/0x634 [<0000000005852dd5>] kthread+0x1bc/0x1c4 [<000000005fccd770>] ret_from_fork+0x10/0x20 Fixes: `050ac52cbe` ("mac80211: code for on-demand Hybrid Wireless Mesh Protocol") Signed-off-by: Nicolas Escande <nico.escande@gmail.com> Link: https://msgid.link/20240528142605.1060566-1-nico.escande@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-05-29 15:17:03 +02:00
Witold Sadowski	4a69c1264f	spi: cadence: Ensure data lines set to low during dummy-cycle period During dummy-cycles xSPI will switch GPIO into Hi-Z mode. In that dummy period voltage on data lines will slowly drop, what can cause unintentional modebyte transmission. Value send to SPI memory chip will depend on last address, and clock frequency. To prevent unforeseen consequences of that behaviour, force send single modebyte(0x00). Modebyte will be send only if number of dummy-cycles is not equal to 0. Code must also reduce dummycycle byte count by one - as one byte is send as modebyte. Signed-off-by: Witold Sadowski <wsadowski@marvell.com> Link: https://msgid.link/r/20240529074037.1345882-2-wsadowski@marvell.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-29 13:43:02 +01:00
Peter Ujfalusi	ffa077b2f6	ASoC: SOF: ipc4-topology: Fix input format query of process modules without base extension If a process module does not have base config extension then the same format applies to all of it's inputs and the process->base_config_ext is NULL, causing NULL dereference when specifically crafted topology and sequences used. Fixes: `648fea1284` ("ASoC: SOF: ipc4-topology: set copier output format for process module") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Cc: stable@vger.kernel.org Link: https://msgid.link/r/20240529121201.14687-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-29 13:34:44 +01:00
Puranjay Mohan	b1e7cee961	powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH The Linux Kernel Memory Model [1][2] requires RMW operations that have a return value to be fully ordered. BPF atomic operations with BPF_FETCH (including BPF_XCHG and BPF_CMPXCHG) return a value back so they need to be JITed to fully ordered operations. POWERPC currently emits relaxed operations for these. We can show this by running the following litmus-test: PPC SB+atomic_add+fetch { 0:r0=x; (* dst reg assuming offset is 0 ) 0:r1=2; ( src reg ) 0:r2=1; 0:r4=y; ( P0 writes to this, P1 reads this ) 0:r5=z; ( P1 writes to this, P0 reads this *) 0:r6=0; 1:r2=1; 1:r4=y; 1:r5=z; } P0 \| P1 ; stw r2, 0(r4) \| stw r2,0(r5) ; \| ; loop:lwarx r3, r6, r0 \| ; mr r8, r3 \| ; add r3, r3, r1 \| sync ; stwcx. r3, r6, r0 \| ; bne loop \| ; mr r1, r8 \| ; \| ; lwa r7, 0(r5) \| lwa r7,0(r4) ; ~exists(0:r7=0 /\ 1:r7=0) Witnesses Positive: 9 Negative: 3 Condition ~exists (0:r7=0 /\ 1:r7=0) Observation SB+atomic_add+fetch Sometimes 3 9 This test shows that the older store in P0 is reordered with a newer load to a different address. Although there is a RMW operation with fetch between them. Adding a sync before and after RMW fixes the issue: Witnesses Positive: 9 Negative: 0 Condition ~exists (0:r7=0 /\ 1:r7=0) Observation SB+atomic_add+fetch Never 0 9 [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt [2] https://www.kernel.org/doc/Documentation/atomic_t.txt Fixes: `aea7ef8a82` ("powerpc/bpf/32: add support for BPF_ATOMIC bitwise operations") Fixes: `2d9206b227` ("powerpc/bpf/32: Add instructions for atomic_[cmp]xchg") Fixes: `dbe6e2456f` ("powerpc/bpf/64: add support for atomic fetch operations") Fixes: `1e82dfaa78` ("powerpc/bpf/64: Add instructions for atomic_[cmp]xchg") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Naveen N Rao <naveen@kernel.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240513100248.110535-1-puranjay@kernel.org	2024-05-29 22:12:42 +10:00
Edward Adam Davis	068648aab7	nfc/nci: Add the inconsistency check between the input data length and count write$nci(r0, &(0x7f0000000740)=ANY=[@ANYBLOB="610501"], 0xf) Syzbot constructed a write() call with a data length of 3 bytes but a count value of 15, which passed too little data to meet the basic requirements of the function nci_rf_intf_activated_ntf_packet(). Therefore, increasing the comparison between data length and count value to avoid problems caused by inconsistent data length and count. Reported-and-tested-by: syzbot+71bfed2b2bcea46c98f2@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-29 13:08:31 +01:00
Minda Chen	e9022b31db	MAINTAINERS: dwmac: starfive: update Maintainer Update the maintainer of starfive dwmac driver. Signed-off-by: Minda Chen <minda.chen@starfivetech.com> Acked-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-29 12:25:02 +01:00
Christian Brauner	a82c13d299	Merge patch series "cachefiles: some bugfixes and cleanups for ondemand requests" libaokun@huaweicloud.com <libaokun@huaweicloud.com> says: We've been testing ondemand mode for cachefiles since January, and we're almost done. We hit a lot of issues during the testing period, and this patch set fixes some of the issues related to ondemand requests. The patches have passed internal testing without regression. The following is a brief overview of the patches, see the patches for more details. Patch 1-5: Holding reference counts of reqs and objects on read requests to avoid malicious restore leading to use-after-free. Patch 6-10: Add some consistency checks to copen/cread/get_fd to avoid malicious copen/cread/close fd injections causing use-after-free or hung. Patch 11: When cache is marked as CACHEFILES_DEAD, flush all requests, otherwise the kernel may be hung. since this state is irreversible, the daemon can read open requests but cannot copen. Patch 12: Allow interrupting a read request being processed by killing the read process as a way of avoiding hung in some special cases. fs/cachefiles/daemon.c \| 3 +- fs/cachefiles/internal.h \| 5 + fs/cachefiles/ondemand.c \| 217 ++++++++++++++++++++++-------- include/trace/events/cachefiles.h \| 8 +- 4 files changed, 176 insertions(+), 57 deletions(-) * patches from https://lore.kernel.org/r/20240522114308.2402121-1-libaokun@huaweicloud.com: cachefiles: make on-demand read killable cachefiles: flush all requests after setting CACHEFILES_DEAD cachefiles: Set object to close if ondemand_id < 0 in copen cachefiles: defer exposing anon_fd until after copy_to_user() succeeds cachefiles: never get a new anonymous fd if ondemand_id is valid cachefiles: add spin_lock for cachefiles_ondemand_info cachefiles: add consistency check for copen/cread cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read() cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read() cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd() cachefiles: remove requests from xarray during flushing requests cachefiles: add output string to cachefiles_obj_[get\|put]_ondemand_fd Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:40 +02:00
Baokun Li	bc9dde6155	cachefiles: make on-demand read killable Replacing wait_for_completion() with wait_for_completion_killable() in cachefiles_ondemand_send_req() allows us to kill processes that might trigger a hunk_task if the daemon is abnormal. But now only CACHEFILES_OP_READ is killable, because OP_CLOSE and OP_OPEN is initiated from kworker context and the signal is prohibited in these kworker. Note that when the req in xas changes, i.e. xas_load(&xas) != req, it means that a process will complete the current request soon, so wait again for the request to be completed. In addition, add the cachefiles_ondemand_finish_req() helper function to simplify the code. Suggested-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-13-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:31 +02:00
Baokun Li	85e833cd72	cachefiles: flush all requests after setting CACHEFILES_DEAD In ondemand mode, when the daemon is processing an open request, if the kernel flags the cache as CACHEFILES_DEAD, the cachefiles_daemon_write() will always return -EIO, so the daemon can't pass the copen to the kernel. Then the kernel process that is waiting for the copen triggers a hung_task. Since the DEAD state is irreversible, it can only be exited by closing /dev/cachefiles. Therefore, after calling cachefiles_io_error() to mark the cache as CACHEFILES_DEAD, if in ondemand mode, flush all requests to avoid the above hungtask. We may still be able to read some of the cached data before closing the fd of /dev/cachefiles. Note that this relies on the patch that adds reference counting to the req, otherwise it may UAF. Fixes: `c838305450` ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-12-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:31 +02:00
Zizhi Wo	4f8703fb34	cachefiles: Set object to close if ondemand_id < 0 in copen If copen is maliciously called in the user mode, it may delete the request corresponding to the random id. And the request may have not been read yet. Note that when the object is set to reopen, the open request will be done with the still reopen state in above case. As a result, the request corresponding to this object is always skipped in select_req function, so the read request is never completed and blocks other process. Fix this issue by simply set object to close if its id < 0 in copen. Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-11-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:30 +02:00
Baokun Li	4b4391e77a	cachefiles: defer exposing anon_fd until after copy_to_user() succeeds After installing the anonymous fd, we can now see it in userland and close it. However, at this point we may not have gotten the reference count of the cache, but we will put it during colse fd, so this may cause a cache UAF. So grab the cache reference count before fd_install(). In addition, by kernel convention, fd is taken over by the user land after fd_install(), and the kernel should not call close_fd() after that, i.e., it should call fd_install() after everything is ready, thus fd_install() is called after copy_to_user() succeeds. Fixes: `c838305450` ("cachefiles: notify the user daemon when looking up cookie") Suggested-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-10-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:30 +02:00
Baokun Li	4988e35e95	cachefiles: never get a new anonymous fd if ondemand_id is valid Now every time the daemon reads an open request, it gets a new anonymous fd and ondemand_id. With the introduction of "restore", it is possible to read the same open request more than once, and therefore an object can have more than one anonymous fd. If the anonymous fd is not unique, the following concurrencies will result in an fd leak: t1 \| t2 \| t3 ------------------------------------------------------------ cachefiles_ondemand_init_object cachefiles_ondemand_send_req REQ_A = kzalloc(sizeof(*req) + data_len) wait_for_completion(&REQ_A->done) cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req cachefiles_ondemand_get_fd load->fd = fd0 ondemand_id = object_id0 ------ restore ------ cachefiles_ondemand_restore // restore REQ_A cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req cachefiles_ondemand_get_fd load->fd = fd1 ondemand_id = object_id1 process_open_req(REQ_A) write(devfd, ("copen %u,%llu", msg->msg_id, size)) cachefiles_ondemand_copen xa_erase(&cache->reqs, id) complete(&REQ_A->done) kfree(REQ_A) process_open_req(REQ_A) // copen fails due to no req // daemon close(fd1) cachefiles_ondemand_fd_release // set object closed -- umount -- cachefiles_withdraw_cookie cachefiles_ondemand_clean_object cachefiles_ondemand_init_close_req if (!cachefiles_ondemand_object_is_open(object)) return -ENOENT; // The fd0 is not closed until the daemon exits. However, the anonymous fd holds the reference count of the object and the object holds the reference count of the cookie. So even though the cookie has been relinquished, it will not be unhashed and freed until the daemon exits. In fscache_hash_cookie(), when the same cookie is found in the hash list, if the cookie is set with the FSCACHE_COOKIE_RELINQUISHED bit, then the new cookie waits for the old cookie to be unhashed, while the old cookie is waiting for the leaked fd to be closed, if the daemon does not exit in time it will trigger a hung task. To avoid this, allocate a new anonymous fd only if no anonymous fd has been allocated (ondemand_id == 0) or if the previously allocated anonymous fd has been closed (ondemand_id == -1). Moreover, returns an error if ondemand_id is valid, letting the daemon know that the current userland restore logic is abnormal and needs to be checked. Fixes: `c838305450` ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-9-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:30 +02:00
Baokun Li	0a79004083	cachefiles: add spin_lock for cachefiles_ondemand_info The following concurrency may cause a read request to fail to be completed and result in a hung: t1 \| t2 --------------------------------------------------------- cachefiles_ondemand_copen req = xa_erase(&cache->reqs, id) // Anon fd is maliciously closed. cachefiles_ondemand_fd_release xa_lock(&cache->reqs) cachefiles_ondemand_set_object_close(object) xa_unlock(&cache->reqs) cachefiles_ondemand_set_object_open // No one will ever close it again. cachefiles_ondemand_daemon_read cachefiles_ondemand_select_req // Get a read req but its fd is already closed. // The daemon can't issue a cread ioctl with an closed fd, then hung. So add spin_lock for cachefiles_ondemand_info to protect ondemand_id and state, thus we can avoid the above problem in cachefiles_ondemand_copen() by using ondemand_id to determine if fd has been closed. Fixes: `c838305450` ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-8-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:30 +02:00
Baokun Li	a26dc49df3	cachefiles: add consistency check for copen/cread This prevents malicious processes from completing random copen/cread requests and crashing the system. Added checks are listed below: * Generic, copen can only complete open requests, and cread can only complete read requests. * For copen, ondemand_id must not be 0, because this indicates that the request has not been read by the daemon. * For cread, the object corresponding to fd and req should be the same. Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-7-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:30 +02:00
Baokun Li	3e6d704f02	cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read() The err_put_fd label is only used once, so remove it to make the code more readable. In addition, the logic for deleting error request and CLOSE request is merged to simplify the code. Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-6-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:29 +02:00
Baokun Li	da4a827416	cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read() We got the following issue in a fuzz test of randomly issuing the restore command: ================================================================== BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0xb41/0xb60 Read of size 8 at addr ffff888122e84088 by task ondemand-04-dae/963 CPU: 13 PID: 963 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #564 Call Trace: kasan_report+0x93/0xc0 cachefiles_ondemand_daemon_read+0xb41/0xb60 vfs_read+0x169/0xb50 ksys_read+0xf5/0x1e0 Allocated by task 116: kmem_cache_alloc+0x140/0x3a0 cachefiles_lookup_cookie+0x140/0xcd0 fscache_cookie_state_machine+0x43c/0x1230 [...] Freed by task 792: kmem_cache_free+0xfe/0x390 cachefiles_put_object+0x241/0x480 fscache_cookie_state_machine+0x5c8/0x1230 [...] ================================================================== Following is the process that triggers the issue: mount \| daemon_thread1 \| daemon_thread2 ------------------------------------------------------------ cachefiles_withdraw_cookie cachefiles_ondemand_clean_object(object) cachefiles_ondemand_send_req REQ_A = kzalloc(sizeof(*req) + data_len) wait_for_completion(&REQ_A->done) cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req msg->object_id = req->object->ondemand->ondemand_id ------ restore ------ cachefiles_ondemand_restore xas_for_each(&xas, req, ULONG_MAX) xas_set_mark(&xas, CACHEFILES_REQ_NEW) cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req copy_to_user(_buffer, msg, n) xa_erase(&cache->reqs, id) complete(&REQ_A->done) ------ close(fd) ------ cachefiles_ondemand_fd_release cachefiles_put_object cachefiles_put_object kmem_cache_free(cachefiles_object_jar, object) REQ_A->object->ondemand->ondemand_id // object UAF !!! When we see the request within xa_lock, req->object must not have been freed yet, so grab the reference count of object before xa_unlock to avoid the above issue. Fixes: `0a7e54c195` ("cachefiles: resend an open request if the read request's object is closed") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-5-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:29 +02:00
Baokun Li	de3e26f9e5	cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd() We got the following issue in a fuzz test of randomly issuing the restore command: ================================================================== BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0x609/0xab0 Write of size 4 at addr ffff888109164a80 by task ondemand-04-dae/4962 CPU: 11 PID: 4962 Comm: ondemand-04-dae Not tainted 6.8.0-rc7-dirty #542 Call Trace: kasan_report+0x94/0xc0 cachefiles_ondemand_daemon_read+0x609/0xab0 vfs_read+0x169/0xb50 ksys_read+0xf5/0x1e0 Allocated by task 626: __kmalloc+0x1df/0x4b0 cachefiles_ondemand_send_req+0x24d/0x690 cachefiles_create_tmpfile+0x249/0xb30 cachefiles_create_file+0x6f/0x140 cachefiles_look_up_object+0x29c/0xa60 cachefiles_lookup_cookie+0x37d/0xca0 fscache_cookie_state_machine+0x43c/0x1230 [...] Freed by task 626: kfree+0xf1/0x2c0 cachefiles_ondemand_send_req+0x568/0x690 cachefiles_create_tmpfile+0x249/0xb30 cachefiles_create_file+0x6f/0x140 cachefiles_look_up_object+0x29c/0xa60 cachefiles_lookup_cookie+0x37d/0xca0 fscache_cookie_state_machine+0x43c/0x1230 [...] ================================================================== Following is the process that triggers the issue: mount \| daemon_thread1 \| daemon_thread2 ------------------------------------------------------------ cachefiles_ondemand_init_object cachefiles_ondemand_send_req REQ_A = kzalloc(sizeof(req) + data_len) wait_for_completion(&REQ_A->done) cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req cachefiles_ondemand_get_fd copy_to_user(_buffer, msg, n) process_open_req(REQ_A) ------ restore ------ cachefiles_ondemand_restore xas_for_each(&xas, req, ULONG_MAX) xas_set_mark(&xas, CACHEFILES_REQ_NEW); cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req write(devfd, ("copen %u,%llu", msg->msg_id, size)); cachefiles_ondemand_copen xa_erase(&cache->reqs, id) complete(&REQ_A->done) kfree(REQ_A) cachefiles_ondemand_get_fd(REQ_A) fd = get_unused_fd_flags file = anon_inode_getfile fd_install(fd, file) load = (void )REQ_A->msg.data; load->fd = fd; // load UAF !!! This issue is caused by issuing a restore command when the daemon is still alive, which results in a request being processed multiple times thus triggering a UAF. So to avoid this problem, add an additional reference count to cachefiles_req, which is held while waiting and reading, and then released when the waiting and reading is over. Note that since there is only one reference count for waiting, we need to avoid the same request being completed multiple times, so we can only complete the request if it is successfully removed from the xarray. Fixes: `e73fa11a35` ("cachefiles: add restore command to recover inflight ondemand read requests") Suggested-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-4-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:29 +02:00
Baokun Li	0fc75c5940	cachefiles: remove requests from xarray during flushing requests Even with CACHEFILES_DEAD set, we can still read the requests, so in the following concurrency the request may be used after it has been freed: mount \| daemon_thread1 \| daemon_thread2 ------------------------------------------------------------ cachefiles_ondemand_init_object cachefiles_ondemand_send_req REQ_A = kzalloc(sizeof(*req) + data_len) wait_for_completion(&REQ_A->done) cachefiles_daemon_read cachefiles_ondemand_daemon_read // close dev fd cachefiles_flush_reqs complete(&REQ_A->done) kfree(REQ_A) xa_lock(&cache->reqs); cachefiles_ondemand_select_req req->msg.opcode != CACHEFILES_OP_READ // req use-after-free !!! xa_unlock(&cache->reqs); xa_destroy(&cache->reqs) Hence remove requests from cache->reqs when flushing them to avoid accessing freed requests. Fixes: `c838305450` ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-3-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:29 +02:00
Baokun Li	cc5ac966f2	cachefiles: add output string to cachefiles_obj_[get\|put]_ondemand_fd This lets us see the correct trace output. Fixes: `c838305450` ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-2-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 13:03:29 +02:00
Pierre-Louis Bossart	b062938fd9	ASoC: Intel: sof-sdw: fix missing SPI_MASTER dependency The addition of the Cirrus Logic 'sidecar' amps adds a dependency on SPI_MASTER. Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for SND_SOC_CS35L56_SPI Depends on [n]: SOUND [=y] && SND [=y] && SND_SOC [=y] && SPI_MASTER [=n] && (SOUNDWIRE [=y] \|\| !SOUNDWIRE [=y]) Selected by [y]: - SND_SOC_INTEL_SOUNDWIRE_SOF_MACH [=y] && SOUND [=y] && SND [=y] && SND_SOC [=y] && SND_SOC_INTEL_MACH [=y] && SND_SOC_SOF_INTEL_SOUNDWIRE [=y] && I2C [=y] && ACPI [=y] && (MFD_INTEL_LPSS [=y] \|\| COMPILE_TEST [=n]) && (SND_SOC_INTEL_USER_FRIENDLY_LONG_NAMES [=y] \|\| COMPILE_TEST [=n]) && SOUNDWIRE [=y] Fixes: `b831b4dca4` ("ASoC: intel: sof_sdw: Add support for cs42l43-cs35l56 sidecar amps") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405140758.o2HY4nYD-lkp@intel.com/ Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://msgid.link/r/20240527191940.30107-1-pierre-louis.bossart@linux.intel.com Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-29 11:07:48 +01:00
Alexandre Belloni	6d40dbc758	ALSA: pcm: fix typo in comment Fix the typo in the comment for SNDRV_PCM_RATE_KNOT Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20240528191850.63314-1-alexandre.belloni@bootlin.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-29 10:40:36 +02:00
John Garry	ed7ee6a69f	statx: Update offset commentary for struct statx In commit `2a82bb0294` ("statx: stx_subvol"), a new member was added to struct statx, but the offset comment was not correct. Update it. Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240529081725.3769290-1-john.g.garry@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-29 10:40:19 +02:00
Takashi Iwai	fe85f6e607	ALSA: ump: Don't clear bank selection after sending a program change The current code clears the bank selection MSB/LSB after sending a program change, but this can be wrong, as many apps may not send the full bank selection with both MSB and LSB but sending only one. Better to keep the previous bank set. Fixes: `0b5288f5fe` ("ALSA: ump: Add legacy raw MIDI support") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240529083823.5778-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-29 10:39:50 +02:00
Takashi Iwai	edb3277619	ALSA: seq: Fix incorrect UMP type for system messages When converting a legacy system message to a UMP packet, it forgot to modify the UMP type field but keeping the default type (either type 2 or 4). Correct to the right type for system messages. Fixes: `e9e02819a9` ("ALSA: seq: Automatic conversion of UMP events") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20240529083800.5742-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-29 10:39:40 +02:00
Imre Deak	75800e2e42	drm/i915: Fix audio component initialization After registering the audio component in i915_audio_component_init() the audio driver may call i915_audio_component_get_power() via the component ops. This could program AUD_FREQ_CNTRL with an uninitialized value if the latter function is called before display.audio.freq_cntrl gets initialized. The get_power() function also does a modeset which in the above case happens too early before the initialization step and triggers the "Reject display access from task" error message added by the Fixes: commit below. Fix the above issue by registering the audio component only after the initialization step. Fixes: `87c1694533` ("drm/i915: save AUD_FREQ_CNTRL state at audio domain suspend") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10291 Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240521143022.3784539-1-imre.deak@intel.com (cherry picked from commit `fdd0b80172`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-05-29 11:35:48 +03:00
Vidya Srinivas	43e2b37e2a	drm/i915/dpt: Make DPT object unshrinkable In some scenarios, the DPT object gets shrunk but the actual framebuffer did not and thus its still there on the DPT's vm->bound_list. Then it tries to rewrite the PTEs via a stale CPU mapping. This causes panic. Cc: stable@vger.kernel.org Reported-by: Shawn Lee <shawn.c.lee@intel.com> Fixes: `0dc987b699` ("drm/i915/display: Add smem fallback allocation for dpt") Signed-off-by: Vidya Srinivas <vidya.srinivas@intel.com> [vsyrjala: Add TODO comment] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240520165634.1162470-1-vidya.srinivas@intel.com (cherry picked from commit `51064d471c`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-05-29 11:35:43 +03:00
Andi Shyti	ee01b6a386	drm/i915/gt: Fix CCS id's calculation for CCS mode setting The whole point of the previous fixes has been to change the CCS hardware configuration to generate only one stream available to the compute users. We did this by changing the info.engine_mask that is set during device probe, reset during the detection of the fused engines, and finally reset again when choosing the CCS mode. We can't use the engine_mask variable anymore, as with the current configuration, it imposes only one CCS no matter what the hardware configuration is. Before changing the engine_mask for the third time, save it and use it for calculating the CCS mode. After the previous changes, the user reported a performance drop to around 1/4. We have tested that the compute operations, with the current patch, have improved by the same factor. Fixes: `6db31251bb` ("drm/i915/gt: Enable only one CCS for compute workload") Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Gnattu OC <gnattuoc@me.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Tested-by: Jian Ye <jian.ye@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Tested-by: Gnattu OC <gnattuoc@me.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240517090616.242529-1-andi.shyti@linux.intel.com (cherry picked from commit `a09d2327a9`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-05-29 11:35:38 +03:00
Dmitry Baryshkov	8c318cb70c	drm/panel/lg-sw43408: mark sw43408_backlight_ops as static Fix sparse warning regarding symbol 'sw43408_backlight_ops' not being declared. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202404200739.hbWZvOhR-lkp@intel.com/ Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Fixes: `069a6c0e94` ("drm: panel: Add LG sw43408 panel driver") Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240528-panel-sw43408-fix-v4-2-330b42445bcc@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>	2024-05-29 11:35:32 +03:00
Nirmoy Das	659a3062c7	drm/i915/selftests: Set always_coherent to false when reading from CPU Commit `8d4ba9fc1c` ("drm/i915/selftests: Pick correct caching mode.") was not complete as for non LLC sharing platforms cpu read can happen from LLC which probably doesn't have the latest changes made by GPU. Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Fixes: `8d4ba9fc1c` ("drm/i915/selftests: Pick correct caching mode.") Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240516151403.2875-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit `007ed70831`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-05-29 11:35:32 +03:00
Dmitry Baryshkov	33defcacd2	drm/panel/lg-sw43408: select CONFIG_DRM_DISPLAY_DP_HELPER This panel driver uses DSC PPS functions and as such depends on the DRM_DISPLAY_DP_HELPER. Select this symbol to make required functions available to the driver. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202404200800.kYsRYyli-lkp@intel.com/ Fixes: `069a6c0e94` ("drm: panel: Add LG sw43408 panel driver") Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240528-panel-sw43408-fix-v4-1-330b42445bcc@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>	2024-05-29 11:35:32 +03:00
Arnd Bergmann	d4f36db623	drm/i915/guc: avoid FIELD_PREP warning With gcc-7 and earlier, there are lots of warnings like In file included from <command-line>:0:0: In function '__guc_context_policy_add_priority.isra.66', inlined from '__guc_context_set_prio.isra.67' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3292:3, inlined from 'guc_context_set_prio' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3320:2: include/linux/compiler_types.h:399:38: error: call to '__compiletime_assert_631' declared with attribute error: FIELD_PREP: mask is not constant _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ... drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2422:3: note: in expansion of macro 'FIELD_PREP' FIELD_PREP(GUC_KLV_0_KEY, GUC_CONTEXT_POLICIES_KLV_ID_##id) \| \ ^~~~~~~~~~ Make sure that GUC_KLV_0_KEY is an unsigned value to avoid the warning. Fixes: `77b6f79df6` ("drm/i915/guc: Update to GuC version 69.0.3") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240430164809.482131-1-julia.filipchuk@intel.com (cherry picked from commit `364e039827`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-05-29 11:35:26 +03:00
Chris Wilson	70cb9188ff	drm/i915/gt: Disarm breadcrumbs if engines are already idle The breadcrumbs use a GT wakeref for guarding the interrupt, but are disarmed during release of the engine wakeref. This leaves a hole where we may attach a breadcrumb just as the engine is parking (after it has parked its breadcrumbs), execute the irq worker with some signalers still attached, but never be woken again. That issue manifests itself in CI with IGT runner timeouts while tests are waiting indefinitely for release of all GT wakerefs. <6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats <7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5 <7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4 <7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3 <7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2 <7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off <7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6 <7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02 <4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT. ... <6> [299.356526] sysrq: Show State ... <6> [299.373964] task:i915_selftest state:D stack:11784 pid:5578 tgid:5578 ppid:873 flags:0x00004002 <6> [299.373967] Call Trace: <6> [299.373968] <TASK> <6> [299.373970] __schedule+0x3bb/0xda0 <6> [299.373974] schedule+0x41/0x110 <6> [299.373976] intel_wakeref_wait_for_idle+0x82/0x100 [i915] <6> [299.374083] ? __pfx_var_wake_function+0x10/0x10 <6> [299.374087] live_engine_busy_stats+0x9b/0x500 [i915] <6> [299.374173] __i915_subtests+0xbe/0x240 [i915] <6> [299.374277] ? __pfx___intel_gt_live_setup+0x10/0x10 [i915] <6> [299.374369] ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915] <6> [299.374456] intel_engine_live_selftests+0x1c/0x30 [i915] <6> [299.374547] __run_selftests+0xbb/0x190 [i915] <6> [299.374635] i915_live_selftests+0x4b/0x90 [i915] <6> [299.374717] i915_pci_probe+0x10d/0x210 [i915] At the end of the interrupt worker, if there are no more engines awake, disarm the breadcrumb and go to sleep. Fixes: `9d5612ca16` ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: <stable@vger.kernel.org> # v5.12+ Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240423165505.465734-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit `fbad43ecca`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-05-29 11:35:15 +03:00
Janusz Krzysztofik	647535760a	Revert "drm/i915: Remove extra multi-gt pm-references" This reverts commit `1f33dc0c11`. There was a patch supposed to fix an issue of illegal attempts to free a still active i915 VMA object when parking a GT believed to be idle, reported by CI on 2-GT Meteor Lake. As a solution, an extra wakeref for a Primary GT was acquired from i915_gem_do_execbuffer() -- see commit `f56fe3e917` ("drm/i915: Fix a VMA UAF for multi-gt platform"). However, that fix occurred insufficient -- the issue was still reported by CI. That wakeref was released on exit from i915_gem_do_execbuffer(), then potentially before completion of the request and deactivation of its associated VMAs. Moreover, CI reports indicated that single-GT platforms also suffered sporadically from the same race. Since that issue was fixed by another commit `f3c71b2ded` ("drm/i915/vma: Fix UAF on destroy against retire race"), the changes introduced by that insufficient fix were dropped as no longer useful. However, that series resulted in another VMA UAF scenario now being triggered in CI. <4> [260.290809] ------------[ cut here ]------------ <4> [260.290988] list_del corruption. prev->next should be ffff888118c5d990, but was ffff888118c5a510. (prev=ffff888118c5a510) <4> [260.291004] WARNING: CPU: 2 PID: 1143 at lib/list_debug.c:62 __list_del_entry_valid_or_report+0xb7/0xe0 .. <4> [260.291055] CPU: 2 PID: 1143 Comm: kms_plane Not tainted 6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1 <4> [260.291058] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024 <4> [260.291060] RIP: 0010:__list_del_entry_valid_or_report+0xb7/0xe0 ... <4> [260.291087] Call Trace: <4> [260.291089] <TASK> <4> [260.291124] i915_vma_reopen+0x43/0x80 [i915] <4> [260.291298] eb_lookup_vmas+0x9cb/0xcc0 [i915] <4> [260.291579] i915_gem_do_execbuffer+0xc9a/0x26d0 [i915] <4> [260.291883] i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915] ... <4> [260.292301] </TASK> ... <4> [260.292506] ---[ end trace 0000000000000000 ]--- <4> [260.292782] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6ca3: 0000 [#1] PREEMPT SMP NOPTI <4> [260.303575] CPU: 2 PID: 1143 Comm: kms_plane Tainted: G W 6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1 <4> [260.313851] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024 <4> [260.326359] RIP: 0010:eb_validate_vmas+0x114/0xd80 [i915] ... <4> [260.428756] Call Trace: <4> [260.431192] <TASK> <4> [639.283393] i915_gem_do_execbuffer+0xd05/0x26d0 [i915] <4> [639.305245] i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915] ... <4> [639.411134] </TASK> ... <4> [639.449979] ---[ end trace 0000000000000000 ]--- We defer actually closing, unbinding and destroying a VMA until next idle point, or until the object is freed in the meantime. By postponing the unbind, we allow for the VMA to be reopened by the client, avoiding the work required to rebind the VMA. Starting from commit `b0647a5e79` ("drm/i915: Avoid live-lock with i915_vma_parked()"), we assume that as long as a GT is held idle, no VMA would be reopened while we destroy them. That assumption is no longer true in multi-GT configurations, where a VMA we reopen may be handled by a GT different from the one that we already keep active via its engine while we set up an execbuf request. Restoring the extra GT0 PM wakeref removed from i915_gem_do_execbuffer() processing path seems to fix this issue. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10608 Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Nirmoy Das <nirmoy.das@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com> Fixes: `1f33dc0c11` ("drm/i915: Remove extra multi-gt pm-references") Link: https://patchwork.freedesktop.org/patch/msgid/20240506180253.96858-2-janusz.krzysztofik@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `749670a58d`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-05-29 11:35:08 +03:00
Chen-Yu Tsai	e06a698ae6	scripts/make_fit: Drop fdt image entry compatible string According to the FIT image source file format document found in U-boot [1] and the split-out FIT image specification [2], under "'/images' node" -> "Conditionally mandatory property", the "compatible" property is described as "compatible method for loading image", i.e., not the compatible string embedded in the FDT or used for matching. Drop the compatible string from the fdt image entry node. While at it also fix up a typo in the document section of output_dtb. [1] U-boot source "doc/usage/fit/source_file_format.rst", or on the website: https://docs.u-boot.org/en/latest/usage/fit/source_file_format.html [2] https://github.com/open-source-firmware/flat-image-tree/blob/main/source/chapter2-source-file-format.rst Fixes: `7a23b027ec` ("arm64: boot: Support Flat Image Tree") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Simon Glass <sjg@chromium.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-29 16:41:21 +09:00
Masahiro Yamada	3c562a70cf	kbuild: remove a stale comment about cleaning in link-vmlinux.sh Remove the left-over of commit `51eb95e2da` ("kbuild: Don't remove link-vmlinux temporary files on exit/signal"). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-29 16:40:03 +09:00
Masahiro Yamada	3430f65d61	kbuild: fix short log for AS in link-vmlinux.sh In convention, short logs print the output file, not the input file. Let's change the suffix for 'AS' since it assembles .S into .o. [Before] LD .tmp_vmlinux.kallsyms1 NM .tmp_vmlinux.kallsyms1.syms KSYMS .tmp_vmlinux.kallsyms1.S AS .tmp_vmlinux.kallsyms1.S LD .tmp_vmlinux.kallsyms2 NM .tmp_vmlinux.kallsyms2.syms KSYMS .tmp_vmlinux.kallsyms2.S AS .tmp_vmlinux.kallsyms2.S LD vmlinux [After] LD .tmp_vmlinux.kallsyms1 NM .tmp_vmlinux.kallsyms1.syms KSYMS .tmp_vmlinux.kallsyms1.S AS .tmp_vmlinux.kallsyms1.o LD .tmp_vmlinux.kallsyms2 NM .tmp_vmlinux.kallsyms2.syms KSYMS .tmp_vmlinux.kallsyms2.S AS .tmp_vmlinux.kallsyms2.o LD vmlinux Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-29 16:40:03 +09:00
Masahiro Yamada	b18b047002	kbuild: change scripts/mksysmap into sed script The previous commit removed the subshell execution from scripts/mksysmap, which is now simple enough to become a sed script. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-29 16:40:03 +09:00
Masahiro Yamada	04b8cb0945	kbuild: avoid unneeded kallsyms step 3 Since commit `951bcae6c5` ("kallsyms: Avoid weak references for kallsyms symbols"), the kallsyms step 3 always occurs. You can compare the build logs. [Before `951bcae6c5`] $ git checkout 951bcae6c5a0^ $ make defconfig all [ snip ] LD .tmp_vmlinux.kallsyms1 NM .tmp_vmlinux.kallsyms1.syms KSYMS .tmp_vmlinux.kallsyms1.S AS .tmp_vmlinux.kallsyms1.S LD .tmp_vmlinux.kallsyms2 NM .tmp_vmlinux.kallsyms2.syms KSYMS .tmp_vmlinux.kallsyms2.S AS .tmp_vmlinux.kallsyms2.S LD vmlinux [After `951bcae6c5`] $ git checkout `951bcae6c5` $ make defconfig all [ snip ] LD .tmp_vmlinux.kallsyms1 NM .tmp_vmlinux.kallsyms1.syms KSYMS .tmp_vmlinux.kallsyms1.S AS .tmp_vmlinux.kallsyms1.S LD .tmp_vmlinux.kallsyms2 NM .tmp_vmlinux.kallsyms2.syms KSYMS .tmp_vmlinux.kallsyms2.S AS .tmp_vmlinux.kallsyms2.S LD .tmp_vmlinux.kallsyms3 # should not happen NM .tmp_vmlinux.kallsyms3.syms # should not happen KSYMS .tmp_vmlinux.kallsyms3.S # should not happen AS .tmp_vmlinux.kallsyms3.S # should not happen LD vmlinux The resulting vmlinux is correct, but it always requires an additional linking step. The symbols produced by kallsyms are excluded from kallsyms itself because they were previously missing in step 1. With those symbols excluded, the symbol lists matched between step 1 and step 2, eliminating the need for step 3. Now, this has a negative effect. Since `951bcae6c5`, the PROVIDE() directives provide the fallback definitions, which are not trimmed from the sysbol list in step 1 because ${kallsymso_prev} is empty at this point. In step 2, ${kallsymso_prev} is set, and the kallsyms_* symbols are trimmed from the symbol list. Due to the table size difference between step 1 and step 2 (the former is larger due to the presence of kallsyms_), step 3 is triggered. Now that the kallsyms_ symbols are always linked, let's stop omitting them from kallsyms. This avoids unnecessary step 3. Fixes: `951bcae6c5` ("kallsyms: Avoid weak references for kallsyms symbols") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-29 16:40:03 +09:00
Douglas Anderson	659bbf7e1b	kbuild: scripts/gdb: Replace missed $(srctree)/$(src) w/ $(src) Recently we went through the source tree and replaced $(srctree)/$(src) w/ $(src). However, the gdb scripts Makefile had a hidden $(srctree)/$(src) that looked like this: $(abspath $(srctree))/$(src) Because we missed that then my installed kernel had symlinks that looked like this: __init__.py -> ${INSTALL_DIR}/$(INSTALL_DIR}/scripts/gdb/linux/__init__.py Let's also replace the midden $(abspath $(srctree))/$(src) with $(src). Now: __init__.py -> $(INSTALL_DIR}/scripts/gdb/linux/__init__.py Fixes: `b1992c3772` ("kbuild: use $(src) instead of $(srctree)/$(src) for source directory") Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-29 16:40:03 +09:00
Masahiro Yamada	31894d35b5	kconfig: remove redundant check in expr_join_or() The check for 'sym1 == sym2' is redundant here because it has already been done a few lines above: if (sym1 != sym2) return NULL; Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-29 16:40:03 +09:00
Masahiro Yamada	aabdc960a2	kconfig: fix comparison to constant symbols, 'm', 'n' Currently, comparisons to 'm' or 'n' result in incorrect output. [Test Code] config MODULES def_bool y modules config A def_tristate m config B def_bool A > n CONFIG_B is unset, while CONFIG_B=y is expected. The reason for the issue is because Kconfig compares the tristate values as strings. Currently, the .type fields in the constant symbol definitions, symbol_{yes,mod,no} are unspecified, i.e., S_UNKNOWN. When expr_calc_value() evaluates 'A > n', it checks the types of 'A' and 'n' to determine how to compare them. The left-hand side, 'A', is a tristate symbol with a value of 'm', which corresponds to a numeric value of 1. (Internally, 'y', 'm', and 'n' are represented as 2, 1, and 0, respectively.) The right-hand side, 'n', has an unknown type, so it is treated as the string "n" during the comparison. expr_calc_value() compares two values numerically only when both can have numeric values. Otherwise, they are compared as strings. symbol numeric value ASCII code ------------------------------------- y 2 0x79 m 1 0x6d n 0 0x6e 'm' is greater than 'n' if compared numerically (since 1 is greater than 0), but smaller than 'n' if compared as strings (since the ASCII code 0x6d is smaller than 0x6e). Specifying .type=S_TRISTATE for symbol_{yes,mod,no} fixes the above test code. Doing so, however, would cause a regression to the following test code. [Test Code 2] config MODULES def_bool n modules config A def_tristate n config B def_bool A = m You would get CONFIG_B=y, while CONFIG_B should not be set. The reason is because sym_get_string_value() turns 'm' into 'n' when the module feature is disabled. Consequently, expr_calc_value() evaluates 'A = n' instead of 'A = m'. This oddity has been hidden because the type of 'm' was previously S_UNKNOWN instead of S_TRISTATE. sym_get_string_value() should not tweak the string because the tristate value has already been correctly calculated. There is no reason to return the string "n" where its tristate value is mod. Fixes: `31847b67be` ("kconfig: allow use of relations other than (in)equality") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-29 16:40:03 +09:00
Masahiro Yamada	a607468b52	kconfig: remove unused expr_is_no() This has not been used since commit `e911503085` ("Kconfig: Remove bad inference rules expr_eliminate_dups2()"). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-05-29 16:40:03 +09:00
Adrián Larumbe	3b8407e81e	drm/gem-shmem: Add import attachment warning to locked pin function Commit `ec144244a4` ("drm/gem-shmem: Acquire reservation lock in GEM pin/unpin callbacks") moved locking DRM object's dma reservation to drm_gem_shmem_object_pin, and made drm_gem_shmem_pin_locked public, so we need to make sure the not-imported check warning is also added to the latter. Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Fixes: `a780278472` ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()") Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240523113236.432585-4-adrian.larumbe@collabora.com	2024-05-29 09:30:44 +02:00
Adrián Larumbe	8c2f5dd0c3	drm/lima: Fix dma_resv deadlock at drm object pin time Commit `a780278472` ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()") moved locking the DRM object's dma reservation to drm_gem_pin(), but Lima's pin callback kept calling drm_gem_shmem_pin, which also tries to lock the same dma_resv, leading to a double lock situation. As was already done for Panfrost in the previous commit, fix it by replacing drm_gem_shmem_pin() with its locked variant. Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Steven Price <steven.price@arm.com> Fixes: `a780278472` ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()") Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Val Packett <val@packett.cool> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240523113236.432585-3-adrian.larumbe@collabora.com	2024-05-29 09:30:39 +02:00
Adrián Larumbe	e57f2187cc	drm/panfrost: Fix dma_resv deadlock at drm object pin time When Panfrost must pin an object that is being prepared a dma-buf attachment for on behalf of another driver, the core drm gem object pinning code already takes a lock on the object's dma reservation. However, Panfrost GEM object's pinning callback would eventually try taking the lock on the same dma reservation when delegating pinning of the object onto the shmem subsystem, which led to a deadlock. This can be shown by enabling CONFIG_DEBUG_WW_MUTEX_SLOWPATH, which throws the following recursive locking situation: weston/3440 is trying to acquire lock: ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_pin+0x34/0xb8 [drm_shmem_helper] but task is already holding lock: ffff000000e235a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_pin+0x2c/0x80 [drm] Fix it by replacing drm_gem_shmem_pin with its locked version, as the lock had already been taken by drm_gem_pin(). Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Steven Price <steven.price@arm.com> Fixes: `a780278472` ("drm/gem: Acquire reservation lock in drm_gem_{pin/unpin}()") Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240523113236.432585-2-adrian.larumbe@collabora.com	2024-05-29 09:30:38 +02:00
Vladimir Oltean	fb66df20a7	net/sched: taprio: extend minimum interval restriction to entire cycle too It is possible for syzbot to side-step the restriction imposed by the blamed commit in the Fixes: tag, because the taprio UAPI permits a cycle-time different from (and potentially shorter than) the sum of entry intervals. We need one more restriction, which is that the cycle time itself must be larger than N * ETH_ZLEN bit times, where N is the number of schedule entries. This restriction needs to apply regardless of whether the cycle time came from the user or was the implicit, auto-calculated value, so we move the existing "cycle == 0" check outside the "if "(!new->cycle_time)" branch. This way covers both conditions and scenarios. Add a selftest which illustrates the issue triggered by syzbot. Fixes: `b5b73b26b3` ("taprio: Fix allowing too small intervals") Reported-by: syzbot+a7d2b1d5d1af83035567@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/0000000000007d66bc06196e7c66@google.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20240527153955.553333-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-28 19:46:41 -07:00
Vladimir Oltean	e634134180	net/sched: taprio: make q->picos_per_byte available to fill_sched_entry() In commit `b5b73b26b3` ("taprio: Fix allowing too small intervals"), a comparison of user input against length_to_duration(q, ETH_ZLEN) was introduced, to avoid RCU stalls due to frequent hrtimers. The implementation of length_to_duration() depends on q->picos_per_byte being set for the link speed. The blamed commit in the Fixes: tag has moved this too late, so the checks introduced above are ineffective. The q->picos_per_byte is zero at parse_taprio_schedule() -> parse_sched_list() -> parse_sched_entry() -> fill_sched_entry() time. Move the taprio_set_picos_per_byte() call as one of the first things in taprio_change(), before the bulk of the netlink attribute parsing is done. That's because it is needed there. Add a selftest to make sure the issue doesn't get reintroduced. Fixes: `09dbdf28f9` ("net/sched: taprio: fix calculation of maximum gate durations") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240527153955.553333-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-28 19:46:41 -07:00
Martin K. Petersen	4fedb1f095	Merge branch '6.10/scsi-queue' into 6.10/scsi-fixes Pull in remaining commits from 6.10/scsi-queue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-28 21:29:03 -04:00
Kent Overstreet	83208cbf2f	bcachefs: Don't return -EROFS from mount on inconsistency error We were accidentally returning -EROFS during recovery on filesystem inconsistency - since this is what the journal returns on emergency shutdown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 19:23:03 -04:00
Li Zhijian	49ba7b515c	cxl/region: Fix memregion leaks in devm_cxl_add_region() Move the mode verification to __create_region() before allocating the memregion to avoid the memregion leaks. Fixes: `6e09926418` ("cxl/region: Add volatile region creation support") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240507053421.456439-1-lizhijian@fujitsu.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-05-28 16:09:17 -07:00
Dave Jiang	d555105271	cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c tools/testing/cxl/test/mem.c uses vmalloc() and vfree() but does not include linux/vmalloc.h. Kernel v6.10 made changes that causes the currently included headers not depend on vmalloc.h and therefore mem.c can no longer compile. Add linux/vmalloc.h to fix compile issue. CC [M] tools/testing/cxl/test/mem.o tools/testing/cxl/test/mem.c: In function ‘label_area_release’: tools/testing/cxl/test/mem.c:1428:9: error: implicit declaration of function ‘vfree’; did you mean ‘kvfree’? [-Werror=implicit-function-declaration] 1428 \| vfree(lsa); \| ^~~~~ \| kvfree tools/testing/cxl/test/mem.c: In function ‘cxl_mock_mem_probe’: tools/testing/cxl/test/mem.c:1466:22: error: implicit declaration of function ‘vmalloc’; did you mean ‘kmalloc’? [-Werror=implicit-function-declaration] 1466 \| mdata->lsa = vmalloc(LSA_SIZE); \| ^~~~~~~ \| kmalloc Fixes: `7d3eb23c4c` ("tools/testing/cxl: Introduce a mock memory device + driver") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20240528225551.1025977-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-05-28 16:08:04 -07:00
Eric Garver	e8ded22ef0	netfilter: nft_fib: allow from forward/input without iif selector This removes the restriction of needing iif selector in the forward/input hooks for fib lookups when requested result is oif/oifname. Removing this restriction allows "loose" lookups from the forward hooks. Fixes: `be8be04e5d` ("netfilter: nft_fib: reverse path filter for policy-based routing on iif") Signed-off-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-05-29 00:37:51 +02:00
Florian Westphal	21a673bddc	netfilter: tproxy: bail out if IP has been disabled on the device syzbot reports: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] [..] RIP: 0010:nf_tproxy_laddr4+0xb7/0x340 net/ipv4/netfilter/nf_tproxy_ipv4.c:62 Call Trace: nft_tproxy_eval_v4 net/netfilter/nft_tproxy.c:56 [inline] nft_tproxy_eval+0xa9a/0x1a00 net/netfilter/nft_tproxy.c:168 __in_dev_get_rcu() can return NULL, so check for this. Reported-and-tested-by: syzbot+b94a6818504ea90d7661@syzkaller.appspotmail.com Fixes: `cc6eb43385` ("tproxy: use the interface primary IP address as a default value for --on-ip") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-05-29 00:37:51 +02:00
Pablo Neira Ayuso	33c563ebf8	netfilter: nft_payload: skbuff vlan metadata mangle support Userspace assumes vlan header is present at a given offset, but vlan offload allows to store this in metadata fields of the skbuff. Hence mangling vlan results in a garbled packet. Handle this transparently by adding a parser to the kernel. If vlan metadata is present and payload offset is over 12 bytes (source and destination mac address fields), then subtract vlan header present in vlan metadata, otherwise mangle vlan metadata based on offset and length, extracting data from the source register. This is similar to: `8cfd23e674` ("netfilter: nft_payload: work around vlan header stripping") to deal with vlan payload mangling. Fixes: `7ec3f7b47b` ("netfilter: nft_payload: add packet mangling support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-05-29 00:35:32 +02:00
Kent Overstreet	8528bde1b6	bcachefs: Fix uninitialized var warning Can't actually be used uninitialized, but gcc was being silly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 18:21:51 -04:00
Kent Overstreet	759bb4eabc	bcachefs: Split out sb-errors_format.h Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 17:33:45 -04:00
Kent Overstreet	5c16c57488	bcachefs: Split out journal_seq_blacklist_format.h Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 17:32:03 -04:00
Kent Overstreet	24998050b6	bcachefs: Split out replicas_format.h Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 17:32:03 -04:00
Kent Overstreet	1cdcc6e3c2	bcachefs: Split out disk_groups_format.h Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 17:32:03 -04:00
Kent Overstreet	4c5eef0c50	bcachefs: split out sb-downgrade_format.h Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 17:32:03 -04:00
Kent Overstreet	016c22e410	bcachefs: split out sb-members_format.h Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 17:32:03 -04:00
Maarten Lankhorst	f73a058be5	Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes v6.10-rc1 is released, forward from v6.9 Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2024-05-28 22:21:34 +02:00
Rafael J. Wysocki	1ae088232b	Merge tag 'linux-cpupower-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux into pm-tools Merge cpupower utility fix for 6.10-rc2 from Shuah Khan: "This cpupower fixes update for Linux 6.10-rc2 consists of one single fix to cpupower's P-State frequency calculation and reporting with AMD Family 1Ah+ processors, when using the acpi-cpufreq driver." * tag 'linux-cpupower-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs	2024-05-28 22:07:00 +02:00
Dhananjay Ugwekar	e4731baaf2	cpufreq: amd-pstate: Fix the inconsistency in max frequency units The nominal frequency in cpudata is maintained in MHz whereas all other frequencies are in KHz. This means we have to convert nominal frequency value to KHz before we do any interaction with other frequency values. In amd_pstate_set_boost(), this conversion from MHz to KHz is missed, fix that. Tested on a AMD Zen4 EPYC server Before: $ cat /sys/devices/system/cpu/cpufreq/policy/scaling_max_freq \| uniq 2151 $ cat /sys/devices/system/cpu/cpufreq/policy/cpuinfo_min_freq \| uniq 400000 $ cat /sys/devices/system/cpu/cpufreq/policy/scaling_cur_freq \| uniq 2151 409422 After: $ cat /sys/devices/system/cpu/cpufreq/policy/scaling_max_freq \| uniq 2151000 $ cat /sys/devices/system/cpu/cpufreq/policy/cpuinfo_min_freq \| uniq 400000 $ cat /sys/devices/system/cpu/cpufreq/policy/scaling_cur_freq \| uniq 2151000 1799527 Fixes: `ec437d71db` ("cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors") Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Acked-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Tested-by: Peter Jung <ptr1337@cachyos.org> Cc: 5.17+ <stable@vger.kernel.org> # 5.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-28 22:03:11 +02:00
Arnd Bergmann	779b8a14af	cpufreq: amd-pstate: remove global header file When extra warnings are enabled, gcc points out a global variable definition in a header: In file included from drivers/cpufreq/amd-pstate-ut.c:29: include/linux/amd-pstate.h:123:27: error: 'amd_pstate_mode_string' defined but not used [-Werror=unused-const-variable=] 123 \| static const char * const amd_pstate_mode_string[] = { \| ^~~~~~~~~~~~~~~~~~~~~~ This header is only included from two files in the same directory, and one of them uses only a single definition from it, so clean it up by moving most of the contents into the driver that uses them, and making shared bits a local header file. Fixes: `36c5014e54` ("cpufreq: amd-pstate: optimize driver working mode selection in amd_pstate_param()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-28 21:59:39 +02:00
Thomas Weißschuh	ac62f52138	ACPI: AC: Properly notify powermanagement core about changes The powermanagement core does various actions when a powersupply changes. It calls into notifiers, LED triggers, other power supplies and emits an uevent. To make sure that all these actions happen properly call power_supply_changed(). Reported-by: Rajas Paranjpe <paranjperajas@gmail.com> Closes: https://github.com/MrChromebox/firmware/issues/420#issuecomment-2132251318 Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-28 21:56:40 +02:00
Arnaldo Carvalho de Melo	2f523f29d3	tools headers UAPI: Update i915_drm.h with the kernel sources Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-28 16:54:31 -03:00
Andy Shevchenko	edcde848c0	PNP: Hide pnp_bus_type from the non-PNP code The pnp_bus_type is defined only when CONFIG_PNP=y, while being not guarded by ifdeffery in the header. Moreover, it's not used outside of the PNP code. Move it to the internal header to make sure no-one will try to (ab)use it. Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-28 21:53:51 +02:00
Andy Shevchenko	c7a5096781	PNP: Make dev_is_pnp() to be a function and export it for modules Since we have a dev_is_pnp() macro that utilises the address of the pnp_bus_type variable, the users, which can be compiled as modules, will fail to build. Convert the macro to be a function and export it to the modules to prevent build breakage. Reported-by: Woody Suwalski <terraluna977@gmail.com> Closes: https://lore.kernel.org/r/cc8a93b2-2504-9754-e26c-5d5c3bd1265c@gmail.com Fixes: `2a49b45cd0` ("PNP: Add dev_is_pnp() macro") Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-28 21:53:51 +02:00
Arnaldo Carvalho de Melo	88e520512a	tools headers UAPI: Sync kvm headers with the kernel sources To pick the changes in: `4af663c2f6` ("KVM: SEV: Allow per-guest configuration of GHCB protocol version") `4f5defae70` ("KVM: SEV: introduce KVM_SEV_INIT2 operation") `26c44aa9e0` ("KVM: SEV: define VM types for SEV and SEV-ES") `ac5c48027b` ("KVM: SEV: publish supported VMSA features") `651d61bc8b` ("KVM: PPC: Fix documentation for ppc mmu caps") That don't change functionality in tools/perf, as no new ioctl is added for the 'perf trace' scripts to harvest. This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joel Stanley <joel@jms.id.au> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Roth <michael.roth@amd.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/lkml/ZlYxAdHjyAkvGtMW@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-28 16:49:36 -03:00
Arnaldo Carvalho de Melo	ac4b069035	tools arch x86: Sync the msr-index.h copy with the kernel sources To pick up the changes from these csets: `53bc516ade` ("x86/msr: Move ARCH_CAP_XAPIC_DISABLE bit definition to its rightful place") That patch just move definitions around, so this just silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Link: https://lore.kernel.org/lkml/ZlYe8jOzd1_DyA7X@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-28 15:14:32 -03:00
Linus Torvalds	e0cce98fe2	Merge tag 'tpmdd-next-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fixes from Jarkko Sakkinen: "This fixes two unaddressed review comments for the HMAC encryption patch set. They are cosmetic but we are better off, if such unnecessary glitches do not exist in the release. The important part is enabling the HMAC encryption by default only on x86-64 because that is the only sufficiently tested arch. Finally, there is a bug fix for SPI transfer buffer allocation, which did not take into account the SPI header size" * tag 'tpmdd-next-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: Enable TCG_TPM2_HMAC by default only for X86_64 tpm: Rename TPM2_OA_TMPL to TPM2_OA_NULL_KEY and make it local tpm: Open code tpm_buf_parameters() tpm_tis_spi: Account for SPI header when allocating TPM SPI xfer buffer	2024-05-28 10:40:52 -07:00
Linus Torvalds	8d6bc6a2b1	Merge tag 'probes-fixes-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - uprobes: prevent mutex_lock() under rcu_read_lock(). Recent changes moved uprobe_cpu_buffer preparation which involves mutex_lock(), under __uprobe_trace_func() which is called inside rcu_read_lock(). Fix it by moving uprobe_cpu_buffer preparation outside of __uprobe_trace_func() - kprobe-events: handle the error case of btf_find_struct_member() * tag 'probes-fixes-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/probes: fix error check in parse_btf_field() uprobes: prevent mutex_lock() under rcu_read_lock()	2024-05-28 10:17:40 -07:00
Jeff Johnson	a0fc1a053b	of: of_test: add MODULE_DESCRIPTION() Fix the 'make W=1' warning: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/of/of_test.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240524-md-of-of_test-v1-1-6ebd078d620f@quicinc.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-05-28 12:10:00 -05:00
Sagi Grimberg	c758b77d4a	nvmet: fix a possible leak when destroy a ctrl during qp establishment In nvmet_sq_destroy we capture sq->ctrl early and if it is non-NULL we know that a ctrl was allocated (in the admin connect request handler) and we need to release pending AERs, clear ctrl->sqs and sq->ctrl (for nvme-loop primarily), and drop the final reference on the ctrl. However, a small window is possible where nvmet_sq_destroy starts (as a result of the client giving up and disconnecting) concurrently with the nvme admin connect cmd (which may be in an early stage). But before kill_and_confirm of sq->ref (i.e. the admin connect managed to get an sq live reference). In this case, sq->ctrl was allocated however after it was captured in a local variable in nvmet_sq_destroy. This prevented the final reference drop on the ctrl. Solve this by re-capturing the sq->ctrl after all inflight request has completed, where for sure sq->ctrl reference is final, and move forward based on that. This issue was observed in an environment with many hosts connecting multiple ctrls simoutanuosly, creating a delay in allocating a ctrl leading up to this race window. Reported-by: Alex Turin <alex@vastdata.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-28 10:01:52 -07:00
Keith Busch	be647e2c76	nvme: use srcu for iterating namespace list The nvme pci driver synchronizes with all the namespace queues during a reset to ensure that there's no pending timeout work. Meanwhile the timeout work potentially iterates those same namespaces to freeze their queues. Each of those namespace iterations use the same read lock. If a write lock should somehow get between the synchronize and freeze steps, then forward progress is deadlocked. We had been relying on the nvme controller state machine to ensure the reset work wouldn't conflict with timeout work. That guarantee may be a bit fragile to rely on, so iterate the namespace lists without taking potentially circular locks, as reported by lockdep. Link: https://lore.kernel.org/all/20220930001943.zdbvolc3gkekfmcv@shindev/ Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-28 09:43:32 -07:00
Kent Overstreet	f1d4fed13f	bcachefs: Better fsck error message for key version Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 11:29:26 -04:00
Kent Overstreet	088d0de812	bcachefs: btree_gc can now handle unknown btrees Compatibility fix - we no longer have a separate table for which order gc walks btrees in, and special case the stripes btree directly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 11:29:26 -04:00
Jeff Johnson	b4131076c1	bcachefs: add missing MODULE_DESCRIPTION() Fix the 'make W=1' warning: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/bcachefs/mean_and_variance_test.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 11:29:26 -04:00
Kent Overstreet	247c056bde	bcachefs: Fix setting of downgrade recovery passes/errors bch2_check_version_downgrade() was setting c->sb.version, which bch2_sb_set_downgrade() expects to be at the previous version; and it shouldn't even have been set directly because c->sb.version is updated by write_super(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 11:29:26 -04:00
Kent Overstreet	08f50005e0	bcachefs: Run check_key_has_snapshot in snapshot_delete_keys() delete_dead_snapshots now runs before the main fsck.c passes which check for keys for invalid snapshots; thus, it needs those checks as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 11:29:26 -04:00
Kent Overstreet	82af5ceb5d	bcachefs: Refactor delete_dead_snapshots() Consolidate per-key work into delete_dead_snapshots_process_key(), so we now walk all keys once, not twice. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 11:29:26 -04:00
Kent Overstreet	218e5e0c2a	bcachefs: Fix locking assert We now track whether a transaction is locked, and verify that we don't have nodes locked when the transaction isn't locked; reorder relocks to not pop the new assert. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 11:29:26 -04:00
Kent Overstreet	9e1a66e668	bcachefs: Fix lookup_first_inode() when inode_generations are present This function is used for finding the hash seed (which is the same in all versions of an inode in different snapshots): ff an inode has been deleted in a child snapshot we need to iterate until we find a live version. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 11:29:26 -04:00
Kent Overstreet	1292bc2ebf	bcachefs: Plumb bkey into __btree_err() It can be useful to know the exact byte offset within a btree node where an error occured. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-28 11:29:23 -04:00
Dhananjay Ugwekar	43cad521c6	tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs Update cpupower's P-State frequency calculation and reporting with AMD Family 1Ah+ processors, when using the acpi-cpufreq driver. This is due to a change in the PStateDef MSR layout in AMD Family 1Ah+. Tested on 4th and 5th Gen AMD EPYC system Signed-off-by: Ananth Narayan <Ananth.Narayan@amd.com> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-05-28 09:22:57 -06:00
Filipe Manana	f13e01b89d	btrfs: ensure fast fsync waits for ordered extents after a write failure If a write path in COW mode fails, either before submitting a bio for the new extents or an actual IO error happens, we can end up allowing a fast fsync to log file extent items that point to unwritten extents. This is because dropping the extent maps happens when completing ordered extents, at btrfs_finish_one_ordered(), and the completion of an ordered extent is executed in a work queue. This can result in a fast fsync to start logging file extent items based on existing extent maps before the ordered extents complete, therefore resulting in a log that has file extent items that point to unwritten extents, resulting in a corrupt file if a crash happens after and the log tree is replayed the next time the fs is mounted. This can happen for both direct IO writes and buffered writes. For example consider a direct IO write, in COW mode, that fails at btrfs_dio_submit_io() because btrfs_extract_ordered_extent() returned an error: 1) We call btrfs_finish_ordered_extent() with the 'uptodate' parameter set to false, meaning an error happened; 2) That results in marking the ordered extent with the BTRFS_ORDERED_IOERR flag; 3) btrfs_finish_ordered_extent() queues the completion of the ordered extent - so that btrfs_finish_one_ordered() will be executed later in a work queue. That function will drop extent maps in the range when it's executed, since the extent maps point to unwritten locations (signaled by the BTRFS_ORDERED_IOERR flag); 4) After calling btrfs_finish_ordered_extent() we keep going down the write path and unlock the inode; 5) After that a fast fsync starts and locks the inode; 6) Before the work queue executes btrfs_finish_one_ordered(), the fsync task sees the extent maps that point to the unwritten locations and logs file extent items based on them - it does not know they are unwritten, and the fast fsync path does not wait for ordered extents to complete, which is an intentional behaviour in order to reduce latency. For the buffered write case, here's one example: 1) A fast fsync begins, and it starts by flushing delalloc and waiting for the writeback to complete by calling filemap_fdatawait_range(); 2) Flushing the dellaloc created a new extent map X; 3) During the writeback some IO error happened, and at the end io callback (end_bbio_data_write()) we call btrfs_finish_ordered_extent(), which sets the BTRFS_ORDERED_IOERR flag in the ordered extent and queues its completion; 4) After queuing the ordered extent completion, the end io callback clears the writeback flag from all pages (or folios), and from that moment the fast fsync can proceed; 5) The fast fsync proceeds sees extent map X and logs a file extent item based on extent map X, resulting in a log that points to an unwritten data extent - because the ordered extent completion hasn't run yet, it happens only after the logging. To fix this make btrfs_finish_ordered_extent() set the inode flag BTRFS_INODE_NEEDS_FULL_SYNC in case an error happened for a COW write, so that a fast fsync will wait for ordered extent completion. Note that this issues of using extent maps that point to unwritten locations can not happen for reads, because in read paths we start by locking the extent range and wait for any ordered extents in the range to complete before looking for extent maps. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-05-28 16:35:12 +02:00
Arnaldo Carvalho de Melo	da42b5229b	tools headers: Update the syscall tables and unistd.h, mostly to support the new 'mseal' syscall But also to wire up shadow stacks on 32-bit x86, picking up those changes from these csets: `ff388fe5c4` ("mseal: wire up mseal syscall") `2883f01ec3` ("x86/shstk: Enable shadow stacks for x32") This makes 'perf trace' support it, now its possible, for instance to do: # perf trace -e mseal --max-stack=16 Here is an example with the 'sendmmsg' syscall: root@x1:~# perf trace -e sendmmsg --max-stack 16 --max-events=1 0.000 ( 0.062 ms): dbus-broker/1012 sendmmsg(fd: 150, mmsg: 0x7ffef57cca50, vlen: 1, flags: DONTWAIT\|NOSIGNAL) = 1 syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) [0x117ce7] (/usr/lib64/libc.so.6 (deleted)) root@x1:~# To do a system wide tracing of the new 'mseal' syscall with a backtrace of at most 16 entries. This addresses these perf tools build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H J Lu <hjl.tools@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlXlo4TNcba4wnVZ@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-28 11:10:00 -03:00
MD Danish Anwar	56a5cf538c	net: ti: icssg-prueth: Fix start counter for ft1 filter The start counter for FT1 filter is wrongly set to 0 in the driver. FT1 is used for source address violation (SAV) check and source address starts at Byte 6 not Byte 0. Fix this by changing start counter to ETH_ALEN in icssg_ft1_set_mac_addr(). Fixes: `e9b4ece7d7` ("net: ti: icssg-prueth: Add Firmware config and classification APIs.") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://lore.kernel.org/r/20240527063015.263748-1-danishanwar@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 15:29:52 +02:00
Coly Li	74d4ce92e0	bcache: code cleanup in __bch_bucket_alloc_set() In __bch_bucket_alloc_set() the lines after lable 'err:' indeed do nothing useful after multiple cache devices are removed from bcache code. This cleanup patch drops the useless code to save a bit CPU cycles. Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20240528120914.28705-4-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-28 06:55:59 -06:00
Coly Li	05356938a4	bcache: call force_wake_up_gc() if necessary in check_should_bypass() If there are extreme heavy write I/O continuously hit on relative small cache device (512GB in my testing), it is possible to make counter c->gc_stats.in_use continue to increase and exceed CUTOFF_CACHE_ADD. If 'c->gc_stats.in_use > CUTOFF_CACHE_ADD' happens, all following write requests will bypass the cache device because check_should_bypass() returns 'true'. Because all writes bypass the cache device, counter c->sectors_to_gc has no chance to be negative value, and garbage collection thread won't be waken up even the whole cache becomes clean after writeback accomplished. The aftermath is that all write I/Os go directly into backing device even the cache device is clean. To avoid the above situation, this patch uses a quite conservative way to fix: if 'c->gc_stats.in_use > CUTOFF_CACHE_ADD' happens, only wakes up garbage collection thread when the whole cache device is clean. Before the fix, the writes-always-bypass situation happens after 10+ hours write I/O pressure on 512GB Intel optane memory which acts as cache device. After this fix, such situation doesn't happen after 36+ hours testing. Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20240528120914.28705-3-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-28 06:55:59 -06:00
Dongsheng Yang	a14a68b769	bcache: allow allocator to invalidate bucket in gc Currently, if the gc is running, when the allocator found free_inc is empty, allocator has to wait the gc finish. Before that, the IO is blocked. But actually, there would be some buckets is reclaimable before gc, and gc will never mark this kind of bucket to be unreclaimable. So we can put these buckets into free_inc in gc running to avoid IO being blocked. Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn> Signed-off-by: Coly Li <colyli@suse.de> Link: https://lore.kernel.org/r/20240528120914.28705-2-colyli@suse.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-28 06:55:59 -06:00
Hannes Reinecke	e993db2d6e	block: check for max_hw_sectors underflow The logical block size need to be smaller than the max_hw_sector setting, otherwise we can't even transfer a single LBA. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-28 06:55:23 -06:00
Christoph Hellwig	e528bede6f	block: stack max_user_sectors The max_user_sectors is one of the three factors determining the actual max_sectors limit for READ/WRITE requests. Because of that it needs to be stacked at least for the device mapper multi-path case where requests are directly inserted on the lower device. For SCSI disks this is important because the sd driver actually sets it's own advisory limit that is lower than max_hw_sectors based on the block limits VPD page. While this is a bit odd an unusual, the same effect can happen if a user or udev script tweaks the value manually. Fixes: `4f563a6473` ("block: add a max_user_discard_sectors queue limit") Reported-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240523182618.602003-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-28 06:54:36 -06:00
Christoph Hellwig	bafea1c58b	sd: also set max_user_sectors when setting max_sectors sd can set a max_sectors value that is lower than the max_hw_sectors limit based on the block limits VPD page. While this is rather unusual, it used to work until the max_user_sectors field was split out to cleanly deal with conflicting hardware and user limits when the hardware limit changes. Also set max_user_sectors to ensure the limit can properly be stacked. Fixes: `4f563a6473` ("block: add a max_user_discard_sectors queue limit") Reported-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20240523182618.602003-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-28 06:54:36 -06:00
Damien Le Moal	233e27b4d2	null_blk: Print correct max open zones limit in null_init_zoned_dev() When changing the maximum number of open zones, print that number instead of the total number of zones. Fixes: `dc4d137ee3` ("null_blk: add support for max open/active zone limit for zoned devices") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20240528062852.437599-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-28 06:54:21 -06:00
Christian Brauner	db003a28e0	netfs: fix kernel doc for nets_wait_for_outstanding_io() The @inode parameter wasn't documented leading to new doc build warnings. Fixes: `f89ea63f1c` ("netfs, 9p: Fix race between umount and async request completion") Link: https://lore.kernel.org/r/20240528133050.7e09d78e@canb.auug.org.au Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-28 14:34:15 +02:00
Christian Brauner	0c07c273a5	debugfs: continue to ignore unknown mount options Wolfram reported that debugfs remained empty on some of his boards triggering the message "debugfs: Unknown parameter 'auto'". The root of the issue is that we ignored unknown mount options in the old mount api but we started rejecting unknown mount options in the new mount api. Continue to ignore unknown mount options to not regress userspace. Fixes: `a20971c187` ("vfs: Convert debugfs to use the new mount API") Link: https://lore.kernel.org/r/20240527100618.np2wqiw5mz7as3vk@ninjato Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-28 14:32:42 +02:00
Alina Yu	72b6a2d650	regulator: rtq2208: Fix invalid memory access when devm_of_regulator_put_matches is called In this patch, a software bug has been fixed. rtq2208_ldo_match is no longer a local variable. It prevents invalid memory access when devm_of_regulator_put_matches is called. Signed-off-by: Alina Yu <alina_yu@richtek.com> Link: https://msgid.link/r/4ce8c4f16f1cf3aa4e5f36c0694dd3c5ccf3cd1c.1716870419.git.alina_yu@richtek.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-28 13:22:54 +01:00
Pierre-Louis Bossart	e662c90a6d	ALSA/hda: intel-dsp-config: reduce log verbosity The information on PCI class/subclass was interesting in the Skylake timeframe, since the DSP was only enabled on a limited number of platforms. Now most Intel platforms do enable the DSP, so the information is less interesting to log. When a DSP driver is used, the common helper may be called multiple times due to deferred probes, but there's no reason to print the same information multiple times. Using dev_info_once() covers all the existing usages for internal cards with DSPs. External cards don't rely on DSPs so far. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20240527193808.165652-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-28 13:07:51 +02:00
Takashi Iwai	a200df7deb	ALSA: seq: Don't clear bank selection at event -> UMP MIDI2 conversion The current code to convert from a legacy sequencer event to UMP MIDI2 clears the bank selection at each time the program change is submitted. This is confusing and may lead to incorrect bank values tranmitted to the destination in the end. Drop the line to clear the bank info and keep the provided values. Fixes: `e9e02819a9` ("ALSA: seq: Automatic conversion of UMP events") Link: https://lore.kernel.org/r/20240527151852.29036-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-28 13:07:20 +02:00
Takashi Iwai	8a42886cae	ALSA: seq: Fix missing bank setup between MIDI1/MIDI2 UMP conversion When a UMP packet is converted between MIDI1 and MIDI2 protocols, the bank selection may be lost. The conversion from MIDI1 to MIDI2 needs the encoding of the bank into UMP_MSG_STATUS_PROGRAM bits, while the conversion from MIDI2 to MIDI1 needs the extraction from that instead. This patch implements the missing bank selection mechanism in those conversions. Fixes: `e9e02819a9` ("ALSA: seq: Automatic conversion of UMP events") Link: https://lore.kernel.org/r/20240527151852.29036-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-28 13:07:08 +02:00
Jarkko Sakkinen	d3e43a8fa4	tpm: Enable TCG_TPM2_HMAC by default only for X86_64 Given the not fully root caused performance issues on non-x86 platforms, enable the feature by default only for x86-64. That is the platform it brings the most value and has gone most of the QA. Can be reconsidered later and can be obviously opt-in enabled too on any arch. Link: https://lore.kernel.org/linux-integrity/bf67346ef623ff3c452c4f968b7d900911e250c3.camel@gmail.com/#t Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2024-05-28 13:14:29 +03:00
Jarkko Sakkinen	f09fc6cee0	tpm: Rename TPM2_OA_TMPL to TPM2_OA_NULL_KEY and make it local Rename and document TPM2_OA_TMPL, as originally requested in the patch set review, but left unaddressed without any appropriate reasoning. The new name is TPM2_OA_NULL_KEY, has a documentation and is local only to tpm2-sessions.c. Link: https://lore.kernel.org/linux-integrity/ddbeb8111f48a8ddb0b8fca248dff6cc9d7079b2.camel@HansenPartnership.com/ Link: https://lore.kernel.org/linux-integrity/CZCKTWU6ZCC9.2UTEQPEVICYHL@suppilovahvero/ Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2024-05-28 13:14:23 +03:00
Thadeu Lima de Souza Cascardo	4b4647add7	sock_map: avoid race between sock_map_close and sk_psock_put sk_psock_get will return NULL if the refcount of psock has gone to 0, which will happen when the last call of sk_psock_put is done. However, sk_psock_drop may not have finished yet, so the close callback will still point to sock_map_close despite psock being NULL. This can be reproduced with a thread deleting an element from the sock map, while the second one creates a socket, adds it to the map and closes it. That will trigger the WARN_ON_ONCE: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 7220 at net/core/sock_map.c:1701 sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701 Modules linked in: CPU: 1 PID: 7220 Comm: syz-executor380 Not tainted 6.9.0-syzkaller-07726-g3c999d1ae3c7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 RIP: 0010:sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701 Code: df e8 92 29 88 f8 48 8b 1b 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 79 29 88 f8 4c 8b 23 eb 89 e8 4f 15 23 f8 90 <0f> 0b 90 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 13 26 3d 02 RSP: 0018:ffffc9000441fda8 EFLAGS: 00010293 RAX: ffffffff89731ae1 RBX: ffffffff94b87540 RCX: ffff888029470000 RDX: 0000000000000000 RSI: ffffffff8bcab5c0 RDI: ffffffff8c1faba0 RBP: 0000000000000000 R08: ffffffff92f9b61f R09: 1ffffffff25f36c3 R10: dffffc0000000000 R11: fffffbfff25f36c4 R12: ffffffff89731840 R13: ffff88804b587000 R14: ffff88804b587000 R15: ffffffff89731870 FS: 000055555e080380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000207d4000 CR4: 0000000000350ef0 Call Trace: <TASK> unix_release+0x87/0xc0 net/unix/af_unix.c:1048 __sock_release net/socket.c:659 [inline] sock_close+0xbe/0x240 net/socket.c:1421 __fput+0x42b/0x8a0 fs/file_table.c:422 __do_sys_close fs/open.c:1556 [inline] __se_sys_close fs/open.c:1541 [inline] __x64_sys_close+0x7f/0x110 fs/open.c:1541 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fb37d618070 Code: 00 00 48 c7 c2 b8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d4 e8 10 2c 00 00 80 3d 31 f0 07 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c RSP: 002b:00007ffcd4a525d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fb37d618070 RDX: 0000000000000010 RSI: 00000000200001c0 RDI: 0000000000000004 RBP: 0000000000000000 R08: 0000000100000000 R09: 0000000100000000 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Use sk_psock, which will only check that the pointer is not been set to NULL yet, which should only happen after the callbacks are restored. If, then, a reference can still be gotten, we may call sk_psock_stop and cancel psock->work. As suggested by Paolo Abeni, reorder the condition so the control flow is less convoluted. After that change, the reproducer does not trigger the WARN_ON_ONCE anymore. Suggested-by: Paolo Abeni <pabeni@redhat.com> Reported-by: syzbot+07a2e4a1a57118ef7355@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=07a2e4a1a57118ef7355 Fixes: `aadb2bb83f` ("sock_map: Fix a potential use-after-free in sock_map_close()") Fixes: `5b4a79ba65` ("bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself") Cc: stable@vger.kernel.org Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20240524144702.1178377-1-cascardo@igalia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 12:05:19 +02:00
Jarkko Sakkinen	f3d7ba9e1b	tpm: Open code tpm_buf_parameters() With only single call site, this makes no sense (slipped out of the radar during the review). Open code and document the action directly to the site, to make it more readable. Fixes: `1b6d7f9eb1` ("tpm: add session encryption protection to tpm2_get_random()") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2024-05-28 13:03:57 +03:00
Matthew R. Ochs	195aba96b8	tpm_tis_spi: Account for SPI header when allocating TPM SPI xfer buffer The TPM SPI transfer mechanism uses MAX_SPI_FRAMESIZE for computing the maximum transfer length and the size of the transfer buffer. As such, it does not account for the 4 bytes of header that prepends the SPI data frame. This can result in out-of-bounds accesses and was confirmed with KASAN. Introduce SPI_HDRSIZE to account for the header and use to allocate the transfer buffer. Fixes: `a86a42ac2b` ("tpm_tis_spi: Add hardware wait polling") Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Tested-by: Carol Soto <csoto@nvidia.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>	2024-05-28 13:03:57 +03:00
Niranjana Vishwanathapura	6c5cd0807c	drm/xe: Properly handle alloc_guc_id() failure Release the submission_state lock if alloc_guc_id() fails. v2: Add Fixes tag and CC stable kernel Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240521201711.4934-1-niranjana.vishwanathapura@intel.com (cherry picked from commit `40672b792a`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-05-28 08:53:45 +02:00
Matthew Brost	c8ea2c31f5	drm/xe: Only use reserved BCS instances for usm migrate exec queue The GuC context scheduling queue is 2 entires deep, thus it is possible for a migration job to be stuck behind a fault if migration exec queue shares engines with user jobs. This can deadlock as the migrate exec queue is required to service page faults. Avoid deadlock by only using reserved BCS instances for usm migrate exec queue. Fixes: `a043fbab7a` ("drm/xe/pvc: Use fast copy engines as migrate engine on PVC") Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240415190453.696553-2-matthew.brost@intel.com Reviewed-by: Brian Welty <brian.welty@intel.com> (cherry picked from commit `04f4a70a18`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-05-28 08:53:35 +02:00
Himal Prasad Ghimiray	77b79df026	drm/xe: Change pcode timeout to 50msec while polling again Polling is initially attempted with timeout_base_ms enabled for preemption, and if it exceeds this timeframe, another attempt is made without preemption, allowing an additional 50 ms before timing out. v2 - Rebase v3 - Move warnings to separate patch (Lucas) Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Fixes: `7dc9b92dcf` ("drm/xe: Remove i915_utils dependency from xe_pcode.") Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240508152216.3263109-2-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `c81858eb52`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-05-28 08:53:24 +02:00
Bingbu Cao	c19fa08c14	media: intel/ipu6: fix the buffer flags caused by wrong parentheses The buffer flags is set by wrong due to wrong parentheses, the FL_INCOMING flag is never taken an account. Fix it by wrapping the ternary conditional operation with parentheses. Fixes: `3c1dfb5a69` ("media: intel/ipu6: input system video nodes and buffer queues") Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2024-05-28 08:00:14 +02:00
Christophe JAILLET	ab0ed48101	media: intel/ipu6: Fix an error handling path in isys_probe() If an error occurs after a successful alloc_fw_msg_bufs() call, some resources should be released as already done in the remove function. Add a new free_fw_msg_bufs() function that releases what has been allocated by alloc_fw_msg_bufs(). Also use this new function in isys_remove() to avoid some code duplication. Fixes: `f50c4ca0a8` ("media: intel/ipu6: add the main input system driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2024-05-28 08:00:14 +02:00
Christophe JAILLET	266b44ec9a	media: intel/ipu6: Move isys_remove() close to isys_probe() In preparation to fixing a leak in isys_probe(), move isys_remove(). The fix will introduce a new function that will also be called from isys_remove(). The code needs to be rearranged to avoid a forward declaration. Having the .remove function close to the .probe function is also more standard. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2024-05-28 08:00:14 +02:00
Christophe JAILLET	fe61b2906b	media: intel/ipu6: Fix some redundant resources freeing in ipu6_pci_remove() pcim_iomap_regions() and pcim_enable_device() are used in the probe. So the corresponding managed resources don't need to be freed explicitly in the remove function. Remove the incorrect pci_release_regions() and pci_disable_device() calls. Fixes: `25fedc0219` ("media: intel/ipu6: add Intel IPU6 PCI device driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Bingbu Cao <bingbu.cao@intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2024-05-28 08:00:14 +02:00
Sakari Ailus	fd7ccfb112	media: Documentation: v4l: Fix ACTIVE route flag The documentation in one occasion mentions the VIDIOC_SUBDEV_STREAM_FL_ACTIVE flag. This was meant to be V4L2_SUBDEV_STREAM_FL_ACTIVE as it's a flag, not an IOCTL. Fix it. Fixes: `cd2c75454d` ("media: Documentation: Document S_ROUTING behaviour") Reported-by: Samuel Wein PhD <sam@samwein.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>	2024-05-28 08:00:14 +02:00
Thorsten Blum	c519cf9b74	docs: netdev: Fix typo in Signed-off-by tag s/of/off/ Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Fixes: `e110ba6592` ("docs: netdev: add note about Changes Requested and revising commit messages") Link: https://lore.kernel.org/r/20240527103618.265801-2-thorsten.blum@toblux.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:15:22 -07:00
Jakub Kicinski	6e15774d92	Merge branch 'selftests-mptcp-mark-unstable-subtests-as-flaky' Matthieu Baerts says: ==================== selftests: mptcp: mark unstable subtests as flaky Some subtests can be unstable, failing once every X runs. Fixing them can take time: there could be an issue in the kernel or in the subtest, and it is then important to do a proper analysis, not to hide real bugs. To avoid creating noises on the different CIs where tests are more unstable than on our side, some subtests have been marked as flaky. As a result, errors with these subtests (if any) are ignored. Note that the MPTCP CI will continue to track these flaky subtests. All these unstable subtests are also tracked by our bug tracker. These are fixes for the -net tree, because the instabilities are visible there. The first patch introducing the flake support has no 'Fixes' tags, mainly because it requires recent and important refactoring done in all MPTCP selftests. Backporting that to old versions where the flaky tests have been introduced would be too difficult, and probably not worth it. The other patches, adding MPTCP_LIB_SUBTEST_FLAKY=1, have a Fixes tag, simply to ease the backport of the future fixes removing them along with the proper fix. ==================== Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-0-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:13:01 -07:00
Matthieu Baerts (NGI0)	38af56e666	selftests: mptcp: join: mark 'fail' tests as flaky These tests are rarely unstable. It depends on the CI running the tests, especially if it is also busy doing other tasks in parallel, and if a debug kernel config is being used. It looks like this issue is sometimes present with the NetDev CI. While this is being investigated, the tests are marked as flaky not to create noises on such CIs. Fixes: `b6e074e171` ("selftests: mptcp: add infinite map testcase") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/491 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-4-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:12:51 -07:00
Matthieu Baerts (NGI0)	8c06ac2178	selftests: mptcp: join: mark 'fastclose' tests as flaky These tests are flaky since their introduction. This might be less or not visible depending on the CI running the tests, especially if it is also busy doing other tasks in parallel, and if a debug kernel config is being used. It looks like this issue is often present with the NetDev CI. While this is being investigated, the tests are marked as flaky not to create noises on such CIs. Fixes: `01542c9bf9` ("selftests: mptcp: add fastclose testcase") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/324 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-3-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:12:50 -07:00
Matthieu Baerts (NGI0)	cc73a6577a	selftests: mptcp: simult flows: mark 'unbalanced' tests as flaky These tests are flaky since their introduction. This might be less or not visible depending on the CI running the tests, especially if it is also busy doing other tasks in parallel. A first analysis shown that the transfer can be slowed down when there are some re-injections at the MPTCP level. Such re-injections can of course happen, and disturb the transfer, but it looks strange to have them in this lab. That could be caused by the kernel having access to less CPU cycles -- e.g. when other activities are executed in parallel -- or by a misinterpretation on the MPTCP packet scheduler side. While this is being investigated, the tests are marked as flaky not to create noises in other CIs. Fixes: `219d04992b` ("mptcp: push pending frames when subflow has free space") Link: https://github.com/multipath-tcp/mptcp_net-next/issues/475 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-2-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:12:50 -07:00
Matthieu Baerts (NGI0)	5597613fb3	selftests: mptcp: lib: support flaky subtests Some subtests can be unstable, failing once every X runs. Fixing them can take time: there could be an issue in the kernel or in the subtest, and it is then important to do a proper analysis, not to hide real bugs. To avoid creating noises on the different CIs, it is important to have a simple way to mark subtests as flaky, and ignore the errors. This is what this patch introduces: subtests can be marked as flaky by setting MPTCP_LIB_SUBTEST_FLAKY env var to 1, e.g. MPTCP_LIB_SUBTEST_FLAKY=1 <run flaky subtest> The subtest will be executed, and errors (if any) will be ignored. It is still good to run these subtests, as it exercises code, and the results can still be useful for the on-going investigations. Note that the MPTCP CI will continue to track these flaky subtests by setting SELFTESTS_MPTCP_LIB_OVERRIDE_FLAKY env var to 1, and a ticket has to be created before marking subtests as flaky. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240524-upstream-net-20240524-selftests-mptcp-flaky-v1-1-a352362f3f8e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:12:50 -07:00
Jakub Kicinski	7a8cc96ebe	Merge branch 'intel-wired-lan-driver-updates-2024-05-23-ice-idpf' Jacob Keller says: ==================== Intel Wired LAN Driver Updates 2024-05-23 (ice, idpf) This series contains two fixes which finished up testing. First, Alexander fixes an issue in idpf caused by enabling NAPI and interrupts prior to actually allocating the Rx buffers. Second, Jacob fixes the ice driver VSI VLAN counting logic to ensure that addition and deletion of VLANs properly manages the total VSI count. ==================== Link: https://lore.kernel.org/r/20240523-net-2024-05-23-intel-net-fixes-v1-0-17a923e0bb5f@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:11:46 -07:00
Jacob Keller	82617b9a04	ice: fix accounting if a VLAN already exists The ice_vsi_add_vlan() function is used to add a VLAN filter for the target VSI. This function prepares a filter in the switch table for the given VSI. If it succeeds, the vsi->num_vlan counter is incremented. It is not considered an error to add a VLAN which already exists in the switch table, so the function explicitly checks and ignores -EEXIST. The vsi->num_vlan counter is still incremented. This seems incorrect, as it means we can double-count in the case where the same VLAN is added twice by the caller. The actual table will have one less filter than the count. The ice_vsi_del_vlan() function similarly checks and handles the -ENOENT condition for when deleting a filter that doesn't exist. This flow only decrements the vsi->num_vlan if it actually deleted a filter. The vsi->num_vlan counter is used only in a few places, primarily related to tracking the number of non-zero VLANs. If the vsi->num_vlans gets out of sync, then ice_vsi_num_non_zero_vlans() will incorrectly report more VLANs than are present, and ice_vsi_has_non_zero_vlans() could return true potentially in cases where there are only VLAN 0 filters left. Fix this by only incrementing the vsi->num_vlan in the case where we actually added an entry, and not in the case where the entry already existed. Fixes: `a1ffafb0b4` ("ice: Support configuring the device to Double VLAN Mode") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240523-net-2024-05-23-intel-net-fixes-v1-2-17a923e0bb5f@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:11:43 -07:00
Alexander Lobakin	d514c8b542	idpf: don't enable NAPI and interrupts prior to allocating Rx buffers Currently, idpf enables NAPI and interrupts prior to allocating Rx buffers. This may lead to frame loss (there are no buffers to place incoming frames) and even crashes on quick ifup-ifdown. Interrupts must be enabled only after all the resources are here and available. Split interrupt init into two phases: initialization and enabling, and perform the second only after the queues are fully initialized. Note that we can't just move interrupt initialization down the init process, as the queues must have correct a ::q_vector pointer set and NAPI already added in order to allocate buffers correctly. Also, during the deinit process, disable HW interrupts first and only then disable NAPI. Otherwise, there can be a HW event leading to napi_schedule(), but the NAPI will already be unavailable. Fixes: `d4d5587182` ("idpf: initialize interrupts and enable vport") Reported-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240523-net-2024-05-23-intel-net-fixes-v1-1-17a923e0bb5f@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:11:43 -07:00
Alexander Lobakin	266aa3b481	page_pool: fix &page_pool_params kdoc issues After the tagged commit, @netdev got documented twice and the kdoc script didn't notice that. Remove the second description added later and move the initial one according to the field position. After merging commit `5f8e4007c1` ("kernel-doc: fix struct_group_tagged() parsing"), kdoc requires to describe struct groups as well. &page_pool_params has 2 struct groups which generated new warnings, describe them to resolve this. Fixes: `403f11ac9a` ("page_pool: don't use driver-set flags field directly") Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20240524112859.2757403-1-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 17:00:22 -07:00
Horatiu Vultur	4fb679040d	net: micrel: Fix lan8841_config_intr after getting out of sleep mode When the interrupt is enabled, the function lan8841_config_intr tries to clear any pending interrupts by reading the interrupt status, then checks the return value for errors and then continue to enable the interrupt. It has been seen that once the system gets out of sleep mode, the interrupt status has the value 0x400 meaning that the PHY detected that the link was in low power. That is correct value but the problem is that the check is wrong. We try to check for errors but we return an error also in this case which is not an error. Therefore fix this by returning only when there is an error. Fixes: `a8f1a19d27` ("net: micrel: Add support for lan8841 PHY") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Suman Ghosh <sumang@marvell.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20240524085350.359812-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 16:58:39 -07:00
Xiaolei Wang	bf0497f53c	net:fec: Add fec_enet_deinit() When fec_probe() fails or fec_drv_remove() needs to release the fec queue and remove a NAPI context, therefore add a function corresponding to fec_enet_init() and call fec_enet_deinit() which does the opposite to release memory and remove a NAPI context. Fixes: `59d0f74656` ("net: fec: init multi queue date structure") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240524050528.4115581-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 16:55:32 -07:00
Rob Herring (Arm)	12f86b9af9	dt-bindings: net: pse-pd: ti,tps23881: Fix missing "additionalProperties" constraints The child nodes are missing "additionalProperties" constraints which means any undocumented properties or child nodes are allowed. Add the constraints and all the undocumented properties exposed by the fix. Fixes: `f562202fed` ("dt-bindings: net: pse-pd: Add bindings for TPS23881 PSE controller") Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://lore.kernel.org/r/20240523171750.2837331-1-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 16:49:16 -07:00
Rob Herring (Arm)	0fe53c0ab0	dt-bindings: net: pse-pd: microchip,pd692x0: Fix missing "additionalProperties" constraints The child nodes are missing "additionalProperties" constraints which means any undocumented properties or child nodes are allowed. Add the constraints, and fix the fallout of wrong manager node regex and missing properties. Fixes: `9c1de033af` ("dt-bindings: net: pse-pd: Add bindings for PD692x0 PSE controller") Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://lore.kernel.org/r/20240523171732.2836880-1-robh@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 16:48:55 -07:00
Eric Dumazet	f4dca95fc0	tcp: reduce accepted window in NEW_SYN_RECV state Jason commit made checks against ACK sequence less strict and can be exploited by attackers to establish spoofed flows with less probes. Innocent users might use tcp_rmem[1] == 1,000,000,000, or something more reasonable. An attacker can use a regular TCP connection to learn the server initial tp->rcv_wnd, and use it to optimize the attack. If we make sure that only the announced window (smaller than 65535) is used for ACK validation, we force an attacker to use 65537 packets to complete the 3WHS (assuming server ISN is unknown) Fixes: `378979e94e` ("tcp: remove 64 KByte limit for initial tp->rcv_wnd value") Link: https://datatracker.ietf.org/meeting/119/materials/slides-119-tcpm-ghost-acks-00 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240523130528.60376-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 16:47:23 -07:00
Willem de Bruijn	be008726d0	net: gro: initialize network_offset in network layer Syzkaller was able to trigger kernel BUG at net/core/gro.c:424 ! RIP: 0010:gro_pull_from_frag0 net/core/gro.c:424 [inline] RIP: 0010:gro_try_pull_from_frag0 net/core/gro.c:446 [inline] RIP: 0010:dev_gro_receive+0x242f/0x24b0 net/core/gro.c:571 Due to using an incorrect NAPI_GRO_CB(skb)->network_offset. The referenced commit sets this offset to 0 in skb_gro_reset_offset. That matches the expected case in dev_gro_receive: pp = INDIRECT_CALL_INET(ptype->callbacks.gro_receive, ipv6_gro_receive, inet_gro_receive, &gro_list->list, skb); But syzkaller injected an skb with protocol ETH_P_TEB into an ip6gre device (by writing the IP6GRE encapsulated version to a TAP device). The result was a first call to eth_gro_receive, and thus an extra ETH_HLEN in network_offset that should not be there. First issue hit is when computing offset from network header in ipv6_gro_pull_exthdrs. Initialize both offsets in the network layer gro_receive. This pairs with all reads in gro_receive, which use skb_gro_receive_network_offset(). Fixes: `186b1ea73a` ("net: gro: use cb instead of skb->network_header") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> CC: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240523141434.1752483-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 16:46:59 -07:00
Ido Schimmel	7b05ab85e2	ipv4: Fix address dump when IPv4 is disabled on an interface Cited commit started returning an error when user space requests to dump the interface's IPv4 addresses and IPv4 is disabled on the interface. Restore the previous behavior and do not return an error. Before cited commit: # ip address show dev dummy1 10: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether e2:40:68:98:d0:18 brd ff:ff:ff:ff:ff:ff inet6 fe80::e040:68ff:fe98:d018/64 scope link proto kernel_ll valid_lft forever preferred_lft forever # ip link set dev dummy1 mtu 67 # ip address show dev dummy1 10: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 67 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether e2:40:68:98:d0:18 brd ff:ff:ff:ff:ff:ff After cited commit: # ip address show dev dummy1 10: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether 32:2d:69:f2:9c:99 brd ff:ff:ff:ff:ff:ff inet6 fe80::302d:69ff:fef2:9c99/64 scope link proto kernel_ll valid_lft forever preferred_lft forever # ip link set dev dummy1 mtu 67 # ip address show dev dummy1 RTNETLINK answers: No such device Dump terminated With this patch: # ip address show dev dummy1 10: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether de:17:56:bb:57:c0 brd ff:ff:ff:ff:ff:ff inet6 fe80::dc17:56ff:febb:57c0/64 scope link proto kernel_ll valid_lft forever preferred_lft forever # ip link set dev dummy1 mtu 67 # ip address show dev dummy1 10: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 67 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether de:17:56:bb:57:c0 brd ff:ff:ff:ff:ff:ff I fixed the exact same issue for IPv6 in commit `c04f7dfe6e` ("ipv6: Fix address dump when IPv6 is disabled on an interface"), but noted [1] that I am not doing the change for IPv4 because I am not aware of a way to disable IPv4 on an interface other than unregistering it. I clearly missed the above case. [1] https://lore.kernel.org/netdev/20240321173042.2151756-1-idosch@nvidia.com/ Fixes: `cdb2f80f1c` ("inet: use xa_array iterator to implement inet_dump_ifaddr()") Reported-by: Carolina Jubran <cjubran@nvidia.com> Reported-by: Yamen Safadi <ysafadi@nvidia.com> Tested-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240523110257.334315-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 16:46:05 -07:00
Jakub Kicinski	2786ae339e	Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-05-27 We've added 15 non-merge commits during the last 7 day(s) which contain a total of 18 files changed, 583 insertions(+), 55 deletions(-). The main changes are: 1) Fix broken BPF multi-uprobe PID filtering logic which filtered by thread while the promise was to filter by process, from Andrii Nakryiko. 2) Fix the recent influx of syzkaller reports to sockmap which triggered a locking rule violation by performing a map_delete, from Jakub Sitnicki. 3) Fixes to netkit driver in particular on skb->pkt_type override upon pass verdict, from Daniel Borkmann. 4) Fix an integer overflow in resolve_btfids which can wrongly trigger build failures, from Friedrich Vock. 5) Follow-up fixes for ARC JIT reported by static analyzers, from Shahab Vahedi. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Cover verifier checks for mutating sockmap/sockhash Revert "bpf, sockmap: Prevent lock inversion deadlock in map delete elem" bpf: Allow delete from sockmap/sockhash only if update is allowed selftests/bpf: Add netkit test for pkt_type selftests/bpf: Add netkit tests for mac address netkit: Fix pkt_type override upon netkit pass verdict netkit: Fix setting mac address in l2 mode ARC, bpf: Fix issues reported by the static analyzers selftests/bpf: extend multi-uprobe tests with USDTs selftests/bpf: extend multi-uprobe tests with child thread case libbpf: detect broken PID filtering logic for multi-uprobe bpf: remove unnecessary rcu_read_{lock,unlock}() in multi-uprobe attach logic bpf: fix multi-uprobe PID filtering logic bpf: Fix potential integer overflow in resolve_btfids MAINTAINERS: Add myself as reviewer of ARM64 BPF JIT ==================== Link: https://lore.kernel.org/r/20240527203551.29712-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-27 16:26:30 -07:00
Pierre-Louis Bossart	3ff78451b8	ASoC: SOF: add missing MODULE_DESCRIPTION() MODULE_DESCRIPTION() was optional until it became mandatory and flagged as an error by 'make W=1'. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://msgid.link/r/20240527194414.166156-5-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-27 21:19:27 +01:00
Pierre-Louis Bossart	06a2315da0	ASoC: SOF: reorder MODULE_ definitions Follow the arbitrary Intel convention order to allow for easier grep. MODULE_LICENSE MODULE_DESCRIPTION MODULE_IMPORT Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://msgid.link/r/20240527194414.166156-4-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-27 21:19:26 +01:00
Pierre-Louis Bossart	b88056df4f	ASoC: SOF: AMD: group all module related information The module information is spread across files, group in a single location. For maintenability and alignment, the arbitrary Intel convention is used with the following order: MODULE_LICENSE MODULE_DESCRIPTION MODULE_IMPORT Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://msgid.link/r/20240527194414.166156-3-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-27 21:19:25 +01:00
Pierre-Louis Bossart	e30a942861	ASoC: SOF: stream-ipc: remove unnecessary MODULE_LICENSE This file is part of the snd-sof module, there's no reason to re-add the MODULE_LICENSE here. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://msgid.link/r/20240527194414.166156-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-27 21:19:24 +01:00
hexue	30a0e3135f	block: delete redundant function declaration blk_stats_alloc_enable was used for block hybrid poll, the related function definition was removed by patch: commit `54bdd67d0f` ("blk-mq: remove hybrid polling") but the function declaration was not deleted. Signed-off-by: hexue <xue01.he@samsung.com> Link: https://lore.kernel.org/r/20240527084533.1485210-1-xue01.he@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-27 13:58:06 -06:00
Damien Le Moal	d9ff882b54	null_blk: Fix return value of nullb_device_power_store() When powering on a null_blk device that is not already on, the return value ret that is initialized to be count is reused to check the return value of null_add_dev(), leading to nullb_device_power_store() to return null_add_dev() return value (0 on success) instead of "count". So make sure to set ret to be equal to count when there are no errors. Fixes: `a2db328b08` ("null_blk: fix null-ptr-dereference while configuring 'power' and 'submit_queues'") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Link: https://lore.kernel.org/r/20240527043445.235267-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-27 13:56:58 -06:00
Jakub Sitnicki	a63bf55616	selftests/bpf: Cover verifier checks for mutating sockmap/sockhash Verifier enforces that only certain program types can mutate sock{map,hash} maps, that is update it or delete from it. Add test coverage for these checks so we don't regress. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240527-sockmap-verify-deletes-v1-3-944b372f2101@cloudflare.com	2024-05-27 19:34:26 +02:00
Jakub Sitnicki	3b9ce0491a	Revert "bpf, sockmap: Prevent lock inversion deadlock in map delete elem" This reverts commit `ff91059932`. This check is no longer needed. BPF programs attached to tracepoints are now rejected by the verifier when they attempt to delete from a sockmap/sockhash maps. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240527-sockmap-verify-deletes-v1-2-944b372f2101@cloudflare.com	2024-05-27 19:34:25 +02:00
Jakub Sitnicki	98e948fb60	bpf: Allow delete from sockmap/sockhash only if update is allowed We have seen an influx of syzkaller reports where a BPF program attached to a tracepoint triggers a locking rule violation by performing a map_delete on a sockmap/sockhash. We don't intend to support this artificial use scenario. Extend the existing verifier allowed-program-type check for updating sockmap/sockhash to also cover deleting from a map. From now on only BPF programs which were previously allowed to update sockmap/sockhash can delete from these map types. Fixes: `ff91059932` ("bpf, sockmap: Prevent lock inversion deadlock in map delete elem") Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Reported-by: syzbot+ec941d6e24f633a59172@syzkaller.appspotmail.com Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: syzbot+ec941d6e24f633a59172@syzkaller.appspotmail.com Acked-by: John Fastabend <john.fastabend@gmail.com> Closes: https://syzkaller.appspot.com/bug?extid=ec941d6e24f633a59172 Link: https://lore.kernel.org/bpf/20240527-sockmap-verify-deletes-v1-1-944b372f2101@cloudflare.com	2024-05-27 19:33:40 +02:00
Marc Zyngier	c92e8b9eac	KVM: arm64: AArch32: Fix spurious trapping of conditional instructions We recently upgraded the view of ESR_EL2 to 64bit, in keeping with the requirements of the architecture. However, the AArch32 emulation code was left unaudited, and the (already dodgy) code that triages whether a trap is spurious or not (because the condition code failed) broke in a subtle way: If ESR_EL2.ISS2 is ever non-zero (unlikely, but hey, this is the ARM architecture we're talking about), the hack that tests the top bits of ESR_EL2.EC will break in an interesting way. Instead, use kvm_vcpu_trap_get_class() to obtain the EC, and list all the possible ECs that can fail a condition code check. While we're at it, add SMC32 to the list, as it is explicitly listed as being allowed to trap despite failing a condition code check (as described in the HCR_EL2.TSC documentation). Fixes: `0b12620fdd` ("KVM: arm64: Treat ESR_EL2 as a 64-bit register") Cc: stable@vger.kernel.org Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240524141956.1450304-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-05-27 17:46:09 +01:00
Marc Zyngier	dfe6d190f3	KVM: arm64: Allow AArch32 PSTATE.M to be restored as System mode It appears that we don't allow a vcpu to be restored in AArch32 System mode, as we never included it in the list of valid modes. Just add it to the list of allowed modes. Fixes: `0d854a60b1` ("arm64: KVM: enable initialization of a 32bit vcpu") Cc: stable@vger.kernel.org Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240524141956.1450304-3-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-05-27 17:45:35 +01:00
Marc Zyngier	947051e361	KVM: arm64: Fix AArch32 register narrowing on userspace write When userspace writes to one of the core registers, we make sure to narrow the corresponding GPRs if PSTATE indicates an AArch32 context. The code tries to check whether the context is EL0 or EL1 so that it narrows the correct registers. But it does so by checking the full PSTATE instead of PSTATE.M. As a consequence, and if we are restoring an AArch32 EL0 context in a 64bit guest, and that PSTATE has any bit set outside of PSTATE.M, we narrow all registers instead of only the first 15, destroying the 64bit state. Obviously, this is not something the guest is likely to enjoy. Correctly masking PSTATE to only evaluate PSTATE.M fixes it. Fixes: `90c1f934ed` ("KVM: arm64: Get rid of the AArch32 register mapping code") Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Cc: stable@vger.kernel.org Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240524141956.1450304-2-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>	2024-05-27 17:45:21 +01:00
Arnaldo Carvalho de Melo	001821b0e7	perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources to pick POSTED_MSI_NOTIFICATION To pick up the change in: `f5a3562ec9` ("x86/irq: Reserve a per CPU IDT vector for posted MSIs") That picks up this new vector: $ cp arch/x86/include/asm/irq_vectors.h tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h $ tools/perf/trace/beauty/tracepoints/x86_irq_vectors.sh > after $ diff -u before after --- before 2024-05-27 12:50:47.708863932 -0300 +++ after 2024-05-27 12:51:15.335113123 -0300 @@ -1,6 +1,7 @@ static const char x86_irq_vectors[] = { [0x02] = "NMI", [0x80] = "IA32_SYSCALL", + [0xeb] = "POSTED_MSI_NOTIFICATION", [0xec] = "LOCAL_TIMER", [0xed] = "HYPERV_STIMER0", [0xee] = "HYPERV_REENLIGHTENMENT", $ Now those will be known when pretty printing the irq_vectors: tracepoints. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/ZlS34M0x30EFVhbg@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-27 13:42:18 -03:00
Arnaldo Carvalho de Melo	a3eed53bee	perf beauty: Update copy of linux/socket.h with the kernel sources To pick up the fixes in: `0645fbe760` ("net: have do_accept() take a struct proto_accept_arg argument") That just changes a function prototype, not touching things used by the perf scrape scripts such as: $ tools/perf/trace/beauty/sockaddr.sh \| head -5 static const char *socket_families[] = { [0] = "UNSPEC", [1] = "LOCAL", [2] = "INET", [3] = "AX25", $ This addresses this perf tools build warning: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlSrceExgjrUiDb5@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-27 12:49:18 -03:00
Arnaldo Carvalho de Melo	1437a9f06f	tools headers UAPI: Sync fcntl.h with the kernel sources to pick F_DUPFD_QUERY There is no scrape script yet for those, but the warning pointed out we need to update the array with the F_LINUX_SPECIFIC_BASE entries, do it. Now 'perf trace' can decode that cmd and also use it in filter, as in: root@number:~# perf trace -e syscalls:*enter_fcntl --filter 'cmd != SETFL && cmd != GETFL' 0.000 sssd_kcm/303828 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7fffdc6a8a50) 0.013 sssd_kcm/303828 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLKW, arg: 0x7fffdc6a8aa0) 0.090 sssd_kcm/303828 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLKW, arg: 0x7fffdc6a88e0) ^Croot@number:~# This picks up the changes in: `c62b758bae` ("fcntl: add F_DUPFD_QUERY fcntl()") Addressing this perf tools build warning: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlSqNQH9mFw2bmjq@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-27 12:44:09 -03:00
Ritesh Harjani (IBM)	b0c6bcd58d	xfs: Add cond_resched to block unmap range and reflink remap path An async dio write to a sparse file can generate a lot of extents and when we unlink this file (using rm), the kernel can be busy in umapping and freeing those extents as part of transaction processing. Similarly xfs reflink remapping path can also iterate over a million extent entries in xfs_reflink_remap_blocks(). Since we can busy loop in these two functions, so let's add cond_resched() to avoid softlockup messages like these. watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:82435] CPU: 1 PID: 82435 Comm: kworker/1:0 Tainted: G S L 6.9.0-rc5-0-default #1 Workqueue: xfs-inodegc/sda2 xfs_inodegc_worker NIP [c000000000beea10] xfs_extent_busy_trim+0x100/0x290 LR [c000000000bee958] xfs_extent_busy_trim+0x48/0x290 Call Trace: xfs_alloc_get_rec+0x54/0x1b0 (unreliable) xfs_alloc_compute_aligned+0x5c/0x144 xfs_alloc_ag_vextent_size+0x238/0x8d4 xfs_alloc_fix_freelist+0x540/0x694 xfs_free_extent_fix_freelist+0x84/0xe0 __xfs_free_extent+0x74/0x1ec xfs_extent_free_finish_item+0xcc/0x214 xfs_defer_finish_one+0x194/0x388 xfs_defer_finish_noroll+0x1b4/0x5c8 xfs_defer_finish+0x2c/0xc4 xfs_bunmapi_range+0xa4/0x100 xfs_itruncate_extents_flags+0x1b8/0x2f4 xfs_inactive_truncate+0xe0/0x124 xfs_inactive+0x30c/0x3e0 xfs_inodegc_worker+0x140/0x234 process_scheduled_works+0x240/0x57c worker_thread+0x198/0x468 kthread+0x138/0x140 start_kernel_thread+0x14/0x18 run fstests generic/175 at 2024-02-02 04:40:21 [ C17] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679] CPU: 17 PID: 7679 Comm: xfs_io Kdump: loaded Tainted: G X 6.4.0 NIP [c008000005e3ec94] xfs_rmapbt_diff_two_keys+0x54/0xe0 [xfs] LR [c008000005e08798] xfs_btree_get_leaf_keys+0x110/0x1e0 [xfs] Call Trace: 0xc000000014107c00 (unreliable) __xfs_btree_updkeys+0x8c/0x2c0 [xfs] xfs_btree_update_keys+0x150/0x170 [xfs] xfs_btree_lshift+0x534/0x660 [xfs] xfs_btree_make_block_unfull+0x19c/0x240 [xfs] xfs_btree_insrec+0x4e4/0x630 [xfs] xfs_btree_insert+0x104/0x2d0 [xfs] xfs_rmap_insert+0xc4/0x260 [xfs] xfs_rmap_map_shared+0x228/0x630 [xfs] xfs_rmap_finish_one+0x2d4/0x350 [xfs] xfs_rmap_update_finish_item+0x44/0xc0 [xfs] xfs_defer_finish_noroll+0x2e4/0x740 [xfs] __xfs_trans_commit+0x1f4/0x400 [xfs] xfs_reflink_remap_extent+0x2d8/0x650 [xfs] xfs_reflink_remap_blocks+0x154/0x320 [xfs] xfs_file_remap_range+0x138/0x3a0 [xfs] do_clone_file_range+0x11c/0x2f0 vfs_clone_file_range+0x60/0x1c0 ioctl_file_clone+0x78/0x140 sys_ioctl+0x934/0x1270 system_call_exception+0x158/0x320 system_call_vectored_common+0x15c/0x2ec Cc: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Tested-by: Disha Goel<disgoel@linux.ibm.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>	2024-05-27 20:50:35 +05:30
Arnaldo Carvalho de Melo	0efc88e444	tools headers UAPI: Sync linux/prctl.h with the kernel sources To pick the changes in: `628d701f2d` ("powerpc/dexcr: Add DEXCR prctl interface") `6b9391b581` ("riscv: Include riscv_set_icache_flush_ctx prctl") That adds some PowerPC and a RISC-V specific prctl options: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/perf/trace/beauty/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after --- before 2024-05-27 12:14:21.358032781 -0300 +++ after 2024-05-27 12:14:32.364530185 -0300 @@ -65,6 +65,9 @@ [68] = "GET_MEMORY_MERGE", [69] = "RISCV_V_SET_CONTROL", [70] = "RISCV_V_GET_CONTROL", + [71] = "RISCV_SET_ICACHE_FLUSH_CTX", + [72] = "PPC_GET_DEXCR", + [73] = "PPC_SET_DEXCR", }; static const char *prctl_set_mm_options[] = { [1] = "START_CODE", $ That now will be used to decode the syscall option and also to compose filters, for instance: [root@five ~]# perf trace -e syscalls:sys_enter_prctl --filter option==SET_NAME 0.000 Isolated Servi/3474327 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23f13b7aee) 0.032 DOM Worker/3474327 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23deb25670) 7.920 :3474328/3474328 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24fbb10) 7.935 StreamT~s #374/3474328 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24fb970) 8.400 Isolated Servi/3474329 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24bab10) 8.418 StreamT~s #374/3474329 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24ba970) ^C[root@five ~]# This addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/lkml/ZlSklGWp--v_Ije7@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-27 12:20:02 -03:00
Linus Torvalds	2bfcfd584f	Merge tag 'pmdomain-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fix from Ulf Hansson: - Fix regression in gpcv2 PM domain for i.MX8 * tag 'pmdomain-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: imx: gpcv2: Add delay after power up handshake	2024-05-27 08:18:31 -07:00
Christoph Hellwig	c8c1f7012b	dm: make dm_set_zones_restrictions work on the queue limits Don't stuff the values directly into the queue without any synchronization, but instead delay applying the queue limits in the caller and let dm_set_zones_restrictions work on the limit structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20240527123634.1116952-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-27 09:16:39 -06:00
Christoph Hellwig	5e7a4bbcc3	dm: remove dm_check_zoned Fold it into the only caller in preparation to changes in the queue limits setup. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20240527123634.1116952-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-27 09:16:39 -06:00
Christoph Hellwig	d9780064b1	dm: move setting zoned_enabled to dm_table_set_restrictions Keep it together with the rest of the zoned code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20240527123634.1116952-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-27 09:16:39 -06:00
Christoph Hellwig	80e4e17ac9	block: remove blk_queue_max_integrity_segments This is unused now that all the atomic queue limit conversions are merged. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240521221606.393040-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-05-27 09:16:22 -06:00
Arnaldo Carvalho de Melo	e5c7bd4e5c	tools include UAPI: Sync linux/stat.h with the kernel sources To get the changes in: `2a82bb0294` ("statx: stx_subvol") To pick up this change and support it: $ tools/perf/trace/beauty/statx_mask.sh > before $ cp include/uapi/linux/stat.h tools/perf/trace/beauty/include/uapi/linux/stat.h $ tools/perf/trace/beauty/statx_mask.sh > after $ diff -u before after --- before 2024-05-22 13:39:49.742470571 -0300 +++ after 2024-05-22 13:39:59.157883101 -0300 @@ -14,4 +14,5 @@ [ilog2(0x00001000) + 1] = "MNT_ID", [ilog2(0x00002000) + 1] = "DIOALIGN", [ilog2(0x00004000) + 1] = "MNT_ID_UNIQUE", + [ilog2(0x00008000) + 1] = "SUBVOL", }; $ Now we'll see it like we see these: # perf trace -e statx 0.000 ( 0.015 ms): systemd-userwo/3982299 statx(dfd: 6, filename: ".", mask: TYPE\|INO\|MNT_ID, buffer: 0x7ffd8945e850) = 0 <SNIP> 180.559 ( 0.007 ms): (ostnamed)/3982957 statx(dfd: 4, filename: "sys", flags: SYMLINK_NOFOLLOW\|NO_AUTOMOUNT\|STATX_DONT_SYNC, mask: TYPE, buffer: 0x7fff13161190) = 0 180.918 ( 0.011 ms): (ostnamed)/3982957 statx(dfd: CWD, filename: "/run/systemd/mount-rootfs/sys/kernel/security", flags: SYMLINK_NOFOLLOW\|NO_AUTOMOUNT\|STATX_DONT_SYNC, mask: MNT_ID, buffer: 0x7fff13161120) = 0 180.956 ( 0.010 ms): (ostnamed)/3982957 statx(dfd: CWD, filename: "/run/systemd/mount-rootfs/sys/fs/cgroup", flags: SYMLINK_NOFOLLOW\|NO_AUTOMOUNT\|STATX_DONT_SYNC, mask: MNT_ID, buffer: 0x7fff13161120) = 0 <SNIP> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Zk5nO9yT0oPezUoo@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-27 12:13:51 -03:00
Linus Torvalds	e4c07ec89e	Merge tag 'vfs-6.10-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix io_uring based write-through after converting cifs to use the netfs library - Fix aio error handling when doing write-through via netfs library - Fix performance regression in iomap when used with non-large folio mappings - Fix signalfd error code - Remove obsolete comment in signalfd code - Fix async request indication in netfs_perform_write() by raising BDP_ASYNC when IOCB_NOWAIT is set - Yield swap device immediately to prevent spurious EBUSY errors - Don't cross a .backup mountpoint from backup volumes in afs to avoid infinite loops - Fix a race between umount and async request completion in 9p after 9p was converted to use the netfs library * tag 'vfs-6.10-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: netfs, 9p: Fix race between umount and async request completion afs: Don't cross .backup mountpoint from backup volume swap: yield device immediately netfs: Fix setting of BDP_ASYNC from iocb flags signalfd: drop an obsolete comment signalfd: fix error return code iomap: fault in smaller chunks for non-large folio mappings filemap: add helper mapping_max_folio_size() netfs: Fix AIO error handling when doing write-through netfs: Fix io_uring based write-through	2024-05-27 08:09:12 -07:00
Lukas Bulwahn	82d71b53d7	Documentation/core-api: correct reference to SWIOTLB_DYNAMIC Commit `c93f261dfc` ("Documentation/core-api: add swiotlb documentation") accidentally refers to CONFIG_DYNAMIC_SWIOTLB in one place, while the config is actually called CONFIG_SWIOTLB_DYNAMIC. Correct the reference to the intended config option. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Reviewed-by: Petr Tesarik <petr@tesarici.cz> Signed-off-by: Christoph Hellwig <hch@lst.de>	2024-05-27 16:52:09 +02:00
Fedor Pchelkin	6cb05d89fd	dma-buf: handle testing kthreads creation failure kthread creation may possibly fail inside race_signal_callback(). In such a case stop the already started threads, put the already taken references to them and return with error code. Found by Linux Verification Center (linuxtesting.org). Fixes: `2989f64510` ("dma-buf: Add selftests for dma-fence") Cc: stable@vger.kernel.org Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: T.J. Mercier <tjmercier@google.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522181308.841686-1-pchelkin@ispras.ru Signed-off-by: Christian König <christian.koenig@amd.com>	2024-05-27 16:09:22 +02:00
Charles Keepax	d5d2a5dacb	MAINTAINERS: Remove James Schulman from Cirrus audio maintainers James no longer works for Cirrus Logic, remove him from the list of maintainers for the Cirrus audio CODEC drivers. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://msgid.link/r/20240527101326.440345-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-27 13:59:42 +01:00
Charles Keepax	8d34c12e87	ASoC: wm_adsp: Add missing MODULE_DESCRIPTION() wm_adsp is built as a separate module and as such should include a MODULE_DESCRIPTION() macro. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://msgid.link/r/20240527100237.430240-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-27 13:59:41 +01:00
Charles Keepax	797c525e85	ASoC: cs42l43: Only restrict 44.1kHz for the ASP The SoundWire interface can always support 44.1kHz using flow controlled mode, and whether the ASP is in master mode should obviously only affect the ASP. Update cs42l43_startup() to only restrict the rates for the ASP DAI. Fixes: `fc918cbe87` ("ASoC: cs42l43: Add support for the cs42l43") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://msgid.link/r/20240527100840.439832-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-27 13:59:40 +01:00
Carlos López	e569eb3497	tracing/probes: fix error check in parse_btf_field() btf_find_struct_member() might return NULL or an error via the ERR_PTR() macro. However, its caller in parse_btf_field() only checks for the NULL condition. Fix this by using IS_ERR() and returning the error up the stack. Link: https://lore.kernel.org/all/20240527094351.15687-1-clopez@suse.de/ Fixes: `c440adfbe3` ("tracing/probes: Support BTF based data structure field access") Signed-off-by: Carlos López <clopez@suse.de> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-05-27 21:32:35 +09:00
David Howells	f89ea63f1c	netfs, 9p: Fix race between umount and async request completion There's a problem in 9p's interaction with netfslib whereby a crash occurs because the 9p_fid structs get forcibly destroyed during client teardown (without paying attention to their refcounts) before netfslib has finished with them. However, it's not a simple case of deferring the clunking that p9_fid_put() does as that requires the p9_client record to still be present. The problem is that netfslib has to unlock pages and clear the IN_PROGRESS flag before destroying the objects involved - including the fid - and, in any case, nothing checks to see if writeback completed barring looking at the page flags. Fix this by keeping a count of outstanding I/O requests (of any type) and waiting for it to quiesce during inode eviction. Reported-by: syzbot+df038d463cca332e8414@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/0000000000005be0aa061846f8d6@google.com/ Reported-by: syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/000000000000b86c5e06130da9c6@google.com/ Reported-by: syzbot+1527696d41a634cc1819@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/000000000000041f960618206d7e@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/755891.1716560771@warthog.procyon.org.uk Tested-by: syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com Reviewed-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <sfrench@samba.org> cc: Hillf Danton <hdanton@sina.com> cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Reported-and-tested-by: syzbot+d7c7a495a5e466c031b6@syzkaller.appspotmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-27 13:12:13 +02:00
Guenter Roeck	779aa4d747	drm/nouveau/nvif: Avoid build error due to potential integer overflows Trying to build parisc:allmodconfig with gcc 12.x or later results in the following build error. drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd': drivers/gpu/drm/nouveau/nvif/object.c:161:9: error: 'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict] 161 \| memcpy(data, args->mthd.data, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor': drivers/gpu/drm/nouveau/nvif/object.c:298:17: error: 'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict] 298 \| memcpy(data, args->new.data, size); gcc assumes that 'sizeof(*args) + size' can overflow, which would result in the problem. The problem is not new, only it is now no longer a warning but an error since W=1 has been enabled for the drm subsystem and since Werror is enabled for test builds. Rearrange arithmetic and use check_add_overflow() for validating the allocation size to avoid the overflow. While at it, split assignments out of if conditions. Fixes: `a61ddb4393` ("drm: enable (most) W=1 warnings by default across the subsystem") Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Joe Perches <joe@perches.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240524134817.1369993-1-linux@roeck-us.net	2024-05-27 13:10:30 +02:00
Rafael J. Wysocki	ae2170d6ea	thermal: trip: Trigger trip down notifications when trips involved in mitigation become invalid When a trip point becomes invalid after being crossed on the way up, it is involved in a mitigation episode that needs to be adjusted to compensate for the trip going away. For this reason, introduce thermal_zone_trip_down() as a wrapper around thermal_trip_crossed() and make thermal_zone_set_trip_temp() call it if the new temperature of the trip at hand is equal to THERMAL_TEMP_INVALID and it has been crossed on the way up to trigger all of the necessary adjustments in user space, the thermal debug code and the zone governor. Fixes: `8c69a777e4` ("thermal: core: Fix the handling of invalid trip points") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-27 13:00:00 +02:00
Rafael J. Wysocki	cb573eec60	thermal: core: Introduce thermal_trip_crossed() Add a helper function called thermal_trip_crossed() to be invoked by __thermal_zone_device_update() in order to notify user space, the thermal debug code and the zone governor about trip crossing. Subsequently, this will also be used in the case when a trip point becomes invalid after being crossed on the way up. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-27 13:00:00 +02:00
Rafael J. Wysocki	5a599e10e5	thermal/debugfs: Allow tze_seq_show() to print statistics for invalid trips Commit `a6258fde8d` ("thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats") modified tze_seq_show() to skip invalid trips, but it overlooked the fact that a trip may become invalid during a mitigation eposide involving it, in which case its statistics should still be reported. For this reason, remove the invalid trip temperature check from the main loop in tze_seq_show(). The trips that have never been valid will still be skipped after this change because there are no statistics to report for them. Fixes: `a6258fde8d` ("thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-27 13:00:00 +02:00
Rafael J. Wysocki	9e69acc1de	thermal/debugfs: Print initial trip temperature and hysteresis in tze_seq_show() The temperature and hysteresis of a trip point may change during a mitigation episode it is involved in (it may even become invalid altogether), so in order to avoid possible confusion related to that, store the temperature and hysteresis of trip points at the time they are crossed on the way up and print those values instead of their current temperature and hysteresis. Fixes: `7ef01f228c` ("thermal/debugfs: Add thermal debugfs information for mitigation episodes") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-27 13:00:00 +02:00
Parthiban Veerasooran	52a2f06083	net: usb: smsc95xx: fix changing LED_SEL bit value updated from EEPROM LED Select (LED_SEL) bit in the LED General Purpose IO Configuration register is used to determine the functionality of external LED pins (Speed Indicator, Link and Activity Indicator, Full Duplex Link Indicator). The default value for this bit is 0 when no EEPROM is present. If a EEPROM is present, the default value is the value of the LED Select bit in the Configuration Flags of the EEPROM. A USB Reset or Lite Reset (LRST) will cause this bit to be restored to the image value last loaded from EEPROM, or to be set to 0 if no EEPROM is present. While configuring the dual purpose GPIO/LED pins to LED outputs in the LED General Purpose IO Configuration register, the LED_SEL bit is changed as 0 and resulting the configured value from the EEPROM is cleared. The issue is fixed by using read-modify-write approach. Fixes: `f293501c61` ("smsc95xx: configure LED outputs") Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Woojung Huh <woojung.huh@microchip.com> Link: https://lore.kernel.org/r/20240523085314.167650-1-Parthiban.Veerasooran@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-27 12:48:23 +02:00
Armin Wolf	c4bd7f1d78	ACPI: EC: Avoid returning AE_OK on errors in address space handler If an error code other than EINVAL, ENODEV or ETIME is returned by acpi_ec_read() / acpi_ec_write(), then AE_OK is incorrectly returned by acpi_ec_space_handler(). Fix this by only returning AE_OK on success, and return AE_ERROR otherwise. Signed-off-by: Armin Wolf <W_Armin@gmx.de> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-27 12:43:29 +02:00
Armin Wolf	f6f172dc6a	ACPI: EC: Abort address space access upon error When a multi-byte address space access is requested, acpi_ec_read()/ acpi_ec_write() is being called multiple times. Abort such operations if a single call to acpi_ec_read() / acpi_ec_write() fails, as the data read from / written to the EC might be incomplete. Signed-off-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-27 12:43:29 +02:00
Dan Williams	7ff6c798ec	ACPI: APEI: EINJ: Fix einj_dev release leak The platform driver conversion of EINJ mistakenly used platform_device_del() to unwind platform_device_register_full() at module exit. This leads to a small leak of one 'struct platform_device' instance per module load/unload cycle. Switch to platform_device_unregister() which performs both device_del() and final put_device(). Fixes: `5621fafaac` ("EINJ: Migrate to a platform driver") Cc: 6.9+ <stable@vger.kernel.org> # 6.9+ Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-05-27 12:35:37 +02:00
Darrick J. Wong	95b19e2f4e	xfs: don't open-code u64_to_user_ptr Don't open-code what the kernel already provides. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>	2024-05-27 15:55:52 +05:30
Darrick J. Wong	38de567906	xfs: allow symlinks with short remote targets An internal user complained about log recovery failing on a symlink ("Bad dinode after recovery") with the following (excerpted) format: core.magic = 0x494e core.mode = 0120777 core.version = 3 core.format = 2 (extents) core.nlinkv2 = 1 core.nextents = 1 core.size = 297 core.nblocks = 1 core.naextents = 0 core.forkoff = 0 core.aformat = 2 (extents) u3.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,12,1,0] This is a symbolic link with a 297-byte target stored in a disk block, which is to say this is a symlink with a remote target. The forkoff is 0, which is to say that there's 512 - 176 == 336 bytes in the inode core to store the data fork. Eventually, testing of generic/388 failed with the same inode corruption message during inode recovery. In writing a debugging patch to call xfs_dinode_verify on dirty inode log items when we're committing transactions, I observed that xfs/298 can reproduce the problem quite quickly. xfs/298 creates a symbolic link, adds some extended attributes, then deletes them all. The test failure occurs when the final removexattr also deletes the attr fork because that does not convert the remote symlink back into a shortform symlink. That is how we trip this test. The only reason why xfs/298 only triggers with the debug patch added is that it deletes the symlink, so the final iflush shows the inode as free. I wrote a quick fstest to emulate the behavior of xfs/298, except that it leaves the symlinks on the filesystem after inducing the "corrupt" state. Kernels going back at least as far as 4.18 have written out symlink inodes in this manner and prior to `1eb70f54c4` they did not object to reading them back in. Because we've been writing out inodes this way for quite some time, the only way to fix this is to relax the check for symbolic links. Directories don't have this problem because di_size is bumped to blocksize during the sf->data conversion. Fixes: `1eb70f54c4` ("xfs: validate inode fork size against fork format") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>	2024-05-27 15:55:52 +05:30
Darrick J. Wong	97835e6866	xfs: fix xfs_init_attr_trans not handling explicit operation codes When we were converting the attr code to use an explicit operation code instead of keying off of attr->value being null, we forgot to change the code that initializes the transaction reservation. Split the function into two helpers that handle the !remove and remove cases, then fix both callsites to handle this correctly. Fixes: `c27411d4c6` ("xfs: make attr removal an explicit operation") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>	2024-05-27 15:55:52 +05:30
Darrick J. Wong	2b3f004d3d	xfs: drop xfarray sortinfo folio on error Chandan Babu reports the following livelock in xfs/708: run fstests xfs/708 at 2024-05-04 15:35:29 XFS (loop16): EXPERIMENTAL online scrub feature in use. Use at your own risk! XFS (loop5): Mounting V5 Filesystem e96086f0-a2f9-4424-a1d5-c75d53d823be XFS (loop5): Ending clean mount XFS (loop5): Quotacheck needed: Please wait. XFS (loop5): Quotacheck: Done. XFS (loop5): EXPERIMENTAL online scrub feature in use. Use at your own risk! INFO: task xfs_io:143725 blocked for more than 122 seconds. Not tainted 6.9.0-rc4+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:xfs_io state:D stack:0 pid:143725 tgid:143725 ppid:117661 flags:0x00004006 Call Trace: <TASK> __schedule+0x69c/0x17a0 schedule+0x74/0x1b0 io_schedule+0xc4/0x140 folio_wait_bit_common+0x254/0x650 shmem_undo_range+0x9d5/0xb40 shmem_evict_inode+0x322/0x8f0 evict+0x24e/0x560 __dentry_kill+0x17d/0x4d0 dput+0x263/0x430 __fput+0x2fc/0xaa0 task_work_run+0x132/0x210 get_signal+0x1a8/0x1910 arch_do_signal_or_restart+0x7b/0x2f0 syscall_exit_to_user_mode+0x1c2/0x200 do_syscall_64+0x72/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e The shmem code is trying to drop all the folios attached to a shmem file and gets stuck on a locked folio after a bnobt repair. It looks like the process has a signal pending, so I started looking for places where we lock an xfile folio and then deal with a fatal signal. I found a bug in xfarray_sort_scan via code inspection. This function is called to set up the scanning phase of a quicksort operation, which may involve grabbing a locked xfile folio. If we exit the function with an error code, the caller does not call xfarray_sort_scan_done to put the xfile folio. If _sort_scan returns an error code while si->folio is set, we leak the reference and never unlock the folio. Therefore, change xfarray_sort to call _scan_done on exit. This is safe to call multiple times because it sets si->folio to NULL and ignores a NULL si->folio. Also change _sort_scan to use an intermediate variable so that we never pollute si->folio with an errptr. Fixes: `232ea05277` ("xfs: enable sorting of xfile-backed arrays") Reported-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>	2024-05-27 15:55:52 +05:30
John Garry	b33874fb7f	xfs: Stop using __maybe_unused in xfs_alloc.c In both xfs_alloc_cur_finish() and xfs_alloc_ag_vextent_exact(), local variable @afg is tagged as __maybe_unused. Otherwise an unused variable warning would be generated for when building with W=1 and CONFIG_XFS_DEBUG unset. In both cases, the variable is unused as it is only referenced in an ASSERT() call, which is compiled out (in this config). It is generally a poor programming style to use __maybe_unused for variables. The ASSERT() call is to verify that agbno of the end of the extent is within bounds for both functions. @afg is used as an intermediate variable to find the AG length. However xfs_verify_agbext() already exists to verify a valid extent range. The arguments for calling xfs_verify_agbext() are already available, so use that instead. An advantage of using xfs_verify_agbext() is that it verifies that both the start and the end of the extent are within the bounds of the AG and catches overflows. Suggested-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>	2024-05-27 15:54:24 +05:30
John Garry	d7ba701da6	xfs: Clear W=1 warning in xfs_iwalk_run_callbacks() For CONFIG_XFS_DEBUG unset, xfs_iwalk_run_callbacks() generates the following warning for when building with W=1: fs/xfs/xfs_iwalk.c: In function ‘xfs_iwalk_run_callbacks’: fs/xfs/xfs_iwalk.c:354:42: error: variable ‘irec’ set but not used [-Werror=unused-but-set-variable] 354 \| struct xfs_inobt_rec_incore *irec; \| ^~~~ cc1: all warnings being treated as errors Drop @irec, as it is only an intermediate variable. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>	2024-05-27 15:54:24 +05:30
Hariprasad Kelam	1684842147	Octeontx2-pf: Free send queue buffers incase of leaf to inner There are two type of classes. "Leaf classes" that are the bottom of the class hierarchy. "Inner classes" that are neither the root class nor leaf classes. QoS rules can only specify leaf classes as targets for traffic. Root / \ / \ 1 2 /\ / \ 4 5 classes 1,4 and 5 are leaf classes. class 2 is a inner class. When a leaf class made as inner, or vice versa, resources associated with send queue (send queue buffers and transmit schedulers) are not getting freed. Fixes: `5e6808b4c6` ("octeontx2-pf: Add support for HTB offload") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240523073626.4114-1-hkelam@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-27 11:55:47 +02:00
Kuniyuki Iwashima	51d1b25a72	af_unix: Read sk->sk_hash under bindlock during bind(). syzkaller reported data-race of sk->sk_hash in unix_autobind() [0], and the same ones exist in unix_bind_bsd() and unix_bind_abstract(). The three bind() functions prefetch sk->sk_hash locklessly and use it later after validating that unix_sk(sk)->addr is NULL under unix_sk(sk)->bindlock. The prefetched sk->sk_hash is the hash value of unbound socket set in unix_create1() and does not change until bind() completes. There could be a chance that sk->sk_hash changes after the lockless read. However, in such a case, non-NULL unix_sk(sk)->addr is visible under unix_sk(sk)->bindlock, and bind() returns -EINVAL without using the prefetched value. The KCSAN splat is false-positive, but let's silence it by reading sk->sk_hash under unix_sk(sk)->bindlock. [0]: BUG: KCSAN: data-race in unix_autobind / unix_autobind write to 0xffff888034a9fb88 of 4 bytes by task 4468 on cpu 0: __unix_set_addr_hash net/unix/af_unix.c:331 [inline] unix_autobind+0x47a/0x7d0 net/unix/af_unix.c:1185 unix_dgram_connect+0x7e3/0x890 net/unix/af_unix.c:1373 __sys_connect_file+0xd7/0xe0 net/socket.c:2048 __sys_connect+0x114/0x140 net/socket.c:2065 __do_sys_connect net/socket.c:2075 [inline] __se_sys_connect net/socket.c:2072 [inline] __x64_sys_connect+0x40/0x50 net/socket.c:2072 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e read to 0xffff888034a9fb88 of 4 bytes by task 4465 on cpu 1: unix_autobind+0x28/0x7d0 net/unix/af_unix.c:1134 unix_dgram_connect+0x7e3/0x890 net/unix/af_unix.c:1373 __sys_connect_file+0xd7/0xe0 net/socket.c:2048 __sys_connect+0x114/0x140 net/socket.c:2065 __do_sys_connect net/socket.c:2075 [inline] __se_sys_connect net/socket.c:2072 [inline] __x64_sys_connect+0x40/0x50 net/socket.c:2072 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e value changed: 0x000000e4 -> 0x000001e3 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 4465 Comm: syz-executor.0 Not tainted 6.8.0-12822-gcd51db110a7e #12 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: `afd20b9290` ("af_unix: Replace the big lock with small locks.") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240522154218.78088-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-27 11:46:56 +02:00
Kuniyuki Iwashima	97e1db06c7	af_unix: Annotate data-race around unix_sk(sk)->addr. Once unix_sk(sk)->addr is assigned under net->unx.table.locks and unix_sk(sk)->bindlock, (unix_sk(sk)->addr) and unix_sk(sk)->path are fully set up, and unix_sk(sk)->addr is never changed. unix_getname() and unix_copy_addr() access the two fields locklessly, and commit `ae3b564179` ("missing barriers in some of unix_sock ->addr and ->path accesses") added smp_store_release() and smp_load_acquire() pairs. In other functions, we still read unix_sk(sk)->addr locklessly to check if the socket is bound, and KCSAN complains about it. [0] Given these functions have no dependency for (unix_sk(sk)->addr) and unix_sk(sk)->path, READ_ONCE() is enough to annotate the data-race. Note that it is safe to access unix_sk(sk)->addr locklessly if the socket is found in the hash table. For example, the lockless read of otheru->addr in unix_stream_connect() is safe. Note also that newu->addr there is of the child socket that is still not accessible from userspace, and smp_store_release() publishes the address in case the socket is accept()ed and unix_getname() / unix_copy_addr() is called. [0]: BUG: KCSAN: data-race in unix_bind / unix_listen write (marked) to 0xffff88805f8d1840 of 8 bytes by task 13723 on cpu 0: __unix_set_addr_hash net/unix/af_unix.c:329 [inline] unix_bind_bsd net/unix/af_unix.c:1241 [inline] unix_bind+0x881/0x1000 net/unix/af_unix.c:1319 __sys_bind+0x194/0x1e0 net/socket.c:1847 __do_sys_bind net/socket.c:1858 [inline] __se_sys_bind net/socket.c:1856 [inline] __x64_sys_bind+0x40/0x50 net/socket.c:1856 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e read to 0xffff88805f8d1840 of 8 bytes by task 13724 on cpu 1: unix_listen+0x72/0x180 net/unix/af_unix.c:734 __sys_listen+0xdc/0x160 net/socket.c:1881 __do_sys_listen net/socket.c:1890 [inline] __se_sys_listen net/socket.c:1888 [inline] __x64_sys_listen+0x2e/0x40 net/socket.c:1888 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e value changed: 0x0000000000000000 -> 0xffff88807b5b1b40 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 13724 Comm: syz-executor.4 Not tainted 6.8.0-12822-gcd51db110a7e #12 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240522154002.77857-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-27 11:46:31 +02:00
Geliang Tang	21a22ed618	selftests: hsr: Fix "File exists" errors for hsr_ping The hsr_ping test reports the following errors: INFO: preparing interfaces for HSRv0. INFO: Initial validation ping. INFO: Longer ping test. INFO: Cutting one link. INFO: Delay the link and drop a few packages. INFO: All good. INFO: preparing interfaces for HSRv1. RTNETLINK answers: File exists RTNETLINK answers: File exists RTNETLINK answers: File exists RTNETLINK answers: File exists RTNETLINK answers: File exists RTNETLINK answers: File exists Error: ipv4: Address already assigned. Error: ipv6: address already assigned. Error: ipv4: Address already assigned. Error: ipv6: address already assigned. Error: ipv4: Address already assigned. Error: ipv6: address already assigned. INFO: Initial validation ping. That is because the cleanup code for the 2nd round test before "setup_hsr_interfaces 1" is removed incorrectly in commit `680fda4f67` ("test: hsr: Remove script code already implemented in lib.sh"). This patch fixes it by re-setup the namespaces using setup_ns ns1 ns2 ns3 command before "setup_hsr_interfaces 1". It deletes previous namespaces and create new ones. Fixes: `680fda4f67` ("test: hsr: Remove script code already implemented in lib.sh") Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/6485d3005f467758d49f0f313c8c009759ba6b05.1716374462.git.tanggeliang@kylinos.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-27 11:44:42 +02:00
hmtheboy154	3050052613	platform/x86: touchscreen_dmi: Add info for the EZpad 6s Pro The "EZpad 6s Pro" uses the same touchscreen as the "EZpad 6 Pro B", unlike the "Ezpad 6 Pro" which has its own touchscreen. Signed-off-by: hmtheboy154 <buingoc67@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240527091447.248849-3-hdegoede@redhat.com	2024-05-27 11:43:03 +02:00
hmtheboy154	7c8639aa41	platform/x86: touchscreen_dmi: Add info for GlobalSpace SolT IVW 11.6" tablet This is a tablet created by GlobalSpace Technologies Limited which uses an Intel Atom x5-Z8300, 4GB of RAM & 64GB of storage. Link: https://web.archive.org/web/20171102141952/http://globalspace.in/11.6-device.html Signed-off-by: hmtheboy154 <buingoc67@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240527091447.248849-2-hdegoede@redhat.com	2024-05-27 11:43:03 +02:00
Hans de Goede	0b178b0267	platform/x86: touchscreen_dmi: Add support for setting touchscreen properties from cmdline On x86/ACPI platforms touchscreens mostly just work without needing any device/model specific configuration. But in some cases (mostly with Silead and Goodix touchscreens) it is still necessary to manually specify various touchscreen-properties on a per model basis. touchscreen_dmi is a special place for DMI quirks for this, but it can be challenging for users to figure out the right property values, especially for Silead touchscreens where non of these can be read back from the touchscreen-controller. ATM users can only test touchscreen properties by editing touchscreen_dmi.c and then building a completely new kernel which makes it unnecessary difficult for users to test and submit properties when necessary for their laptop / tablet model. Add support for specifying properties on the kernel commandline to allow users to easily figure out the right settings. See the added documentation in kernel-parameters.txt for the commandline syntax. Cc: Gregor Riepl <onitake@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240523143601.47555-1-hdegoede@redhat.com	2024-05-27 11:42:57 +02:00
Martin Tůma	825fc49497	media: mgb4: Fix double debugfs remove Fixes an error where debugfs_remove_recursive() is called first on a parent directory and then again on a child which causes a kernel panic. Signed-off-by: Martin Tůma <martin.tuma@digiteqautomotive.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: `0ab13674a9` ("media: pci: mgb4: Added Digiteq Automotive MGB4 driver") Cc: <stable@vger.kernel.org> [hverkuil: added Fixes/Cc tags]	2024-05-27 11:33:56 +02:00
Steven Rostedt (Google)	5d059bf2b1	platform/x86: thinkpad_acpi: Select INPUT_SPARSEKMAP in Kconfig Now that drivers/platform/x86/thinkpad_acpi.c uses sparse_keymap_report_event(), it must select INPUT_SPARSEKMAP in its Kconfig option otherwise the build fails with: ld: vmlinux.o: in function `tpacpi_input_send_key': thinkpad_acpi.c:(.text+0xd4d27f): undefined reference to `sparse_keymap_report_event' ld: vmlinux.o: in function `hotkey_init': thinkpad_acpi.c:(.init.text+0x66cb6): undefined reference to `sparse_keymap_setup' Fixes: `42f7b965de` ("platform/x86: thinkpad_acpi: Switch to using sparse-keymap helpers") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240522074813.379b9fc2@gandalf.local.home Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-05-27 11:30:34 +02:00
Hans de Goede	4d6ef1be24	platform/x86: x86-android-tablets: Add "select LEDS_CLASS" Since the x86-android-tablets now calls devm_led_classdev_register_ext() it needs to select LEDS_CLASS as well as LEDS_CLASS' NEW_LEDS dependency. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405182256.FsKBjIzG-lkp@intel.com/ Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240521094741.273397-1-hdegoede@redhat.com	2024-05-27 11:30:15 +02:00
Harshit Mogalapalli	a4edf675ba	platform/x86: ISST: fix use-after-free in tpmi_sst_dev_remove() In tpmi_sst_dev_remove(), tpmi_sst is dereferenced after being freed. Fix this by reordering the kfree() post the dereference. Fixes: `9d1d36268f` ("platform/x86: ISST: Support partitioned systems") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20240517144946.289615-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2024-05-27 11:25:33 +02:00
Roded Zats	e8021b94b0	enic: Validate length of nl attributes in enic_set_vf_port enic_set_vf_port assumes that the nl attribute IFLA_PORT_PROFILE is of length PORT_PROFILE_MAX and that the nl attributes IFLA_PORT_INSTANCE_UUID, IFLA_PORT_HOST_UUID are of length PORT_UUID_MAX. These attributes are validated (in the function do_setlink in rtnetlink.c) using the nla_policy ifla_port_policy. The policy defines IFLA_PORT_PROFILE as NLA_STRING, IFLA_PORT_INSTANCE_UUID as NLA_BINARY and IFLA_PORT_HOST_UUID as NLA_STRING. That means that the length validation using the policy is for the max size of the attributes and not on exact size so the length of these attributes might be less than the sizes that enic_set_vf_port expects. This might cause an out of bands read access in the memcpys of the data of these attributes in enic_set_vf_port. Fixes: `f8bd909183` ("net: Add ndo_{set\|get}_vf_port support for enic dynamic vnics") Signed-off-by: Roded Zats <rzats@paloaltonetworks.com> Link: https://lore.kernel.org/r/20240522073044.33519-1-rzats@paloaltonetworks.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-27 11:18:01 +02:00
Jean-Baptiste Maneyrol	95444b9eeb	iio: invensense: fix odr switching to same value ODR switching happens in 2 steps, update to store the new value and then apply when the ODR change flag is received in the data. When switching to the same ODR value, the ODR change flag is never happening, and frequency switching is blocked waiting for the never coming apply. Fix the issue by preventing update to happen when switching to same ODR value. Fixes: `0ecc363cce` ("iio: make invensense timestamp module generic") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com> Link: https://lore.kernel.org/r/20240524124851.567485-1-inv.git-commit@tdk.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:20 +01:00
Dumitru Ceclan	f00dd89530	iio: adc: ad7173: Remove index from temp channel Temperature channel is unique per device, index is not needed. This is breaking userspace: Include fixes tag to be released within the same rc cycle. Fixes: `76a1e6a428` ("iio: adc: ad7173: add AD7173 driver") Signed-off-by: Dumitru Ceclan <dumitru.ceclan@analog.com> Link: https://lore.kernel.org/r/20240521-ad7173-fixes-v1-3-8161cc7f3ad1@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:20 +01:00
Dumitru Ceclan	3450ee7e80	iio: adc: ad7173: Add ad7173_device_info names Add missing names from the device info struct for 3 models to ensure consistency with the rest of the models. Fixes: `76a1e6a428` ("iio: adc: ad7173: add AD7173 driver") Signed-off-by: Dumitru Ceclan <dumitru.ceclan@analog.com> Link: https://lore.kernel.org/r/20240521-ad7173-fixes-v1-2-8161cc7f3ad1@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:20 +01:00
Dumitru Ceclan	ab6f0ab178	iio: adc: ad7173: fix buffers enablement for ad7176-2 AD7176-2 does not feature input buffers and marks corespondent register bits as read only. Enable buffers only on supported models. Fixes: `76a1e6a428` ("iio: adc: ad7173: add AD7173 driver") Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Dumitru Ceclan <dumitru.ceclan@analog.com> Link: https://lore.kernel.org/r/20240521-ad7173-fixes-v1-1-8161cc7f3ad1@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:20 +01:00
Harshit Mogalapalli	a23c14b062	iio: temperature: mlx90635: Fix ERR_PTR dereference in mlx90635_probe() When devm_regmap_init_i2c() fails, regmap_ee could be error pointer, instead of checking for IS_ERR(regmap_ee), regmap is checked which looks like a copy paste error. Fixes: `a1d1ba5e1c` ("iio: temperature: mlx90635 MLX90635 IR Temperature sensor") Reviewed-by: Crt Mori<cmo@melexis.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20240513203427.3208696-1-harshit.m.mogalapalli@oracle.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:20 +01:00
Vasileios Amoiridis	bedb2ccb56	iio: imu: bmi323: Fix trigger notification in case of error In case of error in the bmi323_trigger_handler() function, the function exits without calling the iio_trigger_notify_done() which is responsible for informing the attached trigger that the process is done and in case there is a .reenable(), to call it. Fixes: `8a636db3aa` ("iio: imu: Add driver for BMI323 IMU") Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com> Link: https://lore.kernel.org/r/20240508155407.139805-1-vassilisamir@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:20 +01:00
Marc Ferland	279428df88	iio: dac: ad5592r: fix temperature channel scaling value The scale value for the temperature channel is (assuming Vref=2.5 and the datasheet): 376.7897513 When calculating both val and val2 for the temperature scale we use (3767897513/25) and multiply it by Vref (here I assume 2500mV) to obtain: 2500 * (3767897513/25) ==> 376789751300 Finally we divide with remainder by 10^9 to get: val = 376 val2 = 789751300 However, we return IIO_VAL_INT_PLUS_MICRO (should have been NANO) as the scale type. So when converting the raw temperature value to the 'processed' temperature value we will get (assuming raw=810, offset=-753): processed = (raw + offset) * scale_val = (810 + -753) * 376 = 21432 processed += div((raw + offset) * scale_val2, 10^6) += div((810 + -753) * 789751300, 10^6) += 45015 ==> 66447 ==> 66.4 Celcius instead of the expected 21.5 Celsius. Fix this issue by changing IIO_VAL_INT_PLUS_MICRO to IIO_VAL_INT_PLUS_NANO. Fixes: `56ca9db862` ("iio: dac: Add support for the AD5592R/AD5593R ADCs/DACs") Signed-off-by: Marc Ferland <marc.ferland@sonatest.com> Link: https://lore.kernel.org/r/20240501150554.1871390-1-marc.ferland@sonatest.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:20 +01:00
Adam Rizkalla	0f0f630661	iio: pressure: bmp280: Fix BMP580 temperature reading Fix overflow issue when storing BMP580 temperature reading and properly preserve sign of 24-bit data. Signed-off-by: Adam Rizkalla <ajarizzo@gmail.com> Tested-By: Vasileios Amoiridis <vassilisamir@gmail.com> Acked-by: Angel Iglesias <ang.iglesiasg@gmail.com> Link: https://lore.kernel.org/r/Zin2udkXRD0+GrML@adam-asahi.lan Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:20 +01:00
Angelo Dureghello	72d0a20fab	dt-bindings: iio: dac: fix ad354xr output range Fix output range, as per datasheet must be -2.5 to 7.5. Signed-off-by: Angelo Dureghello <adureghello@baylibre.com> Fixes: `b0a96c5f59` ("dt-bindings: iio: dac: Add adi,ad3552r.yaml") Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20240503185528.2043127-1-adureghello@baylibre.org Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:20 +01:00
David Lechner	8a01ef749b	iio: adc: ad9467: fix scan type sign According to the IIO documentation, the sign in the scan type should be lower case. The ad9467 driver was incorrectly using upper case. Fix by changing to lower case. Fixes: `4606d0f4b0` ("iio: adc: ad9467: add support for AD9434 high-speed ADC") Fixes: `ad67971202` ("iio: adc: ad9467: add support AD9467 ADC") Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20240503-ad9467-fix-scan-type-sign-v1-1-c7a1a066ebb9@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-05-27 09:49:19 +01:00
Jason Nader	9e2f46cd87	ata: ahci: Do not apply Intel PCS quirk on Intel Alder Lake Commit `b8b8b4e0c0` ("ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list") added Intel Alder Lake to the ahci_pci_tbl. Because of the way that the Intel PCS quirk was implemented, having an explicit entry in the ahci_pci_tbl caused the Intel PCS quirk to be applied. (The quirk was not being applied if there was no explict entry.) Thus, entries that were added to the ahci_pci_tbl also got the Intel PCS quirk applied. The quirk was cleaned up in commit `7edbb60592` ("ahci: clean up intel_pcs_quirk"), such that it is clear which entries that actually applies the Intel PCS quirk. Newer Intel AHCI controllers do not need the Intel PCS quirk, and applying it when not needed actually breaks some platforms. Do not apply the Intel PCS quirk for Intel Alder Lake. This is in line with how things worked before commit `b8b8b4e0c0` ("ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list"), such that certain platforms using Intel Alder Lake will work once again. Cc: stable@vger.kernel.org # 6.7 Fixes: `b8b8b4e0c0` ("ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list") Signed-off-by: Jason Nader <dev@kayoway.com> Signed-off-by: Niklas Cassel <cassel@kernel.org>	2024-05-27 10:07:37 +02:00
Luke D. Jones	2be46155d7	ALSA: hda/realtek: Adjust G814JZR to use SPI init for amp The 2024 ASUS ROG G814J model is much the same as the 2023 model and the 2023 16" version. We can use the same Cirrus Amp quirk. Fixes: `811dd426a9` ("ALSA: hda/realtek: Add quirks for Asus ROG 2024 laptops using CS35L41") Signed-off-by: Luke D. Jones <luke@ljones.dev> Link: https://lore.kernel.org/r/20240526091032.114545-1-luke@ljones.dev Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-27 09:29:19 +02:00
Takashi Iwai	495000a386	ALSA: core: Remove debugfs at disconnection The card-specific debugfs entries are removed at the last stage of card free phase, and it's performed after synchronization of the closes of all opened fds. This works fine for most cases, but it can be potentially problematic for a hotplug device like USB-audio. Due to the nature of snd_card_free_when_closed(), the card free isn't called immediately after the driver removal for a hotplug device, but it's left until the last fd is closed. It implies that the card debugfs entries also remain. Meanwhile, when a new device is inserted before the last close and the very same card slot is assigned, the driver tries to create the card debugfs root again on the very same path. This conflicts with the remaining entry, and results in the kernel warning such as: debugfs: Directory 'card0' with parent 'sound' already present! with the missing debugfs entry afterwards. For avoiding such conflicts, remove debugfs entries at the device disconnection phase instead. The jack kctl debugfs entries get removed in snd_jack_dev_disconnect() instead of each kctl private_free. Fixes: `2d670ea2bd` ("ALSA: jack: implement software jack injection via debugfs") Link: https://lore.kernel.org/r/20240524151256.32521-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-05-27 09:28:21 +02:00
Jeff Johnson	9ee267a293	fs: smb: common: add missing MODULE_DESCRIPTION() macros Fix the 'make W=1' warnings: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/smb/common/cifs_arc4.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/smb/common/cifs_md4.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-05-27 00:44:23 -05:00
Dave Airlie	3e049b6b8f	Merge tag 'drm-misc-fixes-2024-05-23' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: buddy: - stop using PAGE_SIZE shmem-helper: - avoid kernel panic in mmap() tests: - buddy: fix PAGE_SIZE dependency Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240523184745.GA11363@localhost.localdomain	2024-05-27 13:47:14 +10:00
Matthew Wilcox (Oracle)	b82b6eeefd	bcachefs: Use copy_folio_from_iter_atomic() copy_page_from_iter_atomic() will be removed at some point. Also fixup a comment for folios. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-26 22:30:09 -04:00
Jim Wylder	611b7eb19d	regmap-i2c: Subtract reg size from max_write Currently, when an adapter defines a max_write_len quirk, the data will be chunked into data sizes equal to the max_write_len quirk value. But the payload will be increased by the size of the register address before transmission. The resulting value always ends up larger than the limit set by the quirk. Avoid this error by setting regmap's max_write to the quirk's max_write_len minus the number of bytes for the register and padding. This allows the chunking to work correctly for this limited case without impacting other use-cases. Signed-off-by: Jim Wylder <jwylder@google.com> Link: https://msgid.link/r/20240523211437.2839942-1-jwylder@google.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-27 01:30:33 +01:00
Jeff Johnson	f94b77709e	firewire: add missing MODULE_DESCRIPTION() to test modules Fix the 'make W=1' warnings: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/firewire/uapi-test.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/firewire/packet-serdes-test.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240523-md-firewire-uapi-test-v1-1-6be5adcc3aed@quicinc.com Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2024-05-27 07:34:58 +09:00
Kent Overstreet	9242a34b76	bcachefs: Fix sb-downgrade validation Superblock downgrade entries are only two byte aligned, but section sizes are 8 byte aligned, which means we have to be careful about overrun checks; an entry that crosses the end of the section is allowed (and ignored) as long as it has zero errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-26 12:44:12 -04:00
Kent Overstreet	d509cadc3a	bcachefs: Fix debug assert Reported-by: syzbot+a8074a75b8d73328751e@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-26 12:40:30 -04:00
Herbert Xu	67ec8cdf29	hwrng: core - Remove add_early_randomness A potential deadlock was reported with the config file at https://web.archive.org/web/20240522052129/https://0x0.st/XPN_.txt In this particular configuration, the deadlock doesn't exist because the warning triggered at a point before modules were even available. However, the deadlock can be real because any module loaded would invoke async_synchronize_full. The issue is spurious for software crypto algorithms which aren't themselves involved in async probing. However, it would be hard to avoid for a PCI crypto driver using async probing. In this particular call trace, the problem is easily avoided because the only reason the module is being requested during probing is the add_early_randomness call in the hwrng core. This feature is vestigial since there is now a kernel thread dedicated to doing exactly this. So remove add_early_randomness as it is no longer needed. Reported-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Reported-by: Eric Biggers <ebiggers@kernel.org> Fixes: `1b6d7f9eb1` ("tpm: add session encryption protection to tpm2_get_random()") Link: https://lore.kernel.org/r/119dc5ed-f159-41be-9dda-1a056f29888d@notapiano/ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-05-26 18:32:16 +08:00
Daniel Borkmann	95348e463e	selftests/bpf: Add netkit test for pkt_type Add a test case to assert that the skb->pkt_type which was set from the BPF program is retained from the netkit xmit side to the peer's device at tcx ingress location. # ./vmtest.sh -- ./test_progs -t netkit [...] ./test_progs -t netkit [ 1.140780] bpf_testmod: loading out-of-tree module taints kernel. [ 1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz [ 1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns [ 1.290384] clocksource: Switched to clocksource tsc #345 tc_netkit_basic:OK #346 tc_netkit_device:OK #347 tc_netkit_multi_links:OK #348 tc_netkit_multi_opts:OK #349 tc_netkit_neigh_links:OK #350 tc_netkit_pkt_type:OK Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20240524163619.26001-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:53:11 -07:00
Daniel Borkmann	998ffeb273	selftests/bpf: Add netkit tests for mac address This adds simple tests around setting MAC addresses in the different netkit modes. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20240524163619.26001-3-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:48:57 -07:00
Daniel Borkmann	3998d18426	netkit: Fix pkt_type override upon netkit pass verdict When running Cilium connectivity test suite with netkit in L2 mode, we found that compared to tcx a few tests were failing which pushed traffic into an L7 proxy sitting in host namespace. The problem in particular is around the invocation of eth_type_trans() in netkit. In case of tcx, this is run before the tcx ingress is triggered inside host namespace and thus if the BPF program uses the bpf_skb_change_type() helper the newly set type is retained. However, in case of netkit, the late eth_type_trans() invocation overrides the earlier decision from the BPF program which eventually leads to the test failure. Instead of eth_type_trans(), split out the relevant parts, meaning, reset of mac header and call to eth_skb_pkt_type() before the BPF program is run in order to have the same behavior as with tcx, and refactor a small helper called eth_skb_pull_mac() which is run in case it's passed up the stack where the mac header must be pulled. With this all connectivity tests pass. Fixes: `35dfaad718` ("netkit, bpf: Add bpf programmable net device") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20240524163619.26001-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:48:57 -07:00
Daniel Borkmann	d6fe532b74	netkit: Fix setting mac address in l2 mode When running Cilium connectivity test suite with netkit in L2 mode, we found that it is expected to be able to specify a custom MAC address for the devices, in particular, cilium-cni obtains the specified MAC address by querying the endpoint and sets the MAC address of the interface inside the Pod. Thus, fix the missing support in netkit for L2 mode. Fixes: `35dfaad718` ("netkit, bpf: Add bpf programmable net device") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20240524163619.26001-1-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:48:57 -07:00
Shahab Vahedi	dd6a403795	ARC, bpf: Fix issues reported by the static analyzers Also updated couple of comments along the way. One of the issues reported was indeed a bug in the code: memset(ctx, 0, sizeof(ctx)) // original line memset(ctx, 0, sizeof(*ctx)) // fixed line That was a nice catch. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405222314.UG5F2NHn-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202405232036.Xqoc3b0J-lkp@intel.com/ Signed-off-by: Shahab Vahedi <shahab@synopsys.com> Link: https://lore.kernel.org/r/20240525035628.1026-1-list+bpf@vahedi.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:47:21 -07:00
Alexei Starovoitov	590016ad83	Merge branch 'fix-bpf-multi-uprobe-pid-filtering-logic' Andrii Nakryiko says: ==================== Fix BPF multi-uprobe PID filtering logic It turns out that current implementation of multi-uprobe PID filtering logic is broken. It filters by thread, while the promise is filtering by process. Patch #1 fixes the logic trivially. The rest is testing and mitigations that are necessary for libbpf to not break users of USDT programs. v1->v2: - fix selftest in last patch (CI); - use semicolon in patch #3 (Jiri). ==================== Link: https://lore.kernel.org/r/20240521163401.3005045-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:46:03 -07:00
Andrii Nakryiko	198034a87d	selftests/bpf: extend multi-uprobe tests with USDTs Validate libbpf's USDT-over-multi-uprobe logic by adding USDTs to existing multi-uprobe tests. This checks correct libbpf fallback to singular uprobes (when run on older kernels with buggy PID filtering). We reuse already established child process and child thread testing infrastructure, so additions are minimal. These test fail on either older kernels or older version of libbpf that doesn't detect PID filtering problems. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240521163401.3005045-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:46:02 -07:00
Andrii Nakryiko	70342420a1	selftests/bpf: extend multi-uprobe tests with child thread case Extend existing multi-uprobe tests to test that PID filtering works correctly. We already have child process tests, but we need also child thread tests. This patch adds spawn_thread() helper to start child thread, wait for it to be ready, and then instruct it to trigger desired uprobes. Additionally, we extend BPF-side code to track thread ID, not just process ID. Also we detect whether extraneous triggerings with unexpected process IDs happened, and validate that none of that happened in practice. These changes prove that fixed PID filtering logic for multi-uprobe works as expected. These tests fail on old kernels. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20240521163401.3005045-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:46:02 -07:00
Andrii Nakryiko	04d939a2ab	libbpf: detect broken PID filtering logic for multi-uprobe Libbpf is automatically (and transparently to user) detecting multi-uprobe support in the kernel, and, if supported, uses multi-uprobes to improve USDT attachment speed. USDTs can be attached system-wide or for the specific process by PID. In the latter case, we rely on correct kernel logic of not triggering USDT for unrelated processes. As such, on older kernels that do support multi-uprobes, but still have broken PID filtering logic, we need to fall back to singular uprobes. Unfortunately, whether user is using PID filtering or not is known at the attachment time, which happens after relevant BPF programs were loaded into the kernel. Also unfortunately, we need to make a call whether to use multi-uprobes or singular uprobe for SEC("usdt") programs during BPF object load time, at which point we have no information about possible PID filtering. The distinction between single and multi-uprobes is small, but important for the kernel. Multi-uprobes get BPF_TRACE_UPROBE_MULTI attach type, and kernel internally substitiute different implementation of some of BPF helpers (e.g., bpf_get_attach_cookie()) depending on whether uprobe is multi or singular. So, multi-uprobes and singular uprobes cannot be intermixed. All the above implies that we have to make an early and conservative call about the use of multi-uprobes. And so this patch modifies libbpf's existing feature detector for multi-uprobe support to also check correct PID filtering. If PID filtering is not yet fixed, we fall back to singular uprobes for USDTs. This extension to feature detection is simple thanks to kernel's -EINVAL addition for pid < 0. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240521163401.3005045-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:46:02 -07:00
Andrii Nakryiko	4a8f635a60	bpf: remove unnecessary rcu_read_{lock,unlock}() in multi-uprobe attach logic get_pid_task() internally already calls rcu_read_lock() and rcu_read_unlock(), so there is no point to do this one extra time. This is a drive-by improvement and has no correctness implications. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240521163401.3005045-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:46:02 -07:00
Andrii Nakryiko	46ba0e49b6	bpf: fix multi-uprobe PID filtering logic Current implementation of PID filtering logic for multi-uprobes in uprobe_prog_run() is filtering down to exact thread, while the intent for PID filtering it to filter by process instead. The check in uprobe_prog_run() also differs from the analogous one in uprobe_multi_link_filter() for some reason. The latter is correct, checking task->mm, not the task itself. Fix the check in uprobe_prog_run() to perform the same task->mm check. While doing this, we also update get_pid_task() use to use PIDTYPE_TGID type of lookup, given the intent is to get a representative task of an entire process. This doesn't change behavior, but seems more logical. It would hold task group leader task now, not any random thread task. Last but not least, given multi-uprobe support is half-broken due to this PID filtering logic (depending on whether PID filtering is important or not), we need to make it easy for user space consumers (including libbpf) to easily detect whether PID filtering logic was already fixed. We do it here by adding an early check on passed pid parameter. If it's negative (and so has no chance of being a valid PID), we return -EINVAL. Previous behavior would eventually return -ESRCH ("No process found"), given there can't be any process with negative PID. This subtle change won't make any practical change in behavior, but will allow applications to detect PID filtering fixes easily. Libbpf fixes take advantage of this in the next patch. Cc: stable@vger.kernel.org Acked-by: Jiri Olsa <jolsa@kernel.org> Fixes: `b733eeade4` ("bpf: Add pid filter support for uprobe_multi link") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240521163401.3005045-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-25 10:46:02 -07:00
Marc Dionne	29be9100ac	afs: Don't cross .backup mountpoint from backup volume Don't cross a mountpoint that explicitly specifies a backup volume (target is <vol>.backup) when starting from a backup volume. It it not uncommon to mount a volume's backup directly in the volume itself. This can cause tools that are not paying attention to get into a loop mounting the volume onto itself as they attempt to traverse the tree, leading to a variety of problems. This doesn't prevent the general case of loops in a sequence of mountpoints, but addresses a common special case in the same way as other afs clients. Reported-by: Jan Henrik Sylvester <jan.henrik.sylvester@uni-hamburg.de> Link: http://lists.infradead.org/pipermail/linux-afs/2024-May/008454.html Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Link: http://lists.infradead.org/pipermail/linux-afs/2024-February/008074.html Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/768760.1716567475@warthog.procyon.org.uk Reviewed-by: Jeffrey Altman <jaltman@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-25 14:02:40 +02:00
Scott Mayhew	0c8c7c5597	nfs: don't invalidate dentries on transient errors This is a slight variation on a patch previously proposed by Neil Brown that never got merged. Prior to commit `5ceb9d7fda` ("NFS: Refactor nfs_lookup_revalidate()"), any error from nfs_lookup_verify_inode() other than -ESTALE would result in nfs_lookup_revalidate() returning that error (-ESTALE is mapped to zero). Since that commit, all errors result in nfs_lookup_revalidate() returning zero, resulting in dentries being invalidated where they previously were not (particularly in the case of -ERESTARTSYS). Fix it by passing the actual error code to nfs_lookup_revalidate_done(), and leaving the decision on whether to map the error code to zero or one to nfs_lookup_revalidate_done(). A simple reproducer is to run the following python code in a subdirectory of an NFS mount (not in the root of the NFS mount): ---8<--- import os import multiprocessing import time if __name__=="__main__": multiprocessing.set_start_method("spawn") count = 0 while True: try: os.getcwd() pool = multiprocessing.Pool(10) pool.close() pool.terminate() count += 1 except Exception as e: print(f"Failed after {count} iterations") print(e) break ---8<--- Prior to commit `5ceb9d7fda`, the above code would run indefinitely. After commit `5ceb9d7fda`, it fails almost immediately with -ENOENT. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-05-24 12:31:11 -04:00
Jan Kara	a527c3ba41	nfs: Avoid flushing many pages with NFS_FILE_SYNC When we are doing WB_SYNC_ALL writeback, nfs submits write requests with NFS_FILE_SYNC flag to the server (which then generally treats it as an O_SYNC write). This helps to reduce latency for single requests but when submitting more requests, additional fsyncs on the server side hurt latency. NFS generally avoids this additional overhead by not setting NFS_FILE_SYNC if desc->pg_moreio is set. However this logic doesn't always work. When we do random 4k writes to a huge file and then call fsync(2), each page writeback is going to be sent with NFS_FILE_SYNC because after preparing one page for writeback, we start writing back next, nfs_do_writepage() will call nfs_pageio_cond_complete() which finds the page is not contiguous with previously prepared IO and submits is without setting desc->pg_moreio. Hence NFS_FILE_SYNC is used resulting in poor performance. Fix the problem by setting desc->pg_moreio in nfs_pageio_cond_complete() before submitting outstanding IO. This improves throughput of fsync-after-random-writes on my test SSD from ~70MB/s to ~250MB/s. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-05-24 12:26:14 -04:00
Kundan Kumar	1bd293fcf3	nvme: adjust multiples of NVME_CTRL_PAGE_SIZE in offset bio_vec start offset may be relatively large particularly when large folio gets added to the bio. A bigger offset will result in avoiding the single-segment mapping optimization and end up using expensive mempool_alloc further. Rather than using absolute value, adjust bv_offset by NVME_CTRL_PAGE_SIZE while checking if segment can be fitted into one/two PRP entries. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kundan Kumar <kundan.kumar@samsung.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-24 08:59:16 -07:00
Kanchan Joshi	64e3d02b43	nvme: remove sgs and sws sgs/sws are unused, so remove these from nvme_ns_head structure. Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-24 08:57:40 -07:00
Friedrich Vock	44382b3ed6	bpf: Fix potential integer overflow in resolve_btfids err is a 32-bit integer, but elf_update returns an off_t, which is 64-bit at least on 64-bit platforms. If symbols_patch is called on a binary between 2-4GB in size, the result will be negative when cast to a 32-bit integer, which the code assumes means an error occurred. This can wrongly trigger build failures when building very large kernel images. Fixes: `fbbb68de80` ("bpf: Add resolve_btfids tool to resolve BTF IDs in ELF object") Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240514070931.199694-1-friedrich.vock@gmx.de	2024-05-24 17:12:12 +02:00
Tetsuo Handa	b794918961	dma-buf/sw-sync: don't enable IRQ from sync_print_obj() Since commit `a6aa8fca4d` ("dma-buf/sw-sync: Reduce irqsave/irqrestore from known context") by error replaced spin_unlock_irqrestore() with spin_unlock_irq() for both sync_debugfs_show() and sync_print_obj() despite sync_print_obj() is called from sync_debugfs_show(), lockdep complains inconsistent lock state warning. Use plain spin_{lock,unlock}() for sync_print_obj(), for sync_debugfs_show() is already using spin_{lock,unlock}_irq(). Reported-by: syzbot <syzbot+a225ee3df7e7f9372dbe@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=a225ee3df7e7f9372dbe Fixes: `a6aa8fca4d` ("dma-buf/sw-sync: Reduce irqsave/irqrestore from known context") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/c2e46020-aaa6-4e06-bf73-f05823f913f0@I-love.SAKURA.ne.jp Signed-off-by: Christian König <christian.koenig@amd.com>	2024-05-24 16:30:05 +02:00
David S. Miller	0b4f5add9f	Merge branch 'mlx5-fixes' Tariq Toukan says: ==================== mlx5 fixes 24-05-22 This patchset provides bug fixes to mlx5 core and Eth drivers. Series generated against: commit `9c91c7fadb` ("net: mana: Fix the extra HZ in mana_hwc_send_request") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 13:27:08 +01:00
Gal Pressman	83fea49f27	net/mlx5e: Fix UDP GSO for encapsulated packets When the skb is encapsulated, adjust the inner UDP header instead of the outer one, and account for UDP header (instead of TCP) in the inline header size calculation. Fixes: `689adf0d48` ("net/mlx5e: Add UDP GSO support") Reported-by: Jason Baron <jbaron@akamai.com> Closes: https://lore.kernel.org/netdev/c42961cb-50b9-4a9a-bd43-87fe48d88d29@akamai.com/ Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Boris Pismenny <borisp@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 13:27:08 +01:00
Carolina Jubran	5c74195d5d	net/mlx5e: Use rx_missed_errors instead of rx_dropped for reporting buffer exhaustion Previously, the driver incorrectly used rx_dropped to report device buffer exhaustion. According to the documentation, rx_dropped should not be used to count packets dropped due to buffer exhaustion, which is the purpose of rx_missed_errors. Use rx_missed_errors as intended for counting packets dropped due to buffer exhaustion. Fixes: `269e6b3af3` ("net/mlx5e: Report additional error statistics in get stats ndo") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 13:27:08 +01:00
Rahul Rameshbabu	f55cd31287	net/mlx5e: Do not use ptp structure for tx ts stats when not initialized The ptp channel instance is only initialized when ptp traffic is first processed by the driver. This means that there is a window in between when port timestamping is enabled and ptp traffic is sent where the ptp channel instance is not initialized. Accessing statistics during this window will lead to an access violation (NULL + member offset). Check the validity of the instance before attempting to query statistics. BUG: unable to handle page fault for address: 0000000000003524 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 109dfc067 P4D 109dfc067 PUD 1064ef067 PMD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 420 Comm: ethtool Not tainted 6.9.0-rc2-rrameshbabu+ #245 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.3-1-1 04/01/204 RIP: 0010:mlx5e_stats_ts_get+0x4c/0x130 <snip> Call Trace: <TASK> ? show_regs+0x60/0x70 ? __die+0x24/0x70 ? page_fault_oops+0x15f/0x430 ? do_user_addr_fault+0x2c9/0x5c0 ? exc_page_fault+0x63/0x110 ? asm_exc_page_fault+0x27/0x30 ? mlx5e_stats_ts_get+0x4c/0x130 ? mlx5e_stats_ts_get+0x20/0x130 mlx5e_get_ts_stats+0x15/0x20 <snip> Fixes: `3579032c08` ("net/mlx5e: Implement ethtool hardware timestamping statistics") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 13:27:08 +01:00
Rahul Rameshbabu	9a52f6d44f	net/mlx5e: Fix IPsec tunnel mode offload feature check Remove faulty check disabling checksum offload and GSO for offload of simple IPsec tunnel L4 traffic. Comment previously describing the deleted code incorrectly claimed the check prevented double tunnel (or three layers of ip headers). Fixes: `f1267798c9` ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 13:27:08 +01:00
Rahul Rameshbabu	16d66a4fa8	net/mlx5: Use mlx5_ipsec_rx_status_destroy to correctly delete status rules rx_create no longer allocates a modify_hdr instance that needs to be cleaned up. The mlx5_modify_header_dealloc call will lead to a NULL pointer dereference. A leak in the rules also previously occurred since there are now two rules populated related to status. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 109907067 P4D 109907067 PUD 116890067 PMD 0 Oops: 0000 [#1] SMP CPU: 1 PID: 484 Comm: ip Not tainted 6.9.0-rc2-rrameshbabu+ #254 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.3-1-1 04/01/2014 RIP: 0010:mlx5_modify_header_dealloc+0xd/0x70 <snip> Call Trace: <TASK> ? show_regs+0x60/0x70 ? __die+0x24/0x70 ? page_fault_oops+0x15f/0x430 ? free_to_partial_list.constprop.0+0x79/0x150 ? do_user_addr_fault+0x2c9/0x5c0 ? exc_page_fault+0x63/0x110 ? asm_exc_page_fault+0x27/0x30 ? mlx5_modify_header_dealloc+0xd/0x70 rx_create+0x374/0x590 rx_add_rule+0x3ad/0x500 ? rx_add_rule+0x3ad/0x500 ? mlx5_cmd_exec+0x2c/0x40 ? mlx5_create_ipsec_obj+0xd6/0x200 mlx5e_accel_ipsec_fs_add_rule+0x31/0xf0 mlx5e_xfrm_add_state+0x426/0xc00 <snip> Fixes: `94af50c0a9` ("net/mlx5e: Unify esw and normal IPsec status table creation/destruction") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 13:27:07 +01:00
Gal Pressman	1b9f86c6d5	net/mlx5: Fix MTMP register capability offset in MCAM register The MTMP register (0x900a) capability offset is off-by-one, move it to the right place. Fixes: `1f507e80c7` ("net/mlx5: Expose NIC temperature via hardware monitoring kernel API") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 13:27:07 +01:00
Tariq Toukan	fca3b47918	net/mlx5: Do not query MPIR on embedded CPU function A proper query to MPIR needs to set the correct value in the depth field. On embedded CPU this value is not necessarily zero. As there is no real use case for multi-PF netdev on the embedded CPU of the smart NIC, block this option. This fixes the following failure: ACCESS_REG(0x805) op_mod(0x1) failed, status bad system state(0x4), syndrome (0x685f19), err(-5) Fixes: `678eb44805` ("net/mlx5: SD, Implement basic query and instantiation") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 13:27:07 +01:00
Maher Sanalla	51ef9305b8	net/mlx5: Lag, do bond only if slaves agree on roce state Currently, the driver does not enforce that lag bond slaves must have matching roce capabilities. Yet, in mlx5_do_bond(), the driver attempts to enable roce on all vports of the bond slaves, causing the following syndrome when one slave has no roce fw support: mlx5_cmd_out_err:809:(pid 25427): MODIFY_NIC_VPORT_CONTEXT(0×755) op_mod(0×0) failed, status bad parameter(0×3), syndrome (0xc1f678), err(-22) Thus, create HW lag only if bond's slaves agree on roce state, either all slaves have roce support resulting in a roce lag bond, or none do, resulting in a raw eth bond. Fixes: `7907f23adc` ("net/mlx5: Implement RoCE LAG feature") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 13:27:07 +01:00
Christian Brauner	712182b67e	swap: yield device immediately Otherwise we can cause spurious EBUSY issues when trying to mount the rootfs later on. Link: https://bugzilla.kernel.org/show_bug.cgi?id=218845 Reported-by: Petri Kaukasoina <petri.kaukasoina@tuni.fi> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-24 13:34:08 +02:00
David Howells	c596bea145	netfs: Fix setting of BDP_ASYNC from iocb flags Fix netfs_perform_write() to set BDP_ASYNC if IOCB_NOWAIT is set rather than if IOCB_SYNC is not set. It reflects asynchronicity in the sense of not waiting rather than synchronicity in the sense of not returning until the op is complete. Without this, generic/590 fails on cifs in strict caching mode with a complaint that one of the writes fails with EAGAIN. The test can be distilled down to: mount -t cifs /my/share /mnt -ostuff xfs_io -i -c 'falloc 0 8191M -c fsync -f /mnt/file xfs_io -i -c 'pwrite -b 1M -W 0 8191M' /mnt/file Fixes: `c38f4e96e6` ("netfs: Provide func to copy data to pagecache for buffered write") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/316306.1716306586@warthog.procyon.org.uk Reviewed-by: Jens Axboe <axboe@kernel.dk> cc: Jeff Layton <jlayton@kernel.org> cc: Enzo Matsumiya <ematsumiya@suse.de> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: netfs@lists.linux.dev cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-24 13:34:07 +02:00
Fedor Pchelkin	65bea99537	signalfd: drop an obsolete comment Commit `fbe38120eb` ("signalfd: convert to ->read_iter()") removed the call to anon_inode_getfd() by splitting fd setup into two parts. Drop the comment referencing the internal details of that function. Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Link: https://lore.kernel.org/r/20240520090819.76342-2-pchelkin@ispras.ru Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-24 13:34:07 +02:00
Fedor Pchelkin	f826bc9d6f	signalfd: fix error return code If anon_inode_getfile() fails, return appropriate error code. This looks like a single typo: the similar code changes in timerfd and userfaultfd are okay. Found by Linux Verification Center (linuxtesting.org). Fixes: `fbe38120eb` ("signalfd: convert to ->read_iter()") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Link: https://lore.kernel.org/r/20240520090819.76342-1-pchelkin@ispras.ru Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-24 13:34:07 +02:00
Xu Yang	4e527d5841	iomap: fault in smaller chunks for non-large folio mappings Since commit (`5d8edfb900` "iomap: Copy larger chunks from userspace"), iomap will try to copy in larger chunks than PAGE_SIZE. However, if the mapping doesn't support large folio, only one page of maximum 4KB will be created and 4KB data will be writen to pagecache each time. Then, next 4KB will be handled in next iteration. This will cause potential write performance problem. If chunk is 2MB, total 512 pages need to be handled finally. During this period, fault_in_iov_iter_readable() is called to check iov_iter readable validity. Since only 4KB will be handled each time, below address space will be checked over and over again: start end - buf, buf+2MB buf+4KB, buf+2MB buf+8KB, buf+2MB ... buf+2044KB buf+2MB Obviously the checking size is wrong since only 4KB will be handled each time. So this will get a correct chunk to let iomap work well in non-large folio case. With this change, the write speed will be stable. Tested on ARM64 device. Before: - dd if=/dev/zero of=/dev/sda bs=400K count=10485 (334 MB/s) - dd if=/dev/zero of=/dev/sda bs=800K count=5242 (278 MB/s) - dd if=/dev/zero of=/dev/sda bs=1600K count=2621 (204 MB/s) - dd if=/dev/zero of=/dev/sda bs=2200K count=1906 (170 MB/s) - dd if=/dev/zero of=/dev/sda bs=3000K count=1398 (150 MB/s) - dd if=/dev/zero of=/dev/sda bs=4500K count=932 (139 MB/s) After: - dd if=/dev/zero of=/dev/sda bs=400K count=10485 (339 MB/s) - dd if=/dev/zero of=/dev/sda bs=800K count=5242 (330 MB/s) - dd if=/dev/zero of=/dev/sda bs=1600K count=2621 (332 MB/s) - dd if=/dev/zero of=/dev/sda bs=2200K count=1906 (333 MB/s) - dd if=/dev/zero of=/dev/sda bs=3000K count=1398 (333 MB/s) - dd if=/dev/zero of=/dev/sda bs=4500K count=932 (333 MB/s) Fixes: `5d8edfb900` ("iomap: Copy larger chunks from userspace") Cc: stable@vger.kernel.org Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240521114939.2541461-2-xu.yang_2@nxp.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-24 13:34:07 +02:00
Xu Yang	79c1374548	filemap: add helper mapping_max_folio_size() Add mapping_max_folio_size() to get the maximum folio size for this pagecache mapping. Fixes: `5d8edfb900` ("iomap: Copy larger chunks from userspace") Cc: stable@vger.kernel.org Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240521114939.2541461-1-xu.yang_2@nxp.com Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-24 13:34:06 +02:00
David Howells	2c6b531020	netfs: Fix AIO error handling when doing write-through If an error occurs whilst we're doing an AIO write in write-through mode, we may end up calling ->ki_complete() and returning an error from ->write_iter(). This can result in either a UAF (the ->ki_complete() func pointer may get overwritten, for example) or a refcount underflow in io_submit() as ->ki_complete is called twice. Fix this by making netfs_end_writethrough() - and thus netfs_perform_write() - unconditionally return -EIOCBQUEUED if we're doing an AIO write and wait for completion if we're not. Fixes: `288ace2f57` ("netfs: New writeback implementation") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/295052.1716298587@warthog.procyon.org.uk cc: Jeff Layton <jlayton@kernel.org> cc: Enzo Matsumiya <ematsumiya@suse.de> cc: netfs@lists.linux.dev cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-24 13:34:06 +02:00
David Howells	9b038d004c	netfs: Fix io_uring based write-through This can be triggered by mounting a cifs filesystem with a cache=strict mount option and then, using the fsx program from xfstests, doing: ltp/fsx -A -d -N 1000 -S 11463 -P /tmp /cifs-mount/foo \ --replay-ops=gen112-fsxops Where gen112-fsxops holds: fallocate 0x6be7 0x8fc5 0x377d3 copy_range 0x9c71 0x77e8 0x2edaf 0x377d3 write 0x2776d 0x8f65 0x377d3 The problem is that netfs_io_request::len is being used for two purposes and ends up getting set to the amount of data we transferred, not the amount of data the caller asked to be transferred (for various reasons, such as mmap'd writes, we might end up rounding out the data written to the server to include the entire folio at each end). Fix this by keeping the amount we were asked to write in ->len and using ->submitted to track what we issued ops for. Then, when we come to calling ->ki_complete(), ->len is the right size. This also required netfs_cleanup_dio_write() to change since we're no longer advancing wreq->len. Use wreq->transferred instead as we might have done a short read. With this, the generic/112 xfstest passes if cifs is forced to put all non-DIO opens into write-through mode. Fixes: `288ace2f57` ("netfs: New writeback implementation") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/295086.1716298663@warthog.procyon.org.uk cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <stfrench@microsoft.com> cc: Enzo Matsumiya <ematsumiya@suse.de> cc: netfs@lists.linux.dev cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-05-24 13:34:06 +02:00
Mathieu Othacehe	128d54fbcb	net: phy: micrel: set soft_reset callback to genphy_soft_reset for KSZ8061 Following a similar reinstate for the KSZ8081 and KSZ9031. Older kernels would use the genphy_soft_reset if the PHY did not implement a .soft_reset. The KSZ8061 errata described here: https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ8061-Errata-DS80000688B.pdf and worked around with `232ba3a51c` ("net: phy: Micrel KSZ8061: link failure after cable connect") is back again without this soft reset. Fixes: `6e2d85ec05` ("net: phy: Stop with excessive soft reset") Tested-by: Karim Ben Houcine <karim.benhoucine@landisgyr.com> Signed-off-by: Mathieu Othacehe <othacehe@gnu.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 12:30:38 +01:00
Matt Jan	06e785aeb9	connector: Fix invalid conversion in cn_proc.h The implicit conversion from unsigned int to enum proc_cn_event is invalid, so explicitly cast it for compilation in a C++ compiler. /usr/include/linux/cn_proc.h: In function 'proc_cn_event valid_event(proc_cn_event)': /usr/include/linux/cn_proc.h:72:17: error: invalid conversion from 'unsigned int' to 'proc_cn_event' [-fpermissive] 72 \| ev_type &= PROC_EVENT_ALL; \| ^ \| \| \| unsigned int Signed-off-by: Matt Jan <zoo868e@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-05-24 10:36:55 +01:00
Martin K. Petersen	d09c05aa35	scsi: core: Handle devices which return an unusually large VPD page count Peter Schneider reported that a system would no longer boot after updating to 6.8.4. Peter bisected the issue and identified commit `b5fc07a5fb` ("scsi: core: Consult supported VPD page list prior to fetching page") as being the culprit. Turns out the enclosure device in Peter's system reports a byteswapped page length for VPD page 0. It reports "02 00" as page length instead of "00 02". This causes us to attempt to access 516 bytes (page length + header) of information despite only 2 pages being present. Limit the page search scope to the size of our VPD buffer to guard against devices returning a larger page count than requested. Link: https://lore.kernel.org/r/20240521023040.2703884-1-martin.petersen@oracle.com Fixes: `b5fc07a5fb` ("scsi: core: Consult supported VPD page list prior to fetching page") Cc: stable@vger.kernel.org Reported-by: Peter Schneider <pschneider1968@googlemail.com> Closes: https://lore.kernel.org/all/eec6ebbf-061b-4a7b-96dc-ea748aa4d035@googlemail.com/ Tested-by: Peter Schneider <pschneider1968@googlemail.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-23 20:35:32 -04:00
Andrii Nakryiko	699646734a	uprobes: prevent mutex_lock() under rcu_read_lock() Recent changes made uprobe_cpu_buffer preparation lazy, and moved it deeper into __uprobe_trace_func(). This is problematic because __uprobe_trace_func() is called inside rcu_read_lock()/rcu_read_unlock() block, which then calls prepare_uprobe_buffer() -> uprobe_buffer_get() -> mutex_lock(&ucb->mutex), leading to a splat about using mutex under non-sleepable RCU: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 98231, name: stress-ng-sigq preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 ... Call Trace: <TASK> dump_stack_lvl+0x3d/0xe0 __might_resched+0x24c/0x270 ? prepare_uprobe_buffer+0xd5/0x1d0 __mutex_lock+0x41/0x820 ? ___perf_sw_event+0x206/0x290 ? __perf_event_task_sched_in+0x54/0x660 ? __perf_event_task_sched_in+0x54/0x660 prepare_uprobe_buffer+0xd5/0x1d0 __uprobe_trace_func+0x4a/0x140 uprobe_dispatcher+0x135/0x280 ? uprobe_dispatcher+0x94/0x280 uprobe_notify_resume+0x650/0xec0 ? atomic_notifier_call_chain+0x21/0x110 ? atomic_notifier_call_chain+0xf8/0x110 irqentry_exit_to_user_mode+0xe2/0x1e0 asm_exc_int3+0x35/0x40 RIP: 0033:0x7f7e1d4da390 Code: 33 04 00 0f 1f 80 00 00 00 00 f3 0f 1e fa b9 01 00 00 00 e9 b2 fc ff ff 66 90 f3 0f 1e fa 31 c9 e9 a5 fc ff ff 0f 1f 44 00 00 <cc> 0f 1e fa b8 27 00 00 00 0f 05 c3 0f 1f 40 00 f3 0f 1e fa b8 6e RSP: 002b:00007ffd2abc3608 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000076d325f1 RCX: 0000000000000000 RDX: 0000000076d325f1 RSI: 000000000000000a RDI: 00007ffd2abc3690 RBP: 000000000000000a R08: 00017fb700000000 R09: 00017fb700000000 R10: 00017fb700000000 R11: 0000000000000246 R12: 0000000000017ff2 R13: 00007ffd2abc3610 R14: 0000000000000000 R15: 00007ffd2abc3780 </TASK> Luckily, it's easy to fix by moving prepare_uprobe_buffer() to be called slightly earlier: into uprobe_trace_func() and uretprobe_trace_func(), outside of RCU locked section. This still keeps this buffer preparation lazy and helps avoid the overhead when it's not needed. E.g., if there is only BPF uprobe handler installed on a given uprobe, buffer won't be initialized. Note, the other user of prepare_uprobe_buffer(), __uprobe_perf_func(), is not affected, as it doesn't prepare buffer under RCU read lock. Link: https://lore.kernel.org/all/20240521053017.3708530-1-andrii@kernel.org/ Fixes: `1b8f85defb` ("uprobes: prepare uprobe args buffer lazily") Reported-by: Breno Leitao <leitao@debian.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-05-24 07:44:44 +09:00
Mario Limonciello	8195979d2d	drm/amd/display: Enable colorspace property for MST connectors MST colorspace property support was disabled due to a series of warnings that came up when the device was plugged in since the properties weren't made at device creation. Create the properties in advance instead. Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: `69a9596102` ("drm/amd/display: Temporary Disable MST DP Colorspace Property"). Reported-and-tested-by: Tyler Schneider <tyler.schneider@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3353 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-23 18:14:14 -04:00
Hawking Zhang	ec58991054	drm/amdgpu: correct hbm field in boot status hbm filed takes bit 13 and bit 14 in boot status. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-05-23 18:13:23 -04:00
Sagi Grimberg	f97914e35f	nvmet: fix ns enable/disable possible hang When disabling an nvmet namespace, there is a period where the subsys->lock is released, as the ns disable waits for backend IO to complete, and the ns percpu ref to be properly killed. The original intent was to avoid taking the subsystem lock for a prolong period as other processes may need to acquire it (for example new incoming connections). However, it opens up a window where another process may come in and enable the ns, (re)intiailizing the ns percpu_ref, causing the disable sequence to hang. Solve this by taking the global nvmet_config_sem over the entire configfs enable/disable sequence. Fixes: `a07b4970f4` ("nvmet: add a generic NVMe target") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-23 13:44:42 -07:00
Keith Busch	a2e4c5f5f6	nvme-multipath: fix io accounting on failover There are io stats accounting that needs to be handled, so don't call blk_mq_end_request() directly. Use the existing nvme_end_req() helper that already handles everything. Fixes: `d4d957b53d` ("nvme-multipath: support io stats on the mpath device") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-23 13:44:42 -07:00
Keith Busch	2fe7b42246	nvme: fix multipath batched completion accounting Batched completions were missing the io stats accounting and bio trace events. Move the common code to a helper and call it from the batched and non-batched functions. Fixes: `d4d957b53d` ("nvme-multipath: support io stats on the mpath device") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-23 13:44:35 -07:00
Sean Christopherson	b4bd556467	KVM: SVM: WARN on vNMI + NMI window iff NMIs are outright masked When requesting an NMI window, WARN on vNMI support being enabled if and only if NMIs are actually masked, i.e. if the vCPU is already handling an NMI. KVM's ABI for NMIs that arrive simultanesouly (from KVM's point of view) is to inject one NMI and pend the other. When using vNMI, KVM pends the second NMI simply by setting V_NMI_PENDING, and lets the CPU do the rest (hardware automatically sets V_NMI_BLOCKING when an NMI is injected). However, if KVM can't immediately inject an NMI, e.g. because the vCPU is in an STI shadow or is running with GIF=0, then KVM will request an NMI window and trigger the WARN (but still function correctly). Whether or not the GIF=0 case makes sense is debatable, as the intent of KVM's behavior is to provide functionality that is as close to real hardware as possible. E.g. if two NMIs are sent in quick succession, the probability of both NMIs arriving in an STI shadow is infinitesimally low on real hardware, but significantly larger in a virtual environment, e.g. if the vCPU is preempted in the STI shadow. For GIF=0, the argument isn't as clear cut, because the window where two NMIs can collide is much larger in bare metal (though still small). That said, KVM should not have divergent behavior for the GIF=0 case based on whether or not vNMI support is enabled. And KVM has allowed simultaneous NMIs with GIF=0 for over a decade, since commit `7460fb4a34` ("KVM: Fix simultaneous NMIs"). I.e. KVM's GIF=0 handling shouldn't be modified without a really good reason to do so, and if KVM's behavior were to be modified, it should be done irrespective of vNMI support. Fixes: `fa4c027a79` ("KVM: x86: Add support for SVM's Virtual NMI") Cc: stable@vger.kernel.org Cc: Santosh Shukla <Santosh.Shukla@amd.com> Cc: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240522021435.1684366-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:34:44 -04:00
Sean Christopherson	76d5363c20	KVM: x86: Force KVM_WERROR if the global WERROR is enabled Force KVM_WERROR if the global WERROR is enabled to avoid pestering the user about a Kconfig that will ultimately be ignored. Force KVM_WERROR instead of making it mutually exclusive with WERROR to avoid generating a .config builds KVM with -Werror, but has KVM_WERROR=n. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240517180341.974251-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:33:31 -04:00
Sean Christopherson	6af6142e3a	KVM: x86: Disable KVM_INTEL_PROVE_VE by default Disable KVM's "prove #VE" support by default, as it provides no functional value, and even its sanity checking benefits are relatively limited. I.e. it should be fully opt-in even on debug kernels, especially since EPT Violation #VE suppression appears to be buggy on some CPUs. Opportunistically add a line in the help text to make it abundantly clear that KVM_INTEL_PROVE_VE should never be enabled in a production environment. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240518000430.1118488-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:33:15 -04:00
Sean Christopherson	a5dc0c9b55	KVM: VMX: Enumerate EPT Violation #VE support in /proc/cpuinfo Don't suppress printing EPT_VIOLATION_VE in /proc/cpuinfo, knowing whether or not KVM_INTEL_PROVE_VE actually does anything is extremely valuable. A privileged user can get at the information by reading the raw MSR, but the whole point of the VMX flags is to avoid needing to glean information from raw MSR reads. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240518000430.1118488-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:28:49 -04:00
Sean Christopherson	bca99c0356	KVM: x86/mmu: Print SPTEs on unexpected #VE Print the SPTEs that correspond to the faulting GPA on an unexpected EPT Violation #VE to help the user debug failures, e.g. to pinpoint which SPTE didn't have SUPPRESS_VE set. Opportunistically assert that the underlying exit reason was indeed an EPT Violation, as the CPU has really gone off the rails if a #VE occurs due to a completely unexpected exit reason. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240518000430.1118488-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:28:45 -04:00
Sean Christopherson	743f177336	KVM: VMX: Dump VMCS on unexpected #VE Dump the VMCS on an unexpected #VE, otherwise it's practically impossible to figure out why the #VE occurred. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240518000430.1118488-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:28:04 -04:00
Sean Christopherson	837d557aba	KVM: x86/mmu: Add sanity checks that KVM doesn't create EPT #VE SPTEs Assert that KVM doesn't set a SPTE to a value that could trigger an EPT Violation #VE on a non-MMIO SPTE, e.g. to help detect bugs even without KVM_INTEL_PROVE_VE enabled, and to help debug actual #VE failures. Note, this will run afoul of TDX support, which needs to reflect emulated MMIO accesses into the guest as #VEs (which was the whole point of adding EPT Violation #VE support in KVM). The obvious fix for that is to exempt MMIO SPTEs, but that's annoyingly difficult now that is_mmio_spte() relies on a per-VM value. However, resolving that conundrum is a future problem, whereas getting KVM_INTEL_PROVE_VE healthy is a current problem. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240518000430.1118488-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:27:26 -04:00
Sean Christopherson	9031b42139	KVM: nVMX: Always handle #VEs in L0 (never forward #VEs from L2 to L1) Always handle #VEs, e.g. due to prove EPT Violation #VE failures, in L0, as KVM does not expose any #VE capabilities to L1, i.e. any and all #VEs are KVM's responsibility. Fixes: `8131cf5b4f` ("KVM: VMX: Introduce test mode related to EPT violation VE") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240518000430.1118488-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:27:26 -04:00
Sean Christopherson	d1b32ecdc8	KVM: nVMX: Initialize #VE info page for vmcs02 when proving #VE support Point vmcs02.VE_INFORMATION_ADDRESS at the vCPU's #VE info page when initializing vmcs02, otherwise KVM will run L2 with EPT Violation #VE enabled and a VE info address pointing at pfn 0. Fixes: `8131cf5b4f` ("KVM: VMX: Introduce test mode related to EPT violation VE") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240518000430.1118488-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:27:25 -04:00
Sean Christopherson	40e8a6901a	KVM: VMX: Don't kill the VM on an unexpected #VE Don't terminate the VM on an unexpected #VE, as it's extremely unlikely the #VE is fatal to the guest, and even less likely that it presents a danger to the host. Simply resume the guest on "failure", as the #VE info page's BUSY field will prevent converting any more EPT Violations to #VEs for the vCPU (at least, that's what the BUSY field is supposed to do). Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20240518000430.1118488-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:27:17 -04:00
Isaku Yamahata	803482f472	KVM: x86/mmu: Use SHADOW_NONPRESENT_VALUE for atomic zap in TDP MMU Use SHADOW_NONPRESENT_VALUE when zapping TDP MMU SPTEs with mmu_lock held for read, tdp_mmu_zap_spte_atomic() was simply missed during the initial development. Fixes: `7f01cab849` ("KVM: x86/mmu: Allow non-zero value for non-present SPTE and removed SPTE") Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> [sean: write changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240518000430.1118488-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-05-23 12:24:39 -04:00
Daniel Vetter	36f53d622a	Merge tag 'drm-misc-fixes-2024-05-16' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Short summary of fixes pull: nouveau: - use tile_mode and pte_kind for VM_BIND bo allocations Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240516072658.GA8395@linux.fritz.box	2024-05-23 17:07:13 +02:00
Mark Brown	3aac9f4885	soi: Don't call DMA sync API when not needed Merge series from Andy Shevchenko <andriy.shevchenko@linux.intel.com>: A couple of fixes to avoid calling DMA sync API when it's not needed. This doesn't stop from discussing if IOMMU code is doing the right thing, i.e. dereferences SG list when orig_nents == 0, but this is a separate story.	2024-05-23 15:16:57 +01:00
Fedor Pchelkin	e64746e74f	dma-mapping: benchmark: handle NUMA_NO_NODE correctly cpumask_of_node() can be called for NUMA_NO_NODE inside do_map_benchmark() resulting in the following sanitizer report: UBSAN: array-index-out-of-bounds in ./arch/x86/include/asm/topology.h:72:28 index -1 is out of range for type 'cpumask [64][1]' CPU: 1 PID: 990 Comm: dma_map_benchma Not tainted 6.9.0-rc6 #29 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) ubsan_epilogue (lib/ubsan.c:232) __ubsan_handle_out_of_bounds (lib/ubsan.c:429) cpumask_of_node (arch/x86/include/asm/topology.h:72) [inline] do_map_benchmark (kernel/dma/map_benchmark.c:104) map_benchmark_ioctl (kernel/dma/map_benchmark.c:246) full_proxy_unlocked_ioctl (fs/debugfs/file.c:333) __x64_sys_ioctl (fs/ioctl.c:890) do_syscall_64 (arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Use cpumask_of_node() in place when binding a kernel thread to a cpuset of a particular node. Note that the provided node id is checked inside map_benchmark_ioctl(). It's just a NUMA_NO_NODE case which is not handled properly later. Found by Linux Verification Center (linuxtesting.org). Fixes: `65789daa80` ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Acked-by: Barry Song <baohua@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>	2024-05-23 15:06:48 +02:00
Fedor Pchelkin	1ff05e723f	dma-mapping: benchmark: fix node id validation While validating node ids in map_benchmark_ioctl(), node_possible() may be provided with invalid argument outside of [0,MAX_NUMNODES-1] range leading to: BUG: KASAN: wild-memory-access in map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) Read of size 8 at addr 1fffffff8ccb6398 by task dma_map_benchma/971 CPU: 7 PID: 971 Comm: dma_map_benchma Not tainted 6.9.0-rc6 #37 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) kasan_report (mm/kasan/report.c:603) kasan_check_range (mm/kasan/generic.c:189) variable_test_bit (arch/x86/include/asm/bitops.h:227) [inline] arch_test_bit (arch/x86/include/asm/bitops.h:239) [inline] _test_bit at (include/asm-generic/bitops/instrumented-non-atomic.h:142) [inline] node_state (include/linux/nodemask.h:423) [inline] map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) full_proxy_unlocked_ioctl (fs/debugfs/file.c:333) __x64_sys_ioctl (fs/ioctl.c:890) do_syscall_64 (arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Compare node ids with sane bounds first. NUMA_NO_NODE is considered a special valid case meaning that benchmarking kthreads won't be bound to a cpuset of a given node. Found by Linux Verification Center (linuxtesting.org). Fixes: `65789daa80` ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2024-05-23 15:06:47 +02:00
Fedor Pchelkin	f7c9ccaadf	dma-mapping: benchmark: avoid needless copy_to_user if benchmark fails If do_map_benchmark() has failed, there is nothing useful to copy back to userspace. Suggested-by: Barry Song <21cnbao@gmail.com> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2024-05-23 15:06:47 +02:00
Fedor Pchelkin	bb9025f443	dma-mapping: benchmark: fix up kthread-related error handling kthread creation failure is invalidly handled inside do_map_benchmark(). The put_task_struct() calls on the error path are supposed to balance the get_task_struct() calls which only happen after all the kthreads are successfully created. Rollback using kthread_stop() for already created kthreads in case of such failure. In normal situation call kthread_stop_put() to gracefully stop kthreads and put their task refcounts. This should be done for all started kthreads. Found by Linux Verification Center (linuxtesting.org). Fixes: `65789daa80` ("dma-mapping: add benchmark support for streaming DMA APIs") Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2024-05-23 15:06:47 +02:00
Baochen Qiang	6e16782d6b	wifi: ath11k: move power type check to ASSOC stage when connecting to 6 GHz AP With commit `bc8a0fac86` ("wifi: mac80211: don't set bss_conf in parsing") ath11k fails to connect to 6 GHz AP. This is because currently ath11k checks AP's power type in ath11k_mac_op_assign_vif_chanctx() which would be called in AUTH stage. However with above commit power type is not available until ASSOC stage. As a result power type check fails and therefore connection fails. Fix this by moving power type check to ASSOC stage, also move regulatory rules update there because it depends on power type. Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30 Fixes: `bc8a0fac86` ("wifi: mac80211: don't set bss_conf in parsing") Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240424064019.4847-1-quic_bqiang@quicinc.com	2024-05-23 15:45:52 +03:00
Carl Huang	ed281c6ab6	wifi: ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs WCN6750 firmware crashes because of num_vdevs changed from 4 to 17 in ath11k_init_wmi_config_qca6390() as the ab->hw_params.num_vdevs is 17. This is caused by commit `f019f4dff2` ("wifi: ath11k: support 2 station interfaces") which assigns ab->hw_params.num_vdevs directly to config->num_vdevs in ath11k_init_wmi_config_qca6390(), therefore WCN6750 firmware crashes as it can't support such a big num_vdevs. Fix it by assign 3 to num_vdevs in hw_params for WCN6750 as 3 is sufficient too. Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3 Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.1.0.1-01371-QCAMSLSWPLZ-1 Fixes: `f019f4dff2` ("wifi: ath11k: support 2 station interfaces") Reported-by: Luca Weiss <luca.weiss@fairphone.com> Tested-by: Luca Weiss <luca.weiss@fairphone.com> Closes: https://lore.kernel.org/r/D15TIIDIIESY.D1EKKJLZINMA@fairphone.com/ Signed-off-by: Carl Huang <quic_cjhuang@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240520030757.2209395-1-quic_cjhuang@quicinc.com	2024-05-23 15:43:40 +03:00
Chen Ni	0a3f9f7fc5	HID: nvidia-shield: Add missing check for input_ff_create_memless Add check for the return value of input_ff_create_memless() and return the error if it fails in order to catch the error. Fixes: `09308562d4` ("HID: nvidia-shield: Initial driver implementation with Thunderstrike support") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-05-23 14:17:22 +02:00
Zhang Lixu	2360497238	HID: intel-ish-hid: Fix build error for COMPILE_TEST kernel test robot reported build error due to a pointer type mismatch: .../ishtp/loader.c:172:8: error: incompatible pointer types passing '__le64 ' (aka 'unsigned long long ') to parameter of type 'dma_addr_t ' (aka 'unsigned int ') The issue arises because the driver, which is primarily intended for x86-64, is also built for i386 when COMPILE_TEST is enabled. Resolve type mismatch by using a temporary dma_addr_t variable to hold the DMA address. Populate this temporary variable in dma_alloc_coherent() function, and then convert and store the address in the fragment->fragment_tbl[i].ddr_adrs field in the correct format. Similarly, convert the ddr_adrs field back to dma_addr_t when freeing the DMA buffer with dma_free_coherent(). Fixes: `579a267e46` ("HID: intel-ish-hid: Implement loading firmware from host feature") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405201313.SAStVPrT-lkp@intel.com/ Signed-off-by: Zhang Lixu <lixu.zhang@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>	2024-05-23 14:12:48 +02:00
Heiner Kallweit	e61bcf42d2	i2c: Remove I2C_CLASS_SPD Remove this class after all users have been gone. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-05-23 13:38:15 +02:00
Christophe JAILLET	e6722ea6b9	i2c: synquacer: Remove a clk reference from struct synquacer_i2c 'pclk' is only used locally in the probe. Remove it from the 'synquacer_i2c' structure. Also remove a useless debug message. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-05-23 13:38:15 +02:00
Uwe Kleine-König	a827ad9b3c	spi: stm32: Revert change that enabled controller before asserting CS On stm32mp157 enabling the controller before asserting CS makes the hardware trigger spurious interrupts in a tight loop and the transfers fail. Revert the commit that swapped the order of enable and CS. This reintroduces the problem that swapping was supposed to fix, which however is less grave. Reported-by: Leonard Göhrs <l.goehrs@pengutronix.de> Link: https://lore.kernel.org/all/39033ed7-3e57-4339-80b4-fc8919e26aa7@pengutronix.de/ Fixes: `52b62e7a5d` ("spi: stm32: enable controller before asserting CS") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://msgid.link/r/20240523103326.792907-2-u.kleine-koenig@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-23 12:35:09 +01:00
Andy Shevchenko	da560097c0	spi: Check if transfer is mapped before calling DMA sync APIs The resent update to remove the orig_nents checks revealed that not all DMA sync backends can cope with the unallocated SG list, while supplying orig_nents == 0 (the commit `861370f49c` ("iommu/dma: force bouncing if the size is not cacheline-aligned"), for example, makes that happen for the IOMMU case). It means we have to check if the buffers are DMA mapped before trying to sync them. Re-introduce that check in a form of calling ->can_dma() in the same way as it's done in the DMA mapping loop for the SPI transfers. Reported-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Reported-by: Neil Armstrong <neil.armstrong@linaro.org> Closes: https://lore.kernel.org/r/8ae675b5-fcf9-4c9b-b06a-4462f70e1322@linaro.org Closes: https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano Fixes: `8cc3bad9d9` ("spi: Remove unneded check for orig_nents") Suggested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://msgid.link/r/20240522171018.3362521-3-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-23 12:32:54 +01:00
Andy Shevchenko	9f788ba457	spi: Don't mark message DMA mapped when no transfer in it is There is no need to set the DMA mapped flag of the message if it has no mapped transfers. Moreover, it may give the code a chance to take the wrong paths, i.e. to exercise DMA related APIs on unmapped data. Make __spi_map_msg() to bail earlier on the above mentioned cases. Fixes: `99adef310f` ("spi: Provide core support for DMA mapping transfers") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://msgid.link/r/20240522171018.3362521-2-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-05-23 12:32:52 +01:00
Dominique Martinet	c898afdc15	9p: add missing locking around taking dentry fid list Fix a use-after-free on dentry's d_fsdata fid list when a thread looks up a fid through dentry while another thread unlinks it: UAF thread: refcount_t: addition on 0; use-after-free. p9_fid_get linux/./include/net/9p/client.h:262 v9fs_fid_find+0x236/0x280 linux/fs/9p/fid.c:129 v9fs_fid_lookup_with_uid linux/fs/9p/fid.c:181 v9fs_fid_lookup+0xbf/0xc20 linux/fs/9p/fid.c:314 v9fs_vfs_getattr_dotl+0xf9/0x360 linux/fs/9p/vfs_inode_dotl.c:400 vfs_statx+0xdd/0x4d0 linux/fs/stat.c:248 Freed by: p9_fid_destroy (inlined) p9_client_clunk+0xb0/0xe0 linux/net/9p/client.c:1456 p9_fid_put linux/./include/net/9p/client.h:278 v9fs_dentry_release+0xb5/0x140 linux/fs/9p/vfs_dentry.c:55 v9fs_remove+0x38f/0x620 linux/fs/9p/vfs_inode.c:518 vfs_unlink+0x29a/0x810 linux/fs/namei.c:4335 The problem is that d_fsdata was not accessed under d_lock, because d_release() normally is only called once the dentry is otherwise no longer accessible but since we also call it explicitly in v9fs_remove that lock is required: move the hlist out of the dentry under lock then unref its fids once they are no longer accessible. Fixes: `154372e67d` ("fs/9p: fix create-unlink-getattr idiom") Cc: stable@vger.kernel.org Reported-by: Meysam Firouzi Reported-by: Amirmohammad Eftekhar Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Message-ID: <20240521122947.1080227-1-asmadeus@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-05-23 20:29:09 +09:00
Guilherme G. Piccoli	7c23b186ab	efi: pstore: Return proper errors on UEFI failures Right now efi-pstore either returns 0 (success) or -EIO; but we do have a function to convert UEFI errors in different standard error codes, helping to narrow down potential issues more accurately. So, let's use this helper here. Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2024-05-23 09:09:39 +02:00
Nathan Chancellor	5134acb15d	efi/libstub: zboot.lds: Discard .discard sections When building ARCH=loongarch defconfig + CONFIG_UNWINDER_ORC=y using LLVM, there is a warning from ld.lld when linking the EFI zboot image due to the use of unreachable() in number() in vsprintf.c: ld.lld: warning: drivers/firmware/efi/libstub/lib.a(vsprintf.stub.o):(.discard.unreachable+0x0): has non-ABS relocation R_LARCH_32_PCREL against symbol '' If the compiler cannot eliminate the default case for any reason, the .discard.unreachable section will remain in the final binary but the entire point of any section prefixed with .discard is that it is only used at compile time, so it can be discarded via /DISCARD/ in a linker script. The asm-generic vmlinux.lds.h includes .discard and .discard.* in the COMMON_DISCARDS macro but that is not used for zboot.lds, as it is not a kernel image linker script. Add .discard and .discard.* to /DISCARD/ in zboot.lds, so that any sections meant to be discarded at link time are not included in the final zboot image. This issue is not specific to LoongArch, it is just the first architecture to select CONFIG_OBJTOOL, which defines annotate_unreachable() as an asm statement to add the .discard.unreachable section, and use the EFI stub. Closes: https://github.com/ClangBuiltLinux/linux/issues/2023 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2024-05-23 09:02:39 +02:00
Sagi Grimberg	134d0b3f24	nfs: propagate readlink errors in nfs_symlink_filler There is an inherent race where a symlink file may have been overriden (by a different client) between lookup and readlink, resulting in a spurious EIO error returned to userspace. Fix this by propagating back ESTALE errors such that the vfs will retry the lookup/get_link (similar to nfs4_file_open) at least once. Cc: Dan Aloni <dan.aloni@vastdata.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-05-22 19:25:00 -04:00
Trond Myklebust	6cbe14f42b	MAINTAINERS: Change email address for Trond Myklebust Ensure that patches are sent to an email server that won't corrupt them. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-05-22 11:27:32 -04:00
Dmitry Mastykin	aad11473f8	NFSv4: Fix memory leak in nfs4_set_security_label We leak nfs_fattr and nfs4_label every time we set a security xattr. Signed-off-by: Dmitry Mastykin <mastichi@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-05-22 11:27:04 -04:00
Xu Kuohai	8d00547ea8	MAINTAINERS: Add myself as reviewer of ARM64 BPF JIT I am working on ARM64 BPF JIT for a while, hence add myself as reviewer. Signed-off-by: Xu Kuohai <xukuohai@huaweicloud.com> Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Link: https://lore.kernel.org/r/20240516020928.156125-1-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-21 11:00:30 -07:00
Nilay Shroff	d3a043733f	nvme-multipath: find NUMA path only for online numa-node In current native multipath design when a shared namespace is created, we loop through each possible numa-node, calculate the NUMA distance of that node from each nvme controller and then cache the optimal IO path for future reference while sending IO. The issue with this design is that we may refer to the NUMA distance table for an offline node which may not be populated at the time and so we may inadvertently end up finding and caching a non-optimal path for IO. Then latter when the corresponding numa-node becomes online and hence the NUMA distance table entry for that node is created, ideally we should re-calculate the multipath node distance for the newly added node however that doesn't happen unless we rescan/reset the controller. So essentially, we may keep using non-optimal IO path for a node which is made online after namespace is created. This patch helps fix this issue ensuring that when a shared namespace is created, we calculate the multipath node distance for each online numa-node instead of each possible numa-node. Then latter when a node becomes online and we receive any IO on that newly added node, we would calculate the multipath node distance for newly added node but this time NUMA distance table would have been already populated for newly added node. Hence we would be able to correctly calculate the multipath node distance and choose the optimal path for the IO. Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-05-21 06:43:08 -07:00
Wachowski, Karol	39bc27bd68	drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE) Lack of check for copy-on-write (COW) mapping in drm_gem_shmem_mmap allows users to call mmap with PROT_WRITE and MAP_PRIVATE flag causing a kernel panic due to BUG_ON in vmf_insert_pfn_prot: BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); Return -EINVAL early if COW mapping is detected. This bug affects all drm drivers using default shmem helpers. It can be reproduced by this simple example: void *ptr = mmap(0, size, PROT_WRITE, MAP_PRIVATE, fd, mmap_offset); ptr[0] = 0; Fixes: `2194a63a81` ("drm: Add library for shmem backed GEM objects") Cc: Noralf Trønnes <noralf@tronnes.org> Cc: Eric Anholt <eric@anholt.net> Cc: Rob Herring <robh@kernel.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20240520100514.925681-1-jacek.lawrynowicz@linux.intel.com	2024-05-21 14:38:51 +02:00
Nikita Zhandarovich	25460d6f39	net/9p: fix uninit-value in p9_client_rpc() Syzbot with the help of KMSAN reported the following error: BUG: KMSAN: uninit-value in trace_9p_client_res include/trace/events/9p.h:146 [inline] BUG: KMSAN: uninit-value in p9_client_rpc+0x1314/0x1340 net/9p/client.c:754 trace_9p_client_res include/trace/events/9p.h:146 [inline] p9_client_rpc+0x1314/0x1340 net/9p/client.c:754 p9_client_create+0x1551/0x1ff0 net/9p/client.c:1031 v9fs_session_init+0x1b9/0x28e0 fs/9p/v9fs.c:410 v9fs_mount+0xe2/0x12b0 fs/9p/vfs_super.c:122 legacy_get_tree+0x114/0x290 fs/fs_context.c:662 vfs_get_tree+0xa7/0x570 fs/super.c:1797 do_new_mount+0x71f/0x15e0 fs/namespace.c:3352 path_mount+0x742/0x1f20 fs/namespace.c:3679 do_mount fs/namespace.c:3692 [inline] __do_sys_mount fs/namespace.c:3898 [inline] __se_sys_mount+0x725/0x810 fs/namespace.c:3875 __x64_sys_mount+0xe4/0x150 fs/namespace.c:3875 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 Uninit was created at: __alloc_pages+0x9d6/0xe70 mm/page_alloc.c:4598 __alloc_pages_node include/linux/gfp.h:238 [inline] alloc_pages_node include/linux/gfp.h:261 [inline] alloc_slab_page mm/slub.c:2175 [inline] allocate_slab mm/slub.c:2338 [inline] new_slab+0x2de/0x1400 mm/slub.c:2391 ___slab_alloc+0x1184/0x33d0 mm/slub.c:3525 __slab_alloc mm/slub.c:3610 [inline] __slab_alloc_node mm/slub.c:3663 [inline] slab_alloc_node mm/slub.c:3835 [inline] kmem_cache_alloc+0x6d3/0xbe0 mm/slub.c:3852 p9_tag_alloc net/9p/client.c:278 [inline] p9_client_prepare_req+0x20a/0x1770 net/9p/client.c:641 p9_client_rpc+0x27e/0x1340 net/9p/client.c:688 p9_client_create+0x1551/0x1ff0 net/9p/client.c:1031 v9fs_session_init+0x1b9/0x28e0 fs/9p/v9fs.c:410 v9fs_mount+0xe2/0x12b0 fs/9p/vfs_super.c:122 legacy_get_tree+0x114/0x290 fs/fs_context.c:662 vfs_get_tree+0xa7/0x570 fs/super.c:1797 do_new_mount+0x71f/0x15e0 fs/namespace.c:3352 path_mount+0x742/0x1f20 fs/namespace.c:3679 do_mount fs/namespace.c:3692 [inline] __do_sys_mount fs/namespace.c:3898 [inline] __se_sys_mount+0x725/0x810 fs/namespace.c:3875 __x64_sys_mount+0xe4/0x150 fs/namespace.c:3875 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75 If p9_check_errors() fails early in p9_client_rpc(), req->rc.tag will not be properly initialized. However, trace_9p_client_res() ends up trying to print it out anyway before p9_client_rpc() finishes. Fix this issue by assigning default values to p9_fcall fields such as 'tag' and (just in case KMSAN unearths something new) 'id' during the tag allocation stage. Reported-and-tested-by: syzbot+ff14db38f56329ef68df@syzkaller.appspotmail.com Fixes: `348b59012e` ("net/9p: Convert net/9p protocol dumps to tracepoints") Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Cc: stable@vger.kernel.org Message-ID: <20240408141039.30428-1-n.zhandarovich@fintech.ru> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2024-05-21 21:27:28 +09:00
Shengjiu Wang	e8dc41afca	pmdomain: imx: gpcv2: Add delay after power up handshake AudioMix BLK-CTRL on i.MX8MP encountered an accessing register issue after power up. [ 2.181035] Kernel panic - not syncing: Asynchronous SError Interrupt [ 2.181038] CPU: 1 PID: 48 Comm: kworker/u16:2 Not tainted 6.9.0-rc5-next-20240424-00003-g21cec88845c6 #171 [ 2.181047] Hardware name: NXP i.MX8MPlus EVK board (DT) [ 2.181050] Workqueue: events_unbound deferred_probe_work_func [ 2.181064] Call trace: [...] [ 2.181142] arm64_serror_panic+0x6c/0x78 [ 2.181149] do_serror+0x3c/0x70 [ 2.181157] el1h_64_error_handler+0x30/0x48 [ 2.181164] el1h_64_error+0x64/0x68 [ 2.181171] clk_imx8mp_audiomix_runtime_resume+0x34/0x44 [ 2.181183] __genpd_runtime_resume+0x30/0x80 [ 2.181195] genpd_runtime_resume+0x110/0x244 [ 2.181205] __rpm_callback+0x48/0x1d8 [ 2.181213] rpm_callback+0x68/0x74 [ 2.181224] rpm_resume+0x468/0x6c0 [ 2.181234] __pm_runtime_resume+0x50/0x94 [ 2.181243] pm_runtime_get_suppliers+0x60/0x8c [ 2.181258] __driver_probe_device+0x48/0x12c [ 2.181268] driver_probe_device+0xd8/0x15c [ 2.181278] __device_attach_driver+0xb8/0x134 [ 2.181290] bus_for_each_drv+0x84/0xe0 [ 2.181302] __device_attach+0x9c/0x188 [ 2.181312] device_initial_probe+0x14/0x20 [ 2.181323] bus_probe_device+0xac/0xb0 [ 2.181334] deferred_probe_work_func+0x88/0xc0 [ 2.181344] process_one_work+0x150/0x290 [ 2.181357] worker_thread+0x2f8/0x408 [ 2.181370] kthread+0x110/0x114 [ 2.181381] ret_from_fork+0x10/0x20 [ 2.181391] SMP: stopping secondary CPUs According to comments in power up handshake: /* request the ADB400 to power up / if (domain->bits.hskreq) { regmap_update_bits(domain->regmap, domain->regs->hsk, domain->bits.hskreq, domain->bits.hskreq); / * ret = regmap_read_poll_timeout(domain->regmap, domain->regs->hsk, reg_val, * (reg_val & domain->bits.hskack), 0, * USEC_PER_MSEC); * Technically we need the commented code to wait handshake. But that needs * the BLK-CTL module BUS clk-en bit being set. * * There is a separate BLK-CTL module and we will have such a driver for it, * that driver will set the BUS clk-en bit and handshake will be triggered * automatically there. Just add a delay and suppose the handshake finish * after that. */ } The BLK-CTL module needs to add delay to wait for a handshake request finished. For some BLK-CTL module (eg. AudioMix on i.MX8MP) doesn't have BUS clk-en bit, it is better to add delay in this driver, as the BLK-CTL module doesn't need to care about how it is powered up. regmap_read_bypassed() is to make sure the above write IO transaction already reaches target before udelay(). Fixes: `1496dd413b` ("clk: imx: imx8mp: Add pm_runtime support for power saving") Reported-by: Francesco Dolcini <francesco@dolcini.it> Closes: https://lore.kernel.org/all/66293535.170a0220.21fe.a2e7@mx.google.com/ Suggested-by: Frank Li <frank.li@nxp.com> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Tested-by: Adam Ford <aford173@gmail.com> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Link: https://lore.kernel.org/r/1715396125-3724-1-git-send-email-shengjiu.wang@nxp.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2024-05-21 12:21:01 +02:00
Pablo Neira Ayuso	aff5c01fa1	netfilter: nft_payload: restore vlan q-in-q match support Revert `f6ae9f120d` ("netfilter: nft_payload: add C-VLAN support"). `f41f72d09e` ("netfilter: nft_payload: simplify vlan header handling") already allows to match on inner vlan tags by subtract the vlan header size to the payload offset which has been popped and stored in skbuff metadata fields. Fixes: `f6ae9f120d` ("netfilter: nft_payload: add C-VLAN support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-05-20 20:38:42 +02:00
Alexander Maltsev	c1193d9bbb	netfilter: ipset: Add list flush to cancel_gc Flushing list in cancel_gc drops references to other lists right away, without waiting for RCU to destroy list. Fixes race when referenced ipsets can't be destroyed while referring list is scheduled for destroy. Fixes: `97f7cf1cd8` ("netfilter: ipset: fix performance regression in swap operation") Signed-off-by: Alexander Maltsev <keltar.gw@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-05-20 20:38:42 +02:00
Dmitry Baryshkov	21ae74e1bf	wifi: ath10k: fix QCOM_RPROC_COMMON dependency If ath10k_snoc is built-in, while Qualcomm remoteprocs are built as modules, compilation fails with: /usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_init': drivers/net/wireless/ath/ath10k/snoc.c:1534: undefined reference to `qcom_register_ssr_notifier' /usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_deinit': drivers/net/wireless/ath/ath10k/snoc.c:1551: undefined reference to `qcom_unregister_ssr_notifier' Add corresponding dependency to ATH10K_SNOC Kconfig entry so that it's built as module if QCOM_RPROC_COMMON is built as module too. Fixes: `747ff7d3d7` ("ath10k: Don't always treat modem stop events as crashes") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240511-ath10k-snoc-dep-v1-1-9666e3af5c27@linaro.org	2024-05-20 14:48:34 +03:00
Eric Dumazet	dc21c6cc3d	netfilter: nfnetlink_queue: acquire rcu_read_lock() in instance_destroy_rcu() syzbot reported that nf_reinject() could be called without rcu_read_lock() : WARNING: suspicious RCU usage 6.9.0-rc7-syzkaller-02060-g5c1672705a1a #0 Not tainted net/netfilter/nfnetlink_queue.c:263 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by syz-executor.4/13427: #0: ffffffff8e334f60 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline] #0: ffffffff8e334f60 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2190 [inline] #0: ffffffff8e334f60 (rcu_callback){....}-{0:0}, at: rcu_core+0xa86/0x1830 kernel/rcu/tree.c:2471 #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline] #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: nfqnl_flush net/netfilter/nfnetlink_queue.c:405 [inline] #1: ffff88801ca92958 (&inst->lock){+.-.}-{2:2}, at: instance_destroy_rcu+0x30/0x220 net/netfilter/nfnetlink_queue.c:172 stack backtrace: CPU: 0 PID: 13427 Comm: syz-executor.4 Not tainted 6.9.0-rc7-syzkaller-02060-g5c1672705a1a #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712 nf_reinject net/netfilter/nfnetlink_queue.c:323 [inline] nfqnl_reinject+0x6ec/0x1120 net/netfilter/nfnetlink_queue.c:397 nfqnl_flush net/netfilter/nfnetlink_queue.c:410 [inline] instance_destroy_rcu+0x1ae/0x220 net/netfilter/nfnetlink_queue.c:172 rcu_do_batch kernel/rcu/tree.c:2196 [inline] rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2471 handle_softirqs+0x2d6/0x990 kernel/softirq.c:554 __do_softirq kernel/softirq.c:588 [inline] invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637 irq_exit_rcu+0x9/0x30 kernel/softirq.c:649 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043 </IRQ> <TASK> Fixes: `9872bec773` ("[NETFILTER]: nfnetlink: use RCU for queue instances hash") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-05-20 11:38:42 +02:00
Matthew Auld	520fb7f183	drm/tests/buddy: stop using PAGE_SIZE Gives the wrong impression that min page-size has to be tied to the CPU PAGE_SIZE. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240229105112.250077-4-matthew.auld@intel.com Signed-off-by: Christian König <christian.koenig@amd.com>	2024-05-17 12:57:39 +02:00
Matthew Auld	117bbc0e43	drm/buddy: stop using PAGE_SIZE The drm_buddy minimum page-size requirements should be distinct from the CPU PAGE_SIZE. Only restriction is that the minimum page-size is at least 4K. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240229105112.250077-3-matthew.auld@intel.com Signed-off-by: Christian König <christian.koenig@amd.com>	2024-05-17 12:56:42 +02:00
Breno Leitao	637c435f08	wifi: ath11k: Fix error path in ath11k_pcic_ext_irq_config If one of the dummy allocation fails in ath11k_pcic_ext_irq_config(), the previous allocated devices might leak due to returning without deallocating the devices. Instead of returning on the error path, deallocate all the previously allocated net_devices and then return. Fixes: `bca592ead8` ("wifi: ath11k: allocate dummy net_device dynamically") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240508185902.70975-1-leitao@debian.org	2024-05-17 09:54:32 +03:00
Deming Wang	e4f5f8298c	scsi: mpt3sas: Add missing kerneldoc parameter descriptions Add missing kerneldoc parameter descriptions to _scsih_set_debug_level(). Signed-off-by: Deming Wang <wangdeming@inspur.com> Link: https://lore.kernel.org/r/20240513115956.1576-1-wangdeming@inspur.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-15 10:38:09 -04:00
Saurav Kashyap	6c3bb589de	scsi: qedf: Set qed_slowpath_params to zero before use Zero qed_slowpath_params before use. Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240515091101.18754-4-skashyap@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-15 10:26:53 -04:00
Saurav Kashyap	78e88472b6	scsi: qedf: Wait for stag work during unload If stag work is already scheduled and unload is called, it can lead to issues as unload cleans up the work element. Wait for stag work to get completed before cleanup during unload. Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240515091101.18754-3-skashyap@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-15 10:26:53 -04:00
Saurav Kashyap	51071f0831	scsi: qedf: Don't process stag work during unload and recovery Stag work can cause issues during unload and recovery, hence don't process it. Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240515091101.18754-2-skashyap@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-15 10:26:53 -04:00
Justin Stitt	9fad9d560a	scsi: sr: Fix unintentional arithmetic wraparound Running syzkaller with the newly reintroduced signed integer overflow sanitizer produces this report: [ 65.194362] ------------[ cut here ]------------ [ 65.197752] UBSAN: signed-integer-overflow in ../drivers/scsi/sr_ioctl.c:436:9 [ 65.203607] -2147483648 * 177 cannot be represented in type 'int' [ 65.207911] CPU: 2 PID: 10416 Comm: syz-executor.1 Not tainted 6.8.0-rc2-00035-gb3ef86b5a957 #1 [ 65.213585] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 65.219923] Call Trace: [ 65.221556] <TASK> [ 65.223029] dump_stack_lvl+0x93/0xd0 [ 65.225573] handle_overflow+0x171/0x1b0 [ 65.228219] sr_select_speed+0xeb/0xf0 [ 65.230786] ? __pm_runtime_resume+0xe6/0x130 [ 65.233606] sr_block_ioctl+0x15d/0x1d0 ... Historically, the signed integer overflow sanitizer did not work in the kernel due to its interaction with `-fwrapv` but this has since been changed [1] in the newest version of Clang. It was re-enabled in the kernel with Commit `557f8c582a` ("ubsan: Reintroduce signed overflow sanitizer"). Firstly, let's change the type of "speed" to unsigned long as sr_select_speed()'s only caller passes in an unsigned long anyways. $ git grep '\.select_speed' \| drivers/scsi/sr.c: .select_speed = sr_select_speed, ... \| static int cdrom_ioctl_select_speed(struct cdrom_device_info *cdi, \| unsigned long arg) \| { \| ... \| return cdi->ops->select_speed(cdi, arg); \| } Next, let's add an extra check to make sure we don't exceed 0xffff/177 (350) since 0xffff is the max speed. This has two benefits: 1) we deal with integer overflow before it happens and 2) we properly respect the max speed of 0xffff. There are some "magic" numbers here but I did not want to change more than what was necessary. Link: https://github.com/llvm/llvm-project/pull/82432 [1] Closes: https://github.com/KSPP/linux/issues/357 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20240508-b4-b4-sio-sr_select_speed-v2-1-00b68f724290@google.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-15 10:05:24 -04:00
Martin Wilck	10157b1fc1	scsi: core: alua: I/O errors for ALUA state transitions When a host is configured with a few LUNs and I/O is running, injecting FC faults repeatedly leads to path recovery problems. The LUNs have 4 paths each and 3 of them come back active after say an FC fault which makes 2 of the paths go down, instead of all 4. This happens after several iterations of continuous FC faults. Reason here is that we're returning an I/O error whenever we're encountering sense code 06/04/0a (LOGICAL UNIT NOT ACCESSIBLE, ASYMMETRIC ACCESS STATE TRANSITION) instead of retrying. [mwilck: The original patch was developed by Rajashekhar M A and Hannes Reinecke. I moved the code to alua_check_sense() as suggested by Mike Christie [1]. Evan Milne had raised the question whether pg->state should be set to transitioning in the UA case [2]. I believe that doing this is correct. SCSI_ACCESS_STATE_TRANSITIONING by itself doesn't cause I/O errors. Our handler schedules an RTPG, which will only result in an I/O error condition if the transitioning timeout expires.] [1] https://lore.kernel.org/all/0bc96e82-fdda-4187-148d-5b34f81d4942@oracle.com/ [2] https://lore.kernel.org/all/CAGtn9r=kicnTDE2o7Gt5Y=yoidHYD7tG8XdMHEBJTBraVEoOCw@mail.gmail.com/ Co-developed-by: Rajashekhar M A <rajs@netapp.com> Co-developed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin Wilck <martin.wilck@suse.com> Link: https://lore.kernel.org/r/20240514140344.19538-1-mwilck@suse.com Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-15 09:46:13 -04:00
Nathan Chancellor	9f365cb8bb	scsi: mpi3mr: Use proper format specifier in mpi3mr_sas_port_add() When building for a 32-bit platform such as ARM or i386, for which size_t is unsigned int, there is a warning due to using an unsigned long format specifier: drivers/scsi/mpi3mr/mpi3mr_transport.c:1370:11: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat] 1369 \| ioc_warn(mrioc, "skipping port %u, max allowed value is %lu\n", \| ~~~ \| %u 1370 \| i, sizeof(mr_sas_port->phy_mask) * 8); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use the proper format specifier for size_t, %zu, to resolve the warning for all platforms. Fixes: `3668651def` ("scsi: mpi3mr: Sanitise num_phys") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20240514-mpi3mr-fix-wformat-v1-1-f1ad49217e5e@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-05-15 09:41:39 -04:00
Mohamed Ahmed	aed9a1a4f7	drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations Allow PTE kind and tile mode on BO create with VM_BIND, and add a GETPARAM to indicate this change. This is needed to support modifiers in NVK and ensure correctness when dealing with the nouveau GL driver. The userspace modifiers implementation this is for can be found here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24795 Fixes: `b88baab828` ("drm/nouveau: implement new VM_BIND uAPI") Signed-off-by: Mohamed Ahmed <mohamedahmedegypt2001@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240509204352.7597-1-mohamedahmedegypt2001@gmail.com	2024-05-13 22:17:58 +02:00

863 changed files with 9341 additions and 5218 deletions

									
										3

.editorconfig
									
												View File
												
				@@ -5,7 +5,6 @@ root = true

				[{*.{awk,c,dts,dtsi,dtso,h,mk,s,S},Kconfig,Makefile,Makefile.*}]

				charset = utf-8

				end_of_line = lf

				trim_trailing_whitespace = true

				insert_final_newline = true

				indent_style = tab

				indent_size = 8

				@@ -13,7 +12,6 @@ indent_size = 8

				[*.{json,py,rs}]

				charset = utf-8

				end_of_line = lf

				trim_trailing_whitespace = true

				insert_final_newline = true

				indent_style = space

				indent_size = 4

				@@ -26,7 +24,6 @@ indent_size = 8

				[*.yaml]

				charset = utf-8

				end_of_line = lf

				trim_trailing_whitespace = unset

				insert_final_newline = true

				indent_style = space

				indent_size = 2

12

.mailmap

View File

@@ -72,6 +72,8 @@ Andrey Ryabinin <ryabinin.a.a@gmail.com> <aryabinin@virtuozzo.com>
 Andrzej Hajda <andrzej.hajda@intel.com> <a.hajda@samsung.com>
 André Almeida <andrealmeid@igalia.com> <andrealmeid@collabora.com>
 Andy Adamson <andros@citi.umich.edu>
 Andy Shevchenko <andy@kernel.org> <andy@smile.org.ua>
 Andy Shevchenko <andy@kernel.org> <ext-andriy.shevchenko@nokia.com>
 Anilkumar Kolli <quic_akolli@quicinc.com> <akolli@codeaurora.org>
 Anirudh Ghayal <quic_aghayal@quicinc.com> <aghayal@codeaurora.org>
 Antoine Tenart <atenart@kernel.org> <antoine.tenart@bootlin.com>
@@ -217,6 +219,7 @@ Geliang Tang <geliang@kernel.org> <geliang.tang@suse.com>
 Geliang Tang <geliang@kernel.org> <geliangtang@xiaomi.com>
 Geliang Tang <geliang@kernel.org> <geliangtang@gmail.com>
 Geliang Tang <geliang@kernel.org> <geliangtang@163.com>
 Geliang Tang <geliang@kernel.org> <tanggeliang@kylinos.cn>
 Georgi Djakov <djakov@kernel.org> <georgi.djakov@linaro.org>
 Gerald Schaefer <gerald.schaefer@linux.ibm.com> <geraldsc@de.ibm.com>
 Gerald Schaefer <gerald.schaefer@linux.ibm.com> <gerald.schaefer@de.ibm.com>
@@ -337,10 +340,11 @@ Kalyan Thota <quic_kalyant@quicinc.com> <kalyan_t@codeaurora.org>
 Karthikeyan Periyasamy <quic_periyasa@quicinc.com> <periyasa@codeaurora.org>
 Kathiravan T <quic_kathirav@quicinc.com> <kathirav@codeaurora.org>
 Kay Sievers <kay.sievers@vrfy.org>
 Kees Cook <keescook@chromium.org> <kees.cook@canonical.com>
 Kees Cook <keescook@chromium.org> <keescook@google.com>
 Kees Cook <keescook@chromium.org> <kees@outflux.net>
 Kees Cook <keescook@chromium.org> <kees@ubuntu.com>
 Kees Cook <kees@kernel.org> <kees.cook@canonical.com>
 Kees Cook <kees@kernel.org> <keescook@chromium.org>
 Kees Cook <kees@kernel.org> <keescook@google.com>
 Kees Cook <kees@kernel.org> <kees@outflux.net>
 Kees Cook <kees@kernel.org> <kees@ubuntu.com>
 Keith Busch <kbusch@kernel.org> <keith.busch@intel.com>
 Keith Busch <kbusch@kernel.org> <keith.busch@linux.intel.com>
 Kenneth W Chen <kenneth.w.chen@intel.com>

									
										35

Documentation/admin-guide/LSM/tomoyo.rst
									
												View File
												
				@@ -9,8 +9,8 @@ TOMOYO is a name-based MAC extension (LSM module) for the Linux kernel.

				LiveCD-based tutorials are available at

				http://tomoyo.sourceforge.jp/1.8/ubuntu12.04-live.html

				http://tomoyo.sourceforge.jp/1.8/centos6-live.html

				https://tomoyo.sourceforge.net/1.8/ubuntu12.04-live.html

				https://tomoyo.sourceforge.net/1.8/centos6-live.html

				Though these tutorials use non-LSM version of TOMOYO, they are useful for you

				to know what TOMOYO is.

				@@ -21,45 +21,32 @@ How to enable TOMOYO?

				Build the kernel with ``CONFIG_SECURITY_TOMOYO=y`` and pass ``security=tomoyo`` on

				kernel's command line.

				Please see http://tomoyo.osdn.jp/2.5/ for details.

				Please see https://tomoyo.sourceforge.net/2.6/ for details.

				Where is documentation?

				=======================

				User <-> Kernel interface documentation is available at

				https://tomoyo.osdn.jp/2.5/policy-specification/index.html .

				https://tomoyo.sourceforge.net/2.6/policy-specification/index.html .

				Materials we prepared for seminars and symposiums are available at

				https://osdn.jp/projects/tomoyo/docs/?category_id=532&language_id=1 .

				https://sourceforge.net/projects/tomoyo/files/docs/ .

				Below lists are chosen from three aspects.

				What is TOMOYO?

				  TOMOYO Linux Overview

				    https://osdn.jp/projects/tomoyo/docs/lca2009-takeda.pdf

				    https://sourceforge.net/projects/tomoyo/files/docs/lca2009-takeda.pdf

				  TOMOYO Linux: pragmatic and manageable security for Linux

				    https://osdn.jp/projects/tomoyo/docs/freedomhectaipei-tomoyo.pdf

				    https://sourceforge.net/projects/tomoyo/files/docs/freedomhectaipei-tomoyo.pdf

				  TOMOYO Linux: A Practical Method to Understand and Protect Your Own Linux Box

				    https://osdn.jp/projects/tomoyo/docs/PacSec2007-en-no-demo.pdf

				    https://sourceforge.net/projects/tomoyo/files/docs/PacSec2007-en-no-demo.pdf

				What can TOMOYO do?

				  Deep inside TOMOYO Linux

				    https://osdn.jp/projects/tomoyo/docs/lca2009-kumaneko.pdf

				    https://sourceforge.net/projects/tomoyo/files/docs/lca2009-kumaneko.pdf

				  The role of "pathname based access control" in security.

				    https://osdn.jp/projects/tomoyo/docs/lfj2008-bof.pdf

				    https://sourceforge.net/projects/tomoyo/files/docs/lfj2008-bof.pdf

				History of TOMOYO?

				  Realities of Mainlining

				    https://osdn.jp/projects/tomoyo/docs/lfj2008.pdf

				What is future plan?

				====================

				We believe that inode based security and name based security are complementary

				and both should be used together. But unfortunately, so far, we cannot enable

				multiple LSM modules at the same time. We feel sorry that you have to give up

				SELinux/SMACK/AppArmor etc. when you want to use TOMOYO.

				We hope that LSM becomes stackable in future. Meanwhile, you can use non-LSM

				version of TOMOYO, available at http://tomoyo.osdn.jp/1.8/ .

				LSM version of TOMOYO is a subset of non-LSM version of TOMOYO. We are planning

				to port non-LSM version's functionalities to LSM versions.

				    https://sourceforge.net/projects/tomoyo/files/docs/lfj2008.pdf

22

Documentation/admin-guide/kernel-parameters.txt

View File

@@ -1921,6 +1921,28 @@
 				Format:
 				<bus_id>,<clkrate>
 	i2c_touchscreen_props= [HW,ACPI,X86]
 			Set device-properties for ACPI-enumerated I2C-attached
 			touchscreen, to e.g. fix coordinates of upside-down
 			mounted touchscreens. If you need this option please
 			submit a drivers/platform/x86/touchscreen_dmi.c patch
 			adding a DMI quirk for this.
 			Format:
 			<ACPI_HW_ID>:<prop_name>=<val>[:prop_name=val][:...]
 			Where <val> is one of:
 			Omit "=<val>" entirely	Set a boolean device-property
 			Unsigned number		Set a u32 device-property
 			Anything else		Set a string device-property
 			Examples (split over multiple lines):
 			i2c_touchscreen_props=GDIX1001:touchscreen-inverted-x:
 			touchscreen-inverted-y
 			i2c_touchscreen_props=MSSL1680:touchscreen-size-x=1920:
 			touchscreen-size-y=1080:touchscreen-inverted-y:
 			firmware-name=gsl1680-vendor-model.fw:silead,home-button
 	i8042.debug	[HW] Toggle i8042 debug mode
 	i8042.unmask_kbd_data
 			[HW] Enable printing of interrupt data from the KBD port

									
										4

Documentation/admin-guide/mm/transhuge.rst
									
												View File
												
				@@ -467,11 +467,11 @@ anon_fault_fallback_charge

					instead falls back to using huge pages with lower orders or

					small pages even though the allocation was successful.

				anon_swpout

				swpout

					is incremented every time a huge page is swapped out in one

					piece without splitting.

				anon_swpout_fallback

				swpout_fallback

					is incremented if a huge page has to be split before swapout.

					Usually because failed to allocate some continuous swap space

					for the huge page.

									
										4

Documentation/arch/riscv/uabi.rst
									
												View File
												
				@@ -65,4 +65,6 @@ the extension, or may have deliberately removed it from the listing.

				Misaligned accesses

				-------------------

				Misaligned accesses are supported in userspace, but they may perform poorly.

				Misaligned scalar accesses are supported in userspace, but they may perform

				poorly.  Misaligned vector accesses are only supported if the Zicclsm extension

				is supported.

									
										4

Documentation/cdrom/cdrom-standard.rst
									
												View File
												
				@@ -217,7 +217,7 @@ current *struct* is::

						int (*media_changed)(struct cdrom_device_info *, int);

						int (*tray_move)(struct cdrom_device_info *, int);

						int (*lock_door)(struct cdrom_device_info *, int);

						int (*select_speed)(struct cdrom_device_info *, int);

						int (*select_speed)(struct cdrom_device_info *, unsigned long);

						int (*get_last_session) (struct cdrom_device_info *,

									 struct cdrom_multisession *);

						int (*get_mcn)(struct cdrom_device_info *, struct cdrom_mcn *);

				@@ -396,7 +396,7 @@ action need be taken, and the return value should be 0.

				::

					int select_speed(struct cdrom_device_info *cdi, int speed)

					int select_speed(struct cdrom_device_info *cdi, unsigned long speed)

				Some CD-ROM drives are capable of changing their head-speed. There

				are several reasons for changing the speed of a CD-ROM drive. Badly

									
										2

Documentation/core-api/swiotlb.rst
									
												View File
												
				@@ -192,7 +192,7 @@ alignment larger than PAGE_SIZE.

				Dynamic swiotlb

				---------------

				When CONFIG_DYNAMIC_SWIOTLB is enabled, swiotlb can do on-demand expansion of

				When CONFIG_SWIOTLB_DYNAMIC is enabled, swiotlb can do on-demand expansion of

				the amount of memory available for allocation as bounce buffers. If a bounce

				buffer request fails due to lack of available space, an asynchronous background

				task is kicked off to allocate memory from general system memory and turn it

									
										3

Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml
									
												View File
												
				@@ -54,11 +54,10 @@ unevaluatedProperties: false

				examples:

				  - |

				    mlahb: ahb@38000000 {

				    ahb {

				      compatible = "st,mlahb", "simple-bus";

				      #address-cells = <1>;

				      #size-cells = <1>;

				      reg = <0x10000000 0x40000>;

				      ranges;

				      dma-ranges = <0x00000000 0x38000000 0x10000>,

				                   <0x10000000 0x10000000 0x60000>,

									
										6

Documentation/devicetree/bindings/arm/sunxi.yaml
									
												View File
												
				@@ -57,17 +57,17 @@ properties:

				          - const: allwinner,sun8i-v3s

				      - description: Anbernic RG35XX (2024)

				      - items:

				        items:

				          - const: anbernic,rg35xx-2024

				          - const: allwinner,sun50i-h700

				      - description: Anbernic RG35XX Plus

				      - items:

				        items:

				          - const: anbernic,rg35xx-plus

				          - const: allwinner,sun50i-h700

				      - description: Anbernic RG35XX H

				      - items:

				        items:

				          - const: anbernic,rg35xx-h

				          - const: allwinner,sun50i-h700

									
										2

Documentation/devicetree/bindings/iio/dac/adi,ad3552r.yaml
									
												View File
												
				@@ -139,7 +139,7 @@ allOf:

				                Voltage output range of the channel as <minimum, maximum>

				                Required connections:

				                  Rfb1x for: 0 to 2.5 V; 0 to 3V; 0 to 5 V;

				                  Rfb2x for: 0 to 10 V; 2.5 to 7.5V; -5 to 5 V;

				                  Rfb2x for: 0 to 10 V; -2.5 to 7.5V; -5 to 5 V;

				              oneOf:

				                - items:

				                    - const: 0

									
										19

Documentation/devicetree/bindings/input/elan,ekth6915.yaml
									
												View File
												
				@@ -18,9 +18,12 @@ allOf:

				properties:

				  compatible:

				    enum:

				      - elan,ekth6915

				      - ilitek,ili2901

				    oneOf:

				      - items:

				          - enum:

				              - elan,ekth5015m

				          - const: elan,ekth6915

				      - const: elan,ekth6915

				  reg:

				    const: 0x10

				@@ -33,6 +36,12 @@ properties:

				  reset-gpios:

				    description: Reset GPIO; not all touchscreens using eKTH6915 hook this up.

				  no-reset-on-power-off:

				    type: boolean

				    description:

				      Reset line is wired so that it can (and should) be left deasserted when

				      the power supply is off.

				  vcc33-supply:

				    description: The 3.3V supply to the touchscreen.

				@@ -58,8 +67,8 @@ examples:

				      #address-cells = <1>;

				      #size-cells = <0>;

				      ap_ts: touchscreen@10 {

				        compatible = "elan,ekth6915";

				      touchscreen@10 {

				        compatible = "elan,ekth5015m", "elan,ekth6915";

				        reg = <0x10>;

				        interrupt-parent = <&tlmm>;

									
										66

Documentation/devicetree/bindings/input/ilitek,ili2901.yaml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,66 @@

				# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)

				%YAML 1.2

				---

				$id: http://devicetree.org/schemas/input/ilitek,ili2901.yaml#

				$schema: http://devicetree.org/meta-schemas/core.yaml#

				title: Ilitek ILI2901 touchscreen controller

				maintainers:

				  - Jiri Kosina <jkosina@suse.com>

				description:

				  Supports the Ilitek ILI2901 touchscreen controller.

				  This touchscreen controller uses the i2c-hid protocol with a reset GPIO.

				allOf:

				  - $ref: /schemas/input/touchscreen/touchscreen.yaml#

				properties:

				  compatible:

				    enum:

				      - ilitek,ili2901

				  reg:

				    maxItems: 1

				  interrupts:

				    maxItems: 1

				  panel: true

				  reset-gpios:

				    maxItems: 1

				  vcc33-supply: true

				  vccio-supply: true

				required:

				  - compatible

				  - reg

				  - interrupts

				  - vcc33-supply

				additionalProperties: false

				examples:

				  - |

				    #include <dt-bindings/gpio/gpio.h>

				    #include <dt-bindings/interrupt-controller/irq.h>

				    i2c {

				      #address-cells = <1>;

				      #size-cells = <0>;

				      touchscreen@41 {

				        compatible = "ilitek,ili2901";

				        reg = <0x41>;

				        interrupt-parent = <&tlmm>;

				        interrupts = <9 IRQ_TYPE_LEVEL_LOW>;

				        reset-gpios = <&tlmm 8 GPIO_ACTIVE_LOW>;

				        vcc33-supply = <&pp3300_ts>;

				      };

				    };

									
										11

Documentation/devicetree/bindings/net/pse-pd/microchip,pd692x0.yaml
									
												View File
												
				@@ -24,6 +24,7 @@ properties:

				  managers:

				    type: object

				    additionalProperties: false

				    description:

				      List of the PD69208T4/PD69204T4/PD69208M PSE managers. Each manager

				      have 4 or 8 physical ports according to the chip version. No need to

				@@ -47,8 +48,9 @@ properties:

				      - "#size-cells"

				    patternProperties:

				      "^manager@0[0-9a-b]$":

				      "^manager@[0-9a-b]$":

				        type: object

				        additionalProperties: false

				        description:

				          PD69208T4/PD69204T4/PD69208M PSE manager exposing 4 or 8 physical

				          ports.

				@@ -69,9 +71,14 @@ properties:

				        patternProperties:

				          '^port@[0-7]$':

				            type: object

				            additionalProperties: false

				            properties:

				              reg:

				                maxItems: 1

				            required:

				              - reg

				            additionalProperties: false

				        required:

				          - reg

									
										18

Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml
									
												View File
												
				@@ -29,13 +29,31 @@ properties:

				      of the ports conversion matrix that establishes relationship between

				      the logical ports and the physical channels.

				    type: object

				    additionalProperties: false

				    properties:

				      "#address-cells":

				        const: 1

				      "#size-cells":

				        const: 0

				    patternProperties:

				      '^channel@[0-7]$':

				        type: object

				        additionalProperties: false

				        properties:

				          reg:

				            maxItems: 1

				        required:

				          - reg

				    required:

				      - "#address-cells"

				      - "#size-cells"

				unevaluatedProperties: false

				required:

									
										1

Documentation/devicetree/bindings/usb/realtek,rts5411.yaml
									
												View File
												
				@@ -65,6 +65,7 @@ patternProperties:

				    description: The hard wired USB devices

				    type: object

				    $ref: /schemas/usb/usb-device.yaml

				    additionalProperties: true

				required:

				  - peer-hub

									
										12

Documentation/kbuild/kconfig-language.rst
									
												View File
												
				@@ -150,6 +150,12 @@ applicable everywhere (see syntax).

					That will limit the usefulness but on the other hand avoid

					the illegal configurations all over.

					If "select" <symbol> is followed by "if" <expr>, <symbol> will be

					selected by the logical AND of the value of the current menu symbol

					and <expr>. This means, the lower limit can be downgraded due to the

					presence of "if" <expr>. This behavior may seem weird, but we rely on

					it. (The future of this behavior is undecided.)

				- weak reverse dependencies: "imply" <symbol> ["if" <expr>]

				  This is similar to "select" as it enforces a lower limit on another

				@@ -184,7 +190,7 @@ applicable everywhere (see syntax).

				  ability to hook into a secondary subsystem while allowing the user to

				  configure that subsystem out without also having to unset these drivers.

				  Note: If the combination of FOO=y and BAR=m causes a link error,

				  Note: If the combination of FOO=y and BAZ=m causes a link error,

				  you can guard the function call with IS_REACHABLE()::

					foo_init()

				@@ -202,6 +208,10 @@ applicable everywhere (see syntax).

					imply BAR

					imply BAZ

				  Note: If "imply" <symbol> is followed by "if" <expr>, the default of <symbol>

				  will be the logical AND of the value of the current menu symbol and <expr>.

				  (The future of this behavior is undecided.)

				- limiting menu display: "visible if" <expr>

				  This attribute is only applicable to menu blocks, if the condition is

									
										4

Documentation/netlink/specs/netdev.yaml
									
												View File
												
				@@ -349,6 +349,10 @@ attribute-sets:

				          Number of packets dropped due to transient lack of resources, such as

				          buffer space, host descriptors etc.

				        type: uint

				      -

				        name: rx-csum-complete

				        doc: Number of packets that were marked as CHECKSUM_COMPLETE.

				        type: uint

				      -

				        name: rx-csum-unnecessary

				        doc: Number of packets that were marked as CHECKSUM_UNNECESSARY.

									
										31

Documentation/networking/af_xdp.rst
									
												View File
												
				@@ -329,24 +329,23 @@ XDP_SHARED_UMEM option and provide the initial socket's fd in the

				sxdp_shared_umem_fd field as you registered the UMEM on that

				socket. These two sockets will now share one and the same UMEM.

				In this case, it is possible to use the NIC's packet steering

				capabilities to steer the packets to the right queue. This is not

				possible in the previous example as there is only one queue shared

				among sockets, so the NIC cannot do this steering as it can only steer

				between queues.

				There is no need to supply an XDP program like the one in the previous

				case where sockets were bound to the same queue id and

				device. Instead, use the NIC's packet steering capabilities to steer

				the packets to the right queue. In the previous example, there is only

				one queue shared among sockets, so the NIC cannot do this steering. It

				can only steer between queues.

				In libxdp (or libbpf prior to version 1.0), you need to use the

				xsk_socket__create_shared() API as it takes a reference to a FILL ring

				and a COMPLETION ring that will be created for you and bound to the

				shared UMEM. You can use this function for all the sockets you create,

				or you can use it for the second and following ones and use

				xsk_socket__create() for the first one. Both methods yield the same

				result.

				In libbpf, you need to use the xsk_socket__create_shared() API as it

				takes a reference to a FILL ring and a COMPLETION ring that will be

				created for you and bound to the shared UMEM. You can use this

				function for all the sockets you create, or you can use it for the

				second and following ones and use xsk_socket__create() for the first

				one. Both methods yield the same result.

				Note that a UMEM can be shared between sockets on the same queue id

				and device, as well as between queues on the same device and between

				devices at the same time. It is also possible to redirect to any

				socket as long as it is bound to the same umem with XDP_SHARED_UMEM.

				devices at the same time.

				XDP_USE_NEED_WAKEUP bind flag

				-----------------------------

				@@ -823,10 +822,6 @@ A: The short answer is no, that is not supported at the moment. The

				   switch, or other distribution mechanism, in your NIC to direct

				   traffic to the correct queue id and socket.

				   Note that if you are using the XDP_SHARED_UMEM option, it is

				   possible to switch traffic between any socket bound to the same

				   umem.

				Q: My packets are sometimes corrupted. What is wrong?

				A: Care has to be taken not to feed the same buffer in the UMEM into

									
										2

Documentation/process/maintainer-netdev.rst
									
												View File
												
				@@ -227,7 +227,7 @@ preferably including links to previous postings, for example::

				  The amount of mooing will depend on packet rate so should match

				  the diurnal cycle quite well.

				  Signed-of-by: Joe Defarmer <joe@barn.org>

				  Signed-off-by: Joe Defarmer <joe@barn.org>

				  ---

				  v3:

				    - add a note about time-of-day mooing fluctuation to the commit message

									
										2

Documentation/userspace-api/media/v4l/dev-subdev.rst
									
												View File
												
				@@ -582,7 +582,7 @@ depending on the hardware. In all cases, however, only routes that have the

				Devices generating the streams may allow enabling and disabling some of the

				routes or have a fixed routing configuration. If the routes can be disabled, not

				declaring the routes (or declaring them without

				``VIDIOC_SUBDEV_STREAM_FL_ACTIVE`` flag set) in ``VIDIOC_SUBDEV_S_ROUTING`` will

				``V4L2_SUBDEV_STREAM_FL_ACTIVE`` flag set) in ``VIDIOC_SUBDEV_S_ROUTING`` will

				disable the routes. ``VIDIOC_SUBDEV_S_ROUTING`` will still return such routes

				back to the user in the routes array, with the ``V4L2_SUBDEV_STREAM_FL_ACTIVE``

				flag unset.

12

MAINTAINERS

View File

@@ -1107,7 +1107,6 @@ L:	linux-pm@vger.kernel.org
 S:	Supported
 F:	Documentation/admin-guide/pm/amd-pstate.rst
 F:	drivers/cpufreq/amd-pstate*
 F:	include/linux/amd-pstate.h
 F:	tools/power/x86/amd_pstate_tracer/amd_pstate_trace.py
 AMD PTDMA DRIVER
@@ -3854,6 +3853,7 @@ BPF JIT for ARM64
 M:	Daniel Borkmann <daniel@iogearbox.net>
 M:	Alexei Starovoitov <ast@kernel.org>
 M:	Puranjay Mohan <puranjay@kernel.org>
 R:	Xu Kuohai <xukuohai@huaweicloud.com>
 L:	bpf@vger.kernel.org
 S:	Supported
 F:	arch/arm64/net/
@@ -5187,7 +5187,6 @@ F:	Documentation/devicetree/bindings/media/i2c/chrontel,ch7322.yaml
 F:	drivers/media/cec/i2c/ch7322.c
 CIRRUS LOGIC AUDIO CODEC DRIVERS
 M:	James Schulman <james.schulman@cirrus.com>
 M:	David Rhodes <david.rhodes@cirrus.com>
 M:	Richard Fitzgerald <rf@opensource.cirrus.com>
 L:	alsa-devel@alsa-project.org (moderated for non-subscribers)
@@ -11035,8 +11034,8 @@ F:	include/uapi/drm/i915_drm.h
 INTEL DRM XE DRIVER (Lunar Lake and newer)
 M:	Lucas De Marchi <lucas.demarchi@intel.com>
 M:	Oded Gabbay <ogabbay@kernel.org>
 M:	Thomas Hellström <thomas.hellstrom@linux.intel.com>
 M:	Rodrigo Vivi <rodrigo.vivi@intel.com>
 L:	intel-xe@lists.freedesktop.org
 S:	Supported
 W:	https://drm.pages.freedesktop.org/intel-docs/
@@ -15238,7 +15237,6 @@ F:	drivers/staging/most/
 F:	include/linux/most.h
 MOTORCOMM PHY DRIVER
 M:	Peter Geis <pgwipeout@gmail.com>
 M:	Frank <Frank.Sae@motor-comm.com>
 L:	netdev@vger.kernel.org
 S:	Maintained
@@ -15827,7 +15825,7 @@ F:	drivers/nfc/virtual_ncidev.c
 F:	tools/testing/selftests/nci/
 NFS, SUNRPC, AND LOCKD CLIENTS
 M:	Trond Myklebust <trond.myklebust@hammerspace.com>
 M:	Trond Myklebust <trondmy@kernel.org>
 M:	Anna Schumaker <anna@kernel.org>
 L:	linux-nfs@vger.kernel.org
 S:	Maintained
@@ -21316,7 +21314,7 @@ F:	arch/riscv/boot/dts/starfive/
 STARFIVE DWMAC GLUE LAYER
 M:	Emil Renner Berthing <kernel@esmil.dk>
 M:	Samin Guo <samin.guo@starfivetech.com>
 M:	Minda Chen <minda.chen@starfivetech.com>
 S:	Maintained
 F:	Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
 F:	drivers/net/ethernet/stmicro/stmmac/dwmac-starfive.c
@@ -22679,7 +22677,7 @@ L:	tomoyo-users-en@lists.osdn.me (subscribers-only, for users in English)
 L:	tomoyo-dev@lists.osdn.me (subscribers-only, for developers in Japanese)
 L:	tomoyo-users@lists.osdn.me (subscribers-only, for users in Japanese)
 S:	Maintained
 W:	https://tomoyo.osdn.jp/
 W:	https://tomoyo.sourceforge.net/
 F:	security/tomoyo/
 TOPSTAR LAPTOP EXTRAS DRIVER

									
										2

Makefile
									
												View File
												
				@@ -2,7 +2,7 @@

				VERSION = 6

				PATCHLEVEL = 10

				SUBLEVEL = 0

				EXTRAVERSION = -rc1

				EXTRAVERSION = -rc4

				NAME = Baby Opossum Posse

				# *DOCUMENTATION*

									
										2

arch/arc/net/bpf_jit.h
									
												View File
												
				@@ -39,7 +39,7 @@

				/************** Functions that the back-end must provide **************/

				/* Extension for 32-bit operations. */

				inline u8 zext(u8 *buf, u8 rd);

				u8 zext(u8 *buf, u8 rd);

				/***** Moves *****/

				u8 mov_r32(u8 *buf, u8 rd, u8 rs, u8 sign_ext);

				u8 mov_r32_i32(u8 *buf, u8 reg, s32 imm);

									
										10

arch/arc/net/bpf_jit_arcv2.c
									
												View File
												
				@@ -62,7 +62,7 @@ enum {

				 *   If/when we decide to add ARCv2 instructions that do use register pairs,

				 *   the mapping, hopefully, doesn't need to be revisited.

				 */

				const u8 bpf2arc[][2] = {

				static const u8 bpf2arc[][2] = {

					/* Return value from in-kernel function, and exit value from eBPF */

					[BPF_REG_0] = {ARC_R_8, ARC_R_9},

					/* Arguments from eBPF program to in-kernel function */

				@@ -1302,7 +1302,7 @@ static u8 arc_b(u8 *buf, s32 offset)

				/************* Packers (Deal with BPF_REGs) **************/

				inline u8 zext(u8 *buf, u8 rd)

				u8 zext(u8 *buf, u8 rd)

				{

					if (rd != BPF_REG_FP)

						return arc_movi_r(buf, REG_HI(rd), 0);

				@@ -2235,6 +2235,7 @@ u8 gen_swap(u8 *buf, u8 rd, u8 size, u8 endian, bool force, bool do_zext)

							break;

						default:

							/* The caller must have handled this. */

							break;

						}

					} else {

						/*

				@@ -2253,6 +2254,7 @@ u8 gen_swap(u8 *buf, u8 rd, u8 size, u8 endian, bool force, bool do_zext)

							break;

						default:

							/* The caller must have handled this. */

							break;

						}

					}

				@@ -2517,7 +2519,7 @@ u8 arc_epilogue(u8 *buf, u32 usage, u16 frame_size)

				#define JCC64_NR_OF_JMPS 3	/* Number of jumps in jcc64 template. */

				#define JCC64_INSNS_TO_END 3	/* Number of insn. inclusive the 2nd jmp to end. */

				#define JCC64_SKIP_JMP 1	/* Index of the "skip" jump to "end". */

				const struct {

				static const struct {

					/*

					 * "jit_off" is common between all "jmp[]" and is coupled with

					 * "cond" of each "jmp[]" instance. e.g.:

				@@ -2883,7 +2885,7 @@ u8 gen_jmp_64(u8 *buf, u8 rd, u8 rs, u8 cond, u32 curr_off, u32 targ_off)

				 * The "ARC_CC_SET" becomes "CC_unequal" because of the "tst"

				 * instruction that precedes the conditional branch.

				 */

				const u8 arcv2_32_jmps[ARC_CC_LAST] = {

				static const u8 arcv2_32_jmps[ARC_CC_LAST] = {

					[ARC_CC_UGT] = CC_great_u,

					[ARC_CC_UGE] = CC_great_eq_u,

					[ARC_CC_ULT] = CC_less_u,

									
										22

arch/arc/net/bpf_jit_core.c
									
												View File
												
				@@ -159,7 +159,7 @@ static void jit_dump(const struct jit_context *ctx)

				/* Initialise the context so there's no garbage. */

				static int jit_ctx_init(struct jit_context *ctx, struct bpf_prog *prog)

				{

					memset(ctx, 0, sizeof(ctx));

					memset(ctx, 0, sizeof(*ctx));

					ctx->orig_prog = prog;

				@@ -167,7 +167,7 @@ static int jit_ctx_init(struct jit_context *ctx, struct bpf_prog *prog)

					ctx->prog = bpf_jit_blind_constants(prog);

					if (IS_ERR(ctx->prog))

						return PTR_ERR(ctx->prog);

					ctx->blinded = (ctx->prog == ctx->orig_prog ? false : true);

					ctx->blinded = (ctx->prog != ctx->orig_prog);

					/* If the verifier doesn't zero-extend, then we have to do it. */

					ctx->do_zext = !ctx->prog->aux->verifier_zext;

				@@ -1182,12 +1182,12 @@ static int jit_prepare(struct jit_context *ctx)

				}

				/*

				 * All the "handle_*()" functions have been called before by the

				 * "jit_prepare()". If there was an error, we would know by now.

				 * Therefore, no extra error checking at this point, other than

				 * a sanity check at the end that expects the calculated length

				 * (jit.len) to be equal to the length of generated instructions

				 * (jit.index).

				 * jit_compile() is the real compilation phase. jit_prepare() is

				 * invoked before jit_compile() as a dry-run to make sure everything

				 * will go OK and allocate the necessary memory.

				 *

				 * In the end, jit_compile() checks if it has produced the same number

				 * of instructions as jit_prepare() would.

				 */

				static int jit_compile(struct jit_context *ctx)

				{

				@@ -1407,9 +1407,9 @@ static struct bpf_prog *do_extra_pass(struct bpf_prog *prog)

				/*

				 * This function may be invoked twice for the same stream of BPF

				 * instructions. The "extra pass" happens, when there are "call"s

				 * involved that their addresses are not known during the first

				 * invocation.

				 * instructions. The "extra pass" happens, when there are

				 * (re)locations involved that their addresses are not known

				 * during the first run.

				 */

				struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

				{

									
										17

arch/arm/kernel/ftrace.c
									
												View File
												
				@@ -232,11 +232,24 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,

					unsigned long old;

					if (unlikely(atomic_read(&current->tracing_graph_pause)))

				err_out:

						return;

					if (IS_ENABLED(CONFIG_UNWINDER_FRAME_POINTER)) {

						/* FP points one word below parent's top of stack */

						frame_pointer += 4;

						/*

						 * Usually, the stack frames are contiguous in memory but cases

						 * have been observed where the next stack frame does not live

						 * at 'frame_pointer + 4' as this code used to assume.

						 *

						 * Instead, dereference the field in the stack frame that

						 * stores the SP of the calling frame: to avoid unbounded

						 * recursion, this cannot involve any ftrace instrumented

						 * functions, so use the __get_kernel_nofault() primitive

						 * directly.

						 */

						__get_kernel_nofault(&frame_pointer,

								     (unsigned long *)(frame_pointer - 8),

								     unsigned long, err_out);

					} else {

						struct stackframe frame = {

							.fp = frame_pointer,

1

arch/arm64/Kconfig

View File

@@ -1649,6 +1649,7 @@ config RODATA_FULL_DEFAULT_ENABLED
 config ARM64_SW_TTBR0_PAN
 	bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
 	depends on !KCSAN
 	help
 	  Enabling this option prevents the kernel from accessing
 	  user-space memory directly by pointing TTBR0_EL1 to a reserved

									
										3

arch/arm64/include/asm/asm-extable.h
									
												View File
												
				@@ -112,6 +112,9 @@

				#define _ASM_EXTABLE_KACCESS_ERR(insn, fixup, err)			\

					_ASM_EXTABLE_KACCESS_ERR_ZERO(insn, fixup, err, wzr)

				#define _ASM_EXTABLE_KACCESS(insn, fixup)				\

					_ASM_EXTABLE_KACCESS_ERR_ZERO(insn, fixup, wzr, wzr)

				#define _ASM_EXTABLE_LOAD_UNALIGNED_ZEROPAD(insn, fixup, data, addr)		\

					__DEFINE_ASM_GPR_NUMS							\

					__ASM_EXTABLE_RAW(#insn, #fixup,					\

									
										6

arch/arm64/include/asm/el2_setup.h
									
												View File
												
				@@ -146,7 +146,7 @@

				/* Coprocessor traps */

				.macro __init_el2_cptr

					__check_hvhe .LnVHE_\@, x1

					mov	x0, #(CPACR_EL1_FPEN_EL1EN | CPACR_EL1_FPEN_EL0EN)

					mov	x0, #CPACR_ELx_FPEN

					msr	cpacr_el1, x0

					b	.Lskip_set_cptr_\@

				.LnVHE_\@:

				@@ -277,7 +277,7 @@

					// (h)VHE case

					mrs	x0, cpacr_el1			// Disable SVE traps

					orr	x0, x0, #(CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)

					orr	x0, x0, #CPACR_ELx_ZEN

					msr	cpacr_el1, x0

					b	.Lskip_set_cptr_\@

				@@ -298,7 +298,7 @@

					// (h)VHE case

					mrs	x0, cpacr_el1			// Disable SME traps

					orr	x0, x0, #(CPACR_EL1_SMEN_EL0EN | CPACR_EL1_SMEN_EL1EN)

					orr	x0, x0, #CPACR_ELx_SMEN

					msr	cpacr_el1, x0

					b	.Lskip_set_cptr_sme_\@

									
										36

arch/arm64/include/asm/io.h
									
												View File
												
				@@ -153,8 +153,9 @@ extern void __memset_io(volatile void __iomem *, int, size_t);

				 * emit the large TLP from the CPU.

				 */

				static inline void __const_memcpy_toio_aligned32(volatile u32 __iomem *to,

										 const u32 *from, size_t count)

				static __always_inline void

				__const_memcpy_toio_aligned32(volatile u32 __iomem *to, const u32 *from,

							      size_t count)

				{

					switch (count) {

					case 8:

				@@ -196,24 +197,22 @@ static inline void __const_memcpy_toio_aligned32(volatile u32 __iomem *to,

				void __iowrite32_copy_full(void __iomem *to, const void *from, size_t count);

				static inline void __const_iowrite32_copy(void __iomem *to, const void *from,

									  size_t count)

				static __always_inline void

				__iowrite32_copy(void __iomem *to, const void *from, size_t count)

				{

					if (count == 8 || count == 4 || count == 2 || count == 1) {

					if (__builtin_constant_p(count) &&

					    (count == 8 || count == 4 || count == 2 || count == 1)) {

						__const_memcpy_toio_aligned32(to, from, count);

						dgh();

					} else {

						__iowrite32_copy_full(to, from, count);

					}

				}

				#define __iowrite32_copy __iowrite32_copy

				#define __iowrite32_copy(to, from, count)                  \

					(__builtin_constant_p(count) ?                     \

						 __const_iowrite32_copy(to, from, count) : \

						 __iowrite32_copy_full(to, from, count))

				static inline void __const_memcpy_toio_aligned64(volatile u64 __iomem *to,

										 const u64 *from, size_t count)

				static __always_inline void

				__const_memcpy_toio_aligned64(volatile u64 __iomem *to, const u64 *from,

							      size_t count)

				{

					switch (count) {

					case 8:

				@@ -255,21 +254,18 @@ static inline void __const_memcpy_toio_aligned64(volatile u64 __iomem *to,

				void __iowrite64_copy_full(void __iomem *to, const void *from, size_t count);

				static inline void __const_iowrite64_copy(void __iomem *to, const void *from,

									  size_t count)

				static __always_inline void

				__iowrite64_copy(void __iomem *to, const void *from, size_t count)

				{

					if (count == 8 || count == 4 || count == 2 || count == 1) {

					if (__builtin_constant_p(count) &&

					    (count == 8 || count == 4 || count == 2 || count == 1)) {

						__const_memcpy_toio_aligned64(to, from, count);

						dgh();

					} else {

						__iowrite64_copy_full(to, from, count);

					}

				}

				#define __iowrite64_copy(to, from, count)                  \

					(__builtin_constant_p(count) ?                     \

						 __const_iowrite64_copy(to, from, count) : \

						 __iowrite64_copy_full(to, from, count))

				#define __iowrite64_copy __iowrite64_copy

				/*

				 * I/O memory mapping functions.

									
										6

arch/arm64/include/asm/kvm_arm.h
									
												View File
												
				@@ -305,6 +305,12 @@

								 GENMASK(19, 14) |	\

								 BIT(11))

				#define CPTR_VHE_EL2_RES0	(GENMASK(63, 32) |	\

								 GENMASK(27, 26) |	\

								 GENMASK(23, 22) |	\

								 GENMASK(19, 18) |	\

								 GENMASK(15, 0))

				/* Hyp Debug Configuration Register bits */

				#define MDCR_EL2_E2TB_MASK	(UL(0x3))

				#define MDCR_EL2_E2TB_SHIFT	(UL(24))

									
										71

arch/arm64/include/asm/kvm_emulate.h
									
												View File
												
				@@ -557,6 +557,68 @@ static __always_inline void kvm_incr_pc(struct kvm_vcpu *vcpu)

						vcpu_set_flag((v), e);					\

					} while (0)

				#define __build_check_all_or_none(r, bits)				\

					BUILD_BUG_ON(((r) & (bits)) && ((r) & (bits)) != (bits))

				#define __cpacr_to_cptr_clr(clr, set)					\

					({								\

						u64 cptr = 0;						\

													\

						if ((set) & CPACR_ELx_FPEN)				\

							cptr |= CPTR_EL2_TFP;				\

						if ((set) & CPACR_ELx_ZEN)				\

							cptr |= CPTR_EL2_TZ;				\

						if ((set) & CPACR_ELx_SMEN)				\

							cptr |= CPTR_EL2_TSM;				\

						if ((clr) & CPACR_ELx_TTA)				\

							cptr |= CPTR_EL2_TTA;				\

						if ((clr) & CPTR_EL2_TAM)				\

							cptr |= CPTR_EL2_TAM;				\

						if ((clr) & CPTR_EL2_TCPAC)				\

							cptr |= CPTR_EL2_TCPAC;				\

													\

						cptr;							\

					})

				#define __cpacr_to_cptr_set(clr, set)					\

					({								\

						u64 cptr = 0;						\

													\

						if ((clr) & CPACR_ELx_FPEN)				\

							cptr |= CPTR_EL2_TFP;				\

						if ((clr) & CPACR_ELx_ZEN)				\

							cptr |= CPTR_EL2_TZ;				\

						if ((clr) & CPACR_ELx_SMEN)				\

							cptr |= CPTR_EL2_TSM;				\

						if ((set) & CPACR_ELx_TTA)				\

							cptr |= CPTR_EL2_TTA;				\

						if ((set) & CPTR_EL2_TAM)				\

							cptr |= CPTR_EL2_TAM;				\

						if ((set) & CPTR_EL2_TCPAC)				\

							cptr |= CPTR_EL2_TCPAC;				\

													\

						cptr;							\

					})

				#define cpacr_clear_set(clr, set)					\

					do {								\

						BUILD_BUG_ON((set) & CPTR_VHE_EL2_RES0);		\

						BUILD_BUG_ON((clr) & CPACR_ELx_E0POE);			\

						__build_check_all_or_none((clr), CPACR_ELx_FPEN);	\

						__build_check_all_or_none((set), CPACR_ELx_FPEN);	\

						__build_check_all_or_none((clr), CPACR_ELx_ZEN);	\

						__build_check_all_or_none((set), CPACR_ELx_ZEN);	\

						__build_check_all_or_none((clr), CPACR_ELx_SMEN);	\

						__build_check_all_or_none((set), CPACR_ELx_SMEN);	\

													\

						if (has_vhe() || has_hvhe())				\

							sysreg_clear_set(cpacr_el1, clr, set);		\

						else							\

							sysreg_clear_set(cptr_el2,			\

									 __cpacr_to_cptr_clr(clr, set),	\

									 __cpacr_to_cptr_set(clr, set));\

					} while (0)

				static __always_inline void kvm_write_cptr_el2(u64 val)

				{

					if (has_vhe() || has_hvhe())

				@@ -570,17 +632,16 @@ static __always_inline u64 kvm_get_reset_cptr_el2(struct kvm_vcpu *vcpu)

					u64 val;

					if (has_vhe()) {

						val = (CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN |

						       CPACR_EL1_ZEN_EL1EN);

						val = (CPACR_ELx_FPEN | CPACR_EL1_ZEN_EL1EN);

						if (cpus_have_final_cap(ARM64_SME))

							val |= CPACR_EL1_SMEN_EL1EN;

					} else if (has_hvhe()) {

						val = (CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN);

						val = CPACR_ELx_FPEN;

						if (!vcpu_has_sve(vcpu) || !guest_owns_fp_regs())

							val |= CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN;

							val |= CPACR_ELx_ZEN;

						if (cpus_have_final_cap(ARM64_SME))

							val |= CPACR_EL1_SMEN_EL1EN | CPACR_EL1_SMEN_EL0EN;

							val |= CPACR_ELx_SMEN;

					} else {

						val = CPTR_NVHE_EL2_RES1;

									
										25

arch/arm64/include/asm/kvm_host.h
									
												View File
												
				@@ -76,6 +76,7 @@ static inline enum kvm_mode kvm_get_mode(void) { return KVM_MODE_NONE; };

				DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);

				extern unsigned int __ro_after_init kvm_sve_max_vl;

				extern unsigned int __ro_after_init kvm_host_sve_max_vl;

				int __init kvm_arm_init_sve(void);

				u32 __attribute_const__ kvm_target_cpu(void);

				@@ -521,6 +522,20 @@ struct kvm_cpu_context {

					u64 *vncr_array;

				};

				struct cpu_sve_state {

					__u64 zcr_el1;

					/*

					 * Ordering is important since __sve_save_state/__sve_restore_state

					 * relies on it.

					 */

					__u32 fpsr;

					__u32 fpcr;

					/* Must be SVE_VQ_BYTES (128 bit) aligned. */

					__u8 sve_regs[];

				};

				/*

				 * This structure is instantiated on a per-CPU basis, and contains

				 * data that is:

				@@ -534,7 +549,15 @@ struct kvm_cpu_context {

				 */

				struct kvm_host_data {

					struct kvm_cpu_context host_ctxt;

					struct user_fpsimd_state *fpsimd_state;	/* hyp VA */

					/*

					 * All pointers in this union are hyp VA.

					 * sve_state is only used in pKVM and if system_supports_sve().

					 */

					union {

						struct user_fpsimd_state *fpsimd_state;

						struct cpu_sve_state *sve_state;

					};

					/* Ownership of the FP regs */

					enum {

									
										4

arch/arm64/include/asm/kvm_hyp.h
									
												View File
												
				@@ -111,7 +111,8 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);

				void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);

				void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);

				void __sve_restore_state(void *sve_pffr, u32 *fpsr);

				void __sve_save_state(void *sve_pffr, u32 *fpsr, int save_ffr);

				void __sve_restore_state(void *sve_pffr, u32 *fpsr, int restore_ffr);

				u64 __guest_enter(struct kvm_vcpu *vcpu);

				@@ -142,5 +143,6 @@ extern u64 kvm_nvhe_sym(id_aa64smfr0_el1_sys_val);

				extern unsigned long kvm_nvhe_sym(__icache_flags);

				extern unsigned int kvm_nvhe_sym(kvm_arm_vmid_bits);

				extern unsigned int kvm_nvhe_sym(kvm_host_sve_max_vl);

				#endif /* __ARM64_KVM_HYP_H__ */

									
										9

arch/arm64/include/asm/kvm_pkvm.h
									
												View File
												
				@@ -128,4 +128,13 @@ static inline unsigned long hyp_ffa_proxy_pages(void)

					return (2 * KVM_FFA_MBOX_NR_PAGES) + DIV_ROUND_UP(desc_max, PAGE_SIZE);

				}

				static inline size_t pkvm_host_sve_state_size(void)

				{

					if (!system_supports_sve())

						return 0;

					return size_add(sizeof(struct cpu_sve_state),

							SVE_SIG_REGS_SIZE(sve_vq_from_vl(kvm_host_sve_max_vl)));

				}

				#endif	/* __ARM64_KVM_PKVM_H__ */

									
										192

arch/arm64/include/asm/uaccess.h
									
												View File
												
				@@ -30,23 +30,20 @@ static inline int __access_ok(const void __user *ptr, unsigned long size);

				/*

				 * Test whether a block of memory is a valid user space address.

				 * Returns 1 if the range is valid, 0 otherwise.

				 *

				 * This is equivalent to the following test:

				 * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX

				 * We only care that the address cannot reach the kernel mapping, and

				 * that an invalid address will fault.

				 */

				static inline int access_ok(const void __user *addr, unsigned long size)

				static inline int access_ok(const void __user *p, unsigned long size)

				{

					/*

					 * Asynchronous I/O running in a kernel thread does not have the

					 * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag

					 * the user address before checking.

					 */

					if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&

					    (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))

						addr = untagged_addr(addr);

					unsigned long addr = (unsigned long)p;

					return likely(__access_ok(addr, size));

					/* Only bit 55 of the address matters */

					addr |= addr+size;

					addr = (addr >> 55) & 1;

					size >>= 55;

					return !(addr | size);

				}

				#define access_ok access_ok

				@@ -184,29 +181,40 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)

				 * The "__xxx_error" versions set the third argument to -EFAULT if an error

				 * occurs, and leave it unchanged on success.

				 */

				#define __get_mem_asm(load, reg, x, addr, err, type)			\

				#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT

				#define __get_mem_asm(load, reg, x, addr, label, type)			\

					asm_goto_output(						\

					"1:	" load "	" reg "0, [%1]\n"			\

					_ASM_EXTABLE_##type##ACCESS_ERR(1b, %l2, %w0)			\

					: "=r" (x)							\

					: "r" (addr) : : label)

				#else

				#define __get_mem_asm(load, reg, x, addr, label, type) do {		\

					int __gma_err = 0;						\

					asm volatile(							\

					"1:	" load "	" reg "1, [%2]\n"			\

					"2:\n"								\

					_ASM_EXTABLE_##type##ACCESS_ERR_ZERO(1b, 2b, %w0, %w1)		\

					: "+r" (err), "=r" (x)						\

					: "r" (addr))

					: "+r" (__gma_err), "=r" (x)					\

					: "r" (addr));							\

					if (__gma_err) goto label; } while (0)

				#endif

				#define __raw_get_mem(ldr, x, ptr, err, type)					\

				#define __raw_get_mem(ldr, x, ptr, label, type)					\

				do {										\

					unsigned long __gu_val;							\

					switch (sizeof(*(ptr))) {						\

					case 1:									\

						__get_mem_asm(ldr "b", "%w", __gu_val, (ptr), (err), type);	\

						__get_mem_asm(ldr "b", "%w", __gu_val, (ptr), label, type);	\

						break;								\

					case 2:									\

						__get_mem_asm(ldr "h", "%w", __gu_val, (ptr), (err), type);	\

						__get_mem_asm(ldr "h", "%w", __gu_val, (ptr), label, type);	\

						break;								\

					case 4:									\

						__get_mem_asm(ldr, "%w", __gu_val, (ptr), (err), type);		\

						__get_mem_asm(ldr, "%w", __gu_val, (ptr), label, type);		\

						break;								\

					case 8:									\

						__get_mem_asm(ldr, "%x",  __gu_val, (ptr), (err), type);	\

						__get_mem_asm(ldr, "%x",  __gu_val, (ptr), label, type);	\

						break;								\

					default:								\

						BUILD_BUG();							\

				@@ -219,27 +227,34 @@ do {										\

				 * uaccess_ttbr0_disable(). As `x` and `ptr` could contain blocking functions,

				 * we must evaluate these outside of the critical section.

				 */

				#define __raw_get_user(x, ptr, err)					\

				#define __raw_get_user(x, ptr, label)					\

				do {									\

					__typeof__(*(ptr)) __user *__rgu_ptr = (ptr);			\

					__typeof__(x) __rgu_val;					\

					__chk_user_ptr(ptr);						\

													\

					uaccess_ttbr0_enable();						\

					__raw_get_mem("ldtr", __rgu_val, __rgu_ptr, err, U);		\

					uaccess_ttbr0_disable();					\

													\

					(x) = __rgu_val;						\

					do {								\

						__label__ __rgu_failed;					\

						uaccess_ttbr0_enable();					\

						__raw_get_mem("ldtr", __rgu_val, __rgu_ptr, __rgu_failed, U);	\

						uaccess_ttbr0_disable();				\

						(x) = __rgu_val;					\

						break;							\

					__rgu_failed:							\

						uaccess_ttbr0_disable();				\

						goto label;						\

					} while (0);							\

				} while (0)

				#define __get_user_error(x, ptr, err)					\

				do {									\

					__label__ __gu_failed;						\

					__typeof__(*(ptr)) __user *__p = (ptr);				\

					might_fault();							\

					if (access_ok(__p, sizeof(*__p))) {				\

						__p = uaccess_mask_ptr(__p);				\

						__raw_get_user((x), __p, (err));			\

						__raw_get_user((x), __p, __gu_failed);			\

					} else {							\

					__gu_failed:							\

						(x) = (__force __typeof__(x))0; (err) = -EFAULT;	\

					}								\

				} while (0)

				@@ -262,40 +277,42 @@ do {									\

				do {									\

					__typeof__(dst) __gkn_dst = (dst);				\

					__typeof__(src) __gkn_src = (src);				\

					int __gkn_err = 0;						\

					do { 								\

						__label__ __gkn_label;					\

													\

					__mte_enable_tco_async();					\

					__raw_get_mem("ldr", *((type *)(__gkn_dst)),			\

						      (__force type *)(__gkn_src), __gkn_err, K);	\

					__mte_disable_tco_async();					\

													\

					if (unlikely(__gkn_err))					\

						__mte_enable_tco_async();				\

						__raw_get_mem("ldr", *((type *)(__gkn_dst)),		\

						      (__force type *)(__gkn_src), __gkn_label, K);	\

						__mte_disable_tco_async();				\

						break;							\

					__gkn_label:							\

						__mte_disable_tco_async();				\

						goto err_label;						\

					} while (0);							\

				} while (0)

				#define __put_mem_asm(store, reg, x, addr, err, type)			\

					asm volatile(							\

					"1:	" store "	" reg "1, [%2]\n"			\

				#define __put_mem_asm(store, reg, x, addr, label, type)			\

					asm goto(							\

					"1:	" store "	" reg "0, [%1]\n"			\

					"2:\n"								\

					_ASM_EXTABLE_##type##ACCESS_ERR(1b, 2b, %w0)			\

					: "+r" (err)							\

					: "rZ" (x), "r" (addr))

					_ASM_EXTABLE_##type##ACCESS(1b, %l2)				\

					: : "rZ" (x), "r" (addr) : : label)

				#define __raw_put_mem(str, x, ptr, err, type)					\

				#define __raw_put_mem(str, x, ptr, label, type)					\

				do {										\

					__typeof__(*(ptr)) __pu_val = (x);					\

					switch (sizeof(*(ptr))) {						\

					case 1:									\

						__put_mem_asm(str "b", "%w", __pu_val, (ptr), (err), type);	\

						__put_mem_asm(str "b", "%w", __pu_val, (ptr), label, type);	\

						break;								\

					case 2:									\

						__put_mem_asm(str "h", "%w", __pu_val, (ptr), (err), type);	\

						__put_mem_asm(str "h", "%w", __pu_val, (ptr), label, type);	\

						break;								\

					case 4:									\

						__put_mem_asm(str, "%w", __pu_val, (ptr), (err), type);		\

						__put_mem_asm(str, "%w", __pu_val, (ptr), label, type);		\

						break;								\

					case 8:									\

						__put_mem_asm(str, "%x", __pu_val, (ptr), (err), type);		\

						__put_mem_asm(str, "%x", __pu_val, (ptr), label, type);		\

						break;								\

					default:								\

						BUILD_BUG();							\

				@@ -307,25 +324,34 @@ do {										\

				 * uaccess_ttbr0_disable(). As `x` and `ptr` could contain blocking functions,

				 * we must evaluate these outside of the critical section.

				 */

				#define __raw_put_user(x, ptr, err)					\

				#define __raw_put_user(x, ptr, label)					\

				do {									\

					__label__ __rpu_failed;						\

					__typeof__(*(ptr)) __user *__rpu_ptr = (ptr);			\

					__typeof__(*(ptr)) __rpu_val = (x);				\

					__chk_user_ptr(__rpu_ptr);					\

													\

					uaccess_ttbr0_enable();						\

					__raw_put_mem("sttr", __rpu_val, __rpu_ptr, err, U);		\

					uaccess_ttbr0_disable();					\

					do {								\

						uaccess_ttbr0_enable();					\

						__raw_put_mem("sttr", __rpu_val, __rpu_ptr, __rpu_failed, U);	\

						uaccess_ttbr0_disable();				\

						break;							\

					__rpu_failed:							\

						uaccess_ttbr0_disable();				\

						goto label;						\

					} while (0);							\

				} while (0)

				#define __put_user_error(x, ptr, err)					\

				do {									\

					__label__ __pu_failed;						\

					__typeof__(*(ptr)) __user *__p = (ptr);				\

					might_fault();							\

					if (access_ok(__p, sizeof(*__p))) {				\

						__p = uaccess_mask_ptr(__p);				\

						__raw_put_user((x), __p, (err));			\

						__raw_put_user((x), __p, __pu_failed);			\

					} else	{							\

					__pu_failed:							\

						(err) = -EFAULT;					\

					}								\

				} while (0)

				@@ -348,15 +374,18 @@ do {									\

				do {									\

					__typeof__(dst) __pkn_dst = (dst);				\

					__typeof__(src) __pkn_src = (src);				\

					int __pkn_err = 0;						\

													\

					__mte_enable_tco_async();					\

					__raw_put_mem("str", *((type *)(__pkn_src)),			\

						      (__force type *)(__pkn_dst), __pkn_err, K);	\

					__mte_disable_tco_async();					\

													\

					if (unlikely(__pkn_err))					\

					do {								\

						__label__ __pkn_err;					\

						__mte_enable_tco_async();				\

						__raw_put_mem("str", *((type *)(__pkn_src)),		\

							      (__force type *)(__pkn_dst), __pkn_err, K);	\

						__mte_disable_tco_async();				\

						break;							\

					__pkn_err:							\

						__mte_disable_tco_async();				\

						goto err_label;						\

					} while (0);							\

				} while(0)

				extern unsigned long __must_check __arch_copy_from_user(void *to, const void __user *from, unsigned long n);

				@@ -381,6 +410,51 @@ extern unsigned long __must_check __arch_copy_to_user(void __user *to, const voi

					__actu_ret;							\

				})

				static __must_check __always_inline bool user_access_begin(const void __user *ptr, size_t len)

				{

					if (unlikely(!access_ok(ptr,len)))

						return 0;

					uaccess_ttbr0_enable();

					return 1;

				}

				#define user_access_begin(a,b)	user_access_begin(a,b)

				#define user_access_end()	uaccess_ttbr0_disable()

				#define unsafe_put_user(x, ptr, label) \

					__raw_put_mem("sttr", x, uaccess_mask_ptr(ptr), label, U)

				#define unsafe_get_user(x, ptr, label) \

					__raw_get_mem("ldtr", x, uaccess_mask_ptr(ptr), label, U)

				/*

				 * KCSAN uses these to save and restore ttbr state.

				 * We do not support KCSAN with ARM64_SW_TTBR0_PAN, so

				 * they are no-ops.

				 */

				static inline unsigned long user_access_save(void) { return 0; }

				static inline void user_access_restore(unsigned long enabled) { }

				/*

				 * We want the unsafe accessors to always be inlined and use

				 * the error labels - thus the macro games.

				 */

				#define unsafe_copy_loop(dst, src, len, type, label)				\

					while (len >= sizeof(type)) {						\

						unsafe_put_user(*(type *)(src),(type __user *)(dst),label);	\

						dst += sizeof(type);						\

						src += sizeof(type);						\

						len -= sizeof(type);						\

					}

				#define unsafe_copy_to_user(_dst,_src,_len,label)			\

				do {									\

					char __user *__ucu_dst = (_dst);				\

					const char *__ucu_src = (_src);					\

					size_t __ucu_len = (_len);					\

					unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u64, label);	\

					unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u32, label);	\

					unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u16, label);	\

					unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u8, label);	\

				} while (0)

				#define INLINE_COPY_TO_USER

				#define INLINE_COPY_FROM_USER

									
										3

arch/arm64/kernel/armv8_deprecated.c
									
												View File
												
				@@ -462,6 +462,9 @@ static int run_all_insn_set_hw_mode(unsigned int cpu)

					for (int i = 0; i < ARRAY_SIZE(insn_emulations); i++) {

						struct insn_emulation *insn = insn_emulations[i];

						bool enable = READ_ONCE(insn->current_mode) == INSN_HW;

						if (insn->status == INSN_UNAVAILABLE)

							continue;

						if (insn->set_hw_mode && insn->set_hw_mode(enable)) {

							pr_warn("CPU[%u] cannot support the emulation of %s",

								cpu, insn->name);

									
										12

arch/arm64/kernel/mte.c
									
												View File
												
				@@ -582,12 +582,9 @@ subsys_initcall(register_mte_tcf_preferred_sysctl);

				size_t mte_probe_user_range(const char __user *uaddr, size_t size)

				{

					const char __user *end = uaddr + size;

					int err = 0;

					char val;

					__raw_get_user(val, uaddr, err);

					if (err)

						return size;

					__raw_get_user(val, uaddr, efault);

					uaddr = PTR_ALIGN(uaddr, MTE_GRANULE_SIZE);

					while (uaddr < end) {

				@@ -595,12 +592,13 @@ size_t mte_probe_user_range(const char __user *uaddr, size_t size)

						 * A read is sufficient for mte, the caller should have probed

						 * for the pte write permission if required.

						 */

						__raw_get_user(val, uaddr, err);

						if (err)

							return end - uaddr;

						__raw_get_user(val, uaddr, efault);

						uaddr += MTE_GRANULE_SIZE;

					}

					(void)val;

					return 0;

				efault:

					return end - uaddr;

				}

									
										76

arch/arm64/kvm/arm.c
									
												View File
												
				@@ -1931,6 +1931,11 @@ static unsigned long nvhe_percpu_order(void)

					return size ? get_order(size) : 0;

				}

				static size_t pkvm_host_sve_state_order(void)

				{

					return get_order(pkvm_host_sve_state_size());

				}

				/* A lookup table holding the hypervisor VA for each vector slot */

				static void *hyp_spectre_vector_selector[BP_HARDEN_EL2_SLOTS];

				@@ -2310,12 +2315,20 @@ static void __init teardown_subsystems(void)

				static void __init teardown_hyp_mode(void)

				{

					bool free_sve = system_supports_sve() && is_protected_kvm_enabled();

					int cpu;

					free_hyp_pgds();

					for_each_possible_cpu(cpu) {

						free_page(per_cpu(kvm_arm_hyp_stack_page, cpu));

						free_pages(kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[cpu], nvhe_percpu_order());

						if (free_sve) {

							struct cpu_sve_state *sve_state;

							sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;

							free_pages((unsigned long) sve_state, pkvm_host_sve_state_order());

						}

					}

				}

				@@ -2398,6 +2411,58 @@ static int __init kvm_hyp_init_protection(u32 hyp_va_bits)

					return 0;

				}

				static int init_pkvm_host_sve_state(void)

				{

					int cpu;

					if (!system_supports_sve())

						return 0;

					/* Allocate pages for host sve state in protected mode. */

					for_each_possible_cpu(cpu) {

						struct page *page = alloc_pages(GFP_KERNEL, pkvm_host_sve_state_order());

						if (!page)

							return -ENOMEM;

						per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state = page_address(page);

					}

					/*

					 * Don't map the pages in hyp since these are only used in protected

					 * mode, which will (re)create its own mapping when initialized.

					 */

					return 0;

				}

				/*

				 * Finalizes the initialization of hyp mode, once everything else is initialized

				 * and the initialziation process cannot fail.

				 */

				static void finalize_init_hyp_mode(void)

				{

					int cpu;

					if (system_supports_sve() && is_protected_kvm_enabled()) {

						for_each_possible_cpu(cpu) {

							struct cpu_sve_state *sve_state;

							sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;

							per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state =

								kern_hyp_va(sve_state);

						}

					} else {

						for_each_possible_cpu(cpu) {

							struct user_fpsimd_state *fpsimd_state;

							fpsimd_state = &per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->host_ctxt.fp_regs;

							per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->fpsimd_state =

								kern_hyp_va(fpsimd_state);

						}

					}

				}

				static void pkvm_hyp_init_ptrauth(void)

				{

					struct kvm_cpu_context *hyp_ctxt;

				@@ -2566,6 +2631,10 @@ static int __init init_hyp_mode(void)

							goto out_err;

						}

						err = init_pkvm_host_sve_state();

						if (err)

							goto out_err;

						err = kvm_hyp_init_protection(hyp_va_bits);

						if (err) {

							kvm_err("Failed to init hyp memory protection\n");

				@@ -2730,6 +2799,13 @@ static __init int kvm_arm_init(void)

					if (err)

						goto out_subs;

					/*

					 * This should be called after initialization is done and failure isn't

					 * possible anymore.

					 */

					if (!in_hyp_mode)

						finalize_init_hyp_mode();

					kvm_arm_initialised = true;

					return 0;

									
										21

arch/arm64/kvm/emulate-nested.c
									
												View File
												
				@@ -2181,16 +2181,23 @@ void kvm_emulate_nested_eret(struct kvm_vcpu *vcpu)

					if (forward_traps(vcpu, HCR_NV))

						return;

					spsr = vcpu_read_sys_reg(vcpu, SPSR_EL2);

					spsr = kvm_check_illegal_exception_return(vcpu, spsr);

					/* Check for an ERETAx */

					esr = kvm_vcpu_get_esr(vcpu);

					if (esr_iss_is_eretax(esr) && !kvm_auth_eretax(vcpu, &elr)) {

						/*

						 * Oh no, ERETAx failed to authenticate.  If we have

						 * FPACCOMBINE, deliver an exception right away.  If we

						 * don't, then let the mangled ELR value trickle down the

						 * Oh no, ERETAx failed to authenticate.

						 *

						 * If we have FPACCOMBINE and we don't have a pending

						 * Illegal Execution State exception (which has priority

						 * over FPAC), deliver an exception right away.

						 *

						 * Otherwise, let the mangled ELR value trickle down the

						 * ERET handling, and the guest will have a little surprise.

						 */

						if (kvm_has_pauth(vcpu->kvm, FPACCOMBINE)) {

						if (kvm_has_pauth(vcpu->kvm, FPACCOMBINE) && !(spsr & PSR_IL_BIT)) {

							esr &= ESR_ELx_ERET_ISS_ERETA;

							esr |= FIELD_PREP(ESR_ELx_EC_MASK, ESR_ELx_EC_FPAC);

							kvm_inject_nested_sync(vcpu, esr);

				@@ -2201,17 +2208,11 @@ void kvm_emulate_nested_eret(struct kvm_vcpu *vcpu)

					preempt_disable();

					kvm_arch_vcpu_put(vcpu);

					spsr = __vcpu_sys_reg(vcpu, SPSR_EL2);

					spsr = kvm_check_illegal_exception_return(vcpu, spsr);

					if (!esr_iss_is_eretax(esr))

						elr = __vcpu_sys_reg(vcpu, ELR_EL2);

					trace_kvm_nested_eret(vcpu, elr, spsr);

					/*

					 * Note that the current exception level is always the virtual EL2,

					 * since we set HCR_EL2.NV bit only when entering the virtual EL2.

					 */

					*vcpu_pc(vcpu) = elr;

					*vcpu_cpsr(vcpu) = spsr;

									
										11

arch/arm64/kvm/fpsimd.c
									
												View File
												
				@@ -90,6 +90,13 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)

							fpsimd_save_and_flush_cpu_state();

						}

					}

					/*

					 * If normal guests gain SME support, maintain this behavior for pKVM

					 * guests, which don't support SME.

					 */

					WARN_ON(is_protected_kvm_enabled() && system_supports_sme() &&

						read_sysreg_s(SYS_SVCR));

				}

				/*

				@@ -161,9 +168,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)

					if (has_vhe() && system_supports_sme()) {

						/* Also restore EL0 state seen on entry */

						if (vcpu_get_flag(vcpu, HOST_SME_ENABLED))

							sysreg_clear_set(CPACR_EL1, 0,

									 CPACR_EL1_SMEN_EL0EN |

									 CPACR_EL1_SMEN_EL1EN);

							sysreg_clear_set(CPACR_EL1, 0, CPACR_ELx_SMEN);

						else

							sysreg_clear_set(CPACR_EL1,

									 CPACR_EL1_SMEN_EL0EN,

									
										3

arch/arm64/kvm/guest.c
									
												View File
												
				@@ -251,6 +251,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)

						case PSR_AA32_MODE_SVC:

						case PSR_AA32_MODE_ABT:

						case PSR_AA32_MODE_UND:

						case PSR_AA32_MODE_SYS:

							if (!vcpu_el1_is_32bit(vcpu))

								return -EINVAL;

							break;

				@@ -276,7 +277,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)

					if (*vcpu_cpsr(vcpu) & PSR_MODE32_BIT) {

						int i, nr_reg;

						switch (*vcpu_cpsr(vcpu)) {

						switch (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {

						/*

						 * Either we are dealing with user mode, and only the

						 * first 15 registers (+ PC) must be narrowed to 32bit.

									
										18

arch/arm64/kvm/hyp/aarch32.c
									
												View File
												
				@@ -50,9 +50,23 @@ bool kvm_condition_valid32(const struct kvm_vcpu *vcpu)

					u32 cpsr_cond;

					int cond;

					/* Top two bits non-zero?  Unconditional. */

					if (kvm_vcpu_get_esr(vcpu) >> 30)

					/*

					 * These are the exception classes that could fire with a

					 * conditional instruction.

					 */

					switch (kvm_vcpu_trap_get_class(vcpu)) {

					case ESR_ELx_EC_CP15_32:

					case ESR_ELx_EC_CP15_64:

					case ESR_ELx_EC_CP14_MR:

					case ESR_ELx_EC_CP14_LS:

					case ESR_ELx_EC_FP_ASIMD:

					case ESR_ELx_EC_CP10_ID:

					case ESR_ELx_EC_CP14_64:

					case ESR_ELx_EC_SVC32:

						break;

					default:

						return true;

					}

					/* Is condition field valid? */

					cond = kvm_vcpu_get_condition(vcpu);

									
										6

arch/arm64/kvm/hyp/fpsimd.S
									
												View File
												
				@@ -25,3 +25,9 @@ SYM_FUNC_START(__sve_restore_state)

					sve_load 0, x1, x2, 3

					ret

				SYM_FUNC_END(__sve_restore_state)

				SYM_FUNC_START(__sve_save_state)

					mov	x2, #1

					sve_save 0, x1, x2, 3

					ret

				SYM_FUNC_END(__sve_save_state)

									
										36

arch/arm64/kvm/hyp/include/hyp/switch.h
									
												View File
												
				@@ -316,10 +316,24 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)

				{

					sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);

					__sve_restore_state(vcpu_sve_pffr(vcpu),

							    &vcpu->arch.ctxt.fp_regs.fpsr);

							    &vcpu->arch.ctxt.fp_regs.fpsr,

							    true);

					write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR);

				}

				static inline void __hyp_sve_save_host(void)

				{

					struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);

					sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR);

					write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);

					__sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),

							 &sve_state->fpsr,

							 true);

				}

				static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu);

				/*

				 * We trap the first access to the FP/SIMD to save the host context and

				 * restore the guest context lazily.

				@@ -330,7 +344,6 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)

				{

					bool sve_guest;

					u8 esr_ec;

					u64 reg;

					if (!system_supports_fpsimd())

						return false;

				@@ -353,24 +366,15 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)

					/* Valid trap.  Switch the context: */

					/* First disable enough traps to allow us to update the registers */

					if (has_vhe() || has_hvhe()) {

						reg = CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN;

						if (sve_guest)

							reg |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN;

						sysreg_clear_set(cpacr_el1, 0, reg);

					} else {

						reg = CPTR_EL2_TFP;

						if (sve_guest)

							reg |= CPTR_EL2_TZ;

						sysreg_clear_set(cptr_el2, reg, 0);

					}

					if (sve_guest || (is_protected_kvm_enabled() && system_supports_sve()))

						cpacr_clear_set(0, CPACR_ELx_FPEN | CPACR_ELx_ZEN);

					else

						cpacr_clear_set(0, CPACR_ELx_FPEN);

					isb();

					/* Write out the host state if it's in the registers */

					if (host_owns_fp_regs())

						__fpsimd_save_state(*host_data_ptr(fpsimd_state));

						kvm_hyp_save_fpsimd_host(vcpu);

					/* Restore the guest state */

					if (sve_guest)

									
										1

arch/arm64/kvm/hyp/include/nvhe/pkvm.h
									
												View File
												
				@@ -59,7 +59,6 @@ static inline bool pkvm_hyp_vcpu_is_protected(struct pkvm_hyp_vcpu *hyp_vcpu)

				}

				void pkvm_hyp_vm_table_init(void *tbl);

				void pkvm_host_fpsimd_state_init(void);

				int __pkvm_init_vm(struct kvm *host_kvm, unsigned long vm_hva,

						   unsigned long pgd_hva);

									
										84

arch/arm64/kvm/hyp/nvhe/hyp-main.c
									
												View File
												
				@@ -23,20 +23,80 @@ DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);

				void __kvm_hyp_host_forward_smc(struct kvm_cpu_context *host_ctxt);

				static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)

				{

					__vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);

					/*

					 * On saving/restoring guest sve state, always use the maximum VL for

					 * the guest. The layout of the data when saving the sve state depends

					 * on the VL, so use a consistent (i.e., the maximum) guest VL.

					 */

					sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);

					__sve_save_state(vcpu_sve_pffr(vcpu), &vcpu->arch.ctxt.fp_regs.fpsr, true);

					write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);

				}

				static void __hyp_sve_restore_host(void)

				{

					struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);

					/*

					 * On saving/restoring host sve state, always use the maximum VL for

					 * the host. The layout of the data when saving the sve state depends

					 * on the VL, so use a consistent (i.e., the maximum) host VL.

					 *

					 * Setting ZCR_EL2 to ZCR_ELx_LEN_MASK sets the effective length

					 * supported by the system (or limited at EL3).

					 */

					write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);

					__sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),

							    &sve_state->fpsr,

							    true);

					write_sysreg_el1(sve_state->zcr_el1, SYS_ZCR);

				}

				static void fpsimd_sve_flush(void)

				{

					*host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;

				}

				static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)

				{

					if (!guest_owns_fp_regs())

						return;

					cpacr_clear_set(0, CPACR_ELx_FPEN | CPACR_ELx_ZEN);

					isb();

					if (vcpu_has_sve(vcpu))

						__hyp_sve_save_guest(vcpu);

					else

						__fpsimd_save_state(&vcpu->arch.ctxt.fp_regs);

					if (system_supports_sve())

						__hyp_sve_restore_host();

					else

						__fpsimd_restore_state(*host_data_ptr(fpsimd_state));

					*host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;

				}

				static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)

				{

					struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;

					fpsimd_sve_flush();

					hyp_vcpu->vcpu.arch.ctxt	= host_vcpu->arch.ctxt;

					hyp_vcpu->vcpu.arch.sve_state	= kern_hyp_va(host_vcpu->arch.sve_state);

					hyp_vcpu->vcpu.arch.sve_max_vl	= host_vcpu->arch.sve_max_vl;

					/* Limit guest vector length to the maximum supported by the host.  */

					hyp_vcpu->vcpu.arch.sve_max_vl	= min(host_vcpu->arch.sve_max_vl, kvm_host_sve_max_vl);

					hyp_vcpu->vcpu.arch.hw_mmu	= host_vcpu->arch.hw_mmu;

					hyp_vcpu->vcpu.arch.hcr_el2	= host_vcpu->arch.hcr_el2;

					hyp_vcpu->vcpu.arch.mdcr_el2	= host_vcpu->arch.mdcr_el2;

					hyp_vcpu->vcpu.arch.cptr_el2	= host_vcpu->arch.cptr_el2;

					hyp_vcpu->vcpu.arch.iflags	= host_vcpu->arch.iflags;

				@@ -54,10 +114,11 @@ static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)

					struct vgic_v3_cpu_if *host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;

					unsigned int i;

					fpsimd_sve_sync(&hyp_vcpu->vcpu);

					host_vcpu->arch.ctxt		= hyp_vcpu->vcpu.arch.ctxt;

					host_vcpu->arch.hcr_el2		= hyp_vcpu->vcpu.arch.hcr_el2;

					host_vcpu->arch.cptr_el2	= hyp_vcpu->vcpu.arch.cptr_el2;

					host_vcpu->arch.fault		= hyp_vcpu->vcpu.arch.fault;

				@@ -79,6 +140,17 @@ static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)

						struct pkvm_hyp_vcpu *hyp_vcpu;

						struct kvm *host_kvm;

						/*

						 * KVM (and pKVM) doesn't support SME guests for now, and

						 * ensures that SME features aren't enabled in pstate when

						 * loading a vcpu. Therefore, if SME features enabled the host

						 * is misbehaving.

						 */

						if (unlikely(system_supports_sme() && read_sysreg_s(SYS_SVCR))) {

							ret = -EINVAL;

							goto out;

						}

						host_kvm = kern_hyp_va(host_vcpu->kvm);

						hyp_vcpu = pkvm_load_hyp_vcpu(host_kvm->arch.pkvm.handle,

									      host_vcpu->vcpu_idx);

				@@ -405,11 +477,7 @@ void handle_trap(struct kvm_cpu_context *host_ctxt)

						handle_host_smc(host_ctxt);

						break;

					case ESR_ELx_EC_SVE:

						if (has_hvhe())

							sysreg_clear_set(cpacr_el1, 0, (CPACR_EL1_ZEN_EL1EN |

											CPACR_EL1_ZEN_EL0EN));

						else

							sysreg_clear_set(cptr_el2, CPTR_EL2_TZ, 0);

						cpacr_clear_set(0, CPACR_ELx_ZEN);

						isb();

						sve_cond_update_zcr_vq(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);

						break;

									
										17

arch/arm64/kvm/hyp/nvhe/pkvm.c
									
												View File
												
				@@ -18,6 +18,8 @@ unsigned long __icache_flags;

				/* Used by kvm_get_vttbr(). */

				unsigned int kvm_arm_vmid_bits;

				unsigned int kvm_host_sve_max_vl;

				/*

				 * Set trap register values based on features in ID_AA64PFR0.

				 */

				@@ -63,7 +65,7 @@ static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)

					/* Trap SVE */

					if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE), feature_ids)) {

						if (has_hvhe())

							cptr_clear |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN;

							cptr_clear |= CPACR_ELx_ZEN;

						else

							cptr_set |= CPTR_EL2_TZ;

					}

				@@ -247,17 +249,6 @@ void pkvm_hyp_vm_table_init(void *tbl)

					vm_table = tbl;

				}

				void pkvm_host_fpsimd_state_init(void)

				{

					unsigned long i;

					for (i = 0; i < hyp_nr_cpus; i++) {

						struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i);

						host_data->fpsimd_state = &host_data->host_ctxt.fp_regs;

					}

				}

				/*

				 * Return the hyp vm structure corresponding to the handle.

				 */

				@@ -586,6 +577,8 @@ unlock:

					if (ret)

						unmap_donated_memory(hyp_vcpu, sizeof(*hyp_vcpu));

					hyp_vcpu->vcpu.arch.cptr_el2 = kvm_get_reset_cptr_el2(&hyp_vcpu->vcpu);

					return ret;

				}

									
										25

arch/arm64/kvm/hyp/nvhe/setup.c
									
												View File
												
				@@ -67,6 +67,28 @@ static int divide_memory_pool(void *virt, unsigned long size)

					return 0;

				}

				static int pkvm_create_host_sve_mappings(void)

				{

					void *start, *end;

					int ret, i;

					if (!system_supports_sve())

						return 0;

					for (i = 0; i < hyp_nr_cpus; i++) {

						struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i);

						struct cpu_sve_state *sve_state = host_data->sve_state;

						start = kern_hyp_va(sve_state);

						end = start + PAGE_ALIGN(pkvm_host_sve_state_size());

						ret = pkvm_create_mappings(start, end, PAGE_HYP);

						if (ret)

							return ret;

					}

					return 0;

				}

				static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size,

								 unsigned long *per_cpu_base,

								 u32 hyp_va_bits)

				@@ -125,6 +147,8 @@ static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size,

							return ret;

					}

					pkvm_create_host_sve_mappings();

					/*

					 * Map the host sections RO in the hypervisor, but transfer the

					 * ownership from the host to the hypervisor itself to make sure they

				@@ -300,7 +324,6 @@ void __noreturn __pkvm_init_finalise(void)

						goto out;

					pkvm_hyp_vm_table_init(vm_table_base);

					pkvm_host_fpsimd_state_init();

				out:

					/*

					 * We tail-called to here from handle___pkvm_init() and will not return,

									
										24

arch/arm64/kvm/hyp/nvhe/switch.c
									
												View File
												
				@@ -48,15 +48,14 @@ static void __activate_traps(struct kvm_vcpu *vcpu)

					val |= has_hvhe() ? CPACR_EL1_TTA : CPTR_EL2_TTA;

					if (cpus_have_final_cap(ARM64_SME)) {

						if (has_hvhe())

							val &= ~(CPACR_EL1_SMEN_EL1EN | CPACR_EL1_SMEN_EL0EN);

							val &= ~CPACR_ELx_SMEN;

						else

							val |= CPTR_EL2_TSM;

					}

					if (!guest_owns_fp_regs()) {

						if (has_hvhe())

							val &= ~(CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN |

								 CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN);

							val &= ~(CPACR_ELx_FPEN | CPACR_ELx_ZEN);

						else

							val |= CPTR_EL2_TFP | CPTR_EL2_TZ;

				@@ -182,6 +181,25 @@ static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)

						kvm_handle_pvm_sysreg(vcpu, exit_code));

				}

				static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)

				{

					/*

					 * Non-protected kvm relies on the host restoring its sve state.

					 * Protected kvm restores the host's sve state as not to reveal that

					 * fpsimd was used by a guest nor leak upper sve bits.

					 */

					if (unlikely(is_protected_kvm_enabled() && system_supports_sve())) {

						__hyp_sve_save_host();

						/* Re-enable SVE traps if not supported for the guest vcpu. */

						if (!vcpu_has_sve(vcpu))

							cpacr_clear_set(CPACR_ELx_ZEN, 0);

					} else {

						__fpsimd_save_state(*host_data_ptr(fpsimd_state));

					}

				}

				static const exit_handler_fn hyp_exit_handlers[] = {

					[0 ... ESR_ELx_EC_MAX]		= NULL,

					[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15_32,

									
										12

arch/arm64/kvm/hyp/vhe/switch.c
									
												View File
												
				@@ -93,8 +93,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu)

					val = read_sysreg(cpacr_el1);

					val |= CPACR_ELx_TTA;

					val &= ~(CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN |

						 CPACR_EL1_SMEN_EL0EN | CPACR_EL1_SMEN_EL1EN);

					val &= ~(CPACR_ELx_ZEN | CPACR_ELx_SMEN);

					/*

					 * With VHE (HCR.E2H == 1), accesses to CPACR_EL1 are routed to

				@@ -109,9 +108,9 @@ static void __activate_traps(struct kvm_vcpu *vcpu)

					if (guest_owns_fp_regs()) {

						if (vcpu_has_sve(vcpu))

							val |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN;

							val |= CPACR_ELx_ZEN;

					} else {

						val &= ~(CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN);

						val &= ~CPACR_ELx_FPEN;

						__activate_traps_fpsimd32(vcpu);

					}

				@@ -262,6 +261,11 @@ static bool kvm_hyp_handle_eret(struct kvm_vcpu *vcpu, u64 *exit_code)

					return true;

				}

				static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)

				{

					__fpsimd_save_state(*host_data_ptr(fpsimd_state));

				}

				static const exit_handler_fn hyp_exit_handlers[] = {

					[0 ... ESR_ELx_EC_MAX]		= NULL,

					[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15_32,

									
										6

arch/arm64/kvm/nested.c
									
												View File
												
				@@ -58,8 +58,10 @@ static u64 limit_nv_id_reg(u32 id, u64 val)

						break;

					case SYS_ID_AA64PFR1_EL1:

						/* Only support SSBS */

						val &= NV_FTR(PFR1, SSBS);

						/* Only support BTI, SSBS, CSV2_frac */

						val &= (NV_FTR(PFR1, BT)	|

							NV_FTR(PFR1, SSBS)	|

							NV_FTR(PFR1, CSV2_frac));

						break;

					case SYS_ID_AA64MMFR0_EL1:

									
										3

arch/arm64/kvm/reset.c
									
												View File
												
				@@ -32,6 +32,7 @@

				/* Maximum phys_shift supported for any VM on this host */

				static u32 __ro_after_init kvm_ipa_limit;

				unsigned int __ro_after_init kvm_host_sve_max_vl;

				/*

				 * ARMv8 Reset Values

				@@ -51,6 +52,8 @@ int __init kvm_arm_init_sve(void)

				{

					if (system_supports_sve()) {

						kvm_sve_max_vl = sve_max_virtualisable_vl();

						kvm_host_sve_max_vl = sve_max_vl();

						kvm_nvhe_sym(kvm_host_sve_max_vl) = kvm_host_sve_max_vl;

						/*

						 * The get_sve_reg()/set_sve_reg() ioctl interface will need

									
										4

arch/arm64/mm/contpte.c
									
												View File
												
				@@ -376,7 +376,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,

					 * clearing access/dirty for the whole block.

					 */

					unsigned long start = addr;

					unsigned long end = start + nr;

					unsigned long end = start + nr * PAGE_SIZE;

					if (pte_cont(__ptep_get(ptep + nr - 1)))

						end = ALIGN(end, CONT_PTE_SIZE);

				@@ -386,7 +386,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,

						ptep = contpte_align_down(ptep);

					}

					__clear_young_dirty_ptes(vma, start, ptep, end - start, flags);

					__clear_young_dirty_ptes(vma, start, ptep, (end - start) / PAGE_SIZE, flags);

				}

				EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);

4

arch/loongarch/boot/dts/loongson-2k0500-ref.dts

View File

@@ -44,14 +44,14 @@
 &gmac0 {
 	status = "okay";
 	phy-mode = "rgmii";
 	phy-mode = "rgmii-id";
 	bus_id = <0x0>;
 };
 &gmac1 {
 	status = "okay";
 	phy-mode = "rgmii";
 	phy-mode = "rgmii-id";
 	bus_id = <0x1>;
 };

4

arch/loongarch/boot/dts/loongson-2k1000-ref.dts

View File

@@ -43,7 +43,7 @@
 &gmac0 {
 	status = "okay";
 	phy-mode = "rgmii";
 	phy-mode = "rgmii-id";
 	phy-handle = <&phy0>;
 	mdio {
 		compatible = "snps,dwmac-mdio";
@@ -58,7 +58,7 @@
 &gmac1 {
 	status = "okay";
 	phy-mode = "rgmii";
 	phy-mode = "rgmii-id";
 	phy-handle = <&phy1>;
 	mdio {
 		compatible = "snps,dwmac-mdio";

2

arch/loongarch/boot/dts/loongson-2k2000-ref.dts

View File

@@ -92,7 +92,7 @@
 &gmac2 {
 	status = "okay";
 	phy-mode = "rgmii";
 	phy-mode = "rgmii-id";
 	phy-handle = <&phy2>;
 	mdio {
 		compatible = "snps,dwmac-mdio";

									
										1

arch/loongarch/include/asm/numa.h
									
												View File
												
				@@ -56,6 +56,7 @@ extern int early_cpu_to_node(int cpu);

				static inline void early_numa_add_cpu(int cpuid, s16 node)	{ }

				static inline void numa_add_cpu(unsigned int cpu)		{ }

				static inline void numa_remove_cpu(unsigned int cpu)		{ }

				static inline void set_cpuid_to_node(int cpuid, s16 node)	{ }

				static inline int early_cpu_to_node(int cpu)

				{

									
										2

arch/loongarch/include/asm/stackframe.h
									
												View File
												
				@@ -42,7 +42,7 @@

					.macro JUMP_VIRT_ADDR temp1 temp2

					li.d	\temp1, CACHE_BASE

					pcaddi	\temp2, 0

					or	\temp1, \temp1, \temp2

					bstrins.d  \temp1, \temp2, (DMW_PABITS - 1), 0

					jirl	zero, \temp1, 0xc

					.endm

									
										2

arch/loongarch/kernel/head.S
									
												View File
												
				@@ -22,7 +22,7 @@

				_head:

					.word	MZ_MAGIC		/* "MZ", MS-DOS header */

					.org	0x8

					.dword	kernel_entry		/* Kernel entry point */

					.dword	_kernel_entry		/* Kernel entry point (physical address) */

					.dword	_kernel_asize		/* Kernel image effective size */

					.quad	PHYS_LINK_KADDR		/* Kernel image load offset from start of RAM */

					.org	0x38			/* 0x20 ~ 0x37 reserved */

									
										6

arch/loongarch/kernel/setup.c
									
												View File
												
				@@ -282,7 +282,7 @@ static void __init fdt_setup(void)

						return;

					/* Prefer to use built-in dtb, checking its legality first. */

					if (!fdt_check_header(__dtb_start))

					if (IS_ENABLED(CONFIG_BUILTIN_DTB) && !fdt_check_header(__dtb_start))

						fdt_pointer = __dtb_start;

					else

						fdt_pointer = efi_fdt_pointer(); /* Fallback to firmware dtb */

				@@ -351,10 +351,8 @@ void __init platform_init(void)

					arch_reserve_vmcore();

					arch_reserve_crashkernel();

				#ifdef CONFIG_ACPI_TABLE_UPGRADE

					acpi_table_upgrade();

				#endif

				#ifdef CONFIG_ACPI

					acpi_table_upgrade();

					acpi_gbl_use_default_register_widths = false;

					acpi_boot_table_init();

				#endif

									
										5

arch/loongarch/kernel/smp.c
									
												View File
												
				@@ -273,7 +273,6 @@ static void __init fdt_smp_setup(void)

						if (cpuid == loongson_sysconf.boot_cpu_id) {

							cpu = 0;

							numa_add_cpu(cpu);

						} else {

							cpu = cpumask_next_zero(-1, cpu_present_mask);

						}

				@@ -283,6 +282,9 @@ static void __init fdt_smp_setup(void)

						set_cpu_present(cpu, true);

						__cpu_number_map[cpuid] = cpu;

						__cpu_logical_map[cpu] = cpuid;

						early_numa_add_cpu(cpu, 0);

						set_cpuid_to_node(cpuid, 0);

					}

					loongson_sysconf.nr_cpus = num_processors;

				@@ -468,6 +470,7 @@ void smp_prepare_boot_cpu(void)

					set_cpu_possible(0, true);

					set_cpu_online(0, true);

					set_my_cpu_offset(per_cpu_offset(0));

					numa_add_cpu(0);

					rr_node = first_node(node_online_map);

					for_each_possible_cpu(cpu) {

									
										10

arch/loongarch/kernel/vmlinux.lds.S
									
												View File
												
				@@ -6,6 +6,7 @@

				#define PAGE_SIZE _PAGE_SIZE

				#define RO_EXCEPTION_TABLE_ALIGN	4

				#define PHYSADDR_MASK			0xffffffffffff /* 48-bit */

				/*

				 * Put .bss..swapper_pg_dir as the first thing in .bss. This will

				@@ -142,10 +143,11 @@ SECTIONS

				#ifdef CONFIG_EFI_STUB

					/* header symbols */

					_kernel_asize = _end - _text;

					_kernel_fsize = _edata - _text;

					_kernel_vsize = _end - __initdata_begin;

					_kernel_rsize = _edata - __initdata_begin;

					_kernel_entry = ABSOLUTE(kernel_entry & PHYSADDR_MASK);

					_kernel_asize = ABSOLUTE(_end - _text);

					_kernel_fsize = ABSOLUTE(_edata - _text);

					_kernel_vsize = ABSOLUTE(_end - __initdata_begin);

					_kernel_rsize = ABSOLUTE(_edata - __initdata_begin);

				#endif

					.gptab.sdata : {

									
										15

arch/parisc/include/asm/cacheflush.h
									
												View File
												
				@@ -31,18 +31,17 @@ void flush_cache_all_local(void);

				void flush_cache_all(void);

				void flush_cache_mm(struct mm_struct *mm);

				void flush_kernel_dcache_page_addr(const void *addr);

				#define flush_kernel_dcache_range(start,size) \

					flush_kernel_dcache_range_asm((start), (start)+(size));

				/* The only way to flush a vmap range is to flush whole cache */

				#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1

				void flush_kernel_vmap_range(void *vaddr, int size);

				void invalidate_kernel_vmap_range(void *vaddr, int size);

				#define flush_cache_vmap(start, end)		flush_cache_all()

				void flush_cache_vmap(unsigned long start, unsigned long end);

				#define flush_cache_vmap_early(start, end)	do { } while (0)

				#define flush_cache_vunmap(start, end)		flush_cache_all()

				void flush_cache_vunmap(unsigned long start, unsigned long end);

				void flush_dcache_folio(struct folio *folio);

				#define flush_dcache_folio flush_dcache_folio

				@@ -77,17 +76,11 @@ void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr,

				void flush_cache_range(struct vm_area_struct *vma,

						unsigned long start, unsigned long end);

				/* defined in pacache.S exported in cache.c used by flush_anon_page */

				void flush_dcache_page_asm(unsigned long phys_addr, unsigned long vaddr);

				#define ARCH_HAS_FLUSH_ANON_PAGE

				void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned long vmaddr);

				#define ARCH_HAS_FLUSH_ON_KUNMAP

				static inline void kunmap_flush_on_unmap(const void *addr)

				{

					flush_kernel_dcache_page_addr(addr);

				}

				void kunmap_flush_on_unmap(const void *addr);

				#endif /* _PARISC_CACHEFLUSH_H */

									
										27

arch/parisc/include/asm/pgtable.h
									
												View File
												
				@@ -448,14 +448,17 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)

					return pte;

				}

				static inline pte_t ptep_get(pte_t *ptep)

				{

					return READ_ONCE(*ptep);

				}

				#define ptep_get ptep_get

				static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)

				{

					pte_t pte;

					if (!pte_young(*ptep))

						return 0;

					pte = *ptep;

					pte = ptep_get(ptep);

					if (!pte_young(pte)) {

						return 0;

					}

				@@ -463,17 +466,10 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned

					return 1;

				}

				int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep);

				pte_t ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep);

				struct mm_struct;

				static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)

				{

					pte_t old_pte;

					old_pte = *ptep;

					set_pte(ptep, __pte(0));

					return old_pte;

				}

				static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)

				{

					set_pte(ptep, pte_wrprotect(*ptep));

				@@ -511,7 +507,8 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,

				#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN

				#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG

				#define __HAVE_ARCH_PTEP_GET_AND_CLEAR

				#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH

				#define __HAVE_ARCH_PTEP_CLEAR_FLUSH

				#define __HAVE_ARCH_PTEP_SET_WRPROTECT

				#define __HAVE_ARCH_PTE_SAME

									
										421

arch/parisc/kernel/cache.c
									
												View File
												
				@@ -20,6 +20,7 @@

				#include <linux/sched.h>

				#include <linux/sched/mm.h>

				#include <linux/syscalls.h>

				#include <linux/vmalloc.h>

				#include <asm/pdc.h>

				#include <asm/cache.h>

				#include <asm/cacheflush.h>

				@@ -31,20 +32,31 @@

				#include <asm/mmu_context.h>

				#include <asm/cachectl.h>

				#define PTR_PAGE_ALIGN_DOWN(addr) PTR_ALIGN_DOWN(addr, PAGE_SIZE)

				/*

				 * When nonzero, use _PAGE_ACCESSED bit to try to reduce the number

				 * of page flushes done flush_cache_page_if_present. There are some

				 * pros and cons in using this option. It may increase the risk of

				 * random segmentation faults.

				 */

				#define CONFIG_FLUSH_PAGE_ACCESSED	0

				int split_tlb __ro_after_init;

				int dcache_stride __ro_after_init;

				int icache_stride __ro_after_init;

				EXPORT_SYMBOL(dcache_stride);

				/* Internal implementation in arch/parisc/kernel/pacache.S */

				void flush_dcache_page_asm(unsigned long phys_addr, unsigned long vaddr);

				EXPORT_SYMBOL(flush_dcache_page_asm);

				void purge_dcache_page_asm(unsigned long phys_addr, unsigned long vaddr);

				void flush_icache_page_asm(unsigned long phys_addr, unsigned long vaddr);

				/* Internal implementation in arch/parisc/kernel/pacache.S */

				void flush_data_cache_local(void *);  /* flushes local data-cache only */

				void flush_instruction_cache_local(void); /* flushes local code-cache only */

				static void flush_kernel_dcache_page_addr(const void *addr);

				/* On some machines (i.e., ones with the Merced bus), there can be

				 * only a single PxTLB broadcast at a time; this must be guaranteed

				 * by software. We need a spinlock around all TLB flushes to ensure

				@@ -321,6 +333,18 @@ __flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr,

				{

					if (!static_branch_likely(&parisc_has_cache))

						return;

					/*

					 * The TLB is the engine of coherence on parisc.  The CPU is

					 * entitled to speculate any page with a TLB mapping, so here

					 * we kill the mapping then flush the page along a special flush

					 * only alias mapping. This guarantees that the page is no-longer

					 * in the cache for any process and nor may it be speculatively

					 * read in (until the user or kernel specifically accesses it,

					 * of course).

					 */

					flush_tlb_page(vma, vmaddr);

					preempt_disable();

					flush_dcache_page_asm(physaddr, vmaddr);

					if (vma->vm_flags & VM_EXEC)

				@@ -328,46 +352,44 @@ __flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr,

					preempt_enable();

				}

				static void flush_user_cache_page(struct vm_area_struct *vma, unsigned long vmaddr)

				static void flush_kernel_dcache_page_addr(const void *addr)

				{

					unsigned long flags, space, pgd, prot;

				#ifdef CONFIG_TLB_PTLOCK

					unsigned long pgd_lock;

				#endif

					unsigned long vaddr = (unsigned long)addr;

					unsigned long flags;

					vmaddr &= PAGE_MASK;

					/* Purge TLB entry to remove translation on all CPUs */

					purge_tlb_start(flags);

					pdtlb(SR_KERNEL, addr);

					purge_tlb_end(flags);

					/* Use tmpalias flush to prevent data cache move-in */

					preempt_disable();

					/* Set context for flush */

					local_irq_save(flags);

					prot = mfctl(8);

					space = mfsp(SR_USER);

					pgd = mfctl(25);

				#ifdef CONFIG_TLB_PTLOCK

					pgd_lock = mfctl(28);

				#endif

					switch_mm_irqs_off(NULL, vma->vm_mm, NULL);

					local_irq_restore(flags);

					flush_user_dcache_range_asm(vmaddr, vmaddr + PAGE_SIZE);

					if (vma->vm_flags & VM_EXEC)

						flush_user_icache_range_asm(vmaddr, vmaddr + PAGE_SIZE);

					flush_tlb_page(vma, vmaddr);

					/* Restore previous context */

					local_irq_save(flags);

				#ifdef CONFIG_TLB_PTLOCK

					mtctl(pgd_lock, 28);

				#endif

					mtctl(pgd, 25);

					mtsp(space, SR_USER);

					mtctl(prot, 8);

					local_irq_restore(flags);

					flush_dcache_page_asm(__pa(vaddr), vaddr);

					preempt_enable();

				}

				static void flush_kernel_icache_page_addr(const void *addr)

				{

					unsigned long vaddr = (unsigned long)addr;

					unsigned long flags;

					/* Purge TLB entry to remove translation on all CPUs */

					purge_tlb_start(flags);

					pdtlb(SR_KERNEL, addr);

					purge_tlb_end(flags);

					/* Use tmpalias flush to prevent instruction cache move-in */

					preempt_disable();

					flush_icache_page_asm(__pa(vaddr), vaddr);

					preempt_enable();

				}

				void kunmap_flush_on_unmap(const void *addr)

				{

					flush_kernel_dcache_page_addr(addr);

				}

				EXPORT_SYMBOL(kunmap_flush_on_unmap);

				void flush_icache_pages(struct vm_area_struct *vma, struct page *page,

						unsigned int nr)

				{

				@@ -375,13 +397,16 @@ void flush_icache_pages(struct vm_area_struct *vma, struct page *page,

					for (;;) {

						flush_kernel_dcache_page_addr(kaddr);

						flush_kernel_icache_page(kaddr);

						flush_kernel_icache_page_addr(kaddr);

						if (--nr == 0)

							break;

						kaddr += PAGE_SIZE;

					}

				}

				/*

				 * Walk page directory for MM to find PTEP pointer for address ADDR.

				 */

				static inline pte_t *get_ptep(struct mm_struct *mm, unsigned long addr)

				{

					pte_t *ptep = NULL;

				@@ -410,6 +435,41 @@ static inline bool pte_needs_flush(pte_t pte)

						== (_PAGE_PRESENT | _PAGE_ACCESSED);

				}

				/*

				 * Return user physical address. Returns 0 if page is not present.

				 */

				static inline unsigned long get_upa(struct mm_struct *mm, unsigned long addr)

				{

					unsigned long flags, space, pgd, prot, pa;

				#ifdef CONFIG_TLB_PTLOCK

					unsigned long pgd_lock;

				#endif

					/* Save context */

					local_irq_save(flags);

					prot = mfctl(8);

					space = mfsp(SR_USER);

					pgd = mfctl(25);

				#ifdef CONFIG_TLB_PTLOCK

					pgd_lock = mfctl(28);

				#endif

					/* Set context for lpa_user */

					switch_mm_irqs_off(NULL, mm, NULL);

					pa = lpa_user(addr);

					/* Restore previous context */

				#ifdef CONFIG_TLB_PTLOCK

					mtctl(pgd_lock, 28);

				#endif

					mtctl(pgd, 25);

					mtsp(space, SR_USER);

					mtctl(prot, 8);

					local_irq_restore(flags);

					return pa;

				}

				void flush_dcache_folio(struct folio *folio)

				{

					struct address_space *mapping = folio_flush_mapping(folio);

				@@ -458,50 +518,23 @@ void flush_dcache_folio(struct folio *folio)

						if (addr + nr * PAGE_SIZE > vma->vm_end)

							nr = (vma->vm_end - addr) / PAGE_SIZE;

						if (parisc_requires_coherency()) {

							for (i = 0; i < nr; i++) {

								pte_t *ptep = get_ptep(vma->vm_mm,

											addr + i * PAGE_SIZE);

								if (!ptep)

									continue;

								if (pte_needs_flush(*ptep))

									flush_user_cache_page(vma,

											addr + i * PAGE_SIZE);

								/* Optimise accesses to the same table? */

								pte_unmap(ptep);

							}

						} else {

							/*

							 * The TLB is the engine of coherence on parisc:

							 * The CPU is entitled to speculate any page

							 * with a TLB mapping, so here we kill the

							 * mapping then flush the page along a special

							 * flush only alias mapping. This guarantees that

							 * the page is no-longer in the cache for any

							 * process and nor may it be speculatively read

							 * in (until the user or kernel specifically

							 * accesses it, of course)

							 */

							for (i = 0; i < nr; i++)

								flush_tlb_page(vma, addr + i * PAGE_SIZE);

							if (old_addr == 0 || (old_addr & (SHM_COLOUR - 1))

						if (old_addr == 0 || (old_addr & (SHM_COLOUR - 1))

									!= (addr & (SHM_COLOUR - 1))) {

								for (i = 0; i < nr; i++)

									__flush_cache_page(vma,

										addr + i * PAGE_SIZE,

										(pfn + i) * PAGE_SIZE);

								/*

								 * Software is allowed to have any number

								 * of private mappings to a page.

								 */

								if (!(vma->vm_flags & VM_SHARED))

									continue;

								if (old_addr)

									pr_err("INEQUIVALENT ALIASES 0x%lx and 0x%lx in file %pD\n",

										old_addr, addr, vma->vm_file);

								if (nr == folio_nr_pages(folio))

									old_addr = addr;

							}

							for (i = 0; i < nr; i++)

								__flush_cache_page(vma,

									addr + i * PAGE_SIZE,

									(pfn + i) * PAGE_SIZE);

							/*

							 * Software is allowed to have any number

							 * of private mappings to a page.

							 */

							if (!(vma->vm_flags & VM_SHARED))

								continue;

							if (old_addr)

								pr_err("INEQUIVALENT ALIASES 0x%lx and 0x%lx in file %pD\n",

									old_addr, addr, vma->vm_file);

							if (nr == folio_nr_pages(folio))

								old_addr = addr;

						}

						WARN_ON(++count == 4096);

					}

				@@ -591,35 +624,28 @@ extern void purge_kernel_dcache_page_asm(unsigned long);

				extern void clear_user_page_asm(void *, unsigned long);

				extern void copy_user_page_asm(void *, void *, unsigned long);

				void flush_kernel_dcache_page_addr(const void *addr)

				{

					unsigned long flags;

					flush_kernel_dcache_page_asm(addr);

					purge_tlb_start(flags);

					pdtlb(SR_KERNEL, addr);

					purge_tlb_end(flags);

				}

				EXPORT_SYMBOL(flush_kernel_dcache_page_addr);

				static void flush_cache_page_if_present(struct vm_area_struct *vma,

					unsigned long vmaddr, unsigned long pfn)

					unsigned long vmaddr)

				{

				#if CONFIG_FLUSH_PAGE_ACCESSED

					bool needs_flush = false;

					pte_t *ptep;

					pte_t *ptep, pte;

					/*

					 * The pte check is racy and sometimes the flush will trigger

					 * a non-access TLB miss. Hopefully, the page has already been

					 * flushed.

					 */

					ptep = get_ptep(vma->vm_mm, vmaddr);

					if (ptep) {

						needs_flush = pte_needs_flush(*ptep);

						pte = ptep_get(ptep);

						needs_flush = pte_needs_flush(pte);

						pte_unmap(ptep);

					}

					if (needs_flush)

						flush_cache_page(vma, vmaddr, pfn);

						__flush_cache_page(vma, vmaddr, PFN_PHYS(pte_pfn(pte)));

				#else

					struct mm_struct *mm = vma->vm_mm;

					unsigned long physaddr = get_upa(mm, vmaddr);

					if (physaddr)

						__flush_cache_page(vma, vmaddr, PAGE_ALIGN_DOWN(physaddr));

				#endif

				}

				void copy_user_highpage(struct page *to, struct page *from,

				@@ -629,7 +655,7 @@ void copy_user_highpage(struct page *to, struct page *from,

					kfrom = kmap_local_page(from);

					kto = kmap_local_page(to);

					flush_cache_page_if_present(vma, vaddr, page_to_pfn(from));

					__flush_cache_page(vma, vaddr, PFN_PHYS(page_to_pfn(from)));

					copy_page_asm(kto, kfrom);

					kunmap_local(kto);

					kunmap_local(kfrom);

				@@ -638,16 +664,17 @@ void copy_user_highpage(struct page *to, struct page *from,

				void copy_to_user_page(struct vm_area_struct *vma, struct page *page,

						unsigned long user_vaddr, void *dst, void *src, int len)

				{

					flush_cache_page_if_present(vma, user_vaddr, page_to_pfn(page));

					__flush_cache_page(vma, user_vaddr, PFN_PHYS(page_to_pfn(page)));

					memcpy(dst, src, len);

					flush_kernel_dcache_range_asm((unsigned long)dst, (unsigned long)dst + len);

					flush_kernel_dcache_page_addr(PTR_PAGE_ALIGN_DOWN(dst));

				}

				void copy_from_user_page(struct vm_area_struct *vma, struct page *page,

						unsigned long user_vaddr, void *dst, void *src, int len)

				{

					flush_cache_page_if_present(vma, user_vaddr, page_to_pfn(page));

					__flush_cache_page(vma, user_vaddr, PFN_PHYS(page_to_pfn(page)));

					memcpy(dst, src, len);

					flush_kernel_dcache_page_addr(PTR_PAGE_ALIGN_DOWN(src));

				}

				/* __flush_tlb_range()

				@@ -681,32 +708,10 @@ int __flush_tlb_range(unsigned long sid, unsigned long start,

				static void flush_cache_pages(struct vm_area_struct *vma, unsigned long start, unsigned long end)

				{

					unsigned long addr, pfn;

					pte_t *ptep;

					unsigned long addr;

					for (addr = start; addr < end; addr += PAGE_SIZE) {

						bool needs_flush = false;

						/*

						 * The vma can contain pages that aren't present. Although

						 * the pte search is expensive, we need the pte to find the

						 * page pfn and to check whether the page should be flushed.

						 */

						ptep = get_ptep(vma->vm_mm, addr);

						if (ptep) {

							needs_flush = pte_needs_flush(*ptep);

							pfn = pte_pfn(*ptep);

							pte_unmap(ptep);

						}

						if (needs_flush) {

							if (parisc_requires_coherency()) {

								flush_user_cache_page(vma, addr);

							} else {

								if (WARN_ON(!pfn_valid(pfn)))

									return;

								__flush_cache_page(vma, addr, PFN_PHYS(pfn));

							}

						}

					}

					for (addr = start; addr < end; addr += PAGE_SIZE)

						flush_cache_page_if_present(vma, addr);

				}

				static inline unsigned long mm_total_size(struct mm_struct *mm)

				@@ -757,21 +762,19 @@ void flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned

						if (WARN_ON(IS_ENABLED(CONFIG_SMP) && arch_irqs_disabled()))

							return;

						flush_tlb_range(vma, start, end);

						flush_cache_all();

						if (vma->vm_flags & VM_EXEC)

							flush_cache_all();

						else

							flush_data_cache();

						return;

					}

					flush_cache_pages(vma, start, end);

					flush_cache_pages(vma, start & PAGE_MASK, end);

				}

				void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn)

				{

					if (WARN_ON(!pfn_valid(pfn)))

						return;

					if (parisc_requires_coherency())

						flush_user_cache_page(vma, vmaddr);

					else

						__flush_cache_page(vma, vmaddr, PFN_PHYS(pfn));

					__flush_cache_page(vma, vmaddr, PFN_PHYS(pfn));

				}

				void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned long vmaddr)

				@@ -779,34 +782,133 @@ void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned lon

					if (!PageAnon(page))

						return;

					if (parisc_requires_coherency()) {

						if (vma->vm_flags & VM_SHARED)

							flush_data_cache();

						else

							flush_user_cache_page(vma, vmaddr);

					__flush_cache_page(vma, vmaddr, PFN_PHYS(page_to_pfn(page)));

				}

				int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long addr,

							   pte_t *ptep)

				{

					pte_t pte = ptep_get(ptep);

					if (!pte_young(pte))

						return 0;

					set_pte(ptep, pte_mkold(pte));

				#if CONFIG_FLUSH_PAGE_ACCESSED

					__flush_cache_page(vma, addr, PFN_PHYS(pte_pfn(pte)));

				#endif

					return 1;

				}

				/*

				 * After a PTE is cleared, we have no way to flush the cache for

				 * the physical page. On PA8800 and PA8900 processors, these lines

				 * can cause random cache corruption. Thus, we must flush the cache

				 * as well as the TLB when clearing a PTE that's valid.

				 */

				pte_t ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr,

						       pte_t *ptep)

				{

					struct mm_struct *mm = (vma)->vm_mm;

					pte_t pte = ptep_get_and_clear(mm, addr, ptep);

					unsigned long pfn = pte_pfn(pte);

					if (pfn_valid(pfn))

						__flush_cache_page(vma, addr, PFN_PHYS(pfn));

					else if (pte_accessible(mm, pte))

						flush_tlb_page(vma, addr);

					return pte;

				}

				/*

				 * The physical address for pages in the ioremap case can be obtained

				 * from the vm_struct struct. I wasn't able to successfully handle the

				 * vmalloc and vmap cases. We have an array of struct page pointers in

				 * the uninitialized vmalloc case but the flush failed using page_to_pfn.

				 */

				void flush_cache_vmap(unsigned long start, unsigned long end)

				{

					unsigned long addr, physaddr;

					struct vm_struct *vm;

					/* Prevent cache move-in */

					flush_tlb_kernel_range(start, end);

					if (end - start >= parisc_cache_flush_threshold) {

						flush_cache_all();

						return;

					}

					flush_tlb_page(vma, vmaddr);

					preempt_disable();

					flush_dcache_page_asm(page_to_phys(page), vmaddr);

					preempt_enable();

				}

					if (WARN_ON_ONCE(!is_vmalloc_addr((void *)start))) {

						flush_cache_all();

						return;

					}

					vm = find_vm_area((void *)start);

					if (WARN_ON_ONCE(!vm)) {

						flush_cache_all();

						return;

					}

					/* The physical addresses of IOREMAP regions are contiguous */

					if (vm->flags & VM_IOREMAP) {

						physaddr = vm->phys_addr;

						for (addr = start; addr < end; addr += PAGE_SIZE) {

							preempt_disable();

							flush_dcache_page_asm(physaddr, start);

							flush_icache_page_asm(physaddr, start);

							preempt_enable();

							physaddr += PAGE_SIZE;

						}

						return;

					}

					flush_cache_all();

				}

				EXPORT_SYMBOL(flush_cache_vmap);

				/*

				 * The vm_struct has been retired and the page table is set up. The

				 * last page in the range is a guard page. Its physical address can't

				 * be determined using lpa, so there is no way to flush the range

				 * using flush_dcache_page_asm.

				 */

				void flush_cache_vunmap(unsigned long start, unsigned long end)

				{

					/* Prevent cache move-in */

					flush_tlb_kernel_range(start, end);

					flush_data_cache();

				}

				EXPORT_SYMBOL(flush_cache_vunmap);

				/*

				 * On systems with PA8800/PA8900 processors, there is no way to flush

				 * a vmap range other than using the architected loop to flush the

				 * entire cache. The page directory is not set up, so we can't use

				 * fdc, etc. FDCE/FICE don't work to flush a portion of the cache.

				 * L2 is physically indexed but FDCE/FICE instructions in virtual

				 * mode output their virtual address on the core bus, not their

				 * real address. As a result, the L2 cache index formed from the

				 * virtual address will most likely not be the same as the L2 index

				 * formed from the real address.

				 */

				void flush_kernel_vmap_range(void *vaddr, int size)

				{

					unsigned long start = (unsigned long)vaddr;

					unsigned long end = start + size;

					if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&

					    (unsigned long)size >= parisc_cache_flush_threshold) {

						flush_tlb_kernel_range(start, end);

						flush_data_cache();

					flush_tlb_kernel_range(start, end);

					if (!static_branch_likely(&parisc_has_dcache))

						return;

					/* If interrupts are disabled, we can only do local flush */

					if (WARN_ON(IS_ENABLED(CONFIG_SMP) && arch_irqs_disabled())) {

						flush_data_cache_local(NULL);

						return;

					}

					flush_kernel_dcache_range_asm(start, end);

					flush_tlb_kernel_range(start, end);

					flush_data_cache();

				}

				EXPORT_SYMBOL(flush_kernel_vmap_range);

				@@ -818,15 +920,18 @@ void invalidate_kernel_vmap_range(void *vaddr, int size)

					/* Ensure DMA is complete */

					asm_syncdma();

					if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&

					    (unsigned long)size >= parisc_cache_flush_threshold) {

						flush_tlb_kernel_range(start, end);

						flush_data_cache();

					flush_tlb_kernel_range(start, end);

					if (!static_branch_likely(&parisc_has_dcache))

						return;

					/* If interrupts are disabled, we can only do local flush */

					if (WARN_ON(IS_ENABLED(CONFIG_SMP) && arch_irqs_disabled())) {

						flush_data_cache_local(NULL);

						return;

					}

					purge_kernel_dcache_range_asm(start, end);

					flush_tlb_kernel_range(start, end);

					flush_data_cache();

				}

				EXPORT_SYMBOL(invalidate_kernel_vmap_range);

2

arch/powerpc/Kconfig

View File

@@ -137,7 +137,7 @@ config PPC
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_HAS_HUGEPD			if HUGETLB_PAGE
 	select ARCH_HAS_KCOV
 	select ARCH_HAS_KERNEL_FPU_SUPPORT	if PPC_FPU
 	select ARCH_HAS_KERNEL_FPU_SUPPORT	if PPC64 && PPC_FPU
 	select ARCH_HAS_MEMBARRIER_CALLBACKS
 	select ARCH_HAS_MEMBARRIER_SYNC_CORE
 	select ARCH_HAS_MEMREMAP_COMPAT_ALIGN	if PPC_64S_HASH_MMU

									
										27

arch/powerpc/include/asm/uaccess.h
									
												View File
												
				@@ -92,9 +92,25 @@ __pu_failed:							\

						: label)

				#endif

				#ifdef CONFIG_CC_IS_CLANG

				#define DS_FORM_CONSTRAINT "Z<>"

				#else

				#define DS_FORM_CONSTRAINT "YZ<>"

				#endif

				#ifdef __powerpc64__

				#ifdef CONFIG_PPC_KERNEL_PREFIXED

				#define __put_user_asm2_goto(x, ptr, label)			\

					__put_user_asm_goto(x, ptr, label, "std")

				#else

				#define __put_user_asm2_goto(x, addr, label)			\

					asm goto ("1: std%U1%X1 %0,%1	# put_user\n"		\

						EX_TABLE(1b, %l2)				\

						:						\

						: "r" (x), DS_FORM_CONSTRAINT (*addr)		\

						:						\

						: label)

				#endif // CONFIG_PPC_KERNEL_PREFIXED

				#else /* __powerpc64__ */

				#define __put_user_asm2_goto(x, addr, label)			\

					asm goto(					\

				@@ -165,8 +181,19 @@ do {								\

				#endif

				#ifdef __powerpc64__

				#ifdef CONFIG_PPC_KERNEL_PREFIXED

				#define __get_user_asm2_goto(x, addr, label)			\

					__get_user_asm_goto(x, addr, label, "ld")

				#else

				#define __get_user_asm2_goto(x, addr, label)			\

					asm_goto_output(					\

						"1:	ld%U1%X1 %0, %1	# get_user\n"		\

						EX_TABLE(1b, %l2)				\

						: "=r" (x)					\

						: DS_FORM_CONSTRAINT (*addr)			\

						:						\

						: label)

				#endif // CONFIG_PPC_KERNEL_PREFIXED

				#else /* __powerpc64__ */

				#define __get_user_asm2_goto(x, addr, label)			\

					asm_goto_output(					\

									
										12

arch/powerpc/net/bpf_jit_comp32.c
									
												View File
												
				@@ -900,6 +900,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code

							/* Get offset into TMP_REG */

							EMIT(PPC_RAW_LI(tmp_reg, off));

							/*

							 * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'

							 * before and after the operation.

							 *

							 * This is a requirement in the Linux Kernel Memory Model.

							 * See __cmpxchg_u32() in asm/cmpxchg.h as an example.

							 */

							if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))

								EMIT(PPC_RAW_SYNC());

							tmp_idx = ctx->idx * 4;

							/* load value from memory into r0 */

							EMIT(PPC_RAW_LWARX(_R0, tmp_reg, dst_reg, 0));

				@@ -953,6 +962,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code

							/* For the BPF_FETCH variant, get old data into src_reg */

							if (imm & BPF_FETCH) {

								/* Emit 'sync' to enforce full ordering */

								if (IS_ENABLED(CONFIG_SMP))

									EMIT(PPC_RAW_SYNC());

								EMIT(PPC_RAW_MR(ret_reg, ax_reg));

								if (!fp->aux->verifier_zext)

									EMIT(PPC_RAW_LI(ret_reg - 1, 0)); /* higher 32-bit */

									
										12

arch/powerpc/net/bpf_jit_comp64.c
									
												View File
												
				@@ -846,6 +846,15 @@ emit_clear:

							/* Get offset into TMP_REG_1 */

							EMIT(PPC_RAW_LI(tmp1_reg, off));

							/*

							 * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'

							 * before and after the operation.

							 *

							 * This is a requirement in the Linux Kernel Memory Model.

							 * See __cmpxchg_u64() in asm/cmpxchg.h as an example.

							 */

							if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))

								EMIT(PPC_RAW_SYNC());

							tmp_idx = ctx->idx * 4;

							/* load value from memory into TMP_REG_2 */

							if (size == BPF_DW)

				@@ -908,6 +917,9 @@ emit_clear:

							PPC_BCC_SHORT(COND_NE, tmp_idx);

							if (imm & BPF_FETCH) {

								/* Emit 'sync' to enforce full ordering */

								if (IS_ENABLED(CONFIG_SMP))

									EMIT(PPC_RAW_SYNC());

								EMIT(PPC_RAW_MR(ret_reg, _R0));

								/*

								 * Skip unnecessary zero-extension for 32-bit cmpxchg.

									
										4

arch/powerpc/platforms/pseries/lparcfg.c
									
												View File
												
				@@ -371,8 +371,8 @@ static int read_dt_lpar_name(struct seq_file *m)

				static void read_lpar_name(struct seq_file *m)

				{

					if (read_rtas_lpar_name(m) && read_dt_lpar_name(m))

						pr_err_once("Error can't get the LPAR name");

					if (read_rtas_lpar_name(m))

						read_dt_lpar_name(m);

				}

				#define SPLPAR_MAXLENGTH 1026*(sizeof(char))

2

arch/riscv/Kconfig

View File

@@ -106,7 +106,7 @@ config RISCV
 	select HAS_IOPORT if MMU
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_HUGE_VMALLOC if HAVE_ARCH_HUGE_VMAP
 	select HAVE_ARCH_HUGE_VMAP if MMU && 64BIT && !XIP_KERNEL
 	select HAVE_ARCH_HUGE_VMAP if MMU && 64BIT
 	select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
 	select HAVE_ARCH_JUMP_LABEL_RELATIVE if !XIP_KERNEL
 	select HAVE_ARCH_KASAN if MMU && 64BIT

									
										22

arch/riscv/include/asm/cmpxchg.h
									
												View File
												
				@@ -10,7 +10,7 @@

				#include <asm/fence.h>

				#define __arch_xchg_masked(prepend, append, r, p, n)			\

				#define __arch_xchg_masked(sc_sfx, prepend, append, r, p, n)		\

				({									\

					u32 *__ptr32b = (u32 *)((ulong)(p) & ~0x3);			\

					ulong __s = ((ulong)(p) & (0x4 - sizeof(*p))) * BITS_PER_BYTE;	\

				@@ -25,7 +25,7 @@

					       "0:	lr.w %0, %2\n"					\

					       "	and  %1, %0, %z4\n"				\

					       "	or   %1, %1, %z3\n"				\

					       "	sc.w %1, %1, %2\n"				\

					       "	sc.w" sc_sfx " %1, %1, %2\n"			\

					       "	bnez %1, 0b\n"					\

					       append							\

					       : "=&r" (__retx), "=&r" (__rc), "+A" (*(__ptr32b))	\

				@@ -46,7 +46,8 @@

						: "memory");						\

				})

				#define _arch_xchg(ptr, new, sfx, prepend, append)			\

				#define _arch_xchg(ptr, new, sc_sfx, swap_sfx, prepend,			\

						   sc_append, swap_append)				\

				({									\

					__typeof__(ptr) __ptr = (ptr);					\

					__typeof__(*(__ptr)) __new = (new);				\

				@@ -55,15 +56,15 @@

					switch (sizeof(*__ptr)) {					\

					case 1:								\

					case 2:								\

						__arch_xchg_masked(prepend, append,			\

						__arch_xchg_masked(sc_sfx, prepend, sc_append,		\

								   __ret, __ptr, __new);		\

						break;							\

					case 4:								\

						__arch_xchg(".w" sfx, prepend, append,			\

						__arch_xchg(".w" swap_sfx, prepend, swap_append,	\

							      __ret, __ptr, __new);			\

						break;							\

					case 8:								\

						__arch_xchg(".d" sfx, prepend, append,			\

						__arch_xchg(".d" swap_sfx, prepend, swap_append,	\

							      __ret, __ptr, __new);			\

						break;							\

					default:							\

				@@ -73,16 +74,17 @@

				})

				#define arch_xchg_relaxed(ptr, x)					\

					_arch_xchg(ptr, x, "", "", "")

					_arch_xchg(ptr, x, "", "", "", "", "")

				#define arch_xchg_acquire(ptr, x)					\

					_arch_xchg(ptr, x, "", "", RISCV_ACQUIRE_BARRIER)

					_arch_xchg(ptr, x, "", "", "",					\

						   RISCV_ACQUIRE_BARRIER, RISCV_ACQUIRE_BARRIER)

				#define arch_xchg_release(ptr, x)					\

					_arch_xchg(ptr, x, "", RISCV_RELEASE_BARRIER, "")

					_arch_xchg(ptr, x, "", "", RISCV_RELEASE_BARRIER, "", "")

				#define arch_xchg(ptr, x)						\

					_arch_xchg(ptr, x, ".aqrl", "", "")

					_arch_xchg(ptr, x, ".rl", ".aqrl", "", RISCV_FULL_BARRIER, "")

				#define xchg32(ptr, x)							\

				({									\

									
										2

arch/riscv/kernel/cpu_ops_sbi.c
									
												View File
												
				@@ -72,7 +72,7 @@ static int sbi_cpu_start(unsigned int cpuid, struct task_struct *tidle)

					/* Make sure tidle is updated */

					smp_mb();

					bdata->task_ptr = tidle;

					bdata->stack_ptr = task_stack_page(tidle) + THREAD_SIZE;

					bdata->stack_ptr = task_pt_regs(tidle);

					/* Make sure boot data is updated */

					smp_mb();

					hsm_data = __pa(bdata);

									
										3

arch/riscv/kernel/cpu_ops_spinwait.c
									
												View File
												
				@@ -34,8 +34,7 @@ static void cpu_update_secondary_bootdata(unsigned int cpuid,

					/* Make sure tidle is updated */

					smp_mb();

					WRITE_ONCE(__cpu_spinwait_stack_pointer[hartid],

						   task_stack_page(tidle) + THREAD_SIZE);

					WRITE_ONCE(__cpu_spinwait_stack_pointer[hartid], task_pt_regs(tidle));

					WRITE_ONCE(__cpu_spinwait_task_pointer[hartid], tidle);

				}

									
										7

arch/riscv/kvm/aia_device.c
									
												View File
												
				@@ -237,10 +237,11 @@ static gpa_t aia_imsic_ppn(struct kvm_aia *aia, gpa_t addr)

				static u32 aia_imsic_hart_index(struct kvm_aia *aia, gpa_t addr)

				{

					u32 hart, group = 0;

					u32 hart = 0, group = 0;

					hart = (addr >> (aia->nr_guest_bits + IMSIC_MMIO_PAGE_SHIFT)) &

						GENMASK_ULL(aia->nr_hart_bits - 1, 0);

					if (aia->nr_hart_bits)

						hart = (addr >> (aia->nr_guest_bits + IMSIC_MMIO_PAGE_SHIFT)) &

						       GENMASK_ULL(aia->nr_hart_bits - 1, 0);

					if (aia->nr_group_bits)

						group = (addr >> aia->nr_group_shift) &

							GENMASK_ULL(aia->nr_group_bits - 1, 0);

									
										4

arch/riscv/kvm/vcpu_onereg.c
									
												View File
												
				@@ -724,9 +724,9 @@ static int kvm_riscv_vcpu_set_reg_isa_ext(struct kvm_vcpu *vcpu,

					switch (reg_subtype) {

					case KVM_REG_RISCV_ISA_SINGLE:

						return riscv_vcpu_set_isa_ext_single(vcpu, reg_num, reg_val);

					case KVM_REG_RISCV_SBI_MULTI_EN:

					case KVM_REG_RISCV_ISA_MULTI_EN:

						return riscv_vcpu_set_isa_ext_multi(vcpu, reg_num, reg_val, true);

					case KVM_REG_RISCV_SBI_MULTI_DIS:

					case KVM_REG_RISCV_ISA_MULTI_DIS:

						return riscv_vcpu_set_isa_ext_multi(vcpu, reg_num, reg_val, false);

					default:

						return -ENOENT;

									
										4

arch/riscv/mm/fault.c
									
												View File
												
				@@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)

					if (unlikely(access_error(cause, vma))) {

						vma_end_read(vma);

						count_vm_vma_lock_event(VMA_LOCK_SUCCESS);

						tsk->thread.bad_cause = SEGV_ACCERR;

						bad_area_nosemaphore(regs, code, addr);

						tsk->thread.bad_cause = cause;

						bad_area_nosemaphore(regs, SEGV_ACCERR, addr);

						return;

					}

									
										21

arch/riscv/mm/init.c
									
												View File
												
				@@ -250,18 +250,19 @@ static void __init setup_bootmem(void)

						kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base;

					/*

					 * memblock allocator is not aware of the fact that last 4K bytes of

					 * the addressable memory can not be mapped because of IS_ERR_VALUE

					 * macro. Make sure that last 4k bytes are not usable by memblock

					 * if end of dram is equal to maximum addressable memory.  For 64-bit

					 * kernel, this problem can't happen here as the end of the virtual

					 * address space is occupied by the kernel mapping then this check must

					 * be done as soon as the kernel mapping base address is determined.

					 * Reserve physical address space that would be mapped to virtual

					 * addresses greater than (void *)(-PAGE_SIZE) because:

					 *  - This memory would overlap with ERR_PTR

					 *  - This memory belongs to high memory, which is not supported

					 *

					 * This is not applicable to 64-bit kernel, because virtual addresses

					 * after (void *)(-PAGE_SIZE) are not linearly mapped: they are

					 * occupied by kernel mapping. Also it is unrealistic for high memory

					 * to exist on 64-bit platforms.

					 */

					if (!IS_ENABLED(CONFIG_64BIT)) {

						max_mapped_addr = __pa(~(ulong)0);

						if (max_mapped_addr == (phys_ram_end - 1))

							memblock_set_current_limit(max_mapped_addr - 4096);

						max_mapped_addr = __va_to_pa_nodebug(-PAGE_SIZE);

						memblock_reserve(max_mapped_addr, (phys_addr_t)-max_mapped_addr);

					}

					min_low_pfn = PFN_UP(phys_ram_base);

									
										27

arch/s390/boot/startup.c
									
												View File
												
				@@ -384,7 +384,7 @@ static void fixup_vmlinux_info(void)

				void startup_kernel(void)

				{

					unsigned long kernel_size = vmlinux.image_size + vmlinux.bss_size;

					unsigned long nokaslr_offset_phys = mem_safe_offset();

					unsigned long nokaslr_offset_phys, kaslr_large_page_offset;

					unsigned long amode31_lma = 0;

					unsigned long max_physmem_end;

					unsigned long asce_limit;

				@@ -393,6 +393,12 @@ void startup_kernel(void)

					fixup_vmlinux_info();

					setup_lpp();

					/*

					 * Non-randomized kernel physical start address must be _SEGMENT_SIZE

					 * aligned (see blow).

					 */

					nokaslr_offset_phys = ALIGN(mem_safe_offset(), _SEGMENT_SIZE);

					safe_addr = PAGE_ALIGN(nokaslr_offset_phys + kernel_size);

					/*

				@@ -425,10 +431,25 @@ void startup_kernel(void)

					save_ipl_cert_comp_list();

					rescue_initrd(safe_addr, ident_map_size);

					if (kaslr_enabled())

						__kaslr_offset_phys = randomize_within_range(kernel_size, THREAD_SIZE, 0, ident_map_size);

					/*

					 * __kaslr_offset_phys must be _SEGMENT_SIZE aligned, so the lower

					 * 20 bits (the offset within a large page) are zero. Copy the last

					 * 20 bits of __kaslr_offset, which is THREAD_SIZE aligned, to

					 * __kaslr_offset_phys.

					 *

					 * With this the last 20 bits of __kaslr_offset_phys and __kaslr_offset

					 * are identical, which is required to allow for large mappings of the

					 * kernel image.

					 */

					kaslr_large_page_offset = __kaslr_offset & ~_SEGMENT_MASK;

					if (kaslr_enabled()) {

						unsigned long end = ident_map_size - kaslr_large_page_offset;

						__kaslr_offset_phys = randomize_within_range(kernel_size, _SEGMENT_SIZE, 0, end);

					}

					if (!__kaslr_offset_phys)

						__kaslr_offset_phys = nokaslr_offset_phys;

					__kaslr_offset_phys |= kaslr_large_page_offset;

					kaslr_adjust_vmlinux_info(__kaslr_offset_phys);

					physmem_reserve(RR_VMLINUX, __kaslr_offset_phys, kernel_size);

					deploy_kernel((void *)__kaslr_offset_phys);

									
										12

arch/s390/boot/vmem.c
									
												View File
												
				@@ -261,21 +261,27 @@ static unsigned long _pa(unsigned long addr, unsigned long size, enum populate_m

				static bool large_allowed(enum populate_mode mode)

				{

					return (mode == POPULATE_DIRECT) || (mode == POPULATE_IDENTITY);

					return (mode == POPULATE_DIRECT) || (mode == POPULATE_IDENTITY) || (mode == POPULATE_KERNEL);

				}

				static bool can_large_pud(pud_t *pu_dir, unsigned long addr, unsigned long end,

							  enum populate_mode mode)

				{

					unsigned long size = end - addr;

					return machine.has_edat2 && large_allowed(mode) &&

					       IS_ALIGNED(addr, PUD_SIZE) && (end - addr) >= PUD_SIZE;

					       IS_ALIGNED(addr, PUD_SIZE) && (size >= PUD_SIZE) &&

					       IS_ALIGNED(_pa(addr, size, mode), PUD_SIZE);

				}

				static bool can_large_pmd(pmd_t *pm_dir, unsigned long addr, unsigned long end,

							  enum populate_mode mode)

				{

					unsigned long size = end - addr;

					return machine.has_edat1 && large_allowed(mode) &&

					       IS_ALIGNED(addr, PMD_SIZE) && (end - addr) >= PMD_SIZE;

					       IS_ALIGNED(addr, PMD_SIZE) && (size >= PMD_SIZE) &&

					       IS_ALIGNED(_pa(addr, size, mode), PMD_SIZE);

				}

				static void pgtable_pte_populate(pmd_t *pmd, unsigned long addr, unsigned long end,

									
										1

arch/s390/boot/vmlinux.lds.S
									
												View File
												
				@@ -109,6 +109,7 @@ SECTIONS

				#ifdef CONFIG_KERNEL_UNCOMPRESSED

					. = ALIGN(PAGE_SIZE);

					. += AMODE31_SIZE;		/* .amode31 section */

					. = ALIGN(1 << 20);		/* _SEGMENT_SIZE */

				#else

					. = ALIGN(8);

				#endif

43

arch/s390/configs/debug_defconfig

View File

@@ -43,7 +43,6 @@ CONFIG_PROFILING=y
 CONFIG_KEXEC=y
 CONFIG_KEXEC_FILE=y
 CONFIG_KEXEC_SIG=y
 CONFIG_CRASH_DUMP=y
 CONFIG_LIVEPATCH=y
 CONFIG_MARCH_Z13=y
 CONFIG_NR_CPUS=512
@@ -51,6 +50,7 @@ CONFIG_NUMA=y
 CONFIG_HZ_100=y
 CONFIG_CERT_STORE=y
 CONFIG_EXPOLINE=y
 # CONFIG_EXPOLINE_EXTERN is not set
 CONFIG_EXPOLINE_AUTO=y
 CONFIG_CHSC_SCH=y
 CONFIG_VFIO_CCW=m
@@ -76,6 +76,7 @@ CONFIG_MODULE_FORCE_UNLOAD=y
 CONFIG_MODULE_UNLOAD_TAINT_TRACKING=y
 CONFIG_MODVERSIONS=y
 CONFIG_MODULE_SRCVERSION_ALL=y
 CONFIG_MODULE_SIG_SHA256=y
 CONFIG_BLK_DEV_THROTTLING=y
 CONFIG_BLK_WBT=y
 CONFIG_BLK_CGROUP_IOLATENCY=y
@@ -100,7 +101,6 @@ CONFIG_MEMORY_HOTPLUG=y
 CONFIG_MEMORY_HOTREMOVE=y
 CONFIG_KSM=y
 CONFIG_TRANSPARENT_HUGEPAGE=y
 CONFIG_CMA_DEBUG=y
 CONFIG_CMA_DEBUGFS=y
 CONFIG_CMA_SYSFS=y
 CONFIG_CMA_AREAS=7
@@ -119,6 +119,7 @@ CONFIG_UNIX_DIAG=m
 CONFIG_XFRM_USER=m
 CONFIG_NET_KEY=m
 CONFIG_SMC_DIAG=m
 CONFIG_SMC_LO=y
 CONFIG_INET=y
 CONFIG_IP_MULTICAST=y
 CONFIG_IP_ADVANCED_ROUTER=y
@@ -133,7 +134,6 @@ CONFIG_IP_MROUTE=y
 CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
 CONFIG_IP_PIMSM_V1=y
 CONFIG_IP_PIMSM_V2=y
 CONFIG_SYN_COOKIES=y
 CONFIG_NET_IPVTI=m
 CONFIG_INET_AH=m
 CONFIG_INET_ESP=m
@@ -167,6 +167,7 @@ CONFIG_BRIDGE_NETFILTER=m
 CONFIG_NETFILTER_NETLINK_HOOK=m
 CONFIG_NF_CONNTRACK=m
 CONFIG_NF_CONNTRACK_SECMARK=y
 CONFIG_NF_CONNTRACK_ZONES=y
 CONFIG_NF_CONNTRACK_PROCFS=y
 CONFIG_NF_CONNTRACK_EVENTS=y
 CONFIG_NF_CONNTRACK_TIMEOUT=y
@@ -183,17 +184,39 @@ CONFIG_NF_CONNTRACK_SIP=m
 CONFIG_NF_CONNTRACK_TFTP=m
 CONFIG_NF_CT_NETLINK=m
 CONFIG_NF_CT_NETLINK_TIMEOUT=m
 CONFIG_NF_CT_NETLINK_HELPER=m
 CONFIG_NETFILTER_NETLINK_GLUE_CT=y
 CONFIG_NF_TABLES=m
 CONFIG_NF_TABLES_INET=y
 CONFIG_NF_TABLES_NETDEV=y
 CONFIG_NFT_NUMGEN=m
 CONFIG_NFT_CT=m
 CONFIG_NFT_FLOW_OFFLOAD=m
 CONFIG_NFT_CONNLIMIT=m
 CONFIG_NFT_LOG=m
 CONFIG_NFT_LIMIT=m
 CONFIG_NFT_MASQ=m
 CONFIG_NFT_REDIR=m
 CONFIG_NFT_NAT=m
 CONFIG_NFT_TUNNEL=m
 CONFIG_NFT_QUEUE=m
 CONFIG_NFT_QUOTA=m
 CONFIG_NFT_REJECT=m
 CONFIG_NFT_COMPAT=m
 CONFIG_NFT_HASH=m
 CONFIG_NFT_FIB_INET=m
 CONFIG_NETFILTER_XTABLES_COMPAT=y
 CONFIG_NFT_XFRM=m
 CONFIG_NFT_SOCKET=m
 CONFIG_NFT_OSF=m
 CONFIG_NFT_TPROXY=m
 CONFIG_NFT_SYNPROXY=m
 CONFIG_NFT_DUP_NETDEV=m
 CONFIG_NFT_FWD_NETDEV=m
 CONFIG_NFT_FIB_NETDEV=m
 CONFIG_NFT_REJECT_NETDEV=m
 CONFIG_NF_FLOW_TABLE_INET=m
 CONFIG_NF_FLOW_TABLE=m
 CONFIG_NF_FLOW_TABLE_PROCFS=y
 CONFIG_NETFILTER_XT_SET=m
 CONFIG_NETFILTER_XT_TARGET_AUDIT=m
 CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
@@ -206,8 +229,10 @@ CONFIG_NETFILTER_XT_TARGET_HMARK=m
 CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
 CONFIG_NETFILTER_XT_TARGET_LOG=m
 CONFIG_NETFILTER_XT_TARGET_MARK=m
 CONFIG_NETFILTER_XT_TARGET_NETMAP=m
 CONFIG_NETFILTER_XT_TARGET_NFLOG=m
 CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
 CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
 CONFIG_NETFILTER_XT_TARGET_TEE=m
 CONFIG_NETFILTER_XT_TARGET_TPROXY=m
 CONFIG_NETFILTER_XT_TARGET_TRACE=m
@@ -216,6 +241,7 @@ CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
 CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
 CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
 CONFIG_NETFILTER_XT_MATCH_BPF=m
 CONFIG_NETFILTER_XT_MATCH_CGROUP=m
 CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
 CONFIG_NETFILTER_XT_MATCH_COMMENT=m
 CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
@@ -230,6 +256,7 @@ CONFIG_NETFILTER_XT_MATCH_DSCP=m
 CONFIG_NETFILTER_XT_MATCH_ESP=m
 CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
 CONFIG_NETFILTER_XT_MATCH_HELPER=m
 CONFIG_NETFILTER_XT_MATCH_IPCOMP=m
 CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
 CONFIG_NETFILTER_XT_MATCH_IPVS=m
 CONFIG_NETFILTER_XT_MATCH_LENGTH=m
@@ -247,6 +274,7 @@ CONFIG_NETFILTER_XT_MATCH_QUOTA=m
 CONFIG_NETFILTER_XT_MATCH_RATEEST=m
 CONFIG_NETFILTER_XT_MATCH_REALM=m
 CONFIG_NETFILTER_XT_MATCH_RECENT=m
 CONFIG_NETFILTER_XT_MATCH_SOCKET=m
 CONFIG_NETFILTER_XT_MATCH_STATE=m
 CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
 CONFIG_NETFILTER_XT_MATCH_STRING=m
@@ -302,7 +330,6 @@ CONFIG_IP_NF_TARGET_ECN=m
 CONFIG_IP_NF_TARGET_TTL=m
 CONFIG_IP_NF_RAW=m
 CONFIG_IP_NF_SECURITY=m
 CONFIG_IP_NF_ARPTABLES=m
 CONFIG_IP_NF_ARPFILTER=m
 CONFIG_IP_NF_ARP_MANGLE=m
 CONFIG_NFT_FIB_IPV6=m
@@ -373,7 +400,6 @@ CONFIG_NET_ACT_POLICE=m
 CONFIG_NET_ACT_GACT=m
 CONFIG_GACT_PROB=y
 CONFIG_NET_ACT_MIRRED=m
 CONFIG_NET_ACT_IPT=m
 CONFIG_NET_ACT_NAT=m
 CONFIG_NET_ACT_PEDIT=m
 CONFIG_NET_ACT_SIMP=m
@@ -462,6 +488,7 @@ CONFIG_DM_VERITY=m
 CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
 CONFIG_DM_SWITCH=m
 CONFIG_DM_INTEGRITY=m
 CONFIG_DM_VDO=m
 CONFIG_NETDEVICES=y
 CONFIG_BONDING=m
 CONFIG_DUMMY=m
@@ -574,7 +601,6 @@ CONFIG_WATCHDOG=y
 CONFIG_WATCHDOG_NOWAYOUT=y
 CONFIG_SOFT_WATCHDOG=m
 CONFIG_DIAG288_WATCHDOG=m
 # CONFIG_DRM_DEBUG_MODESET_LOCK is not set
 CONFIG_FB=y
 # CONFIG_FB_DEVICE is not set
 CONFIG_FRAMEBUFFER_CONSOLE=y
@@ -645,7 +671,6 @@ CONFIG_MSDOS_FS=m
 CONFIG_VFAT_FS=m
 CONFIG_EXFAT_FS=m
 CONFIG_NTFS_FS=m
 CONFIG_NTFS_RW=y
 CONFIG_PROC_KCORE=y
 CONFIG_TMPFS=y
 CONFIG_TMPFS_POSIX_ACL=y
@@ -663,6 +688,7 @@ CONFIG_SQUASHFS_XZ=y
 CONFIG_SQUASHFS_ZSTD=y
 CONFIG_ROMFS_FS=m
 CONFIG_NFS_FS=m
 CONFIG_NFS_V2=m
 CONFIG_NFS_V3_ACL=y
 CONFIG_NFS_V4=m
 CONFIG_NFS_SWAP=y
@@ -879,6 +905,5 @@ CONFIG_RBTREE_TEST=y
 CONFIG_INTERVAL_TREE_TEST=m
 CONFIG_PERCPU_TEST=m
 CONFIG_ATOMIC64_SELFTEST=y
 CONFIG_STRING_SELFTEST=y
 CONFIG_TEST_BITOPS=m
 CONFIG_TEST_BPF=m

40

arch/s390/configs/defconfig

View File

@@ -41,7 +41,6 @@ CONFIG_PROFILING=y
 CONFIG_KEXEC=y
 CONFIG_KEXEC_FILE=y
 CONFIG_KEXEC_SIG=y
 CONFIG_CRASH_DUMP=y
 CONFIG_LIVEPATCH=y
 CONFIG_MARCH_Z13=y
 CONFIG_NR_CPUS=512
@@ -49,6 +48,7 @@ CONFIG_NUMA=y
 CONFIG_HZ_100=y
 CONFIG_CERT_STORE=y
 CONFIG_EXPOLINE=y
 # CONFIG_EXPOLINE_EXTERN is not set
 CONFIG_EXPOLINE_AUTO=y
 CONFIG_CHSC_SCH=y
 CONFIG_VFIO_CCW=m
@@ -71,6 +71,7 @@ CONFIG_MODULE_FORCE_UNLOAD=y
 CONFIG_MODULE_UNLOAD_TAINT_TRACKING=y
 CONFIG_MODVERSIONS=y
 CONFIG_MODULE_SRCVERSION_ALL=y
 CONFIG_MODULE_SIG_SHA256=y
 CONFIG_BLK_DEV_THROTTLING=y
 CONFIG_BLK_WBT=y
 CONFIG_BLK_CGROUP_IOLATENCY=y
@@ -110,6 +111,7 @@ CONFIG_UNIX_DIAG=m
 CONFIG_XFRM_USER=m
 CONFIG_NET_KEY=m
 CONFIG_SMC_DIAG=m
 CONFIG_SMC_LO=y
 CONFIG_INET=y
 CONFIG_IP_MULTICAST=y
 CONFIG_IP_ADVANCED_ROUTER=y
@@ -124,7 +126,6 @@ CONFIG_IP_MROUTE=y
 CONFIG_IP_MROUTE_MULTIPLE_TABLES=y
 CONFIG_IP_PIMSM_V1=y
 CONFIG_IP_PIMSM_V2=y
 CONFIG_SYN_COOKIES=y
 CONFIG_NET_IPVTI=m
 CONFIG_INET_AH=m
 CONFIG_INET_ESP=m
@@ -158,6 +159,7 @@ CONFIG_BRIDGE_NETFILTER=m
 CONFIG_NETFILTER_NETLINK_HOOK=m
 CONFIG_NF_CONNTRACK=m
 CONFIG_NF_CONNTRACK_SECMARK=y
 CONFIG_NF_CONNTRACK_ZONES=y
 CONFIG_NF_CONNTRACK_PROCFS=y
 CONFIG_NF_CONNTRACK_EVENTS=y
 CONFIG_NF_CONNTRACK_TIMEOUT=y
@@ -174,17 +176,39 @@ CONFIG_NF_CONNTRACK_SIP=m
 CONFIG_NF_CONNTRACK_TFTP=m
 CONFIG_NF_CT_NETLINK=m
 CONFIG_NF_CT_NETLINK_TIMEOUT=m
 CONFIG_NF_CT_NETLINK_HELPER=m
 CONFIG_NETFILTER_NETLINK_GLUE_CT=y
 CONFIG_NF_TABLES=m
 CONFIG_NF_TABLES_INET=y
 CONFIG_NF_TABLES_NETDEV=y
 CONFIG_NFT_NUMGEN=m
 CONFIG_NFT_CT=m
 CONFIG_NFT_FLOW_OFFLOAD=m
 CONFIG_NFT_CONNLIMIT=m
 CONFIG_NFT_LOG=m
 CONFIG_NFT_LIMIT=m
 CONFIG_NFT_MASQ=m
 CONFIG_NFT_REDIR=m
 CONFIG_NFT_NAT=m
 CONFIG_NFT_TUNNEL=m
 CONFIG_NFT_QUEUE=m
 CONFIG_NFT_QUOTA=m
 CONFIG_NFT_REJECT=m
 CONFIG_NFT_COMPAT=m
 CONFIG_NFT_HASH=m
 CONFIG_NFT_FIB_INET=m
 CONFIG_NETFILTER_XTABLES_COMPAT=y
 CONFIG_NFT_XFRM=m
 CONFIG_NFT_SOCKET=m
 CONFIG_NFT_OSF=m
 CONFIG_NFT_TPROXY=m
 CONFIG_NFT_SYNPROXY=m
 CONFIG_NFT_DUP_NETDEV=m
 CONFIG_NFT_FWD_NETDEV=m
 CONFIG_NFT_FIB_NETDEV=m
 CONFIG_NFT_REJECT_NETDEV=m
 CONFIG_NF_FLOW_TABLE_INET=m
 CONFIG_NF_FLOW_TABLE=m
 CONFIG_NF_FLOW_TABLE_PROCFS=y
 CONFIG_NETFILTER_XT_SET=m
 CONFIG_NETFILTER_XT_TARGET_AUDIT=m
 CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
@@ -197,8 +221,10 @@ CONFIG_NETFILTER_XT_TARGET_HMARK=m
 CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
 CONFIG_NETFILTER_XT_TARGET_LOG=m
 CONFIG_NETFILTER_XT_TARGET_MARK=m
 CONFIG_NETFILTER_XT_TARGET_NETMAP=m
 CONFIG_NETFILTER_XT_TARGET_NFLOG=m
 CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
 CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
 CONFIG_NETFILTER_XT_TARGET_TEE=m
 CONFIG_NETFILTER_XT_TARGET_TPROXY=m
 CONFIG_NETFILTER_XT_TARGET_TRACE=m
@@ -207,6 +233,7 @@ CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
 CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
 CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
 CONFIG_NETFILTER_XT_MATCH_BPF=m
 CONFIG_NETFILTER_XT_MATCH_CGROUP=m
 CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
 CONFIG_NETFILTER_XT_MATCH_COMMENT=m
 CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
@@ -221,6 +248,7 @@ CONFIG_NETFILTER_XT_MATCH_DSCP=m
 CONFIG_NETFILTER_XT_MATCH_ESP=m
 CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
 CONFIG_NETFILTER_XT_MATCH_HELPER=m
 CONFIG_NETFILTER_XT_MATCH_IPCOMP=m
 CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
 CONFIG_NETFILTER_XT_MATCH_IPVS=m
 CONFIG_NETFILTER_XT_MATCH_LENGTH=m
@@ -238,6 +266,7 @@ CONFIG_NETFILTER_XT_MATCH_QUOTA=m
 CONFIG_NETFILTER_XT_MATCH_RATEEST=m
 CONFIG_NETFILTER_XT_MATCH_REALM=m
 CONFIG_NETFILTER_XT_MATCH_RECENT=m
 CONFIG_NETFILTER_XT_MATCH_SOCKET=m
 CONFIG_NETFILTER_XT_MATCH_STATE=m
 CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
 CONFIG_NETFILTER_XT_MATCH_STRING=m
@@ -293,7 +322,6 @@ CONFIG_IP_NF_TARGET_ECN=m
 CONFIG_IP_NF_TARGET_TTL=m
 CONFIG_IP_NF_RAW=m
 CONFIG_IP_NF_SECURITY=m
 CONFIG_IP_NF_ARPTABLES=m
 CONFIG_IP_NF_ARPFILTER=m
 CONFIG_IP_NF_ARP_MANGLE=m
 CONFIG_NFT_FIB_IPV6=m
@@ -363,7 +391,6 @@ CONFIG_NET_ACT_POLICE=m
 CONFIG_NET_ACT_GACT=m
 CONFIG_GACT_PROB=y
 CONFIG_NET_ACT_MIRRED=m
 CONFIG_NET_ACT_IPT=m
 CONFIG_NET_ACT_NAT=m
 CONFIG_NET_ACT_PEDIT=m
 CONFIG_NET_ACT_SIMP=m
@@ -452,6 +479,7 @@ CONFIG_DM_VERITY=m
 CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
 CONFIG_DM_SWITCH=m
 CONFIG_DM_INTEGRITY=m
 CONFIG_DM_VDO=m
 CONFIG_NETDEVICES=y
 CONFIG_BONDING=m
 CONFIG_DUMMY=m
@@ -630,7 +658,6 @@ CONFIG_MSDOS_FS=m
 CONFIG_VFAT_FS=m
 CONFIG_EXFAT_FS=m
 CONFIG_NTFS_FS=m
 CONFIG_NTFS_RW=y
 CONFIG_PROC_KCORE=y
 CONFIG_TMPFS=y
 CONFIG_TMPFS_POSIX_ACL=y
@@ -649,6 +676,7 @@ CONFIG_SQUASHFS_XZ=y
 CONFIG_SQUASHFS_ZSTD=y
 CONFIG_ROMFS_FS=m
 CONFIG_NFS_FS=m
 CONFIG_NFS_V2=m
 CONFIG_NFS_V3_ACL=y
 CONFIG_NFS_V4=m
 CONFIG_NFS_SWAP=y

5

arch/s390/configs/zfcpdump_defconfig

View File

@@ -9,25 +9,22 @@ CONFIG_BPF_SYSCALL=y
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_CC_OPTIMIZE_FOR_SIZE=y
 CONFIG_KEXEC=y
 CONFIG_CRASH_DUMP=y
 CONFIG_MARCH_Z13=y
 CONFIG_NR_CPUS=2
 CONFIG_HZ_100=y
 # CONFIG_CHSC_SCH is not set
 # CONFIG_SCM_BUS is not set
 # CONFIG_AP is not set
 # CONFIG_PFAULT is not set
 # CONFIG_S390_HYPFS is not set
 # CONFIG_VIRTUALIZATION is not set
 # CONFIG_S390_GUEST is not set
 # CONFIG_SECCOMP is not set
 # CONFIG_GCC_PLUGINS is not set
 # CONFIG_BLOCK_LEGACY_AUTOLOAD is not set
 CONFIG_PARTITION_ADVANCED=y
 # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
 # CONFIG_SWAP is not set
 # CONFIG_COMPAT_BRK is not set
 # CONFIG_COMPACTION is not set
 # CONFIG_MIGRATION is not set
 CONFIG_NET=y
 # CONFIG_IUCV is not set
 # CONFIG_PCPU_DEV_REFCNT is not set

									
										54

arch/s390/kernel/crash_dump.c
									
												View File
												
				@@ -451,7 +451,7 @@ static void *nt_final(void *ptr)

				/*

				 * Initialize ELF header (new kernel)

				 */

				static void *ehdr_init(Elf64_Ehdr *ehdr, int mem_chunk_cnt)

				static void *ehdr_init(Elf64_Ehdr *ehdr, int phdr_count)

				{

					memset(ehdr, 0, sizeof(*ehdr));

					memcpy(ehdr->e_ident, ELFMAG, SELFMAG);

				@@ -465,11 +465,8 @@ static void *ehdr_init(Elf64_Ehdr *ehdr, int mem_chunk_cnt)

					ehdr->e_phoff = sizeof(Elf64_Ehdr);

					ehdr->e_ehsize = sizeof(Elf64_Ehdr);

					ehdr->e_phentsize = sizeof(Elf64_Phdr);

					/*

					 * Number of memory chunk PT_LOAD program headers plus one kernel

					 * image PT_LOAD program header plus one PT_NOTE program header.

					 */

					ehdr->e_phnum = mem_chunk_cnt + 1 + 1;

					/* Number of PT_LOAD program headers plus PT_NOTE program header */

					ehdr->e_phnum = phdr_count + 1;

					return ehdr + 1;

				}

				@@ -503,12 +500,14 @@ static int get_mem_chunk_cnt(void)

				/*

				 * Initialize ELF loads (new kernel)

				 */

				static void loads_init(Elf64_Phdr *phdr)

				static void loads_init(Elf64_Phdr *phdr, bool os_info_has_vm)

				{

					unsigned long old_identity_base = os_info_old_value(OS_INFO_IDENTITY_BASE);

					unsigned long old_identity_base = 0;

					phys_addr_t start, end;

					u64 idx;

					if (os_info_has_vm)

						old_identity_base = os_info_old_value(OS_INFO_IDENTITY_BASE);

					for_each_physmem_range(idx, &oldmem_type, &start, &end) {

						phdr->p_type = PT_LOAD;

						phdr->p_vaddr = old_identity_base + start;

				@@ -522,6 +521,11 @@ static void loads_init(Elf64_Phdr *phdr)

					}

				}

				static bool os_info_has_vm(void)

				{

					return os_info_old_value(OS_INFO_KASLR_OFFSET);

				}

				/*

				 * Prepare PT_LOAD type program header for kernel image region

				 */

				@@ -566,7 +570,7 @@ static void *notes_init(Elf64_Phdr *phdr, void *ptr, u64 notes_offset)

					return ptr;

				}

				static size_t get_elfcorehdr_size(int mem_chunk_cnt)

				static size_t get_elfcorehdr_size(int phdr_count)

				{

					size_t size;

				@@ -581,10 +585,8 @@ static size_t get_elfcorehdr_size(int mem_chunk_cnt)

					size += nt_vmcoreinfo_size();

					/* nt_final */

					size += sizeof(Elf64_Nhdr);

					/* PT_LOAD type program header for kernel text region */

					size += sizeof(Elf64_Phdr);

					/* PT_LOADS */

					size += mem_chunk_cnt * sizeof(Elf64_Phdr);

					size += phdr_count * sizeof(Elf64_Phdr);

					return size;

				}

				@@ -595,8 +597,8 @@ static size_t get_elfcorehdr_size(int mem_chunk_cnt)

				int elfcorehdr_alloc(unsigned long long *addr, unsigned long long *size)

				{

					Elf64_Phdr *phdr_notes, *phdr_loads, *phdr_text;

					int mem_chunk_cnt, phdr_text_cnt;

					size_t alloc_size;

					int mem_chunk_cnt;

					void *ptr, *hdr;

					u64 hdr_off;

				@@ -615,12 +617,14 @@ int elfcorehdr_alloc(unsigned long long *addr, unsigned long long *size)

					}

					mem_chunk_cnt = get_mem_chunk_cnt();

					phdr_text_cnt = os_info_has_vm() ? 1 : 0;

					alloc_size = get_elfcorehdr_size(mem_chunk_cnt);

					alloc_size = get_elfcorehdr_size(mem_chunk_cnt + phdr_text_cnt);

					hdr = kzalloc(alloc_size, GFP_KERNEL);

					/* Without elfcorehdr /proc/vmcore cannot be created. Thus creating

					/*

					 * Without elfcorehdr /proc/vmcore cannot be created. Thus creating

					 * a dump with this crash kernel will fail. Panic now to allow other

					 * dump mechanisms to take over.

					 */

				@@ -628,21 +632,23 @@ int elfcorehdr_alloc(unsigned long long *addr, unsigned long long *size)

						panic("s390 kdump allocating elfcorehdr failed");

					/* Init elf header */

					ptr = ehdr_init(hdr, mem_chunk_cnt);

					phdr_notes = ehdr_init(hdr, mem_chunk_cnt + phdr_text_cnt);

					/* Init program headers */

					phdr_notes = ptr;

					ptr = PTR_ADD(ptr, sizeof(Elf64_Phdr));

					phdr_text = ptr;

					ptr = PTR_ADD(ptr, sizeof(Elf64_Phdr));

					phdr_loads = ptr;

					ptr = PTR_ADD(ptr, sizeof(Elf64_Phdr) * mem_chunk_cnt);

					if (phdr_text_cnt) {

						phdr_text = phdr_notes + 1;

						phdr_loads = phdr_text + 1;

					} else {

						phdr_loads = phdr_notes + 1;

					}

					ptr = PTR_ADD(phdr_loads, sizeof(Elf64_Phdr) * mem_chunk_cnt);

					/* Init notes */

					hdr_off = PTR_DIFF(ptr, hdr);

					ptr = notes_init(phdr_notes, ptr, ((unsigned long) hdr) + hdr_off);

					/* Init kernel text program header */

					text_init(phdr_text);

					if (phdr_text_cnt)

						text_init(phdr_text);

					/* Init loads */

					loads_init(phdr_loads);

					loads_init(phdr_loads, phdr_text_cnt);

					/* Finalize program headers */

					hdr_off = PTR_DIFF(ptr, hdr);

					*addr = (unsigned long long) hdr;

									
										4

arch/x86/boot/compressed/Makefile
									
												View File
												
				@@ -105,9 +105,9 @@ vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/mem.o

				vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o

				vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_mixed.o

				vmlinux-objs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a

				vmlinux-libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a

				$(obj)/vmlinux: $(vmlinux-objs-y) FORCE

				$(obj)/vmlinux: $(vmlinux-objs-y) $(vmlinux-libs-y) FORCE

					$(call if_changed,ld)

				OBJCOPYFLAGS_vmlinux.bin :=  -R .comment -S

									
										1

arch/x86/events/intel/cstate.c
									
												View File
												
				@@ -114,6 +114,7 @@

				#include "../perf_event.h"

				#include "../probe.h"

				MODULE_DESCRIPTION("Support for Intel cstate performance events");

				MODULE_LICENSE("GPL");

				#define DEFINE_CSTATE_FORMAT_ATTR(_var, _name, _format)		\

									
										1

arch/x86/events/intel/uncore.c
									
												View File
												
				@@ -34,6 +34,7 @@ static struct event_constraint uncore_constraint_fixed =

				struct event_constraint uncore_constraint_empty =

					EVENT_CONSTRAINT(0, 0, 0);

				MODULE_DESCRIPTION("Support for Intel uncore performance events");

				MODULE_LICENSE("GPL");

				int uncore_pcibus_to_dieid(struct pci_bus *bus)

									
										1

arch/x86/events/rapl.c
									
												View File
												
				@@ -64,6 +64,7 @@

				#include "perf_event.h"

				#include "probe.h"

				MODULE_DESCRIPTION("Support Intel/AMD RAPL energy consumption counters");

				MODULE_LICENSE("GPL");

				/*

									
										1

arch/x86/include/asm/kvm_host.h
									
												View File
												
				@@ -2154,6 +2154,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);

				int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,

						       void *insn, int insn_len);

				void kvm_mmu_print_sptes(struct kvm_vcpu *vcpu, gpa_t gpa, const char *msg);

				void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva);

				void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,

							     u64 addr, unsigned long roots);

									
										4

arch/x86/include/asm/uaccess.h
									
												View File
												
				@@ -78,10 +78,10 @@ extern int __get_user_bad(void);

					int __ret_gu;							\

					register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX);		\

					__chk_user_ptr(ptr);						\

					asm volatile("call __" #fn "_%c4"				\

					asm volatile("call __" #fn "_%c[size]"				\

						     : "=a" (__ret_gu), "=r" (__val_gu),		\

							ASM_CALL_CONSTRAINT				\

						     : "0" (ptr), "i" (sizeof(*(ptr))));		\

						     : "0" (ptr), [size] "i" (sizeof(*(ptr))));		\

					instrument_get_user(__val_gu);					\

					(x) = (__force __typeof__(*(ptr))) __val_gu;			\

					__builtin_expect(__ret_gu, 0);					\

									
										2

arch/x86/include/asm/vmxfeatures.h
									
												View File
												
				@@ -77,7 +77,7 @@

				#define VMX_FEATURE_ENCLS_EXITING	( 2*32+ 15) /* "" VM-Exit on ENCLS (leaf dependent) */

				#define VMX_FEATURE_RDSEED_EXITING	( 2*32+ 16) /* "" VM-Exit on RDSEED */

				#define VMX_FEATURE_PAGE_MOD_LOGGING	( 2*32+ 17) /* "pml" Log dirty pages into buffer */

				#define VMX_FEATURE_EPT_VIOLATION_VE	( 2*32+ 18) /* "" Conditionally reflect EPT violations as #VE exceptions */

				#define VMX_FEATURE_EPT_VIOLATION_VE	( 2*32+ 18) /* Conditionally reflect EPT violations as #VE exceptions */

				#define VMX_FEATURE_PT_CONCEAL_VMX	( 2*32+ 19) /* "" Suppress VMX indicators in Processor Trace */

				#define VMX_FEATURE_XSAVES		( 2*32+ 20) /* "" Enable XSAVES and XRSTORS in guest */

				#define VMX_FEATURE_MODE_BASED_EPT_EXEC	( 2*32+ 22) /* "ept_mode_based_exec" Enable separate EPT EXEC bits for supervisor vs. user */

									
										9

arch/x86/kernel/amd_nb.c
									
												View File
												
				@@ -215,7 +215,14 @@ out:

				int amd_smn_read(u16 node, u32 address, u32 *value)

				{

					return __amd_smn_rw(node, address, value, false);

					int err = __amd_smn_rw(node, address, value, false);

					if (PCI_POSSIBLE_ERROR(*value)) {

						err = -ENODEV;

						*value = 0;

					}

					return err;

				}

				EXPORT_SYMBOL_GPL(amd_smn_read);

									
										1

arch/x86/kernel/cpu/aperfmperf.c
									
												View File
												
				@@ -345,6 +345,7 @@ static DECLARE_WORK(disable_freq_invariance_work,

						    disable_freq_invariance_workfn);

				DEFINE_PER_CPU(unsigned long, arch_freq_scale) = SCHED_CAPACITY_SCALE;

				EXPORT_PER_CPU_SYMBOL_GPL(arch_freq_scale);

				static void scale_freq_tick(u64 acnt, u64 mcnt)

				{

									
										7

arch/x86/kernel/cpu/common.c
									
												View File
												
				@@ -1075,6 +1075,10 @@ void get_cpu_address_sizes(struct cpuinfo_x86 *c)

						c->x86_virt_bits = (eax >> 8) & 0xff;

						c->x86_phys_bits = eax & 0xff;

						/* Provide a sane default if not enumerated: */

						if (!c->x86_clflush_size)

							c->x86_clflush_size = 32;

					}

					c->x86_cache_bits = c->x86_phys_bits;

				@@ -1585,6 +1589,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)

					if (have_cpuid_p()) {

						cpu_detect(c);

						get_cpu_vendor(c);

						intel_unlock_cpuid_leafs(c);

						get_cpu_cap(c);

						setup_force_cpu_cap(X86_FEATURE_CPUID);

						get_cpu_address_sizes(c);

				@@ -1744,7 +1749,7 @@ static void generic_identify(struct cpuinfo_x86 *c)

					cpu_detect(c);

					get_cpu_vendor(c);

					intel_unlock_cpuid_leafs(c);

					get_cpu_cap(c);

					get_cpu_address_sizes(c);

									
										2

arch/x86/kernel/cpu/cpu.h
									
												View File
												
				@@ -61,9 +61,11 @@ extern __ro_after_init enum tsx_ctrl_states tsx_ctrl_state;

				extern void __init tsx_init(void);

				void tsx_ap_init(void);

				void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c);

				#else

				static inline void tsx_init(void) { }

				static inline void tsx_ap_init(void) { }

				static inline void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c) { }

				#endif /* CONFIG_CPU_SUP_INTEL */

				extern void init_spectral_chicken(struct cpuinfo_x86 *c);

									
										25

arch/x86/kernel/cpu/intel.c
									
												View File
												
				@@ -269,19 +269,26 @@ detect_keyid_bits:

					c->x86_phys_bits -= keyid_bits;

				}

				void intel_unlock_cpuid_leafs(struct cpuinfo_x86 *c)

				{

					if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)

						return;

					if (c->x86 < 6 || (c->x86 == 6 && c->x86_model < 0xd))

						return;

					/*

					 * The BIOS can have limited CPUID to leaf 2, which breaks feature

					 * enumeration. Unlock it and update the maximum leaf info.

					 */

					if (msr_clear_bit(MSR_IA32_MISC_ENABLE, MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0)

						c->cpuid_level = cpuid_eax(0);

				}

				static void early_init_intel(struct cpuinfo_x86 *c)

				{

					u64 misc_enable;

					/* Unmask CPUID levels if masked: */

					if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {

						if (msr_clear_bit(MSR_IA32_MISC_ENABLE,

								  MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) > 0) {

							c->cpuid_level = cpuid_eax(0);

							get_cpu_cap(c);

						}

					}

					if ((c->x86 == 0xf && c->x86_model >= 0x03) ||

						(c->x86 == 0x6 && c->x86_model >= 0x0e))

						set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);

									
										4

arch/x86/kernel/cpu/topology_amd.c
									
												View File
												
				@@ -84,9 +84,9 @@ static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)

					/*

					 * If leaf 0xb is available, then the domain shifts are set

					 * already and nothing to do here.

					 * already and nothing to do here. Only valid for family >= 0x17.

					 */

					if (!has_topoext) {

					if (!has_topoext && tscan->c->x86 >= 0x17) {

						/*

						 * Leaf 0x80000008 set the CORE domain shift already.

						 * Update the SMT domain, but do not propagate it.

Compare commits

966 Commits v6.10-rc1 ... arm64-uacc

3 .editorconfig Unescape Escape View File

12 .mailmap Unescape Escape View File

35 Documentation/admin-guide/LSM/tomoyo.rst Unescape Escape View File

22 Documentation/admin-guide/kernel-parameters.txt Unescape Escape View File

4 Documentation/admin-guide/mm/transhuge.rst Unescape Escape View File

4 Documentation/arch/riscv/uabi.rst Unescape Escape View File

4 Documentation/cdrom/cdrom-standard.rst Unescape Escape View File

2 Documentation/core-api/swiotlb.rst Unescape Escape View File

3 Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml Unescape Escape View File

6 Documentation/devicetree/bindings/arm/sunxi.yaml Unescape Escape View File

2 Documentation/devicetree/bindings/iio/dac/adi,ad3552r.yaml Unescape Escape View File

19 Documentation/devicetree/bindings/input/elan,ekth6915.yaml Unescape Escape View File

66 Documentation/devicetree/bindings/input/ilitek,ili2901.yaml Normal file Unescape Escape View File

11 Documentation/devicetree/bindings/net/pse-pd/microchip,pd692x0.yaml Unescape Escape View File

18 Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml Unescape Escape View File

1 Documentation/devicetree/bindings/usb/realtek,rts5411.yaml Unescape Escape View File

12 Documentation/kbuild/kconfig-language.rst Unescape Escape View File

4 Documentation/netlink/specs/netdev.yaml Unescape Escape View File

31 Documentation/networking/af_xdp.rst Unescape Escape View File

2 Documentation/process/maintainer-netdev.rst Unescape Escape View File

2 Documentation/userspace-api/media/v4l/dev-subdev.rst Unescape Escape View File

12 MAINTAINERS Unescape Escape View File

2 Makefile Unescape Escape View File

2 arch/arc/net/bpf_jit.h Unescape Escape View File

10 arch/arc/net/bpf_jit_arcv2.c Unescape Escape View File

22 arch/arc/net/bpf_jit_core.c Unescape Escape View File

17 arch/arm/kernel/ftrace.c Unescape Escape View File

1 arch/arm64/Kconfig Unescape Escape View File

3 arch/arm64/include/asm/asm-extable.h Unescape Escape View File

6 arch/arm64/include/asm/el2_setup.h Unescape Escape View File

36 arch/arm64/include/asm/io.h Unescape Escape View File

6 arch/arm64/include/asm/kvm_arm.h Unescape Escape View File

71 arch/arm64/include/asm/kvm_emulate.h Unescape Escape View File

25 arch/arm64/include/asm/kvm_host.h Unescape Escape View File

4 arch/arm64/include/asm/kvm_hyp.h Unescape Escape View File

9 arch/arm64/include/asm/kvm_pkvm.h Unescape Escape View File

192 arch/arm64/include/asm/uaccess.h Unescape Escape View File

3 arch/arm64/kernel/armv8_deprecated.c Unescape Escape View File

12 arch/arm64/kernel/mte.c Unescape Escape View File

76 arch/arm64/kvm/arm.c Unescape Escape View File

21 arch/arm64/kvm/emulate-nested.c Unescape Escape View File

11 arch/arm64/kvm/fpsimd.c Unescape Escape View File

3 arch/arm64/kvm/guest.c Unescape Escape View File

18 arch/arm64/kvm/hyp/aarch32.c Unescape Escape View File

6 arch/arm64/kvm/hyp/fpsimd.S Unescape Escape View File

36 arch/arm64/kvm/hyp/include/hyp/switch.h Unescape Escape View File

1 arch/arm64/kvm/hyp/include/nvhe/pkvm.h Unescape Escape View File

84 arch/arm64/kvm/hyp/nvhe/hyp-main.c Unescape Escape View File

17 arch/arm64/kvm/hyp/nvhe/pkvm.c Unescape Escape View File

25 arch/arm64/kvm/hyp/nvhe/setup.c Unescape Escape View File

24 arch/arm64/kvm/hyp/nvhe/switch.c Unescape Escape View File

12 arch/arm64/kvm/hyp/vhe/switch.c Unescape Escape View File

6 arch/arm64/kvm/nested.c Unescape Escape View File

3 arch/arm64/kvm/reset.c Unescape Escape View File

4 arch/arm64/mm/contpte.c Unescape Escape View File

4 arch/loongarch/boot/dts/loongson-2k0500-ref.dts Unescape Escape View File

4 arch/loongarch/boot/dts/loongson-2k1000-ref.dts Unescape Escape View File

2 arch/loongarch/boot/dts/loongson-2k2000-ref.dts Unescape Escape View File

1 arch/loongarch/include/asm/numa.h Unescape Escape View File

2 arch/loongarch/include/asm/stackframe.h Unescape Escape View File

2 arch/loongarch/kernel/head.S Unescape Escape View File

6 arch/loongarch/kernel/setup.c Unescape Escape View File

5 arch/loongarch/kernel/smp.c Unescape Escape View File

10 arch/loongarch/kernel/vmlinux.lds.S Unescape Escape View File

15 arch/parisc/include/asm/cacheflush.h Unescape Escape View File

27 arch/parisc/include/asm/pgtable.h Unescape Escape View File

421 arch/parisc/kernel/cache.c Unescape Escape View File

2 arch/powerpc/Kconfig Unescape Escape View File

27 arch/powerpc/include/asm/uaccess.h Unescape Escape View File

12 arch/powerpc/net/bpf_jit_comp32.c Unescape Escape View File

12 arch/powerpc/net/bpf_jit_comp64.c Unescape Escape View File

4 arch/powerpc/platforms/pseries/lparcfg.c Unescape Escape View File

2 arch/riscv/Kconfig Unescape Escape View File

22 arch/riscv/include/asm/cmpxchg.h Unescape Escape View File

2 arch/riscv/kernel/cpu_ops_sbi.c Unescape Escape View File

3 arch/riscv/kernel/cpu_ops_spinwait.c Unescape Escape View File

7 arch/riscv/kvm/aia_device.c Unescape Escape View File

4 arch/riscv/kvm/vcpu_onereg.c Unescape Escape View File

966 Commits

v6.10-rc1 ... arm64-uacc

3

.editorconfig

View File

12

.mailmap

View File

35

Documentation/admin-guide/LSM/tomoyo.rst

View File

22

Documentation/admin-guide/kernel-parameters.txt

View File

4

Documentation/admin-guide/mm/transhuge.rst

View File

4

Documentation/arch/riscv/uabi.rst

View File

4

Documentation/cdrom/cdrom-standard.rst

View File

2

Documentation/core-api/swiotlb.rst

View File

3

Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml

View File

6

Documentation/devicetree/bindings/arm/sunxi.yaml

View File

2

Documentation/devicetree/bindings/iio/dac/adi,ad3552r.yaml

View File

19

Documentation/devicetree/bindings/input/elan,ekth6915.yaml

View File

66

Documentation/devicetree/bindings/input/ilitek,ili2901.yaml Normal file

View File

11

Documentation/devicetree/bindings/net/pse-pd/microchip,pd692x0.yaml

View File

18

Documentation/devicetree/bindings/net/pse-pd/ti,tps23881.yaml

View File

1

Documentation/devicetree/bindings/usb/realtek,rts5411.yaml

View File

12

Documentation/kbuild/kconfig-language.rst

View File

4

Documentation/netlink/specs/netdev.yaml

View File

31

Documentation/networking/af_xdp.rst

View File

2

Documentation/process/maintainer-netdev.rst

View File

2

Documentation/userspace-api/media/v4l/dev-subdev.rst

View File

12

MAINTAINERS

View File

2

Makefile

View File

2

arch/arc/net/bpf_jit.h

View File

10

arch/arc/net/bpf_jit_arcv2.c

View File

22

arch/arc/net/bpf_jit_core.c

View File

17

arch/arm/kernel/ftrace.c

View File

1

arch/arm64/Kconfig

View File

3

arch/arm64/include/asm/asm-extable.h

View File

6

arch/arm64/include/asm/el2_setup.h

View File

36

arch/arm64/include/asm/io.h

View File

6

arch/arm64/include/asm/kvm_arm.h

View File

71

arch/arm64/include/asm/kvm_emulate.h

View File

25

arch/arm64/include/asm/kvm_host.h

View File

4

arch/arm64/include/asm/kvm_hyp.h

View File

9

arch/arm64/include/asm/kvm_pkvm.h

View File

192

arch/arm64/include/asm/uaccess.h

View File

3

arch/arm64/kernel/armv8_deprecated.c

View File

12

arch/arm64/kernel/mte.c

View File

76

arch/arm64/kvm/arm.c

View File

21

arch/arm64/kvm/emulate-nested.c

View File

11

arch/arm64/kvm/fpsimd.c

View File

3

arch/arm64/kvm/guest.c

View File

18

arch/arm64/kvm/hyp/aarch32.c

View File

6

arch/arm64/kvm/hyp/fpsimd.S

View File

36

arch/arm64/kvm/hyp/include/hyp/switch.h

View File

1

arch/arm64/kvm/hyp/include/nvhe/pkvm.h

View File

84

arch/arm64/kvm/hyp/nvhe/hyp-main.c

View File

17

arch/arm64/kvm/hyp/nvhe/pkvm.c

View File

25

arch/arm64/kvm/hyp/nvhe/setup.c

View File

24

arch/arm64/kvm/hyp/nvhe/switch.c

View File

12

arch/arm64/kvm/hyp/vhe/switch.c

View File

6

arch/arm64/kvm/nested.c

View File

3

arch/arm64/kvm/reset.c

View File

4

arch/arm64/mm/contpte.c

View File

4

arch/loongarch/boot/dts/loongson-2k0500-ref.dts

View File

4

arch/loongarch/boot/dts/loongson-2k1000-ref.dts

View File

2

arch/loongarch/boot/dts/loongson-2k2000-ref.dts

View File

1

arch/loongarch/include/asm/numa.h

View File

2

arch/loongarch/include/asm/stackframe.h

View File

2

arch/loongarch/kernel/head.S

View File

6

arch/loongarch/kernel/setup.c

View File

5

arch/loongarch/kernel/smp.c

View File

10

arch/loongarch/kernel/vmlinux.lds.S

View File

15

arch/parisc/include/asm/cacheflush.h

View File

27

arch/parisc/include/asm/pgtable.h

View File

421

arch/parisc/kernel/cache.c

View File

2

arch/powerpc/Kconfig

View File

27

arch/powerpc/include/asm/uaccess.h

View File

12

arch/powerpc/net/bpf_jit_comp32.c

View File

12

arch/powerpc/net/bpf_jit_comp64.c

View File

4

arch/powerpc/platforms/pseries/lparcfg.c

View File

2

arch/riscv/Kconfig

View File

22

arch/riscv/include/asm/cmpxchg.h

View File

2

arch/riscv/kernel/cpu_ops_sbi.c

View File

3

arch/riscv/kernel/cpu_ops_spinwait.c

View File

7

arch/riscv/kvm/aia_device.c

View File

4

arch/riscv/kvm/vcpu_onereg.c

View File

4

arch/riscv/mm/fault.c

View File