0aec3847d044273733285dcff90afda89ad461d2
13662 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
480e035fc4 |
Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
"Highlights are usual, more AMD IP blocks for future hw, i915/xe
changes, Displayport tunnelling support for i915, msm YUV over DP
changes, new tests for ttm, but its mostly a lot of stuff all over the
place from lots of people.
core:
- EDID cleanups
- scheduler error handling fixes
- managed: add drmm_release_action() with tests
- add ratelimited drm debug print
- DPCD PSR early transport macro
- DP tunneling and bandwidth allocation helpers
- remove built-in edids
- dp: Avoid AUX transfers on powered-down displays
- dp: Add VSC SDP helpers
cross drivers:
- use new drm print helpers
- switch to ->read_edid callback
- gem: add stats for shared buffers plus updates to amdgpu, i915, xe
syncobj:
- fixes to waiting and sleeping
ttm:
- add tests
- fix errno codes
- simply busy-placement handling
- fix page decryption
media:
- tc358743: fix v4l device registration
video:
- move all kernel parameters for video behind CONFIG_VIDEO
sound:
- remove <drm/drm_edid.h> include from header
ci:
- add tests for msm
- fix apq8016 runner
efifb:
- use copy of global screen_info state
vesafb:
- use copy of global screen_info state
simplefb:
- fix logging
bridge:
- ite-6505: fix DP link-training bug
- samsung-dsim: fix error checking in probe
- samsung-dsim: add bsh-smm-s2/pro boards
- tc358767: fix regmap usage
- imx: add i.MX8MP HDMI PVI plus DT bindings
- imx: add i.MX8MP HDMI TX plus DT bindings
- sii902x: fix probing and unregistration
- tc358767: limit pixel PLL input range
- switch to new drm_bridge_read_edid() interface
panel:
- ltk050h3146w: error-handling fixes
- panel-edp: support delay between power-on and enable; use put_sync
in unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49
V8.0, BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings
- panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings
- panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings
- add BOE TH101MB31IG002-28A plus DT bindings
- add EDT ETML1010G3DRA plus DT bindings
- add Novatek NT36672E LCD DSI plus DT bindings
- nt36523: support 120Hz timings, fix includes
- simple: fix display timings on RK32FN48H
- visionox-vtdr6130: fix initialization
- add Powkiddy RGB10MAX3 plus DT bindings
- st7703: support panel rotation plus DT bindings
- add Himax HX83112A plus DT bindings
- ltk500hd1829: add support for ltk101b4029w and admatec 9904370
- simple: add BOE BP082WX1-100 8.2" panel plus DT bindungs
panel-orientation-quirks:
- GPD Win Mini
amdgpu:
- Validate DMABuf imports in compute VMs
- Add RAS ACA framework
- PSP 13 fixes
- Misc code cleanups
- Replay fixes
- Atom interpretor PS, WS bounds checking
- DML2 fixes
- Audio fixes
- DCN 3.5 Z state fixes
- Remove deprecated ida_simple usage
- UBSAN fixes
- RAS fixes
- Enable seq64 infrastructure
- DC color block enablement
- Documentation updates
- DC documentation updates
- DMCUB updates
- ATHUB 4.1 support
- LSDMA 7.0 support
- JPEG DPG support
- IH 7.0 support
- HDP 7.0 support
- VCN 5.0 support
- SMU 13.0.6 updates
- NBIO 7.11 updates
- SDMA 6.1 updates
- MMHUB 3.3 updates
- DCN 3.5.1 support
- NBIF 6.3.1 support
- VPE 6.1.1 support
amdkfd:
- Validate DMABuf imports in compute VMs
- SVM fixes
- Trap handler updates and enhancements
- Fix cache size reporting
- Relocate the trap handler
radeon:
- Atom interpretor PS, WS bounds checking
- Misc code cleanups
xe:
- new query for GuC submission version
- Remove unused persistent exec_queues
- Add vram frequency sysfs attributes
- Add the flag XE_VM_BIND_FLAG_DUMPABLE
- Drop pre-production workarounds
- Drop kunit tests for unsupported platforms
- Start pumbling SR-IOV support with memory based interrupts for VF
- Allow to map BO in GGTT with PAT index corresponding to XE_CACHE_UC
to work with memory based interrupts
- Add GuC Doorbells Manager as prep work SR-IOV
- Implement additional workarounds for xe2 and MTL
- Program a few registers according to perfomance guide spec for Xe2
- Fix remaining 32b build issues and enable it back
- Fix build with CONFIG_DEBUG_FS=n
- Fix warnings from GuC ABI headers
- Introduce Relay Communication for SR-IOV for VF <-> GuC <-> PF
- Release mmap mappings on rpm suspend
- Disable mid-thread preemption when not properly supported by
hardware
- Fix xe_exec by reserving extra fence slot for CPU bind
- Fix xe_exec with full long running exec queue
- Canonicalize addresses where needed for Xe2 and add to devcoredum
- Toggle USM support for Xe2
- Only allow 1 ufence per exec / bind IOCTL
- Add GuC firmware loading for Lunar Lake
- Add XE_VMA_PTE_64K VMA flag
i915:
- Add more ADL-N PCI IDs
- Enable fastboot also on older platforms
- Early transport for panel replay and PSR
- New ARL PCI IDs
- DP TPS4 PHY test pattern support
- Unify and improve VSC SDP for PSR and non-PSR cases
- Refactor memory regions and improve debug logging
- Rework global state serialization
- Remove unused CDCLK divider fields
- Unify HDCP connector logging format
- Use display instead of graphics version in display code
- Move VBT and opregion debugfs next to the implementation
- Abstract opregion interface, use opaque type
- MTL fixes
- HPD handling fixes
- Add GuC submission interface version query
- Atomically invalidate userptr on mmu-notifier
- Update handling of MMIO triggered reports
- Don't make assumptions about intel_wakeref_t type
- Extend driver code of Xe_LPG to Xe_LPG+
- Add flex arrays to struct i915_syncmap
- Allow for very slow HuC loading
- DP tunneling and bandwidth allocation support
msm:
- Correct bindings for MSM8976 and SM8650 platforms
- Start migration of MDP5 platforms to DPU driver
- X1E80100 MDSS support
- DPU:
- Improve DSC allocation, fixing several important corner cases
- Add support for SDM630/SDM660 platforms
- Simplify dpu_encoder_phys_ops
- Apply fixes targeting DSC support with a single DSC encoder
- Apply fixes for HCTL_EN timing configuration
- X1E80100 support
- Add support for YUV420 over DP
- GPU:
- fix sc7180 UBWC config
- fix a7xx LLC config
- new gpu support: a305B, a750, a702
- machine support: SM7150 (different power levels than other a618)
- a7xx devcoredump support
habanalabs:
- configure IRQ affinity according to NUMA node
- move HBM MMU page tables inside the HBM
- improve device reset
- check extended PCIe errors
ivpu:
- updates to firmware API
- refactor BO allocation
imx:
- use devm_ functions during init
hisilicon:
- fix EDID includes
mgag200:
- improve ioremap usage
- convert to struct drm_edid
- Work around PCI write bursts
nouveau:
- disp: use kmemdup()
- fix EDID includes
- documentation fixes
qaic:
- fixes to BO handling
- make use of DRM managed release
- fix order of remove operations
rockchip:
- analogix_dp: get encoder port from DT
- inno_hdmi: support HDMI for RK3128
- lvds: error-handling fixes
ssd130x:
- support SSD133x plus DT bindings
tegra:
- fix error handling
tilcdc:
- make use of DRM managed release
v3d:
- show memory stats in debugfs
- Support display MMU page size
vc4:
- fix error handling in plane prepare_fb
- fix framebuffer test in plane helpers
virtio:
- add venus capset defines
vkms:
- fix OOB access when programming the LUT
- Kconfig improvements
vmwgfx:
- unmap surface before changing plane state
- fix memory leak in error handling
- documentation fixes
- list command SVGA_3D_CMD_DEFINE_GB_SURFACE_V4 as invalid
- fix null-pointer deref in execbuf
- refactor display-mode probing
- fix fencing for creating cursor MOBs
- fix cursor-memory lifetime
xlnx:
- fix live video input for ZynqMP DPSUB
lima:
- fix memory leak
loongson:
- fail if no VRAM present
meson:
- switch to new drm_bridge_read_edid() interface
renesas:
- add RZ/G2L DU support plus DT bindings
mxsfb:
- Use managed mode config
sun4i:
- HDMI: updates to atomic mode setting
mediatek:
- Add display driver for MT8188 VDOSYS1
- DSI driver cleanups
- Filter modes according to hardware capability
- Fix a null pointer crash in mtk_drm_crtc_finish_page_flip
etnaviv:
- enhancements for NPU and MRT support"
* tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel: (1420 commits)
drm/amd/display: Removed redundant @ symbol to fix kernel-doc warnings in -next repo
drm/amd/pm: wait for completion of the EnableGfxImu message
drm/amdgpu/soc21: add mode2 asic reset for SMU IP v14.0.1
drm/amdgpu: add smu 14.0.1 support
drm/amdgpu: add VPE 6.1.1 discovery support
drm/amdgpu/vpe: add VPE 6.1.1 support
drm/amdgpu/vpe: don't emit cond exec command under collaborate mode
drm/amdgpu/vpe: add collaborate mode support for VPE
drm/amdgpu/vpe: add PRED_EXE and COLLAB_SYNC OPCODE
drm/amdgpu/vpe: add multi instance VPE support
drm/amdgpu/discovery: add nbif v6_3_1 ip block
drm/amdgpu: Add nbif v6_3_1 ip block support
drm/amdgpu: Add pcie v6_1_0 ip headers (v5)
drm/amdgpu: Add nbif v6_3_1 ip headers (v5)
arch/powerpc: Remove <linux/fb.h> from backlight code
macintosh/via-pmu-backlight: Include <linux/backlight.h>
fbdev/chipsfb: Include <linux/backlight.h>
drm/etnaviv: Restore some id values
drm/amdkfd: make kfd_class constant
drm/amdgpu: add ring timeout information in devcoredump
...
|
||
|
|
1715f710e7 |
Merge tag 'fsnotify_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara: - fsnotify optimizations to reduce cost of fsnotify when nobody is watching - fix longstanding wart that system could not be suspended when some process was waiting for response to fanotify permission event - some spelling fixes * tag 'fsnotify_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fanotify: allow freeze when waiting response for permission events fanotify: Fix misspelling of "writable" fsnotify: Fix misspelling of "writable" inotify: Fix misspelling of "writable" fsnotify: Add fsnotify_sb_has_watchers() helper fsnotify: optimize the case of no parent watcher |
||
|
|
69afef4af4 |
Merge tag 'gpio-updates-for-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
"The biggest feature is the locking overhaul. Up until now the
synchronization in the GPIO subsystem was broken. There was a single
spinlock "protecting" multiple data structures but doing it wrong (as
evidenced by several places where it would be released when a sleeping
function was called and then reacquired without checking the protected
state).
We tried to use an RW semaphore before but the main issue with GPIO is
that we have drivers implementing the interfaces in both sleeping and
non-sleeping ways as well as user-facing interfaces that can be called
both from process as well as atomic contexts. Both ends converge in
the same code paths that can use neither spinlocks nor mutexes. The
only reasonable way out is to use SRCU and go mostly lockless. To that
end: we add several SRCU structs in relevant places and use them to
assure consistency between API calls together with atomic reads and
writes of GPIO descriptor flags where it makes sense.
This code has spent several weeks in next and has received several
fixes in the first week or two after which it stabilized nicely. The
GPIO subsystem is now resilient to providers being suddenly unbound.
We managed to also remove the existing character device RW semaphore
and the obsolete global spinlock.
Other than the locking rework we have one new driver (for Chromebook
EC), much appreciated documentation improvements from Kent and the
regular driver improvements, DT-bindings updates and GPIOLIB core
tweaks.
Serialization rework:
- use SRCU to serialize access to the global GPIO device list, to
GPIO device structs themselves and to GPIO descriptors
- make the GPIO subsystem resilient to the GPIO providers being
unbound while the API calls are in progress
- don't dereference the SRCU-protected chip pointer if the
information we need can be obtained from the GPIO device structure
- move some of the information contained in struct gpio_chip to
struct gpio_device to further reduce the need to dereference the
former
- pass the GPIO device struct instead of the GPIO chip to sysfs
callback to, again, reduce the need for accessing the latter
- get GPIO descriptors from the GPIO device, not from the chip for
the same reason
- allow for mostly lockless operation of the GPIO driver API: assure
consistency with SRCU and atomic operations
- remove the global GPIO spinlock
- remove the character device RW semaphore
Core GPIOLIB:
- constify pointers in GPIO API where applicable
- unify the GPIO counting APIs for ACPI and OF
- provide a macro for iterating over all GPIOs, not only the ones
that are requested
- remove leftover typedefs
- pass the consumer device to GPIO core in
devm_fwnode_gpiod_get_index() for improved logging
- constify the GPIO bus type
- don't warn about removing GPIO chips with descriptors still held by
users as we can now handle this situation gracefully
- remove unused logging helpers
- unexport functions that are only used internally in the GPIO
subsystem
- set the device type (assign the relevant struct device_type) for
GPIO devices
New drivers:
- add the ChromeOS EC GPIO driver
Driver improvements:
- allow building gpio-vf610 with COMPILE_TEST as well as disabling it
in menuconfig (before it was always built for i.MX cofigs)
- count the number of EICs using the device properties instead of
hard-coding it in gpio-eic-sprd
- improve the device naming, extend the debugfs output and add
lockdep asserts to gpio-sim
DT bindings:
- document the 'label' property for gpio-pca9570
- convert aspeed,ast2400-gpio bindings to DT schema
- disallow unevaluated properties for gpio-mvebu
- document a new model in renesas,rcar-gpio
Documentation:
- improve the character device kerneldocs in user-space headers
- add proper documentation for the character device uAPI (both v1 and v2)
- move the sysfs and gpio-mockup docs into the "obsolete" section
- improve naming consistency for GPIO terms
- clarify the line values description for sysfs
- minor docs improvements
- improve the driver API contract for setting GPIO direction
- mark unsafe APIs as deprecated in kerneldocs and suggest
replacements
Other:
- remove an obsolete test from selftests"
* tag 'gpio-updates-for-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (79 commits)
gpio: sysfs: repair export returning -EPERM on 1st attempt
selftest: gpio: remove obsolete gpio-mockup test
gpiolib: Deduplicate cleanup for-loop in gpiochip_add_data_with_key()
dt-bindings: gpio: aspeed,ast2400-gpio: Convert to DT schema
gpio: acpi: Make acpi_gpio_count() take firmware node as a parameter
gpio: of: Make of_gpio_get_count() take firmware node as a parameter
gpiolib: Pass consumer device through to core in devm_fwnode_gpiod_get_index()
gpio: sim: use for_each_hwgpio()
gpio: provide for_each_hwgpio()
gpio: don't warn about removing GPIO chips with active users anymore
gpio: sim: delimit the fwnode name with a ":" when generating labels
gpio: sim: add lockdep asserts
gpio: Add ChromeOS EC GPIO driver
gpio: constify of_phandle_args in of_find_gpio_device_by_xlate()
gpio: fix memory leak in gpiod_request_commit()
gpio: constify opaque pointer "data" in gpio_device_find()
gpio: cdev: fix a NULL-pointer dereference with DEBUG enabled
gpio: uapi: clarify default_values being logical
gpio: sysfs: fix inverted pointer logic
gpio: don't let lockdep complain about inherently dangerous RCU usage
...
|
||
|
|
8c9c2f851b |
Merge tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"Core changes:
- Constification of bus_type pointer
- Preparations for user-space page-fault delivery
- Use a named kmem_cache for IOVA magazines
Intel VT-d changes from Lu Baolu:
- Add RBTree to track iommu probed devices
- Add Intel IOMMU debugfs document
- Cleanup and refactoring
ARM-SMMU Updates from Will Deacon:
- Device-tree binding updates for a bunch of Qualcomm SoCs
- SMMUv2: Support for Qualcomm X1E80100 MDSS
- SMMUv3: Significant rework of the driver's STE manipulation and
domain handling code. This is the initial part of a larger scale
rework aiming to improve the driver's implementation of the
IOMMU-API in preparation for hooking up IOMMUFD support.
AMD-Vi Updates:
- Refactor GCR3 table support for SVA
- Cleanups
Some smaller cleanups and fixes"
* tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
iommu: Fix compilation without CONFIG_IOMMU_INTEL
iommu/amd: Fix sleeping in atomic context
iommu/dma: Document min_align_mask assumption
iommu/vt-d: Remove scalabe mode in domain_context_clear_one()
iommu/vt-d: Remove scalable mode context entry setup from attach_dev
iommu/vt-d: Setup scalable mode context entry in probe path
iommu/vt-d: Fix NULL domain on device release
iommu: Add static iommu_ops->release_domain
iommu/vt-d: Improve ITE fault handling if target device isn't present
iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected
PCI: Make pci_dev_is_disconnected() helper public for other drivers
iommu/vt-d: Use device rbtree in iopf reporting path
iommu/vt-d: Use rbtree to track iommu probed devices
iommu/vt-d: Merge intel_svm_bind_mm() into its caller
iommu/vt-d: Remove initialization for dynamically heap-allocated rcu_head
iommu/vt-d: Remove treatment for revoking PASIDs with pending page faults
iommu/vt-d: Add the document for Intel IOMMU debugfs
iommu/vt-d: Use kcalloc() instead of kzalloc()
iommu/vt-d: Remove INTEL_IOMMU_BROKEN_GFX_WA
iommu: re-use local fwnode variable in iommu_ops_from_fwnode()
...
|
||
|
|
cc4a875cf3 |
Merge tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore: - Promote IMA/EVM to a proper LSM This is the bulk of the diffstat, and the source of all the changes in the VFS code. Prior to the start of the LSM stacking work it was important that IMA/EVM were separate from the rest of the LSMs, complete with their own hooks, infrastructure, etc. as it was the only way to enable IMA/EVM at the same time as a LSM. However, now that the bulk of the LSM infrastructure supports multiple simultaneous LSMs, we can simplify things greatly by bringing IMA/EVM into the LSM infrastructure as proper LSMs. This is something I've wanted to see happen for quite some time and Roberto was kind enough to put in the work to make it happen. - Use the LSM hook default values to simplify the call_int_hook() macro Previously the call_int_hook() macro required callers to supply a default return value, despite a default value being specified when the LSM hook was defined. This simplifies the macro by using the defined default return value which makes life easier for callers and should also reduce the number of return value bugs in the future (we've had a few pop up recently, hence this work). - Use the KMEM_CACHE() macro instead of kmem_cache_create() The guidance appears to be to use the KMEM_CACHE() macro when possible and there is no reason why we can't use the macro, so let's use it. - Fix a number of comment typos in the LSM hook comment blocks Not much to say here, we fixed some questionable grammar decisions in the LSM hook comment blocks. * tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (28 commits) cred: Use KMEM_CACHE() instead of kmem_cache_create() lsm: use default hook return value in call_int_hook() lsm: fix typos in security/security.c comment headers integrity: Remove LSM ima: Make it independent from 'integrity' LSM evm: Make it independent from 'integrity' LSM evm: Move to LSM infrastructure ima: Move IMA-Appraisal to LSM infrastructure ima: Move to LSM infrastructure integrity: Move integrity_kernel_module_request() to IMA security: Introduce key_post_create_or_update hook security: Introduce inode_post_remove_acl hook security: Introduce inode_post_set_acl hook security: Introduce inode_post_create_tmpfile hook security: Introduce path_post_mknod hook security: Introduce file_release hook security: Introduce file_post_open hook security: Introduce inode_post_removexattr hook security: Introduce inode_post_setattr hook security: Align inode_setattr hook definition with EVM ... |
||
|
|
9187210eee |
Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Large effort by Eric to lower rtnl_lock pressure and remove locks:
- Make commonly used parts of rtnetlink (address, route dumps
etc) lockless, protected by RCU instead of rtnl_lock.
- Add a netns exit callback which already holds rtnl_lock,
allowing netns exit to take rtnl_lock once in the core instead
of once for each driver / callback.
- Remove locks / serialization in the socket diag interface.
- Remove 6 calls to synchronize_rcu() while holding rtnl_lock.
- Remove the dev_base_lock, depend on RCU where necessary.
- Support busy polling on a per-epoll context basis. Poll length and
budget parameters can be set independently of system defaults.
- Introduce struct net_hotdata, to make sure read-mostly global
config variables fit in as few cache lines as possible.
- Add optional per-nexthop statistics to ease monitoring / debug of
ECMP imbalance problems.
- Support TCP_NOTSENT_LOWAT in MPTCP.
- Ensure that IPv6 temporary addresses' preferred lifetimes are long
enough, compared to other configured lifetimes, and at least 2 sec.
- Support forwarding of ICMP Error messages in IPSec, per RFC 4301.
- Add support for the independent control state machine for bonding
per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled
control state machine.
- Add "network ID" to MCTP socket APIs to support hosts with multiple
disjoint MCTP networks.
- Re-use the mono_delivery_time skbuff bit for packets which user
space wants to be sent at a specified time. Maintain the timing
information while traversing veth links, bridge etc.
- Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets.
- Simplify many places iterating over netdevs by using an xarray
instead of a hash table walk (hash table remains in place, for use
on fastpaths).
- Speed up scanning for expired routes by keeping a dedicated list.
- Speed up "generic" XDP by trying harder to avoid large allocations.
- Support attaching arbitrary metadata to netconsole messages.
Things we sprinkled into general kernel code:
- Enforce VM_IOREMAP flag and range in ioremap_page_range and
introduce VM_SPARSE kind and vm_area_[un]map_pages (used by
bpf_arena).
- Rework selftest harness to enable the use of the full range of ksft
exit code (pass, fail, skip, xfail, xpass).
Netfilter:
- Allow userspace to define a table that is exclusively owned by a
daemon (via netlink socket aliveness) without auto-removing this
table when the userspace program exits. Such table gets marked as
orphaned and a restarting management daemon can re-attach/regain
ownership.
- Speed up element insertions to nftables' concatenated-ranges set
type. Compact a few related data structures.
BPF:
- Add BPF token support for delegating a subset of BPF subsystem
functionality from privileged system-wide daemons such as systemd
through special mount options for userns-bound BPF fs to a trusted
& unprivileged application.
- Introduce bpf_arena which is sparse shared memory region between
BPF program and user space where structures inside the arena can
have pointers to other areas of the arena, and pointers work
seamlessly for both user-space programs and BPF programs.
- Introduce may_goto instruction that is a contract between the
verifier and the program. The verifier allows the program to loop
assuming it's behaving well, but reserves the right to terminate
it.
- Extend the BPF verifier to enable static subprog calls in spin lock
critical sections.
- Support registration of struct_ops types from modules which helps
projects like fuse-bpf that seeks to implement a new struct_ops
type.
- Add support for retrieval of cookies for perf/kprobe multi links.
- Support arbitrary TCP SYN cookie generation / validation in the TC
layer with BPF to allow creating SYN flood handling in BPF
firewalls.
- Add code generation to inline the bpf_kptr_xchg() helper which
improves performance when stashing/popping the allocated BPF
objects.
Wireless:
- Add SPP (signaling and payload protected) AMSDU support.
- Support wider bandwidth OFDMA, as required for EHT operation.
Driver API:
- Major overhaul of the Energy Efficient Ethernet internals to
support new link modes (2.5GE, 5GE), share more code between
drivers (especially those using phylib), and encourage more
uniform behavior. Convert and clean up drivers.
- Define an API for querying per netdev queue statistics from
drivers.
- IPSec: account in global stats for fully offloaded sessions.
- Create a concept of Ethernet PHY Packages at the Device Tree level,
to allow parameterizing the existing PHY package code.
- Enable Rx hashing (RSS) on GTP protocol fields.
Misc:
- Improvements and refactoring all over networking selftests.
- Create uniform module aliases for TC classifiers, actions, and
packet schedulers to simplify creating modprobe policies.
- Address all missing MODULE_DESCRIPTION() warnings in networking.
- Extend the Netlink descriptions in YAML to cover message
encapsulation or "Netlink polymorphism", where interpretation of
nested attributes depends on link type, classifier type or some
other "class type".
Drivers:
- Ethernet high-speed NICs:
- Add a new driver for Marvell's Octeon PCI Endpoint NIC VF.
- Intel (100G, ice, idpf):
- support E825-C devices
- nVidia/Mellanox:
- support devices with one port and multiple PCIe links
- Broadcom (bnxt):
- support n-tuple filters
- support configuring the RSS key
- Wangxun (ngbe/txgbe):
- implement irq_domain for TXGBE's sub-interrupts
- Pensando/AMD:
- support XDP
- optimize queue submission and wakeup handling (+17% bps)
- optimize struct layout, saving 28% of memory on queues
- Ethernet NICs embedded and virtual:
- Google cloud vNIC:
- refactor driver to perform memory allocations for new queue
config before stopping and freeing the old queue memory
- Synopsys (stmmac):
- obey queueMaxSDU and implement counters required by 802.1Qbv
- Renesas (ravb):
- support packet checksum offload
- suspend to RAM and runtime PM support
- Ethernet switches:
- nVidia/Mellanox:
- support for nexthop group statistics
- Microchip:
- ksz8: implement PHY loopback
- add support for KSZ8567, a 7-port 10/100Mbps switch
- PTP:
- New driver for RENESAS FemtoClock3 Wireless clock generator.
- Support OCP PTP cards designed and built by Adva.
- CAN:
- Support recvmsg() flags for own, local and remote traffic on CAN
BCM sockets.
- Support for esd GmbH PCIe/402 CAN device family.
- m_can:
- Rx/Tx submission coalescing
- wake on frame Rx
- WiFi:
- Intel (iwlwifi):
- enable signaling and payload protected A-MSDUs
- support wider-bandwidth OFDMA
- support for new devices
- bump FW API to 89 for AX devices; 90 for BZ/SC devices
- MediaTek (mt76):
- mt7915: newer ADIE version support
- mt7925: radio temperature sensor support
- Qualcomm (ath11k):
- support 6 GHz station power modes: Low Power Indoor (LPI),
Standard Power) SP and Very Low Power (VLP)
- QCA6390 & WCN6855: support 2 concurrent station interfaces
- QCA2066 support
- Qualcomm (ath12k):
- refactoring in preparation for Multi-Link Operation (MLO)
support
- 1024 Block Ack window size support
- firmware-2.bin support
- support having multiple identical PCI devices (firmware needs
to have ATH12K_FW_FEATURE_MULTI_QRTR_ID)
- QCN9274: support split-PHY devices
- WCN7850: enable Power Save Mode in station mode
- WCN7850: P2P support
- RealTek:
- rtw88: support for more rtw8811cu and rtw8821cu devices
- rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL
- rtlwifi: speed up USB firmware initialization
- rtwl8xxxu:
- RTL8188F: concurrent interface support
- Channel Switch Announcement (CSA) support in AP mode
- Broadcom (brcmfmac):
- per-vendor feature support
- per-vendor SAE password setup
- DMI nvram filename quirk for ACEPC W5 Pro"
* tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2255 commits)
nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y
nexthop: Fix out-of-bounds access during attribute validation
nexthop: Only parse NHA_OP_FLAGS for dump messages that require it
nexthop: Only parse NHA_OP_FLAGS for get messages that require it
bpf: move sleepable flag from bpf_prog_aux to bpf_prog
bpf: hardcode BPF_PROG_PACK_SIZE to 2MB * num_possible_nodes()
selftests/bpf: Add kprobe multi triggering benchmarks
ptp: Move from simple ida to xarray
vxlan: Remove generic .ndo_get_stats64
vxlan: Do not alloc tstats manually
devlink: Add comments to use netlink gen tool
nfp: flower: handle acti_netdevs allocation failure
net/packet: Add getsockopt support for PACKET_COPY_THRESH
net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID
selftests/bpf: Add bpf_arena_htab test.
selftests/bpf: Add bpf_arena_list test.
selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages
bpf: Add helper macro bpf_addr_space_cast()
libbpf: Recognize __arena global variables.
bpftool: Recognize arena map type
...
|
||
|
|
43a7548e28 |
Merge tag 'for-6.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"Mostly stabilization, refactoring and cleanup changes. There rest are
minor performance optimizations due to caching or lock contention
reduction and a few notable fixes.
Performance improvements:
- minor speedup in logging when repeatedly allocated structure is
preallocated only once, improves latency and decreases lock
contention
- minor throughput increase (+6%), reduced lock contention after
clearing delayed allocation bits, applies to several common
workload types
- skip full quota rescan if a new relation is added in the same
transaction
Fixes:
- zstd fix for inline compressed file in subpage mode, updated
version from the 6.8 time
- proper qgroup inheritance ioctl parameter validation
- more fiemap followup fixes after reduced locking done in 6.8:
- fix race when detecting delalloc ranges
Core changes:
- more debugging code:
- added assertions for a very rare crash in raid56 calculation
- tree-checker dumps page state to give more insights into
possible reference counting issues
- add checksum calculation offloading sysfs knob, for now enabled
under DEBUG only to determine a good heuristic for deciding the
offload or synchronous, depends on various factors (block group
profile, device speed) and is not as clear as initially thought
(checksum type)
- error handling improvements, added assertions
- more page to folio conversion (defrag, truncate), cached size and
shift
- preparation for more fine grained locking of sectors in subpage
mode
- cleanups and refactoring:
- include cleanups, forward declarations
- pointer-to-structure helpers
- redundant argument removals
- removed unused code
- slab cache updates, last use of SLAB_MEM_SPREAD removed"
* tag 'for-6.9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (114 commits)
btrfs: reuse cloned extent buffer during fiemap to avoid re-allocations
btrfs: fix race when detecting delalloc ranges during fiemap
btrfs: fix off-by-one chunk length calculation at contains_pending_extent()
btrfs: qgroup: allow quick inherit if snapshot is created and added to the same parent
btrfs: qgroup: validate btrfs_qgroup_inherit parameter
btrfs: include device major and minor numbers in the device scan notice
btrfs: mark btrfs_put_caching_control() static
btrfs: remove SLAB_MEM_SPREAD flag use
btrfs: qgroup: always free reserved space for extent records
btrfs: tree-checker: dump the page status if hit something wrong
btrfs: compression: remove dead comments in btrfs_compress_heuristic()
btrfs: subpage: make writer lock utilize bitmap
btrfs: subpage: make reader lock utilize bitmap
btrfs: unexport btrfs_subpage_start_writer() and btrfs_subpage_end_and_test_writer()
btrfs: pass a valid extent map cache pointer to __get_extent_map()
btrfs: merge btrfs_del_delalloc_inode() helpers
btrfs: pass btrfs_device to btrfs_scratch_superblocks()
btrfs: handle transaction commit errors in flush_reservations()
btrfs: use KMEM_CACHE() to create btrfs_free_space cache
btrfs: use KMEM_CACHE() to create delayed ref caches
...
|
||
|
|
5f20e6ab1f |
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says: ==================== pull-request: bpf-next 2024-03-11 We've added 59 non-merge commits during the last 9 day(s) which contain a total of 88 files changed, 4181 insertions(+), 590 deletions(-). The main changes are: 1) Enforce VM_IOREMAP flag and range in ioremap_page_range and introduce VM_SPARSE kind and vm_area_[un]map_pages to be used in bpf_arena, from Alexei. 2) Introduce bpf_arena which is sparse shared memory region between bpf program and user space where structures inside the arena can have pointers to other areas of the arena, and pointers work seamlessly for both user-space programs and bpf programs, from Alexei and Andrii. 3) Introduce may_goto instruction that is a contract between the verifier and the program. The verifier allows the program to loop assuming it's behaving well, but reserves the right to terminate it, from Alexei. 4) Use IETF format for field definitions in the BPF standard document, from Dave. 5) Extend struct_ops libbpf APIs to allow specify version suffixes for stuct_ops map types, share the same BPF program between several map definitions, and other improvements, from Eduard. 6) Enable struct_ops support for more than one page in trampolines, from Kui-Feng. 7) Support kCFI + BPF on riscv64, from Puranjay. 8) Use bpf_prog_pack for arm64 bpf trampoline, from Puranjay. 9) Fix roundup_pow_of_two undefined behavior on 32-bit archs, from Toke. ==================== Link: https://lore.kernel.org/r/20240312003646.8692-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
38b334fc76 |
Merge tag 'x86_sev_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov: - Add the x86 part of the SEV-SNP host support. This will allow the kernel to be used as a KVM hypervisor capable of running SNP (Secure Nested Paging) guests. Roughly speaking, SEV-SNP is the ultimate goal of the AMD confidential computing side, providing the most comprehensive confidential computing environment up to date. This is the x86 part and there is a KVM part which did not get ready in time for the merge window so latter will be forthcoming in the next cycle. - Rework the early code's position-dependent SEV variable references in order to allow building the kernel with clang and -fPIE/-fPIC and -mcmodel=kernel - The usual set of fixes, cleanups and improvements all over the place * tag 'x86_sev_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) x86/sev: Disable KMSAN for memory encryption TUs x86/sev: Dump SEV_STATUS crypto: ccp - Have it depend on AMD_IOMMU iommu/amd: Fix failure return from snp_lookup_rmpentry() x86/sev: Fix position dependent variable references in startup code crypto: ccp: Make snp_range_list static x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT Documentation: virt: Fix up pre-formatted text block for SEV ioctls crypto: ccp: Add the SNP_SET_CONFIG command crypto: ccp: Add the SNP_COMMIT command crypto: ccp: Add the SNP_PLATFORM_STATUS command x86/cpufeatures: Enable/unmask SEV-SNP CPU feature KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown crypto: ccp: Handle legacy SEV commands when SNP is enabled crypto: ccp: Handle non-volatile INIT_EX data when SNP is enabled crypto: ccp: Handle the legacy TMR allocation when SNP is enabled x86/sev: Introduce an SNP leaked pages list crypto: ccp: Provide an API to issue SEV and SNP commands ... |
||
|
|
eaf657f7ad |
devlink: Add comments to use netlink gen tool
Add the comment to remind people not to manually modify the net/devlink/netlink_gen.c, but to use tools/net/ynl/ynl-regen.sh to generate it. Signed-off-by: William Tu <witu@nvidia.com> Suggested-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240310145503.32721-1-witu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
667a86ad9b |
bpf: Disasm support for addr_space_cast instruction.
LLVM generates rX = addr_space_cast(rY, dst_addr_space, src_addr_space) instruction when pointers in non-zero address space are used by the bpf program. Recognize this insn in uapi and in bpf disassembler. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20240308010812.89848-3-alexei.starovoitov@gmail.com |
||
|
|
317460317a |
bpf: Introduce bpf_arena.
Introduce bpf_arena, which is a sparse shared memory region between the bpf program and user space. Use cases: 1. User space mmap-s bpf_arena and uses it as a traditional mmap-ed anonymous region, like memcached or any key/value storage. The bpf program implements an in-kernel accelerator. XDP prog can search for a key in bpf_arena and return a value without going to user space. 2. The bpf program builds arbitrary data structures in bpf_arena (hash tables, rb-trees, sparse arrays), while user space consumes it. 3. bpf_arena is a "heap" of memory from the bpf program's point of view. The user space may mmap it, but bpf program will not convert pointers to user base at run-time to improve bpf program speed. Initially, the kernel vm_area and user vma are not populated. User space can fault in pages within the range. While servicing a page fault, bpf_arena logic will insert a new page into the kernel and user vmas. The bpf program can allocate pages from that region via bpf_arena_alloc_pages(). This kernel function will insert pages into the kernel vm_area. The subsequent fault-in from user space will populate that page into the user vma. The BPF_F_SEGV_ON_FAULT flag at arena creation time can be used to prevent fault-in from user space. In such a case, if a page is not allocated by the bpf program and not present in the kernel vm_area, the user process will segfault. This is useful for use cases 2 and 3 above. bpf_arena_alloc_pages() is similar to user space mmap(). It allocates pages either at a specific address within the arena or allocates a range with the maple tree. bpf_arena_free_pages() is analogous to munmap(), which frees pages and removes the range from the kernel vm_area and from user process vmas. bpf_arena can be used as a bpf program "heap" of up to 4GB. The speed of bpf program is more important than ease of sharing with user space. This is use case 3. In such a case, the BPF_F_NO_USER_CONV flag is recommended. It will tell the verifier to treat the rX = bpf_arena_cast_user(rY) instruction as a 32-bit move wX = wY, which will improve bpf prog performance. Otherwise, bpf_arena_cast_user is translated by JIT to conditionally add the upper 32 bits of user vm_start (if the pointer is not NULL) to arena pointers before they are stored into memory. This way, user space sees them as valid 64-bit pointers. Diff https://github.com/llvm/llvm-project/pull/84410 enables LLVM BPF backend generate the bpf_addr_space_cast() instruction to cast pointers between address_space(1) which is reserved for bpf_arena pointers and default address space zero. All arena pointers in a bpf program written in C language are tagged as __attribute__((address_space(1))). Hence, clang provides helpful diagnostics when pointers cross address space. Libbpf and the kernel support only address_space == 1. All other address space identifiers are reserved. rX = bpf_addr_space_cast(rY, /* dst_as */ 1, /* src_as */ 0) tells the verifier that rX->type = PTR_TO_ARENA. Any further operations on PTR_TO_ARENA register have to be in the 32-bit domain. The verifier will mark load/store through PTR_TO_ARENA with PROBE_MEM32. JIT will generate them as kern_vm_start + 32bit_addr memory accesses. The behavior is similar to copy_from_kernel_nofault() except that no address checks are necessary. The address is guaranteed to be in the 4GB range. If the page is not present, the destination register is zeroed on read, and the operation is ignored on write. rX = bpf_addr_space_cast(rY, 0, 1) tells the verifier that rX->type = unknown scalar. If arena->map_flags has BPF_F_NO_USER_CONV set, then the verifier converts such cast instructions to mov32. Otherwise, JIT will emit native code equivalent to: rX = (u32)rY; if (rY) rX |= clear_lo32_bits(arena->user_vm_start); /* replace hi32 bits in rX */ After such conversion, the pointer becomes a valid user pointer within bpf_arena range. The user process can access data structures created in bpf_arena without any additional computations. For example, a linked list built by a bpf program can be walked natively by user space. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Barret Rhoden <brho@google.com> Link: https://lore.kernel.org/bpf/20240308010812.89848-2-alexei.starovoitov@gmail.com |
||
|
|
1ddeeb2a05 |
Merge tag 'for-6.9/block-20240310' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- MD pull requests via Song:
- Cleanup redundant checks (Yu Kuai)
- Remove deprecated headers (Marc Zyngier, Song Liu)
- Concurrency fixes (Li Lingfeng)
- Memory leak fix (Li Nan)
- Refactor raid1 read_balance (Yu Kuai, Paul Luse)
- Clean up and fix for md_ioctl (Li Nan)
- Other small fixes (Gui-Dong Han, Heming Zhao)
- MD atomic limits (Christoph)
- NVMe pull request via Keith:
- RDMA target enhancements (Max)
- Fabrics fixes (Max, Guixin, Hannes)
- Atomic queue_limits usage (Christoph)
- Const use for class_register (Ricardo)
- Identification error handling fixes (Shin'ichiro, Keith)
- Improvement and cleanup for cached request handling (Christoph)
- Moving towards atomic queue limits. Core changes and driver bits so
far (Christoph)
- Fix UAF issues in aoeblk (Chun-Yi)
- Zoned fix and cleanups (Damien)
- s390 dasd cleanups and fixes (Jan, Miroslav)
- Block issue timestamp caching (me)
- noio scope guarding for zoned IO (Johannes)
- block/nvme PI improvements (Kanchan)
- Ability to terminate long running discard loop (Keith)
- bdev revalidation fix (Li)
- Get rid of old nr_queues hack for kdump kernels (Ming)
- Support for async deletion of ublk (Ming)
- Improve IRQ bio recycling (Pavel)
- Factor in CPU capacity for remote vs local completion (Qais)
- Add shared_tags configfs entry for null_blk (Shin'ichiro
- Fix for a regression in page refcounts introduced by the folio
unification (Tony)
- Misc fixes and cleanups (Arnd, Colin, John, Kunwu, Li, Navid,
Ricardo, Roman, Tang, Uwe)
* tag 'for-6.9/block-20240310' of git://git.kernel.dk/linux: (221 commits)
block: partitions: only define function mac_fix_string for CONFIG_PPC_PMAC
block/swim: Convert to platform remove callback returning void
cdrom: gdrom: Convert to platform remove callback returning void
block: remove disk_stack_limits
md: remove mddev->queue
md: don't initialize queue limits
md/raid10: use the atomic queue limit update APIs
md/raid5: use the atomic queue limit update APIs
md/raid1: use the atomic queue limit update APIs
md/raid0: use the atomic queue limit update APIs
md: add queue limit helpers
md: add a mddev_is_dm helper
md: add a mddev_add_trace_msg helper
md: add a mddev_trace_remap helper
bcache: move calculation of stripe_size and io_opt into bcache_device_init
virtio_blk: Do not use disk_set_max_open/active_zones()
aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts
block: move capacity validation to blkpg_do_ioctl()
block: prevent division by zero in blk_rq_stat_sum()
drbd: atomically update queue limits in drbd_reconsider_queue_parameters
...
|
||
|
|
d2c84bdce2 |
Merge tag 'for-6.9/io_uring-20240310' of git://git.kernel.dk/linux
Pull io_uring updates from Jens Axboe: - Make running of task_work internal loops more fair, and unify how the different methods deal with them (me) - Support for per-ring NAPI. The two minor networking patches are in a shared branch with netdev (Stefan) - Add support for truncate (Tony) - Export SQPOLL utilization stats (Xiaobing) - Multishot fixes (Pavel) - Fix for a race in manipulating the request flags via poll (Pavel) - Cleanup the multishot checking by making it generic, moving it out of opcode handlers (Pavel) - Various tweaks and cleanups (me, Kunwu, Alexander) * tag 'for-6.9/io_uring-20240310' of git://git.kernel.dk/linux: (53 commits) io_uring: Fix sqpoll utilization check racing with dying sqpoll io_uring/net: dedup io_recv_finish req completion io_uring: refactor DEFER_TASKRUN multishot checks io_uring: fix mshot io-wq checks io_uring/net: add io_req_msg_cleanup() helper io_uring/net: simplify msghd->msg_inq checking io_uring/kbuf: rename REQ_F_PARTIAL_IO to REQ_F_BL_NO_RECYCLE io_uring/net: remove dependency on REQ_F_PARTIAL_IO for sr->done_io io_uring/net: correctly handle multishot recvmsg retry setup io_uring/net: clear REQ_F_BL_EMPTY in the multishot retry handler io_uring: fix io_queue_proc modifying req->flags io_uring: fix mshot read defer taskrun cqe posting io_uring/net: fix overflow check in io_recvmsg_mshot_prep() io_uring/net: correct the type of variable io_uring/sqpoll: statistics of the true utilization of sq threads io_uring/net: move recv/recvmsg flags out of retry loop io_uring/kbuf: flag request if buffer pool is empty after buffer pick io_uring/net: improve the usercopy for sendmsg/recvmsg io_uring/net: move receive multishot out of the generic msghdr path io_uring/net: unify how recvmsg and sendmsg copy in the msghdr ... |
||
|
|
0f1a876682 |
Merge tag 'vfs-6.9.uuid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs uuid updates from Christian Brauner: "This adds two new ioctl()s for getting the filesystem uuid and retrieving the sysfs path based on the path of a mounted filesystem. Getting the filesystem uuid has been implemented in filesystem specific code for a while it's now lifted as a generic ioctl" * tag 'vfs-6.9.uuid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: xfs: add support for FS_IOC_GETFSSYSFSPATH fs: add FS_IOC_GETFSSYSFSPATH fat: Hook up sb->s_uuid fs: FS_IOC_GETUUID ovl: convert to super_set_uuid() fs: super_set_uuid() |
||
|
|
b5683a37c8 |
Merge tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull pdfd updates from Christian Brauner:
- Until now pidfds could only be created for thread-group leaders but
not for threads. There was no technical reason for this. We simply
had no users that needed support for this. Now we do have users that
need support for this.
This introduces a new PIDFD_THREAD flag for pidfd_open(). If that
flag is set pidfd_open() creates a pidfd that refers to a specific
thread.
In addition, we now allow clone() and clone3() to be called with
CLONE_PIDFD | CLONE_THREAD which wasn't possible before.
A pidfd that refers to an individual thread differs from a pidfd that
refers to a thread-group leader:
(1) Pidfds are pollable. A task may poll a pidfd and get notified
when the task has exited.
For thread-group leader pidfds the polling task is woken if the
thread-group is empty. In other words, if the thread-group
leader task exits when there are still threads alive in its
thread-group the polling task will not be woken when the
thread-group leader exits but rather when the last thread in the
thread-group exits.
For thread-specific pidfds the polling task is woken if the
thread exits.
(2) Passing a thread-group leader pidfd to pidfd_send_signal() will
generate thread-group directed signals like kill(2) does.
Passing a thread-specific pidfd to pidfd_send_signal() will
generate thread-specific signals like tgkill(2) does.
The default scope of the signal is thus determined by the type
of the pidfd.
Since use-cases exist where the default scope of the provided
pidfd needs to be overriden the following flags are added to
pidfd_send_signal():
- PIDFD_SIGNAL_THREAD
Send a thread-specific signal.
- PIDFD_SIGNAL_THREAD_GROUP
Send a thread-group directed signal.
- PIDFD_SIGNAL_PROCESS_GROUP
Send a process-group directed signal.
The scope change will only work if the struct pid is actually
used for this scope.
For example, in order to send a thread-group directed signal the
provided pidfd must be used as a thread-group leader and
similarly for PIDFD_SIGNAL_PROCESS_GROUP the struct pid must be
used as a process group leader.
- Move pidfds from the anonymous inode infrastructure to a tiny pseudo
filesystem. This will unblock further work that we weren't able to do
simply because of the very justified limitations of anonymous inodes.
Moving pidfds to a tiny pseudo filesystem allows for statx on pidfds
to become useful for the first time. They can now be compared by
inode number which are unique for the system lifetime.
Instead of stashing struct pid in file->private_data we can now stash
it in inode->i_private. This makes it possible to introduce concepts
that operate on a process once all file descriptors have been closed.
A concrete example is kill-on-last-close. Another side-effect is that
file->private_data is now freed up for per-file options for pidfds.
Now, each struct pid will refer to a different inode but the same
struct pid will refer to the same inode if it's opened multiple
times. In contrast to now where each struct pid refers to the same
inode.
The tiny pseudo filesystem is not visible anywhere in userspace
exactly like e.g., pipefs and sockfs. There's no lookup, there's no
complex inode operations, nothing. Dentries and inodes are always
deleted when the last pidfd is closed.
We allocate a new inode and dentry for each struct pid and we reuse
that inode and dentry for all pidfds that refer to the same struct
pid. The code is entirely optional and fairly small. If it's not
selected we fallback to anonymous inodes. Heavily inspired by nsfs.
The dentry and inode allocation mechanism is moved into generic
infrastructure that is now shared between nsfs and pidfs. The
path_from_stashed() helper must be provided with a stashing location,
an inode number, a mount, and the private data that is supposed to be
used and it will provide a path that can be passed to dentry_open().
The helper will try retrieve an existing dentry from the provided
stashing location. If a valid dentry is found it is reused. If not a
new one is allocated and we try to stash it in the provided location.
If this fails we retry until we either find an existing dentry or the
newly allocated dentry could be stashed. Subsequent openers of the
same namespace or task are then able to reuse it.
- Currently it is only possible to get notified when a task has exited,
i.e., become a zombie and userspace gets notified with EPOLLIN. We
now also support waiting until the task has been reaped, notifying
userspace with EPOLLHUP.
- Ensure that ESRCH is reported for getfd if a task is exiting instead
of the confusing EBADF.
- Various smaller cleanups to pidfd functions.
* tag 'vfs-6.9.pidfd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits)
libfs: improve path_from_stashed()
libfs: add stashed_dentry_prune()
libfs: improve path_from_stashed() helper
pidfs: convert to path_from_stashed() helper
nsfs: convert to path_from_stashed() helper
libfs: add path_from_stashed()
pidfd: add pidfs
pidfd: move struct pidfd_fops
pidfd: allow to override signal scope in pidfd_send_signal()
pidfd: change pidfd_send_signal() to respect PIDFD_THREAD
signal: fill in si_code in prepare_kill_siginfo()
selftests: add ESRCH tests for pidfd_getfd()
pidfd: getfd should always report ESRCH if a task is exiting
pidfd: clone: allow CLONE_THREAD | CLONE_PIDFD together
pidfd: exit: kill the no longer used thread_group_exited()
pidfd: change do_notify_pidfd() to use __wake_up(poll_to_key(EPOLLIN))
pid: kill the obsolete PIDTYPE_PID code in transfer_pid()
pidfd: kill the no longer needed do_notify_pidfd() in de_thread()
pidfd_poll: report POLLHUP when pid_task() == NULL
pidfd: implement PIDFD_THREAD flag for pidfd_open()
...
|
||
|
|
7ea65c89d8 |
Merge tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"Misc features, cleanups, and fixes for vfs and individual filesystems.
Features:
- Support idmapped mounts for hugetlbfs.
- Add RWF_NOAPPEND flag for pwritev2(). This allows us to fix a bug
where the passed offset is ignored if the file is O_APPEND. The new
flag allows a caller to enforce that the offset is honored to
conform to posix even if the file was opened in append mode.
- Move i_mmap_rwsem in struct address_space to avoid false sharing
between i_mmap and i_mmap_rwsem.
- Convert efs, qnx4, and coda to use the new mount api.
- Add a generic is_dot_dotdot() helper that's used by various
filesystems and the VFS code instead of open-coding it multiple
times.
- Recently we've added stable offsets which allows stable ordering
when iterating directories exported through NFS on e.g., tmpfs
filesystems. Originally an xarray was used for the offset map but
that caused slab fragmentation issues over time. This switches the
offset map to the maple tree which has a dense mode that handles
this scenario a lot better. Includes tests.
- Finally merge the case-insensitive improvement series Gabriel has
been working on for a long time. This cleanly propagates case
insensitive operations through ->s_d_op which in turn allows us to
remove the quite ugly generic_set_encrypted_ci_d_ops() operations.
It also improves performance by trying a case-sensitive comparison
first and then fallback to case-insensitive lookup if that fails.
This also fixes a bug where overlayfs would be able to be mounted
over a case insensitive directory which would lead to all sort of
odd behaviors.
Cleanups:
- Make file_dentry() a simple accessor now that ->d_real() is
simplified because of the backing file work we did the last two
cycles.
- Use the dedicated file_mnt_idmap helper in ntfs3.
- Use smp_load_acquire/store_release() in the i_size_read/write
helpers and thus remove the hack to handle i_size reads in the
filemap code.
- The SLAB_MEM_SPREAD is a nop now. Remove it from various places in
fs/
- It's no longer necessary to perform a second built-in initramfs
unpack call because we retain the contents of the previous
extraction. Remove it.
- Now that we have removed various allocators kfree_rcu() always
works with kmem caches and kmalloc(). So simplify various places
that only use an rcu callback in order to handle the kmem cache
case.
- Convert the pipe code to use a lockdep comparison function instead
of open-coding the nesting making lockdep validation easier.
- Move code into fs-writeback.c that was located in a header but can
be made static as it's only used in that one file.
- Rewrite the alignment checking iterators for iovec and bvec to be
easier to read, and also significantly more compact in terms of
generated code. This saves 270 bytes of text on x86-64 (with
clang-18) and 224 bytes on arm64 (with gcc-13). In profiles it also
saves a bit of time for the same workload.
- Switch various places to use KMEM_CACHE instead of
kmem_cache_create().
- Use inode_set_ctime_to_ts() in inode_set_ctime_current()
- Use kzalloc() in name_to_handle_at() to avoid kernel infoleak.
- Various smaller cleanups for eventfds.
Fixes:
- Fix various comments and typos, and unneeded initializations.
- Fix stack allocation hack for clang in the select code.
- Improve dump_mapping() debug code on a best-effort basis.
- Fix build errors in various selftests.
- Avoid wrap-around instrumentation in various places.
- Don't allow user namespaces without an idmapping to be used for
idmapped mounts.
- Fix sysv sb_read() call.
- Fix fallback implementation of the get_name() export operation"
* tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (70 commits)
hugetlbfs: support idmapped mounts
qnx4: convert qnx4 to use the new mount api
fs: use inode_set_ctime_to_ts to set inode ctime to current time
libfs: Drop generic_set_encrypted_ci_d_ops
ubifs: Configure dentry operations at dentry-creation time
f2fs: Configure dentry operations at dentry-creation time
ext4: Configure dentry operations at dentry-creation time
libfs: Add helper to choose dentry operations at mount-time
libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
fscrypt: Drop d_revalidate once the key is added
fscrypt: Drop d_revalidate for valid dentries during lookup
fscrypt: Factor out a helper to configure the lookup dentry
ovl: Always reject mounting over case-insensitive directories
libfs: Attempt exact-match comparison first during casefolded lookup
efs: remove SLAB_MEM_SPREAD flag usage
jfs: remove SLAB_MEM_SPREAD flag usage
minix: remove SLAB_MEM_SPREAD flag usage
openpromfs: remove SLAB_MEM_SPREAD flag usage
proc: remove SLAB_MEM_SPREAD flag usage
qnx6: remove SLAB_MEM_SPREAD flag usage
...
|
||
|
|
f541fd7adf |
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says: ==================== ethtool: ice: Support for RSS settings to GTP Takeru Hayasaka enables RSS functionality for GTP packets on ice driver with ethtool. A user can include TEID and make RSS work for GTP-U over IPv4 by doing the following:`ethtool -N ens3 rx-flow-hash gtpu4 sde` In addition to gtpu(4|6), we now support gtpc(4|6),gtpc(4|6)t,gtpu(4|6)e, gtpu(4|6)u, and gtpu(4|6)d. gtpc(4|6): Used for GTP-C in IPv4 and IPv6, where the GTP header format does not include a TEID. gtpc(4|6)t: Used for GTP-C in IPv4 and IPv6, with a GTP header format that includes a TEID. gtpu(4|6): Used for GTP-U in both IPv4 and IPv6 scenarios. gtpu(4|6)e: Used for GTP-U with extended headers in both IPv4 and IPv6. gtpu(4|6)u: Used when the PSC (PDU session container) in the GTP-U extended header includes Uplink, applicable to both IPv4 and IPv6. gtpu(4|6)d: Used when the PSC in the GTP-U extended header includes Downlink, for both IPv4 and IPv6. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
5072ae00ae |
net: nexthop: Expose nexthop group HW stats to user space
Add netlink support for reading NH group hardware stats. Stats collection is done through a new notifier, NEXTHOP_EVENT_HW_STATS_REPORT_DELTA. Drivers that implement HW counters for a given NH group are thereby asked to collect the stats and report back to core by calling nh_grp_hw_stats_report_delta(). This is similar to what netdevice L3 stats do. Besides exposing number of packets that passed in the HW datapath, also include information on whether any driver actually realizes the counters. The core can tell based on whether it got any _report_delta() reports from the drivers. This allows enabling the statistics at the group at any time, with drivers opting into supporting them. This is also in line with what netdevice L3 stats are doing. So as not to waste time and space, tie the collection and reporting of HW stats with a new op flag, NHA_OP_FLAG_DUMP_HW_STATS. Co-developed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Kees Cook <keescook@chromium.org> # For the __counted_by bits Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
746c19a52e |
net: nexthop: Add ability to enable / disable hardware statistics
Add netlink support for enabling collection of HW statistics on nexthop groups. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
95fedd7685 |
net: nexthop: Expose nexthop group stats to user space
Add netlink support for reading NH group stats. This data is only for statistics of the traffic in the SW datapath. HW nexthop group statistics will be added in the following patches. Emission of the stats is keyed to a new op_stats flag to avoid cluttering the netlink message with stats if the user doesn't need them: NHA_OP_FLAG_DUMP_STATS. Co-developed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
a207eab103 |
net: nexthop: Add NHA_OP_FLAGS
In order to add per-nexthop statistics, but still not increase netlink message size for consumers that do not care about them, there needs to be a toggle through which the user indicates their desire to get the statistics. To that end, add a new attribute, NHA_OP_FLAGS. The idea is to be able to use the attribute for carrying of arbitrary operation-specific flags, i.e. not make it specific for get / dump. Add the new attribute to get and dump policies, but do not actually allow any flags yet -- those will come later as the flags themselves are defined. Add the necessary parsing code. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
f379a7e9c3 | Merge branches 'arm/mediatek', 'arm/renesas', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next | ||
|
|
92f8b1f5ca |
netdev: add queue stat for alloc failures
Rx alloc failures are commonly counted by drivers. Support reporting those via netdev-genl queue stats. Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240306195509.1502746-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
ab63a2387c |
netdev: add per-queue statistics
The ethtool-nl family does a good job exposing various protocol related and IEEE/IETF statistics which used to get dumped under ethtool -S, with creative names. Queue stats don't have a netlink API, yet, and remain a lion's share of ethtool -S output for new drivers. Not only is that bad because the names differ driver to driver but it's also bug-prone. Intuitively drivers try to report only the stats for active queues, but querying ethtool stats involves multiple system calls, and the number of stats is read separately from the stats themselves. Worse still when user space asks for values of the stats, it doesn't inform the kernel how big the buffer is. If number of stats increases in the meantime kernel will overflow user buffer. Add a netlink API for dumping queue stats. Queue information is exposed via the netdev-genl family, so add the stats there. Support per-queue and sum-for-device dumps. Latter will be useful when subsequent patches add more interesting common stats than just bytes and packets. The API does not currently distinguish between HW and SW stats. The expectation is that the source of the stats will either not matter much (good packets) or be obvious (skb alloc errors). Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240306195509.1502746-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
b0b6739cb9 |
Merge tag 'drm-etnaviv-next-2024-03-07' of https://git.pengutronix.de/git/lst/linux into drm-next
- various code cleanups - enhancements for NPU and MRT support Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/72a783cd98d60f6ebb43b90a6b453eea87224409.camel@pengutronix.de |
||
|
|
e3afe5dd3a |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/page_pool_user.c |
||
|
|
011832b97b |
bpf: Introduce may_goto instruction
Introduce may_goto instruction that from the verifier pov is similar to
open coded iterators bpf_for()/bpf_repeat() and bpf_loop() helper, but it
doesn't iterate any objects.
In assembly 'may_goto' is a nop most of the time until bpf runtime has to
terminate the program for whatever reason. In the current implementation
may_goto has a hidden counter, but other mechanisms can be used.
For programs written in C the later patch introduces 'cond_break' macro
that combines 'may_goto' with 'break' statement and has similar semantics:
cond_break is a nop until bpf runtime has to break out of this loop.
It can be used in any normal "for" or "while" loop, like
for (i = zero; i < cnt; cond_break, i++) {
The verifier recognizes that may_goto is used in the program, reserves
additional 8 bytes of stack, initializes them in subprog prologue, and
replaces may_goto instruction with:
aux_reg = *(u64 *)(fp - 40)
if aux_reg == 0 goto pc+off
aux_reg -= 1
*(u64 *)(fp - 40) = aux_reg
may_goto instruction can be used by LLVM to implement __builtin_memcpy,
__builtin_strcmp.
may_goto is not a full substitute for bpf_for() macro.
bpf_for() doesn't have induction variable that verifiers sees,
so 'i' in bpf_for(i, 0, 100) is seen as imprecise and bounded.
But when the code is written as:
for (i = 0; i < 100; cond_break, i++)
the verifier see 'i' as precise constant zero,
hence cond_break (aka may_goto) doesn't help to converge the loop.
A static or global variable can be used as a workaround:
static int zero = 0;
for (i = zero; i < 100; cond_break, i++) // works!
may_goto works well with arena pointers that don't need to be bounds
checked on access. Load/store from arena returns imprecise unbounded
scalar and loops with may_goto pass the verifier.
Reserve new opcode BPF_JMP | BPF_JCOND for may_goto insn.
JCOND stands for conditional pseudo jump.
Since goto_or_nop insn was proposed, it may use the same opcode.
may_goto vs goto_or_nop can be distinguished by src_reg:
code = BPF_JMP | BPF_JCOND
src_reg = 0 - may_goto
src_reg = 1 - goto_or_nop
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Tested-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240306031929.42666-2-alexei.starovoitov@gmail.com
|
||
|
|
0ed3bba16d |
ethtool: Add GTP RSS hash options to ethtool.h
This is a patch that enables RSS functionality for GTP packets using ethtool. A user can include TEID and make RSS work for GTP-U over IPv4 by doing the following:`ethtool -N ens3 rx-flow-hash gtpu4 sde` In addition to gtpu(4|6), we now support gtpc(4|6),gtpc(4|6)t,gtpu(4|6)e, gtpu(4|6)u, and gtpu(4|6)d. gtpc(4|6): Used for GTP-C in IPv4 and IPv6, where the GTP header format does not include a TEID. gtpc(4|6)t: Used for GTP-C in IPv4 and IPv6, with a GTP header format that includes a TEID. gtpu(4|6): Used for GTP-U in both IPv4 and IPv6 scenarios. gtpu(4|6)e: Used for GTP-U with extended headers in both IPv4 and IPv6. gtpu(4|6)u: Used when the PSC (PDU session container) in the GTP-U extended header includes Uplink, applicable to both IPv4 and IPv6. gtpu(4|6)d: Used when the PSC in the GTP-U extended header includes Downlink, for both IPv4 and IPv6. GTP generates a flow that includes an ID called TEID to identify the tunnel. This tunnel is created for each UE (User Equipment).By performing RSS based on this flow, it is possible to apply RSS for each communication unit from the UE. Without this, RSS would only be effective within the range of IP addresses. For instance, the PGW can only perform RSS within the IP range of the SGW. Problematic from a load distribution perspective, especially if there's a bias in the terminals connected to a particular base station.This case can be solved by using this patch. Signed-off-by: Takeru Hayasaka <hayatake396@gmail.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> |
||
|
|
8c2c2549fb |
fanotify: Fix misspelling of "writable"
Several file system notification system headers have "writable" misspelled as "writtable" in the comments. This patch fixes it in the fanotify header. Signed-off-by: Vicki Pfau <vi@endrift.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240306020831.1404033-3-vi@endrift.com> |
||
|
|
e225555028 |
inotify: Fix misspelling of "writable"
Several file system notification system headers have "writable" misspelled as "writtable" in the comments. This patch fixes it in the inotify header. Signed-off-by: Vicki Pfau <vi@endrift.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240306020831.1404033-1-vi@endrift.com> |
||
|
|
e9c717bee8 |
Merge tag 'v6.8-rc7' into gpio/for-next
Linux 6.8-rc7 |
||
|
|
86211eea8a |
btrfs: qgroup: validate btrfs_qgroup_inherit parameter
[BUG] Currently btrfs can create subvolume with an invalid qgroup inherit without triggering any error: # mkfs.btrfs -O quota -f $dev # mount $dev $mnt # btrfs subvolume create -i 2/0 $mnt/subv1 # btrfs qgroup show -prce --sync $mnt Qgroupid Referenced Exclusive Path -------- ---------- --------- ---- 0/5 16.00KiB 16.00KiB <toplevel> 0/256 16.00KiB 16.00KiB subv1 [CAUSE] We only do a very basic size check for btrfs_qgroup_inherit structure, but never really verify if the values are correct. Thus in btrfs_qgroup_inherit() function, we have to skip non-existing qgroups, and never return any error. [FIX] Fix the behavior and introduce extra checks: - Introduce early check for btrfs_qgroup_inherit structure Not only the size, but also all the qgroup ids would be verified. And the timing is very early, so we can return error early. This early check is very important for snapshot creation, as snapshot is delayed to transaction commit. - Drop support for btrfs_qgroup_inherit::num_ref_copies and num_excl_copies Those two members are used to specify to copy refr/excl numbers from other qgroups. This would definitely mark qgroup inconsistent, and btrfs-progs has dropped the support for them for a long time. It's time to drop the support for kernel. - Verify the supported btrfs_qgroup_inherit::flags Just in case we want to add extra flags for btrfs_qgroup_inherit. Now above subvolume creation would fail with -ENOENT other than silently ignore the non-existing qgroup. CC: stable@vger.kernel.org # 6.7+ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> |
||
|
|
4b2765ae41 |
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-02-29
We've added 119 non-merge commits during the last 32 day(s) which contain
a total of 150 files changed, 3589 insertions(+), 995 deletions(-).
The main changes are:
1) Extend the BPF verifier to enable static subprog calls in spin lock
critical sections, from Kumar Kartikeya Dwivedi.
2) Fix confusing and incorrect inference of PTR_TO_CTX argument type
in BPF global subprogs, from Andrii Nakryiko.
3) Larger batch of riscv BPF JIT improvements and enabling inlining
of the bpf_kptr_xchg() for RV64, from Pu Lehui.
4) Allow skeleton users to change the values of the fields in struct_ops
maps at runtime, from Kui-Feng Lee.
5) Extend the verifier's capabilities of tracking scalars when they
are spilled to stack, especially when the spill or fill is narrowing,
from Maxim Mikityanskiy & Eduard Zingerman.
6) Various BPF selftest improvements to fix errors under gcc BPF backend,
from Jose E. Marchesi.
7) Avoid module loading failure when the module trying to register
a struct_ops has its BTF section stripped, from Geliang Tang.
8) Annotate all kfuncs in .BTF_ids section which eventually allows
for automatic kfunc prototype generation from bpftool, from Daniel Xu.
9) Several updates to the instruction-set.rst IETF standardization
document, from Dave Thaler.
10) Shrink the size of struct bpf_map resp. bpf_array,
from Alexei Starovoitov.
11) Initial small subset of BPF verifier prepwork for sleepable bpf_timer,
from Benjamin Tissoires.
12) Fix bpftool to be more portable to musl libc by using POSIX's
basename(), from Arnaldo Carvalho de Melo.
13) Add libbpf support to gcc in CORE macro definitions,
from Cupertino Miranda.
14) Remove a duplicate type check in perf_event_bpf_event,
from Florian Lehner.
15) Fix bpf_spin_{un,}lock BPF helpers to actually annotate them
with notrace correctly, from Yonghong Song.
16) Replace the deprecated bpf_lpm_trie_key 0-length array with flexible
array to fix build warnings, from Kees Cook.
17) Fix resolve_btfids cross-compilation to non host-native endianness,
from Viktor Malik.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits)
selftests/bpf: Test if shadow types work correctly.
bpftool: Add an example for struct_ops map and shadow type.
bpftool: Generated shadow variables for struct_ops maps.
libbpf: Convert st_ops->data to shadow type.
libbpf: Set btf_value_type_id of struct bpf_map for struct_ops.
bpf: Replace bpf_lpm_trie_key 0-length array with flexible array
bpf, arm64: use bpf_prog_pack for memory management
arm64: patching: implement text_poke API
bpf, arm64: support exceptions
arm64: stacktrace: Implement arch_bpf_stack_walk() for the BPF JIT
bpf: add is_async_callback_calling_insn() helper
bpf: introduce in_sleepable() helper
bpf: allow more maps in sleepable bpf programs
selftests/bpf: Test case for lacking CFI stub functions.
bpf: Check cfi_stubs before registering a struct_ops type.
bpf: Clarify batch lookup/lookup_and_delete semantics
bpf, docs: specify which BPF_ABS and BPF_IND fields were zero
bpf, docs: Fix typos in instruction-set.rst
selftests/bpf: update tcp_custom_syncookie to use scalar packet offset
bpf: Shrink size of struct bpf_map/bpf_array.
...
====================
Link: https://lore.kernel.org/r/20240301001625.8800-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
||
|
|
fbf9e3b697 |
Merge tag 'sound-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai: "The amount of changes wasn't as small as wished, but all reasonably small fixes. There is a PCM core API change, which is for correcting the behavior change we took in 6.8. The rest are device-specific fixes for ASoC AMD, Qualcomm, Cirrus codecs, HD-audio quirks & co" * tag 'sound-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: amd: yc: Fix non-functional mic on Lenovo 21J2 ASoC: amd: yc: add new YC platform variant (0x63) support ALSA: hda/realtek - ALC285 reduce pop noise from Headphone port ASoC: amd: yc: Add Lenovo ThinkBook 21J0 into DMI quirk table ALSA: hda/realtek: Add special fixup for Lenovo 14IRP8 ASoC: soc-card: Fix missing locking in snd_soc_card_get_kcontrol() ALSA: hda/realtek: tas2781: enable subwoofer volume control ALSA: pcm: clarify and fix default msbits value for all formats ASoC: qcom: Fix uninitialized pointer dmactl ALSA: hda/realtek: fix mute/micmute LED For HP mt440 ALSA: Drop leftover snd-rtctimer stuff from Makefile ALSA: ump: Fix the discard error code from snd_ump_legacy_open() ALSA: hda/realtek: Enable Mute LED on HP 840 G8 (MB 8AB8) ASoC: cs35l56: Must clear HALO_STATE before issuing SYSTEM_RESET ALSA: hda/realtek: Fix top speaker connection on Dell Inspiron 16 Plus 7630 ALSA: firewire-lib: fix to check cycle continuity |
||
|
|
7187ea0978 |
Merge tag 'drm-fixes-2024-03-01' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie: "Bunch of fixes, xe, amdgpu, nouveau and tegra all have a few. Then drm/bridge including some drivers/soc fallout fixes. The biggest thing in here is a new unit test for some buddy allocator fixes, otherwise a misc fbcon, ttm unit test and one msm revert. Seems pretty normal for this stage. buddy: - two allocation fixes + unit test fbcon: - font restore syzkaller fix ttm: - kunit test fix bridge: - fix aux-hpd leaks - fix aux-hpd registration - fix use after free in soc/qcom - fix boot on soc/qcom xe: - A couple of tracepoint updates from Priyanka and Lucas - Make sure BINDs are completed before accepting UNBINDs on LR vms - Don't arbitrarily restrict max number of batched binds - Add uapi for dumpable bos (agreed on IRC) - Remove unused uapi flags and a leftover comment - A couple of fixes related to the execlist backend msm: - DP: Revert a change which was causing a HDP regression amdgpu: - Fix potential buffer overflow - Fix power min cap - Suspend/resume fix - SI PM fix - eDP fix nouveau: - fix a misreported VRAM sizing - fix a regression in suspend/resume due to freeing tegra: - host1x reset fix - only remove existing driver if display is possible" * tag 'drm-fixes-2024-03-01' of https://gitlab.freedesktop.org/drm/kernel: (32 commits) drm/nouveau: keep DMA buffers required for suspend/resume nouveau: report byte usage in VRAM usage drm/xe/xe_trace: Add move_lacks_source detail to xe_bo_move trace drm/xe: Deny unbinds if uapi ufence pending drm/xe: Expose user fence from xe_sync_entry drm/xe: Use pointers in trace events drm/xe/xe_bo_move: Enhance xe_bo_move trace drm/xe/mmio: fix build warning for BAR resize on 32-bit drm/xe: get rid of MAX_BINDS drm/xe: Use vmalloc for array of bind allocation in bind IOCTL drm/xe: Don't support execlists in xe_gt_tlb_invalidation layer drm/xe: Fix execlist splat drm/xe/uapi: Remove unused flags drm/xe/uapi: Remove DRM_XE_VM_BIND_FLAG_ASYNC comment left over drm/xe: Add uapi for dumpable bos drm/amd/display: Add monitor patch for specific eDP Revert "drm/msm/dp: use drm_bridge_hpd_notify() to report HPD status changes" drm/tests/drm_buddy: add alloc_range_bias test drm/buddy: check range allocation matches alignment drm/buddy: fix range bias ... |
||
|
|
cb12fd8e0d |
pidfd: add pidfs
This moves pidfds from the anonymous inode infrastructure to a tiny pseudo filesystem. This has been on my todo for quite a while as it will unblock further work that we weren't able to do simply because of the very justified limitations of anonymous inodes. Moving pidfds to a tiny pseudo filesystem allows: * statx() on pidfds becomes useful for the first time. * pidfds can be compared simply via statx() and then comparing inode numbers. * pidfds have unique inode numbers for the system lifetime. * struct pid is now stashed in inode->i_private instead of file->private_data. This means it is now possible to introduce concepts that operate on a process once all file descriptors have been closed. A concrete example is kill-on-last-close. * file->private_data is freed up for per-file options for pidfds. * Each struct pid will refer to a different inode but the same struct pid will refer to the same inode if it's opened multiple times. In contrast to now where each struct pid refers to the same inode. Even if we were to move to anon_inode_create_getfile() which creates new inodes we'd still be associating the same struct pid with multiple different inodes. The tiny pseudo filesystem is not visible anywhere in userspace exactly like e.g., pipefs and sockfs. There's no lookup, there's no complex inode operations, nothing. Dentries and inodes are always deleted when the last pidfd is closed. We allocate a new inode for each struct pid and we reuse that inode for all pidfds. We use iget_locked() to find that inode again based on the inode number which isn't recycled. We allocate a new dentry for each pidfd that uses the same inode. That is similar to anonymous inodes which reuse the same inode for thousands of dentries. For pidfds we're talking way less than that. There usually won't be a lot of concurrent openers of the same struct pid. They can probably often be counted on two hands. I know that systemd does use separate pidfd for the same struct pid for various complex process tracking issues. So I think with that things actually become way simpler. Especially because we don't have to care about lookup. Dentries and inodes continue to be always deleted. The code is entirely optional and fairly small. If it's not selected we fallback to anonymous inodes. Heavily inspired by nsfs which uses a similar stashing mechanism just for namespaces. Link: https://lore.kernel.org/r/20240213-vfs-pidfd_fs-v1-2-f863f58cfce1@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> |
||
|
|
aa5fe428d5 |
Merge tag 'drm-xe-fixes-2024-02-29' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes: - A couple of tracepoint updates from Priyanka and Lucas. - Make sure BINDs are completed before accepting UNBINDs on LR vms. - Don't arbitrarily restrict max number of batched binds. - Add uapi for dumpable bos (agreed on IRC). - Remove unused uapi flags and a leftover comment. Driver Changes: - A couple of fixes related to the execlist backend. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZeCBg4MA2hd1oggN@fedora |
||
|
|
992909549a |
Merge tag 'drm-intel-gt-next-2024-02-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes: Fixes: - Add some boring kerneldoc (Tvrtko Ursulin) - Check before removing mm notifier (Nirmoy Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Zd889Wvu/ZKZSK4/@tursulin-desk |
||
|
|
65f5dd4f02 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR. Conflicts: net/mptcp/protocol.c |
||
|
|
896880ff30 |
bpf: Replace bpf_lpm_trie_key 0-length array with flexible array
Replace deprecated 0-length array in struct bpf_lpm_trie_key with
flexible array. Found with GCC 13:
../kernel/bpf/lpm_trie.c:207:51: warning: array subscript i is outside array bounds of 'const __u8[0]' {aka 'const unsigned char[]'} [-Warray-bounds=]
207 | *(__be16 *)&key->data[i]);
| ^~~~~~~~~~~~~
../include/uapi/linux/swab.h:102:54: note: in definition of macro '__swab16'
102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
| ^
../include/linux/byteorder/generic.h:97:21: note: in expansion of macro '__be16_to_cpu'
97 | #define be16_to_cpu __be16_to_cpu
| ^~~~~~~~~~~~~
../kernel/bpf/lpm_trie.c:206:28: note: in expansion of macro 'be16_to_cpu'
206 | u16 diff = be16_to_cpu(*(__be16 *)&node->data[i]
^
| ^~~~~~~~~~~
In file included from ../include/linux/bpf.h:7:
../include/uapi/linux/bpf.h:82:17: note: while referencing 'data'
82 | __u8 data[0]; /* Arbitrary size */
| ^~~~
And found at run-time under CONFIG_FORTIFY_SOURCE:
UBSAN: array-index-out-of-bounds in kernel/bpf/lpm_trie.c:218:49
index 0 is out of range for type '__u8 [*]'
Changing struct bpf_lpm_trie_key is difficult since has been used by
userspace. For example, in Cilium:
struct egress_gw_policy_key {
struct bpf_lpm_trie_key lpm_key;
__u32 saddr;
__u32 daddr;
};
While direct references to the "data" member haven't been found, there
are static initializers what include the final member. For example,
the "{}" here:
struct egress_gw_policy_key in_key = {
.lpm_key = { 32 + 24, {} },
.saddr = CLIENT_IP,
.daddr = EXTERNAL_SVC_IP & 0Xffffff,
};
To avoid the build time and run time warnings seen with a 0-sized
trailing array for struct bpf_lpm_trie_key, introduce a new struct
that correctly uses a flexible array for the trailing bytes,
struct bpf_lpm_trie_key_u8. As part of this, include the "header"
portion (which is just the "prefixlen" member), so it can be used
by anything building a bpf_lpr_trie_key that has trailing members that
aren't a u8 flexible array (like the self-test[1]), which is named
struct bpf_lpm_trie_key_hdr.
Unfortunately, C++ refuses to parse the __struct_group() helper, so
it is not possible to define struct bpf_lpm_trie_key_hdr directly in
struct bpf_lpm_trie_key_u8, so we must open-code the union directly.
Adjust the kernel code to use struct bpf_lpm_trie_key_u8 through-out,
and for the selftest to use struct bpf_lpm_trie_key_hdr. Add a comment
to the UAPI header directing folks to the two new options.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Closes: https://paste.debian.net/hidden/ca500597/
Link: https://lore.kernel.org/all/202206281009.4332AA33@keescook/ [1]
Link: https://lore.kernel.org/bpf/20240222155612.it.533-kees@kernel.org
|
||
|
|
87adedeba5 |
Merge tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth, WiFi and netfilter.
We have one outstanding issue with the stmmac driver, which may be a
LOCKDEP false positive, not a blocker.
Current release - regressions:
- netfilter: nf_tables: re-allow NFPROTO_INET in
nft_(match/target)_validate()
- eth: ionic: fix error handling in PCI reset code
Current release - new code bugs:
- eth: stmmac: complete meta data only when enabled, fix null-deref
- kunit: fix again checksum tests on big endian CPUs
Previous releases - regressions:
- veth: try harder when allocating queue memory
- Bluetooth:
- hci_bcm4377: do not mark valid bd_addr as invalid
- hci_event: fix handling of HCI_EV_IO_CAPA_REQUEST
Previous releases - always broken:
- info leak in __skb_datagram_iter() on netlink socket
- mptcp:
- map v4 address to v6 when destroying subflow
- fix potential wake-up event loss due to sndbuf auto-tuning
- fix double-free on socket dismantle
- wifi: nl80211: reject iftype change with mesh ID change
- fix small out-of-bound read when validating netlink be16/32 types
- rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
- ipv6: fix potential "struct net" ref-leak in inet6_rtm_getaddr()
- ip_tunnel: prevent perpetual headroom growth with huge number of
tunnels on top of each other
- mctp: fix skb leaks on error paths of mctp_local_output()
- eth: ice: fixes for DPLL state reporting
- dpll: rely on rcu for netdev_dpll_pin() to prevent UaF
- eth: dpaa: accept phy-interface-type = '10gbase-r' in the device
tree"
* tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits)
dpll: fix build failure due to rcu_dereference_check() on unknown type
kunit: Fix again checksum tests on big endian CPUs
tls: fix use-after-free on failed backlog decryption
tls: separate no-async decryption request handling from async
tls: fix peeking with sync+async decryption
tls: decrement decrypt_pending if no async completion will be called
gtp: fix use-after-free and null-ptr-deref in gtp_newlink()
net: hsr: Use correct offset for HSR TLV values in supervisory HSR frames
igb: extend PTP timestamp adjustments to i211
rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
tools: ynl: fix handling of multiple mcast groups
selftests: netfilter: add bridge conntrack + multicast test case
netfilter: bridge: confirm multicast packets before passing them up the stack
netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate()
Bluetooth: qca: Fix triggering coredump implementation
Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT
Bluetooth: qca: Fix wrong event type for patch config command
Bluetooth: Enforce validation on max value of connection interval
Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
Bluetooth: mgmt: Fix limited discoverable off timeout
...
|
||
|
|
eaa367a031 |
drm/xe/uapi: Remove unused flags
Those cases missed in previous uAPI cleanups were mostly accidentally brought in from i915 or created to exercise the possibilities of gpuvm but they are not used by userspace yet, so let's remove them. They can still be brought back later if needed. v2: - Fix XE_VM_FLAG_FAULT_MODE support in xe_lrc.c (Brian Welty) - Leave DRM_XE_VM_BIND_OP_UNMAP_ALL (José Roberto de Souza) - Ensure invalid flag values are rejected (Rodrigo Vivi) v3: Rebase after removal of persistent exec_queues (Francois Dugast) v4: Rodrigo: Rebase after the new dumpable flag. Fixes: |
||
|
|
b6f4fb397d |
drm/xe/uapi: Remove DRM_XE_VM_BIND_FLAG_ASYNC comment left over
This is a comment left over of commit |
||
|
|
7e10d87e63 |
drm/xe: Add uapi for dumpable bos
Add the flag XE_VM_BIND_FLAG_DUMPABLE to notify devcoredump that this
mapping should be dumped.
This is not hooked up, but the uapi should be ready before merging.
It's likely easier to dump the contents of the bo's at devcoredump
readout time, so it's better if the bos will stay unmodified after
a hang. The NEEDS_CPU_MAPPING flag is removed as requirement.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221133024.898315-3-maarten.lankhorst@linux.intel.com
(cherry picked from commit
|
||
|
|
13fe8e6825 |
ublk: add UBLK_CMD_DEL_DEV_ASYNC
The current command UBLK_CMD_DEL_DEV won't return until the device is released, this way looks more reliable, but makes userspace more difficult to implement, especially about orders: unmap command buffer(which holds one ublkc reference), ublkc close, io_uring_file_unregister, ublkb close. Add UBLK_CMD_DEL_DEV_ASYNC so that device deletion won't wait release, then userspace needn't worry about the above order. Actually both loop and nbd is deleted in this async way. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240223075539.89945-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
5fa918a335 |
uapi: ioam6: API for netlink multicast events
Add new api to support ioam6 events for generic netlink multicast. A first "trace" event is added to the list of ioam6 events, which will represent an IOAM Pre-allocated Trace Option-Type. It provides another solution to share IOAM data with user space. Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
|
|
6a20086419 |
uapi: in6: replace temporary label with rfc9486
Not really a fix per se, but IPV6_TLV_IOAM is still tagged as "TEMPORARY
IANA allocation for IOAM", while RFC 9486 is available for some time
now. Just update the reference.
Fixes:
|
||
|
|
f112b68f27 |
Merge v6.8-rc6 into drm-next
Thomas Zimmermann asked to backmerge -rc6 for drm-misc branches, there's a few same-area-changed conflicts (xe and amdgpu mostly) that are getting a bit too annoying. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> |
||
|
|
85df6b5a66 |
ALSA: pcm: clarify and fix default msbits value for all formats
Return used most significant bits from sample bit-width rather than the whole
physical sample word size. The starting bit offset is defined in the format
itself.
The behaviour is not changed for 32-bit formats like S32_LE. But with this
change - msbits value 24 instead 32 is returned for 24-bit formats like S24_LE
etc.
Also, commit
|