Commit Graph

3164 Commits

Author SHA1 Message Date
Linus Torvalds
c013d0af81 Merge tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:

 - Improve the type checking of request flags (Bart)

 - Ensure queue mapping for a single queues always picks the right queue
   (Bart)

 - Sanitize the io priority handling (Jan)

 - rq-qos race fix (Jinke)

 - Reserved tags handling improvements (John)

 - Separate memory alignment from file/disk offset aligment for O_DIRECT
   (Keith)

 - Add new ublk driver, userspace block driver using io_uring for
   communication with the userspace backend (Ming)

 - Use try_cmpxchg() to cleanup the code in various spots (Uros)

 - Finally remove bdevname() (Christoph)

 - Clean up the zoned device handling (Christoph)

 - Clean up independent access range support (Christoph)

 - Clean up and improve block sysfs handling (Christoph)

 - Clean up and improve teardown of block devices.

   This turns the usual two step process into something that is simpler
   to implement and handle in block drivers (Christoph)

 - Clean up chunk size handling (Christoph)

 - Misc cleanups and fixes (Bart, Bo, Dan, GuoYong, Jason, Keith, Liu,
   Ming, Sebastian, Yang, Ying)

* tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block: (178 commits)
  ublk_drv: fix double shift bug
  ublk_drv: make sure that correct flags(features) returned to userspace
  ublk_drv: fix error handling of ublk_add_dev
  ublk_drv: fix lockdep warning
  block: remove __blk_get_queue
  block: call blk_mq_exit_queue from disk_release for never added disks
  blk-mq: fix error handling in __blk_mq_alloc_disk
  ublk: defer disk allocation
  ublk: rewrite ublk_ctrl_get_queue_affinity to not rely on hctx->cpumask
  ublk: fold __ublk_create_dev into ublk_ctrl_add_dev
  ublk: cleanup ublk_ctrl_uring_cmd
  ublk: simplify ublk_ch_open and ublk_ch_release
  ublk: remove the empty open and release block device operations
  ublk: remove UBLK_IO_F_PREFLUSH
  ublk: add a MAINTAINERS entry
  block: don't allow the same type rq_qos add more than once
  mmc: fix disk/queue leak in case of adding disk failure
  ublk_drv: fix an IS_ERR() vs NULL check
  ublk: remove UBLK_IO_F_INTEGRITY
  ublk_drv: remove unneeded semicolon
  ...
2022-08-02 13:46:35 -07:00
Linus Torvalds
98e2474640 Merge tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block
Pull io_uring buffered writes support from Jens Axboe:
 "This contains support for buffered writes, specifically for XFS. btrfs
  is in progress, will be coming in the next release.

  io_uring does support buffered writes on any file type, but since the
  buffered write path just always -EAGAIN (or -EOPNOTSUPP) any attempt
  to do so if IOCB_NOWAIT is set, any buffered write will effectively be
  handled by io-wq offload. This isn't very efficient, and we even have
  specific code in io-wq to serialize buffered writes to the same inode
  to avoid further inefficiencies with thread offload.

  This is particularly sad since most buffered writes don't block, they
  simply copy data to a page and dirty it. With this pull request, we
  can handle buffered writes a lot more effiently.

  If balance_dirty_pages() needs to block, we back off on writes as
  indicated.

  This improves buffered write support by 2-3x.

  Jan Kara helped with the mm bits for this, and Stefan handled the
  fs/iomap/xfs/io_uring parts of it"

* tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block:
  mm: honor FGP_NOWAIT for page cache page allocation
  xfs: Add async buffered write support
  xfs: Specify lockmode when calling xfs_ilock_for_iomap()
  io_uring: Add tracepoint for short writes
  io_uring: fix issue with io_write() not always undoing sb_start_write()
  io_uring: Add support for async buffered writes
  fs: Add async write file modification handling.
  fs: Split off inode_needs_update_time and __file_update_time
  fs: add __remove_file_privs() with flags parameter
  fs: add a FMODE_BUF_WASYNC flags for f_mode
  iomap: Return -EAGAIN from iomap_write_iter()
  iomap: Add async buffered write support
  iomap: Add flags parameter to iomap_page_create()
  mm: Add balance_dirty_pages_ratelimited_flags() function
  mm: Move updates of dirty_exceeded into one place
  mm: Move starting of background writeback into the main balancing loop
2022-08-02 13:27:23 -07:00
Linus Torvalds
b349b1181d Merge tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:

 - As per (valid) complaint in the last merge window, fs/io_uring.c has
   grown quite large these days. io_uring isn't really tied to fs
   either, as it supports a wide variety of functionality outside of
   that.

   Move the code to io_uring/ and split it into files that either
   implement a specific request type, and split some code into helpers
   as well. The code is organized a lot better like this, and io_uring.c
   is now < 4K LOC (me).

 - Deprecate the epoll_ctl opcode. It'll still work, just trigger a
   warning once if used. If we don't get any complaints on this, and I
   don't expect any, then we can fully remove it in a future release
   (me).

 - Improve the cancel hash locking (Hao)

 - kbuf cleanups (Hao)

 - Efficiency improvements to the task_work handling (Dylan, Pavel)

 - Provided buffer improvements (Dylan)

 - Add support for recv/recvmsg multishot support. This is similar to
   the accept (or poll) support for have for multishot, where a single
   SQE can trigger everytime data is received. For applications that
   expect to do more than a few receives on an instantiated socket, this
   greatly improves efficiency (Dylan).

 - Efficiency improvements for poll handling (Pavel)

 - Poll cancelation improvements (Pavel)

 - Allow specifiying a range for direct descriptor allocations (Pavel)

 - Cleanup the cqe32 handling (Pavel)

 - Move io_uring types to greatly cleanup the tracing (Pavel)

 - Tons of great code cleanups and improvements (Pavel)

 - Add a way to do sync cancelations rather than through the sqe -> cqe
   interface, as that's a lot easier to use for some use cases (me).

 - Add support to IORING_OP_MSG_RING for sending direct descriptors to a
   different ring. This avoids the usually problematic SCM case, as we
   disallow those. (me)

 - Make the per-command alloc cache we use for apoll generic, place
   limits on it, and use it for netmsg as well (me).

 - Various cleanups (me, Michal, Gustavo, Uros)

* tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block: (172 commits)
  io_uring: ensure REQ_F_ISREG is set async offload
  net: fix compat pointer in get_compat_msghdr()
  io_uring: Don't require reinitable percpu_ref
  io_uring: fix types in io_recvmsg_multishot_overflow
  io_uring: Use atomic_long_try_cmpxchg in __io_account_mem
  io_uring: support multishot in recvmsg
  net: copy from user before calling __get_compat_msghdr
  net: copy from user before calling __copy_msghdr
  io_uring: support 0 length iov in buffer select in compat
  io_uring: fix multishot ending when not polled
  io_uring: add netmsg cache
  io_uring: impose max limit on apoll cache
  io_uring: add abstraction around apoll cache
  io_uring: move apoll cache to poll.c
  io_uring: consolidate hash_locked io-wq handling
  io_uring: clear REQ_F_HASH_LOCKED on hash removal
  io_uring: don't race double poll setting REQ_F_ASYNC_DATA
  io_uring: don't miss setting REQ_F_DOUBLE_POLL
  io_uring: disable multishot recvmsg
  io_uring: only trace one of complete or overflow
  ...
2022-08-02 13:20:44 -07:00
Linus Torvalds
c1dbe9a1c8 Merge tag 'thermal-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
 "These start a rework of the handling of trip points in the thermal
  core, improve the cpufreq/devfreq cooling device handling, update some
  thermal control drivers and the tmon utility and clean up code.

  Specifics:

   - Consolidate the thermal core code by beginning to move the thermal
     trip structure from the thermal OF code as a generic structure to
     be used by the different sensors when registering a thermal zone
     (Daniel Lezcano).

   - Make per cpufreq / devfreq cooling device ops instead of using a
     global variable, fix comments and rework the trace information
     (Lukasz Luba).

   - Add the include/dt-bindings/thermal.h under the area covered by the
     thermal maintainer in the MAINTAINERS file (Lukas Bulwahn).

   - Improve the error output by giving the sensor identification when a
     thermal zone failed to initialize, the DT bindings by changing the
     positive logic and adding the r8a779f0 support on the rcar3
     (Wolfram Sang).

   - Convert the QCom tsens DT binding to the dtsformat format
     (Krzysztof Kozlowski).

   - Remove the pointless get_trend() function in the QCom, Ux500 and
     tegra thermal drivers, along with the unused DROP_FULL and
     RAISE_FULL trends definitions. Simplify the code by using clamp()
     macros (Daniel Lezcano).

   - Fix ref_table memory leak at probe time on the k3_j72xx bandgap
     (Bryan Brattlof).

   - Fix array underflow in prep_lookup_table (Dan Carpenter).

   - Add static annotation to the k3_j72xx_bandgap_j7* data structure
     (Jin Xiaoyun).

   - Fix typos in comments detected on sun8i by Coccinelle (Julia
     Lawall).

   - Fix typos in comments on rzg2l (Biju Das).

   - Remove as unnecessary call to dev_err() as the error is already
     printed by the failing function on u8500 (Yang Li).

   - Register the thermal zones as hwmon sensors for the Qcom thermal
     sensors (Dmitry Baryshkov).

   - Fix 'tmon' tool compilation issue by adding phtread.h include
     (Markus Mayer).

   - Fix typo in the comments for the 'tmon' tool (Slark Xiao).

   - Make the thermal core use ida_alloc()/free() directly instead of
     ida_simple_get()/ida_simple_remove() that have been deprecated
     (keliu).

   - Drop ACPI_FADT_LOW_POWER_S0 check from the Intel PCH thermal
     control driver (Rafael Wysocki)"

* tag 'thermal-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits)
  thermal/of: Initialize trip points separately
  thermal/of: Use thermal trips stored in the thermal zone
  thermal/core: Add thermal_trip in thermal_zone
  thermal/core: Rename 'trips' to 'num_trips'
  thermal/core: Move thermal_set_delay_jiffies to static
  thermal/core: Remove unneeded EXPORT_SYMBOLS
  thermal/of: Move thermal_trip structure to thermal.h
  thermal/of: Remove the device node pointer for thermal_trip
  thermal/of: Replace device node match with device node search
  thermal/core: Remove duplicate information when an error occurs
  thermal/core: Avoid calling ->get_trip_temp() unnecessarily
  thermal/tools/tmon: Fix typo 'the the' in comment
  thermal/tools/tmon: Include pthread and time headers in tmon.h
  thermal/ti-soc-thermal: Fix comment typo
  thermal/drivers/qcom/spmi-adc-tm5: Register thermal zones as hwmon sensors
  thermal/drivers/qcom/temp-alarm: Register thermal zones as hwmon sensors
  thermal/drivers/u8500: Remove unnecessary print function dev_err()
  thermal/drivers/rzg2l: Fix comments
  thermal/drivers/sun8i: Fix typo in comment
  thermal/drivers/k3_j72xx_bandgap: Make k3_j72xx_bandgap_j721e_data and k3_j72xx_bandgap_j7200_data static
  ...
2022-08-02 11:27:53 -07:00
Linus Torvalds
a771ea6413 Merge tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
 "These are mostly minor improvements all over including new CPU IDs for
  the Intel RAPL driver, an Energy Model rework to use micro-Watt as the
  power unit, cpufreq fixes and cleanus, cpuidle updates, devfreq
  updates, documentation cleanups and a new version of the pm-graph
  suite of utilities.

  Specifics:

   - Make cpufreq_show_cpus() more straightforward (Viresh Kumar).

   - Drop unnecessary CPU hotplug locking from store() used by cpufreq
     sysfs attributes (Viresh Kumar).

   - Make the ACPI cpufreq driver support the boost control interface on
     Zhaoxin/Centaur processors (Tony W Wang-oc).

   - Print a warning message on attempts to free an active cpufreq
     policy which should never happen (Viresh Kumar).

   - Fix grammar in the Kconfig help text for the loongson2 cpufreq
     driver (Randy Dunlap).

   - Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
     governor (Zhao Liu).

   - Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
     cpuidle driver (Eiichi Tsukata).

   - Modify intel_idle to treat C1 and C1E as independent idle states on
     Sapphire Rapids (Artem Bityutskiy).

   - Extend support for wakeirq to callback wrappers used during system
     suspend and resume (Ulf Hansson).

   - Defer waiting for device probe before loading a hibernation image
     till the first actual device access to avoid possible deadlocks
     reported by syzbot (Tetsuo Handa).

   - Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
     Helgaas).

   - Add Raptor Lake-P to the list of processors supported by the Intel
     RAPL driver (George D Sworo).

   - Add Alder Lake-N and Raptor Lake-P to the list of processors for
     which Power Limit4 is supported in the Intel RAPL driver (Sumeet
     Pawnikar).

   - Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
     attempting to remove it (Hsin-Yi Wang).

   - Change the Energy Model code to represent power in micro-Watts and
     adjust its users accordingly (Lukasz Luba).

   - Add new devfreq driver for Mediatek CCI (Cache Coherent
     Interconnect) (Johnson Wang).

   - Convert the Samsung Exynos SoC Bus bindings to DT schema of
     exynos-bus.c (Krzysztof Kozlowski).

   - Address kernel-doc warnings by adding the description for unused
     function parameters in devfreq core (Mauro Carvalho Chehab).

   - Use NULL to pass a null pointer rather than zero according to the
     function propotype in imx-bus.c (Colin Ian King).

   - Print error message instead of error interger value in
     tegra30-devfreq.c (Dmitry Osipenko).

   - Add checks to prevent setting negative frequency QoS limits for
     CPUs (Shivnandan Kumar).

   - Update the pm-graph suite of utilities to the latest revision 5.9
     including multiple improvements (Todd Brandt).

   - Drop pme_interrupt reference from the PCI power management
     documentation (Mario Limonciello)"

* tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits)
  powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
  PM: QoS: Add check to make sure CPU freq is non-negative
  PM: hibernate: defer device probing when resuming from hibernation
  intel_idle: make SPR C1 and C1E be independent
  cpufreq: ondemand: Use cpumask_var_t for on-stack cpu mask
  cpufreq: loongson2: fix Kconfig "its" grammar
  pm-graph v5.9
  cpufreq: Warn users while freeing active policy
  cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
  firmware: arm_scmi: Get detailed power scale from perf
  Documentation: EM: Switch to micro-Watts scale
  PM: EM: convert power field to micro-Watts precision and align drivers
  PM / devfreq: tegra30: Add error message for devm_devfreq_add_device()
  PM / devfreq: imx-bus: use NULL to pass a null pointer rather than zero
  PM / devfreq: shut up kernel-doc warnings
  dt-bindings: interconnect: samsung,exynos-bus: convert to dtschema
  PM / devfreq: mediatek: Introduce MediaTek CCI devfreq driver
  dt-bindings: interconnect: Add MediaTek CCI dt-bindings
  PM: domains: Ensure genpd_debugfs_dir exists before remove
  PM: runtime: Extend support for wakeirq for force_suspend|resume
  ...
2022-08-02 11:17:00 -07:00
Linus Torvalds
47b62edcd4 Merge tag 'arm-drivers-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC drivers from Arnd Bergmann:
 "The SoC driver updates contain changes to improve support for
  additional SoC variants, as well as cleanups an minor bugfixes
  in a number of existing drivers.

  Notable updates this time include:

   - Support for Qualcomm MSM8909 (Snapdragon 210) in various drivers

   - Updates for interconnect drivers on Qualcomm Snapdragon

   - A new driver support for NMI interrupts on Fujitsu A64fx

   - A rework of Broadcom BCMBCA Kconfig dependencies

   - Improved support for BCM2711 (Raspberry Pi 4) power management to
     allow the use of the V3D GPU

   - Cleanups to the NXP guts driver

   - Arm SCMI firmware driver updates to add tracing support, and use
     the firmware interfaces for system power control and for power
     capping"

* tag 'arm-drivers-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (125 commits)
  soc: a64fx-diag: disable modular build
  dt-bindings: soc: qcom: qcom,smd-rpm: add power-controller
  dt-bindings: soc: qcom: aoss: document qcom,sm8450-aoss-qmp
  dt-bindings: soc: qcom,rpmh-rsc: simplify qcom,tcs-config
  ARM: mach-qcom: Add support for MSM8909
  dt-bindings: arm: cpus: Document "qcom,msm8909-smp" enable-method
  soc: qcom: spm: Add CPU data for MSM8909
  dt-bindings: soc: qcom: spm: Add MSM8909 CPU compatible
  soc: qcom: rpmpd: Add compatible for MSM8909
  dt-bindings: power: qcom-rpmpd: Add MSM8909 power domains
  soc: qcom: smd-rpm: Add compatible for MSM8909
  dt-bindings: soc: qcom: smd-rpm: Add MSM8909
  soc: qcom: icc-bwmon: Remove unnecessary print function dev_err()
  soc: fujitsu: Add A64FX diagnostic interrupt driver
  soc: qcom: socinfo: Fix the id of SA8540P SoC
  soc: qcom: Make QCOM_RPMPD depend on PM
  tty: serial: bcm63xx: bcmbca: Replace ARCH_BCM_63XX with ARCH_BCMBCA
  spi: bcm63xx-hsspi: bcmbca: Replace ARCH_BCM_63XX with ARCH_BCMBCA
  clk: bcm: bcmbca: Replace ARCH_BCM_63XX with ARCH_BCMBCA
  hwrng: bcm2835: bcmbca: Replace ARCH_BCM_63XX with ARCH_BCMBCA
  ...
2022-08-02 08:10:10 -07:00
Linus Torvalds
0dd1cabe8a Merge tag 'slab-for-5.20_or_6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:

 - An addition of 'accounted' flag to slab allocation tracepoints to
   indicate memcg_kmem accounting, by Vasily

 - An optimization of memcg handling in freeing paths, by Muchun

 - Various smaller fixes and cleanups

* tag 'slab-for-5.20_or_6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slab_common: move generic bulk alloc/free functions to SLOB
  mm/sl[au]b: use own bulk free function when bulk alloc failed
  mm: slab: optimize memcg_slab_free_hook()
  mm/tracing: add 'accounted' entry into output of allocation tracepoints
  tools/vm/slabinfo: Handle files in debugfs
  mm/slub: Simplify __kmem_cache_alias()
  mm, slab: fix bad alignments
2022-08-01 11:46:58 -07:00
Linus Torvalds
eb43bbac4c Merge tag 'dlm-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:

 - Delay the cleanup of interrupted posix lock requests until the user
   space result arrives. Previously, the immediate cleanup would lead to
   extraneous warnings when the result arrived.

 - Tracepoint improvements, e.g. adding the lock resource name.

 - Delay the completion of lockspace creation until one full recovery
   cycle has completed. This allows more error cases to be returned to
   the caller.

 - Remove warnings from the locking layer about delayed network replies.
   The recently added midcomms warnings are much more useful.

 - Begin the process of deprecating two unused lock-timeout-related
   features. These features now require enabling via a Kconfig option,
   and enabling them triggers deprecation warnings. We expect to remove
   the code in v6.2.

* tag 'dlm-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  fs: dlm: move kref_put assert for lkb structs
  fs: dlm: don't use deprecated timeout features by default
  fs: dlm: add deprecation Kconfig and warnings for timeouts
  fs: dlm: remove timeout from dlm_user_adopt_orphan
  fs: dlm: remove waiter warnings
  fs: dlm: fix grammar in lowcomms output
  fs: dlm: add comment about lkb IFL flags
  fs: dlm: handle recovery result outside of ls_recover
  fs: dlm: make new_lockspace() wait until recovery completes
  fs: dlm: call dlm_lsop_recover_prep once
  fs: dlm: update comments about recovery and membership handling
  fs: dlm: add resource name to tracepoints
  fs: dlm: remove additional dereference of lksb
  fs: dlm: change ast and bast trace order
  fs: dlm: change posix lock sigint handling
  fs: dlm: use dlm_plock_info for do_unlock_close
  fs: dlm: change plock interrupted message to debug again
  fs: dlm: add pid to debug log
  fs: dlm: plock use list_first_entry
2022-08-01 08:46:53 -07:00
Rafael J. Wysocki
da9d01794e Merge tag 'thermal-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal control changes for 5.20-rc1 from Daniel Lezcano:

"- Make per cpufreq / devfreq cooling device ops instead of using a
   global variable, fix comments and rework the trace information
   (Lukasz Luba)

 - Add the include/dt-bindings/thermal.h under the area covered by the
   thermal maintainer in the MAINTAINERS file (Lukas Bulwahn)

 - Improve the error output by giving the sensor identification when a
   thermal zone failed to initialize, the DT bindings by changing the
   positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
   Sang)

 - Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
   Kozlowski)

 - Remove the pointless get_trend() function in the QCom, Ux500 and
   tegra thermal drivers, along with the unused DROP_FULL and
   RAISE_FULL trends definitions. Simplify the code by using clamp()
   macros (Daniel Lezcano)

 - Fix ref_table memory leak at probe time on the k3_j72xx bandgap
   (Bryan Brattlof)

 - Fix array underflow in prep_lookup_table (Dan Carpenter)

 - Add static annotation to the k3_j72xx_bandgap_j7* data structure
   (Jin Xiaoyun)

 - Fix typos in comments detected on sun8i by Coccinelle (Julia Lawall)

 - Fix typos in comments on rzg2l (Biju Das)

 - Remove as unnecessary call to dev_err() as the error is already
   printed by the failing function on u8500 (Yang Li)

 - Register the thermal zones as hwmon sensors for the Qcom thermal
   sensors (Dmitry Baryshkov)

 - Fix 'tmon' tool compilation issue by adding phtread.h include
   (Markus Mayer)

 - Fix typo in the comments for the 'tmon' tool (Slark Xiao)

 - Consolidate the thermal core code by beginning to move the thermal
   trip structure from the thermal OF code as a generic structure to be
   used by the different sensors when registering a thermal zone
   (Daniel Lezcano)"

* tag 'thermal-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (36 commits)
  thermal/of: Initialize trip points separately
  thermal/of: Use thermal trips stored in the thermal zone
  thermal/core: Add thermal_trip in thermal_zone
  thermal/core: Rename 'trips' to 'num_trips'
  thermal/core: Move thermal_set_delay_jiffies to static
  thermal/core: Remove unneeded EXPORT_SYMBOLS
  thermal/of: Move thermal_trip structure to thermal.h
  thermal/of: Remove the device node pointer for thermal_trip
  thermal/of: Replace device node match with device node search
  thermal/core: Remove duplicate information when an error occurs
  thermal/core: Avoid calling ->get_trip_temp() unnecessarily
  thermal/tools/tmon: Fix typo 'the the' in comment
  thermal/tools/tmon: Include pthread and time headers in tmon.h
  thermal/ti-soc-thermal: Fix comment typo
  thermal/drivers/qcom/spmi-adc-tm5: Register thermal zones as hwmon sensors
  thermal/drivers/qcom/temp-alarm: Register thermal zones as hwmon sensors
  thermal/drivers/u8500: Remove unnecessary print function dev_err()
  thermal/drivers/rzg2l: Fix comments
  thermal/drivers/sun8i: Fix typo in comment
  thermal/drivers/k3_j72xx_bandgap: Make k3_j72xx_bandgap_j721e_data and k3_j72xx_bandgap_j7200_data static
  ...
2022-07-29 19:10:56 +02:00
Rafael J. Wysocki
f611b33af2 Merge back cpuidle material for 5.20. 2022-07-29 17:15:30 +02:00
Lukasz Luba
3f7ced7ac9 drivers/thermal/cpufreq_cooling : Refactor thermal_power_cpu_get_power tracing
Simplify the thermal_power_cpu_get_power trace event by removing
complicated cpumask and variable length array. Now the tools parsing trace
output don't have to hassle to get this power data. The simplified format
version uses 'policy->cpu'. Remove also the 'load' information completely
since there is very little value of it in this trace event. To get the
CPUs' load (or utilization) there are other dedicated trace hooks in the
kernel. This patch also simplifies and speeds-up the main cooling code
when that trace event is enabled.

Rename the trace event to avoid confusion of tools which parse the trace
file.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220613124327.30766-3-lukasz.luba@arm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28 17:29:42 +02:00
Stefan Roesch
1c849b481b io_uring: Add tracepoint for short writes
This adds the io_uring_short_write tracepoint to io_uring. A short write
is issued if not all pages that are required for a write are in the page
cache and the async buffered writes have to return EAGAIN.

Signed-off-by: Stefan Roesch <shr@fb.com>
Link: https://lore.kernel.org/r/20220616212221.2024518-13-shr@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24 18:39:32 -06:00
Dylan Yudaken
9b26e811e9 io_uring: fix io_uring_cqe_overflow trace format
Make the trace format consistent with io_uring_complete for cflags

Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220630091231.1456789-12-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24 18:39:17 -06:00
Dylan Yudaken
eccd880185 io_uring: add trace event for running task work
This is useful for investigating if task_work is batching

Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220622134028.2013417-8-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24 18:39:15 -06:00
Pavel Begunkov
48863ffd3e io_uring: clean up tracing events
We have lots of trace events accepting an io_uring request and wanting
to print some of its fields like user_data, opcode, flags and so on.
However, as trace points were unaware of io_uring structures, we had to
pass all the fields as arguments. Teach trace/events/io_uring.h about
struct io_kiocb and stop the misery of passing a horde of arguments to
trace helpers.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/40ff72f92798114e56d400f2b003beb6cde6ef53.1655384063.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-24 18:39:14 -06:00
Linus Torvalds
9bd572ec7a Merge tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, bpf and wireless.

  Still no major regressions, the release continues to be calm. An
  uptick of fixes this time around due to trivial data race fixes and
  patches flowing down from subtrees.

  There has been a few driver fixes (particularly a few fixes for false
  positives due to 66e4c8d950 which went into -next in May!) that make
  me worry the wide testing is not exactly fully through.

  So "calm" but not "let's just cut the final ASAP" vibes over here.

  Current release - regressions:

   - wifi: rtw88: fix write to const table of channel parameters

  Current release - new code bugs:

   - mac80211: add gfp_t arg to ieeee80211_obss_color_collision_notify

   - mlx5:
      - TC, allow offload from uplink to other PF's VF
      - Lag, decouple FDB selection and shared FDB
      - Lag, correct get the port select mode str

   - bnxt_en: fix and simplify XDP transmit path

   - r8152: fix accessing unset transport header

  Previous releases - regressions:

   - conntrack: fix crash due to confirmed bit load reordering (after
     atomic -> refcount conversion)

   - stmmac: dwc-qos: disable split header for Tegra194

  Previous releases - always broken:

   - mlx5e: ring the TX doorbell on DMA errors

   - bpf: make sure mac_header was set before using it

   - mac80211: do not wake queues on a vif that is being stopped

   - mac80211: fix queue selection for mesh/OCB interfaces

   - ip: fix dflt addr selection for connected nexthop

   - seg6: fix skb checksums for SRH encapsulation/insertion

   - xdp: fix spurious packet loss in generic XDP TX path

   - bunch of sysctl data race fixes

   - nf_log: incorrect offset to network header

  Misc:

   - bpf: add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs"

* tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
  nfp: flower: configure tunnel neighbour on cmsg rx
  net/tls: Check for errors in tls_device_init
  MAINTAINERS: Add an additional maintainer to the AMD XGBE driver
  xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue
  selftests/net: test nexthop without gw
  ip: fix dflt addr selection for connected nexthop
  net: atlantic: remove aq_nic_deinit() when resume
  net: atlantic: remove deep parameter on suspend/resume functions
  sfc: fix kernel panic when creating VF
  seg6: bpf: fix skb checksum in bpf_push_seg6_encap()
  seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors
  seg6: fix skb checksum evaluation in SRH encapsulation/insertion
  sfc: fix use after free when disabling sriov
  net: sunhme: output link status with a single print.
  r8152: fix accessing unset transport header
  net: stmmac: fix leaks in probe
  net: ftgmac100: Hold reference returned by of_get_child_by_name()
  nexthop: Fix data-races around nexthop_compat_mode.
  ipv4: Fix data-races around sysctl_ip_dynaddr.
  tcp: Fix a data-race around sysctl_tcp_ecn_fallback.
  ...
2022-07-14 12:48:07 -07:00
Bart Van Assche
ed4512590b fs/nilfs2: Use the enum req_op and blk_opf_t types
Improve static type checking by using the enum req_op type for variables
that represent a request operation and the new blk_opf_t type for
variables that represent request flags. Combine the 'mode' and
'mode_flags' arguments of nilfs_btnode_submit_block into a single
argument 'opf'.

Reviewed-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-59-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-14 12:14:33 -06:00
Bart Van Assche
6669797b0d fs/jbd2: Fix the documentation of the jbd2_write_superblock() callers
Commit 2a222ca992 ("fs: have submit_bh users pass in op and flags
separately") renamed the jbd2_write_superblock() 'write_op' argument into
'write_flags'. Propagate this change to the jbd2_write_superblock()
callers. Additionally, change the type of 'write_flags' into blk_opf_t.

Cc: Mike Christie <michael.christie@oracle.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-57-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-14 12:14:32 -06:00
Bart Van Assche
7649c873c1 fs/f2fs: Use the enum req_op and blk_opf_t types
Improve static type checking by using the enum req_op type for variables
that represent a request operation and the new blk_opf_t type for
variables that represent request flags.

Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-53-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-14 12:14:32 -06:00
Li kunyu
0bb7e14c8e blk-iocost: tracing: atomic64_read(&ioc->vtime_rate) is assigned an extra semicolon
Remove extra semicolon.

Link: https://lkml.kernel.org/r/20220629030013.10362-1-kunyu@nfschina.com

Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Li kunyu <kunyu@nfschina.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-12 16:36:37 -04:00
Steven Rostedt (Google)
820b8963ad net: sock: tracing: Fix sock_exceed_buf_limit not to dereference stale pointer
The trace event sock_exceed_buf_limit saves the prot->sysctl_mem pointer
and then dereferences it in the TP_printk() portion. This is unsafe as the
TP_printk() portion is executed at the time the buffer is read. That is,
it can be seconds, minutes, days, months, even years later. If the proto
is freed, then this dereference will can also lead to a kernel crash.

Instead, save the sysctl_mem array into the ring buffer and have the
TP_printk() reference that instead. This is the proper and safe way to
read pointers in trace events.

Link: https://lore.kernel.org/all/20220706052130.16368-12-kuniyu@amazon.com/

Cc: stable@vger.kernel.org
Fixes: 3847ce32ae ("core: add tracepoints for queueing skb to rcvbuf")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-08 12:06:17 +01:00
Arnd Bergmann
77abf47213 Merge tag 'scmi-updates-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/drivers
Arm SCMI updates for v5.20

The main additions this time around are:

1. The capability to trace full SCMI message headers and payloads.
   The recent unearthing of chain of old firmware issues motivated
   this effort so that it is easier to trace them and debug quicker
   than it took this time around in absence of such tracing.

2. SCMI System power control driver to handle platform's requests for a
   graceful shutdown. Though the system power control protocol has been
   around since the begining of SCMI, it lacked the timeout information
   that was added in SCMI v3.1 that enables kernel to take appropriate
   action within the timeout and doesn't have to rely on any other
   user inputs(which was blocking factor for addition of this driver
   earlier)

3. Support for SCMI Power Capping protocol that was introduced in SCMI v3.1
   This protocol is intended for controlling and monitoring the power
   consumption of power capping domains. The firmware also provides the
   hierarchy of powercap domains by providing parent domain information.

It also contains a bug fix in the old SCPI driver addressing possible
user-after-free issues.

* tag 'scmi-updates-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  firmware: arm_scmi: Use fast channel tracing
  include: trace: Add SCMI fast channel tracing
  firmware: arm_scmi: Add SCMI v3.1 powercap fast channels support
  firmware: arm_scmi: Generalize the fast channel support
  firmware: arm_scmi: Add SCMI v3.1 powercap protocol basic support
  dt-bindings: firmware: arm,scmi: Add support for powercap protocol
  firmware: arm_scmi: Add SCMI System Power Control driver
  firmware: arm_scmi: Add devm_protocol_acquire helper
  firmware: arm_scmi: Add SCMI v3.1 System Power extensions
  firmware: arm_scmi: Support only one single system power device
  firmware: arm_scmi: Use new SCMI full message tracing
  include: trace: Add SCMI full message tracing
  firmware: arm_scpi: Ensure scpi_info is not assigned if the probe fails
  firmware: arm_scmi: Remove usage of the deprecated ida_simple_xxx API
  firmware: arm_scmi: Fix response size warning for OPTEE transport
  firmware: arm_scmi: Relax CLOCK_DESCRIBE_RATES out-of-spec checks

Link: https://lore.kernel.org/r/20220706115045.2272678-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-06 13:56:10 +02:00
Vasily Averin
b347aa7b57 mm/tracing: add 'accounted' entry into output of allocation tracepoints
Slab caches marked with SLAB_ACCOUNT force accounting for every
allocation from this cache even if __GFP_ACCOUNT flag is not passed.
Unfortunately, at the moment this flag is not visible in ftrace output,
and this makes it difficult to analyze the accounted allocations.

This patch adds boolean "accounted" entry into trace output,
and set it to 'true' for calls used __GFP_ACCOUNT flag and
for allocations from caches marked with SLAB_ACCOUNT.
Set it to 'false' if accounting is disabled in configs.

Signed-off-by: Vasily Averin <vvs@openvz.org>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Link: https://lore.kernel.org/r/c418ed25-65fe-f623-fbf8-1676528859ed@openvz.org
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-07-04 17:11:27 +02:00
Cristian Marussi
e699eb9b4f include: trace: Add SCMI fast channel tracing
All the currently defined SCMI events are meant to trace only regular SCMI
transfers based on SCMI messages exchanges; SCMI transactions based on
fast channels, where used, are completely invisible from the tracing point
of view.

Add support to trace fast channel transactions; while doing that avoid
exposing full shared memory location addresses.

Link: https://lore.kernel.org/r/20220704102241.2988447-6-cristian.marussi@arm.com
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04 14:28:43 +01:00
Cristian Marussi
2bd0467074 include: trace: Add SCMI full message tracing
Add a distinct trace event to dump full SCMI message headers and payloads.

Link: https://lore.kernel.org/r/20220630173135.2086631-2-cristian.marussi@arm.com
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04 14:28:42 +01:00
Linus Torvalds
43627618a0 Merge tag 'ata-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ATA fix from Damien Le Moal:

 - a single patch to fix tracing of command completion (Edward)

* tag 'ata-5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: libata: add qc->flags in ata_qc_complete_template tracepoint
2022-06-24 11:12:34 -07:00
Alexander Aring
5d92a30e90 fs: dlm: add resource name to tracepoints
This patch adds the resource name to dlm tracepoints.  The name
usually comes through the lkb_resource, but in some cases a resource
may not yet be associated with an lkb, in which case the name and
namelen parameters are used.

It should be okay to access the lkb_resource and the res_name field at
the time when the tracepoint is invoked. The resource is assigned to a
lkb and it's reference is being held during the tracepoint call. During
this time the resource cannot be freed. Also a lkb will never switch
its assigned resource. The name of a dlm_rsb is assigned at creation
time and should never be changed during runtime as well.

The TP_printk() call uses always a hexadecimal string array
representation for the resource name (which is not necessarily ascii.)

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-06-24 11:53:09 -05:00
Alexander Aring
0c4c516fa2 fs: dlm: remove additional dereference of lksb
This patch removes a dereference of lksb of lkb when calling ast
tracepoint. First it reduces additional overhead, even if traces
are not active. Second we can deference it in TP_fast_assign from
the existing lkb parameter.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-06-24 11:53:08 -05:00
Dylan Yudaken
e70b64a3f2 io_uring: move io_uring_get_opcode out of TP_printk
The TP_printk macro's are not supposed to use custom code ([1]) or else
tools such as perf cannot use these events.

Convert the opcode string representation to use the __string wiring that
the event framework provides ([2]).

[1]: https://lwn.net/Articles/379903/
[2]: https://lwn.net/Articles/381064/

Fixes: 033b87d24f ("io_uring: use the text representation of ops in trace")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220623083743.2648321-1-dylany@fb.com
[axboe: fixup spurious removal of sq_thread assignment]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-23 08:40:36 -06:00
Edward Wu
540a92bfe6 ata: libata: add qc->flags in ata_qc_complete_template tracepoint
Add flags value to check the result of ata completion

Fixes: 255c03d15a ("libata: Add tracepoints")
Cc: stable@vger.kernel.org
Signed-off-by: Edward Wu <edwardwu@realtek.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-06-17 16:30:03 +09:00
Eiichi Tsukata
0da11bf0ca cpuidle: haltpoll: Add trace points for guest_halt_poll_ns grow/shrink
Add trace points as are implemented in KVM host halt polling.
This helps tune guest halt polling params.

Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-14 16:01:35 +02:00
Wonhyuk Yang
873a400938 workqueue: Fix type of cpu in trace event
The trace event "workqueue_queue_work" use unsigned int type for
req_cpu, cpu. This casue confusing cpu number like below log.

$ cat /sys/kernel/debug/tracing/trace
cat-317  [001] ...: workqueue_queue_work: ... req_cpu=8192 cpu=4294967295

So, change unsigned type to signed type in the trace event. After
applying this patch, cpu number will be printed as -1 instead of
4294967295 as folllows.

$ cat /sys/kernel/debug/tracing/trace
cat-1338  [002] ...: workqueue_queue_work: ... req_cpu=8192 cpu=-1

Cc: Baik Song An <bsahn@etri.re.kr>
Cc: Hong Yeon Kim <kimhy@etri.re.kr>
Cc: Taeung Song <taeung@reallinux.co.kr>
Cc: linuxgeek@linuxgeek.io
Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-06-07 07:09:47 -10:00
Linus Torvalds
500a434fc5 Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
 "Here is the set of driver core changes for 5.19-rc1.

  Lots of tiny driver core changes and cleanups happened this cycle, but
  the two major things are:

   - firmware_loader reorganization and additions including the ability
     to have XZ compressed firmware images and the ability for userspace
     to initiate the firmware load when it needs to, instead of being
     always initiated by the kernel. FPGA devices specifically want this
     ability to have their firmware changed over the lifetime of the
     system boot, and this allows them to work without having to come up
     with yet-another-custom-uapi interface for loading firmware for
     them.

   - physical location support added to sysfs so that devices that know
     this information, can tell userspace where they are located in a
     common way. Some ACPI devices already support this today, and more
     bus types should support this in the future.

  Smaller changes include:

   - driver_override api cleanups and fixes

   - error path cleanups and fixes

   - get_abi script fixes

   - deferred probe timeout changes.

  It's that last change that I'm the most worried about. It has been
  reported to cause boot problems for a number of systems, and I have a
  tested patch series that resolves this issue. But I didn't get it
  merged into my tree before 5.18-final came out, so it has not gotten
  any linux-next testing.

  I'll send the fixup patches (there are 2) as a follow-on series to this
  pull request.

  All have been tested in linux-next for weeks, with no reported issues
  other than the above-mentioned boot time-outs"

* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
  driver core: fix deadlock in __device_attach
  kernfs: Separate kernfs_pr_cont_buf and rename_lock.
  topology: Remove unused cpu_cluster_mask()
  driver core: Extend deferred probe timeout on driver registration
  MAINTAINERS: add Russ Weight as a firmware loader maintainer
  driver: base: fix UAF when driver_attach failed
  test_firmware: fix end of loop test in upload_read_show()
  driver core: location: Add "back" as a possible output for panel
  driver core: location: Free struct acpi_pld_info *pld
  driver core: Add "*" wildcard support to driver_async_probe cmdline param
  driver core: location: Check for allocations failure
  arch_topology: Trace the update thermal pressure
  kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
  export: fix string handling of namespace in EXPORT_SYMBOL_NS
  rpmsg: use local 'dev' variable
  rpmsg: Fix calling device_lock() on non-initialized device
  firmware_loader: describe 'module' parameter of firmware_upload_register()
  firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
  firmware_loader: Fix configs for sysfs split
  selftests: firmware: Add firmware upload selftests
  ...
2022-06-03 11:48:47 -07:00
Linus Torvalds
1501f707d2 Merge tag 'f2fs-for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've refactored the existing atomic write support
  implemented by in-memory operations to have storing data in disk
  temporarily, which can give us a benefit to accept more atomic writes.

  At the same time, we removed the existing volatile write support.

  We've also revisited the file pinning and GC flows and found some
  corner cases which contributeed abnormal system behaviours.

  As usual, there're several minor code refactoring for readability,
  sanity check, and clean ups.

  Enhancements:

   - allow compression for mmap files in compress_mode=user

   - kill volatile write support

   - change the current atomic write way

   - give priority to select unpinned section for foreground GC

   - introduce data read/write showing path info

   - remove unnecessary f2fs_lock_op in f2fs_new_inode

  Bug fixes:

   - fix the file pinning flow during checkpoint=disable and GCs

   - fix foreground and background GCs to select the right victims and
     get free sections on time

   - fix GC flags on defragmenting pages

   - avoid an infinite loop to flush node pages

   - fix fallocate to use file_modified to update permissions
     consistently"

* tag 'f2fs-for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (40 commits)
  f2fs: fix to tag gcing flag on page during file defragment
  f2fs: replace F2FS_I(inode) and sbi by the local variable
  f2fs: add f2fs_init_write_merge_io function
  f2fs: avoid unneeded error handling for revoke_entry_slab allocation
  f2fs: allow compression for mmap files in compress_mode=user
  f2fs: fix typo in comment
  f2fs: make f2fs_read_inline_data() more readable
  f2fs: fix to do sanity check for inline inode
  f2fs: fix fallocate to use file_modified to update permissions consistently
  f2fs: don't use casefolded comparison for "." and ".."
  f2fs: do not stop GC when requiring a free section
  f2fs: keep wait_ms if EAGAIN happens
  f2fs: introduce f2fs_gc_control to consolidate f2fs_gc parameters
  f2fs: reject test_dummy_encryption when !CONFIG_FS_ENCRYPTION
  f2fs: kill volatile write support
  f2fs: change the current atomic write way
  f2fs: don't need inode lock for system hidden quota
  f2fs: stop allocating pinned sections if EAGAIN happens
  f2fs: skip GC if possible when checkpoint disabling
  f2fs: give priority to select unpinned section for foreground GC
  ...
2022-05-31 16:52:59 -07:00
Linus Torvalds
6d29d7fe4f Merge tag 'nfsd-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
 "We introduce 'courteous server' in this release. Previously NFSD would
  purge open and lock state for an unresponsive client after one lease
  period (typically 90 seconds). Now, after one lease period, another
  client can open and lock those files and the unresponsive client's
  lease is purged; otherwise if the unresponsive client's open and lock
  state is uncontended, the server retains that open and lock state for
  up to 24 hours, allowing the client's workload to resume after a
  lengthy network partition.

  A longstanding issue with NFSv4 file creation is also addressed.
  Previously a file creation can fail internally, returning an error to
  the client, but leave the newly created file in place as an artifact.
  The file creation code path has been reorganized so that internal
  failures and race conditions are less likely to result in an unwanted
  file creation.

  A fault injector has been added to help exercise paths that are run
  during kernel metadata cache invalidation. These caches contain
  information maintained by user space about exported filesystems. Many
  of our test workloads do not trigger cache invalidation.

  There is one patch that is needed to support PREEMPT_RT and a fix for
  an ancient 'sleep while spin-locked' splat that seems to have become
  easier to hit since v5.18-rc3"

* tag 'nfsd-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (36 commits)
  NFSD: nfsd_file_put() can sleep
  NFSD: Add documenting comment for nfsd4_release_lockowner()
  NFSD: Modernize nfsd4_release_lockowner()
  NFSD: Fix possible sleep during nfsd4_release_lockowner()
  nfsd: destroy percpu stats counters after reply cache shutdown
  nfsd: Fix null-ptr-deref in nfsd_fill_super()
  nfsd: Unregister the cld notifier when laundry_wq create failed
  SUNRPC: Use RMW bitops in single-threaded hot paths
  NFSD: Clean up the show_nf_flags() macro
  NFSD: Trace filecache opens
  NFSD: Move documenting comment for nfsd4_process_open2()
  NFSD: Fix whitespace
  NFSD: Remove dprintk call sites from tail of nfsd4_open()
  NFSD: Instantiate a struct file when creating a regular NFSv4 file
  NFSD: Clean up nfsd_open_verified()
  NFSD: Remove do_nfsd_create()
  NFSD: Refactor NFSv4 OPEN(CREATE)
  NFSD: Refactor NFSv3 CREATE
  NFSD: Refactor nfsd_create_setattr()
  NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create()
  ...
2022-05-26 20:52:24 -07:00
Linus Torvalds
98931dd95f Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
 "Almost all of MM here. A few things are still getting finished off,
  reviewed, etc.

   - Yang Shi has improved the behaviour of khugepaged collapsing of
     readonly file-backed transparent hugepages.

   - Johannes Weiner has arranged for zswap memory use to be tracked and
     managed on a per-cgroup basis.

   - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
     runtime enablement of the recent huge page vmemmap optimization
     feature.

   - Baolin Wang contributes a series to fix some issues around hugetlb
     pagetable invalidation.

   - Zhenwei Pi has fixed some interactions between hwpoisoned pages and
     virtualization.

   - Tong Tiangen has enabled the use of the presently x86-only
     page_table_check debugging feature on arm64 and riscv.

   - David Vernet has done some fixup work on the memcg selftests.

   - Peter Xu has taught userfaultfd to handle write protection faults
     against shmem- and hugetlbfs-backed files.

   - More DAMON development from SeongJae Park - adding online tuning of
     the feature and support for monitoring of fixed virtual address
     ranges. Also easier discovery of which monitoring operations are
     available.

   - Nadav Amit has done some optimization of TLB flushing during
     mprotect().

   - Neil Brown continues to labor away at improving our swap-over-NFS
     support.

   - David Hildenbrand has some fixes to anon page COWing versus
     get_user_pages().

   - Peng Liu fixed some errors in the core hugetlb code.

   - Joao Martins has reduced the amount of memory consumed by
     device-dax's compound devmaps.

   - Some cleanups of the arch-specific pagemap code from Anshuman
     Khandual.

   - Muchun Song has found and fixed some errors in the TLB flushing of
     transparent hugepages.

   - Roman Gushchin has done more work on the memcg selftests.

  ... and, of course, many smaller fixes and cleanups. Notably, the
  customary million cleanup serieses from Miaohe Lin"

* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
  mm: kfence: use PAGE_ALIGNED helper
  selftests: vm: add the "settings" file with timeout variable
  selftests: vm: add "test_hmm.sh" to TEST_FILES
  selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
  selftests: vm: add migration to the .gitignore
  selftests/vm/pkeys: fix typo in comment
  ksm: fix typo in comment
  selftests: vm: add process_mrelease tests
  Revert "mm/vmscan: never demote for memcg reclaim"
  mm/kfence: print disabling or re-enabling message
  include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
  include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
  mm: fix a potential infinite loop in start_isolate_page_range()
  MAINTAINERS: add Muchun as co-maintainer for HugeTLB
  zram: fix Kconfig dependency warning
  mm/shmem: fix shmem folio swapoff hang
  cgroup: fix an error handling path in alloc_pagecache_max_30M()
  mm: damon: use HPAGE_PMD_SIZE
  tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
  nodemask.h: fix compilation error with GCC12
  ...
2022-05-26 12:32:41 -07:00
Linus Torvalds
3f306ea2e1 Merge tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:

 - don't over-decrypt memory (Robin Murphy)

 - takes min align mask into account for the swiotlb max mapping size
   (Tianyu Lan)

 - use GFP_ATOMIC in dma-debug (Mikulas Patocka)

 - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)

 - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)

 - cleanup swiotlb initialization and share more code with swiotlb-xen
   (me, Stefano Stabellini)

* tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping: (23 commits)
  dma-direct: don't over-decrypt memory
  swiotlb: max mapping size takes min align mask into account
  swiotlb: use the right nslabs-derived sizes in swiotlb_init_late
  swiotlb: use the right nslabs value in swiotlb_init_remap
  swiotlb: don't panic when the swiotlb buffer can't be allocated
  dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
  dma-direct: don't fail on highmem CMA pages in dma_direct_alloc_pages
  swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm
  x86: remove cruft from <asm/dma-mapping.h>
  swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl
  swiotlb: merge swiotlb-xen initialization into swiotlb
  swiotlb: provide swiotlb_init variants that remap the buffer
  swiotlb: pass a gfp_mask argument to swiotlb_init_late
  swiotlb: add a SWIOTLB_ANY flag to lift the low memory restriction
  swiotlb: make the swiotlb_init interface more useful
  x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled
  x86: remove the IOMMU table infrastructure
  MIPS/octeon: use swiotlb_init instead of open coding it
  arm/xen: don't check for xen_initial_domain() in xen_create_contiguous_region
  swiotlb: rename swiotlb_late_init_with_default_size
  ...
2022-05-25 19:18:36 -07:00
Linus Torvalds
7e062cda7d Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
 "Core
  ----

   - Support TCPv6 segmentation offload with super-segments larger than
     64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP).

   - Generalize skb freeing deferral to per-cpu lists, instead of
     per-socket lists.

   - Add a netdev statistic for packets dropped due to L2 address
     mismatch (rx_otherhost_dropped).

   - Continue work annotating skb drop reasons.

   - Accept alternative netdev names (ALT_IFNAME) in more netlink
     requests.

   - Add VLAN support for AF_PACKET SOCK_RAW GSO.

   - Allow receiving skb mark from the socket as a cmsg.

   - Enable memcg accounting for veth queues, sysctl tables and IPv6.

  BPF
  ---

   - Add libbpf support for User Statically-Defined Tracing (USDTs).

   - Speed up symbol resolution for kprobes multi-link attachments.

   - Support storing typed pointers to referenced and unreferenced
     objects in BPF maps.

   - Add support for BPF link iterator.

   - Introduce access to remote CPU map elements in BPF per-cpu map.

   - Allow middle-of-the-road settings for the
     kernel.unprivileged_bpf_disabled sysctl.

   - Implement basic types of dynamic pointers e.g. to allow for
     dynamically sized ringbuf reservations without extra memory copies.

  Protocols
  ---------

   - Retire port only listening_hash table, add a second bind table
     hashed by port and address. Avoid linear list walk when binding to
     very popular ports (e.g. 443).

   - Add bridge FDB bulk flush filtering support allowing user space to
     remove all FDB entries matching a condition.

   - Introduce accept_unsolicited_na sysctl for IPv6 to implement
     router-side changes for RFC9131.

   - Support for MPTCP path manager in user space.

   - Add MPTCP support for fallback to regular TCP for connections that
     have never connected additional subflows or transmitted
     out-of-sequence data (partial support for RFC8684 fallback).

   - Avoid races in MPTCP-level window tracking, stabilize and improve
     throughput.

   - Support lockless operation of GRE tunnels with seq numbers enabled.

   - WiFi support for host based BSS color collision detection.

   - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets.

   - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2).

   - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile).

   - Allow matching on the number of VLAN tags via tc-flower.

   - Add tracepoint for tcp_set_ca_state().

  Driver API
  ----------

   - Improve error reporting from classifier and action offload.

   - Add support for listing line cards in switches (devlink).

   - Add helpers for reporting page pool statistics with ethtool -S.

   - Add support for reading clock cycles when using PTP virtual clocks,
     instead of having the driver convert to time before reporting. This
     makes it possible to report time from different vclocks.

   - Support configuring low-latency Tx descriptor push via ethtool.

   - Separate Clause 22 and Clause 45 MDIO accesses more explicitly.

  New hardware / drivers
  ----------------------

   - Ethernet:
      - Marvell's Octeon NIC PCI Endpoint support (octeon_ep)
      - Sunplus SP7021 SoC (sp7021_emac)
      - Add support for Renesas RZ/V2M (in ravb)
      - Add support for MediaTek mt7986 switches (in mtk_eth_soc)

   - Ethernet PHYs:
      - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting)
      - TI DP83TD510 PHY
      - Microchip LAN8742/LAN88xx PHYs

   - WiFi:
      - Driver for pureLiFi X, XL, XC devices (plfxlc)
      - Driver for Silicon Labs devices (wfx)
      - Support for WCN6750 (in ath11k)
      - Support Realtek 8852ce devices (in rtw89)

   - Mobile:
      - MediaTek T700 modems (Intel 5G 5000 M.2 cards)

   - CAN:
      - ctucanfd: add support for CTU CAN FD open-source IP core from
        Czech Technical University in Prague

  Drivers
  -------

   - Delete a number of old drivers still using virt_to_bus().

   - Ethernet NICs:
      - intel: support TSO on tunnels MPLS
      - broadcom: support multi-buffer XDP
      - nfp: support VF rate limiting
      - sfc: use hardware tx timestamps for more than PTP
      - mlx5: multi-port eswitch support
      - hyper-v: add support for XDP_REDIRECT
      - atlantic: XDP support (including multi-buffer)
      - macb: improve real-time perf by deferring Tx processing to NAPI

   - High-speed Ethernet switches:
      - mlxsw: implement basic line card information querying
      - prestera: add support for traffic policing on ingress and egress

   - Embedded Ethernet switches:
      - lan966x: add support for packet DMA (FDMA)
      - lan966x: add support for PTP programmable pins
      - ti: cpsw_new: enable bc/mc storm prevention

   - Qualcomm 802.11ax WiFi (ath11k):
      - Wake-on-WLAN support for QCA6390 and WCN6855
      - device recovery (firmware restart) support
      - support setting Specific Absorption Rate (SAR) for WCN6855
      - read country code from SMBIOS for WCN6855/QCA6390
      - enable keep-alive during WoWLAN suspend
      - implement remain-on-channel support

   - MediaTek WiFi (mt76):
      - support Wireless Ethernet Dispatch offloading packet movement
        between the Ethernet switch and WiFi interfaces
      - non-standard VHT MCS10-11 support
      - mt7921 AP mode support
      - mt7921 IPv6 NS offload support

   - Ethernet PHYs:
      - micrel: ksz9031/ksz9131: cabletest support
      - lan87xx: SQI support for T1 PHYs
      - lan937x: add interrupt support for link detection"

* tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits)
  ptp: ocp: Add firmware header checks
  ptp: ocp: fix PPS source selector debugfs reporting
  ptp: ocp: add .init function for sma_op vector
  ptp: ocp: vectorize the sma accessor functions
  ptp: ocp: constify selectors
  ptp: ocp: parameterize input/output sma selectors
  ptp: ocp: revise firmware display
  ptp: ocp: add Celestica timecard PCI ids
  ptp: ocp: Remove #ifdefs around PCI IDs
  ptp: ocp: 32-bit fixups for pci start address
  Revert "net/smc: fix listen processing for SMC-Rv2"
  ath6kl: Use cc-disable-warning to disable -Wdangling-pointer
  selftests/bpf: Dynptr tests
  bpf: Add dynptr data slices
  bpf: Add bpf_dynptr_read and bpf_dynptr_write
  bpf: Dynptr support for ring buffers
  bpf: Add bpf_dynptr_from_mem for local dynptrs
  bpf: Add verifier support for dynptrs
  bpf: Suppress 'passing zero to PTR_ERR' warning
  bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack
  ...
2022-05-25 12:22:58 -07:00
Vasily Averin
e5c3f619a0 include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
Fix sparse warning about incorrect gfp_t cast.

Link: https://lkml.kernel.org/r/001979f3-e978-0998-cbed-61a4a2ac87b8@openvz.org
Fixes: f67bed134a ("percpu: improve percpu_alloc_percpu event trace")
Signed-off-by: Vasily Averin <vvs@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-25 10:47:48 -07:00
Vasily Averin
185194f191 include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
Redefines __def_gfpflag_names array according to akpm@, willy@ and Joe
Perches recommendations.

Link: https://lkml.kernel.org/r/6f811e19-41c6-f3e8-fca6-23a19a62e313@openvz.org
Fixes: fe573327ff ("tracing: incorrect gfp_t conversion")
Signed-off-by: Vasily Averin <vvs@openvz.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Joe Perches <joe@perches.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-25 10:47:47 -07:00
Linus Torvalds
fdaf9a5840 Merge tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache
Pull page cache updates from Matthew Wilcox:

 - Appoint myself page cache maintainer

 - Fix how scsicam uses the page cache

 - Use the memalloc_nofs_save() API to replace AOP_FLAG_NOFS

 - Remove the AOP flags entirely

 - Remove pagecache_write_begin() and pagecache_write_end()

 - Documentation updates

 - Convert several address_space operations to use folios:
     - is_dirty_writeback
     - readpage becomes read_folio
     - releasepage becomes release_folio
     - freepage becomes free_folio

 - Change filler_t to require a struct file pointer be the first
   argument like ->read_folio

* tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache: (107 commits)
  nilfs2: Fix some kernel-doc comments
  Appoint myself page cache maintainer
  fs: Remove aops->freepage
  secretmem: Convert to free_folio
  nfs: Convert to free_folio
  orangefs: Convert to free_folio
  fs: Add free_folio address space operation
  fs: Convert drop_buffers() to use a folio
  fs: Change try_to_free_buffers() to take a folio
  jbd2: Convert release_buffer_page() to use a folio
  jbd2: Convert jbd2_journal_try_to_free_buffers to take a folio
  reiserfs: Convert release_buffer_page() to use a folio
  fs: Remove last vestiges of releasepage
  ubifs: Convert to release_folio
  reiserfs: Convert to release_folio
  orangefs: Convert to release_folio
  ocfs2: Convert to release_folio
  nilfs2: Remove comment about releasepage
  nfs: Convert to release_folio
  jfs: Convert to release_folio
  ...
2022-05-24 19:55:07 -07:00
Linus Torvalds
bd1b7c1384 Merge tag 'for-5.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
 "Features:

   - subpage:
      - support for PAGE_SIZE > 4K (previously only 64K)
      - make it work with raid56

   - repair super block num_devices automatically if it does not match
     the number of device items

   - defrag can convert inline extents to regular extents, up to now
     inline files were skipped but the setting of mount option
     max_inline could affect the decision logic

   - zoned:
      - minimal accepted zone size is explicitly set to 4MiB
      - make zone reclaim less aggressive and don't reclaim if there are
        enough free zones
      - add per-profile sysfs tunable of the reclaim threshold

   - allow automatic block group reclaim for non-zoned filesystems, with
     sysfs tunables

   - tree-checker: new check, compare extent buffer owner against owner
     rootid

  Performance:

   - avoid blocking on space reservation when doing nowait direct io
     writes (+7% throughput for reads and writes)

   - NOCOW write throughput improvement due to refined locking (+3%)

   - send: reduce pressure to page cache by dropping extent pages right
     after they're processed

  Core:

   - convert all radix trees to xarray

   - add iterators for b-tree node items

   - support printk message index

   - user bulk page allocation for extent buffers

   - switch to bio_alloc API, use on-stack bios where convenient, other
     bio cleanups

   - use rw lock for block groups to favor concurrent reads

   - simplify workques, don't allocate high priority threads for all
     normal queues as we need only one

   - refactor scrub, process chunks based on their constraints and
     similarity

   - allocate direct io structures on stack and pass around only
     pointers, avoids allocation and reduces potential error handling

  Fixes:

   - fix count of reserved transaction items for various inode
     operations

   - fix deadlock between concurrent dio writes when low on free data
     space

   - fix a few cases when zones need to be finished

  VFS, iomap:

   - add helper to check if sb write has started (usable for assertions)

   - new helper iomap_dio_alloc_bio, export iomap_dio_bio_end_io"

* tag 'for-5.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (173 commits)
  btrfs: zoned: introduce a minimal zone size 4M and reject mount
  btrfs: allow defrag to convert inline extents to regular extents
  btrfs: add "0x" prefix for unsupported optional features
  btrfs: do not account twice for inode ref when reserving metadata units
  btrfs: zoned: fix comparison of alloc_offset vs meta_write_pointer
  btrfs: send: avoid trashing the page cache
  btrfs: send: keep the current inode open while processing it
  btrfs: allocate the btrfs_dio_private as part of the iomap dio bio
  btrfs: move struct btrfs_dio_private to inode.c
  btrfs: remove the disk_bytenr in struct btrfs_dio_private
  btrfs: allocate dio_data on stack
  iomap: add per-iomap_iter private data
  iomap: allow the file system to provide a bio_set for direct I/O
  btrfs: add a btrfs_dio_rw wrapper
  btrfs: zoned: zone finish unused block group
  btrfs: zoned: properly finish block group on metadata write
  btrfs: zoned: finish block group when there are no more allocatable bytes left
  btrfs: zoned: consolidate zone finish functions
  btrfs: zoned: introduce btrfs_zoned_bg_is_full
  btrfs: improve error reporting in lookup_inline_extent_backref
  ...
2022-05-24 18:52:35 -07:00
Linus Torvalds
65965d9530 Merge tag 'erofs-for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs (and fscache) updates from Gao Xiang:
 "After working on it on the mailing list for more than half a year, we
  finally form 'erofs over fscache' feature into shape. Hopefully it
  could bring more possibility to the communities.

  The story mainly started from a new project what we called "RAFS v6" [1]
  for Nydus image service almost a year ago, which enhances EROFS to be
  a new form of one bootstrap (which includes metadata representing the
  whole fs tree) + several data-deduplicated content addressable blobs
  (actually treated as multiple devices). Each blob can represent one
  container image layer but not quite exactly since all new data can be
  fully existed in the previous blobs so no need to introduce another
  new blob.

  It is actually not a new idea (at least on my side it's much like a
  simpilied casync [2] for now) and has many benefits over per-file
  blobs or some other exist ways since typically each RAFS v6 image only
  has dozens of device blobs instead of thousands of per-file blobs.
  It's easy to be signed with user keys as a golden image, transfered
  untouchedly with minimal overhead over the network, kept in some type
  of storage conveniently, and run with (optional) runtime verification
  but without involving too many irrelevant features crossing the system
  beyond EROFS itself. At least it's our final goal and we're keeping
  working on it. There was also a good summary of this approach from the
  casync author [3].

  Regardless further optimizations, this work is almost done in the
  previous Linux release cycles. In this round, we'd like to introduce
  on-demand load for EROFS with the fscache/cachefiles infrastructure,
  considering the following advantages:

   - Introduce new file-based backend to EROFS. Although each image only
     contains dozens of blobs but in densely-deployed runC host for
     example, there could still be massive blobs on a machine, which is
     messy if each blob is treated as a device. In contrast, fscache and
     cachefiles are really great interfaces for us to make them work.

   - Introduce on-demand load to fscache and EROFS. Previously, fscache
     is mainly used to caching network-likewise filesystems, now it can
     support on-demand downloading for local fses too with the exact
     localfs on-disk format. It has many advantages which we're been
     described in the latest patchset cover letter [4]. In addition to
     that, most importantly, the cached data is still stored in the
     original local fs on-disk format so that it's still the one signed
     with private keys but only could be partially available. Users can
     fully trust it during running. Later, users can also back up
     cachefiles easily to another machine.

   - More reliable on-demand approach in principle. After data is all
     available locally, user daemon can be no longer online in some use
     cases, which helps daemon crash recovery (filesystems can still in
     service) and hot-upgrade (user daemon can be upgraded more
     frequently due to new features or protocols introduced.)

   - Other format can also be converted to EROFS filesystem format over
     the internet on the fly with the new on-demand load feature and
     mounted. That is entirely possible with on-demand load feature as
     long as such archive format metadata can be fetched in advance like
     stargz.

  In addition, although currently our target user is Nydus image service [5],
  but laterly, it can be used for other use cases like on-demand system
  booting, etc. As for the fscache on-demand load feature itself,
  strictly it can be used for other local fses too. Laterly we could
  promote most code to the iomap infrastructure and also enhance it in
  the read-write way if other local fses are interested.

  Thanks David Howells for taking so much time and patience on this
  these months, many thanks with great respect here again! Thanks Jeffle
  for working on this feature and Xin Yin from Bytedance for
  asynchronous I/O implementation as well as Zichen Tian, Jia Zhu, and
  Yan Song for testing, much appeciated. We're also exploring more
  possibly over fscache cache management over FSDAX for secure
  containers and working on more improvements and useful features for
  fscache, cachefiles, and on-demand load.

  In addition to "erofs over fscache", NFS export and idmapped mount are
  also completed in this cycle for container use cases as well.

  Summary:

   - Add erofs on-demand load support over fscache

   - Support NFS export for erofs

   - Support idmapped mounts for erofs

   - Don't prompt for risk any more when using big pcluster

   - Fix buffer copy overflow of ztailpacking feature

   - Several minor cleanups"

[1] https://lore.kernel.org/r/20210730194625.93856-1-hsiangkao@linux.alibaba.com
[2] https://github.com/systemd/casync
[3] http://0pointer.net/blog/casync-a-tool-for-distributing-file-system-images.html
[4] https://lore.kernel.org/r/20220509074028.74954-1-jefflexu@linux.alibaba.com
[5] https://github.com/dragonflyoss/image-service

* tag 'erofs-for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: (29 commits)
  erofs: scan devices from device table
  erofs: change to use asynchronous io for fscache readpage/readahead
  erofs: add 'fsid' mount option
  erofs: implement fscache-based data readahead
  erofs: implement fscache-based data read for inline layout
  erofs: implement fscache-based data read for non-inline layout
  erofs: implement fscache-based metadata read
  erofs: register fscache context for extra data blobs
  erofs: register fscache context for primary data blob
  erofs: add erofs_fscache_read_folios() helper
  erofs: add anonymous inode caching metadata for data blobs
  erofs: add fscache context helper functions
  erofs: register fscache volume
  erofs: add fscache mode check helper
  erofs: make erofs_map_blocks() generally available
  cachefiles: document on-demand read mode
  cachefiles: add tracepoints for on-demand read mode
  cachefiles: enable on-demand read mode
  cachefiles: implement on-demand read
  cachefiles: notify the user daemon when withdrawing cookie
  ...
2022-05-24 18:42:04 -07:00
Linus Torvalds
2319be1356 Merge tag 'locking-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:

 - rwsem cleanups & optimizations/fixes:
    - Conditionally wake waiters in reader/writer slowpaths
    - Always try to wake waiters in out_nolock path

 - Add try_cmpxchg64() implementation, with arch optimizations - and use
   it to micro-optimize sched_clock_{local,remote}()

 - Various force-inlining fixes to address objdump instrumentation-check
   warnings

 - Add lock contention tracepoints:

    lock:contention_begin
    lock:contention_end

 - Misc smaller fixes & cleanups

* tag 'locking-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/clock: Use try_cmpxchg64 in sched_clock_{local,remote}
  locking/atomic/x86: Introduce arch_try_cmpxchg64
  locking/atomic: Add generic try_cmpxchg64 support
  futex: Remove a PREEMPT_RT_FULL reference.
  locking/qrwlock: Change "queue rwlock" to "queued rwlock"
  lockdep: Delete local_irq_enable_in_hardirq()
  locking/mutex: Make contention tracepoints more consistent wrt adaptive spinning
  locking: Apply contention tracepoints in the slow path
  locking: Add lock contention tracepoints
  locking/rwsem: Always try to wake waiters in out_nolock path
  locking/rwsem: Conditionally wake waiters in reader/writer slowpaths
  locking/rwsem: No need to check for handoff bit if wait queue empty
  lockdep: Fix -Wunused-parameter for _THIS_IP_
  x86/mm: Force-inline __phys_addr_nodebug()
  x86/kvm/svm: Force-inline GHCB accessors
  task_stack, x86/cea: Force-inline stack helpers
2022-05-24 10:18:23 -07:00
Jakub Kicinski
677fb75253 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/cadence/macb_main.c
  5cebb40bc9 ("net: macb: Fix PTP one step sync support")
  138badbc21 ("net: macb: use NAPI for TX completion path")
https://lore.kernel.org/all/20220523111021.31489367@canb.auug.org.au/

net/smc/af_smc.c
  75c1edf23b ("net/smc: postpone sk_refcnt increment in connect()")
  3aba103006 ("net/smc: align the connect behaviour with TCP")
https://lore.kernel.org/all/20220524114408.4bf1af38@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-23 21:19:17 -07:00
Linus Torvalds
8443516da6 Merge tag 'platform-drivers-x86-v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Hans de Goede:
 "This includes some small changes to kernel/stop_machine.c and arch/x86
  which are deps of the new Intel IFS support.

  Highlights:

   - New drivers:
       - Intel "In Field Scan" (IFS) support
       - Winmate FM07/FM07P buttons
       - Mellanox SN2201 support

   -  AMD PMC driver enhancements

   -  Lots of various other small fixes and hardware-id additions"

* tag 'platform-drivers-x86-v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (54 commits)
  platform/x86/intel/ifs: Add CPU_SUP_INTEL dependency
  platform/x86: intel_cht_int33fe: Set driver data
  platform/x86: intel-hid: fix _DSM function index handling
  platform/x86: toshiba_acpi: use kobj_to_dev()
  platform/x86: samsung-laptop: use kobj_to_dev()
  platform/x86: gigabyte-wmi: Add support for Z490 AORUS ELITE AC and X570 AORUS ELITE WIFI
  tools/power/x86/intel-speed-select: Fix warning for perf_cap.cpu
  tools/power/x86/intel-speed-select: Display error on turbo mode disabled
  Documentation: In-Field Scan
  platform/x86/intel/ifs: add ABI documentation for IFS
  trace: platform/x86/intel/ifs: Add trace point to track Intel IFS operations
  platform/x86/intel/ifs: Add IFS sysfs interface
  platform/x86/intel/ifs: Add scan test support
  platform/x86/intel/ifs: Authenticate and copy to secured memory
  platform/x86/intel/ifs: Check IFS Image sanity
  platform/x86/intel/ifs: Read IFS firmware image
  platform/x86/intel/ifs: Add stub driver for In-Field Scan
  stop_machine: Add stop_core_cpuslocked() for per-core operations
  x86/msr-index: Define INTEGRITY_CAPABILITIES MSR
  x86/microcode/intel: Expose collect_cpu_info_early() for IFS
  ...
2022-05-23 20:38:39 -07:00
Linus Torvalds
6e01f86fb2 Merge tag 'timers-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer and timekeeping updates from Thomas Gleixner:

 - Expose CLOCK_TAI to instrumentation to aid with TSN debugging.

 - Ensure that the clockevent is stopped when there is no timer armed to
   avoid pointless wakeups.

 - Make the sched clock frequency handling and rounding consistent.

 - Provide a better debugobject hint for delayed works. The timer
   callback is always the same, which makes it difficult to identify the
   underlying work. Use the work function as a hint instead.

 - Move the timer specific sysctl code into the timer subsystem.

 - The usual set of improvements and cleanups

* tag 'timers-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timers: Provide a better debugobjects hint for delayed works
  time/sched_clock: Fix formatting of frequency reporting code
  time/sched_clock: Use Hz as the unit for clock rate reporting below 4kHz
  time/sched_clock: Round the frequency reported to nearest rather than down
  timekeeping: Consolidate fast timekeeper
  timekeeping: Annotate ktime_get_boot_fast_ns() with data_race()
  timers/nohz: Switch to ONESHOT_STOPPED in the low-res handler when the tick is stopped
  timekeeping: Introduce fast accessor to clock tai
  tracing/timer: Add missing argument documentation of trace points
  clocksource: Replace cpumask_weight() with cpumask_empty()
  timers: Move timer sysctl into the timer code
  clockevents: Use dedicated list iterator variable
  timers: Simplify calc_index()
  timers: Initialize base::next_expiry_recalc in timers_prepare_cpu()
2022-05-23 17:05:55 -07:00
Linus Torvalds
9836e93c0a Merge tag 'for-5.19/io_uring-passthrough-2022-05-22' of git://git.kernel.dk/linux-block
Pull io_uring NVMe command passthrough from Jens Axboe:
 "On top of everything else, this adds support for passthrough for
  io_uring.

  The initial feature for this is NVMe passthrough support, which allows
  non-filesystem based IO commands and admin commands.

  To support this, io_uring grows support for SQE and CQE members that
  are twice as big, allowing to pass in a full NVMe command without
  having to copy data around. And to complete with more than just a
  single 32-bit value as the output"

* tag 'for-5.19/io_uring-passthrough-2022-05-22' of git://git.kernel.dk/linux-block: (22 commits)
  io_uring: cleanup handling of the two task_work lists
  nvme: enable uring-passthrough for admin commands
  nvme: helper for uring-passthrough checks
  blk-mq: fix passthrough plugging
  nvme: add vectored-io support for uring-cmd
  nvme: wire-up uring-cmd support for io-passthru on char-device.
  nvme: refactor nvme_submit_user_cmd()
  block: wire-up support for passthrough plugging
  fs,io_uring: add infrastructure for uring-cmd
  io_uring: support CQE32 for nop operation
  io_uring: enable CQE32
  io_uring: support CQE32 in /proc info
  io_uring: add tracing for additional CQE32 fields
  io_uring: overflow processing for CQE32
  io_uring: flush completions for CQE32
  io_uring: modify io_get_cqe for CQE32
  io_uring: add CQE32 completion processing
  io_uring: add CQE32 setup processing
  io_uring: change ring size calculation for CQE32
  io_uring: store add. return values for CQE32
  ...
2022-05-23 13:06:15 -07:00
Linus Torvalds
368da430d0 Merge tag 'for-5.19/io_uring-socket-2022-05-22' of git://git.kernel.dk/linux-block
Pull io_uring socket() support from Jens Axboe:
 "This adds support for socket(2) for io_uring. This is handy when using
  direct / registered file descriptors with io_uring.

  Outside of those two patches, a small series from Dylan on top that
  improves the tracing by providing a text representation of the opcode
  rather than needing to decode this by reading the header file every
  time.

  That sits in this branch as it was the last opcode added (until it
  wasn't...)"

* tag 'for-5.19/io_uring-socket-2022-05-22' of git://git.kernel.dk/linux-block:
  io_uring: use the text representation of ops in trace
  io_uring: rename op -> opcode
  io_uring: add io_uring_get_opcode
  io_uring: add type to op enum
  io_uring: add socket(2) support
  net: add __sys_socket_file()
2022-05-23 12:42:33 -07:00
Linus Torvalds
09beaff75e Merge tag 'for-5.19/io_uring-xattr-2022-05-22' of git://git.kernel.dk/linux-block
Pull io_uring xattr support from Jens Axboe:
 "Support for the xattr variants"

* tag 'for-5.19/io_uring-xattr-2022-05-22' of git://git.kernel.dk/linux-block:
  io_uring: cleanup error-handling around io_req_complete
  io_uring: fix trace for reduced sqe padding
  io_uring: add fgetxattr and getxattr support
  io_uring: add fsetxattr and setxattr support
  fs: split off do_getxattr from getxattr
  fs: split off setxattr_copy and do_setxattr function from setxattr
2022-05-23 12:30:30 -07:00