Commit Graph

1136096 Commits

Author SHA1 Message Date
Jason Wang 477f719714 vdpa_sim_net: support feature provisioning
This patch implements features provisioning for vdpa_sim_net.

1) validating the provisioned features to be a subset of the parent
   features.
2) clearing the features that is not wanted by the userspace

For example:

vdpasim_net:
  supported_classes net
  max_supported_vqs 3
  dev_features MTU MAC CTRL_VQ CTRL_MAC_ADDR ANY_LAYOUT VERSION_1 ACCESS_PLATFORM

1) provision vDPA device with all features that are supported by the
   net simulator

dev1: mac 00:00:00:00:00:00 link up link_announce false mtu 1500
  negotiated_features MTU MAC CTRL_VQ CTRL_MAC_ADDR VERSION_1 ACCESS_PLATFORM

2) provision vDPA device with a subset of the features

dev1: mac 00:00:00:00:00:00 link up link_announce false mtu 1500
  negotiated_features CTRL_VQ VERSION_1 ACCESS_PLATFORM

Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220927074810.28627-3-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-10-07 09:32:40 -04:00
Jason Wang 90fea5a800 vdpa: device feature provisioning
This patch allows the device features to be provisioned through
netlink. A new attribute is introduced to allow the userspace to pass
a 64bit device features during device adding.

This provides several advantages:

- Allow to provision a subset of the features to ease the cross vendor
  live migration.
- Better debug-ability for vDPA framework and parent.

Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220927074810.28627-2-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07 09:32:40 -04:00
Gavin Li 4959aebba8 virtio-net: use mtu size as buffer length for big packets
Currently add_recvbuf_big() allocates MAX_SKB_FRAGS segments for big
packets even when GUEST_* offloads are not present on the device.
However, if guest GSO is not supported, it would be sufficient to
allocate segments to cover just up the MTU size and no further.
Allocating the maximum amount of segments results in a large waste of
buffer space in the queue, which limits the number of packets that can
be buffered and can result in reduced performance.

Therefore, if guest GSO is not supported, use the MTU to calculate the
optimal amount of segments required.

Below is the iperf TCP test results over a Mellanox NIC, using vDPA for
1 VQ, queue size 1024, before and after the change, with the iperf
server running over the virtio-net interface.

MTU(Bytes)/Bandwidth (Gbit/s)
             Before   After
  1500        22.5     22.4
  9000        12.8     25.9

And result of queue size 256.
MTU(Bytes)/Bandwidth (Gbit/s)
             Before   After
  9000        2.15     11.9

With this patch no degradation is observed with multiple below tests and
feature bit combinations. Results are summarized below for q depth of
1024. Interface MTU is 1500 if MTU feature is disabled. MTU is set to 9000
in other tests.

Features/              Bandwidth (Gbit/s)
                         Before   After
mtu off                   20.1     20.2
mtu/indirect on           17.4     17.3
mtu/indirect/packed on    17.2     17.2

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Message-Id: <20220914144911.56422-3-gavinl@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-10-07 09:32:40 -04:00
Gavin Li 46cd26f410 virtio-net: introduce and use helper function for guest gso support checks
Probe routine is already several hundred lines.
Use helper function for guest gso support check.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Message-Id: <20220914144911.56422-2-gavinl@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07 09:32:40 -04:00
Michael S. Tsirkin cdbd952bb7 virtio: drop vp_legacy_set_queue_size
There's actually no way to set queue size on legacy virtio pci.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20220815220447.155860-1-mst@redhat.com>
2022-10-07 09:32:40 -04:00
Deming Wang f7adf38928 virtio_ring: make vring_alloc_queue_packed prettier
Add some spaces to vring_alloc_queue(make it look prettier).

Signed-off-by: Deming Wang <wangdeming@inspur.com>
Message-Id: <20220926183306.4535-1-wangdeming@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07 09:32:40 -04:00
Deming Wang bdeb2f9836 virtio_ring: split: Operators use unified style
The operators of vring_alloc_queue_split should use the unified style.Add
space for the '|' ,make it be looked more pretty.

Signed-off-by: Deming Wang <wangdeming@inspur.com>
Message-Id: <20220926022202.1516-1-wangdeming@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07 09:32:40 -04:00
Xiu Jianfeng 078adb3bf4 vhost: add __init/__exit annotations to module init/exit funcs
Add missing __init/__exit annotations to module init/exit funcs.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Message-Id: <20220917083803.21521-1-xiujianfeng@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-10-07 09:32:40 -04:00
Mark Brown e1567b4f0e arm64/sysreg: Fix typo in SCTR_EL1.SPINTMASK
SPINTMASK was typoed as SPINMASK, fix it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221005181642.711734-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-10-07 14:30:11 +01:00
Kees Cook 10d5ea5a43 wifi: nl80211: Split memcpy() of struct nl80211_wowlan_tcp_data_token flexible array
To work around a misbehavior of the compiler's ability to see into
composite flexible array structs (as detailed in the coming memcpy()
hardening series[1]), split the memcpy() of the header and the payload
so no false positive run-time overflow warning will be generated.

[1] https://lore.kernel.org/linux-hardening/20220901065914.1417829-2-keescook@chromium.org/

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-10-07 15:19:06 +02:00
Hawkins Jiawei e3e6e1d16a wifi: wext: use flex array destination for memcpy()
Syzkaller reports buffer overflow false positive as follows:
------------[ cut here ]------------
memcpy: detected field-spanning write (size 8) of single field
	"&compat_event->pointer" at net/wireless/wext-core.c:623 (size 4)
WARNING: CPU: 0 PID: 3607 at net/wireless/wext-core.c:623
	wireless_send_event+0xab5/0xca0 net/wireless/wext-core.c:623
Modules linked in:
CPU: 1 PID: 3607 Comm: syz-executor659 Not tainted
	6.0.0-rc6-next-20220921-syzkaller #0
[...]
Call Trace:
 <TASK>
 ioctl_standard_call+0x155/0x1f0 net/wireless/wext-core.c:1022
 wireless_process_ioctl+0xc8/0x4c0 net/wireless/wext-core.c:955
 wext_ioctl_dispatch net/wireless/wext-core.c:988 [inline]
 wext_ioctl_dispatch net/wireless/wext-core.c:976 [inline]
 wext_handle_ioctl+0x26b/0x280 net/wireless/wext-core.c:1049
 sock_ioctl+0x285/0x640 net/socket.c:1220
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:870 [inline]
 __se_sys_ioctl fs/ioctl.c:856 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:856
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
 [...]
 </TASK>

Wireless events will be sent on the appropriate channels in
wireless_send_event(). Different wireless events may have different
payload structure and size, so kernel uses **len** and **cmd** field
in struct __compat_iw_event as wireless event common LCP part, uses
**pointer** as a label to mark the position of remaining different part.

Yet the problem is that, **pointer** is a compat_caddr_t type, which may
be smaller than the relative structure at the same position. So during
wireless_send_event() tries to parse the wireless events payload, it may
trigger the memcpy() run-time destination buffer bounds checking when the
relative structure's data is copied to the position marked by **pointer**.

This patch solves it by introducing flexible-array field **ptr_bytes**,
to mark the position of the wireless events remaining part next to
LCP part. What's more, this patch also adds **ptr_len** variable in
wireless_send_event() to improve its maintainability.

Reported-and-tested-by: syzbot+473754e5af963cf014cf@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/00000000000070db2005e95a5984@google.com/
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-10-07 15:00:25 +02:00
Felix Fietkau d9e2497040 wifi: cfg80211: fix ieee80211_data_to_8023_exthdr handling of small packets
STP topology change notification packets only have a payload of 7 bytes,
so they get dropped due to the skb->len < hdrlen + 8 check.
Fix this by removing the extra 8 from the skb->len check and checking the
return code on the skb_copy_bits calls.

Fixes: 2d1c304cb2 ("cfg80211: add function for 802.3 conversion with separate output buffer")
Reported-by: Chad Monroe <chad.monroe@smartrg.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-10-07 14:57:20 +02:00
Alexander Wetzel c95014e1d0 wifi: mac80211: netdev compatible TX stop for iTXQ drivers
Properly handle TX stop for internal queues (iTXQs) within mac80211.

mac80211 must not stop netdev queues when using mac80211 iTXQs.
For these drivers the netdev interface is created with IFF_NO_QUEUE.

While netdev still drops frames for IFF_NO_QUEUE interfaces when we stop
the netdev queues, it also prints a warning when this happens:
Assuming the mac80211 interface is called wlan0 we would get
"Virtual device wlan0 asks to queue packet!" when netdev has to drop a
frame.

This patch is keeping the harmless netdev queue starts for iTXQ drivers.

Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-10-07 14:48:14 +02:00
Conor Dooley c210b91818 riscv: dts: microchip: fix fabric i2c reg size
The size of the reg should've been changed when the address was changed,
but obviously I forgot to do so.

Fixes: ab291621a8 ("riscv: dts: microchip: icicle: re-jig fabric peripheral addresses")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2022-10-07 13:43:42 +01:00
Felix Fietkau 3bf9e30e49 wifi: mac80211: fix decap offload for stations on AP_VLAN interfaces
Since AP_VLAN interfaces are not passed to the driver, check offload_flags
on the bss vif instead.

Reported-by: Howard Hsu <howard-yh.hsu@mediatek.com>
Fixes: 80a915ec44 ("mac80211: add rx decapsulation offload support")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-10-07 14:43:29 +02:00
Dan Carpenter ceb3d688f9 wifi: mac80211: unlock on error in ieee80211_can_powered_addr_change()
Unlock before returning -EOPNOTSUPP.

Fixes: 3c06e91b40 ("wifi: mac80211: Support POWERED_ADDR_CHANGE feature")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-10-07 14:41:14 +02:00
James Prestwood 092197f1f4 wifi: mac80211: remove/avoid misleading prints
At some point a few kernel debug prints started appearing which
indicated something was sending invalid IEs:

"bad VHT capabilities, disabling VHT"
"Invalid HE elem, Disable HE"

Turns out these were being printed because the local hardware
supported HE/VHT but the peer/AP did not. Bad/invalid indicates,
to me at least, that the IE is in some way malformed, not missing.

For the HE print (ieee80211_verify_peer_he_mcs_support) it will
now silently fail if the HE capability element is missing (still
prints if the element size is wrong).

For the VHT print, it has been removed completely and will silently
set the DISABLE_VHT flag which is consistent with how DISABLE_HT
is set.

Signed-off-by: James Prestwood <prestwoj@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-10-07 14:40:33 +02:00
James Prestwood b650009fcb wifi: mac80211: fix probe req HE capabilities access
When building the probe request IEs HE support is checked for
the 6GHz band (wiphy->bands[NL80211_BAND_6GHZ]). If supported
the HE capability IE should be included according to the spec.
The problem is the 16-bit capability is obtained from the
band object (sband) that was passed in, not the 6GHz band
object (sband6). If the sband object doesn't support HE it will
result in a warning.

Fixes: 7d29bc50b3 ("mac80211: always include HE 6GHz capability in probe request")
Signed-off-by: James Prestwood <prestwoj@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-10-07 14:39:47 +02:00
Felix Fietkau f5369dcf5c wifi: mac80211: do not drop packets smaller than the LLC-SNAP header on fast-rx
Since STP TCN frames are only 7 bytes, the pskb_may_pull call returns an error.
Instead of dropping those packets, bump them back to the slow path for proper
processing.

Fixes: 49ddf8e6e2 ("mac80211: add fast-rx path")
Reported-by: Chad Monroe <chad.monroe@smartrg.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-10-07 14:38:58 +02:00
Li Zhong a8e633c604 net/9p: clarify trans_fd parse_opt failure handling
This parse_opts will set invalid opts.rfd/wfd in case of failure which
we already check, but it is not clear for readers that parse_opts error
are handled in p9_fd_create: clarify this by explicitely checking the
return value.

Link: https://lkml.kernel.org/r/20220921210921.1654735-1-floridsleeves@gmail.com
Signed-off-by: Li Zhong <floridsleeves@gmail.com>
[Dominique: reworded commit message to clarify this is NOOP]
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-10-07 21:23:09 +09:00
Xiu Jianfeng 0664c63af1 net/9p: add __init/__exit annotations to module init/exit funcs
xen transport was missing annotations

Link: https://lkml.kernel.org/r/20220909103546.73015-1-xiujianfeng@huawei.com
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-10-07 21:23:09 +09:00
Dominique Martinet 296ab4a813 net/9p: use a dedicated spinlock for trans_fd
Shamelessly copying the explanation from Tetsuo Handa's suggested
patch[1] (slightly reworded):
syzbot is reporting inconsistent lock state in p9_req_put()[2],
for p9_tag_remove() from p9_req_put() from IRQ context is using
spin_lock_irqsave() on "struct p9_client"->lock but trans_fd
(not from IRQ context) is using spin_lock().

Since the locks actually protect different things in client.c and in
trans_fd.c, just replace trans_fd.c's lock by a new one specific to the
transport (client.c's protect the idr for fid/tag allocations,
while trans_fd.c's protects its own req list and request status field
that acts as the transport's state machine)

Link: https://lore.kernel.org/r/20220904112928.1308799-1-asmadeus@codewreck.org
Link: https://lkml.kernel.org/r/2470e028-9b05-2013-7198-1fdad071d999@I-love.SAKURA.ne.jp [1]
Link: https://syzkaller.appspot.com/bug?extid=2f20b523930c32c160cc [2]
Reported-by: syzbot <syzbot+2f20b523930c32c160cc@syzkaller.appspotmail.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-10-07 21:22:48 +09:00
Nicholas Piggin 376b3275c1 KVM: PPC: Book3S HV: Fix stack frame regs marker
The hard-coded marker is out of date now, fix it using the nice define.

Fixes: 17773afdcd ("powerpc/64: use 32-bit immediate for STACK_FRAME_REGS_MARKER")
Reported-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221006143345.129077-1-npiggin@gmail.com
2022-10-07 21:30:25 +11:00
Prathamesh Shete b78870e7f4 mmc: sdhci-tegra: Use actual clock rate for SW tuning correction
Ensure tegra_host member "curr_clk_rate" holds the actual clock rate
instead of requested clock rate for proper use during tuning correction
algorithm. Actual clk rate may not be the same as the requested clk
frequency depending on the parent clock source set. Tuning correction
algorithm depends on certain parameters which are sensitive to current
clk rate. If the host clk is selected instead of the actual clock rate,
tuning correction algorithm may end up applying invalid correction,
which could result in errors

Fixes: ea8fc5953e ("mmc: tegra: update hw tuning process")
Signed-off-by: Aniruddha TVS Rao <anrao@nvidia.com>
Signed-off-by: Prathamesh Shete <pshete@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221006130622.22900-4-pshete@nvidia.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-10-07 11:03:39 +02:00
Dmitry Torokhov 099d387ebb watchdog: twl4030_wdt: add missing mod_devicetable.h include
The driver is using of_device_id and therefore needs to include
mod_devicetable.h header. We used to get this definition indirectly via
inclusion of matrix_keypad.h from twl.h, but we are cleaning up
matrix_keypad.h from unnecessary includes.

Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220927154611.3330871-1-dmitry.torokhov@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2022-10-07 11:03:25 +02:00
Biju Das f0c00454bf mmc: renesas_sdhi: Fix rounding errors
Due to clk rounding errors on RZ/G2L platforms, it selects a clock source
with a lower clock rate compared to a higher one.
For eg: The rounding error (533333333 Hz / 4 * 4 = 533333332 Hz < 5333333
33 Hz) selects a clk source of 400 MHz instead of 533.333333 MHz.

This patch fixes this issue by adding a margin of (1/1024) higher to
the clock rate.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Fixes: bb6d3fa98a ("clk: renesas: rcar-gen3: Switch to new SD clock handling")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220928110755.849275-1-biju.das.jz@bp.renesas.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-10-07 10:53:22 +02:00
Kees Cook aabf6155df net: ethernet: bgmac: Remove -Warray-bounds exception
GCC-12 emits false positive -Warray-bounds warnings with
CONFIG_UBSAN_SHIFT (-fsanitize=shift). This is fixed in GCC 13[1],
and there is top-level Makefile logic to remove -Warray-bounds for
known-bad GCC versions staring with commit f0be87c42c ("gcc-12: disable
'-Warray-bounds' universally for now").

Remove the local work-around.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105679

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-07 08:50:07 +01:00
Kees Cook 4af609b216 net: ethernet: mediatek: Remove -Warray-bounds exception
GCC-12 emits false positive -Warray-bounds warnings with
CONFIG_UBSAN_SHIFT (-fsanitize=shift). This is fixed in GCC 13[1],
and there is top-level Makefile logic to remove -Warray-bounds for
known-bad GCC versions staring with commit f0be87c42c ("gcc-12: disable
'-Warray-bounds' universally for now").

Remove the local work-around.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105679

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-07 08:49:29 +01:00
Serhiy Boiko fb4a5dfca0 prestera: matchall: do not rollback if rule exists
If you try to create a 'mirror' ACL rule on a port that already has a
mirror rule, prestera_span_rule_add() will fail with EEXIST error.

This forces rollback procedure which destroys existing mirror rule on
hardware leaving it visible in linux.

Add an explicit check for EEXIST to prevent the deletion of the existing
rule but keep user seeing error message:

  $ tc filter add dev sw1p1 ... skip_sw action mirred egress mirror dev sw1p2
  $ tc filter add dev sw1p1 ... skip_sw action mirred egress mirror dev sw1p3
  RTNETLINK answers: File exists
  We have an error talking to the kernel

Fixes: 13defa275e ("net: marvell: prestera: Add matchall support")
Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-07 08:48:34 +01:00
David Ahern 61b91eb33a ipv4: Handle attempt to delete multipath route when fib_info contains an nh reference
Gwangun Jung reported a slab-out-of-bounds access in fib_nh_match:
    fib_nh_match+0xf98/0x1130 linux-6.0-rc7/net/ipv4/fib_semantics.c:961
    fib_table_delete+0x5f3/0xa40 linux-6.0-rc7/net/ipv4/fib_trie.c:1753
    inet_rtm_delroute+0x2b3/0x380 linux-6.0-rc7/net/ipv4/fib_frontend.c:874

Separate nexthop objects are mutually exclusive with the legacy
multipath spec. Fix fib_nh_match to return if the config for the
to be deleted route contains a multipath spec while the fib_info
is using a nexthop object.

Fixes: 493ced1ac4 ("ipv4: Allow routes to use nexthop objects")
Fixes: 6bf92d70e6 ("net: ipv4: fix route with nexthop object delete warning")
Reported-by: Gwangun Jung <exsociety@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-07 08:47:08 +01:00
Yang Li 3030cbff67 net: enetc: Remove duplicated include in enetc_qos.c
net/pkt_sched.h is included twice in enetc_qos.c,
remove one of them.

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2334
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-07 08:45:16 +01:00
Yang Li 7305e7804d octeontx2-pf: mcs: remove unneeded semicolon
Semicolon is not required after curly braces.

Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2332
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-07 08:44:44 +01:00
Gaurav Kohli 365e1ececb hv_netvsc: Fix race between VF offering and VF association message from host
During vm boot, there might be possibility that vf registration
call comes before the vf association from host to vm.

And this might break netvsc vf path, To prevent the same block
vf registration until vf bind message comes from host.

Cc: stable@vger.kernel.org
Fixes: 00d7ddba11 ("hv_netvsc: pair VF based on serial number")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Gaurav Kohli <gauravkohli@linux.microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-07 08:43:58 +01:00
Alexander Aring 30393181fd net: ieee802154: return -EINVAL for unknown addr type
This patch adds handling to return -EINVAL for an unknown addr type. The
current behaviour is to return 0 as successful but the size of an
unknown addr type is not defined and should return an error like -EINVAL.

Fixes: 94160108a7 ("net/ieee802154: fix uninit value bug in dgram_sendmsg")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-07 08:42:00 +01:00
Wenjia Zhang 87d1aa8b90 MAINTAINERS: add Jan as SMC maintainer
Add Jan as maintainer for Shared Memory Communications (SMC)
Sockets.

Acked-by: Jan Karcher <jaka@linux.ibm.com>
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-07 08:40:15 +01:00
Juergen Gross 2849752f36 xen/pcifront: move xenstore config scanning into sub-function
pcifront_try_connect() and pcifront_attach_devices() share a large
chunk of duplicated code for reading the config information from
Xenstore, which only differs regarding calling pcifront_rescan_root()
or pcifront_scan_root().

Put that code into a new sub-function. It is fine to always call
pcifront_rescan_root() from that common function, as it will fallback
to pcifront_scan_root() if the domain/bus combination isn't known
yet (and pcifront_scan_root() should never be called for an already
known domain/bus combination anyway). In order to avoid duplicate
messages for the fallback case move the check for domain/bus not known
to the beginning of pcifront_rescan_root().

While at it fix the error reporting in case the root-xx node had the
wrong format.

As the return value of pcifront_try_connect() and
pcifront_attach_devices() are not used anywhere make those functions
return void. As an additional bonus this removes the dubious return
of -EFAULT in case of an unexpected driver state.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-07 07:36:44 +02:00
Jisheng Zhang 87f81e66e2
riscv: enable THP_SWAP for RV64
I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
memory optimizations such as swap on zram are helpful. As is seen
in commit d0637c505f ("arm64: enable THP_SWAP for arm64") and
commit bd4c82c22c ("mm, THP, swap: delay splitting THP after
swapped out"), THP_SWAP can improve the swap throughput significantly.

Enable THP_SWAP for RV64, testing the micro-benchmark which is
introduced by commit d0637c505f ("arm64: enable THP_SWAP for arm64")
shows below numbers on the Lichee RV dock board:

swp out bandwidth w/o patch: 66908 bytes/ms (mean of 10 tests)
swp out bandwidth w/ patch: 322638 bytes/ms (mean of 10 tests)

Improved by 382%!

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220829145742.3139-1-jszhang@kernel.org/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-06 20:03:48 -07:00
Palmer Dabbelt 61a41d16ad
RISC-V: Print SSTC in canonical order
This got out of order during a merge conflict, fix it by putting the
entries in the correct order.

Fixes: 7ab52f75a9 ("RISC-V: Add Sstc extension support")
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220920204518.10988-1-palmer@rivosinc.com/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-10-06 20:03:27 -07:00
Dave Airlie bafaf67c42 Revert "drm/sched: Use parent fence instead of finished"
This reverts commit e4dc45b184.

This is causing instability on Linus' desktop, and I'm seeing
oops with VK CTS runs.

netconsole got me the following oops:
[ 1234.778760] BUG: kernel NULL pointer dereference, address: 0000000000000088
[ 1234.778782] #PF: supervisor read access in kernel mode
[ 1234.778787] #PF: error_code(0x0000) - not-present page
[ 1234.778791] PGD 0 P4D 0
[ 1234.778798] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 1234.778803] CPU: 7 PID: 805 Comm: systemd-journal Not tainted 6.0.0+ #2
[ 1234.778809] Hardware name: System manufacturer System Product
Name/PRIME X370-PRO, BIOS 5603 07/28/2020
[ 1234.778813] RIP: 0010:drm_sched_job_done.isra.0+0xc/0x140 [gpu_sched]
[ 1234.778828] Code: aa 0f 1d ce e9 57 ff ff ff 48 89 d7 e8 9d 8f 3f
ce e9 4a ff ff ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 55 53
48 89 fb <48> 8b af 88 00 00 00 f0 ff 8d f0 00 00 00 48 8b 85 80 01 00
00 f0
[ 1234.778834] RSP: 0000:ffffabe680380de0 EFLAGS: 00010087
[ 1234.778839] RAX: ffffffffc04e9230 RBX: 0000000000000000 RCX: 0000000000000018
[ 1234.778897] RDX: 00000ba278e8977a RSI: ffff953fb288b460 RDI: 0000000000000000
[ 1234.778901] RBP: ffff953fb288b598 R08: 00000000000000e0 R09: ffff953fbd98b808
[ 1234.778905] R10: 0000000000000000 R11: ffffabe680380ff8 R12: ffffabe680380e00
[ 1234.778908] R13: 0000000000000001 R14: 00000000ffffffff R15: ffff953fbd9ec458
[ 1234.778912] FS:  00007f35e7008580(0000) GS:ffff95428ebc0000(0000)
knlGS:0000000000000000
[ 1234.778916] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1234.778919] CR2: 0000000000000088 CR3: 000000010147c000 CR4: 00000000003506e0
[ 1234.778924] Call Trace:
[ 1234.778981]  <IRQ>
[ 1234.778989]  dma_fence_signal_timestamp_locked+0x6a/0xe0
[ 1234.778999]  dma_fence_signal+0x2c/0x50
[ 1234.779005]  amdgpu_fence_process+0xc8/0x140 [amdgpu]
[ 1234.779234]  sdma_v3_0_process_trap_irq+0x70/0x80 [amdgpu]
[ 1234.779395]  amdgpu_irq_dispatch+0xa9/0x1d0 [amdgpu]
[ 1234.779609]  amdgpu_ih_process+0x80/0x100 [amdgpu]
[ 1234.779783]  amdgpu_irq_handler+0x1f/0x60 [amdgpu]
[ 1234.779940]  __handle_irq_event_percpu+0x46/0x190
[ 1234.779946]  handle_irq_event+0x34/0x70
[ 1234.779949]  handle_edge_irq+0x9f/0x240
[ 1234.779954]  __common_interrupt+0x66/0x100
[ 1234.779960]  common_interrupt+0xa0/0xc0
[ 1234.779965]  </IRQ>
[ 1234.779968]  <TASK>
[ 1234.779971]  asm_common_interrupt+0x22/0x40
[ 1234.779976] RIP: 0010:finish_mkwrite_fault+0x22/0x110
[ 1234.779981] Code: 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 55 41
54 55 48 89 fd 53 48 8b 07 f6 40 50 08 0f 84 eb 00 00 00 48 8b 45 30
48 8b 18 <48> 89 df e8 66 bd ff ff 48 85 c0 74 0d 48 89 c2 83 e2 01 48
83 ea
[ 1234.779985] RSP: 0000:ffffabe680bcfd78 EFLAGS: 00000202

Revert it for now and figure it out later.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-10-07 12:58:39 +10:00
Tetsuo Handa ef575281b2 9p/trans_fd: always use O_NONBLOCK read/write
syzbot is reporting hung task at p9_fd_close() [1], for p9_mux_poll_stop()
 from p9_conn_destroy() from p9_fd_close() is failing to interrupt already
started kernel_read() from p9_fd_read() from p9_read_work() and/or
kernel_write() from p9_fd_write() from p9_write_work() requests.

Since p9_socket_open() sets O_NONBLOCK flag, p9_mux_poll_stop() does not
need to interrupt kernel_read()/kernel_write(). However, since p9_fd_open()
does not set O_NONBLOCK flag, but pipe blocks unless signal is pending,
p9_mux_poll_stop() needs to interrupt kernel_read()/kernel_write() when
the file descriptor refers to a pipe. In other words, pipe file descriptor
needs to be handled as if socket file descriptor.

We somehow need to interrupt kernel_read()/kernel_write() on pipes.

A minimal change, which this patch is doing, is to set O_NONBLOCK flag
 from p9_fd_open(), for O_NONBLOCK flag does not affect reading/writing
of regular files. But this approach changes O_NONBLOCK flag on userspace-
supplied file descriptors (which might break userspace programs), and
O_NONBLOCK flag could be changed by userspace. It would be possible to set
O_NONBLOCK flag every time p9_fd_read()/p9_fd_write() is invoked, but still
remains small race window for clearing O_NONBLOCK flag.

If we don't want to manipulate O_NONBLOCK flag, we might be able to
surround kernel_read()/kernel_write() with set_thread_flag(TIF_SIGPENDING)
and recalc_sigpending(). Since p9_read_work()/p9_write_work() works are
processed by kernel threads which process global system_wq workqueue,
signals could not be delivered from remote threads when p9_mux_poll_stop()
 from p9_conn_destroy() from p9_fd_close() is called. Therefore, calling
set_thread_flag(TIF_SIGPENDING)/recalc_sigpending() every time would be
needed if we count on signals for making kernel_read()/kernel_write()
non-blocking.

Link: https://lkml.kernel.org/r/345de429-a88b-7097-d177-adecf9fed342@I-love.SAKURA.ne.jp
Link: https://syzkaller.appspot.com/bug?extid=8b41a1365f1106fd0f33 [1]
Reported-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+8b41a1365f1106fd0f33@syzkaller.appspotmail.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
[Dominique: add comment at Christian's suggestion]
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
2022-10-07 09:59:36 +09:00
Linus Torvalds 4c86114194 New code for 6.1:
- Fix a UAF bug when recording writeback mapping errors.
  - Add a tracepoint so that we can monitor writeback mappings.
 
 Signed-off-by: Darrick J. Wong <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmM5290ACgkQ+H93GTRK
 tOv1sg//XftuWPDdcUir6XQUUwdcGDP4bYRknQjDK3h0HjnrLY+YeBqUSg/2HPx3
 288wxJkah7JIIbTqlGNwuPfCz/nDbNf1hFwjkDXjKa0wnG/JGpXlEW93UOEQT7Sz
 XMpcEA6fcZt2HbFv7TMbU5MuN4ipOdfKmY2gkgJVh2Ku1lgcwaKH4xM1b/6o4PDz
 gmKfN6E2W3DID6ETXzSef4Zid3qbE5a0njT5p5R7zZVhQOXNdgBLRexiUV4bd6HQ
 jTzs80VuHl/leKfJLwn7D6HPrkqvSBF2gys/eANUPQri8V+fGiiEILSEc/+LKHej
 3Ob9sBIuTlPDUmCfsjnw2c4nKlmCBXRPSS0zZdcE881imQF7XeOXblvCZB7obzU0
 vVEf+VmEo2xCWxedxNwBuX4m2q+yjY2YFuNEPYOIrspPqDKyrZwDvs+y9o5GaNcu
 ZgIY/qe9LcRpdsrBjiGt4Zf/tgaVKiLeuRH/7F3RAvMLMX/lLrHMf927GP3Mm25j
 XN4K0Eww9ZB+3sxkxRbOr6jIXYW8jkmDgs4DGhoaRQvtJJTryzPp0YK+0CWgaKSm
 5AMTn3wELx63j1vv/iGnNuWEHuSfsyJMZngutC3zp5r5rR7KgOJcTXfbeadz7kiM
 Eo2ri663iDdCw3u4FrpNaFAHyUYL4AfZkUAMnq71gbnsG0BOLdA=
 =5YUg
 -----END PGP SIGNATURE-----

Merge tag 'iomap-6.1-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap updates from Darrick Wong:
 "It's pretty quiet this time around -- a UAF bugfix and a new
  tracepoint so we can watch file writeback:

   - Fix a UAF bug when recording writeback mapping errors

   - Add a tracepoint so that we can monitor writeback mappings"

* tag 'iomap-6.1-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: add a tracepoint for mappings returned by map_blocks
  iomap: iomap: fix memory corruption when recording errors during writeback
2022-10-06 17:57:50 -07:00
Linus Torvalds bc32a6330f The first two changes that involve files outside of fs/ext4:
- submit_bh() can never return an error, so change it to return void,
   and remove the unused checks from its callers
 
 - fix I_DIRTY_TIME handling so it will be set even if the inode
   already has I_DIRTY_INODE
 
 Performance:
 
 - Always enable i_version counter (as btrfs and xfs already do).
   Remove some uneeded i_version bumps to avoid unnecessary nfs cache
   invalidations.
 
 - Wake up journal waters in FIFO order, to avoid some journal users
   from not getting a journal handle for an unfairly long time.
 
 - In ext4_write_begin() allocate any necessary buffer heads before
   starting the journal handle.
 
 - Don't try to prefetch the block allocation bitmaps for a read-only
   file system.
 
 Bug Fixes:
 
 - Fix a number of fast commit bugs, including resources leaks and out
   of bound references in various error handling paths and/or if the fast
   commit log is corrupted.
 
 - Avoid stopping the online resize early when expanding a file system
   which is less than 16TiB to a size greater than 16TiB.
 
 - Fix apparent metadata corruption caused by a race with a metadata
   buffer head getting migrated while it was trying to be read.
 
 - Mark the lazy initialization thread freezable to prevent suspend
   failures.
 
 - Other miscellaneous bug fixes.
 
 Cleanups:
 
 - Break up the incredibly long ext4_full_super() function by
   refactoring to move code into more understandable, smaller
   functions.
 
 - Remove the deprecated (and ignored) noacl and nouser_attr mount
   option.
 
 - Factor out some common code in fast commit handling.
 
 - Other miscellaneous cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmM8/2gACgkQ8vlZVpUN
 gaPohAf9GDMUq3QIYoWLlJ+ygJhL0xQGPfC6sypMjHaUO5GSo+1+sAMU3JBftxUS
 LrgTtmzSKzwp9PyOHNs+mswUzhLZivKVCLMmOznQUZS228GSVKProhN1LPL4UP2Q
 Ks8i1M5XTWS+mtJ5J5Mw6jRHxcjfT6ynyJKPnIWKTwXyeru1WSJ2PWqtWQD4EZkE
 lImECy0jX/zlK02s0jDYbNIbXIvI/TTYi7wT8o1ouLCAXMDv5gJRc5TXCVtX8i59
 /Pl9rGG/+IWTnYT/aQ668S2g0Cz6Wyv2EkmiPUW0Y8NoLaaouBYZoC2hDujiv+l1
 ucEI14TEQ+DojJTdChrtwKqgZfqDOw==
 =xoLC
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "The first two changes involve files outside of fs/ext4:

   - submit_bh() can never return an error, so change it to return void,
     and remove the unused checks from its callers

   - fix I_DIRTY_TIME handling so it will be set even if the inode
     already has I_DIRTY_INODE

  Performance:

   - Always enable i_version counter (as btrfs and xfs already do).
     Remove some uneeded i_version bumps to avoid unnecessary nfs cache
     invalidations

   - Wake up journal waiters in FIFO order, to avoid some journal users
     from not getting a journal handle for an unfairly long time

   - In ext4_write_begin() allocate any necessary buffer heads before
     starting the journal handle

   - Don't try to prefetch the block allocation bitmaps for a read-only
     file system

  Bug Fixes:

   - Fix a number of fast commit bugs, including resources leaks and out
     of bound references in various error handling paths and/or if the
     fast commit log is corrupted

   - Avoid stopping the online resize early when expanding a file system
     which is less than 16TiB to a size greater than 16TiB

   - Fix apparent metadata corruption caused by a race with a metadata
     buffer head getting migrated while it was trying to be read

   - Mark the lazy initialization thread freezable to prevent suspend
     failures

   - Other miscellaneous bug fixes

  Cleanups:

   - Break up the incredibly long ext4_full_super() function by
     refactoring to move code into more understandable, smaller
     functions

   - Remove the deprecated (and ignored) noacl and nouser_attr mount
     option

   - Factor out some common code in fast commit handling

   - Other miscellaneous cleanups"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (53 commits)
  ext4: fix potential out of bound read in ext4_fc_replay_scan()
  ext4: factor out ext4_fc_get_tl()
  ext4: introduce EXT4_FC_TAG_BASE_LEN helper
  ext4: factor out ext4_free_ext_path()
  ext4: remove unnecessary drop path references in mext_check_coverage()
  ext4: update 'state->fc_regions_size' after successful memory allocation
  ext4: fix potential memory leak in ext4_fc_record_regions()
  ext4: fix potential memory leak in ext4_fc_record_modified_inode()
  ext4: remove redundant checking in ext4_ioctl_checkpoint
  jbd2: add miss release buffer head in fc_do_one_pass()
  ext4: move DIOREAD_NOLOCK setting to ext4_set_def_opts()
  ext4: remove useless local variable 'blocksize'
  ext4: unify the ext4 super block loading operation
  ext4: factor out ext4_journal_data_mode_check()
  ext4: factor out ext4_load_and_init_journal()
  ext4: factor out ext4_group_desc_init() and ext4_group_desc_free()
  ext4: factor out ext4_geometry_check()
  ext4: factor out ext4_check_feature_compatibility()
  ext4: factor out ext4_init_metadata_csum()
  ext4: factor out ext4_encoding_init()
  ...
2022-10-06 17:45:53 -07:00
Linus Torvalds 7f198ba7ae affs-for-6.1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmM71hEACgkQxWXV+ddt
 WDsiGg/9EGardWzoQs/YCexFxjDTnIQxaTmoxN2igfsNJ7PwQQDb2R2hUSSQgarp
 vepmjkO0YdFI4Hlym81x5t/cRmdGYz1fjt/0nBibhrhxJBUi3S6CcSOABuOdP699
 E9Iuzyx6YRa4eJN5+rmyDj/51sOgZNGZ+cMKY8ukkXk1Y/VBfOzU/yIoLVjNKPMJ
 UZwXN8C/Tvh/wbE4uBYu5G4dbXfC+qx3ywz/+KdccmSm1iPgA1NzmbAJDFjyCVKI
 R0qub1qXzOTqM0Y1uAu1+LjS2mIv1uASCt2MXt1ragZJsClFK8k+pQwHU9/m4S4G
 lOtUobsfrAPN8laH9B6aIWJMTSY2hglQxbKLFXAy12znlHoPXa5VANNh7KIurBNH
 xtV9eFjcc1b3CJ+nF1NaekamFhWFDSL1kHQpu16gyTZc1p9wkbAh8mYSl0LVHOCp
 n7ZiSJ1Gq5+WdbTMvjVAlgObtzpjOkvl0MpsDRqBn7fva/CNYI+D1NWbf2uP9NyY
 Pe9XO8w7bNXRLyV8vKtISrCnxULGuXxsXPG23K8nd78h5ZPrjCyU3RQaAANFfgDe
 YEzCXF+WKLT8UflJOB1l9rYnY6VrwlktVYOHgHRT94CHPn7vJkCf2HVlOg28sBnM
 39sspyIyO9b55IsWyY0RxRwdQ6mMTXnJqxG5GgorMQ+ZrBg9BPk=
 =fNXH
 -----END PGP SIGNATURE-----

Merge tag 'affs-for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull affs update from David Sterba:
 "One minor update for AFFS, switching away from strlcpy"

* tag 'affs-for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  affs: move from strlcpy with unused retval to strscpy
2022-10-06 17:42:29 -07:00
Linus Torvalds 76e4503534 for-6.1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmM6zNkACgkQxWXV+ddt
 WDsNMg/+LTuwf6Js+mAl1AgtSpLOl2gLfNBJAUXhzwPbc3nF9bwONE/EUYEXTo5h
 kTf1cQRj0NCIZ7iHDwXuWNm77diNl+SChEDIoc7k0d6P7Qmmn2AWbTLM4dleyg5S
 6jxPpOMbegycQfL9tSJNaiT9zlZxj9Z+0yPibR99otrgtuv6zuvRxcdh34rEFIyf
 xoabO3/18lAKHzYzAZxNXMpbUSBmqLPVoZEOcfBAXvcuIJkzKRP6Y9gwlYs+kn+D
 J8BPa3LoSNxXrpCvWzlu7vO3gwNp7H7pQQqZKjjEcOZ+dj2UYQeTyJvl1vdzaNyk
 EoFYlkaKkYi7RaonuHjNaTeD/igJf8Eo6DTiXzACECssbKutlvNG4HXuFApsWy7M
 T7KZ5jTAQ98ZMYjgZ27UbEpFZd8lYHzV952Njjo9zbRVbqwaPEZTTdkjpz+3X6t4
 Z0A951ixOYKiOVdu3Uj1fHaBv0n/p0wrXIGt3ZIdjufM9TctV3oJwOZOiM2H0ccb
 XJVwsQG92+ja9XLZrw8H62PCKBYo3LL52r9b9NVodY9aTsQWTfiV5OP84RRlncCp
 hzPkHmO1YIyVcLoijagiO7cW21pQbKfqsRX/P1F7DXyjosHppmDS7IHDWA7Adf3W
 QA6eBnoWqVwBh7P+IyxJuRG0CrnxkPZeAZIhohDwk5Mt4NGATkA=
 =NlUz
 -----END PGP SIGNATURE-----

Merge tag 'for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "There's a bunch of performance improvements, most notably the FIEMAP
  speedup, the new block group tree to speed up mount on large
  filesystems, more io_uring integration, some sysfs exports and the
  usual fixes and core updates.

  Summary:

  Performance:

   - outstanding FIEMAP speed improvement
      - algorithmic change how extents are enumerated leads to orders of
        magnitude speed boost (uncached and cached)
      - extent sharing check speedup (2.2x uncached, 3x cached)
      - add more cancellation points, allowing to interrupt seeking in
        files with large number of extents
      - more efficient hole and data seeking (4x uncached, 1.3x cached)
      - sample results:
	    256M, 32K extents:   4s ->  29ms  (~150x)
	    512M, 64K extents:  30s ->  59ms  (~550x)
	    1G,  128K extents: 225s -> 120ms (~1800x)

   - improved inode logging, especially for directories (on dbench
     workload throughput +25%, max latency -21%)

   - improved buffered IO, remove redundant extent state tracking,
     lowering memory consumption and avoiding rb tree traversal

   - add sysfs tunable to let qgroup temporarily skip exact accounting
     when deleting snapshot, leading to a speedup but requiring a rescan
     after that, will be used by snapper

   - support io_uring and buffered writes, until now it was just for
     direct IO, with the no-wait semantics implemented in the buffered
     write path it now works and leads to speed improvement in IOPS
     (2x), throughput (2.2x), latency (depends, 2x to 150x)

   - small performance improvements when dropping and searching for
     extent maps as well as when flushing delalloc in COW mode
     (throughput +5MB/s)

  User visible changes:

   - new incompatible feature block-group-tree adding a dedicated tree
     for tracking block groups, this allows a much faster load during
     mount and avoids seeking unlike when it's scattered in the extent
     tree items
      - this reduces mount time for many-terabyte sized filesystems
      - conversion tool will be provided so existing filesystem can also
        be updated in place
      - to reduce test matrix and feature combinations requires no-holes
        and free-space-tree (mkfs defaults since 5.15)

   - improved reporting of super block corruption detected by scrub

   - scrub also tries to repair super block and does not wait until next
     commit

   - discard stats and tunables are exported in sysfs
     (/sys/fs/btrfs/FSID/discard)

   - qgroup status is exported in sysfs
     (/sys/sys/fs/btrfs/FSID/qgroups/)

   - verify that super block was not modified when thawing filesystem

  Fixes:

   - FIEMAP fixes
      - fix extent sharing status, does not depend on the cached status
        where merged
      - flush delalloc so compressed extents are reported correctly

   - fix alignment of VMA for memory mapped files on THP

   - send: fix failures when processing inodes with no links (orphan
     files and directories)

   - fix race between quota enable and quota rescan ioctl

   - handle more corner cases for read-only compat feature verification

   - fix missed extent on fsync after dropping extent maps

  Core:

   - lockdep annotations to validate various transactions states and
     state transitions

   - preliminary support for fs-verity in send

   - more effective memory use in scrub for subpage where sector is
     smaller than page

   - block group caching progress logic has been removed, load is now
     synchronous

   - simplify end IO callbacks and bio handling, use chained bios
     instead of own tracking

   - add no-wait semantics to several functions (tree search, nocow,
     flushing, buffered write

   - cleanups and refactoring

  MM changes:

   - export balance_dirty_pages_ratelimited_flags"

* tag 'for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (177 commits)
  btrfs: set generation before calling btrfs_clean_tree_block in btrfs_init_new_buffer
  btrfs: drop extent map range more efficiently
  btrfs: avoid pointless extent map tree search when flushing delalloc
  btrfs: remove unnecessary next extent map search
  btrfs: remove unnecessary NULL pointer checks when searching extent maps
  btrfs: assert tree is locked when clearing extent map from logging
  btrfs: remove unnecessary extent map initializations
  btrfs: remove the refcount warning/check at free_extent_map()
  btrfs: add helper to replace extent map range with a new extent map
  btrfs: move open coded extent map tree deletion out of inode eviction
  btrfs: use cond_resched_rwlock_write() during inode eviction
  btrfs: use extent_map_end() at btrfs_drop_extent_map_range()
  btrfs: move btrfs_drop_extent_cache() to extent_map.c
  btrfs: fix missed extent on fsync after dropping extent maps
  btrfs: remove stale prototype of btrfs_write_inode
  btrfs: enable nowait async buffered writes
  btrfs: assert nowait mode is not used for some btree search functions
  btrfs: make btrfs_buffered_write nowait compatible
  btrfs: plumb NOWAIT through the write path
  btrfs: make lock_and_cleanup_extent_if_need nowait compatible
  ...
2022-10-06 17:36:48 -07:00
Linus Torvalds 4c0ed7d8d6 whack-a-mole: constifying struct path *
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzxmRQAKCRBZ7Krx/gZQ
 6+/kAQD2xyf+i4zOYVBr1NB3qBbhVS1zrni1NbC/kT3dJPgTvwEA7z7eqwnrN4zg
 scKFP8a3yPoaQBfs4do5PolhuSr2ngA=
 =NBI+
 -----END PGP SIGNATURE-----

Merge tag 'pull-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs constification updates from Al Viro:
 "whack-a-mole: constifying struct path *"

* tag 'pull-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ecryptfs: constify path
  spufs: constify path
  nd_jump_link(): constify path
  audit_init_parent(): constify path
  __io_setxattr(): constify path
  do_proc_readlink(): constify path
  overlayfs: constify path
  fs/notify: constify path
  may_linkat(): constify path
  do_sys_name_to_handle(): constify path
  ->getprocattr(): attribute name is const char *, TYVM...
2022-10-06 17:31:02 -07:00
Linus Torvalds 1586a7036d a couple of assorted tomoyo patches
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzxjfQAKCRBZ7Krx/gZQ
 63XXAQCC4mKe9HVosfkC6rmoN5ADJCl+lMmk5q8OFN8w7MQrGgD/cF0fVAyYvotr
 iRCwx8qAqQVmsh+d3DzU1UVP+f54owU=
 =pAf9
 -----END PGP SIGNATURE-----

Merge tag 'pull-tomoyo' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull misc tomoyo changes from Al Viro:
 "A couple of assorted tomoyo patches"

* tag 'pull-tomoyo' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  tomoyo: struct path it might get from LSM callers won't have NULL dentry or mnt
  tomoyo: use vsnprintf() properly
2022-10-06 17:26:56 -07:00
Linus Torvalds ab29622157 whack-a-mole: cropped up open-coded file_inode() uses...
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzxj0gAKCRBZ7Krx/gZQ
 66/1AQC/KfIAINNOPxozsZaxOaOKo0ouVJ7sJV4ZGsPKpU69gwD/UodJZCtyZ52h
 wwkmfzTDjAgGt1QCKj96zk2XFqg4swE=
 =u0pv
 -----END PGP SIGNATURE-----

Merge tag 'pull-file_inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull file_inode() updates from Al Vrio:
 "whack-a-mole: cropped up open-coded file_inode() uses..."

* tag 'pull-file_inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  orangefs: use ->f_mapping
  _nfs42_proc_copy(): use ->f_mapping instead of file_inode()->i_mapping
  dma_buf: no need to bother with file_inode()->i_mapping
  nfs_finish_open(): don't open-code file_inode()
  bprm_fill_uid(): don't open-code file_inode()
  sgx: use ->f_mapping...
  exfat_iterate(): don't open-code file_inode(file)
  ibmvmc: don't open-code file_inode()
2022-10-06 17:22:11 -07:00
Linus Torvalds 7a3353c5c4 struct file-related stuff
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzxjIQAKCRBZ7Krx/gZQ
 6/FPAQCNCZygQzd+54//vo4kTwv5T2Bv3hS8J51rASPJT87/BQD/TfCLS5urt/Gt
 81A1dFOfnTXseofuBKyGSXwQm0dWpgA=
 =PLre
 -----END PGP SIGNATURE-----

Merge tag 'pull-file' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs file updates from Al Viro:
 "struct file-related stuff"

* tag 'pull-file' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  dma_buf_getfile(): don't bother with ->f_flags reassignments
  Change calling conventions for filldir_t
  locks: fix TOCTOU race when granting write lease
2022-10-06 17:13:18 -07:00
Linus Torvalds 70df64d6c6 d_path pile
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzxjQAAKCRBZ7Krx/gZQ
 683pAP9oSHaXo3Twl6rweirNbHocgm8MynCgIU3bpzeVPi6Z1wEApfEq4IInWQyL
 R6ObOneoSobi+9Iaqsoe+uKu54MghAY=
 =rt7w
 -----END PGP SIGNATURE-----

Merge tag 'pull-d_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs d_path updates from Al Viro.

* tag 'pull-d_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  d_path.c: typo fix...
  dynamic_dname(): drop unused dentry argument
2022-10-06 16:55:41 -07:00
Linus Torvalds 46811b5cb3 saner inode_init_always()
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzxivgAKCRBZ7Krx/gZQ
 66iyAQD2btSJlwKoqMlo+Xnj0J7nvHEYpFOf+rMWa/1SJIJOnAEAr9VYpgstELHP
 bAZ3EbGyLPZbLPnmiTzrDlvqZDbKDgs=
 =4e/D
 -----END PGP SIGNATURE-----

Merge tag 'pull-inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs inode update from Al Viro:
 "Saner inode_init_always(), also fixing a nilfs problem"

* tag 'pull-inode' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: fix UAF/GPF bug in nilfs_mdt_destroy
2022-10-06 16:49:00 -07:00