Commit Graph

1281524 Commits

Author SHA1 Message Date
Taehee Yoo
59a931c5b7 xdp: fix invalid wait context of page_pool_destroy()
If the driver uses a page pool, it creates a page pool with
page_pool_create().
The reference count of page pool is 1 as default.
A page pool will be destroyed only when a reference count reaches 0.
page_pool_destroy() is used to destroy page pool, it decreases a
reference count.
When a page pool is destroyed, ->disconnect() is called, which is
mem_allocator_disconnect().
This function internally acquires mutex_lock().

If the driver uses XDP, it registers a memory model with
xdp_rxq_info_reg_mem_model().
The xdp_rxq_info_reg_mem_model() internally increases a page pool
reference count if a memory model is a page pool.
Now the reference count is 2.

To destroy a page pool, the driver should call both page_pool_destroy()
and xdp_unreg_mem_model().
The xdp_unreg_mem_model() internally calls page_pool_destroy().
Only page_pool_destroy() decreases a reference count.

If a driver calls page_pool_destroy() then xdp_unreg_mem_model(), we
will face an invalid wait context warning.
Because xdp_unreg_mem_model() calls page_pool_destroy() with
rcu_read_lock().
The page_pool_destroy() internally acquires mutex_lock().

Splat looks like:
=============================
[ BUG: Invalid wait context ]
6.10.0-rc6+ #4 Tainted: G W
-----------------------------
ethtool/1806 is trying to lock:
ffffffff90387b90 (mem_id_lock){+.+.}-{4:4}, at: mem_allocator_disconnect+0x73/0x150
other info that might help us debug this:
context-{5:5}
3 locks held by ethtool/1806:
stack backtrace:
CPU: 0 PID: 1806 Comm: ethtool Tainted: G W 6.10.0-rc6+ #4 f916f41f172891c800f2fed
Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
Call Trace:
<TASK>
dump_stack_lvl+0x7e/0xc0
__lock_acquire+0x1681/0x4de0
? _printk+0x64/0xe0
? __pfx_mark_lock.part.0+0x10/0x10
? __pfx___lock_acquire+0x10/0x10
lock_acquire+0x1b3/0x580
? mem_allocator_disconnect+0x73/0x150
? __wake_up_klogd.part.0+0x16/0xc0
? __pfx_lock_acquire+0x10/0x10
? dump_stack_lvl+0x91/0xc0
__mutex_lock+0x15c/0x1690
? mem_allocator_disconnect+0x73/0x150
? __pfx_prb_read_valid+0x10/0x10
? mem_allocator_disconnect+0x73/0x150
? __pfx_llist_add_batch+0x10/0x10
? console_unlock+0x193/0x1b0
? lockdep_hardirqs_on+0xbe/0x140
? __pfx___mutex_lock+0x10/0x10
? tick_nohz_tick_stopped+0x16/0x90
? __irq_work_queue_local+0x1e5/0x330
? irq_work_queue+0x39/0x50
? __wake_up_klogd.part.0+0x79/0xc0
? mem_allocator_disconnect+0x73/0x150
mem_allocator_disconnect+0x73/0x150
? __pfx_mem_allocator_disconnect+0x10/0x10
? mark_held_locks+0xa5/0xf0
? rcu_is_watching+0x11/0xb0
page_pool_release+0x36e/0x6d0
page_pool_destroy+0xd7/0x440
xdp_unreg_mem_model+0x1a7/0x2a0
? __pfx_xdp_unreg_mem_model+0x10/0x10
? kfree+0x125/0x370
? bnxt_free_ring.isra.0+0x2eb/0x500
? bnxt_free_mem+0x5ac/0x2500
xdp_rxq_info_unreg+0x4a/0xd0
bnxt_free_mem+0x1356/0x2500
bnxt_close_nic+0xf0/0x3b0
? __pfx_bnxt_close_nic+0x10/0x10
? ethnl_parse_bit+0x2c6/0x6d0
? __pfx___nla_validate_parse+0x10/0x10
? __pfx_ethnl_parse_bit+0x10/0x10
bnxt_set_features+0x2a8/0x3e0
__netdev_update_features+0x4dc/0x1370
? ethnl_parse_bitset+0x4ff/0x750
? __pfx_ethnl_parse_bitset+0x10/0x10
? __pfx___netdev_update_features+0x10/0x10
? mark_held_locks+0xa5/0xf0
? _raw_spin_unlock_irqrestore+0x42/0x70
? __pm_runtime_resume+0x7d/0x110
ethnl_set_features+0x32d/0xa20

To fix this problem, it uses rhashtable_lookup_fast() instead of
rhashtable_lookup() with rcu_read_lock().
Using xa without rcu_read_lock() here is safe.
xa is freed by __xdp_mem_allocator_rcu_free() and this is called by
call_rcu() of mem_xa_remove().
The mem_xa_remove() is called by page_pool_destroy() if a reference
count reaches 0.
The xa is already protected by the reference count mechanism well in the
control plane.
So removing rcu_read_lock() for page_pool_destroy() is safe.

Fixes: c3f812cea0 ("page_pool: do not release pool until inflight == 0.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20240712095116.3801586-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 20:40:21 -07:00
Chengen Du
79eecf631c af_packet: Handle outgoing VLAN packets without hardware offloading
The issue initially stems from libpcap. The ethertype will be overwritten
as the VLAN TPID if the network interface lacks hardware VLAN offloading.
In the outbound packet path, if hardware VLAN offloading is unavailable,
the VLAN tag is inserted into the payload but then cleared from the sk_buff
struct. Consequently, this can lead to a false negative when checking for
the presence of a VLAN tag, causing the packet sniffing outcome to lack
VLAN tag information (i.e., TCI-TPID). As a result, the packet capturing
tool may be unable to parse packets as expected.

The TCI-TPID is missing because the prb_fill_vlan_info() function does not
modify the tp_vlan_tci/tp_vlan_tpid values, as the information is in the
payload and not in the sk_buff struct. The skb_vlan_tag_present() function
only checks vlan_all in the sk_buff struct. In cooked mode, the L2 header
is stripped, preventing the packet capturing tool from determining the
correct TCI-TPID value. Additionally, the protocol in SLL is incorrect,
which means the packet capturing tool cannot parse the L3 header correctly.

Link: https://github.com/the-tcpdump-group/libpcap/issues/1105
Link: https://lore.kernel.org/netdev/20240520070348.26725-1-chengen.du@canonical.com/T/#u
Fixes: 393e52e33c ("packet: deliver VLAN TCI to userspace")
Cc: stable@vger.kernel.org
Signed-off-by: Chengen Du <chengen.du@canonical.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240713114735.62360-1-chengen.du@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 20:27:36 -07:00
Breno Leitao
97d9fba9a8 net: netconsole: Disable target before netpoll cleanup
Currently, netconsole cleans up the netpoll structure before disabling
the target. This approach can lead to race conditions, as message
senders (write_ext_msg() and write_msg()) check if the target is
enabled before using netpoll. The sender can validate that the target is
enabled, but, the netpoll might be de-allocated already, causing
undesired behaviours.

This patch reverses the order of operations:
1. Disable the target
2. Clean up the netpoll structure

This change eliminates the potential race condition, ensuring that
no messages are sent through a partially cleaned-up netpoll structure.

Fixes: 2382b15bcc ("netconsole: take care of NETDEV_UNREGISTER event")
Cc: stable@vger.kernel.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240712143415.1141039-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:38:44 -07:00
Jakub Kicinski
d657f5c76c Merge branch 'vrf-fix-source-address-selection-with-route-leak'
Nicolas Dichtel says:

====================
vrf: fix source address selection with route leak

For patch 1 and 2, I didn't find the exact commit that introduced this bug, but
I suspect it has been here since the first version. I arbitrarily choose one.
====================

Link: https://patch.msgid.link/20240710081521.3809742-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:34:18 -07:00
Nicolas Dichtel
39367183ae selftests: vrf_route_leaking: add local test
The goal is to check that the source address selected by the kernel is
routable when a leaking route is used. ICMP, TCP and UDP connections are
tested.
The symmetric topology is enough for this test.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240710081521.3809742-5-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:34:16 -07:00
Nicolas Dichtel
abb9a68d2c ipv6: take care of scope when choosing the src addr
When the source address is selected, the scope must be checked. For
example, if a loopback address is assigned to the vrf device, it must not
be chosen for packets sent outside.

CC: stable@vger.kernel.org
Fixes: afbac6010a ("net: ipv6: Address selection needs to consider L3 domains")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240710081521.3809742-4-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:34:16 -07:00
Nicolas Dichtel
252442f2ae ipv6: fix source address selection with route leak
By default, an address assigned to the output interface is selected when
the source address is not specified. This is problematic when a route,
configured in a vrf, uses an interface from another vrf (aka route leak).
The original vrf does not own the selected source address.

Let's add a check against the output interface and call the appropriate
function to select the source address.

CC: stable@vger.kernel.org
Fixes: 0d240e7811 ("net: vrf: Implement get_saddr for IPv6")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://patch.msgid.link/20240710081521.3809742-3-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:34:16 -07:00
Nicolas Dichtel
6807352353 ipv4: fix source address selection with route leak
By default, an address assigned to the output interface is selected when
the source address is not specified. This is problematic when a route,
configured in a vrf, uses an interface from another vrf (aka route leak).
The original vrf does not own the selected source address.

Let's add a check against the output interface and call the appropriate
function to select the source address.

CC: stable@vger.kernel.org
Fixes: 8cbb512c92 ("net: Add source address lookup op for VRF")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240710081521.3809742-2-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:34:15 -07:00
Amit Cohen
f67a90a0c8 selftests: forwarding: devlink_lib: Wait for udev events after reloading
Lately, an additional locking was added by commit c0a40097f0
("drivers: core: synchronize really_probe() and dev_uevent()"). The
locking protects dev_uevent() calling. This function is used to send
messages from the kernel to user space. Uevent messages notify user space
about changes in device states, such as when a device is added, removed,
or changed. These messages are used by udev (or other similar user-space
tools) to apply device-specific rules.

After reloading devlink instance, udev events should be processed. This
locking causes a short delay of udev events handling.

One example for useful udev rule is renaming ports. 'forwading.config'
can be configured to use names after udev rules are applied. Some tests run
devlink_reload() and immediately use the updated names. This worked before
the above mentioned commit was pushed, but now the delay of uevent messages
causes that devlink_reload() returns before udev events are handled and
tests fail.

Adjust devlink_reload() to not assume that udev events are already
processed when devlink reload is done, instead, wait for udev events to
ensure they are processed before returning from the function.

Without this patch:
TESTS='rif_mac_profile' ./resource_scale.sh
TEST: 'rif_mac_profile' 4                                           [ OK ]
sysctl: cannot stat /proc/sys/net/ipv6/conf/swp1/disable_ipv6: No such file or directory
sysctl: cannot stat /proc/sys/net/ipv6/conf/swp1/disable_ipv6: No such file or directory
sysctl: cannot stat /proc/sys/net/ipv6/conf/swp2/disable_ipv6: No such file or directory
sysctl: cannot stat /proc/sys/net/ipv6/conf/swp2/disable_ipv6: No such file or directory
Cannot find device "swp1"
Cannot find device "swp2"
TEST: setup_wait_dev (: Interface swp1 does not come up.) [FAIL]

With this patch:
$ TESTS='rif_mac_profile' ./resource_scale.sh
TEST: 'rif_mac_profile' 4                                           [ OK ]
TEST: 'rif_mac_profile' overflow 5                                  [ OK ]

This is relevant not only for this test.

Fixes: bc7cbb1e9f ("selftests: forwarding: Add devlink_lib.sh")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/89367666e04b38a8993027f1526801ca327ab96a.1720709333.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:17:13 -07:00
Jakub Kicinski
5f25f553b8 Merge branch 'net-pse-pd-fix-possible-issues-with-a-pse-supporting-both-c33-and-podl'
Kory Maincent says:

====================
net: pse-pd: Fix possible issues with a PSE supporting both c33 and PoDL

Although PSE controllers supporting both c33 and PoDL are not on the
market yet, we want to prevent potential issues from arising in the
future. Two possible issues could occur with a PSE supporting both c33
and PoDL:

- Setting the config for one type of PSE leaves the other type's config
  null. In this case, the PSE core would return EOPNOTSUPP, which is not
  the correct behavior.
- Null dereference of Netlink attributes as only one of the Netlink
  attributes would be specified at a time.

This patch series contains two patches to fix these issues.
====================

Link: https://patch.msgid.link/20240711-fix_pse_pd_deref-v3-0-edd78fc4fe42@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:16:21 -07:00
Kory Maincent
4cddb0f15e net: ethtool: pse-pd: Fix possible null-deref
Fix a possible null dereference when a PSE supports both c33 and PoDL, but
only one of the netlink attributes is specified. The c33 or PoDL PSE
capabilities are already validated in the ethnl_set_pse_validate() call.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20240705184116.13d8235a@kernel.org/
Fixes: 4d18e3ddf4 ("net: ethtool: pse-pd: Expand pse commands with the PSE PoE interface")
Link: https://patch.msgid.link/20240711-fix_pse_pd_deref-v3-2-edd78fc4fe42@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:16:18 -07:00
Kory Maincent
93c3a96c30 net: pse-pd: Do not return EOPNOSUPP if config is null
For a PSE supporting both c33 and PoDL, setting config for one type of PoE
leaves the other type's config null. Currently, this case returns
EOPNOTSUPP, which is incorrect. Instead, we should do nothing if the
configuration is empty.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Fixes: d83e13761d ("net: pse-pd: Use regulator framework within PSE framework")
Link: https://patch.msgid.link/20240711-fix_pse_pd_deref-v3-1-edd78fc4fe42@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:16:18 -07:00
Jakub Kicinski
70c676cb3d Merge tag 'ipsec-2024-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2024-07-11

1) Fix esp_output_tail_tcp() on unsupported ESPINTCP.
   From Hagar Hemdan.

2) Fix two bugs in the recently introduced SA direction separation.
   From Antony Antony.

3) Fix unregister netdevice hang on hardware offload. We had to add another
   list where skbs linked to that are unlinked from the lists (deleted)
   but not yet freed.

4) Fix netdev reference count imbalance in xfrm_state_find.
   From Jianbo Liu.

5) Call xfrm_dev_policy_delete when killingi them on offloaded policies.
   Jianbo Liu.

* tag 'ipsec-2024-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: call xfrm_dev_policy_delete when kill policy
  xfrm: fix netdev reference count imbalance
  xfrm: Export symbol xfrm_dev_state_delete.
  xfrm: Fix unregister netdevice hang on hardware offload.
  xfrm: Log input direction mismatch error in one place
  xfrm: Fix input error path memory access
  net: esp: cleanup esp_output_tail_tcp() in case of unsupported ESPINTCP
====================

Link: https://patch.msgid.link/20240711100025.1949454-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-14 07:10:49 -07:00
Linus Torvalds
528dd46d0f Merge tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull more networking fixes from Jakub Kicinski:
 "A quick follow up to yesterday's pull. We got a regressions report for
  the bnxt patch as soon as it got to your tree. The ethtool fix is also
  good to have, although it's an older regression.

  Current release - regressions:

   - eth: bnxt_en: fix crash in bnxt_get_max_rss_ctx_ring() on older HW
     when user tries to decrease the ring count

  Previous releases - regressions:

   - ethtool: fix RSS setting, accept "no change" setting if the driver
     doesn't support the new features

   - eth: i40e: remove needless retries of NVM update, don't wait 20min
     when we know the firmware update won't succeed"

* tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
  bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()
  octeontx2-af: fix issue with IPv4 match for RSS
  octeontx2-af: fix issue with IPv6 ext match for RSS
  octeontx2-af: fix detection of IP layer
  octeontx2-af: fix a issue with cpt_lf_alloc mailbox
  octeontx2-af: replace cpt slot with lf id on reg write
  i40e: fix: remove needless retries of NVM update
  net: ethtool: Fix RSS setting
2024-07-12 18:33:33 -07:00
Michael Chan
f7ce5eb2cb bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring()
On older chips not supporting multiple RSS contexts, reducing
ethtool channels will crash:

BUG: kernel NULL pointer dereference, address: 00000000000000b8
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 7032 Comm: ethtool Tainted: G S                 6.10.0-rc4 #1
Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
RIP: 0010:bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en]
Code: c3 d3 eb 4c 8b 83 38 01 00 00 48 8d bb 38 01 00 00 4c 39 c7 74 42 41 8d 54 24 ff 31 c0 0f b7 d2 4c 8d 4c 12 02 66 85 ed 74 1d <49> 8b 90 b8 00 00 00 49 8d 34 11 0f b7 0a 66 39 c8 0f 42 c1 48 83
RSP: 0018:ffffaaa501d23ba8 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff8efdf600c940 RCX: 0000000000000000
RDX: 000000000000007f RSI: ffffffffacf429c4 RDI: ffff8efdf600ca78
RBP: 0000000000000080 R08: 0000000000000000 R09: 0000000000000100
R10: 0000000000000001 R11: ffffaaa501d238c0 R12: 0000000000000080
R13: 0000000000000000 R14: ffff8efdf600c000 R15: 0000000000000006
FS:  00007f977a7d2740(0000) GS:ffff8f041f840000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000b8 CR3: 00000002320aa004 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __die_body+0x15/0x60
? page_fault_oops+0x157/0x440
? do_user_addr_fault+0x60/0x770
? _raw_spin_lock_irqsave+0x12/0x40
? exc_page_fault+0x61/0x120
? asm_exc_page_fault+0x22/0x30
? bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en]
? bnxt_get_max_rss_ctx_ring+0x25/0x90 [bnxt_en]
bnxt_set_channels+0x9d/0x340 [bnxt_en]
ethtool_set_channels+0x14b/0x210
__dev_ethtool+0xdf8/0x2890
? preempt_count_add+0x6a/0xa0
? percpu_counter_add_batch+0x23/0x90
? filemap_map_pages+0x417/0x4a0
? avc_has_extended_perms+0x185/0x420
? __pfx_udp_ioctl+0x10/0x10
? sk_ioctl+0x55/0xf0
? kmalloc_trace_noprof+0xe0/0x210
? dev_ethtool+0x54/0x170
dev_ethtool+0xa2/0x170
dev_ioctl+0xbe/0x530
sock_do_ioctl+0xa3/0xf0
sock_ioctl+0x20d/0x2e0

bp->rss_ctx_list is not initialized if the chip or firmware does not
support multiple RSS contexts.  Fix it by adding a check in
bnxt_get_max_rss_ctx_ring() before proceeding to reference
bp->rss_ctx_list.

Fixes: 0d1b7d6c92 ("bnxt: fix crashes when reducing ring count with active RSS contexts")
Reported-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/netdev/ZpFEJeNpwxW1aW9k@gmail.com/
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240712175318.166811-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-12 18:00:00 -07:00
Linus Torvalds
975f3b6da1 Merge tag 'for-6.10-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "Fix a regression in extent map shrinker behaviour.

  In the past weeks we got reports from users that there are huge
  latency spikes or freezes. This was bisected to newly added shrinker
  of extent maps (it was added to fix a build up of the structures in
  memory).

  I'm assuming that the freezes would happen to many users after release
  so I'd like to get it merged now so it's in 6.10. Although the diff
  size is not small the changes are relatively straightforward, the
  reporters verified the fixes and we did testing on our side.

  The fixes:

   - adjust behaviour under memory pressure and check lock or scheduling
     conditions, bail out if needed

   - synchronize tracking of the scanning progress so inode ranges are
     not skipped or work duplicated

   - do a delayed iput when scanning a root so evicting an inode does
     not slow things down in case of lots of dirty data, also fix
     lockdep warning, a deadlock could happen when writing the dirty
     data would need to start a transaction"

* tag 'for-6.10-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: avoid races when tracking progress for extent map shrinking
  btrfs: stop extent map shrinker if reschedule is needed
  btrfs: use delayed iput during extent map shrinking
2024-07-12 12:08:42 -07:00
Linus Torvalds
a52ff901a1 Merge tag 'ceph-for-6.10-rc8' of https://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
 "A fix for a possible use-after-free following "rbd unmap" or "umount"
  marked for stable and two kernel-doc fixups"

* tag 'ceph-for-6.10-rc8' of https://github.com/ceph/ceph-client:
  libceph: fix crush_choose_firstn() kernel-doc warnings
  libceph: suppress crush_choose_indep() kernel-doc warnings
  libceph: fix race between delayed_work() and ceph_monc_stop()
2024-07-12 10:39:29 -07:00
Linus Torvalds
ac6a9e07a7 Merge tag 'pmdomain-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fix from Ulf Hansson:

 - qcom: Skip retention level for rpmhpd's

* tag 'pmdomain-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: qcom: rpmhpd: Skip retention level for Power Domains
2024-07-12 10:29:49 -07:00
Linus Torvalds
01ec3bb6ea Merge tag 'mmc-v6.10-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:

 - davinci_mmc: Prevent transmitted data size from exceeding sgm's
   length

 - sdhci: Fix max_seg_size for 64KiB PAGE_SIZE

* tag 'mmc-v6.10-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: davinci_mmc: Prevent transmitted data size from exceeding sgm's length
  mmc: sdhci: Fix max_seg_size for 64KiB PAGE_SIZE
2024-07-12 10:26:48 -07:00
Linus Torvalds
e091caf99f Merge tag 'arm-fixes-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
 "Most of these changes are Qualcomm SoC specific and came in just after
  I sent out the last set of fixes. This includes two regression fixes
  for SoC drivers, a defconfig change to ensure the Lenovo X13s is
  usable and 11 changes to DT files to fix regressions and minor
  platform specific issues.

  Tony and Chunyan step back from their respective maintainership roles
  on the omap and unisoc platforms, and Christophe in turn takes over
  maintaining some of the Freescale SoC drivers that he has been taking
  care of in practice already.

  Lastly, there are two trivial fixes for the davinci and sunxi
  platforms"

* tag 'arm-fixes-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  MAINTAINERS: Update FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY
  MAINTAINERS: Add more maintainers for omaps
  ARM: davinci: Convert comma to semicolon
  MAINTAINERS: Move myself from SPRD Maintainer to Reviewer
  Revert "dt-bindings: cache: qcom,llcc: correct QDU1000 reg entries"
  arm64: dts: qcom: qdu1000: Fix LLCC reg property
  arm64: dts: qcom: sm6115: add iommu for sdhc_1
  arm64: dts: qcom: x1e80100-crd: fix DAI used for headset recording
  arm64: dts: qcom: x1e80100-crd: fix WCD audio codec TX port mapping
  soc: qcom: pmic_glink: disable UCSI on sc8280xp
  arm64: defconfig: enable Elan i2c-hid driver
  arm64: dts: qcom: sc8280xp-crd: use external pull up for touch reset
  arm64: dts: qcom: sc8280xp-x13s: fix touchscreen power on
  arm64: dts: qcom: x1e80100: Fix PCIe 6a reg offsets and add MHI
  arm64: dts: qcom: sa8775p: Correct IRQ number of EL2 non-secure physical timer
  arm64: dts: allwinner: Fix PMIC interrupt number
  arm64: dts: qcom: sc8280xp: Set status = "reserved" on PSHOLD
  arm64: dts: qcom: x1e80100-*: Allocate some CMA buffers
  arm64: dts: qcom: sc8180x: Fix LLCC reg property again
2024-07-12 09:00:25 -07:00
Linus Torvalds
f469cf967b Merge tag 'char-misc-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver fixes from Greg KH:
 "Here are some small remaining driver fixes for 6.10-final that have
  all been in linux-next for a while and resolve reported issues.
  Included in here are:

   - mei driver fixes (and a spelling fix at the end just to be clean)

   - iio driver fixes for reported problems

   - fastrpc bugfixes

   - nvmem small fixes"

* tag 'char-misc-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  mei: vsc: Fix spelling error
  mei: vsc: Enhance SPI transfer of IVSC ROM
  mei: vsc: Utilize the appropriate byte order swap function
  mei: vsc: Prevent timeout error with added delay post-firmware download
  mei: vsc: Enhance IVSC chipset stability during warm reboot
  nvmem: core: limit cell sysfs permissions to main attribute ones
  nvmem: core: only change name to fram for current attribute
  nvmem: meson-efuse: Fix return value of nvmem callbacks
  nvmem: rmem: Fix return value of rmem_read()
  misc: microchip: pci1xxxx: Fix return value of nvmem callbacks
  hpet: Support 32-bit userspace
  misc: fastrpc: Restrict untrusted app to attach to privileged PD
  misc: fastrpc: Fix ownership reassignment of remote heap
  misc: fastrpc: Fix memory leak in audio daemon attach operation
  misc: fastrpc: Avoid updating PD type for capability request
  misc: fastrpc: Copy the complete capability structure to user
  misc: fastrpc: Fix DSP capabilities request
  iio: light: apds9306: Fix error handing
  iio: trigger: Fix condition for own trigger
2024-07-12 08:45:27 -07:00
Linus Torvalds
1cb67bcc21 Merge tag 'tty-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial fixes from Greg KH:
 "Here are some small serial driver fixes for 6.10-final. Included in
  here are:

   - qcom-geni fixes for a much much much discussed issue and everyone
     now seems to be agreed that this is the proper way forward to
     resolve the reported lockups

   - imx serial driver bugfixes

   - 8250_omap errata fix

   - ma35d1 serial driver bugfix

  All of these have been in linux-next for over a week with no reported
  issues"

* tag 'tty-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: qcom-geni: do not kill the machine on fifo underrun
  serial: qcom-geni: fix hard lockup on buffer flush
  serial: qcom-geni: fix soft lockup on sw flow control and suspend
  serial: imx: ensure RTS signal is not left active after shutdown
  tty: serial: ma35d1: Add a NULL check for of_node
  serial: 8250_omap: Fix Errata i2310 with RX FIFO level check
  serial: imx: only set receiver level if it is zero
2024-07-12 08:39:44 -07:00
Linus Torvalds
1293147aa8 Merge tag 'usb-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are some small USB driver fixes and new device ids for
  6.10-final. Included in here are:

   - new usb-serial device ids for reported devices

   - syzbot-triggered duplicate endpoint bugfix

   - gadget bugfix for configfs memory overwrite

   - xhci resume bugfix

   - new device quirk added

   - usb core error path bugfix

  All of these have been in linux-next (most for a while) with no
  reported issues"

* tag 'usb-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: mos7840: fix crash on resume
  USB: serial: option: add Rolling RW350-GL variants
  USB: serial: option: add support for Foxconn T99W651
  USB: serial: option: add Netprisma LCUK54 series modules
  usb: gadget: configfs: Prevent OOB read/write in usb_string_copy()
  usb: dwc3: pci: add support for the Intel Panther Lake
  usb: core: add missing of_node_put() in usb_of_has_devices_or_graph
  USB: Add USB_QUIRK_NO_SET_INTF quirk for START BP-850k
  USB: core: Fix duplicate endpoint bug by clearing reserved bits in the descriptor
  xhci: always resume roothubs if xHC was reset during resume
  USB: serial: option: add Telit generic core-dump composition
  USB: serial: option: add Fibocom FM350-GL
  USB: serial: option: add Telit FN912 rmnet compositions
2024-07-12 08:35:56 -07:00
Linus Torvalds
9b48104b2c Merge tag 'sound-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "The majority of changes here are small device-specific fixes for ASoC
  SOF / Intel and usual HD-audio quirks.

  The only significant high LOC is found in the Cirrus firmware driver,
  but all those are for hardening against malicious firmware blobs, and
  they look fine for taking as a last minute fix, too"

* tag 'sound-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: Enable Mute LED on HP 250 G7
  firmware: cs_dsp: Use strnlen() on name fields in V1 wmfw files
  ALSA: hda/realtek: Limit mic boost on VAIO PRO PX
  ALSA: hda: cs35l41: Fix swapped l/r audio channels for Lenovo ThinBook 13x Gen4
  ASoC: SOF: Intel: hda-pcm: Limit the maximum number of periods by MAX_BDL_ENTRIES
  ASoC: rt711-sdw: add missing readable registers
  ASoC: SOF: Intel: hda: fix null deref on system suspend entry
  ALSA: hda/realtek: add quirk for Clevo V5[46]0TU
  firmware: cs_dsp: Prevent buffer overrun when processing V2 alg headers
  firmware: cs_dsp: Validate payload length before processing block
  firmware: cs_dsp: Return error if block header overflows file
  firmware: cs_dsp: Fix overflow checking of wmfw header
2024-07-12 08:32:40 -07:00
Linus Torvalds
5d4c85134b Merge tag 'bcachefs-2024-07-12' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs fixes from Kent Overstreet:

 - revert the SLAB_ACCOUNT patch, something crazy is going on in memcg
   and someone forgot to test

 - minor fixes: missing rcu_read_lock(), scheduling while atomic (in an
   emergency shutdown path)

 - two lockdep fixes; these could have gone earlier, but were left to
   bake awhile

* tag 'bcachefs-2024-07-12' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: bch2_gc_btree() should not use btree_root_lock
  bcachefs: Set PF_MEMALLOC_NOFS when trans->locked
  bcachefs; Use trans_unlock_long() when waiting on allocator
  Revert "bcachefs: Mark bch_inode_info as SLAB_ACCOUNT"
  bcachefs: fix scheduling while atomic in break_cycle()
  bcachefs: Fix RCU splat
2024-07-12 08:22:43 -07:00
David S. Miller
425652d45c Merge branch 'octeontx2-cpt-rss-cfg-fixes' into main
Srujana Challa says:

====================
Fixes for CPT and RSS configuration

This series of patches fixes various issues related to CPT
configuration and RSS configuration.

v1->v2:
- Excluded the patch "octeontx2-af: reduce cpt flt interrupt vectors for
  cn10kb" to submit it to net-next.
- Addressed the review comments.

Kiran Kumar K (1):
  octeontx2-af: Fix issue with IPv6 ext match for RSS
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-12 13:42:02 +01:00
Satheesh Paul
60795bbf04 octeontx2-af: fix issue with IPv4 match for RSS
While performing RSS based on IPv4, packets with
IPv4 options are not being considered. Adding changes
to match both plain IPv4 and IPv4 with option header.

Fixes: 41a7aa7b80 ("octeontx2-af: NIX Rx flowkey configuration for RSS")
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-12 13:42:01 +01:00
Kiran Kumar K
e23ac1095b octeontx2-af: fix issue with IPv6 ext match for RSS
While performing RSS based on IPv6, extension ltype
is not being considered. This will be problem for
fragmented packets or packets with extension header.
Adding changes to match IPv6 ext header along with IPv6
ltype.

Fixes: 41a7aa7b80 ("octeontx2-af: NIX Rx flowkey configuration for RSS")
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-12 13:42:01 +01:00
Michal Mazur
404dc0fd6f octeontx2-af: fix detection of IP layer
Checksum and length checks are not enabled for IPv4 header with
options and IPv6 with extension headers.
To fix this a change in enum npc_kpu_lc_ltype is required which will
allow adjustment of LTYPE_MASK to detect all types of IP headers.

Fixes: 21e6699e5c ("octeontx2-af: Add NPC KPU profile")
Signed-off-by: Michal Mazur <mmazur2@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-12 13:42:00 +01:00
Srujana Challa
845fe19139 octeontx2-af: fix a issue with cpt_lf_alloc mailbox
This patch fixes CPT_LF_ALLOC mailbox error due to
incompatible mailbox message format. Specifically, it
corrects the `blkaddr` field type from `int` to `u8`.

Fixes: de2854c87c ("octeontx2-af: Mailbox changes for 98xx CPT block")
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-12 13:42:00 +01:00
Nithin Dabilpuram
bc35e28af7 octeontx2-af: replace cpt slot with lf id on reg write
Replace slot id with global CPT lf id on reg read/write as
CPTPF/VF driver would send slot number instead of global
lf id in the reg offset. And also update the mailbox response
with the global lf's register offset.

Fixes: ae454086e3 ("octeontx2-af: add mailbox interface for CPT")
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-12 13:42:00 +01:00
Christophe Leroy
6fba5cbd32 MAINTAINERS: Update FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY
FREESCALE SOC DRIVERS has been orphaned since
commit eaac25d026 ("MAINTAINERS: Drop Li Yang as their email address
stopped working")
QUICC ENGINE LIBRARY has Qiang Zhao as maintainer but he hasn't
responded for years and when Li Yang was still maintaining FREESCALE
SOC DRIVERS he was also handling QUICC ENGINE LIBRARY directly.

As a maintainer of LINUX FOR POWERPC EMBEDDED PPC8XX AND PPC83XX, I
also need FREESCALE SOC DRIVERS to be actively maintained, so add
myself as maintainer of FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY.

See below link for more context.

Link: https://lore.kernel.org/linuxppc-dev/20240219153016.ntltc76bphwrv6hn@skbuf/T/#mf6d4a5eef79e8eae7ae0456a2794c01e630a6756
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-12 13:16:09 +02:00
Tony Lindgren
dfd168e74e MAINTAINERS: Add more maintainers for omaps
There are many generations of omaps to maintain, and I will be only active
as a hobbyist with time permitting. Let's add more maintainers to ensure
continued Linux support.

TI is interested in maintaining the active SoCs such as am3, am4 and
dra7. And the hobbyists are interested in maintaining some of the older
devices, mainly based on omap3 and 4 SoCs.

Kevin and Roger have agreed to maintain the active TI parts. Both Kevin
and Roger have been working on the omap variants for a long time, and
have a good understanding of the hardware.

Aaro and Andreas have agreed to maintain the community devices. Both Aaro
and Andreas have long experience on working with the earlier TI SoCs.

While at it, let's also change me to be a reviewer for the omap1, and
drop the link to my old omap web page.

Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-12 12:07:10 +02:00
Aleksandr Loktionov
8b9b59e27a i40e: fix: remove needless retries of NVM update
Remove wrong EIO to EGAIN conversion and pass all errors as is.

After commit 230f3d53a5 ("i40e: remove i40e_status"), which should only
replace F/W specific error codes with Linux kernel generic, all EIO errors
suddenly started to be converted into EAGAIN which leads nvmupdate to retry
until it timeouts and sometimes fails after more than 20 minutes in the
middle of NVM update, so NVM becomes corrupted.

The bug affects users only at the time when they try to update NVM, and
only F/W versions that generate errors while nvmupdate. For example, X710DA2
with 0x8000ECB7 F/W is affected, but there are probably more...

Command for reproduction is just NVM update:
 ./nvmupdate64

In the log instead of:
 i40e_nvmupd_exec_aq err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_ENOMEM)
appears:
 i40e_nvmupd_exec_aq err -EIO aq_err I40E_AQ_RC_ENOMEM
 i40e: eeprom check failed (-5), Tx/Rx traffic disabled

The problematic code did silently convert EIO into EAGAIN which forced
nvmupdate to ignore EAGAIN error and retry the same operation until timeout.
That's why NVM update takes 20+ minutes to finish with the fail in the end.

Fixes: 230f3d53a5 ("i40e: remove i40e_status")
Co-developed-by: Kelvin Kang <kelvin.kang@intel.com>
Signed-off-by: Kelvin Kang <kelvin.kang@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240710224455.188502-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11 17:31:52 -07:00
Saeed Mahameed
503757c809 net: ethtool: Fix RSS setting
When user submits a rxfh set command without touching XFRM_SYM_XOR,
rxfh.input_xfrm is set to RXH_XFRM_NO_CHANGE, which is equal to 0xff.

Testing if (rxfh.input_xfrm & RXH_XFRM_SYM_XOR &&
	    !ops->cap_rss_sym_xor_supported)
		return -EOPNOTSUPP;

Will always be true on devices that don't set cap_rss_sym_xor_supported,
since rxfh.input_xfrm & RXH_XFRM_SYM_XOR is always true, if input_xfrm
was not set, i.e RXH_XFRM_NO_CHANGE=0xff, which will result in failure
of any command that doesn't require any change of XFRM, e.g RSS context
or hash function changes.

To avoid this breakage, test if rxfh.input_xfrm != RXH_XFRM_NO_CHANGE
before testing other conditions. Note that the problem will only trigger
with XFRM-aware userspace, old ethtool CLI would continue to work.

Fixes: 0dd415d155 ("net: ethtool: add a NO_CHANGE uAPI for new RXFH's input_xfrm")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://patch.msgid.link/20240710225538.43368-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11 17:21:08 -07:00
Kent Overstreet
1841027c7d bcachefs: bch2_gc_btree() should not use btree_root_lock
btree_root_lock is for the root keys in btree_root, not the pointers to
the nodes themselves; this fixes a lock ordering issue between
btree_root_lock and btree node locks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-11 20:10:55 -04:00
Kent Overstreet
f236ea4bca bcachefs: Set PF_MEMALLOC_NOFS when trans->locked
proper lock ordering is: fs_reclaim -> btree node locks

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-11 20:10:55 -04:00
Kent Overstreet
f0f3e51148 bcachefs; Use trans_unlock_long() when waiting on allocator
not using unlock_long() blocks key cache reclaim, and the allocator may
take awhile

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-11 20:10:55 -04:00
Kent Overstreet
aacd897d4d Revert "bcachefs: Mark bch_inode_info as SLAB_ACCOUNT"
This reverts commit 86d81ec5f5.

This wasn't tested with memcg enabled, it immediately hits a null ptr
deref in list_lru_add().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-11 20:01:38 -04:00
Linus Torvalds
43db1e03c0 Merge tag 'for-6.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mikulas Patocka:

 - Fix broken discard for device mapper VDO target

* tag 'for-6.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm vdo: replace max_discard_sectors with max_hw_discard_sectors
2024-07-11 15:11:14 -07:00
Bruce Johnston
d5cfecfe7f dm vdo: replace max_discard_sectors with max_hw_discard_sectors
Commit 4f563a6473 ("block: add a max_user_discard_sectors queue
limit") changed block core to set max_discard_sectors to:
min(lim->max_hw_discard_sectors, lim->max_user_discard_sectors)

Commit 825d8bbd2f ("dm: always manage discard support in terms
of max_hw_discard_sectors") fixed most dm targetss to deal with
this, by replacing max_discard_sectors with max_hw_discard_sectors.
Unfortunately, dm-vdo did not get fixed at that time.

Fixes: 825d8bbd2f ("dm: always manage discard support in terms of max_hw_discard_sectors")
Signed-off-by: Bruce Johnston <bjohnsto@redhat.com>
Signed-off-by: Matthew Sakai <msakai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2024-07-11 21:24:41 +02:00
Linus Torvalds
8a18fda0fe Merge tag 'spi-fix-v6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "This fixes two regressions that have been bubbling along for a large
  part of this release.

  One is a revert of the multi mode support for the OMAP SPI controller,
  this introduced regressions on a number of systems and while there has
  been progress on fixing those we've not got something that works for
  everyone yet so let's just drop the change for now.

  The other is a series of fixes from David Lechner for his recent
  message optimisation work, this interacted badly with spi-mux which
  is altogether too clever with recursive use of the bus and creates
  situations that hadn't been considered.

  There are also a couple of small driver specific fixes, including one
  more patch from David for sleep duration calculations in the AXI
  driver"

* tag 'spi-fix-v6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: mux: set ctlr->bits_per_word_mask
  spi: add defer_optimize_message controller flag
  spi: don't unoptimize message in spi_async()
  spi: omap2-mcspi: Revert multi mode support
  spi: davinci: Unset POWERDOWN bit when releasing resources
  spi: axi-spi-engine: fix sleep calculation
  spi: imx: Don't expect DMA for i.MX{25,35,50,51,53} cspi devices
2024-07-11 12:07:50 -07:00
Linus Torvalds
51df8e0cba Merge tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf and netfilter.

  Current release - regressions:

   - core: fix rc7's __skb_datagram_iter() regression

  Current release - new code bugs:

   - eth: bnxt: fix crashes when reducing ring count with active RSS
     contexts

  Previous releases - regressions:

   - sched: fix UAF when resolving a clash

   - skmsg: skip zero length skb in sk_msg_recvmsg2

   - sunrpc: fix kernel free on connection failure in
     xs_tcp_setup_socket

   - tcp: avoid too many retransmit packets

   - tcp: fix incorrect undo caused by DSACK of TLP retransmit

   - udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().

   - eth: ks8851: fix deadlock with the SPI chip variant

   - eth: i40e: fix XDP program unloading while removing the driver

  Previous releases - always broken:

   - bpf:
       - fix too early release of tcx_entry
       - fail bpf_timer_cancel when callback is being cancelled
       - bpf: fix order of args in call to bpf_map_kvcalloc

   - netfilter: nf_tables: prefer nft_chain_validate

   - ppp: reject claimed-as-LCP but actually malformed packets

   - wireguard: avoid unaligned 64-bit memory accesses"

* tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
  net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket
  net/sched: Fix UAF when resolving a clash
  net: ks8851: Fix potential TX stall after interface reopen
  udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().
  netfilter: nf_tables: prefer nft_chain_validate
  netfilter: nfnetlink_queue: drop bogus WARN_ON
  ethtool: netlink: do not return SQI value if link is down
  ppp: reject claimed-as-LCP but actually malformed packets
  selftests/bpf: Add timer lockup selftest
  net: ethernet: mtk-star-emac: set mac_managed_pm when probing
  e1000e: fix force smbus during suspend flow
  tcp: avoid too many retransmit packets
  bpf: Defer work in bpf_timer_cancel_and_free
  bpf: Fail bpf_timer_cancel when callback is being cancelled
  bpf: fix order of args in call to bpf_map_kvcalloc
  net: ethernet: lantiq_etop: fix double free in detach
  i40e: Fix XDP program unloading while removing the driver
  net: fix rc7's __skb_datagram_iter()
  net: ks8851: Fix deadlock with the SPI chip variant
  octeontx2-af: Fix incorrect value output on error path in rvu_check_rsrc_availability()
  ...
2024-07-11 09:29:49 -07:00
Linus Torvalds
83ab4b461e Merge tag 'vfs-6.10-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
 "cachefiles:

   - Export an existing and add a new cachefile helper to be used in
     filesystems to fix reference count bugs

   - Use the newly added fscache_ty_get_volume() helper to get a
     reference count on an fscache_volume to handle volumes that are
     about to be removed cleanly

   - After withdrawing a fscache_cache via FSCACHE_CACHE_IS_WITHDRAWN
     wait for all ongoing cookie lookups to complete and for the object
     count to reach zero

   - Propagate errors from vfs_getxattr() to avoid an infinite loop in
     cachefiles_check_volume_xattr() because it keeps seeing ESTALE

   - Don't send new requests when an object is dropped by raising
     CACHEFILES_ONDEMAND_OJBSTATE_DROPPING

   - Cancel all requests for an object that is about to be dropped

   - Wait for the ondemand_boject_worker to finish before dropping a
     cachefiles object to prevent use-after-free

   - Use cyclic allocation for message ids to better handle id recycling

   - Add missing lock protection when iterating through the xarray when
     polling

  netfs:

   - Use standard logging helpers for debug logging

  VFS:

   - Fix potential use-after-free in file locks during
     trace_posix_lock_inode(). The tracepoint could fire while another
     task raced it and freed the lock that was requested to be traced

   - Only increment the nr_dentry_negative counter for dentries that are
     present on the superblock LRU. Currently, DCACHE_LRU_LIST list is
     used to detect this case. However, the flag is also raised in
     combination with DCACHE_SHRINK_LIST to indicate that dentry->d_lru
     is used. So checking only DCACHE_LRU_LIST will lead to wrong
     nr_dentry_negative count. Fix the check to not count dentries that
     are on a shrink related list

  Misc:

   - hfsplus: fix an uninitialized value issue in copy_name

   - minix: fix minixfs_rename with HIGHMEM. It still uses kunmap() even
     though we switched it to kmap_local_page() a while ago"

* tag 'vfs-6.10-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  minixfs: Fix minixfs_rename with HIGHMEM
  hfsplus: fix uninit-value in copy_name
  vfs: don't mod negative dentry count when on shrinker list
  filelock: fix potential use-after-free in posix_lock_inode
  cachefiles: add missing lock protection when polling
  cachefiles: cyclic allocation of msg_id to avoid reuse
  cachefiles: wait for ondemand_object_worker to finish when dropping object
  cachefiles: cancel all requests for the object that is being dropped
  cachefiles: stop sending new request when dropping object
  cachefiles: propagate errors from vfs_getxattr() to avoid infinite loop
  cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie()
  cachefiles: fix slab-use-after-free in fscache_withdraw_volume()
  netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume()
  netfs: Switch debug logging to pr_debug()
2024-07-11 09:03:28 -07:00
Bastien Curutchet
16198eef11 mmc: davinci_mmc: Prevent transmitted data size from exceeding sgm's length
No check is done on the size of the data to be transmiited. This causes
a kernel panic when this size exceeds the sg_miter's length.

Limit the number of transmitted bytes to sgm->length.

Cc: stable@vger.kernel.org
Fixes: ed01d210fd ("mmc: davinci_mmc: Use sg_miter for PIO")
Signed-off-by: Bastien Curutchet <bastien.curutchet@bootlin.com>
Link: https://lore.kernel.org/r/20240711081838.47256-2-bastien.curutchet@bootlin.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-07-11 17:48:54 +02:00
Adrian Hunter
63d20a94f2 mmc: sdhci: Fix max_seg_size for 64KiB PAGE_SIZE
blk_queue_max_segment_size() ensured:

	if (max_size < PAGE_SIZE)
		max_size = PAGE_SIZE;

whereas:

blk_validate_limits() makes it an error:

	if (WARN_ON_ONCE(lim->max_segment_size < PAGE_SIZE))
		return -EINVAL;

The change from one to the other, exposed sdhci which was setting maximum
segment size too low in some circumstances.

Fix the maximum segment size when it is too low.

Fixes: 616f876617 ("mmc: pass queue_limits to blk_mq_alloc_disk")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20240710180737.142504-1-adrian.hunter@intel.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-07-11 17:48:40 +02:00
Takashi Iwai
f19e1027f6 Merge tag 'asoc-fix-v6.10-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.10

A few fairly small fixes for ASoC, there's a relatively large set of
hardening changes for the cs_dsp firmware file parsing and a couple of
other small device specific fixes.
2024-07-11 17:11:50 +02:00
Filipe Manana
4484940514 btrfs: avoid races when tracking progress for extent map shrinking
We store the progress (root and inode numbers) of the extent map shrinker
in fs_info without any synchronization but we can have multiple tasks
calling into the shrinker during memory allocations when there's enough
memory pressure for example.

This can result in a task A reading fs_info->extent_map_shrinker_last_ino
after another task B updates it, and task A reading
fs_info->extent_map_shrinker_last_root before task B updates it, making
task A see an odd state that isn't necessarily harmful but may make it
skip certain inode ranges or do more work than necessary by going over
the same inodes again. These unprotected accesses would also trigger
warnings from tools like KCSAN.

So add a lock to protect access to these progress fields.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11 16:50:54 +02:00
Filipe Manana
b3ebb9b7e9 btrfs: stop extent map shrinker if reschedule is needed
The extent map shrinker can be called in a variety of contexts where we
are under memory pressure, and of them is when a task is trying to
allocate memory. For this reason the shrinker is typically called with a
value of struct shrink_control::nr_to_scan that is much smaller than what
we return in the nr_cached_objects callback of struct super_operations
(fs/btrfs/super.c:btrfs_nr_cached_objects()), so that the shrinker does
not take a long time and cause high latencies. However we can still take
a lot of time in the shrinker even for a limited amount of nr_to_scan:

1) When traversing the red black tree that tracks open inodes in a root,
   as for example with millions of open inodes we get a deep tree which
   takes time searching for an inode;

2) Iterating over the extent map tree, which is a red black tree, of an
   inode when doing the rb_next() calls and when removing an extent map
   from the tree, since often that requires rebalancing the red black
   tree;

3) When trying to write lock an inode's extent map tree we may wait for a
   significant amount of time, because there's either another task about
   to do IO and searching for an extent map in the tree or inserting an
   extent map in the tree, and we can have thousands or even millions of
   extent maps for an inode. Furthermore, there can be concurrent calls
   to the shrinker so the lock might be busy simply because there is
   already another task shrinking extent maps for the same inode;

4) We often reschedule if we need to, which further increases latency.

So improve on this by stopping the extent map shrinking code whenever we
need to reschedule and make it skip an inode if we can't immediately lock
its extent map tree.

Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reported-by: Andrea Gelmini <andrea.gelmini@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CABXGCsMmmb36ym8hVNGTiU8yfUS_cGvoUmGCcBrGWq9OxTrs+A@mail.gmail.com/
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11 16:45:42 +02:00
Filipe Manana
68a3ebd18b btrfs: use delayed iput during extent map shrinking
When putting an inode during extent map shrinking we're doing a standard
iput() but that may take a long time in case the inode is dirty and we are
doing the final iput that triggers eviction - the VFS will have to wait
for writeback before calling the btrfs evict callback (see
fs/inode.c:evict()).

This slows down the task running the shrinker which may have been
triggered while updating some tree for example, meaning locks are held
as well as an open transaction handle.

Also if the iput() ends up triggering eviction and the inode has no links
anymore, then we trigger item truncation which requires flushing delayed
items, space reservation to start a transaction and that may trigger the
space reclaim task and wait for it, resulting in deadlocks in case the
reclaim task needs for example to commit a transaction and the shrinker
is being triggered from a path holding a transaction handle.

Syzbot reported such a case with the following stack traces:

   ======================================================
   WARNING: possible circular locking dependency detected
   6.10.0-rc2-syzkaller-00010-g2ab795141095 #0 Not tainted
   ------------------------------------------------------
   kswapd0/111 is trying to acquire lock:
   ffff88801eae4610 (sb_internal#3){.+.+}-{0:0}, at: btrfs_commit_inode_delayed_inode+0x110/0x330 fs/btrfs/delayed-inode.c:1275

   but task is already holding lock:
   ffffffff8dd3a9a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xa88/0x1970 mm/vmscan.c:6924

   which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

   -> #3 (fs_reclaim){+.+.}-{0:0}:
          __fs_reclaim_acquire mm/page_alloc.c:3783 [inline]
          fs_reclaim_acquire+0x102/0x160 mm/page_alloc.c:3797
          might_alloc include/linux/sched/mm.h:334 [inline]
          slab_pre_alloc_hook mm/slub.c:3890 [inline]
          slab_alloc_node mm/slub.c:3980 [inline]
          kmem_cache_alloc_lru_noprof+0x58/0x2f0 mm/slub.c:4019
          btrfs_alloc_inode+0x118/0xb20 fs/btrfs/inode.c:8411
          alloc_inode+0x5d/0x230 fs/inode.c:261
          iget5_locked fs/inode.c:1235 [inline]
          iget5_locked+0x1c9/0x2c0 fs/inode.c:1228
          btrfs_iget_locked fs/btrfs/inode.c:5590 [inline]
          btrfs_iget_path fs/btrfs/inode.c:5607 [inline]
          btrfs_iget+0xfb/0x230 fs/btrfs/inode.c:5636
          create_reloc_inode+0x403/0x820 fs/btrfs/relocation.c:3911
          btrfs_relocate_block_group+0x471/0xe60 fs/btrfs/relocation.c:4114
          btrfs_relocate_chunk+0x143/0x450 fs/btrfs/volumes.c:3373
          __btrfs_balance fs/btrfs/volumes.c:4157 [inline]
          btrfs_balance+0x211a/0x3f00 fs/btrfs/volumes.c:4534
          btrfs_ioctl_balance fs/btrfs/ioctl.c:3675 [inline]
          btrfs_ioctl+0x12ed/0x8290 fs/btrfs/ioctl.c:4742
          __do_compat_sys_ioctl+0x2c3/0x330 fs/ioctl.c:1007
          do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
          __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
          do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
          entry_SYSENTER_compat_after_hwframe+0x84/0x8e

   -> #2 (btrfs_trans_num_extwriters){++++}-{0:0}:
          join_transaction+0x164/0xf40 fs/btrfs/transaction.c:315
          start_transaction+0x427/0x1a70 fs/btrfs/transaction.c:700
          btrfs_rebuild_free_space_tree+0xaa/0x480 fs/btrfs/free-space-tree.c:1323
          btrfs_start_pre_rw_mount+0x218/0xf60 fs/btrfs/disk-io.c:2999
          open_ctree+0x41ab/0x52e0 fs/btrfs/disk-io.c:3554
          btrfs_fill_super fs/btrfs/super.c:946 [inline]
          btrfs_get_tree_super fs/btrfs/super.c:1863 [inline]
          btrfs_get_tree+0x11e9/0x1b90 fs/btrfs/super.c:2089
          vfs_get_tree+0x8f/0x380 fs/super.c:1780
          fc_mount+0x16/0xc0 fs/namespace.c:1125
          btrfs_get_tree_subvol fs/btrfs/super.c:2052 [inline]
          btrfs_get_tree+0xa53/0x1b90 fs/btrfs/super.c:2090
          vfs_get_tree+0x8f/0x380 fs/super.c:1780
          do_new_mount fs/namespace.c:3352 [inline]
          path_mount+0x6e1/0x1f10 fs/namespace.c:3679
          do_mount fs/namespace.c:3692 [inline]
          __do_sys_mount fs/namespace.c:3898 [inline]
          __se_sys_mount fs/namespace.c:3875 [inline]
          __ia32_sys_mount+0x295/0x320 fs/namespace.c:3875
          do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
          __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
          do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
          entry_SYSENTER_compat_after_hwframe+0x84/0x8e

   -> #1 (btrfs_trans_num_writers){++++}-{0:0}:
          join_transaction+0x148/0xf40 fs/btrfs/transaction.c:314
          start_transaction+0x427/0x1a70 fs/btrfs/transaction.c:700
          btrfs_rebuild_free_space_tree+0xaa/0x480 fs/btrfs/free-space-tree.c:1323
          btrfs_start_pre_rw_mount+0x218/0xf60 fs/btrfs/disk-io.c:2999
          open_ctree+0x41ab/0x52e0 fs/btrfs/disk-io.c:3554
          btrfs_fill_super fs/btrfs/super.c:946 [inline]
          btrfs_get_tree_super fs/btrfs/super.c:1863 [inline]
          btrfs_get_tree+0x11e9/0x1b90 fs/btrfs/super.c:2089
          vfs_get_tree+0x8f/0x380 fs/super.c:1780
          fc_mount+0x16/0xc0 fs/namespace.c:1125
          btrfs_get_tree_subvol fs/btrfs/super.c:2052 [inline]
          btrfs_get_tree+0xa53/0x1b90 fs/btrfs/super.c:2090
          vfs_get_tree+0x8f/0x380 fs/super.c:1780
          do_new_mount fs/namespace.c:3352 [inline]
          path_mount+0x6e1/0x1f10 fs/namespace.c:3679
          do_mount fs/namespace.c:3692 [inline]
          __do_sys_mount fs/namespace.c:3898 [inline]
          __se_sys_mount fs/namespace.c:3875 [inline]
          __ia32_sys_mount+0x295/0x320 fs/namespace.c:3875
          do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
          __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
          do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
          entry_SYSENTER_compat_after_hwframe+0x84/0x8e

   -> #0 (sb_internal#3){.+.+}-{0:0}:
          check_prev_add kernel/locking/lockdep.c:3134 [inline]
          check_prevs_add kernel/locking/lockdep.c:3253 [inline]
          validate_chain kernel/locking/lockdep.c:3869 [inline]
          __lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137
          lock_acquire kernel/locking/lockdep.c:5754 [inline]
          lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719
          percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
          __sb_start_write include/linux/fs.h:1655 [inline]
          sb_start_intwrite include/linux/fs.h:1838 [inline]
          start_transaction+0xbc1/0x1a70 fs/btrfs/transaction.c:694
          btrfs_commit_inode_delayed_inode+0x110/0x330 fs/btrfs/delayed-inode.c:1275
          btrfs_evict_inode+0x960/0xe80 fs/btrfs/inode.c:5291
          evict+0x2ed/0x6c0 fs/inode.c:667
          iput_final fs/inode.c:1741 [inline]
          iput.part.0+0x5a8/0x7f0 fs/inode.c:1767
          iput+0x5c/0x80 fs/inode.c:1757
          btrfs_scan_root fs/btrfs/extent_map.c:1118 [inline]
          btrfs_free_extent_maps+0xbd3/0x1320 fs/btrfs/extent_map.c:1189
          super_cache_scan+0x409/0x550 fs/super.c:227
          do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435
          shrink_slab+0x18a/0x1310 mm/shrinker.c:662
          shrink_one+0x493/0x7c0 mm/vmscan.c:4790
          shrink_many mm/vmscan.c:4851 [inline]
          lru_gen_shrink_node+0x89f/0x1750 mm/vmscan.c:4951
          shrink_node mm/vmscan.c:5910 [inline]
          kswapd_shrink_node mm/vmscan.c:6720 [inline]
          balance_pgdat+0x1105/0x1970 mm/vmscan.c:6911
          kswapd+0x5ea/0xbf0 mm/vmscan.c:7180
          kthread+0x2c1/0x3a0 kernel/kthread.c:389
          ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
          ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

   other info that might help us debug this:

   Chain exists of:
     sb_internal#3 --> btrfs_trans_num_extwriters --> fs_reclaim

    Possible unsafe locking scenario:

          CPU0                    CPU1
          ----                    ----
     lock(fs_reclaim);
                                  lock(btrfs_trans_num_extwriters);
                                  lock(fs_reclaim);
     rlock(sb_internal#3);

    *** DEADLOCK ***

   2 locks held by kswapd0/111:
    #0: ffffffff8dd3a9a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xa88/0x1970 mm/vmscan.c:6924
    #1: ffff88801eae40e0 (&type->s_umount_key#62){++++}-{3:3}, at: super_trylock_shared fs/super.c:562 [inline]
    #1: ffff88801eae40e0 (&type->s_umount_key#62){++++}-{3:3}, at: super_cache_scan+0x96/0x550 fs/super.c:196

   stack backtrace:
   CPU: 0 PID: 111 Comm: kswapd0 Not tainted 6.10.0-rc2-syzkaller-00010-g2ab795141095 #0
   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
   Call Trace:
    <TASK>
    __dump_stack lib/dump_stack.c:88 [inline]
    dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114
    check_noncircular+0x31a/0x400 kernel/locking/lockdep.c:2187
    check_prev_add kernel/locking/lockdep.c:3134 [inline]
    check_prevs_add kernel/locking/lockdep.c:3253 [inline]
    validate_chain kernel/locking/lockdep.c:3869 [inline]
    __lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137
    lock_acquire kernel/locking/lockdep.c:5754 [inline]
    lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719
    percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
    __sb_start_write include/linux/fs.h:1655 [inline]
    sb_start_intwrite include/linux/fs.h:1838 [inline]
    start_transaction+0xbc1/0x1a70 fs/btrfs/transaction.c:694
    btrfs_commit_inode_delayed_inode+0x110/0x330 fs/btrfs/delayed-inode.c:1275
    btrfs_evict_inode+0x960/0xe80 fs/btrfs/inode.c:5291
    evict+0x2ed/0x6c0 fs/inode.c:667
    iput_final fs/inode.c:1741 [inline]
    iput.part.0+0x5a8/0x7f0 fs/inode.c:1767
    iput+0x5c/0x80 fs/inode.c:1757
    btrfs_scan_root fs/btrfs/extent_map.c:1118 [inline]
    btrfs_free_extent_maps+0xbd3/0x1320 fs/btrfs/extent_map.c:1189
    super_cache_scan+0x409/0x550 fs/super.c:227
    do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435
    shrink_slab+0x18a/0x1310 mm/shrinker.c:662
    shrink_one+0x493/0x7c0 mm/vmscan.c:4790
    shrink_many mm/vmscan.c:4851 [inline]
    lru_gen_shrink_node+0x89f/0x1750 mm/vmscan.c:4951
    shrink_node mm/vmscan.c:5910 [inline]
    kswapd_shrink_node mm/vmscan.c:6720 [inline]
    balance_pgdat+0x1105/0x1970 mm/vmscan.c:6911
    kswapd+0x5ea/0xbf0 mm/vmscan.c:7180
    kthread+0x2c1/0x3a0 kernel/kthread.c:389
    ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
    ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
    </TASK>

So fix this by using btrfs_add_delayed_iput() so that the final iput is
delegated to the cleaner kthread.

Link: https://lore.kernel.org/linux-btrfs/000000000000892280061a344581@google.com/
Reported-by: syzbot+3dad89b3993a4b275e72@syzkaller.appspotmail.com
Fixes: 956a17d9d0 ("btrfs: add a shrinker for extent maps")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11 16:45:18 +02:00