pipe_buffer might refer to a compound page (and contain more than a PAGE_SIZE
worth of data). Theoretically it had been possible since way back, but
nfsd_splice_actor() hadn't run into that until copy_page_to_iter() change.
Fortunately, the only thing that changes for compound pages is that we
need to stuff each relevant subpage in and convert the offset into offset
in the first subpage.
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: f0f6b614f8 "copy_page_to_iter(): don't split high-order page in case of ITER_PIPE"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The CPU uplink port on the KSZ9477 is P5 not P6 - fix this.
Fixes: 7899eb6cb1 ("arm64: dts: imx: Add i.MX8M Plus Gateworks gw7400 dts support")
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The CAN STBY poarlity is active-low. Specify it as such by removing the
'enable-active-high' property and updating the gpio property.
Fixes: 7899eb6cb1 ("arm64: dts: imx: Add i.MX8M Plus Gateworks gw7400 dts support")
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The width and height arguments in the cmdq packet for mtk_dither_config()
are inverted. We fix the incorrect width and height for dither settings
in mtk_dither_config().
Fixes: 73d3724745 ("drm/mediatek: Adjust to the alphabetic order for mediatek-drm")
Co-developed-by: Yongqiang Niu <yongqiang.niu@mediatek.com>
Signed-off-by: Yongqiang Niu <yongqiang.niu@mediatek.com>
Signed-off-by: Allen-KH Cheng <allen-kh.cheng@mediatek.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20220908141205.18256-1-allen-kh.cheng@mediatek.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Kernel bugzilla: 216301
When doing direct writes we need to also invalidate the mapping in case
we have a cached copy of the affected page(s) in memory or else
subsequent reads of the data might return the old/stale content
before we wrote an update to the server.
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
LoongArch irqchips have a fixed hierarchy which currently can't be
described by ACPI tables, so upstream irqchip drivers call downstream
irqchip drivers' initialization directly. As a result, the top level
(CPU-level) irqchip driver should explicitly select downstream drivers
to avoid build errors.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220808085319.3350111-1-chenhuacai@loongson.cn
Running a PREEMPT_RT kernel based on v5.19-rc3-rt4 on an Ampere Altra
triggers:
[ 22.616229] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
[ 22.616239] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1884, name: kworker/80:1
[ 22.616243] preempt_count: 3, expected: 0
[ 22.616244] RCU nest depth: 0, expected: 0
[...]
[ 22.616250] hardirqs last enabled at (33): _raw_spin_unlock_irq (/home/piegon01/linux/./arch/arm64/include/asm/irqflags.h:35)
[ 22.616273] hardirqs last disabled at (34): __schedule (/home/piegon01/linux/kernel/sched/core.c:6432 (discriminator 1))
[ 22.616283] softirqs last enabled at (0): copy_process (/home/piegon01/linux/./include/linux/lockdep.h:191)
[ 22.616297] softirqs last disabled at (0): 0x0
[ 22.616305] Preemption disabled at:
[ 22.616307] __setup_irq (/home/piegon01/linux/kernel/irq/manage.c:1612)
[ 22.616322] CPU: 80 PID: 1884 Comm: kworker/80:1 Tainted: G W [...]
[ 22.616328] Hardware name: WIWYNN Mt.Jade Server System B81.03001.0005/Mt.Jade Motherboard, BIOS 1.08.20220218 (SCP: 1.08.20220218) 2022/02/18
[ 22.616333] Workqueue: events work_for_cpu_fn
[ 22.616344] Call trace:
[...]
[ 22.616403] alloc_cpumask_var_node (/home/piegon01/linux/lib/cpumask.c:115)
[ 22.616414] alloc_cpumask_var (/home/piegon01/linux/lib/cpumask.c:147)
[ 22.616417] its_select_cpu (/home/piegon01/linux/drivers/irqchip/irq-gic-v3-its.c:1580)
[ 22.616428] its_set_affinity (/home/piegon01/linux/drivers/irqchip/irq-gic-v3-its.c:1659)
[ 22.616431] msi_domain_set_affinity (/home/piegon01/linux/kernel/irq/msi.c:501)
[ 22.616440] irq_do_set_affinity (/home/piegon01/linux/kernel/irq/manage.c:276)
[ 22.616443] irq_setup_affinity (/home/piegon01/linux/kernel/irq/manage.c:633)
[ 22.616447] irq_startup (/home/piegon01/linux/kernel/irq/chip.c:280)
[ 22.616453] __setup_irq (/home/piegon01/linux/kernel/irq/manage.c:1777)
Follow the pattern established in commit cba4235e60 ("genirq: Remove
mask argument from setup_affinity()") and co to overcome this issue by
defining a static struct cpumask and protecting it by a raw spinlock.
Since its_select_cpu() can be executed with IRQs enabled or disabled,
enforce that the cpumask computation is done with interrupts disabled.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220912141857.1391343-1-pierre.gondois@arm.com
The field drv_data is assigned during driver's probe, where it's
already checked to be not NULL.
Remove the always false check '!host_data->drv_data'.
This fixes a warning "variable dereferenced before check" detected
by '0-DAY CI Kernel Test Service'.
Fixes: c297493336 ("irqchip/stm32-exti: Simplify irq description table")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/lkml/202208131739.gJvcs9ls-lkp@intel.com/
Signed-off-by: Antonio Borneo <antonio.borneo@foss.st.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220817125758.5975-1-antonio.borneo@foss.st.com
fix edp clock on Gru, fix usb port on BPI-R2-Pro, fix license typo,
fix wlan-wake-pin on Gru-Bob and lower the sd-card speed on Quartz64-B
-----BEGIN PGP SIGNATURE-----
iQFEBAABCAAuFiEE7v+35S2Q1vLNA3Lx86Z5yZzRHYEFAmMb4fQQHGhlaWtvQHNu
dGVjaC5kZQAKCRDzpnnJnNEdgbZ6B/4j5Up0eubn3kxWt1Mn5XSYXoPOHH+dgwxu
CkR0ZijvvTmdHjYCSg7Iy+ZNreowSUJkbjES6iEGVZbflopAcss6hxvwTM+Y7GU0
Q6+ZCKDqpDr23TZZ7VMzkOwQIYEyV3FKZ1vZfhdAk6/WNpdkHl9NVBrWcfSmJHY4
Mh32a7YIuaWhLI3MKTt5/umdGRzqMtvxhKfYlLKOrRc3ePoTIXVTWGdx0OzOFvCT
B9cWCq5YbVb+0VTbS5yHLG9i6ssOsMlYpoyl+m6mcqKu7CDjsNVwucc/keTIIavD
NeaJY7wkRdvjsd+0MF3U63bVSCcuslnLlGsGt5NTREpLvIuHHYRH
=DVTd
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmMfRJoACgkQmmx57+YA
GNnZ/A//ZHE9W/+pPxXUkFW81BWTaVbqsGMd7S74+0WK9kdQGejdnovkj2HUKOc2
m3f7K36qsfhFMZsgdn9o1QO01wM7moPa23/T99yd1i7Ja/y4r0oTNBC1Bn6VlGu4
3NWGptaOZhdn4WHKxFjVZP/U4Mf67Ze5R62YsnfvjSsMDKysbUzJJFhzP/MkyJEP
RmJwcFiVAIGKI6LTRqynYx2PkcH/wanjMDOsPol0gpRxBSdGrIMTLBDqnp+qpbbp
9jXkHE2kU+Eb64ZdeR0wkf34bkHYrxExgP7tjN6hrbGbbXspnX5VfRcKpUeU5vfJ
FStzsKgEN0Lq4ssH5qBnoNCAxLoihGwz4UT5jn+bgCsLnJpSmc8XQRfoHCbJejK9
+lvG1q7RjVJhX7ekmn1gSOw7ltxebk9hx3lV6T8rTilg25HXM+aATuiDTyckD1v4
Oi8hxw+NS8p0XRXAe9p6lsxJYXlsIpEXIWoXC9SYgzF2iqLpHwFHEvfpyBbVltw5
QtPG1fhq9a3/At3GEjYg/rSYq/VDrBu1DwVfshy6DpyqeUB30Mdrj3uVpH2ZPH5L
Uv0/Wp+F4WynXxn14qmuLiIXBsxTtb1oLZifUCMmrb6w8eeGudu7TN6olT0+ynVl
L7ks2Kah3Nf9p2JshEVBI76xQnrU4+RDCN/LfRs9KN2P6WemcX4=
=6QKc
-----END PGP SIGNATURE-----
Merge tag 'v6.0-rockchip-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes
Drop some not-specified properties, fix phy-supply properties,
fix edp clock on Gru, fix usb port on BPI-R2-Pro, fix license typo,
fix wlan-wake-pin on Gru-Bob and lower the sd-card speed on Quartz64-B
* tag 'v6.0-rockchip-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
arm64: dts: rockchip: Remove 'enable-active-low' from rk3566-quartz64-a
arm64: dts: rockchip: Remove 'enable-active-low' from rk3399-puma
arm64: dts: rockchip: fix property for usb2 phy supply on rk3568-evb1-v10
arm64: dts: rockchip: fix property for usb2 phy supply on rock-3a
arm64: dts: rockchip: Set RK3399-Gru PCLK_EDP to 24 MHz
arm64: dts: rockchip: fix upper usb port on BPI-R2-Pro
arm64: dts: rockchip: Fix typo in lisense text for PX30.Core
arm64: dts: rockchip: Pull up wlan wake# on Gru-Bob
arm64: dts: rockchip: Lower sd speed on quartz64-b
Link: https://lore.kernel.org/r/2645885.mvXUDI8C0e@phil
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The worker is canceled in gt_park path, but earlier it was assumed that
gt_park path cannot sleep and the cancel is asynchronous. This caused a
race with suspend flow where the worker runs after suspend and causes an
unclaimed register access warning. Cancel the worker synchronously since
the gt_park is indeed allowed to sleep.
v2: Fix author name and sign-off mismatch
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4419
Fixes: 77cdd054dd ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220827002135.139349-1-umesh.nerlige.ramappa@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit 31335aa8e0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fix regression introduced by commit:
"drm/i915: Individualize fences before adding to dma_resv obj"
which sets obj->read_domains to 0 for both read and write paths.
Also set obj->write_domain to 0 on read path which was removed by
the commit.
References: https://gitlab.freedesktop.org/drm/intel/-/issues/6639
Fixes: 420a07b841 ("drm/i915: Individualize fences before adding to dma_resv obj")
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.16+
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907172641.12555-1-nirmoy.das@intel.com
(cherry picked from commit 04f7eb3d45)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Currently, pic_height of vdsc_cfg structure is being used to calculate
slice_height, before it is set for DP.
So taking out the lines to set pic_height from the helper
intel_dp_dsc_compute_params() to individual encoders, and setting
pic_height, before it is used to calculate slice_height for DP.
Fixes: 5a6d866f8e ("drm/i915: Get slice height before computing rc params")
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220902103219.1168781-1-ankit.k.nautiyal@intel.com
(cherry picked from commit e72df53dcb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
imx8mm-tqma8mqml.dtsi has PCIe support, so it should include
<dt-bindings/phy/phy-imx8-pcie.h>.
Otherwise, there are build errors when this SoM dtsi is included
on customers' carrier boards.
While at it, remove the PCI header from imx8mm-tqma8mqml-mba8mx.dts,
which is now unneeded.
Fixes: 1d84283101 ("arm64: dts: tqma8mqml: add PCIe support")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Some users have reported being unable to connect to MT76x0 APs running mt76
after a commit enabling the VHT extneded NSS BW feature.
Fix this regression by ensuring that this feature only gets enabled on drivers
that support it
Cc: stable@vger.kernel.org
Fixes: d9fcfc1424 ("mt76: enable the VHT extended NSS BW feature")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220907095228.82072-1-nbd@nbd.name
The code was accidentally shifting register values down by tid % 32 instead of
(tid * field_size) % 32.
Cc: stable@vger.kernel.org
Fixes: a28bef561a ("mt76: mt7615: re-enable offloading of sequence number assignment")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220826182329.18155-1-nbd@nbd.name
The iwlmei driver breaks iwlwifi when returning from suspend. The interface
ends up in the 'down' state after coming back from suspend. And iwd doesn't
touch the interface state, but wpa_supplicant does, so the bug only happens on
iwd.
The bug report[0] has been open for four months now, and no fix seems to be
forthcoming. Since just disabling the iwlmei driver works as a workaround,
let's mark the config option as broken until it can be fixed properly.
[0] https://bugzilla.kernel.org/show_bug.cgi?id=215937
Fixes: 2da4366f9e ("iwlwifi: mei: add the driver to allow cooperation with CSME")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220907134450.1183045-1-toke@toke.dk
Commit 566e373fe0 ("arm64: Kconfig.platforms: Group NXP platforms
together") introduced a new symbol ARCH_NXP and made ARCH_LAYERSCAPE
(among others) depend on it, but didn't enable it in the defconfig.
Thus, now the defconfig doesn't include support for any NXP
architectures anymore. Fix it.
Fixes: 566e373fe0 ("arm64: Kconfig.platforms: Group NXP platforms together")
Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Although the RTC is on the module, the RTC_EVENT# signal is connected
on the mainboard. Already set by bootloader, but make it explicit in Linux
as well.
Fixes: 418d1d840e ("arm64: dts: freescale: add initial device tree for TQMa8MPQL with i.MX8MP")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Smatch checker complains that 'secretmem_mnt' dereferencing possible
ERR_PTR(). Let the function return if 'secretmem_mnt' is ERR_PTR, to
avoid deferencing it.
Link: https://lkml.kernel.org/r/20220904074647.GA64291@cloud-MacBookPro
Fixes: 1507f51255 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: Binyi Han <dantengknight@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foudation.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
These two predicates are the same for file pages, but are not the same for
anonymous pages.
Link: https://lkml.kernel.org/r/20220902192639.1737108-3-willy@infradead.org
Fixes: 07f67a8ded ("mm/vmscan: convert shrink_active_list() to use a folio")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "Folio fixes for 6.0".
This patch (of 2):
The recent folio conversion changed the VM_BUG_ON() to dump the folio
we're storing instead of the entry we retrieved. This was a mistake;
the entry we retrieved is the more interesting page to dump.
Link: https://lkml.kernel.org/r/20220902192639.1737108-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20220902192639.1737108-2-willy@infradead.org
Fixes: ceff9d3354 ("mm/swap: convert __delete_from_swap_cache() to a folio")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When gfp_types.h was split from gfp.h, it broke the radix test suite. Fix
the test suite by using gfp_types.h in the tools gfp.h header.
Link: https://lkml.kernel.org/r/20220902191923.1735933-1-willy@infradead.org
Fixes: cb5a065b4e (headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h>)
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. Fix this up by properly calling
dput().
Link: https://lkml.kernel.org/r/20220902191149.112434-1-sj@kernel.org
Fixes: 75c1c2b53c ("mm/damon/dbgfs: support multiple contexts")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that
installs migration entries directly if it can lock the migrating page.
When removing a dirty pte the dirty bit is supposed to be carried over to
the underlying page to prevent it being lost.
Currently migrate_vma_*() can only be used for private anonymous mappings.
That means loss of the dirty bit usually doesn't result in data loss
because these pages are typically not file-backed. However pages may be
backed by swap storage which can result in data loss if an attempt is made
to migrate a dirty page that doesn't yet have the PageDirty flag set.
In this case migration will fail due to unexpected references but the
dirty pte bit will be lost. If the page is subsequently reclaimed data
won't be written back to swap storage as it is considered uptodate,
resulting in data loss if the page is subsequently accessed.
Prevent this by copying the dirty bit to the page when removing the pte to
match what try_to_migrate_one() does.
Link: https://lkml.kernel.org/r/dd48e4882ce859c295c1a77612f66d198b0403f9.1662078528.git-series.apopple@nvidia.com
Fixes: 8c3328f1f3 ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reported-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently we only call flush_cache_page() for the anon_exclusive case,
however in both cases we clear the pte so should flush the cache.
Link: https://lkml.kernel.org/r/5676f30436ab71d1a587ac73f835ed8bd2113ff5.1662078528.git-series.apopple@nvidia.com
Fixes: 8c3328f1f3 ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When clearing a PTE the TLB should be flushed whilst still holding the PTL
to avoid a potential race with madvise/munmap/etc. For example consider
the following sequence:
CPU0 CPU1
---- ----
migrate_vma_collect_pmd()
pte_unmap_unlock()
madvise(MADV_DONTNEED)
-> zap_pte_range()
pte_offset_map_lock()
[ PTE not present, TLB not flushed ]
pte_unmap_unlock()
[ page is still accessible via stale TLB ]
flush_tlb_range()
In this case the page may still be accessed via the stale TLB entry after
madvise returns. Fix this by flushing the TLB while holding the PTL.
Fixes: 8c3328f1f3 ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Link: https://lkml.kernel.org/r/9f801e9d8d830408f2ca27821f606e09aa856899.1662078528.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reported-by: Nadav Amit <nadav.amit@gmail.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 4867fbbdd6 ("x86/mm: move protection_map[] inside the platform")
moved accesses to protection_map[] from mem_encrypt_amd.c to pgprot.c. As
a result, the accesses are now targets of KASAN (and other
instrumentations), leading to the crash during the boot process.
Disable the instrumentations for pgprot.c like commit 67bb8e999e
("x86/mm: Disable various instrumentations of mm/mem_encrypt.c and
mm/tlb.c").
Before this patch, my AMD machine cannot boot since v6.0-rc1 with KASAN
enabled, without anything printed. After the change, it successfully
boots up.
Fixes: 4867fbbdd6 ("x86/mm: move protection_map[] inside the platform")
Link: https://lkml.kernel.org/r/20220824084726.2174758-1-naohiro.aota@wdc.com
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In the case where a filesystem is polled to take over the memory failure
and receives -EOPNOTSUPP it indicates that page->index and page->mapping
are valid for reverse mapping the failure address. Introduce
FSDAX_INVALID_PGOFF to distinguish when add_to_kill() is being called from
mf_dax_kill_procs() by a filesytem vs the typical memory_failure() path.
Otherwise, vma_pgoff_address() is called with an invalid fsdax_pgoff which
then trips this failing signature:
kernel BUG at mm/memory-failure.c:319!
invalid opcode: 0000 [#1] PREEMPT SMP PTI
CPU: 13 PID: 1262 Comm: dax-pmd Tainted: G OE N 6.0.0-rc2+ #62
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:add_to_kill.cold+0x19d/0x209
[..]
Call Trace:
<TASK>
collect_procs.part.0+0x2c4/0x460
memory_failure+0x71b/0xba0
? _printk+0x58/0x73
do_madvise.part.0.cold+0xaf/0xc5
Link: https://lkml.kernel.org/r/166153429427.2758201.14605968329933175594.stgit@dwillia2-xfh.jf.intel.com
Fixes: c36e202495 ("mm: introduce mf_dax_kill_procs() for fsdax case")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The SB_BORN flag is stored in the vfs superblock, not xfs_sb.
Link: https://lkml.kernel.org/r/166153428094.2758201.7936572520826540019.stgit@dwillia2-xfh.jf.intel.com
Fixes: 6f643c57d5 ("xfs: implement ->notify_failure() for XFS")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm, xfs, dax: Fixes for memory_failure() handling".
I failed to run the memory error injection section of the ndctl test suite
on linux-next prior to the merge window and as a result some bugs were
missed. While the new enabling targeted reflink enabled XFS filesystems
the bugs cropped up in the surrounding cases of DAX error injection on
ext4-fsdax and device-dax.
One new assumption / clarification in this set is the notion that if a
filesystem's ->notify_failure() handler returns -EOPNOTSUPP, then it must
be the case that the fsdax usage of page->index and page->mapping are
valid. I am fairly certain this is true for xfs_dax_notify_failure(), but
would appreciate another set of eyes.
This patch (of 4):
XFS always registers dax_holder_operations regardless of whether the
filesystem is capable of handling the notifications. The expectation is
that if the notify_failure handler cannot run then there are no scenarios
where it needs to run. In other words the expected semantic is that
page->index and page->mapping are valid for memory_failure() when the
conditions that cause -EOPNOTSUPP in xfs_dax_notify_failure() are present.
A fallback to the generic memory_failure() path is expected so do not warn
when that happens.
Link: https://lkml.kernel.org/r/166153426798.2758201.15108211981034512993.stgit@dwillia2-xfh.jf.intel.com
Link: https://lkml.kernel.org/r/166153427440.2758201.6709480562966161512.stgit@dwillia2-xfh.jf.intel.com
Fixes: 6f643c57d5 ("xfs: implement ->notify_failure() for XFS")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patrick Daly reported the following problem;
NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK] - before offline operation
[0] - ZONE_MOVABLE
[1] - ZONE_NORMAL
[2] - NULL
For a GFP_KERNEL allocation, alloc_pages_slowpath() will save the
offset of ZONE_NORMAL in ac->preferred_zoneref. If a concurrent
memory_offline operation removes the last page from ZONE_MOVABLE,
build_all_zonelists() & build_zonerefs_node() will update
node_zonelists as shown below. Only populated zones are added.
NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK] - after offline operation
[0] - ZONE_NORMAL
[1] - NULL
[2] - NULL
The race is simple -- page allocation could be in progress when a memory
hot-remove operation triggers a zonelist rebuild that removes zones. The
allocation request will still have a valid ac->preferred_zoneref that is
now pointing to NULL and triggers an OOM kill.
This problem probably always existed but may be slightly easier to trigger
due to 6aa303defb ("mm, vmscan: only allocate and reclaim from zones
with pages managed by the buddy allocator") which distinguishes between
zones that are completely unpopulated versus zones that have valid pages
not managed by the buddy allocator (e.g. reserved, memblock, ballooning
etc). Memory hotplug had multiple stages with timing considerations
around managed/present page updates, the zonelist rebuild and the zone
span updates. As David Hildenbrand puts it
memory offlining adjusts managed+present pages of the zone
essentially in one go. If after the adjustments, the zone is no
longer populated (present==0), we rebuild the zone lists.
Once that's done, we try shrinking the zone (start+spanned
pages) -- which results in zone_start_pfn == 0 if there are no
more pages. That happens *after* rebuilding the zonelists via
remove_pfn_range_from_zone().
The only requirement to fix the race is that a page allocation request
identifies when a zonelist rebuild has happened since the allocation
request started and no page has yet been allocated. Use a seqlock_t to
track zonelist updates with a lockless read-side of the zonelist and
protecting the rebuild and update of the counter with a spinlock.
[akpm@linux-foundation.org: make zonelist_update_seq static]
Link: https://lkml.kernel.org/r/20220824110900.vh674ltxmzb3proq@techsingularity.net
Fixes: 6aa303defb ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Patrick Daly <quic_pdaly@quicinc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org> [4.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- Remove unused scripts/gcc-ld script
- Add zstd support to scripts/extract-ikconfig
- Check 'make headers' for UML
- Fix scripts/mksysmap to ignore local symbols
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmMd0iwVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGxpkP/ivD5efMSqNKGU1gRLsWXlTdF28k
x1UDLo2q43ylzNemE8NARnDHZhcEp4u/92U7Iwkc5tc+MgMCSRPr1klcdPf5PgwN
GSXHIaWgoN3wn2/8BgFhM6UUddqCkKnGDItfsKumQn70Q5KH1n1ht7Cei5KDI5nY
YmPWaIaKY9eILOTLAVo0UcyoGgX8s9gQyZ8oQr7zRAQjYhjTrb0C9B5aym5PeKOF
MTUjZpJOMUZcYVEv3y+4X9Dwxlx4Tj3PggijBPIRc/O8AagWRBju3GY5jpKbuR/q
vavdhBVe+1Obo9qzh1/sioSvbRdortr8xiwNFhlYelsr5JIasGPjXEZVElRJhwql
Dh+g03GdSTnBwJEWrca7ArbWJ43ODVJyhQggSmhaPj8MYsUKcbL6OK1wq1PZWHxn
lDDriBJ9zYFIefzZnMnMn6uDzoKBn6eErI4svPtCDoNe6YooJQpDwFTgD/P8JKmW
ektGEeee5OUbKCNyxtBhMIYgIhIdbdtwPSTwaq3+Kwc+PXRD+9Ohofswq/cu6rkT
0lA9o+zflMLzqK6TRS35In2cJkgGGVas4UVrjR2vLuLHtwleDXKqeLo4oWyp3xno
voaNUtT2IYCcX/v7f80OTw4i2B3AbdxK6g0+75VbsONxEsx+RWUM4URJJYiFoTPt
6B3329tK59GNtFF+
=FJba
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Remove unused scripts/gcc-ld script
- Add zstd support to scripts/extract-ikconfig
- Check 'make headers' for UML
- Fix scripts/mksysmap to ignore local symbols
* tag 'kbuild-fixes-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
mksysmap: Fix the mismatch of 'L0' symbols in System.map
kbuild: disable header exports for UML in a straightforward way
scripts/extract-ikconfig: add zstd compression support
scripts: remove obsolete gcc-ld script
- Disable in-kernel BTI when compiling with GCC, as it makes invalid
assumptions about the distance between functions which has led to
crashes when calling modules on a CPU with BTI support.
- Remove bogus TIF_SME flag management if memory allocation fails in the
ptrace code.
- Fix the resume path when configured for 52-bit virtual addressing.
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmMcmCUQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNJEPB/9hhMNrX0yAc64otI99DRBuuZjlGWm5tLVj
A5dByt0bvca6XIE/5nZpduYcSGBORL4Ss29mbYteWlaU89TlLEHaedOpDTX0xj5g
61H8UiK0YDX9w/7hbPZw7Uxdvw30LJSxszWZ4Ucst9CYTq+mMbbhgf8h7/BDmBD+
OBpkOlouW4fSw15/UlbgBu6UTAVkbWNcWxzwzOVAdqFYzQ75eMnmOEkEZ8b/YBAA
wZ+0kFmpjGkWymZKlgTxLfQ3eVdKwlrFzBliDtbFoFNuWyyDQ2e1qDQKFZvST4ul
NMVlF19K9NKGVlxl2++rafbg5BQ+Y+0RJ2Hujf72W6sPaThJUDM2
=7sa1
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Three small arm64 fixes, all related to optional architecture
extensions: BTI, SME and 52-bit virtual addressing:
- Disable in-kernel BTI when compiling with GCC, as it makes invalid
assumptions about the distance between functions which has led to
crashes when calling modules on a CPU with BTI support
- Remove bogus TIF_SME flag management if memory allocation fails in
the ptrace code
- Fix the resume path when configured for 52-bit virtual addressing"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: fix resume for 52-bit enabled builds
arm64/ptrace: Don't clear calling process' TIF_SME on OOM
arm64/bti: Disable in kernel BTI when cross section thunks are broken
Including:
- Intel VT-d fixes from Lu Baolu
- Boot kdump kernels with VT-d scalable mode on
- Calculate the right page table levels
- Fix two recursive locking issues
- Fix a lockdep splat issue
- AMD IOMMU fixes:
- Fix for completion-wait command to use full 64 bits of data
- Fix PASID related issue where GPU sound devices failed to
initialize
- Fix for Virtio-IOMMU to report correct caching behavior, needed for
use with VFIO
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmMdjj8ACgkQK/BELZcB
GuNn8hAAv/7ff2sINSHMswNWI+i2Ik/vdWr+J5OaS81I3IIgs1Exw/0LiwSRUSf4
UHPYInkpBe2Chmaj14K75dA39GHj/0DJDz4DyVby8p4VDwT2/+8NNhYMUv22+mT1
E+1cm2ZdjcYGQVzK18oisK/w4HqMLRrLP7KUiP9/ZeIYbsakHf9kBVmcxEDQNX/A
OM0Baa/DMiHkMPVJjj5Fv7hXrsRpQgQnw30/mwi5DsGM58+gepaSJLhvALtwY/h9
njxdag019dUyrywdIgrOTbfx5ZAGEM8337W4Yl+u8xnYcJvC7sWVmcMtqzUnkx2J
LcV2Ejwnma++cN35waYYtIxCQ8VUHIvAg2Ub67O6k3nVwE7T4cNuLJRFft7fgM/u
AzRl2T7QX9htJ4fmZ+GPY/1NaEr/otRIi/KZ8yKlgs1k2OMX3unyUa8uxXeVHFCo
KaxbiQymVo29Y/YeU5KE5bBI28ozTCEyJoxbSrccEbAwgR8nnLe7iJST8OSP8yzK
STkPEt+aUOcbYeaqktcW3qSpXiwaQYcq5uOAiw2dFJWEDSUY9hU6J8XDccbzQd/o
fqLcebryXkQMSwq60up2D1sG4/V1NfvwukAApGczz3Vh9nmwWhC+h0tShSPfGUBa
Zd4ORR3L8L7o5Uk88u2/Rn09vwyKjGHl6Oz4GKZW1xJFeaEeFAw=
=+h6p
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- Intel VT-d fixes from Lu Baolu:
- Boot kdump kernels with VT-d scalable mode on
- Calculate the right page table levels
- Fix two recursive locking issues
- Fix a lockdep splat issue
- AMD IOMMU fixes:
- Fix for completion-wait command to use full 64 bits of data
- Fix PASID related issue where GPU sound devices failed to
initialize
- Fix for Virtio-IOMMU to report correct caching behavior, needed for
use with VFIO
* tag 'iommu-fixes-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Fix false ownership failure on AMD systems with PASID activated
iommu/vt-d: Fix possible recursive locking in intel_iommu_init()
iommu/virtio: Fix interaction with VFIO
iommu/vt-d: Fix lockdep splat due to klist iteration in atomic context
iommu/vt-d: Fix recursive lock issue in iommu_flush_dev_iotlb()
iommu/vt-d: Correctly calculate sagaw value of IOMMU
iommu/vt-d: Fix kdump kernels boot failure with scalable mode
iommu/amd: use full 64-bit value in build_completion_wait()
- fix for octeon irq setup problem
- fix compiler warning for new CONFIG option
- switch to SPARSEMEM_EXTREME for all platforms selecting SPARSEMEM
-----BEGIN PGP SIGNATURE-----
iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmMdE8EaHHRzYm9nZW5k
QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHAGKw//XrnVQLw+GR1MAXkWuc6z
uhiWHuBKY+7TYzRywf9qdoHp7X38nrWQzGoOHqYeZxV3cnO/eTzoczkyRJWttFdY
Av3LZejy0eANT/489irR5Xin19kHkEm+LUlyZDnRdb4kT311IaF9OghMue1uWmzF
7KRr7Pp3GhcfyrJCKEQ1RSnbydkfsZZ7kVkOOr35m7HFsyb2te/Pll5BaaZLp8Fv
ZPVjqmWNa2bZKVCOiZzNZBCxKIvX4VNA074rN8et9/sRBWELrE0ffNpRxt/ETz6r
KOyYJ6dSqHQGk5OQRthcCWqarnGe4zPlwT0d2eBRlvP3Hz/wWRNMlsAMPCgTGN6d
jX6IVTujYQBNrCyhKWMF7daXq92mKm7D0egMsyo1a2pnNrBZtmjtMsyw+xHsOXAo
blPGiazkx+lFJ/0+GsGWph0taymklez7le145RTcBfyJzkcDVnmnii7IueD2xN3H
/PLZRUH0T1Kt4xSsuLeBGpRP3F0IT35kyoocPOHE97AKCa2AVgaOVRhLDW99GGOT
Y8Vmk5h+WTDR/c6fRdC0cAjiEfkumkBnGr0lfWA7k8/xI1Wj58Oiw4e/dETcP4bg
+PMzSLiVMdJjUkKDmPKfgy120t6fFDlQ4lptSkRr0T+sB39tlVK1BWGEs6YEiZPg
BUUnQg7mBp11HRiZK+l5Gnk=
=djqb
-----END PGP SIGNATURE-----
Merge tag 'mips-fixes_6.0_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- fix for loongson32 starup hang
- fix for octeon irq setup problem
- fix compiler warning for new CONFIG option
- switch to SPARSEMEM_EXTREME for all platforms selecting SPARSEMEM
* tag 'mips-fixes_6.0_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
mips: Select SPARSEMEM_EXTREME
MIPS: OCTEON: irq: Fix octeon_irq_force_ciu_mapping()
MIPS: octeon: Get rid of preprocessor directives around RESERVE32
MIPS: loongson32: ls1c: Fix hang during startup
The AMD IOMMU driver cannot activate PASID mode on a RID without the RID's
translation being set to IDENTITY. Further it requires changing the RID's
page table layout from the normal v1 IOMMU_DOMAIN_IDENTITY layout to a
different v2 layout.
It does this by creating a new iommu_domain, configuring that domain for
v2 identity operation and then attaching it to the group, from within the
driver. This logic assumes the group is already set to the IDENTITY domain
and is being used by the DMA API.
However, since the ownership logic is based on the group's domain pointer
equaling the default domain to detect DMA API ownership, this causes it to
look like the group is not attached to the DMA API any more. This blocks
attaching drivers to any other devices in the group.
In a real system this manifests itself as the HD-audio devices on some AMD
platforms losing their device drivers.
Work around this unique behavior of the AMD driver by checking for
equality of IDENTITY domains based on their type, not their pointer
value. This allows the AMD driver to have two IDENTITY domains for
internal purposes without breaking the check.
Have the AMD driver properly declare that the special domain it created is
actually an IDENTITY domain.
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: stable@vger.kernel.org
Fixes: 512881eacf ("bus: platform,amba,fsl-mc,PCI: Add device DMA ownership management")
Reported-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/0-v1-ea566e16b06b+811-amd_owner_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac
("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
is used to protect DMAR related global data from DMAR hotplug operations.
The dmar_global_lock used in the intel_iommu_init() might cause recursive
locking issue, for example, intel_iommu_get_resv_regions() is taking the
dmar_global_lock from within a section where intel_iommu_init() already
holds it via probe_acpi_namespace_devices().
Using dmar_global_lock in intel_iommu_init() could be relaxed since it is
unlikely that any IO board must be hot added before the IOMMU subsystem is
initialized. This eliminates the possible recursive locking issue by moving
down DMAR hotplug support after the IOMMU is initialized and removing the
uses of dmar_global_lock in intel_iommu_init().
Fixes: d5692d4af0 ("iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()")
Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/894db0ccae854b35c73814485569b634237b5538.1657034828.git.robin.murphy@arm.com
Link: https://lore.kernel.org/r/20220718235325.3952426-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>