linux

Commit Graph

Author	SHA1	Message	Date
Peter Xu	0ccf7f168e	mm/thp: carry over dirty bit when thp splits on pmd Carry over the dirty bit from pmd to pte when a huge pmd splits. It shouldn't be a correctness issue since when pmd_dirty() we'll have the page marked dirty anyway, however having dirty bit carried over helps the next initial writes of split ptes on some archs like x86. Link: https://lkml.kernel.org/r/20220811161331.37055-5-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Minchan Kim <minchan@kernel.org> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:05 -07:00
Peter Xu	0d206b5d2e	mm/swap: add swp_offset_pfn() to fetch PFN from swap entry We've got a bunch of special swap entries that stores PFN inside the swap offset fields. To fetch the PFN, normally the user just calls swp_offset() assuming that'll be the PFN. Add a helper swp_offset_pfn() to fetch the PFN instead, fetching only the max possible length of a PFN on the host, meanwhile doing proper check with MAX_PHYSMEM_BITS to make sure the swap offsets can actually store the PFNs properly always using the BUILD_BUG_ON() in is_pfn_swap_entry(). One reason to do so is we never tried to sanitize whether swap offset can really fit for storing PFN. At the meantime, this patch also prepares us with the future possibility to store more information inside the swp offset field, so assuming "swp_offset(entry)" to be the PFN will not stand any more very soon. Replace many of the swp_offset() callers to use swp_offset_pfn() where proper. Note that many of the existing users are not candidates for the replacement, e.g.: (1) When the swap entry is not a pfn swap entry at all, or, (2) when we wanna keep the whole swp_offset but only change the swp type. For the latter, it can happen when fork() triggered on a write-migration swap entry pte, we may want to only change the migration type from write->read but keep the rest, so it's not "fetching PFN" but "changing swap type only". They're left aside so that when there're more information within the swp offset they'll be carried over naturally in those cases. Since at it, dropping hwpoison_entry_to_pfn() because that's exactly what the new swp_offset_pfn() is about. Link: https://lkml.kernel.org/r/20220811161331.37055-4-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Minchan Kim <minchan@kernel.org> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:05 -07:00
Peter Xu	eba4d770ef	mm/swap: comment all the ifdef in swapops.h swapops.h contains quite a few layers of ifdef, some of the "else" and "endif" doesn't get proper comment on the macro so it's hard to follow on what are they referring to. Add the comments. Link: https://lkml.kernel.org/r/20220811161331.37055-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Suggested-by: Nadav Amit <nadav.amit@gmail.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:04 -07:00
Peter Xu	9c61d5321e	mm/x86: use SWP_TYPE_BITS in 3-level swap macros Patch series "mm: Remember a/d bits for migration entries", v4. Problem ======= When migrating a page, right now we always mark the migrated page as old & clean. However that could lead to at least two problems: (1) We lost the real hot/cold information while we could have persisted. That information shouldn't change even if the backing page is changed after the migration, (2) There can be always extra overhead on the immediate next access to any migrated page, because hardware MMU needs cycles to set the young bit again for reads, and dirty bits for write, as long as the hardware MMU supports these bits. Many of the recent upstream works showed that (2) is not something trivial and actually very measurable. In my test case, reading 1G chunk of memory - jumping in page size intervals - could take 99ms just because of the extra setting on the young bit on a generic x86_64 system, comparing to 4ms if young set. This issue is originally reported by Andrea Arcangeli. Solution ======== To solve this problem, this patchset tries to remember the young/dirty bits in the migration entries and carry them over when recovering the ptes. We have the chance to do so because in many systems the swap offset is not really fully used. Migration entries use swp offset to store PFN only, while the PFN is normally not as large as swp offset and normally smaller. It means we do have some free bits in swp offset that we can use to store things like A/D bits, and that's how this series tried to approach this problem. max_swapfile_size() is used here to detect per-arch offset length in swp entries. We'll automatically remember the A/D bits when we find that we have enough swp offset field to keep both the PFN and the extra bits. Since max_swapfile_size() can be slow, the last two patches cache the results for it and also swap_migration_ad_supported as a whole. Known Issues / TODOs ==================== We still haven't taught madvise() to recognize the new A/D bits in migration entries, namely MADV_COLD/MADV_FREE. E.g. when MADV_COLD upon a migration entry. It's not clear yet on whether we should clear the A bit, or we should just drop the entry directly. We didn't teach idle page tracking on the new migration entries, because it'll need larger rework on the tree on rmap pgtable walk. However it should make it already better because before this patchset page will be old page after migration, so the series will fix potential false negative of idle page tracking when pages were migrated before observing. The other thing is migration A/D bits will not start to working for private device swap entries. The code is there for completeness but since private device swap entries do not yet have fields to store A/D bits, even if we'll persistent A/D across present pte switching to migration entry, we'll lose it again when the migration entry converted to private device swap entry. Tests ===== After the patchset applied, the immediate read access test [1] of above 1G chunk after migration can shrink from 99ms to 4ms. The test is done by moving 1G pages from node 0->1->0 then read it in page size jumps. The test is with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz. Similar effect can also be measured when writting the memory the 1st time after migration. After applying the patchset, both initial immediate read/write after page migrated will perform similarly like before migration happened. Patch Layout ============ Patch 1-2: Cleanups from either previous versions or on swapops.h macros. Patch 3-4: Prepare for the introduction of migration A/D bits Patch 5: The core patch to remember young/dirty bit in swap offsets. Patch 6-7: Cache relevant fields to make migration_entry_supports_ad() fast. [1] https://github.com/xzpeter/clibs/blob/master/misc/swap-young.c This patch (of 7): Replace all the magic "5" with the macro. Link: https://lkml.kernel.org/r/20220811161331.37055-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20220811161331.37055-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:04 -07:00
Miaohe Lin	9cf2819159	mm, hwpoison: cleanup some obsolete comments 1.Remove meaningless comment in kill_proc(). That doesn't tell anything. 2.Fix the wrong function name get_hwpoison_unless_zero(). It should be get_page_unless_zero(). 3.The gate keeper for free hwpoison page has moved to check_new_page(). Update the corresponding comment. Link: https://lkml.kernel.org/r/20220830123604.25763-7-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:04 -07:00
Miaohe Lin	b680dae9a8	mm, hwpoison: check PageTable() explicitly in hwpoison_user_mappings() PageTable can't be handled by memory_failure(). Filter it out explicitly in hwpoison_user_mappings(). This will also make code more consistent with the relevant check in unpoison_memory(). Link: https://lkml.kernel.org/r/20220830123604.25763-6-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:04 -07:00
Miaohe Lin	36537a67d3	mm, hwpoison: avoid unneeded page_mapped_in_vma() overhead in collect_procs_anon() If vma->vm_mm != t->mm, there's no need to call page_mapped_in_vma() as add_to_kill() won't be called in this case. Move up the mm check to avoid possible unneeded calling to page_mapped_in_vma(). Link: https://lkml.kernel.org/r/20220830123604.25763-5-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:04 -07:00
Miaohe Lin	21c9e90ab9	mm, hwpoison: use num_poisoned_pages_sub() to decrease num_poisoned_pages Use num_poisoned_pages_sub() to combine multiple atomic ops into one. Also num_poisoned_pages_dec() can be killed as there's no caller now. Link: https://lkml.kernel.org/r/20220830123604.25763-4-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:04 -07:00
Miaohe Lin	da29499124	mm, hwpoison: use __PageMovable() to detect non-lru movable pages It's more recommended to use __PageMovable() to detect non-lru movable pages. We can avoid bumping page refcnt via isolate_movable_page() for the isolated lru pages. Also if pages become PageLRU just after they're checked but before trying to isolate them, isolate_lru_page() will be called to do the right work. [linmiaohe@huawei.com: fixes per Naoya Horiguchi] Link: https://lkml.kernel.org/r/1f7ee86e-7d28-0d8c-e0de-b7a5a94519e8@huawei.com Link: https://lkml.kernel.org/r/20220830123604.25763-3-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:03 -07:00
Miaohe Lin	2fe62e2226	mm, hwpoison: use ClearPageHWPoison() in memory_failure() Patch series "A few cleanup patches for memory-failure". his series contains a few cleanup patches to use __PageMovable() to detect non-lru movable pages, use num_poisoned_pages_sub() to reduce multiple atomic ops overheads and so on. More details can be found in the respective changelogs. This patch (of 6): Use ClearPageHWPoison() instead of TestClearPageHWPoison() to clear page hwpoison flags to avoid unneeded full memory barrier overhead. Link: https://lkml.kernel.org/r/20220830123604.25763-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20220830123604.25763-2-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:03 -07:00
Yang Shi	4d24de9425	mm: MADV_COLLAPSE: refetch vm_end after reacquiring mmap_lock The syzbot reported the below problem: BUG: Bad page map in process syz-executor198 pte:8000000071c00227 pmd:74b30067 addr:0000000020563000 vm_flags:08100077 anon_vma:ffff8880547d2200 mapping:0000000000000000 index:20563 file:(null) fault:0x0 mmap:0x0 read_folio:0x0 CPU: 1 PID: 3614 Comm: syz-executor198 Not tainted 6.0.0-rc3-next-20220901-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_bad_pte.cold+0x2a7/0x2d0 mm/memory.c:565 vm_normal_page+0x10c/0x2a0 mm/memory.c:636 hpage_collapse_scan_pmd+0x729/0x1da0 mm/khugepaged.c:1199 madvise_collapse+0x481/0x910 mm/khugepaged.c:2433 madvise_vma_behavior+0xd0a/0x1cc0 mm/madvise.c:1062 madvise_walk_vmas+0x1c7/0x2b0 mm/madvise.c:1236 do_madvise.part.0+0x24a/0x340 mm/madvise.c:1415 do_madvise mm/madvise.c:1428 [inline] __do_sys_madvise mm/madvise.c:1428 [inline] __se_sys_madvise mm/madvise.c:1426 [inline] __x64_sys_madvise+0x113/0x150 mm/madvise.c:1426 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f770ba87929 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f770ba18308 EFLAGS: 00000246 ORIG_RAX: 000000000000001c RAX: ffffffffffffffda RBX: 00007f770bb0f3f8 RCX: 00007f770ba87929 RDX: 0000000000000019 RSI: 0000000000600003 RDI: 0000000020000000 RBP: 00007f770bb0f3f0 R08: 00007f770ba18700 R09: 0000000000000000 R10: 00007f770ba18700 R11: 0000000000000246 R12: 00007f770bb0f3fc R13: 00007ffc2d8b62ef R14: 00007f770ba18400 R15: 0000000000022000 Basically the test program does the below conceptually: 1. mmap 0x2000000 - 0x21000000 as anonymous region 2. mmap io_uring SQ stuff at 0x20563000 with MAP_FIXED, io_uring_mmap() actually remaps the pages with special PTEs 3. call MADV_COLLAPSE for 0x20000000 - 0x21000000 It actually triggered the below race: CPU A CPU B mmap 0x20000000 - 0x21000000 as anon madvise_collapse is called on this area Retrieve start and end address from the vma (NEVER updated later!) Collapsed the first 2M area and dropped mmap_lock Acquire mmap_lock mmap io_uring file at 0x20563000 Release mmap_lock Reacquire mmap_lock revalidate vma pass since 0x20200000 + 0x200000 > 0x20563000 scan the next 2M (0x20200000 - 0x20400000), but due to whatever reason it didn't release mmap_lock scan the 3rd 2M area (start from 0x20400000) get into the vma created by io_uring The hend should be updated after MADV_COLLAPSE reacquire mmap_lock since the vma may be shrunk. We don't have to worry about shink from the other direction since it could be caught by hugepage_vma_revalidate(). Either no valid vma is found or the vma doesn't fit anymore. Link: https://lkml.kernel.org/r/20220914162220.787703-1-shy828301@gmail.com Fixes: `7d8faaf155` ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") Reported-by: syzbot+915f3e317adb0e85835f@syzkaller.appspotmail.com Signed-off-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-26 19:46:03 -07:00
Satya Priya	31e4fcf971	clk: qcom: lpass: Fix lpass audiocc probe Change the qcom_cc_probe_by_index() call to qcom_cc_really_probe() to avoid remapping of memory region for index 0, which is already being done through qcom_cc_map(). Fixes: `7c6a6641c2` ("clk: qcom: lpass: Add support for resets & external mclk for SC7280") Signed-off-by: Satya Priya <quic_c_skakit@quicinc.com> Reviewed-by: Neil Armstrong <neil.armstrong@baylibre.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/1663673683-7018-1-git-send-email-quic_c_skakit@quicinc.com	2022-09-26 21:45:31 -05:00
Robert Marko	cca7b7d5f1	clk: qcom: apss-ipq-pll: add support for IPQ8074 Add support for IPQ8074 since it uses the same PLL setup, however it uses slightly different Alpha PLL config. Alpha PLL config was obtained by dumping PLL registers from a running device. Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220818220628.339366-7-robimarko@gmail.com	2022-09-26 21:40:11 -05:00
Robert Marko	2a4d702465	clk: qcom: apss-ipq-pll: update IPQ6018 Alpha PLL config Update the IPQ6018 Alpha PLL config to the latest one from the downstream 5.4 kernel[1]. This one should match the production SoC-s. Tested on IPQ6018 CP01-C1 reference board. [1] https://git.codelinaro.org/clo/qsdk/oss/kernel/linux-ipq-5.4/-/blob/NHSS.QSDK.12.1.r4/drivers/clk/qcom/apss-ipq-pll.c#L41 Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220818220628.339366-6-robimarko@gmail.com	2022-09-26 21:40:11 -05:00
Robert Marko	823a117e1d	clk: qcom: apss-ipq-pll: use OF match data for Alpha PLL config Convert the driver to use OF match data for providing the Alpha PLL config per compatible. This is required for IPQ8074 support since it uses a different Alpha PLL config. While we are here rename "ipq_pll_config" to "ipq6018_pll_config" to make it clear that it is for IPQ6018 only. Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220818220628.339366-5-robimarko@gmail.com	2022-09-26 21:40:10 -05:00
Robert Marko	d522c77aa8	dt-bindings: clock: qcom,a53pll: add IPQ8074 compatible Add IPQ8074 compatible to A53 PLL bindings. Signed-off-by: Robert Marko <robimarko@gmail.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220818220628.339366-4-robimarko@gmail.com	2022-09-26 21:40:10 -05:00
Robert Marko	86e78995c9	clk: qcom: apss-ipq6018: mark apcs_alias0_core_clk as critical While fixing up the driver I noticed that my IPQ8074 board was hanging after CPUFreq switched the frequency during boot, WDT would eventually reset it. So mark apcs_alias0_core_clk as critical since its the clock feeding the CPU cluster and must never be disabled. Fixes: `5e77b4ef1b` ("clk: qcom: Add ipq6018 apss clock controller") Signed-off-by: Robert Marko <robimarko@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220818220628.339366-3-robimarko@gmail.com	2022-09-26 21:40:10 -05:00
Robert Marko	43a56cbf2a	clk: qcom: apss-ipq6018: fix apcs_alias0_clk_src While working on IPQ8074 APSS driver it was discovered that IPQ6018 and IPQ8074 use almost the same PLL and APSS clocks, however APSS driver is currently broken. More precisely apcs_alias0_clk_src is broken, it was added as regmap_mux clock. However after debugging why it was always stuck at 800Mhz, it was figured out that its not regmap_mux compatible at all. It is a simple mux but it uses RCG2 register layout and control bits, so utilize the new clk_rcg2_mux_closest_ops to correctly drive it while not having to provide a dummy frequency table. While we are here, use ARRAY_SIZE for number of parents. Tested on IPQ6018-CP01-C1 reference board and multiple IPQ8074 boards. Fixes: `5e77b4ef1b` ("clk: qcom: Add ipq6018 apss clock controller") Signed-off-by: Robert Marko <robimarko@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220818220628.339366-2-robimarko@gmail.com	2022-09-26 21:40:10 -05:00
Christian Marangi	c5d2c96b3a	clk: qcom: clk-rcg2: add rcg2 mux ops An RCG may act as a mux that switch between 2 parents. This is the case on IPQ6018 and IPQ8074 where the APCS core clk that feeds the CPU cluster clock just switches between XO and the PLL that feeds it. Add the required ops to add support for this special configuration and use the generic mux function to determine the rate. This way we dont have to keep a essentially dummy frequency table to use RCG2 as a mux. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Robert Marko <robimarko@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20220818220628.339366-1-robimarko@gmail.com	2022-09-26 21:40:10 -05:00
Christoph Hellwig	99e6038743	blk-cgroup: pass a gendisk to the blkg allocation helpers Prepare for storing the blkcg information in the gendisk instead of the request_queue. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-18-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:28 -06:00
Christoph Hellwig	de185b56e8	blk-cgroup: pass a gendisk to blkcg_schedule_throttle Pass the gendisk to blkcg_schedule_throttle as part of moving the blk-cgroup infrastructure to be gendisk based. Remove the unused !BLK_CGROUP stub while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-17-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:28 -06:00
Christoph Hellwig	00ad6991bb	blk-cgroup: pass a gendisk to blkg_destroy_all Pass the gendisk to blkg_destroy_all as part of moving the blk-cgroup infrastructure to be gendisk based. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:28 -06:00
Christoph Hellwig	cad9266abc	blk-throttle: pass a gendisk to blk_throtl_cancel_bios Pass the gendisk to blk_throtl_cancel_bios as part of moving the blk-cgroup infrastructure to be gendisk based. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:28 -06:00
Christoph Hellwig	5f6dc7522a	blk-throttle: pass a gendisk to blk_throtl_register_queue Pass the gendisk to blk_throtl_register_queue as part of moving the blk-cgroup infrastructure to be gendisk based. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:27 -06:00
Christoph Hellwig	e13793bae6	blk-throttle: pass a gendisk to blk_throtl_init and blk_throtl_exit Pass the gendisk to blk_throtl_init and blk_throtl_exit as part of moving the blk-cgroup infrastructure to be gendisk based. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-13-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:27 -06:00
Christoph Hellwig	3657647e33	blk-iocost: cleanup ioc_qos_write Use a local disk variable instead of retrieving the disk and request_queue over and over by various means. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:27 -06:00
Christoph Hellwig	57b6455497	blk-iocost: pass a gendisk to blk_iocost_init Pass the gendisk to blk_iocost_init as part of moving the blk-cgroup infrastructure to be gendisk based. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:27 -06:00
Christoph Hellwig	9df3e65139	blk-iocost: simplify ioc_name Just directly dereference the disk name instead of going through multiple hoops to find the same value. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:27 -06:00
Christoph Hellwig	16fac1b591	blk-iolatency: pass a gendisk to blk_iolatency_init Pass the gendisk to blk_iolatency_init as part of moving the blk-cgroup infrastructure to be gendisk based. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-9-hch@lst.de [axboe: missed inline for blk_iolatency_init() and !CONFIG_BLK_CGROUP_IOLATENCY] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:17:24 -06:00
Christoph Hellwig	b0dde3f5d6	blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit Pass the gendisk to blk_ioprio_init and blk_ioprio_exit as part of moving the blk-cgroup infrastructure to be gendisk based. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:09:31 -06:00
Christoph Hellwig	9823538fb7	blk-cgroup: pass a gendisk to blkcg_init_queue and blkcg_exit_queue Pass the gendisk to blkcg_init_disk and blkcg_exit_disk as part of moving the blk-cgroup infrastructure to be gendisk based. Also remove the rather pointless kerneldoc comments for these internal functions with a single caller each. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:09:31 -06:00
Christoph Hellwig	f753526e32	blk-cgroup: remove blkg_lookup_check The combinations of an error check with an ERR_PTR return and a lookup with a NULL return leads to ugly handling of the return values in the callers. Just open coding the check and the lookup is much simpler. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:09:31 -06:00
Christoph Hellwig	4a69f325aa	blk-cgroup: cleanup the blkg_lookup family of functions Add a fully inlined blkg_lookup as the extra two checks aren't going to generated a lot more code vs the call to the slowpath routine, and open code the hint update in the two callers that care. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:09:31 -06:00
Christoph Hellwig	79fcc5be93	blk-cgroup: remove open coded blkg_lookup instances Use blkg_lookup instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:09:31 -06:00
Christoph Hellwig	928f6f00a9	blk-cgroup: remove blk_queue_root_blkg Just open code it in the only caller and drop the unused !BLK_CGROUP stub. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:09:31 -06:00
Christoph Hellwig	33dc62796c	blk-cgroup: fix error unwinding in blkcg_init_queue When blk_throtl_init fails, we need to call blk_ioprio_exit. Switch to proper goto based unwinding to fix this. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Herrmann <aherrmann@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20220921180501.1539876-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-09-26 19:09:31 -06:00
Jakub Kicinski	5dcf41a8e9	Merge branch 'net-sunhme-cleanups-and-logging-improvements' Sean Anderson says: ==================== net: sunhme: Cleanups and logging improvements This series is a continuation of [1] with a focus on logging improvements (in the style of commit `b11e5f6a3a` ("net: sunhme: output link status with a single print.")). I have included several of Rolf's patches in the series where appropriate (with slight modifications). After this series is applied, many more messages from this driver will come with driver/device information. Additionally, most messages (especially debug messages) have been condensed onto one line (as KERN_CONT messages get split). [1] https://lore.kernel.org/netdev/4686583.GXAFRqVoOG@eto.sf-tec.de/ ==================== Link: https://lore.kernel.org/r/20220924015339.1816744-1-seanga2@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:41 -07:00
Sean Anderson	77ceb3731e	sunhme: Add myself as a maintainer I have the hardware so at the very least I can test things. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:38 -07:00
Sean Anderson	26657c70b9	sunhme: Use vdbg for spam-y prints The SXD, TXD, and RXD macros are used only once (or twice). Just use the vdbg print, which seems to have been devised for these sorts of very verbose messages. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:38 -07:00
Sean Anderson	24cddbc3ef	sunhme: Combine continued messages This driver seems to have been written under the assumption that messages can be continued arbitrarily. I'm not when this changed (if ever), but such ad-hoc continuations are liable to be rudely interrupted. Convert all such instances to single prints. This loses a bit of timing information (such as when a line was constructed piecemeal as the function executed), but it's easy to add a few prints if necessary. This also adds newlines to the ends of any prints without them. Since (almost every) debug print included the name of the function, include it automatically. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:38 -07:00
Sean Anderson	8acf878f29	sunhme: Use (net)dev_foo wherever possible Wherever possible, use the associated netdev (or device) when printing errors or other messages. This makes it immediately clear what device caused the error, and provides more information than just the device name. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:38 -07:00
Sean Anderson	0bc1f45410	sunhme: Convert printk(KERN_FOO ...) to pr_foo(...) This is a mostly-mechanical translation of the existing printks into pr_foos. In several places, I have pasted messages which were broken over several lines to allow for easier grepping. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:37 -07:00
Sean Anderson	30931367ba	sunhme: Clean up debug infrastructure Remove all the single-use debug conditionals, and just collect the debug defines at the top of the file. HMD seems like it is used for general debug info, so just redefine it as pr_debug. Additionally, instead of using the default loglevel, use the debug loglevel for debugging. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:37 -07:00
Sean Anderson	03290907a5	sunhme: Convert FOO((...)) to FOO(...) With the power of variadic macros, double parentheses are unnecessary. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:37 -07:00
Rolf Eike Beer	914d9b2711	sunhme: switch to devres This not only removes a lot of code, it also fixes the memleak of the DMA memory when register_netdev() fails. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> [ rebased onto net-next/master; fixed error reporting ] Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:37 -07:00
Sean Anderson	5b3dc6dda6	sunhme: Regularize probe errors This fixes several error paths to ensure they return an appropriate error (instead of ENODEV). Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:37 -07:00
Sean Anderson	d6f1e89bdb	sunhme: Return an ERR_PTR from quattro_pci_find In order to differentiate between a missing bridge and an OOM condition, return ERR_PTRs from quattro_pci_find. This also does some general linting in the area. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:36 -07:00
Rolf Eike Beer	acb3f35f92	sunhme: forward the error code from pci_enable_device() This already returns a proper error value, so pass it to the caller. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:36 -07:00
Sean Anderson	6478c6e994	sunhme: Remove version Module versions are not very useful: > The basic problem is, the version string does not identify the sources > with enough accuracy. It says nothing about back ported fixes in > stable kernels. It tells you nothing about vendor patches to the > network core, etc. https://lore.kernel.org/all/Yf6mtvA1zO7cdzr7@lunn.ch/ While we're at it, inline the author and use the driver name a bit more. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:36 -07:00
Rolf Eike Beer	8247ab50c2	sunhme: remove unused tx_dump_ring() I can't find a reference to it in the entire git history. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-26 17:45:36 -07:00

... 72 73 74 75 76 ...

1136096 Commits All Branches Search

1136096 Commits

All Branches