Commit Graph

87886 Commits

Author SHA1 Message Date
Linus Torvalds c290db0137 drm fixes for 6.1-rc8
i915:
 - Fix dram info readout
 - Remove non-existent pipes from bigjoiner pipe mask
 - Fix negative value passed as remaining time
 - Never return 0 if not all requests retired
 
 amdgpu:
 - VCN fix for vangogh
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmOJV+oACgkQDHTzWXnE
 hr4fLhAAjvH0Qp/b7mjJ6J7C5b8w2IPGVUDeGdZIqmaFv905825o8Hoj132F5HV0
 NvSK5B69Z+of958ky1ksXowAMfKyUbqpOx00QjX1F4v+R0C1QAslobQirhfVpaf5
 uvqw69/b1A7uPI1Pz+2SXWgmmrJ1qyMc7fqPodNWudBDyjm+Wsz6NnTxCF+OMsJV
 LRlZ73IjLqfX17sUFpH9Gr/1PsAF9d4PkLPcc2WVFQrV8O7K5dPBwRdtqtCuZ54K
 zRE3k0hIYyRQHhqCd+IBGpnbwTGAhLIb4FAN+wQ5hmO/gU5kJm3o+1ruhpUepiLM
 jhZOHritZAqU3NE42odWrKT3Juz9Zvf84fTaULKcmk/cNUPPBhlLbBU4CL5/OCAD
 RbT7kSxMzqO1uVDKXggblaFWjeMmeulz3iSqU3dmSGWue39/2kMSDKKykCSpSJTn
 ync5iEXD9nIADjgdnu9W7sbQaEhoJc0/bJ01/sy1FPimR5rcJh15pozabSMz95cO
 YtnkzYymyCQbyaSdPHgWSRrAmFHfGi6rMdLR+vl6CHTRdyYfb/tB5hhcwxLZoWpt
 K4/+IrJO7kUR7wbBpYRq1sQfl98PinfVxXiCI3PnLSBBkLFCNxRwrYw+4hMe2Bjw
 sYPT7ADAIwNW2HWM+z7GxTMYQZK5lcsolPeFaxycA3h4B5NA8QU=
 =TdXQ
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2022-12-02' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Things do seem to have finally settled down, just four i915 and one
  amdgpu this week. Probably won't have much for next week if you do
  push rc8 out.

  i915:
   - Fix dram info readout
   - Remove non-existent pipes from bigjoiner pipe mask
   - Fix negative value passed as remaining time
   - Never return 0 if not all requests retired

  amdgpu:
   - VCN fix for vangogh"

* tag 'drm-fixes-2022-12-02' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: enable Vangogh VCN indirect sram mode
  drm/i915: Never return 0 if not all requests retired
  drm/i915: Fix negative value passed as remaining time
  drm/i915: Remove non-existent pipes from bigjoiner pipe mask
  drm/i915/mtl: Fix dram info readout
2022-12-02 15:35:21 -08:00
Linus Torvalds bdaa78c6aa 15 hotfixes. 11 marked cc:stable. Only three or four of the latter
address post-6.0 issues, which is hopefully a sign that things are
 converging.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY4pQpQAKCRDdBJ7gKXxA
 jquxAP9Lqif7CGDgdq8uWY2hHS/Ujc3k7Ohgyzs37olnCuU8KwEA6/J7SpjsBgtY
 OfzvnwxpCTh8Kfzu/oNckIHo/EEiIA8=
 =o6qT
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc hotfixes from Andrew Morton:
 "15 hotfixes,  11 marked cc:stable.

  Only three or four of the latter address post-6.0 issues, which is
  hopefully a sign that things are converging"

* tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  revert "kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible"
  Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled
  drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame
  mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths
  mm/khugepaged: fix GUP-fast interaction by sending IPI
  mm/khugepaged: take the right locks for page table retraction
  mm: migrate: fix THP's mapcount on isolation
  mm: introduce arch_has_hw_nonleaf_pmd_young()
  mm: add dummy pmd_young() for architectures not having it
  mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes()
  tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep"
  nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
  hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing
  madvise: use zap_page_range_single for madvise dontneed
  mm: replace VM_WARN_ON to pr_warn if the node is offline with __GFP_THISNODE
2022-12-02 13:39:38 -08:00
Dave Airlie c082fbd687 Merge tag 'amd-drm-fixes-6.1-2022-12-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.1-2022-12-01:

amdgpu:
- VCN fix for vangogh

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221201202015.5931-1-alexander.deucher@amd.com
2022-12-02 09:12:46 +10:00
Leo Liu 9a8cc8cabc drm/amdgpu: enable Vangogh VCN indirect sram mode
So that uses PSP to initialize HW.

Fixes: 0c2c02b66c ("drm/amdgpu/vcn: add firmware support for dimgrey_cavefish")
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-12-01 15:09:49 -05:00
Lee Jones 6f6cb17143 drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame
Patch series "Fix a bunch of allmodconfig errors", v2.

Since b339ec9c22 ("kbuild: Only default to -Werror if COMPILE_TEST")
WERROR now defaults to COMPILE_TEST meaning that it's enabled for
allmodconfig builds.  This leads to some interesting build failures when
using Clang, each resolved in this set.

With this set applied, I am able to obtain a successful allmodconfig Arm
build.


This patch (of 2):

calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 ||
ARM64) architectures built with Clang (all released versions), whereby the
stack frame gets blown up to well over 5k.  This would cause an immediate
kernel panic on most architectures.  We'll revert this when the following
bug report has been resolved:
https://github.com/llvm/llvm-project/issues/41896.

Link: https://lkml.kernel.org/r/20221125120750.3537134-1-lee@kernel.org
Link: https://lkml.kernel.org/r/20221125120750.3537134-2-lee@kernel.org
Signed-off-by: Lee Jones <lee@kernel.org>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@gmail.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Lee Jones <lee@kernel.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Tom Rix <trix@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30 14:49:42 -08:00
Andrzej Hajda 04aa64375f drm/i915: fix TLB invalidation for Gen12 video and compute engines
In case of Gen12 video and compute engines, TLB_INV registers are masked -
to modify one bit, corresponding bit in upper half of the register must
be enabled, otherwise nothing happens.

CVE: CVE-2022-4139
Suggested-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: 7938d61591 ("drm/i915: Flush TLBs before releasing backing store")
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-11-30 08:44:00 -08:00
Janusz Krzysztofik 12b8b046e4 drm/i915: Never return 0 if not all requests retired
Users of intel_gt_retire_requests_timeout() expect 0 return value on
success.  However, we have no protection from passing back 0 potentially
returned by a call to dma_fence_wait_timeout() when it succedes right
after its timeout has expired.

Replace 0 with -ETIME before potentially using the timeout value as return
code, so -ETIME is returned if there are still some requests not retired
after timeout, 0 otherwise.

v3: Use conditional expression, more compact but also better reflecting
    intention standing behind the change.

v2: Move the added lines down so flush_submission() is not affected.

Fixes: f33a8a5160 ("drm/i915: Merge wait_for_timelines with retire_request")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-3-janusz.krzysztofik@linux.intel.com
(cherry picked from commit f301a29f14)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-29 08:49:15 +00:00
Janusz Krzysztofik a8899b8728 drm/i915: Fix negative value passed as remaining time
Commit b97060a99b ("drm/i915/guc: Update intel_gt_wait_for_idle to work
with GuC") extended the API of intel_gt_retire_requests_timeout() with an
extra argument 'remaining_timeout', intended for passing back unconsumed
portion of requested timeout when 0 (success) is returned.  However, when
request retirement happens to succeed despite an error returned by a call
to dma_fence_wait_timeout(), that error code (a negative value) is passed
back instead of remaining time.  If we then pass that negative value
forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
will be triggered.

If request retirement succeeds but an error code is passed back via
remaininig_timeout, we may have no clue on how much of the initial timeout
might have been left for spending it on waiting for GuC to become idle.
OTOH, since all pending requests have been successfully retired, that
error code has been already ignored by intel_gt_retire_requests_timeout(),
then we shouldn't fail.

Assume no more time has been left on error and pass 0 timeout value to
intel_uc_wait_for_idle() to give it a chance to return success if GuC is
already idle.

v3: Don't fail on any error passed back via remaining_timeout.

v2: Fix the issue on the caller side, not the provider.

Fixes: b97060a99b ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.15+
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit f235dbd5b7)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-29 08:49:15 +00:00
Ville Syrjälä 3c1ea6a5f4 drm/i915: Remove non-existent pipes from bigjoiner pipe mask
bigjoiner_pipes() doesn't consider that:
- RKL only has three pipes
- some pipes may be fused off

This means that intel_atomic_check_bigjoiner() won't reject
all configurations that would need a non-existent pipe.
Instead we just keep on rolling witout actually having
reserved the slave pipe we need.

It's possible that we don't outright explode anywhere due to
this since eg. for_each_intel_crtc_in_pipe_mask() will only
walk the crtcs we've registered even though the passed in
pipe_mask asks for more of them. But clearly the thing won't
do what is expected of it when the required pipes are not
present.

Fix the problem by consulting the device info pipe_mask already
in bigjoiner_pipes().

Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221118185201.10469-1-ville.syrjala@linux.intel.com
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
(cherry picked from commit f1c87a94a1)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-29 08:49:15 +00:00
Radhakrishna Sripada 2f3830544a drm/i915/mtl: Fix dram info readout
MEM_SS_INFO_GLOBAL Register info read from the hardware is cached in val. However
the variable is being modified when determining the DRAM type thereby clearing out
the channels and qgv info extracted later in the function xelpdp_get_dram_info. Preserve
the register value and use extracted fields in the switch statement.

Fixes: 825477e779 ("drm/i915/mtl: Obtain SAGV values from MMIO instead of GT pcode mailbox")
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221117213015.584417-1-radhakrishna.sripada@intel.com
(cherry picked from commit ec35c41d91)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-29 08:49:14 +00:00
Dave Airlie e57702069b Merge tag 'amd-drm-fixes-6.1-2022-11-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.1-2022-11-23:

amdgpu:
- DCN 3.1.4 fixes
- DP MST DSC deadlock fixes
- HMM userptr fixes
- Fix Aldebaran CU occupancy reporting
- GFX11 fixes
- PSP suspend/resume fix
- DCE12 KASAN fix
- DCN 3.2.x fixes
- Rotated cursor fix
- SMU 13.x fix
- DELL platform suspend/resume fixes
- VCN4 SR-IOV fix
- Display regression fix for polled connectors

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221123143453.8977-1-alexander.deucher@amd.com
2022-11-25 10:55:23 +10:00
Dave Airlie b65a648865 Merge tag 'drm-intel-fixes-2022-11-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix GVT KVM reference count handling (Sean Christopherson)
- Never purge busy TTM objects (Matthew Auld)
- Fix warn in intel_display_power_*_domain() functions (Imre Deak)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y38u44hb1LZfZC+M@tursulin-desk
2022-11-25 09:42:02 +10:00
Dave Airlie 9e2c5c651a Merge tag 'drm-misc-fixes-2022-11-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v6.1-rc7:
- Another amdgpu gang submit fix.
- Use dma_fence_unwrap_for_each when importing sync files.
- Fix race in dma_heap_add().
- Fix use of uninitialized memory in logo.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a5721505-4823-98ef-7d6f-0ea478221391@linux.intel.com
2022-11-25 09:21:11 +10:00
Jane Jian ecb41b71ef drm/amdgpu/vcn: re-use original vcn0 doorbell value
root cause that S2A need to use deduct offset flag.
after setting this flag, vcn0 doorbell value works.
so return it as before

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-23 09:01:54 -05:00
Alex Deucher 602ad43c3c drm/amdgpu: Partially revert "drm/amdgpu: update drm_display_info correctly when the edid is read"
This partially reverts 20543be93c.

Calling drm_connector_update_edid_property() in
amdgpu_connector_free_edid() causes a noticeable pause in
the system every 10 seconds on polled outputs so revert this
part of the change.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2257
Cc: Claudio Suarez <cssk@net-c.es>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-11-23 09:01:53 -05:00
Tsung-hua Lin a6e1775da0 drm/amd/display: No display after resume from WB/CB
[why]
First MST sideband message returns AUX_RET_ERROR_HPD_DISCON
on certain intel platform. Aux transaction considered failure
if HPD unexpected pulled low. The actual aux transaction success
in such case, hence do not return error.

[how]
Not returning error when AUX_RET_ERROR_HPD_DISCON detected
on the first sideband message.

v2: squash in fix (Alex)

Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com>
Acked-by: Brian Chang <Brian.Chang@amd.com>
Signed-off-by: Tsung-hua Lin <Tsung-hua.Lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-11-23 09:01:53 -05:00
Stanley.Yang 3cb93f3904 drm/amdgpu: fix use-after-free during gpu recovery
[Why]
    [  754.862560] refcount_t: underflow; use-after-free.
    [  754.862898] Call Trace:
    [  754.862903]  <TASK>
    [  754.862913]  amdgpu_job_free_cb+0xc2/0xe1 [amdgpu]
    [  754.863543]  drm_sched_main.cold+0x34/0x39 [amd_sched]

[How]
    The fw_fence may be not init, check whether dma_fence_init
    is performed before job free

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-23 09:01:53 -05:00
lyndonli f2e1aa267f drm/amd/pm: update driver if header for smu_13_0_7
update driver if header for smu_13_0_7

Signed-off-by: lyndonli <Lyndon.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
2022-11-23 09:01:53 -05:00
David Galiffi a26a54fbe3 drm/amd/display: Fix rotated cursor offset calculation
[Why]
Underflow is observed when cursor is still enabled when the cursor
rectangle is outside the bounds of it's surface viewport.

[How]
Update parameters used to determine when cursor should be disabled.

Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Brian Chang <Brian.Chang@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-23 09:01:53 -05:00
Dillon Varone e667ee3b0c drm/amd/display: Use new num clk levels struct for max mclk index
[WHY?]
When calculating watermark and dlg values, the max mclk level index and
associated speed are needed to find the correlated dummy latency value.
Currently the incorrect index is given due to a clock manager refactor.

[HOW?]
Use num_memclk_level from num_entries_per_clk struct for getting the correct max
mem speed.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Brian Chang <Brian.Chang@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-23 09:01:53 -05:00
Taimur Hassan 2a5dd86a69 drm/amd/display: Avoid setting pixel rate divider to N/A
[Why]
Pixel rate divider values should never be set to N/A (0xF) as the K1/K2
field is only 1/2 bits wide.

[How]
Set valid divider values for virtual and FRL/DP2 cases.

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Brian Chang <Brian.Chang@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-23 09:01:53 -05:00
Dillon Varone dd2c028c13 drm/amd/display: Use viewport height for subvp mall allocation size
[WHY?]
MALL allocation size depends on the viewport height, not the addressable
vertical lines, which will not match when scaling.

[HOW?]
Base MALL allocation size calculations off viewport height.

Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Brian Chang <Brian.Chang@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-23 09:01:53 -05:00
Dillon Varone 5d82c82f1d drm/amd/display: Update soc bounding box for dcn32/dcn321
[Description]
New values for soc bounding box and dummy pstate.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Brian Chang <Brian.Chang@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
2022-11-23 09:01:53 -05:00
Lyude Paul 44035ec2fd drm/amd/dc/dce120: Fix audio register mapping, stop triggering KASAN
There's been a very long running bug that seems to have been neglected for
a while, where amdgpu consistently triggers a KASAN error at start:

  BUG: KASAN: global-out-of-bounds in read_indirect_azalia_reg+0x1d4/0x2a0 [amdgpu]
  Read of size 4 at addr ffffffffc2274b28 by task modprobe/1889

After digging through amd's rather creative method for accessing registers,
I eventually discovered the problem likely has to do with the fact that on
my dce120 GPU there are supposedly 7 sets of audio registers. But we only
define a register mapping for 6 sets.

So, fix this and fix the KASAN warning finally.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-23 09:01:53 -05:00
Alex Deucher 4f2bea62cf drm/amdgpu/psp: don't free PSP buffers on suspend
We can reuse the same buffers on resume.

v2: squash in S4 fix from Shikai

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2213
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-11-23 09:01:53 -05:00
Tvrtko Ursulin 14af5d3858 Merge tag 'gvt-fixes-2022-11-11' of https://github.com/intel/gvt-linux into drm-intel-fixes
gvt-fixes-2022-11-11

- kvm reference fix from Sean

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221111090208.GQ30028@zhen-hp.sh.intel.com
2022-11-22 07:59:17 +00:00
Jack Xiao 91abf28a63 drm/amd/amdgpu: reserve vm invalidation engine for firmware
If mes enabled, reserve VM invalidation engine 5 for firmware.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
2022-11-21 16:43:13 -05:00
Ramesh Errabolu b9ab82da88 drm/amdgpu: Enable Aldebaran devices to report CU Occupancy
Allow user to know number of compute units (CU) that are in use at any
given moment. Enable access to the method kgd_gfx_v9_get_cu_occupancy
that computes CU occupancy.

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-11-21 16:41:51 -05:00
Christian König 4458da0bb0 drm/amdgpu: fix userptr HMM range handling v2
The basic problem here is that it's not allowed to page fault while
holding the reservation lock.

So it can happen that multiple processes try to validate an userptr
at the same time.

Work around that by putting the HMM range object into the mutex
protected bo list for now.

v2: make sure range is set to NULL in case of an error

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
CC: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-21 16:40:30 -05:00
Christian König b39df63b16 drm/amdgpu: always register an MMU notifier for userptr
Since switching to HMM we always need that because we no longer grab
references to the pages.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
CC: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-21 16:40:04 -05:00
Lyude Paul 85ef1679a1 drm/amdgpu/dm/mst: Fix uninitialized var in pre_compute_mst_dsc_configs_for_state()
Coverity noticed this one, so let's fix it.

Fixes: ba891436c2 ("drm/amdgpu/mst: Stop ignoring error codes and deadlocking")
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: stable@vger.kernel.org # v5.6+
2022-11-21 16:39:26 -05:00
Lyude Paul d60b82aa4d drm/amdgpu/dm/dp_mst: Don't grab mst_mgr->lock when computing DSC state
Now that we've fixed the issue with using the incorrect topology manager,
we're actually grabbing the topology manager's lock - and consequently
deadlocking. Luckily for us though, there's actually nothing in AMD's DSC
state computation code that really should need this lock. The one exception
is the mutex_lock() in dm_dp_mst_is_port_support_mode(), however we grab no
locks beneath &mgr->lock there so that should be fine to leave be.

Gitlab issue: https://gitlab.freedesktop.org/drm/amd/-/issues/2171

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 8c20a1ed9b ("drm/amd/display: MST DSC compute fair share")
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-21 16:38:50 -05:00
Lyude Paul dfbc00410c drm/amdgpu/dm/mst: Use the correct topology mgr pointer in amdgpu_dm_connector
This bug hurt me. Basically, it appears that we've been grabbing the
entirely wrong mutex in the MST DSC computation code for amdgpu! While
we've been grabbing:

  amdgpu_dm_connector->mst_mgr

That's zero-initialized memory, because the only connectors we'll ever
actually be doing DSC computations for are MST ports. Which have mst_mgr
zero-initialized, and instead have the correct topology mgr pointer located
at:

  amdgpu_dm_connector->mst_port->mgr;

I'm a bit impressed that until now, this code has managed not to crash
anyone's systems! It does seem to cause a warning in LOCKDEP though:

  [   66.637670] DEBUG_LOCKS_WARN_ON(lock->magic != lock)

This was causing the problems that appeared to have been introduced by:

  commit 4d07b0bc40 ("drm/display/dp_mst: Move all payload info into the atomic state")

This wasn't actually where they came from though. Presumably, before the
only thing we were doing with the topology mgr pointer was attempting to
grab mst_mgr->lock. Since the above commit however, we grab much more
information from mst_mgr including the atomic MST state and respective
modesetting locks.

This patch also implies that up until now, it's quite likely we could be
susceptible to race conditions when going through the MST topology state
for DSC computations since we technically will not have grabbed any lock
when going through it.

So, let's fix this by adjusting all the respective code paths to look at
the right pointer and skip things that aren't actual MST connectors from a
topology.

Gitlab issue: https://gitlab.freedesktop.org/drm/amd/-/issues/2171

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 8c20a1ed9b ("drm/amd/display: MST DSC compute fair share")
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-21 16:38:42 -05:00
Lyude Paul 2f3a127386 drm/display/dp_mst: Fix drm_dp_mst_add_affected_dsc_crtcs() return code
Looks like that we're accidentally dropping a pretty important return code
here. For some reason, we just return -EINVAL if we fail to get the MST
topology state. This is wrong: error codes are important and should never
be squashed without being handled, which here seems to have the potential
to cause a deadlock.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Fixes: 8ec046716c ("drm/dp_mst: Add helper to trigger modeset on affected DSC MST CRTCs")
Cc: <stable@vger.kernel.org> # v5.6+
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-21 16:38:35 -05:00
Lyude Paul ba891436c2 drm/amdgpu/mst: Stop ignoring error codes and deadlocking
It appears that amdgpu makes the mistake of completely ignoring the return
values from the DP MST helpers, and instead just returns a simple
true/false. In this case, it seems to have come back to bite us because as
a result of simply returning false from
compute_mst_dsc_configs_for_state(), amdgpu had no way of telling when a
deadlock happened from these helpers. This could definitely result in some
kernel splats.

V2:
* Address Wayne's comments (fix another bunch of spots where we weren't
  passing down return codes)

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 8c20a1ed9b ("drm/amd/display: MST DSC compute fair share")
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-21 16:38:23 -05:00
Roman Li 3ca6823894 drm/amd/display: Align dcn314_smu logging with other DCNs
[Why]
Assert on non-OK response from SMU is unnecessary.
It was replaced with respective log message on other asics
in the past with commit:
"drm/amd/display: Removing assert statements for Linux"

[How]
Remove assert and add dbg logging as on other DCNs.

Signed-off-by: Roman Li <roman.li@amd.com>
Reviewed-by: Saaem Rizvi <SyedSaaem.Rizvi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-21 16:38:14 -05:00
Imre Deak ebbaa4392e drm/i915: Fix warn in intel_display_power_*_domain() functions
The intel_display_power_*_domain() functions should always warn if a
default domain is returned as a fallback, fix this up. Spotted by Ville.

Fixes: 979e1b32e0 ("drm/i915: Sanitize the port -> DDI/AUX power domain mapping for each platform")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221114122251.21327-2-imre.deak@intel.com
(cherry picked from commit 10b85f0e1d)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-21 09:41:14 +00:00
Matthew Auld 00a6c36cca drm/i915/ttm: never purge busy objects
In i915_gem_madvise_ioctl() we immediately purge the object is not
currently used, like when the mm.pages are NULL.  With shmem the pages
might still be hanging around or are perhaps swapped out. Similarly with
ttm we might still have the pages hanging around on the ttm resource,
like with lmem or shmem, but here we need to be extra careful since
async unbinds are possible as well as in-progress kernel moves. In
i915_ttm_purge() we expect the pipeline-gutting to nuke the ttm resource
for us, however if it's busy the memory is only moved to a ghost object,
which then leads to broken behaviour when for example clearing the
i915_tt->filp, since the actual ttm_tt is still alive and populated,
even though it's been moved to the ghost object.  When we later destroy
the ghost object we hit the following, since the filp is now NULL:

[  +0.006982] #PF: supervisor read access in kernel mode
[  +0.005149] #PF: error_code(0x0000) - not-present page
[  +0.005147] PGD 11631d067 P4D 11631d067 PUD 115972067 PMD 0
[  +0.005676] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  +0.012962] Workqueue: events ttm_device_delayed_workqueue [ttm]
[  +0.006022] RIP: 0010:i915_ttm_tt_unpopulate+0x3a/0x70 [i915]
[  +0.005879] Code: 89 fb 48 85 f6 74 11 8b 55 4c 48 8b 7d 30 45 31 c0 31 c9 e8 18 6a e5 e0 80 7d 60 00 74 20 48 8b 45 68
8b 55 08 4c 89 e7 5b 5d <48> 8b 40 20 83 e2 01 41 5c 89 d1 48 8b 70
 30 e9 42 b2 ff ff 4c 89
[  +0.018782] RSP: 0000:ffffc9000bf6fd70 EFLAGS: 00010202
[  +0.005244] RAX: 0000000000000000 RBX: ffff8883e12ae380 RCX: 0000000000000000
[  +0.007150] RDX: 000000008000000e RSI: ffffffff823559b4 RDI: ffff8883e12ae3c0
[  +0.007142] RBP: ffff888103b65d48 R08: 0000000000000001 R09: 0000000000000001
[  +0.007144] R10: 0000000000000001 R11: ffff88829c2c8040 R12: ffff8883e12ae3c0
[  +0.007148] R13: 0000000000000001 R14: ffff888115184140 R15: ffff888115184248
[  +0.007154] FS:  0000000000000000(0000) GS:ffff88844db00000(0000) knlGS:0000000000000000
[  +0.008108] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.005763] CR2: 0000000000000020 CR3: 000000013fdb4004 CR4: 00000000003706e0
[  +0.007152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  +0.007145] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  +0.007154] Call Trace:
[  +0.002459]  <TASK>
[  +0.002126]  ttm_tt_unpopulate.part.0+0x17/0x70 [ttm]
[  +0.005068]  ttm_bo_tt_destroy+0x1c/0x50 [ttm]
[  +0.004464]  ttm_bo_cleanup_memtype_use+0x25/0x40 [ttm]
[  +0.005244]  ttm_bo_cleanup_refs+0x90/0x2c0 [ttm]
[  +0.004721]  ttm_bo_delayed_delete+0x235/0x250 [ttm]
[  +0.004981]  ttm_device_delayed_workqueue+0x13/0x40 [ttm]
[  +0.005422]  process_one_work+0x248/0x560
[  +0.004028]  worker_thread+0x4b/0x390
[  +0.003682]  ? process_one_work+0x560/0x560
[  +0.004199]  kthread+0xeb/0x120
[  +0.003163]  ? kthread_complete_and_exit+0x20/0x20
[  +0.004815]  ret_from_fork+0x1f/0x30

v2:
 - Just use ttm_bo_wait() directly (Niranjana)
 - Add testcase reference

Testcase: igt@gem_madvise@dontneed-evict-race
Fixes: 213d509277 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Reported-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.15+
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Acked-by: Nirmoy Das <Nirmoy.Das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221115104620.120432-1-matthew.auld@intel.com
(cherry picked from commit 5524b5e52e)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-21 09:41:13 +00:00
Dave Airlie b1010b93fe drm/tegra: Fixes for v6.1-rc6
This contains a single fix that avoids using the GART on Tegra20 because
 it doesn't work well with the way the Tegra DRM driver tries to use it.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAmN3dtETHHRyZWRpbmdA
 bnZpZGlhLmNvbQAKCRDdI6zXfz6zoUjRD/9asvgInB2t2LXQXSpP/m48biafa3uw
 ybwz1pjWG9RA/1/atbDxxM/d8m+Woc/fNspC4e3fSVymaSNazuzY4uce26zKEqC8
 wDUHutv3tb+l+0sw32vR12eM8Z5ATExG9OHZg0TB4UNfkC56iYBUCv4rtGGO/k1y
 erU+372l4SPZGXGl3BKo+VEZu0oD2FcWAlLmRfbo4KaX5DTr6k6bCOlBud5QoQPW
 e19D2lOhO+kGI1Y84q3udxxp7Wbu7B+SMbhxyHzGLE1GVuP+SyshbUGwippWsHkk
 6eUSEvT36aROwxh150K9CJYWq8zDdd2/m2+iFCy5dWlTNIjSY5Tj/0rn03bNQs/7
 Ofh0c0Jx5j5V0nkD8T9o8DbgpcW5oAeWKfgevhuz/AnaQO7RBfcjgQ8G9CmrlKcq
 cGyTANacukf6+4hDCgMhcFd6TBH9gITjkcfi2FxJ1ny4zfwRr0gzAXv09tK4JkWB
 VJX8xQh2+W5hZwVMSVf0EWm0xm14vg+G/Ev/2lMalsCYkiekkZqJGAGJ2Boaqkql
 fnEbILC5m8B0dwJcTDuKqXlqJRWc7Rat0AWlflmnbjR4wPb965o5QIH387xj4hKI
 Wrn9fnPvX8WASxId9Em1wqYgRRdJkb22UmNUc4kk/pd1yXzS0xp0KXX64Ht88ipm
 Oso15mW9Aaud5w==
 =qmNJ
 -----END PGP SIGNATURE-----

Merge tag 'drm/tegra/for-6.1-rc6' of https://gitlab.freedesktop.org/drm/tegra into drm-fixes

drm/tegra: Fixes for v6.1-rc6

This contains a single fix that avoids using the GART on Tegra20 because
it doesn't work well with the way the Tegra DRM driver tries to use it.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221118121614.3511110-1-thierry.reding@gmail.com
2022-11-19 06:15:37 +10:00
Christian König b09d6acba1 drm/amdgpu: handle gang submit before VMID
Otherwise it can happen that not all gang members can get a VMID
assigned and we deadlock.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 68ce8b2422 ("drm/amdgpu: add gang submit backend v2")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221118153023.312582-1-christian.koenig@amd.com
2022-11-18 17:52:15 +01:00
Robin Murphy c2418f911a gpu: host1x: Avoid trying to use GART on Tegra20
Since commit c7e3ca515e ("iommu/tegra: gart: Do not register with
bus") quite some time ago, the GART driver has effectively disabled
itself to avoid issues with the GPU driver expecting it to work in ways
that it doesn't. As of commit 57365a04c9 ("iommu: Move bus setup to
IOMMU device registration") that bodge no longer works, but really the
GPU driver should be responsible for its own behaviour anyway. Make the
workaround explicit.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Suggested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2022-11-18 09:33:20 +01:00
Dave Airlie 585f2bc8fe Merge tag 'amd-drm-fixes-6.1-2022-11-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.1-2022-11-16:

amdgpu:
- Fix a possible memory leak in ganng submit error path
- DP tunneling fixes
- DCN 3.1 page flip fix
- DCN 3.2.x fixes
- DCN 3.1.4 fixes
- Don't expose degamma on hardware that doesn't support it
- BACO fixes for SMU 11.x
- BACO fixes for SMU 13.x
- Virtual display fix for devices with no display hardware

amdkfd:
- Memory limit regression fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221117040416.6100-1-alexander.deucher@amd.com
2022-11-18 11:09:04 +10:00
Dave Airlie a73b603f91 Merge tag 'drm-intel-fixes-2022-11-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix uaf with lmem_userfault_list handling (Matthew Auld)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y3X2bNJ/4GR1BAiG@tursulin-desk
2022-11-18 11:02:54 +10:00
Dave Airlie 5fa8813878 Merge tag 'drm-misc-fixes-2022-11-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v6.1-rc6:
- Fix error handling in vc4_atomic_commit_tail()
- Set bpc for logictechno panels.
- Fix potential memory leak in drm_dev_init()
- Fix potential null-ptr-deref in drm_vblank_destroy_worker()
- Set lima's clkname corrrectly when regulator is missing.
- Small amdgpu fix to gang submission.
- Revert hiding unregistered connectors from userspace, as it breaks on DP-MST.
- Add workaround for DP++ dual mode adaptors that don't support
  i2c subaddressing.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c7d02936-c550-199b-6cb7-cbf6cf104e4a@linux.intel.com
2022-11-18 07:08:57 +10:00
Simon Rettberg 5954acbacb drm/display: Don't assume dual mode adaptors support i2c sub-addressing
Current dual mode adaptor ("DP++") detection code assumes that all
adaptors support i2c sub-addressing for read operations from the
DP-HDMI adaptor ID buffer.  It has been observed that multiple
adaptors do not in fact support this, and always return data starting
at register 0.  On affected adaptors, the code fails to read the proper
registers that would identify the device as a type 2 adaptor, and
handles those as type 1, limiting the TMDS clock to 165MHz, even if
the according register would announce a higher TMDS clock.
Fix this by always reading the ID buffer starting from offset 0, and
discarding any bytes before the actual offset of interest.

We tried finding authoritative documentation on whether or not this is
allowed behaviour, but since all the official VESA docs are paywalled,
the best we could come up with was the spec sheet for Texas Instruments'
SNx5DP149 chip family.[1]  It explicitly mentions that sub-addressing is
supported for register writes, but *not* for reads (See NOTE in
section 8.5.3).  Unless TI openly decided to violate the VESA spec, one
could take that as a hint that sub-addressing is in fact not mandated
by VESA.
The other two adaptors affected used the PS8409(A) and the LT8611,
according to the data returned from their ID buffers.

[1] https://www.ti.com/lit/ds/symlink/sn75dp149.pdf

Cc: stable@vger.kernel.org
Signed-off-by: Simon Rettberg <simon.rettberg@rz.uni-freiburg.de>
Reviewed-by: Rafael Gieschke <rafael.gieschke@rz.uni-freiburg.de>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006113314.41101987@computer
Acked-by: Jani Nikula <jani.nikula@intel.com>
2022-11-15 23:31:02 +02:00
Evan Quan 4b14841c9a drm/amd/pm: fix SMU13 runpm hang due to unintentional workaround
The workaround designed for some specific ASICs is wrongly applied
to SMU13 ASICs. That leads to some runpm hang.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2022-11-15 13:29:06 -05:00
Evan Quan df7c013efc drm/amd/pm: enable runpm support over BACO for SMU13.0.7
Enable SMU13.0.7 runpm support.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
2022-11-15 13:28:38 -05:00
Evan Quan 8652da45d0 drm/amd/pm: enable runpm support over BACO for SMU13.0.0
Enable SMU13.0.0 runpm support.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
2022-11-15 13:28:09 -05:00
Alex Deucher f8794f31ab drm/amdgpu: there is no vbios fb on devices with no display hw (v2)
If we enable virtual display functionality on parts with
no display hardware we can end up trying to check for and
reserve the vbios FB area on devices where it doesn't exist.
Check if display hardware is actually present on the hardware
before trying to reserve the memory.

v2: move the check into common code

Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 13:24:40 -05:00
Eric Huang 6f9eea4392 drm/amdkfd: Fix a memory limit issue
It is to resolve a regression, which fails to allocate
VRAM due to no free memory in application, the reason
is we add check of vram_pin_size for memory limit, and
application is pinning the memory for Peerdirect, KFD
should not count it in memory limit. So removing
vram_pin_size will resolve it.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-11-15 13:24:32 -05:00