For computing PCIe bandwidth in userspace and troubleshooting PCIe
bandwidth issues. Note that this intentionally fills holes and padding
in drm_amdgpu_info_device.
Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20790
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allows to correlate the infos printed by
/sys/kernel/debug/dri/n/amdgpu_gem_info to the ones found
in /proc/.../fdinfo and /sys/kernel/debug/dma_buf/bufinfo.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add sdma ras function on sdma v6_0_3.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Laptops with APUs from a variety of manufacturers and generations
show a warning about a missing PSP runtime database.
As it's not required for PSP to dump this database into framebuffer,
decrease messages about it missing to debug.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only VCN0 supports AV1.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Always enable multipipe policy on ASICs with GC VERSION > 9.0.0
instead of MEC number > 1.
This will allow multipipe policy on ASICs with one MEC,
e.g., gfx11 APUs.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is only one MEC on these APUs.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The HW model validation that guards the indirect SRAM checking in the
VCN code path is redundant - there's no model that's not included in the
switch, making it useless in practice [0].
So, let's remove this switch statement for good.
[0] lore.kernel.org/amd-gfx/MN0PR12MB61013D20B8A2263B22AE1BCFE2C19@MN0PR12MB6101.namprd12.prod.outlook.com
Suggested-by: Alex Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: James Zhu <James.Zhu@amd.com>
Cc: Lazar Lijo <Lijo.Lazar@amd.com>
Cc: Leo Liu <leo.liu@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is an incredibly trivial fix, just for the sake of
"aesthetical" organization of the defines. Some were space based,
most were tab based and there was a lack of "alignment", now it's
all the same and aligned.
Cc: James Zhu <James.Zhu@amd.com>
Cc: Lazar Lijo <Lijo.Lazar@amd.com>
Cc: Leo Liu <leo.liu@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Several source files include drm_crtc_helper.h without needing it or
only to get its transitive include statements; leading to unnecessary
compile-time dependencies.
Directly include required headers and drop drm_crtc_helper.h where
possible.
v2:
* keep includes sorted in amdgpu_device.c (Sam)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230116131235.18917-4-tzimmermann@suse.de
The problem is that base sched hasn't been assigned yet at this moment,
causing something like "ring=0" all the time from trace.
mpv:cs0-3473 [002] ..... 129.047431: amdgpu_cs: ring=0, dw=48, fences=0
mpv:cs0-3473 [002] ..... 129.089125: amdgpu_cs: ring=0, dw=48, fences=0
mpv:cs0-3473 [002] ..... 129.130987: amdgpu_cs: ring=0, dw=48, fences=0
mpv:cs0-3473 [002] ..... 129.172478: amdgpu_cs: ring=0, dw=48, fences=0
Fixes: 4624459c84 ("drm/amdgpu: add gang submit frontend v6")
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is xgmi3x16 pcs error status for aldebaran, driver should
check xgmi3x16 pcs error status field instead of gopx16 pcs error
status field.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The pcs error count should be determined by PCS ERROR status and
PCS ERROR MASK registers, only PCS ERROR status register can not
refect error counts accurately.
Changed from V1:
remove clean noncorrectable mask registers
optimize query pcs error status
Changed from V2:
remove check mask_value bits
correct set value corresponding bit
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It can be that neither fence were initialized when we run out of UVD
streams for example.
v2: fix typo breaking compile
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2324
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use gfx ras common initialization interface to
initialize gfx ras block.
V2:
Update function call due to amdgpu_gfx_ras_sw_init
interface changes.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Align a closing brace and remove trailing whitespaces. No functional
changes.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If early init fails for a single IP block, then no further IP blocks
are evaluated. This means that if a user was missing more than one
firmware binary they would have to keep adding binaries and re-probing
until they discovered the ones missing.
To make this easier, run early init for each IP block and report a single
failure if not all passed.
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is already a "default" case in the switch block, so there is
no need to have a break after the switch block.
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to reset this or otherwise run into list corruption later on.
Fixes: e44a0fe630 ("drm/amdgpu: rework reserved VMID handling")
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Candice Li <candice.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]:
Amdgpu ras uses amdgpu_ras_is_supported to check whether
the ras block supports the ras function. amdgpu_ras_is_supported
uses .ras_enabled to determine whether the ras function of the
block is enabled.
But for special asic with mem ecc enabled but sram ecc not
enabled, some ras blocks support poison mode but their ras function
is not enabled on .ras_enabled, these ras blocks will run abnormally.
[How]:
If the ras block is not supported on .ras_enabled but the asic
supports poison mode and the ras block has ras configuration, it
can be considered that the ras block supports ras function.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]:
For special asic with mem ecc enabled but sram ecc
not enabled, some ras blocks can register their ras
configuration to ras list, but these ras blocks are not
enabled on .ras_enabled, so it can not get ras block
object using amdgpu_ras_get_ras_block.
[How]:
Remove ras block support check. Even if the ras block
checked is not in the ras list, it will return a null
pointer and will have no effect.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Perform gpu reset after gfx finishes processing
ras poison consumption on gfx_v11_0_3.
V2:
Move gfx poison consumption handler from hw_ops to ip
function level.
V3:
Adjust the calling position of amdgpu_gfx_poison_consumation_handler.
V4:
Since gfx v11_0_3 does not have .hw_ops instance, the .hw_ops null
pointer check in amdgpu_ras_interrupt_poison_consumption_handler
needs to be adjusted.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add gfx ras function on gfx v11_0_3.
V2:
1. Add separate source files for gfx v11_0_3.
2. Create a common function to initialize gfx ras block.
V3:
1. Rename amdgpu_gfx_ras_block_init to amdgpu_gfx_ras_sw_init.
2. Adjust the calling position of amdgpu_gfx_ras_sw_init.
3. Remove gfx_v11_0_3_ras_ops.
V4:
Revert changes in amdgpu_ras_interrupt_poison_consumption_handler.
V5:
1. Remove invalid include file in gfx_v11_0_3.c.
2. Reduce the number of parameters of amdgpu_gfx_ras_sw_init.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The existing codebase never had a case for detecting MP0 version on
Renoir and instead relied upon hardcoded chip name. This was missed as
part of the changes to migrate all IP blocks to build filenames from
`amdgpu_ucode.c`.
Consequently, Renoir tries to fetch a binary with 11_0_3 in the filename
and since it's supposed to have "renoir" in the filename fails to probe.
The fbdev still works though so the series worked.
Add a case for Renoir into the legacy table to ensure the right ASD and
TA firmware load again.
Reported-by: Ekene Akuneme <Ekene.Akuneme@amd.com>
Reported-by: Nicholas Choi <Nicholas.Choi@amd.com>
Cc: Alex Hung <Alex.Hung@amd.com>
Fixes: 994a97447e ("drm/amd: Parse both v1 and v2 TA microcode headers using same function")
Fixes: 54a3e03234 ("drm/amd: Add a legacy mapping to "amdgpu_ucode_ip_version_decode"")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This tab was deleted accidentally and triggers a Smatch warning:
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:1006 gfx_v8_0_init_microcode()
warn: inconsistent indenting
Add it back.
Fixes: 0aaafb7359 ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX8")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
UAPI Changes:
* fourcc: Document Open Source user waiver
Cross-subsystem Changes:
* firmware: fix color-format selection for system framebuffers
Core Changes:
* format-helper: Add conversion from XRGB8888 to various sysfb formats;
Make XRGB8888 the only driver-emulated legacy format
* fb-helper: Avoid blank consoles from selecting an incorrect color format
* probe-helper: Enable/disable HPD on connectors plus driver updates
* Use drm_dbg_ helpers in several places
* docs: Document defaults for CRTC backgrounds; Document use of drm_minor
Driver Changes:
* arm/hdlcd: Use new debugfs helpers
* gud: Use new debugfs helpers
* panel: Support Visionox VTDR6130 AMOLED DSI; Support Himax HX8394; Convert
many drivers to common generic DSI write-sequence helper
* v3d: Do not opencode drm_gem_object_lookup()
* vc4: Various HVS an CRTC fixes
* vkms: Fix SEGFAULT from incorrect GEM-buffer mapping
* Convert various drivers to i2c probe_new()
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmPAA0AACgkQaA3BHVML
eiMKNAgAoXCFd7YMKoHSDlmAOztP+0qdZWM1LkpyC3fTpYbbNC1xI8yn7TVZIbgY
qtkceCUAGmzuO5kTo0aSZdgPWvdWCuDmYBQsX/VwACchSA0aOKZcCaqXGi2Y39D6
G4xsHJyRhU/lWzn/GZQ4+l46m6NAlnrLFBfuLlOHDhYtYZq9D8gXEDm7NSl8/HGF
OjtfSFINKKtOCjvi5GAa8CfnE8GVp59paFKJuj8xU6Dqqlorff+XroWkQG8LV4qi
3UyShQL6d/2hiRQ7fIhXEGCKit7rJkEpfsLvSqJkTkzOKyjhnIZbJIuesgqhtbEH
OpWsmUVKNphi0Y5zw5ZiTqrSy0Py7g==
=+mHi
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2023-01-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.3:
UAPI Changes:
* fourcc: Document Open Source user waiver
Cross-subsystem Changes:
* firmware: fix color-format selection for system framebuffers
Core Changes:
* format-helper: Add conversion from XRGB8888 to various sysfb formats;
Make XRGB8888 the only driver-emulated legacy format
* fb-helper: Avoid blank consoles from selecting an incorrect color format
* probe-helper: Enable/disable HPD on connectors plus driver updates
* Use drm_dbg_ helpers in several places
* docs: Document defaults for CRTC backgrounds; Document use of drm_minor
Driver Changes:
* arm/hdlcd: Use new debugfs helpers
* gud: Use new debugfs helpers
* panel: Support Visionox VTDR6130 AMOLED DSI; Support Himax HX8394; Convert
many drivers to common generic DSI write-sequence helper
* v3d: Do not opencode drm_gem_object_lookup()
* vc4: Various HVS an CRTC fixes
* vkms: Fix SEGFAULT from incorrect GEM-buffer mapping
* Convert various drivers to i2c probe_new()
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Y8ADeSzZDj+tpibF@linux-uq9g
amd-drm-next-6.3-2023-01-13:
amdgpu:
- Fix possible segfault in failure case
- Rework FW requests to happen in early_init for all IPs so
that we don't lose the sbios console if FW is missing
- PSR fixes
- Misc cleanups
- Unload fix
- SMU13 fixes
amdkfd:
- Fix for cleared VRAM BOs
- Fix cleanup if GPUVM creation fails
- Memory accounting fix
- Use resource_size rather than open codeing it
- GC11 mGPU fix
radeon:
- Fix memory leak on shutdown
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230113225911.7776-1-alexander.deucher@amd.com
[Why]
SDMA0_CNTL and MMHUB system aperture related registers are blocked by L1 Policy.
Therefore, they cannot be accessed by VF and loged in violation.
[How]
For MMHUB registers, they will be programmed by PF. So VF will skip to program them in mmhubv3_0.
For SDMA0_CNTL which is a PF_only register, VF don't need to program it in sdma_v6_0.
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some dead code was introduced as part of utilizing the `amdgpu_ucode_*`
helpers. Adjust the control flow to make sure that firmware is released
in the appropriate error flows.
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1530548 ("Control flow issues")
Fixes: ec787deb2d ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX9")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Including <drm/drm_fb_helper.h> is not required, so remove the include
statements. No functional changes.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20230111130206.29974-8-tzimmermann@suse.de
Include <linux/backlight.h> in source files that need it. Some of
DRM's source code gets the backlight header via drm_crtc_helper.h
and <linux/fb.h>, which can leed to unnecessary recompilation. If
possible, do not include drm_crtc_helper.h any longer.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Christian König <christian.koenig@amd.com> # amd
Acked-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20230111130206.29974-2-tzimmermann@suse.de
Use page aligned size to reserve memory usage because page aligned TTM
BO size is used to unreserve memory usage, otherwise no page aligned
size causes memory usage accounting unbalanced.
Change vram_used definition type to int64_t to be able to trigger
WARN_ONCE(adev && adev->kfd.vram_used < 0, "..."), to help debug the
accounting issue with warning and backtrace.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If acquire_vm failed when initializing KFD vm, set vm->process_info to
NULL and free process info, otherwise, the future acquire_vm will
always fail as vm->process_info is not NULL.
Pass avm as parameter to remove the duplicate code.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The point bo->kfd_bo is NULL for queue's write pointer BO
when creating queue on mGPU. To avoid using the pointer
fixes the error.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No consumers outside of amdgpu_ucode.c use amdgpu_ucode_validate
anymore, so make the function static.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The `amdgpu_ucode_request` helper will ensure that the return code for
missing firmware is -ENODEV so that early_init can fail.
The `amdgpu_ucode_release` helper is for symmetry on unloading.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The `amdgpu_ucode_request` helper will ensure that the return code for
missing firmware is -ENODEV so that early_init can fail.
The `amdgpu_ucode_release` helper is for symmetry on unloading.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The `amdgpu_ucode_request` helper will ensure that the return code for
missing firmware is -ENODEV so that early_init can fail.
The `amdgpu_ucode_release` helper is for symmetry on unloading.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>