linux

Author	SHA1	Message	Date
ZhenGuo Yin	5659b0c93a	drm/amdgpu: reset vm state machine after gpu reset(vram lost) [Why] Page table of compute VM in the VRAM will lost after gpu reset. VRAM won't be restored since compute VM has no shadows. [How] Use higher 32-bit of vm->generation to record a vram_lost_counter. Reset the VM state machine when vm->genertaion is not equal to the new generation token. v2: Check vm->generation instead of calling drm_sched_entity_error in amdgpu_vm_validate. v3: Use new generation token instead of vram_lost_counter for check. Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit `47c0388b05`)	2024-07-24 17:30:49 -04:00
Tim Huang	fab1ead0ae	drm/amdgpu: add missed harvest check for VCN IP v4/v5 To prevent below probe failure, add a check for models with VCN IP v4.0.6 where VCN1 may be harvested. v2: Apply the same check to VCN IP v4.0 and v5.0. [ 54.070117] RIP: 0010:vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu] [ 54.071055] Code: 80 fb ff 8d 82 00 80 fe ff 81 fe 00 06 00 00 0f 43 c2 49 69 d5 38 0d 00 00 48 8d 71 04 c1 e8 02 4c 01 f2 48 89 b2 50 f6 02 00 <89> 01 48 8b 82 50 f6 02 00 48 8d 48 04 48 89 8a 50 f6 02 00 c7 00 [ 54.072408] RSP: 0018:ffffb17985f736f8 EFLAGS: 00010286 [ 54.072793] RAX: 00000000000000d6 RBX: ffff99a82f680000 RCX: 0000000000000000 [ 54.073315] RDX: ffff99a82f680000 RSI: 0000000000000004 RDI: ffff99a82f680000 [ 54.073835] RBP: ffffb17985f73730 R08: 0000000000000001 R09: 0000000000000000 [ 54.074353] R10: 0000000000000008 R11: ffffb17983c05000 R12: 0000000000000000 [ 54.074879] R13: 0000000000000000 R14: ffff99a82f680000 R15: 0000000000000001 [ 54.075400] FS: 00007f8d9c79a000(0000) GS:ffff99ab2f140000(0000) knlGS:0000000000000000 [ 54.075988] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.076408] CR2: 0000000000000000 CR3: 0000000140c3a000 CR4: 0000000000750ef0 [ 54.076927] PKRU: 55555554 [ 54.077132] Call Trace: [ 54.077319] <TASK> [ 54.077484] ? show_regs+0x69/0x80 [ 54.077747] ? __die+0x28/0x70 [ 54.077979] ? page_fault_oops+0x180/0x4b0 [ 54.078286] ? do_user_addr_fault+0x2d2/0x680 [ 54.078610] ? exc_page_fault+0x84/0x190 [ 54.078910] ? asm_exc_page_fault+0x2b/0x30 [ 54.079224] ? vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu] [ 54.079941] ? vcn_v4_0_5_start_dpg_mode+0xe6/0x36b0 [amdgpu] [ 54.080617] vcn_v4_0_5_set_powergating_state+0x82/0x19b0 [amdgpu] [ 54.081316] amdgpu_device_ip_set_powergating_state+0x64/0xc0 [amdgpu] [ 54.082057] amdgpu_vcn_ring_begin_use+0x6f/0x1d0 [amdgpu] [ 54.082727] amdgpu_ring_alloc+0x44/0x70 [amdgpu] [ 54.083351] amdgpu_vcn_dec_sw_ring_test_ring+0x40/0x110 [amdgpu] [ 54.084054] amdgpu_ring_test_helper+0x22/0x90 [amdgpu] [ 54.084698] vcn_v4_0_5_hw_init+0x87/0xc0 [amdgpu] [ 54.085307] amdgpu_device_init+0x1f96/0x2780 [amdgpu] [ 54.085951] amdgpu_driver_load_kms+0x1e/0xc0 [amdgpu] [ 54.086591] amdgpu_pci_probe+0x19f/0x550 [amdgpu] [ 54.087215] local_pci_probe+0x48/0xa0 [ 54.087509] pci_device_probe+0xc9/0x250 [ 54.087812] really_probe+0x1a4/0x3f0 [ 54.088101] __driver_probe_device+0x7d/0x170 [ 54.088443] driver_probe_device+0x24/0xa0 [ 54.088765] __driver_attach+0xdd/0x1d0 [ 54.089068] ? __pfx___driver_attach+0x10/0x10 [ 54.089417] bus_for_each_dev+0x8e/0xe0 [ 54.089718] driver_attach+0x22/0x30 [ 54.090000] bus_add_driver+0x120/0x220 [ 54.090303] driver_register+0x62/0x120 [ 54.090606] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu] [ 54.091255] __pci_register_driver+0x62/0x70 [ 54.091593] amdgpu_init+0x67/0xff0 [amdgpu] [ 54.092190] do_one_initcall+0x5f/0x330 [ 54.092495] do_init_module+0x68/0x240 [ 54.092794] load_module+0x201c/0x2110 [ 54.093093] init_module_from_file+0x97/0xd0 [ 54.093428] ? init_module_from_file+0x97/0xd0 [ 54.093777] idempotent_init_module+0x11c/0x2a0 [ 54.094134] __x64_sys_finit_module+0x64/0xc0 [ 54.094476] do_syscall_64+0x58/0x120 [ 54.094767] entry_SYSCALL_64_after_hwframe+0x6e/0x76 Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit `0b071245dd`)	2024-07-24 17:30:23 -04:00
Stanley.Yang	1a8825259a	drm/amdgpu: Fix eeprom max record count The eeprom table is empty before initializing, set eeprom table version first before initializing. Changed from V1: Reuse amdgpu_ras_set_eeprom_table_version function Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `015b8a2fdf`)	2024-07-24 17:30:23 -04:00
YiPeng Chai	afac8c6554	drm/amdgpu: fix ras UE error injection failure issue The ras command shared memory is allocated from VRAM and the response status of the command buffer will not be zero due to gpu being in fatal error state after ras UE error injection. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8284951a6e`)	2024-07-24 17:30:23 -04:00
Rodrigo Siqueira	5302d1a06a	drm/amd/display: Remove ASSERT if significance is zero in math_ceil2 In the DML math_ceil2 function, there is one ASSERT if the significance is equal to zero. However, significance might be equal to zero sometimes, and this is not an issue for a ceil function, but the current ASSERT will trigger warnings in those cases. This commit removes the ASSERT if the significance is equal to zero to avoid unnecessary noise. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `332315885d`)	2024-07-24 17:30:23 -04:00
Sung Joon Kim	4ab68e168a	drm/amd/display: Check for NULL pointer [why & how] Need to make sure plane_state is initialized before accessing its members. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Xi (Alex) Liu <xi.liu@amd.com> Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `295d91cbc7`)	2024-07-24 17:30:23 -04:00
Jane Jian	6728f55590	drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF For VCN/JPEG 4.0.3, use only the local addressing scheme. - Mask bit higher than AID0 range v2 remain the case for mmhub use master XCC Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `caaf576292`)	2024-07-24 17:30:23 -04:00
Lijo Lazar	485432d090	drm/amdgpu: Add empty HDP flush function to VCN v4.0.3 VCN 4.0.3 does not HDP flush with RRMT enabled. Instead, mmsch will do the HDP flush. This change is necessary for VCN v4.0.3, no need for backward compatibility Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `49cfaebe48`)	2024-07-24 17:30:23 -04:00
Lijo Lazar	23df34997d	drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3 JPEG v4.0.3 doesn't support HDP flush when RRMT is enabled. Instead, mmsch fw will do the flush. This change is necessary for JPEG v4.0.3, no need for backward compatibility Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `585e3fdb36`)	2024-07-24 17:30:23 -04:00
Ma Ke	df65aabef3	drm/amd/amdgpu: Fix uninitialized variable warnings Return 0 to avoid returning an uninitialized variable r. Cc: stable@vger.kernel.org Fixes: `230dd6bb61` ("drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `6472de66c0`)	2024-07-24 17:30:23 -04:00
David Belanger	73048bda46	drm/amdgpu: Fix atomics on GFX12 If PCIe supports atomics, configure register to prevent DF from breaking atomics in separate load/store operations. Signed-off-by: David Belanger <david.belanger@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `666f14cab2`)	2024-07-24 17:30:23 -04:00
Alex Deucher	a03ebf1163	drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell We seem to have a case where SDMA will sometimes miss a doorbell if GFX is entering the powergating state when the doorbell comes in. To workaround this, we can update the wptr via MMIO, however, this is only safe because we disallow gfxoff in begin_ring() for SDMA 5.2 and then allow it again in end_ring(). Enable this workaround while we are root causing the issue with the HW team. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440 Tested-by: Friedrich Vock <friedrich.vock@gmx.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit `f2ac526349`)	2024-07-24 17:29:57 -04:00
Alex Deucher	e3615bd198	drm/amd/display: fix corruption with high refresh rates on DCN 3.0 This reverts commit `bc87d666c0` and the register changes from commit `6d4279cb99`. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3412 Cc: mikhail.v.gavrilov@gmail.com Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.10.x	2024-07-17 17:41:28 -04:00
Rodrigo Siqueira	d938ec1a12	drm/amd/display: Add simple struct doc to remove doc build warning This commit is a part of a series that addresses the following build warning for opp: ./drivers/gpu/drm/amd/display/dc/inc/hw/opp.h:1: warning: no structured comments found ./drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h:1: warning: no structured comments found This commit fixes this issue by adding a simple kernel-doc to a struct in the opp.h and the dpp.h files. Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-16 11:45:09 -04:00
Rodrigo Siqueira	4cf300f604	drm/amd/display: Move DIO documentation to the right place When building the kernel-doc, it complains with the below warning: ./drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.h:1: warning: no structured comments found ./drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.h:1: warning: no structured comments found This warning was caused by the wrong use of the ':export:' and the lack of function documentation in the file pointed under the ':internal:'. This commit addresses those issues by relocating the overview documentation to the correct C file, removing the ':export:' options, and adding two simple kernel-doc to ensure that ':internal:' does not have any warning. Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/dri-devel/20240715085918.68f5ecc9@canb.auug.org.au/ Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-16 11:44:49 -04:00
Li Ma	5ae8fb9712	drm/amd/swsmu: enable Pstates profile levels for SMU v14.0.4 Enables following UMD stable Pstates profile levels of power_dpm_force_performance_level for SMU v14.0.4. - profile_peak - profile_min_mclk - profile_min_sclk - profile_standard Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <tim.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-16 11:44:43 -04:00
Tim Huang	2fdc99b96e	drm/amd/pm: early return if disabling DPMS for GFX IP v11.5.2 This was intended to add support for GFX IP v11.5.2, but it needs to be applied to all GFX11 and subsequent APUs. Therefore the code should be revised to accommodate this. Signed-off-by: Tim Huang <tim.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-16 11:44:36 -04:00
YiPeng Chai	b3fb79cda5	drm/amdgpu: add mutex to protect ras shared memory Add mutex to protect ras shared memory. v2: Add TA_RAS_COMMAND__TRIGGER_ERROR command call status check. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-16 11:44:30 -04:00
Roman Li	0e2c796b49	drm/amd/display: Add function banner for idle_workqueue [Why] htmldocs warning: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h: warning: Function parameter or struct member 'idle_workqueue' not described in 'amdgpu_display_manager'. [How] Add comment section for idle_workqueue with param description. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/dri-devel/20240715090211.736a9b4d@canb.auug.org.au/ Signed-off-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-16 11:44:20 -04:00
Alex Hung	6e169c7e0f	drm/amd/display: Add doc entry for program_3dlut_size Fixes the warning: Function parameter or struct member 'program_3dlut_size' not described in 'mpc_funcs' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/dri-devel/20240715090445.7e9387ec@canb.auug.org.au/ Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-16 11:44:10 -04:00
Boyuan Zhang	7d75ef3736	drm/amdgpu/vcn: not pause dpg for unified queue For unified queue, DPG pause for encoding is done inside VCN firmware, so there is no need to pause dpg based on ring type in kernel. For VCN3 and below, pausing DPG for encoding in kernel is still needed. v2: add more comments v3: update commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-16 11:44:04 -04:00
Boyuan Zhang	ecfa23c8df	drm/amdgpu/vcn: identify unified queue in sw init Determine whether VCN using unified queue in sw_init, instead of calling functions later on. v2: fix coding style Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-16 11:43:48 -04:00
Aurabindo Pillai	7bbae44cf1	drm/amd/display: fix doc entry for bb_from_dmub Fixes the warning: Function parameter or struct member 'bb_from_dmub' not described in 'amdgpu_display_manager' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-15 17:41:23 -04:00
Aurabindo Pillai	28814be882	drm/amd: Bump KMS_DRIVER_MINOR version Increase the KMS minor version to indicate GFX12 DCC support since this contains a major change in how DCC is managed across IPs like GFX, DCN etc. This will be used mainly by userspace like Mesa to figure out DCC support on GFX12 hardware. v2: fix version number (Alex) Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-15 17:41:23 -04:00
Alex Deucher	1cff1010be	drm/amdgpu/mes12: add missing opcode string Fixes the indexing of the string array. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-12 11:46:46 -04:00
Alex Deucher	478cb8badf	drm/amdgpu/mes11: update opcode strings Add new packet. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-12 11:46:46 -04:00
Leo Li	7ed58b68ac	Revert "drm/amd/display: Reset freesync config before update new state" This change caused PSR SU panels to not read from their remote fb, preventing us from entering self-refresh. It is a regression. This reverts commit `eb6dfbb7a9`. Signed-off-by: Leo Li <sunpeng.li@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `dc1000bf46`)	2024-07-12 11:46:46 -04:00
Alex Deucher	8030f6533e	drm/amdgpu: remove exp hw support check for gfx12 Enable it by default. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 13:45:57 -04:00
YiPeng Chai	e23300dfff	drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed The problem case is as follows: 1. GPU A triggers a gpu ras reset, and GPU A drives GPU B to also perform a gpu ras reset. 2. After gpu B ras reset started, gpu B queried a DE data. Since the DE data was queried in the ras reset thread instead of the page retirement thread, bad page retirement work would not be triggered. Then even if all gpu resets are completed, the bad pages will be cached in RAM until GPU B's bad page retirement work is triggered again and then saved to eeprom. This patch can save the bad pages to eeprom in time after gpu ras reset is completed. v2: 1. Add the above description to code comments. 2. Reuse existing function. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:13:41 -04:00
YiPeng Chai	c04706914d	drm/amdgpu: flush all cached ras bad pages to eeprom Before uninstalling gpu driver, flush all cached ras bad pages to eeprom. v2: Put the same code into a function and reuse the function. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:13:35 -04:00
Sunil Khatri	c39385710c	drm/amdgpu: select compute ME engines dynamically GFX ME right now is one but this could change in future SOC's. Use no of ME for GFX as start point for ME for compute for GFX12. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:13:22 -04:00
Aurabindo Pillai	21e6f6085b	drm/amd/display: Allow display DCC for DCN401 To enable mesa to use display dcc, DM should expose them in the supported modifiers. Add the best (most efficient) modifiers first. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:13:10 -04:00
Sunil Khatri	a85cc86cce	drm/amdgpu: select compute ME engines dynamically GFX ME right now is one but this could change in future SOC's. Use no of ME for GFX as start point for ME for compute for GFX11. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:13:04 -04:00
Alex Deucher	7d570f56f1	drm/amdgpu/job: Replace DRM_INFO/ERROR logging Use the dev_info/err variants so we get per device logging in multi-GPU cases. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:12:55 -04:00
Sunil Khatri	948f2828a6	drm/amdgpu: select compute ME engines dynamically GFX ME right now is one but this could change in future SOC's. Use no of ME for GFX as start point for ME for compute for GFX10. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:12:48 -04:00
Danijel Slivka	708f220567	drm/amd/pm: Ignore initial value in smu response register Why: If the reg mmMP1_SMN_C2PMSG_90 is being written to during amdgpu driver load or driver unload, subsequent amdgpu driver load will fail at smu_hw_init. The default of mmMP1_SMN_C2PMSG_90 register at a clean environment is 0x1 and if value differs from expected, amdgpu driver load will fail. How to fix: Ignore the initial value in smu response register before the first smu message is sent,if smc in SMU_FW_INIT state, just proceed further to send the message. If register holds an unexpected value after smu message was sent set, smc_state to SMU_FW_HANG state and no further smu messages will be sent. v2: Set SMU_FW_INIT state at the start of smu hw_init/resume. Check smc_fw_state before sending smu message if in hang state skip sending message. Set SMU_FW_HANG only in case unexpected value is detected Signed-off-by: Danijel Slivka <danijel.slivka@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:12:37 -04:00
Lijo Lazar	d02ddefc7e	drm/amdgpu: Initialize VF partition mode For SOCs with GFX v9.4.3, a VF may have multiple compute partitions. Fetch the partition information during init and initialize partition nodes. There is no support to switch partition mode in VF mode, hence disable the same. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:12:28 -04:00
Gavin Wan	5d64af40e3	drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping. sdma has 2 instances in SRIOV cpx mode. Odd numbered VFs have sdma0/sdma1 instances. Even numbered vfs have sdma2/sdma3. For Even numbered vfs, the sdma2 & sdma3 (irq srouce id CLIENTID_SDMA2 and CLIENTID_SDMA3) should map to irq seq 0 & 1. Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-10 10:12:19 -04:00
Zhigang Luo	ee98fb71ba	drm/amdgpu: set CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 to avoid reading wrong WPTR from doorbell in sriov vf, set CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 to read WPTR from MQD. Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:56:27 -04:00
Yang Wang	59f488be76	drm/amdgpu: add ras event state device attribute support add amdgpu ras 'event_state' sysfs device attribute support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:56:13 -04:00
Li Ma	1dd34092c1	drm/amd/swsmu: enable more Pstates profile levels for SMU v14.0.0 and v14.0.1 V1: This patch enables following UMD stable Pstates profile levels for power_dpm_force_performance_level interface. - profile_peak - profile_min_mclk - profile_min_sclk - profile_standard V2: Fix conflict with commit "drm/amd/pm: smu v14.0.4 reuse smu v14.0.0 dpmtable " V3: Add VCLK1 and DCLK1 support for SMU V14.0.1 And avoid to set VCLK1 and DCLK1 for SMU v14.0.0 Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:55:43 -04:00
Yang Wang	12b435a40c	drm/amdgpu: add ras POSION_CONSUMPTION event id support add amdgpu ras POSION_CONSUMPTION event id support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:55:37 -04:00
Stanley.Yang	91ba536ead	drm/amdkfd: Use mode1 reset for GFX v9.4.4 GFX v9.4.4 uses mode1 reset to handle poison consumption. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:55:26 -04:00
Yang Wang	5b9de2596f	drm/amdgpu: add ras POSION_CREATION event id support add amdgpu ras POSION_CREATION event id support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:55:18 -04:00
Yang Wang	75ac6a2506	drm/amdgpu: refine amdgpu ras event id core code v1: - use unified event id to manage ras events - add a new function amdgpu_ras_query_error_status_with_event() to accept event type as parameter. v2: add a warn log to show the location of function failure when calling amdgpu_ras_mark_event(). (Tao Zhou) v3: change RAS_EVENT_TYPE_ISR to RAS_EVENT_TYPE_FATAL. v4: rename amdgpu_ras_get_recovery_event() to amdgpu_ras_get_fatal_error_event(). Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:55:11 -04:00
Wayne Lin	e33697141b	drm/amd/display: Solve mst monitors blank out problem after resume [Why] In dm resume, we firstly restore dc state and do the mst resume for topology probing thereafter. If we change dpcd DP_MSTM_CTRL value after LT in mst reume, it will cause light up problem on the hub. [How] Revert commit `202dc359ad` ("drm/amd/display: Defer handling mst up request in resume"). And adjust the reason to trigger dc_link_detect by DETECT_REASON_RESUMEFROMS3S4. Cc: stable@vger.kernel.org Fixes: `202dc359ad` ("drm/amd/display: Defer handling mst up request in resume") Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Fangzhi Zuo <jerry.zuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:51:02 -04:00
Christian König	320debca1b	drm/amdgpu: reject gang submit on reserved VMIDs A gang submit won't work if the VMID is reserved and we can't flush out VM changes from multiple engines at the same time. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:50:53 -04:00
Saleemkhan Jamadar	b6ad109166	drm/amdgpu: enable dpg for vcn and jpeg on GC 11_5_2 DPG mode is enabled for vcn and jpeg on VCN v4_0_5 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:50:40 -04:00
Yang Wang	332210c13a	drm/amdgpu: remove redundant semicolons in RAS_EVENT_LOG remove redundant semicolons in RAS_EVENT_LOG to avoid code format check warning. Fixes: `b712d7c201` ("drm/amdgpu: fix compiler 'side-effect' check issue for RAS_EVENT_LOG()") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:47:43 -04:00
Frank Min	54837bd2be	drm/amdgpu: restore dcc bo tilling configs while moving While moving buffer which has dcc tiling config, it is needed to restore its original dcc tiling. 1. extend copy flag to cover tiling bits 2. add logic to restore original dcc tiling config Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-08 16:47:27 -04:00

1 2 3 4 5 ...

30669 Commits