Files
linux/drivers
Yunxiang Li 6e4aa08fa9 drm/amdgpu: Fix amdgpu_device_reset_sriov retry logic
The retry loop for SRIOV reset have refcount and memory leak issue.
Depending on which function call fails it can potentially call
amdgpu_amdkfd_pre/post_reset different number of times and causes
kfd_locked count to be wrong. This will block all future attempts at
opening /dev/kfd. The retry loop also leakes resources by calling
amdgpu_virt_init_data_exchange multiple times without calling the
corresponding fini function.

Align with the bare-metal reset path which doesn't have these issues.
This means taking the amdgpu_amdkfd_pre/post_reset functions out of the
reset loop and calling amdgpu_device_pre_asic_reset each retry which
properly free the resources from previous try by calling
amdgpu_virt_fini_data_exchange.

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-02 15:41:05 -04:00
..
2024-03-07 21:52:03 +00:00
2024-02-16 18:51:00 -05:00
2024-03-07 20:37:04 +00:00
2024-02-19 21:23:28 +01:00
2024-03-07 20:32:47 +00:00
2024-04-09 17:01:01 -07:00
2024-03-07 20:32:38 +00:00
2024-03-27 13:17:15 +01:00
2024-02-20 13:36:34 +01:00
2024-02-19 11:10:55 +01:00