Pull drm updates from Dave Airlie:
"Highlights are usual, more AMD IP blocks for future hw, i915/xe
changes, Displayport tunnelling support for i915, msm YUV over DP
changes, new tests for ttm, but its mostly a lot of stuff all over the
place from lots of people.
core:
- EDID cleanups
- scheduler error handling fixes
- managed: add drmm_release_action() with tests
- add ratelimited drm debug print
- DPCD PSR early transport macro
- DP tunneling and bandwidth allocation helpers
- remove built-in edids
- dp: Avoid AUX transfers on powered-down displays
- dp: Add VSC SDP helpers
cross drivers:
- use new drm print helpers
- switch to ->read_edid callback
- gem: add stats for shared buffers plus updates to amdgpu, i915, xe
syncobj:
- fixes to waiting and sleeping
ttm:
- add tests
- fix errno codes
- simply busy-placement handling
- fix page decryption
media:
- tc358743: fix v4l device registration
video:
- move all kernel parameters for video behind CONFIG_VIDEO
sound:
- remove <drm/drm_edid.h> include from header
ci:
- add tests for msm
- fix apq8016 runner
efifb:
- use copy of global screen_info state
vesafb:
- use copy of global screen_info state
simplefb:
- fix logging
bridge:
- ite-6505: fix DP link-training bug
- samsung-dsim: fix error checking in probe
- samsung-dsim: add bsh-smm-s2/pro boards
- tc358767: fix regmap usage
- imx: add i.MX8MP HDMI PVI plus DT bindings
- imx: add i.MX8MP HDMI TX plus DT bindings
- sii902x: fix probing and unregistration
- tc358767: limit pixel PLL input range
- switch to new drm_bridge_read_edid() interface
panel:
- ltk050h3146w: error-handling fixes
- panel-edp: support delay between power-on and enable; use put_sync
in unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49
V8.0, BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings
- panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings
- panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings
- add BOE TH101MB31IG002-28A plus DT bindings
- add EDT ETML1010G3DRA plus DT bindings
- add Novatek NT36672E LCD DSI plus DT bindings
- nt36523: support 120Hz timings, fix includes
- simple: fix display timings on RK32FN48H
- visionox-vtdr6130: fix initialization
- add Powkiddy RGB10MAX3 plus DT bindings
- st7703: support panel rotation plus DT bindings
- add Himax HX83112A plus DT bindings
- ltk500hd1829: add support for ltk101b4029w and admatec 9904370
- simple: add BOE BP082WX1-100 8.2" panel plus DT bindungs
panel-orientation-quirks:
- GPD Win Mini
amdgpu:
- Validate DMABuf imports in compute VMs
- Add RAS ACA framework
- PSP 13 fixes
- Misc code cleanups
- Replay fixes
- Atom interpretor PS, WS bounds checking
- DML2 fixes
- Audio fixes
- DCN 3.5 Z state fixes
- Remove deprecated ida_simple usage
- UBSAN fixes
- RAS fixes
- Enable seq64 infrastructure
- DC color block enablement
- Documentation updates
- DC documentation updates
- DMCUB updates
- ATHUB 4.1 support
- LSDMA 7.0 support
- JPEG DPG support
- IH 7.0 support
- HDP 7.0 support
- VCN 5.0 support
- SMU 13.0.6 updates
- NBIO 7.11 updates
- SDMA 6.1 updates
- MMHUB 3.3 updates
- DCN 3.5.1 support
- NBIF 6.3.1 support
- VPE 6.1.1 support
amdkfd:
- Validate DMABuf imports in compute VMs
- SVM fixes
- Trap handler updates and enhancements
- Fix cache size reporting
- Relocate the trap handler
radeon:
- Atom interpretor PS, WS bounds checking
- Misc code cleanups
xe:
- new query for GuC submission version
- Remove unused persistent exec_queues
- Add vram frequency sysfs attributes
- Add the flag XE_VM_BIND_FLAG_DUMPABLE
- Drop pre-production workarounds
- Drop kunit tests for unsupported platforms
- Start pumbling SR-IOV support with memory based interrupts for VF
- Allow to map BO in GGTT with PAT index corresponding to XE_CACHE_UC
to work with memory based interrupts
- Add GuC Doorbells Manager as prep work SR-IOV
- Implement additional workarounds for xe2 and MTL
- Program a few registers according to perfomance guide spec for Xe2
- Fix remaining 32b build issues and enable it back
- Fix build with CONFIG_DEBUG_FS=n
- Fix warnings from GuC ABI headers
- Introduce Relay Communication for SR-IOV for VF <-> GuC <-> PF
- Release mmap mappings on rpm suspend
- Disable mid-thread preemption when not properly supported by
hardware
- Fix xe_exec by reserving extra fence slot for CPU bind
- Fix xe_exec with full long running exec queue
- Canonicalize addresses where needed for Xe2 and add to devcoredum
- Toggle USM support for Xe2
- Only allow 1 ufence per exec / bind IOCTL
- Add GuC firmware loading for Lunar Lake
- Add XE_VMA_PTE_64K VMA flag
i915:
- Add more ADL-N PCI IDs
- Enable fastboot also on older platforms
- Early transport for panel replay and PSR
- New ARL PCI IDs
- DP TPS4 PHY test pattern support
- Unify and improve VSC SDP for PSR and non-PSR cases
- Refactor memory regions and improve debug logging
- Rework global state serialization
- Remove unused CDCLK divider fields
- Unify HDCP connector logging format
- Use display instead of graphics version in display code
- Move VBT and opregion debugfs next to the implementation
- Abstract opregion interface, use opaque type
- MTL fixes
- HPD handling fixes
- Add GuC submission interface version query
- Atomically invalidate userptr on mmu-notifier
- Update handling of MMIO triggered reports
- Don't make assumptions about intel_wakeref_t type
- Extend driver code of Xe_LPG to Xe_LPG+
- Add flex arrays to struct i915_syncmap
- Allow for very slow HuC loading
- DP tunneling and bandwidth allocation support
msm:
- Correct bindings for MSM8976 and SM8650 platforms
- Start migration of MDP5 platforms to DPU driver
- X1E80100 MDSS support
- DPU:
- Improve DSC allocation, fixing several important corner cases
- Add support for SDM630/SDM660 platforms
- Simplify dpu_encoder_phys_ops
- Apply fixes targeting DSC support with a single DSC encoder
- Apply fixes for HCTL_EN timing configuration
- X1E80100 support
- Add support for YUV420 over DP
- GPU:
- fix sc7180 UBWC config
- fix a7xx LLC config
- new gpu support: a305B, a750, a702
- machine support: SM7150 (different power levels than other a618)
- a7xx devcoredump support
habanalabs:
- configure IRQ affinity according to NUMA node
- move HBM MMU page tables inside the HBM
- improve device reset
- check extended PCIe errors
ivpu:
- updates to firmware API
- refactor BO allocation
imx:
- use devm_ functions during init
hisilicon:
- fix EDID includes
mgag200:
- improve ioremap usage
- convert to struct drm_edid
- Work around PCI write bursts
nouveau:
- disp: use kmemdup()
- fix EDID includes
- documentation fixes
qaic:
- fixes to BO handling
- make use of DRM managed release
- fix order of remove operations
rockchip:
- analogix_dp: get encoder port from DT
- inno_hdmi: support HDMI for RK3128
- lvds: error-handling fixes
ssd130x:
- support SSD133x plus DT bindings
tegra:
- fix error handling
tilcdc:
- make use of DRM managed release
v3d:
- show memory stats in debugfs
- Support display MMU page size
vc4:
- fix error handling in plane prepare_fb
- fix framebuffer test in plane helpers
virtio:
- add venus capset defines
vkms:
- fix OOB access when programming the LUT
- Kconfig improvements
vmwgfx:
- unmap surface before changing plane state
- fix memory leak in error handling
- documentation fixes
- list command SVGA_3D_CMD_DEFINE_GB_SURFACE_V4 as invalid
- fix null-pointer deref in execbuf
- refactor display-mode probing
- fix fencing for creating cursor MOBs
- fix cursor-memory lifetime
xlnx:
- fix live video input for ZynqMP DPSUB
lima:
- fix memory leak
loongson:
- fail if no VRAM present
meson:
- switch to new drm_bridge_read_edid() interface
renesas:
- add RZ/G2L DU support plus DT bindings
mxsfb:
- Use managed mode config
sun4i:
- HDMI: updates to atomic mode setting
mediatek:
- Add display driver for MT8188 VDOSYS1
- DSI driver cleanups
- Filter modes according to hardware capability
- Fix a null pointer crash in mtk_drm_crtc_finish_page_flip
etnaviv:
- enhancements for NPU and MRT support"
* tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel: (1420 commits)
drm/amd/display: Removed redundant @ symbol to fix kernel-doc warnings in -next repo
drm/amd/pm: wait for completion of the EnableGfxImu message
drm/amdgpu/soc21: add mode2 asic reset for SMU IP v14.0.1
drm/amdgpu: add smu 14.0.1 support
drm/amdgpu: add VPE 6.1.1 discovery support
drm/amdgpu/vpe: add VPE 6.1.1 support
drm/amdgpu/vpe: don't emit cond exec command under collaborate mode
drm/amdgpu/vpe: add collaborate mode support for VPE
drm/amdgpu/vpe: add PRED_EXE and COLLAB_SYNC OPCODE
drm/amdgpu/vpe: add multi instance VPE support
drm/amdgpu/discovery: add nbif v6_3_1 ip block
drm/amdgpu: Add nbif v6_3_1 ip block support
drm/amdgpu: Add pcie v6_1_0 ip headers (v5)
drm/amdgpu: Add nbif v6_3_1 ip headers (v5)
arch/powerpc: Remove <linux/fb.h> from backlight code
macintosh/via-pmu-backlight: Include <linux/backlight.h>
fbdev/chipsfb: Include <linux/backlight.h>
drm/etnaviv: Restore some id values
drm/amdkfd: make kfd_class constant
drm/amdgpu: add ring timeout information in devcoredump
...
900 lines
22 KiB
C
900 lines
22 KiB
C
// SPDX-License-Identifier: MIT
|
|
/*
|
|
* Copyright © 2021 Intel Corporation
|
|
*/
|
|
|
|
#include "xe_exec_queue.h"
|
|
|
|
#include <linux/nospec.h>
|
|
|
|
#include <drm/drm_device.h>
|
|
#include <drm/drm_file.h>
|
|
#include <drm/xe_drm.h>
|
|
|
|
#include "xe_device.h"
|
|
#include "xe_gt.h"
|
|
#include "xe_hw_engine_class_sysfs.h"
|
|
#include "xe_hw_fence.h"
|
|
#include "xe_lrc.h"
|
|
#include "xe_macros.h"
|
|
#include "xe_migrate.h"
|
|
#include "xe_pm.h"
|
|
#include "xe_ring_ops_types.h"
|
|
#include "xe_trace.h"
|
|
#include "xe_vm.h"
|
|
|
|
enum xe_exec_queue_sched_prop {
|
|
XE_EXEC_QUEUE_JOB_TIMEOUT = 0,
|
|
XE_EXEC_QUEUE_TIMESLICE = 1,
|
|
XE_EXEC_QUEUE_PREEMPT_TIMEOUT = 2,
|
|
XE_EXEC_QUEUE_SCHED_PROP_MAX = 3,
|
|
};
|
|
|
|
static int exec_queue_user_extensions(struct xe_device *xe, struct xe_exec_queue *q,
|
|
u64 extensions, int ext_number, bool create);
|
|
|
|
static struct xe_exec_queue *__xe_exec_queue_alloc(struct xe_device *xe,
|
|
struct xe_vm *vm,
|
|
u32 logical_mask,
|
|
u16 width, struct xe_hw_engine *hwe,
|
|
u32 flags, u64 extensions)
|
|
{
|
|
struct xe_exec_queue *q;
|
|
struct xe_gt *gt = hwe->gt;
|
|
int err;
|
|
|
|
/* only kernel queues can be permanent */
|
|
XE_WARN_ON((flags & EXEC_QUEUE_FLAG_PERMANENT) && !(flags & EXEC_QUEUE_FLAG_KERNEL));
|
|
|
|
q = kzalloc(struct_size(q, lrc, width), GFP_KERNEL);
|
|
if (!q)
|
|
return ERR_PTR(-ENOMEM);
|
|
|
|
kref_init(&q->refcount);
|
|
q->flags = flags;
|
|
q->hwe = hwe;
|
|
q->gt = gt;
|
|
q->class = hwe->class;
|
|
q->width = width;
|
|
q->logical_mask = logical_mask;
|
|
q->fence_irq = >->fence_irq[hwe->class];
|
|
q->ring_ops = gt->ring_ops[hwe->class];
|
|
q->ops = gt->exec_queue_ops;
|
|
INIT_LIST_HEAD(&q->compute.link);
|
|
INIT_LIST_HEAD(&q->multi_gt_link);
|
|
|
|
q->sched_props.timeslice_us = hwe->eclass->sched_props.timeslice_us;
|
|
q->sched_props.preempt_timeout_us =
|
|
hwe->eclass->sched_props.preempt_timeout_us;
|
|
q->sched_props.job_timeout_ms =
|
|
hwe->eclass->sched_props.job_timeout_ms;
|
|
if (q->flags & EXEC_QUEUE_FLAG_KERNEL &&
|
|
q->flags & EXEC_QUEUE_FLAG_HIGH_PRIORITY)
|
|
q->sched_props.priority = XE_EXEC_QUEUE_PRIORITY_KERNEL;
|
|
else
|
|
q->sched_props.priority = XE_EXEC_QUEUE_PRIORITY_NORMAL;
|
|
|
|
if (extensions) {
|
|
/*
|
|
* may set q->usm, must come before xe_lrc_init(),
|
|
* may overwrite q->sched_props, must come before q->ops->init()
|
|
*/
|
|
err = exec_queue_user_extensions(xe, q, extensions, 0, true);
|
|
if (err) {
|
|
kfree(q);
|
|
return ERR_PTR(err);
|
|
}
|
|
}
|
|
|
|
if (vm)
|
|
q->vm = xe_vm_get(vm);
|
|
|
|
if (xe_exec_queue_is_parallel(q)) {
|
|
q->parallel.composite_fence_ctx = dma_fence_context_alloc(1);
|
|
q->parallel.composite_fence_seqno = XE_FENCE_INITIAL_SEQNO;
|
|
}
|
|
|
|
return q;
|
|
}
|
|
|
|
static void __xe_exec_queue_free(struct xe_exec_queue *q)
|
|
{
|
|
if (q->vm)
|
|
xe_vm_put(q->vm);
|
|
kfree(q);
|
|
}
|
|
|
|
static int __xe_exec_queue_init(struct xe_exec_queue *q)
|
|
{
|
|
struct xe_device *xe = gt_to_xe(q->gt);
|
|
int i, err;
|
|
|
|
for (i = 0; i < q->width; ++i) {
|
|
err = xe_lrc_init(q->lrc + i, q->hwe, q, q->vm, SZ_16K);
|
|
if (err)
|
|
goto err_lrc;
|
|
}
|
|
|
|
err = q->ops->init(q);
|
|
if (err)
|
|
goto err_lrc;
|
|
|
|
/*
|
|
* Normally the user vm holds an rpm ref to keep the device
|
|
* awake, and the context holds a ref for the vm, however for
|
|
* some engines we use the kernels migrate vm underneath which offers no
|
|
* such rpm ref, or we lack a vm. Make sure we keep a ref here, so we
|
|
* can perform GuC CT actions when needed. Caller is expected to have
|
|
* already grabbed the rpm ref outside any sensitive locks.
|
|
*/
|
|
if (!(q->flags & EXEC_QUEUE_FLAG_PERMANENT) && (q->flags & EXEC_QUEUE_FLAG_VM || !q->vm))
|
|
drm_WARN_ON(&xe->drm, !xe_device_mem_access_get_if_ongoing(xe));
|
|
|
|
return 0;
|
|
|
|
err_lrc:
|
|
for (i = i - 1; i >= 0; --i)
|
|
xe_lrc_finish(q->lrc + i);
|
|
return err;
|
|
}
|
|
|
|
struct xe_exec_queue *xe_exec_queue_create(struct xe_device *xe, struct xe_vm *vm,
|
|
u32 logical_mask, u16 width,
|
|
struct xe_hw_engine *hwe, u32 flags,
|
|
u64 extensions)
|
|
{
|
|
struct xe_exec_queue *q;
|
|
int err;
|
|
|
|
q = __xe_exec_queue_alloc(xe, vm, logical_mask, width, hwe, flags,
|
|
extensions);
|
|
if (IS_ERR(q))
|
|
return q;
|
|
|
|
if (vm) {
|
|
err = xe_vm_lock(vm, true);
|
|
if (err)
|
|
goto err_post_alloc;
|
|
}
|
|
|
|
err = __xe_exec_queue_init(q);
|
|
if (vm)
|
|
xe_vm_unlock(vm);
|
|
if (err)
|
|
goto err_post_alloc;
|
|
|
|
return q;
|
|
|
|
err_post_alloc:
|
|
__xe_exec_queue_free(q);
|
|
return ERR_PTR(err);
|
|
}
|
|
|
|
struct xe_exec_queue *xe_exec_queue_create_class(struct xe_device *xe, struct xe_gt *gt,
|
|
struct xe_vm *vm,
|
|
enum xe_engine_class class, u32 flags)
|
|
{
|
|
struct xe_hw_engine *hwe, *hwe0 = NULL;
|
|
enum xe_hw_engine_id id;
|
|
u32 logical_mask = 0;
|
|
|
|
for_each_hw_engine(hwe, gt, id) {
|
|
if (xe_hw_engine_is_reserved(hwe))
|
|
continue;
|
|
|
|
if (hwe->class == class) {
|
|
logical_mask |= BIT(hwe->logical_instance);
|
|
if (!hwe0)
|
|
hwe0 = hwe;
|
|
}
|
|
}
|
|
|
|
if (!logical_mask)
|
|
return ERR_PTR(-ENODEV);
|
|
|
|
return xe_exec_queue_create(xe, vm, logical_mask, 1, hwe0, flags, 0);
|
|
}
|
|
|
|
void xe_exec_queue_destroy(struct kref *ref)
|
|
{
|
|
struct xe_exec_queue *q = container_of(ref, struct xe_exec_queue, refcount);
|
|
struct xe_exec_queue *eq, *next;
|
|
|
|
xe_exec_queue_last_fence_put_unlocked(q);
|
|
if (!(q->flags & EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD)) {
|
|
list_for_each_entry_safe(eq, next, &q->multi_gt_list,
|
|
multi_gt_link)
|
|
xe_exec_queue_put(eq);
|
|
}
|
|
|
|
q->ops->fini(q);
|
|
}
|
|
|
|
void xe_exec_queue_fini(struct xe_exec_queue *q)
|
|
{
|
|
int i;
|
|
|
|
for (i = 0; i < q->width; ++i)
|
|
xe_lrc_finish(q->lrc + i);
|
|
if (!(q->flags & EXEC_QUEUE_FLAG_PERMANENT) && (q->flags & EXEC_QUEUE_FLAG_VM || !q->vm))
|
|
xe_device_mem_access_put(gt_to_xe(q->gt));
|
|
__xe_exec_queue_free(q);
|
|
}
|
|
|
|
void xe_exec_queue_assign_name(struct xe_exec_queue *q, u32 instance)
|
|
{
|
|
switch (q->class) {
|
|
case XE_ENGINE_CLASS_RENDER:
|
|
sprintf(q->name, "rcs%d", instance);
|
|
break;
|
|
case XE_ENGINE_CLASS_VIDEO_DECODE:
|
|
sprintf(q->name, "vcs%d", instance);
|
|
break;
|
|
case XE_ENGINE_CLASS_VIDEO_ENHANCE:
|
|
sprintf(q->name, "vecs%d", instance);
|
|
break;
|
|
case XE_ENGINE_CLASS_COPY:
|
|
sprintf(q->name, "bcs%d", instance);
|
|
break;
|
|
case XE_ENGINE_CLASS_COMPUTE:
|
|
sprintf(q->name, "ccs%d", instance);
|
|
break;
|
|
case XE_ENGINE_CLASS_OTHER:
|
|
sprintf(q->name, "gsccs%d", instance);
|
|
break;
|
|
default:
|
|
XE_WARN_ON(q->class);
|
|
}
|
|
}
|
|
|
|
struct xe_exec_queue *xe_exec_queue_lookup(struct xe_file *xef, u32 id)
|
|
{
|
|
struct xe_exec_queue *q;
|
|
|
|
mutex_lock(&xef->exec_queue.lock);
|
|
q = xa_load(&xef->exec_queue.xa, id);
|
|
if (q)
|
|
xe_exec_queue_get(q);
|
|
mutex_unlock(&xef->exec_queue.lock);
|
|
|
|
return q;
|
|
}
|
|
|
|
enum xe_exec_queue_priority
|
|
xe_exec_queue_device_get_max_priority(struct xe_device *xe)
|
|
{
|
|
return capable(CAP_SYS_NICE) ? XE_EXEC_QUEUE_PRIORITY_HIGH :
|
|
XE_EXEC_QUEUE_PRIORITY_NORMAL;
|
|
}
|
|
|
|
static int exec_queue_set_priority(struct xe_device *xe, struct xe_exec_queue *q,
|
|
u64 value, bool create)
|
|
{
|
|
if (XE_IOCTL_DBG(xe, value > XE_EXEC_QUEUE_PRIORITY_HIGH))
|
|
return -EINVAL;
|
|
|
|
if (XE_IOCTL_DBG(xe, value > xe_exec_queue_device_get_max_priority(xe)))
|
|
return -EPERM;
|
|
|
|
if (!create)
|
|
return q->ops->set_priority(q, value);
|
|
|
|
q->sched_props.priority = value;
|
|
return 0;
|
|
}
|
|
|
|
static bool xe_exec_queue_enforce_schedule_limit(void)
|
|
{
|
|
#if IS_ENABLED(CONFIG_DRM_XE_ENABLE_SCHEDTIMEOUT_LIMIT)
|
|
return true;
|
|
#else
|
|
return !capable(CAP_SYS_NICE);
|
|
#endif
|
|
}
|
|
|
|
static void
|
|
xe_exec_queue_get_prop_minmax(struct xe_hw_engine_class_intf *eclass,
|
|
enum xe_exec_queue_sched_prop prop,
|
|
u32 *min, u32 *max)
|
|
{
|
|
switch (prop) {
|
|
case XE_EXEC_QUEUE_JOB_TIMEOUT:
|
|
*min = eclass->sched_props.job_timeout_min;
|
|
*max = eclass->sched_props.job_timeout_max;
|
|
break;
|
|
case XE_EXEC_QUEUE_TIMESLICE:
|
|
*min = eclass->sched_props.timeslice_min;
|
|
*max = eclass->sched_props.timeslice_max;
|
|
break;
|
|
case XE_EXEC_QUEUE_PREEMPT_TIMEOUT:
|
|
*min = eclass->sched_props.preempt_timeout_min;
|
|
*max = eclass->sched_props.preempt_timeout_max;
|
|
break;
|
|
default:
|
|
break;
|
|
}
|
|
#if IS_ENABLED(CONFIG_DRM_XE_ENABLE_SCHEDTIMEOUT_LIMIT)
|
|
if (capable(CAP_SYS_NICE)) {
|
|
switch (prop) {
|
|
case XE_EXEC_QUEUE_JOB_TIMEOUT:
|
|
*min = XE_HW_ENGINE_JOB_TIMEOUT_MIN;
|
|
*max = XE_HW_ENGINE_JOB_TIMEOUT_MAX;
|
|
break;
|
|
case XE_EXEC_QUEUE_TIMESLICE:
|
|
*min = XE_HW_ENGINE_TIMESLICE_MIN;
|
|
*max = XE_HW_ENGINE_TIMESLICE_MAX;
|
|
break;
|
|
case XE_EXEC_QUEUE_PREEMPT_TIMEOUT:
|
|
*min = XE_HW_ENGINE_PREEMPT_TIMEOUT_MIN;
|
|
*max = XE_HW_ENGINE_PREEMPT_TIMEOUT_MAX;
|
|
break;
|
|
default:
|
|
break;
|
|
}
|
|
}
|
|
#endif
|
|
}
|
|
|
|
static int exec_queue_set_timeslice(struct xe_device *xe, struct xe_exec_queue *q,
|
|
u64 value, bool create)
|
|
{
|
|
u32 min = 0, max = 0;
|
|
|
|
xe_exec_queue_get_prop_minmax(q->hwe->eclass,
|
|
XE_EXEC_QUEUE_TIMESLICE, &min, &max);
|
|
|
|
if (xe_exec_queue_enforce_schedule_limit() &&
|
|
!xe_hw_engine_timeout_in_range(value, min, max))
|
|
return -EINVAL;
|
|
|
|
if (!create)
|
|
return q->ops->set_timeslice(q, value);
|
|
|
|
q->sched_props.timeslice_us = value;
|
|
return 0;
|
|
}
|
|
|
|
typedef int (*xe_exec_queue_set_property_fn)(struct xe_device *xe,
|
|
struct xe_exec_queue *q,
|
|
u64 value, bool create);
|
|
|
|
static const xe_exec_queue_set_property_fn exec_queue_set_property_funcs[] = {
|
|
[DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY] = exec_queue_set_priority,
|
|
[DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE] = exec_queue_set_timeslice,
|
|
};
|
|
|
|
static int exec_queue_user_ext_set_property(struct xe_device *xe,
|
|
struct xe_exec_queue *q,
|
|
u64 extension,
|
|
bool create)
|
|
{
|
|
u64 __user *address = u64_to_user_ptr(extension);
|
|
struct drm_xe_ext_set_property ext;
|
|
int err;
|
|
u32 idx;
|
|
|
|
err = __copy_from_user(&ext, address, sizeof(ext));
|
|
if (XE_IOCTL_DBG(xe, err))
|
|
return -EFAULT;
|
|
|
|
if (XE_IOCTL_DBG(xe, ext.property >=
|
|
ARRAY_SIZE(exec_queue_set_property_funcs)) ||
|
|
XE_IOCTL_DBG(xe, ext.pad) ||
|
|
XE_IOCTL_DBG(xe, ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_PRIORITY &&
|
|
ext.property != DRM_XE_EXEC_QUEUE_SET_PROPERTY_TIMESLICE))
|
|
return -EINVAL;
|
|
|
|
idx = array_index_nospec(ext.property, ARRAY_SIZE(exec_queue_set_property_funcs));
|
|
if (!exec_queue_set_property_funcs[idx])
|
|
return -EINVAL;
|
|
|
|
return exec_queue_set_property_funcs[idx](xe, q, ext.value, create);
|
|
}
|
|
|
|
typedef int (*xe_exec_queue_user_extension_fn)(struct xe_device *xe,
|
|
struct xe_exec_queue *q,
|
|
u64 extension,
|
|
bool create);
|
|
|
|
static const xe_exec_queue_set_property_fn exec_queue_user_extension_funcs[] = {
|
|
[DRM_XE_EXEC_QUEUE_EXTENSION_SET_PROPERTY] = exec_queue_user_ext_set_property,
|
|
};
|
|
|
|
#define MAX_USER_EXTENSIONS 16
|
|
static int exec_queue_user_extensions(struct xe_device *xe, struct xe_exec_queue *q,
|
|
u64 extensions, int ext_number, bool create)
|
|
{
|
|
u64 __user *address = u64_to_user_ptr(extensions);
|
|
struct drm_xe_user_extension ext;
|
|
int err;
|
|
u32 idx;
|
|
|
|
if (XE_IOCTL_DBG(xe, ext_number >= MAX_USER_EXTENSIONS))
|
|
return -E2BIG;
|
|
|
|
err = __copy_from_user(&ext, address, sizeof(ext));
|
|
if (XE_IOCTL_DBG(xe, err))
|
|
return -EFAULT;
|
|
|
|
if (XE_IOCTL_DBG(xe, ext.pad) ||
|
|
XE_IOCTL_DBG(xe, ext.name >=
|
|
ARRAY_SIZE(exec_queue_user_extension_funcs)))
|
|
return -EINVAL;
|
|
|
|
idx = array_index_nospec(ext.name,
|
|
ARRAY_SIZE(exec_queue_user_extension_funcs));
|
|
err = exec_queue_user_extension_funcs[idx](xe, q, extensions, create);
|
|
if (XE_IOCTL_DBG(xe, err))
|
|
return err;
|
|
|
|
if (ext.next_extension)
|
|
return exec_queue_user_extensions(xe, q, ext.next_extension,
|
|
++ext_number, create);
|
|
|
|
return 0;
|
|
}
|
|
|
|
static const enum xe_engine_class user_to_xe_engine_class[] = {
|
|
[DRM_XE_ENGINE_CLASS_RENDER] = XE_ENGINE_CLASS_RENDER,
|
|
[DRM_XE_ENGINE_CLASS_COPY] = XE_ENGINE_CLASS_COPY,
|
|
[DRM_XE_ENGINE_CLASS_VIDEO_DECODE] = XE_ENGINE_CLASS_VIDEO_DECODE,
|
|
[DRM_XE_ENGINE_CLASS_VIDEO_ENHANCE] = XE_ENGINE_CLASS_VIDEO_ENHANCE,
|
|
[DRM_XE_ENGINE_CLASS_COMPUTE] = XE_ENGINE_CLASS_COMPUTE,
|
|
};
|
|
|
|
static struct xe_hw_engine *
|
|
find_hw_engine(struct xe_device *xe,
|
|
struct drm_xe_engine_class_instance eci)
|
|
{
|
|
u32 idx;
|
|
|
|
if (eci.engine_class > ARRAY_SIZE(user_to_xe_engine_class))
|
|
return NULL;
|
|
|
|
if (eci.gt_id >= xe->info.gt_count)
|
|
return NULL;
|
|
|
|
idx = array_index_nospec(eci.engine_class,
|
|
ARRAY_SIZE(user_to_xe_engine_class));
|
|
|
|
return xe_gt_hw_engine(xe_device_get_gt(xe, eci.gt_id),
|
|
user_to_xe_engine_class[idx],
|
|
eci.engine_instance, true);
|
|
}
|
|
|
|
static u32 bind_exec_queue_logical_mask(struct xe_device *xe, struct xe_gt *gt,
|
|
struct drm_xe_engine_class_instance *eci,
|
|
u16 width, u16 num_placements)
|
|
{
|
|
struct xe_hw_engine *hwe;
|
|
enum xe_hw_engine_id id;
|
|
u32 logical_mask = 0;
|
|
|
|
if (XE_IOCTL_DBG(xe, width != 1))
|
|
return 0;
|
|
if (XE_IOCTL_DBG(xe, num_placements != 1))
|
|
return 0;
|
|
if (XE_IOCTL_DBG(xe, eci[0].engine_instance != 0))
|
|
return 0;
|
|
|
|
eci[0].engine_class = DRM_XE_ENGINE_CLASS_COPY;
|
|
|
|
for_each_hw_engine(hwe, gt, id) {
|
|
if (xe_hw_engine_is_reserved(hwe))
|
|
continue;
|
|
|
|
if (hwe->class ==
|
|
user_to_xe_engine_class[DRM_XE_ENGINE_CLASS_COPY])
|
|
logical_mask |= BIT(hwe->logical_instance);
|
|
}
|
|
|
|
return logical_mask;
|
|
}
|
|
|
|
static u32 calc_validate_logical_mask(struct xe_device *xe, struct xe_gt *gt,
|
|
struct drm_xe_engine_class_instance *eci,
|
|
u16 width, u16 num_placements)
|
|
{
|
|
int len = width * num_placements;
|
|
int i, j, n;
|
|
u16 class;
|
|
u16 gt_id;
|
|
u32 return_mask = 0, prev_mask;
|
|
|
|
if (XE_IOCTL_DBG(xe, !xe_device_uc_enabled(xe) &&
|
|
len > 1))
|
|
return 0;
|
|
|
|
for (i = 0; i < width; ++i) {
|
|
u32 current_mask = 0;
|
|
|
|
for (j = 0; j < num_placements; ++j) {
|
|
struct xe_hw_engine *hwe;
|
|
|
|
n = j * width + i;
|
|
|
|
hwe = find_hw_engine(xe, eci[n]);
|
|
if (XE_IOCTL_DBG(xe, !hwe))
|
|
return 0;
|
|
|
|
if (XE_IOCTL_DBG(xe, xe_hw_engine_is_reserved(hwe)))
|
|
return 0;
|
|
|
|
if (XE_IOCTL_DBG(xe, n && eci[n].gt_id != gt_id) ||
|
|
XE_IOCTL_DBG(xe, n && eci[n].engine_class != class))
|
|
return 0;
|
|
|
|
class = eci[n].engine_class;
|
|
gt_id = eci[n].gt_id;
|
|
|
|
if (width == 1 || !i)
|
|
return_mask |= BIT(eci[n].engine_instance);
|
|
current_mask |= BIT(eci[n].engine_instance);
|
|
}
|
|
|
|
/* Parallel submissions must be logically contiguous */
|
|
if (i && XE_IOCTL_DBG(xe, current_mask != prev_mask << 1))
|
|
return 0;
|
|
|
|
prev_mask = current_mask;
|
|
}
|
|
|
|
return return_mask;
|
|
}
|
|
|
|
int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
|
|
struct drm_file *file)
|
|
{
|
|
struct xe_device *xe = to_xe_device(dev);
|
|
struct xe_file *xef = to_xe_file(file);
|
|
struct drm_xe_exec_queue_create *args = data;
|
|
struct drm_xe_engine_class_instance eci[XE_HW_ENGINE_MAX_INSTANCE];
|
|
struct drm_xe_engine_class_instance __user *user_eci =
|
|
u64_to_user_ptr(args->instances);
|
|
struct xe_hw_engine *hwe;
|
|
struct xe_vm *vm, *migrate_vm;
|
|
struct xe_gt *gt;
|
|
struct xe_exec_queue *q = NULL;
|
|
u32 logical_mask;
|
|
u32 id;
|
|
u32 len;
|
|
int err;
|
|
|
|
if (XE_IOCTL_DBG(xe, args->flags) ||
|
|
XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
|
|
return -EINVAL;
|
|
|
|
len = args->width * args->num_placements;
|
|
if (XE_IOCTL_DBG(xe, !len || len > XE_HW_ENGINE_MAX_INSTANCE))
|
|
return -EINVAL;
|
|
|
|
err = __copy_from_user(eci, user_eci,
|
|
sizeof(struct drm_xe_engine_class_instance) *
|
|
len);
|
|
if (XE_IOCTL_DBG(xe, err))
|
|
return -EFAULT;
|
|
|
|
if (XE_IOCTL_DBG(xe, eci[0].gt_id >= xe->info.gt_count))
|
|
return -EINVAL;
|
|
|
|
if (eci[0].engine_class == DRM_XE_ENGINE_CLASS_VM_BIND) {
|
|
for_each_gt(gt, xe, id) {
|
|
struct xe_exec_queue *new;
|
|
u32 flags;
|
|
|
|
if (xe_gt_is_media_type(gt))
|
|
continue;
|
|
|
|
eci[0].gt_id = gt->info.id;
|
|
logical_mask = bind_exec_queue_logical_mask(xe, gt, eci,
|
|
args->width,
|
|
args->num_placements);
|
|
if (XE_IOCTL_DBG(xe, !logical_mask))
|
|
return -EINVAL;
|
|
|
|
hwe = find_hw_engine(xe, eci[0]);
|
|
if (XE_IOCTL_DBG(xe, !hwe))
|
|
return -EINVAL;
|
|
|
|
/* The migration vm doesn't hold rpm ref */
|
|
xe_device_mem_access_get(xe);
|
|
|
|
flags = EXEC_QUEUE_FLAG_VM | (id ? EXEC_QUEUE_FLAG_BIND_ENGINE_CHILD : 0);
|
|
|
|
migrate_vm = xe_migrate_get_vm(gt_to_tile(gt)->migrate);
|
|
new = xe_exec_queue_create(xe, migrate_vm, logical_mask,
|
|
args->width, hwe, flags,
|
|
args->extensions);
|
|
|
|
xe_device_mem_access_put(xe); /* now held by engine */
|
|
|
|
xe_vm_put(migrate_vm);
|
|
if (IS_ERR(new)) {
|
|
err = PTR_ERR(new);
|
|
if (q)
|
|
goto put_exec_queue;
|
|
return err;
|
|
}
|
|
if (id == 0)
|
|
q = new;
|
|
else
|
|
list_add_tail(&new->multi_gt_list,
|
|
&q->multi_gt_link);
|
|
}
|
|
} else {
|
|
gt = xe_device_get_gt(xe, eci[0].gt_id);
|
|
logical_mask = calc_validate_logical_mask(xe, gt, eci,
|
|
args->width,
|
|
args->num_placements);
|
|
if (XE_IOCTL_DBG(xe, !logical_mask))
|
|
return -EINVAL;
|
|
|
|
hwe = find_hw_engine(xe, eci[0]);
|
|
if (XE_IOCTL_DBG(xe, !hwe))
|
|
return -EINVAL;
|
|
|
|
vm = xe_vm_lookup(xef, args->vm_id);
|
|
if (XE_IOCTL_DBG(xe, !vm))
|
|
return -ENOENT;
|
|
|
|
err = down_read_interruptible(&vm->lock);
|
|
if (err) {
|
|
xe_vm_put(vm);
|
|
return err;
|
|
}
|
|
|
|
if (XE_IOCTL_DBG(xe, xe_vm_is_closed_or_banned(vm))) {
|
|
up_read(&vm->lock);
|
|
xe_vm_put(vm);
|
|
return -ENOENT;
|
|
}
|
|
|
|
q = xe_exec_queue_create(xe, vm, logical_mask,
|
|
args->width, hwe, 0,
|
|
args->extensions);
|
|
up_read(&vm->lock);
|
|
xe_vm_put(vm);
|
|
if (IS_ERR(q))
|
|
return PTR_ERR(q);
|
|
|
|
if (xe_vm_in_preempt_fence_mode(vm)) {
|
|
q->compute.context = dma_fence_context_alloc(1);
|
|
spin_lock_init(&q->compute.lock);
|
|
|
|
err = xe_vm_add_compute_exec_queue(vm, q);
|
|
if (XE_IOCTL_DBG(xe, err))
|
|
goto put_exec_queue;
|
|
}
|
|
}
|
|
|
|
mutex_lock(&xef->exec_queue.lock);
|
|
err = xa_alloc(&xef->exec_queue.xa, &id, q, xa_limit_32b, GFP_KERNEL);
|
|
mutex_unlock(&xef->exec_queue.lock);
|
|
if (err)
|
|
goto kill_exec_queue;
|
|
|
|
args->exec_queue_id = id;
|
|
|
|
return 0;
|
|
|
|
kill_exec_queue:
|
|
xe_exec_queue_kill(q);
|
|
put_exec_queue:
|
|
xe_exec_queue_put(q);
|
|
return err;
|
|
}
|
|
|
|
int xe_exec_queue_get_property_ioctl(struct drm_device *dev, void *data,
|
|
struct drm_file *file)
|
|
{
|
|
struct xe_device *xe = to_xe_device(dev);
|
|
struct xe_file *xef = to_xe_file(file);
|
|
struct drm_xe_exec_queue_get_property *args = data;
|
|
struct xe_exec_queue *q;
|
|
int ret;
|
|
|
|
if (XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
|
|
return -EINVAL;
|
|
|
|
q = xe_exec_queue_lookup(xef, args->exec_queue_id);
|
|
if (XE_IOCTL_DBG(xe, !q))
|
|
return -ENOENT;
|
|
|
|
switch (args->property) {
|
|
case DRM_XE_EXEC_QUEUE_GET_PROPERTY_BAN:
|
|
args->value = !!(q->flags & EXEC_QUEUE_FLAG_BANNED);
|
|
ret = 0;
|
|
break;
|
|
default:
|
|
ret = -EINVAL;
|
|
}
|
|
|
|
xe_exec_queue_put(q);
|
|
|
|
return ret;
|
|
}
|
|
|
|
/**
|
|
* xe_exec_queue_is_lr() - Whether an exec_queue is long-running
|
|
* @q: The exec_queue
|
|
*
|
|
* Return: True if the exec_queue is long-running, false otherwise.
|
|
*/
|
|
bool xe_exec_queue_is_lr(struct xe_exec_queue *q)
|
|
{
|
|
return q->vm && xe_vm_in_lr_mode(q->vm) &&
|
|
!(q->flags & EXEC_QUEUE_FLAG_VM);
|
|
}
|
|
|
|
static s32 xe_exec_queue_num_job_inflight(struct xe_exec_queue *q)
|
|
{
|
|
return q->lrc->fence_ctx.next_seqno - xe_lrc_seqno(q->lrc) - 1;
|
|
}
|
|
|
|
/**
|
|
* xe_exec_queue_ring_full() - Whether an exec_queue's ring is full
|
|
* @q: The exec_queue
|
|
*
|
|
* Return: True if the exec_queue's ring is full, false otherwise.
|
|
*/
|
|
bool xe_exec_queue_ring_full(struct xe_exec_queue *q)
|
|
{
|
|
struct xe_lrc *lrc = q->lrc;
|
|
s32 max_job = lrc->ring.size / MAX_JOB_SIZE_BYTES;
|
|
|
|
return xe_exec_queue_num_job_inflight(q) >= max_job;
|
|
}
|
|
|
|
/**
|
|
* xe_exec_queue_is_idle() - Whether an exec_queue is idle.
|
|
* @q: The exec_queue
|
|
*
|
|
* FIXME: Need to determine what to use as the short-lived
|
|
* timeline lock for the exec_queues, so that the return value
|
|
* of this function becomes more than just an advisory
|
|
* snapshot in time. The timeline lock must protect the
|
|
* seqno from racing submissions on the same exec_queue.
|
|
* Typically vm->resv, but user-created timeline locks use the migrate vm
|
|
* and never grabs the migrate vm->resv so we have a race there.
|
|
*
|
|
* Return: True if the exec_queue is idle, false otherwise.
|
|
*/
|
|
bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
|
|
{
|
|
if (xe_exec_queue_is_parallel(q)) {
|
|
int i;
|
|
|
|
for (i = 0; i < q->width; ++i) {
|
|
if (xe_lrc_seqno(&q->lrc[i]) !=
|
|
q->lrc[i].fence_ctx.next_seqno - 1)
|
|
return false;
|
|
}
|
|
|
|
return true;
|
|
}
|
|
|
|
return xe_lrc_seqno(&q->lrc[0]) ==
|
|
q->lrc[0].fence_ctx.next_seqno - 1;
|
|
}
|
|
|
|
void xe_exec_queue_kill(struct xe_exec_queue *q)
|
|
{
|
|
struct xe_exec_queue *eq = q, *next;
|
|
|
|
list_for_each_entry_safe(eq, next, &eq->multi_gt_list,
|
|
multi_gt_link) {
|
|
q->ops->kill(eq);
|
|
xe_vm_remove_compute_exec_queue(q->vm, eq);
|
|
}
|
|
|
|
q->ops->kill(q);
|
|
xe_vm_remove_compute_exec_queue(q->vm, q);
|
|
}
|
|
|
|
int xe_exec_queue_destroy_ioctl(struct drm_device *dev, void *data,
|
|
struct drm_file *file)
|
|
{
|
|
struct xe_device *xe = to_xe_device(dev);
|
|
struct xe_file *xef = to_xe_file(file);
|
|
struct drm_xe_exec_queue_destroy *args = data;
|
|
struct xe_exec_queue *q;
|
|
|
|
if (XE_IOCTL_DBG(xe, args->pad) ||
|
|
XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1]))
|
|
return -EINVAL;
|
|
|
|
mutex_lock(&xef->exec_queue.lock);
|
|
q = xa_erase(&xef->exec_queue.xa, args->exec_queue_id);
|
|
mutex_unlock(&xef->exec_queue.lock);
|
|
if (XE_IOCTL_DBG(xe, !q))
|
|
return -ENOENT;
|
|
|
|
xe_exec_queue_kill(q);
|
|
|
|
trace_xe_exec_queue_close(q);
|
|
xe_exec_queue_put(q);
|
|
|
|
return 0;
|
|
}
|
|
|
|
static void xe_exec_queue_last_fence_lockdep_assert(struct xe_exec_queue *q,
|
|
struct xe_vm *vm)
|
|
{
|
|
if (q->flags & EXEC_QUEUE_FLAG_VM)
|
|
lockdep_assert_held(&vm->lock);
|
|
else
|
|
xe_vm_assert_held(vm);
|
|
}
|
|
|
|
/**
|
|
* xe_exec_queue_last_fence_put() - Drop ref to last fence
|
|
* @q: The exec queue
|
|
* @vm: The VM the engine does a bind or exec for
|
|
*/
|
|
void xe_exec_queue_last_fence_put(struct xe_exec_queue *q, struct xe_vm *vm)
|
|
{
|
|
xe_exec_queue_last_fence_lockdep_assert(q, vm);
|
|
|
|
if (q->last_fence) {
|
|
dma_fence_put(q->last_fence);
|
|
q->last_fence = NULL;
|
|
}
|
|
}
|
|
|
|
/**
|
|
* xe_exec_queue_last_fence_put_unlocked() - Drop ref to last fence unlocked
|
|
* @q: The exec queue
|
|
*
|
|
* Only safe to be called from xe_exec_queue_destroy().
|
|
*/
|
|
void xe_exec_queue_last_fence_put_unlocked(struct xe_exec_queue *q)
|
|
{
|
|
if (q->last_fence) {
|
|
dma_fence_put(q->last_fence);
|
|
q->last_fence = NULL;
|
|
}
|
|
}
|
|
|
|
/**
|
|
* xe_exec_queue_last_fence_get() - Get last fence
|
|
* @q: The exec queue
|
|
* @vm: The VM the engine does a bind or exec for
|
|
*
|
|
* Get last fence, takes a ref
|
|
*
|
|
* Returns: last fence if not signaled, dma fence stub if signaled
|
|
*/
|
|
struct dma_fence *xe_exec_queue_last_fence_get(struct xe_exec_queue *q,
|
|
struct xe_vm *vm)
|
|
{
|
|
struct dma_fence *fence;
|
|
|
|
xe_exec_queue_last_fence_lockdep_assert(q, vm);
|
|
|
|
if (q->last_fence &&
|
|
test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &q->last_fence->flags))
|
|
xe_exec_queue_last_fence_put(q, vm);
|
|
|
|
fence = q->last_fence ? q->last_fence : dma_fence_get_stub();
|
|
dma_fence_get(fence);
|
|
return fence;
|
|
}
|
|
|
|
/**
|
|
* xe_exec_queue_last_fence_set() - Set last fence
|
|
* @q: The exec queue
|
|
* @vm: The VM the engine does a bind or exec for
|
|
* @fence: The fence
|
|
*
|
|
* Set the last fence for the engine. Increases reference count for fence, when
|
|
* closing engine xe_exec_queue_last_fence_put should be called.
|
|
*/
|
|
void xe_exec_queue_last_fence_set(struct xe_exec_queue *q, struct xe_vm *vm,
|
|
struct dma_fence *fence)
|
|
{
|
|
xe_exec_queue_last_fence_lockdep_assert(q, vm);
|
|
|
|
xe_exec_queue_last_fence_put(q, vm);
|
|
q->last_fence = dma_fence_get(fence);
|
|
}
|