Cross-driver (xe-core) Changes:
- Require BMG scanout buffers to be 64k physically aligned (Maarten)
Core (drm) Changes:
- Introducing Xe2 ccs modifiers for integrated and discrete graphics (Juha-Pekka)
Driver Changes:
- General cleanup and more work moving towards intel_display isolation (Jani)
- New display workaround (Suraj)
- Use correct cp_irq_count on HDCP (Suraj)
- eDP PSR fix when CRC is enabled (Jouni)
- Fix DP MST state after a sink reset (Imre)
- Fix Arrow Lake GSC firmware version (John)
- Use chained DSBs for LUT programming (Ville)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZtCC0lJ0Zf3MoSdW@intel.com
UAPI Changes:
- Fix OA format masks which were breaking build with gcc-5
Cross-subsystem Changes:
Driver Changes:
- Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost)
- Refactor hw engine lookup and mmio access to be used in more places
(Dominik, Matt Auld, Mika Kuoppala)
- Enable priority mem read for Xe2 and later (Pallavi Mishra)
- Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik)
- Fix refcount and speedup devcoredump (Matthew Brost)
- Add performance tuning changes to Xe2 (Akshata, Shekhar)
- Fix OA sysfs entry (Ashutosh)
- Add first GuC firmware support for BMG (Julia)
- Bump minimum GuC firmware for platforms under force_probe to match LNL
and BMG (Julia)
- Fix access check on user fence creation (Nirmoy)
- Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas)
- Document workaround and use proper WA infra (Matt Roper)
- Fix VF configuration on media GT (Michal Wajdeczko)
- Fix VM dma-resv lock (Matthew Brost)
- Allow suspend/resume exec queue backend op to be called multiple times
(Matthew Brost)
- Add GT stats to debugfs (Nirmoy)
- Add hwconfig to debugfs (Matt Roper)
- Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas)
- Remove dead kunit code (Jani Nikula)
- Refactor drvdata storing to help display (Jani Nikula)
- Cleanup unsused xe parameter in pte handling (Himal)
- Rename s/enable_display/probe_display/ for clarity (Lucas)
- Fix missing MCR annotation in couple of registers (Tejas)
- Fix DGFX display suspend/resume (Maarten)
- Prepare exec_queue_kill for PXP handling (Daniele)
- Fix devm/drmm issues (Daniele, Matthew Brost)
- Fix tile and ggtt fini sequences (Matthew Brost)
- Fix crashes when probing without firmware in place (Daniele, Matthew Brost)
- Use xe_managed for kernel BOs (Daniele, Matthew Brost)
- Future-proof dss_per_group calculation by using hwconfig (Matt Roper)
- Use reserved copy engine for user binds on faulting devices
(Matthew Brost)
- Allow mixing dma-fence jobs and long-running faulting jobs (Francois)
- Cleanup redundant arg when creating use BO (Nirmoy)
- Prevent UAF around preempt fence (Auld)
- Fix display suspend/resume (Maarten)
- Use vma_pages() helper (Thorsten)
- Calculate pagefault queue size (Stuart, Matthew Auld)
- Fix missing pagefault wq destroy (Stuart)
- Fix lifetime handling of HW fence ctx (Matthew Brost)
- Fix order destroy order for jobs (Matthew Brost)
- Fix TLB invalidation for media GT (Matthew Brost)
- Document GGTT (Rodrigo Vivi)
- Refactor GGTT layering and fix runtime outer protection (Rodrigo Vivi)
- Handle HPD polling on display pm runtime suspend/resume (Imre, Vinod)
- Drop unrequired NULL checks (Apoorva, Himal)
- Use separate rpm lockdep map for non-d3cold-capable devices (Thomas Hellström)
- Support "nomodeset" kernel command-line option (Thomas Zimmermann)
- Drop force_probe requirement for LNL and BMG (Lucas, Balasubramani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/wd42jsh4i3q5zlrmi2cljejohdsrqc6hvtxf76lbxsp3ibrgmz@y54fa7wwxgsd
In order to avoid the DSB keeping the DEwake permanently
asserted we must clear DSB_PMCTRL_2.DSB_FORCE_DEWAKE once
we are done. For good measure do the same for
DSB_PMCTRL.DSB_ENABLE_DEWAKE.
Experimentally this doens't seem to be actually necessary
(unlike with DSB_FORCE_DEWAKE). That is, the DSB_ENABLE_DEWAKE
doesn't seem to do anything whenever the DSB is not active.
But I'd hate to waste a ton of power in case there I'm wrong
and there is some way DEwake could remaing asserted. One extra
register write is a small price to pay for some peace of mind.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240624191032.27333-13-ville.syrjala@linux.intel.com
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Allow intel_dsb_chain() to start the chained DSB
at start of the undelaye vblank. This is slightly
more involved than simply setting the bit as we
must use the DEwake mechanism to eliminate pkgC
latency.
And DSB_ENABLE_DEWAKE itself is problematic in that
it allows us to configure just a single scanline,
and if the current scanline is already past that
DSB_ENABLE_DEWAKE won't do anything, rendering the
whole thing moot.
The current workaround involves checking the pipe's current
scanline with the CPU, and if it looks like we're about to
miss the configured DEwake scanline we set DSB_FORCE_DEWAKE
to immediately assert DEwake. This is somewhat racy since the
hardware is making progress all the while we're checking it on
the CPU.
We can make things less racy by chaining two DSBs and handling
the DSB_FORCE_DEWAKE stuff entirely without CPU involvement:
1. CPU starts the first DSB immediately
2. First DSB configures the second DSB, including its dewake_scanline
3. First DSB starts the second w/ DSB_WAIT_FOR_VBLANK
4. First DSB asserts DSB_FORCE_DEWAKE
5. First DSB waits until we're outside the dewake_scanline-vblank_start
window
6. First DSB deasserts DSB_FORCE_DEWAKE
That will guarantee that the we are fully awake when the second
DSB starts to actually execute.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240624191032.27333-12-ville.syrjala@linux.intel.com
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Add functions to emit a DSB scanline window wait instructions.
We can either wait for the scanline to be IN the window
or OUT of the window.
The hardware doesn't handle wraparound so we must manually
deal with it by swapping the IN range to the inverse OUT
range, or vice versa.
Also add a bit of paranoia to catch the edge case of waiting
for the entire frame. That doesn't make sense since an IN
wait would be a nop, and an OUT wait would imply waiting
forever. Most of the time this also results in both scanline
ranges (original and inverted) to have lower=upper+1
which is nonsense from the hw POV.
For now we are only handling the case where the scanline wait
happens prior to latching the double buffered registers during
the commit (which might change the timings due to LRR/VRR/etc.)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240624191032.27333-10-ville.syrjala@linux.intel.com
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
When determining various scanlines for DSB use we should take into
account whether VRR is active at the time when the DSB uses said
scanline information. For now all DSB scanline usage occurs prior
to the actual commit, so we only need to care about the state of
VRR at that time.
I've decided to move intel_crtc_scanline_to_hw() in its entirety
to the DSB code as it will also need to know the actual state
of VRR in order to do its job 100% correctly.
TODO: figure out how much of this could be moved to some
more generic place and perhaps be shared with the CPU
vblank evasion code/etc...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240624191032.27333-8-ville.syrjala@linux.intel.com
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Currently we calculate the DEwake scanline based on
the delayed vblank start, while in reality it should be computed
based on the undelayed vblank start (as that is where the DSB
actually starts). Currently it doesn't really matter as we
don't have any vblank delay configured, but that may change
in the future so let's be accurate in what we do.
We can also remove the max() as intel_crtc_scanline_to_hw()
can deal with negative numbers, which there really shouldn't
be anyway.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240624191032.27333-7-ville.syrjala@linux.intel.com
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Currently we switch from out software idea of a scanline
to the hw's idea of a scanline during the commit phase in
_intel_dsb_commit(). While that is slightly easier due to
fastsets fiddling with the timings, we'll also need to
generate proper hw scanline numbers already when emitting
DSB scanline wait instructions. So this approach won't
do in the future. Switch to hw scanline numbers earlier.
Also intel_dsb_dewake_scanline() itself already makes
some assumptions about VRR that don't take into account
VRR toggling during fastsets, so technically delaying
the sw->hw conversion doesn't even help us.
The other reason for delaying the conversion was that we
are using intel_get_crtc_scanline() during intel_dsb_commit()
which gives us the current sw scanline. But this is pretty
low level stuff anyway so just using raw PIPEDSL reads seems
fine here, and that of course gives us the hw scanline
directly, reducing the need to do so many conversions.
v2: Return the non-hw scanline from intel_dsb_dewake_scanline()
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240624191032.27333-5-ville.syrjala@linux.intel.com
On ilk/snb the pipe may be configured to place the LUT before or
after the CSC depending on various factors, but as there is only
one LUT (no split mode like on IVB+) we only advertise a gamma_lut
and no degamma_lut in the uapi to avoid confusing userspace.
This can cause a problem during readout if the VBIOS/GOP enabled
the LUT in the pre CSC configuration. The current code blindly
assigns the results of the readout to the degamma_lut, which will
cause a failure during the next atomic_check() as we aren't expecting
anything to be in degamma_lut since it's not visible to userspace.
Fix the problem by assigning whatever LUT we read out from the
hardware into gamma_lut.
Cc: stable@vger.kernel.org
Fixes: d2559299d3 ("drm/i915: Make ilk_read_luts() capable of degamma readout")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11608
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240710124137.16773-1-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
The function parameter out can be used directly instead of assigning it
to a temporary u64 variable first.
Remove the local variable tmp2 and use the parameter out directly as the
divisor in do_div() to remove the following Coccinelle/coccicheck
warning reported by do_div.cocci:
WARNING: do_div() does a 64-by-32 division, please consider using div64_u64 instead
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240710210034.796032-2-thorsten.blum@toblux.com
For non-d3cold-capable devices we'd like to be able to wake up the
device from reclaim. In particular, for Lunar Lake we'd like to be
able to blit CCS metadata to system at shrink time; at least from
kswapd where it's reasonable OK to wait for rpm resume and a
preceding rpm suspend.
Therefore use a separate lockdep map for such devices and prime it
reclaim-tainted.
v2:
- Rename lockmap acquire- and release functions. (Rodrigo Vivi).
- Reinstate the old xe_pm_runtime_lockdep_prime() function and
rename it to xe_rpm_might_enter_cb(). (Matthew Auld).
- Introduce a separate xe_pm_runtime_lockdep_prime function
called from module init for known required locking orders.
v3:
- Actually hook up the prime function at module init.
v4:
- Rebase.
v5:
- Don't use reclaim-safe RPM with sriov.
Cc: "Vivi, Rodrigo" <rodrigo.vivi@intel.com>
Cc: "Auld, Matthew" <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240826143450.92511-1-thomas.hellstrom@linux.intel.com
For CCS formats on affected platforms, CCS can be used freely, but
display engine requires a multiple of 64k physical pages. No other
changes are needed.
At the BO creation time we don't know if the BO will be used for CCS
or not. If the scanout flag is set, and the BO is a multiple of 64k,
we take the safe route and force the physical alignment of 64k pages.
If the BO is not a multiple of 64k, or the scanout flag was not set
at BO creation, we reject it for usage as CCS in display. The physical
pages are likely not aligned correctly, and this will cause corruption
when used as FB.
The scanout flag and size being a multiple of 64k are used together
to enforce 64k physical placement.
VM_BIND is completely unaffected, mappings to a VM can still be aligned
to 4k, just like for normal buffers.
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240826170117.327709-3-maarten.lankhorst@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Need to take some Xe bo definition in here before
we can add the BMG display 64k aligned size restrictions.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Setting 'nomodeset' on the kernel command line disables all graphics
drivers with modesetting capabilities, leaving only firmware drivers,
such as simpledrm or efifb.
Most DRM drivers automatically support 'nomodeset' via DRM's module
helper macros. In xe, which uses regular module_init(), manually call
drm_firmware_drivers_only() to test for 'nomodeset'. Do not register
the driver if set.
v2:
- use xe's init table (Lucas)
- do NULL test for init/exit functions
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240827121003.97429-1-tzimmermann@suse.de
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
This reverts commit 8ebb1fc2e6.
The panel should be handled through the samsung-atna33xc20 driver for
correct power up timings. Otherwise the backlight does not work correctly.
We have existing users of this panel through the generic "edp-panel"
compatible (e.g. the Qualcomm X1E80100 CRD), but the screen works only
partially in that configuration: It works after boot but once the screen
gets disabled it does not turn on again until after reboot. It behaves the
same way with the default "conservative" timings, so we might as well drop
the configuration from the panel-edp driver. That way, users with old DTBs
will get a warning and can move to the new driver.
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240715-x1e80100-crd-backlight-v2-2-31b7f2f658a3@linaro.org