When handling a GPU page fault addr_to_drm_mm_node() is used to
translate the GPU address to a buffer object. However it is possible for
the buffer object to be freed after the function has returned resulting
in a use-after-free of the BO.
Change addr_to_drm_mm_node to return the panfrost_gem_object with an
extra reference on it, preventing the BO from being freed until after
the page fault has been handled.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913160310.50444-1-steven.price@arm.com
The panfrost driver requests a supply using regulator_get_optional()
but both the name of the supply and the usage pattern suggest that it is
being used for the main power for the device and is not at all optional
for the device for function, there is no meaningful handling for absent
supplies. Such regulators should use the vanilla regulator_get()
interface, it will ensure that even if a supply is not described in the
system integration one will be provided in software.
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904123032.23263-1-broonie@kernel.org
Currently the self refresh idle timer is a const set by the crtc. This
is fine if the self refresh entry/exit times are well-known for all
panels used on that crtc. However panels and workloads can vary quite a
bit, and a timeout which works well for one doesn't work well for
another.
In the extreme, if the timeout is too short we could get in a situation
where the self refresh exits are taking so long we queue up a self refresh
entry before the exit commit is even finished.
This patch changes the idle timeout to a moving average of the entry
times + a moving average of exit times + the crtc constant.
This patch was tested on rockchip, with a kevin CrOS panel the idle
delay averages out to about ~235ms (35 entry + 100 exit + 100 const). On
the same board, the bob panel idle delay lands around ~340ms (90 entry
+ 150 exit + 100 const).
WRT the dedicated mutex in self_refresh_data, it would be nice if we
could rely on drm_crtc.mutex to protect the average times, but there are
a few reasons why a separate lock is a better choice:
- We can't rely on drm_crtc.mutex being held if we're doing a nonblocking
commit
- We can't grab drm_crtc.mutex since drm_modeset_lock() doesn't tell us
whether the lock was already held in the acquire context (it eats
-EALREADY), so we can't tell if we should drop it or not
- We don't need such a heavy-handed lock for what we're trying to do,
commit ordering doesn't matter, so a point-of-use lock will be less
contentious
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link to v1: https://patchwork.freedesktop.org/patch/msgid/20190917200443.64481-2-sean@poorly.run
Link: https://patchwork.freedesktop.org/patch/msgid/20190918200734.149876-2-sean@poorly.run
Changes in v2:
- Migrate locking explanation from comment to commit msg (Daniel)
- Turf constant entry delay and multiply the avg times by 2 (Daniel)
commit 4f5368b554
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Fri Jun 14 08:17:23 2019 +0200
drm/kms: Catch mode_object lifetime errors
uncovered a bit a mess in dp drivers. Most drivers (from a quick look,
all except i915) register all the dp stuff in their init code, which
is too early. With CONFIG_DRM_DP_AUX_CHARDEV this will blow up,
because drm_dp_aux_register tries to add a child to a device in sysfs
(the connector) which doesn't even exist yet.
No one seems to have cared thus far. But with the above change I also
moved the setting of dev->registered after the ->load callback, in an
attempt to keep old drivers from hitting any WARN_ON backtraces. But
that moved radeon.ko from the "working, by accident" to "now also
broken" category.
Since this is a huge mess I figured a revert would be simplest. But
this check has already caught issues in i915:
commit 1b9bd09630
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Tue Aug 20 19:16:57 2019 +0300
drm/i915: Do not create a new max_bpc prop for MST connectors
Hence I'd like to retain it. Fix the radeon regression by moving the
setting of dev->registered back to were it was, and stop the
backtraces with an explicit check for dev->driver->load.
Everyone else will stay as broken with CONFIG_DRM_DP_AUX_CHARDEV. The
next patch will improve the kerneldoc and add a todo entry for this.
Fixes: 4f5368b554 ("drm/kms: Catch mode_object lifetime errors")
Cc: Sean Paul <sean@poorly.run>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reported-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Cc: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190917120936.7501-1-daniel.vetter@ffwll.ch
Writing the 0x1704 (BUS_BAR1_BLOCK) register causes the GPU to probe the
memory region at the programmed address. The result is an address decode
error in the external memory controller because address 0, which is what
is written to the register, is not designated as accessible to devices.
Avoid triggering DMA from the GPU by removing teardown of the BAR1.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
When the last reference to a TTM BO is dropped, ttm_bo_release() will
acquire the DMA reservation object's wound/wait mutex while trying to
clean up (ttm_bo_cleanup_refs_or_queue() via ttm_bo_release()). It is
therefore essential that drm_gem_object_release() be called after the
TTM BO has been uninitialized, otherwise drm_gem_object_release() has
already destroyed the wound/wait mutex (via dma_resv_fini()).
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Prior to commit 019cbd4a4f ("drm/nouveau: Initialize GEM object before
TTM object"), the reservation object was locked across all of the buffer
object creation.
After splitting nouveau_bo_new() into separate nouveau_bo_alloc() and
nouveau_bo_init() functions, the reservation object is passed to the
latter, so the lock needs to be held across that function as well.
Fixes: 019cbd4a4f ("drm/nouveau: Initialize GEM object before TTM object")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Commit 019cbd4a4f ("drm/nouveau: Initialize GEM object before TTM
object") introduced a subtle change in how the buffer allocation size is
handled. Prior to that change, the size would get aligned to at least a
page, whereas after that change a non-page-aligned size would get passed
through unmodified. This ultimately causes a BUG_ON() to trigger in
drm_gem_private_object_init() and crashes the system.
Fix this by restoring the code that align the allocation size.
Fixes: 019cbd4a4f ("drm/nouveau: Initialize GEM object before TTM object")
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
On the ThinkPad P71, we have one eDP connector exposed along with 5 DP
connectors, resulting in a total of 11 TMDS encoders. Since the GPU on
this system is also capable of MST, we create an additional 4 fake MST
encoders for each DP port. Unfortunately, we also do this for the eDP
port as well, resulting in:
1 eDP port: +1 TMDS encoder
+4 DPMST encoders
5 DP ports: +2 TMDS encoders
+4 DPMST encoders
*5 ports
== 35 encoders
Which breaks things, since DRM has a hard coded limit of 32 encoders.
So, fix this by not creating MSTMs for any eDP connectors. This brings
us down to 31 encoders, although we can do better.
This fixes driver probing for nouveau on the ThinkPad P71.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This reverts commit 83f35bc3a8.
There are at least two DSI controller drivers which relies on the old
behaviour of adv7511 driver. To avoid platform breakage this patch
should be reverted. This is a temporary solution, as it blocks adv7511
usage with other platforms.
Assumption that DSI device driver (bridge/panel) should first expose
drm_bridge/drm_panel object, then look for DSI bus is just incorrect -
it can work with devices controlled via i2c but it cannot work with
devices controlled via DSI - they will not be able to probe.
To solve the issue following steps should be performed:
- rework reverted patch allowing co-operation with broken DSI controller
drivers - with simple/ugly workaround,
- fix controller drivers and then remove workaround.
Signed-off-by: Rob Clark <robdclark@chromium.org>
[a.hajda: changed commit message]
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829180836.14453-1-robdclark@gmail.com
The guest may use this register to identify the running state of one
context. Emulate it as the value in context image as if the context runs
on the GPU hardware.
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
when creating a vGPU workload, the guest context head pointer should
be updated correctly by comparing with the exsiting workload in the
guest worklod queue including the current running context.
in some situation, there is a running context A and then received 2 new
vGPU workload context B and A. in the new workload context A, it's head
pointer should be updated with the running context A's tail.
v2: walk through guest workload list in backward way.
Cc: stable@vger.kernel.org
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
There were bugs in the DSI transfer (read and write) function
as it was only tested with displays ever needing a single byte
to be written. Fixed it up and tested so we can now write
messages of up to 16 bytes and read up to 4 bytes from the
display.
Tested with a Sony ACX424AKP display: this display now self-
identifies and can control backlight in command mode.
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 5fc537bfd0 ("drm/mcde: Add new driver for ST-Ericsson MCDE")
Reviewed-by: Stephan Gerhold <stephan@gerhold.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190903170804.17053-1-linus.walleij@linaro.org
This was useful for debugging fps drops. I suspect it will be useful
again.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@poorly.run>
In addition, moving to kms->flush_commit() lets us drop the only user
of kms->commit().
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@poorly.run>
Now that flush/wait/complete is decoupled from the "synchronous" part of
atomic commit_tail(), add support to defer flush to a timer that expires
shortly before vblank for async commits. In this way, multiple atomic
commits (for example, cursor updates) can be coalesced into a single
flush at the end of the frame.
v2: don't hold lock over ->wait_flush(), to avoid locking interaction
that was causing fps drop when combining page flips or non-async
atomic commits and lots of legacy cursor updates
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@poorly.run>
With atomic commit, ->prepare_commit() and ->complete_commit() may not
be evenly balanced (although ->complete_commit() will complete each
crtc that had been previously prepared). So these will no longer be
a good place to enable/disable clocks needed for hw access.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@poorly.run>
Add ->flush_commit(crtc_mask). Currently a no-op, but kms backends
should migrate writing flush registers to this hook, so we can decouple
pushing updates to hardware, and flushing the updates.
Once we add async commit support, the hw updates will be pushed down to
the hw synchronously, but flushing the updates will be deferred until as
close to vblank as possible, so that multiple updates can be combined in
a single frame.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@poorly.run>
Prep work for async commits, in which case this will be called after we
no longer have the atomic state object.
This drops some wait_for_vblanks(), but those should be unnecessary, as
we call this after waiting for flush to complete.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@poorly.run>
First step in re-working the atomic related internal API to prepare for
async updates pending.. ->wait_flush() is intended to block until there
is no in-progress flush.
A crtc_mask is used, rather than an atomic state object, as this will
later be used for async flush after the atomic state is destroyed.
This replaces ->wait_for_crtc_commit_done()
v2: update for review comments
Signed-off-by: Rob Clark <robdclark@chromium.org>
Previously the callback was called from whoever called wait_for_vblank(),
but that isn't a great plan when wait_for_vblank() stops getting called,
and results in frame_done_timer expiring.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Just waiting for next vblank isn't ideal.. we should really be looking
at the hw FLUSH register value to know if there is still an in-progress
flush without stalling unnecessarily when there is no pending flush.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Rob Clark <robdclark@chromium.org>
It attempted to avoid fps drops in the presence of cursor updates. But
it is racing, and can result in hw updates after flush before vblank,
which leads to underruns.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Remove the default for CONFIG_DRM_MSM and let the user select the driver
manually as one does.
Additionally select QCOM_COMMAND_DB for ARCH_QCOM targets to make sure
it doesn't get missed when we need it for a6xx targets.
v2: Move from default 'm' to no default
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Previously, dpu_crtc_frame_event_work() would try to aquire all the
modeset locks in order to check whether it can release bandwidth. (If
we only have cmd-mode display, bandwidth can be released at frame-done
time.)
The problem with this is that it is also responsible for signalling
frame_done_comp, which dpu_crtc_commit_kickoff() waits on if there is
already a frame pending. This is called in the msm_atomic_commit_tail()
path.. which means that for non-nonblock commits, at least some of the
modeset locks are already held.
Re-work this scheme to use a reference count to track our need to have
clocks enabled. It is incremented for each atomic commit, and
decremented in the corresponding frame-done. Additionally, any crtc
used in video mode hold an extra reference while they are enabled. The
net effect is that we can determine in frame-done whether it is safe to
drop bandwidth without needing to aquire any modeset locks.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Sean Paul <sean@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct msm_gem_submit {
...
struct {
...
} bos[0];
};
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.
So, replace the following form:
sizeof(*submit) + ((u64)nr_bos * sizeof(submit->bos[0]))
with:
struct_size(submit, bos, nr_bos)
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>