linux/drivers
Jason Ekstrand 3761baae90 Revert "drm/i915: Propagate errors on awaiting already signaled fences"
This reverts commit 9e31c1fe45.  Ever
since that commit, we've been having issues where a hang in one client
can propagate to another.  In particular, a hang in an app can propagate
to the X server which causes the whole desktop to lock up.

Error propagation along fences sound like a good idea, but as your bug
shows, surprising consequences, since propagating errors across security
boundaries is not a good thing.

What we do have is track the hangs on the ctx, and report information to
userspace using RESET_STATS. That's how arb_robustness works. Also, if my
understanding is still correct, the EIO from execbuf is when your context
is banned (because not recoverable or too many hangs). And in all these
cases it's up to userspace to figure out what is all impacted and should
be reported to the application, that's not on the kernel to guess and
automatically propagate.

What's more, we're also building more features on top of ctx error
reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the
userspace fence wait also relies on that mechanism. So it is the path
going forward for reporting gpu hangs and resets to userspace.

So all together that's why I think we should just bury this idea again as
not quite the direction we want to go to, hence why I think the revert is
the right option here.

For backporters: Please note that you _must_ have a backport of
https://lore.kernel.org/dri-devel/20210602164149.391653-2-jason@jlekstrand.net/
for otherwise backporting just this patch opens up a security bug.

v2: Augment commit message. Also restore Jason's sob that I
accidentally lost.

v3: Add a note for backporters

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: Marcin Slusarz <marcin.slusarz@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Marcin Slusarz <marcin.slusarz@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080
Fixes: 9e31c1fe45 ("drm/i915: Propagate errors on awaiting already signaled fences")
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-3-jason@jlekstrand.net
(cherry picked from commit 93a2711cdd)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-07-19 09:55:24 -04:00
..
accessibility TTY / Serial patches for 5.14-rc1 2021-07-05 14:08:24 -07:00
acpi More ACPI updates for 5.14-rc1 2021-07-07 13:30:01 -07:00
amba
android
ata ARM: SoC changes for 5.14 2021-07-10 09:22:44 -07:00
atm Networking changes for 5.14. 2021-06-30 15:51:09 -07:00
auxdisplay
base More power management updates for 5.14-rc1 2021-07-07 13:22:59 -07:00
bcma
block block-5.14-2021-07-16 2021-07-16 12:31:44 -07:00
bluetooth TTY / Serial patches for 5.14-rc1 2021-07-05 14:08:24 -07:00
bus ARM: Drivers for 5.14 2021-07-10 09:46:20 -07:00
cdrom block: remove REQ_OP_SCSI_{IN,OUT} 2021-06-30 15:34:19 -06:00
char powerpc/powernv: Fix fall-through warning for Clang 2021-07-13 19:21:41 -05:00
clk dt-bindings: clock: r9a07g044-cpg: Update clock/reset definitions 2021-07-12 10:52:03 +02:00
clocksource This round has a diffstat dominated by Qualcomm clk drivers. Honestly though 2021-07-01 13:26:16 -07:00
comedi Staging / IIO driver patches for 5.14-rc1 2021-07-05 14:01:53 -07:00
connector
counter
cpufreq cpufreq: Fix fall-through warning for Clang 2021-07-13 11:53:07 -05:00
cpuidle - Add support for the Qcom MSM8226 (Bartosz Dudziak) 2021-06-30 14:56:51 +02:00
crypto ARM: SoC changes for 5.14 2021-07-10 09:22:44 -07:00
cxl
dax fs: remove noop_set_page_dirty() 2021-06-29 10:53:48 -07:00
dca
devfreq PM / devfreq: passive: Fix get_target_freq when not using required-opp 2021-06-24 10:37:35 +09:00
dio
dma dmaengine: mpc512x: Fix fall-through warning for Clang 2021-07-14 11:05:55 -05:00
dma-buf Short summary of fixes pull: 2021-07-13 15:15:17 +02:00
edac EDAC/igen6: fix core dependency AGAIN 2021-07-15 11:59:59 -07:00
eisa
extcon Char / Misc driver updates for 5.14-rc1 2021-07-05 13:42:16 -07:00
firewire Char / Misc driver updates for 5.14-rc1 2021-07-05 13:42:16 -07:00
firmware ARM SCMI fixes for v5.14 2021-07-16 23:01:25 +02:00
fpga fpga: machxo2-spi: Address warning about unused variable 2021-06-24 15:45:11 +02:00
fsi
gnss
gpio - Core Frameworks 2021-07-05 12:10:34 -07:00
gpu Revert "drm/i915: Propagate errors on awaiting already signaled fences" 2021-07-19 09:55:24 -04:00
greybus
hid Merge branch 'for-5.14/multitouch' into for-linus 2021-06-30 09:15:15 +02:00
hsi
hv Merge branch 'akpm' (patches from Andrew) 2021-07-02 12:08:10 -07:00
hwmon Char / Misc driver updates for 5.14-rc1 2021-07-05 13:42:16 -07:00
hwspinlock
hwtracing Char / Misc driver updates for 5.14-rc1 2021-07-05 13:42:16 -07:00
i2c Merge branch 'i2c/for-mergewindow' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2021-07-04 11:47:18 -07:00
i3c I3C for 5.14 2021-07-10 11:53:06 -07:00
idle
iio Staging / IIO driver patches for 5.14-rc1 2021-07-05 14:01:53 -07:00
infiniband Tracing updates for 5.14: 2021-07-03 11:13:22 -07:00
input This pull request contains the following changes for UML: 2021-07-09 10:19:13 -07:00
interconnect
iommu fallthrough fixes for Clang for 5.14-rc2 2021-07-15 13:57:31 -07:00
ipack TTY / Serial patches for 5.14-rc1 2021-07-05 14:08:24 -07:00
irqchip irqchip fixes for 5.14, take #1 2021-07-09 15:35:13 +02:00
isdn TTY / Serial patches for 5.14-rc1 2021-07-05 14:08:24 -07:00
leds This contains quite a lot of fixes, with more fixes in my inbox that 2021-07-03 11:57:42 -07:00
lightnvm
macintosh
mailbox mbox: add polarfire soc system controller mailbox 2021-06-26 12:06:48 -05:00
mcb mcb: Use DEFINE_RES_MEM() helper macro and fix the end address 2021-06-24 15:56:25 +02:00
md - Various DM persistent-data library improvements and fixes that 2021-06-30 18:19:39 -07:00
media Driver core changes for 5.14-rc1 2021-07-05 13:51:41 -07:00
memory
memstick for-5.14/block-2021-06-29 2021-06-30 12:12:56 -07:00
message
mfd Driver core changes for 5.14-rc1 2021-07-05 13:51:41 -07:00
misc TTY / Serial patches for 5.14-rc1 2021-07-05 14:08:24 -07:00
mmc mmc: jz4740: Fix fall-through warning for Clang 2021-07-13 13:59:22 -05:00
most
mtd mtd: cfi_util: Fix unreachable code issue 2021-07-12 11:15:28 -05:00
mux
net fallthrough fixes for Clang for 5.14-rc2 2021-07-15 13:57:31 -07:00
nfc
ntb
nubus
nvdimm cxl for 5.14 2021-07-04 11:55:13 -07:00
nvme block-5.14-2021-07-16 2021-07-16 12:31:44 -07:00
nvmem Char / Misc driver updates for 5.14-rc1 2021-07-05 13:42:16 -07:00
of Devicetree updates for v5.14: 2021-07-03 10:54:08 -07:00
opp
parisc kernel.h: split out panic and oops helpers 2021-07-01 11:06:04 -07:00
parport
pci PCI: Fix fall-through warning for Clang 2021-07-13 13:59:12 -05:00
pcmcia
perf
phy USB / Thunderbolt patches for 5.14-rc1 2021-07-05 14:16:22 -07:00
pinctrl This is the bulk of pin control changes for the v5.14 kernel: 2021-07-01 16:57:14 -07:00
platform Driver core changes for 5.14-rc1 2021-07-05 13:51:41 -07:00
pnp Char / Misc driver updates for 5.14-rc1 2021-07-05 13:42:16 -07:00
power power: supply: Fix fall-through warnings for Clang 2021-07-13 14:50:47 -05:00
powercap
pps
ps3
ptp ptp: Relocate lookup cookie to correct block. 2021-07-08 12:33:10 -07:00
pwm pwm: ep93xx: Ensure configuring period and duty_cycle isn't wrongly skipped 2021-07-08 16:09:30 +02:00
rapidio
ras
regulator - Core Frameworks 2021-07-05 12:10:34 -07:00
remoteproc remoteproc updates for v5.14 2021-07-07 10:50:03 -07:00
reset ARM: Drivers for 5.14 2021-07-10 09:46:20 -07:00
rpmsg
rtc RTC for 5.14 2021-07-10 16:19:10 -07:00
s390 SCSI fixes on 20210717 2021-07-17 13:09:23 -07:00
sbus
scsi SCSI fixes on 20210717 2021-07-17 13:09:23 -07:00
sh
siox siox: Simplify error handling via dev_err_probe() 2021-06-24 15:46:34 +02:00
slimbus
soc ARM: Drivers for 5.14 2021-07-10 09:46:20 -07:00
soundwire Char / Misc driver updates for 5.14-rc1 2021-07-05 13:42:16 -07:00
spi Merge remote-tracking branch 'spi/for-5.14' into spi-next 2021-06-25 14:08:26 +01:00
spmi spmi: hisi-spmi-controller: move driver from staging 2021-06-25 10:02:05 +02:00
ssb
staging TTY / Serial patches for 5.14-rc1 2021-07-05 14:08:24 -07:00
target block-5.14-2021-07-08 2021-07-09 12:05:33 -07:00
tc
tee fallthrough fixes for Clang for 5.14-rc1 2021-06-28 20:03:38 -07:00
thermal - Add rk3568 sensor support (Finley Xiao) 2021-07-10 11:43:25 -07:00
thunderbolt USB / Thunderbolt patches for 5.14-rc1 2021-07-05 14:16:22 -07:00
tty This pull request contains the following changes for UML: 2021-07-09 10:19:13 -07:00
uio
usb usb: gadget: fsl_qe_udc: Fix fall-through warning for Clang 2021-07-14 11:02:37 -05:00
vdpa vp_vdpa: allow set vq state to initial state after reset 2021-07-08 07:49:02 -04:00
vfio VFIO update for v5.14-rc1 2021-07-03 11:49:33 -07:00
vhost vdpa: support packed virtqueue for set/get_vq_state() 2021-07-08 07:49:01 -04:00
video drm fixes for 5.14-rc2 2021-07-16 11:14:54 -07:00
virt nitro_enclaves: Set Bus Master for the NE PCI device 2021-06-24 15:48:27 +02:00
virtio virtio,vhost,vdpa: features, fixes 2021-07-09 11:06:29 -07:00
visorbus
vlynq
vme
w1
watchdog linux-watchdog 5.14-rc1 tag 2021-07-07 12:57:46 -07:00
xen xen: branch for v5.14-rc1 2021-07-07 11:07:13 -07:00
zorro
Kconfig
Makefile hyperv-next for 5.14 2021-06-29 11:21:35 -07:00