The soundwire subsystem uses two completion structures that allow
drivers to wait for soundwire device to become enumerated on the bus and
initialised by their drivers, respectively.
The code implementing the signalling is currently broken as it does not
signal all current and future waiters and also uses the wrong
reinitialisation function, which can potentially lead to memory
corruption if there are still waiters on the queue.
Not signalling future waiters specifically breaks sound card probe
deferrals as codec drivers can not tell that the soundwire device is
already attached when being reprobed. Some codec runtime PM
implementations suffer from similar problems as waiting for enumeration
during resume can also timeout despite the device already having been
enumerated.
Fixes: fb9469e54f ("soundwire: bus: fix race condition with enumeration_complete signaling")
Fixes: a90def0681 ("soundwire: bus: fix race condition with initialization_complete signaling")
Cc: stable@vger.kernel.org # 5.7
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Cc: Rander Wang <rander.wang@linux.intel.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20230705123018.30903-2-johan+linaro@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Always add buses to the stream->master_list in a fixed order.
The unique bus->id is used to order the adding of buses to the
list.
This prevents lockdep asserts and possible deadlocks on streams
that have multiple buses.
sdw_acquire_bus_lock() takes bus_lock in the order that buses
are listed in stream->master_list. do_bank_switch() takes all
the msg_lock in the same order.
To prevent a lockdep assert, and a possible real deadlock, the
relative order of taking these mutexes must always be the same.
For example, if a stream takes the mutexes in the order
(bus0, bus1) lockdep will assert if another stream takes them
in the order (bus1, bus0).
More complex relative ordering will also assert, for example
if two streams take (bus0, bus1) and (bus1, bus2), then a third
stream takes (bus2, bus0).
Previously sdw_stream_add_master() simply added the given bus
to the end of the list, requiring the caller to guarantee that
buses are added in a fixed order. This isn't reasonable or
necessary - it's an internal implementation detail that should
not be exposed by the API. It doesn't really make sense when
there could be multiple independent calling drivers, to say
"you must add your buses in the same order as a different driver,
that you don't know about, added them".
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20230615141208.679011-2-rf@opensource.cirrus.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Give the bus_lock and msg_lock of each bus a different unique key
so that it is possible to acquire the locks of multiple buses
without lockdep asserting a possible deadlock.
Using mutex_init() to initialize a mutex gives all those mutexes
the same lock class. Lockdep checking treats it as an error to
attempt to take a mutex while already holding a mutex of the same
class. This causes a lockdep assert when sdw_acquire_bus_lock()
attempts to lock multiple buses, and when do_bank_switch() takes
multiple msg_lock.
[ 138.697350] WARNING: possible recursive locking detected
[ 138.697366] 6.3.0-test #1 Tainted: G E
[ 138.697380] --------------------------------------------
[ 138.697394] play/903 is trying to acquire lock:
[ 138.697409] ffff99b8c41aa8c8 (&bus->bus_lock){+.+.}-{3:3}, at:
sdw_prepare_stream+0x52/0x2e0
[ 138.697443]
but task is already holding lock:
[ 138.697468] ffff99b8c41af8c8 (&bus->bus_lock){+.+.}-{3:3}, at:
sdw_prepare_stream+0x52/0x2e0
[ 138.697493]
other info that might help us debug this:
[ 138.697521] Possible unsafe locking scenario:
[ 138.697540] CPU0
[ 138.697550] ----
[ 138.697559] lock(&bus->bus_lock);
[ 138.697570] lock(&bus->bus_lock);
[ 138.697581]
*** DEADLOCK ***
Giving each mutex a unique key allows multiple to be held
without triggering a lockdep assert. But note that it does not
allow them to be taken in one order then a different order.
If two mutexes are taken in the order A, B then they must
always be taken in that order otherwise they could deadlock.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20230615141208.679011-1-rf@opensource.cirrus.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The ace2x driver can be build with or without mlink support, but
when SND_SOC_SOF_HDA_MLINK is set to =m and soundwire is built-in,
it fails with a link error:
ld.lld: error: undefined symbol: hdac_bus_eml_sdw_wait_syncpu_unlocked
>>> referenced by intel_ace2x.c
>>> drivers/soundwire/intel_ace2x.o:(intel_link_power_up) in archive vmlinux.a
ld.lld: error: undefined symbol: hdac_bus_eml_sdw_sync_arm_unlocked
>>> referenced by intel_ace2x.c
>>> drivers/soundwire/intel_ace2x.o:(intel_sync_arm) in archive vmlinux.a
Add a Kconfig dependency that prevents that broken configuration but
still allows soundwire to be a loadable module instead.
Fixes: 4d1e2464a1 ("soundwire: intel_ace2x: add sync_arm/sync_go helpers")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230616090932.2714714-1-arnd@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Returning an error code in the remove callback yields to an error
message
remove callback returned a non-zero value. This will be ignored.
After that the device is removed anyhow. Improve the error message to at
least say what the actual problem is. While touching that code, convert
the driver to the .remove_new() callback which returns no value with the
same effect as returning zero in a .remove() callback.
As the return value is ignored by the core the only effect of this patch
is to improve the error message. (And the motivating effect is that
there is one less driver using .remove().)
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230518200823.249795-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
This reverts commit 57ed510b05 ("soundwire: qcom: use
pm_runtime_resume_and_get()") which introduced unbalanced
pm_runtime_put(), when device did not have runtime PM enabled.
If pm_runtime_resume_and_get() failed with -EACCES, the driver continued
execution and finally called pm_runtime_put_autosuspend(). Since
pm_runtime_resume_and_get() drops the usage counter on every error, this
lead to double decrement of that counter visible in certain debugfs
actions on unattached devices (still in reset state):
$ cat /sys/kernel/debug/soundwire/master-0-0/sdw:0:0217:f001:00:0/registers
qcom-soundwire 3210000.soundwire-controller: swrm_wait_for_wr_fifo_avail err write overflow
soundwire sdw-master-0: trf on Slave 1 failed:-5 read addr e36 count 1
soundwire sdw:0:0217:f001:00:0: Runtime PM usage count underflow!
Fixes: 57ed510b05 ("soundwire: qcom: use pm_runtime_resume_and_get()")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20230517163750.997629-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Start from ACE1.x, DOAISE is added to AC timing control
register bit 5, it combines with DOAIS to get effective
timing, and has the default value 1.
The current code fills DOAIS, DACTQE and DODS bits to a
variable initialized to zero, and updates the variable
to AC timing control register. With this operation, We
change DOAISE to 0, and force a much more aggressive
timing. The timing is even unable to form a working
waveform on SDA pin on Meteorlake.
This patch uses read-modify-write operation for the AC
timing control register access, thus makes sure those
bits not supposed and intended to change are not touched.
Signed-off-by: Chao Song <chao.song@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Rander Wang <rander.wang@intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://lore.kernel.org/r/20230515081301.12921-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
When a peripheral reports as ATTACHED, the manager may need to follow
a programming sequence, e.g. to assign DMA resources and/or assign a
command queue for that peripheral.
This patch adds an optional callback, which will be invoked every time
the peripheral attaches. This might be overkill in some scenarios, and
one could argue that this should be invoked only on the first
attachment. The bus does not however track this first attachment with
any existing state-mirroring variable, and using dev_num_sticky would
not work across suspend-resume cycles.
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Rander Wang <rander.wang@intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://lore.kernel.org/r/20230515071042.2038-20-yung-chuan.liao@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
It makes sense to have only a single point responsible for ensuring
that all currently pending IRQs are handled. The current code in
sdw_handle_slave_alerts confusingly splits this process in two. This
code will loop until the asserted IRQs are cleared but it will only
handle IRQs that were already asserted when it was called. This
means the caller must also loop (either manually, or through its IRQ
mechanism) until the IRQs are all handled. It makes sense to either do
all the looping in sdw_handle_slave_alerts or do no looping there and
let the host controller repeatedly call it until things are handled.
There are realistically two sensible host controllers, those that
will generate an IRQ when the alert status changes and those
that will generate an IRQ continuously whilst the alert status
is high. The current code will work fine for the second of those
systems but not the first with out additional looping in the host
controller. Removing the code that filters out new IRQs whilst
the handler is running enables both types of host controller to be
supported and simplifies the code. The code will still only loop up to
SDW_READ_INTR_CLEAR_RETRY times, so it shouldn't be possible for it to
get completely stuck handling IRQs forever, and if you are generating
IRQs faster than you can handle them you likely have bigger problems
anyway.
This fixes an issue on the Cadence SoundWire IP, which only generates
IRQs on an alert status change, where an alert which arrives whilst
another alert is being handled will never be handled and will block
all future alerts from being handled.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20230418140650.297279-1-ckeepax@opensource.cirrus.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>