Currently, when a controller is configured to use filtered mode, thermal
readings are valid only about 30% of the time.
Upon testing, it was noticed that lowering any of the interval settings
resulted in an improved rate of valid data. The same was observed when
decreasing the number of samples for each sensor (which also results in
quicker measurements).
Retrying the read with a timeout longer than the time it takes to
resample (about 344us with these settings and 4 sensors) also improves
the rate.
Lower all timing settings to the minimum, configure the filtering to
single sample, and poll the measurement register for at least one period
to improve the data validity on filtered mode. With these changes in
place, out of 100000 reads, a single one failed, ie 99.999% of the data
was valid.
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230713154743.611870-1-nfraprado@collabora.com
Each LVTS thermal controller can have up to four sensors, each capable
of triggering its own interrupt when its measured temperature crosses
the configured threshold. The threshold for each sensor is handled
separately by the thermal framework, since each one is registered with
its own thermal zone and trips. However, the temperature thresholds are
configured on the controller, and therefore are shared between all
sensors on that controller.
When the temperature measured by the sensors is different enough to
cause the thermal framework to configure different thresholds for each
one, interrupts start triggering on sensors outside the last threshold
configured.
To address the issue, track the thresholds required by each sensor and
only actually set the highest one in the hardware, and disable
interrupts for all sensors outside the current configured range.
Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230706153823.201943-7-nfraprado@collabora.com
The thermal framework might leave the low threshold unset if there
aren't any lower trip points. This leaves the register zeroed, which
translates to a very high temperature for the low threshold. The
interrupt for this threshold is then immediately triggered, and the
state machine gets stuck, preventing any other temperature monitoring
interrupts to ever trigger.
(The same happens by not setting the Cold or Hot to Normal thresholds
when using those)
Set the unused threshold to a valid low value. This value was chosen so
that for any valid golden temperature read from the efuse, when the
value is converted to raw and back again to milliCelsius, the result
doesn't underflow.
Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230706153823.201943-6-nfraprado@collabora.com
Out of the many interrupts supported by the hardware, the only ones of
interest to the driver currently are:
* The temperature went over the high offset threshold, for any of the
sensors
* The temperature went below the low offset threshold, for any of the
sensors
* The temperature went over the stage3 threshold
These are the only thresholds configured by the driver through the
OFFSETH, OFFSETL, and PROTTC registers, respectively.
The current interrupt mask in LVTS_MONINT_CONF, enables many more
interrupts, including data ready on sensors for both filtered and
immediate mode. These are not only not handled by the driver, but they
are also triggered too often, causing unneeded overhead. Disable these
unnecessary interrupts.
The meaning of each bit can be seen in the comment describing
LVTS_MONINTST in the IRQ handler.
Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230706153823.201943-5-nfraprado@collabora.com
There are two kinds of temperature monitoring interrupts available:
* High Offset, Low Offset
* Hot, Hot to normal, Cold
The code currently uses the hot/h2n/cold interrupts, however in a way
that doesn't work: the cold threshold is left uninitialized, which
prevents the other thresholds from ever triggering, and the h2n
interrupt is used as the lower threshold, which prevents the hot
interrupt from triggering again after the thresholds are updated by the
thermal framework, since a hot interrupt can only trigger again after
the hot to normal interrupt has been triggered.
But better yet than addressing those issues, is to use the high/low
offset interrupts instead. This way only two thresholds need to be
managed, which have a simpler state machine, making them a better match
to the thermal framework's high and low thresholds.
Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230706153823.201943-4-nfraprado@collabora.com
Each controller can be configured to operate on immediate or filtered
mode. On filtered mode, the sensors are enabled by setting the
corresponding bits in MONCTL0, while on immediate mode, by setting
MSRCTL1.
Previously, the code would set MSRCTL1 for all four sensors when
configured to immediate mode, but given that the controller might not
have all four sensors connected, this would cause interrupts to trigger
for non-existent sensors. Fix this by handling the MSRCTL1 register
analogously to the MONCTL0: only enable the sensors that were declared.
Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230706153823.201943-3-nfraprado@collabora.com
There is a single IRQ handler for each LVTS thermal domain, and it is
supposed to check each of its underlying controllers for the origin of
the interrupt and clear its status. However due to a typo, only the
first controller was ever being handled, which resulted in the interrupt
never being cleared when it happened on the other controllers. Add the
missing index so interrupts are handled for all controllers.
Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230706153823.201943-2-nfraprado@collabora.com
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The AUXADC thermal v1 allows reading temperature range between -20°C to
150°C and any value out of this range is invalid.
Add new definitions for MT8173_TEMP_{MIN_MAX} and a new small helper
mtk_thermal_temp_is_valid() to check if new readings are in range: if
not, we tell to the API that the reading is invalid by returning
THERMAL_TEMP_INVALID.
It was chosen to introduce the helper function because, even though this
temperature range is realistically ok for all, it comes from a downstream
kernel driver for version 1, but here we also support v2 and v3 which may
may have wider constraints.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230419061146.22246-3-angelogioacchino.delregno@collabora.com
Some more testing revealed that this commit introduces a regression on some
MT8173 Chromebooks and at least on one MT6795 Sony Xperia M5 smartphone due
to the delay being apparently variable and machine specific.
Another solution would be to delay for a bit more (~70ms) but this is not
feasible for two reasons: first of all, we're adding an even bigger delay
in a probe function; second, some machines need less, some may need even
more, making the msleep at probe solution highly suboptimal.
This reverts commit 10debf8c2d.
Fixes: 10debf8c2d ("thermal/drivers/mediatek: Add delay after thermal banks initialization")
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230419061146.22246-2-angelogioacchino.delregno@collabora.com
The binary representation for sensor 1 interrupt status was incorrectly
assembled, when compared to the full table given in the same comment
section. The conversion into hex was also incorrect, leading to
incorrect interrupt status bitmask for sensor 1. This would cause the
driver to incorrectly identify changes for sensor 1, when in fact it
was sensor 0, or a sensor access time out.
Fix the binary and hex representations in the comments, and the actual
bitmask macro.
Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230328031017.1360976-1-wenst@chromium.org
Some drivers are directly using the thermal zone's 'device' structure
field.
Use the driver device pointer instead of the thermal zone device when
it is available.
Remove the traces when they are duplicate with the traces in the core
code.
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Balsam CHIHI <bchihi@baylibre.com> #Mediatek LVTS
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> #MediaTek LVTS
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The devres variant of thermal_add_hwmon_sysfs() only takes the thermal
zone structure pointer as parameter.
Actually, it uses the tz->device to add it in the devres list.
It is preferable to use the device registering the thermal zone
instead of the thermal zone device itself. That prevents the driver
accessing the thermal zone structure internals and it is from my POV
more correct regarding how devm_ is used.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> #amlogic_thermal
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> #sun8i_thermal
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> #MediaTek auxadc
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Low Voltage Thermal Sensor (LVTS) is a multiple sensors, multi
controllers contained in a thermal domain.
A thermal domains can be the MCU or the AP.
Each thermal domains contain up to seven controllers, each thermal
controller handle up to four thermal sensors.
The LVTS has two Finite State Machines (FSM), one to handle the
functionin temperatures range like hot or cold temperature and another
one to handle monitoring trip point. The FSM notifies via interrupts
when a trip point is crossed.
The interrupt is managed at the thermal controller level, so when an
interrupt occurs, the driver has to find out which sensor triggered
such an interrupt.
The sampling of the thermal can be filtered or immediate. For the
former, the LVTS measures several points and applies a low pass
filter.
Signed-off-by: Balsam CHIHI <bchihi@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
On MT8195 Tomato Chromebook:
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20230209105628.50294-5-bchihi@baylibre.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>