Pull ARM SoC fixes from Olof Johansson:
"Another batch of fixes for ARM SoC platforms. Most are smaller fixes.
Two areas that are worth pointing out are:
- OMAP had a handful of changes to voltage specs that caused a bit of
churn, most of volume of change in this branch is due to this.
- There are a couple of _rcuidle fixes from Paul that touch common
code and came in through the OMAP tree since they were the ones who
saw the problems.
The rest is smaller changes across a handful of platforms"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (36 commits)
ARM: dts: STi: stih407-family: Disable reserved-memory co-processor nodes
ARM: dts: am437x-sk-evm: Reduce i2c0 bus speed for tps65218
ARM: OMAP2+: timer: add probe for clocksources
ARM: OMAP1: fix ams-delta FIQ handler to work with sparse IRQ
memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing
arm: Use _rcuidle for smp_cross_call() tracepoints
MAINTAINERS: Add myself as reviewer of ARM FSL/NXP
ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_mem_ret
ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_logic_ret
ARM: OMAP: DRA7: powerdomain data: Set L3init and L4per to ON
ARM: imx6ul: Fix Micrel PHY mask
ARM: OMAP2+: Select OMAP_INTERCONNECT for SOC_AM43XX
ARM: dts: DRA74x: fix DSS PLL2 addresses
ARM: OMAP2: Enable Errata 430973 for OMAP3
ARM: dts: socfpga: Add missing PHY phandle
ARM: dts: exynos: Fix port nodes names for Exynos5420 Peach Pit board
ARM: dts: exynos: Fix port nodes names for Exynos5250 Snow board
ARM: dts: sun6i: yones-toptech-bs1078-v2: Drop constraints on dc1sw regulator
ARM: dts: sun6i: primo81: Drop constraints on dc1sw regulator
ARM: dts: sunxi: Add OLinuXino Lime2 eMMC to the Makefile
...
OMAP-GPMC: Fixes for for v4.7-rc cycle:
- Fix omap gpmc EXTRADELAY timing. The DT provided timings
were wrongly used causing devices requiring extra delay timing
to fail.
* tag 'gpmc-omap-fixes-for-v4.7' of https://github.com/rogerq/linux:
memory: omap-gpmc: Fix omap gpmc EXTRADELAY timing
+ Linux 4.7-rc3
Signed-off-by: Olof Johansson <olof@lixom.net>
Fixes for omaps for v4.7-rc cycle:
- Fix dra7 for hardware issues limiting L4Per and L3init power domains
to on state. Without this the devices may not work correctly after
some time of use because of asymmetric aging. And related to this,
let's also remove the unusable states.
- Always select omap interconnect for am43x as otherwise the am43x
only configurations will not boot properly. This can happen easily
for any product kernels that leave out other SoCs to save memory.
- Fix DSS PLL2 addresses that have gone unused for now
- Select erratum 430973 for omap3, this is now safe to do and can
save quite a bit of debugging time for people who may have left
it out.
* tag 'omap-for-v4.7/fixes-powedomain' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_mem_ret
ARM: OMAP: DRA7: powerdomain data: Remove unused pwrsts_logic_ret
ARM: OMAP: DRA7: powerdomain data: Set L3init and L4per to ON
ARM: OMAP2+: Select OMAP_INTERCONNECT for SOC_AM43XX
ARM: dts: DRA74x: fix DSS PLL2 addresses
ARM: OMAP2: Enable Errata 430973 for OMAP3
+ Linux 4.7-rc2
Signed-off-by: Olof Johansson <olof@lixom.net>
Fixes for omaps for v4.7-rc cycle:
- Two boot warning fixes from the RCU tree that should have gotten
merged several weeks ago already but did not because of issues
with who merges them. Paul has now split the RCU warning fixes into
sets for various maintainers.
- Fix ams-delta FIQ regression caused by omap1 sparse IRQ changes
- Fix PM for omap3 boards using timer12 and gptimer, like the
original beagleboard
- Fix hangs on am437x-sk-evm by lowering the I2C bus speed
* tag 'fixes-rcu-fiq-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am437x-sk-evm: Reduce i2c0 bus speed for tps65218
ARM: OMAP2+: timer: add probe for clocksources
ARM: OMAP1: fix ams-delta FIQ handler to work with sparse IRQ
arm: Use _rcuidle for smp_cross_call() tracepoints
arm: Use _rcuidle tracepoint to allow use from idle
Signed-off-by: Olof Johansson <olof@lixom.net>
This patch fixes a non-booting issue in Mainline.
When booting with a compressed kernel, we need to be careful how we
populate memory close to DDR start. AUTO_ZRELADDR is enabled by default
in multi-arch enabled configurations, which place some restrictions on
where the kernel is placed and where it will be uncompressed to on boot.
AUTO_ZRELADDR takes the decompressor code's start address and masks out
the bottom 28 bits to obtain an address to uncompress the kernel to
(thus a load address of 0x42000000 means that the kernel will be
uncompressed to 0x40000000 i.e. DDR START on this platform).
Even changing the load address to after the co-processor's shared memory
won't render a booting platform, since the AUTO_ZRELADDR algorithm still
ensures the kernel is uncompressed into memory shared with the first
co-processor (0x40000000).
Another option would be to move loading to 0x4A000000, since this will
mean the decompressor will decompress the kernel to 0x48000000. However,
this would mean a large chunk (0x44000000 => 0x48000000 (64MB)) of
memory would essentially be wasted for no good reason.
Until we can work with ST to find a suitable memory location to
relocate co-processor shared memory, let's disable the shared memory
nodes. This will ensure a working platform in the mean time.
NB: The more observant of you will notice that we're leaving the DMU
shared memory node enabled; this is because a) it is the only one in
active use at the time of this writing and b) it is not affected by
the current default behaviour which is causing issues.
Fixes: fe135c6 (ARM: dts: STiH407: Move over to using the 'reserved-memory' API for obtaining DMA memory)
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
The i.MX fixes for 4.7:
- Correct Micrel PHY mask to fix the issue that i.MX6UL ethernet works
in U-Boot but not in kernel.
* tag 'imx-fixes-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: imx6ul: Fix Micrel PHY mask
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull ARM fixes from Russell King:
"A couple of fixes for pmd_mknotpresent()/pmd_present() for LPAE
systems"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8579/1: mm: Fix definition of pmd_mknotpresent
ARM: 8578/1: mm: ensure pmd_present only checks the valid bit
Based on the latest timing specifications for the TPS65218 from the data
sheet, http://www.ti.com/lit/ds/symlink/tps65218.pdf, document SLDS206
from November 2014, we must change the i2c bus speed to better fit within
the minimum high SCL time required for proper i2c transfer.
When running at 400khz, measurements show that SCL spends
0.8125 uS/1.666 uS high/low which violates the requirement for minimum
high period of SCL provided in datasheet Table 7.6 which is 1 uS.
Switching to 100khz gives us 5 uS/5 uS high/low which both fall above
the minimum given values for 100 khz, 4.0 uS/4.7 uS high/low.
Without this patch occasionally a voltage set operation from the kernel
will appear to have worked but the actual voltage reflected on the PMIC
will not have updated, causing problems especially with cpufreq that may
update to a higher OPP without actually raising the voltage on DCDC2,
leading to a hang.
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com>
Signed-off-by: Aparna Balasubramanian <aparnab@ti.com>
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
A few platforms are currently missing clocksource_probe() completely
in their time_init functionality. On OMAP3430 for example, this is
causing cpuidle to be pretty much dead, as the counter32k is not
going to be registered and instead a gptimer is used as a clocksource.
This will tick in periodic mode, preventing any deeper idle states.
While here, also drop one unnecessary check for populated DT before
existing clocksource_probe() call.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
After OMAP1 IRQ definitions have been changed by commit 685e2d08c5
("ARM: OMAP1: Change interrupt numbering for sparse IRQ") introduced
in v4.2, ams-delta FIQ handler which depends on them no longer works
as expected. Fix it.
Created and tested on Amstrad Delta against Linux-4.7-rc3
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Further testing with false negatives suppressed by commit 293e2421fe
("rcu: Remove superfluous versions of rcu_read_lock_sched_held()")
identified another unprotected use of RCU from the idle loop. Because RCU
actively ignores idle-loop code (for energy-efficiency reasons, among
other things), using RCU from the idle loop can result in too-short
grace periods, in turn resulting in arbitrary misbehavior.
The resulting lockdep-RCU splat is as follows:
------------------------------------------------------------------------
===============================
[ INFO: suspicious RCU usage. ]
4.6.0-rc5-next-20160426+ #1112 Not tainted
-------------------------------
include/trace/events/ipi.h:35 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112
Hardware name: Generic OMAP4 (Flattened Device Tree)
[<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
[<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4)
[<c047fec8>] (dump_stack) from [<c010dcfc>] (smp_cross_call+0xbc/0x188)
[<c010dcfc>] (smp_cross_call) from [<c01c9e28>] (generic_exec_single+0x9c/0x15c)
[<c01c9e28>] (generic_exec_single) from [<c01ca0a0>] (smp_call_function_single_async+0 x38/0x9c)
[<c01ca0a0>] (smp_call_function_single_async) from [<c0603728>] (cpuidle_coupled_poke_others+0x8c/0xa8)
[<c0603728>] (cpuidle_coupled_poke_others) from [<c0603c10>] (cpuidle_enter_state_coupled+0x26c/0x390)
[<c0603c10>] (cpuidle_enter_state_coupled) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0)
[<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
[<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
------------------------------------------------------------------------
Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: <linux-omap@vger.kernel.org>
Cc: <linux-arm-kernel@lists.infradead.org>
Fixes for Exynos-based Snow and Peach Pit boards for regressions introduced in
4.7-rc1 because OF graph logic expects specific names of child nodes.
* tag 'samsung-fixes-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
ARM: dts: exynos: Fix port nodes names for Exynos5420 Peach Pit board
ARM: dts: exynos: Fix port nodes names for Exynos5250 Snow board
Signed-off-by: Olof Johansson <olof@lixom.net>
As per the latest revision F of public TRM for DRA7/AM57xx SoCs
SPRUHZ6F[1] (April 2016), with the exception of MPU power domain, all
other power domains do not have memories capable of retention since
they all operate in either "ON" or "OFF" mode. For these power states,
the retention state for memories are basically ignored by PRCM and does
not require to be programmed.
[1] http://www.ti.com/lit/pdf/spruhz6
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
As per the latest revision F of public TRM for DRA7/AM57xx SoCs
SPRUHZ6F[1] (April 2016), with the exception of MPU power domain (and
CPUx sub power domains), all other power domains can either operate
in "ON" mode OR in some cases, "OFF" mode. For these power states,
the logic retention state is basically ignored by PRCM and does not
require to be programmed.
[1] http://www.ti.com/lit/pdf/spruhz6
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
As per the latest revision F of public TRM for DRA7/AM57xx SoCs
SPRUHZ6F[1] (April 2016), L4Per and L3init power domains now operate in
always "ON" mode due to asymmetric aging limitations. Update the same
[1] http://www.ti.com/lit/pdf/spruhz6
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Currently pmd_mknotpresent will use a zero entry to respresent an
invalidated pmd.
Unfortunately this definition clashes with pmd_none, thus it is
possible for a race condition to occur if zap_pmd_range sees pmd_none
whilst __split_huge_pmd_locked is running too with pmdp_invalidate
just called.
This patch fixes the race condition by modifying pmd_mknotpresent to
create non-zero faulting entries (as is done in other architectures),
removing the ambiguity with pmd_none.
[catalin.marinas@arm.com: using L_PMD_SECT_VALID instead of PMD_TYPE_SECT]
Fixes: 8d96250700 ("ARM: mm: Transparent huge page support for LPAE systems.")
Cc: <stable@vger.kernel.org> # 3.11+
Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In a subsequent patch, pmd_mknotpresent will clear the valid bit of the
pmd entry, resulting in a not-present entry from the hardware's
perspective. Unfortunately, pmd_present simply checks for a non-zero pmd
value and will therefore continue to return true even after a
pmd_mknotpresent operation. Since pmd_mknotpresent is only used for
managing huge entries, this is only an issue for the 3-level case.
This patch fixes the 3-level pmd_present implementation to take into
account the valid bit. For bisectability, the change is made before the
fix to pmd_mknotpresent.
[catalin.marinas@arm.com: comment update regarding pmd_mknotpresent patch]
Fixes: 8d96250700 ("ARM: mm: Transparent huge page support for LPAE systems.")
Cc: <stable@vger.kernel.org> # 3.11+
Cc: Russell King <linux@armlinux.org.uk>
Cc: Steve Capper <Steve.Capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
AM43XX SoCs make use of the omap_l3_noc driver so explicitly select
OMAP_INTERCONNECT in the Kconfig for SOC_AM43XX to ensure it always gets
enabled for AM43XX only builds.
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
DSS's 'pll2_clkctrl' and 'pll2' have wrong addresses in the dra74x.dtsi
file. Video PLL2 has not been used so wrong addresses went unnoticed.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Enable Erratum 430973 similar to commit 5c86c5339c ("ARM:
omap2plus_defconfig: Enable ARM erratum 430973 for omap3") - Since
multiple defconfigs can exist from various points of view (multi_v7,
omap2plus etc.. it is always better to enable the erratum from the
Kconfig selection point of view so that downstream kernels dont have
to rediscover this all over again.
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Pull clk fixes from Stephen Boyd:
"This finally removes the CLK_IS_ROOT flag by picking up the last few
stragglers that didn't get merged by anyone this time around.
Better to do it now than wait for another one to pop up. There's also
a minor maintainers update and a Kconfig fix"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: nxp: Select MFD_SYSCON for creg driver
MAINTAINERS: Add file patterns for clock device tree bindings
clk: Remove CLK_IS_ROOT flag
clk: microchip: Remove CLK_IS_ROOT
powerpc/512x: clk: Remove CLK_IS_ROOT
vexpress/spc: Remove CLK_IS_ROOT
Commit bea7eef694 ("ARM: dts: exynos: Fix DTC unit name warnings in
Peach Pit") fixed the DTC warnings about mismatches between unit names
and reg properties in the Exynos5420 Peach Pit DTS.
But unfortunately it also added a regression on the Peach Pit when
changing the port node names since the OF graph logic expects the port
nodes to be always named 'port'.
The Documentation/devicetree/bindings/graph.txt binding document says
that when there is more than one port, '#address-cells', '#size-cells'
and 'reg' properties should be used to number the port nodes.
Fixes: bea7eef694 ("ARM: dts: exynos: Fix DTC unit name warnings in Peach Pit")
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Commit 5c9cbade06 ("ARM: dts: exynos: Fix DTC unit name warnings in
Exynos5250") fixed all the DTC warnings about mismatchs between unit
names and reg properties in Exynos5250 boards DTS.
But unfortunately it also added a regression on the Exynos5250 Snow
Chromebook when changing the port node names since the OF graph logic
expects the port nodes to be always named 'port'.
The Documentation/devicetree/bindings/graph.txt binding document says
that when there is more than one port, '#address-cells', '#size-cells'
and 'reg' properties should be used to number the port nodes.
Fixes: 5c9cbade06 ("ARM: dts: exynos: Fix DTC unit name warnings in Exynos5250")
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Pull ARM fix from Russell King:
"Just one fix to the ptrace code, spotted by Simon Marchi, where if a
thread migrates to a different CPU and the VFP registers are changed
through ptrace, the application doesn't see the updated VFP registers"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: fix PTRACE_SETVFPREGS on SMP systems
This is the same issue fixed in commit dcf5341f01 ("ARM: dts:
sun8i-q8-common: Do not set constraints on dc1sw regulator").
Commit message copied:
dc1sw is an on/off only regulator and as such it cannot have constraints.
This is a limitation of the kernel regulator implementation which resolves
supplies on the first regulator_get(), which is done after applying
constraints, and applying the constrains will fail because it calls
_regulator_get_voltage() and _regulator_do_set_voltage() both of which
will fail on a switch regulator when there is no supply (yet).
This causes registering of all axp22x regulators to fail with the
following errors:
[ 1.395249] vcc-lcd: failed to get the current voltage(-22)
[ 1.405131] axp20x-regulator axp20x-regulator: Failed to register dc1sw
[ 1.412436] axp20x-regulator: probe of axp20x-regulator failed with error -22
This commit removes the constrains on dc1sw / vcc-lcd fixing this problem.
Note that dcdc1 itself is contrained to the exact same values, so this
does not change anything.
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Cc: <stable@vger.kernel.org> # 4.6
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
This is the same issue fixed in commit dcf5341f01 ("ARM: dts:
sun8i-q8-common: Do not set constraints on dc1sw regulator").
Commit message copied:
dc1sw is an on/off only regulator and as such it cannot have constraints.
This is a limitation of the kernel regulator implementation which resolves
supplies on the first regulator_get(), which is done after applying
constraints, and applying the constrains will fail because it calls
_regulator_get_voltage() and _regulator_do_set_voltage() both of which
will fail on a switch regulator when there is no supply (yet).
This causes registering of all axp22x regulators to fail with the
following errors:
[ 1.395249] vcc-lcd: failed to get the current voltage(-22)
[ 1.405131] axp20x-regulator axp20x-regulator: Failed to register dc1sw
[ 1.412436] axp20x-regulator: probe of axp20x-regulator failed with error -22
This commit removes the constrains on dc1sw / vcc-lcd fixing this problem.
Note that dcdc1 itself is contrained to the exact same values, so this
does not change anything.
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Cc: <stable@vger.kernel.org> # 4.6
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
DTS fixes for omaps for v4.7 merge window for issues noted
with patches in Linux next:
- Fix omap5 and am57xx-idk input voltages to fix micro-sd
probing at least for some omap5-uevm configurations
- Fix unhandled fault for igepv5 audio
- Fix UART wakeirqs for omap5 by removing WAKUP_EN flags,
those are managed by the wakeirq and can currently confuse
the wakeirqs as there is no handler necessarily registered
- Fix LDO7 source for igepv5
Also included are few minor changes not strictly fixes
are good to have merged:
- Fix HP T410 boot time warnings for eMMC and disable the
unused MMC interfaces while at it
- Add dra7 gpmc dma channel
- Add igep00x0 SD card detect and write protect GPIOs
* tag 'omap-for-v4.7-dts-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: igep0020: Add SD card write-protect pin.
ARM: dts: igep00x0: Add SD card-detect.
ARM: dts: am57xx-idk-common: Fix input supply names
ARM: dts: dra7: Add gpmc dma channel
ARM: dts: disable mmc by default and enable when needed for dm814x
ARM: dts: Add non-removable to hsmmc on hp-t410
ARM: dts: Fix ldo7 source for HDMI on igepv5
ARM: dts: Fix uart wakeirq on omap5 by removing WAKEUP_EN for omaps
ARM: dts: Fix igepv5 audiopwon-gpio
ARM: dts: omap5-board-common: Describe the voltage supply mapping accurately
Signed-off-by: Olof Johansson <olof@lixom.net>
Two fixes for omaps for v4.7 merge window, one to enable
ARM errata for am437x, and the other to add ARM errtum
workaround for dra7.
AFAIK these both can wait for v4.7, we can then request them
for stable kernels as needed.
* tag 'omap-for-v4.7/fixes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: AM43XX: Enable fixes for Cortex-A9 errata
ARM: OMAP5 / DRA7: Introduce workaround for 801819
Signed-off-by: Olof Johansson <olof@lixom.net>
commit 27dd9af6bc ("ARM: dts: sunxi: Add a olinuxino-lime2-emmc")
added the new emmc equipped lime2 but forgot its Makefile.
This patch adds an entry to the Makefile.
Signed-off-by: Olliver Schinagl <oliver@schinagl.nl>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Two fixes for v4.7 cycle for build issues:
1. Fix samsung-keypad build error if INPUT is selected as module.
The error though depends on some uncommon build settings so it
is not as easy to trigger.
2. Get rid of 'samsung_device_dma_mask' defined but not used warning.
* tag 'samsung-fixes-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
ARM: exynos: don't select keyboard driver
ARM: samsung: improve static dma_mask definition
Signed-off-by: Olof Johansson <olof@lixom.net>
PTRACE_SETVFPREGS fails to properly mark the VFP register set to be
reloaded, because it undoes one of the effects of vfp_flush_hwstate().
Specifically vfp_flush_hwstate() sets thread->vfpstate.hard.cpu to
an invalid CPU number, but vfp_set() overwrites this with the original
CPU number, thereby rendering the hardware state as apparently "valid",
even though the software state is more recent.
Fix this by reverting the previous change.
Cc: <stable@vger.kernel.org>
Fixes: 8130b9d7b9 ("ARM: 7308/1: vfp: flush thread hwstate before copying ptrace registers")
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Simon Marchi <simon.marchi@ericsson.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
This flag is a no-op now (see commit 47b0eeb3dc "clk: Deprecate
CLK_IS_ROOT", 2016-02-02) so remove it.
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Pull second batch of KVM updates from Radim Krčmář:
"General:
- move kvm_stat tool from QEMU repo into tools/kvm/kvm_stat (kvm_stat
had nothing to do with QEMU in the first place -- the tool only
interprets debugfs)
- expose per-vm statistics in debugfs and support them in kvm_stat
(KVM always collected per-vm statistics, but they were summarised
into global statistics)
x86:
- fix dynamic APICv (VMX was improperly configured and a guest could
access host's APIC MSRs, CVE-2016-4440)
- minor fixes
ARM changes from Christoffer Dall:
- new vgic reimplementation of our horribly broken legacy vgic
implementation. The two implementations will live side-by-side
(with the new being the configured default) for one kernel release
and then we'll remove the legacy one.
- fix for a non-critical issue with virtual abort injection to guests"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits)
tools: kvm_stat: Add comments
tools: kvm_stat: Introduce pid monitoring
KVM: Create debugfs dir and stat files for each VM
MAINTAINERS: Add kvm tools
tools: kvm_stat: Powerpc related fixes
tools: Add kvm_stat man page
tools: Add kvm_stat vm monitor script
kvm:vmx: more complete state update on APICv on/off
KVM: SVM: Add more SVM_EXIT_REASONS
KVM: Unify traced vector format
svm: bitwise vs logical op typo
KVM: arm/arm64: vgic-new: Synchronize changes to active state
KVM: arm/arm64: vgic-new: enable build
KVM: arm/arm64: vgic-new: implement mapped IRQ handling
KVM: arm/arm64: vgic-new: Wire up irqfd injection
KVM: arm/arm64: vgic-new: Add vgic_v2/v3_enable
KVM: arm/arm64: vgic-new: vgic_init: implement map_resources
KVM: arm/arm64: vgic-new: vgic_init: implement vgic_init
KVM: arm/arm64: vgic-new: vgic_init: implement vgic_create
KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init
...
Pull kbuild updates from Michal Marek:
- new option CONFIG_TRIM_UNUSED_KSYMS which does a two-pass build and
unexports symbols which are not used in the current config [Nicolas
Pitre]
- several kbuild rule cleanups [Masahiro Yamada]
- warning option adjustments for gcov etc [Arnd Bergmann]
- a few more small fixes
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (31 commits)
kbuild: move -Wunused-const-variable to W=1 warning level
kbuild: fix if_change and friends to consider argument order
kbuild: fix adjust_autoksyms.sh for modules that need only one symbol
kbuild: fix ksym_dep_filter when multiple EXPORT_SYMBOL() on the same line
gcov: disable -Wmaybe-uninitialized warning
gcov: disable tree-loop-im to reduce stack usage
gcov: disable for COMPILE_TEST
Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES
Kbuild: change CC_OPTIMIZE_FOR_SIZE definition
kbuild: forbid kernel directory to contain spaces and colons
kbuild: adjust ksym_dep_filter for some cmd_* renames
kbuild: Fix dependencies for final vmlinux link
kbuild: better abstract vmlinux sequential prerequisites
kbuild: fix call to adjust_autoksyms.sh when output directory specified
kbuild: Get rid of KBUILD_STR
kbuild: rename cmd_as_s_S to cmd_cpp_s_S
kbuild: rename cmd_cc_i_c to cmd_cpp_i_c
kbuild: drop redundant "PHONY += FORCE"
kbuild: delete unnecessary "@:"
kbuild: mark help target as PHONY
...
Pull perf updates from Ingo Molnar:
"Mostly tooling and PMU driver fixes, but also a number of late updates
such as the reworking of the call-chain size limiting logic to make
call-graph recording more robust, plus tooling side changes for the
new 'backwards ring-buffer' extension to the perf ring-buffer"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
perf record: Read from backward ring buffer
perf record: Rename variable to make code clear
perf record: Prevent reading invalid data in record__mmap_read
perf evlist: Add API to pause/resume
perf trace: Use the ptr->name beautifier as default for "filename" args
perf trace: Use the fd->name beautifier as default for "fd" args
perf report: Add srcline_from/to branch sort keys
perf evsel: Record fd into perf_mmap
perf evsel: Add overwrite attribute and check write_backward
perf tools: Set buildid dir under symfs when --symfs is provided
perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
perf annotate: Sort list of recognised instructions
perf annotate: Fix identification of ARM blt and bls instructions
perf tools: Fix usage of max_stack sysctl
perf callchain: Stop validating callchains by the max_stack sysctl
perf trace: Fix exit_group() formatting
perf top: Use machine->kptr_restrict_warned
perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
perf machine: Do not bail out if not managing to read ref reloc symbol
perf/x86/intel/p4: Trival indentation fix, remove space
...
Pull pwm updates from Thierry Reding:
"This set of changes introduces an atomic API to the PWM subsystem.
This is influenced by the DRM atomic API that was introduced a while
back, though it is obviously a lot simpler. The fundamental idea
remains the same, though: drivers provide a single callback to
implement the atomic configuration of a PWM channel.
As a side-effect the PWM subsystem gains the ability for initial state
retrieval, so that the logical state mirrors that of the hardware.
Many use-cases don't care about this, but for others it is essential.
These new features require changes in all users, which these patches
take care of. The core is transitioned to use the atomic callback if
available and provides a fallback mechanism for other drivers.
Changes to transition users and drivers to the atomic API are
postponed to v4.8"
* tag 'pwm/for-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (30 commits)
pwm: Add information about polarity, duty cycle and period to debugfs
pwm: Switch to the atomic API
pwm: Update documentation
pwm: Add core infrastructure to allow atomic updates
pwm: Add hardware readout infrastructure
pwm: Move the enabled/disabled info into pwm_state
pwm: Introduce the pwm_state concept
pwm: Keep PWM state in sync with hardware state
ARM: Explicitly apply PWM config extracted from pwm_args
drm: i915: Explicitly apply PWM config extracted from pwm_args
input: misc: pwm-beeper: Explicitly apply PWM config extracted from pwm_args
input: misc: max8997: Explicitly apply PWM config extracted from pwm_args
backlight: lm3630a: explicitly apply PWM config extracted from pwm_args
backlight: lp855x: Explicitly apply PWM config extracted from pwm_args
backlight: lp8788: Explicitly apply PWM config extracted from pwm_args
backlight: pwm_bl: Use pwm_get_args() where appropriate
fbdev: ssd1307fb: Use pwm_get_args() where appropriate
regulator: pwm: Use pwm_get_args() where appropriate
leds: pwm: Use pwm_get_args() where appropriate
input: misc: max77693: Use pwm_get_args() where appropriate
...
Pull ARM SoC fixes from Arnd Bergmann:
"This is a first set of bug fixes on top of what was merged for 4.7.
Two patches for lpc32xx address a harmless build warning that was just
introduced, one patch for the mediatek soc driver fixes a warning for
arm64, and the pxa changes are minor cleanups that should have been
part of the original pull requests but that I forgot to apply to the
cleanup-fixes branch earlier"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: lpc32xx: fix NR_IRQS confict
ARM: lpc32xx: remove legacy irq controller driver
soc: mtk-pmic-wrap: avoid integer overflow warning
ARM: pxa: Remove CLK_IS_ROOT
ARM: pxa: activate pinctrl for device-tree machines
Pull ARM SoC late DT updates from Arnd Bergmann:
"This is a collection of a few late fixes and other misc stuff that had
dependencies on things being merged from other trees.
The Renesas R-Car power domain handling, and the Nvidia Tegra USB
support both hand notable changes that required changing the DT
binding in a way that only provides compatibility with old DT blobs on
new kernels but not vice versa. As a consequence, the DT changes are
based on top of the driver changes and are now in this branch.
For NXP i.MX and Samsung Exynos, the changes in here depend on other
changes that got merged through the clk maintainer tree"
* tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (35 commits)
ARM: dts: exynos: Add support of Bus frequency using VDD_INT for exynos5422-odroidxu3
ARM: dts: exynos: Add bus nodes using VDD_INT for Exynos542x SoC
ARM: dts: exynos: Add NoC Probe dt node for Exynos542x SoC
ARM: dts: exynos: Add support of bus frequency for exynos4412-trats/odroidu3
ARM: dts: exynos: Expand the voltage range of buck1/3 regulator for exynos4412-odroidu3
ARM: dts: exynos: Add support of bus frequency using VDD_INT for exynos3250-rinato
ARM: dts: exynos: Add exynos4412-ppmu-common dtsi to delete duplicate PPMU nodes
ARM: dts: exynos: Add bus nodes using VDD_MIF for Exynos4210
ARM: dts: exynos: Add bus nodes using VDD_INT for Exynos4x12
ARM: dts: exynos: Add bus nodes using VDD_MIF for Exynos4x12
ARM: dts: exynos: Add bus nodes using VDD_INT for Exynos3250
ARM: dts: exynos: Add DMC bus frequency for exynos3250-rinato/monk
ARM: dts: exynos: Add DMC bus node for Exynos3250
ARM: tegra: Enable XUSB on Nyan
ARM: tegra: Enable XUSB on Jetson TK1
ARM: tegra: Enable XUSB on Venice2
ARM: tegra: Add Tegra124 XUSB controller
ARM: tegra: Move Tegra124 to the new XUSB pad controller binding
ARM: dts: r8a7794: Use SYSC "always-on" PM Domain
ARM: dts: r8a7793: Use SYSC "always-on" PM Domain
...
With the change to sparse IRQs, the lpc32xx platform gets a warning about
conflicting macros:
In file included from arch/arm/mach-lpc32xx/irq.c:31:0:
arch/arm/mach-lpc32xx/include/mach/irqs.h:115:0: warning: "NR_IRQS" redefined
#define NR_IRQS 96
arch/arm/include/asm/irq.h:9:0: note: this is the location of the previous definition
#define NR_IRQS NR_IRQS_LEGACY
One such instance was in the old irq driver that is now removed by
the previous patch, but any other file including mach/irqs.h still
has the issue. Since none of them use this constant, we can just
remove the old definition.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 8cb17b5ed0 ("irqchip: Add LPC32xx interrupt controller driver")
New NXP LPC32xx irq chip driver is used instead of a legacy one.
[this also fixes a harmless build warning about the NR_IRQS redefinition]
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Sylvain Lemieux <slemieux.tyco@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull MTD updates from Brian Norris:
"First cycle with Boris as NAND maintainer! Many (most) bullets stolen
from him.
Generic:
- Migrated NAND LED trigger to be a generic MTD trigger
NAND:
- Introduction of the "ECC algorithm" concept, to avoid overloading
the ECC mode field too much more
- Replaced the nand_ecclayout infrastructure with something a little
more flexible (finally!) and future proof
- Rework of the OMAP GPMC and NAND drivers; the TI folks pulled some
of this into their own tree as well
- Prepare the sunxi NAND driver to receive DMA support
- Handle bitflips in erased pages on GPMI revisions that do not
support this in hardware.
SPI NOR:
- Start using the spi_flash_read() API for SPI drivers that support
it (i.e., SPI drivers with special memory-mapped flash modes)
And other small scattered improvments"
* tag 'for-linus-20160523' of git://git.infradead.org/linux-mtd: (155 commits)
mtd: spi-nor: support GigaDevice gd25lq64c
mtd: nand_bch: fix spelling of "probably"
mtd: brcmnand: respect ECC algorithm set by NAND subsystem
gpmi-nand: Handle ECC Errors in erased pages
Documentation: devicetree: deprecate "soft_bch" nand-ecc-mode value
mtd: nand: add support for "nand-ecc-algo" DT property
mtd: mtd: drop NAND_ECC_SOFT_BCH enum value
mtd: drop support for NAND_ECC_SOFT_BCH as "soft_bch" mapping
mtd: nand: read ECC algorithm from the new field
mtd: nand: fsmc: validate ECC setup by checking algorithm directly
mtd: nand: set ECC algorithm to Hamming on fallback
staging: mt29f_spinand: set ECC algorithm explicitly
CRIS v32: nand: set ECC algorithm explicitly
mtd: nand: atmel: set ECC algorithm explicitly
mtd: nand: davinci: set ECC algorithm explicitly
mtd: nand: bf5xx: set ECC algorithm explicitly
mtd: nand: omap2: Fix high memory dma prefetch transfer
mtd: nand: omap2: Start dma request before enabling prefetch
mtd: nandsim: add __init attribute
mtd: nand: move of_get_nand_xxx() helpers into nand_base.c
...
KVM/ARM Changes for v4.7 take 2
"The GIC is dead; Long live the GIC"
This set of changes include the new vgic, which is a reimplementation of
our horribly broken legacy vgic implementation. The two implementations
will live side-by-side (with the new being the configured default) for
one kernel release and then we'll remove it.
Also fixes a non-critical issue with virtual abort injection to guests.
Merge yet more updates from Andrew Morton:
- Oleg's "wait/ptrace: assume __WALL if the child is traced". It's a
kernel-based workaround for existing userspace issues.
- A few hotfixes
- befs cleanups
- nilfs2 updates
- sys_wait() changes
- kexec updates
- kdump
- scripts/gdb updates
- the last of the MM queue
- a few other misc things
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (84 commits)
kgdb: depends on VT
drm/amdgpu: make amdgpu_mn_get wait for mmap_sem killable
drm/radeon: make radeon_mn_get wait for mmap_sem killable
drm/i915: make i915_gem_mmap_ioctl wait for mmap_sem killable
uprobes: wait for mmap_sem for write killable
prctl: make PR_SET_THP_DISABLE wait for mmap_sem killable
exec: make exec path waiting for mmap_sem killable
aio: make aio_setup_ring killable
coredump: make coredump_wait wait for mmap_sem for write killable
vdso: make arch_setup_additional_pages wait for mmap_sem for write killable
ipc, shm: make shmem attach/detach wait for mmap_sem killable
mm, fork: make dup_mmap wait for mmap_sem for write killable
mm, proc: make clear_refs killable
mm: make vm_brk killable
mm, elf: handle vm_brk error
mm, aout: handle vm_brk failures
mm: make vm_munmap killable
mm: make vm_mmap killable
mm: make mmap_sem for write waits killable for mm syscalls
MAINTAINERS: add co-maintainer for scripts/gdb
...
most architectures are relying on mmap_sem for write in their
arch_setup_additional_pages. If the waiting task gets killed by the oom
killer it would block oom_reaper from asynchronous address space reclaim
and reduce the chances of timely OOM resolving. Wait for the lock in
the killable mode and return with EINTR if the task got killed while
waiting.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Andy Lutomirski <luto@amacapital.net> [x86 vdso]
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull drm updates from Dave Airlie:
"Here's the main drm pull request for 4.7, it's been a busy one, and
I've been a bit more distracted in real life this merge window. Lots
more ARM drivers, not sure if it'll ever end. I think I've at least
one more coming the next merge window.
But changes are all over the place, support for AMD Polaris GPUs is in
here, some missing GM108 support for nouveau (found in some Lenovos),
a bunch of MST and skylake fixes.
I've also noticed a few fixes from Arnd in my inbox, that I'll try and
get in asap, but I didn't think they should hold this up.
New drivers:
- Hisilicon kirin display driver
- Mediatek MT8173 display driver
- ARC PGU - bitstreamer on Synopsys ARC SDP boards
- Allwinner A13 initial RGB output driver
- Analogix driver for DisplayPort IP found in exynos and rockchip
DRM Core:
- UAPI headers fixes and C++ safety
- DRM connector reference counting
- DisplayID mode parsing for Dell 5K monitors
- Removal of struct_mutex from drivers
- Connector registration cleanups
- MST robustness fixes
- MAINTAINERS updates
- Lockless GEM object freeing
- Generic fbdev deferred IO support
panel:
- Support for a bunch of new panels
i915:
- VBT refactoring
- PLL computation cleanups
- DSI support for BXT
- Color manager support
- More atomic patches
- GEM improvements
- GuC fw loading fixes
- DP detection fixes
- SKL GPU hang fixes
- Lots of BXT fixes
radeon/amdgpu:
- Initial Polaris support
- GPUVM/Scheduler/Clock/Power improvements
- ASYNC pageflip support
- New mesa feature support
nouveau:
- GM108 support
- Power sensor support improvements
- GR init + ucode fixes.
- Use GPU provided topology information
vmwgfx:
- Add host messaging support
gma500:
- Some cleanups and fixes
atmel:
- Bridge support
- Async atomic commit support
fsl-dcu:
- Timing controller for LCD support
- Pixel clock polarity support
rcar-du:
- Misc fixes
exynos:
- Pipeline clock support
- Exynoss4533 SoC support
- HW trigger mode support
- export HDMI_PHY clock
- DECON5433 fixes
- Use generic prime functions
- use DMA mapping APIs
rockchip:
- Lots of little fixes
vc4:
- Render node support
- Gamma ramp support
- DPI output support
msm:
- Mostly cleanups and fixes
- Conversion to generic struct fence
etnaviv:
- Fix for prime buffer handling
- Allow hangcheck to be coalesced with other wakeups
tegra:
- Gamme table size fix"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (1050 commits)
drm/edid: add displayid detailed 1 timings to the modelist. (v1.1)
drm/edid: move displayid validation to it's own function.
drm/displayid: Iterate over all DisplayID blocks
drm/edid: move displayid tiled block parsing into separate function.
drm: Nuke ->vblank_disable_allowed
drm/vmwgfx: Report vmwgfx version to vmware.log
drm/vmwgfx: Add VMWare host messaging capability
drm/vmwgfx: Kill some lockdep warnings
drm/nouveau/gr/gf100-: fix race condition in fecs/gpccs ucode
drm/nouveau/core: recognise GM108 chipsets
drm/nouveau/gr/gm107-: fix touching non-existent ppcs in attrib cb setup
drm/nouveau/gr/gk104-: share implementation of ppc exception init
drm/nouveau/gr/gk104-: move rop_active_fbps init to nonctx
drm/nouveau/bios/pll: check BIT table version before trying to parse it
drm/nouveau/bios/pll: prevent oops when limits table can't be parsed
drm/nouveau/volt/gk104: round up in gk104_volt_set
drm/nouveau/fb/gm200: setup mmu debug buffer registers at init()
drm/nouveau/fb/gk20a,gm20b: setup mmu debug buffer registers at init()
drm/nouveau/fb/gf100-: allocate mmu debug buffers
drm/nouveau/fb: allow chipset-specific actions for oneinit()
...
The binary GCD algorithm is based on the following facts:
1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)
Even on x86 machines with reasonable division hardware, the binary
algorithm runs about 25% faster (80% the execution time) than the
division-based Euclidian algorithm.
On platforms like Alpha and ARMv6 where division is a function call to
emulation code, it's even more significant.
There are two variants of the code here, depending on whether a fast
__ffs (find least significant set bit) instruction is available. This
allows the unpredictable branches in the bit-at-a-time shifting loop to
be eliminated.
If fast __ffs is not available, the "even/odd" GCD variant is used.
I use the following code to benchmark:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#define swap(a, b) \
do { \
a ^= b; \
b ^= a; \
a ^= b; \
} while (0)
unsigned long gcd0(unsigned long a, unsigned long b)
{
unsigned long r;
if (a < b) {
swap(a, b);
}
if (b == 0)
return a;
while ((r = a % b) != 0) {
a = b;
b = r;
}
return b;
}
unsigned long gcd1(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
b >>= __builtin_ctzl(b);
for (;;) {
a >>= __builtin_ctzl(a);
if (a == b)
return a << __builtin_ctzl(r);
if (a < b)
swap(a, b);
a -= b;
}
}
unsigned long gcd2(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
r &= -r;
while (!(b & r))
b >>= 1;
for (;;) {
while (!(a & r))
a >>= 1;
if (a == b)
return a;
if (a < b)
swap(a, b);
a -= b;
a >>= 1;
if (a & r)
a += b;
a >>= 1;
}
}
unsigned long gcd3(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
b >>= __builtin_ctzl(b);
if (b == 1)
return r & -r;
for (;;) {
a >>= __builtin_ctzl(a);
if (a == 1)
return r & -r;
if (a == b)
return a << __builtin_ctzl(r);
if (a < b)
swap(a, b);
a -= b;
}
}
unsigned long gcd4(unsigned long a, unsigned long b)
{
unsigned long r = a | b;
if (!a || !b)
return r;
r &= -r;
while (!(b & r))
b >>= 1;
if (b == r)
return r;
for (;;) {
while (!(a & r))
a >>= 1;
if (a == r)
return r;
if (a == b)
return a;
if (a < b)
swap(a, b);
a -= b;
a >>= 1;
if (a & r)
a += b;
a >>= 1;
}
}
static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
gcd0, gcd1, gcd2, gcd3, gcd4,
};
#define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))
#if defined(__x86_64__)
#define rdtscll(val) do { \
unsigned long __a,__d; \
__asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
(val) = ((unsigned long long)__a) | (((unsigned long long)__d)<<32); \
} while(0)
static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
unsigned long a, unsigned long b, unsigned long *res)
{
unsigned long long start, end;
unsigned long long ret;
unsigned long gcd_res;
rdtscll(start);
gcd_res = gcd(a, b);
rdtscll(end);
if (end >= start)
ret = end - start;
else
ret = ~0ULL - start + 1 + end;
*res = gcd_res;
return ret;
}
#else
static inline struct timespec read_time(void)
{
struct timespec time;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
return time;
}
static inline unsigned long long diff_time(struct timespec start, struct timespec end)
{
struct timespec temp;
if ((end.tv_nsec - start.tv_nsec) < 0) {
temp.tv_sec = end.tv_sec - start.tv_sec - 1;
temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
} else {
temp.tv_sec = end.tv_sec - start.tv_sec;
temp.tv_nsec = end.tv_nsec - start.tv_nsec;
}
return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
}
static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
unsigned long a, unsigned long b, unsigned long *res)
{
struct timespec start, end;
unsigned long gcd_res;
start = read_time();
gcd_res = gcd(a, b);
end = read_time();
*res = gcd_res;
return diff_time(start, end);
}
#endif
static inline unsigned long get_rand()
{
if (sizeof(long) == 8)
return (unsigned long)rand() << 32 | rand();
else
return rand();
}
int main(int argc, char **argv)
{
unsigned int seed = time(0);
int loops = 100;
int repeats = 1000;
unsigned long (*res)[TEST_ENTRIES];
unsigned long long elapsed[TEST_ENTRIES];
int i, j, k;
for (;;) {
int opt = getopt(argc, argv, "n:r:s:");
/* End condition always first */
if (opt == -1)
break;
switch (opt) {
case 'n':
loops = atoi(optarg);
break;
case 'r':
repeats = atoi(optarg);
break;
case 's':
seed = strtoul(optarg, NULL, 10);
break;
default:
/* You won't actually get here. */
break;
}
}
res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
memset(elapsed, 0, sizeof(elapsed));
srand(seed);
for (j = 0; j < loops; j++) {
unsigned long a = get_rand();
/* Do we have args? */
unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
unsigned long long min_elapsed[TEST_ENTRIES];
for (k = 0; k < repeats; k++) {
for (i = 0; i < TEST_ENTRIES; i++) {
unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
if (k == 0 || min_elapsed[i] > tmp)
min_elapsed[i] = tmp;
}
}
for (i = 0; i < TEST_ENTRIES; i++)
elapsed[i] += min_elapsed[i];
}
for (i = 0; i < TEST_ENTRIES; i++)
printf("gcd%d: elapsed %llu\n", i, elapsed[i]);
k = 0;
srand(seed);
for (j = 0; j < loops; j++) {
unsigned long a = get_rand();
unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
for (i = 1; i < TEST_ENTRIES; i++) {
if (res[j][i] != res[j][0])
break;
}
if (i < TEST_ENTRIES) {
if (k == 0) {
k = 1;
fprintf(stderr, "Error:\n");
}
fprintf(stderr, "gcd(%lu, %lu): ", a, b);
for (i = 0; i < TEST_ENTRIES; i++)
fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
}
}
if (k == 0)
fprintf(stderr, "PASS\n");
free(res);
return 0;
}
Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 10174
gcd1: elapsed 2120
gcd2: elapsed 2902
gcd3: elapsed 2039
gcd4: elapsed 2812
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9309
gcd1: elapsed 2280
gcd2: elapsed 2822
gcd3: elapsed 2217
gcd4: elapsed 2710
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9589
gcd1: elapsed 2098
gcd2: elapsed 2815
gcd3: elapsed 2030
gcd4: elapsed 2718
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9914
gcd1: elapsed 2309
gcd2: elapsed 2779
gcd3: elapsed 2228
gcd4: elapsed 2709
PASS
[akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
Signed-off-by: Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>