* pm-cpuidle:
cpuidle: Measure idle state durations with monotonic clock
cpuidle: fix a suspicious RCU usage in menu governor
cpuidle: support multiple drivers
cpuidle: prepare the cpuidle core to handle multiple drivers
cpuidle: move driver checking within the lock section
cpuidle: move driver's refcount to cpuidle
cpuidle: fixup device.h header in cpuidle.h
cpuidle / sysfs: move structure declaration into the sysfs.c file
cpuidle: Get typical recent sleep interval
cpuidle: Set residency to 0 if target Cstate not enter
cpuidle: Quickly notice prediction failure in general case
cpuidle: Quickly notice prediction failure for repeat mode
cpuidle / sysfs: move kobj initialization in the syfs file
cpuidle / sysfs: change function parameter
* pm-cpufreq: (21 commits)
cpufreq: ondemand: update sampling rate only on right CPUs
cpufreq: SPEAr: Add CPUFreq driver
cpufreq: governors: Fix jiffies/cputime mixup (revisited)
cpufreq: ondemand: fix wrong delay sampling rate
cpufreq: exynos: Use static for functions used in only this file
cpufreq: exynos: Broadcast frequency change notifications for all cores
cpufreq: remove use of __devexit
cpufreq: remove use of __devinit
cpufreq: remove use of __devexit_p
cpufreq: Remove unnecessary initialization of a local variable
cpufreq: Make sure target freq is within limits
cpufreq: Avoid calling cpufreq driver's target() routine if target_freq == policy->cur
cpufreq: Fix sparse warning by making local function static
cpufreq: Fix sparse warnings by updating cputime64_t to u64
cpufreq: governors: remove redundant code
cpufreq: return early from __cpufreq_driver_getavg()
cpufreq: fix jiffies/cputime mixup in conservative/ondemand governors
cpufreq: Improve debug prints
cpufreq: Move common part from governors to separate file, v2
cpufreq / core: Fix printing of governor and driver name
...
* acpi-general: (38 commits)
ACPI / thermal: _TMP and _CRT/_HOT/_PSV/_ACx dependency fix
ACPI: drop unnecessary local variable from acpi_system_write_wakeup_device()
ACPI: Fix logging when no pci_irq is allocated
ACPI: Update Dock hotplug error messages
ACPI: Update Container hotplug error messages
ACPI: Update Memory hotplug error messages
ACPI: Update CPU hotplug error messages
ACPI: Add acpi_handle_<level>() interfaces
ACPI: remove use of __devexit
ACPI / PM: Add Sony Vaio VPCEB1S1E to nonvs blacklist.
ACPI / battery: Correct battery capacity values on Thinkpads
Revert "ACPI / x86: Add quirk for "CheckPoint P-20-00" to not use bridge _CRS_ info"
ACPI: create _SUN sysfs file
ACPI / memhotplug: bind the memory device when the driver is being loaded
ACPI / memhotplug: don't allow to eject the memory device if it is being used
ACPI / memhotplug: free memory device if acpi_memory_enable_device() failed
ACPI / memhotplug: fix memory leak when memory device is unbound from acpi_memhotplug
ACPI / memhotplug: deal with eject request in hotplug queue
ACPI / memory-hotplug: add memory offline code to acpi_memory_device_remove()
ACPI / memory-hotplug: call acpi_bus_trim() to remove memory device
...
Conflicts:
include/linux/acpi.h (two additions at the end of the same file)
* acpi-dev-pm:
ACPI / PM: Allow attach/detach routines to change device power states
ACPI / PM: Introduce os_accessible flag for power_state
ACPI / PM: Add check preventing transitioning to non-D0 state from D3.
ACPI / PM: Fix build problem when CONFIG_ACPI or CONFIG_PM is not set
ACPI / PM: Fix build problem related to acpi_target_system_state()
ACPI / PM: Provide ACPI PM callback routines for subsystems
ACPI / PM: Move device PM functions related to sleep states
ACPI / PM: Provide device PM functions operating on struct acpi_device
ACPI / PM: Split device wakeup management routines
ACPI / PM: Move runtime remote wakeup setup routine to device_pm.c
ACPI / PM: Move device power state selection routine to device_pm.c
ACPI / PM: Move routines for adding/removing device wakeup notifiers
ACPI / PM: Fix device PM kernedoc comments and #ifdefs
* pm-qos:
PM / QoS: Handle device PM QoS flags while removing constraints
PM / QoS: Resume device before exposing/hiding PM QoS flags
PM / QoS: Document request manipulation requirement for flags
PM / QoS: Fix a free error in the dev_pm_qos_constraints_destroy()
PM / QoS: Fix the return value of dev_pm_qos_update_request()
PM / ACPI: Take device PM QoS flags into account
PM / Domains: Check device PM QoS flags in pm_genpd_poweroff()
PM / QoS: Make it possible to expose PM QoS device flags to user space
PM / QoS: Introduce PM QoS device flags support
PM / QoS: Prepare struct dev_pm_qos_request for more request types
PM / QoS: Introduce request and constraint data types for PM QoS flags
PM / QoS: Prepare device structure for adding more constraint types
Make it possible to ask the routines used for adding/removing devices
to/from the general ACPI PM domain, acpi_dev_pm_attach() and
acpi_dev_pm_detach(), respectively, to change the power states of
devices so that they are put into the full-power state automatically
by acpi_dev_pm_attach() and into the lowest-power state available
automatically by acpi_dev_pm_detach().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Pull device tree regression fix from Grant Likely:
"Simple build regression fix for DT device drivers on Sparc. An
earlier change had masked out the of_iomap() helper on SPARC."
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6:
of/address: sparc: Declare of_iomap as an extern function for sparc again
This bug-fix makes sure that of_iomap is defined extern for sparc so that the
sparc-specific implementation of_iomap is once again used when including
include/linux/of_address.h in a sparc context. OF_GPIO that is now available for
sparc relies on this.
The bug was inadvertently introduced in a850a75, "of/address: add empty static
inlines for !CONFIG_OF", that added a static dummy inline for of_iomap when
!CONFIG_OF_ADDRESS. However, CONFIG_OF_ADDRESS is never defined for sparc, but
there is a sparc-specific implementation /arch/sparc/kernel/of_device_common.c.
This fix takes the same approach as 0bce04b that solved the equivalent problem
for of_address_to_resource.
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Pull i2c fixes from Wolfram Sang:
"Bugfixes for the i2c subsystem.
Except for a few one-liners, there is mainly one revert because of an
overlooked dependency. Since there is no linux-next at the moment, I
did some extra testing, and all was fine for me."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: mxs: Handle i2c DMA failure properly
i2c: s3c2410: Fix code to free gpios
i2c: omap: ensure writes to dev->buf_len are ordered
Revert "ARM: OMAP: convert I2C driver to PM QoS for MPU latency constraints"
i2c: at91: fix SMBus quick command
If FREEZER is not defined, the error as following will be throw
when compiled.
arch/arm/kernel/signal.c:645: error: implicit declaration of function
'try_to_freeze_nowarn'
Signed-off-by: Haifeng Li <omycle@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ACPI 5 introduced I2cSerialBus resource that makes it possible to enumerate
and configure the I2C slave devices behind the I2C controller. This patch
adds helper functions to support I2C slave enumeration.
An ACPI enabled I2C controller driver only needs to call acpi_i2c_register_devices()
in order to get its slave devices enumerated, created and bound to the
corresponding ACPI handle.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull input updates from Dmitry Torokhov:
"This fixes recent regression where /dev/input/mice got assigned wrong
device node which messed up setups with static /dev, and a regression
in ads7846 GPIO debounce setup."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
ARM - OMAP: ads7846: fix pendown debounce setting
Input: ads7846 - enable pendown GPIO debounce time setting
Input: mousedev - move /dev/input/mice to the correct minor
Input: MT - document new 'flags' argument of input_mt_init_slots()
This patch introduces acpi_handle_<level>(), where <level> is
a kernel message level such as err/warn/info, to support improved
logging messages for ACPI, esp. hot-plug operations.
acpi_handle_<level>() appends "ACPI" prefix and ACPI object path
to the messages. This improves diagnosis of hotplug operations
since an error message in a log file identifies an object that
caused an issue. This interface acquires the global namespace
mutex to obtain an object path. In interrupt context, it shows
the object path as <n/a>.
acpi_handle_<level>() takes acpi_handle as an argument, which is
passed to ACPI hotplug notify handlers from the ACPICA. Therefore,
it is always available unlike other kernel objects, such as device.
For example:
acpi_handle_err(handle, "Device don't exist, dropping EJECT\n");
logs an error message like this at KERN_ERR.
ACPI: \_SB_.SCK4.CPU4: Device don't exist, dropping EJECT
ACPI hot-plug drivers can use acpi_handle_<level>() when they need
to identify a target ACPI object path in their messages, such as
error cases. The usage model is similar to dev_<level>().
acpi_handle_<level>() can be used when a device is not created or
is invalid during hot-plug operations. ACPI object path is also
consistent on the platform, unlike device name that gets incremented
over hotplug operations.
ACPI drivers should use dev_<level>() when a device object is valid.
Device name provides more user friendly information, and avoids
acquiring the global ACPI namespace mutex. ACPI drivers also
continue to use pr_<level>() when they do not need to specify device
information, such as boot-up messages.
Note: ACPI_[WARNING|INFO|ERROR]() are intended for the ACPICA and
are not associated with the kernel message level.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Tested-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil@hp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some platforms need the pendown GPIO debounce time setting programmed.
Since the pendown GPIO is handled by the driver, the debounce time
should also be handled along with the pendown GPIO request.
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The current platform device creation and registration code in
acpi_create_platform_device() is quite convoluted. This function
takes an ACPI device node as an argument and eventually calls
platform_device_register_resndata() to create and register a
platform device object on the basis of the information contained
in that code. However, it doesn't associate the new platform
device with the ACPI node directly, but instead it relies on
acpi_platform_notify(), called from within device_add(), to find
that ACPI node again with the help of acpi_platform_find_device()
and acpi_platform_match() and then attach the new platform device
to it. This causes an additional ACPI namespace walk to happen and
is clearly suboptimal.
Use the observation that it is now possible to initialize the ACPI
handle of a device before calling device_add() for it to make this
code more straightforward. Namely, add a new field to struct
platform_device_info allowing us to pass the ACPI handle of interest
to platform_device_register_full(), which will then use it to
initialize the new device's ACPI handle before registering it.
This will cause acpi_platform_notify() to use the ACPI handle from
the device structure directly instead of using the .find_device()
routine provided by the device's bus type. In consequence,
acpi_platform_bus, acpi_platform_find_device(), and
acpi_platform_match() are not necessary any more, so remove them.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To avoid adding an ACPI handle pointer to struct device on
architectures that don't use ACPI, or generally when CONFIG_ACPI is
not set, in which cases that pointer is useless, define struct
acpi_dev_node that will contain the handle pointer if CONFIG_ACPI is
set and will be empty otherwise and use it to represent the ACPI
device node field in struct device.
In addition to that define macros for reading and setting the ACPI
handle of a device that don't generate code when CONFIG_ACPI is
unset. Modify the ACPI subsystem to use those macros instead of
referring to the given device's ACPI handle directly.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge misc fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches)
revert "mm: fix-up zone present pages"
tmpfs: change final i_blocks BUG to WARNING
tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
rapidio: fix kernel-doc warnings
swapfile: fix name leak in swapoff
memcg: fix hotplugged memory zone oops
mips, arc: fix build failure
memcg: oom: fix totalpages calculation for memory.swappiness==0
mm: fix build warning for uninitialized value
mm: add anon_vma_lock to validate_mm()
Revert commit 7f1290f2f2 ("mm: fix-up zone present pages")
That patch tried to fix a issue when calculating zone->present_pages,
but it caused a regression on 32bit systems with HIGHMEM. With that
change, reset_zone_present_pages() resets all zone->present_pages to
zero, and fixup_zone_present_pages() is called to recalculate
zone->present_pages when the boot allocator frees core memory pages into
buddy allocator. Because highmem pages are not freed by bootmem
allocator, all highmem zones' present_pages becomes zero.
Various options for improving the situation are being discussed but for
now, let's return to the 3.6 code.
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix rapidio kernel-doc warnings:
Warning(drivers/rapidio/rio.c:415): No description found for parameter 'local'
Warning(drivers/rapidio/rio.c:415): Excess function parameter 'lstart' description in 'rio_map_inb_region'
Warning(include/linux/rio.h:290): No description found for parameter 'switches'
Warning(include/linux/rio.h:290): No description found for parameter 'destid_table'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Acked-by: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When MEMCG is configured on (even when it's disabled by boot option),
when adding or removing a page to/from its lru list, the zone pointer
used for stats updates is nowadays taken from the struct lruvec. (On
many configurations, calculating zone from page is slower.)
But we have no code to update all the lruvecs (per zone, per memcg) when
a memory node is hotadded. Here's an extract from the oops which
results when running numactl to bind a program to a newly onlined node:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000f60
IP: __mod_zone_page_state+0x9/0x60
Pid: 1219, comm: numactl Not tainted 3.6.0-rc5+ #180 Bochs Bochs
Process numactl (pid: 1219, threadinfo ffff880039abc000, task ffff8800383c4ce0)
Call Trace:
__pagevec_lru_add_fn+0xdf/0x140
pagevec_lru_move_fn+0xb1/0x100
__pagevec_lru_add+0x1c/0x30
lru_add_drain_cpu+0xa3/0x130
lru_add_drain+0x2f/0x40
...
The natural solution might be to use a memcg callback whenever memory is
hotadded; but that solution has not been scoped out, and it happens that
we do have an easy location at which to update lruvec->zone. The lruvec
pointer is discovered either by mem_cgroup_zone_lruvec() or by
mem_cgroup_page_lruvec(), and both of those do know the right zone.
So check and set lruvec->zone in those; and remove the inadequate
attempt to set lruvec->zone from lruvec_init(), which is called before
NODE_DATA(node) has been allocated in such cases.
Ah, there was one exceptionr. For no particularly good reason,
mem_cgroup_force_empty_list() has its own code for deciding lruvec.
Change it to use the standard mem_cgroup_zone_lruvec() and
mem_cgroup_get_lru_size() too. In fact it was already safe against such
an oops (the lru lists in danger could only be empty), but we're better
proofed against future changes this way.
I've marked this for stable (3.6) since we introduced the problem in 3.5
(now closed to stable); but I have no idea if this is the only fix
needed to get memory hotadd working with memcg in 3.6, and received no
answer when I enquired twice before.
Reported-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ARM SoC fixes from Olof Johansson:
"We've been sitting on this longer than we meant to due to travel and
other activities, but the number of patches is luckily not that high.
Biggest changes are from a batch of OMAP bugfixes, but there are a few
for the broader set of SoCs too (bcm2835, pxa, highbank, tegra, at91
and i.MX).
The OMAP patches contain some fixes for MUSB/PHY on omap4 which ends
up being a bit on the large side but needed for legacy (non-DT)
platforms. Beyond that there are a handful of hwmod/pm changes.
So, fairly noncontroversial stuff all in all, and as usual around this
time the fixes are well targeted at specific problems."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: imx: ehci: fix host power mask bit
ARM i.MX: fix error-valued pointer dereference in clk_register_gate2()
ARM: at91/usbh: fix overcurrent gpio setup
ARM: at91/AT91SAM9G45: fix crypto peripherals irq issue due to sparse irq support
ARM: boot: Fix usage of kecho
ARM: OMAP: ocp2scp: create omap device for ocp2scp
ARM: OMAP4: add _dev_attr_ to ocp2scp for representing usb_phy
drivers: bus: ocp2scp: add pdata support
irqchip: irq-bcm2835: Add terminating entry for of_device_id table
ARM: highbank: retry wfi on reset request
ARM: OMAP4: PM: fix regulator name for VDD_MPU
ARM: OMAP4: hwmod data: do not enable or reset the McPDM during kernel init
ARM: OMAP2+: hwmod: add flag to prevent hwmod code from touching IP block during init
ARM: dt: tegra: fix length of pad control and mux registers
ARM: OMAP: hwmod: wait for sysreset complete after enabling hwmod
ARM: OMAP2+: clockdomain: Fix OMAP4 ISS clk domain to support only SWSUP
ARM: pxa/spitz_pm: Fix hang when resuming from STR
ARM: pxa: hx4700: Fix backlight PWM device number
ARM: OMAP2+: PM: add missing newline to VC warning message
From Sascha Hauer <s.hauer@pengutronix.de>:
ARM i.MX fixes for 3.7-rc
* tag 'imx-fixes-rc' of git://git.pengutronix.de/git/imx/linux-2.6:
ARM: imx: ehci: fix host power mask bit
ARM i.MX: fix error-valued pointer dereference in clk_register_gate2()
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Users of GCC 4.7 have reported compiler errors due to having inline
applied to function declarations in clk-provider.h. The definitions
exist in drivers/clk/clk.c. An example error:
In file included from arch/arm/mach-omap2/clockdomain.c:25:0:
arch/arm/mach-omap2/clockdomain.c: In function ‘clkdm_clk_disable’:
include/linux/clk-provider.h:338:12: error: inlining failed in call to always_inline ‘__clk_get_enable_count’: function body not available
arch/arm/mach-omap2/clockdomain.c:1001:28: error: called from here
make[1]: *** [arch/arm/mach-omap2/clockdomain.o] Error 1
make: *** [arch/arm/mach-omap2] Error 2
This patch removes the use of inline from include/linux/clk-provider.h
but keeps the function definitions in drivers/clk/clk.c as inlined since
they are one-liners.
Signed-off-by: Igor Mazanov <i.mazanov@gmail.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: improved subject, added changelog]
Commit e5cc8ef (ACPI / PM: Provide ACPI PM callback routines for
subsystems) introduced a build problem occuring if CONFIG_ACPI is
unset or CONFIG_PM is unset and errno.h is not included before
acpi.h, because in that case ENODEV used in acpi.h is undefined.
Fix the issue by making acpi.h include errno.h.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
With the tegra3 and the big.LITTLE [1] new architectures, several cpus
with different characteristics (latencies and states) can co-exists on the
system.
The cpuidle framework has the limitation of handling only identical cpus.
This patch removes this limitation by introducing the multiple driver support
for cpuidle.
This option is configurable at compile time and should be enabled for the
architectures mentioned above. So there is no impact for the other platforms
if the option is disabled. The option defaults to 'n'. Note the multiple drivers
support is also compatible with the existing drivers, even if just one driver is
needed, all the cpu will be tied to this driver using an extra small chunk of
processor memory.
The multiple driver support use a per-cpu driver pointer instead of a global
variable and the accessor to this variable are done from a cpu context.
In order to keep the compatibility with the existing drivers, the function
'cpuidle_register_driver' and 'cpuidle_unregister_driver' will register
the specified driver for all the cpus.
The semantic for the output of /sys/devices/system/cpu/cpuidle/current_driver
remains the same except the driver name will be related to the current cpu.
The /sys/devices/system/cpu/cpu[0-9]/cpuidle/driver/name files are added
allowing to read the per cpu driver name.
[1] http://lwn.net/Articles/481055/
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We want to support different cpuidle drivers co-existing together.
In this case we should move the refcount to the cpuidle_driver
structure to handle several drivers at a time.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The structure cpuidle_state_kobj is not used anywhere except
in the sysfs.c file. The definition of this structure is not
needed in the cpuidle header file. This patch moves it to the
sysfs.c file in order to encapsulate the code a bit more.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The prediction for future is difficult and when the cpuidle governor prediction
fails and govenor possibly choose the shallower C-state than it should. How to
quickly notice and find the failure becomes important for power saving.
cpuidle menu governor has a method to predict the repeat pattern if there are 8
C-states residency which are continuous and the same or very close, so it will
predict the next C-states residency will keep same residency time.
There is a real case that turbostat utility (tools/power/x86/turbostat)
at kernel 3.3 or early. turbostat utility will read 10 registers one by one at
Sandybridge, so it will generate 10 IPIs to wake up idle CPUs. So cpuidle menu
governor will predict it is repeat mode and there is another IPI wake up idle
CPU soon, so it keeps idle CPU stay at C1 state even though CPU is totally
idle. However, in the turbostat, following 10 registers reading is sleep 5
seconds by default, so the idle CPU will keep at C1 for a long time though it is
idle until break event occurs.
In a idle Sandybridge system, run "./turbostat -v", we will notice that deep
C-state dangles between "70% ~ 99%". After patched the kernel, we will notice
deep C-state stays at >99.98%.
In the patch, a timer is added when menu governor detects a repeat mode and
choose a shallow C-state. The timer is set to a time out value that greater
than predicted time, and we conclude repeat mode prediction failure if timer is
triggered. When repeat mode happens as expected, the timer is not triggered
and CPU waken up from C-states and it will cancel the timer initiatively.
When repeat mode does not happen, the timer will be time out and menu governor
will quickly notice that the repeat mode prediction fails and then re-evaluates
deeper C-states possibility.
Below is another case which will clearly show the patch much benefit:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <sys/time.h>
#include <time.h>
#include <pthread.h>
volatile int * shutdown;
volatile long * count;
int delay = 20;
int loop = 8;
void usage(void)
{
fprintf(stderr,
"Usage: idle_predict [options]\n"
" --help -h Print this help\n"
" --thread -n Thread number\n"
" --loop -l Loop times in shallow Cstate\n"
" --delay -t Sleep time (uS)in shallow Cstate\n");
}
void *simple_loop() {
int idle_num = 1;
while (!(*shutdown)) {
*count = *count + 1;
if (idle_num % loop)
usleep(delay);
else {
/* sleep 1 second */
usleep(1000000);
idle_num = 0;
}
idle_num++;
}
}
static void sighand(int sig)
{
*shutdown = 1;
}
int main(int argc, char *argv[])
{
sigset_t sigset;
int signum = SIGALRM;
int i, c, er = 0, thread_num = 8;
pthread_t pt[1024];
static char optstr[] = "n:l:t:h:";
while ((c = getopt(argc, argv, optstr)) != EOF)
switch (c) {
case 'n':
thread_num = atoi(optarg);
break;
case 'l':
loop = atoi(optarg);
break;
case 't':
delay = atoi(optarg);
break;
case 'h':
default:
usage();
exit(1);
}
printf("thread=%d,loop=%d,delay=%d\n",thread_num,loop,delay);
count = malloc(sizeof(long));
shutdown = malloc(sizeof(int));
*count = 0;
*shutdown = 0;
sigemptyset(&sigset);
sigaddset(&sigset, signum);
sigprocmask (SIG_BLOCK, &sigset, NULL);
signal(SIGINT, sighand);
signal(SIGTERM, sighand);
for(i = 0; i < thread_num ; i++)
pthread_create(&pt[i], NULL, simple_loop, NULL);
for (i = 0; i < thread_num; i++)
pthread_join(pt[i], NULL);
exit(0);
}
Get powertop V2 from git://github.com/fenrus75/powertop, build powertop.
After build the above test application, then run it.
Test plaform can be Intel Sandybridge or other recent platforms.
#./idle_predict -l 10 &
#./powertop
We will find that deep C-state will dangle between 40%~100% and much time spent
on C1 state. It is because menu governor wrongly predict that repeat mode
is kept, so it will choose the C1 shallow C-state even though it has chance to
sleep 1 second in deep C-state.
While after patched the kernel, we find that deep C-state will keep >99.6%.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Initially ondemand governor was written and then using its code conservative
governor is written. It used a lot of code from ondemand governor, but copy of
code was created instead of using the same routines from both governors. Which
increased code redundancy, which is difficult to manage.
This patch is an attempt to move common part of both the governors to
cpufreq_governor.c file to come over above mentioned issues.
This shouldn't change anything from functionality point of view.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Multiple cpufreq governers have defined similar get_cpu_idle_time_***()
routines. These routines must be moved to some common place, so that all
governors can use them.
So moving them to cpufreq_governor.c, which seems to be a better place for
keeping these routines.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Arrays for governer and driver name are of size CPUFREQ_NAME_LEN or 16.
i.e. 15 bytes for name and 1 for trailing '\0'.
When cpufreq driver print these names (for sysfs), it includes '\n' or ' ' in
the fmt string and still passes length as CPUFREQ_NAME_LEN. If the driver or
governor names are using all 15 fields allocated to them, then the trailing '\n'
or ' ' will never be printed. And so commands like:
root@linaro-developer# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
will print something like:
cpufreq_foodrvroot@linaro-developer#
Fix this by increasing print length by one character.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, whoever wants to use ACPI device resources has to call
acpi_walk_resources() to browse the buffer returned by the _CRS
method for the given device and create filters passed to that
routine to apply to the individual resource items. This generally
is cumbersome, time-consuming and inefficient. Moreover, it may
be problematic if resource conflicts need to be resolved, because
the different users of _CRS will need to do that in a consistent
way. However, if there are resource conflicts, the ACPI core
should be able to resolve them centrally instead of relying on
various users of acpi_walk_resources() to handle them correctly
together.
For this reason, introduce a new function, acpi_dev_get_resources(),
that can be used by subsystems to obtain a list of struct resource
objects corresponding to the ACPI device resources returned by
_CRS and, if necessary, to apply additional preprocessing routine
to the ACPI resources before converting them to the struct resource
format.
Make the ACPI code that creates platform device objects use
acpi_dev_get_resources() for resource processing instead of executing
acpi_walk_resources() twice by itself, which causes it to be much
more straightforward and easier to follow.
In the future, acpi_dev_get_resources() can be extended to meet
the needs of the ACPI PNP subsystem and other users of _CRS in
the kernel.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Move some code used for parsing ACPI device resources from the PNP
subsystem to the ACPI core, so that other bus types (platform, SPI,
I2C) can use the same routines for parsing resources in a consistent
way, without duplicating code.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Introduce function acpi_match_device() allowing callers to match
struct device objects with populated acpi_handle fields against
arrays of ACPI device IDs. Also introduce function
acpi_driver_match_device() using acpi_match_device() internally and
allowing callers to match a struct device object against an array of
ACPI device IDs provided by a device driver.
Additionally, introduce macro ACPI_PTR() that may be used by device
drivers to escape pointers to data structures whose definitions
depend on CONFIG_ACPI.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
With ACPI 5 we are starting to see devices that don't natively support
discovery but can be enumerated with the help of the ACPI namespace.
Typically, these devices can be represented in the Linux device driver
model as platform devices or some serial bus devices, like SPI or I2C
devices.
Since we want to re-use existing drivers for those devices, we need a
way for drivers to specify the ACPI IDs of supported devices, so that
they can be matched against device nodes in the ACPI namespace. To
this end, it is sufficient to add a pointer to an array of supported
ACPI device IDs, that can be provided by the driver, to struct device.
Moreover, things like ACPI power management need to have access to
the ACPI handle of each supported device, because that handle is used
to invoke AML methods associated with the corresponding ACPI device
node. The ACPI handles of devices are now stored in the archdata
member structure of struct device whose definition depends on the
architecture and includes the ACPI handle only on x86 and ia64. Since
the pointer to an array of supported ACPI IDs is added to struct
device_driver in an architecture-independent way, it is logical to
move the ACPI handle from archdata to struct device itself at the same
time. This also makes code more straightforward in some places and
follows the example of Device Trees that have a poiter to struct
device_node in there too.
This changeset is based on Mika Westerberg's work.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
acpi_no_s4_hw_signature is defined in #ifdef CONFIG_HIBERNATION block,
but the current code put the declaration in #ifdef CONFIG_PM_SLEEP block.
I happened to meet this issue when I turned off PM_SLEEP config manually:
arch/x86/kernel/acpi/sleep.c:100:4: error: implicit declaration of function ‘acpi_no_s4_hw_signature’ [-Werror=implicit-function-declaration]
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The ACPI specificiation would like us to save NVS at hibernation time,
but makes no mention of saving NVS over S3. Not all versions of
Windows do this either, and it is clear that not all machines need NVS
saved/restored over S3. Allow the user to improve their suspend/resume
time by disabling the NVS save/restore at S3 time, but continue to do
the NVS save/restore for S4 as specified.
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some bus types don't support power management natively, but generally
there may be device nodes in ACPI tables corresponding to the devices
whose bus types they are (under ACPI 5 those bus types may be SPI,
I2C and platform). If that is the case, standard ACPI power
management may be applied to those devices, although currently the
kernel has no means for that.
For this reason, provide a set of routines that may be used as power
management callbacks for such devices. This may be done in three
different ways.
(1) Device drivers handling the devices in question may run
acpi_dev_pm_attach() in their .probe() routines, which (on
success) will cause the devices to be added to the general ACPI
PM domain and ACPI power management will be used for them going
forward. Then, acpi_dev_pm_detach() may be used to remove the
devices from the general ACPI PM domain if ACPI power management
is not necessary for them any more.
(2) The devices' subsystems may use acpi_subsys_runtime_suspend(),
acpi_subsys_runtime_resume(), acpi_subsys_prepare(),
acpi_subsys_suspend_late(), acpi_subsys_resume_early() as their
power management callbacks in the same way as the general ACPI
PM domain does that.
(3) The devices' drivers may execute acpi_dev_suspend_late(),
acpi_dev_resume_early(), acpi_dev_runtime_suspend(),
acpi_dev_runtime_resume() from their power management callbacks
as appropriate, if that's absolutely necessary, but it is not
recommended to do that, because such drivers may not work
without ACPI support as a result.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull networking fixes from David Miller:
"Bug fixes galore, mostly in drivers as is often the case:
1) USB gadget and cdc_eem drivers need adjustments to their frame size
lengths in order to handle VLANs correctly. From Ian Coolidge.
2) TIPC and several network drivers erroneously call tasklet_disable
before tasklet_kill, fix from Xiaotian Feng.
3) r8169 driver needs to apply the WOL suspend quirk to more chipsets,
fix from Cyril Brulebois.
4) Fix multicast filters on RTL_GIGA_MAC_VER_35 r8169 chips, from
Nathan Walp.
5) FDB netlink dumps should use RTM_NEWNEIGH as the message type, not
zero. From John Fastabend.
6) Fix smsc95xx tx checksum offload on big-endian, from Steve
Glendinning.
7) __inet_diag_dump() needs to repsect and report the error value
returned from inet_diag_lock_handler() rather than ignore it.
Otherwise if an inet diag handler is not available for a particular
protocol, we essentially report success instead of giving an error
indication. Fix from Cyrill Gorcunov.
8) When the QFQ packet scheduler sees TSO/GSO packets it does not
handle things properly, and in fact ends up corrupting it's
datastructures as well as mis-schedule packets. Fix from Paolo
Valente.
9) Fix oopser in skb_loop_sk(), from Eric Leblond.
10) CXGB4 passes partially uninitialized datastructures in to FW
commands, fix from Vipul Pandya.
11) When we send unsolicited ipv6 neighbour advertisements, we should
send them to the link-local allnodes multicast address, as per
RFC4861. Fix from Hannes Frederic Sowa.
12) There is some kind of bug in the usbnet's kevent deferral
mechanism, but more immediately when it triggers an uncontrolled
stream of kernel messages spam the log. Rate limit the error log
message triggered when this problem occurs, as sending thousands
of error messages into the kernel log doesn't help matters at all,
and in fact makes further diagnosis more difficult.
From Steve Glendinning.
13) Fix gianfar restore from hibernation, from Wang Dongsheng.
14) The netlink message attribute sizes are wrong in the ipv6 GRE
driver, it was using the size of ipv4 addresses instead of ipv6
ones :-) Fix from Nicolas Dichtel."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
gre6: fix rtnl dump messages
gianfar: ethernet vanishes after restoring from hibernation
usbnet: ratelimit kevent may have been dropped warnings
ipv6: send unsolicited neighbour advertisements to all-nodes
net: usb: cdc_eem: Fix rx skb allocation for 802.1Q VLANs
usb: gadget: g_ether: fix frame size check for 802.1Q
cxgb4: Fix initialization of SGE_CONTROL register
isdn: Make CONFIG_ISDN depend on CONFIG_NETDEVICES
cxgb4: Initialize data structures before using.
af-packet: fix oops when socket is not present
pkt_sched: enable QFQ to support TSO/GSO
net: inet_diag -- Return error code if protocol handler is missed
net: bnx2x: Fix typo in bnx2x driver
smsc95xx: fix tx checksum offload for big endian
rtnetlink: Use nlmsg type RTM_NEWNEIGH from dflt fdb dump
ptp: update adjfreq callback description
r8169: allow multicast packets on sub-8168f chipset.
r8169: Fix WoL on RTL8168d/8111d.
drivers/net: use tasklet_kill in device remove/close process
tipc: do not use tasklet_disable before tasklet_kill
Pull sparc fixes from David Miller:
"Several build/bug fixes for sparc, including:
1) Configuring a mix of static vs. modular sparc64 crypto modules
didn't work, remove an ill-conceived attempt to only have to build
the device match table for these drivers once to fix the problem.
Reported by Meelis Roos.
2) Make the montgomery multiple/square and mpmul instructions actually
usable in 32-bit tasks. Essentially this involves providing 32-bit
userspace with a way to use a 64-bit stack when it needs to.
3) Our sparc64 atomic backoffs don't yield cpu strands properly on
Niagara chips. Use pause instruction when available to achieve
this, otherwise use a benign instruction we know blocks the strand
for some time.
4) Wire up kcmp
5) Fix the build of various drivers by removing the unnecessary
blocking of OF_GPIO when SPARC.
6) Fix unintended regression wherein of_address_to_resource stopped
being provided. Fix from Andreas Larsson.
7) Fix NULL dereference in leon_handle_ext_irq(), also from Andreas
Larsson."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix build with mix of modular vs. non-modular crypto drivers.
sparc: Support atomic64_dec_if_positive properly.
of/address: sparc: Declare of_address_to_resource() as an extern function for sparc again
sparc32, leon: Check for existent irq_map entry in leon_handle_ext_irq
sparc: Add sparc support for platform_get_irq()
sparc: Allow OF_GPIO on sparc.
qlogicpti: Fix build warning.
sparc: Wire up sys_kcmp.
sparc64: Improvde documentation and readability of atomic backoff code.
sparc64: Use pause instruction when available.
sparc64: Fix cpu strand yielding.
sparc64: Make montmul/montsqr/mpmul usable in 32-bit threads.
This bug-fix makes sure that of_address_to_resource is defined extern for sparc
so that the sparc-specific implementation of of_address_to_resource() is once
again used when including include/linux/of_address.h in a sparc context. A
number of drivers in mainline relies on this function working for sparc.
The bug was introduced in a850a75544, "of/address:
add empty static inlines for !CONFIG_OF". Contrary to that commit title, the
static inlines are added for !CONFIG_OF_ADDRESS, and CONFIG_OF_ADDRESS is never
defined for sparc. This is good behavior for the other functions in
include/linux/of_address.h, as the extern functions defined in
drivers/of/address.c only gets linked when OF_ADDRESS is configured. However,
for of_address_to_resource there exists a sparc-specific implementation in
arch/sparc/arch/sparc/kernel/of_device_common.c
Solution suggested by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The of_device_id match data is now marked as const and
must not be modified. This changes the dw_mmc to mark
all pointers passing the dw_mci_drv_data or dw_mci_dma_ops
structures as const, and also marks the static definitions
as const.
drivers/mmc/host/dw_mmc-exynos.c: In function 'dw_mci_exynos_probe':
drivers/mmc/host/dw_mmc-exynos.c:234:11: warning: assignment discards 'const' qualifier from pointer target type [enabled by default]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Abraham <thomas.abraham@linaro.org>
Cc: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
CMD23 causes lots of errors in kernel on some freescale SoCs
(P1020, P1021, P1022, P1024, P1025 and P4080) when MMC card used,
which is because these controllers does not support CMD23,
even on the SoCs which declares CMD23 is supported.
Therefore, we'll not use CMD23.
Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Even though platform_get_irq returns error, 'host->irq'
always has an unsigned value. Less-than-zero comparison
of an unsigned value is never true. Type of 'unsigned int'
will be changed for 'int'.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>