Commit Graph

1852 Commits

Author SHA1 Message Date
Yasuaki Ishimatsu
3464046824 numa, cpu hotplug: change links of CPU and node when changing node number by onlining CPU
When booting x86 system contains memoryless node, node numbers of CPUs
on memoryless node were changed to nearest online node number by
init_cpu_to_node() because the node is not online.

In my system, node numbers of cpu#30-44 and 75-89 were changed from 2 to
0 as follows:

  $ numactl --hardware
  available: 2 nodes (0-1)
  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 30 31 32 33 34 35 36 37 38 39 40
  41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 75 76 77 78 79 80 81 82
  83 84 85 86 87 88 89
  node 0 size: 32394 MB
  node 0 free: 27898 MB
  node 1 cpus: 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 60 61 62 63 64 65 66
  67 68 69 70 71 72 73 74
  node 1 size: 32768 MB
  node 1 free: 30335 MB

If we hot add memory to memoryless node and offine/online all CPUs on
the node, node numbers of these CPUs are changed to correct node numbers
by srat_detect_node() because the node become online.

In this case, node numbers of cpu#30-44 and 75-89 were changed from 0 to
2 in my system as follows:

  $ numactl --hardware
  available: 3 nodes (0-2)
  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 45 46 47 48 49 50 51 52 53 54 55
  56 57 58 59
  node 0 size: 32394 MB
  node 0 free: 27218 MB
  node 1 cpus: 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 60 61 62 63 64 65 66
  67 68 69 70 71 72 73 74
  node 1 size: 32768 MB
  node 1 free: 30014 MB
  node 2 cpus: 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 75 76 77 78 79 80 81
  82 83 84 85 86 87 88 89
  node 2 size: 16384 MB
  node 2 free: 16384 MB

But "cpu to node" and "node to cpu" links were not changed as follows:

  $ ls /sys/devices/system/cpu/cpu30/|grep node
  node0
  $ ls /sys/devices/system/node/node0/|grep cpu30
  cpu30

"numactl --hardware" shows that cpu30 belongs to node 2.  But sysfs
links does not change.

This patch changes "cpu to node" and "node to cpu" links when node
number changed by onlining CPU.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:39 -07:00
Tang Chen
6056d619a8 mm: Remove unused parameter of pages_correctly_reserved()
nr_pages is not used in pages_correctly_reserved().
So remove it.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:38 -07:00
David Rientjes
4edd7ceff0 mm, hotplug: avoid compiling memory hotremove functions when disabled
__remove_pages() is only necessary for CONFIG_MEMORY_HOTREMOVE.  PowerPC
pseries will return -EOPNOTSUPP if unsupported.

Adding an #ifdef causes several other functions it depends on to also
become unnecessary, which saves in .text when disabled (it's disabled in
most defconfigs besides powerpc, including x86).  remove_memory_block()
becomes static since it is not referenced outside of
drivers/base/memory.c.

Build tested on x86 and powerpc with CONFIG_MEMORY_HOTREMOVE both enabled
and disabled.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:37 -07:00
Andrew Morton
6e259e7dc4 drivers/base/node.c: switch to register_hotmemory_notifier()
Squishes a warning which my change to hotplug_memory_notifier() added.

I want to keep that warning, because it is punishment for failnig to check
the hotplug_memory_notifier() return value.

Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:36 -07:00
Greg Kroah-Hartman
0d1d392f01 Merge 3.9-rc7 into driver-core-next
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-14 18:37:05 -07:00
Ulf Hansson
fa180eb448 PM / Runtime: Idle devices asynchronously after probe|release
Putting devices into idle|suspend in a synchronous manner means we are
waiting for each device to become idle|suspended before the probe|release
is fully done.

This patch switch to use the asynchronous runtime PM API:s instead and
thus improves the parallelism since we can move on and handle the next
device in queue in an earlier phase.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-11 12:42:52 -07:00
Greg Kroah-Hartman
4e4098a3e0 driver core: handle user namespaces properly with the uid/gid devtmpfs change
Now that devtmpfs is caring about uid/gid, we need to use the correct
internal types so users who have USER_NS enabled will have things work
properly for them.

Thanks to Eric for pointing this out, and the patch review.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-11 11:43:29 -07:00
Ming Lei
d81c8d19da driver core: devtmpfs: fix compile failure with CONFIG_UIDGID_STRICT_TYPE_CHECKS
If CONFIG_UIDGID_STRICT_TYPE_CHECKS is enalbed, the below compile
failure will be triggered:

	drivers/base/devtmpfs.c: In function 'handle_create':
	drivers/base/devtmpfs.c:214:19: error: incompatible types when assigning to type 'kuid_t' from type 'uid_t'
	drivers/base/devtmpfs.c:215:19: error: incompatible types when assigning to type 'kgid_t' from type 'gid_t'
	make[2]: *** [drivers/base/devtmpfs.o] Error 1

This patch fixes the compile failure.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-10 09:36:06 -07:00
Greg Kroah-Hartman
c3a3042090 devtmpfs: add base.h include
This fixes a sparse warning, and is a good idea given that the
devtmpfs_init() prototype is in this file.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-10 09:00:15 -07:00
Mark Brown
51a246aa5c regmap: Back out work buffer fix
This reverts commit bc8ce4 (regmap: don't corrupt work buffer in
_regmap_raw_write()) since it turns out that it can cause issues when
taken in isolation from the other changes in -next that lead to its
discovery.  On the basis that nobody noticed the problems for quite some
time without that subsequent work let's drop it from v3.9.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-04-09 18:03:25 +01:00
Kay Sievers
3c2670e651 driver core: add uid and gid to devtmpfs
Some drivers want to tell userspace what uid and gid should be used for
their device nodes, so allow that information to percolate through the
driver core to userspace in order to make this happen.  This means that
some systems (i.e.  Android and friends) will not need to even run a
udev-like daemon for their device node manager and can just rely in
devtmpfs fully, reducing their footprint even more.

Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-08 08:21:48 -07:00
Linus Torvalds
d08d528dc1 Merge tag 'pm+acpi-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:

 - Revert of a recent cpuidle change that caused Nehalem machines to
   hang on boot from Alex Shi.

 - USB power management fix addressing a crash in the port device
   object's release routine from Rafael J Wysocki.

 - Device PM QoS fix for a potential deadlock related to sysfs interface
   from Rafael J Wysocki.

 - Fix for a cpufreq crash when the /cpus Device Tree node is missing
   from Paolo Pisati.

 - Fix for a build issue on ia64 related to the Boot Graphics Resource
   Table (BGRT) from Tony Luck.

 - Two fixes for ACPI handles being set incorrectly for device objects
   that don't correspond to any ACPI namespace nodes in the I2C and SPI
   subsystems from Rafael J Wysocki.

 - Fix for compiler warnings related to CONFIG_PM_DEVFREQ being unset
   from Rajagopal Venkat.

 - Fix for a symbol definition typo in cpufreq_governor.h from Borislav
   Petkov.

* tag 'pm+acpi-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / BGRT: Don't let users configure BGRT on non X86 systems
  cpuidle / ACPI: recover percpu ACPI processor cstate
  ACPI / I2C: Use parent's ACPI_HANDLE() in acpi_i2c_register_devices()
  cpufreq: Correct header guards typo
  ACPI / SPI: Use parent's ACPI_HANDLE() in acpi_register_spi_devices()
  cpufreq: check OF node /cpus presence before dereferencing it
  PM / devfreq: Fix compiler warnings for CONFIG_PM_DEVFREQ unset
  PM / QoS: Avoid possible deadlock related to sysfs access
  USB / PM: Don't try to hide PM QoS flags from usb_port_device_release()
2013-04-04 15:56:28 -07:00
Arnd Bergmann
bcfb87fb75 sysfs: fix crash_notes_size build warning
commit eca4549f57 "sysfs: Add crash_notes_size to export percpu
note size" adds a printk that outputs a size_t value as %lu
when it should be %zu, resulting in this warning.

drivers/base/cpu.c: In function 'show_crash_notes_size':
drivers/base/cpu.c:142:2: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'unsigned int' [-Wformat=]

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Simon Horman <horms@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-03 11:09:03 -07:00
Rafael J. Wysocki
0f70306929 PM / QoS: Avoid possible deadlock related to sysfs access
Commit b81ea1b (PM / QoS: Fix concurrency issues and memory leaks in
device PM QoS) put calls to pm_qos_sysfs_add_latency(),
pm_qos_sysfs_add_flags(), pm_qos_sysfs_remove_latency(), and
pm_qos_sysfs_remove_flags() under dev_pm_qos_mtx, which was a
mistake, because it may lead to deadlocks in some situations.
For example, if pm_qos_remote_wakeup_store() is run in parallel
with dev_pm_qos_constraints_destroy(), they may deadlock in the
following way:

 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.9.0-rc4-next-20130328-sasha-00014-g91a3267 #319 Tainted: G        W
 -------------------------------------------------------
 trinity-child6/12371 is trying to acquire lock:
  (s_active#54){++++.+}, at: [<ffffffff81301631>] sysfs_addrm_finish+0x31/0x60

 but task is already holding lock:
  (dev_pm_qos_mtx){+.+.+.}, at: [<ffffffff81f07cc3>] dev_pm_qos_constraints_destroy+0x23/0x250

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (dev_pm_qos_mtx){+.+.+.}:
        [<ffffffff811811da>] lock_acquire+0x1aa/0x240
        [<ffffffff83dab809>] __mutex_lock_common+0x59/0x5e0
        [<ffffffff83dabebf>] mutex_lock_nested+0x3f/0x50
        [<ffffffff81f07f2f>] dev_pm_qos_update_flags+0x3f/0xc0
        [<ffffffff81f05f4f>] pm_qos_remote_wakeup_store+0x3f/0x70
        [<ffffffff81efbb43>] dev_attr_store+0x13/0x20
        [<ffffffff812ffdaa>] sysfs_write_file+0xfa/0x150
        [<ffffffff8127f2c1>] __kernel_write+0x81/0x150
        [<ffffffff812afc2d>] write_pipe_buf+0x4d/0x80
        [<ffffffff812af57c>] splice_from_pipe_feed+0x7c/0x120
        [<ffffffff812afa25>] __splice_from_pipe+0x45/0x80
        [<ffffffff812b14fc>] splice_from_pipe+0x4c/0x70
        [<ffffffff812b1538>] default_file_splice_write+0x18/0x30
        [<ffffffff812afae3>] do_splice_from+0x83/0xb0
        [<ffffffff812afb2e>] direct_splice_actor+0x1e/0x20
        [<ffffffff812b0277>] splice_direct_to_actor+0xe7/0x200
        [<ffffffff812b15bc>] do_splice_direct+0x4c/0x70
        [<ffffffff8127eda9>] do_sendfile+0x169/0x300
        [<ffffffff8127ff94>] SyS_sendfile64+0x64/0xb0
        [<ffffffff83db7d18>] tracesys+0xe1/0xe6

 -> #0 (s_active#54){++++.+}:
        [<ffffffff811800cf>] __lock_acquire+0x15bf/0x1e50
        [<ffffffff811811da>] lock_acquire+0x1aa/0x240
        [<ffffffff81300aa2>] sysfs_deactivate+0x122/0x1a0
        [<ffffffff81301631>] sysfs_addrm_finish+0x31/0x60
        [<ffffffff812ff77f>] sysfs_hash_and_remove+0x7f/0xb0
        [<ffffffff813035a1>] sysfs_unmerge_group+0x51/0x70
        [<ffffffff81f068f4>] pm_qos_sysfs_remove_flags+0x14/0x20
        [<ffffffff81f07490>] __dev_pm_qos_hide_flags+0x30/0x70
        [<ffffffff81f07cd5>] dev_pm_qos_constraints_destroy+0x35/0x250
        [<ffffffff81f06931>] dpm_sysfs_remove+0x11/0x50
        [<ffffffff81efcf6f>] device_del+0x3f/0x1b0
        [<ffffffff81efd128>] device_unregister+0x48/0x60
        [<ffffffff82d4083c>] usb_hub_remove_port_device+0x1c/0x20
        [<ffffffff82d2a9cd>] hub_disconnect+0xdd/0x160
        [<ffffffff82d36ab7>] usb_unbind_interface+0x67/0x170
        [<ffffffff81f001a7>] __device_release_driver+0x87/0xe0
        [<ffffffff81f00559>] device_release_driver+0x29/0x40
        [<ffffffff81effc58>] bus_remove_device+0x148/0x160
        [<ffffffff81efd07f>] device_del+0x14f/0x1b0
        [<ffffffff82d344f9>] usb_disable_device+0xf9/0x280
        [<ffffffff82d34ff8>] usb_set_configuration+0x268/0x840
        [<ffffffff82d3a7fc>] usb_remove_store+0x4c/0x80
        [<ffffffff81efbb43>] dev_attr_store+0x13/0x20
        [<ffffffff812ffdaa>] sysfs_write_file+0xfa/0x150
        [<ffffffff8127f71d>] do_loop_readv_writev+0x4d/0x90
        [<ffffffff8127f999>] do_readv_writev+0xf9/0x1e0
        [<ffffffff8127faba>] vfs_writev+0x3a/0x60
        [<ffffffff8127fc60>] SyS_writev+0x50/0xd0
        [<ffffffff83db7d18>] tracesys+0xe1/0xe6

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(dev_pm_qos_mtx);
                                lock(s_active#54);
                                lock(dev_pm_qos_mtx);
   lock(s_active#54);

  *** DEADLOCK ***

To avoid that, remove the calls to functions mentioned above from
under dev_pm_qos_mtx and introduce a separate lock to prevent races
between functions that add or remove device PM QoS sysfs attributes
from happening.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-02 01:25:24 +02:00
Mark Brown
af8ee69df3 Merge remote-tracking branch 'regmap/fix/async' into tmp 2013-03-31 23:27:38 +01:00
Mark Brown
6d66df4109 Merge remote-tracking branch 'regmap/fix/core' into tmp 2013-03-31 23:09:22 +01:00
Zhang Yanfei
eca4549f57 sysfs: Add crash_notes_size to export percpu note size
For percpu notes, we are exporting only address and not size. So
the userspace tool kexec-tools is putting an upper limit of 1024
and putting the value in p_memsz and p_filesz fields. So the patch
add the new sysfile crash_notes_size to export the exact percpu
note size and let the kexec-tools parse it intead of using 1024.

The idea came from Vivek Goyal. And a later patch will be sent to
kexec-tools to let it parse the size.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-29 09:17:22 -07:00
Fabio Porcedda
0258e182e7 driver core: platform.c: fix checkpatch errors and warnings
Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-29 09:10:55 -07:00
Fabio Porcedda
647c86d0a2 driver core: warn that platform_driver_probe can not use deferred probing
Add documentation that platform_driver_probe() is incompatible with
deferred probing.

Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-29 09:10:55 -07:00
Mark Brown
f951b6587b regmap: async: Add missing return
Let's only write once...

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-03-27 13:08:44 +00:00
Stephen Warren
bc8ce4afd7 regmap: don't corrupt work buffer in _regmap_raw_write()
_regmap_raw_write() contains code to call regcache_write() to write
values to the cache. That code calls memcpy() to copy the value data to
the start of the work_buf. However, at least when _regmap_raw_write() is
called from _regmap_bus_raw_write(), the value data is in the work_buf,
and this memcpy() operation may over-write part of that value data,
depending on the value of reg_bytes + pad_bytes. At least when using
reg_bytes==1 and pad_bytes==0, corruption of the value data does occur.

To solve this, remove the memcpy() operation, and modify the subsequent
.parse_val() call to parse the original value buffer directly.

At least in the case of 8-bit register address and 16-bit values, and
writes of single registers at a time, this memcpy-then-parse combination
used to cancel each-other out; for a work-buffer containing xx 89 03,
the memcpy changed it to 89 03 03, and the parse_val changed it back to
89 89 03, thus leaving the value uncorrupted. This appears completely
accidental though. Since commit 8a819ff "regmap: core: Split out in
place value parsing", .parse_val only returns the parsed value, and does
not modify the buffer, and hence does not (accidentally) undo the
corruption caused by memcpy(). This caused bogus values to get written
to HW, thus preventing e.g. audio playback on systems with a WM8903
CODEC. This patch fixes that.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-03-21 20:08:15 +01:00
Felipe Balbi
8f46baaa7e base: core: WARN() about bogus permissions on device attributes
Whenever a struct device_attribute is registered
with mismatched permissions - read permission without
a show routine or write permission without store
routine - we will issue a big warning so we catch
those early enough.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-15 09:39:15 -07:00
Lars-Peter Clausen
8abac3ba51 regmap: cache Fix regcache-rbtree sync
The last register block, which falls into the specified range, is not handled
correctly. The formula which calculates the number of register which should be
synced is inverse (and off by one). E.g. if all registers in that block should
be synced only one is synced, and if only one should be synced all (but one) are
synced. To calculate the number of registers that need to be synced we need to
subtract the number of the first register in the block from the max register
number and add one. This patch updates the code accordingly.

The issue was introduced in commit ac8d91c ("regmap: Supply ranges to the sync
operations").

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@vger.kernel.org
2013-03-13 19:07:19 +00:00
Michal Hocko
be871b7e54 device: separate all subsys mutexes
ca22e56d (driver-core: implement 'sysdev' functionality for regular
devices and buses) has introduced bus_register macro with a static
key to distinguish different subsys mutex classes.

This however doesn't work for different subsys which use a common
registering function. One example is subsys_system_register (and
mce_device and cpu_device).

In the end this leads to the following lockdep splat:
[  207.271924] ======================================================
[  207.271932] [ INFO: possible circular locking dependency detected ]
[  207.271942] 3.9.0-rc1-0.7-default+ #34 Not tainted
[  207.271948] -------------------------------------------------------
[  207.271957] bash/10493 is trying to acquire lock:
[  207.271963]  (subsys mutex){+.+.+.}, at: [<ffffffff8134af27>] bus_remove_device+0x37/0x1c0
[  207.271987]
[  207.271987] but task is already holding lock:
[  207.271995]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81046ccf>] cpu_hotplug_begin+0x2f/0x60
[  207.272012]
[  207.272012] which lock already depends on the new lock.
[  207.272012]
[  207.272023]
[  207.272023] the existing dependency chain (in reverse order) is:
[  207.272033]
[  207.272033] -> #4 (cpu_hotplug.lock){+.+.+.}:
[  207.272044]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272056]        [<ffffffff814ad807>] mutex_lock_nested+0x37/0x360
[  207.272069]        [<ffffffff81046ba9>] get_online_cpus+0x29/0x40
[  207.272082]        [<ffffffff81185210>] drain_all_stock+0x30/0x150
[  207.272094]        [<ffffffff811853da>] mem_cgroup_reclaim+0xaa/0xe0
[  207.272104]        [<ffffffff8118775e>] __mem_cgroup_try_charge+0x51e/0xcf0
[  207.272114]        [<ffffffff81188486>] mem_cgroup_charge_common+0x36/0x60
[  207.272125]        [<ffffffff811884da>] mem_cgroup_newpage_charge+0x2a/0x30
[  207.272135]        [<ffffffff81150531>] do_wp_page+0x231/0x830
[  207.272147]        [<ffffffff8115151e>] handle_pte_fault+0x19e/0x8d0
[  207.272157]        [<ffffffff81151da8>] handle_mm_fault+0x158/0x1e0
[  207.272166]        [<ffffffff814b6153>] do_page_fault+0x2a3/0x4e0
[  207.272178]        [<ffffffff814b2578>] page_fault+0x28/0x30
[  207.272189]
[  207.272189] -> #3 (&mm->mmap_sem){++++++}:
[  207.272199]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272208]        [<ffffffff8114c5ad>] might_fault+0x6d/0x90
[  207.272218]        [<ffffffff811a11e3>] filldir64+0xb3/0x120
[  207.272229]        [<ffffffffa013fc19>] call_filldir+0x89/0x130 [ext3]
[  207.272248]        [<ffffffffa0140377>] ext3_readdir+0x6b7/0x7e0 [ext3]
[  207.272263]        [<ffffffff811a1519>] vfs_readdir+0xa9/0xc0
[  207.272273]        [<ffffffff811a15cb>] sys_getdents64+0x9b/0x110
[  207.272284]        [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b
[  207.272296]
[  207.272296] -> #2 (&type->i_mutex_dir_key#3){+.+.+.}:
[  207.272309]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272319]        [<ffffffff814ad807>] mutex_lock_nested+0x37/0x360
[  207.272329]        [<ffffffff8119c254>] link_path_walk+0x6f4/0x9a0
[  207.272339]        [<ffffffff8119e7fa>] path_openat+0xba/0x470
[  207.272349]        [<ffffffff8119ecf8>] do_filp_open+0x48/0xa0
[  207.272358]        [<ffffffff8118d81c>] file_open_name+0xdc/0x110
[  207.272369]        [<ffffffff8118d885>] filp_open+0x35/0x40
[  207.272378]        [<ffffffff8135c76e>] _request_firmware+0x52e/0xb20
[  207.272389]        [<ffffffff8135cdd6>] request_firmware+0x16/0x20
[  207.272399]        [<ffffffffa03bdb91>] request_microcode_fw+0x61/0xd0 [microcode]
[  207.272416]        [<ffffffffa03bd554>] microcode_init_cpu+0x104/0x150 [microcode]
[  207.272431]        [<ffffffffa03bd61c>] mc_device_add+0x7c/0xb0 [microcode]
[  207.272444]        [<ffffffff8134a419>] subsys_interface_register+0xc9/0x100
[  207.272457]        [<ffffffffa04fc0f4>] 0xffffffffa04fc0f4
[  207.272472]        [<ffffffff81000202>] do_one_initcall+0x42/0x180
[  207.272485]        [<ffffffff810bbeff>] load_module+0x19df/0x1b70
[  207.272499]        [<ffffffff810bc376>] sys_init_module+0xe6/0x130
[  207.272511]        [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b
[  207.272523]
[  207.272523] -> #1 (umhelper_sem){++++.+}:
[  207.272537]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272548]        [<ffffffff814ae9c4>] down_read+0x34/0x50
[  207.272559]        [<ffffffff81062bff>] usermodehelper_read_trylock+0x4f/0x100
[  207.272575]        [<ffffffff8135c7dd>] _request_firmware+0x59d/0xb20
[  207.272587]        [<ffffffff8135cdd6>] request_firmware+0x16/0x20
[  207.272599]        [<ffffffffa03bdb91>] request_microcode_fw+0x61/0xd0 [microcode]
[  207.272613]        [<ffffffffa03bd554>] microcode_init_cpu+0x104/0x150 [microcode]
[  207.272627]        [<ffffffffa03bd61c>] mc_device_add+0x7c/0xb0 [microcode]
[  207.272641]        [<ffffffff8134a419>] subsys_interface_register+0xc9/0x100
[  207.272654]        [<ffffffffa04fc0f4>] 0xffffffffa04fc0f4
[  207.272666]        [<ffffffff81000202>] do_one_initcall+0x42/0x180
[  207.272678]        [<ffffffff810bbeff>] load_module+0x19df/0x1b70
[  207.272690]        [<ffffffff810bc376>] sys_init_module+0xe6/0x130
[  207.272702]        [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b
[  207.272715]
[  207.272715] -> #0 (subsys mutex){+.+.+.}:
[  207.272729]        [<ffffffff810ae002>] __lock_acquire+0x13b2/0x15f0
[  207.272740]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272751]        [<ffffffff814ad807>] mutex_lock_nested+0x37/0x360
[  207.272763]        [<ffffffff8134af27>] bus_remove_device+0x37/0x1c0
[  207.272775]        [<ffffffff81349114>] device_del+0x134/0x1f0
[  207.272786]        [<ffffffff813491f2>] device_unregister+0x22/0x60
[  207.272798]        [<ffffffff814a24ea>] mce_cpu_callback+0x15e/0x1ad
[  207.272812]        [<ffffffff814b6402>] notifier_call_chain+0x72/0x130
[  207.272824]        [<ffffffff81073d6e>] __raw_notifier_call_chain+0xe/0x10
[  207.272839]        [<ffffffff81498f76>] _cpu_down+0x1d6/0x350
[  207.272851]        [<ffffffff81499130>] cpu_down+0x40/0x60
[  207.272862]        [<ffffffff8149cc55>] store_online+0x75/0xe0
[  207.272874]        [<ffffffff813474a0>] dev_attr_store+0x20/0x30
[  207.272886]        [<ffffffff812090d9>] sysfs_write_file+0xd9/0x150
[  207.272900]        [<ffffffff8118e10b>] vfs_write+0xcb/0x130
[  207.272911]        [<ffffffff8118e924>] sys_write+0x64/0xa0
[  207.272923]        [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b
[  207.272936]
[  207.272936] other info that might help us debug this:
[  207.272936]
[  207.272952] Chain exists of:
[  207.272952]   subsys mutex --> &mm->mmap_sem --> cpu_hotplug.lock
[  207.272952]
[  207.272973]  Possible unsafe locking scenario:
[  207.272973]
[  207.272984]        CPU0                    CPU1
[  207.272992]        ----                    ----
[  207.273000]   lock(cpu_hotplug.lock);
[  207.273009]                                lock(&mm->mmap_sem);
[  207.273020]                                lock(cpu_hotplug.lock);
[  207.273031]   lock(subsys mutex);
[  207.273040]
[  207.273040]  *** DEADLOCK ***
[  207.273040]
[  207.273055] 5 locks held by bash/10493:
[  207.273062]  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff81209049>] sysfs_write_file+0x49/0x150
[  207.273080]  #1:  (s_active#150){.+.+.+}, at: [<ffffffff812090c2>] sysfs_write_file+0xc2/0x150
[  207.273099]  #2:  (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81027557>] cpu_hotplug_driver_lock+0x17/0x20
[  207.273121]  #3:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8149911c>] cpu_down+0x2c/0x60
[  207.273140]  #4:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81046ccf>] cpu_hotplug_begin+0x2f/0x60
[  207.273158]
[  207.273158] stack backtrace:
[  207.273170] Pid: 10493, comm: bash Not tainted 3.9.0-rc1-0.7-default+ #34
[  207.273180] Call Trace:
[  207.273192]  [<ffffffff810ab373>] print_circular_bug+0x223/0x310
[  207.273204]  [<ffffffff810ae002>] __lock_acquire+0x13b2/0x15f0
[  207.273216]  [<ffffffff812086b0>] ? sysfs_hash_and_remove+0x60/0xc0
[  207.273227]  [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.273239]  [<ffffffff8134af27>] ? bus_remove_device+0x37/0x1c0
[  207.273251]  [<ffffffff814ad807>] mutex_lock_nested+0x37/0x360
[  207.273263]  [<ffffffff8134af27>] ? bus_remove_device+0x37/0x1c0
[  207.273274]  [<ffffffff812086b0>] ? sysfs_hash_and_remove+0x60/0xc0
[  207.273286]  [<ffffffff8134af27>] bus_remove_device+0x37/0x1c0
[  207.273298]  [<ffffffff81349114>] device_del+0x134/0x1f0
[  207.273309]  [<ffffffff813491f2>] device_unregister+0x22/0x60
[  207.273321]  [<ffffffff814a24ea>] mce_cpu_callback+0x15e/0x1ad
[  207.273332]  [<ffffffff814b6402>] notifier_call_chain+0x72/0x130
[  207.273344]  [<ffffffff81073d6e>] __raw_notifier_call_chain+0xe/0x10
[  207.273356]  [<ffffffff81498f76>] _cpu_down+0x1d6/0x350
[  207.273368]  [<ffffffff81027557>] ? cpu_hotplug_driver_lock+0x17/0x20
[  207.273380]  [<ffffffff81499130>] cpu_down+0x40/0x60
[  207.273391]  [<ffffffff8149cc55>] store_online+0x75/0xe0
[  207.273402]  [<ffffffff813474a0>] dev_attr_store+0x20/0x30
[  207.273413]  [<ffffffff812090d9>] sysfs_write_file+0xd9/0x150
[  207.273425]  [<ffffffff8118e10b>] vfs_write+0xcb/0x130
[  207.273436]  [<ffffffff8118e924>] sys_write+0x64/0xa0
[  207.273447]  [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b

Which reports a false possitive deadlock because it sees:
1) load_module -> subsys_interface_register -> mc_deveice_add (*) -> subsys->p->mutex -> link_path_walk -> lookup_slow -> i_mutex
2) sys_write -> _cpu_down -> cpu_hotplug_begin -> cpu_hotplug.lock -> mce_cpu_callback -> mce_device_remove(**) -> device_unregister -> bus_remove_device -> subsys mutex
3) vfs_readdir -> i_mutex -> filldir64 -> might_fault -> might_lock_read(mmap_sem) -> page_fault -> mmap_sem -> drain_all_stock -> cpu_hotplug.lock

but
1) takes cpu_subsys subsys (*) but 2) takes mce_device subsys (**) so
the deadlock is not possible AFAICS.

The fix is quite simple. We can pull the key inside bus_type structure
because they are defined per device so the pointer will be unique as
well. bus_register doesn't need to be a macro anymore so change it
to the inline. We could get rid of __bus_register as there is no other
caller but maybe somebody will want to use a different key so keep it
around for now.

Reported-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-13 08:48:28 -07:00
Dimitris Papastamos
c6432ea9cc regmap: Initialize `map->debugfs' before regcache
In the rbtree code we are exposing statistics relating to the
number of nodes/registers of the rbtree cache for each of the
devices.  Ensure that `map->debugfs' has been initialized before
we attempt to initialize the debugfs entry for the rbtree cache.

Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@vger.kernel.org
2013-03-12 18:25:10 +00:00
Linus Torvalds
c89b148fd3 Merge tag 'pm+acpi-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael J Wysocki:

 - Two fixes for the new intel_pstate driver from Dirk Brandewie.

 - Fix for incorrect usage of the .find_bridge() callback from struct
   acpi_bus_type in the USB core and subsequent removal of that callback
   from Rafael J Wysocki.

 - ACPI processor driver cleanups from Chen Gang and Syam Sidhardhan.

 - ACPI initialization and error messages fix from Joe Perches.

 - Operating Performance Points documentation improvement from Nishanth
   Menon.

 - Fixes for memory leaks and potential concurrency issues and sysfs
  attributes leaks during device removal in the core device PM QoS code
  from Rafael J Wysocki.

 - Calxeda Highbank cpufreq driver simplification from Emilio López.

 - cpufreq comment cleanup from Namhyung Kim.

 - Fix for a section mismatch in Calxeda Highbank interprocessor
   communication code from Mark Langsdorf (this is not a PM fix strictly
   speaking, but the code in question went in through the PM tree).

* tag 'pm+acpi-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq / intel_pstate: Do not load on VM that does not report max P state.
  cpufreq / intel_pstate: Fix intel_pstate_init() error path
  ACPI / glue: Drop .find_bridge() callback from struct acpi_bus_type
  ACPI / glue: Add .match() callback to struct acpi_bus_type
  ACPI / porocessor: Beautify code, pr->id is u32 which is never < 0
  ACPI / processor: Remove redundant NULL check before kfree
  ACPI / Sleep: Avoid interleaved message on errors
  PM / QoS: Remove device PM QoS sysfs attributes at the right place
  PM / QoS: Fix concurrency issues and memory leaks in device PM QoS
  cpufreq: highbank: do not initialize array with a loop
  PM / OPP: improve introductory documentation
  cpufreq: Fix a typo in comment
  mailbox, pl320-ipc: remove __init from probe function
2013-03-07 14:54:28 -08:00
Linus Torvalds
0c8150d2c4 Merge tag 'regmap-v3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap PM fix from Mark Brown:
 "A simple fix to stop us leaking a runtime PM reference in the case
  where we fail to enable a device."

* tag 'regmap-v3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: irq: call pm_runtime_put in pm_runtime_get_sync failed case
2013-03-07 13:06:21 -08:00
Rafael J. Wysocki
37530f2bda PM / QoS: Remove device PM QoS sysfs attributes at the right place
Device PM QoS sysfs attributes, if present during device removal,
are removed from within device_pm_remove(), which is too late,
since dpm_sysfs_remove() has already removed the whole attribute
group they belonged to.  However, moving the removal of those
attributes to dpm_sysfs_remove() alone is not sufficient, because
in theory they still can be re-added right after being removed by it
(the device's driver is still bound to it at that point).

For this reason, move the entire desctruction of device PM QoS
constraints to dpm_sysfs_remove() and make it prevent any new
constraints from being added after it has run.  Also, move the
initialization of the power.qos field in struct device to
device_pm_init_common() and drop the no longer needed
dev_pm_qos_constraints_init().

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-03-04 14:23:12 +01:00
Rafael J. Wysocki
b81ea1b5ac PM / QoS: Fix concurrency issues and memory leaks in device PM QoS
The current device PM QoS code assumes that certain functions will
never be called in parallel with each other (for example, it is
assumed that dev_pm_qos_expose_flags() won't be called in parallel
with dev_pm_qos_hide_flags() for the same device and analogously
for the latency limit), which may be overly optimistic.  Moreover,
dev_pm_qos_expose_flags() and dev_pm_qos_expose_latency_limit()
leak memory in error code paths (req needs to be freed on errors)
and __dev_pm_qos_drop_user_request() forgets to free the request.

To fix the above issues put more things under the device PM QoS
mutex to make them mutually exclusive and add the missing freeing
of memory.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-03-04 14:23:11 +01:00
Li Fei
283189d3be regmap: irq: call pm_runtime_put in pm_runtime_get_sync failed case
Even in failed case of pm_runtime_get_sync, the usage_count
is incremented. In order to keep the usage_count with correct
value and runtime power management to behave correctly, call
pm_runtime_put(_sync) in such case.

Signed-off-by Liu Chuansheng <chuansheng.liu@intel.com>
Signed-off-by: Li Fei <fei.li@intel.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-03-01 14:54:16 +08:00
John Sheu
495c10cc1c CHROMIUM: dma-buf: restore args on failure of dma_buf_mmap
Callers to dma_buf_mmap expect to fput() the vma struct's vm_file
themselves on failure.  Not restoring the struct's data on failure
causes a double-decrement of the vm_file's refcount.

Signed-off-by: John Sheu <sheu@google.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2013-02-27 15:14:02 +05:30
Daniel Vetter
f00b4dad9d dma-buf: implement vmap refcounting in the interface logic
All drivers which implement this need to have some sort of refcount to
allow concurrent vmap usage. Hence implement this in the dma-buf core.

To protect against concurrent calls we need a lock, which potentially
causes new funny locking inversions. But this shouldn't be a problem
for exporters with statically allocated backing storage, and more
dynamic drivers have decent issues already anyway.

Inspired by some refactoring patches from Aaron Plattner, who
implemented the same idea, but only for drm/prime drivers.

v2: Check in dma_buf_release that no dangling vmaps are left.
Suggested by Aaron Plattner. We might want to do similar checks for
attachments, but that's for another patch. Also fix up ERR_PTR return
for vmap.

v3: Check whether the passed-in vmap address matches with the cached
one for vunmap. Eventually we might want to remove that parameter -
compared to the kmap functions there's no need for the vaddr for
unmapping.  Suggested by Chris Wilson.

v4: Fix a brown-paper-bag bug spotted by Aaron Plattner.

Cc: Aaron Plattner <aplattner@nvidia.com>
Reviewed-by: Aaron Plattner <aplattner@nvidia.com>
Tested-by: Aaron Plattner <aplattner@nvidia.com>
Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
2013-02-27 15:13:36 +05:30
Linus Torvalds
d895cb1af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...
2013-02-26 20:16:07 -08:00
Al Viro
3dadecce20 switch vfs_getattr() to struct path
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-26 02:46:08 -05:00
Linus Torvalds
9043a2650c Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module update from Rusty Russell:
 "The sweeping change is to make add_taint() explicitly indicate whether
  to disable lockdep, but it's a mechanical change."

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  MODSIGN: Add option to not sign modules during modules_install
  MODSIGN: Add -s <signature> option to sign-file
  MODSIGN: Specify the hash algorithm on sign-file command line
  MODSIGN: Simplify Makefile with a Kconfig helper
  module: clean up load_module a little more.
  modpost: Ignore ARC specific non-alloc sections
  module: constify within_module_*
  taint: add explicit flag to show whether lock dep is still OK.
  module: printk message when module signature fail taints kernel.
2013-02-25 15:41:43 -08:00
Ming Lei
db88175f41 pm / runtime: force memory allocation with no I/O during Runtime PM callbcack
Apply the introduced memalloc_noio_save() and memalloc_noio_restore() to
force memory allocation with no I/O during runtime_resume/runtime_suspend
callback on device with the flag of 'memalloc_noio' set.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Decotigny <david.decotigny@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Jiri Kosina <jiri.kosina@suse.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:16 -08:00
Ming Lei
e823407f7b pm / runtime: introduce pm_runtime_set_memalloc_noio()
Introduce the flag memalloc_noio in 'struct dev_pm_info' to help PM core
to teach mm not allocating memory with GFP_KERNEL flag for avoiding
probable deadlock.

As explained in the comment, any GFP_KERNEL allocation inside
runtime_resume() or runtime_suspend() on any one of device in the path
from one block or network device to the root device in the device tree
may cause deadlock, the introduced pm_runtime_set_memalloc_noio() sets
or clears the flag on device in the path recursively.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Jiri Kosina <jiri.kosina@suse.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Decotigny <david.decotigny@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:16 -08:00
Yasuaki Ishimatsu
6677e3eaf4 memory-hotplug: check whether all memory blocks are offlined or not when removing memory
We remove the memory like this:

 1. lock memory hotplug
 2. offline a memory block
 3. unlock memory hotplug
 4. repeat 1-3 to offline all memory blocks
 5. lock memory hotplug
 6. remove memory(TODO)
 7. unlock memory hotplug

All memory blocks must be offlined before removing memory.  But we don't
hold the lock in the whole operation.  So we should check whether all
memory blocks are offlined before step6.  Otherwise, kernel maybe
panicked.

Offlining a memory block and removing a memory device can be two
different operations.  Users can just offline some memory blocks without
removing the memory device.  For this purpose, the kernel has held
lock_memory_hotplug() in __offline_pages().  To reuse the code for
memory hot-remove, we repeat step 1-3 to offline all the memory blocks,
repeatedly lock and unlock memory hotplug, but not hold the memory
hotplug lock in the whole operation.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:11 -08:00
Linus Torvalds
74e1a2a393 Merge tag 'usb-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB patches from Greg Kroah-Hartman:
 "Here's the big USB merge for 3.9-rc1

  Nothing major, lots of gadget fixes, and of course, xhci stuff.

  All of this has been in linux-next for a while, with the exception of
  the last 3 patches, which were reverts of patches in the tree that
  caused problems, they went in yesterday."

* tag 'usb-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (190 commits)
  Revert "USB: EHCI: make ehci-vt8500 a separate driver"
  Revert "USB: EHCI: make ehci-orion a separate driver"
  Revert "USB: update host controller Kconfig entries"
  USB: update host controller Kconfig entries
  USB: EHCI: make ehci-orion a separate driver
  USB: EHCI: make ehci-vt8500 a separate driver
  USB: usb-storage: unusual_devs update for Super TOP SATA bridge
  USB: ehci-omap: Fix autoloading of module
  USB: ehci-omap: Don't free gpios that we didn't request
  USB: option: add Huawei "ACM" devices using protocol = vendor
  USB: serial: fix null-pointer dereferences on disconnect
  USB: option: add Yota / Megafon M100-1 4g modem
  drivers/usb: add missing GENERIC_HARDIRQS dependencies
  USB: storage: properly handle the endian issues of idProduct
  testusb: remove all mentions of 'usbfs'
  usb: gadget: imx_udc: make it depend on BROKEN
  usb: omap_control_usb: fix compile warning
  ARM: OMAP: USB: Add phy binding information
  ARM: OMAP2: MUSB: Specify omap4 has mailbox
  ARM: OMAP: devices: create device for usb part of control module
  ...
2013-02-21 12:20:00 -08:00
Linus Torvalds
06991c28f3 Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core patches from Greg Kroah-Hartman:
 "Here is the big driver core merge for 3.9-rc1

  There are two major series here, both of which touch lots of drivers
  all over the kernel, and will cause you some merge conflicts:

   - add a new function called devm_ioremap_resource() to properly be
     able to check return values.

   - remove CONFIG_EXPERIMENTAL

  Other than those patches, there's not much here, some minor fixes and
  updates"

Fix up trivial conflicts

* tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
  base: memory: fix soft/hard_offline_page permissions
  drivercore: Fix ordering between deferred_probe and exiting initcalls
  backlight: fix class_find_device() arguments
  TTY: mark tty_get_device call with the proper const values
  driver-core: constify data for class_find_device()
  firmware: Ignore abort check when no user-helper is used
  firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
  firmware: Make user-mode helper optional
  firmware: Refactoring for splitting user-mode helper code
  Driver core: treat unregistered bus_types as having no devices
  watchdog: Convert to devm_ioremap_resource()
  thermal: Convert to devm_ioremap_resource()
  spi: Convert to devm_ioremap_resource()
  power: Convert to devm_ioremap_resource()
  mtd: Convert to devm_ioremap_resource()
  mmc: Convert to devm_ioremap_resource()
  mfd: Convert to devm_ioremap_resource()
  media: Convert to devm_ioremap_resource()
  iommu: Convert to devm_ioremap_resource()
  drm: Convert to devm_ioremap_resource()
  ...
2013-02-21 12:05:51 -08:00
Linus Torvalds
8793422fd9 Merge tag 'pm+acpi-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:

 - Rework of the ACPI namespace scanning code from Rafael J.  Wysocki
   with contributions from Bjorn Helgaas, Jiang Liu, Mika Westerberg,
   Toshi Kani, and Yinghai Lu.

 - ACPI power resources handling and ACPI device PM update from Rafael
   J Wysocki.

 - ACPICA update to version 20130117 from Bob Moore and Lv Zheng with
   contributions from Aaron Lu, Chao Guan, Jesper Juhl, and Tim Gardner.

 - Support for Intel Lynxpoint LPSS from Mika Westerberg.

 - cpuidle update from Len Brown including Intel Haswell support, C1
   state for intel_idle, removal of global pm_idle.

 - cpuidle fixes and cleanups from Daniel Lezcano.

 - cpufreq fixes and cleanups from Viresh Kumar and Fabio Baltieri with
   contributions from Stratos Karafotis and Rickard Andersson.

 - Intel P-states driver for Sandy Bridge processors from Dirk
   Brandewie.

 - cpufreq driver for Marvell Kirkwood SoCs from Andrew Lunn.

 - cpufreq fixes related to ordering issues between acpi-cpufreq and
   powernow-k8 from Borislav Petkov and Matthew Garrett.

 - cpufreq support for Calxeda Highbank processors from Mark Langsdorf
   and Rob Herring.

 - cpufreq driver for the Freescale i.MX6Q SoC and cpufreq-cpu0 update
   from Shawn Guo.

 - cpufreq Exynos fixes and cleanups from Jonghwan Choi, Sachin Kamat,
   and Inderpal Singh.

 - Support for "lightweight suspend" from Zhang Rui.

 - Removal of the deprecated power trace API from Paul Gortmaker.

 - Assorted updates from Andreas Fleig, Colin Ian King, Davidlohr Bueso,
   Joseph Salisbury, Kees Cook, Li Fei, Nishanth Menon, ShuoX Liu,
   Srinivas Pandruvada, Tejun Heo, Thomas Renninger, and Yasuaki
   Ishimatsu.

* tag 'pm+acpi-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (267 commits)
  PM idle: remove global declaration of pm_idle
  unicore32 idle: delete stray pm_idle comment
  openrisc idle: delete pm_idle
  mn10300 idle: delete pm_idle
  microblaze idle: delete pm_idle
  m32r idle: delete pm_idle, and other dead idle code
  ia64 idle: delete pm_idle
  cris idle: delete idle and pm_idle
  ARM64 idle: delete pm_idle
  ARM idle: delete pm_idle
  blackfin idle: delete pm_idle
  sparc idle: rename pm_idle to sparc_idle
  sh idle: rename global pm_idle to static sh_idle
  x86 idle: rename global pm_idle to static x86_idle
  APM idle: register apm_cpu_idle via cpuidle
  cpufreq / intel_pstate: Add kernel command line option disable intel_pstate.
  cpufreq / intel_pstate: Change to disallow module build
  tools/power turbostat: display SMI count by default
  intel_idle: export both C1 and C1E
  ACPI / hotplug: Fix concurrency issues and memory leaks
  ...
2013-02-20 11:26:56 -08:00
Linus Torvalds
8a3a11f91d Merge tag 'pinctrl-for-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pinctrl changes from Linus Walleij:
 "These are the main pinctrl changes for the v3.9 merge window.  The
  most interesting change by far is how the device core grabs pinctrl
  default handles avoiding the need to stick boilerplate into driver
  consumers.

   - Grabbing of default pinctrl handles from the device core.  These
     are the hunks hitting drivers/base.  All is ACKed by Greg, after a
     long discussion about different alternatives.

   - Some stuff also touches the MFD and ARM SoC trees, this has been
     coordinated and ACKed.

   - New drivers for:
     - The Tegra 114 sub-SoC
     - Allwinner sunxi
     - New ABx500 driver and sub-SoC drivers for AB8500, AB8505, AB9540
       and AB8540.

   - Make it possible for hogged pins to enter a sleep mode, and make it
     possible for drivers to control that mode.

   - Various clean-up, extensions and device tree support to various pin
     controllers."

* tag 'pinctrl-for-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (68 commits)
  pinctrl: tegra: add clfvs function to Tegra114 support
  pinctrl: generic: rename input schmitt disable
  pinctrl/pinconfig: add debug interface
  pinctrl: samsung: remove duplicated line
  ARM: ux500: use real AB8500 IRQ numbers instead of virtual ones
  ARM: ux500: remove irq_base property from platform_data
  pinctrl/abx500: use direct IRQ defines
  pinctrl/abx500: replace IRQ offsets with table read-in values
  pinctrl/abx500: move IRQ handling to ab8500-core
  pinctrl: exynos5440: remove erroneous __init
  pinctrl/abx500: adjust offset for get_mode()
  pinctrl/abx500: add Device Tree support
  pinctrl/abx500: align GPIO cluster boundaries
  pinctrl/abx500: prevent error path from corrupting returning error
  pinctrl: sunxi: add of_xlate function
  pinctrl/lantiq: fix pin number in ltq_pmx_gpio_request_enable
  pinctrl/lantiq: add functionality to falcon_pinconf_dbg_show
  pinctrl/lantiq: fix pinconfig parameters
  pinctrl/lantiq: one of the boot leds was defined incorrectly
  pinctrl/lantiq: only probe available pad controllers
  ...
2013-02-20 09:23:30 -08:00
Felipe Balbi
74fef7a8fd base: memory: fix soft/hard_offline_page permissions
those two sysfs files don't have a 'show' method,
so they shouldn't have a read permission. Thanks
to Greg Kroah-Hartman for actually looking into
the source code and figuring out we had a real bug
with these two files.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-18 11:18:13 -08:00
Grant Likely
d72cca1eee drivercore: Fix ordering between deferred_probe and exiting initcalls
One of the side effects of deferred probe is that some drivers which
used to be probed before initcalls completed are now happening slightly
later. This causes two problems.
- If a console driver gets deferred, then it may not be ready when
  userspace starts. For example, if a uart depends on pinctrl, then the
  uart will get deferred and /dev/console will not be available
- __init sections will be discarded before built-in drivers are probed.
  Strictly speaking, __init functions should not be called in a drivers
  __probe path, but there are a lot of drivers (console stuff again)
  that do anyway. In the past it was perfectly safe to do so because all
  built-in drivers got probed before the end of initcalls.

This patch fixes the problem by forcing the first pass of the deferred
list to complete at late_initcall time. This is late enough to catch the
drivers that are known to have the above issues.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Haojian Zhuang <haojian.zhuang@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable <stable@vger.kernel.org> # 3.4+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-15 10:50:33 -08:00
Rafael J. Wysocki
4419fbd4b4 Merge branch 'pm-cpufreq'
* pm-cpufreq: (55 commits)
  cpufreq / intel_pstate: Fix 32 bit build
  cpufreq: conservative: Fix typos in comments
  cpufreq: ondemand: Fix typos in comments
  cpufreq: exynos: simplify .init() for setting policy->cpus
  cpufreq: kirkwood: Add a cpufreq driver for Marvell Kirkwood SoCs
  cpufreq/x86: Add P-state driver for sandy bridge.
  cpufreq_stats: do not remove sysfs files if frequency table is not present
  cpufreq: Do not track governor name for scaling drivers with internal governors.
  cpufreq: Only call cpufreq_out_of_sync() for driver that implement cpufreq_driver.target()
  cpufreq: Retrieve current frequency from scaling drivers with internal governors
  cpufreq: Fix locking issues
  cpufreq: Create a macro for unlock_policy_rwsem{read,write}
  cpufreq: Remove unused HOTPLUG_CPU code
  cpufreq: governors: Fix WARN_ON() for multi-policy platforms
  cpufreq: ondemand: Replace down_differential tuner with adj_up_threshold
  cpufreq / stats: Get rid of CPUFREQ_STATDEVICE_ATTR
  cpufreq: Don't check cpu_online(policy->cpu)
  cpufreq: add imx6q-cpufreq driver
  cpufreq: Don't remove sysfs link for policy->cpu
  cpufreq: Remove unnecessary use of policy->shared_type
  ...
2013-02-15 13:59:07 +01:00
Mark Brown
a2b37efc4e Merge remote-tracking branch 'regmap/topic/no-bus' into regmap-next 2013-02-14 17:11:09 +00:00
Mark Brown
a31f68497e Merge remote-tracking branch 'regmap/topic/mmio' into regmap-next 2013-02-14 17:11:08 +00:00
Mark Brown
5dea215028 Merge remote-tracking branch 'regmap/topic/irq' into regmap-next 2013-02-14 17:11:08 +00:00
Mark Brown
7798b582d3 Merge remote-tracking branch 'regmap/topic/flat' into regmap-next 2013-02-14 17:11:07 +00:00
Mark Brown
43280026c8 Merge remote-tracking branch 'regmap/topic/debugfs' into regmap-next 2013-02-14 17:11:06 +00:00