Commit Graph

41697 Commits

Author SHA1 Message Date
Linus Torvalds
0586bed3e8 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rtmutex: tester: Remove the remaining BKL leftovers
  lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause
  rtmutex: Simplify PI algorithm and make highest prio task get lock
  rwsem: Remove redundant asmregparm annotation
  rwsem: Move duplicate function prototypes to linux/rwsem.h
  rwsem: Unify the duplicate rwsem_is_locked() inlines
  rwsem: Move duplicate init macros and functions to linux/rwsem.h
  rwsem: Move duplicate struct rwsem declaration to linux/rwsem.h
  x86: Cleanup rwsem_count_t typedef
  rwsem: Cleanup includes
  locking: Remove deprecated lock initializers
  cred: Replace deprecated spinlock initialization
  kthread: Replace deprecated spinlock initialization
  xtensa: Replace deprecated spinlock initialization
  um: Replace deprecated spinlock initialization
  sparc: Replace deprecated spinlock initialization
  mips: Replace deprecated spinlock initialization
  cris: Replace deprecated spinlock initialization
  alpha: Replace deprecated spinlock initialization
  rtmutex-tester: Remove BKL tests
2011-03-15 18:28:30 -07:00
Linus Torvalds
b80cd62b7d Merge branch 'core-futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  arm: Remove bogus comment in futex_atomic_cmpxchg_inatomic()
  futex: Deobfuscate handle_futex_death()
  plist: Add priority list test
  plist: Shrink struct plist_head
  futex,plist: Remove debug lock assignment from plist_node
  futex,plist: Pass the real head of the priority list to plist_del()
  futex: Sanitize futex ops argument types
  futex: Sanitize cmpxchg_futex_value_locked API
  futex: Remove redundant pagefault_disable in futex_atomic_cmpxchg_inatomic()
  futex: Avoid redudant evaluation of task_pid_vnr()
  futex: Update futex_wait_setup comments about locking
2011-03-15 18:23:52 -07:00
Linus Torvalds
c345f60a5f Merge branch 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debugobjects: Add hint for better object identification
2011-03-15 18:23:25 -07:00
Linus Torvalds
422e6c4bc4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (57 commits)
  tidy the trailing symlinks traversal up
  Turn resolution of trailing symlinks iterative everywhere
  simplify link_path_walk() tail
  Make trailing symlink resolution in path_lookupat() iterative
  update nd->inode in __do_follow_link() instead of after do_follow_link()
  pull handling of one pathname component into a helper
  fs: allow AT_EMPTY_PATH in linkat(), limit that to CAP_DAC_READ_SEARCH
  Allow passing O_PATH descriptors via SCM_RIGHTS datagrams
  readlinkat(), fchownat() and fstatat() with empty relative pathnames
  Allow O_PATH for symlinks
  New kind of open files - "location only".
  ext4: Copy fs UUID to superblock
  ext3: Copy fs UUID to superblock.
  vfs: Export file system uuid via /proc/<pid>/mountinfo
  unistd.h: Add new syscalls numbers to asm-generic
  x86: Add new syscalls for x86_64
  x86: Add new syscalls for x86_32
  fs: Remove i_nlink check from file system link callback
  fs: Don't allow to create hardlink for deleted file
  vfs: Add open by file handle support
  ...
2011-03-15 15:48:13 -07:00
Linus Torvalds
76ca078328 Merge branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm
* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
  xen: suspend: remove xen_hvm_suspend
  xen: suspend: pull pre/post suspend hooks out into suspend_info
  xen: suspend: move arch specific pre/post suspend hooks into generic hooks
  xen: suspend: refactor non-arch specific pre/post suspend hooks
  xen: suspend: add "arch" to pre/post suspend hooks
  xen: suspend: pass extra hypercall argument via suspend_info struct
  xen: suspend: refactor cancellation flag into a structure
  xen: suspend: use HYPERVISOR_suspend for PVHVM case instead of open coding
  xen: switch to new schedop hypercall by default.
  xen: use new schedop interface for suspend
  xen: do not respond to unknown xenstore control requests
  xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabled
  xen: PV on HVM: support PV spinlocks and IPIs
  xen: make the ballon driver work for hvm domains
  xen-blkfront: handle Xen major numbers other than XENVBD
  xen: do not use xen_info on HVM, set pv_info name to "Xen HVM"
  xen: no need to delay xen_setup_shutdown_event for hvm guests anymore
2011-03-15 10:59:09 -07:00
Linus Torvalds
27d2a8b97e Merge branches 'stable/ia64', 'stable/blkfront-cleanup' and 'stable/cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/ia64' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: ia64 build broken due to "xen: switch to new schedop hypercall by default."

* 'stable/blkfront-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: Union the blkif_request request specific fields

* 'stable/cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: annotate functions which only call into __init at start of day
  xen p2m: annotate variable which appears unused
  xen: events: mark cpu_evtchn_mask_p as __refdata
2011-03-15 10:49:16 -07:00
Linus Torvalds
010b8f4e26 Merge branch 'stable/irq.cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/irq.cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: events: remove dom0 specific xen_create_msi_irq
  xen: events: use xen_bind_pirq_msi_to_irq from xen_create_msi_irq
  xen: events: push set_irq_msi down into xen_create_msi_irq
  xen: events: update pirq_to_irq in xen_create_msi_irq
  xen: events: refactor xen_create_msi_irq slightly
  xen: events: separate MSI PIRQ allocation from PIRQ binding to IRQ
  xen: events: assume PHYSDEVOP_get_free_pirq exists
  xen: pci: collapse apic_register_gsi_xen_hvm and xen_hvm_register_pirq
  xen: events: return irq from xen_allocate_pirq_msi
  xen: events: drop XEN_ALLOC_IRQ flag to xen_allocate_pirq_msi
  xen: events: do not leak IRQ from xen_allocate_pirq_msi when no pirq available.
  xen: pci: only define xen_initdom_setup_msi_irqs if CONFIG_XEN_DOM0
2011-03-15 10:47:56 -07:00
Linus Torvalds
397fae0818 Merge branches 'stable/irq.rework' and 'stable/pcifront-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/irq.rework' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/irq: Cleanup up the pirq_to_irq for DomU PV PCI passthrough guests as well.
  xen: Use IRQF_FORCE_RESUME
  xen/timer: Missing IRQF_NO_SUSPEND in timer code broke suspend.
  xen: Fix compile error introduced by "switch to new irq_chip functions"
  xen: Switch to new irq_chip functions
  xen: Remove stale irq_chip.end
  xen: events: do not free legacy IRQs
  xen: events: allocate GSIs and dynamic IRQs from separate IRQ ranges.
  xen: events: add xen_allocate_irq_{dynamic, gsi} and xen_free_irq
  xen:events: move find_unbound_irq inside CONFIG_PCI_MSI
  xen: handled remapped IRQs when enabling a pcifront PCI device.
  genirq: Add IRQF_FORCE_RESUME

* 'stable/pcifront-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  pci/xen: When free-ing MSI-X/MSI irq->desc also use generic code.
  pci/xen: Cleanup: convert int** to int[]
  pci/xen: Use xen_allocate_pirq_msi instead of xen_allocate_pirq
  xen-pcifront: Sanity check the MSI/MSI-X values
  xen-pcifront: don't use flush_scheduled_work()
2011-03-15 10:47:16 -07:00
Al Viro
1abf0c718f New kind of open files - "location only".
New flag for open(2) - O_PATH.  Semantics:
	* pathname is resolved, but the file itself is _NOT_ opened
as far as filesystem is concerned.
	* almost all operations on the resulting descriptors shall
fail with -EBADF.  Exceptions are:
	1) operations on descriptors themselves (i.e.
		close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD),
		fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD),
		fcntl(fd, F_SETFD, ...))
	2) fcntl(fd, F_GETFL), for a common non-destructive way to
		check if descriptor is open
	3) "dfd" arguments of ...at(2) syscalls, i.e. the starting
		points of pathname resolution
	* closing such descriptor does *NOT* affect dnotify or
posix locks.
	* permissions are checked as usual along the way to file;
no permission checks are applied to the file itself.  Of course,
giving such thing to syscall will result in permission checks (at
the moment it means checking that starting point of ....at() is
a directory and caller has exec permissions on it).

fget() and fget_light() return NULL on such descriptors; use of
fget_raw() and fget_raw_light() is needed to get them.  That protects
existing code from dealing with those things.

There are two things still missing (they come in the next commits):
one is handling of symlinks (right now we refuse to open them that
way; see the next commit for semantics related to those) and another
is descriptor passing via SCM_RIGHTS datagrams.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15 02:21:45 -04:00
Aneesh Kumar K.V
93f1c20bc8 vfs: Export file system uuid via /proc/<pid>/mountinfo
We add a per superblock uuid field. File systems should
update the uuid in the fill_super callback

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15 02:21:45 -04:00
Aneesh Kumar K.V
a51571ccb8 unistd.h: Add new syscalls numbers to asm-generic
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15 02:21:45 -04:00
Aneesh Kumar K.V
becfd1f375 vfs: Add open by file handle support
[AV: duplicate of open() guts removed; file_open_root() used instead]

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15 02:21:44 -04:00
Aneesh Kumar K.V
990d6c2d7a vfs: Add name to file handle conversion support
The syscall also return mount id which can be used
to lookup file system specific information such as uuid
in /proc/<pid>/mountinfo

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15 02:21:37 -04:00
Al Viro
f52e0c1130 New AT_... flag: AT_EMPTY_PATH
For name_to_handle_at(2) we'll want both ...at()-style syscall that
would be usable for non-directory descriptors (with empty relative
pathname).  Introduce new flag (AT_EMPTY_PATH) to deal with that and
corresponding LOOKUP_EMPTY; teach user_path_at() and path_init() to
deal with the latter.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 19:12:20 -04:00
Linus Torvalds
5f40d42094 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: NFSROOT should default to "proto=udp"
  nfs4: remove duplicated #include
  NFSv4: nfs4_state_mark_reclaim_nograce() should be static
  NFSv4: Fix the setlk error handler
  NFSv4.1: Fix the handling of the SEQUENCE status bits
  NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
  NFSv4.1 reclaim complete must wait for completion
  NFSv4: remove duplicate clientid in struct nfs_client
  NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY
  sunrpc: Propagate errors from xs_bind() through xs_create_sock()
  (try3-resend) Fix nfs_compat_user_ino64 so it doesn't cause problems if bit 31 or 63 are set in fileid
  nfs: fix compilation warning
  nfs: add kmalloc return value check in decode_and_add_ds
  SUNRPC: Remove resource leak in svc_rdma_send_error()
  nfs: close NFSv4 COMMIT vs. CLOSE race
  SUNRPC: Close a race in __rpc_wait_for_completion_task()
2011-03-14 11:19:50 -07:00
Aneesh Kumar K.V
5fe0c23788 exportfs: Return the minimum required handle size
The exportfs encode handle function should return the minimum required
handle size. This helps user to find out the handle size by passing 0
handle size in the first step and then redoing to the call again with
the returned handle size value.

Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 09:15:28 -04:00
Al Viro
c8b91accfa clean statfs-like syscalls up
New helpers: user_statfs() and fd_statfs(), taking userland pathname and
descriptor resp. and filling struct kstatfs.  Syscalls of statfs family
(native, compat and foreign - osf and hpux on alpha and parisc resp.)
switched to those.  Removes some boilerplate code, simplifies cleanup
on errors...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 09:15:28 -04:00
Al Viro
73d049a40f open-style analog of vfs_path_lookup()
new function: file_open_root(dentry, mnt, name, flags) opens the file
vfs_path_lookup would arrive to.

Note that name can be empty; in that case the usual requirement that
dentry should be a directory is lifted.

open-coded equivalents switched to it, may_open() got down exactly
one caller and became static.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 09:15:28 -04:00
Al Viro
5b6ca027d8 reduce vfs_path_lookup() to do_path_lookup()
New lookup flag: LOOKUP_ROOT.  nd->root is set (and held) by caller,
path_init() starts walking from that place and all pathname resolution
machinery never drops nd->root if that flag is set.  That turns
vfs_path_lookup() into a special case of do_path_lookup() *and*
gets us down to 3 callers of link_path_walk(), making it finally
feasible to rip the handling of trailing symlink out of link_path_walk().
That will not only simply the living hell out of it, but make life
much simpler for unionfs merge.  Trailing symlink handling will
become iterative, which is a good thing for stack footprint in
a lot of situations as well.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 09:15:27 -04:00
Al Viro
70e9b35711 get rid of nd->file
Don't stash the struct file * used as starting point of walk in nameidata;
pass file ** to path_init() instead.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 09:15:26 -04:00
Al Viro
47c805dc2d switch do_filp_open() to struct open_flags
take calculation of open_flags by open(2) arguments into new helper
in fs/open.c, move filp_open() over there, have it and do_sys_open()
use that helper, switch exec.c callers of do_filp_open() to explicit
(and constant) struct open_flags.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 09:15:25 -04:00
Al Viro
16c2cd7179 untangle the "need_reval_dot" mess
instead of ad-hackery around need_reval_dot(), do the following:
set a flag (LOOKUP_JUMPED) in the beginning of path, on absolute
symlink traversal, on ".." and on procfs-style symlinks.  Clear on
normal components, leave unchanged on ".".  Non-nested callers of
link_path_walk() call handle_reval_path(), which checks that flag
is set and that fs does want the final revalidate thing, then does
->d_revalidate().  In link_path_walk() all the return_reval stuff
is gone.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 09:15:24 -04:00
Al Viro
c9c6cac0c2 kill path_lookup()
all remaining callers pass LOOKUP_PARENT to it, so
flags argument can die; renamed to kern_path_parent()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 09:15:23 -04:00
Linus Torvalds
eebea5d13d Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] target: Fix t_transport_aborted handling in LUN_RESET + active I/O shutdown
2011-03-13 16:00:28 -07:00
Thomas Gleixner
995612178c Merge branch 'tip/futex/devel' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-rt into core/futexes
futex,plist: Pass the real head of the priority list to plist_del()
 futex,plist: Remove debug lock assignment from plist_node
 plist: Shrink struct plist_head
 plist: Add priority list test
2011-03-12 11:43:32 +01:00
Trond Myklebust
0400a6b0cb NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
nfs4_schedule_state_recovery() should only be used when we need to force
the state manager to check the lease. If we just want to start the
state manager in order to handle a state recovery situation, we should be
using nfs4_schedule_state_manager().

This patch fixes the abuses of nfs4_schedule_state_recovery() by replacing
its use with a set of helper functions that do the right thing.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:18:22 -05:00
Lai Jiangshan
bf6a9b8336 plist: Shrink struct plist_head
struct plist_head is used in struct task_struct as well as struct
rtmutex. If we can make it smaller, it will also make these structures
smaller as well.

The field prio_list in struct plist_head is seldom used and we can get
its information from the plist_nodes. Removing this field will decrease
the size of plist_head by half.

Signed-off-by:  Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4D107982.9090700@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-11 15:13:26 -05:00
Michel Lespinasse
8d7718aa08 futex: Sanitize futex ops argument types
Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic
prototypes to use u32 types for the futex as this is the data type the
futex core code uses all over the place.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311025058.GD26122@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-11 12:23:31 +01:00
Michel Lespinasse
37a9d912b2 futex: Sanitize cmpxchg_futex_value_locked API
The cmpxchg_futex_value_locked API was funny in that it returned either
the original, user-exposed futex value OR an error code such as -EFAULT.
This was confusing at best, and could be a source of livelocks in places
that retry the cmpxchg_futex_value_locked after trying to fix the issue
by running fault_in_user_writeable().
    
This change makes the cmpxchg_futex_value_locked API more similar to the
get_futex_value_locked one, returning an error code and updating the
original value through a reference argument.
    
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>  [tile]
Acked-by: Tony Luck <tony.luck@intel.com>  [ia64]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>  [microblaze]
Acked-by: David Howells <dhowells@redhat.com> [frv]
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311024851.GC26122@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-11 12:23:08 +01:00
Andy Adamson
114f64b5f2 NFSv4: remove duplicate clientid in struct nfs_client
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:05:00 -05:00
Trond Myklebust
bf294b41ce SUNRPC: Close a race in __rpc_wait_for_completion_task()
Although they run as rpciod background tasks, under normal operation
(i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
and nfs4_do_close() want to be fully synchronous. This means that when we
exit, we want all references to the rpc_task to be gone, and we want
any dentry references etc. held by that task to be released.

For this reason these functions call __rpc_wait_for_completion_task(),
followed by rpc_put_task() in the expectation that the latter will be
releasing the last reference to the rpc_task, and thus ensuring that the
callback_ops->rpc_release() has been called synchronously.

This patch fixes a race which exists due to the fact that
rpciod calls rpc_complete_task() (in order to wake up the callers of
__rpc_wait_for_completion_task()) and then subsequently calls
rpc_put_task() without ensuring that these two steps are done atomically.

In order to avoid adding new spin locks, the patch uses the existing
waitqueue spin lock to order the rpc_task reference count releases between
the waiting process and rpciod.
The common case where nobody is waiting for completion is optimised for by
checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
reference count is 1: in those cases we drop trying to grab the spin lock,
and immediately free up the rpc_task.

Those few processes that need to put the rpc_task from inside an
asynchronous context and that do not care about ordering are given a new
helper: rpc_put_task_async().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:52 -05:00
Ian Campbell
71eef7d1e3 xen: events: remove dom0 specific xen_create_msi_irq
The function name does not distinguish it from xen_allocate_pirq_msi
(which operates on domU and pvhvm domains rather than dom0).

Hoist domain 0 specific functionality up into the only caller leaving
functionality common to all guest types in xen_bind_pirq_msi_to_irq.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-10 14:44:45 -05:00
Ian Campbell
ca1d8fe952 xen: events: use xen_bind_pirq_msi_to_irq from xen_create_msi_irq
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-10 14:44:44 -05:00
Ian Campbell
bf480d952b xen: events: separate MSI PIRQ allocation from PIRQ binding to IRQ
Split the binding aspect of xen_allocate_pirq_msi out into a new
xen_bind_pirq_to_irq function.

In xen_hvm_setup_msi_irq when allocating a pirq write the MSI message
to signal the PIRQ as soon as the pirq is obtained. There is no way to
free the pirq back so if the subsequent binding to an IRQ fails we
want to ensure that we will reuse the PIRQ next time rather than leak
it.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-10 14:44:40 -05:00
Ian Campbell
4b41df7f6e xen: events: return irq from xen_allocate_pirq_msi
consistent with other similar functions.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-10 14:44:37 -05:00
Ian Campbell
bb5d079aef xen: events: drop XEN_ALLOC_IRQ flag to xen_allocate_pirq_msi
All callers pass this flag so it is pointless.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-10 14:44:35 -05:00
Konrad Rzeszutek Wilk
8054c3634c Merge branch 'stable/irq.rework' into stable/irq.cleanup
* stable/irq.rework:
  xen/irq: Cleanup up the pirq_to_irq for DomU PV PCI passthrough guests as well.
  xen: Use IRQF_FORCE_RESUME
  xen/timer: Missing IRQF_NO_SUSPEND in timer code broke suspend.
  xen: Fix compile error introduced by "switch to new irq_chip functions"
  xen: Switch to new irq_chip functions
  xen: Remove stale irq_chip.end
  xen: events: do not free legacy IRQs
  xen: events: allocate GSIs and dynamic IRQs from separate IRQ ranges.
  xen: events: add xen_allocate_irq_{dynamic, gsi} and xen_free_irq
  xen:events: move find_unbound_irq inside CONFIG_PCI_MSI
  xen: handled remapped IRQs when enabling a pcifront PCI device.
  genirq: Add IRQF_FORCE_RESUME
2011-03-10 14:41:43 -05:00
Linus Torvalds
ab02a95405 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules
2011-03-09 16:45:02 -08:00
Stephen Rothwell
684adca4f8 sysctl: the include of rcupdate.h is only needed in the kernel
Fixes this build-check error:

  include/linux/sysctl.h:28: included file 'linux/rcupdate.h' is not exported

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-09 16:43:24 -08:00
Vasiliy Kulikov
8909c9ad8f net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules
Since a8f80e8ff9 any process with
CAP_NET_ADMIN may load any module from /lib/modules/.  This doesn't mean
that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are
limited to /lib/modules/**.  However, CAP_NET_ADMIN capability shouldn't
allow anybody load any module not related to networking.

This patch restricts an ability of autoloading modules to netdev modules
with explicit aliases.  This fixes CVE-2011-1019.

Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior
of loading netdev modules by name (without any prefix) for processes
with CAP_SYS_MODULE to maintain the compatibility with network scripts
that use autoloading netdev modules by aliases like "eth0", "wlan0".

Currently there are only three users of the feature in the upstream
kernel: ipip, ip_gre and sit.

    root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) --
    root@albatros:~# grep Cap /proc/$$/status
    CapInh:	0000000000000000
    CapPrm:	fffffff800001000
    CapEff:	fffffff800001000
    CapBnd:	fffffff800001000
    root@albatros:~# modprobe xfs
    FATAL: Error inserting xfs
    (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted
    root@albatros:~# lsmod | grep xfs
    root@albatros:~# ifconfig xfs
    xfs: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep xfs
    root@albatros:~# lsmod | grep sit
    root@albatros:~# ifconfig sit
    sit: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep sit
    root@albatros:~# ifconfig sit0
    sit0      Link encap:IPv6-in-IPv4
	      NOARP  MTU:1480  Metric:1

    root@albatros:~# lsmod | grep sit
    sit                    10457  0
    tunnel4                 2957  1 sit

For CAP_SYS_MODULE module loading is still relaxed:

    root@albatros:~# grep Cap /proc/$$/status
    CapInh:	0000000000000000
    CapPrm:	ffffffffffffffff
    CapEff:	ffffffffffffffff
    CapBnd:	ffffffffffffffff
    root@albatros:~# ifconfig xfs
    xfs: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep xfs
    xfs                   745319  0

Reference: https://lkml.org/lkml/2011/2/24/203

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Kees Cook <kees.cook@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-03-10 10:25:19 +11:00
Linus Torvalds
78833dd706 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  nd->inode is not set on the second attempt in path_walk()
  unfuck proc_sysctl ->d_compare()
  minimal fix for do_filp_open() race
2011-03-09 13:55:51 -08:00
Owen Smith
51de69523f xen: Union the blkif_request request specific fields
Prepare for extending the block device ring to allow request
specific fields, by moving the request specific fields for
reads, writes and barrier requests to a union member.

Acked-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-08 15:07:00 -05:00
Stanislaw Gruszka
9977728840 debugobjects: Add hint for better object identification
In complex subsystems like mac80211 structures can contain several
timers and work structs, so identifying a specific instance from the
call trace and object type output of debugobjects can be hard.

Allow the subsystems which support debugobjects to provide a hint
function. This function returns a pointer to a kernel address
(preferrably the objects callback function) which is printed along
with the debugobjects type.

Add hint methods for timer_list, work_struct and hrtimer.

[ tglx: Massaged changelog, made it compile ]

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
LKML-Reference: <20110307085809.GA9334@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-08 16:10:38 +01:00
Al Viro
dfef6dcd35 unfuck proc_sysctl ->d_compare()
a) struct inode is not going to be freed under ->d_compare();
however, the thing PROC_I(inode)->sysctl points to just might.
Fortunately, it's enough to make freeing that sucker delayed,
provided that we don't step on its ->unregistering, clear
the pointer to it in PROC_I(inode) before dropping the reference
and check if it's NULL in ->d_compare().

b) I'm not sure that we *can* walk into NULL inode here (we recheck
dentry->seq between verifying that it's still hashed / fetching
dentry->d_inode and passing it to ->d_compare() and there's no
negative hashed dentries in /proc/sys/*), but if we can walk into
that, we really should not have ->d_compare() return 0 on it!
Said that, I really suspect that this check can be simply killed.
Nick?

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-08 02:22:27 -05:00
Linus Torvalds
fb62c00a6d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: no .snap inside of snapped namespace
  libceph: fix msgr standby handling
  libceph: fix msgr keepalive flag
  libceph: fix msgr backoff
  libceph: retry after authorization failure
  libceph: fix handling of short returns from get_user_pages
  ceph: do not clear I_COMPLETE from d_release
  ceph: do not set I_COMPLETE
  Revert "ceph: keep reference to parent inode on ceph_dentry"
2011-03-05 10:43:22 -08:00
Andi Kleen
236344d6b4 mm: add alloc_page_vma_node()
Add a alloc_page_vma_node that allows passing the "local" node in.  Used
in a followon patch.

Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-04 17:53:39 -08:00
Andi Kleen
2f5f9486f8 mm: change alloc_pages_vma to pass down the policy node for local policy
Currently alloc_pages_vma() always uses the local node as policy node for
the LOCAL policy.  Pass this node down as an argument instead.

No behaviour change from this patch, but will be needed for followons.

Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-04 17:53:39 -08:00
Sage Weil
e76661d0a5 libceph: fix msgr keepalive flag
There was some broken keepalive code using a dead variable.  Shift to using
the proper bit flag.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-04 12:24:31 -08:00
Sage Weil
60bf8bf881 libceph: fix msgr backoff
With commit f363e45f we replaced a bunch of hacky workqueue mutual
exclusion logic with the WQ_NON_REENTRANT flag.  One pieces of fallout is
that the exponential backoff breaks in certain cases:

 * con_work attempts to connect.
 * we get an immediate failure, and the socket state change handler queues
   immediate work.
 * con_work calls con_fault, we decide to back off, but can't queue delayed
   work.

In this case, we add a BACKOFF bit to make con_work reschedule delayed work
next time it runs (which should be immediately).

Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-04 12:24:28 -08:00
Linus Torvalds
e3e89cc535 Mark ptrace_{traceme,attach,detach} static
They are only used inside kernel/ptrace.c, and have been for a long
time.  We don't want to go back to the bad-old-days when architectures
did things on their own, so make them static and private.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-04 09:23:30 -08:00