This attempts to clarify names utilized during block I/O remap
operations (partition, volume manager). It correctly matches up the
/from/ information for both device & sector. This takes in the concept
from Kosaki Motohiro and extends it to include better naming for the
"device_from" field.
[ Impact: cleanup ]
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <49FF4FAE.3000301@hp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The conversion to the ntpv4 reference model
f199239373 ("ntp: convert to the NTP4
reference model") in 2.6.19 added nanosecond resolution the adjtimex
interface, but also changed the "stiffness" of the frequency adjustments,
causing NTP convergence time to greatly increase.
SHIFT_PLL, which reduces the stiffness of the freq adjustments, was
designed to be inversely linked to HZ, and the reference value of 4 was
designed for Unix systems using HZ=100. However Linux's clock steering
code mostly independent of HZ.
So this patch reduces the SHIFT_PLL value from 4 to 2, which causes NTPd
behavior to match kernels prior to 2.6.19, greatly reducing convergence
times, and improving close synchronization through environmental thermal
changes.
The patch also changes some l's to L's in nearby code to avoid misreading
50l as 501.
[ Impact: tweak NTP algorithm for faster convergence ]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: zippel@linux-m68k.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <200905051956.n45JuVo9025575@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The mem= option will truncate the memory map at a specified address so
it's not possible to register nodes with memory beyond the e820 upper
bound.
unparse_node() is only called when then node had memory associated with
it, although with the mem= option it is no longer addressable.
[ Impact: fix boot hang on certain (large) systems ]
Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <alpine.DEB.2.00.0905051248150.20021@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A module will add/remove its trace events when it gets loaded/unloaded, so
the ftrace_events list is not "const", and concurrent access needs to be
protected.
This patch thus fixes races between loading/unloding modules and read
'available_events' or read/write 'set_event', etc.
Below shows how to reproduce the race:
# for ((; ;)) { cat /mnt/tracing/available_events; } > /dev/null &
# for ((; ;)) { insmod trace-events-sample.ko; rmmod sample; } &
After a while:
BUG: unable to handle kernel paging request at 0010011c
IP: [<c1080f27>] t_next+0x1b/0x2d
...
Call Trace:
[<c10c90e6>] ? seq_read+0x217/0x30d
[<c10c8ecf>] ? seq_read+0x0/0x30d
[<c10b4c19>] ? vfs_read+0x8f/0x136
[<c10b4fc3>] ? sys_read+0x40/0x65
[<c1002a68>] ? sysenter_do_call+0x12/0x36
[ Impact: fix races when concurrent accessing ftrace_events list ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4A00F709.3080800@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Normally a config should be default to n. This patch also makes the
sample module-only, like SAMPLE_MARKERS and SAMPLE_TRACEPOINTS.
[ Impact: don't build trace event sample by default ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A00F6C0.8090803@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The sample is useful for testing, and I'm using it. But after
loading the module, it keeps saying hi every 10 seconds, this may
be disturbing.
Also Steven said commenting out the "hi" helped in causing races. :)
[ Impact: make testing a bit easier ]
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A00F6AD.2070008@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When UBIFS runs out of space it spends a lot of time trying to
find more space before returning ENOSPC. As there is no point
repeating that unless something has changed, UBIFS has an
optimization to record that the file system is 100% full and not
try to find space. That flag was not being reset when a pending
deletion was finally done.
Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Reviewed-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Merge reason: we moved a mutex.h commit that originated from the
perfcounters tree into core/locking - but now merge
back that branch to solve a merge artifact and to
pick up cleanups of this commit that happened in
core/locking.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This build failure:
arch/powerpc/sysdev/mpic.c:810: error: conflicting types for 'mpic_set_affinity'
arch/powerpc/sysdev/mpic.h:39: error: previous declaration of 'mpic_set_affinity' was here
make[2]: *** [arch/powerpc/sysdev/mpic.o] Error 1
make[2]: *** Waiting for unfinished jobs....
Triggers because the function prototype was not updated when the
function call signature got changed by:
d5dedd4: irq: change ->set_affinity() to return status
[ Impact: build fix on powerpc ]
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-arch@vger.kernel.org
LKML-Reference: <49F654E9.4070809@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On mount, "sec=ntlmssp" can now be specified to allow
"rawntlmssp" security to be enabled during
CIFS session establishment/authentication (ntlmssp used to
require specifying krb5 which was counterintuitive).
Signed-off-by: Steve French <sfrench@us.ibm.com>
This patch adds code that can benchmark the ring buffer as well as
test it. This code can be compiled into the kernel (not recommended)
or as a module.
A separate ring buffer is used to not interfer with other users, like
ftrace. It creates a producer and a consumer (option to disable creation
of the consumer) and will run for 10 seconds, then sleep for 10 seconds
and then repeat.
While running, the producer will write 10 byte loads into the ring
buffer with just putting in the current CPU number. The reader will
continually try to read the buffer. The reader will alternate from reading
the buffer via event by event, or by full pages.
The output is a pr_info, thus it will fill up the syslogs.
Starting ring buffer hammer
End ring buffer hammer
Time: 9000349 (usecs)
Overruns: 12578640
Read: 5358440 (by events)
Entries: 0
Total: 17937080
Missed: 0
Hit: 17937080
Entries per millisec: 1993
501 ns per entry
Sleeping for 10 secs
Starting ring buffer hammer
End ring buffer hammer
Time: 9936350 (usecs)
Overruns: 0
Read: 28146644 (by pages)
Entries: 74
Total: 28146718
Missed: 0
Hit: 28146718
Entries per millisec: 2832
353 ns per entry
Sleeping for 10 secs
Time: is the time the test ran
Overruns: the number of events that were overwritten and not read
Read: the number of events read (either by pages or events)
Entries: the number of entries left in the buffer
(the by pages will only read full pages)
Total: Entries + Read + Overruns
Missed: the number of entries that failed to write
Hit: the number of entries that were written
The above example shows that it takes ~353 nanosecs per entry when
there is a reader, reading by pages (and no overruns)
The event by event reader slowed the producer down to 501 nanosecs.
[ Impact: see how changes to the ring buffer affect stability and performance ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Fix a trivial typo in Documentation/x86/x86_64/mm.txt.
[ Impact: documentation only ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Rik van Riel <riel@redhat.com>
Extend the maximum addressable memory on x86-64 from 2^44 to
2^46 bytes. This requires some shuffling around of the vmalloc
and virtual memmap memory areas, to keep them away from the
direct mapping of up to 64TB of physical memory.
This patch also introduces a guard hole between the vmalloc
area and the virtual memory map space. There's really no
good reason why we wouldn't have a guard hole there.
[ Impact: future hardware enablement ]
Signed-off-by: Rik van Riel <riel@redhat.com>
LKML-Reference: <20090505172856.6820db22@cuia.bos.redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
In the hot path of the ring buffer "__rb_reserve_next" there's a big
if statement that does not even return back to the work flow.
code;
if (cross to next page) {
[ lots of code ]
return;
}
more code;
The condition is even the unlikely path, although we do not denote it
with an unlikely because gcc is fine with it. The condition is true when
the write crosses a page boundary, and we need to start at a new page.
Having this if statement makes it hard to read, but calling another
function to do the work is also not appropriate, because we are using a lot
of variables that were set before the if statement, and we do not want to
send them as parameters.
This patch changes it to a goto:
code;
if (cross to next page)
goto next_page;
more code;
return;
next_page:
[ lots of code]
This makes the code easier to understand, and a bit more obvious.
The output from gcc is practically identical. For some reason, gcc decided
to use different registers when I switched it to a goto. But other than that,
the logic is the same.
[ Impact: easier to read code ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
We were not setting the SMB uid in NTLMSSP authenticate
request which could lead to INVALID_PARAMETER error
on 2nd session setup.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Based on a request from Eric Paris to simplify parsing, replace
audit_log_format statements containing "%s" with audit_log_string().
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
An audit subsystem change replaced AUDIT_EQUAL with Audit_equal.
Update calls to security_filter_rule_init()/match() to reflect
the change.
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/r128: fix r128 ioremaps to use ioremap_wc.
drm: cleanup properly in drm_get_dev() failure paths
drm: clean the map list before destroying the hash table
drm: remove unreachable code in drm_sysfs.c
drm: add control node checks missing from kms merge
drm/kms: don't try to shortcut drm mode set function
drm/radeon: bump minor version for occlusion queries support
When adding the EXPORT_SYMBOL to some of the tracing API, I accidently
used EXPORT_SYMBOL instead of EXPORT_SYMBOL_GPL. This patch fixes
that mistake.
[ Impact: export the tracing code only for GPL modules ]
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The only references in the kernel to the .text.sched section are in
recordmcount.pl. Since the code it has is intended to be example code
it should refer to real kernel sections. So change it to .sched.text
instead.
[ Impact: consistency in comments ]
Signed-off-by: Tim Abbott <tabbott@mit.edu>
LKML-Reference: <1241136371-10768-1-git-send-email-tabbott@mit.edu>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In the mainline ocfs2 code, the interface for masklog is in files under
/sys/fs/o2cb/masklog, but the comments in fs/ocfs2/cluster/masklog.h
reference the old /proc interface. They are out of date.
This patch modifies the comments in cluster/masklog.h, which also provides
a bash script example on how to change the log mask bits.
Signed-off-by: Coly Li <coly.li@suse.de>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Currently the kernel defines XATTR_LIST_MAX as 65536
in include/linux/limits.h. This is the largest buffer that is used for
listing xattrs.
But with ocfs2 xattr tree, we actually have no limit for the number. If
filesystem has more names than can fit in the buffer, the kernel
logs will be pollluted with something like this when listing:
(27738,0):ocfs2_iterate_xattr_buckets:3158 ERROR: status = -34
(27738,0):ocfs2_xattr_tree_list_index_block:3264 ERROR: status = -34
So don't print "ERROR" message as this is not an ocfs2 error.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
madvise(MADV_WILLNEED) forces page cache readahead on a range of memory
backed by a file. The assumption is made that the page required is
order-0 and "normal" page cache.
On hugetlbfs, this assumption is not true and order-0 pages are
allocated and inserted into the hugetlbfs page cache. This leaks
hugetlbfs page reservations and can cause BUGs to trigger related to
corrupted page tables.
This patch causes MADV_WILLNEED to be ignored for hugetlbfs-backed
regions.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As a precaution, it is best to disable writing to the ring buffers
when reseting them.
[ Impact: prevent weird things if write happens during reset ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In the swap page ring buffer code that is used by the ftrace splice code,
we scan the page to increment the counter of entries read.
With the number of entries already in the page we simply need to add it.
[ Impact: speed up reading page from ring buffer ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
drivers/net/lasi_82596.c:164: warning: format '%lx' expects type
'long unsigned int', but argument 3 has type 'resource_size_t'
drivers/net/lasi_82596.c:169: warning: format '%lx' expects type
'long unsigned int', but argument 2 has type 'resource_size_t'
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices have two R6040 MACs but the second one
is not wired to any PHY, therefore the interface is
just unusable. Warn the user about that and prevent
device from registering.
Tested-by: bifferos <bifferos@yahoo.co.uk>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove CONFIG_PROC_FS ifdefs from the code by adding void functions.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bonding/bond_main.c | 30 ++++++++++++++++++++----------
1 files changed, 20 insertions(+), 10 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
v5 -> v6 (current):
-removed so far unused static functions
-corrected dev_addr_del_multiple to call del instead of add
v4 -> v5:
-added device address type (suggested by davem)
-removed refcounting (better to have simplier code then safe potentially few
bytes)
v3 -> v4:
-changed kzalloc to kmalloc in __hw_addr_add_ii()
-ASSERT_RTNL() avoided in dev_addr_flush() and dev_addr_init()
v2 -> v3:
-removed unnecessary rcu read locking
-moved dev_addr_flush() calling to ensure no null dereference of dev_addr
v1 -> v2:
-added forgotten ASSERT_RTNL to dev_addr_init and dev_addr_flush
-removed unnecessary rcu_read locking in dev_addr_init
-use compare_ether_addr_64bits instead of compare_ether_addr
-use L1_CACHE_BYTES as size for allocating struct netdev_hw_addr
-use call_rcu instead of rcu_synchronize
-moved is_etherdev_addr into __KERNEL__ ifdef
This patch introduces a new list in struct net_device and brings a set of
functions to handle the work with device address list. The list is a replacement
for the original dev_addr field and because in some situations there is need to
carry several device addresses with the net device. To be backward compatible,
dev_addr is made to point to the first member of the list so original drivers
sees no difference.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a power management implementation for smsc911x.c which assumes
the chips remains powered during suspend. The device is put in its D1
power saving mode.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Acked-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an upstream port reports an AER error to root port, kernel
starts error recovery procedures. The default return value of
function pcie_portdrv_slot_reset is PCI_ERS_RESULT_NONE. If all
port service drivers of the downstream port under the upstream
port have no slot_reset method in pci_error_handlers, AER recovery
would stop without resume. Below patch against 2.6.30-rc3 fixes it.
Signed-off-by: Zhang Yanmin <yanmin.zhang@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
disable_irq() should wait for all running handlers to complete
before returning. As such, if it's used to disable an interrupt
from that interrupt's handler it will deadlock. This replaces
the dangerous instances with the _nosync() variant which doesn't
have this problem.
Note the 2 handlers in question are only used #ifdef DEBUG so
I imagine these code paths don't get hit often.
Signed-off-by: Ben Nizette <bn@niasdigital.com>
Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* 'irq/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
Revert "genirq: assert that irq handlers are indeed running in hardirq context"