Fix warn_on triggered by mounting a fsfuzzer corrupted file system, where
the root inode has been corrupted.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: Steve Grubb <sgrubb@redhat.com>
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] use max load in conservative governor
[CPUFREQ] fix a lockdep warning
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
gianfar: Fix potential oops during OF address translation
fsl_pq_mdio: Fix kernel oops during OF address translation
tcp: bind() fix when many ports are bound
rdma: potential ERR_PTR dereference
rtnetlink: potential ERR_PTR dereference
net: ipv6 bind to device issue
ipv6: allow to send packet after receiving ICMPv6 Too Big message with MTU field less than IPV6_MIN_MTU
drivers/net/usb: Add new driver ipheth
cxgb3: fix linkup issue
X25 fix dead unaccepted sockets
KS8851: NULL pointer dereference if list is empty
net: 3c574_cs fix stats.tx_bytes counter
xfrm6: ensure to use the same dev when building a bundle
can: Fix possible NULL pointer dereference in ems_usb.c
net: Fix an RCU warning in dev_pick_tx()
ipv6: Fix tcp_v6_send_response transport header setting.
bridge: add a missing ntohs()
8139too: Fix a typo in the function name.
mac80211: pass HT changes to driver when off channel
mac80211: remove bogus TX agg state assignment
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: Ensure we re-enable devices on resume
x86/PCI: parse additional host bridge window resource types
PCI: revert broken device warning
PCI aerdrv: use correct bit defines and add 2ms delay to aer_root_reset
x86/PCI: ignore Consumer/Producer bit in ACPI window descriptions
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86:
eeepc-laptop: add missing sparse_keymap_free
eeepc-wmi: Build fix
asus: don't modify bluetooth/wlan on boot
dell-wmi: Fix memory leak
eeepc-wmi: add backlight support
eeepc-wmi: use a platform device as parent device of all sub-devices
eeepc-wmi: add an eeepc_wmi context structure
The unpack routine fails to handle the decompress_method() returning
unrecognised decompressor (compress_name == NULL). This results in the
routine looping eventually oopsing on an out of bounds memory access.
Note this bug is usually hidden, only triggering on trailing junk after
one or more correct compressed blocks. The case of the compressed archive
being complete junk is (by accident?) caught by the if (state != Reset)
check because state is initialised to Start, but not updated due to the
decompressor not having been called. Obviously if the junk is trailing a
correctly decompressed buffer, state == Reset from the previous call to
the decompressor.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a standalone version of VMware Balloon driver. Ballooning is a
technique that allows hypervisor dynamically limit the amount of memory
available to the guest (with guest cooperation). In the overcommit
scenario, when hypervisor set detects that it needs to shuffle some
memory, it instructs the driver to allocate certain number of pages, and
the underlying memory gets returned to the hypervisor. Later hypervisor
may return memory to the guest by reattaching memory to the pageframes and
instructing the driver to "deflate" balloon.
We are submitting a standalone driver because KVM maintainer (Avi Kivity)
expressed opinion (rightly) that our transport does not fit well into
virtqueue paradigm and thus it does not make much sense to integrate with
virtio.
There were also some concerns whether current ballooning technique is the
right thing. If there appears a better framework to achieve this we are
prepared to evaluate and switch to using it, but in the meantime we'd like
to get this driver upstream.
We want to get the driver accepted in distributions so that users do not
have to deal with an out-of-tree module and many distributions have
"upstream first" requirement.
The driver has been shipping for a number of years and users running on
VMware platform will have it installed as part of VMware Tools even if it
will not come from a distribution, thus there should not be additional
risk in pulling the driver into mainline. The driver will only activate
if host is VMware so everyone else should not be affected at all.
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We are seeing a large regression in database performance on recent
kernels. The database opens a block device with O_DIRECT|O_SYNC and a
number of threads write to different regions of the file at the same time.
A simple test case is below. I haven't defined DEVICE since getting it
wrong will destroy your data :) On an 3 disk LVM with a 64k chunk size we
see about 17MB/sec and only a few threads in IO wait:
procs -----io---- -system-- -----cpu------
r b bi bo in cs us sy id wa st
0 3 0 16170 656 2259 0 0 86 14 0
0 2 0 16704 695 2408 0 0 92 8 0
0 2 0 17308 744 2653 0 0 86 14 0
0 2 0 17933 759 2777 0 0 89 10 0
Most threads are blocking in vfs_fsync_range, which has:
mutex_lock(&mapping->host->i_mutex);
err = fop->fsync(file, dentry, datasync);
if (!ret)
ret = err;
mutex_unlock(&mapping->host->i_mutex);
commit 148f948ba8 (vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode) offers
some explanation of what is going on:
Use these new helpers for syncing from generic VFS functions. This makes
O_SYNC writes to block devices acquire i_mutex for syncing. If we really
care about this, we can make block_fsync() drop the i_mutex and reacquire
it before it returns.
Thanks Jan for such a good commit message! As well as dropping i_mutex,
Christoph suggests we should remove the call to sync_blockdev():
> sync_blockdev is an overcomplicated alias for filemap_write_and_wait on
> the block device inode, which is exactly what we did just before calling
> into ->fsync
The patch below incorporates both suggestions. With it the testcase improves
from 17MB/s to 68M/sec:
procs -----io---- -system-- -----cpu------
r b bi bo in cs us sy id wa st
0 7 0 65536 1000 3878 0 0 70 30 0
0 34 0 69632 1016 3921 0 1 46 53 0
0 57 0 69632 1000 3921 0 0 55 45 0
0 53 0 69640 754 4111 0 0 81 19 0
Testcase:
#define _GNU_SOURCE
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define NR_THREADS 64
#define BUFSIZE (64 * 1024)
#define DEVICE "/dev/mapper/XXXXXX"
#define ALIGN(VAL, SIZE) (((VAL)+(SIZE)-1) & ~((SIZE)-1))
static int fd;
static void *doit(void *arg)
{
unsigned long offset = (long)arg;
char *b, *buf;
b = malloc(BUFSIZE + 1024);
buf = (char *)ALIGN((unsigned long)b, 1024);
memset(buf, 0, BUFSIZE);
while (1)
pwrite(fd, buf, BUFSIZE, offset);
}
int main(int argc, char *argv[])
{
int flags = O_RDWR|O_DIRECT;
int i;
unsigned long offset = 0;
if (argc > 1 && !strcmp(argv[1], "O_SYNC"))
flags |= O_SYNC;
fd = open(DEVICE, flags);
if (fd == -1) {
perror("open");
exit(1);
}
for (i = 0; i < NR_THREADS-1; i++) {
pthread_t tid;
pthread_create(&tid, NULL, doit, (void *)offset);
offset += BUFSIZE;
}
doit((void *)offset);
return 0;
}
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes the following error:
drivers/w1/masters/omap_hdq.c: In function 'hdq_wait_for_flag':
drivers/w1/masters/omap_hdq.c:137: error: implicit declaration of function 'schedule_timeout_uninterruptible'
drivers/w1/masters/omap_hdq.c: In function 'hdq_write_byte':
drivers/w1/masters/omap_hdq.c:177: error: 'TASK_UNINTERRUPTIBLE' undeclared (first use in this function)
drivers/w1/masters/omap_hdq.c:177: error: (Each undeclared identifier is reported only once
drivers/w1/masters/omap_hdq.c:177: error: for each function it appears in.)
drivers/w1/masters/omap_hdq.c:177: error: implicit declaration of function 'schedule_timeout'
drivers/w1/masters/omap_hdq.c: In function 'hdq_isr':
drivers/w1/masters/omap_hdq.c:221: error: 'TASK_NORMAL' undeclared (first use in this function)
drivers/w1/masters/omap_hdq.c: In function 'omap_hdq_break':
drivers/w1/masters/omap_hdq.c:316: error: 'TASK_UNINTERRUPTIBLE' undeclared (first use in this function)
Signed-off-by: Amit Kucheria <amit.kucheria@canonical.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If find_mergeable_anon_vma() succeeds but another thread installs
->anon_vma before we take ptl, then allocated == NULL but avc should be
freed. Change the code to check avc != NULL to detect this case.
Also, a couple of whitespace changes to make the critical section more
visible.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the following RCU warning:
===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
security/keys/request_key.c:116 invoked rcu_dereference_check() without protection!
This was caused by doing:
[root@andromeda ~]# keyctl newring fred @s
539196288
[root@andromeda ~]# keyctl request2 user a a 539196288
request_key: Required key not available
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes 2 issues with the LZO decompressor:
- It doesn't handle the case where a block isn't compressed at all. In
this case, calling lzo1x_decompress_safe will fail, so we need to just
use memcpy() instead (the upstream LZO code does something similar)
- Since commit 54291362d2 ("initramfs: add
missing decompressor error check") , the decompressor return code is
checked in the init/initramfs.c The LZO decompressor didn't return the
expected value, causing the initramfs code to falsely believe a
decompression error occured
Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Tested-by: bert schulze <spambemyguest@googlemail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a futex key happens to be located within a huge page mapped
MAP_PRIVATE, get_futex_key() can go into an infinite loop waiting for a
page->mapping that will never exist.
See https://bugzilla.redhat.com/show_bug.cgi?id=552257 for more details
about the problem.
This patch makes page->mapping a poisoned value that includes
PAGE_MAPPING_ANON mapped MAP_PRIVATE. This is enough for futex to
continue but because of PAGE_MAPPING_ANON, the poisoned value is not
dereferenced or used by futex. No other part of the VM should be
dereferencing the page->mapping of a hugetlbfs page as its page cache is
not on the LRU.
This patch fixes the problem with the test case described in the bugzilla.
[akpm@linux-foundation.org: mel cant spel]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Darren Hart <darren@dvhart.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Description of patch:
---------------------
This is a patch for the EFI framebuffer driver to enable the framebuffer
of the NVIDIA 9400M as found in MacBook Pro (MBP) 5,1 and up. The
framebuffer of the NVIDIA graphic cards are located at the following
addresses in memory:
9400M: 0xC0010000
9600M GT: 0xB0030000
The patch delivered right here only provides the memory location of the
framebuffer of the 9400M device. The 9600M GT is not covered. It is
assumed that the 9400M is used when powered up the MBP.
The information which device is currently powered and in use is stored in
the 64 bytes large EFI variable "gpu-power-prefs". More specifically,
byte 0x3B indicates whether 9600M GT (0x00) or 9400M (0x01) is online.
The PCI bus IDs are the following:
9400M: PCI 03:00:00
9600M GT: PCI 02:00:00
The EFI variables can be easily read-out and manipulated with "rEFIt", an
MBP specific bootloader tool. For more information on how handle rEFIt
and EFI variables please consult "http://refit.sourceforge.net" and
"http://ubuntuforums.org/archive/index.php/t-1076879.html".
IMPORTANT NOTE: The information on how to activate the 9400M device given
at "ubuntuforums.org" is not correct, since it states
gpu-power-prefs[0x3B] = 0x00 -> 9400M (PCI 02:00:00)
gpu-power-prefs[0x3B] = 0x01 -> 9600M GT (PCI 03:00:00)
Actually, the assignment of the values and the PCI bus IDs are swapped.
Suggestions:
------------
To cover framebuffers of both 9400M and 9600M GT, I would suggest to
implement a conditional on "gpu-power-prefs". Depending on the value of
byte 0x3B, the according framebuffer is selected. However, this requires
kernel access to the EFI variables.
[akpm@linux-foundation.org: rename optname, per Peter Jones]
Signed-off-by: Thomas Gerlach <t.m.gerlach@freenet.de>
Acked-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We must tell GCC to use even register for variable passed to ldrd
instruction. Without this patch GCC 4.2.1 puts this variable to r2/r3 on
EABI and r3/r4 on OABI, so force it to r2/r3. This does not change
anything when EABI and OABI compilation works OK.
Without this patch and with OABI I get:
CC drivers/mtd/nand/orion_nand.o
/tmp/ccMkwOCs.s: Assembler messages:
/tmp/ccMkwOCs.s:63: Error: first destination register must be even -- `ldrd r3,[ip]'
make[5]: *** [drivers/mtd/nand/orion_nand.o] Error 1
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Artem Bityutskiy <dedekind1@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jamie Lokier <jamie@shareable.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On ppc64 you get this error:
$ setarch ppc -R true
setarch: ppc: Unrecognized architecture
because uname still reports ppc64 as the machine.
So mask off the personality flags when checking for PER_LINUX32.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 677c9b2e39 ("reiserfs: remove
privroot hiding in lookup") removed the magic from the lookup code to hide
the .reiserfs_priv directory since it was getting loaded at mount-time
instead. The intent was that the entry would be hidden from the user via
a poisoned d_compare, but this was faulty.
This introduced a security issue where unprivileged users could access and
modify extended attributes or ACLs belonging to other users, including
root.
This patch resolves the issue by properly hiding .reiserfs_priv. This was
the intent of the xattr poisoning code, but it appears to have never
worked as expected. This is fixed by using d_revalidate instead of
d_compare.
This patch makes -oexpose_privroot a no-op. I'm fine leaving it this way.
The effort involved in working out the corner cases wrt permissions and
caching outweigh the benefit of the feature.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Edward Shishkin <edward.shishkin@gmail.com>
Reported-by: Matt McCutchen <matt@mattmccutchen.net>
Tested-by: Matt McCutchen <matt@mattmccutchen.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Finally add support to detect a local IPV6_DONTFRAG event
and return the relevant data to the user if they've enabled
IPV6_RECVPATHMTU on the socket. The next recvmsg() will
return no data, but have an IPV6_PATHMTU as ancillary data.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add dontfrag argument to relevant functions for
IPV6_DONTFRAG support, as well as allowing the value
to be passed-in via ancillary cmsg data.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add underlying data structure changes and basic setsockopt()
and getsockopt() support for IPV6_RECVPATHMTU, IPV6_PATHMTU,
and IPV6_DONTFRAG. IPV6_PATHMTU is actually fully functional
at this point.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new generic sample events reordering from perf trace.
Before that, the displayed traces were ordered as they were
in the input as recorded by perf record (not time ordered).
This makes eventually perf trace displaying the events as beeing
time ordered.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
The sample events recorded by perf record are not time ordered
because we have one buffer per cpu for each event (even demultiplexed
per task/per cpu for task bound events). But when we read trace events
we want them to be ordered by time because many state machines are
involved.
There are currently two ways perf tools deal with that:
- use -M to multiplex every buffers (perf sched, perf kmem)
But this creates a lot of contention in SMP machines on
record time.
- use a post-processing time reordering (perf timechart, perf lock)
The reordering used by timechart is simple but doesn't scale well
with huge flow of events, in terms of performance and memory use
(unusable with perf lock for example).
Perf lock has its own samples reordering that flushes its memory
use in a regular basis and that uses a sorting based on the
previous event queued (a new event to be queued is close to the
previous one most of the time).
This patch proposes to export perf lock's samples reordering facility
to the session layer that reads the events. So if a tool wants to
get ordered sample events, it needs to set its
struct perf_event_ops::ordered_samples to true and that's it.
This prepares tracing based perf tools to get rid of the need to
use buffers multiplexing (-M) or to implement their own
reordering.
Also lower the flush period to 2 as it's sufficient already.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
The parse_single_tracepoint_event() was setting some attributes
before it validated the event was indeed a tracepoint event. This
caused problems with other initialization routines like in the
builtin-top.c module whereby sample_period is not set if not 0.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <4bcf232b.698fd80a.6fbe.ffffb737@mx.google.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Previous state machine of perf lock was really broken.
This patch improves it a little.
This patch prepares the list of state machine that represents
lock sequences for each threads.
These state machines can be one of these sequences:
1) acquire -> acquired -> release
2) acquire -> contended -> acquired -> release
3) acquire (w/ try) -> release
4) acquire (w/ read) -> release
The case of 4) is a little special.
Double acquire of read lock is allowed, so the state machine
counts read lock number, and permits double acquire and release.
But, things are not so simple. Something in my model is still wrong.
I counted the number of lock instances with bad sequence,
and ratio is like this (case of tracing whoami): bad:233, total:2279
version 2:
* threads are now identified with tid, not pid
* prepared SEQ_STATE_READ_ACQUIRED for read lock.
* bunch of struct lock_seq_stat is now linked list
* debug information enhanced (this have to be removed someday)
e.g.
| === output for debug===
|
| bad:233, total:2279
| bad rate:0.000000
| histogram of events caused bad sequence
| acquire: 165
| acquired: 0
| contended: 0
| release: 68
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1271852634-9351-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
[rename SEQ_STATE_UNINITED to SEQ_STATE_UNINITIALIZED]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Atom erratum AAE44/AAF40/AAG38/AAH41:
"If software clears the PS (page size) bit in a present PDE (page
directory entry), that will cause linear addresses mapped through this
PDE to use 4-KByte pages instead of using a large page after old TLB
entries are invalidated. Due to this erratum, if a code fetch uses
this PDE before the TLB entry for the large page is invalidated then
it may fetch from a different physical address than specified by
either the old large page translation or the new 4-KByte page
translation. This erratum may also cause speculative code fetches from
incorrect addresses."
[http://download.intel.com/design/processor/specupdt/319536.pdf]
Where as commit 211b3d03c7 seems to
workaround errata AAH41 (mixed 4K TLBs) it reduces the window of
opportunity for the bug to occur and does not totally remove it. This
patch disables mixed 4K/4MB page tables totally avoiding the page
splitting and not tripping this processor issue.
This is based on an original patch by Colin King.
Originally-by: Colin Ian King <colin.king@canonical.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com>
Cc: <stable@kernel.org>
When we do a thread switch, we clear the outgoing FS/GS base if the
corresponding selector is nonzero. This is taken by __switch_to() as
an entry invariant; it does not verify that it is true on entry.
However, copy_thread() doesn't enforce this constraint, which can
result in inconsistent results after fork().
Make copy_thread() match the behavior of __switch_to().
Reported-and-tested-by: Samuel Thibault <samuel.thibault@inria.fr>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4BD1E061.8030605@zytor.com>
Cc: <stable@kernel.org>
Since .size is set properly in "struct pernet_operations l2tp_eth_net_ops",
allocating space for "struct l2tp_eth_net" by hand is not correct, even causes
memory leakage.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since .size is set properly in "struct pernet_operations l2tp_net_ops",
allocating space for "struct l2tp_net" by hand is not correct, even causes
memory leakage.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gianfar driver may pass NULL pointer to the of_translate_address(),
which may lead to a kernel oops. Fix this by using of_iomap(), which
is also much simpler and shorter.
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Old P1020RDB device trees were not specifing tbipa address for
MDIO nodes, which is now causing this kernel oops:
...
eth2: TX BD ring size for Q[6]: 256
eth2: TX BD ring size for Q[7]: 256
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc0015504
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c0015504] memcpy+0x3c/0x9c
LR [c000a9f8] __of_translate_address+0xfc/0x21c
Call Trace:
[df839e00] [c000a94c] __of_translate_address+0x50/0x21c (unreliable)
[df839e50] [c01a33e8] get_gfar_tbipa+0xb0/0xe0
...
The old device trees are buggy, though having a dead ethernet is
better than a dead kernel, so fix the issue by using of_iomap().
Also, a somewhat similar issue exist in the probe() routine, though
there the oops is only a possibility. Nonetheless, fix it too.
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While testing an application using the xpmem (out of kernel) driver, we
noticed a significant page fault rate reduction of x86_64 with respect
to ia64. For one test running with 32 cpus, one thread per cpu, it
took 01:08 for each of the threads to vm_insert_pfn 2GB worth of pages.
For the same test running on 256 cpus, one thread per cpu, it took 14:48
to vm_insert_pfn 2 GB worth of pages.
The slowdown was tracked to lookup_memtype which acquires the
spinlock memtype_lock. This heavily contended lock was slowing down
vm_insert_pfn().
With the cmpxchg on page->flags method, both the 32 cpu and 256 cpu
cases take approx 00:01.3 seconds to complete.
Signed-off-by: Robin Holt <holt@sgi.com>
LKML-Reference: <20100423153627.751194346@gulag1.americas.sgi.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
Cc: Rafael Wysocki <rjw@novell.com>
Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
They are not needed and add over 512 bytes to kernel data.
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Old code from original patch contains beagle board pins that are
not available on the Devkit8000.
Signed-off-by: Thomas Weber <weber@corscience.de>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Change position of calling serial and ethernet initialization.
Signed-off-by: Thomas Weber <weber@corscience.de>
Signed-off-by: Tony Lindgren <tony@atomide.com>