Impact: None (cleanup only)
Merge the GDT_ENTRY() macro between arch/x86/boot/pm.c and
arch/x86/kernel/acpi/sleep.c and put the new one in
<asm-x86/segment.h>.
While we're at it, correct the bitmasks for the limit and flags. The
new version relies on using ULL constants in order to cause type
promotion rather than explicit casts; this avoids having to include
<linux/types.h> in <asm-x86/segments.h>.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The truncate patch should not use the i_allocated_meta_blocks
value. So add seperate functions to be used in the truncate
and alloc path. We also need to release the meta-data block
that we reserved for the blocks that we are truncating.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
With the FLEX_BG layout, there is no reason why extents can't cross
block groups, so make the truncate code reserve enough credits so we
don't BUG if we come across such an extent.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The extents codepath for ext4_truncate() requests journal transaction
credits in very small chunks, requesting only what is needed. This
means there may not be enough credits left on the transaction handle
after ext4_truncate() returns and then when ext4_delete_inode() tries
finish up its work, it may not have enough transaction credits,
causing a BUG() oops in the jbd2 core.
Also, reserve an extra 2 blocks when starting an ext4_delete_inode()
since we need to update the inode bitmap, as well as update the
orphaned inode linked list.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext4_ext_journal_restart() is a convenience function which checks
to see if the requested number of credits is present, and if so it
closes the current transaction and attaches the current handle to the
new transaction. Unfortunately, it wasn't proprely checking the
return value from ext4_journal_extend(), so it was starting a new
transaction when one was not necessary, and returning an error when
all that was necessary was to restart the handle with a new
transaction.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
ext4_da_write_begin needs to call journal_stop before returning,
if the page allocation fails.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In ordered mode, the current jbd2 aborts the journal if a file data buffer
has an error. But this behavior is unintended, and we found that it has
been adopted accidentally.
This patch undoes it and just calls printk() instead of aborting the
journal. Unlike a similar patch for ext3/jbd, file data buffers are
written via generic_writepages(). But we also need to set AS_EIO
into their mappings because wait_on_page_writeback_range() clears
AS_EIO before a user process sees it.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
A transient I/O error can corrupt inode data. Here is the scenario:
(1) update inode_A at the block_B
(2) pdflush writes out new inode_A to the filesystem, but it results
in write I/O error, at this point, BH_Uptodate flag of the buffer
for block_B is cleared and BH_Write_EIO is set
(3) create new inode_C which located at block_B, and
__ext4_get_inode_loc() tries to read on-disk block_B because the
buffer is not uptodate
(4) if it can read on-disk block_B successfully, inode_A is
overwritten by old data
This patch makes __ext4_get_inode_loc() not read the inode block if the
buffer has BH_Write_EIO flag. In this case, the buffer should have the
latest information, so setting the uptodate flag to the buffer (this
avoids WARN_ON_ONCE() in mark_buffer_dirty().)
According to this change, we would need to test BH_Write_EIO flag for the
error checking. Currently nobody checks write I/O errors on metadata
buffers, but it will be done in other patches I'm working on.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: sugita <yumiko.sugita.yf@hitachi.com>
Cc: Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Jan Kara <jack@ucw.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently, the locality group prealloc list is freed only when there
is a block allocation failure. This can result in large number of
entries in the preallocation list making ext4_mb_use_preallocated()
expensive.
To fix this, we convert the locality group prealloc list to a hash
list. The hash index is the order of number of blocks in the prealloc
space with a max order of 9. When adding prealloc space to the list we
make sure total entries for each order does not exceed 8. If it is
more than 8 we discard few entries and make sure the we have only <= 5
entries.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
NR_CPUS can be really large. We should be using nr_cpu_ids instead.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Don't call BUG_ON on file system failures. Instead use ext4_error and
also handle the continue case properly.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
I noticed when filling a 1T filesystem with 4 threads using the
fs_mark benchmark:
fs_mark -d /mnt/test -D 256 -n 100000 -t 4 -s 20480 -F -S 0
that I occasionally got checksum mismatch errors:
EXT4-fs error (device sdb): ext4_init_inode_bitmap: Checksum bad for group 6935
etc. I'd reliably get 4-5 of them during the run.
It appears that the problem is likely a race to init the bg's
when the uninit_bg feature is enabled.
With the patch below, which adds sb_bgl_locking around initialization,
I was able to complete several runs with no errors or warnings.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext4_read_block_bitmap and read_inode_bitmap do essentially
the same thing, and yet they are structured quite differently.
I came across this difference while looking at doing bg locking
during bg initialization.
This patch:
* removes unnecessary casts in the error messages
* renames read_inode_bitmap to ext4_read_inode_bitmap
* and more substantially, restructures the inode bitmap
reading function to be more like the block bitmap counterpart.
The change to the inode bitmap reader simplifies the locking
to be applied in the next patch.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the block group checksums are corrupted, still allow the mount to
succeed, so e2fsck can have a chance to try to fix things up. Add
code in the remount r/w path to make sure the block group checksums
are valid before allowing the filesystem to be remounted read/write.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Inserting an extent can cause a new entry in the already existing index
block. That doesn't increase the depth of the instead. Instead it adds a
new leaf block. Now with the new leaf block the path information
corresponding to the logical block should be fetched from the new block.
The old path will be pointing to the old leaf block.
We need to recalucate the path information on extent insert
even if depth doesn't change. Without this change, the extent merge
after converting an unwritten extent to initialized extent takes the wrong
extent and cause data corruption.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Switches to unlocked_ioctl read to remove ioctl BKL method. Fix the
unknown ioctl return. Probably a nice easy one to kill off BKL usage
entirely later
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch removes the experimental status of kdump on IA64. kdump is on IA64
now since more than one year and it has proven to be stable.
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Merge the GDT_ENTRY() macro between arch/x86/boot/pm.c and
arch/x86/kernel/acpi/sleep.c and put the new one in
<asm-x86/segment.h>.
While we're at it, correct the bitmasks for the limit and flags. The
new version relies on using ULL constants in order to cause type
promotion rather than explicit casts; this avoids having to include
<linux/types.h> in <asm-x86/segments.h>.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
acpi_map_lsapic tries to stuff a long into ia64_cpu_to_sapicid[],
which can only hold ints, so let's fix that.
We need to update the signature of acpi_map_cpu2node() too.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
module_free() refers the first parameter before checking.
But it is called like below(in kernel/kprobes). The first parameter is always NULL.
This happens when many probe points(>1024) are set by kprobes.
I encountered this with using SystemTap. It can set many probes easily.
static int __kprobes collect_one_slot(struct kprobe_insn_page *kip, int idx)
{
...
if (kip->nused == 0) {
hlist_del(&kip->hlist);
if (hlist_empty(&kprobe_insn_pages)) {
...
} else {
module_free(NULL, kip->insns); //<<< 1st param always NULL
kfree(kip);
}
return 1;
}
return 0;
}
Signed-off-by: Akiyama, Nobuyuki <akiyama.nobuyuk@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
When dprintk is enabled the following warnings are generated:
arch/ia64/kernel/cpufreq/acpi-cpufreq.c: In function 'processor_set_pstate':
arch/ia64/kernel/cpufreq/acpi-cpufreq.c:54: warning: format '%x' expects type 'unsigned int', but argumen
t 3 has type 's64'
arch/ia64/kernel/cpufreq/acpi-cpufreq.c: In function 'processor_get_pstate':
arch/ia64/kernel/cpufreq/acpi-cpufreq.c:76: warning: format '%x' expects type 'unsigned int', but argumen
t 2 has type 's64'
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
[PATCH] ocfs2: fix oops in mmap_truncate testing
configfs: call drop_link() to cleanup after create_link() failure
configfs: Allow ->make_item() and ->make_group() to return detailed errors.
configfs: Fix failing mkdir() making racing rmdir() fail
configfs: Fix deadlock with racing rmdir() and rename()
configfs: Make configfs_new_dirent() return error code instead of NULL
configfs: Protect configfs_dirent s_links list mutations
configfs: Introduce configfs_dirent_lock
ocfs2: Don't snprintf() without a format.
ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs
ocfs2/net: Silence build warnings on sparc64
ocfs2: Handle error during journal load
ocfs2: Silence an error message in ocfs2_file_aio_read()
ocfs2: use simple_read_from_buffer()
ocfs2: fix printk format warnings with OCFS2_FS_STATS=n
[PATCH 2/2] ocfs2: Instrument fs cluster locks
[PATCH 1/2] ocfs2: Add CONFIG_OCFS2_FS_STATS config option
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix asm/e820.h for userspace inclusion
x86: fix numaq_tsc_disable
x86: fix kernel_physical_mapping_init() for large x86 systems
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ftrace: do not trace library functions
ftrace: do not trace scheduler functions
ftrace: fix lockup with MAXSMP
ftrace: fix merge buglet
asm-x86/e820.h is included from userspace. 'x86: make e820.c to have
common functions' (b79cd8f126) broke it:
make -C Documentation/lguest
cc -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include
lguest.c -lz -o lguest
In file included from ../../include/asm-x86/bootparam.h:8,
from lguest.c:45:
../../include/asm/e820.h:66: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:67: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:68: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:72: error: expected ‘=’, ‘,’, ‘;’, ‘asm’
or ‘__attribute__’ before ‘e820_update_range’
...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/x86/kernel/numaq_32.c: In function ‘numaq_tsc_disable’:
arch/x86/kernel/numaq_32.c:99: warning: ‘return’ with a value, in function returning void
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Xen save/restore needs bits of code enabled by PM_SLEEP, and PM_SLEEP
depends on PM. So make XEN_SAVE_RESTORE depend on PM and PM_SLEEP
depend on XEN_SAVE_RESTORE.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix calls of smp_call_function*() in arch/ia64/kvm for recent API
changes.
CC [M] arch/ia64/kvm/kvm-ia64.o
arch/ia64/kvm/kvm-ia64.c: In function 'handle_global_purge':
arch/ia64/kvm/kvm-ia64.c:398: error: too many arguments to function 'smp_call_function_single'
arch/ia64/kvm/kvm-ia64.c: In function 'kvm_vcpu_kick':
arch/ia64/kvm/kvm-ia64.c:1696: error: too many arguments to function 'smp_call_function_single'
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace:
fix dangling zombie when new parent ignores children
do_wait: return security_task_wait() error code in place of -ECHILD
ptrace children revamp
do_wait reorganization
The patch named "powerpc/mpc5121: Add clock driver", also contained
an unrelated and bogus change to the top-level makefile. This patch
backs out the bad bit.
SHA1 of offending patch: 137e95906e)
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Repented-by: John Rigby <jrigby@freescale.com>
[ Heh. Normally I pick these out from the diffstats, but I guess
I've grown to trust the ppc tree too much ;) - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
make function tracing more robust: do not trace library functions.
We've already got a sizable list of exceptions:
ifdef CONFIG_FTRACE
# Do not profile string.o, since it may be used in early boot or vdso
CFLAGS_REMOVE_string.o = -pg
# Also do not profile any debug utilities
CFLAGS_REMOVE_spinlock_debug.o = -pg
CFLAGS_REMOVE_list_debug.o = -pg
CFLAGS_REMOVE_debugobjects.o = -pg
CFLAGS_REMOVE_find_next_bit.o = -pg
CFLAGS_REMOVE_cpumask.o = -pg
CFLAGS_REMOVE_bitmap.o = -pg
endif
... and the pattern has been that random library functionality showed
up in ftrace's critical path (outside of its recursion check), causing
hard to debug lockups.
So be a bit defensive about it and exclude all lib/*.o functions by
default. It's not that they are overly interesting for tracing purposes
anyway. Specific ones can still be traced, in an opt-in manner.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
MAXSMP brings in lots of use of various bitops in smp_processor_id()
and friends - causing ftrace to lock up during bootup:
calling anon_inode_init+0x0/0x130
initcall anon_inode_init+0x0/0x130 returned 0 after 0 msecs
calling acpi_event_init+0x0/0x57
[ hard hang ]
So exclude the bitops facilities from tracing.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make chsc_error_from_response() available to chsc callers outside
of chsc.c (namely qdio) to avoid duplicating error checking code.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Add missing module.h include to fix this:
CC arch/s390/kernel/stacktrace.o
arch/s390/kernel/stacktrace.c:84: warning: data definition has no type or storage class
arch/s390/kernel/stacktrace.c:84: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/s390/kernel/stacktrace.c:84: warning: parameter names (without types) in function declaration
arch/s390/kernel/stacktrace.c:97: warning: data definition has no type or storage class
arch/s390/kernel/stacktrace.c:97: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/s390/kernel/stacktrace.c:97: warning: parameter names (without types) in function declaration
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Compiling a kernel with allmodconfig or allyesconfig results in tons
of gcc warnings, because the default maximum stacksize from which on
gcc will emit a warning is just 256 bytes.
Increase this to 2048, so these warnings don't distract from the real
warnings that we need to watch at.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>