Back before e1000-7.3.20, the e1000 driver had a simple algorithm that
managed interrupt moderation. The driver was updated in 7.3.20 to
have the new "adaptive" interrupt moderation but we have customer
requests to redeploy the old way as an option. This patch adds the
old functionality back. The new functionality can be enabled via
module parameter or at runtime via ethtool.
Module parameter: (InterruptThrottleRate=4) to use this new
moderation method.
Ethtool method: ethtool -C ethX rx-usecs 4
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change increases the RX fifo size to 36K for standard frames and
decreases the TX fifo size to 4K. The reason for this change is that on
slower systems the RX is much more likely to backfill and need space than
the TX is. As long as the TX fifo is twice the size of the MTU we should
have more than enough TX fifo.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reduce number of writes to RX producer pointer. When alloc'ing RX
buffers, only write the RX producer pointer once every
E1000_RX_BUFFER_WRITE (16) buffers created.
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In e1000_tx_map, precompute number of segements and bytecounts which
are derived from fields in skb; these are stored in buffer_info. When
cleaning tx in e1000_clean_tx_irq use the values in the associated
buffer_info for statistics counting, this eliminates cache misses
on skb fields.
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flag used in check to get rxhash out of the descriptor is incorrect one.
Fix to use the proper features flag.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of TOMOYO's functions may sleep after mutex_lock(). If OOM-killer selected
a process which is waiting at mutex_lock(), the to-be-killed process can't be
killed. Thus, replace mutex_lock() with mutex_lock_interruptible() so that the
to-be-killed process can immediately return from TOMOYO's functions.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
On/Off contains slash in the name, which causes warning during boot.
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Update CPUID pattern of PXA9xx in head.S and fix the duplicate
entries for pxa935.
Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
"on/off button" was recently renamed to remove the slash character.
Follow that change in the pin polarity detection as well.
While at it, fix another cosmetic coding style flaw as well.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
- Bring in a CMDLINE that actually works and prints to the right tty
- Compile-in JFFS2 to boot into rootfs
- Remove unneeded options for Bluetooth and radio
- Disable CPU_FREQ as it makes the flash driver fail
Thanks Jonathan for spotting what I messed up.
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
UARTs in the S3C2416 are almost same as in S3C2443 and can be handled by
s3c2440 serial driver.
Signed-off-by: Yauhen Kharuzhy <jekhor@gmail.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Remove the old S3C2410_GPJ as we will be moving to the new gpiolib
based driver code and these numbers will become invalid.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Change s3c2410_gpio_setpin() and s3c2410_gpio_pullup() to use
the new s3c_ gpio configuration calls until all their users
are converted.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Remove the last s3c2410_gpio_pullup() users in arch/arm/mach-s3c2410
Note, since mach-h1940.c is setting output and a pull-up, the call
has vbeen chanerd to S3C_GPIO_PULL_NONE instead of S3C_GPIO_PULL_UP.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Move the mach-mini2440 to using the gpiolib API for GPIOS it
directly uses, and s3c_gpio calls for configuration.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Add the necessary 1,2 and 4 bit configuration read calls for the new
gpio code to allow removal of the old s3c24xx gpio code.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Lockres hash size of 16KB is far too small for large filesystems (where we
have hundreds of thousands of lock resources stored in the table).
This patch increases it to 128KB.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
In ocfs2, we use ocfs2_extend_trans() to extend a journal handle's
blocks. But if jbd2_journal_extend() fails, it will only restart
with the the new number of blocks. This tends to be awkward since
in most cases we want additional reserved blocks. It makes our code
harder to mantain since the caller can't be sure all the original
blocks will not be accessed and dirtied again. There are 15 callers
of ocfs2_extend_trans() in fs/ocfs2, and 12 of them have to add
h_buffer_credits before they call ocfs2_extend_trans(). This makes
ocfs2_extend_trans() really extend atop the original block count.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Two tiny cleanup for allocation reservation.
1. Remove some extra codes in ocfs2_local_alloc_find_clear_bits.
2. Remove an unuseful variables in ocfs2_find_resv_lhs.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
When we allocate some bits from the reservation, we always
allocate from the r_start(see ocfs2_resmap_resv_bits).
So there should be no reason to check between r_start
and start. And I don't think we will change this behaviour
later by allocating from some bits after r_start. Why not make
ocfs2_adjust_resv_from_alloc simple for now?
The only chance we have to adjust the reservation is when we haven't
reached the end. With this patch, the function is more readable.
Note:
btw, this patch also fixes an original bug in the function
which I haven't found before.
if (end < ocfs2_resv_end(resv))
rhs = end - ocfs2_resv_end(resv);
This code is of course buggy. ;)
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
OCFS2 has never really supported intr. This patch acknowledges this reality
and makes nointr the default mount option. In a later patch, we intend to
support intr.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
o2dlm join and leave messages are more than informational as they are
required for debugging locking issues. This patch changes them from
KERN_INFO to KERN_NOTICE.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
The default behavior for directory reservations stays the same, but we add a
mount option so people can tweak the size of directory reservations
according to their workloads.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
The default reservation size of 4 (32-bit windows) is a bit too ambitious.
Scale it back to 16 bits (resv_level=2). I have been testing various sizes
on a 4-node cluster which runs a mixed workload that is heavily threaded.
With a 256MB local alloc, I get *roughly* the following levels of average file
fragmentation:
resv_level=0 70%
resv_level=1 21%
resv_level=2 23%
resv_level=3 24%
resv_level=4 60%
resv_level=5 did not test
resv_level=6 60%
resv_level=2 seemed like a good compromise between not letting windows be
too small, but not so big that heavier workloads will immediately suffer
without tuning.
This patch also change the behavior of directory reservations - they now
track file reservations. The previous compromise of giving directory
windows only 8 bits wound up fragmenting more at some window sizes because
file allocations had smaller unused windows to poach from.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
I have observed that the current size of 8M gives us pretty poor
fragmentation on multi-threaded workloads which do lots of writes.
Generally, I can increase the size of local alloc windows and observe a
marked decrease in fragmentation, even up and beyond window sizes of 512
megabytes. This makes sense for a couple reasons - larger local alloc means
more room for reservation windows. On multi-node workloads the larger local
alloc helps as well because we don't have to do window slides as often.
Also, I removed the OCFS2_DEFAULT_LOCAL_ALLOC_SIZE constant as it is no
longer used and the comment above it was out of date.
To test fragmentation, I used a workload which launched 4 threads that did
4k writes into a series of about 140 alternating files.
With resv_level=2, and a 4k/4k file system I observed the following average
fragmentation for various localalloc= parameters:
localalloc= avg. fragmentation
8 48
32 16
64 10
120 7
On larger cluster sizes, the difference is more dramatic.
The new default size top out at 256M, which we'll only get for cluster
sizes of 32K and above.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
This patch pulls the local alloc sizing code into localalloc.c and provides
a callout to it from ocfs2_fill_super(). Behavior is essentially unchanged
except that I correctly calculate the maximum local alloc size. The old code
in ocfs2_parse_options() calculated the max size as:
ocfs2_local_alloc_size(sb) * 8
which is correct, in bits. Unfortunately though the option passed in is in
megabytes. Ultimately, this bug made no real difference - the shrink code
would catch a too-large size and bring it down to something reasonable.
Still, it's less than efficient as-is.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Inodes are always allocated from the global bitmap now so we don't need this
any more. Also, the existing implementation bounces reservations around
needlessly.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Otherwise, the need for a very large contiguous allocation tends to
wreak havoc on many inode allocation reservations on the local alloc, thus
ruining any chances for contiguousness.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Use the reservations system for unindexed dir tree allocations. We don't
bother with the indexed tree as reads from it are mostly random anyway.
Directory reservations are marked seperately, to allow the reservations code
a chance to optimize their window sizes. This patch allocates only 8 bits
for directory windows as they generally are not expected to grow as quickly
as file data. Future improvements to dir window sizing can trivially be
made.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
This patch improves Ocfs2 allocation policy by allowing an inode to
reserve a portion of the local alloc bitmap for itself. The reserved
portion (allocation window) is advisory in that other allocation
windows might steal it if the local alloc bitmap becomes
full. Otherwise, the reservations are honored and guaranteed to be
free. When the local alloc window is moved to a different portion of
the bitmap, existing reservations are discarded.
Reservation windows are represented internally by a red-black
tree. Within that tree, each node represents the reservation window of
one inode. An LRU of active reservations is also maintained. When new
data is written, we allocate it from the inodes window. When all bits
in a window are exhausted, we allocate a new one as close to the
previous one as possible. Should we not find free space, an existing
reservation is pulled off the LRU and cannibalized.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
jbd[2]_journal_dirty_metadata() only returns 0. It's been returning 0
since before the kernel moved to git. There is no point in checking
this error.
ocfs2_journal_dirty() has been faithfully returning the status since the
beginning. All over ocfs2, we have blocks of code checking this can't
fail status. In the past few years, we've tried to avoid adding these
checks, because they are pointless. But anyone who looks at our code
assumes they are needed.
Finally, ocfs2_journal_dirty() is made a void function. All error
checking is removed from other files. We'll BUG_ON() the status of
jbd2_journal_dirty_metadata() just in case they change it someday. They
won't.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Errors from construct_alloc_key() shouldn't just be ignored in the way they are
by construct_key_and_link(). The only error that can be ignored so is
EINPROGRESS as that is used to indicate that we've found a key and don't need
to construct one.
We don't, however, handle ENOMEM, EDQUOT or EACCES to indicate allocation
failures of one sort or another.
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
keyring_serialise_link_sem is only needed for keyring->keyring links as it's
used to prevent cycle detection from being avoided by parallel keyring
additions.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Remove s3c2410_gpio_getirq() as the only users is the pm code, and it
can be replicated by using gpio_to_irq().
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Remove the implementation of s3c2410_gpio_setcfg() as it should now be
functionally equivalent to s3c_gpio_cfgpin(), and add a wrapper for those
drivers that are still using this call.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>