Commit Graph

482213 Commits

Author SHA1 Message Date
Joe Carnuccio
4096953054 qla2xxx: ISP25xx multiqueue shadow register crash fix.
When creating request/response queues from qla25xx_setup_mode(),
the shadow index register pointers were not being initialized
to point at the registers.

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:25:02 +02:00
Joe Carnuccio
98aee70d19 qla2xxx: Add endianizer to max_payload_size modifier.
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:25:01 +02:00
Chad Dupuis
420854b3cd qla2xxx: Enable fast flash access for ISP83xx.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:25:01 +02:00
Joe Carnuccio
2ac224bc0e qla2xxx: Add ISP27xx fwdump template entry T275 (insert buffer).
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:25:01 +02:00
Joe Carnuccio
ce9b9b0858 qla2xxx: ISP27xx fwdump template fix insertbuf() routine.
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:25:01 +02:00
Joe Carnuccio
01cb65f1bb qla2xxx: ISP27xx fwdump template remove high frequency debug logs.
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:25:01 +02:00
Joe Carnuccio
aa2dc3727a qla2xxx: ISP27xx optimize fwdump entry table lookup.
Since the entry call array is sorted in order of entry type opcode,
the search can be terminated as soon as the search key is exceeded.

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:25:01 +02:00
Joe Carnuccio
299f5e27ac qla2xxx: ISP27xx add tests for incomplete template.
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:25:01 +02:00
Chris J Arges
4089b71cc8 mptfusion: enable no_write_same for vmware scsi disks
When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver and the vmware subsystem_vendor which
ensures that manual zeroing out is used instead.

Cc: stable@vger.kernel.org
BugLink: http://bugs.launchpad.net/bugs/1371591
Reported-by: Bruce Lucas <bruce.lucas@mongodb.com>
Tested-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:24:48 +02:00
NeilBrown
cbbce82209 SCHED: add some "wait..on_bit...timeout()" interfaces.
In commit c1221321b7
   sched: Allow wait_on_bit_action() functions to support a timeout

I suggested that a "wait_on_bit_timeout()" interface would not meet my
need.  This isn't true - I was just over-engineering.

Including a 'private' field in wait_bit_key instead of a focused
"timeout" field was just premature generalization.  If some other
use is ever found, it can be generalized or added later.

So this patch renames "private" to "timeout" with a meaning "stop
waiting when "jiffies" reaches or passes "timeout",
and adds two of the many possible wait..bit..timeout() interfaces:

wait_on_page_bit_killable_timeout(), which is the one I want to use,
and out_of_line_wait_on_bit_timeout() which is a reasonably general
example.  Others can be added as needed.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-25 08:23:57 -04:00
Tomas Henzl
859c75aba2 hpsa: add missing pci_set_master in kdump path
Add a call to pci_set_master(...)  missing in the previous
patch "hpsa: refine the pci enable/disable handling".
Found thanks to Rob Elliot.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Robert Elliott <elliott@hp.com>
Tested-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:23:41 +02:00
Ching Huang
2e9feb434a arcmsr: simplify ioctl data read/write
Signed-off-by: Ching Huang <ching 2048@areca.com.tw>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:23:40 +02:00
wenxiong@linux.vnet.ibm.com
3185ea6390 ipr: don't log error messages when applications issues illegal requests
Failing Device information are logged when IOA firmware detected these
illegal request such as IOA firmware doesn't support inquiry with page
code 2. The patch fixes the issue.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Tested-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-25 14:23:18 +02:00
Greg Kroah-Hartman
346e2e4a8b Merge tag 'phy-for_3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy into usb-next
Kishon writes:

Adds 3 new PHY drivers stih407, stih41x and rcar gen2 PHY. It also
includes miscellaneous cleanup of other PHY drivers.

Conflicts:
	MAINTAINERS
2014-09-25 13:11:52 +02:00
Srinivas Kandagatla
9aacd602f0 of/fdt: fix memory range check
In cases where board has below memory DT node

memory{
	device_type = "memory";
	reg = <0x80000000 0x80000000>;
};

Check on the memory range in fdt.c will always fail because it is
comparing MAX_PHYS_ADDR with base + size, in fact it should compare
it with base + size - 1.

This issue was originally noticed on Qualcomm IFC6410 board.
Without this patch kernel shows up noticed unnecessary warnings

[    0.000000] Machine model: Qualcomm APQ8064/IFC6410
[    0.000000] Ignoring memory range 0xffffffff - 0x100000000
[    0.000000] cma: Reserved 64 MiB at ab800000

as a result the size get reduced to 0x7fffffff which looks wrong.

This patch fixes the check involved in generating this warning and
as a result it also fixes the wrong size calculation.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
[grant.likely: adjust new size calculation also]
Signed-off-by: Grant Likely <grant.likely@linaro.org>
2014-09-25 11:55:50 +01:00
Greg Kroah-Hartman
5caf6ae5ce Merge tag 'usb-serial-3.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-next
Johan writes:

USB-serial fixes for v3.17

Here are two more device IDs for v3.17.

Signed-off-by: Johan Hovold <johan@kernel.org>
2014-09-25 12:18:11 +02:00
Peter Hurley
cc952e7017 tty: Fix width of unsigned long bitfield padding
Commit c545b66c69,
'tty: Serialize tcflow() with other tty flow control changes' and
commit 99416322dd,
'tty: Workaround Alpha non-atomic byte storage in tty_struct' work around
compiler bugs and non-atomic storage on multiple arches by padding
bitfields out to the declared type which is unsigned long. However, the
width varies by arch.

Pad bitfields to actual width of unsigned long (which is BITS_PER_LONG).

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-25 12:17:25 +02:00
Frank Praznik
981c5b4a3b HID: sony: Update the DualShock 4 touchpad resolution
The DualShock 4 touchpad has been measured to have a resolution of
44.86 dots/mm which equates to 1920x942.

Signed-off-by: Frank Praznik <frank.praznik@oh.rr.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-09-25 11:23:26 +02:00
Sjoerd Simons
508423bebc ARM: exynos_defconfig: enable USB gadget support
Enable USB gadget support without support for any specific gadgets to
more easily catch cases where a devices dts doesn't specify the usb
controllers dr_mode while it should.

Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2014-09-25 18:20:18 +09:00
Lorenzo Pieralisi
d2e5c871ed drivers: cpuidle: initialize big.LITTLE driver through DT
With the introduction of DT based idle states, CPUidle drivers for ARM
can now initialize idle states data through properties in the device tree.

This patch adds code to the big.LITTLE CPUidle driver to dynamically
initialize idle states data through the updated device tree source file.

Cc: Chander Kashyap <k.chander@samsung.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-09-25 10:52:21 +02:00
Lorenzo Pieralisi
3299b63de3 drivers: cpuidle: CPU idle ARM64 driver
This patch implements a generic CPU idle driver for ARM64 machines.

It relies on the DT idle states infrastructure to initialize idle
states count and respective parameters. Current code assumes the driver
is managing idle states on all possible CPUs but can be easily
generalized to support heterogenous systems and build cpumasks at
runtime using MIDRs or DT cpu nodes compatible properties.

The driver relies on the arm64 CPU operations to call the idle
initialization hook used to parse and save suspend back-end specific
idle states information upon probing.

Idle state index 0 is always initialized as a simple wfi state, ie always
considered present and functional on all ARM64 platforms.

Idle state indices higher than 0 trigger idle state entry by calling
the cpu_suspend function, that triggers the suspend operation through
the CPU operations suspend back-end hook. cpu_suspend passes the idle
state index as a parameter so that the CPU operations suspend back-end
can retrieve the required idle state data by using the idle state
index to execute a look-up on its internal data structures.

Reviewed-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-09-25 10:52:21 +02:00
Lorenzo Pieralisi
9f14da3455 drivers: cpuidle: implement DT based idle states infrastructure
On most common ARM systems, the low-power states a CPU can be put into are
not discoverable in HW and require device tree bindings to describe
power down suspend operations and idle states parameters.

In order to enable DT based idle states and configure idle drivers, this
patch implements the bulk infrastructure required to parse the device tree
idle states bindings and initialize the corresponding CPUidle driver states
data.

The parsing API accepts a start index that defines the first idle state
that should be initialized by the parsing code in order to give new and
legacy driver flexibility over which states should be parsed using the
new DT mechanism.

The idle states node(s) is obtained from the phandle list of the first cpu
in the driver cpumask;  the kernel checks that the idle state node phandle
is the same for all CPUs in the driver cpumask before declaring the idle state
as valid and start parsing its content.

The idle state enter function pointer is initialized through DT match
structures passed in by the CPUidle driver, so that ARM legacy code can
cope with platform specific idle entry method based on compatible
string matching and the code used to initialize the enter function pointer
can be moved to the DT generic layer.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-09-25 10:52:20 +02:00
Jan Willeke
2a0a5b2299 s390/uprobes: architecture backend for uprobes
Signed-off-by: Jan Willeke <willeke@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:17 +02:00
Jan Willeke
975fab1739 s390/uprobes: common library for kprobes and uprobes
This patch moves common functions from kprobes.c to probes.c.
Thus its possible for uprobes to use them without enabling kprobes.

Signed-off-by: Jan Willeke <willeke@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:14 +02:00
Martin Schwidefsky
bbae71bf9c s390/rwlock: use the interlocked-access facility 1 instructions
Make use of the load-and-add, load-and-or and load-and-and instructions
to atomically update the read-write lock without a compare-and-swap loop.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:13 +02:00
Martin Schwidefsky
94232a4332 s390/rwlock: improve writer fairness
Set the write-lock bit in the out-of-line rwlock code to indicate that
a writer is waiting. Additional readers will no be able to get the lock
until at least one writer got the lock. Additional writers have to wait
for the first writer to release the lock again.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:12 +02:00
Martin Schwidefsky
2684e73a86 s390/rwlock: remove interrupt-enabling rwlock variant.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:10 +02:00
Heiko Carstens
6a5c1482e2 s390/mm: remove change bit override support
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:09 +02:00
Heiko Carstens
70c9d29632 s390/vmemmap: remove memset call from vmemmap_populate()
If the vmemmap array gets filled with large pages we allocate those
pages with vmemmap_alloc_block(), which returns cleared pages.
Only for single 4k pages we call our own vmem_alloc_pages() which does
not return cleared pages. However we can also call vmemmap_alloc_block()
to allocate the 4k pages.
This way we can also make sure the vmemmap array is cleared after its
population.
Therefore we can remove the memset at the end of the function which
would clear the vmmemmap array a second time on machines which do
support EDAT1.

On very large configurations this can save us several seconds.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:07 +02:00
Christian Borntraeger
b881dcfbf7 s390/head.s: use zero as address for stfl
The architecture suggests to use address 0 as parameter for stfl,
to allow for future extensions. Using __LC_STFL_FAC_LIST (0x200)
shows which address is used, but might be not future proof.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:06 +02:00
Martin Schwidefsky
d59b93da5e s390/rwlock: use directed yield for write-locked rwlocks
Add an owner field to the arch_rwlock_t to be able to pass the timeslice
of a virtual CPU with diagnose 0x9c to the lock owner in case the rwlock
is write-locked. The undirected yield in case the rwlock is acquired
writable but the lock is read-locked is removed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:05 +02:00
Ingo Tuchscherer
46b05c7bd5 s390/zcrypt: Fixed possible race condition in zcrypt module handling
Signed-off-by: Ingo Tuchscherer <ingo.tuchscherer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:04 +02:00
Stefan Haberland
9fc98ad0d2 s390/tape: fix MTIOCGET ioctl to report blocksize
Remove tape_state from status register and report the drive's current
setting for block size instead as known from other tapes.
Density is not supported so nothing to report here.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:03 +02:00
Ralf Hoppe
8f933b1043 s390/hmcdrv: HMC drive CD/DVD access
This device driver allows accessing a HMC drive CD/DVD-ROM.
It can be used in a LPAR and z/VM environment.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ralf Hoppe <rhoppe@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:02 +02:00
Martin Schwidefsky
ea61a579ab s390/sclp: reduce dependency on event type masks
The event type masks can change asynchronously. These changes are reported
by SCLP to the OS by state-change events which are retrieved with the read
event data command. The SCLP driver has a request queue, there is a window
where the read event data request has not completed yet but the SCLP console
drivers are trying to queue output requests. As the masks are not updated
yet the requests are discarded.

The simplest fix is to queue the console requests independent of the
event type masks and rely on SCLP to return with an error code if a
specific event type is not available.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:01 +02:00
Ingo Tuchscherer
170387a887 s390/zcrypt: support for extended number of ap domains
Extends the number of ap domains within the zcrypt device driver up to 256.
AP domains in the range 00..255 will be detected.

Signed-off-by: Ingo Tuchscherer <ingo.tuchscherer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:51:57 +02:00
Kevin Hilman
6baf6ee534 cpuidle: big.LITTLE: add Exynos5800 compatible string
Exynos 5800 is big.LITTLE SoC compatible with the 5420.  Add the
compatible string so this driver works on the 5800.

Tested on exynos5800-peach-pi (aka Samsung Chromebook2)

Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-09-25 10:51:19 +02:00
Daniel Lezcano
f4ea5332c8 Merge branch 'for-next/cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into cpuidle/3.18
These are the specific changes for ARM64 to make it possible to integrate the
DT based generic cpuidle driver in this tree.

It contains:
  * The documentation for the DT definitions for ARM
  * The refactoring of the cpu_suspend function for ARM64
  * Introduce the cpu_idle_init function for ARM64
  * Add the PSCI CPU SUSPEND based on the previous changes on cpu_suspend

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-09-25 10:47:25 +02:00
Johan Hedberg
565766b087 Bluetooth: Rename sco_param_wideband table to esco_param_msbc
The sco_param_wideband table represents the eSCO parameters for
specifically mSBC encoding. This patch renames the table to the more
descriptive esco_param_msbc name.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-09-25 10:35:08 +02:00
Uwe Kleine-König
e4742d5769 pinctrl: bcm281xx: make Kconfig dependency more strict
This driver is only useful on BCM281xx, so let the driver depend on
ARCH_BCM_MOBILE but allow compile coverage testing.
The main benefit is that the driver isn't available to be selected for
machines that don't have the matching hardware.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Sherman Yin <syin@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2014-09-25 09:53:39 +02:00
Uwe Kleine-König
7b31997a73 gpio: kona: enable only on BCM_MOBILE or for compile testing
This change makes it easier to configure a kernel for a real machine by
not showing the option to enable it at all if COMPILE_TEST is off.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Markus Mayer <mmayer@broadcom.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2014-09-25 09:52:17 +02:00
Nikolaus Voss
e2e0897010 pwm: atmel: Fix calculation of prescale value
The prescale value used for calculating the period was incremented
afterwards, thus the resulting prescale value is by one too high.
This resulted in a PWM frequency only half as high as requested.

This patch moves the 64 bit division out of the prescale loop to
correct the above issue and make the calculation more efficient.

Signed-off-by: Nikolaus Voss <n.voss@weinmann-emt.de>
Tested-by: Bo Shen <voice.shen@atmel.com>
Acked-by: Bo Shen <voice.shen@atmel.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2014-09-25 08:52:39 +02:00
NeilBrown
e87b4c7a7a NFS: don't use STABLE writes during writeback.
commit b31268ac79
  FS: Use stable writes when not doing a bulk flush

was a bit heavy handed.
The particular problem that lead to this patch was that
small writes to an O_SYNC file we being written as UNSTABLE writes
followed by a commit.
This is appropriate for large writes (which require multiple NFS
requests) but for small writes (single NFS request), using
NFS_FILE_SYNC is more efficient.

So that patch causes the code to select between the two methods
depending on how many nfs requests get generated.

Unfortunately this ends up applying to non O_SYNC writes as well.
In particular if you memory-map a file and update random pages, then
when they are eventually written out by writeback they will go as
NFS_FILE_SYNC.  This is inefficient and slows down the application.

So: only set FLUSH_COND_STABLE when wbc->sync_mode is WB_SYNC_ALL.
With this patch:
 O_SYNC writes are NFS_FILE_SYNC for single requests, and NFS_UNSTABLE
    followed by COMMIT for multiple requests
 Writing immediately before close of fsync follow the same pattern.
 Non-O_SYNC writes without an fsync of close eventually get flushed
 out as UNSTABLE and a commit follows eventually as appropriate.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-24 23:23:02 -04:00
NeilBrown
8478eaa16e NFSv4: use exponential retry on NFS4ERR_DELAY for async requests.
Currently asynchronous NFSv4 request will be retried with
exponential timeout (from 1/10 to 15 seconds), but async
requests will always use a 15second retry.

Some "async" requests are really synchronous though.  The
async mechanism is used to allow the request to continue if
the requesting process is killed.
In those cases, an exponential retry is appropriate.

For example, if two different clients both open a file and
get a READ delegation, and one client then unlinks the file
(while still holding an open file descriptor), that unlink
will used the "silly-rename" handling which is async.
The first rename will result in NFS4ERR_DELAY while the
delegation is reclaimed from the other client.  The rename
will not be retried for 15 seconds, causing an unlink to take
15 seconds rather than 100msec.

This patch only added exponential timeout for async unlink and
async rename.  Other async calls, such as 'close' are sometimes
waited for so they might benefit from exponential timeout too.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-24 23:22:47 -04:00
Jason Baron
3dedbb5ca1 rpc: Add -EPERM processing for xs_udp_send_request()
If an iptables drop rule is added for an nfs server, the client can end up in
a softlockup. Because of the way that xs_sendpages() is structured, the -EPERM
is ignored since the prior bits of the packet may have been successfully queued
and thus xs_sendpages() returns a non-zero value. Then, xs_udp_send_request()
thinks that because some bits were queued it should return -EAGAIN. We then try
the request again and again, resulting in cpu spinning. Reproducer:

1) open a file on the nfs server '/nfs/foo' (mounted using udp)
2) iptables -A OUTPUT -d <nfs server ip> -j DROP
3) write to /nfs/foo
4) close /nfs/foo
5) iptables -D OUTPUT -d <nfs server ip> -j DROP

The softlockup occurs in step 4 above.

The previous patch, allows xs_sendpages() to return both a sent count and
any error values that may have occurred. Thus, if we get an -EPERM, return
that to the higher level code.

With this patch in place we can successfully abort the above sequence and
avoid the softlockup.

I also tried the above test case on an nfs mount on tcp and although the system
does not softlockup, I still ended up with the 'hung_task' firing after 120
seconds, due to the i/o being stuck. The tcp case appears a bit harder to fix,
since -EPERM appears to get ignored much lower down in the stack and does not
propogate up to xs_sendpages(). This case is not quite as insidious as the
softlockup and it is not addressed here.

Reported-by: Yigong Lou <ylou@akamai.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-24 23:13:46 -04:00
Jason Baron
f279cd008f rpc: return sent and err from xs_sendpages()
If an error is returned after the first bits of a packet have already been
successfully queued, xs_sendpages() will return a positive 'int' value
indicating success. Callers seem to treat this as -EAGAIN.

However, there are cases where its not a question of waiting for the write
queue to drain. For example, when there is an iptables rule dropping packets
to the destination, the lower level code can return -EPERM only after parts
of the packet have been successfully queued. In this case, we can end up
continuously retrying resulting in a kernel softlockup.

This patch is intended to make no changes in behavior but is in preparation for
subsequent patches that can make decisions based on both on the number of bytes
sent by xs_sendpages() and any errors that may have be returned.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-24 23:13:37 -04:00
Benjamin Coddington
173b3afcee lockd: Try to reconnect if statd has moved
If rpc.statd is restarted, upcalls to monitor hosts can fail with
ECONNREFUSED.  In that case force a lookup of statd's new port and retry the
upcall.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-24 23:08:43 -04:00
Benjamin Coddington
a743419f42 SUNRPC: Don't wake tasks during connection abort
When aborting a connection to preserve source ports, don't wake the task in
xs_error_report.  This allows tasks with RPC_TASK_SOFTCONN to succeed if the
connection needs to be re-established since it preserves the task's status
instead of setting it to the status of the aborting kernel_connect().

This may also avoid a potential conflict on the socket's lock.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-24 23:06:56 -04:00
Olga Kornievskaia
8faaa6d5d4 Fixing lease renewal
Commit c9fdeb28 removed a 'continue' after checking if the lease needs
to be renewed. However, if client hasn't moved, the code falls down to
starting reboot recovery erroneously (ie., sends open reclaim and gets
back stale_clientid error) before recovering from getting stale_clientid
on the renew operation.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Fixes: c9fdeb280b (NFS: Add basic migration support to state manager thread)
Cc: stable@vger.kernel.org # 3.13+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-24 23:03:15 -04:00
Fabian Frederick
2f3169fb18 nfs: fix duplicate proc entries
Commit 65b38851a1
("NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes")

updated the following function:
static int nfs_volume_list_open(struct inode *inode, struct file *file)

it used &nfs_server_list_ops instead of &nfs_volume_list_ops
which means cat /proc/fs/nfsfs/volumes = /proc/fs/nfsfs/servers

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Fixes: 65b38851a1 (NFS: Fix /proc/fs/nfsfs/servers and...)
Cc: stable@vger.kernel.org # 3.4.x+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-24 23:00:18 -04:00