Commit Graph

506942 Commits

Author SHA1 Message Date
Lucas Stach
2801f7252d PCI: Add missing DT binding for "linux,pci-domain" property
41e5c0f81d ("of/pci: Add pci_get_new_domain_nr() and
of_get_pci_domain_nr()") added parsing of the "linux,pci-domain" property,
but didn't add the binding documentation.

Since this property will be supported by a number of host bridge drivers,
add it to the common PCI binding doc.

Fixes: 41e5c0f81d ("of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr()")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Liviu Dudau <Liviu.Dudau@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
2014-11-13 15:43:41 -07:00
Dave Airlie
bcfef97398 Merge tag 'drm/tegra/for-3.18-rc5' of git://people.freedesktop.org/~tagr/linux into drm-fixes
drm/tegra: Fixes for v3.18-rc5

This is a single patch that fixes the VBLANK machinery after:

	7ffd7a6851 drm: Always reject drm_vblank_get() after drm_vblank_off()

* tag 'drm/tegra/for-3.18-rc5' of git://people.freedesktop.org/~tagr/linux:
  drm/tegra: dc: Add missing call to drm_vblank_on()
2014-11-14 07:03:02 +10:00
Dave Airlie
3d0f8536cd Merge branch 'linux-3.18' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
One modesetting, one gk20a fix.

* 'linux-3.18' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau/nv50/disp: Fix modeset on G94
  drm/gk20a/fb: fix setting of large page size bit
2014-11-14 06:24:50 +10:00
Eric Dumazet
d649a7a81f tcp: limit GSO packets to half cwnd
In DC world, GSO packets initially cooked by tcp_sendmsg() are usually
big, as sk_pacing_rate is high.

When network is congested, cwnd can be smaller than the GSO packets
found in socket write queue. tcp_write_xmit() splits GSO packets
using the available cwnd, and we end up sending a single GSO packet,
consuming all available cwnd.

With GRO aggregation on the receiver, we might handle a single GRO
packet, sending back a single ACK.

1) This single ACK might be lost
   TLP or RTO are forced to attempt a retransmit.
2) This ACK releases a full cwnd, sender sends another big GSO packet,
   in a ping pong mode.

This behavior does not fill the pipes in the best way, because of
scheduling artifacts.

Make sure we always have at least two GSO packets in flight.

This allows us to safely increase GRO efficiency without risking
spurious retransmits.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:21:44 -05:00
Dave Airlie
64e5fcc68b Merge tag 'drm-intel-fixes-2014-11-13' of git://anongit.freedesktop.org/drm-intel into drm-fixes
one regression fix.

* tag 'drm-intel-fixes-2014-11-13' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Fix obj->map_and_fenceable across tiling changes
2014-11-14 06:21:40 +10:00
Marcelo Leitner
19ca9fc144 vxlan: Do not reuse sockets for a different address family
Currently, we only match against local port number in order to reuse
socket. But if this new vxlan wants an IPv6 socket and a IPv4 one bound
to that port, vxlan will reuse an IPv4 socket as IPv6 and a panic will
follow. The following steps reproduce it:

   # ip link add vxlan6 type vxlan id 42 group 229.10.10.10 \
       srcport 5000 6000 dev eth0
   # ip link add vxlan7 type vxlan id 43 group ff0e::110 \
       srcport 5000 6000 dev eth0
   # ip link set vxlan6 up
   # ip link set vxlan7 up
   <panic>

[    4.187481] BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
...
[    4.188076] Call Trace:
[    4.188085]  [<ffffffff81667c4a>] ? ipv6_sock_mc_join+0x3a/0x630
[    4.188098]  [<ffffffffa05a6ad6>] vxlan_igmp_join+0x66/0xd0 [vxlan]
[    4.188113]  [<ffffffff810a3430>] process_one_work+0x220/0x710
[    4.188125]  [<ffffffff810a33c4>] ? process_one_work+0x1b4/0x710
[    4.188138]  [<ffffffff810a3a3b>] worker_thread+0x11b/0x3a0
[    4.188149]  [<ffffffff810a3920>] ? process_one_work+0x710/0x710

So address family must also match in order to reuse a socket.

Reported-by: Jean-Tsung Hsiao <jhsiao@redhat.com>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:19:59 -05:00
Thomas Graf
6eba82248e rhashtable: Drop gfp_flags arg in insert/remove functions
Reallocation is only required for shrinking and expanding and both rely
on a mutex for synchronization and callers of rhashtable_init() are in
non atomic context. Therefore, no reason to continue passing allocation
hints through the API.

Instead, use GFP_KERNEL and add __GFP_NOWARN | __GFP_NORETRY to allow
for silent fall back to vzalloc() without the OOM killer jumping in as
pointed out by Eric Dumazet and Eric W. Biederman.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:18:40 -05:00
David S. Miller
64bb7e9949 Merge branch 'mlx4-next'
Or Gerlitz says:

====================
mlx4: Flexible (asymmetric) allocation of EQs and MSI-X vectors

This series from Matan Barak is built as follows:

The 1st two patches fix small bugs w.r.t firmware spec. Next
are two patches which do more re-factoring of the init/fini flow
and a patch that adds support for the QUERY_FUNC firmware command,
these are all pre-steps for the major patch of the series. In this
patch (#6) we change the order of talking/querying the firmware
and enabling SRIOV. This allows to remote worst-case assumption
w.r.t the number of available MSI-X vectors and EQs per function.

The last patch easily enjoys this ordering change, to enable
supports > 64 VFs over a firmware that allows for that.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:28 -05:00
Matan Barak
de966c5928 net/mlx4_core: Support more than 64 VFs
We now allow up to 126 VFs. Note though that certain firmware
versions only allow up to 80 VFs. Moreover, old HCAs only support 64 VFs.
In these cases, we limit the maximum number of VFs to 64.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:22 -05:00
Matan Barak
7ae0e400cd net/mlx4_core: Flexible (asymmetric) allocation of EQs and MSI-X vectors for PF/VFs
Previously, the driver queried the firmware in order to get the number
of supported EQs. Under SRIOV, since this was done before the driver
notified the firmware how many VFs it actually needs, the firmware had
to take into account a worst case scenario and always allocated four EQs
per VF, where one was used for events while the others were used for completions.

Now, when the firmware supports the asymmetric allocation scheme, denoted
by exposing num_sys_eqs > 0 (--> MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the
QUERY_FUNC command to query the firmware before enabling SRIOV. Thus we
can get more EQs and MSI-X vectors per function.

Moreover, when running in the new firmware/driver mode, the limitation
that the number of EQs should be a power of two is lifted.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:21 -05:00
Matan Barak
e8c4265bea net/mlx4_core: Add QUERY_FUNC firmware command
QUERY_FUNC firmware command could be used in order to query the
number of EQs, reserved EQs, etc for a specific function.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:19 -05:00
Matan Barak
a0eacca948 net/mlx4_core: Refactor mlx4_load_one
Refactor mlx4_load_one, as a preparation step for a new and
more complicated load function. The goal is to support both
newer firmware that required init_hca to be done before
enable_sriov and legacy firmwares that requires things to
be done the other way around.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:18 -05:00
Matan Barak
ffc39f6d6f net/mlx4_core: Refactor mlx4_cmd_init and mlx4_cmd_cleanup
Refactoring mlx4_cmd_init and mlx4_cmd_cleanup such that partial init
and cleanup are possible. After this refactoring, calling mlx4_cmd_init
several times is safe.

This is necessary in the VF init flow when mlx4_init_hca returns -EACCESS,
we need to issue cleanup and re-attempt to call it with the slave flag.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:17 -05:00
Matan Barak
225c6c8c6b net/mlx4_core: Use correct variable type for mlx4_slave_cap
We've used an incorrect type for the loop counter and the
mlx4_QUERY_FUNC_CAP function. The current input modifier
is either a port or a boolean.
Since the number of ports is always a positive value < 255,
we should use u8 instead of an integer with casting.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:16 -05:00
Matan Barak
7c68dd435b net/mlx4_core: Fix wrong reading of reserved_eqs
We mistakenly read the reserved_eqs field as a standard
numeric value rather than a log2 value.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:15 -05:00
David S. Miller
9cf5476bfd Merge branch 'rhash_prove_locking'
Herbert Xu says:

====================
rhashtable: Allow local locks to be used and tested

This series moves mutex_is_held entirely under PROVE_LOCKING so
there is zero foot print when we're not debugging.  More importantly
it adds a parrent argument to mutex_is_held so that we can test
local locks rather than global ones (e.g., per-namespace locks).
====================

Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:13:13 -05:00
Herbert Xu
7b4ce23534 rhashtable: Add parent argument to mutex_is_held
Currently mutex_is_held can only test locks in the that are global
since it takes no arguments.  This prevents rhashtable from being
used in places where locks are lock, e.g., per-namespace locks.

This patch adds a parent field to mutex_is_held and rhashtable_params
so that local locks can be used (and tested).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:13:05 -05:00
Herbert Xu
1b2f309d70 rhashtable: Move mutex_is_held under PROVE_LOCKING
The rhashtable function mutex_is_held is only used when PROVE_LOCKING
is enabled.  This patch makes the mutex_is_held field in rhashtable
optional depending on PROVE_LOCKING.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:13:05 -05:00
Herbert Xu
1f501d6252 netfilter: Move mutex_is_held under PROVE_LOCKING
The rhashtable function mutex_is_held is only used when PROVE_LOCKING
is enabled.  This patch modifies netfilter so that we can rhashtable.h
itself can later make mutex_is_held optional depending on PROVE_LOCKING.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:13:05 -05:00
Herbert Xu
9712756620 netlink: Move mutex_is_held under PROVE_LOCKING
The rhashtable function mutex_is_held is only used when PROVE_LOCKING
is enabled.  This patch modifies netlink so that we can rhashtable.h
itself can later make mutex_is_held optional depending on PROVE_LOCKING.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:13:05 -05:00
Enric Balletbo i Serra
ccf899a27c smsc911x: power-up phydev before doing a software reset.
With commit be9dad1f9f ("net: phy: suspend phydev when going
to HALTED"), the PHY device will be put in a low-power mode using
BMCR_PDOWN if the the interface is set down. The smsc911x driver does
a software_reset opening the device driver (ndo_open). In such case,
the PHY must be powered-up before access to any register and before
calling the software_reset function. Otherwise, as the PHY is powered
down the software reset fails and the interface can not be enabled
again.

This patch fixes this scenario that is easy to reproduce setting down
the network interface and setting up again.

    $ ifconfig eth0 down
    $ ifconfig eth0 up
    ifconfig: SIOCSIFFLAGS: Input/output error

Signed-off-by: Enric Balletbo i Serra <eballetbo@iseebcn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:09:28 -05:00
Hisashi Nakamura
9488e1e5b3 net: sh_eth: Add r8a7793 support
The device tree probing for R-Car M2N (r8a7793) is added.

Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:03:53 -05:00
Hisashi Nakamura
966d6dbb6b net: sh_eth: Add RMII mode setting in probe
When using RMMI mode, it is necessary to change in probe.

Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:03:53 -05:00
Michal Kubeček
fbe168ba91 net: generic dev_disable_lro() stacked device handling
Large receive offloading is known to cause problems if received packets
are passed to other host. Therefore the kernel disables it by calling
dev_disable_lro() whenever a network device is enslaved in a bridge or
forwarding is enabled for it (or globally). For virtual devices we need
to disable LRO on the underlying physical device (which is actually
receiving the packets).

Current dev_disable_lro() code handles this  propagation for a vlan
(including 802.1ad nested vlan), macvlan or a vlan on top of a macvlan.
It doesn't handle other stacked devices and their combinations, in
particular propagation from a bond to its slaves which often causes
problems in virtualization setups.

As we now have generic data structures describing the upper-lower device
relationship, dev_disable_lro() can be generalized to disable LRO also
for all lower devices (if any) once it is disabled for the device
itself.

For bonding and teaming devices, it is necessary to disable LRO not only
on current slaves at the moment when dev_disable_lro() is called but
also on any slave (port) added later.

v2: use lower device links for all devices (including vlan and macvlan)

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Veaceslav Falico <vfalico@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 14:48:56 -05:00
Dan Carpenter
b6267d3e80 amd-xgbe: fix ->rss_hash_type
There was a missing break statement so we set everything to
PKT_HASH_TYPE_L3 even when we intended to use PKT_HASH_TYPE_L4.

Fixes: 5b9dfe299e ('amd-xgbe: Provide support for receive side scaling')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 14:39:54 -05:00
Herbert Xu
0c828f2f83 lib: rhashtable - Remove weird non-ASCII characters from comments
My editor spewed garbage that looked like memory corruption on
my screen.  It turns out that a number of occurences of "fi" got
turned into a ligature.

This patch replaces these ligatures with the ASCII letters "fi".

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 14:38:46 -05:00
Alexander Kochetkov
6ff53fd371 net/smsc911x: Fix delays in the PHY enable/disable routines
Increased delay in the smsc911x_phy_disable_energy_detect (from 1ms to 2ms).
Dropped delays in the smsc911x_phy_enable_energy_detect (100ms and 1ms).

The patch affect SMSC LAN generation 4 chips with integrated PHY (LAN9221).

I saw problems with soft reset due to wrong udelay timings.
After I fixed udelay, I measured the time needed to bring integrated PHY
from power-down to operational mode (the time beetween clearing EDPWRDOWN
bit and soft reset complete event). I got 1ms (measured using ktime_get).
The value is equal to the current value (1ms) used in the
smsc911x_phy_disable_energy_detect. It is near the upper bound and in order
to avoid rare soft reset faults it is doubled (2ms).

I don't know official timing for bringing up integrated PHY as specs doesn't
clarify this (or may be I didn't found).

It looks safe to drop delays before and after setting EDPWRDOWN bit
(enable PHY power-down mode). I didn't saw any regressions with the patch.

The patch was reviewed by Steve Glendinning and Microchip Team.

Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Acked-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 14:37:53 -05:00
Alexander Kochetkov
242bcd5ba1 net/smsc911x: Fix rare soft reset timeout issue due to PHY power-down mode
The patch affect SMSC LAN generation 4 chips with integrated PHY (LAN9221).

It is possible that PHY could enter power-down mode (ENERGYON clear),
between ENERGYON bit check in smsc911x_phy_disable_energy_detect and SRST
bit set in smsc911x_soft_reset. This could happen, for example, if someone
disconnect ethernet cable between the checks. The PHY in a power-down mode
would prevent the MAC portion of chip to be software reseted.

Initially found by code review, confirmed later using test case.

This is low probability issue, and in order to reproduce it you have to
run the script:

while true; do
	ifconfig eth0 down
	ifconfig eth0 up || break
done

While the script is running you have to plug/unplug ethernet cable many
times (using gpio controlled ethernet switch, for example) until get:

[ 4516.477783] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 4516.512207] smsc911x smsc911x.0: eth0: SMSC911x/921x identified at 0xce006000, IRQ: 336
[ 4516.524658] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 4516.559082] smsc911x smsc911x.0: eth0: SMSC911x/921x identified at 0xce006000, IRQ: 336
[ 4516.571990] ADDRCONF(NETDEV_UP): eth0: link is not ready
ifconfig: SIOCSIFFLAGS: Input/output error

The patch was reviewed by Steve Glendinning and Microchip Team.

Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Acked-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 14:37:53 -05:00
Anish Bhatt
d7990b0c34 cxgb4i/cxgb4 : Refactor macros to conform to uniform standards
Refactored all macros used in cxgb4i as part of previously started cxgb4 macro
names cleanup. Makes them more uniform and avoids namespace collision.
Minor changes in other drivers where required as some of these macros are used
 by multiple drivers, affected drivers are iw_cxgb4, cxgb4(vf) & csiostor

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 14:36:22 -05:00
Jason Wang
8c847d2541 tun: fix issues of iovec iterators using in tun_put_user()
This patch fixes two issues after using iovec iterators:
- vlan_offset should be initialized to zero, otherwise unexpected offset
  will be used in skb_copy_datagram_iter()
- advance iovec iterator when vnet_hdr_sz is greater than sizeof(gso), this
  is the case when mergeable rx buffer were enabled for a virt guest.

Fixes e0b46d0ee9 ("tun: Use iovec iterators")
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 14:33:22 -05:00
Thomas Graf
882288c05e FOU: Fix no return statement warning for !CONFIG_NET_FOU_IP_TUNNELS
net/ipv4/fou.c: In function ‘ip_tunnel_encap_del_fou_ops’:
net/ipv4/fou.c:861:1: warning: no return statement in function returning non-void [-Wreturn-type]

Fixes: a8c5f90fb5 ("ip_tunnel: Ops registration for secondary encap (fou, gue)")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 14:32:00 -05:00
Ilya Dryomov
cc9f1f518c libceph: change from BUG to WARN for __remove_osd() asserts
No reason to use BUG_ON for osd request list assertions.

Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2014-11-13 22:26:34 +03:00
Ilya Dryomov
ba9d114ec5 libceph: clear r_req_lru_item in __unregister_linger_request()
kick_requests() can put linger requests on the notarget list.  This
means we need to clear the much-overloaded req->r_req_lru_item in
__unregister_linger_request() as well, or we get an assertion failure
in ceph_osdc_release_request() - !list_empty(&req->r_req_lru_item).

AFAICT the assumption was that registered linger requests cannot be on
any of req->r_req_lru_item lists, but that's clearly not the case.

Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2014-11-13 22:21:14 +03:00
Ilya Dryomov
a390de0208 libceph: unlink from o_linger_requests when clearing r_osd
Requests have to be unlinked from both osd->o_requests (normal
requests) and osd->o_linger_requests (linger requests) lists when
clearing req->r_osd.  Otherwise __unregister_linger_request() gets
confused and we trip over a !list_empty(&osd->o_linger_requests)
assert in __remove_osd().

MON=1 OSD=1:

    # cat remove-osd.sh
    #!/bin/bash
    rbd create --size 1 test
    DEV=$(rbd map test)
    ceph osd out 0
    sleep 3
    rbd map dne/dne # obtain a new osdmap as a side effect
    rbd unmap $DEV & # will block
    sleep 3
    ceph osd in 0

Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2014-11-13 22:21:13 +03:00
Ilya Dryomov
aaef31703a libceph: do not crash on large auth tickets
Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth
tickets will have their buffers vmalloc'ed, which leads to the
following crash in crypto:

[   28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0
[   28.686032] IP: [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
[   28.686032] PGD 0
[   28.688088] Oops: 0000 [#1] PREEMPT SMP
[   28.688088] Modules linked in:
[   28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305
[   28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[   28.688088] Workqueue: ceph-msgr con_work
[   28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000
[   28.688088] RIP: 0010:[<ffffffff81392b42>]  [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
[   28.688088] RSP: 0018:ffff8800d903f688  EFLAGS: 00010286
[   28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0
[   28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750
[   28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880
[   28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010
[   28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000
[   28.688088] FS:  00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[   28.688088] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0
[   28.688088] Stack:
[   28.688088]  ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32
[   28.688088]  ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020
[   28.688088]  ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010
[   28.688088] Call Trace:
[   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
[   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
[   28.688088]  [<ffffffff81395d32>] blkcipher_walk_done+0x182/0x220
[   28.688088]  [<ffffffff813990bf>] crypto_cbc_encrypt+0x15f/0x180
[   28.688088]  [<ffffffff81399780>] ? crypto_aes_set_key+0x30/0x30
[   28.688088]  [<ffffffff8156c40c>] ceph_aes_encrypt2+0x29c/0x2e0
[   28.688088]  [<ffffffff8156d2a3>] ceph_encrypt2+0x93/0xb0
[   28.688088]  [<ffffffff8156d7da>] ceph_x_encrypt+0x4a/0x60
[   28.688088]  [<ffffffff8155b39d>] ? ceph_buffer_new+0x5d/0xf0
[   28.688088]  [<ffffffff8156e837>] ceph_x_build_authorizer.isra.6+0x297/0x360
[   28.688088]  [<ffffffff8112089b>] ? kmem_cache_alloc_trace+0x11b/0x1c0
[   28.688088]  [<ffffffff8156b496>] ? ceph_auth_create_authorizer+0x36/0x80
[   28.688088]  [<ffffffff8156ed83>] ceph_x_create_authorizer+0x63/0xd0
[   28.688088]  [<ffffffff8156b4b4>] ceph_auth_create_authorizer+0x54/0x80
[   28.688088]  [<ffffffff8155f7c0>] get_authorizer+0x80/0xd0
[   28.688088]  [<ffffffff81555a8b>] prepare_write_connect+0x18b/0x2b0
[   28.688088]  [<ffffffff81559289>] try_read+0x1e59/0x1f10

This is because we set up crypto scatterlists as if all buffers were
kmalloc'ed.  Fix it.

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-11-13 22:21:12 +03:00
Yan, Zheng
3231300bb9 ceph: fix flush tid comparision
TID of cap flush ack is 64 bits, but ceph_inode_info::flushing_cap_tid
is only 16 bits. 16 bits should be plenty to let the cap flush updates
pipeline appropriately, but we need to cast in the proper direction when
comparing these differently-sized versions. So downcast the 64-bits one
to 16 bits.

Reflects ceph.git commit a5184cf46a6e867287e24aeb731634828467cd98.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-11-13 22:19:05 +03:00
Greg Kroah-Hartman
2aea83a484 Merge tag 'fixes-for-v3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus
Felipe writes:

usb: fixes for v3.18-rc5

Just one fix here on dwc3 which fixes a minor
bug caused by a fix that went in this v3.18-rc
cycle.

The corner case is minimal as it can only be
reproduced with back-to-back Setup transfers
(without starting data or status phase) by
means of a LeCroy USB Trainer, where we can
generate USB packets any way we like.

Signed-off-by: Felipe Balbi <balbi@ti.com>
2014-11-13 11:12:56 -08:00
Thierry Reding
6b1c4d7674 PCI: tegra: Add Kconfig help text
Add a standard help text to the Kconfig entry for the Tegra PCIe host
controller driver.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-11-13 12:03:57 -07:00
Thierry Reding
4407308b73 PCI: tegra: Do not build on 64-bit ARM
32-bit and 64-bit ARM use very different infrastructure to register a PCI
host bridge.  The Tegra PCIe host controller driver currently only supports
the 32-bit ARM infrastructure, so prevent it from being built on 64-bit ARM
where it will break.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-11-13 12:03:28 -07:00
Paul E. McKenney
9ea6c58856 Merge branches 'torture.2014.11.03a', 'cpu.2014.11.03a', 'doc.2014.11.13a', 'fixes.2014.11.13a', 'signal.2014.10.29a' and 'rt.2014.10.29a' into HEAD
cpu.2014.11.03a: Changes for per-CPU variables.
doc.2014.11.13a: Documentation updates.
fixes.2014.11.13a: Miscellaneous fixes.
signal.2014.10.29a: Signal changes.
rt.2014.10.29a: Real-time changes.
torture.2014.11.03a: torture-test changes.
2014-11-13 10:39:04 -08:00
Paul E. McKenney
60ced4950c rcu: Fix FIXME in rcu_tasks_kthread()
This commit affines rcu_tasks_kthread() to the housekeeping CPUs
in CONFIG_NO_HZ_FULL builds.  This is just a default, so systems
administrators are free to put this kthread somewhere else if they wish.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:35:41 -08:00
Oleg Nesterov
ce36f2f3eb rcu: More info about potential deadlocks with rcu_read_unlock()
The comment above rcu_read_unlock() explains the potential deadlock
if the caller holds one of the locks taken by rt_mutex_unlock() paths,
but it is not clear from this documentation that any lock which can
be taken from interrupt can lead to deadlock as well and we need to
take rt_mutex_lock() into account too.

The problem is that rt_mutex_lock() takes wait_lock without disabling
irqs, and thus an interrupt taking some LOCK can obviously race with
rcu_read_unlock_special() called with the same LOCK held.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:35:40 -08:00
Paul E. McKenney
b6331ae8af rcu: Optimize cond_resched_rcu_qs()
The current implementation of cond_resched_rcu_qs() can invoke
rcu_note_voluntary_context_switch() twice in the should_resched()
case, once via the call to __schedule() and once directly.  However, as
noted by Joe Lawrence in a patch to the team subsystem, cond_resched()
returns an indication as to whether or not the call to __schedule()
actually happened.  This commit therefore changes cond_resched_rcu_qs()
so as to invoke rcu_note_voluntary_context_switch() only when __schedule()
was not called.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:35:39 -08:00
Pranith Kumar
1a6c9b2675 rcu: Add sparse check for RCU_INIT_POINTER()
Add a sparse check when RCU_INIT_POINTER() is used to assign a non __rcu
annotated pointer.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:35:38 -08:00
Pranith Kumar
8ab8b3e183 documentation: memory-barriers.txt: Correct example for reorderings
Correct the example of memory orderings in memory-barriers.txt

Commit 615cc2c9cf "Documentation/memory-barriers.txt: fix important typo re
memory barriers" changed the assignment to x and y. Change the rest of the
example to match this change.

Reported-by: Ganesh Rapolu <ganesh.rapolu@hotmail.com>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:34:55 -08:00
Paul E. McKenney
1f7870dd87 documentation: Add atomic_long_t to atomic_ops.txt
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-13 10:34:54 -08:00
Paul E. McKenney
8b19d1dead documentation: Additional restriction for control dependencies
Short-circuit booleans are not defences against compilers breaking
your intended control dependencies.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-13 10:34:53 -08:00
Pranith Kumar
74860feed5 documentation: Document RCU self test boot params
Document the RCU self test boot parameters in kernel-parameters.txt.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:34:52 -08:00
Paul Mackerras
ad0eab9293 Fix thinko in iov_iter_single_seg_count
The branches of the if (i->type & ITER_BVEC) statement in
iov_iter_single_seg_count() are the wrong way around; if ITER_BVEC is
clear then we use i->bvec, when we should be using i->iov.  This fixes
it.

In my case, the symptom that this caused was that a KVM guest doing
filesystem operations on a virtual disk would result in one of qemu's
threads on the host going into an infinite loop in
generic_perform_write().  The loop would hit the copied == 0 case and
call iov_iter_single_seg_count() to reduce the number of bytes to try
to process, but because of the error, iov_iter_single_seg_count()
would just return i->count and the loop made no progress and continued
forever.

Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-13 13:28:55 -05:00
Jeff Layton
b3ecba0967 sunrpc: fix sleeping under rcu_read_lock in gss_stringify_acceptor
Bruce reported that he was seeing the following BUG pop:

    BUG: sleeping function called from invalid context at mm/slab.c:2846
    in_atomic(): 0, irqs_disabled(): 0, pid: 4539, name: mount.nfs
    2 locks held by mount.nfs/4539:
    #0:  (nfs_clid_init_mutex){+.+.+.}, at: [<ffffffffa01c0a9a>] nfs4_discover_server_trunking+0x4a/0x2f0 [nfsv4]
    #1:  (rcu_read_lock){......}, at: [<ffffffffa00e3185>] gss_stringify_acceptor+0x5/0xb0 [auth_rpcgss]
    Preemption disabled at:[<ffffffff81a4f082>] printk+0x4d/0x4f

    CPU: 3 PID: 4539 Comm: mount.nfs Not tainted 3.18.0-rc1-00013-g5b095e9 #3393
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    ffff880021499390 ffff8800381476a8 ffffffff81a534cf 0000000000000001
    0000000000000000 ffff8800381476c8 ffffffff81097854 00000000000000d0
    0000000000000018 ffff880038147718 ffffffff8118e4f3 0000000020479f00
    Call Trace:
    [<ffffffff81a534cf>] dump_stack+0x4f/0x7c
    [<ffffffff81097854>] __might_sleep+0x114/0x180
    [<ffffffff8118e4f3>] __kmalloc+0x1a3/0x280
    [<ffffffffa00e31d8>] gss_stringify_acceptor+0x58/0xb0 [auth_rpcgss]
    [<ffffffffa00e3185>] ? gss_stringify_acceptor+0x5/0xb0 [auth_rpcgss]
    [<ffffffffa006b438>] rpcauth_stringify_acceptor+0x18/0x30 [sunrpc]
    [<ffffffffa01b0469>] nfs4_proc_setclientid+0x199/0x380 [nfsv4]
    [<ffffffffa01b04d0>] ? nfs4_proc_setclientid+0x200/0x380 [nfsv4]
    [<ffffffffa01bdf1a>] nfs40_discover_server_trunking+0xda/0x150 [nfsv4]
    [<ffffffffa01bde45>] ? nfs40_discover_server_trunking+0x5/0x150 [nfsv4]
    [<ffffffffa01c0acf>] nfs4_discover_server_trunking+0x7f/0x2f0 [nfsv4]
    [<ffffffffa01c8e24>] nfs4_init_client+0x104/0x2f0 [nfsv4]
    [<ffffffffa01539b4>] nfs_get_client+0x314/0x3f0 [nfs]
    [<ffffffffa0153780>] ? nfs_get_client+0xe0/0x3f0 [nfs]
    [<ffffffffa01c83aa>] nfs4_set_client+0x8a/0x110 [nfsv4]
    [<ffffffffa0069708>] ? __rpc_init_priority_wait_queue+0xa8/0xf0 [sunrpc]
    [<ffffffffa01c9b2f>] nfs4_create_server+0x12f/0x390 [nfsv4]
    [<ffffffffa01c1472>] nfs4_remote_mount+0x32/0x60 [nfsv4]
    [<ffffffff81196489>] mount_fs+0x39/0x1b0
    [<ffffffff81166145>] ? __alloc_percpu+0x15/0x20
    [<ffffffff811b276b>] vfs_kern_mount+0x6b/0x150
    [<ffffffffa01c1396>] nfs_do_root_mount+0x86/0xc0 [nfsv4]
    [<ffffffffa01c1784>] nfs4_try_mount+0x44/0xc0 [nfsv4]
    [<ffffffffa01549b7>] ? get_nfs_version+0x27/0x90 [nfs]
    [<ffffffffa0161a2d>] nfs_fs_mount+0x47d/0xd60 [nfs]
    [<ffffffff81a59c5e>] ? mutex_unlock+0xe/0x10
    [<ffffffffa01606a0>] ? nfs_remount+0x430/0x430 [nfs]
    [<ffffffffa01609c0>] ? nfs_clone_super+0x140/0x140 [nfs]
    [<ffffffff81196489>] mount_fs+0x39/0x1b0
    [<ffffffff81166145>] ? __alloc_percpu+0x15/0x20
    [<ffffffff811b276b>] vfs_kern_mount+0x6b/0x150
    [<ffffffff811b5830>] do_mount+0x210/0xbe0
    [<ffffffff811b54ca>] ? copy_mount_options+0x3a/0x160
    [<ffffffff811b651f>] SyS_mount+0x6f/0xb0
    [<ffffffff81a5c852>] system_call_fastpath+0x12/0x17

Sleeping under the rcu_read_lock is bad. This patch fixes it by dropping
the rcu_read_lock before doing the allocation and then reacquiring it
and redoing the dereference before doing the copy. If we find that the
string has somehow grown in the meantime, we'll reallocate and try again.

Cc: <stable@vger.kernel.org> # v3.17+
Reported-by: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-11-13 13:15:49 -05:00