Commit Graph

42415 Commits

Author SHA1 Message Date
Jay Vosburgh
a77f9c5dcd Revert "fast_hash: avoid indirect function calls"
This reverts commit e5a2c89995.

	Commit e5a2c899 introduced an alternative_call, arch_fast_hash2,
that selects between __jhash2 and __intel_crc4_2_hash based on the
X86_FEATURE_XMM4_2.

	Unfortunately, the alternative_call system does not appear to be
suitable for use with C functions, as register usage is not handled
properly for the called functions.  The __jhash2 function in particular
clobbers registers that are not preserved when called via
alternative_call, resulting in a panic for direct callers of
arch_fast_hash2 on older CPUs lacking sse4_2.  It is possible that
__intel_crc4_2_hash works merely by chance because it uses fewer
registers.

	This commit was suggested as the source of the problem by Jesse
Gross <jesse@nicira.com>.

Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14 16:36:25 -05:00
David S. Miller
076ce44825 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/chelsio/cxgb4vf/sge.c
	drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c

sge.c was overlapping two changes, one to use the new
__dev_alloc_page() in net-next, and one to use s->fl_pg_order in net.

ixgbe_phy.c was a set of overlapping whitespace changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14 01:01:12 -05:00
Linus Torvalds
5cf5203704 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) sunhme driver lacks DMA mapping error checks, based upon a report by
    Meelis Roos.

 2) Fix memory leak in mvpp2 driver, from Sudip Mukherjee.

 3) DMA memory allocation sizes are wrong in systemport ethernet driver,
    fix from Florian Fainelli.

 4) Fix use after free in mac80211 defragmentation code, from Johannes
    Berg.

 5) Some networking uapi headers missing from Kbuild file, from Stephen
    Hemminger.

 6) TUN driver gets csum_start offset wrong when VLAN accel is enabled,
    and macvtap has a similar bug, from Herbert Xu.

 7) Adjust several tunneling drivers to set dev->iflink after registry,
    because registry sets that to -1 overwriting whatever we did.  From
    Steffen Klassert.

 8) Geneve forgets to set inner tunneling type, causing GSO segmentation
    to fail on some NICs.  From Jesse Gross.

 9) Fix several locking bugs in stmmac driver, from Fabrice Gasnier and
    Giuseppe CAVALLARO.

10) Fix spurious timeouts with NewReno on low traffic connections, from
    Marcelo Leitner.

11) Fix descriptor updates in enic driver, from Govindarajulu
    Varadarajan.

12) PPP calls bpf_prog_create() with locks held, which isn't kosher.
    Fix from Takashi Iwai.

13) Fix NULL deref in SCTP with malformed INIT packets, from Daniel
    Borkmann.

14) psock_fanout selftest accesses past the end of the mmap ring, fix
    from Shuah Khan.

15) Fix PTP timestamping for VLAN packets, from Richard Cochran.

16) netlink_unbind() calls in netlink pass wrong initial argument, from
    Hiroaki SHIMODA.

17) vxlan socket reuse accidently reuses a socket when the address
    family is different, so we have to explicitly check this, from
    Marcelo Lietner.

18) Fix missing include in nft_reject_bridge.c breaking the build on ppc
    and other architectures, from Guenter Roeck.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
  vxlan: Do not reuse sockets for a different address family
  smsc911x: power-up phydev before doing a software reset.
  lib: rhashtable - Remove weird non-ASCII characters from comments
  net/smsc911x: Fix delays in the PHY enable/disable routines
  net/smsc911x: Fix rare soft reset timeout issue due to PHY power-down mode
  netlink: Properly unbind in error conditions.
  net: ptp: fix time stamp matching logic for VLAN packets.
  cxgb4 : dcb open-lldp interop fixes
  selftests/net: psock_fanout seg faults in sock_fanout_read_ring()
  net: bcmgenet: apply MII configuration in bcmgenet_open()
  net: bcmgenet: connect and disconnect from the PHY state machine
  net: qualcomm: Fix dependency
  ixgbe: phy: fix uninitialized status in ixgbe_setup_phy_link_tnx
  net: phy: Correctly handle MII ioctl which changes autonegotiation.
  ipv6: fix IPV6_PKTINFO with v4 mapped
  net: sctp: fix memory leak in auth key management
  net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
  net: ppp: Don't call bpf_prog_create() in ppp_lock
  net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set
  cxgb4 : Fix bug in DCB app deletion
  ...
2014-11-13 17:54:08 -08:00
Tang Chen
f784a3f196 mem-hotplug: reset node managed pages when hot-adding a new pgdat
In free_area_init_core(), zone->managed_pages is set to an approximate
value for lowmem, and will be adjusted when the bootmem allocator frees
pages into the buddy system.

But free_area_init_core() is also called by hotadd_new_pgdat() when
hot-adding memory.  As a result, zone->managed_pages of the newly added
node's pgdat is set to an approximate value in the very beginning.

Even if the memory on that node has node been onlined,
/sys/device/system/node/nodeXXX/meminfo has wrong value:

  hot-add node2 (memory not onlined)
  cat /sys/device/system/node/node2/meminfo
  Node 2 MemTotal:       33554432 kB
  Node 2 MemFree:               0 kB
  Node 2 MemUsed:        33554432 kB
  Node 2 Active:                0 kB

This patch fixes this problem by reset node managed pages to 0 after
hot-adding a new node.

1. Move reset_managed_pages_done from reset_node_managed_pages() to
   reset_all_zones_managed_pages()
2. Make reset_node_managed_pages() non-static
3. Call reset_node_managed_pages() in hotadd_new_pgdat() after pgdat
   is initialized

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>	[3.16+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-13 16:17:06 -08:00
Joonsoo Kim
ad53f92eb4 mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype
Before describing bugs itself, I first explain definition of freepage.

 1. pages on buddy list are counted as freepage.
 2. pages on isolate migratetype buddy list are *not* counted as freepage.
 3. pages on cma buddy list are counted as CMA freepage, too.

Now, I describe problems and related patch.

Patch 1: There is race conditions on getting pageblock migratetype that
it results in misplacement of freepages on buddy list, incorrect
freepage count and un-availability of freepage.

Patch 2: Freepages on pcp list could have stale cached information to
determine migratetype of buddy list to go.  This causes misplacement of
freepages on buddy list and incorrect freepage count.

Patch 4: Merging between freepages on different migratetype of
pageblocks will cause freepages accouting problem.  This patch fixes it.

Without patchset [3], above problem doesn't happens on my CMA allocation
test, because CMA reserved pages aren't used at all.  So there is no
chance for above race.

With patchset [3], I did simple CMA allocation test and get below
result:

 - Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation
 - run kernel build (make -j16) on background
 - 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval
 - Result: more than 5000 freepage count are missed

With patchset [3] and this patchset, I found that no freepage count are
missed so that I conclude that problems are solved.

On my simple memory offlining test, these problems also occur on that
environment, too.

This patch (of 4):

There are two paths to reach core free function of buddy allocator,
__free_one_page(), one is free_one_page()->__free_one_page() and the
other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
Each paths has race condition causing serious problems.  At first, this
patch is focused on first type of freepath.  And then, following patch
will solve the problem in second type of freepath.

In the first type of freepath, we got migratetype of freeing page
without holding the zone lock, so it could be racy.  There are two cases
of this race.

 1. pages are added to isolate buddy list after restoring orignal
    migratetype

    CPU1                                   CPU2

    get migratetype => return MIGRATE_ISOLATE
    call free_one_page() with MIGRATE_ISOLATE

                                grab the zone lock
                                unisolate pageblock
                                release the zone lock

    grab the zone lock
    call __free_one_page() with MIGRATE_ISOLATE
    freepage go into isolate buddy list,
    although pageblock is already unisolated

This may cause two problems.  One is that we can't use this page anymore
until next isolation attempt of this pageblock, because freepage is on
isolate buddy list.  The other is that freepage accouting could be wrong
due to merging between different buddy list.  Freepages on isolate buddy
list aren't counted as freepage, but ones on normal buddy list are
counted as freepage.  If merge happens, buddy freepage on normal buddy
list is inevitably moved to isolate buddy list without any consideration
of freepage accouting so it could be incorrect.

 2. pages are added to normal buddy list while pageblock is isolated.
    It is similar with above case.

This also may cause two problems.  One is that we can't keep these
freepages from being allocated.  Although this pageblock is isolated,
freepage would be added to normal buddy list so that it could be
allocated without any restriction.  And the other problem is same as
case 1, that it, incorrect freepage accouting.

This race condition would be prevented by checking migratetype again
with holding the zone lock.  Because it is somewhat heavy operation and
it isn't needed in common case, we want to avoid rechecking as much as
possible.  So this patch introduce new variable, nr_isolate_pageblock in
struct zone to check if there is isolated pageblock.  With this, we can
avoid to re-check migratetype in common case and do it only if there is
isolated pageblock or migratetype is MIGRATE_ISOLATE.  This solve above
mentioned problems.

Changes from v3:
Add one more check in free_one_page() that checks whether migratetype is
MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Heesub Shin <heesub.shin@samsung.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Ritesh Harjani <ritesh.list@gmail.com>
Cc: Gioh Kim <gioh.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-13 16:17:05 -08:00
Thomas Graf
6eba82248e rhashtable: Drop gfp_flags arg in insert/remove functions
Reallocation is only required for shrinking and expanding and both rely
on a mutex for synchronization and callers of rhashtable_init() are in
non atomic context. Therefore, no reason to continue passing allocation
hints through the API.

Instead, use GFP_KERNEL and add __GFP_NOWARN | __GFP_NORETRY to allow
for silent fall back to vzalloc() without the OOM killer jumping in as
pointed out by Eric Dumazet and Eric W. Biederman.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:18:40 -05:00
Matan Barak
de966c5928 net/mlx4_core: Support more than 64 VFs
We now allow up to 126 VFs. Note though that certain firmware
versions only allow up to 80 VFs. Moreover, old HCAs only support 64 VFs.
In these cases, we limit the maximum number of VFs to 64.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:22 -05:00
Matan Barak
7ae0e400cd net/mlx4_core: Flexible (asymmetric) allocation of EQs and MSI-X vectors for PF/VFs
Previously, the driver queried the firmware in order to get the number
of supported EQs. Under SRIOV, since this was done before the driver
notified the firmware how many VFs it actually needs, the firmware had
to take into account a worst case scenario and always allocated four EQs
per VF, where one was used for events while the others were used for completions.

Now, when the firmware supports the asymmetric allocation scheme, denoted
by exposing num_sys_eqs > 0 (--> MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the
QUERY_FUNC command to query the firmware before enabling SRIOV. Thus we
can get more EQs and MSI-X vectors per function.

Moreover, when running in the new firmware/driver mode, the limitation
that the number of EQs should be a power of two is lifted.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:16:21 -05:00
Herbert Xu
7b4ce23534 rhashtable: Add parent argument to mutex_is_held
Currently mutex_is_held can only test locks in the that are global
since it takes no arguments.  This prevents rhashtable from being
used in places where locks are lock, e.g., per-namespace locks.

This patch adds a parent field to mutex_is_held and rhashtable_params
so that local locks can be used (and tested).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:13:05 -05:00
Herbert Xu
1b2f309d70 rhashtable: Move mutex_is_held under PROVE_LOCKING
The rhashtable function mutex_is_held is only used when PROVE_LOCKING
is enabled.  This patch makes the mutex_is_held field in rhashtable
optional depending on PROVE_LOCKING.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-13 15:13:05 -05:00
Linus Torvalds
15e5cda9e6 Merge tag 'trace-fixes-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
 "Rabin Vincent found a way that tracing could cause an infinite loop in
  the kernel.  The splice logic wants a full page from the ring buffer
  but the ring_buffer_wait() returns when there's any data in the ring
  buffer.  The splice code would then continue the loop waiting for a
  full page.  But if a full page never happens, the splice code will
  never sleep and just continue to loop.

  There's another case that Rabin fixed that could loop if there's no
  memory and kmalloc() constantly returns NULL"

* tag 'trace-fixes-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Do not risk busy looping in buffer splice
  tracing: Do not busy wait in buffer splice
2014-11-12 14:02:29 -08:00
Linus Torvalds
c921220115 Merge tag 'mfd-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD fixes from Lee Jones:
 - register offset fix for stmpe
 - eradicate build warning when !PM in rtsx_pcr
 - fix device ID collision when multiple boards are connected in
   viperboard
 - use correct Regmap handle - fixing unhanded IRQs in max77693
 - unmask MUIC IRQs in max77693
 - clear VBUS & CHG bits so board doesn't reboot instead of poweroff in
   twl4030

* tag 'mfd-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  mfd: twl4030-power: Fix poweroff with PM configuration enabled
  mfd: max77693: Fix always masked MUIC interrupts
  mfd: max77693: Use proper regmap for handling MUIC interrupts
  mfd: viperboard: Fix platform-device id collision
  mfd: rtsx: Fix build warnings for !PM
  mfd: stmpe: Fix STMPE24xx GPMR LSB
2014-11-12 13:13:24 -08:00
Johan Hovold
c31accd159 net: phy: add module_phy_driver macro
Add helper macro for PHY drivers which do not do anything special in
module init/exit. This will allow us to eliminate a lot of boilerplate
code.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-12 13:52:52 -05:00
Alexander Duyck
160d2aba55 net: Remove __skb_alloc_page and __skb_alloc_pages
Remove the two functions which are now dead code.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-12 00:00:14 -05:00
Alexander Duyck
71dfda58aa net: Add device Rx page allocation function
This patch implements __dev_alloc_pages and __dev_alloc_page.  These are
meant to replace the __skb_alloc_pages and __skb_alloc_page functions.  The
reason for doing this is that it occurred to me that __skb_alloc_page is
supposed to be passed an sk_buff pointer, but it is NULL in all cases where
it is used.  Worse is that in the case of ixgbe it is passed NULL via the
sk_buff pointer in the rx_buffer info structure which means the compiler is
not correctly stripping it out.

The naming for these functions is based on dev_alloc_skb and __dev_alloc_skb.
There was originally a netdev_alloc_page, however that was passed a
net_device pointer and this function is not so I thought it best to follow
that naming scheme since that is the same difference between dev_alloc_skb
and netdev_alloc_skb.

In the case of anything greater than order 0 it is assumed that we want a
compound page so __GFP_COMP is set for all allocations as we expect a
compound page when assigning a page frag.

The other change in this patch is to exploit the behaviors of the page
allocator in how it handles flags.  So for example we can always set
__GFP_COMP and __GFP_MEMALLOC since they are ignored if they are not
applicable or are overridden by another flag.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-12 00:00:13 -05:00
WANG Cong
09626e9d15 net: kill netif_copy_real_num_queues()
vlan was the only user of netif_copy_real_num_queues(),
but it no longer calls it after
commit 4af429d29b ("vlan: lockless transmit path").
So we can just remove it.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-11 16:30:16 -05:00
Shani Michaeli
f8c6455bb0 net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE
When processing received traffic, pass CHECKSUM_COMPLETE status to the
stack, with calculated checksum for non TCP/UDP packets (such
as GRE or ICMP).

Although the stack expects checksum which doesn't include the pseudo
header, the HW adds it. To address that, we are subtracting the pseudo
header checksum from the checksum value provided by the HW.

In the IPv6 case, we also compute/add the IP header checksum which
is not added by the HW for such packets.

Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-11 13:20:02 -05:00
Rabin Vincent
e30f53aad2 tracing: Do not busy wait in buffer splice
On a !PREEMPT kernel, attempting to use trace-cmd results in a soft
lockup:

 # trace-cmd record -e raw_syscalls:* -F false
 NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [trace-cmd:61]
 ...
 Call Trace:
  [<ffffffff8105b580>] ? __wake_up_common+0x90/0x90
  [<ffffffff81092e25>] wait_on_pipe+0x35/0x40
  [<ffffffff810936e3>] tracing_buffers_splice_read+0x2e3/0x3c0
  [<ffffffff81093300>] ? tracing_stats_read+0x2a0/0x2a0
  [<ffffffff812d10ab>] ? _raw_spin_unlock+0x2b/0x40
  [<ffffffff810dc87b>] ? do_read_fault+0x21b/0x290
  [<ffffffff810de56a>] ? handle_mm_fault+0x2ba/0xbd0
  [<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
  [<ffffffff810951e2>] ? trace_buffer_lock_reserve+0x22/0x60
  [<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
  [<ffffffff8112415d>] do_splice_to+0x6d/0x90
  [<ffffffff81126971>] SyS_splice+0x7c1/0x800
  [<ffffffff812d1edd>] tracesys_phase2+0xd3/0xd8

The problem is this: tracing_buffers_splice_read() calls
ring_buffer_wait() to wait for data in the ring buffers.  The buffers
are not empty so ring_buffer_wait() returns immediately.  But
tracing_buffers_splice_read() calls ring_buffer_read_page() with full=1,
meaning it only wants to read a full page.  When the full page is not
available, tracing_buffers_splice_read() tries to wait again with
ring_buffer_wait(), which again returns immediately, and so on.

Fix this by adding a "full" argument to ring_buffer_wait() which will
make ring_buffer_wait() wait until the writer has left the reader's
page, i.e.  until full-page reads will succeed.

Link: http://lkml.kernel.org/r/1415645194-25379-1-git-send-email-rabin@rab.in

Cc: stable@vger.kernel.org # 3.16+
Fixes: b1169cc69b ("tracing: Remove mock up poll wait function")
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-10 16:45:43 -05:00
David S. Miller
b92172661e Merge tag 'master-2014-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-11-07

Please pull this batch of updates intended for the 3.19 stream!

For the mac80211 bits, Johannes says:

"This relatively large batch of changes is comprised of the following:
 * large mac80211-hwsim changes from Ben, Jukka and a bit myself
 * OCB/WAVE/11p support from Rostislav on behalf of the Czech Technical
   University in Prague and Volkswagen Group Research
 * minstrel VHT work from Karl
 * more CSA work from Luca
 * WMM admission control support in mac80211 (myself)
 * various smaller fixes, spelling corrections, and minor API additions"

For the Bluetooth bits, Johan says:

"Here's the first bluetooth-next pull request for 3.19. The vast majority
of patches are for ieee802154 from Alexander Aring with various fixes
and cleanups. There are also several LE/SMP fixes as well as improved
support for handling LE devices that have lost their pairing information
(the patches from Alfonso). Jukka provides a couple of stability fixes
for 6lowpan and Szymon conformance fixes for RFCOMM. For the HCI drivers
we have one new USB ID for an Acer controller as well as a reset
handling fix for H5."

For the Atheros bits, Kalle says:

"Major changes are:

o ethtool support (Ben)

o print dev string prefix with debug hex buffers dump (Michal)

o debugfs file to read calibration data from the firmware verification
  purposes (me)

o fix fw_stats debugfs file, now results are more reliable (Michal)

o firmware crash counters via debugfs (Ben&me)

o various tracing points to debug firmware (Rajkumar)

o make it possible to provide firmware calibration data via a file (me)

And we have quite a lot of smaller fixes and clean up."

For the iwlwifi bits, Emmanuel says:

"The big new thing here is netdetect which allows the
firmware to wake up the platform when a specific network
is detected. Along with that I have fixes for d3 operation.
The usual amount of rate scaling stuff - we now support STBC.
The other commit that stands out is Johannes's work on
devcoredump. He basically starts to use the standard
infrastructure he built."

Along with that are the usual sort of updates and such for ath9k,
brcmfmac, wil6210, and a handful of other bits here and there...

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-10 14:34:59 -05:00
Eric Dumazet
3b47d30396 net: gro: add a per device gro flush timer
Tuning coalescing parameters on NIC can be really hard.

Servers can handle both bulk and RPC like traffic, with conflicting
goals : bulk flows want as big GRO packets as possible, RPC want minimal
latencies.

To reach big GRO packets on 10Gbe NIC, one can use :

ethtool -C eth0 rx-usecs 4 rx-frames 44

But this penalizes rpc sessions, with an increase of latencies, up to
50% in some cases, as NICs generally do not force an interrupt when
a packet with TCP Push flag is received.

Some NICs do not have an absolute timer, only a timer rearmed for every
incoming packet.

This patch uses a different strategy : Let GRO stack decides what do do,
based on traffic pattern.

Packets with Push flag wont be delayed.
Packets without Push flag might be held in GRO engine, if we keep
receiving data.

This new mechanism is off by default, and shall be enabled by setting
/sys/class/net/ethX/gro_flush_timeout to a value in nanosecond.

To fully enable this mechanism, drivers should use napi_complete_done()
instead of napi_complete().

Tested:
 Ran 200 netperf TCP_STREAM from A to B (10Gbe mlx4 link, 8 RX queues)

Without this feature, we send back about 305,000 ACK per second.

GRO aggregation ratio is low (811/305 = 2.65 segments per GRO packet)

Setting a timer of 2000 nsec is enough to increase GRO packet sizes
and reduce number of ACK packets. (811/19.2 = 42)

Receiver performs less calls to upper stacks, less wakes up.
This also reduces cpu usage on the sender, as it receives less ACK
packets.

Note that reducing number of wakes up increases cpu efficiency, but can
decrease QPS, as applications wont have the chance to warmup cpu caches
doing a partial read of RPC requests/answers if they fit in one skb.

B:~# sar -n DEV 1 10 | grep eth0 | tail -1
Average:         eth0 811269.80 305732.30 1199462.57  19705.72      0.00
0.00      0.50

B:~# echo 2000 >/sys/class/net/eth0/gro_flush_timeout

B:~# sar -n DEV 1 10 | grep eth0 | tail -1
Average:         eth0 811577.30  19230.80 1199916.51   1239.80      0.00
0.00      0.50

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-10 12:05:59 -05:00
Krzysztof Kozlowski
c0acb8144b mfd: max77693: Fix always masked MUIC interrupts
All interrupts coming from MUIC were ignored because interrupt source
register was masked.

The Maxim 77693 has a "interrupt source" - a separate register and interrupts
which give information about PMIC block triggering the individual
interrupt (charger, topsys, MUIC, flash LED).

By default bootloader could initialize this register to "mask all"
value. In such case (observed on Trats2 board) MUIC interrupts won't be
generated regardless of their mask status. Regmap irq chip was unmasking
individual MUIC interrupts but the source was masked

Before introducing regmap irq chip this interrupt source was unmasked,
read and acked. Reading and acking is not necessary but unmasking is.

Fixes: 342d669c1e ("mfd: max77693: Handle IRQs using regmap")

Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-11-10 15:22:02 +00:00
Linus Torvalds
a315780977 Merge branch 'devicetree/merge' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux
Pull devicetree bugfix from Grant Likely:
 "One buffer overflow bug that shouldn't be left around"

* 'devicetree/merge' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux:
  of: Fix overflow bug in string property parsing functions
2014-11-09 14:33:49 -08:00
Herbert Xu
bfe1be38fc net: Kill skb_copy_datagram_const_iovec
Now that both macvtap and tun are using skb_copy_datagram_iter, we
can kill the abomination that is skb_copy_datagram_const_iovec.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07 12:13:34 -05:00
Herbert Xu
a8f820aa40 inet: Add skb_copy_datagram_iter
This patch adds skb_copy_datagram_iter, which is identical to
skb_copy_datagram_iovec except that it operates on iov_iter
instead of iovec.

Eventually all users of skb_copy_datagram_iovec should switch
over to iov_iter and then we can remove skb_copy_datagram_iovec.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-07 12:13:34 -05:00
David S. Miller
4e84b496fd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-11-06 22:01:18 -05:00
Linus Torvalds
ed78bb846e Merge tag 'pci-v3.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
 "This fixes an oops when enabling SR-IOV VF devices.  The oops is a
  regression I added by configuring all devices during enumeration.

    - Don't oops on virtual buses in acpi_pci_get_bridge_handle() (Yinghai Lu)"

* tag 'pci-v3.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Don't oops on virtual buses in acpi_pci_get_bridge_handle()
2014-11-06 11:33:06 -08:00
Pravin B Shelar
59b93b41e7 net: Remove MPLS GSO feature.
Device can export MPLS GSO support in dev->mpls_features same way
it export vlan features in dev->vlan_features. So it is safe to
remove NETIF_F_GSO_MPLS redundant flag.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-05 23:52:33 -08:00
Hannes Frederic Sowa
e5a2c89995 fast_hash: avoid indirect function calls
By default the arch_fast_hash hashing function pointers are initialized
to jhash(2). If during boot-up a CPU with SSE4.2 is detected they get
updated to the CRC32 ones. This dispatching scheme incurs a function
pointer lookup and indirect call for every hashing operation.

rhashtable as a user of arch_fast_hash e.g. stores pointers to hashing
functions in its structure, too, causing two indirect branches per
hashing operation.

Using alternative_call we can get away with one of those indirect branches.

Acked-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05 22:01:21 -05:00
WANG Cong
25de4668d0 ipv6: move INET6_MATCH() to include/net/inet6_hashtables.h
It is only used in net/ipv6/inet6_hashtables.c.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05 16:59:04 -05:00
David S. Miller
51f3d02b98 net: Add and use skb_copy_datagram_msg() helper.
This encapsulates all of the skb_copy_datagram_iovec() callers
with call argument signature "skb, offset, msghdr->msg_iov, length".

When we move to iov_iters in the networking, the iov_iter object will
sit in the msghdr.

Having a helper like this means there will be less places to touch
during that transformation.

Based upon descriptions and patch from Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05 16:46:40 -05:00
Tom Herbert
e585f23636 udp: Changes to udp_offload to support remote checksum offload
Add a new GSO type, SKB_GSO_TUNNEL_REMCSUM, which indicates remote
checksum offload being done (in this case inner checksum must not
be offloaded to the NIC).

Added logic in __skb_udp_tunnel_segment to handle remote checksum
offload case.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05 16:30:03 -05:00
Rasmus Villemoes
9cdb5dbf79 include/linux/socket.h: Fix comment
File descriptors are always closed on exit :-)

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-05 15:52:45 -05:00
Yinghai Lu
32f638fc11 PCI: Don't oops on virtual buses in acpi_pci_get_bridge_handle()
acpi_pci_get_bridge_handle() returns the ACPI handle for the bridge device
(either a host bridge or a PCI-to-PCI bridge) leading to a PCI bus.  But
SR-IOV virtual functions can be on a virtual bus with no bridge leading to
it.  Return a NULL acpi_handle in this case instead of trying to
dereference the NULL pointer to the bridge.

This fixes a NULL pointer dereference oops in pci_get_hp_params() when
adding SR-IOV VF devices on virtual buses.

[bhelgaas: changelog, add comment in code]
Fixes: 6cd33649fa ("PCI: Add pci_configure_device() during enumeration")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=87591
Reported-by: Chao Zhou <chao.zhou@intel.com>
Reported-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-11-05 13:06:16 -07:00
John W. Linville
bf515fb11a Merge tag 'mac80211-next-for-john-2014-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg <johannes@sipsolutions.net> says:

"This relatively large batch of changes is comprised of the
following:
 * large mac80211-hwsim changes from Ben, Jukka and a bit myself
 * OCB/WAVE/11p support from Rostislav on behalf of the Czech Technical
   University in Prague and Volkswagen Group Research
 * minstrel VHT work from Karl
 * more CSA work from Luca
 * WMM admission control support in mac80211 (myself)
 * various smaller fixes, spelling corrections, and minor API additions"

Conflicts:
	drivers/net/wireless/ath/wil6210/cfg80211.c

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-11-04 16:18:12 -05:00
Grant Likely
a87fa1d81a of: Fix overflow bug in string property parsing functions
The string property read helpers will run off the end of the buffer if
it is handed a malformed string property. Rework the parsers to make
sure that doesn't happen. At the same time add new test cases to make
sure the functions behave themselves.

The original implementations of of_property_read_string_index() and
of_property_count_strings() both open-coded the same block of parsing
code, each with it's own subtly different bugs. The fix here merges
functions into a single helper and makes the original functions static
inline wrappers around the helper.

One non-bugfix aspect of this patch is the addition of a new wrapper,
of_property_read_string_array(). The new wrapper is needed by the
device_properties feature that Rafael is working on and planning to
merge for v3.19. The implementation is identical both with and without
the new static inline wrapper, so it just got left in to reduce the
churn on the header file.

Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Darren Hart <darren.hart@intel.com>
Cc: <stable@vger.kernel.org>  # v3.3+: Drop selftest hunks that don't apply
2014-11-04 10:19:48 +00:00
Eran Harary
0563921abf ieee80211: add "max length of AMPDU" enum for VHT
Maximum length of AMPDU that an STA can receive in VHT.
length = 2 ^ (13 + max_ampdu_length_exp) - 1.

Signed-off-by: Eran Harary <eran.harary@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-11-04 09:57:44 +01:00
Linus Torvalds
f3ed88a6bc Merge branch 'fixes-for-v3.18' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull CMA and DMA-mapping fixes from Marek Szyprowski:
 "This contains important fixes for recently introduced highmem support
  for default contiguous memory region used for dma-mapping subsystem"

* 'fixes-for-v3.18' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  mm, cma: make parameters order consistent in func declaration and definition
  mm: cma: Use %pa to print physical addresses
  mm: cma: Ensure that reservations never cross the low/high mem boundary
  mm: cma: Always consider a 0 base address reservation as dynamic
  mm: cma: Don't crash on allocation if CMA area can't be activated
2014-11-03 21:01:04 -08:00
Eric Dumazet
56b174256b net: add rbnode to struct sk_buff
Yaogong replaces TCP out of order receive queue by an RB tree.

As netem already does a private skb->{next/prev/tstamp} union
with a 'struct rb_node', lets do this in a cleaner way.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yaogong Wang <wygivan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03 16:13:03 -05:00
Matan Barak
d475c95b4b net/mlx4_core: Add retrieval of CONFIG_DEV parameters
Add code to issue CONFIG_DEV "get" firmware command.

This command is used in order to obtain certain parameters used for
supporting various RX checksumming options and vxlan UDP port.

The GET operation is allowed for VFs too.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03 12:28:14 -05:00
Eric Dumazet
4cdb1e2e3d net: shrink struct softnet_data
flow_limit in struct softnet_data is only read from local cpu
and can be moved to fill a hole, reducing softnet_data size by
64 bytes on x86_64

While we are at it, move output_queue, output_queue_tailp and
completion_queue, so that rx / tx paths touch a single cache line.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03 12:25:08 -05:00
Linus Torvalds
81d92dc117 Merge tag 'for-linus-20141102' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
 "Three main MTD fixes for 3.18:

   - A regression from 3.16 which was noticed in 3.17.  With the
     restructuring of the m25p80.c driver and the SPI NOR library
     framework, we omitted proper listing of the SPI device IDs.  This
     means m25p80.c wouldn't auto-load (modprobe) properly when built as
     a module.  For now, we duplicate the device IDs into both modules.

   - The OMAP / ELM modules were depending on an implicit link ordering.
     Use deferred probing so that the new link order (in 3.18-rc) can
     still allow for successful probing.

   - Fix suspend/resume support for LH28F640BF NOR flash"

* tag 'for-linus-20141102' of git://git.infradead.org/linux-mtd:
  mtd: cfi_cmdset_0001.c: fix resume for LH28F640BF chips
  mtd: omap: fix mtd devices not showing up
  mtd: m25p80,spi-nor: Fix module aliases for m25p80
  mtd: spi-nor: make spi_nor_scan() take a chip type name, not spi_device_id
  mtd: m25p80: get rid of spi_get_device_id
2014-11-02 14:45:52 -08:00
Linus Torvalds
ad2be3796f Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "This is a set of six patches consisting of:
   - two MAINTAINER updates
   - two scsi-mq fixs for the old parallel interface (not every request
     is tagged and we need to set the right flags to populate the SPI
     tag message)
   - a fix for a memory leak in scatterlist traversal caused by a
     preallocation update in 3.17
   - an ipv6 fix for cxgbi"

[ The scatterlist fix also came in separately through the block layer tree ]

* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  MAINTAINERS: ufs - remove self
  MAINTAINERS: change hpsa and cciss maintainer
  libcxgbi : support ipv6 address host_param
  scsi: set REQ_QUEUE for the blk-mq case
  Revert "block: all blk-mq requests are tagged"
  lib/scatterlist: fix memory leak with scsi-mq
2014-11-02 14:39:35 -08:00
Linus Torvalds
7e05b807b9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS fixes from Al Viro:
 "A bunch of assorted fixes, most of them followups to overlayfs merge"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ovl: initialize ->is_cursor
  Return short read or 0 at end of a raw device, not EIO
  isofs: don't bother with ->d_op for normal case
  isofs_cmp(): we'll never see a dentry for . or ..
  overlayfs: fix lockdep misannotation
  ovl: fix check for cursor
  overlayfs: barriers for opening upper-layer directory
  rcu: Provide counterpart to rcu_dereference() for non-RCU situations
  staging: android: logger: Fix log corruption regression
2014-11-02 10:28:43 -08:00
David S. Miller
55b42b5ca2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/phy/marvell.c

Simple overlapping changes in drivers/net/phy/marvell.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-01 14:53:27 -04:00
Linus Torvalds
89453379aa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "A bit has accumulated, but it's been a week or so since my last batch
  of post-merge-window fixes, so...

   1) Missing module license in netfilter reject module, from Pablo.
      Lots of people ran into this.

   2) Off by one in mac80211 baserate calculation, from Karl Beldan.

   3) Fix incorrect return value from ax88179_178a driver's set_mac_addr
      op, which broke use of it with bonding.  From Ian Morgan.

   4) Checking of skb_gso_segment()'s return value was not all
      encompassing, it can return an SKB pointer, a pointer error, or
      NULL.  Fix from Florian Westphal.

      This is crummy, and longer term will be fixed to just return error
      pointers or a real SKB.

   6) Encapsulation offloads not being handled by
      skb_gso_transport_seglen().  From Florian Westphal.

   7) Fix deadlock in TIPC stack, from Ying Xue.

   8) Fix performance regression from using rhashtable for netlink
      sockets.  The problem was the synchronize_net() invoked for every
      socket destroy.  From Thomas Graf.

   9) Fix bug in eBPF verifier, and remove the strong dependency of BPF
      on NET.  From Alexei Starovoitov.

  10) In qdisc_create(), use the correct interface to allocate
      ->cpu_bstats, otherwise the u64_stats_sync member isn't
      initialized properly.  From Sabrina Dubroca.

  11) Off by one in ip_set_nfnl_get_byindex(), from Dan Carpenter.

  12) nf_tables_newchain() was erroneously expecting error pointers from
      netdev_alloc_pcpu_stats().  It only returna a valid pointer or
      NULL.  From Sabrina Dubroca.

  13) Fix use-after-free in _decode_session6(), from Li RongQing.

  14) When we set the TX flow hash on a socket, we mistakenly do so
      before we've nailed down the final source port.  Move the setting
      deeper to fix this.  From Sathya Perla.

  15) NAPI budget accounting in amd-xgbe driver was counting descriptors
      instead of full packets, fix from Thomas Lendacky.

  16) Fix total_data_buflen calculation in hyperv driver, from Haiyang
      Zhang.

  17) Fix bcma driver build with OF_ADDRESS disabled, from Hauke
      Mehrtens.

  18) Fix mis-use of per-cpu memory in TCP md5 code.  The problem is
      that something that ends up being vmalloc memory can't be passed
      to the crypto hash routines via scatter-gather lists.  From Eric
      Dumazet.

  19) Fix regression in promiscuous mode enabling in cdc-ether, from
      Olivier Blin.

  20) Bucket eviction and frag entry killing can race with eachother,
      causing an unlink of the object from the wrong list.  Fix from
      Nikolay Aleksandrov.

  21) Missing initialization of spinlock in cxgb4 driver, from Anish
      Bhatt.

  22) Do not cache ipv4 routing failures, otherwise if the sysctl for
      forwarding is subsequently enabled this won't be seen.  From
      Nicolas Cavallari"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (131 commits)
  drivers: net: cpsw: Support ALLMULTI and fix IFF_PROMISC in switch mode
  drivers: net: cpsw: Fix broken loop condition in switch mode
  net: ethtool: Return -EOPNOTSUPP if user space tries to read EEPROM with lengh 0
  stmmac: pci: set default of the filter bins
  net: smc91x: Fix gpios for device tree based booting
  mpls: Allow mpls_gso to be built as module
  mpls: Fix mpls_gso handler.
  r8152: stop submitting intr for -EPROTO
  netfilter: nft_reject_bridge: restrict reject to prerouting and input
  netfilter: nft_reject_bridge: don't use IP stack to reject traffic
  netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functions
  netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functions
  netfilter: nf_tables_bridge: update hook_mask to allow {pre,post}routing
  drivers/net: macvtap and tun depend on INET
  drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets
  drivers/net: Disable UFO through virtio
  net: skb_fclone_busy() needs to detect orphaned skb
  gre: Use inner mac length when computing tunnel length
  mlx4: Avoid leaking steering rules on flow creation error flow
  net/mlx4_en: Don't attempt to TX offload the outer UDP checksum for VXLAN
  ...
2014-10-31 15:04:58 -07:00
John W. Linville
15a892e728 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-10-31 16:05:31 -04:00
Linus Torvalds
aea4869f68 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Ingo Molnar:
 "The tree contains two RCU fixes and a compiler quirk comment fix"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rcu: Make rcu_barrier() understand about missing rcuo kthreads
  compiler/gcc4+: Remove inaccurate comment about 'asm goto' miscompiles
  rcu: More on deadlock between CPU hotplug and expedited grace periods
2014-10-31 12:43:52 -07:00
David Jeffery
b2de525f09 Return short read or 0 at end of a raw device, not EIO
Author: David Jeffery <djeffery@redhat.com>
Changes to the basic direct I/O code have broken the raw driver when reading
to the end of a raw device.  Instead of returning a short read for a read that
extends partially beyond the device's end or 0 when at the end of the device,
these reads now return EIO.

The raw driver needs the same end of device handling as was added for normal
block devices.  Using blkdev_read_iter, which has the needed size checks,
prevents the EIO conditions at the end of the device.

Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-31 06:33:26 -04:00
Eric Dumazet
39bb5e6286 net: skb_fclone_busy() needs to detect orphaned skb
Some drivers are unable to perform TX completions in a bound time.
They instead call skb_orphan()

Problem is skb_fclone_busy() has to detect this case, otherwise
we block TCP retransmits and can freeze unlucky tcp sessions on
mostly idle hosts.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 1f3279ae0c ("tcp: avoid retransmits of TCP packets hanging in host queues")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-30 19:58:30 -04:00
Linus Torvalds
a7ca10f263 Merge branch 'akpm' (incoming from Andrew Morton)
Merge misc fixes from Andrew Morton:
 "21 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
  mm/balloon_compaction: fix deflation when compaction is disabled
  sh: fix sh770x SCIF memory regions
  zram: avoid NULL pointer access in concurrent situation
  mm/slab_common: don't check for duplicate cache names
  ocfs2: fix d_splice_alias() return code checking
  mm: rmap: split out page_remove_file_rmap()
  mm: memcontrol: fix missed end-writeback page accounting
  mm: page-writeback: inline account_page_dirtied() into single caller
  lib/bitmap.c: fix undefined shift in __bitmap_shift_{left|right}()
  drivers/rtc/rtc-bq32k.c: fix register value
  memory-hotplug: clear pgdat which is allocated by bootmem in try_offline_node()
  drivers/rtc/rtc-s3c.c: fix initialization failure without rtc source clock
  kernel/kmod: fix use-after-free of the sub_info structure
  drivers/rtc/rtc-pm8xxx.c: rework to support pm8941 rtc
  mm, thp: fix collapsing of hugepages on madvise
  drivers: of: add return value to of_reserved_mem_device_init()
  mm: free compound page with correct order
  gcov: add ARM64 to GCOV_PROFILE_ALL
  fsnotify: next_i is freed during fsnotify_unmount_inodes.
  mm/compaction.c: avoid premature range skip in isolate_migratepages_range
  ...
2014-10-29 16:38:48 -07:00