Commit Graph

52598 Commits

Author SHA1 Message Date
Eric Dumazet
404f7c9e11 net: struct napi_struct fields reordering
Remove two holes on 64bit arches, and put dev_list at the end of
napi_struct since its not used in fast path.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-27 19:29:35 -04:00
Eric Dumazet
e2bcabec6e net: remove sk_init() helper
It seems sk_init() has no value today and even does strange things :

# grep . /proc/sys/net/core/?mem_*
/proc/sys/net/core/rmem_default:212992
/proc/sys/net/core/rmem_max:131071
/proc/sys/net/core/wmem_default:212992
/proc/sys/net/core/wmem_max:131071

We can remove it completely.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-27 18:42:00 -04:00
stephen hemminger
eccc1bb8d4 tunnel: drop packet if ECN present with not-ECT
Linux tunnels were written before RFC6040 and therefore never
implemented the corner case of ECN getting set in the outer header
and the inner header not being ready for it.

Section 4.2.  Default Tunnel Egress Behaviour.
 o If the inner ECN field is Not-ECT, the decapsulator MUST NOT
      propagate any other ECN codepoint onwards.  This is because the
      inner Not-ECT marking is set by transports that rely on dropped
      packets as an indication of congestion and would not understand or
      respond to any other ECN codepoint [RFC4774].  Specifically:

      *  If the inner ECN field is Not-ECT and the outer ECN field is
         CE, the decapsulator MUST drop the packet.

      *  If the inner ECN field is Not-ECT and the outer ECN field is
         Not-ECT, ECT(0), or ECT(1), the decapsulator MUST forward the
         outgoing packet with the ECN field cleared to Not-ECT.

This patch moves the ECN decap logic out of the individual tunnels
into a common place.

It also adds logging to allow detecting broken systems that
set ECN bits incorrectly when tunneling (or an intermediate
router might be changing the header).

Overloads rx_frame_error to keep track of ECN related error.

Thanks to Chris Wright who caught this while reviewing the new VXLAN
tunnel.

This code was tested by injecting faulty logic in other end GRE
to send incorrectly encapsulated packets.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-27 18:12:37 -04:00
Daniel Borkmann
9e49e88958 filter: add XOR instruction for use with X/K
SKF_AD_ALU_XOR_X has been added a while ago, but as an 'ancillary'
operation that is invoked through a negative offset in K within BPF
load operations. Since BPF_MOD has recently been added, BPF_XOR should
also be part of the common ALU operations. Removing SKF_AD_ALU_XOR_X
might not be an option since this is exposed to user space.

Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-24 16:49:21 -04:00
Eric Dumazet
5640f76858 net: use a per task frag allocator
We currently use a per socket order-0 page cache for tcp_sendmsg()
operations.

This page is used to build fragments for skbs.

Its done to increase probability of coalescing small write() into
single segments in skbs still in write queue (not yet sent)

But it wastes a lot of memory for applications handling many mostly
idle sockets, since each socket holds one page in sk->sk_sndmsg_page

Its also quite inefficient to build TSO 64KB packets, because we need
about 16 pages per skb on arches where PAGE_SIZE = 4096, so we hit
page allocator more than wanted.

This patch adds a per task frag allocator and uses bigger pages,
if available. An automatic fallback is done in case of memory pressure.

(up to 32768 bytes per frag, thats order-3 pages on x86)

This increases TCP stream performance by 20% on loopback device,
but also benefits on other network devices, since 8x less frags are
mapped on transmit and unmapped on tx completion. Alexander Duyck
mentioned a probable performance win on systems with IOMMU enabled.

Its possible some SG enabled hardware cant cope with bigger fragments,
but their ndo_start_xmit() should already handle this, splitting a
fragment in sub fragments, since some arches have PAGE_SIZE=65536

Successfully tested on various ethernet devices.
(ixgbe, igb, bnx2x, tg3, mellanox mlx4)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-24 16:31:37 -04:00
David S. Miller
2a6c8c7998 net: Remove unnecessary NULL check in scm_destroy().
All callers provide a non-NULL scm argument.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-24 15:52:33 -04:00
David S. Miller
ae4735166e Merge branch 'master' of git://1984.lsi.us.es/nf-next
Pablo Neira Ayuso says:

====================
This patchset contains updates for your net-next tree, they are:

* Mostly fixes for the recently pushed IPv6 NAT support:

- Fix crash while removing nf_nat modules from Patrick McHardy.
- Fix unbalanced rcu_read_unlock from Ulrich Weber.
- Merge NETMAP and REDIRECT into one single xt_target module, from
  Jan Engelhardt.
- Fix Kconfig for IPv6 NAT, which allows inconsistent configurations,
  from myself.

* Updates for ipset, all of the from Jozsef Kadlecsik:

- Add the new "nomatch" option to obtain reverse set matching.
- Support for /0 CIDR in hash:net,iface set type.
- One non-critical fix for a rare crash due to pass really
  wrong configuration parameters.
- Coding style cleanups.
- Sparse fixes.
- Add set revision supported via modinfo.i

* One extension for the xt_time match, to support matching during
  the transition between two days with one single rule, from
  Florian Westphal.

* Fix maximum packet length supported by nfnetlink_queue and add
  NFQA_CAP_LEN attribute, from myself.

You can notice that this batch contains a couple of fixes that may
go to 3.6-rc but I don't consider them critical to push them:

* The ipset fix for the /0 cidr case, which is triggered with one
  inconsistent command line invocation of ipset.

* The nfnetlink_queue maximum packet length supported since it requires
  the new NFQA_CAP_LEN attribute to provide a full workaround for the
  described problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-24 15:42:04 -04:00
Pablo Neira Ayuso
6ee584be3e netfilter: nfnetlink_queue: add NFQA_CAP_LEN attribute
This patch adds the NFQA_CAP_LEN attribute that allows us to know
what is the real packet size from user-space (even if we decided
to retrieve just a few bytes from the packet instead of all of it).

Security software that inspects packets should always check for
this new attribute to make sure that it is inspecting the entire
packet.

This also helps to provide a workaround for the problem described
in: http://marc.info/?l=netfilter-devel&m=134519473212536&w=2

Original idea from Florian Westphal.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-24 15:10:29 +02:00
Pablo Neira Ayuso
7be54ca476 netfilter: nf_ct_ftp: add sequence tracking pickup facility for injected entries
This patch allows the FTP helper to pickup the sequence tracking from
the first packet seen. This is useful to fix the breakage of the first
FTP command after the failover while using conntrackd to synchronize
states.

The seq_aft_nl_num field in struct nf_ct_ftp_info has been shrinked to
16-bits (enough for what it does), so we can use the remaining 16-bits
to store the flags while using the same size for the private FTP helper
data.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-24 14:29:40 +02:00
Florian Westphal
54eb3df3a7 netfilter: xt_time: add support to ignore day transition
Currently, if you want to do something like:
"match Monday, starting 23:00, for two hours"
You need two rules, one for Mon 23:00 to 0:00 and one for Tue 0:00-1:00.

The rule: --weekdays Mo --timestart 23:00  --timestop 01:00

looks correct, but it will first match on monday from midnight to 1 a.m.
and then again for another hour from 23:00 onwards.

This permits userspace to explicitly ignore the day transition and
match for a single, continuous time period instead.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-24 14:29:01 +02:00
David S. Miller
c9d2ea96ca netlink: Rearrange netlink_kernel_cfg to save space on 64-bit.
Suggested by Jan Engelhardt.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-23 02:09:23 -04:00
Jozsef Kadlecsik
3e0304a583 netfilter: ipset: Support to match elements marked with "nomatch"
Exceptions can now be matched and we can branch according to the
possible cases:

a. match in the set if the element is not flagged as "nomatch"
b. match in the set if the element is flagged with "nomatch"
c. no match

i.e.

iptables ... -m set --match-set ... -j ...
iptables ... -m set --match-set ... --nomatch-entries -j ...
...

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2012-09-22 22:44:34 +02:00
Jozsef Kadlecsik
3ace95c0ac netfilter: ipset: Coding style fixes
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2012-09-22 22:44:29 +02:00
Jozsef Kadlecsik
10111a6ef3 netfilter: ipset: Include supported revisions in module description
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2012-09-22 22:44:24 +02:00
Jozsef Kadlecsik
85f8c13e41 netfilter: ipset: Rewrite cidr book keeping to handle /0
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2012-09-22 22:43:42 +02:00
Neal Cardwell
016818d076 tcp: TCP Fast Open Server - take SYNACK RTT after completing 3WHS
When taking SYNACK RTT samples for servers using TCP Fast Open, fix
the code to ensure that we only call tcp_valid_rtt_meas() after we
receive the ACK that completes the 3-way handshake.

Previously we were always taking an RTT sample in
tcp_v4_syn_recv_sock(). However, for TCP Fast Open connections
tcp_v4_conn_req_fastopen() calls tcp_v4_syn_recv_sock() at the time we
receive the SYN. So for TFO we must wait until tcp_rcv_state_process()
to take the RTT sample.

To fix this, we wait until after TFO calls tcp_v4_syn_recv_sock()
before we set the snt_synack timestamp, since tcp_synack_rtt_meas()
already ensures that we only take a SYNACK RTT sample if snt_synack is
non-zero. To be careful, we only take a snt_synack timestamp when
a SYNACK transmit or retransmit succeeds.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22 15:47:10 -04:00
Neal Cardwell
623df484a7 tcp: extract code to compute SYNACK RTT
In preparation for adding another spot where we compute the SYNACK
RTT, extract this code so that it can be shared.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22 15:47:10 -04:00
Richard Cochran
de46584675 ptp: clarify the clock_name sysfs attribute
There has been some confusion among PHC driver authors about the
intended purpose of the clock_name attribute. This patch expands the
documation in order to clarify how the clock_name field should be
understood.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22 15:42:46 -04:00
Richard Cochran
1ef761582c ptp: link the phc device to its parent device
PTP Hardware Clock devices appear as class devices in sysfs. This patch
changes the registration API to use the parent device, clarifying the
clock's relationship to the underlying device.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-22 15:42:38 -04:00
Pablo Neira Ayuso
abb17e6c0c netlink: use <linux/export.h> instead of <linux/module.h>
Since (9f00d97 netlink: hide struct module parameter in netlink_kernel_create),
linux/netlink.h includes linux/module.h because of the use of THIS_MODULE.

Use linux/export.h instead, as suggested by Stephen Rothwell, which is
significantly smaller and defines THIS_MODULES.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-21 15:43:58 -04:00
Christoph Paasch
bb68b64724 ipv4: Don't add TCP-code in inet_sock_destruct
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Acked-by: H.K. Jerry Chu <hkchu@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-20 17:12:27 -04:00
Or Gerlitz
9baa0b0364 IB/ipoib: Add rtnl_link_ops support
Add rtnl_link_ops to IPoIB, with the first usage being child device
create/delete through them. Childs devices are now either legacy ones,
created/deleted through the ipoib sysfs entries, or RTNL ones.

Adding support for RTNL childs involved refactoring of ipoib_vlan_add
which is now used by both the sysfs and the link_ops code.

Also, added ndo_uninit entry to support calling unregister_netdevice_queue
from the rtnl dellink entry. This required removal of calls to
ipoib_dev_cleanup from the driver in flows which use unregister_netdevice,
since the networking core will invoke ipoib_uninit which does exactly that.

Signed-off-by: Erez Shitrit <erezsh@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-20 16:49:17 -04:00
David S. Miller
b85c715c2e Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next
Ben Hutchings says:

====================
1. Extension to PPS/PTP to allow for PHC devices where pulses are
   subject to a variable but measurable delay.
2. PPS/PTP/PHC support for Solarflare boards with a timestamping
   peripheral.
3. MTD support for updating the timestamping peripheral on those boards.
4. Fix for potential over-length requests to firmware.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-20 16:39:59 -04:00
Amerigo Wang
6b102865e7 ipv6: unify fragment thresh handling code
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Kubeček <mkubecek@suse.cz>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19 17:23:28 -04:00
Amerigo Wang
d4915c087f ipv6: make ip6_frag_nqueues() and ip6_frag_mem() static inline
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Kubeček <mkubecek@suse.cz>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19 17:23:28 -04:00
Amerigo Wang
b836c99fd6 ipv6: unify conntrack reassembly expire code with standard one
Two years ago, Shan Wei tried to fix this:
http://patchwork.ozlabs.org/patch/43905/

The problem is that RFC2460 requires an ICMP Time
Exceeded -- Fragment Reassembly Time Exceeded message should be
sent to the source of that fragment, if the defragmentation
times out.

"
   If insufficient fragments are received to complete reassembly of a
   packet within 60 seconds of the reception of the first-arriving
   fragment of that packet, reassembly of that packet must be
   abandoned and all the fragments that have been received for that
   packet must be discarded.  If the first fragment (i.e., the one
   with a Fragment Offset of zero) has been received, an ICMP Time
   Exceeded -- Fragment Reassembly Time Exceeded message should be
   sent to the source of that fragment.
"

As Herbert suggested, we could actually use the standard IPv6
reassembly code which follows RFC2460.

With this patch applied, I can see ICMP Time Exceeded sent
from the receiver when the sender sent out 3/4 fragmented
IPv6 UDP packet.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Kubeček <mkubecek@suse.cz>
Cc: David Miller <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19 17:23:28 -04:00
Amerigo Wang
c038a767cd ipv6: add a new namespace for nf_conntrack_reasm
As pointed by Michal, it is necessary to add a new
namespace for nf_conntrack_reasm code, this prepares
for the second patch.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Kubeček <mkubecek@suse.cz>
Cc: David Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel@vger.kernel.org
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19 17:23:28 -04:00
Amerigo Wang
8c4c49df5c netpoll: call ->ndo_select_queue() in tx path
In netpoll tx path, we miss the chance of calling ->ndo_select_queue(),
thus could cause problems when bonding is involved.

This patch makes dev_pick_tx() extern (and rename it to netdev_pick_tx())
to let netpoll call it in netpoll_send_skb_on_dev().

Reported-by: Sylvain Munaut <s.munaut@whatever-company.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Tested-by: Sylvain Munaut <s.munaut@whatever-company.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19 17:19:09 -04:00
stephen hemminger
6b6e27255f netdev: make address const in device address management
The internal functions for add/deleting addresses don't change
their argument.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-19 16:35:22 -04:00
David S. Miller
b4516a288e llc: Remove stray reference to sysctl_llc_station_ack_timeout.
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-17 13:13:24 -04:00
David S. Miller
ba01dfe182 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
This is another batch of updates intended for the 3.7 stream.

There are not a lot of large items, but iwlwifi, mwifiex, rt2x00,
ath9k, and brcmfmac all get some attention.  Wei Yongjun also provides
a series of small maintenance fixes.

This also includes a pull of the wireless tree in order to satisfy
some prerequisites for later patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-17 00:57:32 -04:00
David S. Miller
b48b63a1f6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/netfilter/nfnetlink_log.c
	net/netfilter/xt_LOG.c

Rather easy conflict resolution, the 'net' tree had bug fixes to make
sure we checked if a socket is a time-wait one or not and elide the
logging code if so.

Whereas on the 'net-next' side we are calculating the UID and GID from
the creds using different interfaces due to the user namespace changes
from Eric Biederman.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-15 11:43:53 -04:00
Linus Torvalds
5b799dde31 Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux
Pull i2c embedded fixes from Wolfram Sang:
 "The last bunch of (typical) i2c-embedded driver fixes for 3.6.

  Also update the MAINTAINERS file to point to my tree since people keep
  asking where to find their patches."

* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
  i2c: algo: pca: Fix mode selection for PCA9665
  MAINTAINERS: fix tree for current i2c-embedded development
  i2c: mxs: correctly setup speed for non devicetree
  i2c: pnx: Fix read transactions of >= 2 bytes
  i2c: pnx: Fix bit definitions
2012-09-14 17:55:57 -07:00
Linus Torvalds
dd383af6aa Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "I realise this a bit bigger than I would want at this point.

  Exynos is a large chunk, I got them to half what they wanted already,
  and hey its ARM based, so not going to hurt many people.

  Radeon has only two fixes, but the PLL fixes were a bit bigger, but
  required for a lot of scenarios, the fence fix is really urgent.

  vmwgfx: I've pulled in a dumb ioctl support patch that I was going to
  shove in later and cc stable, but we need it asap, its mainly to stop
  mesa growing a really ugly dependency in userspace to run stuff on
  vmware, and if I don't stick it in the kernel now, everyone will have
  to ship ugly userspace libs to workaround it.

  nouveau: single urgent fix found in F18 testing, causes X to not start
  properly when f18 plymouth is used

  i915: smattering of fixes and debug quieting

  gma500: single regression fix

  So as I said a bit large, but its fairly well scattered and its all
  stuff I'll be shipping in F18's 3.6 kernel."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (26 commits)
  drm/nouveau: fix booting with plymouth + dumb support
  drm/radeon: make 64bit fences more robust v3
  drm/radeon: rework pll selection (v3)
  drm: Drop the NV12M and YUV420M formats
  drm/exynos: remove DRM_FORMAT_NV12M from plane module
  drm/exynos: fix double call of drm_prime_(init/destroy)_file_private
  drm/exynos: add dummy support for dmabuf-mmap
  drm/exynos: Add missing braces around sizeof in exynos_mixer.c
  drm/exynos: Add missing braces around sizeof in exynos_hdmi.c
  drm/exynos: Make g2d_pm_ops static
  drm/exynos: Add dependency for G2D in Kconfig
  drm/exynos: fixed page align bug.
  drm/exynos: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(.. [1]
  drm/exynos: Use devm_* functions in exynos_drm_g2d.c file
  drm/exynos: Use devm_kzalloc in exynos_drm_hdmi.c file
  drm/exynos: Use devm_kzalloc in exynos_drm_vidi.c file
  drm/exynos: Remove redundant check in exynos_drm_fimd.c file
  drm/exynos: Remove redundant check in exynos_hdmi.c file
  vmwgfx: add dumb ioctl support
  gma500: Fix regression on Oaktrail devices
  ...
2012-09-14 17:51:10 -07:00
Linus Torvalds
7ef6e97380 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "This tree includes various fixes"

Ingo really needs to improve on the whole "explain git pull" part.
"Various fixes" indeed.

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/hwpb: Invoke __perf_event_disable() if interrupts are already disabled
  perf/x86: Enable Intel Cedarview Atom suppport
  perf_event: Switch to internal refcount, fix race with close()
  oprofile, s390: Fix uninitialized memory access when writing to oprofilefs
  perf/x86: Fix microcode revision check for SNB-PEBS
2012-09-14 17:43:45 -07:00
Linus Torvalds
a1362d504e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Use after free and new device IDs in bluetooth from Andre Guedes,
    Yevgeniy Melnichuk, Gustavo Padovan, and Henrik Rydberg.

 2) Fix crashes with short packet lengths and VLAN in pktgen, from
    Nishank Trivedi.

 3) mISDN calls flush_work_sync() with locks held, fix from Karsten
    Keil.

 4) Packet scheduler gred parameters are reported to userspace
    improperly scaled, and WRED idling is not performed correctly.  All
    from David Ward.

 5) Fix TCP socket refcount problem in ipv6, from Julian Anastasov.

 6) ibmveth device has RX queue alignment requirements which are not
    being explicitly met resulting in sporadic failures, fix from
    Santiago Leon.

 7) Netfilter needs to take care when interpreting sockets attached to
    socket buffers, they could be time-wait minisockets.  Fix from Eric
    Dumazet.

 8) sock_edemux() has the same issue as netfilter did in #7 above, fix
    from Eric Dumazet.

 9) Avoid infinite loops in CBQ scheduler with some configurations, from
    Eric Dumazet.

10) Deal with "Reflection scan: an Off-Path Attack on TCP", from Jozsef
    Kadlecsik.

11) SCTP overcharges socket for TX packets, fix from Thomas Graf.

12) CODEL packet scheduler should not reset it's state every time it
    builds a new flow, fix from Eric Dumazet.

13) Fix memory leak in nl80211, from Wei Yongjun.

14) NETROM doesn't check skb_copy_datagram_iovec() return values, from
    Alan Cox.

15) l2tp ethernet was using sizeof(ETH_HLEN) instead of plain ETH_HLEN,
    oops.  From Eric Dumazet.

16) Fix selection of ath9k chips on which PA linearization and AM2PM
    predistoration are used, from Felix Fietkau.

17) Flow steering settings in mlx4 driver need to be validated properly,
    from Hadar Hen Zion.

18) bnx2x doesn't show the correct link duplex setting, from Yaniv
    Rosner.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
  pktgen: fix crash with vlan and packet size less than 46
  bnx2x: Add missing afex code
  bnx2x: fix registers dumped
  bnx2x: correct advertisement of pause capabilities
  bnx2x: display the correct duplex value
  bnx2x: prevent timeouts when using PFC
  bnx2x: fix stats copying logic
  bnx2x: Avoid sending multiple statistics queries
  net: qmi_wwan: call subdriver with control intf only
  net_sched: gred: actually perform idling in WRED mode
  net_sched: gred: fix qave reporting via netlink
  net_sched: gred: eliminate redundant DP prio comparisons
  net_sched: gred: correct comment about qavg calculation in RIO mode
  mISDN: Fix wrong usage of flush_work_sync while holding locks
  netfilter: log: Fix log-level processing
  net-sched: sch_cbq: avoid infinite loop
  net: qmi_wwan: fix Gobi device probing for un2430
  net: fix net/core/sock.c build error
  ixp4xx_hss: fix build failure due to missing linux/module.h inclusion
  caif: move the dereference below the NULL test
  ...
2012-09-14 15:34:07 -07:00
Linus Torvalds
0462bfc88d Merge tag 'driver-core-3.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg Kroah-Hartman:
 "Here is one fix for 3.6-rc6 for the kobject.h file.

  It fixes a reported oops if CONFIG_HOTPLUG is disabled.  It's been in
  the linux-next tree for a while now.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'driver-core-3.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  kobject: fix oops with "input0: bad kobj_uevent_env content in show_uevent()"
2012-09-14 14:53:22 -07:00
John W. Linville
9316f0e3c6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2012-09-14 13:53:49 -04:00
Eric W. Biederman
8289bab1da scsi_netlink: Remove dead and buggy code
The scsi netlink code confuses the netlink port id with a process id,
going so far as to read NETLINK_CREDS(skb)->pid instead of the correct
NETLINK_CB(skb).pid.  Fortunately it does not matter because nothing
registers to respond to scsi netlink requests.

The only interesting use of the scsi_netlink interface is
fc_host_post_vendor_event which sends a netlink multicast message.

Since nothing registers to handle scsi netlink messages kill all of the
registration logic, while retaining the same error handling behavior
preserving the userspace visible behavior and removing all of the
confused code that thought a netlink port id was a process id.

This was tested with a kernel allyesconfig build which had no problems.

Cc: James Bottomley <James.Bottomley@parallels.com>
Cc: James Smart <James.Smart@Emulex.Com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-13 16:26:39 -04:00
Karsten Keil
4b921eda53 mISDN: Fix wrong usage of flush_work_sync while holding locks
It is a bad idea to hold a spinlock and call flush_work_sync.
Move the workqueue cleanup outside the spinlock and use cancel_work_sync,
on closing the channel this seems to be the more correct function.
Remove the never used and constant return value of mISDN_freebchannel.

Signed-off-by: Karsten Keil <keil@b1-systems.de>
Cc: <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-13 14:58:54 -04:00
Ville Syrjälä
d9dd85dd4e drm: Drop the NV12M and YUV420M formats
The NV12M/YUV420M formats are identical to the NV12/YUV420 formats.
So just remove these duplicated format names.

This might look like breaking the ABI, but the code has never actually
accepted these formats, so nothing can be using them.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-09-13 12:38:10 +09:00
Linus Torvalds
22b4e63ebe Merge tag 'nfs-for-3.6-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:

 - Final (hopefully) fix for the range checking code in NFSv4 getacl.
   This should fix the Oopses being seen when the acl size is close to
   PAGE_SIZE.
 - Fix a regression with the legacy binary mount code
 - Fix a regression in the readdir cookieverf initialisation
 - Fix an RPC over UDP regression
 - Ensure that we report all errors in the NFSv4 open code
 - Ensure that fsync() reports all relevant synchronisation errors.

* tag 'nfs-for-3.6-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: fsync() must exit with an error if page writeback failed
  SUNRPC: Fix a UDP transport regression
  NFS: return error from decode_getfh in decode open
  NFSv4: Fix buffer overflow checking in __nfs4_get_acl_uncached
  NFSv4: Fix range checking in __nfs4_get_acl_uncached and __nfs4_proc_set_acl
  NFS: Fix a problem with the legacy binary mount code
  NFS: Fix the initialisation of the readdir 'cookieverf' array
2012-09-13 09:04:13 +08:00
Roland Stigge
c076ada4e4 i2c: pnx: Fix read transactions of >= 2 bytes
On transactions with n>=2 bytes, the controller actually wrongly clocks in n+1
bytes. This is caused by the (wrong) assumption that RFE in the Status Register
is 1 iff there is no byte already ordered (via a dummy TX byte). This lead to
the implementation of synchronized byte ordering, e.g.:

Dummy-TX - RX - Dummy-TX - RX - ...

But since RFE actually stays high after some Dummy-TX, it rather looks like:

Dummy-TX - Dummy-TX - RX - Dummy-TX - RX - (RX)

The last RX byte is clocked in by the bus controller, but ignored by the kernel
when filling the userspace buffer.

This patch fixes the issue by asking for RX via Dummy-TX asynchronously.
Introducing a separate counter for TX bytes.

Signed-off-by: Roland Stigge <stigge@antcom.de>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
2012-09-12 17:52:44 +02:00
Duan Jiong
6d57e9078e etherdevice: introduce help function eth_zero_addr()
a lot of code has either the memset or an inefficient copy
from a static array that contains the all-zeros Ethernet address.
Introduce help function eth_zero_addr() to fill an address with
all zeros, making the code clearer and allowing us to get rid of
some constant arrays.

Signed-off-by: Duan Jiong <djduanjiong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-10 15:49:54 -04:00
Eric Dumazet
b6069a9570 filter: add MOD operation
Add a new ALU opcode, to compute a modulus.

Commit ffe06c17af used an ancillary to implement XOR_X,
but here we reserve one of the available ALU opcode to implement both
MOD_X and MOD_K

Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: George Bakos <gbakos@alpinista.org>
Cc: Jay Schulist <jschlst@samba.org>
Cc: Jiri Pirko <jpirko@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-10 15:44:56 -04:00
Eric W. Biederman
15e473046c netlink: Rename pid to portid to avoid confusion
It is a frequent mistake to confuse the netlink port identifier with a
process identifier.  Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.

I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.

I have successfully built an allyesconfig kernel with this change.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-10 15:30:41 -04:00
Pablo Neira Ayuso
9f00d9776b netlink: hide struct module parameter in netlink_kernel_create
This patch defines netlink_kernel_create as a wrapper function of
__netlink_kernel_create to hide the struct module *me parameter
(which seems to be THIS_MODULE in all existing netlink subsystems).

Suggested by David S. Miller.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-08 18:46:30 -04:00
Pablo Neira Ayuso
9785e10aed netlink: kill netlink_set_nonroot
Replace netlink_set_nonroot by one new field `flags' in
struct netlink_kernel_cfg that is passed to netlink_kernel_create.

This patch also renames NL_NONROOT_* to NL_CFG_F_NONROOT_* since
now the flags field in nl_table is generic (so we can add more
flags if needed in the future).

Also adjust all callers in the net-next tree to use these flags
instead of netlink_set_nonroot.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-08 18:45:27 -04:00
Ben Hutchings
220a60a425 pps/ptp: Allow PHC devices to adjust PPS events for known delay
Initial version by Stuart Hodgson <smhodgson@solarflare.com>

Some PHC device drivers may deliver PPS events with a significant
and variable delay, but still be able to measure precisely what
that delay is.

Add a pps_sub_ts() function for subtracting a delay from the
timestamp(s) in a PPS event, and a PTP event type (PTP_CLOCK_PPSUSR)
for which the caller provides a complete PPS event.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-09-07 21:13:28 +01:00
John W. Linville
fac805f8c1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2012-09-07 15:07:55 -04:00