Commit Graph

157407 Commits

Author SHA1 Message Date
Geert Uytterhoeven
80947e7c99 powerpc: Keep track of emulated instructions
If CONFIG_PPC_EMULATED_STATS is enabled, make available counters for the
various classes of emulated instructions under
/sys/kernel/debug/powerpc/emulated_instructions/ (assumed debugfs is mounted on
/sys/kernel/debug).  Optionally (controlled by
/sys/kernel/debug/powerpc/emulated_instructions/do_warn), rate-limited warnings
can be printed to the console when instructions are emulated.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:26 +10:00
Vinay Sridhar
f312deb4cd powerpc/xmon: Add dl command to dump contents of __log_buf
Hello All,

Quite a while back Michael Ellerman had posted a patch to add support to xmon to print the contents of the console log pointed to by __log_buf.
Here's the link to that patch - http://ozlabs.org/pipermail/linuxppc64-dev/2005-March/003657.html
I've ported the patch in the above link to 2.6.30-rc5 and have tested it.

Thanks & regards,
Vinay

Signed-off-by: Michael Ellerman <michael at ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:25 +10:00
Becky Bruce
2138422bba powerpc: Use sg->dma_length in sg_dma_len() macro on 32-bit
Currently, the 32-bit code uses sg->length instead of sg->dma_lentgh
to report sg_dma_len.  However, since the default dma code for 32-bit
(the dma_direct case) sets dma_length and length to the same thing,
we should be able to use dma_length there as well.  This gets rid of
some 32-vs-64-bit ifdefs, and is needed by the swiotlb code which
actually distinguishes between dma_length and length.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:25 +10:00
Jan Blunck
45db924089 powerpc/spufs: Remove double check for non-negative dentry
This patch removes an unnecessary double check if the dentry returned by
lookup_create() is actually non-negative. Since lookup_create() itself returns
an error in this case just remove the check.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:25 +10:00
Anton Blanchard
8d165db107 powerpc: Improve decrementer accuracy
I have been looking at sources of OS jitter and notice that after a long
NO_HZ idle period we wakeup too early:

relative time (us)    event
                      timer irq exit
    999946.405        timer irq entry
         4.835        timer irq exit
        21.685        timer irq entry
         3.540          timer (tick_sched_timer) entry

Here we slept for just under a second then took a timer interrupt that did
nothing. 21.685 us later we wake up again and do the work.

We set a rather low shift value of 16 for the decrementer clockevent, which I
think is causing this issue. On this box we have a 207MHz decrementer and see:

clockevent: decrementer mult[3501] shift[16] cpu[0]

For calculations of large intervals this mult/shift combination could be
off by a significant amount. I notice the sparc code has a loop that iterates
to find a mult/shift combination that maximises the shift value while
keeping mult under 32bit. With the patch below we get:

clockevent: decrementer mult[35015c20] shift[32] cpu[15]

And we no longer see the spurious wakeups.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:24 +10:00
Kumar Gala
9aa4e7b169 powerpc/pci: Remove redundant pcnet32 fixup
The pcnet32 driver has had in its device id table for some time a match
against the "broken" vendor ID.  No need to fake out the vendor ID
with an explicit fixup.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:24 +10:00
Kumar Gala
cf6692c07a powerpc/pci: Cleanup some minor cruft
* Removed building setup-irq on ppc32, we don't use it anymore
* Remove duplicate prototype for setup_grackle() code that needs it
  gets it from <asm/grackle.h>
* Removed gratuitous pci_io_size type differences between ppc32/ppc64

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:24 +10:00
Kumar Gala
2eb4afb69f powerpc/pci: Move pseries code into pseries platform specific area
There doesn't appear to be any specific reason that we need to setup the
pseries specific notifier in generic arch pci code.  Move it into pseries
land.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:24 +10:00
Kumar Gala
58513dc40d powerpc/pci: Clean up direct access to sysdata by celleb platforms
We shouldn't directly access sysdata to get the device node to just
go get the pci_controller.  We can call pci_bus_to_host() for this
purpose.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:24 +10:00
Kumar Gala
2f52297665 powerpc/pci: Clean up direct access to sysdata by RTAS
We shouldn't directly access sysdata to get the device node but call
pci_bus_to_OF_node() for this purpose.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:23 +10:00
Kumar Gala
95272262aa powerpc/pci: Clean up direct access to sysdata by powermac platforms
We shouldn't directly access sysdata to get the device node but call
pci_bus_to_OF_node() for this purpose.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:23 +10:00
Kumar Gala
bccd6f73f0 powerpc/pci: Clean up direct access to sysdata on tsi108 platforms
We shouldn't directly access sysdata to get the pci_controller.  Instead
use pci_bus_to_host() for this purpose.  In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:23 +10:00
Kumar Gala
8456993ead powerpc/pci: Clean up direct access to sysdata by CHRP platforms
We shouldn't directly access sysdata to get the pci_controller.  Instead
use pci_bus_to_host() for this purpose.  In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:23 +10:00
Kumar Gala
f159edaef3 powerpc/pci: Clean up direct access to sysdata by 4xx platforms
We shouldn't directly access sysdata to get the pci_controller.  Instead
use pci_bus_to_host() for this purpose.  In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:23 +10:00
Kumar Gala
5b21fb8e76 powerpc/pci: Clean up direct access to sysdata by 52xx platforms
We shouldn't directly access sysdata to get the pci_controller.  Instead
use pci_bus_to_host() for this purpose.  In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:22 +10:00
Kumar Gala
8206a110cb powerpc/pci: Clean up direct access to sysdata by FSL platforms
We shouldn't directly access sysdata to get the pci_controller.  Instead
use pci_bus_to_host() for this purpose.  In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:22 +10:00
Kumar Gala
19afa40797 powerpc/pci: Clean up direct access to sysdata by indirect ops
We shouldn't directly access sysdata to get the pci_controller.  Instead
use pci_bus_to_host() for this purpose.  In the future we might have
sysdata be a device_node to match ppc64 and unify the code between ppc32
& ppc64.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:22 +10:00
Milton Miller
60dbf43851 powerpc: Add 2.06 tlbie mnemonics
This adds the PowerPC 2.06 tlbie mnemonics and keeps backwards
compatibilty for CPUs before 2.06.

Only useful for bare metal systems.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:21 +10:00
Milton Miller
af20aeb1a3 powerpc: Enable MMU feature sections for inline asm
powerpc: Enable MMU feature sections for inline asm

This adds the ability to do MMU feature sections for inline asm.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:21 +10:00
Michael Neuling
dfb432cb96 powerpc: Move VSX load/stores into ppc-opcode.h
Cleans up the VSX load/store instructions by moving them into
ppc-opcode.h.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:21 +10:00
Michael Neuling
da6b43c833 powerpc: Cleanup macros in ppc-opcode.h
Make macros more braces happy.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:21 +10:00
Kumar Gala
d89ebca224 powerpc: Fix up elf_read_implies_exec() usage
We believe if a toolchain supports PT_GNU_STACK that it sets the proper
PHDR permissions.  Therefor elf_read_implies_exec() should only be true
if we don't see PT_GNU_STACK set.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:21 +10:00
Michael Ellerman
8e27f4dab3 powerpc/irq: We don't need __do_IRQ() anymore
So select GENERIC_HARDIRQS_NO__DO_IRQ to disable it.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:20 +10:00
Michael Ellerman
f11f76d4b8 powerpc/powermac: Use generic_handle_irq() in gatwick_action()
Don't call __do_IRQ() directly in gatwick_action(), use
generic_handle_irq().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:20 +10:00
Michael Ellerman
835363e67d powerpc/irq: Remove fallback to __do_IRQ()
We should no longer have any irq code that needs __do_IRQ(), so
remove the fallback to __do_IRQ().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:20 +10:00
Michael Ellerman
9b647a30cb powerpc/irq: Move get_irq() comment into header
The guts of do_IRQ() isn't really the right place to be documenting
the ppc_md.get_irq() interface. So move the comment into machdep.h

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:43:59 +10:00
Michael Ellerman
d7cb10d6d2 powerpc/irq: Move stack overflow check into a separate function
Makes do_IRQ() shorter and clearer.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:43:58 +10:00
Michael Ellerman
f2694ba568 powerpc/irq: Move #ifdef'ed body of do_IRQ() into a separate function
Rather than a giant ifdef in the body of do_IRQ(), including a
dangling else, move the irq stack logic into a separate routine and
do the ifdef there.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:43:58 +10:00
Geoff Levand
fb94fc2b89 powerpc/ps3: Use smp_request_message_ipi
ps3 has 4 ipis per cpu and can use the new smp_request_message_ipi to
reduce path length when receiving an ipi.

This has the side effect of setting IRQF_PERCPU.

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:43:58 +10:00
Robert Jennings
14f966e794 powerpc/pseries: CMO unused page hinting
Adds support for the "unused" page hint which can be used in shared
memory partitions to flag pages not in use, which will then be stolen
before active pages by the hypervisor when memory needs to be moved to
LPARs in need of additional memory.  Failure to mark pages as 'unused'
makes the LPAR slower to give up unused memory to other partitions.

This adds the kernel parameter 'cmo_free_hint' to disable this
functionality.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:43:58 +10:00
Tony Breeds
d69a78d7da powerpc/mpic: Cleanup mpic_find() implementation
mpic_find() was overloaded to do two things, finding the mpic instance
for a given interrupt and returning if it's an IPI. Instead we introduce
mpic_is_ipi() and simplify mpic_find() to just return the mpic instance

Also silences the warning:
arch/powerpc/sysdev/mpic.c: In function 'mpic_irq_set_priority':
arch/powerpc/sysdev/mpic.c:1382: warning: 'is_ipi' may be used uninitialized in this function

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:42:56 +10:00
Sean MacLennan
805e324b7f powerpc: Update Warp to use leds-gpio driver
Now that leds-gpio is a proper OF platform driver, the Warp can use
the leds-gpio driver rather than the old out-of-kernel driver.

One side-effect is the leds-gpio driver always turns the leds off
while the old driver left them alone. So we have to set them back to
the correct settings.

Signed-off-by: Sean MacLennan <smaclennan@pikatech.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:42:56 +10:00
Divy Le Ray
703cebabd1 cxgb: set phy's mdio dev before the phy init sequence
mdio's dev field needs to be set before mdio ops occur.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 20:51:59 -07:00
Divy Le Ray
86c890ab1b cxgb3: set phy's mdio dev before the phy init sequence
mdio's dev field needs to be set before mdio ops occur.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 20:51:58 -07:00
Ben Hutchings
64318334bf cxgb3: Use generic XENPAK LASI register definitions
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 20:51:58 -07:00
Ben Hutchings
aa22437e87 chelsio: Use generic XENPAK LASI register definitions
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 20:51:56 -07:00
Thomas Reitmayr
385aa9e701 [ARM] Kirkwood: Correct MPP for SATA activity/presence LEDs of QNAP TS-119/TS-219.
For the QNAP TS-119 and TS-219 the wrong MPPs were used for the SATA
activity/presence LEDs. The new settings make these LEDs work as
expected.

Signed-off-by: Thomas Reitmayr <treitmayr@devbase.at>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
2009-05-20 22:17:32 -04:00
Jean-Mickael Guerin
4f72427998 IPv6: set RTPROT_KERNEL to initial route
The use of unspecified protocol in IPv6 initial route prevents quagga to
install IPv6 default route:
# show ipv6 route
S   ::/0 [1/0] via fe80::1, eth1_0
K>* ::/0 is directly connected, lo, rej
C>* ::1/128 is directly connected, lo
C>* fe80::/64 is directly connected, eth1_0

# ip -6 route
fe80::/64 dev eth1_0  proto kernel  metric 256  mtu 1500 advmss 1440
hoplimit -1
ff00::/8 dev eth1_0  metric 256  mtu 1500 advmss 1440 hoplimit -1
unreachable default dev lo  proto none  metric -1  error -101 hoplimit 255

The attached patch ensures RTPROT_KERNEL to the default initial route
and fixes the problem for quagga.
This is similar to "ipv6: protocol for address routes"
f410a1fba7.

# show ipv6 route
S>* ::/0 [1/0] via fe80::1, eth1_0
C>* ::1/128 is directly connected, lo
C>* fe80::/64 is directly connected, eth1_0

# ip -6 route
fe80::/64 dev eth1_0  proto kernel  metric 256  mtu 1500 advmss 1440
hoplimit -1
fe80::/64 dev eth1_0  proto kernel  metric 256  mtu 1500 advmss 1440
hoplimit -1
ff00::/8 dev eth1_0  metric 256  mtu 1500 advmss 1440 hoplimit -1
default via fe80::1 dev eth1_0  proto zebra  metric 1024  mtu 1500
advmss 1440 hoplimit -1
unreachable default dev lo  proto kernel  metric -1  error -101 hoplimit 255

Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 17:38:59 -07:00
Klaus-Dieter Wacket
9f29f6de56 qeth: Clear SBALF15 in any case for output buffers.
Function qeth_clear_output_buffer for HiperSockets may not clear
all 16 SBALEs, but only the used ones. The error flag in SBALF15
has to be cleared in any case.

Signed-off-by: Klaus-Dieter Wacker <kdwacker@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 17:36:51 -07:00
Ursula Braun
b91398008c qeth: omit upstream checksumming for HiperSockets
For HiperSocket devices receive-path checksumming is not required.
Thus NO_CHECKSUMMING is used as default for HiperSocket interfaces.
For layer3 devices configured with NO_CHECKSUMMING received skbs
should have set their ip_summed field to CHECKSUM_UNNECESSARY.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 17:36:51 -07:00
Ursula Braun
0666eb06ab qeth: support z/VM VSWITCH Port Isolation
z/VM Virtual Switch Port Isolation allows guests on a VLAN UNAWARE
virtual switch to be isolated from other guests on the VSWITCH.
(See z/VM Apars VM64281 and VM64463).
The Linux qeth driver is affected, because it has to handle new
error codes introduced with the z/VM VSWITCH Port Isolation support.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 17:36:50 -07:00
Ursula Braun
8ac6d45228 ctcm: avoid crash in ctcm_remove_device
Channels are already removed when setting a ctcm-device offline.
Thus ctcm_remove_device must not refer to channel information.
Solution: delete channel information from the trace call in
ctcm_remove_device.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 17:36:49 -07:00
Ursula Braun
f214856540 qeth: avoid crash after detach of replugged device
If a qeth device is plugged off, setting the device online stops in
state HARDSETUP and a failure is reported to the base cio-layer
causing halt/clear to be invoked. Replugging the device again triggers
a qeth recovery without notification of the cio-layer. If a device
is ungrouped in this state, the qeth set_offline function is not
invoked, because the corresponding ccwgroup device is not in state
ONLINE. Then incoming traffic is still handled by the qdio layer
resulting in a crash in qeth_l<x>_qdio_input_handler, because (part
of) the qeth data structures for this device are already removed.
Solution: After replugging the device qeth recovery should lead to a
working net device. Thus a "LAN offline" result when setting a qeth
device online must not report a failure to the base cio-layer.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 17:36:49 -07:00
David S. Miller
86c2fe1e3a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2009-05-20 17:31:25 -07:00
Rami Rosen
04af8cf6f3 net: Remove unused parameter from fill method in fib_rules_ops.
The netlink message header (struct nlmsghdr) is an unused parameter in
fill method of fib_rules_ops struct.  This patch removes this
parameter from this method and fixes the places where this method is
called.

(include/net/fib_rules.h)

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 17:26:23 -07:00
Eric Dumazet
1ddbcb005c net: fix rtable leak in net/ipv4/route.c
Alexander V. Lukyanov found a regression in 2.6.29 and made a complete
analysis found in http://bugzilla.kernel.org/show_bug.cgi?id=13339
Quoted here because its a perfect one :

begin_of_quotation
 2.6.29 patch has introduced flexible route cache rebuilding. Unfortunately the
 patch has at least one critical flaw, and another problem.

 rt_intern_hash calculates rthi pointer, which is later used for new entry
 insertion. The same loop calculates cand pointer which is used to clean the
 list. If the pointers are the same, rtable leak occurs, as first the cand is
 removed then the new entry is appended to it.

 This leak leads to unregister_netdevice problem (usage count > 0).

 Another problem of the patch is that it tries to insert the entries in certain
 order, to facilitate counting of entries distinct by all but QoS parameters.
 Unfortunately, referencing an existing rtable entry moves it to list beginning,
 to speed up further lookups, so the carefully built order is destroyed.

 For the first problem the simplest patch it to set rthi=0 when rthi==cand, but
 it will also destroy the ordering.
end_of_quotation

Problematic commit is 1080d709fb
(net: implement emergency route cache rebulds when gc_elasticity is exceeded)

Trying to keep dst_entries ordered is too complex and breaks the fact that
order should depend on the frequency of use for garbage collection.

A possible fix is to make rt_intern_hash() simpler, and only makes
rt_check_expire() a litle bit smarter, being able to cope with an arbitrary
entries order. The added loop is running on cache hot data, while cpu
is prefetching next object, so should be unnoticied.

Reported-and-analyzed-by: Alexander V. Lukyanov <lav@yar.ru>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 17:18:02 -07:00
Eric Dumazet
cf8da764fc net: fix length computation in rt_check_expire()
rt_check_expire() computes average and standard deviation of chain lengths,
but not correclty reset length to 0 at beginning of each chain.
This probably gives overflows for sum2 (and sum) on loaded machines instead
of meaningful results.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-20 17:18:01 -07:00
Linus Torvalds
ecca1c5e3a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI PM: Fix initialization and kexec breakage for some devices
2009-05-20 16:44:37 -07:00
Linus Torvalds
5805977e63 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-2.6:
  drm: Copy back ioctl data to userspace regardless of return code.
  drm: Round size of SHM maps to PAGE_SIZE
2009-05-20 16:40:24 -07:00
Linus Torvalds
a9523f4526 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: 64-bit: Fix system lockup.
  MIPS: IP28: Change to build with -mr10k-cache-barrier=store
  MIPS: IP22: Fix hang in power button interrupt handler
  MIPS: IP32: Fix hang on shutdown in power button interrupt handler.
2009-05-20 16:32:19 -07:00