For performance reasons we are about to change ISYNC_ON_SMP to sometimes be
lwsync. Now that the macro name doesn't make sense, change it and LWSYNC_ON_SMP
to better explain what the barriers are doing.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now we have real bit locks use them instead of open coding it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch implements the lwarx/ldarx hint bit for bit locks.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Recent versions of the PowerPC architecture added a hint bit to the larx
instructions to differentiate between an atomic operation and a lock operation:
> 0 Other programs might attempt to modify the word in storage addressed by EA
> even if the subsequent Store Conditional succeeds.
>
> 1 Other programs will not attempt to modify the word in storage addressed by
> EA until the program that has acquired the lock performs a subsequent store
> releasing the lock.
To avoid a binutils dependency this patch create macros for the extended lwarx
format and uses it in the spinlock code. To test this change I used a simple
test case that acquires and releases a global pthread mutex:
pthread_mutex_lock(&mutex);
pthread_mutex_unlock(&mutex);
On a 32 core POWER6, running 32 test threads we spend almost all our time in
the futex spinlock code:
94.37% perf [kernel] [k] ._raw_spin_lock
|
|--99.95%-- ._raw_spin_lock
| |
| |--63.29%-- .futex_wake
| |
| |--36.64%-- .futex_wait_setup
Which is a good test for this patch. The results (in lock/unlock operations per
second) are:
before: 1538203 ops/sec
after: 2189219 ops/sec
An improvement of 42%
A 32 core POWER7 improves even more:
before: 1279529 ops/sec
after: 2282076 ops/sec
An improvement of 78%
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I often get asked if BAD interrupts are really bad. On some boxes (eg
IBM machines running a hypervisor) there are valid cases where are
presented with an interrupt that is not for us. These cases are common
enough to show up as thousands of BAD interrupts a day.
Tone them down by calling them spurious. Since they can be a significant cause
of OS jitter, we may as well log them per cpu so we know where they are
occurring.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With NO_HZ it is useful to know how often the decrementer is going off. The
patch below adds an entry for it and also adds it into the /proc/stat
summaries.
While here, I added performance monitoring and machine check exceptions.
I found it useful to keep an eye on the PMU exception rate
when using the perf tool. Since it's possible to take a completely
handled machine check on a System p box it also sounds like a good idea to
keep a machine check summary.
The event naming matches x86 to keep gratuitous differences to a minimum.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now we use printf style alignment there is no need to manually space
these fields.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On a large machine I noticed the columns of /proc/interrupts failed to line up
with the header after CPU9. At sufficiently large numbers of CPUs it becomes
impossible to line up the CPU number with the counts.
While fixing this I noticed x86 has a number of updates that we may as well
pull in. On PowerPC we currently omit an interrupt completely if there is no
active handler, whereas on x86 it is printed if there is a non zero count.
The x86 code also spaces the first column correctly based on nr_irqs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Right now we allocate a cacheline sized NR_CPUS array for xics IPI
communication. Use DECLARE_PER_CPU_SHARED_ALIGNED to put it in percpu
data in its own cacheline since it is written to by other cpus.
On a kernel with NR_CPUS=1024, this saves quite a lot of memory:
text data bss dec hex filename
8767779 2944260 1505724 13217763 c9afe3 vmlinux.irq_cpustat
8767555 2813444 1505724 13086723 c7b003 vmlinux.xics
A saving of around 128kB.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
PowerPC is currently using asm-generic/hardirq.h which statically allocates an
NR_CPUS irq_stat array. Switch to an arch specific implementation which uses
per cpu data:
On a kernel with NR_CPUS=1024, this saves quite a lot of memory:
text data bss dec hex filename
8767938 2944132 1636796 13348866 cbb002 vmlinux.baseline
8767779 2944260 1505724 13217763 c9afe3 vmlinux.irq_cpustat
A saving of around 128kB.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
During a EEH recover, the pci_dev structure can be null, mainly if an
eeh event is detected during cpi config operation. In this case, the
pci_dev will not be known (and will be null) the kernel will crash
with the following message:
Unable to handle kernel paging request for data at address 0x000000a0
Faulting instruction address: 0xc00000000006b8b4
Oops: Kernel access of bad area, sig: 11 [#1]
NIP [c00000000006b8b4] .eeh_event_handler+0x10c/0x1a0
LR [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
Call Trace:
[c0000003a80dff00] [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
[c0000003a80dff90] [c000000000031f1c] .kernel_thread+0x54/0x70
The bug occurs because pci_name() tries to access a null pointer.
This patch just guarantee that pci_name() is not called on Null pointers.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
DMA ops requires that coherent_dma_mask be set properly for a device,
but this was not being done for devices on the MV64x60 that use DMA.
Both the serial and ethernet devices need this or they won't be able
to allocate memory.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: gadget: fix EEM gadget CRC usage
USB: otg Kconfig: let USB_OTG_UTILS select USB_ULPI option
USB: g_multi: fix CONFIG_USB_G_MULTI_RNDIS usage
kfifo: Don't use integer as NULL pointer
USB: FHCI: Fix build after kfifo rework
kfifo: Make kfifo_initialized work after kfifo_free
USB: serial: add usbid for dell wwan card to sierra.c
USB: SIS USB2VGA DRIVER: support KAIREN's USB VGA adaptor USB20SVGA-MB-PLUS
USB: ehci: phy low power mode bug fixing
USB: s3c-hsotg: Export usb_gadget_register_driver()
USB: r8a66597-udc: Prototype IS_ERR() and PTR_ERR()
USB: ftdi_sio: add device IDs (several ELV, one Mindstorms NXT)
USB: storage: Remove unneeded SC/PR from unusual_devs.h
USB: ftdi_sio: new device id for papouch AD4USB
USB: usbfs: properly clean up the as structure on error paths
USB: usbfs: only copy the actual data received
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
class: Free the class private data in class_release
sysfs: sysfs_sd_setattr set iattrs unconditionally
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
be2net: set proper value to version field in req hdr
xfrm: Fix xfrm_state_clone leak
ipcomp: Avoid duplicate calls to ipcomp_destroy
ethtool: allow non-admin user to read GRO settings.
ixgbe: fix WOL register setup for 82599
ixgbe: Fix - Do not allow Rx FC on 82598 at 1G due to errata
sfc: Fix SFE4002 initialisation
mac80211: fix handling of null-rate control in rate_control_get_rate
inet: Remove bogus IGMPv3 report handling
iwlwifi: fix AMSDU Rx after paged Rx patch
tcp: fix ICMP-RTO war
via-velocity: Fix races on shared interrupts
via-velocity: Take spinlock on set coalesce
via-velocity: Remove unused IRQ status parameter from rx_srv and tx_srv
rtl8187: Add new device ID
iwmc3200wifi: Test of wrong pointer after kzalloc in iwm_mlme_update_bss_table()
ath9k: Fix sequence numbers for PAE frames
mac80211: fix deferred hardware scan requests
iwlwifi: Fix to set correct ht configuration
mac80211: Fix probe request filtering in IBSS mode
...
Despite all its bugs, the middleware support of our CAPI stack was
already in use for many, many moons. And after going through its code,
fixing all issues I found, I feel it deserves to officially become a
non-experimental feature.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
With dynamic TTY nodes and the help of udev, we no longer need this
special filesystem. Schedule it for removal in one year from now.
As a last duty to this feature, move its help to right option so that
users can read the rationale.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This strange special rule to fall back to controller 1 cannot be derived
from the CAPI specs and looks a lot like it was once dedicated to some
out-of-tree driver, probably AVM's broken fcdsl2 (FRITZ!Card DSL v2.0).
I found no in-tree user that needs this check, and I'm now taking care
of the fcdsl2. So drop these bits from our stack.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
We did not evaluate handle_minor_send's return value, just (void)'ed it
away. Time for a cleanup.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need for irqsave acquisition of acklock, bh-safe is sufficient.
Moverover, move kfree out of the lock and do not take acklock at all
in capiminor_del_all_ack as we are the last user of the list here.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce outlock as a spin lock that protects capiminor's outqueue,
outbytes and outskb (formerly known as ttyskb). outlock can be acquired
from soft-IRQ context via capinc_write, so make it bh-safe.
This finally removes the last reason for keeping the workaround lock
around (which was incomplete and partly broken anyway). And as we no
longer call handle_recv_skb in atomic context, gen_data_b3_resp_for can
use non-atomic allocation now.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The inbytes counter was only updated but never read.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The capiminor members datahandle and msgid are incremented outside any
lock, so better do this atomically.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This struct is describing a queue entry, not the queue itself.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid re-queuing skbs unless the error detected in handle_recv_skb is
expected to be recoverable such as lacking memory, a full CAPI queue, a
full TTY input buffer, or a not yet existing TTY.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sending a message down the CAPI stack may trigger the reception of an
answer, but this will go through capi_recv_message and call
handle_minor_recv from there. There is no need to walk the receive queue
on capinc_tty_write.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Not needed, tty->count keeps track of this information. At this chance,
drop traces of ancient attempts to debug this logic via _DEBUG_REFCOUNT.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use a plain spin lock for capiminors_lock, drop inconsistent irqsafe
acquisitions (it's only used in process context anyway).
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nccip in capiminor used to serve as an indicator that the NCCI was
close. But we don't need this, we issue a hangup on capincci_free_minor.
So drop this legacy.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
capincci_free and, thus, capincci_free_minor runs in process context, so
we can issue the hangup of the associated TTY synchronously.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
tty_struct's driver_data cannot be NULL, no need to test for it.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the reference management features of tty_port to look up and drop
again the tty_struct associated with a capiminor.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Properly associate/disassociate a capiminor object with its TTY via the
install/cleanup handlers instead of trying to guess first open and last
close.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Install a reference counter for capiminor objects. Acquire it when
obtaining a capiminor from the array during capinc_tty_open, drop it
when closing the tty again. Another reference is held for the hook-up
with capincci.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to allocate a fixed major for this TTY, both capifs and udev
make this transparent to the user.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Register capiminors dynamically with the TTY core so that udev can make
them show up as the NCCIs appear or disappear. This removes the need to
check if the capiminor requested in capinc_tty_open actually exists.
And this completely obsoletes capifs which will be scheduled for removal
in a later patch.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return proper error code if tty_register_driver fails. In contrast,
tty_unregister_driver cannot practically fail, so drop that error
handling. Finally, mark capinc_tty_init/exit with __init/__exit.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using a plain array of pointers simplifies the management of capiminors.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace open-coded NCCI list management with standard mechanisms.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
capi_read still used interruptible_sleep_on, risking to miss a wakeup
this way. Convert it to wait_event_interruptible.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both capincci_alloc and capiminor_alloc run in non-atomic context,
update their memory allocations accordingly.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename 'ncci_list_mtx' to 'lock', expressing that it now protects a
larger set of capidev members: the NCCI list, ap.applid (ie. the
registration of the application), and modifications of userflags.
We do not need to protect each and every check for ap.applid because,
once an application is registered, it will stay for the whole lifetime
of the device.
Also, there is no need to apply the capidev mutex during release (if
there could be concurrent users, we would crash them anyway by freeing
the device at the end of capi_release).
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fold capidev_alloc and capidev_free into capi_open and capi_release -
there are no other users. Someone pushed a lock_kernel into capi_open.
Drop it, we don't need it. Also remove the useless test from open that
checks for private_data == NULL.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need for anything "harder" here (specifically no need for
irqsave...). Also, make the list removal the first operation of
capidev_free to avoid dumping half-released devices via /proc.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the code a bit more readable be providing stub functions for the
!CONFIG_ISDN_CAPI_MIDDLEWARE case. Though a few lines are moved around,
this comes with no functional changes.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop the application rw-lock in favour of RCU. This synchronizes
capi20_release against capi_ctr_handle_message which may dereference an
application from (soft-)IRQ context. Any other access to the application
list is now protected by the capi_controller_lock as well. This also
allows to safely inspect applications for /proc dumping by holding
capi_controller_lock.
At this chance, drop some useless release_in_progress checks where we
obtained the application pointer from the list (which becomes NULL on
release_in_progress).
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch applies the mutex so far only protecting the controller list
to (almost) all accesses of controller data structures. It also reworks
waiting on state changes in old_capi_manufacturer so that it no longer
poll and holds a module reference to the controller owner while waiting
(the latter was partly done already). Modification and checking of the
blocked state remains racy by design, the caller is responsible for
dealing with this.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Another step towards proper locking: Rework the callback provided to
capidrv for controller state changes. This is so far attached to an
application, which would require us to hold the corresponding lock
across notification calls.
But there is no direct relation between a controller up/down event and
an application, so let's decouple them and provide a notifier call chain
for those events instead. This notifier chain is first of all used
internally. Here we request the highest priority to unsure that
housekeeping work is done before any other notifications. The chain is
exported via [un]register_capictr_notifier to our only user, capidrv, to
replace the racy and unfixable capi20_set_callback.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This step prepares the application of proper controller locking: Push
all state changing work into the notify handler that are called by
capi_ctr_ready and capi_ctr_down, switch detach_capi_ctr to issue a
synchronous ctr_down. Also ensure that we do not go through any action
if the state did not change.
Signed-off-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>