Both avg_overlap and avg_wakeup had an inherent problem in that their accuracy
was detrimentally affected by cross-cpu wakeups, this because we are missing
the necessary call to update_curr(). This can't be fixed without increasing
overhead in our already too fat fastpath.
Additionally, with recent load balancing changes making us prefer to place tasks
in an idle cache domain (which is good for compute bound loads), communicating
tasks suffer when a sync wakeup, which would enable affine placement, is turned
into a non-sync wakeup by SYNC_LESS. With one task on the runqueue, wake_affine()
rejects the affine wakeup request, leaving the unfortunate where placed, taking
frequent cache misses.
Remove it, and recover some fastpath cycles.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268301121.6785.30.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Testing the load which led to this heuristic (nfs4 kbuild) shows that it has
outlived it's usefullness. With intervening load balancing changes, I cannot
see any difference with/without, so recover there fastpath cycles.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268301062.6785.29.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Entering nohz code on every micro-idle is costing ~10% throughput for netperf
TCP_RR when scheduling cross-cpu. Rate limiting entry fixes this, but raises
ticks a bit. On my Q6600, an idle box goes from ~85 interrupts/sec to 128.
The higher the context switch rate, the more nohz entry costs. With this patch
and some cycle recovery patches in my tree, max cross cpu context switch rate is
improved by ~16%, a large portion of which of which is this ratelimiting.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268301003.6785.28.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Put all statistic fields of sched_entity in one struct, sched_statistics,
and embed it into sched_entity.
This change allows to memset the sched_statistics to 0 when needed (for
instance when forking), avoiding bugs of non initialized fields.
Signed-off-by: Lucas De Marchi <lucas.de.marchi@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268275065-18542-1-git-send-email-lucas.de.marchi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix:
arch/powerpc/kernel/perf_event.c:1334: error: 'power_pmu_notifier' undeclared (first use in this function)
arch/powerpc/kernel/perf_event.c:1334: error: (Each undeclared identifier is reported only once
arch/powerpc/kernel/perf_event.c:1334: error: for each function it appears in.)
arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'power_pmu_notifier'
arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'register_cpu_notifier'
Due to commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks).
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add wakeup support to the ads7846 driver. Platforms can enable wakeup
capability by setting the wakeup flag in ads7846_platform_data. With this
patch the ads7846 driver can be used to wake the system from suspend.
Signed-off-by: Ranjith Lohithakshan <ranjithl@ti.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Commit a43912ab19... ("tunnel: eliminate recursion field") eliminated
use of recursion field from tunnel structures, but its definition
still exists in ip6_tnl{}.
Let's remove that unused field.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are taking a wrong regs snapshot when a trace event triggers.
Either we use get_irq_regs(), which gives us the interrupted
registers if we are in an interrupt, or we use task_pt_regs()
which gives us the state before we entered the kernel, assuming
we are lucky enough to be no kernel thread, in which case
task_pt_regs() returns the initial set of regs when the kernel
thread was started.
What we want is different. We need a hot snapshot of the regs,
so that we can get the instruction pointer to record in the
sample, the frame pointer for the callchain, and some other
things.
Let's use the new perf_fetch_caller_regs() for that.
Comparison with perf record -e lock: -R -a -f -g
Before:
perf [kernel] [k] __do_softirq
|
--- __do_softirq
|
|--55.16%-- __open
|
--44.84%-- __write_nocancel
After:
perf [kernel] [k] perf_tp_event
|
--- perf_tp_event
|
|--41.07%-- lock_acquire
| |
| |--39.36%-- _raw_spin_lock
| | |
| | |--7.81%-- hrtimer_interrupt
| | | smp_apic_timer_interrupt
| | | apic_timer_interrupt
The old case was producing unreliable callchains. Now having
right frame and instruction pointers, we have the trace we
want.
Also syscalls and kprobe events already have the right regs,
let's use them instead of wasting a retrieval.
v2: Follow the rename perf_save_regs() -> perf_fetch_caller_regs()
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Archs <linux-arch@vger.kernel.org>
Events that trigger overflows by interrupting a context can
use get_irq_regs() or task_pt_regs() to retrieve the state
when the event triggered. But this is not the case for some
other class of events like trace events as tracepoints are
executed in the same context than the code that triggered
the event.
It means we need a different api to capture the regs there,
namely we need a hot snapshot to get the most important
informations for perf: the instruction pointer to get the
event origin, the frame pointer for the callchain, the code
segment for user_mode() tests (we always use __KERNEL_CS as
trace events always occur from the kernel) and the eflags
for further purposes.
v2: rename perf_save_regs to perf_fetch_caller_regs as per
Masami's suggestion.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Archs <linux-arch@vger.kernel.org>
Use the LBR to fix up the PEBS IP+1 issue.
As said, PEBS reports the next instruction, here we use the LBR to find
the last branch and from that construct the actual IP. If the IP matches
the LBR-TO, we use LBR-FROM, otherwise we use the LBR-TO address as the
beginning of the last basic block and decode forward.
Once we find a match to the current IP, we use the previous location.
This patch introduces a new ABI element: PERF_RECORD_MISC_EXACT, which
conveys that the reported IP (PERF_SAMPLE_IP) is the exact instruction
that caused the event (barring CPU errata).
The fixup can fail due to various reasons:
1) LBR contains invalid data (quite possible)
2) part of the basic block got paged out
3) the reported IP isn't part of the basic block (see 1)
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.619375431@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Implement simple suport Intel Last-Branch-Record, it supports all
hardware that implements FREEZE_LBRS_ON_PMI, but does not (yet) implement
the LBR config register.
The Intel LBR is a FIFO of From,To addresses describing the last few
branches the hardware took.
This patch does not add perf interface to the LBR, but merely provides an
interface for internal use.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.544191154@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch implements support for Intel Precise Event Based Sampling,
which is an alternative counter mode in which the counter triggers a
hardware assist to collect information on events. The hardware assist
takes a trap like snapshot of a subset of the machine registers.
This data is written to the Intel Debug-Store, which can be programmed
with a data threshold at which to raise a PMI.
With the PEBS hardware assist being trap like, the reported IP is always
one instruction after the actual instruction that triggered the event.
This implements a simple PEBS model that always takes a single PEBS event
at a time. This is done so that the interaction with the rest of the
system is as expected (freq adjust, period randomization, lbr,
callchains, etc.).
It adds an ABI element: perf_event_attr::precise, which indicates that we
wish to use this (constrained, but precise) mode.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
LKML-Reference: <20100304140100.392111285@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The AD7873 is almost identical to the ADS7846; the only difference is
related to the Power Management bits PD0 and PD1. This results in a
slightly different PENIRQ enable behavior. For the AD7873, VREF should
be turned off during differential measurements.
So, add the AD7873/43 to the list of driver supported devices, and prevent
VREF usage during differential/ratiometric conversion modes.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Presently early platform devices suffer from the fact they are unable to
use dev_xxx() calls early on due to dev_name() and others being
unavailable at the time ->probe() is called.
This implements early init_name construction from the matched name/id
pair following the semantics of the late device/driver match. As a
result, matched IDs (inclusive of requested ones) are preserved when the
handoff from the early platform code happens at kobject initialization
time.
Since we still require kmalloc slabs to be available at this point, using
kstrdup() for establishing the init_name works fine. This subsequently
needs to be tested from dev_name() prior to the init_name being cleared
by the driver core. We don't kfree() since others will already have a
handle on the string long before the kobject initialization takes place.
This is also needed to permit drivers to use the clock framework early,
without having to manually construct their own device IDs from the match
id/name pair locally (needed by the early console and timer code on sh
and arm).
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
The noise value as is won't be used, isn't
filled by most drivers and doesn't really
make a whole lot of sense on a per packet
basis -- proper cfg80211 survey support in
mac80211 will need to be different.
Mark the struct member as deprecated so it
will be removed from drivers.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The HID layer has some scan codes of the form 0xffbc0000 for logitech
devices which do not work if scancode is typed as signed int, so we need
to switch to unsigned it instead. While at it keycode being signed does
not make much sense either.
Acked-by: Márton Németh <nm127@freemail.hu>
Acked-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Commit 6b03a53a (tcp: use limited socket backlog) added the possibility
of dropping frames when backlog queue is full.
Commit d218d111 (tcp: Generalized TTL Security Mechanism) added the
possibility of dropping frames when TTL is under a given limit.
This patch adds new SNMP MIB entries, named TCPBacklogDrop and
TCPMinTTLDrop, published in /proc/net/netstat in TcpExt: line
netstat -s | egrep "TCPBacklogDrop|TCPMinTTLDrop"
TCPBacklogDrop: 0
TCPMinTTLDrop: 0
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the "__must_check" tag to sk_add_backlog() so that any failure to
check and drop packets will be warned about.
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (62 commits)
msi-laptop: depends on RFKILL
msi-laptop: Detect 3G device exists by standard ec command
msi-laptop: Add resume method for set the SCM load again
msi-laptop: Support some MSI 3G netbook that is need load SCM
msi-laptop: Add threeg sysfs file for support query 3G state by standard 66/62 ec command
msi-laptop: Support standard ec 66/62 command on MSI notebook and nebook
Driver core: create lock/unlock functions for struct device
sysfs: fix for thinko with sysfs_bin_attr_init()
sysfs: Kill unused sysfs_sb variable.
sysfs: Pass super_block to sysfs_get_inode
driver core: Use sysfs_rename_link in device_rename
sysfs: Implement sysfs_rename_link
sysfs: Pack sysfs_dirent more tightly.
sysfs: Serialize updates to the vfs inode
sysfs: windfarm: init sysfs attributes
sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on module dynamic attributes
sysfs: Document sysfs_attr_init and sysfs_bin_attr_init
sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on dynamic attributes
sysfs: Use one lockdep class per sysfs attribute.
sysfs: Only take active references on attributes.
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (26 commits)
ALSA: hdmi - show debug message on changing audio infoframe
ALSA: hdmi - merge common code for intelhdmi and nvhdmi
ALSA: hda - Add ASRock mobo to MSI blacklist
ALSA: hda: uninitialized variable fix
ALSA: hda: Use LPIB for a Biostar Microtech board
ALSA: usb/audio.h: Fix field order
ALSA: fix jazz16 compile (udelay)
ALSA: hda: Use LPIB for Dell Latitude 131L
ALSA: hda - Build hda_eld into snd-hda-codec module
ALSA: hda - Support NVIDIA MCP89 and GT21x hdmi audio
ALSA: hda - Support max codecs to 8 for nvidia hda controller
ALSA: riptide: clean up while loop
ALSA: usbaudio - remove debug "SAMPLE BYTES" printk line
ALSA: timer - pass real event in snd_timer_notify1() to instance callback
ALSA: oxygen: change || to &&
ALSA: opti92x: use PnP data to select Master Control port
ASoC: fix ak4104 register array access
ASoC: soc_pcm_open: Add missing bailout tag
ALSA: usbaudio: Fix wrong bitrate for Creative Creative VF0470 Live Cam
ALSA: ua101: removing debugging code
...
Commit f2ffd9ee... ("[NETFILTER]: Move ip6_masked_addrcmp to
include/net/ipv6.h") replaced ip6_masked_addrcmp() with
ipv6_masked_addr_cmp(). Function definition went away.
Let's remove its declaration as well in header file.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
include/linux/netfilter/nf_conntrack_tuple_common.h:5: ERROR: open brace '{' following enum go on the same line
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
After merging the final tree, today's linux-next build (powerpc
allyesconfig) failed like this:
drivers/pci/pci-sysfs.c: In function 'pci_create_legacy_files':
drivers/pci/pci-sysfs.c:645: error: lvalue required as unary '&' operand
drivers/pci/pci-sysfs.c:658: error: lvalue required as unary '&' operand
Caused by commit "sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on
dynamic attributes" interacting with commit "sysfs: Use one lockdep
class per sysfs attribute") both from the driver-core tree.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Because of rename ordering problems we occassionally give false
warnings about invalid sysfs operations. So using sysfs_rename
create a sysfs_rename_link function that doesn't need strange
workarounds.
Cc: Benjamin Thery <benjamin.thery@bull.net>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I have added a new requirement to the external sysfs interface
that dynamically allocated sysfs attributes must call sysfs_attr_init
if lockdep is enabled. For the time being callying sysfs_attr_init
is only mandatory if lockdep is enabled, so we can live with a few
unconverted instances until we find them all. As this is part of
the public interface of sysfs it is a good idea to document these
pseudo functions so someone inspeciting the code can find out
what has happened.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Acknowledge that the logical sysfs rwsem has one instance per
sysfs attribute with different locking depencencies for different
attributes.
There is a sysfs idiom where writing to one sysfs file causes the
addition or removal of other sysfs files. Lumping all of the
sysfs attributes together in one lock class causes lockdep to
generate lots of false positives.
This introduces the requirement that non-static sysfs attributes
need to be initialized with sysfs_attr_init or sysfs_bin_attr_init.
Strictly speaking this requirement only exists when lockdep is
enabled, and when lockdep is enabled we get a bit fat warning
if this requirement is not met.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes a warning on several pxa based machines:
arch/arm/mach-pxa/ssp.c:475: warning: initialization discards qualifiers from pointer target type
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Vikram Dhillon <dhillonv10@gmail.com>
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Constify struct sysfs_ops.
This is part of the ops structure constification
effort started by Arjan van de Ven et al.
Benefits of this constification:
* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime
* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime
* potentially better optimized code as the compiler
can assume that the const data cannot be changed
* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing
Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Constify struct kset_uevent_ops.
This is part of the ops structure constification
effort started by Arjan van de Ven et al.
Benefits of this constification:
* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime
* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime
* potentially better optimized code as the compiler
can assume that the const data cannot be changed
* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing
Signed-off-by: Emese Revfy <re.emese@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Several drivers just export a static string as class attributes.
Use the new extensible attribute support to define a simple
CLASS_ATTR_STRING() macro for this.
This will allow to remove code from drivers in followon patches.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.
Also drivers can extend the attributes with own data fields
and use that in the low level function.
This makes the class attributes the same as sysdev_class attributes
and plain attributes.
This will allow further cleanups in drivers.
Full tree sweep converting all users.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Allow to create/remove arrays of sysdev attributes
Just wrappers around sysfs_create/move_files
Will be used later to clean up some drivers.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add a attribute array that is automatically registered and unregistered
to struct sysdev_class. This is similar to what struct class has.
A lot of drivers add list of attributes, so it's better to do
this easily in the common sysdev layer.
This adds a new field to struct sysdev_class. I audited the
whole tree and there are no dynamically allocated sysdev classes,
so this is fully compatible.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Adding/Removing a whole array of attributes is very common. Add a standard
utility function to do this with a simple function call, instead of
requiring drivers to open code this.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.
Also drivers can extend the attributes with own data fields
and use that in the low level function.
Similar to sysdev_attributes and normal attributes.
This is a tree-wide sweep, converting everything in one go.
No functional changes in this patch other than passing the new
argument everywhere.
Tested on x86, the non x86 parts are uncompiled.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The platform ID table is normally const, force that by adding the attribute.
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>