Vince Weaver and Stephane Eranian reported warnings in the PEBS
code when running the perf fuzzer. Stephane wrote:
> I can reproduce the problem on my HSW running the fuzzer.
>
> I can see why this could be happening if you are mixing PEBS and non PEBS events
> in the bottom 4 counters. I suspect:
> for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
> if ((counts[bit] == 0) && (error[bit] == 0))
> continue;
>
> This test is not correct when you have non-PEBS events mixed with
> PEBS events and they overflow at the same time. They will have
> counts[i] != 0 but error[i] == 0, and thus you fall thru the loop
> and hit the assert. Or it is something along those lines.
The only way I can make this work is if ->status only has !PEBS events
set, because if it has both set we'll take that slow path which masks
out the !PEBS bits.
After masking there are 3 options:
- there is one bit set, and its @bit, we increment counts[bit].
- there are multiple bits set, we increment error[] for each set bit,
we do not increment counts[].
- there are no bits set, we do nothing.
The intent was to never increment counts[] for !PEBS events.
Now if we start out with only a single !PEBS event set, we'll pass the
test and increment counts[] for a !PEBS and hit the warn.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add perf core PMU support for future Intel Skylake CPU cores.
The code is based on Haswell/Broadwell.
There is a new cache event list, based on the updated Haswell
event list.
Skylake has removed most counter constraints on basic
events, so the basic constraints table now only has a single
entry (plus the fixed counters).
TSX support and various other setups are all shared with Haswell.
Skylake has 32 LBR entries. Add a new LBR init function
to set this up. The filters are all the same as Haswell.
It also has a new LBR format with a separate LBR_INFO_* MSR,
but that has been already added earlier.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285767-27027-7-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add support for the new LBRv5 format used on Intel Skylake CPUs.
The flags for mispredict, abort, in_tx etc. moved to range of separate
LBR_INFO_* MSRs. Teach the LBR code to read those. The original
LBR registers stay the same, except they have full sign
extension now.
LBR_INFO also reports a cycle count to the last branch.
Report the cycle information using the new "cycles" branch_info
output field.
In addition we have to context switch and clear the new INFO
MSRs to avoid any information leaks.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285767-27027-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
PEBSv3 is the same as the existing PEBSv2 used on Haswell,
but it adds a new TSC field. Add support to the generic
PEBS handler to handle the new format, and overwrite
the perf time stamp using the new native_sched_clock_from_tsc().
Right now the time stamp is just slightly more accurate,
as it is nearer the actual event trigger point. With
the PEBS threshold > 1 patchkit it will be much more accurate,
avoid the problems with MMAP mismatches earlier.
The accurate time stamping is only implemented for
the default trace clock for now.
v2: Use _skl prefix. Check for default clock_id.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285767-27027-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Intel PT chapter in the new Intel Architecture SDM adds several packets
corresponding enable bits and registers that control packet generation.
Also, additional bits in the Intel PT CPUID leaf were added to enumerate
presence and parameters of these new packets and features.
The packets and enables are:
* CYC: cycle accurate mode, provides the number of cycles elapsed since
previous CYC packet; its presence and available threshold values are
enumerated via CPUID;
* MTC: mini time counter packets, used for tracking TSC time between
full TSC packets; its presence and available resolution options are
enumerated via CPUID;
* PSB packet period is now configurable, available period values are
enumerated via CPUID.
This patch adds corresponding bit and register definitions, pmu driver
capabilities based on CPUID enumeration, new attribute format bits for
the new featurens and extends event configuration validation function
to take these into account.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1438262131-12725-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently, the PT driver zeroes out the status register every time before
starting the event. However, all the writable bits are already taken care
of in pt_handle_status() function, except the new PacketByteCnt field,
which in new versions of PT contains the number of packet bytes written
since the last sync (PSB) packet. Zeroing it out before enabling PT forces
a sync packet to be written. This means that, with the existing code, a
sync packet (PSB and PSBEND, 18 bytes in total) will be generated every
time a PT event is scheduled in.
To avoid these unnecessary syncs and save a WRMSR in the fast path, this
patch changes the default behavior to not clear PacketByteCnt field, so
that the sync packets will be generated with the period specified as
"psb_period" attribute config field. This has little impact on the trace
data as the other packets that are normally sent within PSB+ (between PSB
and PSBEND) have their own generation scenarios which do not depend on the
sync packets.
One exception where we do need to force PSB like this when tracing starts,
so that the decoder has a clear sync point in the trace. For this purpose
we aready have hw::itrace_started flag, which we are currently using to
output PERF_RECORD_ITRACE_START. This patch moves setting itrace_started
from perf core to the pmu::start, where it should still be 0 on the very
first run.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1438264104-16189-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The previous change documents that cleanup_return_instances()
can't always detect the dead frames, the stack can grow. But
there is one special case which imho worth fixing:
arch_uretprobe_is_alive() can return true when the stack didn't
actually grow, but the next "call" insn uses the already
invalidated frame.
Test-case:
#include <stdio.h>
#include <setjmp.h>
jmp_buf jmp;
int nr = 1024;
void func_2(void)
{
if (--nr == 0)
return;
longjmp(jmp, 1);
}
void func_1(void)
{
setjmp(jmp);
func_2();
}
int main(void)
{
func_1();
return 0;
}
If you ret-probe func_1() and func_2() prepare_uretprobe() hits
the MAX_URETPROBE_DEPTH limit and "return" from func_2() is not
reported.
When we know that the new call is not chained, we can do the
more strict check. In this case "sp" points to the new ret-addr,
so every frame which uses the same "sp" must be dead. The only
complication is that arch_uretprobe_is_alive() needs to know was
it chained or not, so we add the new RP_CHECK_CHAIN_CALL enum
and change prepare_uretprobe() to pass RP_CHECK_CALL only if
!chained.
Note: arch_uretprobe_is_alive() could also re-read *sp and check
if this word is still trampoline_vaddr. This could obviously
improve the logic, but I would like to avoid another
copy_from_user() especially in the case when we can't avoid the
false "alive == T" positives.
Tested-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Anton Arapov <arapov@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150721134028.GA4786@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add the x86 specific version of arch_uretprobe_is_alive()
helper. It returns true if the stack frame mangled by
prepare_uretprobe() is still on stack. So if it returns false,
we know that the probed function has already returned.
We add the new return_instance->stack member and change the
generic code to initialize it in prepare_uretprobe, but it
should be equally useful for other architectures.
TODO: this assumes that the probed application can't use
multiple stacks (say sigaltstack). We will try to improve
this logic later.
Tested-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Anton Arapov <arapov@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150721134018.GA4766@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf fix from Thomas Gleixner:
"A single fix for the intel cqm perf facility to prevent IPIs from
interrupt context"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/cqm: Return cached counter value from IRQ context
Pull x86 fixes from Thomas Gleixner:
"This update contains:
- the manual revert of the SYSCALL32 changes which caused a
regression
- a fix for the MPX vma handling
- three fixes for the ioremap 'is ram' checks.
- PAT warning fixes
- a trivial fix for the size calculation of TLB tracepoints
- handle old EFI structures gracefully
This also contains a PAT fix from Jan plus a revert thereof. Toshi
explained why the code is correct"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/pat: Revert 'Adjust default caching mode translation tables'
x86/asm/entry/32: Revert 'Do not use R9 in SYSCALL32' commit
x86/mm: Fix newly introduced printk format warnings
mm: Fix bugs in region_is_ram()
x86/mm: Remove region_is_ram() call from ioremap
x86/mm: Move warning from __ioremap_check_ram() to the call site
x86/mm/pat, drivers/media/ivtv: Move the PAT warning and replace WARN() with pr_warn()
x86/mm/pat, drivers/infiniband/ipath: Replace WARN() with pr_warn()
x86/mm/pat: Adjust default caching mode translation tables
x86/fpu: Disable dependent CPU features on "noxsave"
x86/mpx: Do not set ->vm_ops on MPX VMAs
x86/mm: Add parenthesis for TLB tracepoint size calculation
efi: Handle memory error structures produced based on old versions of standard
Toshi explains:
"No, the default values need to be set to the fallback types,
i.e. minimal supported mode. For WC and WT, UC is the fallback type.
When PAT is disabled, pat_init() does update the tables below to
enable WT per the default BIOS setup. However, when PAT is enabled,
but CPU has PAT -errata, WT falls back to UC per the default values."
Revert: ca1fec58bc 'x86/mm/pat: Adjust default caching mode translation tables'
Requested-by: Toshi Kani <toshi.kani@hp.com>
Cc: Jan Beulich <jbeulich@suse.de>
Link: http://lkml.kernel.org/r/1437577776.3214.252.camel@hp.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull tty/serial driver fixes from Greg KH:
"Here are a number of small serial and tty fixes for reported issues.
All have been in linux-next successfully"
* tag 'tty-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: vt: Fix !TASK_RUNNING diagnostic warning from paste_selection()
serial: core: Fix crashes while echoing when closing
m32r: Add ioreadXX/iowriteXX big-endian mmio accessors
Revert "serial: imx: initialized DMA w/o HW flow enabled"
sc16is7xx: fix FIFO address of secondary UART
sc16is7xx: fix Kconfig dependencies
serial: etraxfs-uart: Fix release etraxfs_uart_ports
tty/vt: Fix the memory leak in visual_init
serial: amba-pl011: Fix devm_ioremap_resource return value check
n_tty: signal and flush atomically
Pull MMC fixes from Ulf Hansson:
"Here are some mmc fixes intended for v4.2 rc4.
Note, most of the changes are for the sdhci-esdhc-imx controller,
which also required us to modify some related DTS files. Those
changes have been acked by the SoC maintainer.
MMC core:
- Fix a reference inbalance issue for power_ro_lock_show() sysfs handler
MMC host:
- omap_hsmmc: Fix IRQ errorhandling for CD, DTO, and CRC
- sdhci: Prevent a kernel panic while using DMA
- mtk-sd: Let it depend on HAS_DMA to prevent build errors
- sdhci-esdhc: Make 8BIT bus work
- sdhci-esdhc-imx: Fix some regressions for DT based platforms
- sdhci-pxav3: Fix a regression for DT based platforms"
* tag 'mmc-4.2-rc3' of git://git.linaro.org/people/ulf.hansson/mmc:
mmc: sdhci-pxav3: fix platform_data is not initialized
dts: mmc: fsl-imx-esdhc: remove fsl,cd-controller support
mmc: sdhci-esdhc-imx: clear f_max in boarddata
mmc: sdhci-esdhc-imx: remove duplicated dts parsing
mmc: sdhci: make max-frequency property in device tree work
mmc: sdhci-esdhc-imx: move all non dt probe code into one function
mmc: sdhci-esdhc-imx: fix cd regression for dt platform
dts: imx7: fix sd card gpio polarity specified in device tree
dts: imx25: fix sd card gpio polarity specified in device tree
dts: imx6: fix sd card gpio polarity specified in device tree
dts: imx53: fix sd card gpio polarity specified in device tree
dts: imx51: fix sd card gpio polarity specified in device tree
mmc: sdhci-esdhc: Make 8BIT bus work
mmc: block: Add missing mmc_blk_put() in power_ro_lock_show()
mmc: MMC_MTK should depend on HAS_DMA
mmc: sdhci check parameters before call dma_free_coherent
mmc: omap_hsmmc: Handle BADA, DEB and CEB interrupts
mmc: omap_hsmmc: Fix DTO and DCRC handling
Pull arch/tile bugfix from Chris Metcalf:
"This fixes a bug in freeing the initramfs memory"
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: use free_bootmem_late() for initrd
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios
should be changed to GPIO_ACTIVE_HIGH.
Otherwise, the SD may not work properly due to wrong polarity inversion
specified in DT after switch to common parsing function mmc_of_parse().
Signed-off-by: Dong Aisheng <aisheng.dong@freescale.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios
should be changed to GPIO_ACTIVE_HIGH.
Otherwise, the SD may not work properly due to wrong polarity inversion
specified in DT after switch to common parsing function mmc_of_parse().
Signed-off-by: Dong Aisheng <aisheng.dong@freescale.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios
should be changed to GPIO_ACTIVE_HIGH.
Otherwise, the SD may not work properly due to wrong polarity inversion
specified in DT after switch to common parsing function mmc_of_parse().
Signed-off-by: Dong Aisheng <aisheng.dong@freescale.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios
should be changed to GPIO_ACTIVE_HIGH.
Otherwise, the SD may not work properly due to wrong polarity inversion
specified in DT after switch to common parsing function mmc_of_parse().
Signed-off-by: Dong Aisheng <aisheng.dong@freescale.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios
should be changed to GPIO_ACTIVE_HIGH.
Otherwise, the SD may not work properly due to wrong polarity inversion
specified in DT after switch to common parsing function mmc_of_parse().
Signed-off-by: Dong Aisheng <aisheng.dong@freescale.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
commit c627f2ceb6 ("serial: 8250: Add support for big-endian MMIO accesses")
added support for 32-bit big-endian mmio to the 8250 driver. Support for
ioreadXXbe/iowriteXXbe io accessors was missing from m32r arch, which caused
build errors.
Add trivial macro mmio accessors.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We were previously using free_bootmem() and just getting lucky
that nothing too bad happened.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: stable@vger.kernel.org
Make them clearly architecture-dependent; the capability is valid for
all architectures, but the argument is not.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
OVMF depends on WB to boot fast, because it only clears caches after
it has set up MTRRs---which is too late.
Let's do writeback if CR0.CD is set to make it happy, similar to what
SVM is already doing.
Signed-off-by: Xiao Guangrong <guangrong.xiao@intel.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>