functions.
We add the glue code that sets up a dma_ops structure with the
xen_swiotlb_* functions. The code turns on xen_swiotlb flag
when it detects it is running under Xen and it is either
in privileged mode or the iommu=soft flag was passed in.
It also disables the bare-metal SWIOTLB if the Xen-SWIOTLB has
been enabled.
Note: The Xen-SWIOTLB is only built when CONFIG_XEN is enabled.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Albert Herranz <albert_herranz@yahoo.es>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
This patchset:
PV guests under Xen are running in an non-contiguous memory architecture.
When PCI pass-through is utilized, this necessitates an IOMMU for
translating bus (DMA) to virtual and vice-versa and also providing a
mechanism to have contiguous pages for device drivers operations (say DMA
operations).
Specifically, under Xen the Linux idea of pages is an illusion. It
assumes that pages start at zero and go up to the available memory. To
help with that, the Linux Xen MMU provides a lookup mechanism to
translate the page frame numbers (PFN) to machine frame numbers (MFN)
and vice-versa. The MFN are the "real" frame numbers. Furthermore
memory is not contiguous. Xen hypervisor stitches memory for guests
from different pools, which means there is no guarantee that PFN==MFN
and PFN+1==MFN+1. Lastly with Xen 4.0, pages (in debug mode) are
allocated in descending order (high to low), meaning the guest might
never get any MFN's under the 4GB mark.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Albert Herranz <albert_herranz@yahoo.es>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Rather than trying to deal with aliases once they appear, just completely
inhibit them. Mostly the removal of aliases was managable, but it comes
unstuck in xen_create_contiguous_region() because it gets executed at
interrupt time (as a result of dma_alloc_coherent()), which causes all
sorts of confusion in the vmap code, as it was never intended to be run
in interrupt context.
This has the unfortunate side effect of removing all the unmap batching
the vmap code so carefully added, but that can't be helped.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Add a flag to force lazy_max_pages() to zero to prevent any outstanding
mapped pages. We'll need this for Xen.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Nick Piggin <npiggin@suse.de>
This patch exports the capability of the AMD IOMMU to force
cache coherency of DMA transactions through the IOMMU-API.
This is required to disable some nasty hacks in KVM when
this capability is not available.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Tidy-up patch to remove some code and struct perf_session data members
which are no longer needed due to the previous patch: "perf tools: Don't
abbreviate file paths relative to the cwd".
LKML-Reference: <new-submission>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This avoids around some problems where the full path is executables and DSOs it
needed for finding debug symbols on platforms with separated debug symbol files
such as Ubuntu. This is simpler than tracking an extra name for each image.
The only impact should be that paths in verbose output from the perf tools
become absolute, instead of relative to .
LKML-Reference: <new-submission>
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The stock newt checkbox tree widget we were using was not really
suitable for hist entry + callchain browsing.
The problems with it were manifold:
- We needed to traverse the whole hist_entry rb_tree to add each entry +
callchains beforehand.
- No control over the colors used for each row
So a new tree widget, based mostly on slang, was written.
It extends the ui_browser class already used for annotate to allow the
user to fold/unfold branches in the callchains tree, using extra fields
in the symbol_map class that is embedded in hist_entry and
callchain_node instances to store the folding state and when changing
this state calculates the number of rows that are produced when showing
a particular hist_entry instance.
This greatly speeds up browsing as we don't have to upfront touch all
the entries and only calculate callchain related operations when some
callchain branch is actually unfolded.
The memory footprint is also reduced as the data structure is not
duplicated, just some extra fields for controling callchain state and to
simplify the process of seeking thru entries (nr_rows, row_offset) were
added.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we gain two columns and look more like classical (at least in
TUIs) scroll bars bars.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we call ui_browser__show we may have called
ui_browser__refresh_dimensions to check if the maximum lenght for the
contained entries changed, such as when zooming in and out DSOs or
threads in the hist browser.
For that to happen we must delete the old form, that will take care of
deleting the vertical scrollbar, etc, and then recreate them, with the
new dimensions.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We should use perf_sample_data_init() to initialize struct
perf_sample_data. As explained in the description of commit dc1d628a
("perf: Provide generic perf_sample_data initialization"), it is
possible for userspace to get the kernel to dereference data.raw,
so if it is not initialized, that means that unprivileged userspace
can possibly oops the kernel. Using perf_sample_data_init makes sure
it gets initialized to NULL.
This conversion should have been included in commit dc1d628a, but it
got missed.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There is no need to handle POST_PMU, POST_PMD event with
the Capture Route widget.
It is enough to handle POST_REG event, since that will come
when the user changes the routing, and we will switch the needed
bits in the registers.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@nokia.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Provides an accessor function to replace hrtimer.c's
direct access of wall_to_monotonic.
This will allow wall_to_monotonic to be made static as
planned in Documentation/feature-removal-schedule.txt
Signed-off-by: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <1279068988-21864-9-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
update_vsyscall() did not provide the wall_to_monotoinc offset,
so arch specific implementations tend to reference wall_to_monotonic
directly. This limits future cleanups in the timekeeping core, so
this patch fixes the update_vsyscall interface to provide
wall_to_monotonic, allowing wall_to_monotonic to be made static
as planned in Documentation/feature-removal-schedule.txt
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tony Luck <tony.luck@intel.com>
LKML-Reference: <1279068988-21864-7-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
After accidentally misusing timespec_add_safe, I wanted to make sure
we don't accidently trip over that issue again, so I created a simple
timespec_add() function which we can use to replace the instances
of timespec_add_safe() that don't want the overflow detection.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <1279068988-21864-3-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Due to vtime calling vgettimeofday(), its possible that an application
could call time();create("stuff",O_RDRW); only to see the file's
creation timestamp to be before the value returned by time.
A similar way to reproduce the issue is to compare the vsyscall time()
with the syscall time(), and observe ordering issues.
The modified test case from Oleg Nesterov below can illustrate this:
int main(void)
{
time_t sec1,sec2;
do {
sec1 = time(&sec2);
sec2 = syscall(__NR_time, NULL);
} while (sec1 <= sec2);
printf("vtime: %d.000000\n", sec1);
printf("time: %d.000000\n", sec2);
return 0;
}
The proper fix is to make vtime use the same time value as
current_kernel_time() (which is exported via update_vsyscall) instead of
vgettime().
Thanks to Jiri Olsa for bringing up the issue and catching bugs in
earlier verisons of this fix.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
LKML-Reference: <1279068988-21864-2-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use the MMC core's ability to poll for card detection. This also has
the advantage of doing the gpio_get_value from a workqueue instead of
timer, allowing the gpio to be on a sleeping gpiochip.
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The mmci driver's SG list iteration logic assumes that each SG entry
spans only one page, and only maps and flushes one page of the sg. This
is not a valid assumption. Fix it by converting the driver to the
sg_miter API, which correctly handles sgs which span multiple pages.
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Found in the Versatile build:
WARNING: drivers/built-in.o(.data+0x14c): Section mismatch in reference from the variable pl061_gpio_driver to the (unknown reference) .init.data:(unknown)
The variable pl061_gpio_driver references
the (unknown reference) __initdata (unknown)
WARNING: drivers/built-in.o(.data+0x40f8): Section mismatch in reference from the variable pl011_driver to the (unknown reference) .init.data:(unknown)
The variable pl011_driver references
the (unknown reference) __initdata (unknown)
WARNING: drivers/built-in.o(.data+0x5ab4): Section mismatch in reference from the variable pl031_driver to the (unknown reference) .init.data:(unknown)
The variable pl031_driver references
the (unknown reference) __initdata (unknown)
Basically, amba_id structures must not be __initdata. Also fix:
WARNING: drivers/built-in.o(.data+0x138): Section mismatch in reference from the variable pl061_gpio_driver to the function .init.text:pl061_probe()
The variable pl061_gpio_driver references
the function __init pl061_probe()
which is an incorrectly annotated probe function. Fix it to reflect
the other AMBA bus probe functions by removing the __init attributation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
x86 calls machine_shutdown() from the various machine_*() calls which
take the machine down ready for halting, restarting, etc, and uses
this to bring the system safely to a point where those actions can be
performed. Such actions are stopping the secondary CPUs.
So, change the ARM implementation of these to reflect what x86 does.
This solves kexec problems on ARM SMP platforms, where the secondary
CPUs were left running across the kexec call.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The TWD local timers are unable to wake up the CPU when it is placed
into a low power mode, eg. C3. Therefore, we need to adapt things
such that the TWD code can cope with this.
We do this by always providing a broadcast tick function, and marking
the fact that the TWD local timer will stop in low power modes. This
means that when the CPU is placed into a low power mode, the core
timer code marks this fact, and allows an IPI to be given to the core.
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
All implementations of cpu_proc_fin() start by disabling interrupts
and then flush caches. Rather than have every processors proc_fin()
implementation do this, move it out into generic code - and move the
cache flush past setup_mm_for_reboot() (so it can benefit from having
caches still enabled.)
This allows cpu_proc_fin() to become independent of the L1/L2 cache
types, and eventually move the L2 cache flushing into the L2 support
code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Statuses 3 (0b00011) and 6 (0x00110) of DFSR are Access Flags faults on
ARMv6K and ARMv7. Let's patch fsr_info[] at runtime if we are on ARMv7
or later.
Unfortunately, we don't have runtime check for 'K' extension, so we
can't check for it.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On ARM one Linux PGD entry contains two hardware entries (see page
tables layout in pgtable.h). We normally guarantee that we always
fill both L1 entries. But create_mapping() doesn't follow the rule.
It can create inidividual L1 entries, so here we have to call
pmd_none() check in do_translation_fault() for the entry really
corresponded to address, not for the first of pair.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add one more parameter to hook_fault_code() to be able to set 'code'
field of struct fsr_info.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>