In ocfs2_link(), the parent directory inode passed to function
ocfs2_lookup_ino_from_name() is wrong. Parameter dir is the parent of
new_dentry not old_dentry. We should get old_dir from old_dentry and
lookup old_dentry in old_dir in case another node remove the old dentry.
With this change, hard linking works again, when paths are relative with
at least one subdirectory. This is how the problem was reproducable:
# mkdir a
# mkdir b
# touch a/test
# ln a/test b/test
ln: failed to create hard link `b/test' => `a/test': No such file or directory
However when creating links in the same dir, it worked well.
Now the link gets created.
Fixes: 0e048316ff ("ocfs2: check existence of old dentry in ocfs2_link()")
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Reported-by: Szabo Aron - UBIT <aron@ubit.hu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Tested-by: Aron Szabo <aron@ubit.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
My ISP finally gave up on the old mail address, so I am moving things
over to bitmath.org instead. Also change the status fields to better
reflect reality.
Signed-off-by: Henrik Rydberg <rydberg@bitmath.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tejun, while reviewing the code, spotted the following race condition
between the dirtying and truncation of a page:
__set_page_dirty_nobuffers() __delete_from_page_cache()
if (TestSetPageDirty(page))
page->mapping = NULL
if (PageDirty())
dec_zone_page_state(page, NR_FILE_DIRTY);
dec_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
if (page->mapping)
account_page_dirtied(page)
__inc_zone_page_state(page, NR_FILE_DIRTY);
__inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
which results in an imbalance of NR_FILE_DIRTY and BDI_RECLAIMABLE.
Dirtiers usually lock out truncation, either by holding the page lock
directly, or in case of zap_pte_range(), by pinning the mapcount with
the page table lock held. The notable exception to this rule, though,
is do_wp_page(), for which this race exists. However, do_wp_page()
already waits for a locked page to unlock before setting the dirty bit,
in order to prevent a race where clear_page_dirty() misses the page bit
in the presence of dirty ptes. Upgrade that wait to a fully locked
set_page_dirty() to also cover the situation explained above.
Afterwards, the code in set_page_dirty() dealing with a truncation race
is no longer needed. Remove it.
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Constantly forking task causes unlimited grow of anon_vma chain. Each
next child allocates new level of anon_vmas and links vma to all
previous levels because pages might be inherited from any level.
This patch adds heuristic which decides to reuse existing anon_vma
instead of forking new one. It adds counter anon_vma->degree which
counts linked vmas and directly descending anon_vmas and reuses anon_vma
if counter is lower than two. As a result each anon_vma has either vma
or at least two descending anon_vmas. In such trees half of nodes are
leafs with alive vmas, thus count of anon_vmas is no more than two times
bigger than count of vmas.
This heuristic reuses anon_vmas as few as possible because each reuse
adds false aliasing among vmas and rmap walker ought to scan more ptes
when it searches where page is might be mapped.
Link: http://lkml.kernel.org/r/20120816024610.GA5350@evergreen.ssec.wisc.edu
Fixes: 5beb493052 ("mm: change anon_vma linking to fix multi-process server scalability issue")
[akpm@linux-foundation.org: fix typo, per Rik]
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Daniel Forrest <dan.forrest@ssec.wisc.edu>
Tested-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Jerome Marchand <jmarchan@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org> [2.6.34+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
wait_consider_task() checks EXIT_ZOMBIE after EXIT_DEAD/EXIT_TRACE and
both checks can fail if we race with EXIT_ZOMBIE -> EXIT_DEAD/EXIT_TRACE
change in between, gcc needs to reload p->exit_state after
security_task_wait(). In this case ->notask_error will be wrongly
cleared and do_wait() can hang forever if it was the last eligible
child.
Many thanks to Arne who carefully investigated the problem.
Note: this bug is very old but it was pure theoretical until commit
b3ab03160d ("wait: completely ignore the EXIT_DEAD tasks"). Before
this commit "-O2" was probably enough to guarantee that compiler won't
read ->exit_state twice.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Arne Goedeke <el@laramies.com>
Tested-by: Arne Goedeke <el@laramies.com>
Cc: <stable@vger.kernel.org> [3.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull kbuild fix from Michal Marek:
"make mrproper / distclean stopped removing the generated debian/
directory in v3.16. This fixes it"
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: Fix removal of the debian/ directory
The introduction of the uapi directories in v3.7-rc1 moved some of the
generated headers from arch/*/include/generated to the uapi directory,
keeping the #include directives intact.
This creates a problem when bisecting, because the unversioned files are
not cleaned automatically by git and the compiler might include stale
headers as a result. Instead of cleaning them in the Makefiles, promote
arch/*/include/generated/uapi in the search path. Under normal
circumstances, there is no overlap between this uapi subdirectory and
its parent, so the include choices remain the same. We keep
arch/*/include/generated/uapi in the USERINCLUDE variable so that it is
usable standalone.
Note that we cannot completely swap the order of the uapi and
kernel-only directories, since the headers in include/uapi/asm-generic
are meant to be wrapped by their include/asm-generic counterparts when
building kernel code.
Reported-by: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Reported-by: David Drysdale <dmd@lurklurk.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull pinctrl fixes from Linus Walleij:
"Allright allright I've been lazy over christmas and New Years. Here
are a few collected pin control fixes eventually. Details:
A set of assorted pin control fixes for the Rockchip and STi drivers"
* tag 'pinctrl-v3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: st: Add irq_disable hook to st_gpio_irqchip
pinctrl: st: avoid multiple mutex lock
pinctrl: rockchip: Fix enable/disable/mask/unmask
pinctrl: rockchip: Handle wakeup pins
Pull power management and ACPI fixes from Rafael Wysocki:
"These are an ACPI device power management initialization fix (-stable
material), two commits renaming stuff in the ACPI processor driver to
make it more suitable for ARM64 processors and a new ACPI backlight
blacklist entry.
Specifics:
- Fix ACPI power management intialization for device objects
corresponding to devices that are not present at the init time (the
_STA control method returns 0 for them) and therefore should not be
regarded as power manageable (Rafael J Wysocki).
- Rename a structure field and two functions used by the ACPI
processor driver to make them less tied to architectures that use
APICs (both x86 and ia64) and more suitable for ARM64 processors
(Hanjun Guo).
- Add a disable_native_backlight quirk for Dell XPS15 L521X designed
in an unusual way preventing native backlight from working on that
machine (Hans de Goede)"
* tag 'pm+acpi-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: Add disable_native_backlight quirk for Dell XPS15 L521X
ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu()
ACPI / processor: Convert apic_id to phys_id to make it arch agnostic
ACPI / PM: Fix PM initialization for devices that are not present
Instead of registering device attributes individually let's use attribute
groups and also devm_* infrastructure to ease cleanup.
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
This change adds support for the Power Enable Key found on MFD AXP202
and AXP209. Besides the basic support for the button, the driver adds
two entries in sysfs to configure the time delay for power on/off.
Signed-off-by: Carlo Caione <carlo@caione.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[wens@csie.org: made axp20x_pek_remove() static; removed driver owner
field; fixed path for sysfs entries]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
If given input handler is not a filter there is no point is iterating list
of events in a packet to see if some of them need to be filtered out.
Signed-off-by: Anshul Garg <anshul.g@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
If a device does not support autorepeat or does not emit any key events we
should not be scanning all events in a packet to decide if we should start
or stop autorepeat function.
Signed-off-by: Anshul Garg <anshul.g@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Pull keyrings fixes from David Howells:
"Two fixes:
- Fix for the order in which things are done during key garbage
collection to prevent named keyrings causing a crash
[CVE-2014-9529].
- Fix assoc_array to explicitly #include rcupdate.h to prevent
compilation errors under certain circumstances"
* tag 'keys-fixes-20150107' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
assoc_array: Include rcupdate.h for call_rcu() definition
KEYS: close race between key lookup and freeing
Adds a function kvm_vcpu_set_pending_timer instead of calling
kvm_make_request in lapic.c.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When access to descriptor in LDT/GDT wraparound outside long-mode, the address
of the descriptor should be truncated to 32-bit. Citing Intel SDM 2.1.1.1
"Global and Local Descriptor Tables in IA-32e Mode": "GDTR and LDTR registers
are expanded to 64-bits wide in both IA-32e sub-modes (64-bit mode and
compatibility mode)."
So in other cases, we need to truncate. Creating new function to return a
pointer to descriptor table to avoid too much code duplication.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
[Wrap 64-bit check with #ifdef CONFIG_X86_64, to avoid a "right shift count
>= width of type" warning and consequent undefined behavior. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When segment is loaded, the segment access bit is set unconditionally. In
fact, it should be set conditionally, based on whether the segment had the
accessed bit set before. In addition, it can improve performance.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to Intel SDM: "If the ESP register is used as a base register for
addressing a destination operand in memory, the POP instruction computes the
effective address of the operand after it increments the ESP register."
The current emulation does not behave so. The fix required to waste another
of the precious instruction flags and to check the flag in decode_modrm.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, if em_call_far fails it returns success instead of the resulting
error-code. Fix it.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The KVM emulator does not emulate JMP and CALL that target a call gate or a
task gate. This patch does not try to implement these scenario as they are
presumably rare; yet it returns X86EMUL_UNHANDLEABLE error in such cases
instead of generating an exception.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the operand size of fnstcw and fnstsw is updated during the execution,
the emulation may cause spurious exceptions as it reads the memory beforehand.
Marking these instructions as Mov (since the previous value is ignored) and
DstMem16 to simplify the setting of operand size.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Although pop sreg updates RSP according to the operand size, only 2 bytes are
read. The current behavior may result in incorrect #GP or #PF exceptions.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because ASSERT is just a printk, these would oops right away.
The assertion thus hardly adds anything.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The initialization function in mmu.c can always use walk_mmu, which
is known to be vcpu->arch.mmu. Only init_kvm_nested_mmu is used to
initialize vcpu->arch.nested_mmu.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add tracepoint to wait_lapic_expire.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
[Remind reader if early or late. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For the hrtimer which emulates the tscdeadline timer in the guest,
add an option to advance expiration, and busy spin on VM-entry waiting
for the actual expiration time to elapse.
This allows achieving low latencies in cyclictest (or any scenario
which requires strict timing regarding timer expiration).
Reduces average cyclictest latency from 12us to 8us
on Core i5 desktop.
Note: this option requires tuning to find the appropriate value
for a particular hardware/guest combination. One method is to measure the
average delay between apic_timer_fn and VM-entry.
Another method is to start with 1000ns, and increase the value
in say 500ns increments until avg cyclictest numbers stop decreasing.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_x86_ops->test_posted_interrupt() returns true/false depending
whether 'vector' is set.
Next patch makes use of this interface.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In most cases calling hwapic_isr_update(), we always check if
kvm_apic_vid_enabled() == 1, but actually,
kvm_apic_vid_enabled()
-> kvm_x86_ops->vm_has_apicv()
-> vmx_vm_has_apicv() or '0' in svm case
-> return enable_apicv && irqchip_in_kernel(kvm)
So its a little cost to recall vmx_vm_has_apicv() inside
hwapic_isr_update(), here just NULL out hwapic_isr_update() in
case of !enable_apicv inside hardware_setup() then make all
related stuffs follow this. Note we don't check this under that
condition of irqchip_in_kernel() since we should make sure
definitely any caller don't work without in-kernel irqchip.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove FIXME comments about needing fault addresses to be returned. These
are propaagated from walk_addr_generic to gva_to_gpa and from there to
ops->read_std and ops->write_std.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When generating #PF VM-exit, check equality:
(PFEC & PFEC_MASK) == PFEC_MATCH
If there is equality, the 14 bit of exception bitmap is used to take decision
about generating #PF VM-exit. If there is inequality, inverted 14 bit is used.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch improve checks required by Intel Software Developer Manual.
- SMM MSRs are not allowed.
- microcode MSRs are not allowed.
- check x2apic MSRs only when LAPIC is in x2apic mode.
- MSR switch areas must be aligned to 16 bytes.
- address of first and last byte in MSR switch areas should not set any bits
beyond the processor's physical-address width.
Also it adds warning messages on failures during MSR switch. These messages
are useful for people who debug their VMMs in nVMX.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Several hypervisors need MSR auto load/restore feature.
We read MSRs from VM-entry MSR load area which specified by L1,
and load them via kvm_set_msr in the nested entry.
When nested exit occurs, we get MSRs via kvm_get_msr, writing
them to L1`s MSR store area. After this, we read MSRs from VM-exit
MSR load area, and load them via kvm_set_msr.
Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pull virtio/vhost fixes from Michael Tsirkin:
"This fixes a couple of bugs triggered by hot-unplug of virtio devices,
as well as a regression in vhost-net"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost/net: length miscalculation
virtio_pci: document why we defer kfree
virtio_pci: defer kfree until release callback
virtio_pci: device-specific release callback
virtio: make del_vqs idempotent
Document usage of maxim,ena-gpios properties which turn on external/GPIO
control over regulator.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
When drivers use simplified DT parsing method (they provide
'regulator_desc.of_match') they still may want to parse custom
properties for some of the regulators. For example some of the
regulators support GPIO enable control.
Add a driver-supplied callback for such case. This way the regulator
core parses common bindings offloading a lot of code from drivers and
still custom properties may be used.
The callback, called for each parsed regulator, may modify the
'regulator_config' initially passed to regulator_register().
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Add enable control over GPIO for regulators supporting this: LDO20,
LDO21, LDO22, buck8 and buck9.
This is needed for proper (and full) configuration of the Maxim 77686
PMIC without creating redundant 'regulator-fixed' entries.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Copy the 'regulator_config' structure passed to regulator_register()
function so the driver could safely modify it after parsing init data.
The driver may want to change the config as a result of specific init
data parsed by regulator core (e.g. when core handled parsing device
tree).
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Pull IOMMU fixes from Joerg Roedel:
"Including:
- a domain structure leak fix in the Intel VT-d driver
- compile error fix for the VMSA IPMMU driver because of the
IOMMU_EXEC -> IOMMU_NOEXEC conversion
- two small cleanups as an aftermath of the merge window and the
domain-leak fix"
* tag 'iommu-fixes-v3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/rockchip: Drop owner assignment from platform_drivers
iommu/vt-d: Remove dead code in device_notifier
iommu/vt-d: Fix dmar_domain leak in iommu_attach_device
iommu/ipmmu-vmsa: Change IOMMU_EXEC to IOMMU_NOEXEC
XTFPGA boards provides an audio subsystem that consists of TI CDCE706
clock synthesizer, I2S transmitter and TLV320AIC23 audio codec.
I2S transmitter has MMIO-based interface that resembles that of the
OpenCores I2S transmitter. I2S transmitter is always a master on I2S
bus. There's no specialized audio DMA, sample data are transferred to
I2S transmitter FIFO by CPU through memory-mapped queue interface.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Pull crypto fixes from Herbert Xu:
"This fixes a build problem with sha-mb with old toolchains and an
implementation bug in the ctr(aes)/by8 branch of aesni-intel that's
enabled when AVX is available"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: sha-mb - Add avx2_supported check.
crypto: aesni - fix "by8" variant for 128 bit keys
irqmap is optional property, so priv->domain can be NULL if !irqmap.
Thus add NULL test for priv->domain before calling irq_domain_remove()
to prevent NULL pointer dereference.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The patch adds the MICBIAS VDD setting in the platform data. It can be set to
1V8 or 3V3 in the MICBIAS VDD.
Signed-off-by: Oder Chiou <oder_chiou@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>