Commit Graph

1136096 Commits

Author SHA1 Message Date
Vitaly Kuznetsov
9bcb90650e KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
With the updated eVMCSv1 definition, there's no known 'problematic'
controls which are exposed in VMX control MSRs but are not present in
eVMCSv1: all known Hyper-V versions either don't expose the new fields
by not setting bits in the VMX feature controls or support the new
eVMCS revision.

Get rid of VMX control MSRs filtering for KVM on Hyper-V.

Note: VMX control MSRs filtering for Hyper-V on KVM
(nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
definition doesn't have all the features implemented by KVM and some
fields are still missing. Moreover, nested_evmcs_filter_control_msr()
has to support the original eVMCSv1 version when VMM wishes so.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-17-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:48 -04:00
Vitaly Kuznetsov
4da77090b0 KVM: nVMX: Support PERF_GLOBAL_CTRL with enlightened VMCS
Enlightened VMCS v1 got updated and now includes the required fields
for loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT features. For KVM on
Hyper-V enablement, KVM can just observe VMX control MSRs and use the
features (with or without eVMCS) when possible.

Hyper-V on KVM is messier as Windows 11 guests fail to boot if the
controls are advertised and a new PV feature flag, CPUID.0x4000000A.EBX
BIT(0), is not set.  Honor the Hyper-V CPUID feature flag to play nice
with Windows guests.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-16-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:47 -04:00
Sean Christopherson
3ff8a13d41 KVM: nVMX: WARN once and fail VM-Enter if eVMCS sees VMFUNC[63:32] != 0
WARN and reject nested VM-Enter if KVM is using eVMCS and manages to
allow a non-zero value in the upper 32 bits of VM-function controls.  The
eVMCS code assumes all inputs are 32-bit values and subtly drops the
upper bits.  WARN instead of adding proper "support", it's unlikely the
upper bits will be defined/used in the next decade.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-15-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:47 -04:00
Vitaly Kuznetsov
8174193163 KVM: selftests: Switch to updated eVMCSv1 definition
Update Enlightened VMCS definition in selftests from KVM.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-14-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:46 -04:00
Vitaly Kuznetsov
a0fa4b7abf KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
The updated Enlightened VMCS definition has 'encls_exiting_bitmap'
field which needs mapping to VMCS, add the missing encoding.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-13-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:45 -04:00
Vitaly Kuznetsov
dea6e140d9 KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
leaf to know which Enlightened VMCS definition to use (original or 2022
update). Cache the leaf along with other Hyper-V CPUID feature leaves
to make the check quick.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-12-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:44 -04:00
Vitaly Kuznetsov
c9d31986e8 KVM: nVMX: Support several new fields in eVMCSv1
Enlightened VMCS v1 definition was updated with new fields, add
support for them for Hyper-V on KVM.

Note: SSP, CET and Guest LBR features are not supported by KVM yet
and 'struct vmcs12' has no corresponding fields.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-11-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:43 -04:00
Vitaly Kuznetsov
b19e4ff5e5 KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields
Enlightened VMCS v1 definition was updated with new fields, support
them in KVM by defining VMCS-to-EVMCS conversion.

Note: SSP, CET and Guest LBR features are not supported by KVM yet and
the corresponding fields are not defined in 'enum vmcs_field', leave
them commented out for now.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-10-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:42 -04:00
Sean Christopherson
6cce93de28 KVM: nVMX: Use CC() macro to handle eVMCS unsupported controls checks
Locally #define and use the nested virtualization Consistency Check (CC)
macro to handle eVMCS unsupported controls checks.  Using the macro loses
the existing printing of the unsupported controls, but that's a feature
and not a bug.  The existing approach is flawed because the @err param to
trace_kvm_nested_vmenter_failed() is the error code, not the error value.

The eVMCS trickery mostly works as __print_symbolic() falls back to
printing the raw hex value, but that subtly relies on not having a match
between the unsupported value and VMX_VMENTER_INSTRUCTION_ERRORS.

If it's really truly necessary to snapshot the bad value, then the
tracepoint can be extended in the future.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-9-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:42 -04:00
Vitaly Kuznetsov
f4d361b4c2 KVM: nVMX: Refactor unsupported eVMCS controls logic to use 2-d array
Refactor the handling of unsupported eVMCS to use a 2-d array to store
the set of unsupported controls.  KVM's handling of eVMCS is completely
broken as there is no way for userspace to query which features are
unsupported, nor does KVM prevent userspace from attempting to enable
unsupported features.  A future commit will remedy that by filtering and
enforcing unsupported features when eVMCS, but that needs to be opt-in
from userspace to avoid breakage, i.e. KVM needs to maintain its legacy
behavior by snapshotting the exact set of controls that are currently
(un)supported by eVMCS.

No functional change intended.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
[sean: split to standalone patch, write changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-8-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:41 -04:00
Sean Christopherson
85ab071af8 KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also enabled
When querying whether or not eVMCS is enabled on behalf of the guest,
treat eVMCS as enable if and only if Hyper-V is enabled/exposed to the
guest.

Note, flows that come from the host, e.g. KVM_SET_NESTED_STATE, must NOT
check for Hyper-V being enabled as KVM doesn't require guest CPUID to be
set before most ioctls().

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-7-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:40 -04:00
Sean Christopherson
3be29eb7b5 KVM: x86: Report error when setting CPUID if Hyper-V allocation fails
Return -ENOMEM back to userspace if allocating the Hyper-V vCPU struct
fails when enabling Hyper-V in guest CPUID.  Silently ignoring failure
means that KVM will not have an up-to-date CPUID cache if allocating the
struct succeeds later on, e.g. when activating SynIC.

Rejecting the CPUID operation also guarantess that vcpu->arch.hyperv is
non-NULL if hyperv_enabled is true, which will allow for additional
cleanup, e.g. in the eVMCS code.

Note, the initialization needs to be done before CPUID is set, and more
subtly before kvm_check_cpuid(), which potentially enables dynamic
XFEATURES.  Sadly, there's no easy way to avoid exposing Hyper-V details
to CPUID or vice versa.  Expose kvm_hv_vcpu_init() and the Hyper-V CPUID
signature to CPUID instead of exposing cpuid_entry2_find() outside of
CPUID code.  It's hard to envision kvm_hv_vcpu_init() being misused,
whereas cpuid_entry2_find() absolutely shouldn't be used outside of core
CPUID code.

Fixes: 10d7bf1e46 ("KVM: x86: hyper-v: Cache guest CPUID leaves determining features availability")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-6-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:39 -04:00
Sean Christopherson
1cac8d9f6b KVM: x86: Check for existing Hyper-V vCPU in kvm_hv_vcpu_init()
When potentially allocating/initializing the Hyper-V vCPU struct, check
for an existing instance in kvm_hv_vcpu_init() instead of requiring
callers to perform the check.  Relying on callers to do the check is
risky as it's all too easy for KVM to overwrite vcpu->arch.hyperv and
leak memory, and it adds additional burden on callers without much
benefit.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20220830133737.1539624-5-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:38 -04:00
Vitaly Kuznetsov
ce2196b831 KVM: x86: Zero out entire Hyper-V CPUID cache before processing entries
Wipe the whole 'hv_vcpu->cpuid_cache' with memset() instead of having to
zero each particular member when the corresponding CPUID entry was not
found.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
[sean: split to separate patch]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20220830133737.1539624-4-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:38 -04:00
Vitaly Kuznetsov
5ef384a60f x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
Updated Hyper-V Enlightened VMCS specification lists several new
fields for the following features:

- PerfGlobalCtrl
- EnclsExitingBitmap
- Tsc Scaling
- GuestLbrCtl
- CET
- SSP

Update the definition.

Note, the updated spec also provides an additional CPUID feature flag,
CPUIDD.0x4000000A.EBX BIT(0), for PerfGlobalCtrl to workaround a Windows
11 quirk.  Despite what the TLFS says:

  Indicates support for the GuestPerfGlobalCtrl and HostPerfGlobalCtrl
  fields in the enlightened VMCS.

guests can safely use the fields if they are enumerated in the
architectural VMX MSRs.  I.e. KVM-on-HyperV doesn't need to check the
CPUID bit, but KVM-as-HyperV must ensure the bit is set if PerfGlobalCtrl
fields are exposed to L1.

https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
[sean: tweak CPUID name to make it PerfGlobalCtrl only]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20220830133737.1539624-3-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:37 -04:00
Vitaly Kuznetsov
ea9da788a6 x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
Section 1.9 of TLFS v6.0b says:

"All structures are padded in such a way that fields are aligned
naturally (that is, an 8-byte field is aligned to an offset of 8 bytes
and so on)".

'struct enlightened_vmcs' has a glitch:

...
        struct {
                u32                nested_flush_hypercall:1; /*   836: 0  4 */
                u32                msr_bitmap:1;         /*   836: 1  4 */
                u32                reserved:30;          /*   836: 2  4 */
        } hv_enlightenments_control;                     /*   836     4 */
        u32                        hv_vp_id;             /*   840     4 */
        u64                        hv_vm_id;             /*   844     8 */
        u64                        partition_assist_page; /*   852     8 */
...

And the observed values in 'partition_assist_page' make no sense at
all. Fix the layout by padding the structure properly.

Fixes: 68d1eb72ee ("x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220830133737.1539624-2-vkuznets@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:36 -04:00
Oliver Upton
5f5651c673 KVM: selftests: Require DISABLE_NX_HUGE_PAGES cap for NX hugepage test
Require KVM_CAP_VM_DISABLE_NX_HUGE_PAGES for the entire NX hugepage test
instead of skipping the "disable" subtest if the capability isn't
supported by the host kernel.  While the "enable" subtest does provide
value when the capability isn't supported, silently providing only half
the promised coveraged is undesirable, i.e. it's better to skip the test
so that the user knows something.

Alternatively, the test could print something to alert the user instead
of silently skipping the subtest, but that would encourage other tests
to follow suit, and it's not clear that it's desirable to take selftests
in that direction.  And if selftests do head down the path of skipping
subtests, such behavior needs first-class support in the framework.

Opportunistically convert other test preconditions to TEST_REQUIRE().

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20220812175301.3915004-1-oliver.upton@linux.dev
[sean: rewrote changelog to capture discussion about skipping the test]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:36 -04:00
Uros Bizjak
57abfa11ba KVM: VMX: Do not declare vmread_error() asmlinkage
There is no need to declare vmread_error() asmlinkage, its arguments
can be passed via registers for both 32-bit and 64-bit targets.
Function argument registers are considered call-clobbered registers,
they are saved in the trampoline just before the function call and
restored afterwards.

Dropping "asmlinkage" patch unifies trampoline function argument handling
between 32-bit and 64-bit targets and improves generated code for 32-bit
targets.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20220817144045.3206-1-ubizjak@gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:35 -04:00
Liam Ni
e390f4d69d KVM:x86: Clean up ModR/M "reg" initialization in reg op decoding
Refactor decode_register_operand() to get the ModR/M register if and
only if the instruction uses a ModR/M encoding to make it more obvious
how the register operand is retrieved.

Signed-off-by: Liam Ni <zhiguangni01@gmail.com>
Link: https://lore.kernel.org/r/20220908141210.1375828-1-zhiguangni01@zhaoxin.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:34 -04:00
Mingwei Zhang
02dfc44f20 KVM: x86: Print guest pgd in kvm_nested_vmenter()
Print guest pgd in kvm_nested_vmenter() to enrich the information for
tracing. When tdp is enabled, print the value of tdp page table (EPT/NPT);
when tdp is disabled, print the value of non-nested CR3.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/r/20220825225755.907001-4-mizhang@google.com
[sean: print nested_cr3 vs. nested_eptp vs. guest_cr3]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:33 -04:00
David Matlack
37ef0be269 KVM: nVMX: Add tracepoint for nested VM-Enter
Call trace_kvm_nested_vmenter() during nested VMLAUNCH/VMRESUME to bring
parity with nSVM's usage of the tracepoint during nested VMRUN.

Attempt to use analagous VMCS fields to the VMCB fields that are
reported in the SVM case:

"int_ctl": 32-bit field of the VMCB that the CPU uses to deliver virtual
interrupts. The analagous VMCS field is the 16-bit "guest interrupt
status".

"event_inj": 32-bit field of VMCB that is used to inject events
(exceptions and interrupts) into the guest. The analagous VMCS field
is the "VM-entry interruption-information field".

"npt_enabled": 1 when the VCPU has enabled nested paging. The analagous
VMCS field is the enable-EPT execution control.

"npt_addr": 64-bit field when the VCPU has enabled nested paging. The
analagous VMCS field is the ept_pointer.

Signed-off-by: David Matlack <dmatlack@google.com>
[move the code into the nested_vmx_enter_non_root_mode().]
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/r/20220825225755.907001-3-mizhang@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:32 -04:00
Mingwei Zhang
89e54ec592 KVM: x86: Update trace function for nested VM entry to support VMX
Update trace function for nested VM entry to support VMX. Existing trace
function only supports nested VMX and the information printed out is AMD
specific.

So, rename trace_kvm_nested_vmrun() to trace_kvm_nested_vmenter(), since
'vmenter' is generic. Add a new field 'isa' to recognize Intel and AMD;
Update the output to print out VMX/SVM related naming respectively, eg.,
vmcb vs. vmcs; npt vs. ept.

Opportunistically update the call site of trace_kvm_nested_vmenter() to
make one line per parameter.

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/r/20220825225755.907001-2-mizhang@google.com
[sean: align indentation, s/update/rename in changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:32 -04:00
Sean Christopherson
bff0adc40c KVM: x86: Use u64 for address and error code in page fault tracepoint
Track the address and error code as 64-bit values in the page fault
tracepoint.  When TDP is enabled, the address is a GPA and thus can be a
64-bit value even on 32-bit hosts.  And SVM's #NPF genereates 64-bit
error codes.

Opportunistically clean up the formatting.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:31 -04:00
Wonhyuk Yang
faa03b3972 KVM: Add extra information in kvm_page_fault trace point
Currently, kvm_page_fault trace point provide fault_address and error
code. However it is not enough to find which cpu and instruction
cause kvm_page_faults. So add vcpu id and instruction pointer in
kvm_page_fault trace point.

Cc: Baik Song An <bsahn@etri.re.kr>
Cc: Hong Yeon Kim <kimhy@etri.re.kr>
Cc: Taeung Song <taeung@reallinux.co.kr>
Cc: linuxgeek@linuxgeek.io
Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Link: https://lore.kernel.org/r/20220510071001.87169-1-vvghjk1234@gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:30 -04:00
Aaron Lewis
b5cb32b16c KVM: x86: Delete duplicate documentation for KVM_X86_SET_MSR_FILTER
Two copies of KVM_X86_SET_MSR_FILTER somehow managed to make it's way
into the documentation.  Remove one copy and merge the difference from
the removed copy into the copy that's being kept.

Fixes: fd49e8ee70 ("Merge branch 'kvm-sev-cgroup' into HEAD")
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20220712001045.2364298-2-aaronlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:29 -04:00
Paolo Bonzini
db25eb87ad KVM: SVM: remove unnecessary check on INIT intercept
Since svm_check_nested_events() is now handling INIT signals, there is
no need to latch it until the VMEXIT is injected.  The only condition
under which INIT signals are latched is GIF=0.

Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220819165643.83692-1-pbonzini@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:28 -04:00
Uros Bizjak
afe30b59d3 KVM/VMX: Avoid stack engine synchronization uop in __vmx_vcpu_run
Avoid instructions with explicit uses of the stack pointer between
instructions that implicitly refer to it. The sequence of
POP %reg; ADD $x, %RSP; POP %reg forces emission of synchronization
uop to synchronize the value of the stack pointer in the stack engine
and the out-of-order core.

Using POP with the dummy register instead of ADD $x, %RSP results in a
smaller code size and faster code.

The patch also fixes the reference to the wrong register in the
nearby comment.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20220816211010.25693-1-ubizjak@gmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:27 -04:00
Miaohe Lin
5a2a961be2 KVM: fix memoryleak in kvm_init()
When alloc_cpumask_var_node() fails for a certain cpu, there might be some
allocated cpumasks for percpu cpu_kick_mask. We should free these cpumasks
or memoryleak will occur.

Fixes: baff59ccdc ("KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Link: https://lore.kernel.org/r/20220823063414.59778-1-linmiaohe@huawei.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:02:27 -04:00
Gao Xiang
5c2a64252c erofs: introduce partial-referenced pclusters
Due to deduplication for compressed data, pclusters can be partially
referenced with their prefixes.

Together with the user-space implementation, it enables EROFS
variable-length global compressed data deduplication with rolling
hash.

Link: https://lore.kernel.org/r/20220923014915.4362-1-hsiangkao@linux.alibaba.com
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-09-26 23:55:43 +08:00
Yue Hu
b15b2e307c erofs: support on-disk compressed fragments data
Introduce on-disk compressed fragments data feature.

This approach adds a new field called `h_fragmentoff' in the per-file
compression header to indicate the fragment offset of each tail pcluster
or the whole file in the special packed inode.

Similar to ztailpacking, it will also find and record the 'headlcn'
of the tail pcluster when initializing per-inode zmap for making
follow-on requests more easy.

Signed-off-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/YzHKxcFTlHGgXeH9@B-P7TQMD6M-0146.local
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-09-26 23:55:39 +08:00
Liu Jian
bec217197b skmsg: Schedule psock work if the cached skb exists on the psock
In sk_psock_backlog function, for ingress direction skb, if no new data
packet arrives after the skb is cached, the cached skb does not have a
chance to be added to the receive queue of psock. As a result, the cached
skb cannot be received by the upper-layer application. Fix this by reschedule
the psock work to dispose the cached skb in sk_msg_recvmsg function.

Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220907071311.60534-1-liujian56@huawei.com
2022-09-26 17:48:05 +02:00
Liu Jian
043a7356db selftests/bpf: Add wait send memory test for sockmap redirect
Add one test for wait redirect sock's send memory test for sockmap.

Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220823133755.314697-3-liujian56@huawei.com
2022-09-26 17:45:31 +02:00
Liu Jian
3f8ef65af9 net: If sock is dead don't access sock's sk_wq in sk_stream_wait_memory
Fixes the below NULL pointer dereference:

  [...]
  [   14.471200] Call Trace:
  [   14.471562]  <TASK>
  [   14.471882]  lock_acquire+0x245/0x2e0
  [   14.472416]  ? remove_wait_queue+0x12/0x50
  [   14.473014]  ? _raw_spin_lock_irqsave+0x17/0x50
  [   14.473681]  _raw_spin_lock_irqsave+0x3d/0x50
  [   14.474318]  ? remove_wait_queue+0x12/0x50
  [   14.474907]  remove_wait_queue+0x12/0x50
  [   14.475480]  sk_stream_wait_memory+0x20d/0x340
  [   14.476127]  ? do_wait_intr_irq+0x80/0x80
  [   14.476704]  do_tcp_sendpages+0x287/0x600
  [   14.477283]  tcp_bpf_push+0xab/0x260
  [   14.477817]  tcp_bpf_sendmsg_redir+0x297/0x500
  [   14.478461]  ? __local_bh_enable_ip+0x77/0xe0
  [   14.479096]  tcp_bpf_send_verdict+0x105/0x470
  [   14.479729]  tcp_bpf_sendmsg+0x318/0x4f0
  [   14.480311]  sock_sendmsg+0x2d/0x40
  [   14.480822]  ____sys_sendmsg+0x1b4/0x1c0
  [   14.481390]  ? copy_msghdr_from_user+0x62/0x80
  [   14.482048]  ___sys_sendmsg+0x78/0xb0
  [   14.482580]  ? vmf_insert_pfn_prot+0x91/0x150
  [   14.483215]  ? __do_fault+0x2a/0x1a0
  [   14.483738]  ? do_fault+0x15e/0x5d0
  [   14.484246]  ? __handle_mm_fault+0x56b/0x1040
  [   14.484874]  ? lock_is_held_type+0xdf/0x130
  [   14.485474]  ? find_held_lock+0x2d/0x90
  [   14.486046]  ? __sys_sendmsg+0x41/0x70
  [   14.486587]  __sys_sendmsg+0x41/0x70
  [   14.487105]  ? intel_pmu_drain_pebs_core+0x350/0x350
  [   14.487822]  do_syscall_64+0x34/0x80
  [   14.488345]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
  [...]

The test scenario has the following flow:

thread1                               thread2
-----------                           ---------------
 tcp_bpf_sendmsg
  tcp_bpf_send_verdict
   tcp_bpf_sendmsg_redir              sock_close
    tcp_bpf_push_locked                 __sock_release
     tcp_bpf_push                         //inet_release
      do_tcp_sendpages                    sock->ops->release
       sk_stream_wait_memory          	   // tcp_close
          sk_wait_event                      sk->sk_prot->close
           release_sock(__sk);
            ***
                                                lock_sock(sk);
                                                  __tcp_close
                                                    sock_orphan(sk)
                                                      sk->sk_wq  = NULL
                                                release_sock
            ****
           lock_sock(__sk);
          remove_wait_queue(sk_sleep(sk), &wait);
             sk_sleep(sk)
             //NULL pointer dereference
             &rcu_dereference_raw(sk->sk_wq)->wait

While waiting for memory in thread1, the socket is released with its wait
queue because thread2 has closed it. This caused by tcp_bpf_send_verdict
didn't increase the f_count of psock->sk_redir->sk_socket->file in thread1.

We should check if SOCK_DEAD flag is set on wakeup in sk_stream_wait_memory
before accessing the wait queue.

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/bpf/20220823133755.314697-2-liujian56@huawei.com
2022-09-26 17:43:43 +02:00
Iskren Chernev
2bd1228a70 dt-bindings: arm: cpus: Add kryo240 compatible
Kryo240 is found in SM4250, the slower sibling of the SM6115.

Signed-off-by: Iskren Chernev <iskren.chernev@gmail.com>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220919180618.1840194-3-iskren.chernev@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
2022-09-26 10:20:27 -05:00
Dmitry Torokhov
da7a0123ed Input: ep93xx_keypad - add missing linux/input.h include
We are going to clean up matrix_keymap.h from unnecessary includes,
so the driver needs to include API that it uses directly.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220923194738.927408-5-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-09-26 08:15:17 -07:00
Dmitry Torokhov
4e9cded619 Input: imx_keypad - add missing linux/input.h include
We are going to clean up matrix_keymap.h from unnecessary includes,
so the driver needs to include API that it uses directly.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220923194738.927408-4-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-09-26 08:15:10 -07:00
Dmitry Torokhov
81a7cba79d Input: mt6779-keypad - add missing linux/input.h include
We are going to clean up matrix_keymap.h from unnecessary includes,
so the driver needs to include API that it uses directly.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220923194738.927408-3-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-09-26 08:15:03 -07:00
Dmitry Torokhov
d25a9d8f8d Input: st-keyscan - add missing linux/input.h and linux/of.h includes
We are going to clean up matrix_keymap.h from unnecessary includes,
so the driver needs to include API that it uses directly.

Also let's sort includes alphabetically.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220923194738.927408-2-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-09-26 08:14:29 -07:00
Dmitry Torokhov
ddc9589d79 Input: lm8333 - add missing linux/input.h include
We are going to clean up matrix_keymap.h from unnecessary includes,
so the driver needs to include API that it uses directly.

Also let's sort includes alphabetically and drop unneeded irq.h

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220923194738.927408-1-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-09-26 08:13:24 -07:00
Yang Yingliang
fc13b5a25e spi: spi-fsl-qspi: Use devm_platform_ioremap_resource_byname()
Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220924131854.964923-3-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:41 +01:00
Yang Yingliang
c9e1bb724d spi: spi-fsl-lpspi: Use devm_platform_get_and_ioremap_resource()
Use the devm_platform_get_and_ioremap_resource() helper instead of calling
platform_get_resource() and devm_ioremap_resource() separately.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220924131854.964923-2-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:40 +01:00
Yang Yingliang
f96087a38c spi: spi-fsl-dspi: Use devm_platform_get_and_ioremap_resource()
Use the devm_platform_get_and_ioremap_resource() helper instead of calling
platform_get_resource() and devm_ioremap_resource() separately.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220924131854.964923-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:39 +01:00
Zhang Qilong
29f65f2171 spi/omap100k:Fix PM disable depth imbalance in omap1_spi100k_probe
The pm_runtime_enable will increase power disable depth. Thus
a pairing decrement is needed on the error handling path to
keep it balanced according to context.

Fixes:db91841b58f9a ("spi/omap100k: Convert to runtime PM")

Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Link: https://lore.kernel.org/r/20220924121310.78331-4-zhangqilong3@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:37 +01:00
Zhang Qilong
618d815fc9 spi: dw: Fix PM disable depth imbalance in dw_spi_bt1_probe
The pm_runtime_enable will increase power disable depth. Thus
a pairing decrement is needed on the error handling path to
keep it balanced according to context.

Fixes:abf00907538e2 ("spi: dw: Add Baikal-T1 SPI Controller glue driver")

Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Link: https://lore.kernel.org/r/20220924121310.78331-3-zhangqilong3@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:36 +01:00
Zhang Qilong
4d0ef0a1c3 spi: cadence-quadspi: Fix PM disable depth imbalance in cqspi_probe
The pm_runtime_enable will increase power disable depth. Thus
a pairing decrement is needed on the error handling path to
keep it balanced according to context.

Fixes:73d5fe0462702 ("spi: cadence-quadspi: Remove spi_master_put() in probe failure path")

Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Link: https://lore.kernel.org/r/20220924121310.78331-2-zhangqilong3@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:35 +01:00
Axel Lin
947934e389 regulator: tps65219: Fix is_enabled checking in tps65219_set_bypass
Testing .enable cannot tell if a regulator is enabled or not, check return
value of .is_enabled() instead.
Also remove unneeded ret variable.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20220919122353.384171-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:33 +01:00
Krzysztof Kozlowski
2b381b4a91 ASoC: MAINTAINERS: add bindings and APR to Qualcomm Audio entry
Extend the Qualcomm Audio maintainer entry to include sound related
bindings and the Qualcomm APR/GPR (Asynchronous/Generic Packet Router)
IPC driver, which is tightly related to the Audio DSP.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20220923203140.514730-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:31 +01:00
Ren Zhijie
4d73b97b8d ASoC: codecs: wcd934x: Fix Kconfig dependency
If CONFIG_REGMAP_SLIMBUS is not set,
make ARCH=x86_64 CROSS_COMPILE=x86_64-linux-gnu-,
will be failed, like this:

sound/soc/codecs/wcd934x.o: In function `wcd934x_codec_probe':
wcd934x.c:(.text+0x3310): undefined reference to `__regmap_init_slimbus'
make: *** [vmlinux] Error 1

Add select REGMAP_SLIMBUS to config SND_SOC_WCD934X.

Fixes: a61f3b4f47 ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ren Zhijie <renzhijie2@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220926074042.13297-1-renzhijie2@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:30 +01:00
Chunxu Li
f7c91bf653 ASoC: SOF: mediatek: mt8195: Add pcm_pointer callback
Add pcm_pointer callback for mt8195 to support read
host position from DSP

Signed-off-by: Chunxu Li <chunxu.li@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220924033559.26599-3-chunxu.li@mediatek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:28 +01:00
Chunxu Li
978a7144ae ASoC: SOF: mediatek: mt8195: Add pcm_hw_params callback
Add pcm_hw_params callback for mt8195 to support continue
update dma host position

Signed-off-by: Chunxu Li <chunxu.li@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20220924033559.26599-2-chunxu.li@mediatek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-26 16:10:27 +01:00