Commit Graph

235 Commits

Author SHA1 Message Date
Sean Christopherson
3a87c7e0d1 KVM: nSVM: Add helper to synthesize nested VM-Exit without collateral
Add a helper to consolidate boilerplate for nested VM-Exits that don't
provide any data in exit_info_*.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210302174515.2812275-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15 04:43:16 -04:00
Paolo Bonzini
6312975417 KVM: SVM: Pass struct kvm_vcpu to exit handlers (and many, many other places)
Refactor the svm_exit_handlers API to pass @vcpu instead of @svm to
allow directly invoking common x86 exit handlers (in a future patch).
Opportunistically convert an absurd number of instances of 'svm->vcpu'
to direct uses of 'vcpu' to avoid pointless casting.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210205005750.3841462-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15 04:43:09 -04:00
Paolo Bonzini
2a32a77cef KVM: SVM: merge update_cr0_intercept into svm_set_cr0
The logic of update_cr0_intercept is pointlessly complicated.
All svm_set_cr0 is compute the effective cr0 and compare it with
the guest value.

Inlining the function and simplifying the condition
clarifies what it is doing.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15 04:42:38 -04:00
Paolo Bonzini
9e8f0fbfff KVM: nSVM: rename functions and variables according to vmcbXY nomenclature
Now that SVM is using a separate vmcb01 and vmcb02 (and also uses the vmcb12
naming) we can give clearer names to functions that write to and read
from those VMCBs.  Likewise, variables and parameters can be renamed
from nested_vmcb to vmcb12.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15 04:42:31 -04:00
Cathy Avery
193015adf4 KVM: nSVM: Track the ASID generation of the vmcb vmrun through the vmcb
This patch moves the asid_generation from the vcpu to the vmcb
in order to track the ASID generation that was active the last
time the vmcb was run. If sd->asid_generation changes between
two runs, the old ASID is invalid and must be changed.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cathy Avery <cavery@redhat.com>
Message-Id: <20210112164313.4204-3-cavery@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15 04:42:30 -04:00
Cathy Avery
af18fa775d KVM: nSVM: Track the physical cpu of the vmcb vmrun through the vmcb
This patch moves the physical cpu tracking from the vcpu
to the vmcb in svm_switch_vmcb. If either vmcb01 or vmcb02
change physical cpus from one vmrun to the next the vmcb's
previous cpu is preserved for comparison with the current
cpu and the vmcb is marked dirty if different. This prevents
the processor from using old cached data for a vmcb that may
have been updated on a prior run on a different processor.

It also moves the physical cpu check from svm_vcpu_load
to pre_svm_run as the check only needs to be done at run.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cathy Avery <cavery@redhat.com>
Message-Id: <20210112164313.4204-2-cavery@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15 04:42:29 -04:00
Cathy Avery
4995a3685f KVM: SVM: Use a separate vmcb for the nested L2 guest
svm->vmcb will now point to a separate vmcb for L1 (not nested) or L2
(nested).

The main advantages are removing get_host_vmcb and hsave, in favor of
concepts that are shared with VMX.

We don't need anymore to stash the L1 registers in hsave while L2
runs, but we need to copy the VMLOAD/VMSAVE registers from VMCB01 to
VMCB02 and back.  This more or less has the same cost, but code-wise
nested_svm_vmloadsave can be reused.

This patch omits several optimizations that are possible:

- for simplicity there is some wholesale copying of vmcb.control areas
which can go away.

- we should be able to better use the VMCB01 and VMCB02 clean bits.

- another possibility is to always use VMCB01 for VMLOAD and VMSAVE,
thus avoiding the copy of VMLOAD/VMSAVE registers from VMCB01 to
VMCB02 and back.

Tested:
kvm-unit-tests
kvm self tests
Loaded fedora nested guest on fedora

Signed-off-by: Cathy Avery <cavery@redhat.com>
Message-Id: <20201011184818.3609-3-cavery@redhat.com>
[Fix conflicts; keep VMCB02 G_PAT up to date whenever guest writes the
 PAT MSR; do not copy CR4 over from VMCB01 as it is not needed anymore; add
 a few more comments. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15 04:42:28 -04:00
Sean Christopherson
6d1b867d04 KVM: SVM: Don't strip the C-bit from CR2 on #PF interception
Don't strip the C-bit from the faulting address on an intercepted #PF,
the address is a virtual address, not a physical address.

Fixes: 0ede79e132 ("KVM: SVM: Clear C-bit from the page fault address")
Cc: stable@vger.kernel.org
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210305011101.3597423-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-15 04:42:26 -04:00
Sean Christopherson
99840a7545 KVM: SVM: Connect 'npt' module param to KVM's internal 'npt_enabled'
Directly connect the 'npt' param to the 'npt_enabled' variable so that
runtime adjustments to npt_enabled are reflected in sysfs.  Move the
!PAE restriction to a runtime check to ensure NPT is forced off if the
host is using 2-level paging, and add a comment explicitly stating why
NPT requires a 64-bit kernel or a kernel with PAE enabled.

Opportunistically switch the param to octal permissions.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210305021637.3768573-1-seanjc@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-05 08:33:15 -05:00
Babu Moger
9e46f6c6c9 KVM: SVM: Clear the CR4 register on reset
This problem was reported on a SVM guest while executing kexec.
Kexec fails to load the new kernel when the PCID feature is enabled.

When kexec starts loading the new kernel, it starts the process by
resetting the vCPU's and then bringing each vCPU online one by one.
The vCPU reset is supposed to reset all the register states before the
vCPUs are brought online. However, the CR4 register is not reset during
this process. If this register is already setup during the last boot,
all the flags can remain intact. The X86_CR4_PCIDE bit can only be
enabled in long mode. So, it must be enabled much later in SMP
initialization.  Having the X86_CR4_PCIDE bit set during SMP boot can
cause a boot failures.

Fix the issue by resetting the CR4 register in init_vmcb().

Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <161471109108.30811.6392805173629704166.stgit@bmoger-ubuntu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-02 14:39:11 -05:00
Sean Christopherson
2df8d3807c KVM: SVM: Fix nested VM-Exit on #GP interception handling
Fix the interpreation of nested_svm_vmexit()'s return value when
synthesizing a nested VM-Exit after intercepting an SVM instruction while
L2 was running.  The helper returns '0' on success, whereas a return
value of '0' in the exit handler path means "exit to userspace".  The
incorrect return value causes KVM to exit to userspace without filling
the run state, e.g. QEMU logs "KVM: unknown exit, hardware reason 0".

Fixes: 14c2bf81fc ("KVM: SVM: Fix #GP handling for doubly-nested virtualization")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210224005627.657028-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-25 05:13:05 -05:00
Sean Christopherson
e420333422 KVM: x86: Advertise INVPCID by default
Advertise INVPCID by default (if supported by the host kernel) instead
of having both SVM and VMX opt in.  INVPCID was opt in when it was a
VMX only feature so that KVM wouldn't prematurely advertise support
if/when it showed up in the kernel on AMD hardware.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210212003411.1102677-3-seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-18 07:33:29 -05:00
Sean Christopherson
0a8ed2eaac KVM: SVM: Intercept INVPCID when it's disabled to inject #UD
Intercept INVPCID if it's disabled in the guest, even when using NPT,
as KVM needs to inject #UD in this case.

Fixes: 4407a797e9 ("KVM: SVM: Enable INVPCID feature on AMD")
Cc: Babu Moger <babu.moger@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210212003411.1102677-2-seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-18 07:33:28 -05:00
Wei Yongjun
2e215216d6 KVM: SVM: Make symbol 'svm_gp_erratum_intercept' static
The sparse tool complains as follows:

arch/x86/kvm/svm/svm.c:204:6: warning:
 symbol 'svm_gp_erratum_intercept' was not declared. Should it be static?

This symbol is not used outside of svm.c, so this
commit marks it static.

Fixes: 82a11e9c6f ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Message-Id: <20210210075958.1096317-1-weiyongjun1@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-11 08:02:08 -05:00
Paolo Bonzini
996ff5429e KVM: x86: move kvm_inject_gp up from kvm_set_dr to callers
Push the injection of #GP up to the callers, so that they can just use
kvm_complete_insn_gp. __kvm_set_dr is pretty much what the callers can use
together with kvm_complete_insn_gp, so rename it to kvm_set_dr and drop
the old kvm_set_dr wrapper.

This also allows nested VMX code, which really wanted to use __kvm_set_dr,
to use the right function.

While at it, remove the kvm_require_dr() check from the SVM interception.
The APM states:

  All normal exception checks take precedence over the SVM intercepts.

which includes the CR4.DE=1 #UD.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-09 08:17:07 -05:00
Sean Christopherson
6f7a343987 KVM: SVM: Remove an unnecessary forward declaration
Drop a defunct forward declaration of svm_complete_interrupts().

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210205005750.3841462-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-09 08:17:06 -05:00
Sean Christopherson
ca29e14506 KVM: x86: SEV: Treat C-bit as legal GPA bit regardless of vCPU mode
Rename cr3_lm_rsvd_bits to reserved_gpa_bits, and use it for all GPA
legality checks.  AMD's APM states:

  If the C-bit is an address bit, this bit is masked from the guest
  physical address when it is translated through the nested page tables.

Thus, any access that can conceivably be run through NPT should ignore
the C-bit when checking for validity.

For features that KVM emulates in software, e.g. MTRRs, there is no
clear direction in the APM for how the C-bit should be handled.  For
such cases, follow the SME behavior inasmuch as possible, since SEV is
is essentially a VM-specific variant of SME.  For SME, the APM states:

  In this case the upper physical address bits are treated as reserved
  when the feature is enabled except where otherwise indicated.

Collecting the various relavant SME snippets in the APM and cross-
referencing the omissions with Linux kernel code, this leaves MTTRs and
APIC_BASE as the only flows that KVM emulates that should _not_ ignore
the C-bit.

Note, this means the reserved bit checks in the page tables are
technically broken.  This will be remedied in a future patch.

Although the page table checks are technically broken, in practice, it's
all but guaranteed to be irrelevant.  NPT is required for SEV, i.e.
shadowing page tables isn't needed in the common case.  Theoretically,
the checks could be in play for nested NPT, but it's extremely unlikely
that anyone is running nested VMs on SEV, as doing so would require L1
to expose sensitive data to L0, e.g. the entire VMCB.  And if anyone is
running nested VMs, L0 can't read the guest's encrypted memory, i.e. L1
would need to put its NPT in shared memory, in which case the C-bit will
never be set.  Or, L1 could use shadow paging, but again, if L0 needs to
read page tables, e.g. to load PDPTRs, the memory can't be encrypted if
L1 has any expectation of L0 doing the right thing.

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210204000117.3303214-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 09:27:29 -05:00
Paolo Bonzini
bbefd4fc8f KVM: x86: move kvm_inject_gp up from kvm_set_xcr to callers
Push the injection of #GP up to the callers, so that they can just use
kvm_complete_insn_gp.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:37 -05:00
Krish Sadhukhan
04548ed020 KVM: SVM: Replace hard-coded value with #define
Replace the hard-coded value for bit# 1 in EFLAGS, with the available
#define.

Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Message-Id: <20210203012842.101447-2-krish.sadhukhan@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:35 -05:00
Michael Roth
a7fc06dd2f KVM: SVM: use .prepare_guest_switch() to handle CPU register save/setup
Currently we save host state like user-visible host MSRs, and do some
initial guest register setup for MSR_TSC_AUX and MSR_AMD64_TSC_RATIO
in svm_vcpu_load(). Defer this until just before we enter the guest by
moving the handling to kvm_x86_ops.prepare_guest_switch() similarly to
how it is done for the VMX implementation.

Additionally, since handling of saving/restoring host user MSRs is the
same both with/without SEV-ES enabled, move that handling to common
code.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20210202190126.2185715-4-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:35 -05:00
Michael Roth
553cc15f6e KVM: SVM: remove uneeded fields from host_save_users_msrs
Now that the set of host user MSRs that need to be individually
saved/restored are the same with/without SEV-ES, we can drop the
.sev_es_restored flag and just iterate through the list unconditionally
for both cases. A subsequent patch can then move these loops to a
common path.

Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20210202190126.2185715-3-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:34 -05:00
Michael Roth
e79b91bb3c KVM: SVM: use vmsave/vmload for saving/restoring additional host state
Using a guest workload which simply issues 'hlt' in a tight loop to
generate VMEXITs, it was observed (on a recent EPYC processor) that a
significant amount of the VMEXIT overhead measured on the host was the
result of MSR reads/writes in svm_vcpu_load/svm_vcpu_put according to
perf:

  67.49%--kvm_arch_vcpu_ioctl_run
          |
          |--23.13%--vcpu_put
          |          kvm_arch_vcpu_put
          |          |
          |          |--21.31%--native_write_msr
          |          |
          |           --1.27%--svm_set_cr4
          |
          |--16.11%--vcpu_load
          |          |
          |           --15.58%--kvm_arch_vcpu_load
          |                     |
          |                     |--13.97%--svm_set_cr4
          |                     |          |
          |                     |          |--12.64%--native_read_msr

Most of these MSRs relate to 'syscall'/'sysenter' and segment bases, and
can be saved/restored using 'vmsave'/'vmload' instructions rather than
explicit MSR reads/writes. In doing so there is a significant reduction
in the svm_vcpu_load/svm_vcpu_put overhead measured for the above
workload:

  50.92%--kvm_arch_vcpu_ioctl_run
          |
          |--19.28%--disable_nmi_singlestep
          |
          |--13.68%--vcpu_load
          |          kvm_arch_vcpu_load
          |          |
          |          |--9.19%--svm_set_cr4
          |          |          |
          |          |           --6.44%--native_read_msr
          |          |
          |           --3.55%--native_write_msr
          |
          |--6.05%--kvm_inject_nmi
          |--2.80%--kvm_sev_es_mmio_read
          |--2.19%--vcpu_put
          |          |
          |           --1.25%--kvm_arch_vcpu_put
          |                     native_write_msr

Quantifying this further, if we look at the raw cycle counts for a
normal iteration of the above workload (according to 'rdtscp'),
kvm_arch_vcpu_ioctl_run() takes ~4600 cycles from start to finish with
the current behavior. Using 'vmsave'/'vmload', this is reduced to
~2800 cycles, a savings of 39%.

While this approach doesn't seem to manifest in any noticeable
improvement for more realistic workloads like UnixBench, netperf, and
kernel builds, likely due to their exit paths generally involving IO
with comparatively high latencies, it does improve overall overhead
of KVM_RUN significantly, which may still be noticeable for certain
situations. It also simplifies some aspects of the code.

With this change, explicit save/restore is no longer needed for the
following host MSRs, since they are documented[1] as being part of the
VMCB State Save Area:

  MSR_STAR, MSR_LSTAR, MSR_CSTAR,
  MSR_SYSCALL_MASK, MSR_KERNEL_GS_BASE,
  MSR_IA32_SYSENTER_CS,
  MSR_IA32_SYSENTER_ESP,
  MSR_IA32_SYSENTER_EIP,
  MSR_FS_BASE, MSR_GS_BASE

and only the following MSR needs individual handling in
svm_vcpu_put/svm_vcpu_load:

  MSR_TSC_AUX

We could drop the host_save_user_msrs array/loop and instead handle
MSR read/write of MSR_TSC_AUX directly, but we leave that for now as
a potential follow-up.

Since 'vmsave'/'vmload' also handles the LDTR and FS/GS segment
registers (and associated hidden state)[2], some of the code
previously used to handle this is no longer needed, so we drop it
as well.

The first public release of the SVM spec[3] also documents the same
handling for the host state in question, so we make these changes
unconditionally.

Also worth noting is that we 'vmsave' to the same page that is
subsequently used by 'vmrun' to record some host additional state. This
is okay, since, in accordance with the spec[2], the additional state
written to the page by 'vmrun' does not overwrite any fields written by
'vmsave'. This has also been confirmed through testing (for the above
CPU, at least).

[1] AMD64 Architecture Programmer's Manual, Rev 3.33, Volume 2, Appendix B, Table B-2
[2] AMD64 Architecture Programmer's Manual, Rev 3.31, Volume 3, Chapter 4, VMSAVE/VMLOAD
[3] Secure Virtual Machine Architecture Reference Manual, Rev 3.01

Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20210202190126.2185715-2-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:34 -05:00
Sean Christopherson
35a7831912 KVM: SVM: Use asm goto to handle unexpected #UD on SVM instructions
Add svm_asm*() macros, a la the existing vmx_asm*() macros, to handle
faults on SVM instructions instead of using the generic __ex(), a.k.a.
__kvm_handle_fault_on_reboot().  Using asm goto generates slightly
better code as it eliminates the in-line JMP+CALL sequences that are
needed by __kvm_handle_fault_on_reboot() to avoid triggering BUG()
from fixup (which generates bad stack traces).

Using SVM specific macros also drops the last user of __ex() and the
the last asm linkage to kvm_spurious_fault(), and adds a helper for
VMSAVE, which may gain an addition call site in the future (as part
of optimizing the SVM context switching).

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20201231002702.2223707-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:33 -05:00
Jason Baron
b6a7cc3544 KVM: X86: prepend vmx/svm prefix to additional kvm_x86_ops functions
A subsequent patch introduces macros in preparation for simplifying the
definition for vmx_x86_ops and svm_x86_ops. Making the naming more uniform
expands the coverage of the macros. Add vmx/svm prefix to the following
functions: update_exception_bitmap(), enable_nmi_window(),
enable_irq_window(), update_cr8_intercept and enable_smi_window().

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Message-Id: <ed594696f8e2c2b2bfc747504cee9bbb2a269300.1610680941.git.jbaron@akamai.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:29 -05:00
Wei Huang
14c2bf81fc KVM: SVM: Fix #GP handling for doubly-nested virtualization
Under the case of nested on nested (L0, L1, L2 are all hypervisors),
we do not support emulation of the vVMLOAD/VMSAVE feature, the
L0 hypervisor can inject the proper #VMEXIT to inform L1 of what is
happening and L1 can avoid invoking the #GP workaround.  For this
reason we turns on guest VM's X86_FEATURE_SVME_ADDR_CHK bit for KVM
running inside VM to receive the notification and change behavior.

Similarly we check if vcpu is under guest mode before emulating the
vmware-backdoor instructions. For the case of nested on nested, we
let the guest handle it.

Co-developed-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210126081831.570253-5-wei.huang2@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:28 -05:00
Wei Huang
3b9c723ed7 KVM: SVM: Add support for SVM instruction address check change
New AMD CPUs have a change that checks #VMEXIT intercept on special SVM
instructions before checking their EAX against reserved memory region.
This change is indicated by CPUID_0x8000000A_EDX[28]. If it is 1, #VMEXIT
is triggered before #GP. KVM doesn't need to intercept and emulate #GP
faults as #GP is supposed to be triggered.

Co-developed-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210126081831.570253-4-wei.huang2@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:28 -05:00
Bandan Das
82a11e9c6f KVM: SVM: Add emulation support for #GP triggered by SVM instructions
While running SVM related instructions (VMRUN/VMSAVE/VMLOAD), some AMD
CPUs check EAX against reserved memory regions (e.g. SMM memory on host)
before checking VMCB's instruction intercept. If EAX falls into such
memory areas, #GP is triggered before VMEXIT. This causes problem under
nested virtualization. To solve this problem, KVM needs to trap #GP and
check the instructions triggering #GP. For VM execution instructions,
KVM emulates these instructions.

Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <20210126081831.570253-3-wei.huang2@amd.com>
[Conditionally enable #GP intercept. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:28 -05:00
Chenyi Qiang
9a3ecd5e2a KVM: X86: Rename DR6_INIT to DR6_ACTIVE_LOW
DR6_INIT contains the 1-reserved bits as well as the bit that is cleared
to 0 when the condition (e.g. RTM) happens. The value can be used to
initialize dr6 and also be the XOR mask between the #DB exit
qualification (or payload) and DR6.

Concerning that DR6_INIT is used as initial value only once, rename it
to DR6_ACTIVE_LOW and apply it in other places, which would make the
incoming changes for bus lock debug exception more simple.

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210202090433.13441-2-chenyi.qiang@intel.com>
[Define DR6_FIXED_1 from DR6_ACTIVE_LOW and DR6_VOLATILE. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04 05:27:27 -05:00
Sean Christopherson
ccd85d90ce KVM: SVM: Treat SVM as unsupported when running as an SEV guest
Don't let KVM load when running as an SEV guest, regardless of what
CPUID says.  Memory is encrypted with a key that is not accessible to
the host (L0), thus it's impossible for L0 to emulate SVM, e.g. it'll
see garbage when reading the VMCB.

Technically, KVM could decrypt all memory that needs to be accessible to
the L0 and use shadow paging so that L0 does not need to shadow NPT, but
exposing such information to L0 largely defeats the purpose of running as
an SEV guest.  This can always be revisited if someone comes up with a
use case for running VMs inside SEV guests.

Note, VMLOAD, VMRUN, etc... will also #GP on GPAs with C-bit set, i.e. KVM
is doomed even if the SEV guest is debuggable and the hypervisor is willing
to decrypt the VMCB.  This may or may not be fixed on CPUs that have the
SVME_ADDR_CHK fix.

Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210202212017.2486595-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-03 04:30:37 -05:00
Lorenzo Brescia
d95df95106 kvm: tracing: Fix unmatched kvm_entry and kvm_exit events
On VMX, if we exit and then re-enter immediately without leaving
the vmx_vcpu_run() function, the kvm_entry event is not logged.
That means we will see one (or more) kvm_exit, without its (their)
corresponding kvm_entry, as shown here:

 CPU-1979 [002] 89.871187: kvm_entry: vcpu 1
 CPU-1979 [002] 89.871218: kvm_exit:  reason MSR_WRITE
 CPU-1979 [002] 89.871259: kvm_exit:  reason MSR_WRITE

It also seems possible for a kvm_entry event to be logged, but then
we leave vmx_vcpu_run() right away (if vmx->emulation_required is
true). In this case, we will have a spurious kvm_entry event in the
trace.

Fix these situations by moving trace_kvm_entry() inside vmx_vcpu_run()
(where trace_kvm_exit() already is).

A trace obtained with this patch applied looks like this:

 CPU-14295 [000] 8388.395387: kvm_entry: vcpu 0
 CPU-14295 [000] 8388.395392: kvm_exit:  reason MSR_WRITE
 CPU-14295 [000] 8388.395393: kvm_entry: vcpu 0
 CPU-14295 [000] 8388.395503: kvm_exit:  reason EXTERNAL_INTERRUPT

Of course, not calling trace_kvm_entry() in common x86 code any
longer means that we need to adjust the SVM side of things too.

Signed-off-by: Lorenzo Brescia <lorenzo.brescia@edu.unito.it>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Message-Id: <160873470698.11652.13483635328769030605.stgit@Wayrath>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-01-25 18:52:08 -05:00
Tom Lendacky
647daca25d KVM: SVM: Add support for booting APs in an SEV-ES guest
Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence,
where the guest vCPU register state is updated and then the vCPU is VMRUN
to begin execution of the AP. For an SEV-ES guest, this won't work because
the guest register state is encrypted.

Following the GHCB specification, the hypervisor must not alter the guest
register state, so KVM must track an AP/vCPU boot. Should the guest want
to park the AP, it must use the AP Reset Hold exit event in place of, for
example, a HLT loop.

First AP boot (first INIT-SIPI-SIPI sequence):
  Execute the AP (vCPU) as it was initialized and measured by the SEV-ES
  support. It is up to the guest to transfer control of the AP to the
  proper location.

Subsequent AP boot:
  KVM will expect to receive an AP Reset Hold exit event indicating that
  the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to
  awaken it. When the AP Reset Hold exit event is received, KVM will place
  the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI
  sequence, KVM will make the vCPU runnable. It is again up to the guest
  to then transfer control of the AP to the proper location.

  To differentiate between an actual HLT and an AP Reset Hold, a new MP
  state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is
  placed in upon receiving the AP Reset Hold exit event. Additionally, to
  communicate the AP Reset Hold exit event up to userspace (if needed), a
  new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD.

A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order
to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the
original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds
a new function that, for non SEV-ES guests, invokes the original SIPI
delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests,
implements the logic above.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-01-07 18:11:37 -05:00
Uros Bizjak
52782d5b63 KVM/SVM: Remove leftover __svm_vcpu_run prototype from svm.c
Commit 16809ecdc1 moved __svm_vcpu_run the prototype to svm.h,
but forgot to remove the original from svm.c.

Fixes: 16809ecdc1 ("KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests")
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Message-Id: <20201220200339.65115-1-ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-01-07 18:07:28 -05:00
Paolo Bonzini
bc351f0726 Merge branch 'kvm-master' into kvm-next
Fixes to get_mmio_spte, destined to 5.10 stable branch.
2021-01-07 18:06:52 -05:00
Tom Lendacky
16809ecdc1 KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests
The run sequence is different for an SEV-ES guest compared to a legacy or
even an SEV guest. The guest vCPU register state of an SEV-ES guest will
be restored on VMRUN and saved on VMEXIT. There is no need to restore the
guest registers directly and through VMLOAD before VMRUN and no need to
save the guest registers directly and through VMSAVE on VMEXIT.

Update the svm_vcpu_run() function to skip register state saving and
restoring and provide an alternative function for running an SEV-ES guest
in vmenter.S

Additionally, certain host state is restored across an SEV-ES VMRUN. As
a result certain register states are not required to be restored upon
VMEXIT (e.g. FS, GS, etc.), so only do that if the guest is not an SEV-ES
guest.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <fb1c66d32f2194e171b95fc1a8affd6d326e10c1.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:59 -05:00
Tom Lendacky
861377730a KVM: SVM: Provide support for SEV-ES vCPU loading
An SEV-ES vCPU requires additional VMCB vCPU load/put requirements. SEV-ES
hardware will restore certain registers on VMEXIT, but not save them on
VMRUN (see Table B-3 and Table B-4 of the AMD64 APM Volume 2), so make the
following changes:

General vCPU load changes:
  - During vCPU loading, perform a VMSAVE to the per-CPU SVM save area and
    save the current values of XCR0, XSS and PKRU to the per-CPU SVM save
    area as these registers will be restored on VMEXIT.

General vCPU put changes:
  - Do not attempt to restore registers that SEV-ES hardware has already
    restored on VMEXIT.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <019390e9cb5e93cd73014fa5a040c17d42588733.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:59 -05:00
Tom Lendacky
376c6d2850 KVM: SVM: Provide support for SEV-ES vCPU creation/loading
An SEV-ES vCPU requires additional VMCB initialization requirements for
vCPU creation and vCPU load/put requirements. This includes:

General VMCB initialization changes:
  - Set a VMCB control bit to enable SEV-ES support on the vCPU.
  - Set the VMCB encrypted VM save area address.
  - CRx registers are part of the encrypted register state and cannot be
    updated. Remove the CRx register read and write intercepts and replace
    them with CRx register write traps to track the CRx register values.
  - Certain MSR values are part of the encrypted register state and cannot
    be updated. Remove certain MSR intercepts (EFER, CR_PAT, etc.).
  - Remove the #GP intercept (no support for "enable_vmware_backdoor").
  - Remove the XSETBV intercept since the hypervisor cannot modify XCR0.

General vCPU creation changes:
  - Set the initial GHCB gpa value as per the GHCB specification.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <3a8aef366416eddd5556dfa3fdc212aafa1ad0a2.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:58 -05:00
Tom Lendacky
85ca8be938 KVM: SVM: Set the encryption mask for the SVM host save area
The SVM host save area is used to restore some host state on VMEXIT of an
SEV-ES guest. After allocating the save area, clear it and add the
encryption mask to the SVM host save area physical address that is
programmed into the VM_HSAVE_PA MSR.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <b77aa28af6d7f1a0cb545959e08d6dc75e0c3cba.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:57 -05:00
Tom Lendacky
4444dfe405 KVM: SVM: Add NMI support for an SEV-ES guest
The GHCB specification defines how NMIs are to be handled for an SEV-ES
guest. To detect the completion of an NMI the hypervisor must not
intercept the IRET instruction (because a #VC while running the NMI will
issue an IRET) and, instead, must receive an NMI Complete exit event from
the guest.

Update the KVM support for detecting the completion of NMIs in the guest
to follow the GHCB specification. When an SEV-ES guest is active, the
IRET instruction will no longer be intercepted. Now, when the NMI Complete
exit event is received, the iret_interception() function will be called
to simulate the completion of the NMI.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <5ea3dd69b8d4396cefdc9048ebc1ab7caa70a847.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:56 -05:00
Tom Lendacky
ed02b21309 KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest
The guest FPU state is automatically restored on VMRUN and saved on VMEXIT
by the hardware, so there is no reason to do this in KVM. Eliminate the
allocation of the guest_fpu save area and key off that to skip operations
related to the guest FPU state.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <173e429b4d0d962c6a443c4553ffdaf31b7665a4.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:56 -05:00
Tom Lendacky
5719455fbd KVM: SVM: Do not report support for SMM for an SEV-ES guest
SEV-ES guests do not currently support SMM. Update the has_emulated_msr()
kvm_x86_ops function to take a struct kvm parameter so that the capability
can be reported at a VM level.

Since this op is also called during KVM initialization and before a struct
kvm instance is available, comments will be added to each implementation
of has_emulated_msr() to indicate the kvm parameter can be null.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <75de5138e33b945d2fb17f81ae507bda381808e3.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:55 -05:00
Tom Lendacky
d1949b93c6 KVM: SVM: Add support for CR8 write traps for an SEV-ES guest
For SEV-ES guests, the interception of control register write access
is not recommended. Control register interception occurs prior to the
control register being modified and the hypervisor is unable to modify
the control register itself because the register is located in the
encrypted register state.

SEV-ES guests introduce new control register write traps. These traps
provide intercept support of a control register write after the control
register has been modified. The new control register value is provided in
the VMCB EXITINFO1 field, allowing the hypervisor to track the setting
of the guest control registers.

Add support to track the value of the guest CR8 register using the control
register write trap so that the hypervisor understands the guest operating
mode.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <5a01033f4c8b3106ca9374b7cadf8e33da852df1.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:54 -05:00
Tom Lendacky
5b51cb1316 KVM: SVM: Add support for CR4 write traps for an SEV-ES guest
For SEV-ES guests, the interception of control register write access
is not recommended. Control register interception occurs prior to the
control register being modified and the hypervisor is unable to modify
the control register itself because the register is located in the
encrypted register state.

SEV-ES guests introduce new control register write traps. These traps
provide intercept support of a control register write after the control
register has been modified. The new control register value is provided in
the VMCB EXITINFO1 field, allowing the hypervisor to track the setting
of the guest control registers.

Add support to track the value of the guest CR4 register using the control
register write trap so that the hypervisor understands the guest operating
mode.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <c3880bf2db8693aa26f648528fbc6e967ab46e25.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:53 -05:00
Tom Lendacky
f27ad38aac KVM: SVM: Add support for CR0 write traps for an SEV-ES guest
For SEV-ES guests, the interception of control register write access
is not recommended. Control register interception occurs prior to the
control register being modified and the hypervisor is unable to modify
the control register itself because the register is located in the
encrypted register state.

SEV-ES support introduces new control register write traps. These traps
provide intercept support of a control register write after the control
register has been modified. The new control register value is provided in
the VMCB EXITINFO1 field, allowing the hypervisor to track the setting
of the guest control registers.

Add support to track the value of the guest CR0 register using the control
register write trap so that the hypervisor understands the guest operating
mode.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <182c9baf99df7e40ad9617ff90b84542705ef0d7.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:52 -05:00
Tom Lendacky
2985afbcdb KVM: SVM: Add support for EFER write traps for an SEV-ES guest
For SEV-ES guests, the interception of EFER write access is not
recommended. EFER interception occurs prior to EFER being modified and
the hypervisor is unable to modify EFER itself because the register is
located in the encrypted register state.

SEV-ES support introduces a new EFER write trap. This trap provides
intercept support of an EFER write after it has been modified. The new
EFER value is provided in the VMCB EXITINFO1 field, allowing the
hypervisor to track the setting of the guest EFER.

Add support to track the value of the guest EFER value using the EFER
write trap so that the hypervisor understands the guest operating mode.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <8993149352a3a87cd0625b3b61bfd31ab28977e1.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:51 -05:00
Tom Lendacky
7ed9abfe8e KVM: SVM: Support string IO operations for an SEV-ES guest
For an SEV-ES guest, string-based port IO is performed to a shared
(un-encrypted) page so that both the hypervisor and guest can read or
write to it and each see the contents.

For string-based port IO operations, invoke SEV-ES specific routines that
can complete the operation using common KVM port IO support.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <9d61daf0ffda496703717218f415cdc8fd487100.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:51 -05:00
Tom Lendacky
291bd20d5d KVM: SVM: Add initial support for a VMGEXIT VMEXIT
SEV-ES adds a new VMEXIT reason code, VMGEXIT. Initial support for a
VMGEXIT includes mapping the GHCB based on the guest GPA, which is
obtained from a new VMCB field, and then validating the required inputs
for the VMGEXIT exit reason.

Since many of the VMGEXIT exit reasons correspond to existing VMEXIT
reasons, the information from the GHCB is copied into the VMCB control
exit code areas and KVM register areas. The standard exit handlers are
invoked, similar to standard VMEXIT processing. Before restarting the
vCPU, the GHCB is updated with any registers that have been updated by
the hypervisor.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <c6a4ed4294a369bd75c44d03bd7ce0f0c3840e50.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:47 -05:00
Tom Lendacky
e9093fd492 KVM: SVM: Prepare for SEV-ES exit handling in the sev.c file
This is a pre-patch to consolidate some exit handling code into callable
functions. Follow-on patches for SEV-ES exit handling will then be able
to use them from the sev.c file.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <5b8b0ffca8137f3e1e257f83df9f5c881c8a96a3.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:47 -05:00
Tom Lendacky
8164a5ffe4 KVM: SVM: Cannot re-initialize the VMCB after shutdown with SEV-ES
When a SHUTDOWN VMEXIT is encountered, normally the VMCB is re-initialized
so that the guest can be re-launched. But when a guest is running as an
SEV-ES guest, the VMSA cannot be re-initialized because it has been
encrypted. For now, just return -EINVAL to prevent a possible attempt at
a guest reset.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <aa6506000f6f3a574de8dbcdab0707df844cb00c.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:46 -05:00
Tom Lendacky
bc624d9f1b KVM: SVM: Do not allow instruction emulation under SEV-ES
When a guest is running as an SEV-ES guest, it is not possible to emulate
instructions. Add support to prevent instruction emulation.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <f6355ea3024fda0a3eb5eb99c6b62dca10d792bd.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:46 -05:00
Tom Lendacky
8d4846b9b1 KVM: SVM: Prevent debugging under SEV-ES
Since the guest register state of an SEV-ES guest is encrypted, debugging
is not supported. Update the code to prevent guest debugging when the
guest has protected state.

Additionally, an SEV-ES guest must only and always intercept DR7 reads and
writes. Update set_dr_intercepts() and clr_dr_intercepts() to account for
this.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <8db966fa2f9803d6454ce773863025d0e2e7f3cc.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:46 -05:00
Tom Lendacky
f1c6366e30 KVM: SVM: Add required changes to support intercepts under SEV-ES
When a guest is running under SEV-ES, the hypervisor cannot access the
guest register state. There are numerous places in the KVM code where
certain registers are accessed that are not allowed to be accessed (e.g.
RIP, CR0, etc). Add checks to prevent register accesses and add intercept
update support at various points within the KVM code.

Also, when handling a VMGEXIT, exceptions are passed back through the
GHCB. Since the RDMSR/WRMSR intercepts (may) inject a #GP on error,
update the SVM intercepts to handle this for SEV-ES guests.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
[Redo MSR part using the .complete_emulated_msr callback. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:45 -05:00
Paolo Bonzini
f9a4d62176 KVM: x86: introduce complete_emulated_msr callback
This will be used by SEV-ES to inject MSR failure via the GHCB.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-15 05:20:34 -05:00
Tom Lendacky
add5e2f045 KVM: SVM: Add support for the SEV-ES VMSA
Allocate a page during vCPU creation to be used as the encrypted VM save
area (VMSA) for the SEV-ES guest. Provide a flag in the kvm_vcpu_arch
structure that indicates whether the guest state is protected.

When freeing a VMSA page that has been encrypted, the cache contents must
be flushed using the MSR_AMD64_VM_PAGE_FLUSH before freeing the page.

[ i386 build warnings ]
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <fde272b17eec804f3b9db18c131262fe074015c5.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-14 11:09:32 -05:00
Tom Lendacky
916391a2d1 KVM: SVM: Add support for SEV-ES capability in KVM
Add support to KVM for determining if a system is capable of supporting
SEV-ES as well as determining if a guest is an SEV-ES guest.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <e66792323982c822350e40c7a1cf67ea2978a70b.1607620209.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-14 11:09:31 -05:00
Uros Bizjak
3f1a18b9fa KVM/VMX/SVM: Move kvm_machine_check function to x86.h
Move kvm_machine_check to x86.h to avoid two exact copies
of the same function in kvm.c and svm.c.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Message-Id: <20201029135600.122392-1-ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-14 11:09:29 -05:00
Paolo Bonzini
39485ed95d KVM: x86: reinstate vendor-agnostic check on SPEC_CTRL cpuid bits
Until commit e7c587da12 ("x86/speculation: Use synthetic bits for
IBRS/IBPB/STIBP"), KVM was testing both Intel and AMD CPUID bits before
allowing the guest to write MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD.
Testing only Intel bits on VMX processors, or only AMD bits on SVM
processors, fails if the guests are created with the "opposite" vendor
as the host.

While at it, also tweak the host CPU check to use the vendor-agnostic
feature bit X86_FEATURE_IBPB, since we only care about the availability
of the MSR on the host here and not about specific CPUID bits.

Fixes: e7c587da12 ("x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP")
Cc: stable@vger.kernel.org
Reported-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-11 19:05:13 -05:00
Jacob Xu
a2b2d4bf50 kvm: svm: de-allocate svm_cpu_data for all cpus in svm_cpu_uninit()
The cpu arg for svm_cpu_uninit() was previously ignored resulting in the
per cpu structure svm_cpu_data not being de-allocated for all cpus.

Signed-off-by: Jacob Xu <jacobhxu@google.com>
Message-Id: <20201203205939.1783969-1-jacobhxu@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-12-04 03:47:58 -05:00
Chen Zhou
054409ab25 KVM: SVM: fix error return code in svm_create_vcpu()
Fix to return a negative error code from the error handling case
instead of 0 in function svm_create_vcpu(), as done elsewhere in this
function.

Fixes: f4c847a956 ("KVM: SVM: refactor msr permission bitmap allocation")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Message-Id: <20201117025426.167824-1-chenzhou10@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-17 02:40:08 -05:00
Paolo Bonzini
dc924b0624 KVM: SVM: check CR4 changes against vcpu->arch
Similarly to what vmx/vmx.c does, use vcpu->arch.cr4 to check if CR4
bits PGE, PKE and OSXSAVE have changed.  When switching between VMCB01
and VMCB02, CPUID has to be adjusted every time if CR4.PKE or CR4.OSXSAVE
change; without this patch, instead, CR4 would be checked against the
previous value for L2 on vmentry, and against the previous value for
L1 on vmexit, and CPUID would not be updated.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-16 13:14:22 -05:00
Cathy Avery
7e8e6eed75 KVM: SVM: Move asid to vcpu_svm
KVM does not have separate ASIDs for L1 and L2; either the nested
hypervisor and nested guests share a single ASID, or on older processor
the ASID is used only to implement TLB flushing.

Either way, ASIDs are handled at the VM level.  In preparation
for having different VMCBs passed to VMLOAD/VMRUN/VMSAVE for L1 and
L2, store the current ASID to struct vcpu_svm and only move it to
the VMCB in svm_vcpu_run.  This way, TLB flushes can be applied
no matter which VMCB will be active during the next svm_vcpu_run.

Signed-off-by: Cathy Avery <cavery@redhat.com>
Message-Id: <20201011184818.3609-2-cavery@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-16 13:14:21 -05:00
Jim Mattson
2259c17f01 kvm: x86: Sink cpuid update into vendor-specific set_cr4 functions
On emulated VM-entry and VM-exit, update the CPUID bits that reflect
CR4.OSXSAVE and CR4.PKE.

This fixes a bug where the CPUID bits could continue to reflect L2 CR4
values after emulated VM-exit to L1. It also fixes a related bug where
the CPUID bits could continue to reflect L1 CR4 values after emulated
VM-entry to L2. The latter bug is mainly relevant to SVM, wherein
CPUID is not a required intercept. However, it could also be relevant
to VMX, because the code to conditionally update these CPUID bits
assumes that the guest CPUID and the guest CR4 are always in sync.

Fixes: 8eb3f87d90 ("KVM: nVMX: fix guest CR4 loading when emulating L2 to L1 exit")
Fixes: 2acf923e38 ("KVM: VMX: Enable XSAVE/XRSTOR for guest")
Fixes: b9baba8614 ("KVM, pkeys: expose CPUID/CR4 to guest")
Reported-by: Abhiroop Dabral <adabral@paloaltonetworks.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Cc: Haozhong Zhang <haozhong.zhang@intel.com>
Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Huaitong Han <huaitong.han@intel.com>
Message-Id: <20201029170648.483210-1-jmattson@google.com>
2020-11-15 09:49:18 -05:00
Sean Christopherson
c2fe3cd460 KVM: x86: Move vendor CR4 validity check to dedicated kvm_x86_ops hook
Split out VMX's checks on CR4.VMXE to a dedicated hook, .is_valid_cr4(),
and invoke the new hook from kvm_valid_cr4().  This fixes an issue where
KVM_SET_SREGS would return success while failing to actually set CR4.

Fixing the issue by explicitly checking kvm_x86_ops.set_cr4()'s return
in __set_sregs() is not a viable option as KVM has already stuffed a
variety of vCPU state.

Note, kvm_valid_cr4() and is_valid_cr4() have different return types and
inverted semantics.  This will be remedied in a future patch.

Fixes: 5e1746d620 ("KVM: nVMX: Allow setting the VMXE bit in CR4")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20201007014417.29276-5-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-15 09:49:07 -05:00
Sean Christopherson
311a06593b KVM: SVM: Drop VMXE check from svm_set_cr4()
Drop svm_set_cr4()'s explicit check CR4.VMXE now that common x86 handles
the check by incorporating VMXE into the CR4 reserved bits, via
kvm_cpu_caps.  SVM obviously does not set X86_FEATURE_VMX.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20201007014417.29276-4-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-15 09:49:07 -05:00
Babu Moger
96308b0661 KVM: SVM: Update cr3_lm_rsvd_bits for AMD SEV guests
For AMD SEV guests, update the cr3_lm_rsvd_bits to mask
the memory encryption bit in reserved bits.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <160521948301.32054.5783800787423231162.stgit@bmoger-ubuntu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-11-13 06:31:14 -05:00
Linus Torvalds
f9a705ad1c ARM:
- New page table code for both hypervisor and guest stage-2
 - Introduction of a new EL2-private host context
 - Allow EL2 to have its own private per-CPU variables
 - Support of PMU event filtering
 - Complete rework of the Spectre mitigation
 
 PPC:
 - Fix for running nested guests with in-kernel IRQ chip
 - Fix race condition causing occasional host hard lockup
 - Minor cleanups and bugfixes
 
 x86:
 - allow trapping unknown MSRs to userspace
 - allow userspace to force #GP on specific MSRs
 - INVPCID support on AMD
 - nested AMD cleanup, on demand allocation of nested SVM state
 - hide PV MSRs and hypercalls for features not enabled in CPUID
 - new test for MSR_IA32_TSC writes from host and guest
 - cleanups: MMU, CPUID, shared MSRs
 - LAPIC latency optimizations ad bugfixes
 
 For x86, also included in this pull request is a new alternative and
 (in the future) more scalable implementation of extended page tables
 that does not need a reverse map from guest physical addresses to
 host physical addresses.  For now it is disabled by default because
 it is still lacking a few of the existing MMU's bells and whistles.
 However it is a very solid piece of work and it is already available
 for people to hammer on it.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+S8dsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM40Af+M46NJmuS5rcwFfybvK/c42KT6svX
 Co1NrZDwzSQ2mMy3WQzH9qeLvb+nbY4sT3n5BPNPNsT+aIDPOTDt//qJ2/Ip9UUs
 tRNea0MAR96JWLE7MSeeRxnTaQIrw/AAZC0RXFzZvxcgytXwdqBExugw4im+b+dn
 Dcz8QxX1EkwT+4lTm5HC0hKZAuo4apnK1QkqCq4SdD2QVJ1YE6+z7pgj4wX7xitr
 STKD6q/Yt/0ndwqS0GSGbyg0jy6mE620SN6isFRkJYwqfwLJci6KnqvEK67EcNMu
 qeE017K+d93yIVC46/6TfVHzLR/D1FpQ8LZ16Yl6S13OuGIfAWBkQZtPRg==
 =AD6a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "For x86, there is a new alternative and (in the future) more scalable
  implementation of extended page tables that does not need a reverse
  map from guest physical addresses to host physical addresses.

  For now it is disabled by default because it is still lacking a few of
  the existing MMU's bells and whistles. However it is a very solid
  piece of work and it is already available for people to hammer on it.

  Other updates:

  ARM:
   - New page table code for both hypervisor and guest stage-2
   - Introduction of a new EL2-private host context
   - Allow EL2 to have its own private per-CPU variables
   - Support of PMU event filtering
   - Complete rework of the Spectre mitigation

  PPC:
   - Fix for running nested guests with in-kernel IRQ chip
   - Fix race condition causing occasional host hard lockup
   - Minor cleanups and bugfixes

  x86:
   - allow trapping unknown MSRs to userspace
   - allow userspace to force #GP on specific MSRs
   - INVPCID support on AMD
   - nested AMD cleanup, on demand allocation of nested SVM state
   - hide PV MSRs and hypercalls for features not enabled in CPUID
   - new test for MSR_IA32_TSC writes from host and guest
   - cleanups: MMU, CPUID, shared MSRs
   - LAPIC latency optimizations ad bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits)
  kvm: x86/mmu: NX largepage recovery for TDP MMU
  kvm: x86/mmu: Don't clear write flooding count for direct roots
  kvm: x86/mmu: Support MMIO in the TDP MMU
  kvm: x86/mmu: Support write protection for nesting in tdp MMU
  kvm: x86/mmu: Support disabling dirty logging for the tdp MMU
  kvm: x86/mmu: Support dirty logging for the TDP MMU
  kvm: x86/mmu: Support changed pte notifier in tdp MMU
  kvm: x86/mmu: Add access tracking for tdp_mmu
  kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU
  kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU
  kvm: x86/mmu: Add TDP MMU PF handler
  kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg
  kvm: x86/mmu: Support zapping SPTEs in the TDP MMU
  KVM: Cache as_id in kvm_memory_slot
  kvm: x86/mmu: Add functions to handle changed TDP SPTEs
  kvm: x86/mmu: Allocate and free TDP MMU roots
  kvm: x86/mmu: Init / Uninit the TDP MMU
  kvm: x86/mmu: Introduce tdp_iter
  KVM: mmu: extract spte.h and spte.c
  KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp
  ...
2020-10-23 11:17:56 -07:00
Paolo Bonzini
c0623f5e5d Merge branch 'kvm-fixes' into 'next'
Pick up bugfixes from 5.9, otherwise various tests fail.
2020-10-21 18:05:58 -04:00
Maxim Levitsky
2fcf4876ad KVM: nSVM: implement on demand allocation of the nested state
This way we don't waste memory on VMs which don't use nesting
virtualization even when the host enabled it for them.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20201001112954.6258-5-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:48:48 -04:00
Maxim Levitsky
72f211ecaa KVM: x86: allow kvm_x86_ops.set_efer to return an error value
This will be used to signal an error to the userspace, in case
the vendor code failed during handling of this msr. (e.g -ENOMEM)

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20201001112954.6258-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-10-21 17:48:48 -04:00
Linus Torvalds
da9803dfd3 This feature enhances the current guest memory encryption support
called SEV by also encrypting the guest register state, making the
 registers inaccessible to the hypervisor by en-/decrypting them on world
 switches. Thus, it adds additional protection to Linux guests against
 exfiltration, control flow and rollback attacks.
 
 With SEV-ES, the guest is in full control of what registers the
 hypervisor can access. This is provided by a guest-host exchange
 mechanism based on a new exception vector called VMM Communication
 Exception (#VC), a new instruction called VMGEXIT and a shared
 Guest-Host Communication Block which is a decrypted page shared between
 the guest and the hypervisor.
 
 Intercepts to the hypervisor become #VC exceptions in an SEV-ES guest so
 in order for that exception mechanism to work, the early x86 init code
 needed to be made able to handle exceptions, which, in itself, brings
 a bunch of very nice cleanups and improvements to the early boot code
 like an early page fault handler, allowing for on-demand building of the
 identity mapping. With that, !KASLR configurations do not use the EFI
 page table anymore but switch to a kernel-controlled one.
 
 The main part of this series adds the support for that new exchange
 mechanism. The goal has been to keep this as much as possibly
 separate from the core x86 code by concentrating the machinery in two
 SEV-ES-specific files:
 
  arch/x86/kernel/sev-es-shared.c
  arch/x86/kernel/sev-es.c
 
 Other interaction with core x86 code has been kept at minimum and behind
 static keys to minimize the performance impact on !SEV-ES setups.
 
 Work by Joerg Roedel and Thomas Lendacky and others.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl+FiKYACgkQEsHwGGHe
 VUqS5BAAlh5mKwtxXMyFyAIHa5tpsgDjbecFzy1UVmZyxN0JHLlM3NLmb+K52drY
 PiWjNNMi/cFMFazkuLFHuY0poBWrZml8zRS/mExKgUJC6EtguS9FQnRE9xjDBoWQ
 gOTSGJWEzT5wnFqo8qHwlC2CDCSF1hfL8ks3cUFW2tCWus4F9pyaMSGfFqD224rg
 Lh/8+arDMSIKE4uH0cm7iSuyNpbobId0l5JNDfCEFDYRigQZ6pZsQ9pbmbEpncs4
 rmjDvBA5eHDlNMXq0ukqyrjxWTX4ZLBOBvuLhpyssSXnnu2T+Tcxg09+ZSTyJAe0
 LyC9Wfo0v78JASXMAdeH9b1d1mRYNMqjvnBItNQoqweoqUXWz7kvgxCOp6b/G4xp
 cX5YhB6BprBW2DXL45frMRT/zX77UkEKYc5+0IBegV2xfnhRsjqQAQaWLIksyEaX
 nz9/C6+1Sr2IAv271yykeJtY6gtlRjg/usTlYpev+K0ghvGvTmuilEiTltjHrso1
 XAMbfWHQGSd61LNXofvx/GLNfGBisS6dHVHwtkayinSjXNdWxI6w9fhbWVjQ+y2V
 hOF05lmzaJSG5kPLrsFHFqm2YcxOmsWkYYDBHvtmBkMZSf5B+9xxDv97Uy9NETcr
 eSYk//TEkKQqVazfCQS/9LSm0MllqKbwNO25sl0Tw2k6PnheO2g=
 =toqi
 -----END PGP SIGNATURE-----

Merge tag 'x86_seves_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SEV-ES support from Borislav Petkov:
 "SEV-ES enhances the current guest memory encryption support called SEV
  by also encrypting the guest register state, making the registers
  inaccessible to the hypervisor by en-/decrypting them on world
  switches. Thus, it adds additional protection to Linux guests against
  exfiltration, control flow and rollback attacks.

  With SEV-ES, the guest is in full control of what registers the
  hypervisor can access. This is provided by a guest-host exchange
  mechanism based on a new exception vector called VMM Communication
  Exception (#VC), a new instruction called VMGEXIT and a shared
  Guest-Host Communication Block which is a decrypted page shared
  between the guest and the hypervisor.

  Intercepts to the hypervisor become #VC exceptions in an SEV-ES guest
  so in order for that exception mechanism to work, the early x86 init
  code needed to be made able to handle exceptions, which, in itself,
  brings a bunch of very nice cleanups and improvements to the early
  boot code like an early page fault handler, allowing for on-demand
  building of the identity mapping. With that, !KASLR configurations do
  not use the EFI page table anymore but switch to a kernel-controlled
  one.

  The main part of this series adds the support for that new exchange
  mechanism. The goal has been to keep this as much as possibly separate
  from the core x86 code by concentrating the machinery in two
  SEV-ES-specific files:

    arch/x86/kernel/sev-es-shared.c
    arch/x86/kernel/sev-es.c

  Other interaction with core x86 code has been kept at minimum and
  behind static keys to minimize the performance impact on !SEV-ES
  setups.

  Work by Joerg Roedel and Thomas Lendacky and others"

* tag 'x86_seves_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits)
  x86/sev-es: Use GHCB accessor for setting the MMIO scratch buffer
  x86/sev-es: Check required CPU features for SEV-ES
  x86/efi: Add GHCB mappings when SEV-ES is active
  x86/sev-es: Handle NMI State
  x86/sev-es: Support CPU offline/online
  x86/head/64: Don't call verify_cpu() on starting APs
  x86/smpboot: Load TSS and getcpu GDT entry before loading IDT
  x86/realmode: Setup AP jump table
  x86/realmode: Add SEV-ES specific trampoline entry point
  x86/vmware: Add VMware-specific handling for VMMCALL under SEV-ES
  x86/kvm: Add KVM-specific VMMCALL handling under SEV-ES
  x86/paravirt: Allow hypervisor-specific VMMCALL handling under SEV-ES
  x86/sev-es: Handle #DB Events
  x86/sev-es: Handle #AC Events
  x86/sev-es: Handle VMMCALL Events
  x86/sev-es: Handle MWAIT/MWAITX Events
  x86/sev-es: Handle MONITOR/MONITORX Events
  x86/sev-es: Handle INVD Events
  x86/sev-es: Handle RDPMC Events
  x86/sev-es: Handle RDTSC(P) Events
  ...
2020-10-14 10:21:34 -07:00
Linus Torvalds
6873139ed0 objtool changes for v5.10:
- Most of the changes are cleanups and reorganization to make the objtool code
    more arch-agnostic. This is in preparation for non-x86 support.
 
 Fixes:
 
  - KASAN fixes.
  - Handle unreachable trap after call to noreturn functions better.
  - Ignore unreachable fake jumps.
  - Misc smaller fixes & cleanups.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+FgwIRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1juGw/6A6goA5/HHapM965yG1eY/rTLp3eIbcma
 1ZbkUsP0YfT6wVUzw/sOeZzKNOwOq1FuMfkjuH2KcnlxlcMekIaKvLk8uauW4igM
 hbFGuuZfZ0An5ka9iQ1W6HGdsuD3vVlN1w/kxdWk0c3lJCVQSTxdCfzF8fuF3gxX
 lF3Bc1D/ZFcHIHT/hu/jeIUCgCYpD3qZDjQJBScSwVthZC+Fw6weLLGp2rKDaCao
 HhSQft6MUfDrUKfH3LBIUNPRPCOrHo5+AX6BXxLXJVxqlwO/YU3e0GMwSLedMtBy
 TASWo7/9GAp+wNNZe8EliyTKrfC3sLxN1QImfjuojxbBVXx/YQ/ToTt9fVGpF4Y+
 XhhRFv9520v1tS2wPHIgQGwbh7EWG6mdrmo10RAs/31ViONPrbEZ4WmcA08b/5FY
 KEkOVb18yfmDVzVZPpSc+HpIFkppEBOf7wPg27Bj3RTZmzIl/y+rKSnxROpsJsWb
 R6iov7SFVET14lHl1G7tPNXfqRaS7HaOQIj3rSUyAP0ZfX+yIupVJp32dc6Ofg8b
 SddUCwdIHoFdUNz4Y9csUCrewtCVJbxhV4MIdv0GpWbrgSw96RFZgetaH+6mGRpj
 0Kh6M1eC3irDbhBuarWUBAr2doPAq4iOUeQU36Q6YSAbCs83Ws2uKOWOHoFBVwCH
 uSKT0wqqG+E=
 =KX5o
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2020-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:
 "Most of the changes are cleanups and reorganization to make the
  objtool code more arch-agnostic. This is in preparation for non-x86
  support.

  Other changes:

   - KASAN fixes

   - Handle unreachable trap after call to noreturn functions better

   - Ignore unreachable fake jumps

   - Misc smaller fixes & cleanups"

* tag 'objtool-core-2020-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  perf build: Allow nested externs to enable BUILD_BUG() usage
  objtool: Allow nested externs to enable BUILD_BUG()
  objtool: Permit __kasan_check_{read,write} under UACCESS
  objtool: Ignore unreachable trap after call to noreturn functions
  objtool: Handle calling non-function symbols in other sections
  objtool: Ignore unreachable fake jumps
  objtool: Remove useless tests before save_reg()
  objtool: Decode unwind hint register depending on architecture
  objtool: Make unwind hint definitions available to other architectures
  objtool: Only include valid definitions depending on source file type
  objtool: Rename frame.h -> objtool.h
  objtool: Refactor jump table code to support other architectures
  objtool: Make relocation in alternative handling arch dependent
  objtool: Abstract alternative special case handling
  objtool: Move macros describing structures to arch-dependent code
  objtool: Make sync-check consider the target architecture
  objtool: Group headers to check in a single list
  objtool: Define 'struct orc_entry' only when needed
  objtool: Skip ORC entry creation for non-text sections
  objtool: Move ORC logic out of check()
  ...
2020-10-14 10:13:37 -07:00
Alexander Graf
fd6fa73d13 KVM: x86: SVM: Prevent MSR passthrough when MSR access is denied
We will introduce the concept of MSRs that may not be handled in kernel
space soon. Some MSRs are directly passed through to the guest, effectively
making them handled by KVM from user space's point of view.

This patch introduces all logic required to ensure that MSRs that
user space wants trapped are not marked as direct access for guests.

Signed-off-by: Alexander Graf <graf@amazon.com>
Message-Id: <20200925143422.21718-6-graf@amazon.com>
[Make terminology a bit more similar to VMX. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:58:06 -04:00
Aaron Lewis
476c9bd8e9 KVM: x86: Prepare MSR bitmaps for userspace tracked MSRs
Prepare vmx and svm for a subsequent change that ensures the MSR permission
bitmap is set to allow an MSR that userspace is tracking to force a vmx_vmexit
in the guest.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
[agraf: rebase, adapt SVM scheme to nested changes that came in between]
Signed-off-by: Alexander Graf <graf@amazon.com>
Message-Id: <20200925143422.21718-5-graf@amazon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:58:05 -04:00
Sean Christopherson
cc167bd7ee KVM: x86: Use common definition for kvm_nested_vmexit tracepoint
Use the newly introduced TRACE_EVENT_KVM_EXIT to define the guts of
kvm_nested_vmexit so that it captures and prints the same information as
kvm_exit.  This has the bonus side effect of fixing the interrupt info
and error code printing for the case where they're invalid, e.g. if the
exit was a failed VM-Entry.  This also sets the stage for retrieving
EXIT_QUALIFICATION and VM_EXIT_INTR_INFO in nested_vmx_reflect_vmexit()
if and only if the VM-Exit is being routed to L1.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923201349.16097-7-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:52 -04:00
Sean Christopherson
235ba74f00 KVM: x86: Add intr/vectoring info and error code to kvm_exit tracepoint
Extend the kvm_exit tracepoint to align it with kvm_nested_vmexit in
terms of what information is captured.  On SVM, add interrupt info and
error code, while on VMX it add IDT vectoring and error code.  This
sets the stage for macrofying the kvm_exit tracepoint definition so that
it can be reused for kvm_nested_vmexit without loss of information.

Opportunistically stuff a zero for VM_EXIT_INTR_INFO if the VM-Enter
failed, as the field is guaranteed to be invalid.  Note, it'd be
possible to further filter the interrupt/exception fields based on the
VM-Exit reason, but the helper is intended only for tracepoints, i.e.
an extra VMREAD or two is a non-issue, the failed VM-Enter case is just
low hanging fruit.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923201349.16097-5-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:51 -04:00
Sean Christopherson
a9d7d76c66 KVM: x86: Read guest RIP from within the kvm_nested_vmexit tracepoint
Use kvm_rip_read() to read the guest's RIP for the nested VM-Exit
tracepoint instead of having the caller pass in an argument.  Params
that are passed into a tracepoint are evaluated even if the tracepoint
is disabled, i.e. passing in RIP for VMX incurs a VMREAD and retpoline
to retrieve a value that may never be used, e.g. if the exit is due to a
hardware interrupt.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923201349.16097-3-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:50 -04:00
Haiwei Li
95b28ac9db KVM: SVM: Add tracepoint for cr_interception
Add trace_kvm_cr_write and trace_kvm_cr_read for svm.

Signed-off-by: Haiwei Li <lihaiwei@tencent.com>
Message-Id: <f3031602-db3b-c4fe-b719-d402663b0a2b@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:21 -04:00
Wanpeng Li
4e810adb53 KVM: SVM: Analyze is_guest_mode() in svm_vcpu_run()
Analyze is_guest_mode() in svm_vcpu_run() instead of svm_exit_handlers_fastpath()
in conformity with VMX version.

Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1600066548-4343-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:21 -04:00
Sean Christopherson
09e3e2a1cc KVM: x86: Add kvm_x86_ops hook to short circuit emulation
Replace the existing kvm_x86_ops.need_emulation_on_page_fault() with a
more generic is_emulatable(), and unconditionally call the new function
in x86_emulate_instruction().

KVM will use the generic hook to support multiple security related
technologies that prevent emulation in one way or another.  Similar to
the existing AMD #NPF case where emulation of the current instruction is
not possible due to lack of information, AMD's SEV-ES and Intel's SGX
and TDX will introduce scenarios where emulation is impossible due to
the guest's register state being inaccessible.  And again similar to the
existing #NPF case, emulation can be initiated by kvm_mmu_page_fault(),
i.e. outside of the control of vendor-specific code.

While the cause and architecturally visible behavior of the various
cases are different, e.g. SGX will inject a #UD, AMD #NPF is a clean
resume or complete shutdown, and SEV-ES and TDX "return" an error, the
impact on the common emulation code is identical: KVM must stop
emulation immediately and resume the guest.

Query is_emulatable() in handle_ud() as well so that the
force_emulation_prefix code doesn't incorrectly modify RIP before
calling emulate_instruction() in the absurdly unlikely scenario that
KVM encounters forced emulation in conjunction with "do not emulate".

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200915232702.15945-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:19 -04:00
Babu Moger
4407a797e9 KVM: SVM: Enable INVPCID feature on AMD
The following intercept bit has been added to support VMEXIT
for INVPCID instruction:
Code    Name            Cause
A2h     VMEXIT_INVPCID  INVPCID instruction

The following bit has been added to the VMCB layout control area
to control intercept of INVPCID:
Byte Offset     Bit(s)    Function
14h             2         intercept INVPCID

Enable the interceptions when the the guest is running with shadow
page table enabled and handle the tlbflush based on the invpcid
instruction type.

For the guests with nested page table (NPT) support, the INVPCID
feature works as running it natively. KVM does not need to do any
special handling in this case.

AMD documentation for INVPCID feature is available at "AMD64
Architecture Programmer’s Manual Volume 2: System Programming,
Pub. 24593 Rev. 3.34(or later)"

The documentation can be obtained at the links below:
Link: https://www.amd.com/system/files/TechDocs/24593.pdf
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <159985255929.11252.17346684135277453258.stgit@bmoger-ubuntu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:17 -04:00
Babu Moger
830bd71f2c KVM: SVM: Remove set_cr_intercept, clr_cr_intercept and is_cr_intercept
Remove set_cr_intercept, clr_cr_intercept and is_cr_intercept. Instead
call generic svm_set_intercept, svm_clr_intercept an dsvm_is_intercep
tfor all cr intercepts.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <159985253016.11252.16945893859439811480.stgit@bmoger-ubuntu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:15 -04:00
Babu Moger
c62e2e94b9 KVM: SVM: Modify 64 bit intercept field to two 32 bit vectors
Convert all the intercepts to one array of 32 bit vectors in
vmcb_control_area. This makes it easy for future intercept vector
additions. Also update trace functions.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <159985250813.11252.5736581193881040525.stgit@bmoger-ubuntu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:15 -04:00
Babu Moger
9780d51dc2 KVM: SVM: Modify intercept_exceptions to generic intercepts
Modify intercept_exceptions to generic intercepts in vmcb_control_area. Use
the generic vmcb_set_intercept, vmcb_clr_intercept and vmcb_is_intercept to
set/clear/test the intercept_exceptions bits.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <159985250037.11252.1361972528657052410.stgit@bmoger-ubuntu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:14 -04:00
Babu Moger
30abaa8838 KVM: SVM: Change intercept_dr to generic intercepts
Modify intercept_dr to generic intercepts in vmcb_control_area. Use
the generic vmcb_set_intercept, vmcb_clr_intercept and vmcb_is_intercept
to set/clear/test the intercept_dr bits.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <159985249255.11252.10000868032136333355.stgit@bmoger-ubuntu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:14 -04:00
Babu Moger
03bfeeb988 KVM: SVM: Change intercept_cr to generic intercepts
Change intercept_cr to generic intercepts in vmcb_control_area.
Use the new vmcb_set_intercept, vmcb_clr_intercept and vmcb_is_intercept
where applicable.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <159985248506.11252.9081085950784508671.stgit@bmoger-ubuntu>
[Change constant names. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:13 -04:00
Maxim Levitsky
8d22b90e94 KVM: SVM: refactor exit labels in svm_create_vcpu
Kernel coding style suggests not to use labels like error1,error2

Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200827171145.374620-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:12 -04:00
Maxim Levitsky
0681de1b83 KVM: SVM: use __GFP_ZERO instead of clear_page
Another small refactoring.

Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200827171145.374620-5-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:11 -04:00
Maxim Levitsky
f4c847a956 KVM: SVM: refactor msr permission bitmap allocation
Replace svm_vcpu_init_msrpm with svm_vcpu_alloc_msrpm, that also allocates
the msr bitmap and add svm_vcpu_free_msrpm to free it.

This will be used later to move the nested msr permission bitmap allocation
to nested.c

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200827171145.374620-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:11 -04:00
Maxim Levitsky
0dd16b5b0c KVM: nSVM: rename nested vmcb to vmcb12
This is to be more consistient with VMX, and to support
upcoming addition of vmcb02

Hopefully no functional changes.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200827171145.374620-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:10 -04:00
Maxim Levitsky
1feaba144c KVM: SVM: rename a variable in the svm_create_vcpu
The 'page' is to hold the vcpu's vmcb so name it as such to
avoid confusion.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20200827171145.374620-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:10 -04:00
Wanpeng Li
010fd37fdd KVM: LAPIC: Reduce world switch latency caused by timer_advance_ns
All the checks in lapic_timer_int_injected(), __kvm_wait_lapic_expire(), and
these function calls waste cpu cycles when the timer mode is not tscdeadline.
We can observe ~1.3% world switch time overhead by kvm-unit-tests/vmexit.flat
vmcall testing on AMD server. This patch reduces the world switch latency
caused by timer_advance_ns feature when the timer mode is not tscdeadline by
simpling move the check against apic->lapic_timer.expired_tscdeadline much
earlier.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1599731444-3525-7-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:57:10 -04:00
Linus Torvalds
7c7ec3226f Five small fixes. The nested migration bug will be fixed
with a better API in 5.10 or 5.11, for now this is a fix
 that works with existing userspace but keeps the current
 ugly API.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl9ufLMUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMe9AgAgU3YQ2SktkqEOXjHMLqCH5Y3PKFI
 S2anYpoKlH36Q6kzoqtkCj0GVagvdh5+Envz3I/tMdhv3Y/JgZaX1wHAe4cUl9BT
 VyoiDBTWkhYRmpUbLYA8AtmgxQw1Hp8srH86rnvVGmLG6zdAa/rgUAKiQgT688Ej
 CQvF5H7Zi3viPo2rInNSkgTIgewduqSWkwJ6+h4AQMmNJpbRaeZs45yMYyyu/FIi
 hUazy7Rwk2vkWcuTd/sqH9b9y3VCYpN9juRaehEiK8qxXT3ydTU4Tub25BHmvXdr
 dx5pShG4P3nAGnfV1qKAemyQcY7sjfMieqN1F3QcsRcxqZgySUm11o2JRw==
 =sHsX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm fixes from Paolo Bonzini:
 "Five small fixes.

  The nested migration bug will be fixed with a better API in 5.10 or
  5.11, for now this is a fix that works with existing userspace but
  keeps the current ugly API"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SVM: Add a dedicated INVD intercept routine
  KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE
  KVM: x86: fix MSR_IA32_TSC read for nested migration
  selftests: kvm: Fix assert failure in single-step test
  KVM: x86: VMX: Make smaller physical guest address space support user-configurable
2020-09-25 17:15:19 -07:00
Tom Lendacky
4bb05f3048 KVM: SVM: Add a dedicated INVD intercept routine
The INVD instruction intercept performs emulation. Emulation can't be done
on an SEV guest because the guest memory is encrypted.

Provide a dedicated intercept routine for the INVD intercept. And since
the instruction is emulated as a NOP, just skip it instead.

Fixes: 1654efcbc4 ("KVM: SVM: Add KVM_SEV_INIT command")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <a0b9a19ffa7fef86a3cc700c7ea01cb2731e04e5.1600972918.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-25 13:27:35 -04:00
Paolo Bonzini
bf3c0e5e71 Merge branch 'x86-seves-for-paolo' of https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into HEAD 2020-09-22 06:43:17 -04:00
Linus Torvalds
84b1349972 ARM:
- Multiple stolen time fixes, with a new capability to match x86
 - Fix for hugetlbfs mappings when PUD and PMD are the same level
 - Fix for hugetlbfs mappings when PTE mappings are enforced
   (dirty logging, for example)
 - Fix tracing output of 64bit values
 
 x86:
 - nSVM state restore fixes
 - Async page fault fixes
 - Lots of small fixes everywhere
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl9dM5kUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM+Iwf+LbISO7ccpPMK1kKtOeug/jZv+xQA
 sVaBGRzYo+k2e0XtV8E8IV4N30FBtYSwXsbBKkMAoy2FpmMebgDWDQ7xspb6RJMS
 /y8t1iqPwdOaLIkUkgc7UihSTlZm05Es3f3q6uZ9+oaM4Fe+V7xWzTUX4Oy89JO7
 KcQsTD7pMqS4bfZGADK781ITR/WPgCi0aYx5s6dcwcZAQXhb1K1UKEjB8OGKnjUh
 jliReJtxRA16rjF+S5aJ7L07Ce/ksrfwkI4NXJ4GxW+lyOfVNdSBJUBaZt1m7G2M
 1We5+i5EjKCjuxmgtUUUfVdazpj1yl+gBGT7KKkLte9T9WZdXyDnixAbvg==
 =OFb3
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "A bit on the bigger side, mostly due to me being on vacation, then
  busy, then on parental leave, but there's nothing worrisome.

  ARM:
   - Multiple stolen time fixes, with a new capability to match x86
   - Fix for hugetlbfs mappings when PUD and PMD are the same level
   - Fix for hugetlbfs mappings when PTE mappings are enforced (dirty
     logging, for example)
   - Fix tracing output of 64bit values

  x86:
   - nSVM state restore fixes
   - Async page fault fixes
   - Lots of small fixes everywhere"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
  KVM: emulator: more strict rsm checks.
  KVM: nSVM: more strict SMM checks when returning to nested guest
  SVM: nSVM: setup nested msr permission bitmap on nested state load
  SVM: nSVM: correctly restore GIF on vmexit from nesting after migration
  x86/kvm: don't forget to ACK async PF IRQ
  x86/kvm: properly use DEFINE_IDTENTRY_SYSVEC() macro
  KVM: VMX: Don't freeze guest when event delivery causes an APIC-access exit
  KVM: SVM: avoid emulation with stale next_rip
  KVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN
  KVM: SVM: Periodically schedule when unregistering regions on destroy
  KVM: MIPS: Change the definition of kvm type
  kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed
  KVM: nVMX: Fix the update value of nested load IA32_PERF_GLOBAL_CTRL control
  KVM: fix memory leak in kvm_io_bus_unregister_dev()
  KVM: Check the allocation of pv cpu mask
  KVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected
  KVM: arm64: Update page shift if stage 2 block mapping not supported
  KVM: arm64: Fix address truncation in traces
  KVM: arm64: Do not try to map PUDs when they are folded into PMD
  arm64/x86: KVM: Introduce steal-time cap
  ...
2020-09-13 08:34:47 -07:00
Maxim Levitsky
3ebb5d2617 KVM: nSVM: more strict SMM checks when returning to nested guest
* check that guest is 64 bit guest, otherwise the SVM related fields
  in the smm state area are not defined

* If the SMM area indicates that SMM interrupted a running guest,
  check that EFER.SVME which is also saved in this area is set, otherwise
  the guest might have tampered with SMM save area, and so indicate
  emulation failure which should triple fault the guest.

* Check that that guest CPUID supports SVM (due to the same issue as above)

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200827162720.278690-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-12 12:21:43 -04:00
Wanpeng Li
e42c68281b KVM: SVM: avoid emulation with stale next_rip
svm->next_rip is reset in svm_vcpu_run() only after calling
svm_exit_handlers_fastpath(), which will cause SVM's
skip_emulated_instruction() to write a stale RIP.

We can move svm_exit_handlers_fastpath towards the end of
svm_vcpu_run().  To align VMX with SVM, keep svm_complete_interrupts()
close as well.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Paul K. <kronenpj@kronenpj.dyndns.org>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
[Also move vmcb_mark_all_clean before any possible write to the VMCB.
 - Paolo]
2020-09-12 02:19:23 -04:00
Julien Thierry
00089c048e objtool: Rename frame.h -> objtool.h
Header frame.h is getting more code annotations to help objtool analyze
object files.

Rename the file to objtool.h.

[ jpoimboe: add objtool.h to MAINTAINERS ]

Signed-off-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-09-10 10:43:13 -05:00
Tom Lendacky
d07f46f9f5 KVM: SVM: Add GHCB definitions
Extend the vmcb_safe_area with SEV-ES fields and add a new
'struct ghcb' which will be used for guest-hypervisor communication.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200907131613.12703-3-joro@8bytes.org
2020-09-07 19:45:24 +02:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Linus Torvalds
921d2597ab s390: implement diag318
x86:
 * Report last CPU for debugging
 * Emulate smaller MAXPHYADDR in the guest than in the host
 * .noinstr and tracing fixes from Thomas
 * nested SVM page table switching optimization and fixes
 
 Generic:
 * Unify shadow MMU cache data structures across architectures
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8pC+oUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcOwgAjomqtEqQNlp7DdZT7VyyklzbxX1/
 ud7v+oOJ8K4sFlf64lSthjPo3N9rzZCcw+yOXmuyuITngXOGc3tzIwXpCzpLtuQ1
 WO1Ql3B/2dCi3lP5OMmsO1UAZqy9pKLg1dfeYUPk48P5+p7d/NPmk+Em5kIYzKm5
 JsaHfCp2EEXomwmljNJ8PQ1vTjIQSSzlgYUBZxmCkaaX7zbEUMtxAQCStHmt8B84
 33LczwXBm3viSWrzsoBV37I70+tseugiSGsCfUyupXOvq55d6D9FCqtCb45Hn4Vh
 Ik8ggKdalsk/reiGEwNw1/3nr6mRMkHSbl+Mhc4waOIFf9dn0urgQgOaDg==
 =YVx0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "s390:
   - implement diag318

  x86:
   - Report last CPU for debugging
   - Emulate smaller MAXPHYADDR in the guest than in the host
   - .noinstr and tracing fixes from Thomas
   - nested SVM page table switching optimization and fixes

  Generic:
   - Unify shadow MMU cache data structures across architectures"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
  KVM: SVM: Fix sev_pin_memory() error handling
  KVM: LAPIC: Set the TDCR settable bits
  KVM: x86: Specify max TDP level via kvm_configure_mmu()
  KVM: x86/mmu: Rename max_page_level to max_huge_page_level
  KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR
  KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch
  KVM: x86: Pull the PGD's level from the MMU instead of recalculating it
  KVM: VMX: Make vmx_load_mmu_pgd() static
  KVM: x86/mmu: Add separate helper for shadow NPT root page role calc
  KVM: VMX: Drop a duplicate declaration of construct_eptp()
  KVM: nSVM: Correctly set the shadow NPT root level in its MMU role
  KVM: Using macros instead of magic values
  MIPS: KVM: Fix build error caused by 'kvm_run' cleanup
  KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF
  KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
  KVM: VMX: optimize #PF injection when MAXPHYADDR does not match
  KVM: VMX: Add guest physical address check in EPT violation and misconfig
  KVM: VMX: introduce vmx_need_pf_intercept
  KVM: x86: update exception bitmap on CPUID changes
  KVM: x86: rename update_bp_intercept to update_exception_bitmap
  ...
2020-08-06 12:59:31 -07:00