Add x86-64 support for exception fixup on single instructions, without
forcing tests to install their own fault handlers. Use registers r9-r11
to flag the instruction as "safe" and pass fixup/vector information,
i.e. introduce yet another flavor of fixup (versus the kernel's in-memory
tables and KUT's per-CPU area) to take advantage of KVM sefltests being
64-bit only.
Using only registers avoids the need to allocate fixup tables, ensure
FS or GS base is valid for the guest, ensure memory is mapped into the
guest, etc..., and also reduces the potential for recursive faults due to
accessing memory.
Providing exception fixup trivializes tests that just want to verify that
an instruction faults, e.g. no need to track start/end using global
labels, no need to install a dedicated handler, etc...
Deliberately do not support #DE in exception fixup so that the fixup glue
doesn't need to account for a fault with vector == 0, i.e. the vector can
also indicate that a fault occurred. KVM injects #DE only for esoteric
emulation scenarios, i.e. there's very, very little value in testing #DE.
Force any test that wants to generate #DEs to install its own handler(s).
Use kvm_pv_test as a guinea pig for the new fixup, as it has a very
straightforward use case of wanting to verify that RDMSR and WRMSR fault.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220608224516.3788274-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace calls to kvm_check_cap() that treat its return as a boolean with
calls to kvm_has_cap(). Several instances of kvm_check_cap() were missed
when kvm_has_cap() was introduced.
Reported-by: Andrew Jones <drjones@redhat.com>
Fixes: 3ea9b80965 ("KVM: selftests: Add kvm_has_cap() to provide syntactic sugar")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220613161942.1586791-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add TEST_REQUIRE() and __TEST_REQUIRE() to replace the myriad open coded
instances of selftests exiting with KSFT_SKIP after printing an
informational message. In addition to reducing the amount of boilerplate
code in selftests, the UPPERCASE macro names make it easier to visually
identify a test's requirements.
Convert usage that erroneously uses something other than print_skip()
and/or "exits" with '0' or some other non-KSFT_SKIP value.
Intentionally drop a kvm_vm_free() in aarch64/debug-exceptions.c as part
of the conversion. All memory and file descriptors are freed on process
exit, so the explicit free is superfluous.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Take a vCPU directly instead of a VM+vcpu pair in all vCPU-scoped helpers
and ioctls.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename vm_vcpu_add() to __vm_vcpu_add(), and vm_vcpu_add_default() to
vm_vcpu_add() to show the relationship between the newly minted
vm_vcpu_add() and __vm_vcpu_add().
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Return the created 'struct kvm_vcpu' object from vm_vcpu_add_default(),
which cleans up a few tests and will eventually allow removing vcpu_get()
entirely.
Opportunistically rename @vcpuid to @vcpu_id to follow preferred kernel
style.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add "arch" into the name of utility functions that are declared in common
code, but (surprise!) have arch-specific implementations. Shuffle code
around so that all such helpers' declarations are bundled together.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
An "unrestricted guest" is an VMX-only concept, move the relevant helper
to x86-64 code. Assume most readers can correctly convert underscores to
spaces and oppurtunistically trim the function comment.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add helpers for the various one-off helpers used by x86's vCPU state
save/restore helpers, and convert the other open coded ioctl()s to use
existing helpers.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the get/set part of the MP_STATE and GUEST_DEBUG helpers to the end
to align with the many other ioctl() wrappers/helpers. Note, this is not
an endorsement of the predominant style, the goal is purely to provide
consistency in the selftests.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Consolidate the helper for retrieving the list of save/restore MSRs and
the list of feature MSRs, and use the common helpers in the related
get_msr_index_features test. Switching to the common helpers eliminates
the testcase that KVM returns the same -E2BIG result if the input number
of MSRs is '1' versus '0', but considered that testcase isn't very
interesting, e.g. '0' and '1' are equally arbitrary, and certainly not
worth the additional code.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cache the list of MSRs to save restore, mostly to justify not freeing the
list in the caller, which simplifies consumption of the list.
Opportunistically move the XSS test's so called is_supported_msr() to
common code as kvm_msr_is_in_save_restore_list(). The XSS is "supported"
by KVM, it's simply not in the save/restore list because KVM doesn't yet
allow a non-zero value.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fold kvm_util_internal.h into kvm_util_base.h, i.e. make all KVM utility
stuff "public". Hiding struct implementations from tests has been a
massive failure, as it has led to pointless and poorly named wrappers,
unnecessarily opaque code, etc...
Not to mention that the approach was a complete failure as evidenced by
the non-zero number of tests that were including kvm_util_internal.h.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make regs_dump() and sregs_dump() static, they're only implemented by
x86 and only used internally.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the recently introduced KVM-specific ioctl() helpers instead of open
coding calls to ioctl() just to pretty print the ioctl name.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add vcpu_get() to wrap vcpu_find() and deduplicate a pile of code that
asserts the requested vCPU exists.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the recently introduced vCPU-specific ioctl() helpers instead of
open coding calls to ioctl() just to pretty print the ioctl name.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86_page_size is an enum used to communicate the desired page size with
which to map a range of memory. Under the hood they just encode the
desired level at which to map the page. This ends up being clunky in a
few ways:
- The name suggests it encodes the size of the page rather than the
level.
- In other places in x86_64/processor.c we just use a raw int to encode
the level.
Simplify this by adopting the kernel style of PG_LEVEL_XX enums and pass
around raw ints when referring to the level. This makes the code easier
to understand since these macros are very common in KVM MMU code.
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Clean up code that was hardcoding masks for various fields,
now that the masks are included in processor.h.
For more cleanup, define PAGE_SIZE and PAGE_MASK just like in Linux.
PAGE_SIZE in particular was defined by several tests.
Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Red Hat's QE team reported test failure on access_tracking_perf_test:
Testing guest mode: PA-bits:ANY, VA-bits:48, 4K pages
guest physical test memory offset: 0x3fffbffff000
Populating memory : 0.684014577s
Writing to populated memory : 0.006230175s
Reading from populated memory : 0.004557805s
==== Test Assertion Failure ====
lib/kvm_util.c:1411: false
pid=125806 tid=125809 errno=4 - Interrupted system call
1 0x0000000000402f7c: addr_gpa2hva at kvm_util.c:1411
2 (inlined by) addr_gpa2hva at kvm_util.c:1405
3 0x0000000000401f52: lookup_pfn at access_tracking_perf_test.c:98
4 (inlined by) mark_vcpu_memory_idle at access_tracking_perf_test.c:152
5 (inlined by) vcpu_thread_main at access_tracking_perf_test.c:232
6 0x00007fefe9ff81ce: ?? ??:0
7 0x00007fefe9c64d82: ?? ??:0
No vm physical memory at 0xffbffff000
I can easily reproduce it with a Intel(R) Xeon(R) CPU E5-2630 with 46 bits
PA.
It turns out that the address translation for clearing idle page tracking
returned a wrong result; addr_gva2gpa()'s last step, which is based on
"pte[index[0]].pfn", did the calculation with 40 bits length and the
high 12 bits got truncated. In above case the GPA address to be returned
should be 0x3fffbffff000 for GVA 0xc0000000, but it got truncated into
0xffbffff000 and the subsequent gpa2hva lookup failed.
The width of operations on bit fields greater than 32-bit is
implementation defined, and differs between GCC (which uses the bitfield
precision) and clang (which uses 64-bit arithmetic), so this is a
potential minefield. Remove the bit fields and using manual masking
instead.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2075036
Reported-by: Nana Liu <nanliu@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no need for tests other than amx_test to enable dynamic xsave
states. Remove the call to vm_xsave_req_perm from generic code,
and move it inside the test. While at it, allow customizing the bit
that is requested, so that future tests can use it differently.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The following warning appears when executing
make -C tools/testing/selftests/kvm
include/x86_64/processor.h:290:2: warning: 'ecx' may be used uninitialized in this
function [-Wmaybe-uninitialized]
asm volatile("cpuid"
^~~
lib/x86_64/processor.c:1523:21: note: 'ecx' was declared here
uint32_t eax, ebx, ecx, edx, max_ext_leaf;
Just initialize ecx to remove this warning.
Fixes: c8cc43c1ea ("selftests: KVM: avoid failures due to reserved HyperTransport region")
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Message-Id: <20220119140325.59369-1-cloudliang@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For AMX support it is recommended to load XCR0 after XFD, so
that KVM does not see XFD=0, XCR=1 for a save state that will
eventually be disabled (which would lead to premature allocation
of the space required for that save state).
It is also required to load XSAVE data after XCR0 and XFD, so
that KVM can trigger allocation of the extra space required to
store AMX state.
Adjust vcpu_load_state to obey these new requirements.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20211223145322.2914028-2-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AMD proceessors define an address range that is reserved by HyperTransport
and causes a failure if used for guest physical addresses. Avoid
selftests failures by reserving those guest physical addresses; the
rules are:
- On parts with <40 bits, its fully hidden from software.
- Before Fam17h, it was always 12G just below 1T, even if there was more
RAM above this location. In this case we just not use any RAM above 1T.
- On Fam17h and later, it is variable based on SME, and is either just
below 2^48 (no encryption) or 2^43 (encryption).
Fixes: ef4c9f4f65 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()")
Cc: stable@vger.kernel.org
Cc: David Matlack <dmatlack@google.com>
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210805105423.412878-1-pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Recent kernels have checks to ensure the GPA values in special-purpose
registers like CR3 are within the maximum physical address range and
don't overlap with anything in the upper/reserved range. In the case of
SEV kselftest guests booting directly into 64-bit mode, CR3 needs to be
initialized to the GPA of the page table root, with the encryption bit
set. The kernel accounts for this encryption bit by removing it from
reserved bit range when the guest advertises the bit position via
KVM_SET_CPUID*, but kselftests currently call KVM_SET_SREGS as part of
vm_vcpu_add_default(), before KVM_SET_CPUID*.
As a result, KVM_SET_SREGS will return an error in these cases.
Address this by moving vcpu_set_cpuid() (which calls KVM_SET_CPUID*)
ahead of vcpu_setup() (which calls KVM_SET_SREGS).
While there, address a typo in the assertion that triggers when
KVM_SET_SREGS fails.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20211006203617.13045-1-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Nathan Tempelman <natet@google.com>
KVM/arm64 updates for v5.14.
- Add MTE support in guests, complete with tag save/restore interface
- Reduce the impact of CMOs by moving them in the page-table code
- Allow device block mappings at stage-2
- Reduce the footprint of the vmemmap in protected mode
- Support the vGIC on dumb systems such as the Apple M1
- Add selftest infrastructure to support multiple configuration
and apply that to PMU/non-PMU setups
- Add selftests for the debug architecture
- The usual crop of PMU fixes
This test exercises the feature KVM_CAP_EXIT_ON_EMULATION_FAILURE. When
enabled, errors in the in-kernel instruction emulator are forwarded to
userspace with the instruction bytes stored in the exit struct for
KVM_EXIT_INTERNAL_ERROR. So, when the guest attempts to emulate an
'flds' instruction, which isn't able to be emulated in KVM, instead
of failing, KVM sends the instruction to userspace to handle.
For this test to work properly the module parameter
'allow_smaller_maxphyaddr' has to be set.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20210510144834.658457-3-aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add x86-64 hugepage support in the form of a x86-only variant of
virt_pg_map() that takes an explicit page size. To keep things simple,
follow the existing logic for 4k pages and disallow creating a hugepage
if the upper-level entry is present, even if the desired pfn matches.
Opportunistically fix a double "beyond beyond" reported by checkpatch.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-19-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In preparation for adding hugepage support, replace "pageMapL4Entry",
"pageDirectoryPointerEntry", and "pageDirectoryEntry" with a common
"pageUpperEntry", and add a helper to create an upper level entry. All
upper level entries have the same layout, using unique structs provides
minimal value and requires a non-trivial amount of code duplication.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-18-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a helper to allocate a page for use in constructing the guest's page
tables. All architectures have identical address and memslot
requirements (which appear to be arbitrary anyways).
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-15-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop the memslot param from virt_pg_map() and virt_map() and shove the
hardcoded '0' down to the vm_phy_page_alloc() calls.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop the memslot param(s) from vm_vaddr_alloc() now that all callers
directly specific '0' as the memslot. Drop the memslot param from
virt_pgd_alloc() as well since vm_vaddr_alloc() is its only user.
I.e. shove the hardcoded '0' down to the vm_phy_pages_alloc() calls.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Switch to the vm_vaddr_alloc_page() helper for x86-64's "kernel"
allocations now that the helper uses the same min virtual address as the
open coded versions.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-9-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Refactor x86's GDT/TSS allocations to for memslot '0' at its
vm_addr_alloc() call sites instead of passing in '0' from on high. This
is a step toward using a common helper for allocating pages.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210622200529.3650424-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 22f232d134 ("KVM: selftests: x86: Set supported CPUIDs on
default VM") moved vcpu_set_cpuid into vm_create_with_vcpus, but
dirty_log_test doesn't use it to create vm. So vcpu's CPUIDs is
not set, the guest's pa_bits in kvm would be smaller than the
value queried by userspace.
However, the dirty track memory slot is in the highest GPA, the
reserved bits in gpte would be set with wrong pa_bits.
For shadow paging, page fault would fail in permission_fault and
be injected into guest. Since guest doesn't have idt, it finally
leads to vm_exit for triple fault.
Move vcpu_set_cpuid into vm_vcpu_add_default to set supported
CPUIDs on default vcpu, since almost all tests need it.
Fixes: 22f232d134 ("KVM: selftests: x86: Set supported CPUIDs on default VM")
Signed-off-by: Hou Wenlong <houwenlong93@linux.alibaba.com>
Message-Id: <411ea2173f89abce56fc1fca5af913ed9c5a89c9.1624351343.git.houwenlong93@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86, the only arch implementing exception handling, reports unhandled
vectors using port IO at a specific port number. This replicates what
ucall already does.
Introduce a new ucall type, UCALL_UNHANDLED, for guests to report
unhandled exceptions. Then replace the x86 unhandled vector exception
reporting to use it instead of port IO. This new ucall type will be
used in the next commits by arm64 to report unhandled vectors as well.
Tested: Forcing a page fault in the ./x86_64/xapic_ipi_test
halter_guest_code() shows this:
$ ./x86_64/xapic_ipi_test
...
Unexpected vectored event in guest (vector:0xe)
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210611011020.3420067-4-ricarkol@google.com
If a KVM selftest is run on a machine without /dev/kvm, it will exit
silently. Make it easy to tell what's happening by printing an error
message.
Opportunistically consolidate all codepaths that open /dev/kvm into a
single function so they all print the same message.
This slightly changes the semantics of vm_is_unrestricted_guest() by
changing a TEST_ASSERT() to exit(KSFT_SKIP). However
vm_is_unrestricted_guest() is only called in one place
(x86_64/mmio_warning_test.c) and that is to determine if the test should
be skipped or not.
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210511202120.1371800-1-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The variable in practice will never be uninitialized, because the
loop will always go through at least one iteration.
In case it would not, make vcpu_get_cpuid report an assertion
failure.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V emulation is enabled in KVM unconditionally. This is bad at least
from security standpoint as it is an extra attack surface. Ideally, there
should be a per-VM capability explicitly enabled by VMM but currently it
is not the case and we can't mandate one without breaking backwards
compatibility. We can, however, check guest visible CPUIDs and only enable
Hyper-V emulation when "Hv#1" interface was exposed in
HYPERV_CPUID_INTERFACE.
Note, VMMs are free to act in any sequence they like, e.g. they can try
to set MSRs first and CPUIDs later so we still need to allow the host
to read/write Hyper-V specific MSRs unconditionally.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210126134816.1880136-14-vkuznets@redhat.com>
[Add selftest vcpu_set_hv_cpuid API to avoid breaking xen_vmcall_test. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>