forked from Minki/linux
* ARM: support for SVE and Pointer Authentication in guests, PMU improvements
* POWER: support for direct access to the POWER9 XIVE interrupt controller, memory and performance optimizations. * x86: support for accessing memory not backed by struct page, fixes and refactoring * Generic: dirty page tracking improvements -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJc3qV/AAoJEL/70l94x66Dn3QH/jX1Bn0P/RZAIt4w0SySklSg PqxUKDyBQqB9vN9Qeb9jWXAKPH2CtM3+up/rz7oRnBWp7qA6vXcC/R/QJYAvzdXE nklsR/oYCsflR1KdlVYuDvvPCPP2fLBU5zfN83OsaBQ8fNRkm3gN+N5XQ2SbXbLy Mo9tybS4otY201UAC96e8N0ipwwyCRpDneQpLcl+F5nH3RBt63cVbs04O+70MXn7 eT4I+8K3+Go7LATzT8hglD21D/7uvE31qQb6yr5L33IfhU4GB51RZzBXTNaAdY8n hT1rMrRkAMAFWYZPQDfoMadjWU3i5DIfstKjDxOr9oTfuOEp5Z+GvJwvVnUDg1I= =D0+p -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "ARM: - support for SVE and Pointer Authentication in guests - PMU improvements POWER: - support for direct access to the POWER9 XIVE interrupt controller - memory and performance optimizations x86: - support for accessing memory not backed by struct page - fixes and refactoring Generic: - dirty page tracking improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (155 commits) kvm: fix compilation on aarch64 Revert "KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU" kvm: x86: Fix L1TF mitigation for shadow MMU KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible KVM: PPC: Book3S: Remove useless checks in 'release' method of KVM device KVM: PPC: Book3S HV: XIVE: Fix spelling mistake "acessing" -> "accessing" KVM: PPC: Book3S HV: Make sure to load LPID for radix VCPUs kvm: nVMX: Set nested_run_pending in vmx_set_nested_state after checks complete tests: kvm: Add tests for KVM_SET_NESTED_STATE KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS state before setting new state tests: kvm: Add tests for KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_CPU_ID tests: kvm: Add tests to .gitignore KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 KVM: Fix kvm_clear_dirty_log_protect off-by-(minus-)one KVM: Fix the bitmap range to copy during clear dirty KVM: arm64: Fix ptrauth ID register masking logic KVM: x86: use direct accessors for RIP and RSP KVM: VMX: Use accessors for GPRs outside of dedicated caching logic KVM: x86: Omit caching logic for always-available GPRs kvm, x86: Properly check whether a pfn is an MMIO or not ...
This commit is contained in:
commit
0ef0fd3515
85
Documentation/arm64/perf.txt
Normal file
85
Documentation/arm64/perf.txt
Normal file
@ -0,0 +1,85 @@
|
||||
Perf Event Attributes
|
||||
=====================
|
||||
|
||||
Author: Andrew Murray <andrew.murray@arm.com>
|
||||
Date: 2019-03-06
|
||||
|
||||
exclude_user
|
||||
------------
|
||||
|
||||
This attribute excludes userspace.
|
||||
|
||||
Userspace always runs at EL0 and thus this attribute will exclude EL0.
|
||||
|
||||
|
||||
exclude_kernel
|
||||
--------------
|
||||
|
||||
This attribute excludes the kernel.
|
||||
|
||||
The kernel runs at EL2 with VHE and EL1 without. Guest kernels always run
|
||||
at EL1.
|
||||
|
||||
For the host this attribute will exclude EL1 and additionally EL2 on a VHE
|
||||
system.
|
||||
|
||||
For the guest this attribute will exclude EL1. Please note that EL2 is
|
||||
never counted within a guest.
|
||||
|
||||
|
||||
exclude_hv
|
||||
----------
|
||||
|
||||
This attribute excludes the hypervisor.
|
||||
|
||||
For a VHE host this attribute is ignored as we consider the host kernel to
|
||||
be the hypervisor.
|
||||
|
||||
For a non-VHE host this attribute will exclude EL2 as we consider the
|
||||
hypervisor to be any code that runs at EL2 which is predominantly used for
|
||||
guest/host transitions.
|
||||
|
||||
For the guest this attribute has no effect. Please note that EL2 is
|
||||
never counted within a guest.
|
||||
|
||||
|
||||
exclude_host / exclude_guest
|
||||
----------------------------
|
||||
|
||||
These attributes exclude the KVM host and guest, respectively.
|
||||
|
||||
The KVM host may run at EL0 (userspace), EL1 (non-VHE kernel) and EL2 (VHE
|
||||
kernel or non-VHE hypervisor).
|
||||
|
||||
The KVM guest may run at EL0 (userspace) and EL1 (kernel).
|
||||
|
||||
Due to the overlapping exception levels between host and guests we cannot
|
||||
exclusively rely on the PMU's hardware exception filtering - therefore we
|
||||
must enable/disable counting on the entry and exit to the guest. This is
|
||||
performed differently on VHE and non-VHE systems.
|
||||
|
||||
For non-VHE systems we exclude EL2 for exclude_host - upon entering and
|
||||
exiting the guest we disable/enable the event as appropriate based on the
|
||||
exclude_host and exclude_guest attributes.
|
||||
|
||||
For VHE systems we exclude EL1 for exclude_guest and exclude both EL0,EL2
|
||||
for exclude_host. Upon entering and exiting the guest we modify the event
|
||||
to include/exclude EL0 as appropriate based on the exclude_host and
|
||||
exclude_guest attributes.
|
||||
|
||||
The statements above also apply when these attributes are used within a
|
||||
non-VHE guest however please note that EL2 is never counted within a guest.
|
||||
|
||||
|
||||
Accuracy
|
||||
--------
|
||||
|
||||
On non-VHE hosts we enable/disable counters on the entry/exit of host/guest
|
||||
transition at EL2 - however there is a period of time between
|
||||
enabling/disabling the counters and entering/exiting the guest. We are
|
||||
able to eliminate counters counting host events on the boundaries of guest
|
||||
entry/exit when counting guest events by filtering out EL2 for
|
||||
exclude_host. However when using !exclude_hv there is a small blackout
|
||||
window at the guest entry/exit where host events are not captured.
|
||||
|
||||
On VHE systems there are no blackout windows.
|
@ -87,7 +87,21 @@ used to get and set the keys for a thread.
|
||||
Virtualization
|
||||
--------------
|
||||
|
||||
Pointer authentication is not currently supported in KVM guests. KVM
|
||||
will mask the feature bits from ID_AA64ISAR1_EL1, and attempted use of
|
||||
the feature will result in an UNDEFINED exception being injected into
|
||||
the guest.
|
||||
Pointer authentication is enabled in KVM guest when each virtual cpu is
|
||||
initialised by passing flags KVM_ARM_VCPU_PTRAUTH_[ADDRESS/GENERIC] and
|
||||
requesting these two separate cpu features to be enabled. The current KVM
|
||||
guest implementation works by enabling both features together, so both
|
||||
these userspace flags are checked before enabling pointer authentication.
|
||||
The separate userspace flag will allow to have no userspace ABI changes
|
||||
if support is added in the future to allow these two features to be
|
||||
enabled independently of one another.
|
||||
|
||||
As Arm Architecture specifies that Pointer Authentication feature is
|
||||
implemented along with the VHE feature so KVM arm64 ptrauth code relies
|
||||
on VHE mode to be present.
|
||||
|
||||
Additionally, when these vcpu feature flags are not set then KVM will
|
||||
filter out the Pointer Authentication system key registers from
|
||||
KVM_GET/SET_REG_* ioctls and mask those features from cpufeature ID
|
||||
register. Any attempt to use the Pointer Authentication instructions will
|
||||
result in an UNDEFINED exception being injected into the guest.
|
||||
|
@ -69,23 +69,6 @@ by and on behalf of the VM's process may not be freed/unaccounted when
|
||||
the VM is shut down.
|
||||
|
||||
|
||||
It is important to note that althought VM ioctls may only be issued from
|
||||
the process that created the VM, a VM's lifecycle is associated with its
|
||||
file descriptor, not its creator (process). In other words, the VM and
|
||||
its resources, *including the associated address space*, are not freed
|
||||
until the last reference to the VM's file descriptor has been released.
|
||||
For example, if fork() is issued after ioctl(KVM_CREATE_VM), the VM will
|
||||
not be freed until both the parent (original) process and its child have
|
||||
put their references to the VM's file descriptor.
|
||||
|
||||
Because a VM's resources are not freed until the last reference to its
|
||||
file descriptor is released, creating additional references to a VM via
|
||||
via fork(), dup(), etc... without careful consideration is strongly
|
||||
discouraged and may have unwanted side effects, e.g. memory allocated
|
||||
by and on behalf of the VM's process may not be freed/unaccounted when
|
||||
the VM is shut down.
|
||||
|
||||
|
||||
3. Extensions
|
||||
-------------
|
||||
|
||||
@ -347,7 +330,7 @@ They must be less than the value that KVM_CHECK_EXTENSION returns for
|
||||
the KVM_CAP_MULTI_ADDRESS_SPACE capability.
|
||||
|
||||
The bits in the dirty bitmap are cleared before the ioctl returns, unless
|
||||
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT is enabled. For more information,
|
||||
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is enabled. For more information,
|
||||
see the description of the capability.
|
||||
|
||||
4.9 KVM_SET_MEMORY_ALIAS
|
||||
@ -1117,9 +1100,8 @@ struct kvm_userspace_memory_region {
|
||||
This ioctl allows the user to create, modify or delete a guest physical
|
||||
memory slot. Bits 0-15 of "slot" specify the slot id and this value
|
||||
should be less than the maximum number of user memory slots supported per
|
||||
VM. The maximum allowed slots can be queried using KVM_CAP_NR_MEMSLOTS,
|
||||
if this capability is supported by the architecture. Slots may not
|
||||
overlap in guest physical address space.
|
||||
VM. The maximum allowed slots can be queried using KVM_CAP_NR_MEMSLOTS.
|
||||
Slots may not overlap in guest physical address space.
|
||||
|
||||
If KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of "slot"
|
||||
specifies the address space which is being modified. They must be
|
||||
@ -1901,6 +1883,12 @@ Architectures: all
|
||||
Type: vcpu ioctl
|
||||
Parameters: struct kvm_one_reg (in)
|
||||
Returns: 0 on success, negative value on failure
|
||||
Errors:
|
||||
ENOENT: no such register
|
||||
EINVAL: invalid register ID, or no such register
|
||||
EPERM: (arm64) register access not allowed before vcpu finalization
|
||||
(These error codes are indicative only: do not rely on a specific error
|
||||
code being returned in a specific situation.)
|
||||
|
||||
struct kvm_one_reg {
|
||||
__u64 id;
|
||||
@ -1985,6 +1973,7 @@ registers, find a list below:
|
||||
PPC | KVM_REG_PPC_TLB3PS | 32
|
||||
PPC | KVM_REG_PPC_EPTCFG | 32
|
||||
PPC | KVM_REG_PPC_ICP_STATE | 64
|
||||
PPC | KVM_REG_PPC_VP_STATE | 128
|
||||
PPC | KVM_REG_PPC_TB_OFFSET | 64
|
||||
PPC | KVM_REG_PPC_SPMC1 | 32
|
||||
PPC | KVM_REG_PPC_SPMC2 | 32
|
||||
@ -2137,6 +2126,37 @@ contains elements ranging from 32 to 128 bits. The index is a 32bit
|
||||
value in the kvm_regs structure seen as a 32bit array.
|
||||
0x60x0 0000 0010 <index into the kvm_regs struct:16>
|
||||
|
||||
Specifically:
|
||||
Encoding Register Bits kvm_regs member
|
||||
----------------------------------------------------------------
|
||||
0x6030 0000 0010 0000 X0 64 regs.regs[0]
|
||||
0x6030 0000 0010 0002 X1 64 regs.regs[1]
|
||||
...
|
||||
0x6030 0000 0010 003c X30 64 regs.regs[30]
|
||||
0x6030 0000 0010 003e SP 64 regs.sp
|
||||
0x6030 0000 0010 0040 PC 64 regs.pc
|
||||
0x6030 0000 0010 0042 PSTATE 64 regs.pstate
|
||||
0x6030 0000 0010 0044 SP_EL1 64 sp_el1
|
||||
0x6030 0000 0010 0046 ELR_EL1 64 elr_el1
|
||||
0x6030 0000 0010 0048 SPSR_EL1 64 spsr[KVM_SPSR_EL1] (alias SPSR_SVC)
|
||||
0x6030 0000 0010 004a SPSR_ABT 64 spsr[KVM_SPSR_ABT]
|
||||
0x6030 0000 0010 004c SPSR_UND 64 spsr[KVM_SPSR_UND]
|
||||
0x6030 0000 0010 004e SPSR_IRQ 64 spsr[KVM_SPSR_IRQ]
|
||||
0x6060 0000 0010 0050 SPSR_FIQ 64 spsr[KVM_SPSR_FIQ]
|
||||
0x6040 0000 0010 0054 V0 128 fp_regs.vregs[0] (*)
|
||||
0x6040 0000 0010 0058 V1 128 fp_regs.vregs[1] (*)
|
||||
...
|
||||
0x6040 0000 0010 00d0 V31 128 fp_regs.vregs[31] (*)
|
||||
0x6020 0000 0010 00d4 FPSR 32 fp_regs.fpsr
|
||||
0x6020 0000 0010 00d5 FPCR 32 fp_regs.fpcr
|
||||
|
||||
(*) These encodings are not accepted for SVE-enabled vcpus. See
|
||||
KVM_ARM_VCPU_INIT.
|
||||
|
||||
The equivalent register content can be accessed via bits [127:0] of
|
||||
the corresponding SVE Zn registers instead for vcpus that have SVE
|
||||
enabled (see below).
|
||||
|
||||
arm64 CCSIDR registers are demultiplexed by CSSELR value:
|
||||
0x6020 0000 0011 00 <csselr:8>
|
||||
|
||||
@ -2146,6 +2166,64 @@ arm64 system registers have the following id bit patterns:
|
||||
arm64 firmware pseudo-registers have the following bit pattern:
|
||||
0x6030 0000 0014 <regno:16>
|
||||
|
||||
arm64 SVE registers have the following bit patterns:
|
||||
0x6080 0000 0015 00 <n:5> <slice:5> Zn bits[2048*slice + 2047 : 2048*slice]
|
||||
0x6050 0000 0015 04 <n:4> <slice:5> Pn bits[256*slice + 255 : 256*slice]
|
||||
0x6050 0000 0015 060 <slice:5> FFR bits[256*slice + 255 : 256*slice]
|
||||
0x6060 0000 0015 ffff KVM_REG_ARM64_SVE_VLS pseudo-register
|
||||
|
||||
Access to register IDs where 2048 * slice >= 128 * max_vq will fail with
|
||||
ENOENT. max_vq is the vcpu's maximum supported vector length in 128-bit
|
||||
quadwords: see (**) below.
|
||||
|
||||
These registers are only accessible on vcpus for which SVE is enabled.
|
||||
See KVM_ARM_VCPU_INIT for details.
|
||||
|
||||
In addition, except for KVM_REG_ARM64_SVE_VLS, these registers are not
|
||||
accessible until the vcpu's SVE configuration has been finalized
|
||||
using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE). See KVM_ARM_VCPU_INIT
|
||||
and KVM_ARM_VCPU_FINALIZE for more information about this procedure.
|
||||
|
||||
KVM_REG_ARM64_SVE_VLS is a pseudo-register that allows the set of vector
|
||||
lengths supported by the vcpu to be discovered and configured by
|
||||
userspace. When transferred to or from user memory via KVM_GET_ONE_REG
|
||||
or KVM_SET_ONE_REG, the value of this register is of type
|
||||
__u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the set of vector lengths as
|
||||
follows:
|
||||
|
||||
__u64 vector_lengths[KVM_ARM64_SVE_VLS_WORDS];
|
||||
|
||||
if (vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX &&
|
||||
((vector_lengths[(vq - KVM_ARM64_SVE_VQ_MIN) / 64] >>
|
||||
((vq - KVM_ARM64_SVE_VQ_MIN) % 64)) & 1))
|
||||
/* Vector length vq * 16 bytes supported */
|
||||
else
|
||||
/* Vector length vq * 16 bytes not supported */
|
||||
|
||||
(**) The maximum value vq for which the above condition is true is
|
||||
max_vq. This is the maximum vector length available to the guest on
|
||||
this vcpu, and determines which register slices are visible through
|
||||
this ioctl interface.
|
||||
|
||||
(See Documentation/arm64/sve.txt for an explanation of the "vq"
|
||||
nomenclature.)
|
||||
|
||||
KVM_REG_ARM64_SVE_VLS is only accessible after KVM_ARM_VCPU_INIT.
|
||||
KVM_ARM_VCPU_INIT initialises it to the best set of vector lengths that
|
||||
the host supports.
|
||||
|
||||
Userspace may subsequently modify it if desired until the vcpu's SVE
|
||||
configuration is finalized using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE).
|
||||
|
||||
Apart from simply removing all vector lengths from the host set that
|
||||
exceed some value, support for arbitrarily chosen sets of vector lengths
|
||||
is hardware-dependent and may not be available. Attempting to configure
|
||||
an invalid set of vector lengths via KVM_SET_ONE_REG will fail with
|
||||
EINVAL.
|
||||
|
||||
After the vcpu's SVE configuration is finalized, further attempts to
|
||||
write this register will fail with EPERM.
|
||||
|
||||
|
||||
MIPS registers are mapped using the lower 32 bits. The upper 16 of that is
|
||||
the register group type:
|
||||
@ -2198,6 +2276,12 @@ Architectures: all
|
||||
Type: vcpu ioctl
|
||||
Parameters: struct kvm_one_reg (in and out)
|
||||
Returns: 0 on success, negative value on failure
|
||||
Errors include:
|
||||
ENOENT: no such register
|
||||
EINVAL: invalid register ID, or no such register
|
||||
EPERM: (arm64) register access not allowed before vcpu finalization
|
||||
(These error codes are indicative only: do not rely on a specific error
|
||||
code being returned in a specific situation.)
|
||||
|
||||
This ioctl allows to receive the value of a single register implemented
|
||||
in a vcpu. The register to read is indicated by the "id" field of the
|
||||
@ -2690,6 +2774,49 @@ Possible features:
|
||||
- KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
|
||||
Depends on KVM_CAP_ARM_PMU_V3.
|
||||
|
||||
- KVM_ARM_VCPU_PTRAUTH_ADDRESS: Enables Address Pointer authentication
|
||||
for arm64 only.
|
||||
Depends on KVM_CAP_ARM_PTRAUTH_ADDRESS.
|
||||
If KVM_CAP_ARM_PTRAUTH_ADDRESS and KVM_CAP_ARM_PTRAUTH_GENERIC are
|
||||
both present, then both KVM_ARM_VCPU_PTRAUTH_ADDRESS and
|
||||
KVM_ARM_VCPU_PTRAUTH_GENERIC must be requested or neither must be
|
||||
requested.
|
||||
|
||||
- KVM_ARM_VCPU_PTRAUTH_GENERIC: Enables Generic Pointer authentication
|
||||
for arm64 only.
|
||||
Depends on KVM_CAP_ARM_PTRAUTH_GENERIC.
|
||||
If KVM_CAP_ARM_PTRAUTH_ADDRESS and KVM_CAP_ARM_PTRAUTH_GENERIC are
|
||||
both present, then both KVM_ARM_VCPU_PTRAUTH_ADDRESS and
|
||||
KVM_ARM_VCPU_PTRAUTH_GENERIC must be requested or neither must be
|
||||
requested.
|
||||
|
||||
- KVM_ARM_VCPU_SVE: Enables SVE for the CPU (arm64 only).
|
||||
Depends on KVM_CAP_ARM_SVE.
|
||||
Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE):
|
||||
|
||||
* After KVM_ARM_VCPU_INIT:
|
||||
|
||||
- KVM_REG_ARM64_SVE_VLS may be read using KVM_GET_ONE_REG: the
|
||||
initial value of this pseudo-register indicates the best set of
|
||||
vector lengths possible for a vcpu on this host.
|
||||
|
||||
* Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE):
|
||||
|
||||
- KVM_RUN and KVM_GET_REG_LIST are not available;
|
||||
|
||||
- KVM_GET_ONE_REG and KVM_SET_ONE_REG cannot be used to access
|
||||
the scalable archietctural SVE registers
|
||||
KVM_REG_ARM64_SVE_ZREG(), KVM_REG_ARM64_SVE_PREG() or
|
||||
KVM_REG_ARM64_SVE_FFR;
|
||||
|
||||
- KVM_REG_ARM64_SVE_VLS may optionally be written using
|
||||
KVM_SET_ONE_REG, to modify the set of vector lengths available
|
||||
for the vcpu.
|
||||
|
||||
* After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE):
|
||||
|
||||
- the KVM_REG_ARM64_SVE_VLS pseudo-register is immutable, and can
|
||||
no longer be written using KVM_SET_ONE_REG.
|
||||
|
||||
4.83 KVM_ARM_PREFERRED_TARGET
|
||||
|
||||
@ -3809,7 +3936,7 @@ to I/O ports.
|
||||
|
||||
4.117 KVM_CLEAR_DIRTY_LOG (vm ioctl)
|
||||
|
||||
Capability: KVM_CAP_MANUAL_DIRTY_LOG_PROTECT
|
||||
Capability: KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
|
||||
Architectures: x86, arm, arm64, mips
|
||||
Type: vm ioctl
|
||||
Parameters: struct kvm_dirty_log (in)
|
||||
@ -3842,10 +3969,10 @@ the address space for which you want to return the dirty bitmap.
|
||||
They must be less than the value that KVM_CHECK_EXTENSION returns for
|
||||
the KVM_CAP_MULTI_ADDRESS_SPACE capability.
|
||||
|
||||
This ioctl is mostly useful when KVM_CAP_MANUAL_DIRTY_LOG_PROTECT
|
||||
This ioctl is mostly useful when KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
|
||||
is enabled; for more information, see the description of the capability.
|
||||
However, it can always be used as long as KVM_CHECK_EXTENSION confirms
|
||||
that KVM_CAP_MANUAL_DIRTY_LOG_PROTECT is present.
|
||||
that KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is present.
|
||||
|
||||
4.118 KVM_GET_SUPPORTED_HV_CPUID
|
||||
|
||||
@ -3904,6 +4031,40 @@ number of valid entries in the 'entries' array, which is then filled.
|
||||
'index' and 'flags' fields in 'struct kvm_cpuid_entry2' are currently reserved,
|
||||
userspace should not expect to get any particular value there.
|
||||
|
||||
4.119 KVM_ARM_VCPU_FINALIZE
|
||||
|
||||
Architectures: arm, arm64
|
||||
Type: vcpu ioctl
|
||||
Parameters: int feature (in)
|
||||
Returns: 0 on success, -1 on error
|
||||
Errors:
|
||||
EPERM: feature not enabled, needs configuration, or already finalized
|
||||
EINVAL: feature unknown or not present
|
||||
|
||||
Recognised values for feature:
|
||||
arm64 KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE)
|
||||
|
||||
Finalizes the configuration of the specified vcpu feature.
|
||||
|
||||
The vcpu must already have been initialised, enabling the affected feature, by
|
||||
means of a successful KVM_ARM_VCPU_INIT call with the appropriate flag set in
|
||||
features[].
|
||||
|
||||
For affected vcpu features, this is a mandatory step that must be performed
|
||||
before the vcpu is fully usable.
|
||||
|
||||
Between KVM_ARM_VCPU_INIT and KVM_ARM_VCPU_FINALIZE, the feature may be
|
||||
configured by use of ioctls such as KVM_SET_ONE_REG. The exact configuration
|
||||
that should be performaned and how to do it are feature-dependent.
|
||||
|
||||
Other calls that depend on a particular feature being finalized, such as
|
||||
KVM_RUN, KVM_GET_REG_LIST, KVM_GET_ONE_REG and KVM_SET_ONE_REG, will fail with
|
||||
-EPERM unless the feature has already been finalized by means of a
|
||||
KVM_ARM_VCPU_FINALIZE call.
|
||||
|
||||
See KVM_ARM_VCPU_INIT for details of vcpu features that require finalization
|
||||
using this ioctl.
|
||||
|
||||
5. The kvm_run structure
|
||||
------------------------
|
||||
|
||||
@ -4505,6 +4666,15 @@ struct kvm_sync_regs {
|
||||
struct kvm_vcpu_events events;
|
||||
};
|
||||
|
||||
6.75 KVM_CAP_PPC_IRQ_XIVE
|
||||
|
||||
Architectures: ppc
|
||||
Target: vcpu
|
||||
Parameters: args[0] is the XIVE device fd
|
||||
args[1] is the XIVE CPU number (server ID) for this vcpu
|
||||
|
||||
This capability connects the vcpu to an in-kernel XIVE device.
|
||||
|
||||
7. Capabilities that can be enabled on VMs
|
||||
------------------------------------------
|
||||
|
||||
@ -4798,7 +4968,7 @@ and injected exceptions.
|
||||
* For the new DR6 bits, note that bit 16 is set iff the #DB exception
|
||||
will clear DR6.RTM.
|
||||
|
||||
7.18 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT
|
||||
7.18 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
|
||||
|
||||
Architectures: x86, arm, arm64, mips
|
||||
Parameters: args[0] whether feature should be enabled or not
|
||||
@ -4821,6 +4991,11 @@ while userspace can see false reports of dirty pages. Manual reprotection
|
||||
helps reducing this time, improving guest performance and reducing the
|
||||
number of dirty log false positives.
|
||||
|
||||
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 was previously available under the name
|
||||
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT, but the implementation had bugs that make
|
||||
it hard or impossible to use it correctly. The availability of
|
||||
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 signals that those bugs are fixed.
|
||||
Userspace should not try to use KVM_CAP_MANUAL_DIRTY_LOG_PROTECT.
|
||||
|
||||
8. Other capabilities.
|
||||
----------------------
|
||||
|
@ -141,7 +141,8 @@ struct kvm_s390_vm_cpu_subfunc {
|
||||
u8 pcc[16]; # valid with Message-Security-Assist-Extension 4
|
||||
u8 ppno[16]; # valid with Message-Security-Assist-Extension 5
|
||||
u8 kma[16]; # valid with Message-Security-Assist-Extension 8
|
||||
u8 reserved[1808]; # reserved for future instructions
|
||||
u8 kdsa[16]; # valid with Message-Security-Assist-Extension 9
|
||||
u8 reserved[1792]; # reserved for future instructions
|
||||
};
|
||||
|
||||
Parameters: address of a buffer to load the subfunction blocks from.
|
||||
|
197
Documentation/virtual/kvm/devices/xive.txt
Normal file
197
Documentation/virtual/kvm/devices/xive.txt
Normal file
@ -0,0 +1,197 @@
|
||||
POWER9 eXternal Interrupt Virtualization Engine (XIVE Gen1)
|
||||
==========================================================
|
||||
|
||||
Device types supported:
|
||||
KVM_DEV_TYPE_XIVE POWER9 XIVE Interrupt Controller generation 1
|
||||
|
||||
This device acts as a VM interrupt controller. It provides the KVM
|
||||
interface to configure the interrupt sources of a VM in the underlying
|
||||
POWER9 XIVE interrupt controller.
|
||||
|
||||
Only one XIVE instance may be instantiated. A guest XIVE device
|
||||
requires a POWER9 host and the guest OS should have support for the
|
||||
XIVE native exploitation interrupt mode. If not, it should run using
|
||||
the legacy interrupt mode, referred as XICS (POWER7/8).
|
||||
|
||||
* Device Mappings
|
||||
|
||||
The KVM device exposes different MMIO ranges of the XIVE HW which
|
||||
are required for interrupt management. These are exposed to the
|
||||
guest in VMAs populated with a custom VM fault handler.
|
||||
|
||||
1. Thread Interrupt Management Area (TIMA)
|
||||
|
||||
Each thread has an associated Thread Interrupt Management context
|
||||
composed of a set of registers. These registers let the thread
|
||||
handle priority management and interrupt acknowledgment. The most
|
||||
important are :
|
||||
|
||||
- Interrupt Pending Buffer (IPB)
|
||||
- Current Processor Priority (CPPR)
|
||||
- Notification Source Register (NSR)
|
||||
|
||||
They are exposed to software in four different pages each proposing
|
||||
a view with a different privilege. The first page is for the
|
||||
physical thread context and the second for the hypervisor. Only the
|
||||
third (operating system) and the fourth (user level) are exposed the
|
||||
guest.
|
||||
|
||||
2. Event State Buffer (ESB)
|
||||
|
||||
Each source is associated with an Event State Buffer (ESB) with
|
||||
either a pair of even/odd pair of pages which provides commands to
|
||||
manage the source: to trigger, to EOI, to turn off the source for
|
||||
instance.
|
||||
|
||||
3. Device pass-through
|
||||
|
||||
When a device is passed-through into the guest, the source
|
||||
interrupts are from a different HW controller (PHB4) and the ESB
|
||||
pages exposed to the guest should accommadate this change.
|
||||
|
||||
The passthru_irq helpers, kvmppc_xive_set_mapped() and
|
||||
kvmppc_xive_clr_mapped() are called when the device HW irqs are
|
||||
mapped into or unmapped from the guest IRQ number space. The KVM
|
||||
device extends these helpers to clear the ESB pages of the guest IRQ
|
||||
number being mapped and then lets the VM fault handler repopulate.
|
||||
The handler will insert the ESB page corresponding to the HW
|
||||
interrupt of the device being passed-through or the initial IPI ESB
|
||||
page if the device has being removed.
|
||||
|
||||
The ESB remapping is fully transparent to the guest and the OS
|
||||
device driver. All handling is done within VFIO and the above
|
||||
helpers in KVM-PPC.
|
||||
|
||||
* Groups:
|
||||
|
||||
1. KVM_DEV_XIVE_GRP_CTRL
|
||||
Provides global controls on the device
|
||||
Attributes:
|
||||
1.1 KVM_DEV_XIVE_RESET (write only)
|
||||
Resets the interrupt controller configuration for sources and event
|
||||
queues. To be used by kexec and kdump.
|
||||
Errors: none
|
||||
|
||||
1.2 KVM_DEV_XIVE_EQ_SYNC (write only)
|
||||
Sync all the sources and queues and mark the EQ pages dirty. This
|
||||
to make sure that a consistent memory state is captured when
|
||||
migrating the VM.
|
||||
Errors: none
|
||||
|
||||
2. KVM_DEV_XIVE_GRP_SOURCE (write only)
|
||||
Initializes a new source in the XIVE device and mask it.
|
||||
Attributes:
|
||||
Interrupt source number (64-bit)
|
||||
The kvm_device_attr.addr points to a __u64 value:
|
||||
bits: | 63 .... 2 | 1 | 0
|
||||
values: | unused | level | type
|
||||
- type: 0:MSI 1:LSI
|
||||
- level: assertion level in case of an LSI.
|
||||
Errors:
|
||||
-E2BIG: Interrupt source number is out of range
|
||||
-ENOMEM: Could not create a new source block
|
||||
-EFAULT: Invalid user pointer for attr->addr.
|
||||
-ENXIO: Could not allocate underlying HW interrupt
|
||||
|
||||
3. KVM_DEV_XIVE_GRP_SOURCE_CONFIG (write only)
|
||||
Configures source targeting
|
||||
Attributes:
|
||||
Interrupt source number (64-bit)
|
||||
The kvm_device_attr.addr points to a __u64 value:
|
||||
bits: | 63 .... 33 | 32 | 31 .. 3 | 2 .. 0
|
||||
values: | eisn | mask | server | priority
|
||||
- priority: 0-7 interrupt priority level
|
||||
- server: CPU number chosen to handle the interrupt
|
||||
- mask: mask flag (unused)
|
||||
- eisn: Effective Interrupt Source Number
|
||||
Errors:
|
||||
-ENOENT: Unknown source number
|
||||
-EINVAL: Not initialized source number
|
||||
-EINVAL: Invalid priority
|
||||
-EINVAL: Invalid CPU number.
|
||||
-EFAULT: Invalid user pointer for attr->addr.
|
||||
-ENXIO: CPU event queues not configured or configuration of the
|
||||
underlying HW interrupt failed
|
||||
-EBUSY: No CPU available to serve interrupt
|
||||
|
||||
4. KVM_DEV_XIVE_GRP_EQ_CONFIG (read-write)
|
||||
Configures an event queue of a CPU
|
||||
Attributes:
|
||||
EQ descriptor identifier (64-bit)
|
||||
The EQ descriptor identifier is a tuple (server, priority) :
|
||||
bits: | 63 .... 32 | 31 .. 3 | 2 .. 0
|
||||
values: | unused | server | priority
|
||||
The kvm_device_attr.addr points to :
|
||||
struct kvm_ppc_xive_eq {
|
||||
__u32 flags;
|
||||
__u32 qshift;
|
||||
__u64 qaddr;
|
||||
__u32 qtoggle;
|
||||
__u32 qindex;
|
||||
__u8 pad[40];
|
||||
};
|
||||
- flags: queue flags
|
||||
KVM_XIVE_EQ_ALWAYS_NOTIFY (required)
|
||||
forces notification without using the coalescing mechanism
|
||||
provided by the XIVE END ESBs.
|
||||
- qshift: queue size (power of 2)
|
||||
- qaddr: real address of queue
|
||||
- qtoggle: current queue toggle bit
|
||||
- qindex: current queue index
|
||||
- pad: reserved for future use
|
||||
Errors:
|
||||
-ENOENT: Invalid CPU number
|
||||
-EINVAL: Invalid priority
|
||||
-EINVAL: Invalid flags
|
||||
-EINVAL: Invalid queue size
|
||||
-EINVAL: Invalid queue address
|
||||
-EFAULT: Invalid user pointer for attr->addr.
|
||||
-EIO: Configuration of the underlying HW failed
|
||||
|
||||
5. KVM_DEV_XIVE_GRP_SOURCE_SYNC (write only)
|
||||
Synchronize the source to flush event notifications
|
||||
Attributes:
|
||||
Interrupt source number (64-bit)
|
||||
Errors:
|
||||
-ENOENT: Unknown source number
|
||||
-EINVAL: Not initialized source number
|
||||
|
||||
* VCPU state
|
||||
|
||||
The XIVE IC maintains VP interrupt state in an internal structure
|
||||
called the NVT. When a VP is not dispatched on a HW processor
|
||||
thread, this structure can be updated by HW if the VP is the target
|
||||
of an event notification.
|
||||
|
||||
It is important for migration to capture the cached IPB from the NVT
|
||||
as it synthesizes the priorities of the pending interrupts. We
|
||||
capture a bit more to report debug information.
|
||||
|
||||
KVM_REG_PPC_VP_STATE (2 * 64bits)
|
||||
bits: | 63 .... 32 | 31 .... 0 |
|
||||
values: | TIMA word0 | TIMA word1 |
|
||||
bits: | 127 .......... 64 |
|
||||
values: | unused |
|
||||
|
||||
* Migration:
|
||||
|
||||
Saving the state of a VM using the XIVE native exploitation mode
|
||||
should follow a specific sequence. When the VM is stopped :
|
||||
|
||||
1. Mask all sources (PQ=01) to stop the flow of events.
|
||||
|
||||
2. Sync the XIVE device with the KVM control KVM_DEV_XIVE_EQ_SYNC to
|
||||
flush any in-flight event notification and to stabilize the EQs. At
|
||||
this stage, the EQ pages are marked dirty to make sure they are
|
||||
transferred in the migration sequence.
|
||||
|
||||
3. Capture the state of the source targeting, the EQs configuration
|
||||
and the state of thread interrupt context registers.
|
||||
|
||||
Restore is similar :
|
||||
|
||||
1. Restore the EQ configuration. As targeting depends on it.
|
||||
2. Restore targeting
|
||||
3. Restore the thread interrupt contexts
|
||||
4. Restore the source states
|
||||
5. Let the vCPU run
|
@ -343,4 +343,6 @@ static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu,
|
||||
}
|
||||
}
|
||||
|
||||
static inline void vcpu_ptrauth_setup_lazy(struct kvm_vcpu *vcpu) {}
|
||||
|
||||
#endif /* __ARM_KVM_EMULATE_H__ */
|
||||
|
@ -19,6 +19,7 @@
|
||||
#ifndef __ARM_KVM_HOST_H__
|
||||
#define __ARM_KVM_HOST_H__
|
||||
|
||||
#include <linux/errno.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/kvm_types.h>
|
||||
#include <asm/cputype.h>
|
||||
@ -53,6 +54,8 @@
|
||||
|
||||
DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
|
||||
|
||||
static inline int kvm_arm_init_sve(void) { return 0; }
|
||||
|
||||
u32 *kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode);
|
||||
int __attribute_const__ kvm_target_cpu(void);
|
||||
int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
|
||||
@ -150,9 +153,13 @@ struct kvm_cpu_context {
|
||||
u32 cp15[NR_CP15_REGS];
|
||||
};
|
||||
|
||||
typedef struct kvm_cpu_context kvm_cpu_context_t;
|
||||
struct kvm_host_data {
|
||||
struct kvm_cpu_context host_ctxt;
|
||||
};
|
||||
|
||||
static inline void kvm_init_host_cpu_context(kvm_cpu_context_t *cpu_ctxt,
|
||||
typedef struct kvm_host_data kvm_host_data_t;
|
||||
|
||||
static inline void kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt,
|
||||
int cpu)
|
||||
{
|
||||
/* The host's MPIDR is immutable, so let's set it up at boot time */
|
||||
@ -182,7 +189,7 @@ struct kvm_vcpu_arch {
|
||||
struct kvm_vcpu_fault_info fault;
|
||||
|
||||
/* Host FP context */
|
||||
kvm_cpu_context_t *host_cpu_context;
|
||||
struct kvm_cpu_context *host_cpu_context;
|
||||
|
||||
/* VGIC state */
|
||||
struct vgic_cpu vgic_cpu;
|
||||
@ -361,6 +368,9 @@ static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {}
|
||||
static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {}
|
||||
static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
|
||||
|
||||
static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {}
|
||||
static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {}
|
||||
|
||||
static inline void kvm_arm_vhe_guest_enter(void) {}
|
||||
static inline void kvm_arm_vhe_guest_exit(void) {}
|
||||
|
||||
@ -409,4 +419,14 @@ static inline int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature)
|
||||
{
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
static inline bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
#endif /* __ARM_KVM_HOST_H__ */
|
||||
|
@ -1341,6 +1341,7 @@ menu "ARMv8.3 architectural features"
|
||||
config ARM64_PTR_AUTH
|
||||
bool "Enable support for pointer authentication"
|
||||
default y
|
||||
depends on !KVM || ARM64_VHE
|
||||
help
|
||||
Pointer authentication (part of the ARMv8.3 Extensions) provides
|
||||
instructions for signing and authenticating pointers against secret
|
||||
@ -1354,8 +1355,9 @@ config ARM64_PTR_AUTH
|
||||
context-switched along with the process.
|
||||
|
||||
The feature is detected at runtime. If the feature is not present in
|
||||
hardware it will not be advertised to userspace nor will it be
|
||||
enabled.
|
||||
hardware it will not be advertised to userspace/KVM guest nor will it
|
||||
be enabled. However, KVM guest also require VHE mode and hence
|
||||
CONFIG_ARM64_VHE=y option to use this feature.
|
||||
|
||||
endmenu
|
||||
|
||||
|
@ -24,10 +24,13 @@
|
||||
|
||||
#ifndef __ASSEMBLY__
|
||||
|
||||
#include <linux/bitmap.h>
|
||||
#include <linux/build_bug.h>
|
||||
#include <linux/bug.h>
|
||||
#include <linux/cache.h>
|
||||
#include <linux/init.h>
|
||||
#include <linux/stddef.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
#if defined(__KERNEL__) && defined(CONFIG_COMPAT)
|
||||
/* Masks for extracting the FPSR and FPCR from the FPSCR */
|
||||
@ -56,7 +59,8 @@ extern void fpsimd_restore_current_state(void);
|
||||
extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
|
||||
|
||||
extern void fpsimd_bind_task_to_cpu(void);
|
||||
extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state);
|
||||
extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
|
||||
void *sve_state, unsigned int sve_vl);
|
||||
|
||||
extern void fpsimd_flush_task_state(struct task_struct *target);
|
||||
extern void fpsimd_flush_cpu_state(void);
|
||||
@ -87,6 +91,29 @@ extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
|
||||
extern u64 read_zcr_features(void);
|
||||
|
||||
extern int __ro_after_init sve_max_vl;
|
||||
extern int __ro_after_init sve_max_virtualisable_vl;
|
||||
extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
|
||||
|
||||
/*
|
||||
* Helpers to translate bit indices in sve_vq_map to VQ values (and
|
||||
* vice versa). This allows find_next_bit() to be used to find the
|
||||
* _maximum_ VQ not exceeding a certain value.
|
||||
*/
|
||||
static inline unsigned int __vq_to_bit(unsigned int vq)
|
||||
{
|
||||
return SVE_VQ_MAX - vq;
|
||||
}
|
||||
|
||||
static inline unsigned int __bit_to_vq(unsigned int bit)
|
||||
{
|
||||
return SVE_VQ_MAX - bit;
|
||||
}
|
||||
|
||||
/* Ensure vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX before calling this function */
|
||||
static inline bool sve_vq_available(unsigned int vq)
|
||||
{
|
||||
return test_bit(__vq_to_bit(vq), sve_vq_map);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_ARM64_SVE
|
||||
|
||||
|
@ -108,7 +108,8 @@ extern u32 __kvm_get_mdcr_el2(void);
|
||||
.endm
|
||||
|
||||
.macro get_host_ctxt reg, tmp
|
||||
hyp_adr_this_cpu \reg, kvm_host_cpu_state, \tmp
|
||||
hyp_adr_this_cpu \reg, kvm_host_data, \tmp
|
||||
add \reg, \reg, #HOST_DATA_CONTEXT
|
||||
.endm
|
||||
|
||||
.macro get_vcpu_ptr vcpu, ctxt
|
||||
|
@ -98,6 +98,22 @@ static inline void vcpu_set_wfe_traps(struct kvm_vcpu *vcpu)
|
||||
vcpu->arch.hcr_el2 |= HCR_TWE;
|
||||
}
|
||||
|
||||
static inline void vcpu_ptrauth_enable(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
vcpu->arch.hcr_el2 |= (HCR_API | HCR_APK);
|
||||
}
|
||||
|
||||
static inline void vcpu_ptrauth_disable(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
vcpu->arch.hcr_el2 &= ~(HCR_API | HCR_APK);
|
||||
}
|
||||
|
||||
static inline void vcpu_ptrauth_setup_lazy(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (vcpu_has_ptrauth(vcpu))
|
||||
vcpu_ptrauth_disable(vcpu);
|
||||
}
|
||||
|
||||
static inline unsigned long vcpu_get_vsesr(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return vcpu->arch.vsesr_el2;
|
||||
|
@ -22,9 +22,13 @@
|
||||
#ifndef __ARM64_KVM_HOST_H__
|
||||
#define __ARM64_KVM_HOST_H__
|
||||
|
||||
#include <linux/bitmap.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/jump_label.h>
|
||||
#include <linux/kvm_types.h>
|
||||
#include <linux/percpu.h>
|
||||
#include <asm/arch_gicv3.h>
|
||||
#include <asm/barrier.h>
|
||||
#include <asm/cpufeature.h>
|
||||
#include <asm/daifflags.h>
|
||||
#include <asm/fpsimd.h>
|
||||
@ -45,7 +49,7 @@
|
||||
|
||||
#define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
|
||||
|
||||
#define KVM_VCPU_MAX_FEATURES 4
|
||||
#define KVM_VCPU_MAX_FEATURES 7
|
||||
|
||||
#define KVM_REQ_SLEEP \
|
||||
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
||||
@ -54,8 +58,12 @@
|
||||
|
||||
DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
|
||||
|
||||
extern unsigned int kvm_sve_max_vl;
|
||||
int kvm_arm_init_sve(void);
|
||||
|
||||
int __attribute_const__ kvm_target_cpu(void);
|
||||
int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
|
||||
void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
|
||||
int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext);
|
||||
void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t idmap_start);
|
||||
|
||||
@ -117,6 +125,7 @@ enum vcpu_sysreg {
|
||||
SCTLR_EL1, /* System Control Register */
|
||||
ACTLR_EL1, /* Auxiliary Control Register */
|
||||
CPACR_EL1, /* Coprocessor Access Control */
|
||||
ZCR_EL1, /* SVE Control */
|
||||
TTBR0_EL1, /* Translation Table Base Register 0 */
|
||||
TTBR1_EL1, /* Translation Table Base Register 1 */
|
||||
TCR_EL1, /* Translation Control Register */
|
||||
@ -152,6 +161,18 @@ enum vcpu_sysreg {
|
||||
PMSWINC_EL0, /* Software Increment Register */
|
||||
PMUSERENR_EL0, /* User Enable Register */
|
||||
|
||||
/* Pointer Authentication Registers in a strict increasing order. */
|
||||
APIAKEYLO_EL1,
|
||||
APIAKEYHI_EL1,
|
||||
APIBKEYLO_EL1,
|
||||
APIBKEYHI_EL1,
|
||||
APDAKEYLO_EL1,
|
||||
APDAKEYHI_EL1,
|
||||
APDBKEYLO_EL1,
|
||||
APDBKEYHI_EL1,
|
||||
APGAKEYLO_EL1,
|
||||
APGAKEYHI_EL1,
|
||||
|
||||
/* 32bit specific registers. Keep them at the end of the range */
|
||||
DACR32_EL2, /* Domain Access Control Register */
|
||||
IFSR32_EL2, /* Instruction Fault Status Register */
|
||||
@ -212,7 +233,17 @@ struct kvm_cpu_context {
|
||||
struct kvm_vcpu *__hyp_running_vcpu;
|
||||
};
|
||||
|
||||
typedef struct kvm_cpu_context kvm_cpu_context_t;
|
||||
struct kvm_pmu_events {
|
||||
u32 events_host;
|
||||
u32 events_guest;
|
||||
};
|
||||
|
||||
struct kvm_host_data {
|
||||
struct kvm_cpu_context host_ctxt;
|
||||
struct kvm_pmu_events pmu_events;
|
||||
};
|
||||
|
||||
typedef struct kvm_host_data kvm_host_data_t;
|
||||
|
||||
struct vcpu_reset_state {
|
||||
unsigned long pc;
|
||||
@ -223,6 +254,8 @@ struct vcpu_reset_state {
|
||||
|
||||
struct kvm_vcpu_arch {
|
||||
struct kvm_cpu_context ctxt;
|
||||
void *sve_state;
|
||||
unsigned int sve_max_vl;
|
||||
|
||||
/* HYP configuration */
|
||||
u64 hcr_el2;
|
||||
@ -255,7 +288,7 @@ struct kvm_vcpu_arch {
|
||||
struct kvm_guest_debug_arch external_debug_state;
|
||||
|
||||
/* Pointer to host CPU context */
|
||||
kvm_cpu_context_t *host_cpu_context;
|
||||
struct kvm_cpu_context *host_cpu_context;
|
||||
|
||||
struct thread_info *host_thread_info; /* hyp VA */
|
||||
struct user_fpsimd_state *host_fpsimd_state; /* hyp VA */
|
||||
@ -318,12 +351,40 @@ struct kvm_vcpu_arch {
|
||||
bool sysregs_loaded_on_cpu;
|
||||
};
|
||||
|
||||
/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
|
||||
#define vcpu_sve_pffr(vcpu) ((void *)((char *)((vcpu)->arch.sve_state) + \
|
||||
sve_ffr_offset((vcpu)->arch.sve_max_vl)))
|
||||
|
||||
#define vcpu_sve_state_size(vcpu) ({ \
|
||||
size_t __size_ret; \
|
||||
unsigned int __vcpu_vq; \
|
||||
\
|
||||
if (WARN_ON(!sve_vl_valid((vcpu)->arch.sve_max_vl))) { \
|
||||
__size_ret = 0; \
|
||||
} else { \
|
||||
__vcpu_vq = sve_vq_from_vl((vcpu)->arch.sve_max_vl); \
|
||||
__size_ret = SVE_SIG_REGS_SIZE(__vcpu_vq); \
|
||||
} \
|
||||
\
|
||||
__size_ret; \
|
||||
})
|
||||
|
||||
/* vcpu_arch flags field values: */
|
||||
#define KVM_ARM64_DEBUG_DIRTY (1 << 0)
|
||||
#define KVM_ARM64_FP_ENABLED (1 << 1) /* guest FP regs loaded */
|
||||
#define KVM_ARM64_FP_HOST (1 << 2) /* host FP regs loaded */
|
||||
#define KVM_ARM64_HOST_SVE_IN_USE (1 << 3) /* backup for host TIF_SVE */
|
||||
#define KVM_ARM64_HOST_SVE_ENABLED (1 << 4) /* SVE enabled for EL0 */
|
||||
#define KVM_ARM64_GUEST_HAS_SVE (1 << 5) /* SVE exposed to guest */
|
||||
#define KVM_ARM64_VCPU_SVE_FINALIZED (1 << 6) /* SVE config completed */
|
||||
#define KVM_ARM64_GUEST_HAS_PTRAUTH (1 << 7) /* PTRAUTH exposed to guest */
|
||||
|
||||
#define vcpu_has_sve(vcpu) (system_supports_sve() && \
|
||||
((vcpu)->arch.flags & KVM_ARM64_GUEST_HAS_SVE))
|
||||
|
||||
#define vcpu_has_ptrauth(vcpu) ((system_supports_address_auth() || \
|
||||
system_supports_generic_auth()) && \
|
||||
((vcpu)->arch.flags & KVM_ARM64_GUEST_HAS_PTRAUTH))
|
||||
|
||||
#define vcpu_gp_regs(v) (&(v)->arch.ctxt.gp_regs)
|
||||
|
||||
@ -432,9 +493,9 @@ void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);
|
||||
|
||||
struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
|
||||
|
||||
DECLARE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
|
||||
DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
|
||||
|
||||
static inline void kvm_init_host_cpu_context(kvm_cpu_context_t *cpu_ctxt,
|
||||
static inline void kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt,
|
||||
int cpu)
|
||||
{
|
||||
/* The host's MPIDR is immutable, so let's set it up at boot time */
|
||||
@ -452,8 +513,8 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
|
||||
* kernel's mapping to the linear mapping, and store it in tpidr_el2
|
||||
* so that we can use adr_l to access per-cpu variables in EL2.
|
||||
*/
|
||||
u64 tpidr_el2 = ((u64)this_cpu_ptr(&kvm_host_cpu_state) -
|
||||
(u64)kvm_ksym_ref(kvm_host_cpu_state));
|
||||
u64 tpidr_el2 = ((u64)this_cpu_ptr(&kvm_host_data) -
|
||||
(u64)kvm_ksym_ref(kvm_host_data));
|
||||
|
||||
/*
|
||||
* Call initialization code, and switch to the full blown HYP code.
|
||||
@ -491,9 +552,10 @@ static inline bool kvm_arch_requires_vhe(void)
|
||||
return false;
|
||||
}
|
||||
|
||||
void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
|
||||
|
||||
static inline void kvm_arch_hardware_unsetup(void) {}
|
||||
static inline void kvm_arch_sync_events(struct kvm *kvm) {}
|
||||
static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
|
||||
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
|
||||
static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
|
||||
|
||||
@ -516,11 +578,28 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
|
||||
void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
|
||||
void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
|
||||
|
||||
static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
|
||||
{
|
||||
return (!has_vhe() && attr->exclude_host);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
|
||||
static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return kvm_arch_vcpu_run_map_fp(vcpu);
|
||||
}
|
||||
|
||||
void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
|
||||
void kvm_clr_pmu_events(u32 clr);
|
||||
|
||||
void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt);
|
||||
bool __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt);
|
||||
|
||||
void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
|
||||
void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
|
||||
#else
|
||||
static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
|
||||
static inline void kvm_clr_pmu_events(u32 clr) {}
|
||||
#endif
|
||||
|
||||
static inline void kvm_arm_vhe_guest_enter(void)
|
||||
@ -594,4 +673,10 @@ void kvm_arch_free_vm(struct kvm *kvm);
|
||||
|
||||
int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type);
|
||||
|
||||
int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
|
||||
bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
|
||||
|
||||
#define kvm_arm_vcpu_sve_finalized(vcpu) \
|
||||
((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
|
||||
|
||||
#endif /* __ARM64_KVM_HOST_H__ */
|
||||
|
@ -149,7 +149,6 @@ void __debug_switch_to_host(struct kvm_vcpu *vcpu);
|
||||
|
||||
void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
|
||||
void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
|
||||
bool __fpsimd_enabled(void);
|
||||
|
||||
void activate_traps_vhe_load(struct kvm_vcpu *vcpu);
|
||||
void deactivate_traps_vhe_put(void);
|
||||
|
111
arch/arm64/include/asm/kvm_ptrauth.h
Normal file
111
arch/arm64/include/asm/kvm_ptrauth.h
Normal file
@ -0,0 +1,111 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
/* arch/arm64/include/asm/kvm_ptrauth.h: Guest/host ptrauth save/restore
|
||||
* Copyright 2019 Arm Limited
|
||||
* Authors: Mark Rutland <mark.rutland@arm.com>
|
||||
* Amit Daniel Kachhap <amit.kachhap@arm.com>
|
||||
*/
|
||||
|
||||
#ifndef __ASM_KVM_PTRAUTH_H
|
||||
#define __ASM_KVM_PTRAUTH_H
|
||||
|
||||
#ifdef __ASSEMBLY__
|
||||
|
||||
#include <asm/sysreg.h>
|
||||
|
||||
#ifdef CONFIG_ARM64_PTR_AUTH
|
||||
|
||||
#define PTRAUTH_REG_OFFSET(x) (x - CPU_APIAKEYLO_EL1)
|
||||
|
||||
/*
|
||||
* CPU_AP*_EL1 values exceed immediate offset range (512) for stp
|
||||
* instruction so below macros takes CPU_APIAKEYLO_EL1 as base and
|
||||
* calculates the offset of the keys from this base to avoid an extra add
|
||||
* instruction. These macros assumes the keys offsets follow the order of
|
||||
* the sysreg enum in kvm_host.h.
|
||||
*/
|
||||
.macro ptrauth_save_state base, reg1, reg2
|
||||
mrs_s \reg1, SYS_APIAKEYLO_EL1
|
||||
mrs_s \reg2, SYS_APIAKEYHI_EL1
|
||||
stp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APIAKEYLO_EL1)]
|
||||
mrs_s \reg1, SYS_APIBKEYLO_EL1
|
||||
mrs_s \reg2, SYS_APIBKEYHI_EL1
|
||||
stp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APIBKEYLO_EL1)]
|
||||
mrs_s \reg1, SYS_APDAKEYLO_EL1
|
||||
mrs_s \reg2, SYS_APDAKEYHI_EL1
|
||||
stp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APDAKEYLO_EL1)]
|
||||
mrs_s \reg1, SYS_APDBKEYLO_EL1
|
||||
mrs_s \reg2, SYS_APDBKEYHI_EL1
|
||||
stp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APDBKEYLO_EL1)]
|
||||
mrs_s \reg1, SYS_APGAKEYLO_EL1
|
||||
mrs_s \reg2, SYS_APGAKEYHI_EL1
|
||||
stp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APGAKEYLO_EL1)]
|
||||
.endm
|
||||
|
||||
.macro ptrauth_restore_state base, reg1, reg2
|
||||
ldp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APIAKEYLO_EL1)]
|
||||
msr_s SYS_APIAKEYLO_EL1, \reg1
|
||||
msr_s SYS_APIAKEYHI_EL1, \reg2
|
||||
ldp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APIBKEYLO_EL1)]
|
||||
msr_s SYS_APIBKEYLO_EL1, \reg1
|
||||
msr_s SYS_APIBKEYHI_EL1, \reg2
|
||||
ldp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APDAKEYLO_EL1)]
|
||||
msr_s SYS_APDAKEYLO_EL1, \reg1
|
||||
msr_s SYS_APDAKEYHI_EL1, \reg2
|
||||
ldp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APDBKEYLO_EL1)]
|
||||
msr_s SYS_APDBKEYLO_EL1, \reg1
|
||||
msr_s SYS_APDBKEYHI_EL1, \reg2
|
||||
ldp \reg1, \reg2, [\base, #PTRAUTH_REG_OFFSET(CPU_APGAKEYLO_EL1)]
|
||||
msr_s SYS_APGAKEYLO_EL1, \reg1
|
||||
msr_s SYS_APGAKEYHI_EL1, \reg2
|
||||
.endm
|
||||
|
||||
/*
|
||||
* Both ptrauth_switch_to_guest and ptrauth_switch_to_host macros will
|
||||
* check for the presence of one of the cpufeature flag
|
||||
* ARM64_HAS_ADDRESS_AUTH_ARCH or ARM64_HAS_ADDRESS_AUTH_IMP_DEF and
|
||||
* then proceed ahead with the save/restore of Pointer Authentication
|
||||
* key registers.
|
||||
*/
|
||||
.macro ptrauth_switch_to_guest g_ctxt, reg1, reg2, reg3
|
||||
alternative_if ARM64_HAS_ADDRESS_AUTH_ARCH
|
||||
b 1000f
|
||||
alternative_else_nop_endif
|
||||
alternative_if_not ARM64_HAS_ADDRESS_AUTH_IMP_DEF
|
||||
b 1001f
|
||||
alternative_else_nop_endif
|
||||
1000:
|
||||
ldr \reg1, [\g_ctxt, #(VCPU_HCR_EL2 - VCPU_CONTEXT)]
|
||||
and \reg1, \reg1, #(HCR_API | HCR_APK)
|
||||
cbz \reg1, 1001f
|
||||
add \reg1, \g_ctxt, #CPU_APIAKEYLO_EL1
|
||||
ptrauth_restore_state \reg1, \reg2, \reg3
|
||||
1001:
|
||||
.endm
|
||||
|
||||
.macro ptrauth_switch_to_host g_ctxt, h_ctxt, reg1, reg2, reg3
|
||||
alternative_if ARM64_HAS_ADDRESS_AUTH_ARCH
|
||||
b 2000f
|
||||
alternative_else_nop_endif
|
||||
alternative_if_not ARM64_HAS_ADDRESS_AUTH_IMP_DEF
|
||||
b 2001f
|
||||
alternative_else_nop_endif
|
||||
2000:
|
||||
ldr \reg1, [\g_ctxt, #(VCPU_HCR_EL2 - VCPU_CONTEXT)]
|
||||
and \reg1, \reg1, #(HCR_API | HCR_APK)
|
||||
cbz \reg1, 2001f
|
||||
add \reg1, \g_ctxt, #CPU_APIAKEYLO_EL1
|
||||
ptrauth_save_state \reg1, \reg2, \reg3
|
||||
add \reg1, \h_ctxt, #CPU_APIAKEYLO_EL1
|
||||
ptrauth_restore_state \reg1, \reg2, \reg3
|
||||
isb
|
||||
2001:
|
||||
.endm
|
||||
|
||||
#else /* !CONFIG_ARM64_PTR_AUTH */
|
||||
.macro ptrauth_switch_to_guest g_ctxt, reg1, reg2, reg3
|
||||
.endm
|
||||
.macro ptrauth_switch_to_host g_ctxt, h_ctxt, reg1, reg2, reg3
|
||||
.endm
|
||||
#endif /* CONFIG_ARM64_PTR_AUTH */
|
||||
#endif /* __ASSEMBLY__ */
|
||||
#endif /* __ASM_KVM_PTRAUTH_H */
|
@ -454,6 +454,9 @@
|
||||
#define SYS_ICH_LR14_EL2 __SYS__LR8_EL2(6)
|
||||
#define SYS_ICH_LR15_EL2 __SYS__LR8_EL2(7)
|
||||
|
||||
/* VHE encodings for architectural EL0/1 system registers */
|
||||
#define SYS_ZCR_EL12 sys_reg(3, 5, 1, 2, 0)
|
||||
|
||||
/* Common SCTLR_ELx flags. */
|
||||
#define SCTLR_ELx_DSSBS (_BITUL(44))
|
||||
#define SCTLR_ELx_ENIA (_BITUL(31))
|
||||
|
@ -35,6 +35,7 @@
|
||||
#include <linux/psci.h>
|
||||
#include <linux/types.h>
|
||||
#include <asm/ptrace.h>
|
||||
#include <asm/sve_context.h>
|
||||
|
||||
#define __KVM_HAVE_GUEST_DEBUG
|
||||
#define __KVM_HAVE_IRQ_LINE
|
||||
@ -102,6 +103,9 @@ struct kvm_regs {
|
||||
#define KVM_ARM_VCPU_EL1_32BIT 1 /* CPU running a 32bit VM */
|
||||
#define KVM_ARM_VCPU_PSCI_0_2 2 /* CPU uses PSCI v0.2 */
|
||||
#define KVM_ARM_VCPU_PMU_V3 3 /* Support guest PMUv3 */
|
||||
#define KVM_ARM_VCPU_SVE 4 /* enable SVE for this CPU */
|
||||
#define KVM_ARM_VCPU_PTRAUTH_ADDRESS 5 /* VCPU uses address authentication */
|
||||
#define KVM_ARM_VCPU_PTRAUTH_GENERIC 6 /* VCPU uses generic authentication */
|
||||
|
||||
struct kvm_vcpu_init {
|
||||
__u32 target;
|
||||
@ -226,6 +230,45 @@ struct kvm_vcpu_events {
|
||||
KVM_REG_ARM_FW | ((r) & 0xffff))
|
||||
#define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
|
||||
|
||||
/* SVE registers */
|
||||
#define KVM_REG_ARM64_SVE (0x15 << KVM_REG_ARM_COPROC_SHIFT)
|
||||
|
||||
/* Z- and P-regs occupy blocks at the following offsets within this range: */
|
||||
#define KVM_REG_ARM64_SVE_ZREG_BASE 0
|
||||
#define KVM_REG_ARM64_SVE_PREG_BASE 0x400
|
||||
#define KVM_REG_ARM64_SVE_FFR_BASE 0x600
|
||||
|
||||
#define KVM_ARM64_SVE_NUM_ZREGS __SVE_NUM_ZREGS
|
||||
#define KVM_ARM64_SVE_NUM_PREGS __SVE_NUM_PREGS
|
||||
|
||||
#define KVM_ARM64_SVE_MAX_SLICES 32
|
||||
|
||||
#define KVM_REG_ARM64_SVE_ZREG(n, i) \
|
||||
(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | KVM_REG_ARM64_SVE_ZREG_BASE | \
|
||||
KVM_REG_SIZE_U2048 | \
|
||||
(((n) & (KVM_ARM64_SVE_NUM_ZREGS - 1)) << 5) | \
|
||||
((i) & (KVM_ARM64_SVE_MAX_SLICES - 1)))
|
||||
|
||||
#define KVM_REG_ARM64_SVE_PREG(n, i) \
|
||||
(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | KVM_REG_ARM64_SVE_PREG_BASE | \
|
||||
KVM_REG_SIZE_U256 | \
|
||||
(((n) & (KVM_ARM64_SVE_NUM_PREGS - 1)) << 5) | \
|
||||
((i) & (KVM_ARM64_SVE_MAX_SLICES - 1)))
|
||||
|
||||
#define KVM_REG_ARM64_SVE_FFR(i) \
|
||||
(KVM_REG_ARM64 | KVM_REG_ARM64_SVE | KVM_REG_ARM64_SVE_FFR_BASE | \
|
||||
KVM_REG_SIZE_U256 | \
|
||||
((i) & (KVM_ARM64_SVE_MAX_SLICES - 1)))
|
||||
|
||||
#define KVM_ARM64_SVE_VQ_MIN __SVE_VQ_MIN
|
||||
#define KVM_ARM64_SVE_VQ_MAX __SVE_VQ_MAX
|
||||
|
||||
/* Vector lengths pseudo-register: */
|
||||
#define KVM_REG_ARM64_SVE_VLS (KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
|
||||
KVM_REG_SIZE_U512 | 0xffff)
|
||||
#define KVM_ARM64_SVE_VLS_WORDS \
|
||||
((KVM_ARM64_SVE_VQ_MAX - KVM_ARM64_SVE_VQ_MIN) / 64 + 1)
|
||||
|
||||
/* Device Control API: ARM VGIC */
|
||||
#define KVM_DEV_ARM_VGIC_GRP_ADDR 0
|
||||
#define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
|
||||
|
@ -125,9 +125,16 @@ int main(void)
|
||||
DEFINE(VCPU_CONTEXT, offsetof(struct kvm_vcpu, arch.ctxt));
|
||||
DEFINE(VCPU_FAULT_DISR, offsetof(struct kvm_vcpu, arch.fault.disr_el1));
|
||||
DEFINE(VCPU_WORKAROUND_FLAGS, offsetof(struct kvm_vcpu, arch.workaround_flags));
|
||||
DEFINE(VCPU_HCR_EL2, offsetof(struct kvm_vcpu, arch.hcr_el2));
|
||||
DEFINE(CPU_GP_REGS, offsetof(struct kvm_cpu_context, gp_regs));
|
||||
DEFINE(CPU_APIAKEYLO_EL1, offsetof(struct kvm_cpu_context, sys_regs[APIAKEYLO_EL1]));
|
||||
DEFINE(CPU_APIBKEYLO_EL1, offsetof(struct kvm_cpu_context, sys_regs[APIBKEYLO_EL1]));
|
||||
DEFINE(CPU_APDAKEYLO_EL1, offsetof(struct kvm_cpu_context, sys_regs[APDAKEYLO_EL1]));
|
||||
DEFINE(CPU_APDBKEYLO_EL1, offsetof(struct kvm_cpu_context, sys_regs[APDBKEYLO_EL1]));
|
||||
DEFINE(CPU_APGAKEYLO_EL1, offsetof(struct kvm_cpu_context, sys_regs[APGAKEYLO_EL1]));
|
||||
DEFINE(CPU_USER_PT_REGS, offsetof(struct kvm_regs, regs));
|
||||
DEFINE(HOST_CONTEXT_VCPU, offsetof(struct kvm_cpu_context, __hyp_running_vcpu));
|
||||
DEFINE(HOST_DATA_CONTEXT, offsetof(struct kvm_host_data, host_ctxt));
|
||||
#endif
|
||||
#ifdef CONFIG_CPU_PM
|
||||
DEFINE(CPU_CTX_SP, offsetof(struct cpu_suspend_ctx, sp));
|
||||
|
@ -1913,7 +1913,7 @@ static void verify_sve_features(void)
|
||||
unsigned int len = zcr & ZCR_ELx_LEN_MASK;
|
||||
|
||||
if (len < safe_len || sve_verify_vq_map()) {
|
||||
pr_crit("CPU%d: SVE: required vector length(s) missing\n",
|
||||
pr_crit("CPU%d: SVE: vector length support mismatch\n",
|
||||
smp_processor_id());
|
||||
cpu_die_early();
|
||||
}
|
||||
|
@ -18,6 +18,7 @@
|
||||
*/
|
||||
|
||||
#include <linux/bitmap.h>
|
||||
#include <linux/bitops.h>
|
||||
#include <linux/bottom_half.h>
|
||||
#include <linux/bug.h>
|
||||
#include <linux/cache.h>
|
||||
@ -48,6 +49,7 @@
|
||||
#include <asm/sigcontext.h>
|
||||
#include <asm/sysreg.h>
|
||||
#include <asm/traps.h>
|
||||
#include <asm/virt.h>
|
||||
|
||||
#define FPEXC_IOF (1 << 0)
|
||||
#define FPEXC_DZF (1 << 1)
|
||||
@ -119,6 +121,8 @@
|
||||
*/
|
||||
struct fpsimd_last_state_struct {
|
||||
struct user_fpsimd_state *st;
|
||||
void *sve_state;
|
||||
unsigned int sve_vl;
|
||||
};
|
||||
|
||||
static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
|
||||
@ -130,14 +134,23 @@ static int sve_default_vl = -1;
|
||||
|
||||
/* Maximum supported vector length across all CPUs (initially poisoned) */
|
||||
int __ro_after_init sve_max_vl = SVE_VL_MIN;
|
||||
/* Set of available vector lengths, as vq_to_bit(vq): */
|
||||
static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
|
||||
int __ro_after_init sve_max_virtualisable_vl = SVE_VL_MIN;
|
||||
|
||||
/*
|
||||
* Set of available vector lengths,
|
||||
* where length vq encoded as bit __vq_to_bit(vq):
|
||||
*/
|
||||
__ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
|
||||
/* Set of vector lengths present on at least one cpu: */
|
||||
static __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
|
||||
|
||||
static void __percpu *efi_sve_state;
|
||||
|
||||
#else /* ! CONFIG_ARM64_SVE */
|
||||
|
||||
/* Dummy declaration for code that will be optimised out: */
|
||||
extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);
|
||||
extern __ro_after_init DECLARE_BITMAP(sve_vq_partial_map, SVE_VQ_MAX);
|
||||
extern void __percpu *efi_sve_state;
|
||||
|
||||
#endif /* ! CONFIG_ARM64_SVE */
|
||||
@ -235,14 +248,15 @@ static void task_fpsimd_load(void)
|
||||
*/
|
||||
void fpsimd_save(void)
|
||||
{
|
||||
struct user_fpsimd_state *st = __this_cpu_read(fpsimd_last_state.st);
|
||||
struct fpsimd_last_state_struct const *last =
|
||||
this_cpu_ptr(&fpsimd_last_state);
|
||||
/* set by fpsimd_bind_task_to_cpu() or fpsimd_bind_state_to_cpu() */
|
||||
|
||||
WARN_ON(!in_softirq() && !irqs_disabled());
|
||||
|
||||
if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
|
||||
if (system_supports_sve() && test_thread_flag(TIF_SVE)) {
|
||||
if (WARN_ON(sve_get_vl() != current->thread.sve_vl)) {
|
||||
if (WARN_ON(sve_get_vl() != last->sve_vl)) {
|
||||
/*
|
||||
* Can't save the user regs, so current would
|
||||
* re-enter user with corrupt state.
|
||||
@ -252,31 +266,14 @@ void fpsimd_save(void)
|
||||
return;
|
||||
}
|
||||
|
||||
sve_save_state(sve_pffr(¤t->thread), &st->fpsr);
|
||||
sve_save_state((char *)last->sve_state +
|
||||
sve_ffr_offset(last->sve_vl),
|
||||
&last->st->fpsr);
|
||||
} else
|
||||
fpsimd_save_state(st);
|
||||
fpsimd_save_state(last->st);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Helpers to translate bit indices in sve_vq_map to VQ values (and
|
||||
* vice versa). This allows find_next_bit() to be used to find the
|
||||
* _maximum_ VQ not exceeding a certain value.
|
||||
*/
|
||||
|
||||
static unsigned int vq_to_bit(unsigned int vq)
|
||||
{
|
||||
return SVE_VQ_MAX - vq;
|
||||
}
|
||||
|
||||
static unsigned int bit_to_vq(unsigned int bit)
|
||||
{
|
||||
if (WARN_ON(bit >= SVE_VQ_MAX))
|
||||
bit = SVE_VQ_MAX - 1;
|
||||
|
||||
return SVE_VQ_MAX - bit;
|
||||
}
|
||||
|
||||
/*
|
||||
* All vector length selection from userspace comes through here.
|
||||
* We're on a slow path, so some sanity-checks are included.
|
||||
@ -298,8 +295,8 @@ static unsigned int find_supported_vector_length(unsigned int vl)
|
||||
vl = max_vl;
|
||||
|
||||
bit = find_next_bit(sve_vq_map, SVE_VQ_MAX,
|
||||
vq_to_bit(sve_vq_from_vl(vl)));
|
||||
return sve_vl_from_vq(bit_to_vq(bit));
|
||||
__vq_to_bit(sve_vq_from_vl(vl)));
|
||||
return sve_vl_from_vq(__bit_to_vq(bit));
|
||||
}
|
||||
|
||||
#ifdef CONFIG_SYSCTL
|
||||
@ -550,7 +547,6 @@ int sve_set_vector_length(struct task_struct *task,
|
||||
local_bh_disable();
|
||||
|
||||
fpsimd_save();
|
||||
set_thread_flag(TIF_FOREIGN_FPSTATE);
|
||||
}
|
||||
|
||||
fpsimd_flush_task_state(task);
|
||||
@ -624,12 +620,6 @@ int sve_get_current_vl(void)
|
||||
return sve_prctl_status(0);
|
||||
}
|
||||
|
||||
/*
|
||||
* Bitmap for temporary storage of the per-CPU set of supported vector lengths
|
||||
* during secondary boot.
|
||||
*/
|
||||
static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX);
|
||||
|
||||
static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
|
||||
{
|
||||
unsigned int vq, vl;
|
||||
@ -644,40 +634,82 @@ static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))
|
||||
write_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */
|
||||
vl = sve_get_vl();
|
||||
vq = sve_vq_from_vl(vl); /* skip intervening lengths */
|
||||
set_bit(vq_to_bit(vq), map);
|
||||
set_bit(__vq_to_bit(vq), map);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Initialise the set of known supported VQs for the boot CPU.
|
||||
* This is called during kernel boot, before secondary CPUs are brought up.
|
||||
*/
|
||||
void __init sve_init_vq_map(void)
|
||||
{
|
||||
sve_probe_vqs(sve_vq_map);
|
||||
bitmap_copy(sve_vq_partial_map, sve_vq_map, SVE_VQ_MAX);
|
||||
}
|
||||
|
||||
/*
|
||||
* If we haven't committed to the set of supported VQs yet, filter out
|
||||
* those not supported by the current CPU.
|
||||
* This function is called during the bring-up of early secondary CPUs only.
|
||||
*/
|
||||
void sve_update_vq_map(void)
|
||||
{
|
||||
sve_probe_vqs(sve_secondary_vq_map);
|
||||
bitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);
|
||||
DECLARE_BITMAP(tmp_map, SVE_VQ_MAX);
|
||||
|
||||
sve_probe_vqs(tmp_map);
|
||||
bitmap_and(sve_vq_map, sve_vq_map, tmp_map, SVE_VQ_MAX);
|
||||
bitmap_or(sve_vq_partial_map, sve_vq_partial_map, tmp_map, SVE_VQ_MAX);
|
||||
}
|
||||
|
||||
/* Check whether the current CPU supports all VQs in the committed set */
|
||||
/*
|
||||
* Check whether the current CPU supports all VQs in the committed set.
|
||||
* This function is called during the bring-up of late secondary CPUs only.
|
||||
*/
|
||||
int sve_verify_vq_map(void)
|
||||
{
|
||||
int ret = 0;
|
||||
DECLARE_BITMAP(tmp_map, SVE_VQ_MAX);
|
||||
unsigned long b;
|
||||
|
||||
sve_probe_vqs(sve_secondary_vq_map);
|
||||
bitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,
|
||||
SVE_VQ_MAX);
|
||||
if (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {
|
||||
sve_probe_vqs(tmp_map);
|
||||
|
||||
bitmap_complement(tmp_map, tmp_map, SVE_VQ_MAX);
|
||||
if (bitmap_intersects(tmp_map, sve_vq_map, SVE_VQ_MAX)) {
|
||||
pr_warn("SVE: cpu%d: Required vector length(s) missing\n",
|
||||
smp_processor_id());
|
||||
ret = -EINVAL;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return ret;
|
||||
if (!IS_ENABLED(CONFIG_KVM) || !is_hyp_mode_available())
|
||||
return 0;
|
||||
|
||||
/*
|
||||
* For KVM, it is necessary to ensure that this CPU doesn't
|
||||
* support any vector length that guests may have probed as
|
||||
* unsupported.
|
||||
*/
|
||||
|
||||
/* Recover the set of supported VQs: */
|
||||
bitmap_complement(tmp_map, tmp_map, SVE_VQ_MAX);
|
||||
/* Find VQs supported that are not globally supported: */
|
||||
bitmap_andnot(tmp_map, tmp_map, sve_vq_map, SVE_VQ_MAX);
|
||||
|
||||
/* Find the lowest such VQ, if any: */
|
||||
b = find_last_bit(tmp_map, SVE_VQ_MAX);
|
||||
if (b >= SVE_VQ_MAX)
|
||||
return 0; /* no mismatches */
|
||||
|
||||
/*
|
||||
* Mismatches above sve_max_virtualisable_vl are fine, since
|
||||
* no guest is allowed to configure ZCR_EL2.LEN to exceed this:
|
||||
*/
|
||||
if (sve_vl_from_vq(__bit_to_vq(b)) <= sve_max_virtualisable_vl) {
|
||||
pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
|
||||
smp_processor_id());
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void __init sve_efi_setup(void)
|
||||
@ -744,6 +776,8 @@ u64 read_zcr_features(void)
|
||||
void __init sve_setup(void)
|
||||
{
|
||||
u64 zcr;
|
||||
DECLARE_BITMAP(tmp_map, SVE_VQ_MAX);
|
||||
unsigned long b;
|
||||
|
||||
if (!system_supports_sve())
|
||||
return;
|
||||
@ -753,8 +787,8 @@ void __init sve_setup(void)
|
||||
* so sve_vq_map must have at least SVE_VQ_MIN set.
|
||||
* If something went wrong, at least try to patch it up:
|
||||
*/
|
||||
if (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
|
||||
set_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map);
|
||||
if (WARN_ON(!test_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map)))
|
||||
set_bit(__vq_to_bit(SVE_VQ_MIN), sve_vq_map);
|
||||
|
||||
zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1);
|
||||
sve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1);
|
||||
@ -772,11 +806,31 @@ void __init sve_setup(void)
|
||||
*/
|
||||
sve_default_vl = find_supported_vector_length(64);
|
||||
|
||||
bitmap_andnot(tmp_map, sve_vq_partial_map, sve_vq_map,
|
||||
SVE_VQ_MAX);
|
||||
|
||||
b = find_last_bit(tmp_map, SVE_VQ_MAX);
|
||||
if (b >= SVE_VQ_MAX)
|
||||
/* No non-virtualisable VLs found */
|
||||
sve_max_virtualisable_vl = SVE_VQ_MAX;
|
||||
else if (WARN_ON(b == SVE_VQ_MAX - 1))
|
||||
/* No virtualisable VLs? This is architecturally forbidden. */
|
||||
sve_max_virtualisable_vl = SVE_VQ_MIN;
|
||||
else /* b + 1 < SVE_VQ_MAX */
|
||||
sve_max_virtualisable_vl = sve_vl_from_vq(__bit_to_vq(b + 1));
|
||||
|
||||
if (sve_max_virtualisable_vl > sve_max_vl)
|
||||
sve_max_virtualisable_vl = sve_max_vl;
|
||||
|
||||
pr_info("SVE: maximum available vector length %u bytes per vector\n",
|
||||
sve_max_vl);
|
||||
pr_info("SVE: default vector length %u bytes per vector\n",
|
||||
sve_default_vl);
|
||||
|
||||
/* KVM decides whether to support mismatched systems. Just warn here: */
|
||||
if (sve_max_virtualisable_vl < sve_max_vl)
|
||||
pr_warn("SVE: unvirtualisable vector lengths present\n");
|
||||
|
||||
sve_efi_setup();
|
||||
}
|
||||
|
||||
@ -816,12 +870,11 @@ asmlinkage void do_sve_acc(unsigned int esr, struct pt_regs *regs)
|
||||
local_bh_disable();
|
||||
|
||||
fpsimd_save();
|
||||
fpsimd_to_sve(current);
|
||||
|
||||
/* Force ret_to_user to reload the registers: */
|
||||
fpsimd_flush_task_state(current);
|
||||
set_thread_flag(TIF_FOREIGN_FPSTATE);
|
||||
|
||||
fpsimd_to_sve(current);
|
||||
if (test_and_set_thread_flag(TIF_SVE))
|
||||
WARN_ON(1); /* SVE access shouldn't have trapped */
|
||||
|
||||
@ -894,9 +947,9 @@ void fpsimd_flush_thread(void)
|
||||
|
||||
local_bh_disable();
|
||||
|
||||
fpsimd_flush_task_state(current);
|
||||
memset(¤t->thread.uw.fpsimd_state, 0,
|
||||
sizeof(current->thread.uw.fpsimd_state));
|
||||
fpsimd_flush_task_state(current);
|
||||
|
||||
if (system_supports_sve()) {
|
||||
clear_thread_flag(TIF_SVE);
|
||||
@ -933,8 +986,6 @@ void fpsimd_flush_thread(void)
|
||||
current->thread.sve_vl_onexec = 0;
|
||||
}
|
||||
|
||||
set_thread_flag(TIF_FOREIGN_FPSTATE);
|
||||
|
||||
local_bh_enable();
|
||||
}
|
||||
|
||||
@ -974,6 +1025,8 @@ void fpsimd_bind_task_to_cpu(void)
|
||||
this_cpu_ptr(&fpsimd_last_state);
|
||||
|
||||
last->st = ¤t->thread.uw.fpsimd_state;
|
||||
last->sve_state = current->thread.sve_state;
|
||||
last->sve_vl = current->thread.sve_vl;
|
||||
current->thread.fpsimd_cpu = smp_processor_id();
|
||||
|
||||
if (system_supports_sve()) {
|
||||
@ -987,7 +1040,8 @@ void fpsimd_bind_task_to_cpu(void)
|
||||
}
|
||||
}
|
||||
|
||||
void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
|
||||
void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
|
||||
unsigned int sve_vl)
|
||||
{
|
||||
struct fpsimd_last_state_struct *last =
|
||||
this_cpu_ptr(&fpsimd_last_state);
|
||||
@ -995,6 +1049,8 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st)
|
||||
WARN_ON(!in_softirq() && !irqs_disabled());
|
||||
|
||||
last->st = st;
|
||||
last->sve_state = sve_state;
|
||||
last->sve_vl = sve_vl;
|
||||
}
|
||||
|
||||
/*
|
||||
@ -1043,12 +1099,29 @@ void fpsimd_update_current_state(struct user_fpsimd_state const *state)
|
||||
|
||||
/*
|
||||
* Invalidate live CPU copies of task t's FPSIMD state
|
||||
*
|
||||
* This function may be called with preemption enabled. The barrier()
|
||||
* ensures that the assignment to fpsimd_cpu is visible to any
|
||||
* preemption/softirq that could race with set_tsk_thread_flag(), so
|
||||
* that TIF_FOREIGN_FPSTATE cannot be spuriously re-cleared.
|
||||
*
|
||||
* The final barrier ensures that TIF_FOREIGN_FPSTATE is seen set by any
|
||||
* subsequent code.
|
||||
*/
|
||||
void fpsimd_flush_task_state(struct task_struct *t)
|
||||
{
|
||||
t->thread.fpsimd_cpu = NR_CPUS;
|
||||
|
||||
barrier();
|
||||
set_tsk_thread_flag(t, TIF_FOREIGN_FPSTATE);
|
||||
|
||||
barrier();
|
||||
}
|
||||
|
||||
/*
|
||||
* Invalidate any task's FPSIMD state that is present on this cpu.
|
||||
* This function must be called with softirqs disabled.
|
||||
*/
|
||||
void fpsimd_flush_cpu_state(void)
|
||||
{
|
||||
__this_cpu_write(fpsimd_last_state.st, NULL);
|
||||
|
@ -26,6 +26,7 @@
|
||||
|
||||
#include <linux/acpi.h>
|
||||
#include <linux/clocksource.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/of.h>
|
||||
#include <linux/perf/arm_pmu.h>
|
||||
#include <linux/platform_device.h>
|
||||
@ -528,12 +529,21 @@ static inline int armv8pmu_enable_counter(int idx)
|
||||
|
||||
static inline void armv8pmu_enable_event_counter(struct perf_event *event)
|
||||
{
|
||||
struct perf_event_attr *attr = &event->attr;
|
||||
int idx = event->hw.idx;
|
||||
u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
|
||||
|
||||
armv8pmu_enable_counter(idx);
|
||||
if (armv8pmu_event_is_chained(event))
|
||||
armv8pmu_enable_counter(idx - 1);
|
||||
isb();
|
||||
counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
|
||||
|
||||
kvm_set_pmu_events(counter_bits, attr);
|
||||
|
||||
/* We rely on the hypervisor switch code to enable guest counters */
|
||||
if (!kvm_pmu_counter_deferred(attr)) {
|
||||
armv8pmu_enable_counter(idx);
|
||||
if (armv8pmu_event_is_chained(event))
|
||||
armv8pmu_enable_counter(idx - 1);
|
||||
}
|
||||
}
|
||||
|
||||
static inline int armv8pmu_disable_counter(int idx)
|
||||
@ -546,11 +556,21 @@ static inline int armv8pmu_disable_counter(int idx)
|
||||
static inline void armv8pmu_disable_event_counter(struct perf_event *event)
|
||||
{
|
||||
struct hw_perf_event *hwc = &event->hw;
|
||||
struct perf_event_attr *attr = &event->attr;
|
||||
int idx = hwc->idx;
|
||||
u32 counter_bits = BIT(ARMV8_IDX_TO_COUNTER(idx));
|
||||
|
||||
if (armv8pmu_event_is_chained(event))
|
||||
armv8pmu_disable_counter(idx - 1);
|
||||
armv8pmu_disable_counter(idx);
|
||||
counter_bits |= BIT(ARMV8_IDX_TO_COUNTER(idx - 1));
|
||||
|
||||
kvm_clr_pmu_events(counter_bits);
|
||||
|
||||
/* We rely on the hypervisor switch code to disable guest counters */
|
||||
if (!kvm_pmu_counter_deferred(attr)) {
|
||||
if (armv8pmu_event_is_chained(event))
|
||||
armv8pmu_disable_counter(idx - 1);
|
||||
armv8pmu_disable_counter(idx);
|
||||
}
|
||||
}
|
||||
|
||||
static inline int armv8pmu_enable_intens(int idx)
|
||||
@ -827,14 +847,23 @@ static int armv8pmu_set_event_filter(struct hw_perf_event *event,
|
||||
* with other architectures (x86 and Power).
|
||||
*/
|
||||
if (is_kernel_in_hyp_mode()) {
|
||||
if (!attr->exclude_kernel)
|
||||
if (!attr->exclude_kernel && !attr->exclude_host)
|
||||
config_base |= ARMV8_PMU_INCLUDE_EL2;
|
||||
} else {
|
||||
if (attr->exclude_kernel)
|
||||
if (attr->exclude_guest)
|
||||
config_base |= ARMV8_PMU_EXCLUDE_EL1;
|
||||
if (!attr->exclude_hv)
|
||||
if (attr->exclude_host)
|
||||
config_base |= ARMV8_PMU_EXCLUDE_EL0;
|
||||
} else {
|
||||
if (!attr->exclude_hv && !attr->exclude_host)
|
||||
config_base |= ARMV8_PMU_INCLUDE_EL2;
|
||||
}
|
||||
|
||||
/*
|
||||
* Filter out !VHE kernels and guest kernels
|
||||
*/
|
||||
if (attr->exclude_kernel)
|
||||
config_base |= ARMV8_PMU_EXCLUDE_EL1;
|
||||
|
||||
if (attr->exclude_user)
|
||||
config_base |= ARMV8_PMU_EXCLUDE_EL0;
|
||||
|
||||
@ -864,6 +893,9 @@ static void armv8pmu_reset(void *info)
|
||||
armv8pmu_disable_intens(idx);
|
||||
}
|
||||
|
||||
/* Clear the counters we flip at guest entry/exit */
|
||||
kvm_clr_pmu_events(U32_MAX);
|
||||
|
||||
/*
|
||||
* Initialize & Reset PMNC. Request overflow interrupt for
|
||||
* 64 bit cycle counter but cheat in armv8pmu_write_counter().
|
||||
|
@ -296,11 +296,6 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
|
||||
*/
|
||||
|
||||
fpsimd_flush_task_state(current);
|
||||
barrier();
|
||||
/* From now, fpsimd_thread_switch() won't clear TIF_FOREIGN_FPSTATE */
|
||||
|
||||
set_thread_flag(TIF_FOREIGN_FPSTATE);
|
||||
barrier();
|
||||
/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
|
||||
|
||||
sve_alloc(current);
|
||||
|
@ -17,7 +17,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o pmu.o
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
|
||||
|
||||
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
|
||||
|
@ -9,6 +9,7 @@
|
||||
#include <linux/sched.h>
|
||||
#include <linux/thread_info.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <asm/fpsimd.h>
|
||||
#include <asm/kvm_asm.h>
|
||||
#include <asm/kvm_host.h>
|
||||
#include <asm/kvm_mmu.h>
|
||||
@ -85,9 +86,12 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
|
||||
WARN_ON_ONCE(!irqs_disabled());
|
||||
|
||||
if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
|
||||
fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs);
|
||||
fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.gp_regs.fp_regs,
|
||||
vcpu->arch.sve_state,
|
||||
vcpu->arch.sve_max_vl);
|
||||
|
||||
clear_thread_flag(TIF_FOREIGN_FPSTATE);
|
||||
clear_thread_flag(TIF_SVE);
|
||||
update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
|
||||
}
|
||||
}
|
||||
|
||||
@ -100,14 +104,21 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
|
||||
void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
unsigned long flags;
|
||||
bool host_has_sve = system_supports_sve();
|
||||
bool guest_has_sve = vcpu_has_sve(vcpu);
|
||||
|
||||
local_irq_save(flags);
|
||||
|
||||
if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
|
||||
u64 *guest_zcr = &vcpu->arch.ctxt.sys_regs[ZCR_EL1];
|
||||
|
||||
/* Clean guest FP state to memory and invalidate cpu view */
|
||||
fpsimd_save();
|
||||
fpsimd_flush_cpu_state();
|
||||
} else if (system_supports_sve()) {
|
||||
|
||||
if (guest_has_sve)
|
||||
*guest_zcr = read_sysreg_s(SYS_ZCR_EL12);
|
||||
} else if (host_has_sve) {
|
||||
/*
|
||||
* The FPSIMD/SVE state in the CPU has not been touched, and we
|
||||
* have SVE (and VHE): CPACR_EL1 (alias CPTR_EL2) has been
|
||||
|
@ -19,18 +19,25 @@
|
||||
* along with this program. If not, see <http://www.gnu.org/licenses/>.
|
||||
*/
|
||||
|
||||
#include <linux/bits.h>
|
||||
#include <linux/errno.h>
|
||||
#include <linux/err.h>
|
||||
#include <linux/nospec.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/stddef.h>
|
||||
#include <linux/string.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <linux/fs.h>
|
||||
#include <kvm/arm_psci.h>
|
||||
#include <asm/cputype.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <asm/fpsimd.h>
|
||||
#include <asm/kvm.h>
|
||||
#include <asm/kvm_emulate.h>
|
||||
#include <asm/kvm_coproc.h>
|
||||
#include <asm/kvm_host.h>
|
||||
#include <asm/sigcontext.h>
|
||||
|
||||
#include "trace.h"
|
||||
|
||||
@ -52,12 +59,19 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static bool core_reg_offset_is_vreg(u64 off)
|
||||
{
|
||||
return off >= KVM_REG_ARM_CORE_REG(fp_regs.vregs) &&
|
||||
off < KVM_REG_ARM_CORE_REG(fp_regs.fpsr);
|
||||
}
|
||||
|
||||
static u64 core_reg_offset_from_id(u64 id)
|
||||
{
|
||||
return id & ~(KVM_REG_ARCH_MASK | KVM_REG_SIZE_MASK | KVM_REG_ARM_CORE);
|
||||
}
|
||||
|
||||
static int validate_core_offset(const struct kvm_one_reg *reg)
|
||||
static int validate_core_offset(const struct kvm_vcpu *vcpu,
|
||||
const struct kvm_one_reg *reg)
|
||||
{
|
||||
u64 off = core_reg_offset_from_id(reg->id);
|
||||
int size;
|
||||
@ -89,11 +103,19 @@ static int validate_core_offset(const struct kvm_one_reg *reg)
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (KVM_REG_SIZE(reg->id) == size &&
|
||||
IS_ALIGNED(off, size / sizeof(__u32)))
|
||||
return 0;
|
||||
if (KVM_REG_SIZE(reg->id) != size ||
|
||||
!IS_ALIGNED(off, size / sizeof(__u32)))
|
||||
return -EINVAL;
|
||||
|
||||
return -EINVAL;
|
||||
/*
|
||||
* The KVM_REG_ARM64_SVE regs must be used instead of
|
||||
* KVM_REG_ARM_CORE for accessing the FPSIMD V-registers on
|
||||
* SVE-enabled vcpus:
|
||||
*/
|
||||
if (vcpu_has_sve(vcpu) && core_reg_offset_is_vreg(off))
|
||||
return -EINVAL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
@ -115,7 +137,7 @@ static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
(off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
|
||||
return -ENOENT;
|
||||
|
||||
if (validate_core_offset(reg))
|
||||
if (validate_core_offset(vcpu, reg))
|
||||
return -EINVAL;
|
||||
|
||||
if (copy_to_user(uaddr, ((u32 *)regs) + off, KVM_REG_SIZE(reg->id)))
|
||||
@ -140,7 +162,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
(off + (KVM_REG_SIZE(reg->id) / sizeof(__u32))) >= nr_regs)
|
||||
return -ENOENT;
|
||||
|
||||
if (validate_core_offset(reg))
|
||||
if (validate_core_offset(vcpu, reg))
|
||||
return -EINVAL;
|
||||
|
||||
if (KVM_REG_SIZE(reg->id) > sizeof(tmp))
|
||||
@ -183,6 +205,239 @@ out:
|
||||
return err;
|
||||
}
|
||||
|
||||
#define vq_word(vq) (((vq) - SVE_VQ_MIN) / 64)
|
||||
#define vq_mask(vq) ((u64)1 << ((vq) - SVE_VQ_MIN) % 64)
|
||||
|
||||
static bool vq_present(
|
||||
const u64 (*const vqs)[KVM_ARM64_SVE_VLS_WORDS],
|
||||
unsigned int vq)
|
||||
{
|
||||
return (*vqs)[vq_word(vq)] & vq_mask(vq);
|
||||
}
|
||||
|
||||
static int get_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
{
|
||||
unsigned int max_vq, vq;
|
||||
u64 vqs[KVM_ARM64_SVE_VLS_WORDS];
|
||||
|
||||
if (!vcpu_has_sve(vcpu))
|
||||
return -ENOENT;
|
||||
|
||||
if (WARN_ON(!sve_vl_valid(vcpu->arch.sve_max_vl)))
|
||||
return -EINVAL;
|
||||
|
||||
memset(vqs, 0, sizeof(vqs));
|
||||
|
||||
max_vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
|
||||
for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
|
||||
if (sve_vq_available(vq))
|
||||
vqs[vq_word(vq)] |= vq_mask(vq);
|
||||
|
||||
if (copy_to_user((void __user *)reg->addr, vqs, sizeof(vqs)))
|
||||
return -EFAULT;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
{
|
||||
unsigned int max_vq, vq;
|
||||
u64 vqs[KVM_ARM64_SVE_VLS_WORDS];
|
||||
|
||||
if (!vcpu_has_sve(vcpu))
|
||||
return -ENOENT;
|
||||
|
||||
if (kvm_arm_vcpu_sve_finalized(vcpu))
|
||||
return -EPERM; /* too late! */
|
||||
|
||||
if (WARN_ON(vcpu->arch.sve_state))
|
||||
return -EINVAL;
|
||||
|
||||
if (copy_from_user(vqs, (const void __user *)reg->addr, sizeof(vqs)))
|
||||
return -EFAULT;
|
||||
|
||||
max_vq = 0;
|
||||
for (vq = SVE_VQ_MIN; vq <= SVE_VQ_MAX; ++vq)
|
||||
if (vq_present(&vqs, vq))
|
||||
max_vq = vq;
|
||||
|
||||
if (max_vq > sve_vq_from_vl(kvm_sve_max_vl))
|
||||
return -EINVAL;
|
||||
|
||||
/*
|
||||
* Vector lengths supported by the host can't currently be
|
||||
* hidden from the guest individually: instead we can only set a
|
||||
* maxmium via ZCR_EL2.LEN. So, make sure the available vector
|
||||
* lengths match the set requested exactly up to the requested
|
||||
* maximum:
|
||||
*/
|
||||
for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
|
||||
if (vq_present(&vqs, vq) != sve_vq_available(vq))
|
||||
return -EINVAL;
|
||||
|
||||
/* Can't run with no vector lengths at all: */
|
||||
if (max_vq < SVE_VQ_MIN)
|
||||
return -EINVAL;
|
||||
|
||||
/* vcpu->arch.sve_state will be alloc'd by kvm_vcpu_finalize_sve() */
|
||||
vcpu->arch.sve_max_vl = sve_vl_from_vq(max_vq);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
#define SVE_REG_SLICE_SHIFT 0
|
||||
#define SVE_REG_SLICE_BITS 5
|
||||
#define SVE_REG_ID_SHIFT (SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
|
||||
#define SVE_REG_ID_BITS 5
|
||||
|
||||
#define SVE_REG_SLICE_MASK \
|
||||
GENMASK(SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS - 1, \
|
||||
SVE_REG_SLICE_SHIFT)
|
||||
#define SVE_REG_ID_MASK \
|
||||
GENMASK(SVE_REG_ID_SHIFT + SVE_REG_ID_BITS - 1, SVE_REG_ID_SHIFT)
|
||||
|
||||
#define SVE_NUM_SLICES (1 << SVE_REG_SLICE_BITS)
|
||||
|
||||
#define KVM_SVE_ZREG_SIZE KVM_REG_SIZE(KVM_REG_ARM64_SVE_ZREG(0, 0))
|
||||
#define KVM_SVE_PREG_SIZE KVM_REG_SIZE(KVM_REG_ARM64_SVE_PREG(0, 0))
|
||||
|
||||
/*
|
||||
* Number of register slices required to cover each whole SVE register.
|
||||
* NOTE: Only the first slice every exists, for now.
|
||||
* If you are tempted to modify this, you must also rework sve_reg_to_region()
|
||||
* to match:
|
||||
*/
|
||||
#define vcpu_sve_slices(vcpu) 1
|
||||
|
||||
/* Bounds of a single SVE register slice within vcpu->arch.sve_state */
|
||||
struct sve_state_reg_region {
|
||||
unsigned int koffset; /* offset into sve_state in kernel memory */
|
||||
unsigned int klen; /* length in kernel memory */
|
||||
unsigned int upad; /* extra trailing padding in user memory */
|
||||
};
|
||||
|
||||
/*
|
||||
* Validate SVE register ID and get sanitised bounds for user/kernel SVE
|
||||
* register copy
|
||||
*/
|
||||
static int sve_reg_to_region(struct sve_state_reg_region *region,
|
||||
struct kvm_vcpu *vcpu,
|
||||
const struct kvm_one_reg *reg)
|
||||
{
|
||||
/* reg ID ranges for Z- registers */
|
||||
const u64 zreg_id_min = KVM_REG_ARM64_SVE_ZREG(0, 0);
|
||||
const u64 zreg_id_max = KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS - 1,
|
||||
SVE_NUM_SLICES - 1);
|
||||
|
||||
/* reg ID ranges for P- registers and FFR (which are contiguous) */
|
||||
const u64 preg_id_min = KVM_REG_ARM64_SVE_PREG(0, 0);
|
||||
const u64 preg_id_max = KVM_REG_ARM64_SVE_FFR(SVE_NUM_SLICES - 1);
|
||||
|
||||
unsigned int vq;
|
||||
unsigned int reg_num;
|
||||
|
||||
unsigned int reqoffset, reqlen; /* User-requested offset and length */
|
||||
unsigned int maxlen; /* Maxmimum permitted length */
|
||||
|
||||
size_t sve_state_size;
|
||||
|
||||
const u64 last_preg_id = KVM_REG_ARM64_SVE_PREG(SVE_NUM_PREGS - 1,
|
||||
SVE_NUM_SLICES - 1);
|
||||
|
||||
/* Verify that the P-regs and FFR really do have contiguous IDs: */
|
||||
BUILD_BUG_ON(KVM_REG_ARM64_SVE_FFR(0) != last_preg_id + 1);
|
||||
|
||||
/* Verify that we match the UAPI header: */
|
||||
BUILD_BUG_ON(SVE_NUM_SLICES != KVM_ARM64_SVE_MAX_SLICES);
|
||||
|
||||
reg_num = (reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
|
||||
|
||||
if (reg->id >= zreg_id_min && reg->id <= zreg_id_max) {
|
||||
if (!vcpu_has_sve(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0)
|
||||
return -ENOENT;
|
||||
|
||||
vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
|
||||
|
||||
reqoffset = SVE_SIG_ZREG_OFFSET(vq, reg_num) -
|
||||
SVE_SIG_REGS_OFFSET;
|
||||
reqlen = KVM_SVE_ZREG_SIZE;
|
||||
maxlen = SVE_SIG_ZREG_SIZE(vq);
|
||||
} else if (reg->id >= preg_id_min && reg->id <= preg_id_max) {
|
||||
if (!vcpu_has_sve(vcpu) || (reg->id & SVE_REG_SLICE_MASK) > 0)
|
||||
return -ENOENT;
|
||||
|
||||
vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
|
||||
|
||||
reqoffset = SVE_SIG_PREG_OFFSET(vq, reg_num) -
|
||||
SVE_SIG_REGS_OFFSET;
|
||||
reqlen = KVM_SVE_PREG_SIZE;
|
||||
maxlen = SVE_SIG_PREG_SIZE(vq);
|
||||
} else {
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
sve_state_size = vcpu_sve_state_size(vcpu);
|
||||
if (WARN_ON(!sve_state_size))
|
||||
return -EINVAL;
|
||||
|
||||
region->koffset = array_index_nospec(reqoffset, sve_state_size);
|
||||
region->klen = min(maxlen, reqlen);
|
||||
region->upad = reqlen - region->klen;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
{
|
||||
int ret;
|
||||
struct sve_state_reg_region region;
|
||||
char __user *uptr = (char __user *)reg->addr;
|
||||
|
||||
/* Handle the KVM_REG_ARM64_SVE_VLS pseudo-reg as a special case: */
|
||||
if (reg->id == KVM_REG_ARM64_SVE_VLS)
|
||||
return get_sve_vls(vcpu, reg);
|
||||
|
||||
/* Try to interpret reg ID as an architectural SVE register... */
|
||||
ret = sve_reg_to_region(®ion, vcpu, reg);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (!kvm_arm_vcpu_sve_finalized(vcpu))
|
||||
return -EPERM;
|
||||
|
||||
if (copy_to_user(uptr, vcpu->arch.sve_state + region.koffset,
|
||||
region.klen) ||
|
||||
clear_user(uptr + region.klen, region.upad))
|
||||
return -EFAULT;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
{
|
||||
int ret;
|
||||
struct sve_state_reg_region region;
|
||||
const char __user *uptr = (const char __user *)reg->addr;
|
||||
|
||||
/* Handle the KVM_REG_ARM64_SVE_VLS pseudo-reg as a special case: */
|
||||
if (reg->id == KVM_REG_ARM64_SVE_VLS)
|
||||
return set_sve_vls(vcpu, reg);
|
||||
|
||||
/* Try to interpret reg ID as an architectural SVE register... */
|
||||
ret = sve_reg_to_region(®ion, vcpu, reg);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (!kvm_arm_vcpu_sve_finalized(vcpu))
|
||||
return -EPERM;
|
||||
|
||||
if (copy_from_user(vcpu->arch.sve_state + region.koffset, uptr,
|
||||
region.klen))
|
||||
return -EFAULT;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
|
||||
{
|
||||
return -EINVAL;
|
||||
@ -193,9 +448,37 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
static unsigned long num_core_regs(void)
|
||||
static int copy_core_reg_indices(const struct kvm_vcpu *vcpu,
|
||||
u64 __user *uindices)
|
||||
{
|
||||
return sizeof(struct kvm_regs) / sizeof(__u32);
|
||||
unsigned int i;
|
||||
int n = 0;
|
||||
const u64 core_reg = KVM_REG_ARM64 | KVM_REG_SIZE_U64 | KVM_REG_ARM_CORE;
|
||||
|
||||
for (i = 0; i < sizeof(struct kvm_regs) / sizeof(__u32); i++) {
|
||||
/*
|
||||
* The KVM_REG_ARM64_SVE regs must be used instead of
|
||||
* KVM_REG_ARM_CORE for accessing the FPSIMD V-registers on
|
||||
* SVE-enabled vcpus:
|
||||
*/
|
||||
if (vcpu_has_sve(vcpu) && core_reg_offset_is_vreg(i))
|
||||
continue;
|
||||
|
||||
if (uindices) {
|
||||
if (put_user(core_reg | i, uindices))
|
||||
return -EFAULT;
|
||||
uindices++;
|
||||
}
|
||||
|
||||
n++;
|
||||
}
|
||||
|
||||
return n;
|
||||
}
|
||||
|
||||
static unsigned long num_core_regs(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return copy_core_reg_indices(vcpu, NULL);
|
||||
}
|
||||
|
||||
/**
|
||||
@ -251,6 +534,67 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
return copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)) ? -EFAULT : 0;
|
||||
}
|
||||
|
||||
static unsigned long num_sve_regs(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
const unsigned int slices = vcpu_sve_slices(vcpu);
|
||||
|
||||
if (!vcpu_has_sve(vcpu))
|
||||
return 0;
|
||||
|
||||
/* Policed by KVM_GET_REG_LIST: */
|
||||
WARN_ON(!kvm_arm_vcpu_sve_finalized(vcpu));
|
||||
|
||||
return slices * (SVE_NUM_PREGS + SVE_NUM_ZREGS + 1 /* FFR */)
|
||||
+ 1; /* KVM_REG_ARM64_SVE_VLS */
|
||||
}
|
||||
|
||||
static int copy_sve_reg_indices(const struct kvm_vcpu *vcpu,
|
||||
u64 __user *uindices)
|
||||
{
|
||||
const unsigned int slices = vcpu_sve_slices(vcpu);
|
||||
u64 reg;
|
||||
unsigned int i, n;
|
||||
int num_regs = 0;
|
||||
|
||||
if (!vcpu_has_sve(vcpu))
|
||||
return 0;
|
||||
|
||||
/* Policed by KVM_GET_REG_LIST: */
|
||||
WARN_ON(!kvm_arm_vcpu_sve_finalized(vcpu));
|
||||
|
||||
/*
|
||||
* Enumerate this first, so that userspace can save/restore in
|
||||
* the order reported by KVM_GET_REG_LIST:
|
||||
*/
|
||||
reg = KVM_REG_ARM64_SVE_VLS;
|
||||
if (put_user(reg, uindices++))
|
||||
return -EFAULT;
|
||||
++num_regs;
|
||||
|
||||
for (i = 0; i < slices; i++) {
|
||||
for (n = 0; n < SVE_NUM_ZREGS; n++) {
|
||||
reg = KVM_REG_ARM64_SVE_ZREG(n, i);
|
||||
if (put_user(reg, uindices++))
|
||||
return -EFAULT;
|
||||
num_regs++;
|
||||
}
|
||||
|
||||
for (n = 0; n < SVE_NUM_PREGS; n++) {
|
||||
reg = KVM_REG_ARM64_SVE_PREG(n, i);
|
||||
if (put_user(reg, uindices++))
|
||||
return -EFAULT;
|
||||
num_regs++;
|
||||
}
|
||||
|
||||
reg = KVM_REG_ARM64_SVE_FFR(i);
|
||||
if (put_user(reg, uindices++))
|
||||
return -EFAULT;
|
||||
num_regs++;
|
||||
}
|
||||
|
||||
return num_regs;
|
||||
}
|
||||
|
||||
/**
|
||||
* kvm_arm_num_regs - how many registers do we present via KVM_GET_ONE_REG
|
||||
*
|
||||
@ -258,8 +602,15 @@ static int get_timer_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
*/
|
||||
unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return num_core_regs() + kvm_arm_num_sys_reg_descs(vcpu)
|
||||
+ kvm_arm_get_fw_num_regs(vcpu) + NUM_TIMER_REGS;
|
||||
unsigned long res = 0;
|
||||
|
||||
res += num_core_regs(vcpu);
|
||||
res += num_sve_regs(vcpu);
|
||||
res += kvm_arm_num_sys_reg_descs(vcpu);
|
||||
res += kvm_arm_get_fw_num_regs(vcpu);
|
||||
res += NUM_TIMER_REGS;
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
/**
|
||||
@ -269,23 +620,25 @@ unsigned long kvm_arm_num_regs(struct kvm_vcpu *vcpu)
|
||||
*/
|
||||
int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
|
||||
{
|
||||
unsigned int i;
|
||||
const u64 core_reg = KVM_REG_ARM64 | KVM_REG_SIZE_U64 | KVM_REG_ARM_CORE;
|
||||
int ret;
|
||||
|
||||
for (i = 0; i < sizeof(struct kvm_regs) / sizeof(__u32); i++) {
|
||||
if (put_user(core_reg | i, uindices))
|
||||
return -EFAULT;
|
||||
uindices++;
|
||||
}
|
||||
ret = copy_core_reg_indices(vcpu, uindices);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
uindices += ret;
|
||||
|
||||
ret = copy_sve_reg_indices(vcpu, uindices);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
uindices += ret;
|
||||
|
||||
ret = kvm_arm_copy_fw_reg_indices(vcpu, uindices);
|
||||
if (ret)
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
uindices += kvm_arm_get_fw_num_regs(vcpu);
|
||||
|
||||
ret = copy_timer_indices(vcpu, uindices);
|
||||
if (ret)
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
uindices += NUM_TIMER_REGS;
|
||||
|
||||
@ -298,12 +651,11 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
|
||||
return -EINVAL;
|
||||
|
||||
/* Register group 16 means we want a core register. */
|
||||
if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
|
||||
return get_core_reg(vcpu, reg);
|
||||
|
||||
if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
|
||||
return kvm_arm_get_fw_reg(vcpu, reg);
|
||||
switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
|
||||
case KVM_REG_ARM_CORE: return get_core_reg(vcpu, reg);
|
||||
case KVM_REG_ARM_FW: return kvm_arm_get_fw_reg(vcpu, reg);
|
||||
case KVM_REG_ARM64_SVE: return get_sve_reg(vcpu, reg);
|
||||
}
|
||||
|
||||
if (is_timer_reg(reg->id))
|
||||
return get_timer_reg(vcpu, reg);
|
||||
@ -317,12 +669,11 @@ int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
|
||||
if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
|
||||
return -EINVAL;
|
||||
|
||||
/* Register group 16 means we set a core register. */
|
||||
if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_CORE)
|
||||
return set_core_reg(vcpu, reg);
|
||||
|
||||
if ((reg->id & KVM_REG_ARM_COPROC_MASK) == KVM_REG_ARM_FW)
|
||||
return kvm_arm_set_fw_reg(vcpu, reg);
|
||||
switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
|
||||
case KVM_REG_ARM_CORE: return set_core_reg(vcpu, reg);
|
||||
case KVM_REG_ARM_FW: return kvm_arm_set_fw_reg(vcpu, reg);
|
||||
case KVM_REG_ARM64_SVE: return set_sve_reg(vcpu, reg);
|
||||
}
|
||||
|
||||
if (is_timer_reg(reg->id))
|
||||
return set_timer_reg(vcpu, reg);
|
||||
|
@ -173,20 +173,40 @@ static int handle_sve(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
||||
return 1;
|
||||
}
|
||||
|
||||
#define __ptrauth_save_key(regs, key) \
|
||||
({ \
|
||||
regs[key ## KEYLO_EL1] = read_sysreg_s(SYS_ ## key ## KEYLO_EL1); \
|
||||
regs[key ## KEYHI_EL1] = read_sysreg_s(SYS_ ## key ## KEYHI_EL1); \
|
||||
})
|
||||
|
||||
/*
|
||||
* Handle the guest trying to use a ptrauth instruction, or trying to access a
|
||||
* ptrauth register.
|
||||
*/
|
||||
void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_cpu_context *ctxt;
|
||||
|
||||
if (vcpu_has_ptrauth(vcpu)) {
|
||||
vcpu_ptrauth_enable(vcpu);
|
||||
ctxt = vcpu->arch.host_cpu_context;
|
||||
__ptrauth_save_key(ctxt->sys_regs, APIA);
|
||||
__ptrauth_save_key(ctxt->sys_regs, APIB);
|
||||
__ptrauth_save_key(ctxt->sys_regs, APDA);
|
||||
__ptrauth_save_key(ctxt->sys_regs, APDB);
|
||||
__ptrauth_save_key(ctxt->sys_regs, APGA);
|
||||
} else {
|
||||
kvm_inject_undefined(vcpu);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Guest usage of a ptrauth instruction (which the guest EL1 did not turn into
|
||||
* a NOP).
|
||||
*/
|
||||
static int kvm_handle_ptrauth(struct kvm_vcpu *vcpu, struct kvm_run *run)
|
||||
{
|
||||
/*
|
||||
* We don't currently support ptrauth in a guest, and we mask the ID
|
||||
* registers to prevent well-behaved guests from trying to make use of
|
||||
* it.
|
||||
*
|
||||
* Inject an UNDEF, as if the feature really isn't present.
|
||||
*/
|
||||
kvm_inject_undefined(vcpu);
|
||||
kvm_arm_vcpu_ptrauth_trap(vcpu);
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
@ -24,6 +24,7 @@
|
||||
#include <asm/kvm_arm.h>
|
||||
#include <asm/kvm_asm.h>
|
||||
#include <asm/kvm_mmu.h>
|
||||
#include <asm/kvm_ptrauth.h>
|
||||
|
||||
#define CPU_GP_REG_OFFSET(x) (CPU_GP_REGS + x)
|
||||
#define CPU_XREG_OFFSET(x) CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x)
|
||||
@ -64,6 +65,13 @@ ENTRY(__guest_enter)
|
||||
|
||||
add x18, x0, #VCPU_CONTEXT
|
||||
|
||||
// Macro ptrauth_switch_to_guest format:
|
||||
// ptrauth_switch_to_guest(guest cxt, tmp1, tmp2, tmp3)
|
||||
// The below macro to restore guest keys is not implemented in C code
|
||||
// as it may cause Pointer Authentication key signing mismatch errors
|
||||
// when this feature is enabled for kernel code.
|
||||
ptrauth_switch_to_guest x18, x0, x1, x2
|
||||
|
||||
// Restore guest regs x0-x17
|
||||
ldp x0, x1, [x18, #CPU_XREG_OFFSET(0)]
|
||||
ldp x2, x3, [x18, #CPU_XREG_OFFSET(2)]
|
||||
@ -118,6 +126,13 @@ ENTRY(__guest_exit)
|
||||
|
||||
get_host_ctxt x2, x3
|
||||
|
||||
// Macro ptrauth_switch_to_guest format:
|
||||
// ptrauth_switch_to_host(guest cxt, host cxt, tmp1, tmp2, tmp3)
|
||||
// The below macro to save/restore keys is not implemented in C code
|
||||
// as it may cause Pointer Authentication key signing mismatch errors
|
||||
// when this feature is enabled for kernel code.
|
||||
ptrauth_switch_to_host x1, x2, x3, x4, x5
|
||||
|
||||
// Now restore the host regs
|
||||
restore_callee_saved_regs x2
|
||||
|
||||
|
@ -100,7 +100,10 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
|
||||
val = read_sysreg(cpacr_el1);
|
||||
val |= CPACR_EL1_TTA;
|
||||
val &= ~CPACR_EL1_ZEN;
|
||||
if (!update_fp_enabled(vcpu)) {
|
||||
if (update_fp_enabled(vcpu)) {
|
||||
if (vcpu_has_sve(vcpu))
|
||||
val |= CPACR_EL1_ZEN;
|
||||
} else {
|
||||
val &= ~CPACR_EL1_FPEN;
|
||||
__activate_traps_fpsimd32(vcpu);
|
||||
}
|
||||
@ -317,16 +320,48 @@ static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
|
||||
return true;
|
||||
}
|
||||
|
||||
static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
|
||||
/* Check for an FPSIMD/SVE trap and handle as appropriate */
|
||||
static bool __hyp_text __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct user_fpsimd_state *host_fpsimd = vcpu->arch.host_fpsimd_state;
|
||||
bool vhe, sve_guest, sve_host;
|
||||
u8 hsr_ec;
|
||||
|
||||
if (has_vhe())
|
||||
write_sysreg(read_sysreg(cpacr_el1) | CPACR_EL1_FPEN,
|
||||
cpacr_el1);
|
||||
else
|
||||
if (!system_supports_fpsimd())
|
||||
return false;
|
||||
|
||||
if (system_supports_sve()) {
|
||||
sve_guest = vcpu_has_sve(vcpu);
|
||||
sve_host = vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE;
|
||||
vhe = true;
|
||||
} else {
|
||||
sve_guest = false;
|
||||
sve_host = false;
|
||||
vhe = has_vhe();
|
||||
}
|
||||
|
||||
hsr_ec = kvm_vcpu_trap_get_class(vcpu);
|
||||
if (hsr_ec != ESR_ELx_EC_FP_ASIMD &&
|
||||
hsr_ec != ESR_ELx_EC_SVE)
|
||||
return false;
|
||||
|
||||
/* Don't handle SVE traps for non-SVE vcpus here: */
|
||||
if (!sve_guest)
|
||||
if (hsr_ec != ESR_ELx_EC_FP_ASIMD)
|
||||
return false;
|
||||
|
||||
/* Valid trap. Switch the context: */
|
||||
|
||||
if (vhe) {
|
||||
u64 reg = read_sysreg(cpacr_el1) | CPACR_EL1_FPEN;
|
||||
|
||||
if (sve_guest)
|
||||
reg |= CPACR_EL1_ZEN;
|
||||
|
||||
write_sysreg(reg, cpacr_el1);
|
||||
} else {
|
||||
write_sysreg(read_sysreg(cptr_el2) & ~(u64)CPTR_EL2_TFP,
|
||||
cptr_el2);
|
||||
}
|
||||
|
||||
isb();
|
||||
|
||||
@ -335,21 +370,28 @@ static bool __hyp_text __hyp_switch_fpsimd(struct kvm_vcpu *vcpu)
|
||||
* In the SVE case, VHE is assumed: it is enforced by
|
||||
* Kconfig and kvm_arch_init().
|
||||
*/
|
||||
if (system_supports_sve() &&
|
||||
(vcpu->arch.flags & KVM_ARM64_HOST_SVE_IN_USE)) {
|
||||
if (sve_host) {
|
||||
struct thread_struct *thread = container_of(
|
||||
host_fpsimd,
|
||||
vcpu->arch.host_fpsimd_state,
|
||||
struct thread_struct, uw.fpsimd_state);
|
||||
|
||||
sve_save_state(sve_pffr(thread), &host_fpsimd->fpsr);
|
||||
sve_save_state(sve_pffr(thread),
|
||||
&vcpu->arch.host_fpsimd_state->fpsr);
|
||||
} else {
|
||||
__fpsimd_save_state(host_fpsimd);
|
||||
__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
|
||||
}
|
||||
|
||||
vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
|
||||
}
|
||||
|
||||
__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
|
||||
if (sve_guest) {
|
||||
sve_load_state(vcpu_sve_pffr(vcpu),
|
||||
&vcpu->arch.ctxt.gp_regs.fp_regs.fpsr,
|
||||
sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1);
|
||||
write_sysreg_s(vcpu->arch.ctxt.sys_regs[ZCR_EL1], SYS_ZCR_EL12);
|
||||
} else {
|
||||
__fpsimd_restore_state(&vcpu->arch.ctxt.gp_regs.fp_regs);
|
||||
}
|
||||
|
||||
/* Skip restoring fpexc32 for AArch64 guests */
|
||||
if (!(read_sysreg(hcr_el2) & HCR_RW))
|
||||
@ -385,10 +427,10 @@ static bool __hyp_text fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
|
||||
* and restore the guest context lazily.
|
||||
* If FP/SIMD is not implemented, handle the trap and inject an
|
||||
* undefined instruction exception to the guest.
|
||||
* Similarly for trapped SVE accesses.
|
||||
*/
|
||||
if (system_supports_fpsimd() &&
|
||||
kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_FP_ASIMD)
|
||||
return __hyp_switch_fpsimd(vcpu);
|
||||
if (__hyp_handle_fpsimd(vcpu))
|
||||
return true;
|
||||
|
||||
if (!__populate_fault_info(vcpu))
|
||||
return true;
|
||||
@ -524,6 +566,7 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_cpu_context *host_ctxt;
|
||||
struct kvm_cpu_context *guest_ctxt;
|
||||
bool pmu_switch_needed;
|
||||
u64 exit_code;
|
||||
|
||||
/*
|
||||
@ -543,6 +586,8 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
|
||||
host_ctxt->__hyp_running_vcpu = vcpu;
|
||||
guest_ctxt = &vcpu->arch.ctxt;
|
||||
|
||||
pmu_switch_needed = __pmu_switch_to_guest(host_ctxt);
|
||||
|
||||
__sysreg_save_state_nvhe(host_ctxt);
|
||||
|
||||
__activate_vm(kern_hyp_va(vcpu->kvm));
|
||||
@ -589,6 +634,9 @@ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
|
||||
*/
|
||||
__debug_switch_to_host(vcpu);
|
||||
|
||||
if (pmu_switch_needed)
|
||||
__pmu_switch_to_host(host_ctxt);
|
||||
|
||||
/* Returning to host will clear PSR.I, remask PMR if needed */
|
||||
if (system_uses_irq_prio_masking())
|
||||
gic_write_pmr(GIC_PRIO_IRQOFF);
|
||||
|
239
arch/arm64/kvm/pmu.c
Normal file
239
arch/arm64/kvm/pmu.c
Normal file
@ -0,0 +1,239 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
/*
|
||||
* Copyright 2019 Arm Limited
|
||||
* Author: Andrew Murray <Andrew.Murray@arm.com>
|
||||
*/
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/perf_event.h>
|
||||
#include <asm/kvm_hyp.h>
|
||||
|
||||
/*
|
||||
* Given the perf event attributes and system type, determine
|
||||
* if we are going to need to switch counters at guest entry/exit.
|
||||
*/
|
||||
static bool kvm_pmu_switch_needed(struct perf_event_attr *attr)
|
||||
{
|
||||
/**
|
||||
* With VHE the guest kernel runs at EL1 and the host at EL2,
|
||||
* where user (EL0) is excluded then we have no reason to switch
|
||||
* counters.
|
||||
*/
|
||||
if (has_vhe() && attr->exclude_user)
|
||||
return false;
|
||||
|
||||
/* Only switch if attributes are different */
|
||||
return (attr->exclude_host != attr->exclude_guest);
|
||||
}
|
||||
|
||||
/*
|
||||
* Add events to track that we may want to switch at guest entry/exit
|
||||
* time.
|
||||
*/
|
||||
void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
|
||||
{
|
||||
struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
|
||||
|
||||
if (!kvm_pmu_switch_needed(attr))
|
||||
return;
|
||||
|
||||
if (!attr->exclude_host)
|
||||
ctx->pmu_events.events_host |= set;
|
||||
if (!attr->exclude_guest)
|
||||
ctx->pmu_events.events_guest |= set;
|
||||
}
|
||||
|
||||
/*
|
||||
* Stop tracking events
|
||||
*/
|
||||
void kvm_clr_pmu_events(u32 clr)
|
||||
{
|
||||
struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
|
||||
|
||||
ctx->pmu_events.events_host &= ~clr;
|
||||
ctx->pmu_events.events_guest &= ~clr;
|
||||
}
|
||||
|
||||
/**
|
||||
* Disable host events, enable guest events
|
||||
*/
|
||||
bool __hyp_text __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt)
|
||||
{
|
||||
struct kvm_host_data *host;
|
||||
struct kvm_pmu_events *pmu;
|
||||
|
||||
host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
|
||||
pmu = &host->pmu_events;
|
||||
|
||||
if (pmu->events_host)
|
||||
write_sysreg(pmu->events_host, pmcntenclr_el0);
|
||||
|
||||
if (pmu->events_guest)
|
||||
write_sysreg(pmu->events_guest, pmcntenset_el0);
|
||||
|
||||
return (pmu->events_host || pmu->events_guest);
|
||||
}
|
||||
|
||||
/**
|
||||
* Disable guest events, enable host events
|
||||
*/
|
||||
void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
|
||||
{
|
||||
struct kvm_host_data *host;
|
||||
struct kvm_pmu_events *pmu;
|
||||
|
||||
host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
|
||||
pmu = &host->pmu_events;
|
||||
|
||||
if (pmu->events_guest)
|
||||
write_sysreg(pmu->events_guest, pmcntenclr_el0);
|
||||
|
||||
if (pmu->events_host)
|
||||
write_sysreg(pmu->events_host, pmcntenset_el0);
|
||||
}
|
||||
|
||||
#define PMEVTYPER_READ_CASE(idx) \
|
||||
case idx: \
|
||||
return read_sysreg(pmevtyper##idx##_el0)
|
||||
|
||||
#define PMEVTYPER_WRITE_CASE(idx) \
|
||||
case idx: \
|
||||
write_sysreg(val, pmevtyper##idx##_el0); \
|
||||
break
|
||||
|
||||
#define PMEVTYPER_CASES(readwrite) \
|
||||
PMEVTYPER_##readwrite##_CASE(0); \
|
||||
PMEVTYPER_##readwrite##_CASE(1); \
|
||||
PMEVTYPER_##readwrite##_CASE(2); \
|
||||
PMEVTYPER_##readwrite##_CASE(3); \
|
||||
PMEVTYPER_##readwrite##_CASE(4); \
|
||||
PMEVTYPER_##readwrite##_CASE(5); \
|
||||
PMEVTYPER_##readwrite##_CASE(6); \
|
||||
PMEVTYPER_##readwrite##_CASE(7); \
|
||||
PMEVTYPER_##readwrite##_CASE(8); \
|
||||
PMEVTYPER_##readwrite##_CASE(9); \
|
||||
PMEVTYPER_##readwrite##_CASE(10); \
|
||||
PMEVTYPER_##readwrite##_CASE(11); \
|
||||
PMEVTYPER_##readwrite##_CASE(12); \
|
||||
PMEVTYPER_##readwrite##_CASE(13); \
|
||||
PMEVTYPER_##readwrite##_CASE(14); \
|
||||
PMEVTYPER_##readwrite##_CASE(15); \
|
||||
PMEVTYPER_##readwrite##_CASE(16); \
|
||||
PMEVTYPER_##readwrite##_CASE(17); \
|
||||
PMEVTYPER_##readwrite##_CASE(18); \
|
||||
PMEVTYPER_##readwrite##_CASE(19); \
|
||||
PMEVTYPER_##readwrite##_CASE(20); \
|
||||
PMEVTYPER_##readwrite##_CASE(21); \
|
||||
PMEVTYPER_##readwrite##_CASE(22); \
|
||||
PMEVTYPER_##readwrite##_CASE(23); \
|
||||
PMEVTYPER_##readwrite##_CASE(24); \
|
||||
PMEVTYPER_##readwrite##_CASE(25); \
|
||||
PMEVTYPER_##readwrite##_CASE(26); \
|
||||
PMEVTYPER_##readwrite##_CASE(27); \
|
||||
PMEVTYPER_##readwrite##_CASE(28); \
|
||||
PMEVTYPER_##readwrite##_CASE(29); \
|
||||
PMEVTYPER_##readwrite##_CASE(30)
|
||||
|
||||
/*
|
||||
* Read a value direct from PMEVTYPER<idx> where idx is 0-30
|
||||
* or PMCCFILTR_EL0 where idx is ARMV8_PMU_CYCLE_IDX (31).
|
||||
*/
|
||||
static u64 kvm_vcpu_pmu_read_evtype_direct(int idx)
|
||||
{
|
||||
switch (idx) {
|
||||
PMEVTYPER_CASES(READ);
|
||||
case ARMV8_PMU_CYCLE_IDX:
|
||||
return read_sysreg(pmccfiltr_el0);
|
||||
default:
|
||||
WARN_ON(1);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Write a value direct to PMEVTYPER<idx> where idx is 0-30
|
||||
* or PMCCFILTR_EL0 where idx is ARMV8_PMU_CYCLE_IDX (31).
|
||||
*/
|
||||
static void kvm_vcpu_pmu_write_evtype_direct(int idx, u32 val)
|
||||
{
|
||||
switch (idx) {
|
||||
PMEVTYPER_CASES(WRITE);
|
||||
case ARMV8_PMU_CYCLE_IDX:
|
||||
write_sysreg(val, pmccfiltr_el0);
|
||||
break;
|
||||
default:
|
||||
WARN_ON(1);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Modify ARMv8 PMU events to include EL0 counting
|
||||
*/
|
||||
static void kvm_vcpu_pmu_enable_el0(unsigned long events)
|
||||
{
|
||||
u64 typer;
|
||||
u32 counter;
|
||||
|
||||
for_each_set_bit(counter, &events, 32) {
|
||||
typer = kvm_vcpu_pmu_read_evtype_direct(counter);
|
||||
typer &= ~ARMV8_PMU_EXCLUDE_EL0;
|
||||
kvm_vcpu_pmu_write_evtype_direct(counter, typer);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Modify ARMv8 PMU events to exclude EL0 counting
|
||||
*/
|
||||
static void kvm_vcpu_pmu_disable_el0(unsigned long events)
|
||||
{
|
||||
u64 typer;
|
||||
u32 counter;
|
||||
|
||||
for_each_set_bit(counter, &events, 32) {
|
||||
typer = kvm_vcpu_pmu_read_evtype_direct(counter);
|
||||
typer |= ARMV8_PMU_EXCLUDE_EL0;
|
||||
kvm_vcpu_pmu_write_evtype_direct(counter, typer);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* On VHE ensure that only guest events have EL0 counting enabled
|
||||
*/
|
||||
void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_cpu_context *host_ctxt;
|
||||
struct kvm_host_data *host;
|
||||
u32 events_guest, events_host;
|
||||
|
||||
if (!has_vhe())
|
||||
return;
|
||||
|
||||
host_ctxt = vcpu->arch.host_cpu_context;
|
||||
host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
|
||||
events_guest = host->pmu_events.events_guest;
|
||||
events_host = host->pmu_events.events_host;
|
||||
|
||||
kvm_vcpu_pmu_enable_el0(events_guest);
|
||||
kvm_vcpu_pmu_disable_el0(events_host);
|
||||
}
|
||||
|
||||
/*
|
||||
* On VHE ensure that only host events have EL0 counting enabled
|
||||
*/
|
||||
void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_cpu_context *host_ctxt;
|
||||
struct kvm_host_data *host;
|
||||
u32 events_guest, events_host;
|
||||
|
||||
if (!has_vhe())
|
||||
return;
|
||||
|
||||
host_ctxt = vcpu->arch.host_cpu_context;
|
||||
host = container_of(host_ctxt, struct kvm_host_data, host_ctxt);
|
||||
events_guest = host->pmu_events.events_guest;
|
||||
events_host = host->pmu_events.events_host;
|
||||
|
||||
kvm_vcpu_pmu_enable_el0(events_host);
|
||||
kvm_vcpu_pmu_disable_el0(events_guest);
|
||||
}
|
@ -20,20 +20,26 @@
|
||||
*/
|
||||
|
||||
#include <linux/errno.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/kvm.h>
|
||||
#include <linux/hw_breakpoint.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/string.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
#include <kvm/arm_arch_timer.h>
|
||||
|
||||
#include <asm/cpufeature.h>
|
||||
#include <asm/cputype.h>
|
||||
#include <asm/fpsimd.h>
|
||||
#include <asm/ptrace.h>
|
||||
#include <asm/kvm_arm.h>
|
||||
#include <asm/kvm_asm.h>
|
||||
#include <asm/kvm_coproc.h>
|
||||
#include <asm/kvm_emulate.h>
|
||||
#include <asm/kvm_mmu.h>
|
||||
#include <asm/virt.h>
|
||||
|
||||
/* Maximum phys_shift supported for any VM on this host */
|
||||
static u32 kvm_ipa_limit;
|
||||
@ -92,6 +98,14 @@ int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||
case KVM_CAP_ARM_VM_IPA_SIZE:
|
||||
r = kvm_ipa_limit;
|
||||
break;
|
||||
case KVM_CAP_ARM_SVE:
|
||||
r = system_supports_sve();
|
||||
break;
|
||||
case KVM_CAP_ARM_PTRAUTH_ADDRESS:
|
||||
case KVM_CAP_ARM_PTRAUTH_GENERIC:
|
||||
r = has_vhe() && system_supports_address_auth() &&
|
||||
system_supports_generic_auth();
|
||||
break;
|
||||
default:
|
||||
r = 0;
|
||||
}
|
||||
@ -99,13 +113,148 @@ int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||
return r;
|
||||
}
|
||||
|
||||
unsigned int kvm_sve_max_vl;
|
||||
|
||||
int kvm_arm_init_sve(void)
|
||||
{
|
||||
if (system_supports_sve()) {
|
||||
kvm_sve_max_vl = sve_max_virtualisable_vl;
|
||||
|
||||
/*
|
||||
* The get_sve_reg()/set_sve_reg() ioctl interface will need
|
||||
* to be extended with multiple register slice support in
|
||||
* order to support vector lengths greater than
|
||||
* SVE_VL_ARCH_MAX:
|
||||
*/
|
||||
if (WARN_ON(kvm_sve_max_vl > SVE_VL_ARCH_MAX))
|
||||
kvm_sve_max_vl = SVE_VL_ARCH_MAX;
|
||||
|
||||
/*
|
||||
* Don't even try to make use of vector lengths that
|
||||
* aren't available on all CPUs, for now:
|
||||
*/
|
||||
if (kvm_sve_max_vl < sve_max_vl)
|
||||
pr_warn("KVM: SVE vector length for guests limited to %u bytes\n",
|
||||
kvm_sve_max_vl);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int kvm_vcpu_enable_sve(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (!system_supports_sve())
|
||||
return -EINVAL;
|
||||
|
||||
/* Verify that KVM startup enforced this when SVE was detected: */
|
||||
if (WARN_ON(!has_vhe()))
|
||||
return -EINVAL;
|
||||
|
||||
vcpu->arch.sve_max_vl = kvm_sve_max_vl;
|
||||
|
||||
/*
|
||||
* Userspace can still customize the vector lengths by writing
|
||||
* KVM_REG_ARM64_SVE_VLS. Allocation is deferred until
|
||||
* kvm_arm_vcpu_finalize(), which freezes the configuration.
|
||||
*/
|
||||
vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_SVE;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Finalize vcpu's maximum SVE vector length, allocating
|
||||
* vcpu->arch.sve_state as necessary.
|
||||
*/
|
||||
static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
void *buf;
|
||||
unsigned int vl;
|
||||
|
||||
vl = vcpu->arch.sve_max_vl;
|
||||
|
||||
/*
|
||||
* Resposibility for these properties is shared between
|
||||
* kvm_arm_init_arch_resources(), kvm_vcpu_enable_sve() and
|
||||
* set_sve_vls(). Double-check here just to be sure:
|
||||
*/
|
||||
if (WARN_ON(!sve_vl_valid(vl) || vl > sve_max_virtualisable_vl ||
|
||||
vl > SVE_VL_ARCH_MAX))
|
||||
return -EIO;
|
||||
|
||||
buf = kzalloc(SVE_SIG_REGS_SIZE(sve_vq_from_vl(vl)), GFP_KERNEL);
|
||||
if (!buf)
|
||||
return -ENOMEM;
|
||||
|
||||
vcpu->arch.sve_state = buf;
|
||||
vcpu->arch.flags |= KVM_ARM64_VCPU_SVE_FINALIZED;
|
||||
return 0;
|
||||
}
|
||||
|
||||
int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature)
|
||||
{
|
||||
switch (feature) {
|
||||
case KVM_ARM_VCPU_SVE:
|
||||
if (!vcpu_has_sve(vcpu))
|
||||
return -EINVAL;
|
||||
|
||||
if (kvm_arm_vcpu_sve_finalized(vcpu))
|
||||
return -EPERM;
|
||||
|
||||
return kvm_vcpu_finalize_sve(vcpu);
|
||||
}
|
||||
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (vcpu_has_sve(vcpu) && !kvm_arm_vcpu_sve_finalized(vcpu))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
kfree(vcpu->arch.sve_state);
|
||||
}
|
||||
|
||||
static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (vcpu_has_sve(vcpu))
|
||||
memset(vcpu->arch.sve_state, 0, vcpu_sve_state_size(vcpu));
|
||||
}
|
||||
|
||||
static int kvm_vcpu_enable_ptrauth(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
/* Support ptrauth only if the system supports these capabilities. */
|
||||
if (!has_vhe())
|
||||
return -EINVAL;
|
||||
|
||||
if (!system_supports_address_auth() ||
|
||||
!system_supports_generic_auth())
|
||||
return -EINVAL;
|
||||
/*
|
||||
* For now make sure that both address/generic pointer authentication
|
||||
* features are requested by the userspace together.
|
||||
*/
|
||||
if (!test_bit(KVM_ARM_VCPU_PTRAUTH_ADDRESS, vcpu->arch.features) ||
|
||||
!test_bit(KVM_ARM_VCPU_PTRAUTH_GENERIC, vcpu->arch.features))
|
||||
return -EINVAL;
|
||||
|
||||
vcpu->arch.flags |= KVM_ARM64_GUEST_HAS_PTRAUTH;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* kvm_reset_vcpu - sets core registers and sys_regs to reset value
|
||||
* @vcpu: The VCPU pointer
|
||||
*
|
||||
* This function finds the right table above and sets the registers on
|
||||
* the virtual CPU struct to their architecturally defined reset
|
||||
* values.
|
||||
* values, except for registers whose reset is deferred until
|
||||
* kvm_arm_vcpu_finalize().
|
||||
*
|
||||
* Note: This function can be called from two paths: The KVM_ARM_VCPU_INIT
|
||||
* ioctl or as part of handling a request issued by another VCPU in the PSCI
|
||||
@ -131,6 +280,22 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
|
||||
if (loaded)
|
||||
kvm_arch_vcpu_put(vcpu);
|
||||
|
||||
if (!kvm_arm_vcpu_sve_finalized(vcpu)) {
|
||||
if (test_bit(KVM_ARM_VCPU_SVE, vcpu->arch.features)) {
|
||||
ret = kvm_vcpu_enable_sve(vcpu);
|
||||
if (ret)
|
||||
goto out;
|
||||
}
|
||||
} else {
|
||||
kvm_vcpu_reset_sve(vcpu);
|
||||
}
|
||||
|
||||
if (test_bit(KVM_ARM_VCPU_PTRAUTH_ADDRESS, vcpu->arch.features) ||
|
||||
test_bit(KVM_ARM_VCPU_PTRAUTH_GENERIC, vcpu->arch.features)) {
|
||||
if (kvm_vcpu_enable_ptrauth(vcpu))
|
||||
goto out;
|
||||
}
|
||||
|
||||
switch (vcpu->arch.target) {
|
||||
default:
|
||||
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
|
||||
|
@ -695,6 +695,7 @@ static bool access_pmcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
|
||||
val |= p->regval & ARMV8_PMU_PMCR_MASK;
|
||||
__vcpu_sys_reg(vcpu, PMCR_EL0) = val;
|
||||
kvm_pmu_handle_pmcr(vcpu, val);
|
||||
kvm_vcpu_pmu_restore_guest(vcpu);
|
||||
} else {
|
||||
/* PMCR.P & PMCR.C are RAZ */
|
||||
val = __vcpu_sys_reg(vcpu, PMCR_EL0)
|
||||
@ -850,6 +851,7 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
|
||||
if (p->is_write) {
|
||||
kvm_pmu_set_counter_event_type(vcpu, p->regval, idx);
|
||||
__vcpu_sys_reg(vcpu, reg) = p->regval & ARMV8_PMU_EVTYPE_MASK;
|
||||
kvm_vcpu_pmu_restore_guest(vcpu);
|
||||
} else {
|
||||
p->regval = __vcpu_sys_reg(vcpu, reg) & ARMV8_PMU_EVTYPE_MASK;
|
||||
}
|
||||
@ -875,6 +877,7 @@ static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
|
||||
/* accessing PMCNTENSET_EL0 */
|
||||
__vcpu_sys_reg(vcpu, PMCNTENSET_EL0) |= val;
|
||||
kvm_pmu_enable_counter(vcpu, val);
|
||||
kvm_vcpu_pmu_restore_guest(vcpu);
|
||||
} else {
|
||||
/* accessing PMCNTENCLR_EL0 */
|
||||
__vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= ~val;
|
||||
@ -1007,6 +1010,37 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
|
||||
{ SYS_DESC(SYS_PMEVTYPERn_EL0(n)), \
|
||||
access_pmu_evtyper, reset_unknown, (PMEVTYPER0_EL0 + n), }
|
||||
|
||||
static bool trap_ptrauth(struct kvm_vcpu *vcpu,
|
||||
struct sys_reg_params *p,
|
||||
const struct sys_reg_desc *rd)
|
||||
{
|
||||
kvm_arm_vcpu_ptrauth_trap(vcpu);
|
||||
|
||||
/*
|
||||
* Return false for both cases as we never skip the trapped
|
||||
* instruction:
|
||||
*
|
||||
* - Either we re-execute the same key register access instruction
|
||||
* after enabling ptrauth.
|
||||
* - Or an UNDEF is injected as ptrauth is not supported/enabled.
|
||||
*/
|
||||
return false;
|
||||
}
|
||||
|
||||
static unsigned int ptrauth_visibility(const struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *rd)
|
||||
{
|
||||
return vcpu_has_ptrauth(vcpu) ? 0 : REG_HIDDEN_USER | REG_HIDDEN_GUEST;
|
||||
}
|
||||
|
||||
#define __PTRAUTH_KEY(k) \
|
||||
{ SYS_DESC(SYS_## k), trap_ptrauth, reset_unknown, k, \
|
||||
.visibility = ptrauth_visibility}
|
||||
|
||||
#define PTRAUTH_KEY(k) \
|
||||
__PTRAUTH_KEY(k ## KEYLO_EL1), \
|
||||
__PTRAUTH_KEY(k ## KEYHI_EL1)
|
||||
|
||||
static bool access_arch_timer(struct kvm_vcpu *vcpu,
|
||||
struct sys_reg_params *p,
|
||||
const struct sys_reg_desc *r)
|
||||
@ -1044,25 +1078,20 @@ static bool access_arch_timer(struct kvm_vcpu *vcpu,
|
||||
}
|
||||
|
||||
/* Read a sanitised cpufeature ID register by sys_reg_desc */
|
||||
static u64 read_id_reg(struct sys_reg_desc const *r, bool raz)
|
||||
static u64 read_id_reg(const struct kvm_vcpu *vcpu,
|
||||
struct sys_reg_desc const *r, bool raz)
|
||||
{
|
||||
u32 id = sys_reg((u32)r->Op0, (u32)r->Op1,
|
||||
(u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
|
||||
u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
|
||||
|
||||
if (id == SYS_ID_AA64PFR0_EL1) {
|
||||
if (val & (0xfUL << ID_AA64PFR0_SVE_SHIFT))
|
||||
kvm_debug("SVE unsupported for guests, suppressing\n");
|
||||
|
||||
if (id == SYS_ID_AA64PFR0_EL1 && !vcpu_has_sve(vcpu)) {
|
||||
val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
|
||||
} else if (id == SYS_ID_AA64ISAR1_EL1) {
|
||||
const u64 ptrauth_mask = (0xfUL << ID_AA64ISAR1_APA_SHIFT) |
|
||||
(0xfUL << ID_AA64ISAR1_API_SHIFT) |
|
||||
(0xfUL << ID_AA64ISAR1_GPA_SHIFT) |
|
||||
(0xfUL << ID_AA64ISAR1_GPI_SHIFT);
|
||||
if (val & ptrauth_mask)
|
||||
kvm_debug("ptrauth unsupported for guests, suppressing\n");
|
||||
val &= ~ptrauth_mask;
|
||||
} else if (id == SYS_ID_AA64ISAR1_EL1 && !vcpu_has_ptrauth(vcpu)) {
|
||||
val &= ~((0xfUL << ID_AA64ISAR1_APA_SHIFT) |
|
||||
(0xfUL << ID_AA64ISAR1_API_SHIFT) |
|
||||
(0xfUL << ID_AA64ISAR1_GPA_SHIFT) |
|
||||
(0xfUL << ID_AA64ISAR1_GPI_SHIFT));
|
||||
}
|
||||
|
||||
return val;
|
||||
@ -1078,7 +1107,7 @@ static bool __access_id_reg(struct kvm_vcpu *vcpu,
|
||||
if (p->is_write)
|
||||
return write_to_read_only(vcpu, p, r);
|
||||
|
||||
p->regval = read_id_reg(r, raz);
|
||||
p->regval = read_id_reg(vcpu, r, raz);
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -1100,6 +1129,81 @@ static int reg_from_user(u64 *val, const void __user *uaddr, u64 id);
|
||||
static int reg_to_user(void __user *uaddr, const u64 *val, u64 id);
|
||||
static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
|
||||
|
||||
/* Visibility overrides for SVE-specific control registers */
|
||||
static unsigned int sve_visibility(const struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *rd)
|
||||
{
|
||||
if (vcpu_has_sve(vcpu))
|
||||
return 0;
|
||||
|
||||
return REG_HIDDEN_USER | REG_HIDDEN_GUEST;
|
||||
}
|
||||
|
||||
/* Visibility overrides for SVE-specific ID registers */
|
||||
static unsigned int sve_id_visibility(const struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *rd)
|
||||
{
|
||||
if (vcpu_has_sve(vcpu))
|
||||
return 0;
|
||||
|
||||
return REG_HIDDEN_USER;
|
||||
}
|
||||
|
||||
/* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */
|
||||
static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu)
|
||||
{
|
||||
if (!vcpu_has_sve(vcpu))
|
||||
return 0;
|
||||
|
||||
return read_sanitised_ftr_reg(SYS_ID_AA64ZFR0_EL1);
|
||||
}
|
||||
|
||||
static bool access_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
|
||||
struct sys_reg_params *p,
|
||||
const struct sys_reg_desc *rd)
|
||||
{
|
||||
if (p->is_write)
|
||||
return write_to_read_only(vcpu, p, rd);
|
||||
|
||||
p->regval = guest_id_aa64zfr0_el1(vcpu);
|
||||
return true;
|
||||
}
|
||||
|
||||
static int get_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *rd,
|
||||
const struct kvm_one_reg *reg, void __user *uaddr)
|
||||
{
|
||||
u64 val;
|
||||
|
||||
if (WARN_ON(!vcpu_has_sve(vcpu)))
|
||||
return -ENOENT;
|
||||
|
||||
val = guest_id_aa64zfr0_el1(vcpu);
|
||||
return reg_to_user(uaddr, &val, reg->id);
|
||||
}
|
||||
|
||||
static int set_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *rd,
|
||||
const struct kvm_one_reg *reg, void __user *uaddr)
|
||||
{
|
||||
const u64 id = sys_reg_to_index(rd);
|
||||
int err;
|
||||
u64 val;
|
||||
|
||||
if (WARN_ON(!vcpu_has_sve(vcpu)))
|
||||
return -ENOENT;
|
||||
|
||||
err = reg_from_user(&val, uaddr, id);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
/* This is what we mean by invariant: you can't change it. */
|
||||
if (val != guest_id_aa64zfr0_el1(vcpu))
|
||||
return -EINVAL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* cpufeature ID register user accessors
|
||||
*
|
||||
@ -1107,16 +1211,18 @@ static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
|
||||
* are stored, and for set_id_reg() we don't allow the effective value
|
||||
* to be changed.
|
||||
*/
|
||||
static int __get_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
|
||||
static int __get_id_reg(const struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *rd, void __user *uaddr,
|
||||
bool raz)
|
||||
{
|
||||
const u64 id = sys_reg_to_index(rd);
|
||||
const u64 val = read_id_reg(rd, raz);
|
||||
const u64 val = read_id_reg(vcpu, rd, raz);
|
||||
|
||||
return reg_to_user(uaddr, &val, id);
|
||||
}
|
||||
|
||||
static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
|
||||
static int __set_id_reg(const struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *rd, void __user *uaddr,
|
||||
bool raz)
|
||||
{
|
||||
const u64 id = sys_reg_to_index(rd);
|
||||
@ -1128,7 +1234,7 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
|
||||
return err;
|
||||
|
||||
/* This is what we mean by invariant: you can't change it. */
|
||||
if (val != read_id_reg(rd, raz))
|
||||
if (val != read_id_reg(vcpu, rd, raz))
|
||||
return -EINVAL;
|
||||
|
||||
return 0;
|
||||
@ -1137,25 +1243,25 @@ static int __set_id_reg(const struct sys_reg_desc *rd, void __user *uaddr,
|
||||
static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
|
||||
const struct kvm_one_reg *reg, void __user *uaddr)
|
||||
{
|
||||
return __get_id_reg(rd, uaddr, false);
|
||||
return __get_id_reg(vcpu, rd, uaddr, false);
|
||||
}
|
||||
|
||||
static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
|
||||
const struct kvm_one_reg *reg, void __user *uaddr)
|
||||
{
|
||||
return __set_id_reg(rd, uaddr, false);
|
||||
return __set_id_reg(vcpu, rd, uaddr, false);
|
||||
}
|
||||
|
||||
static int get_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
|
||||
const struct kvm_one_reg *reg, void __user *uaddr)
|
||||
{
|
||||
return __get_id_reg(rd, uaddr, true);
|
||||
return __get_id_reg(vcpu, rd, uaddr, true);
|
||||
}
|
||||
|
||||
static int set_raz_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
|
||||
const struct kvm_one_reg *reg, void __user *uaddr)
|
||||
{
|
||||
return __set_id_reg(rd, uaddr, true);
|
||||
return __set_id_reg(vcpu, rd, uaddr, true);
|
||||
}
|
||||
|
||||
static bool access_ctr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
|
||||
@ -1343,7 +1449,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
|
||||
ID_SANITISED(ID_AA64PFR1_EL1),
|
||||
ID_UNALLOCATED(4,2),
|
||||
ID_UNALLOCATED(4,3),
|
||||
ID_UNALLOCATED(4,4),
|
||||
{ SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .visibility = sve_id_visibility },
|
||||
ID_UNALLOCATED(4,5),
|
||||
ID_UNALLOCATED(4,6),
|
||||
ID_UNALLOCATED(4,7),
|
||||
@ -1380,10 +1486,17 @@ static const struct sys_reg_desc sys_reg_descs[] = {
|
||||
|
||||
{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
|
||||
{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
|
||||
{ SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = sve_visibility },
|
||||
{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
|
||||
{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
|
||||
{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
|
||||
|
||||
PTRAUTH_KEY(APIA),
|
||||
PTRAUTH_KEY(APIB),
|
||||
PTRAUTH_KEY(APDA),
|
||||
PTRAUTH_KEY(APDB),
|
||||
PTRAUTH_KEY(APGA),
|
||||
|
||||
{ SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 },
|
||||
{ SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 },
|
||||
{ SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 },
|
||||
@ -1924,6 +2037,12 @@ static void perform_access(struct kvm_vcpu *vcpu,
|
||||
{
|
||||
trace_kvm_sys_access(*vcpu_pc(vcpu), params, r);
|
||||
|
||||
/* Check for regs disabled by runtime config */
|
||||
if (sysreg_hidden_from_guest(vcpu, r)) {
|
||||
kvm_inject_undefined(vcpu);
|
||||
return;
|
||||
}
|
||||
|
||||
/*
|
||||
* Not having an accessor means that we have configured a trap
|
||||
* that we don't know how to handle. This certainly qualifies
|
||||
@ -2435,6 +2554,10 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
|
||||
if (!r)
|
||||
return get_invariant_sys_reg(reg->id, uaddr);
|
||||
|
||||
/* Check for regs disabled by runtime config */
|
||||
if (sysreg_hidden_from_user(vcpu, r))
|
||||
return -ENOENT;
|
||||
|
||||
if (r->get_user)
|
||||
return (r->get_user)(vcpu, r, reg, uaddr);
|
||||
|
||||
@ -2456,6 +2579,10 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
|
||||
if (!r)
|
||||
return set_invariant_sys_reg(reg->id, uaddr);
|
||||
|
||||
/* Check for regs disabled by runtime config */
|
||||
if (sysreg_hidden_from_user(vcpu, r))
|
||||
return -ENOENT;
|
||||
|
||||
if (r->set_user)
|
||||
return (r->set_user)(vcpu, r, reg, uaddr);
|
||||
|
||||
@ -2512,7 +2639,8 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
|
||||
return true;
|
||||
}
|
||||
|
||||
static int walk_one_sys_reg(const struct sys_reg_desc *rd,
|
||||
static int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *rd,
|
||||
u64 __user **uind,
|
||||
unsigned int *total)
|
||||
{
|
||||
@ -2523,6 +2651,9 @@ static int walk_one_sys_reg(const struct sys_reg_desc *rd,
|
||||
if (!(rd->reg || rd->get_user))
|
||||
return 0;
|
||||
|
||||
if (sysreg_hidden_from_user(vcpu, rd))
|
||||
return 0;
|
||||
|
||||
if (!copy_reg_to_user(rd, uind))
|
||||
return -EFAULT;
|
||||
|
||||
@ -2551,9 +2682,9 @@ static int walk_sys_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
|
||||
int cmp = cmp_sys_reg(i1, i2);
|
||||
/* target-specific overrides generic entry. */
|
||||
if (cmp <= 0)
|
||||
err = walk_one_sys_reg(i1, &uind, &total);
|
||||
err = walk_one_sys_reg(vcpu, i1, &uind, &total);
|
||||
else
|
||||
err = walk_one_sys_reg(i2, &uind, &total);
|
||||
err = walk_one_sys_reg(vcpu, i2, &uind, &total);
|
||||
|
||||
if (err)
|
||||
return err;
|
||||
|
@ -64,8 +64,15 @@ struct sys_reg_desc {
|
||||
const struct kvm_one_reg *reg, void __user *uaddr);
|
||||
int (*set_user)(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
|
||||
const struct kvm_one_reg *reg, void __user *uaddr);
|
||||
|
||||
/* Return mask of REG_* runtime visibility overrides */
|
||||
unsigned int (*visibility)(const struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *rd);
|
||||
};
|
||||
|
||||
#define REG_HIDDEN_USER (1 << 0) /* hidden from userspace ioctls */
|
||||
#define REG_HIDDEN_GUEST (1 << 1) /* hidden from guest */
|
||||
|
||||
static inline void print_sys_reg_instr(const struct sys_reg_params *p)
|
||||
{
|
||||
/* Look, we even formatted it for you to paste into the table! */
|
||||
@ -102,6 +109,24 @@ static inline void reset_val(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r
|
||||
__vcpu_sys_reg(vcpu, r->reg) = r->val;
|
||||
}
|
||||
|
||||
static inline bool sysreg_hidden_from_guest(const struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *r)
|
||||
{
|
||||
if (likely(!r->visibility))
|
||||
return false;
|
||||
|
||||
return r->visibility(vcpu, r) & REG_HIDDEN_GUEST;
|
||||
}
|
||||
|
||||
static inline bool sysreg_hidden_from_user(const struct kvm_vcpu *vcpu,
|
||||
const struct sys_reg_desc *r)
|
||||
{
|
||||
if (likely(!r->visibility))
|
||||
return false;
|
||||
|
||||
return r->visibility(vcpu, r) & REG_HIDDEN_USER;
|
||||
}
|
||||
|
||||
static inline int cmp_sys_reg(const struct sys_reg_desc *i1,
|
||||
const struct sys_reg_desc *i2)
|
||||
{
|
||||
|
@ -201,6 +201,8 @@ struct kvmppc_spapr_tce_iommu_table {
|
||||
struct kref kref;
|
||||
};
|
||||
|
||||
#define TCES_PER_PAGE (PAGE_SIZE / sizeof(u64))
|
||||
|
||||
struct kvmppc_spapr_tce_table {
|
||||
struct list_head list;
|
||||
struct kvm *kvm;
|
||||
@ -210,6 +212,7 @@ struct kvmppc_spapr_tce_table {
|
||||
u64 offset; /* in pages */
|
||||
u64 size; /* window size in pages */
|
||||
struct list_head iommu_tables;
|
||||
struct mutex alloc_lock;
|
||||
struct page *pages[0];
|
||||
};
|
||||
|
||||
@ -222,6 +225,7 @@ extern struct kvm_device_ops kvm_xics_ops;
|
||||
struct kvmppc_xive;
|
||||
struct kvmppc_xive_vcpu;
|
||||
extern struct kvm_device_ops kvm_xive_ops;
|
||||
extern struct kvm_device_ops kvm_xive_native_ops;
|
||||
|
||||
struct kvmppc_passthru_irqmap;
|
||||
|
||||
@ -312,7 +316,11 @@ struct kvm_arch {
|
||||
#endif
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
struct kvmppc_xics *xics;
|
||||
struct kvmppc_xive *xive;
|
||||
struct kvmppc_xive *xive; /* Current XIVE device in use */
|
||||
struct {
|
||||
struct kvmppc_xive *native;
|
||||
struct kvmppc_xive *xics_on_xive;
|
||||
} xive_devices;
|
||||
struct kvmppc_passthru_irqmap *pimap;
|
||||
#endif
|
||||
struct kvmppc_ops *kvm_ops;
|
||||
@ -449,6 +457,7 @@ struct kvmppc_passthru_irqmap {
|
||||
#define KVMPPC_IRQ_DEFAULT 0
|
||||
#define KVMPPC_IRQ_MPIC 1
|
||||
#define KVMPPC_IRQ_XICS 2 /* Includes a XIVE option */
|
||||
#define KVMPPC_IRQ_XIVE 3 /* XIVE native exploitation mode */
|
||||
|
||||
#define MMIO_HPTE_CACHE_SIZE 4
|
||||
|
||||
|
@ -197,10 +197,6 @@ extern struct kvmppc_spapr_tce_table *kvmppc_find_table(
|
||||
(iommu_tce_check_ioba((stt)->page_shift, (stt)->offset, \
|
||||
(stt)->size, (ioba), (npages)) ? \
|
||||
H_PARAMETER : H_SUCCESS)
|
||||
extern long kvmppc_tce_to_ua(struct kvm *kvm, unsigned long tce,
|
||||
unsigned long *ua, unsigned long **prmap);
|
||||
extern void kvmppc_tce_put(struct kvmppc_spapr_tce_table *tt,
|
||||
unsigned long idx, unsigned long tce);
|
||||
extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
|
||||
unsigned long ioba, unsigned long tce);
|
||||
extern long kvmppc_h_put_tce_indirect(struct kvm_vcpu *vcpu,
|
||||
@ -273,6 +269,7 @@ union kvmppc_one_reg {
|
||||
u64 addr;
|
||||
u64 length;
|
||||
} vpaval;
|
||||
u64 xive_timaval[2];
|
||||
};
|
||||
|
||||
struct kvmppc_ops {
|
||||
@ -480,6 +477,9 @@ extern void kvm_hv_vm_activated(void);
|
||||
extern void kvm_hv_vm_deactivated(void);
|
||||
extern bool kvm_hv_mode_active(void);
|
||||
|
||||
extern void kvmppc_check_need_tlb_flush(struct kvm *kvm, int pcpu,
|
||||
struct kvm_nested_guest *nested);
|
||||
|
||||
#else
|
||||
static inline void __init kvm_cma_reserve(void)
|
||||
{}
|
||||
@ -594,6 +594,22 @@ extern int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval);
|
||||
extern int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq,
|
||||
int level, bool line_status);
|
||||
extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu);
|
||||
|
||||
static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return vcpu->arch.irq_type == KVMPPC_IRQ_XIVE;
|
||||
}
|
||||
|
||||
extern int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
|
||||
struct kvm_vcpu *vcpu, u32 cpu);
|
||||
extern void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu);
|
||||
extern void kvmppc_xive_native_init_module(void);
|
||||
extern void kvmppc_xive_native_exit_module(void);
|
||||
extern int kvmppc_xive_native_get_vp(struct kvm_vcpu *vcpu,
|
||||
union kvmppc_one_reg *val);
|
||||
extern int kvmppc_xive_native_set_vp(struct kvm_vcpu *vcpu,
|
||||
union kvmppc_one_reg *val);
|
||||
|
||||
#else
|
||||
static inline int kvmppc_xive_set_xive(struct kvm *kvm, u32 irq, u32 server,
|
||||
u32 priority) { return -1; }
|
||||
@ -617,6 +633,21 @@ static inline int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval) { retur
|
||||
static inline int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq,
|
||||
int level, bool line_status) { return -ENODEV; }
|
||||
static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { }
|
||||
|
||||
static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
|
||||
{ return 0; }
|
||||
static inline int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
|
||||
struct kvm_vcpu *vcpu, u32 cpu) { return -EBUSY; }
|
||||
static inline void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) { }
|
||||
static inline void kvmppc_xive_native_init_module(void) { }
|
||||
static inline void kvmppc_xive_native_exit_module(void) { }
|
||||
static inline int kvmppc_xive_native_get_vp(struct kvm_vcpu *vcpu,
|
||||
union kvmppc_one_reg *val)
|
||||
{ return 0; }
|
||||
static inline int kvmppc_xive_native_set_vp(struct kvm_vcpu *vcpu,
|
||||
union kvmppc_one_reg *val)
|
||||
{ return -ENOENT; }
|
||||
|
||||
#endif /* CONFIG_KVM_XIVE */
|
||||
|
||||
#if defined(CONFIG_PPC_POWERNV) && defined(CONFIG_KVM_BOOK3S_64_HANDLER)
|
||||
@ -665,6 +696,8 @@ long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
|
||||
unsigned long pte_index);
|
||||
long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
|
||||
unsigned long pte_index);
|
||||
long kvmppc_rm_h_page_init(struct kvm_vcpu *vcpu, unsigned long flags,
|
||||
unsigned long dest, unsigned long src);
|
||||
long kvmppc_hpte_hv_fault(struct kvm_vcpu *vcpu, unsigned long addr,
|
||||
unsigned long slb_v, unsigned int status, bool data);
|
||||
unsigned long kvmppc_rm_h_xirr(struct kvm_vcpu *vcpu);
|
||||
|
@ -23,6 +23,7 @@
|
||||
* same offset regardless of where the code is executing
|
||||
*/
|
||||
extern void __iomem *xive_tima;
|
||||
extern unsigned long xive_tima_os;
|
||||
|
||||
/*
|
||||
* Offset in the TM area of our current execution level (provided by
|
||||
@ -73,6 +74,8 @@ struct xive_q {
|
||||
u32 esc_irq;
|
||||
atomic_t count;
|
||||
atomic_t pending_count;
|
||||
u64 guest_qaddr;
|
||||
u32 guest_qshift;
|
||||
};
|
||||
|
||||
/* Global enable flags for the XIVE support */
|
||||
|
@ -482,6 +482,8 @@ struct kvm_ppc_cpu_char {
|
||||
#define KVM_REG_PPC_ICP_PPRI_SHIFT 16 /* pending irq priority */
|
||||
#define KVM_REG_PPC_ICP_PPRI_MASK 0xff
|
||||
|
||||
#define KVM_REG_PPC_VP_STATE (KVM_REG_PPC | KVM_REG_SIZE_U128 | 0x8d)
|
||||
|
||||
/* Device control API: PPC-specific devices */
|
||||
#define KVM_DEV_MPIC_GRP_MISC 1
|
||||
#define KVM_DEV_MPIC_BASE_ADDR 0 /* 64-bit */
|
||||
@ -677,4 +679,48 @@ struct kvm_ppc_cpu_char {
|
||||
#define KVM_XICS_PRESENTED (1ULL << 43)
|
||||
#define KVM_XICS_QUEUED (1ULL << 44)
|
||||
|
||||
/* POWER9 XIVE Native Interrupt Controller */
|
||||
#define KVM_DEV_XIVE_GRP_CTRL 1
|
||||
#define KVM_DEV_XIVE_RESET 1
|
||||
#define KVM_DEV_XIVE_EQ_SYNC 2
|
||||
#define KVM_DEV_XIVE_GRP_SOURCE 2 /* 64-bit source identifier */
|
||||
#define KVM_DEV_XIVE_GRP_SOURCE_CONFIG 3 /* 64-bit source identifier */
|
||||
#define KVM_DEV_XIVE_GRP_EQ_CONFIG 4 /* 64-bit EQ identifier */
|
||||
#define KVM_DEV_XIVE_GRP_SOURCE_SYNC 5 /* 64-bit source identifier */
|
||||
|
||||
/* Layout of 64-bit XIVE source attribute values */
|
||||
#define KVM_XIVE_LEVEL_SENSITIVE (1ULL << 0)
|
||||
#define KVM_XIVE_LEVEL_ASSERTED (1ULL << 1)
|
||||
|
||||
/* Layout of 64-bit XIVE source configuration attribute values */
|
||||
#define KVM_XIVE_SOURCE_PRIORITY_SHIFT 0
|
||||
#define KVM_XIVE_SOURCE_PRIORITY_MASK 0x7
|
||||
#define KVM_XIVE_SOURCE_SERVER_SHIFT 3
|
||||
#define KVM_XIVE_SOURCE_SERVER_MASK 0xfffffff8ULL
|
||||
#define KVM_XIVE_SOURCE_MASKED_SHIFT 32
|
||||
#define KVM_XIVE_SOURCE_MASKED_MASK 0x100000000ULL
|
||||
#define KVM_XIVE_SOURCE_EISN_SHIFT 33
|
||||
#define KVM_XIVE_SOURCE_EISN_MASK 0xfffffffe00000000ULL
|
||||
|
||||
/* Layout of 64-bit EQ identifier */
|
||||
#define KVM_XIVE_EQ_PRIORITY_SHIFT 0
|
||||
#define KVM_XIVE_EQ_PRIORITY_MASK 0x7
|
||||
#define KVM_XIVE_EQ_SERVER_SHIFT 3
|
||||
#define KVM_XIVE_EQ_SERVER_MASK 0xfffffff8ULL
|
||||
|
||||
/* Layout of EQ configuration values (64 bytes) */
|
||||
struct kvm_ppc_xive_eq {
|
||||
__u32 flags;
|
||||
__u32 qshift;
|
||||
__u64 qaddr;
|
||||
__u32 qtoggle;
|
||||
__u32 qindex;
|
||||
__u8 pad[40];
|
||||
};
|
||||
|
||||
#define KVM_XIVE_EQ_ALWAYS_NOTIFY 0x00000001
|
||||
|
||||
#define KVM_XIVE_TIMA_PAGE_OFFSET 0
|
||||
#define KVM_XIVE_ESB_PAGE_OFFSET 4
|
||||
|
||||
#endif /* __LINUX_KVM_POWERPC_H */
|
||||
|
@ -94,7 +94,7 @@ endif
|
||||
kvm-book3s_64-objs-$(CONFIG_KVM_XICS) += \
|
||||
book3s_xics.o
|
||||
|
||||
kvm-book3s_64-objs-$(CONFIG_KVM_XIVE) += book3s_xive.o
|
||||
kvm-book3s_64-objs-$(CONFIG_KVM_XIVE) += book3s_xive.o book3s_xive_native.o
|
||||
kvm-book3s_64-objs-$(CONFIG_SPAPR_TCE_IOMMU) += book3s_64_vio.o
|
||||
|
||||
kvm-book3s_64-module-objs := \
|
||||
|
@ -651,6 +651,18 @@ int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id,
|
||||
*val = get_reg_val(id, kvmppc_xics_get_icp(vcpu));
|
||||
break;
|
||||
#endif /* CONFIG_KVM_XICS */
|
||||
#ifdef CONFIG_KVM_XIVE
|
||||
case KVM_REG_PPC_VP_STATE:
|
||||
if (!vcpu->arch.xive_vcpu) {
|
||||
r = -ENXIO;
|
||||
break;
|
||||
}
|
||||
if (xive_enabled())
|
||||
r = kvmppc_xive_native_get_vp(vcpu, val);
|
||||
else
|
||||
r = -ENXIO;
|
||||
break;
|
||||
#endif /* CONFIG_KVM_XIVE */
|
||||
case KVM_REG_PPC_FSCR:
|
||||
*val = get_reg_val(id, vcpu->arch.fscr);
|
||||
break;
|
||||
@ -724,6 +736,18 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id,
|
||||
r = kvmppc_xics_set_icp(vcpu, set_reg_val(id, *val));
|
||||
break;
|
||||
#endif /* CONFIG_KVM_XICS */
|
||||
#ifdef CONFIG_KVM_XIVE
|
||||
case KVM_REG_PPC_VP_STATE:
|
||||
if (!vcpu->arch.xive_vcpu) {
|
||||
r = -ENXIO;
|
||||
break;
|
||||
}
|
||||
if (xive_enabled())
|
||||
r = kvmppc_xive_native_set_vp(vcpu, val);
|
||||
else
|
||||
r = -ENXIO;
|
||||
break;
|
||||
#endif /* CONFIG_KVM_XIVE */
|
||||
case KVM_REG_PPC_FSCR:
|
||||
vcpu->arch.fscr = set_reg_val(id, *val);
|
||||
break;
|
||||
@ -891,6 +915,17 @@ void kvmppc_core_destroy_vm(struct kvm *kvm)
|
||||
kvmppc_rtas_tokens_free(kvm);
|
||||
WARN_ON(!list_empty(&kvm->arch.spapr_tce_tables));
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
/*
|
||||
* Free the XIVE devices which are not directly freed by the
|
||||
* device 'release' method
|
||||
*/
|
||||
kfree(kvm->arch.xive_devices.native);
|
||||
kvm->arch.xive_devices.native = NULL;
|
||||
kfree(kvm->arch.xive_devices.xics_on_xive);
|
||||
kvm->arch.xive_devices.xics_on_xive = NULL;
|
||||
#endif /* CONFIG_KVM_XICS */
|
||||
}
|
||||
|
||||
int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu)
|
||||
@ -1050,6 +1085,9 @@ static int kvmppc_book3s_init(void)
|
||||
if (xics_on_xive()) {
|
||||
kvmppc_xive_init_module();
|
||||
kvm_register_device_ops(&kvm_xive_ops, KVM_DEV_TYPE_XICS);
|
||||
kvmppc_xive_native_init_module();
|
||||
kvm_register_device_ops(&kvm_xive_native_ops,
|
||||
KVM_DEV_TYPE_XIVE);
|
||||
} else
|
||||
#endif
|
||||
kvm_register_device_ops(&kvm_xics_ops, KVM_DEV_TYPE_XICS);
|
||||
@ -1060,8 +1098,10 @@ static int kvmppc_book3s_init(void)
|
||||
static void kvmppc_book3s_exit(void)
|
||||
{
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
if (xics_on_xive())
|
||||
if (xics_on_xive()) {
|
||||
kvmppc_xive_exit_module();
|
||||
kvmppc_xive_native_exit_module();
|
||||
}
|
||||
#endif
|
||||
#ifdef CONFIG_KVM_BOOK3S_32_HANDLER
|
||||
kvmppc_book3s_exit_pr();
|
||||
|
@ -228,11 +228,33 @@ static void release_spapr_tce_table(struct rcu_head *head)
|
||||
unsigned long i, npages = kvmppc_tce_pages(stt->size);
|
||||
|
||||
for (i = 0; i < npages; i++)
|
||||
__free_page(stt->pages[i]);
|
||||
if (stt->pages[i])
|
||||
__free_page(stt->pages[i]);
|
||||
|
||||
kfree(stt);
|
||||
}
|
||||
|
||||
static struct page *kvm_spapr_get_tce_page(struct kvmppc_spapr_tce_table *stt,
|
||||
unsigned long sttpage)
|
||||
{
|
||||
struct page *page = stt->pages[sttpage];
|
||||
|
||||
if (page)
|
||||
return page;
|
||||
|
||||
mutex_lock(&stt->alloc_lock);
|
||||
page = stt->pages[sttpage];
|
||||
if (!page) {
|
||||
page = alloc_page(GFP_KERNEL | __GFP_ZERO);
|
||||
WARN_ON_ONCE(!page);
|
||||
if (page)
|
||||
stt->pages[sttpage] = page;
|
||||
}
|
||||
mutex_unlock(&stt->alloc_lock);
|
||||
|
||||
return page;
|
||||
}
|
||||
|
||||
static vm_fault_t kvm_spapr_tce_fault(struct vm_fault *vmf)
|
||||
{
|
||||
struct kvmppc_spapr_tce_table *stt = vmf->vma->vm_file->private_data;
|
||||
@ -241,7 +263,10 @@ static vm_fault_t kvm_spapr_tce_fault(struct vm_fault *vmf)
|
||||
if (vmf->pgoff >= kvmppc_tce_pages(stt->size))
|
||||
return VM_FAULT_SIGBUS;
|
||||
|
||||
page = stt->pages[vmf->pgoff];
|
||||
page = kvm_spapr_get_tce_page(stt, vmf->pgoff);
|
||||
if (!page)
|
||||
return VM_FAULT_OOM;
|
||||
|
||||
get_page(page);
|
||||
vmf->page = page;
|
||||
return 0;
|
||||
@ -296,7 +321,6 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
|
||||
struct kvmppc_spapr_tce_table *siter;
|
||||
unsigned long npages, size = args->size;
|
||||
int ret = -ENOMEM;
|
||||
int i;
|
||||
|
||||
if (!args->size || args->page_shift < 12 || args->page_shift > 34 ||
|
||||
(args->offset + args->size > (ULLONG_MAX >> args->page_shift)))
|
||||
@ -318,14 +342,9 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
|
||||
stt->offset = args->offset;
|
||||
stt->size = size;
|
||||
stt->kvm = kvm;
|
||||
mutex_init(&stt->alloc_lock);
|
||||
INIT_LIST_HEAD_RCU(&stt->iommu_tables);
|
||||
|
||||
for (i = 0; i < npages; i++) {
|
||||
stt->pages[i] = alloc_page(GFP_KERNEL | __GFP_ZERO);
|
||||
if (!stt->pages[i])
|
||||
goto fail;
|
||||
}
|
||||
|
||||
mutex_lock(&kvm->lock);
|
||||
|
||||
/* Check this LIOBN hasn't been previously allocated */
|
||||
@ -352,17 +371,28 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
|
||||
if (ret >= 0)
|
||||
return ret;
|
||||
|
||||
fail:
|
||||
for (i = 0; i < npages; i++)
|
||||
if (stt->pages[i])
|
||||
__free_page(stt->pages[i]);
|
||||
|
||||
kfree(stt);
|
||||
fail_acct:
|
||||
kvmppc_account_memlimit(kvmppc_stt_pages(npages), false);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static long kvmppc_tce_to_ua(struct kvm *kvm, unsigned long tce,
|
||||
unsigned long *ua)
|
||||
{
|
||||
unsigned long gfn = tce >> PAGE_SHIFT;
|
||||
struct kvm_memory_slot *memslot;
|
||||
|
||||
memslot = search_memslots(kvm_memslots(kvm), gfn);
|
||||
if (!memslot)
|
||||
return -EINVAL;
|
||||
|
||||
*ua = __gfn_to_hva_memslot(memslot, gfn) |
|
||||
(tce & ~(PAGE_MASK | TCE_PCI_READ | TCE_PCI_WRITE));
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *stt,
|
||||
unsigned long tce)
|
||||
{
|
||||
@ -378,7 +408,7 @@ static long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *stt,
|
||||
if (iommu_tce_check_gpa(stt->page_shift, gpa))
|
||||
return H_TOO_HARD;
|
||||
|
||||
if (kvmppc_tce_to_ua(stt->kvm, tce, &ua, NULL))
|
||||
if (kvmppc_tce_to_ua(stt->kvm, tce, &ua))
|
||||
return H_TOO_HARD;
|
||||
|
||||
list_for_each_entry_rcu(stit, &stt->iommu_tables, next) {
|
||||
@ -397,6 +427,36 @@ static long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *stt,
|
||||
return H_SUCCESS;
|
||||
}
|
||||
|
||||
/*
|
||||
* Handles TCE requests for emulated devices.
|
||||
* Puts guest TCE values to the table and expects user space to convert them.
|
||||
* Cannot fail so kvmppc_tce_validate must be called before it.
|
||||
*/
|
||||
static void kvmppc_tce_put(struct kvmppc_spapr_tce_table *stt,
|
||||
unsigned long idx, unsigned long tce)
|
||||
{
|
||||
struct page *page;
|
||||
u64 *tbl;
|
||||
unsigned long sttpage;
|
||||
|
||||
idx -= stt->offset;
|
||||
sttpage = idx / TCES_PER_PAGE;
|
||||
page = stt->pages[sttpage];
|
||||
|
||||
if (!page) {
|
||||
/* We allow any TCE, not just with read|write permissions */
|
||||
if (!tce)
|
||||
return;
|
||||
|
||||
page = kvm_spapr_get_tce_page(stt, sttpage);
|
||||
if (!page)
|
||||
return;
|
||||
}
|
||||
tbl = page_to_virt(page);
|
||||
|
||||
tbl[idx % TCES_PER_PAGE] = tce;
|
||||
}
|
||||
|
||||
static void kvmppc_clear_tce(struct mm_struct *mm, struct iommu_table *tbl,
|
||||
unsigned long entry)
|
||||
{
|
||||
@ -551,7 +611,7 @@ long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
|
||||
|
||||
dir = iommu_tce_direction(tce);
|
||||
|
||||
if ((dir != DMA_NONE) && kvmppc_tce_to_ua(vcpu->kvm, tce, &ua, NULL)) {
|
||||
if ((dir != DMA_NONE) && kvmppc_tce_to_ua(vcpu->kvm, tce, &ua)) {
|
||||
ret = H_PARAMETER;
|
||||
goto unlock_exit;
|
||||
}
|
||||
@ -612,7 +672,7 @@ long kvmppc_h_put_tce_indirect(struct kvm_vcpu *vcpu,
|
||||
return ret;
|
||||
|
||||
idx = srcu_read_lock(&vcpu->kvm->srcu);
|
||||
if (kvmppc_tce_to_ua(vcpu->kvm, tce_list, &ua, NULL)) {
|
||||
if (kvmppc_tce_to_ua(vcpu->kvm, tce_list, &ua)) {
|
||||
ret = H_TOO_HARD;
|
||||
goto unlock_exit;
|
||||
}
|
||||
@ -647,7 +707,7 @@ long kvmppc_h_put_tce_indirect(struct kvm_vcpu *vcpu,
|
||||
}
|
||||
tce = be64_to_cpu(tce);
|
||||
|
||||
if (kvmppc_tce_to_ua(vcpu->kvm, tce, &ua, NULL))
|
||||
if (kvmppc_tce_to_ua(vcpu->kvm, tce, &ua))
|
||||
return H_PARAMETER;
|
||||
|
||||
list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
|
||||
|
@ -66,8 +66,6 @@
|
||||
|
||||
#endif
|
||||
|
||||
#define TCES_PER_PAGE (PAGE_SIZE / sizeof(u64))
|
||||
|
||||
/*
|
||||
* Finds a TCE table descriptor by LIOBN.
|
||||
*
|
||||
@ -88,6 +86,25 @@ struct kvmppc_spapr_tce_table *kvmppc_find_table(struct kvm *kvm,
|
||||
EXPORT_SYMBOL_GPL(kvmppc_find_table);
|
||||
|
||||
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
|
||||
static long kvmppc_rm_tce_to_ua(struct kvm *kvm, unsigned long tce,
|
||||
unsigned long *ua, unsigned long **prmap)
|
||||
{
|
||||
unsigned long gfn = tce >> PAGE_SHIFT;
|
||||
struct kvm_memory_slot *memslot;
|
||||
|
||||
memslot = search_memslots(kvm_memslots_raw(kvm), gfn);
|
||||
if (!memslot)
|
||||
return -EINVAL;
|
||||
|
||||
*ua = __gfn_to_hva_memslot(memslot, gfn) |
|
||||
(tce & ~(PAGE_MASK | TCE_PCI_READ | TCE_PCI_WRITE));
|
||||
|
||||
if (prmap)
|
||||
*prmap = &memslot->arch.rmap[gfn - memslot->base_gfn];
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Validates TCE address.
|
||||
* At the moment flags and page mask are validated.
|
||||
@ -111,7 +128,7 @@ static long kvmppc_rm_tce_validate(struct kvmppc_spapr_tce_table *stt,
|
||||
if (iommu_tce_check_gpa(stt->page_shift, gpa))
|
||||
return H_PARAMETER;
|
||||
|
||||
if (kvmppc_tce_to_ua(stt->kvm, tce, &ua, NULL))
|
||||
if (kvmppc_rm_tce_to_ua(stt->kvm, tce, &ua, NULL))
|
||||
return H_TOO_HARD;
|
||||
|
||||
list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
|
||||
@ -129,7 +146,6 @@ static long kvmppc_rm_tce_validate(struct kvmppc_spapr_tce_table *stt,
|
||||
|
||||
return H_SUCCESS;
|
||||
}
|
||||
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
|
||||
|
||||
/* Note on the use of page_address() in real mode,
|
||||
*
|
||||
@ -161,13 +177,9 @@ static u64 *kvmppc_page_address(struct page *page)
|
||||
/*
|
||||
* Handles TCE requests for emulated devices.
|
||||
* Puts guest TCE values to the table and expects user space to convert them.
|
||||
* Called in both real and virtual modes.
|
||||
* Cannot fail so kvmppc_tce_validate must be called before it.
|
||||
*
|
||||
* WARNING: This will be called in real-mode on HV KVM and virtual
|
||||
* mode on PR KVM
|
||||
* Cannot fail so kvmppc_rm_tce_validate must be called before it.
|
||||
*/
|
||||
void kvmppc_tce_put(struct kvmppc_spapr_tce_table *stt,
|
||||
static void kvmppc_rm_tce_put(struct kvmppc_spapr_tce_table *stt,
|
||||
unsigned long idx, unsigned long tce)
|
||||
{
|
||||
struct page *page;
|
||||
@ -175,35 +187,48 @@ void kvmppc_tce_put(struct kvmppc_spapr_tce_table *stt,
|
||||
|
||||
idx -= stt->offset;
|
||||
page = stt->pages[idx / TCES_PER_PAGE];
|
||||
/*
|
||||
* page must not be NULL in real mode,
|
||||
* kvmppc_rm_ioba_validate() must have taken care of this.
|
||||
*/
|
||||
WARN_ON_ONCE_RM(!page);
|
||||
tbl = kvmppc_page_address(page);
|
||||
|
||||
tbl[idx % TCES_PER_PAGE] = tce;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvmppc_tce_put);
|
||||
|
||||
long kvmppc_tce_to_ua(struct kvm *kvm, unsigned long tce,
|
||||
unsigned long *ua, unsigned long **prmap)
|
||||
/*
|
||||
* TCEs pages are allocated in kvmppc_rm_tce_put() which won't be able to do so
|
||||
* in real mode.
|
||||
* Check if kvmppc_rm_tce_put() can succeed in real mode, i.e. a TCEs page is
|
||||
* allocated or not required (when clearing a tce entry).
|
||||
*/
|
||||
static long kvmppc_rm_ioba_validate(struct kvmppc_spapr_tce_table *stt,
|
||||
unsigned long ioba, unsigned long npages, bool clearing)
|
||||
{
|
||||
unsigned long gfn = tce >> PAGE_SHIFT;
|
||||
struct kvm_memory_slot *memslot;
|
||||
unsigned long i, idx, sttpage, sttpages;
|
||||
unsigned long ret = kvmppc_ioba_validate(stt, ioba, npages);
|
||||
|
||||
memslot = search_memslots(kvm_memslots(kvm), gfn);
|
||||
if (!memslot)
|
||||
return -EINVAL;
|
||||
if (ret)
|
||||
return ret;
|
||||
/*
|
||||
* clearing==true says kvmppc_rm_tce_put won't be allocating pages
|
||||
* for empty tces.
|
||||
*/
|
||||
if (clearing)
|
||||
return H_SUCCESS;
|
||||
|
||||
*ua = __gfn_to_hva_memslot(memslot, gfn) |
|
||||
(tce & ~(PAGE_MASK | TCE_PCI_READ | TCE_PCI_WRITE));
|
||||
idx = (ioba >> stt->page_shift) - stt->offset;
|
||||
sttpage = idx / TCES_PER_PAGE;
|
||||
sttpages = _ALIGN_UP(idx % TCES_PER_PAGE + npages, TCES_PER_PAGE) /
|
||||
TCES_PER_PAGE;
|
||||
for (i = sttpage; i < sttpage + sttpages; ++i)
|
||||
if (!stt->pages[i])
|
||||
return H_TOO_HARD;
|
||||
|
||||
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
|
||||
if (prmap)
|
||||
*prmap = &memslot->arch.rmap[gfn - memslot->base_gfn];
|
||||
#endif
|
||||
|
||||
return 0;
|
||||
return H_SUCCESS;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvmppc_tce_to_ua);
|
||||
|
||||
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
|
||||
static long iommu_tce_xchg_rm(struct mm_struct *mm, struct iommu_table *tbl,
|
||||
unsigned long entry, unsigned long *hpa,
|
||||
enum dma_data_direction *direction)
|
||||
@ -381,7 +406,7 @@ long kvmppc_rm_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
|
||||
if (!stt)
|
||||
return H_TOO_HARD;
|
||||
|
||||
ret = kvmppc_ioba_validate(stt, ioba, 1);
|
||||
ret = kvmppc_rm_ioba_validate(stt, ioba, 1, tce == 0);
|
||||
if (ret != H_SUCCESS)
|
||||
return ret;
|
||||
|
||||
@ -390,7 +415,7 @@ long kvmppc_rm_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
|
||||
return ret;
|
||||
|
||||
dir = iommu_tce_direction(tce);
|
||||
if ((dir != DMA_NONE) && kvmppc_tce_to_ua(vcpu->kvm, tce, &ua, NULL))
|
||||
if ((dir != DMA_NONE) && kvmppc_rm_tce_to_ua(vcpu->kvm, tce, &ua, NULL))
|
||||
return H_PARAMETER;
|
||||
|
||||
entry = ioba >> stt->page_shift;
|
||||
@ -409,7 +434,7 @@ long kvmppc_rm_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
|
||||
}
|
||||
}
|
||||
|
||||
kvmppc_tce_put(stt, entry, tce);
|
||||
kvmppc_rm_tce_put(stt, entry, tce);
|
||||
|
||||
return H_SUCCESS;
|
||||
}
|
||||
@ -480,7 +505,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
|
||||
if (tce_list & (SZ_4K - 1))
|
||||
return H_PARAMETER;
|
||||
|
||||
ret = kvmppc_ioba_validate(stt, ioba, npages);
|
||||
ret = kvmppc_rm_ioba_validate(stt, ioba, npages, false);
|
||||
if (ret != H_SUCCESS)
|
||||
return ret;
|
||||
|
||||
@ -492,7 +517,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
|
||||
*/
|
||||
struct mm_iommu_table_group_mem_t *mem;
|
||||
|
||||
if (kvmppc_tce_to_ua(vcpu->kvm, tce_list, &ua, NULL))
|
||||
if (kvmppc_rm_tce_to_ua(vcpu->kvm, tce_list, &ua, NULL))
|
||||
return H_TOO_HARD;
|
||||
|
||||
mem = mm_iommu_lookup_rm(vcpu->kvm->mm, ua, IOMMU_PAGE_SIZE_4K);
|
||||
@ -508,7 +533,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
|
||||
* We do not require memory to be preregistered in this case
|
||||
* so lock rmap and do __find_linux_pte_or_hugepte().
|
||||
*/
|
||||
if (kvmppc_tce_to_ua(vcpu->kvm, tce_list, &ua, &rmap))
|
||||
if (kvmppc_rm_tce_to_ua(vcpu->kvm, tce_list, &ua, &rmap))
|
||||
return H_TOO_HARD;
|
||||
|
||||
rmap = (void *) vmalloc_to_phys(rmap);
|
||||
@ -542,7 +567,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
|
||||
unsigned long tce = be64_to_cpu(((u64 *)tces)[i]);
|
||||
|
||||
ua = 0;
|
||||
if (kvmppc_tce_to_ua(vcpu->kvm, tce, &ua, NULL))
|
||||
if (kvmppc_rm_tce_to_ua(vcpu->kvm, tce, &ua, NULL))
|
||||
return H_PARAMETER;
|
||||
|
||||
list_for_each_entry_lockless(stit, &stt->iommu_tables, next) {
|
||||
@ -557,7 +582,7 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
|
||||
}
|
||||
}
|
||||
|
||||
kvmppc_tce_put(stt, entry + i, tce);
|
||||
kvmppc_rm_tce_put(stt, entry + i, tce);
|
||||
}
|
||||
|
||||
unlock_exit:
|
||||
@ -583,7 +608,7 @@ long kvmppc_rm_h_stuff_tce(struct kvm_vcpu *vcpu,
|
||||
if (!stt)
|
||||
return H_TOO_HARD;
|
||||
|
||||
ret = kvmppc_ioba_validate(stt, ioba, npages);
|
||||
ret = kvmppc_rm_ioba_validate(stt, ioba, npages, tce_value == 0);
|
||||
if (ret != H_SUCCESS)
|
||||
return ret;
|
||||
|
||||
@ -610,7 +635,7 @@ long kvmppc_rm_h_stuff_tce(struct kvm_vcpu *vcpu,
|
||||
}
|
||||
|
||||
for (i = 0; i < npages; ++i, ioba += (1ULL << stt->page_shift))
|
||||
kvmppc_tce_put(stt, ioba >> stt->page_shift, tce_value);
|
||||
kvmppc_rm_tce_put(stt, ioba >> stt->page_shift, tce_value);
|
||||
|
||||
return H_SUCCESS;
|
||||
}
|
||||
@ -635,6 +660,10 @@ long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
|
||||
|
||||
idx = (ioba >> stt->page_shift) - stt->offset;
|
||||
page = stt->pages[idx / TCES_PER_PAGE];
|
||||
if (!page) {
|
||||
vcpu->arch.regs.gpr[4] = 0;
|
||||
return H_SUCCESS;
|
||||
}
|
||||
tbl = (u64 *)page_address(page);
|
||||
|
||||
vcpu->arch.regs.gpr[4] = tbl[idx % TCES_PER_PAGE];
|
||||
|
@ -750,7 +750,7 @@ static bool kvmppc_doorbell_pending(struct kvm_vcpu *vcpu)
|
||||
/*
|
||||
* Ensure that the read of vcore->dpdes comes after the read
|
||||
* of vcpu->doorbell_request. This barrier matches the
|
||||
* smb_wmb() in kvmppc_guest_entry_inject().
|
||||
* smp_wmb() in kvmppc_guest_entry_inject().
|
||||
*/
|
||||
smp_rmb();
|
||||
vc = vcpu->arch.vcore;
|
||||
@ -802,6 +802,80 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, unsigned long mflags,
|
||||
}
|
||||
}
|
||||
|
||||
/* Copy guest memory in place - must reside within a single memslot */
|
||||
static int kvmppc_copy_guest(struct kvm *kvm, gpa_t to, gpa_t from,
|
||||
unsigned long len)
|
||||
{
|
||||
struct kvm_memory_slot *to_memslot = NULL;
|
||||
struct kvm_memory_slot *from_memslot = NULL;
|
||||
unsigned long to_addr, from_addr;
|
||||
int r;
|
||||
|
||||
/* Get HPA for from address */
|
||||
from_memslot = gfn_to_memslot(kvm, from >> PAGE_SHIFT);
|
||||
if (!from_memslot)
|
||||
return -EFAULT;
|
||||
if ((from + len) >= ((from_memslot->base_gfn + from_memslot->npages)
|
||||
<< PAGE_SHIFT))
|
||||
return -EINVAL;
|
||||
from_addr = gfn_to_hva_memslot(from_memslot, from >> PAGE_SHIFT);
|
||||
if (kvm_is_error_hva(from_addr))
|
||||
return -EFAULT;
|
||||
from_addr |= (from & (PAGE_SIZE - 1));
|
||||
|
||||
/* Get HPA for to address */
|
||||
to_memslot = gfn_to_memslot(kvm, to >> PAGE_SHIFT);
|
||||
if (!to_memslot)
|
||||
return -EFAULT;
|
||||
if ((to + len) >= ((to_memslot->base_gfn + to_memslot->npages)
|
||||
<< PAGE_SHIFT))
|
||||
return -EINVAL;
|
||||
to_addr = gfn_to_hva_memslot(to_memslot, to >> PAGE_SHIFT);
|
||||
if (kvm_is_error_hva(to_addr))
|
||||
return -EFAULT;
|
||||
to_addr |= (to & (PAGE_SIZE - 1));
|
||||
|
||||
/* Perform copy */
|
||||
r = raw_copy_in_user((void __user *)to_addr, (void __user *)from_addr,
|
||||
len);
|
||||
if (r)
|
||||
return -EFAULT;
|
||||
mark_page_dirty(kvm, to >> PAGE_SHIFT);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static long kvmppc_h_page_init(struct kvm_vcpu *vcpu, unsigned long flags,
|
||||
unsigned long dest, unsigned long src)
|
||||
{
|
||||
u64 pg_sz = SZ_4K; /* 4K page size */
|
||||
u64 pg_mask = SZ_4K - 1;
|
||||
int ret;
|
||||
|
||||
/* Check for invalid flags (H_PAGE_SET_LOANED covers all CMO flags) */
|
||||
if (flags & ~(H_ICACHE_INVALIDATE | H_ICACHE_SYNCHRONIZE |
|
||||
H_ZERO_PAGE | H_COPY_PAGE | H_PAGE_SET_LOANED))
|
||||
return H_PARAMETER;
|
||||
|
||||
/* dest (and src if copy_page flag set) must be page aligned */
|
||||
if ((dest & pg_mask) || ((flags & H_COPY_PAGE) && (src & pg_mask)))
|
||||
return H_PARAMETER;
|
||||
|
||||
/* zero and/or copy the page as determined by the flags */
|
||||
if (flags & H_COPY_PAGE) {
|
||||
ret = kvmppc_copy_guest(vcpu->kvm, dest, src, pg_sz);
|
||||
if (ret < 0)
|
||||
return H_PARAMETER;
|
||||
} else if (flags & H_ZERO_PAGE) {
|
||||
ret = kvm_clear_guest(vcpu->kvm, dest, pg_sz);
|
||||
if (ret < 0)
|
||||
return H_PARAMETER;
|
||||
}
|
||||
|
||||
/* We can ignore the remaining flags */
|
||||
|
||||
return H_SUCCESS;
|
||||
}
|
||||
|
||||
static int kvm_arch_vcpu_yield_to(struct kvm_vcpu *target)
|
||||
{
|
||||
struct kvmppc_vcore *vcore = target->arch.vcore;
|
||||
@ -1004,6 +1078,11 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
|
||||
if (nesting_enabled(vcpu->kvm))
|
||||
ret = kvmhv_copy_tofrom_guest_nested(vcpu);
|
||||
break;
|
||||
case H_PAGE_INIT:
|
||||
ret = kvmppc_h_page_init(vcpu, kvmppc_get_gpr(vcpu, 4),
|
||||
kvmppc_get_gpr(vcpu, 5),
|
||||
kvmppc_get_gpr(vcpu, 6));
|
||||
break;
|
||||
default:
|
||||
return RESUME_HOST;
|
||||
}
|
||||
@ -1048,6 +1127,7 @@ static int kvmppc_hcall_impl_hv(unsigned long cmd)
|
||||
case H_IPOLL:
|
||||
case H_XIRR_X:
|
||||
#endif
|
||||
case H_PAGE_INIT:
|
||||
return 1;
|
||||
}
|
||||
|
||||
@ -2505,37 +2585,6 @@ static void kvmppc_prepare_radix_vcpu(struct kvm_vcpu *vcpu, int pcpu)
|
||||
}
|
||||
}
|
||||
|
||||
static void kvmppc_radix_check_need_tlb_flush(struct kvm *kvm, int pcpu,
|
||||
struct kvm_nested_guest *nested)
|
||||
{
|
||||
cpumask_t *need_tlb_flush;
|
||||
int lpid;
|
||||
|
||||
if (!cpu_has_feature(CPU_FTR_HVMODE))
|
||||
return;
|
||||
|
||||
if (cpu_has_feature(CPU_FTR_ARCH_300))
|
||||
pcpu &= ~0x3UL;
|
||||
|
||||
if (nested) {
|
||||
lpid = nested->shadow_lpid;
|
||||
need_tlb_flush = &nested->need_tlb_flush;
|
||||
} else {
|
||||
lpid = kvm->arch.lpid;
|
||||
need_tlb_flush = &kvm->arch.need_tlb_flush;
|
||||
}
|
||||
|
||||
mtspr(SPRN_LPID, lpid);
|
||||
isync();
|
||||
smp_mb();
|
||||
|
||||
if (cpumask_test_cpu(pcpu, need_tlb_flush)) {
|
||||
radix__local_flush_tlb_lpid_guest(lpid);
|
||||
/* Clear the bit after the TLB flush */
|
||||
cpumask_clear_cpu(pcpu, need_tlb_flush);
|
||||
}
|
||||
}
|
||||
|
||||
static void kvmppc_start_thread(struct kvm_vcpu *vcpu, struct kvmppc_vcore *vc)
|
||||
{
|
||||
int cpu;
|
||||
@ -3229,19 +3278,11 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
|
||||
for (sub = 0; sub < core_info.n_subcores; ++sub)
|
||||
spin_unlock(&core_info.vc[sub]->lock);
|
||||
|
||||
if (kvm_is_radix(vc->kvm)) {
|
||||
/*
|
||||
* Do we need to flush the process scoped TLB for the LPAR?
|
||||
*
|
||||
* On POWER9, individual threads can come in here, but the
|
||||
* TLB is shared between the 4 threads in a core, hence
|
||||
* invalidating on one thread invalidates for all.
|
||||
* Thus we make all 4 threads use the same bit here.
|
||||
*
|
||||
* Hash must be flushed in realmode in order to use tlbiel.
|
||||
*/
|
||||
kvmppc_radix_check_need_tlb_flush(vc->kvm, pcpu, NULL);
|
||||
}
|
||||
guest_enter_irqoff();
|
||||
|
||||
srcu_idx = srcu_read_lock(&vc->kvm->srcu);
|
||||
|
||||
this_cpu_disable_ftrace();
|
||||
|
||||
/*
|
||||
* Interrupts will be enabled once we get into the guest,
|
||||
@ -3249,19 +3290,14 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
|
||||
*/
|
||||
trace_hardirqs_on();
|
||||
|
||||
guest_enter_irqoff();
|
||||
|
||||
srcu_idx = srcu_read_lock(&vc->kvm->srcu);
|
||||
|
||||
this_cpu_disable_ftrace();
|
||||
|
||||
trap = __kvmppc_vcore_entry();
|
||||
|
||||
trace_hardirqs_off();
|
||||
|
||||
this_cpu_enable_ftrace();
|
||||
|
||||
srcu_read_unlock(&vc->kvm->srcu, srcu_idx);
|
||||
|
||||
trace_hardirqs_off();
|
||||
set_irq_happened(trap);
|
||||
|
||||
spin_lock(&vc->lock);
|
||||
@ -3514,6 +3550,7 @@ int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
|
||||
#ifdef CONFIG_ALTIVEC
|
||||
load_vr_state(&vcpu->arch.vr);
|
||||
#endif
|
||||
mtspr(SPRN_VRSAVE, vcpu->arch.vrsave);
|
||||
|
||||
mtspr(SPRN_DSCR, vcpu->arch.dscr);
|
||||
mtspr(SPRN_IAMR, vcpu->arch.iamr);
|
||||
@ -3605,6 +3642,7 @@ int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
|
||||
#ifdef CONFIG_ALTIVEC
|
||||
store_vr_state(&vcpu->arch.vr);
|
||||
#endif
|
||||
vcpu->arch.vrsave = mfspr(SPRN_VRSAVE);
|
||||
|
||||
if (cpu_has_feature(CPU_FTR_TM) ||
|
||||
cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST))
|
||||
@ -3970,7 +4008,7 @@ int kvmhv_run_single_vcpu(struct kvm_run *kvm_run,
|
||||
unsigned long lpcr)
|
||||
{
|
||||
int trap, r, pcpu;
|
||||
int srcu_idx;
|
||||
int srcu_idx, lpid;
|
||||
struct kvmppc_vcore *vc;
|
||||
struct kvm *kvm = vcpu->kvm;
|
||||
struct kvm_nested_guest *nested = vcpu->arch.nested;
|
||||
@ -4046,8 +4084,12 @@ int kvmhv_run_single_vcpu(struct kvm_run *kvm_run,
|
||||
vc->vcore_state = VCORE_RUNNING;
|
||||
trace_kvmppc_run_core(vc, 0);
|
||||
|
||||
if (cpu_has_feature(CPU_FTR_HVMODE))
|
||||
kvmppc_radix_check_need_tlb_flush(kvm, pcpu, nested);
|
||||
if (cpu_has_feature(CPU_FTR_HVMODE)) {
|
||||
lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
|
||||
mtspr(SPRN_LPID, lpid);
|
||||
isync();
|
||||
kvmppc_check_need_tlb_flush(kvm, pcpu, nested);
|
||||
}
|
||||
|
||||
trace_hardirqs_on();
|
||||
guest_enter_irqoff();
|
||||
|
@ -805,3 +805,60 @@ void kvmppc_guest_entry_inject_int(struct kvm_vcpu *vcpu)
|
||||
vcpu->arch.doorbell_request = 0;
|
||||
}
|
||||
}
|
||||
|
||||
static void flush_guest_tlb(struct kvm *kvm)
|
||||
{
|
||||
unsigned long rb, set;
|
||||
|
||||
rb = PPC_BIT(52); /* IS = 2 */
|
||||
if (kvm_is_radix(kvm)) {
|
||||
/* R=1 PRS=1 RIC=2 */
|
||||
asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
|
||||
: : "r" (rb), "i" (1), "i" (1), "i" (2),
|
||||
"r" (0) : "memory");
|
||||
for (set = 1; set < kvm->arch.tlb_sets; ++set) {
|
||||
rb += PPC_BIT(51); /* increment set number */
|
||||
/* R=1 PRS=1 RIC=0 */
|
||||
asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
|
||||
: : "r" (rb), "i" (1), "i" (1), "i" (0),
|
||||
"r" (0) : "memory");
|
||||
}
|
||||
} else {
|
||||
for (set = 0; set < kvm->arch.tlb_sets; ++set) {
|
||||
/* R=0 PRS=0 RIC=0 */
|
||||
asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
|
||||
: : "r" (rb), "i" (0), "i" (0), "i" (0),
|
||||
"r" (0) : "memory");
|
||||
rb += PPC_BIT(51); /* increment set number */
|
||||
}
|
||||
}
|
||||
asm volatile("ptesync": : :"memory");
|
||||
}
|
||||
|
||||
void kvmppc_check_need_tlb_flush(struct kvm *kvm, int pcpu,
|
||||
struct kvm_nested_guest *nested)
|
||||
{
|
||||
cpumask_t *need_tlb_flush;
|
||||
|
||||
/*
|
||||
* On POWER9, individual threads can come in here, but the
|
||||
* TLB is shared between the 4 threads in a core, hence
|
||||
* invalidating on one thread invalidates for all.
|
||||
* Thus we make all 4 threads use the same bit.
|
||||
*/
|
||||
if (cpu_has_feature(CPU_FTR_ARCH_300))
|
||||
pcpu = cpu_first_thread_sibling(pcpu);
|
||||
|
||||
if (nested)
|
||||
need_tlb_flush = &nested->need_tlb_flush;
|
||||
else
|
||||
need_tlb_flush = &kvm->arch.need_tlb_flush;
|
||||
|
||||
if (cpumask_test_cpu(pcpu, need_tlb_flush)) {
|
||||
flush_guest_tlb(kvm);
|
||||
|
||||
/* Clear the bit after the TLB flush */
|
||||
cpumask_clear_cpu(pcpu, need_tlb_flush);
|
||||
}
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvmppc_check_need_tlb_flush);
|
||||
|
@ -13,6 +13,7 @@
|
||||
#include <linux/hugetlb.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/log2.h>
|
||||
#include <linux/sizes.h>
|
||||
|
||||
#include <asm/trace.h>
|
||||
#include <asm/kvm_ppc.h>
|
||||
@ -867,6 +868,149 @@ long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int kvmppc_get_hpa(struct kvm_vcpu *vcpu, unsigned long gpa,
|
||||
int writing, unsigned long *hpa,
|
||||
struct kvm_memory_slot **memslot_p)
|
||||
{
|
||||
struct kvm *kvm = vcpu->kvm;
|
||||
struct kvm_memory_slot *memslot;
|
||||
unsigned long gfn, hva, pa, psize = PAGE_SHIFT;
|
||||
unsigned int shift;
|
||||
pte_t *ptep, pte;
|
||||
|
||||
/* Find the memslot for this address */
|
||||
gfn = gpa >> PAGE_SHIFT;
|
||||
memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
|
||||
if (!memslot || (memslot->flags & KVM_MEMSLOT_INVALID))
|
||||
return H_PARAMETER;
|
||||
|
||||
/* Translate to host virtual address */
|
||||
hva = __gfn_to_hva_memslot(memslot, gfn);
|
||||
|
||||
/* Try to find the host pte for that virtual address */
|
||||
ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
|
||||
if (!ptep)
|
||||
return H_TOO_HARD;
|
||||
pte = kvmppc_read_update_linux_pte(ptep, writing);
|
||||
if (!pte_present(pte))
|
||||
return H_TOO_HARD;
|
||||
|
||||
/* Convert to a physical address */
|
||||
if (shift)
|
||||
psize = 1UL << shift;
|
||||
pa = pte_pfn(pte) << PAGE_SHIFT;
|
||||
pa |= hva & (psize - 1);
|
||||
pa |= gpa & ~PAGE_MASK;
|
||||
|
||||
if (hpa)
|
||||
*hpa = pa;
|
||||
if (memslot_p)
|
||||
*memslot_p = memslot;
|
||||
|
||||
return H_SUCCESS;
|
||||
}
|
||||
|
||||
static long kvmppc_do_h_page_init_zero(struct kvm_vcpu *vcpu,
|
||||
unsigned long dest)
|
||||
{
|
||||
struct kvm_memory_slot *memslot;
|
||||
struct kvm *kvm = vcpu->kvm;
|
||||
unsigned long pa, mmu_seq;
|
||||
long ret = H_SUCCESS;
|
||||
int i;
|
||||
|
||||
/* Used later to detect if we might have been invalidated */
|
||||
mmu_seq = kvm->mmu_notifier_seq;
|
||||
smp_rmb();
|
||||
|
||||
ret = kvmppc_get_hpa(vcpu, dest, 1, &pa, &memslot);
|
||||
if (ret != H_SUCCESS)
|
||||
return ret;
|
||||
|
||||
/* Check if we've been invalidated */
|
||||
raw_spin_lock(&kvm->mmu_lock.rlock);
|
||||
if (mmu_notifier_retry(kvm, mmu_seq)) {
|
||||
ret = H_TOO_HARD;
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
/* Zero the page */
|
||||
for (i = 0; i < SZ_4K; i += L1_CACHE_BYTES, pa += L1_CACHE_BYTES)
|
||||
dcbz((void *)pa);
|
||||
kvmppc_update_dirty_map(memslot, dest >> PAGE_SHIFT, PAGE_SIZE);
|
||||
|
||||
out_unlock:
|
||||
raw_spin_unlock(&kvm->mmu_lock.rlock);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static long kvmppc_do_h_page_init_copy(struct kvm_vcpu *vcpu,
|
||||
unsigned long dest, unsigned long src)
|
||||
{
|
||||
unsigned long dest_pa, src_pa, mmu_seq;
|
||||
struct kvm_memory_slot *dest_memslot;
|
||||
struct kvm *kvm = vcpu->kvm;
|
||||
long ret = H_SUCCESS;
|
||||
|
||||
/* Used later to detect if we might have been invalidated */
|
||||
mmu_seq = kvm->mmu_notifier_seq;
|
||||
smp_rmb();
|
||||
|
||||
ret = kvmppc_get_hpa(vcpu, dest, 1, &dest_pa, &dest_memslot);
|
||||
if (ret != H_SUCCESS)
|
||||
return ret;
|
||||
ret = kvmppc_get_hpa(vcpu, src, 0, &src_pa, NULL);
|
||||
if (ret != H_SUCCESS)
|
||||
return ret;
|
||||
|
||||
/* Check if we've been invalidated */
|
||||
raw_spin_lock(&kvm->mmu_lock.rlock);
|
||||
if (mmu_notifier_retry(kvm, mmu_seq)) {
|
||||
ret = H_TOO_HARD;
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
/* Copy the page */
|
||||
memcpy((void *)dest_pa, (void *)src_pa, SZ_4K);
|
||||
|
||||
kvmppc_update_dirty_map(dest_memslot, dest >> PAGE_SHIFT, PAGE_SIZE);
|
||||
|
||||
out_unlock:
|
||||
raw_spin_unlock(&kvm->mmu_lock.rlock);
|
||||
return ret;
|
||||
}
|
||||
|
||||
long kvmppc_rm_h_page_init(struct kvm_vcpu *vcpu, unsigned long flags,
|
||||
unsigned long dest, unsigned long src)
|
||||
{
|
||||
struct kvm *kvm = vcpu->kvm;
|
||||
u64 pg_mask = SZ_4K - 1; /* 4K page size */
|
||||
long ret = H_SUCCESS;
|
||||
|
||||
/* Don't handle radix mode here, go up to the virtual mode handler */
|
||||
if (kvm_is_radix(kvm))
|
||||
return H_TOO_HARD;
|
||||
|
||||
/* Check for invalid flags (H_PAGE_SET_LOANED covers all CMO flags) */
|
||||
if (flags & ~(H_ICACHE_INVALIDATE | H_ICACHE_SYNCHRONIZE |
|
||||
H_ZERO_PAGE | H_COPY_PAGE | H_PAGE_SET_LOANED))
|
||||
return H_PARAMETER;
|
||||
|
||||
/* dest (and src if copy_page flag set) must be page aligned */
|
||||
if ((dest & pg_mask) || ((flags & H_COPY_PAGE) && (src & pg_mask)))
|
||||
return H_PARAMETER;
|
||||
|
||||
/* zero and/or copy the page as determined by the flags */
|
||||
if (flags & H_COPY_PAGE)
|
||||
ret = kvmppc_do_h_page_init_copy(vcpu, dest, src);
|
||||
else if (flags & H_ZERO_PAGE)
|
||||
ret = kvmppc_do_h_page_init_zero(vcpu, dest);
|
||||
|
||||
/* We can ignore the other flags */
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
|
||||
unsigned long pte_index)
|
||||
{
|
||||
|
@ -589,11 +589,8 @@ kvmppc_hv_entry:
|
||||
1:
|
||||
#endif
|
||||
|
||||
/* Use cr7 as an indication of radix mode */
|
||||
ld r5, HSTATE_KVM_VCORE(r13)
|
||||
ld r9, VCORE_KVM(r5) /* pointer to struct kvm */
|
||||
lbz r0, KVM_RADIX(r9)
|
||||
cmpwi cr7, r0, 0
|
||||
|
||||
/*
|
||||
* POWER7/POWER8 host -> guest partition switch code.
|
||||
@ -616,9 +613,6 @@ kvmppc_hv_entry:
|
||||
cmpwi r6,0
|
||||
bne 10f
|
||||
|
||||
/* Radix has already switched LPID and flushed core TLB */
|
||||
bne cr7, 22f
|
||||
|
||||
lwz r7,KVM_LPID(r9)
|
||||
BEGIN_FTR_SECTION
|
||||
ld r6,KVM_SDR1(r9)
|
||||
@ -630,41 +624,13 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
|
||||
mtspr SPRN_LPID,r7
|
||||
isync
|
||||
|
||||
/* See if we need to flush the TLB. Hash has to be done in RM */
|
||||
lhz r6,PACAPACAINDEX(r13) /* test_bit(cpu, need_tlb_flush) */
|
||||
BEGIN_FTR_SECTION
|
||||
/*
|
||||
* On POWER9, individual threads can come in here, but the
|
||||
* TLB is shared between the 4 threads in a core, hence
|
||||
* invalidating on one thread invalidates for all.
|
||||
* Thus we make all 4 threads use the same bit here.
|
||||
*/
|
||||
clrrdi r6,r6,2
|
||||
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
|
||||
clrldi r7,r6,64-6 /* extract bit number (6 bits) */
|
||||
srdi r6,r6,6 /* doubleword number */
|
||||
sldi r6,r6,3 /* address offset */
|
||||
add r6,r6,r9
|
||||
addi r6,r6,KVM_NEED_FLUSH /* dword in kvm->arch.need_tlb_flush */
|
||||
li r8,1
|
||||
sld r8,r8,r7
|
||||
ld r7,0(r6)
|
||||
and. r7,r7,r8
|
||||
beq 22f
|
||||
/* Flush the TLB of any entries for this LPID */
|
||||
lwz r0,KVM_TLB_SETS(r9)
|
||||
mtctr r0
|
||||
li r7,0x800 /* IS field = 0b10 */
|
||||
ptesync
|
||||
li r0,0 /* RS for P9 version of tlbiel */
|
||||
28: tlbiel r7 /* On P9, rs=0, RIC=0, PRS=0, R=0 */
|
||||
addi r7,r7,0x1000
|
||||
bdnz 28b
|
||||
ptesync
|
||||
23: ldarx r7,0,r6 /* clear the bit after TLB flushed */
|
||||
andc r7,r7,r8
|
||||
stdcx. r7,0,r6
|
||||
bne 23b
|
||||
/* See if we need to flush the TLB. */
|
||||
mr r3, r9 /* kvm pointer */
|
||||
lhz r4, PACAPACAINDEX(r13) /* physical cpu number */
|
||||
li r5, 0 /* nested vcpu pointer */
|
||||
bl kvmppc_check_need_tlb_flush
|
||||
nop
|
||||
ld r5, HSTATE_KVM_VCORE(r13)
|
||||
|
||||
/* Add timebase offset onto timebase */
|
||||
22: ld r8,VCORE_TB_OFFSET(r5)
|
||||
@ -980,17 +946,27 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
|
||||
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
/* We are entering the guest on that thread, push VCPU to XIVE */
|
||||
ld r10, HSTATE_XIVE_TIMA_PHYS(r13)
|
||||
cmpldi cr0, r10, 0
|
||||
beq no_xive
|
||||
ld r11, VCPU_XIVE_SAVED_STATE(r4)
|
||||
li r9, TM_QW1_OS
|
||||
lwz r8, VCPU_XIVE_CAM_WORD(r4)
|
||||
li r7, TM_QW1_OS + TM_WORD2
|
||||
mfmsr r0
|
||||
andi. r0, r0, MSR_DR /* in real mode? */
|
||||
beq 2f
|
||||
ld r10, HSTATE_XIVE_TIMA_VIRT(r13)
|
||||
cmpldi cr1, r10, 0
|
||||
beq cr1, no_xive
|
||||
eieio
|
||||
stdx r11,r9,r10
|
||||
stwx r8,r7,r10
|
||||
b 3f
|
||||
2: ld r10, HSTATE_XIVE_TIMA_PHYS(r13)
|
||||
cmpldi cr1, r10, 0
|
||||
beq cr1, no_xive
|
||||
eieio
|
||||
stdcix r11,r9,r10
|
||||
lwz r11, VCPU_XIVE_CAM_WORD(r4)
|
||||
li r9, TM_QW1_OS + TM_WORD2
|
||||
stwcix r11,r9,r10
|
||||
li r9, 1
|
||||
stwcix r8,r7,r10
|
||||
3: li r9, 1
|
||||
stb r9, VCPU_XIVE_PUSHED(r4)
|
||||
eieio
|
||||
|
||||
@ -1009,12 +985,16 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
|
||||
* on, we mask it.
|
||||
*/
|
||||
lbz r0, VCPU_XIVE_ESC_ON(r4)
|
||||
cmpwi r0,0
|
||||
beq 1f
|
||||
ld r10, VCPU_XIVE_ESC_RADDR(r4)
|
||||
cmpwi cr1, r0,0
|
||||
beq cr1, 1f
|
||||
li r9, XIVE_ESB_SET_PQ_01
|
||||
beq 4f /* in real mode? */
|
||||
ld r10, VCPU_XIVE_ESC_VADDR(r4)
|
||||
ldx r0, r10, r9
|
||||
b 5f
|
||||
4: ld r10, VCPU_XIVE_ESC_RADDR(r4)
|
||||
ldcix r0, r10, r9
|
||||
sync
|
||||
5: sync
|
||||
|
||||
/* We have a possible subtle race here: The escalation interrupt might
|
||||
* have fired and be on its way to the host queue while we mask it,
|
||||
@ -2292,7 +2272,7 @@ hcall_real_table:
|
||||
#endif
|
||||
.long 0 /* 0x24 - H_SET_SPRG0 */
|
||||
.long DOTSYM(kvmppc_h_set_dabr) - hcall_real_table
|
||||
.long 0 /* 0x2c */
|
||||
.long DOTSYM(kvmppc_rm_h_page_init) - hcall_real_table
|
||||
.long 0 /* 0x30 */
|
||||
.long 0 /* 0x34 */
|
||||
.long 0 /* 0x38 */
|
||||
|
@ -166,7 +166,8 @@ static irqreturn_t xive_esc_irq(int irq, void *data)
|
||||
return IRQ_HANDLED;
|
||||
}
|
||||
|
||||
static int xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio)
|
||||
int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
|
||||
bool single_escalation)
|
||||
{
|
||||
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
|
||||
struct xive_q *q = &xc->queues[prio];
|
||||
@ -185,7 +186,7 @@ static int xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio)
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
if (xc->xive->single_escalation)
|
||||
if (single_escalation)
|
||||
name = kasprintf(GFP_KERNEL, "kvm-%d-%d",
|
||||
vcpu->kvm->arch.lpid, xc->server_num);
|
||||
else
|
||||
@ -217,7 +218,7 @@ static int xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio)
|
||||
* interrupt, thus leaving it effectively masked after
|
||||
* it fires once.
|
||||
*/
|
||||
if (xc->xive->single_escalation) {
|
||||
if (single_escalation) {
|
||||
struct irq_data *d = irq_get_irq_data(xc->esc_virq[prio]);
|
||||
struct xive_irq_data *xd = irq_data_get_irq_handler_data(d);
|
||||
|
||||
@ -291,7 +292,8 @@ static int xive_check_provisioning(struct kvm *kvm, u8 prio)
|
||||
continue;
|
||||
rc = xive_provision_queue(vcpu, prio);
|
||||
if (rc == 0 && !xive->single_escalation)
|
||||
xive_attach_escalation(vcpu, prio);
|
||||
kvmppc_xive_attach_escalation(vcpu, prio,
|
||||
xive->single_escalation);
|
||||
if (rc)
|
||||
return rc;
|
||||
}
|
||||
@ -342,7 +344,7 @@ static int xive_try_pick_queue(struct kvm_vcpu *vcpu, u8 prio)
|
||||
return atomic_add_unless(&q->count, 1, max) ? 0 : -EBUSY;
|
||||
}
|
||||
|
||||
static int xive_select_target(struct kvm *kvm, u32 *server, u8 prio)
|
||||
int kvmppc_xive_select_target(struct kvm *kvm, u32 *server, u8 prio)
|
||||
{
|
||||
struct kvm_vcpu *vcpu;
|
||||
int i, rc;
|
||||
@ -380,11 +382,6 @@ static int xive_select_target(struct kvm *kvm, u32 *server, u8 prio)
|
||||
return -EBUSY;
|
||||
}
|
||||
|
||||
static u32 xive_vp(struct kvmppc_xive *xive, u32 server)
|
||||
{
|
||||
return xive->vp_base + kvmppc_pack_vcpu_id(xive->kvm, server);
|
||||
}
|
||||
|
||||
static u8 xive_lock_and_mask(struct kvmppc_xive *xive,
|
||||
struct kvmppc_xive_src_block *sb,
|
||||
struct kvmppc_xive_irq_state *state)
|
||||
@ -430,8 +427,8 @@ static u8 xive_lock_and_mask(struct kvmppc_xive *xive,
|
||||
*/
|
||||
if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
|
||||
xive_native_configure_irq(hw_num,
|
||||
xive_vp(xive, state->act_server),
|
||||
MASKED, state->number);
|
||||
kvmppc_xive_vp(xive, state->act_server),
|
||||
MASKED, state->number);
|
||||
/* set old_p so we can track if an H_EOI was done */
|
||||
state->old_p = true;
|
||||
state->old_q = false;
|
||||
@ -486,8 +483,8 @@ static void xive_finish_unmask(struct kvmppc_xive *xive,
|
||||
*/
|
||||
if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
|
||||
xive_native_configure_irq(hw_num,
|
||||
xive_vp(xive, state->act_server),
|
||||
state->act_priority, state->number);
|
||||
kvmppc_xive_vp(xive, state->act_server),
|
||||
state->act_priority, state->number);
|
||||
/* If an EOI is needed, do it here */
|
||||
if (!state->old_p)
|
||||
xive_vm_source_eoi(hw_num, xd);
|
||||
@ -535,7 +532,7 @@ static int xive_target_interrupt(struct kvm *kvm,
|
||||
* priority. The count for that new target will have
|
||||
* already been incremented.
|
||||
*/
|
||||
rc = xive_select_target(kvm, &server, prio);
|
||||
rc = kvmppc_xive_select_target(kvm, &server, prio);
|
||||
|
||||
/*
|
||||
* We failed to find a target ? Not much we can do
|
||||
@ -563,7 +560,7 @@ static int xive_target_interrupt(struct kvm *kvm,
|
||||
kvmppc_xive_select_irq(state, &hw_num, NULL);
|
||||
|
||||
return xive_native_configure_irq(hw_num,
|
||||
xive_vp(xive, server),
|
||||
kvmppc_xive_vp(xive, server),
|
||||
prio, state->number);
|
||||
}
|
||||
|
||||
@ -849,7 +846,8 @@ int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval)
|
||||
|
||||
/*
|
||||
* We can't update the state of a "pushed" VCPU, but that
|
||||
* shouldn't happen.
|
||||
* shouldn't happen because the vcpu->mutex makes running a
|
||||
* vcpu mutually exclusive with doing one_reg get/set on it.
|
||||
*/
|
||||
if (WARN_ON(vcpu->arch.xive_pushed))
|
||||
return -EIO;
|
||||
@ -940,6 +938,13 @@ int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq,
|
||||
/* Turn the IPI hard off */
|
||||
xive_vm_esb_load(&state->ipi_data, XIVE_ESB_SET_PQ_01);
|
||||
|
||||
/*
|
||||
* Reset ESB guest mapping. Needed when ESB pages are exposed
|
||||
* to the guest in XIVE native mode
|
||||
*/
|
||||
if (xive->ops && xive->ops->reset_mapped)
|
||||
xive->ops->reset_mapped(kvm, guest_irq);
|
||||
|
||||
/* Grab info about irq */
|
||||
state->pt_number = hw_irq;
|
||||
state->pt_data = irq_data_get_irq_handler_data(host_data);
|
||||
@ -951,7 +956,7 @@ int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq,
|
||||
* which is fine for a never started interrupt.
|
||||
*/
|
||||
xive_native_configure_irq(hw_irq,
|
||||
xive_vp(xive, state->act_server),
|
||||
kvmppc_xive_vp(xive, state->act_server),
|
||||
state->act_priority, state->number);
|
||||
|
||||
/*
|
||||
@ -1025,9 +1030,17 @@ int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq,
|
||||
state->pt_number = 0;
|
||||
state->pt_data = NULL;
|
||||
|
||||
/*
|
||||
* Reset ESB guest mapping. Needed when ESB pages are exposed
|
||||
* to the guest in XIVE native mode
|
||||
*/
|
||||
if (xive->ops && xive->ops->reset_mapped) {
|
||||
xive->ops->reset_mapped(kvm, guest_irq);
|
||||
}
|
||||
|
||||
/* Reconfigure the IPI */
|
||||
xive_native_configure_irq(state->ipi_number,
|
||||
xive_vp(xive, state->act_server),
|
||||
kvmppc_xive_vp(xive, state->act_server),
|
||||
state->act_priority, state->number);
|
||||
|
||||
/*
|
||||
@ -1049,7 +1062,7 @@ int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq,
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvmppc_xive_clr_mapped);
|
||||
|
||||
static void kvmppc_xive_disable_vcpu_interrupts(struct kvm_vcpu *vcpu)
|
||||
void kvmppc_xive_disable_vcpu_interrupts(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
|
||||
struct kvm *kvm = vcpu->kvm;
|
||||
@ -1083,14 +1096,35 @@ static void kvmppc_xive_disable_vcpu_interrupts(struct kvm_vcpu *vcpu)
|
||||
arch_spin_unlock(&sb->lock);
|
||||
}
|
||||
}
|
||||
|
||||
/* Disable vcpu's escalation interrupt */
|
||||
if (vcpu->arch.xive_esc_on) {
|
||||
__raw_readq((void __iomem *)(vcpu->arch.xive_esc_vaddr +
|
||||
XIVE_ESB_SET_PQ_01));
|
||||
vcpu->arch.xive_esc_on = false;
|
||||
}
|
||||
|
||||
/*
|
||||
* Clear pointers to escalation interrupt ESB.
|
||||
* This is safe because the vcpu->mutex is held, preventing
|
||||
* any other CPU from concurrently executing a KVM_RUN ioctl.
|
||||
*/
|
||||
vcpu->arch.xive_esc_vaddr = 0;
|
||||
vcpu->arch.xive_esc_raddr = 0;
|
||||
}
|
||||
|
||||
void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
|
||||
struct kvmppc_xive *xive = xc->xive;
|
||||
struct kvmppc_xive *xive = vcpu->kvm->arch.xive;
|
||||
int i;
|
||||
|
||||
if (!kvmppc_xics_enabled(vcpu))
|
||||
return;
|
||||
|
||||
if (!xc)
|
||||
return;
|
||||
|
||||
pr_devel("cleanup_vcpu(cpu=%d)\n", xc->server_num);
|
||||
|
||||
/* Ensure no interrupt is still routed to that VP */
|
||||
@ -1129,6 +1163,10 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu)
|
||||
}
|
||||
/* Free the VP */
|
||||
kfree(xc);
|
||||
|
||||
/* Cleanup the vcpu */
|
||||
vcpu->arch.irq_type = KVMPPC_IRQ_DEFAULT;
|
||||
vcpu->arch.xive_vcpu = NULL;
|
||||
}
|
||||
|
||||
int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
|
||||
@ -1146,7 +1184,7 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
|
||||
}
|
||||
if (xive->kvm != vcpu->kvm)
|
||||
return -EPERM;
|
||||
if (vcpu->arch.irq_type)
|
||||
if (vcpu->arch.irq_type != KVMPPC_IRQ_DEFAULT)
|
||||
return -EBUSY;
|
||||
if (kvmppc_xive_find_server(vcpu->kvm, cpu)) {
|
||||
pr_devel("Duplicate !\n");
|
||||
@ -1166,7 +1204,7 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
|
||||
xc->xive = xive;
|
||||
xc->vcpu = vcpu;
|
||||
xc->server_num = cpu;
|
||||
xc->vp_id = xive_vp(xive, cpu);
|
||||
xc->vp_id = kvmppc_xive_vp(xive, cpu);
|
||||
xc->mfrr = 0xff;
|
||||
xc->valid = true;
|
||||
|
||||
@ -1219,7 +1257,8 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
|
||||
if (xive->qmap & (1 << i)) {
|
||||
r = xive_provision_queue(vcpu, i);
|
||||
if (r == 0 && !xive->single_escalation)
|
||||
xive_attach_escalation(vcpu, i);
|
||||
kvmppc_xive_attach_escalation(
|
||||
vcpu, i, xive->single_escalation);
|
||||
if (r)
|
||||
goto bail;
|
||||
} else {
|
||||
@ -1234,7 +1273,7 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
|
||||
}
|
||||
|
||||
/* If not done above, attach priority 0 escalation */
|
||||
r = xive_attach_escalation(vcpu, 0);
|
||||
r = kvmppc_xive_attach_escalation(vcpu, 0, xive->single_escalation);
|
||||
if (r)
|
||||
goto bail;
|
||||
|
||||
@ -1485,8 +1524,8 @@ static int xive_get_source(struct kvmppc_xive *xive, long irq, u64 addr)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct kvmppc_xive_src_block *xive_create_src_block(struct kvmppc_xive *xive,
|
||||
int irq)
|
||||
struct kvmppc_xive_src_block *kvmppc_xive_create_src_block(
|
||||
struct kvmppc_xive *xive, int irq)
|
||||
{
|
||||
struct kvm *kvm = xive->kvm;
|
||||
struct kvmppc_xive_src_block *sb;
|
||||
@ -1509,6 +1548,7 @@ static struct kvmppc_xive_src_block *xive_create_src_block(struct kvmppc_xive *x
|
||||
|
||||
for (i = 0; i < KVMPPC_XICS_IRQ_PER_ICS; i++) {
|
||||
sb->irq_state[i].number = (bid << KVMPPC_XICS_ICS_SHIFT) | i;
|
||||
sb->irq_state[i].eisn = 0;
|
||||
sb->irq_state[i].guest_priority = MASKED;
|
||||
sb->irq_state[i].saved_priority = MASKED;
|
||||
sb->irq_state[i].act_priority = MASKED;
|
||||
@ -1565,7 +1605,7 @@ static int xive_set_source(struct kvmppc_xive *xive, long irq, u64 addr)
|
||||
sb = kvmppc_xive_find_source(xive, irq, &idx);
|
||||
if (!sb) {
|
||||
pr_devel("No source, creating source block...\n");
|
||||
sb = xive_create_src_block(xive, irq);
|
||||
sb = kvmppc_xive_create_src_block(xive, irq);
|
||||
if (!sb) {
|
||||
pr_devel("Failed to create block...\n");
|
||||
return -ENOMEM;
|
||||
@ -1789,7 +1829,7 @@ static void kvmppc_xive_cleanup_irq(u32 hw_num, struct xive_irq_data *xd)
|
||||
xive_cleanup_irq_data(xd);
|
||||
}
|
||||
|
||||
static void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb)
|
||||
void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb)
|
||||
{
|
||||
int i;
|
||||
|
||||
@ -1810,16 +1850,55 @@ static void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb)
|
||||
}
|
||||
}
|
||||
|
||||
static void kvmppc_xive_free(struct kvm_device *dev)
|
||||
/*
|
||||
* Called when device fd is closed. kvm->lock is held.
|
||||
*/
|
||||
static void kvmppc_xive_release(struct kvm_device *dev)
|
||||
{
|
||||
struct kvmppc_xive *xive = dev->private;
|
||||
struct kvm *kvm = xive->kvm;
|
||||
struct kvm_vcpu *vcpu;
|
||||
int i;
|
||||
int was_ready;
|
||||
|
||||
pr_devel("Releasing xive device\n");
|
||||
|
||||
debugfs_remove(xive->dentry);
|
||||
|
||||
if (kvm)
|
||||
kvm->arch.xive = NULL;
|
||||
/*
|
||||
* Clearing mmu_ready temporarily while holding kvm->lock
|
||||
* is a way of ensuring that no vcpus can enter the guest
|
||||
* until we drop kvm->lock. Doing kick_all_cpus_sync()
|
||||
* ensures that any vcpu executing inside the guest has
|
||||
* exited the guest. Once kick_all_cpus_sync() has finished,
|
||||
* we know that no vcpu can be executing the XIVE push or
|
||||
* pull code, or executing a XICS hcall.
|
||||
*
|
||||
* Since this is the device release function, we know that
|
||||
* userspace does not have any open fd referring to the
|
||||
* device. Therefore there can not be any of the device
|
||||
* attribute set/get functions being executed concurrently,
|
||||
* and similarly, the connect_vcpu and set/clr_mapped
|
||||
* functions also cannot be being executed.
|
||||
*/
|
||||
was_ready = kvm->arch.mmu_ready;
|
||||
kvm->arch.mmu_ready = 0;
|
||||
kick_all_cpus_sync();
|
||||
|
||||
/*
|
||||
* We should clean up the vCPU interrupt presenters first.
|
||||
*/
|
||||
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||
/*
|
||||
* Take vcpu->mutex to ensure that no one_reg get/set ioctl
|
||||
* (i.e. kvmppc_xive_[gs]et_icp) can be done concurrently.
|
||||
*/
|
||||
mutex_lock(&vcpu->mutex);
|
||||
kvmppc_xive_cleanup_vcpu(vcpu);
|
||||
mutex_unlock(&vcpu->mutex);
|
||||
}
|
||||
|
||||
kvm->arch.xive = NULL;
|
||||
|
||||
/* Mask and free interrupts */
|
||||
for (i = 0; i <= xive->max_sbid; i++) {
|
||||
@ -1832,11 +1911,47 @@ static void kvmppc_xive_free(struct kvm_device *dev)
|
||||
if (xive->vp_base != XIVE_INVALID_VP)
|
||||
xive_native_free_vp_block(xive->vp_base);
|
||||
|
||||
kvm->arch.mmu_ready = was_ready;
|
||||
|
||||
/*
|
||||
* A reference of the kvmppc_xive pointer is now kept under
|
||||
* the xive_devices struct of the machine for reuse. It is
|
||||
* freed when the VM is destroyed for now until we fix all the
|
||||
* execution paths.
|
||||
*/
|
||||
|
||||
kfree(xive);
|
||||
kfree(dev);
|
||||
}
|
||||
|
||||
/*
|
||||
* When the guest chooses the interrupt mode (XICS legacy or XIVE
|
||||
* native), the VM will switch of KVM device. The previous device will
|
||||
* be "released" before the new one is created.
|
||||
*
|
||||
* Until we are sure all execution paths are well protected, provide a
|
||||
* fail safe (transitional) method for device destruction, in which
|
||||
* the XIVE device pointer is recycled and not directly freed.
|
||||
*/
|
||||
struct kvmppc_xive *kvmppc_xive_get_device(struct kvm *kvm, u32 type)
|
||||
{
|
||||
struct kvmppc_xive **kvm_xive_device = type == KVM_DEV_TYPE_XIVE ?
|
||||
&kvm->arch.xive_devices.native :
|
||||
&kvm->arch.xive_devices.xics_on_xive;
|
||||
struct kvmppc_xive *xive = *kvm_xive_device;
|
||||
|
||||
if (!xive) {
|
||||
xive = kzalloc(sizeof(*xive), GFP_KERNEL);
|
||||
*kvm_xive_device = xive;
|
||||
} else {
|
||||
memset(xive, 0, sizeof(*xive));
|
||||
}
|
||||
|
||||
return xive;
|
||||
}
|
||||
|
||||
/*
|
||||
* Create a XICS device with XIVE backend. kvm->lock is held.
|
||||
*/
|
||||
static int kvmppc_xive_create(struct kvm_device *dev, u32 type)
|
||||
{
|
||||
struct kvmppc_xive *xive;
|
||||
@ -1845,7 +1960,7 @@ static int kvmppc_xive_create(struct kvm_device *dev, u32 type)
|
||||
|
||||
pr_devel("Creating xive for partition\n");
|
||||
|
||||
xive = kzalloc(sizeof(*xive), GFP_KERNEL);
|
||||
xive = kvmppc_xive_get_device(kvm, type);
|
||||
if (!xive)
|
||||
return -ENOMEM;
|
||||
|
||||
@ -1883,6 +1998,43 @@ static int kvmppc_xive_create(struct kvm_device *dev, u32 type)
|
||||
return 0;
|
||||
}
|
||||
|
||||
int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) {
|
||||
struct xive_q *q = &xc->queues[i];
|
||||
u32 i0, i1, idx;
|
||||
|
||||
if (!q->qpage && !xc->esc_virq[i])
|
||||
continue;
|
||||
|
||||
seq_printf(m, " [q%d]: ", i);
|
||||
|
||||
if (q->qpage) {
|
||||
idx = q->idx;
|
||||
i0 = be32_to_cpup(q->qpage + idx);
|
||||
idx = (idx + 1) & q->msk;
|
||||
i1 = be32_to_cpup(q->qpage + idx);
|
||||
seq_printf(m, "T=%d %08x %08x...\n", q->toggle,
|
||||
i0, i1);
|
||||
}
|
||||
if (xc->esc_virq[i]) {
|
||||
struct irq_data *d = irq_get_irq_data(xc->esc_virq[i]);
|
||||
struct xive_irq_data *xd =
|
||||
irq_data_get_irq_handler_data(d);
|
||||
u64 pq = xive_vm_esb_load(xd, XIVE_ESB_GET);
|
||||
|
||||
seq_printf(m, "E:%c%c I(%d:%llx:%llx)",
|
||||
(pq & XIVE_ESB_VAL_P) ? 'P' : 'p',
|
||||
(pq & XIVE_ESB_VAL_Q) ? 'Q' : 'q',
|
||||
xc->esc_virq[i], pq, xd->eoi_page);
|
||||
seq_puts(m, "\n");
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int xive_debug_show(struct seq_file *m, void *private)
|
||||
{
|
||||
@ -1908,7 +2060,6 @@ static int xive_debug_show(struct seq_file *m, void *private)
|
||||
|
||||
kvm_for_each_vcpu(i, vcpu, kvm) {
|
||||
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
|
||||
unsigned int i;
|
||||
|
||||
if (!xc)
|
||||
continue;
|
||||
@ -1918,33 +2069,8 @@ static int xive_debug_show(struct seq_file *m, void *private)
|
||||
xc->server_num, xc->cppr, xc->hw_cppr,
|
||||
xc->mfrr, xc->pending,
|
||||
xc->stat_rm_h_xirr, xc->stat_vm_h_xirr);
|
||||
for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) {
|
||||
struct xive_q *q = &xc->queues[i];
|
||||
u32 i0, i1, idx;
|
||||
|
||||
if (!q->qpage && !xc->esc_virq[i])
|
||||
continue;
|
||||
|
||||
seq_printf(m, " [q%d]: ", i);
|
||||
|
||||
if (q->qpage) {
|
||||
idx = q->idx;
|
||||
i0 = be32_to_cpup(q->qpage + idx);
|
||||
idx = (idx + 1) & q->msk;
|
||||
i1 = be32_to_cpup(q->qpage + idx);
|
||||
seq_printf(m, "T=%d %08x %08x... \n", q->toggle, i0, i1);
|
||||
}
|
||||
if (xc->esc_virq[i]) {
|
||||
struct irq_data *d = irq_get_irq_data(xc->esc_virq[i]);
|
||||
struct xive_irq_data *xd = irq_data_get_irq_handler_data(d);
|
||||
u64 pq = xive_vm_esb_load(xd, XIVE_ESB_GET);
|
||||
seq_printf(m, "E:%c%c I(%d:%llx:%llx)",
|
||||
(pq & XIVE_ESB_VAL_P) ? 'P' : 'p',
|
||||
(pq & XIVE_ESB_VAL_Q) ? 'Q' : 'q',
|
||||
xc->esc_virq[i], pq, xd->eoi_page);
|
||||
seq_printf(m, "\n");
|
||||
}
|
||||
}
|
||||
kvmppc_xive_debug_show_queues(m, vcpu);
|
||||
|
||||
t_rm_h_xirr += xc->stat_rm_h_xirr;
|
||||
t_rm_h_ipoll += xc->stat_rm_h_ipoll;
|
||||
@ -1999,7 +2125,7 @@ struct kvm_device_ops kvm_xive_ops = {
|
||||
.name = "kvm-xive",
|
||||
.create = kvmppc_xive_create,
|
||||
.init = kvmppc_xive_init,
|
||||
.destroy = kvmppc_xive_free,
|
||||
.release = kvmppc_xive_release,
|
||||
.set_attr = xive_set_attr,
|
||||
.get_attr = xive_get_attr,
|
||||
.has_attr = xive_has_attr,
|
||||
|
@ -12,6 +12,13 @@
|
||||
#ifdef CONFIG_KVM_XICS
|
||||
#include "book3s_xics.h"
|
||||
|
||||
/*
|
||||
* The XIVE Interrupt source numbers are within the range 0 to
|
||||
* KVMPPC_XICS_NR_IRQS.
|
||||
*/
|
||||
#define KVMPPC_XIVE_FIRST_IRQ 0
|
||||
#define KVMPPC_XIVE_NR_IRQS KVMPPC_XICS_NR_IRQS
|
||||
|
||||
/*
|
||||
* State for one guest irq source.
|
||||
*
|
||||
@ -54,6 +61,9 @@ struct kvmppc_xive_irq_state {
|
||||
bool saved_p;
|
||||
bool saved_q;
|
||||
u8 saved_scan_prio;
|
||||
|
||||
/* Xive native */
|
||||
u32 eisn; /* Guest Effective IRQ number */
|
||||
};
|
||||
|
||||
/* Select the "right" interrupt (IPI vs. passthrough) */
|
||||
@ -84,6 +94,11 @@ struct kvmppc_xive_src_block {
|
||||
struct kvmppc_xive_irq_state irq_state[KVMPPC_XICS_IRQ_PER_ICS];
|
||||
};
|
||||
|
||||
struct kvmppc_xive;
|
||||
|
||||
struct kvmppc_xive_ops {
|
||||
int (*reset_mapped)(struct kvm *kvm, unsigned long guest_irq);
|
||||
};
|
||||
|
||||
struct kvmppc_xive {
|
||||
struct kvm *kvm;
|
||||
@ -122,6 +137,10 @@ struct kvmppc_xive {
|
||||
|
||||
/* Flags */
|
||||
u8 single_escalation;
|
||||
|
||||
struct kvmppc_xive_ops *ops;
|
||||
struct address_space *mapping;
|
||||
struct mutex mapping_lock;
|
||||
};
|
||||
|
||||
#define KVMPPC_XIVE_Q_COUNT 8
|
||||
@ -198,6 +217,11 @@ static inline struct kvmppc_xive_src_block *kvmppc_xive_find_source(struct kvmpp
|
||||
return xive->src_blocks[bid];
|
||||
}
|
||||
|
||||
static inline u32 kvmppc_xive_vp(struct kvmppc_xive *xive, u32 server)
|
||||
{
|
||||
return xive->vp_base + kvmppc_pack_vcpu_id(xive->kvm, server);
|
||||
}
|
||||
|
||||
/*
|
||||
* Mapping between guest priorities and host priorities
|
||||
* is as follow.
|
||||
@ -248,5 +272,18 @@ extern int (*__xive_vm_h_ipi)(struct kvm_vcpu *vcpu, unsigned long server,
|
||||
extern int (*__xive_vm_h_cppr)(struct kvm_vcpu *vcpu, unsigned long cppr);
|
||||
extern int (*__xive_vm_h_eoi)(struct kvm_vcpu *vcpu, unsigned long xirr);
|
||||
|
||||
/*
|
||||
* Common Xive routines for XICS-over-XIVE and XIVE native
|
||||
*/
|
||||
void kvmppc_xive_disable_vcpu_interrupts(struct kvm_vcpu *vcpu);
|
||||
int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu);
|
||||
struct kvmppc_xive_src_block *kvmppc_xive_create_src_block(
|
||||
struct kvmppc_xive *xive, int irq);
|
||||
void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb);
|
||||
int kvmppc_xive_select_target(struct kvm *kvm, u32 *server, u8 prio);
|
||||
int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
|
||||
bool single_escalation);
|
||||
struct kvmppc_xive *kvmppc_xive_get_device(struct kvm *kvm, u32 type);
|
||||
|
||||
#endif /* CONFIG_KVM_XICS */
|
||||
#endif /* _KVM_PPC_BOOK3S_XICS_H */
|
||||
|
1249
arch/powerpc/kvm/book3s_xive_native.c
Normal file
1249
arch/powerpc/kvm/book3s_xive_native.c
Normal file
File diff suppressed because it is too large
Load Diff
@ -130,25 +130,15 @@ static u32 GLUE(X_PFX,scan_interrupts)(struct kvmppc_xive_vcpu *xc,
|
||||
*/
|
||||
prio = ffs(pending) - 1;
|
||||
|
||||
/*
|
||||
* If the most favoured prio we found pending is less
|
||||
* favored (or equal) than a pending IPI, we return
|
||||
* the IPI instead.
|
||||
*
|
||||
* Note: If pending was 0 and mfrr is 0xff, we will
|
||||
* not spurriously take an IPI because mfrr cannot
|
||||
* then be smaller than cppr.
|
||||
*/
|
||||
if (prio >= xc->mfrr && xc->mfrr < xc->cppr) {
|
||||
prio = xc->mfrr;
|
||||
hirq = XICS_IPI;
|
||||
/* Don't scan past the guest cppr */
|
||||
if (prio >= xc->cppr || prio > 7) {
|
||||
if (xc->mfrr < xc->cppr) {
|
||||
prio = xc->mfrr;
|
||||
hirq = XICS_IPI;
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
/* Don't scan past the guest cppr */
|
||||
if (prio >= xc->cppr || prio > 7)
|
||||
break;
|
||||
|
||||
/* Grab queue and pointers */
|
||||
q = &xc->queues[prio];
|
||||
idx = q->idx;
|
||||
@ -184,9 +174,12 @@ skip_ipi:
|
||||
* been set and another occurrence of the IPI will trigger.
|
||||
*/
|
||||
if (hirq == XICS_IPI || (prio == 0 && !qpage)) {
|
||||
if (scan_type == scan_fetch)
|
||||
if (scan_type == scan_fetch) {
|
||||
GLUE(X_PFX,source_eoi)(xc->vp_ipi,
|
||||
&xc->vp_ipi_data);
|
||||
q->idx = idx;
|
||||
q->toggle = toggle;
|
||||
}
|
||||
/* Loop back on same queue with updated idx/toggle */
|
||||
#ifdef XIVE_RUNTIME_CHECKS
|
||||
WARN_ON(hirq && hirq != XICS_IPI);
|
||||
@ -199,32 +192,41 @@ skip_ipi:
|
||||
if (hirq == XICS_DUMMY)
|
||||
goto skip_ipi;
|
||||
|
||||
/* Clear the pending bit if the queue is now empty */
|
||||
if (!hirq) {
|
||||
pending &= ~(1 << prio);
|
||||
|
||||
/*
|
||||
* Check if the queue count needs adjusting due to
|
||||
* interrupts being moved away.
|
||||
*/
|
||||
if (atomic_read(&q->pending_count)) {
|
||||
int p = atomic_xchg(&q->pending_count, 0);
|
||||
if (p) {
|
||||
#ifdef XIVE_RUNTIME_CHECKS
|
||||
WARN_ON(p > atomic_read(&q->count));
|
||||
#endif
|
||||
atomic_sub(p, &q->count);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* If the most favoured prio we found pending is less
|
||||
* favored (or equal) than a pending IPI, we return
|
||||
* the IPI instead.
|
||||
*/
|
||||
if (prio >= xc->mfrr && xc->mfrr < xc->cppr) {
|
||||
prio = xc->mfrr;
|
||||
hirq = XICS_IPI;
|
||||
break;
|
||||
}
|
||||
|
||||
/* If fetching, update queue pointers */
|
||||
if (scan_type == scan_fetch) {
|
||||
q->idx = idx;
|
||||
q->toggle = toggle;
|
||||
}
|
||||
|
||||
/* Something found, stop searching */
|
||||
if (hirq)
|
||||
break;
|
||||
|
||||
/* Clear the pending bit on the now empty queue */
|
||||
pending &= ~(1 << prio);
|
||||
|
||||
/*
|
||||
* Check if the queue count needs adjusting due to
|
||||
* interrupts being moved away.
|
||||
*/
|
||||
if (atomic_read(&q->pending_count)) {
|
||||
int p = atomic_xchg(&q->pending_count, 0);
|
||||
if (p) {
|
||||
#ifdef XIVE_RUNTIME_CHECKS
|
||||
WARN_ON(p > atomic_read(&q->count));
|
||||
#endif
|
||||
atomic_sub(p, &q->count);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* If we are just taking a "peek", do nothing else */
|
||||
|
@ -570,6 +570,16 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||
case KVM_CAP_PPC_GET_CPU_CHAR:
|
||||
r = 1;
|
||||
break;
|
||||
#ifdef CONFIG_KVM_XIVE
|
||||
case KVM_CAP_PPC_IRQ_XIVE:
|
||||
/*
|
||||
* We need XIVE to be enabled on the platform (implies
|
||||
* a POWER9 processor) and the PowerNV platform, as
|
||||
* nested is not yet supported.
|
||||
*/
|
||||
r = xive_enabled() && !!cpu_has_feature(CPU_FTR_HVMODE);
|
||||
break;
|
||||
#endif
|
||||
|
||||
case KVM_CAP_PPC_ALLOC_HTAB:
|
||||
r = hv_enabled;
|
||||
@ -644,9 +654,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||
else
|
||||
r = num_online_cpus();
|
||||
break;
|
||||
case KVM_CAP_NR_MEMSLOTS:
|
||||
r = KVM_USER_MEM_SLOTS;
|
||||
break;
|
||||
case KVM_CAP_MAX_VCPUS:
|
||||
r = KVM_MAX_VCPUS;
|
||||
break;
|
||||
@ -753,6 +760,9 @@ void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
|
||||
else
|
||||
kvmppc_xics_free_icp(vcpu);
|
||||
break;
|
||||
case KVMPPC_IRQ_XIVE:
|
||||
kvmppc_xive_native_cleanup_vcpu(vcpu);
|
||||
break;
|
||||
}
|
||||
|
||||
kvmppc_core_vcpu_free(vcpu);
|
||||
@ -1941,6 +1951,30 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
|
||||
break;
|
||||
}
|
||||
#endif /* CONFIG_KVM_XICS */
|
||||
#ifdef CONFIG_KVM_XIVE
|
||||
case KVM_CAP_PPC_IRQ_XIVE: {
|
||||
struct fd f;
|
||||
struct kvm_device *dev;
|
||||
|
||||
r = -EBADF;
|
||||
f = fdget(cap->args[0]);
|
||||
if (!f.file)
|
||||
break;
|
||||
|
||||
r = -ENXIO;
|
||||
if (!xive_enabled())
|
||||
break;
|
||||
|
||||
r = -EPERM;
|
||||
dev = kvm_device_from_filp(f.file);
|
||||
if (dev)
|
||||
r = kvmppc_xive_native_connect_vcpu(dev, vcpu,
|
||||
cap->args[1]);
|
||||
|
||||
fdput(f);
|
||||
break;
|
||||
}
|
||||
#endif /* CONFIG_KVM_XIVE */
|
||||
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
|
||||
case KVM_CAP_PPC_FWNMI:
|
||||
r = -EINVAL;
|
||||
|
@ -521,6 +521,9 @@ u32 xive_native_default_eq_shift(void)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(xive_native_default_eq_shift);
|
||||
|
||||
unsigned long xive_tima_os;
|
||||
EXPORT_SYMBOL_GPL(xive_tima_os);
|
||||
|
||||
bool __init xive_native_init(void)
|
||||
{
|
||||
struct device_node *np;
|
||||
@ -573,6 +576,14 @@ bool __init xive_native_init(void)
|
||||
for_each_possible_cpu(cpu)
|
||||
kvmppc_set_xive_tima(cpu, r.start, tima);
|
||||
|
||||
/* Resource 2 is OS window */
|
||||
if (of_address_to_resource(np, 2, &r)) {
|
||||
pr_err("Failed to get thread mgmnt area resource\n");
|
||||
return false;
|
||||
}
|
||||
|
||||
xive_tima_os = r.start;
|
||||
|
||||
/* Grab size of provisionning pages */
|
||||
xive_parse_provisioning(np);
|
||||
|
||||
|
@ -28,6 +28,7 @@
|
||||
#define CPACF_KMCTR 0xb92d /* MSA4 */
|
||||
#define CPACF_PRNO 0xb93c /* MSA5 */
|
||||
#define CPACF_KMA 0xb929 /* MSA8 */
|
||||
#define CPACF_KDSA 0xb93a /* MSA9 */
|
||||
|
||||
/*
|
||||
* En/decryption modifier bits
|
||||
|
@ -278,6 +278,7 @@ struct kvm_s390_sie_block {
|
||||
#define ECD_HOSTREGMGMT 0x20000000
|
||||
#define ECD_MEF 0x08000000
|
||||
#define ECD_ETOKENF 0x02000000
|
||||
#define ECD_ECC 0x00200000
|
||||
__u32 ecd; /* 0x01c8 */
|
||||
__u8 reserved1cc[18]; /* 0x01cc */
|
||||
__u64 pp; /* 0x01de */
|
||||
@ -312,6 +313,7 @@ struct kvm_vcpu_stat {
|
||||
u64 halt_successful_poll;
|
||||
u64 halt_attempted_poll;
|
||||
u64 halt_poll_invalid;
|
||||
u64 halt_no_poll_steal;
|
||||
u64 halt_wakeup;
|
||||
u64 instruction_lctl;
|
||||
u64 instruction_lctlg;
|
||||
|
@ -152,7 +152,10 @@ struct kvm_s390_vm_cpu_subfunc {
|
||||
__u8 pcc[16]; /* with MSA4 */
|
||||
__u8 ppno[16]; /* with MSA5 */
|
||||
__u8 kma[16]; /* with MSA8 */
|
||||
__u8 reserved[1808];
|
||||
__u8 kdsa[16]; /* with MSA9 */
|
||||
__u8 sortl[32]; /* with STFLE.150 */
|
||||
__u8 dfltcc[32]; /* with STFLE.151 */
|
||||
__u8 reserved[1728];
|
||||
};
|
||||
|
||||
/* kvm attributes for crypto */
|
||||
|
@ -30,6 +30,7 @@ config KVM
|
||||
select HAVE_KVM_IRQFD
|
||||
select HAVE_KVM_IRQ_ROUTING
|
||||
select HAVE_KVM_INVALID_WAKEUPS
|
||||
select HAVE_KVM_NO_POLL
|
||||
select SRCU
|
||||
select KVM_VFIO
|
||||
---help---
|
||||
|
@ -14,6 +14,7 @@
|
||||
#include <linux/kvm_host.h>
|
||||
#include <linux/hrtimer.h>
|
||||
#include <linux/mmu_context.h>
|
||||
#include <linux/nospec.h>
|
||||
#include <linux/signal.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/bitmap.h>
|
||||
@ -2307,6 +2308,7 @@ static struct s390_io_adapter *get_io_adapter(struct kvm *kvm, unsigned int id)
|
||||
{
|
||||
if (id >= MAX_S390_IO_ADAPTERS)
|
||||
return NULL;
|
||||
id = array_index_nospec(id, MAX_S390_IO_ADAPTERS);
|
||||
return kvm->arch.adapters[id];
|
||||
}
|
||||
|
||||
@ -2320,8 +2322,13 @@ static int register_io_adapter(struct kvm_device *dev,
|
||||
(void __user *)attr->addr, sizeof(adapter_info)))
|
||||
return -EFAULT;
|
||||
|
||||
if ((adapter_info.id >= MAX_S390_IO_ADAPTERS) ||
|
||||
(dev->kvm->arch.adapters[adapter_info.id] != NULL))
|
||||
if (adapter_info.id >= MAX_S390_IO_ADAPTERS)
|
||||
return -EINVAL;
|
||||
|
||||
adapter_info.id = array_index_nospec(adapter_info.id,
|
||||
MAX_S390_IO_ADAPTERS);
|
||||
|
||||
if (dev->kvm->arch.adapters[adapter_info.id] != NULL)
|
||||
return -EINVAL;
|
||||
|
||||
adapter = kzalloc(sizeof(*adapter), GFP_KERNEL);
|
||||
|
@ -75,6 +75,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
|
||||
{ "halt_successful_poll", VCPU_STAT(halt_successful_poll) },
|
||||
{ "halt_attempted_poll", VCPU_STAT(halt_attempted_poll) },
|
||||
{ "halt_poll_invalid", VCPU_STAT(halt_poll_invalid) },
|
||||
{ "halt_no_poll_steal", VCPU_STAT(halt_no_poll_steal) },
|
||||
{ "halt_wakeup", VCPU_STAT(halt_wakeup) },
|
||||
{ "instruction_lctlg", VCPU_STAT(instruction_lctlg) },
|
||||
{ "instruction_lctl", VCPU_STAT(instruction_lctl) },
|
||||
@ -177,6 +178,11 @@ static int hpage;
|
||||
module_param(hpage, int, 0444);
|
||||
MODULE_PARM_DESC(hpage, "1m huge page backing support");
|
||||
|
||||
/* maximum percentage of steal time for polling. >100 is treated like 100 */
|
||||
static u8 halt_poll_max_steal = 10;
|
||||
module_param(halt_poll_max_steal, byte, 0644);
|
||||
MODULE_PARM_DESC(hpage, "Maximum percentage of steal time to allow polling");
|
||||
|
||||
/*
|
||||
* For now we handle at most 16 double words as this is what the s390 base
|
||||
* kernel handles and stores in the prefix page. If we ever need to go beyond
|
||||
@ -321,6 +327,22 @@ static inline int plo_test_bit(unsigned char nr)
|
||||
return cc == 0;
|
||||
}
|
||||
|
||||
static inline void __insn32_query(unsigned int opcode, u8 query[32])
|
||||
{
|
||||
register unsigned long r0 asm("0") = 0; /* query function */
|
||||
register unsigned long r1 asm("1") = (unsigned long) query;
|
||||
|
||||
asm volatile(
|
||||
/* Parameter regs are ignored */
|
||||
" .insn rrf,%[opc] << 16,2,4,6,0\n"
|
||||
: "=m" (*query)
|
||||
: "d" (r0), "a" (r1), [opc] "i" (opcode)
|
||||
: "cc");
|
||||
}
|
||||
|
||||
#define INSN_SORTL 0xb938
|
||||
#define INSN_DFLTCC 0xb939
|
||||
|
||||
static void kvm_s390_cpu_feat_init(void)
|
||||
{
|
||||
int i;
|
||||
@ -368,6 +390,16 @@ static void kvm_s390_cpu_feat_init(void)
|
||||
__cpacf_query(CPACF_KMA, (cpacf_mask_t *)
|
||||
kvm_s390_available_subfunc.kma);
|
||||
|
||||
if (test_facility(155)) /* MSA9 */
|
||||
__cpacf_query(CPACF_KDSA, (cpacf_mask_t *)
|
||||
kvm_s390_available_subfunc.kdsa);
|
||||
|
||||
if (test_facility(150)) /* SORTL */
|
||||
__insn32_query(INSN_SORTL, kvm_s390_available_subfunc.sortl);
|
||||
|
||||
if (test_facility(151)) /* DFLTCC */
|
||||
__insn32_query(INSN_DFLTCC, kvm_s390_available_subfunc.dfltcc);
|
||||
|
||||
if (MACHINE_HAS_ESOP)
|
||||
allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
|
||||
/*
|
||||
@ -513,9 +545,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||
else if (sclp.has_esca && sclp.has_64bscao)
|
||||
r = KVM_S390_ESCA_CPU_SLOTS;
|
||||
break;
|
||||
case KVM_CAP_NR_MEMSLOTS:
|
||||
r = KVM_USER_MEM_SLOTS;
|
||||
break;
|
||||
case KVM_CAP_S390_COW:
|
||||
r = MACHINE_HAS_ESOP;
|
||||
break;
|
||||
@ -657,6 +686,14 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
|
||||
set_kvm_facility(kvm->arch.model.fac_mask, 135);
|
||||
set_kvm_facility(kvm->arch.model.fac_list, 135);
|
||||
}
|
||||
if (test_facility(148)) {
|
||||
set_kvm_facility(kvm->arch.model.fac_mask, 148);
|
||||
set_kvm_facility(kvm->arch.model.fac_list, 148);
|
||||
}
|
||||
if (test_facility(152)) {
|
||||
set_kvm_facility(kvm->arch.model.fac_mask, 152);
|
||||
set_kvm_facility(kvm->arch.model.fac_list, 152);
|
||||
}
|
||||
r = 0;
|
||||
} else
|
||||
r = -EINVAL;
|
||||
@ -1323,6 +1360,19 @@ static int kvm_s390_set_processor_subfunc(struct kvm *kvm,
|
||||
VM_EVENT(kvm, 3, "SET: guest KMA subfunc 0x%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.kma)[0],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.kma)[1]);
|
||||
VM_EVENT(kvm, 3, "SET: guest KDSA subfunc 0x%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.kdsa)[0],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.kdsa)[1]);
|
||||
VM_EVENT(kvm, 3, "SET: guest SORTL subfunc 0x%16.16lx.%16.16lx.%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.sortl)[0],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.sortl)[1],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.sortl)[2],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.sortl)[3]);
|
||||
VM_EVENT(kvm, 3, "SET: guest DFLTCC subfunc 0x%16.16lx.%16.16lx.%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.dfltcc)[0],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.dfltcc)[1],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.dfltcc)[2],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.dfltcc)[3]);
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -1491,6 +1541,19 @@ static int kvm_s390_get_processor_subfunc(struct kvm *kvm,
|
||||
VM_EVENT(kvm, 3, "GET: guest KMA subfunc 0x%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.kma)[0],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.kma)[1]);
|
||||
VM_EVENT(kvm, 3, "GET: guest KDSA subfunc 0x%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.kdsa)[0],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.kdsa)[1]);
|
||||
VM_EVENT(kvm, 3, "GET: guest SORTL subfunc 0x%16.16lx.%16.16lx.%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.sortl)[0],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.sortl)[1],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.sortl)[2],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.sortl)[3]);
|
||||
VM_EVENT(kvm, 3, "GET: guest DFLTCC subfunc 0x%16.16lx.%16.16lx.%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.dfltcc)[0],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.dfltcc)[1],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.dfltcc)[2],
|
||||
((unsigned long *) &kvm->arch.model.subfuncs.dfltcc)[3]);
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -1546,6 +1609,19 @@ static int kvm_s390_get_machine_subfunc(struct kvm *kvm,
|
||||
VM_EVENT(kvm, 3, "GET: host KMA subfunc 0x%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm_s390_available_subfunc.kma)[0],
|
||||
((unsigned long *) &kvm_s390_available_subfunc.kma)[1]);
|
||||
VM_EVENT(kvm, 3, "GET: host KDSA subfunc 0x%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm_s390_available_subfunc.kdsa)[0],
|
||||
((unsigned long *) &kvm_s390_available_subfunc.kdsa)[1]);
|
||||
VM_EVENT(kvm, 3, "GET: host SORTL subfunc 0x%16.16lx.%16.16lx.%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm_s390_available_subfunc.sortl)[0],
|
||||
((unsigned long *) &kvm_s390_available_subfunc.sortl)[1],
|
||||
((unsigned long *) &kvm_s390_available_subfunc.sortl)[2],
|
||||
((unsigned long *) &kvm_s390_available_subfunc.sortl)[3]);
|
||||
VM_EVENT(kvm, 3, "GET: host DFLTCC subfunc 0x%16.16lx.%16.16lx.%16.16lx.%16.16lx",
|
||||
((unsigned long *) &kvm_s390_available_subfunc.dfltcc)[0],
|
||||
((unsigned long *) &kvm_s390_available_subfunc.dfltcc)[1],
|
||||
((unsigned long *) &kvm_s390_available_subfunc.dfltcc)[2],
|
||||
((unsigned long *) &kvm_s390_available_subfunc.dfltcc)[3]);
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -2817,6 +2893,25 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
|
||||
vcpu->arch.enabled_gmap = vcpu->arch.gmap;
|
||||
}
|
||||
|
||||
static bool kvm_has_pckmo_subfunc(struct kvm *kvm, unsigned long nr)
|
||||
{
|
||||
if (test_bit_inv(nr, (unsigned long *)&kvm->arch.model.subfuncs.pckmo) &&
|
||||
test_bit_inv(nr, (unsigned long *)&kvm_s390_available_subfunc.pckmo))
|
||||
return true;
|
||||
return false;
|
||||
}
|
||||
|
||||
static bool kvm_has_pckmo_ecc(struct kvm *kvm)
|
||||
{
|
||||
/* At least one ECC subfunction must be present */
|
||||
return kvm_has_pckmo_subfunc(kvm, 32) ||
|
||||
kvm_has_pckmo_subfunc(kvm, 33) ||
|
||||
kvm_has_pckmo_subfunc(kvm, 34) ||
|
||||
kvm_has_pckmo_subfunc(kvm, 40) ||
|
||||
kvm_has_pckmo_subfunc(kvm, 41);
|
||||
|
||||
}
|
||||
|
||||
static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
/*
|
||||
@ -2829,13 +2924,19 @@ static void kvm_s390_vcpu_crypto_setup(struct kvm_vcpu *vcpu)
|
||||
vcpu->arch.sie_block->crycbd = vcpu->kvm->arch.crypto.crycbd;
|
||||
vcpu->arch.sie_block->ecb3 &= ~(ECB3_AES | ECB3_DEA);
|
||||
vcpu->arch.sie_block->eca &= ~ECA_APIE;
|
||||
vcpu->arch.sie_block->ecd &= ~ECD_ECC;
|
||||
|
||||
if (vcpu->kvm->arch.crypto.apie)
|
||||
vcpu->arch.sie_block->eca |= ECA_APIE;
|
||||
|
||||
/* Set up protected key support */
|
||||
if (vcpu->kvm->arch.crypto.aes_kw)
|
||||
if (vcpu->kvm->arch.crypto.aes_kw) {
|
||||
vcpu->arch.sie_block->ecb3 |= ECB3_AES;
|
||||
/* ecc is also wrapped with AES key */
|
||||
if (kvm_has_pckmo_ecc(vcpu->kvm))
|
||||
vcpu->arch.sie_block->ecd |= ECD_ECC;
|
||||
}
|
||||
|
||||
if (vcpu->kvm->arch.crypto.dea_kw)
|
||||
vcpu->arch.sie_block->ecb3 |= ECB3_DEA;
|
||||
}
|
||||
@ -3068,6 +3169,17 @@ static void kvm_gmap_notifier(struct gmap *gmap, unsigned long start,
|
||||
}
|
||||
}
|
||||
|
||||
bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
/* do not poll with more than halt_poll_max_steal percent of steal time */
|
||||
if (S390_lowcore.avg_steal_timer * 100 / (TICK_USEC << 12) >=
|
||||
halt_poll_max_steal) {
|
||||
vcpu->stat.halt_no_poll_steal++;
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
/* kvm common code refers to this, but never calls it */
|
||||
|
@ -288,7 +288,9 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
|
||||
const u32 crycb_addr = crycbd_o & 0x7ffffff8U;
|
||||
unsigned long *b1, *b2;
|
||||
u8 ecb3_flags;
|
||||
u32 ecd_flags;
|
||||
int apie_h;
|
||||
int apie_s;
|
||||
int key_msk = test_kvm_facility(vcpu->kvm, 76);
|
||||
int fmt_o = crycbd_o & CRYCB_FORMAT_MASK;
|
||||
int fmt_h = vcpu->arch.sie_block->crycbd & CRYCB_FORMAT_MASK;
|
||||
@ -297,7 +299,8 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
|
||||
scb_s->crycbd = 0;
|
||||
|
||||
apie_h = vcpu->arch.sie_block->eca & ECA_APIE;
|
||||
if (!apie_h && (!key_msk || fmt_o == CRYCB_FORMAT0))
|
||||
apie_s = apie_h & scb_o->eca;
|
||||
if (!apie_s && (!key_msk || (fmt_o == CRYCB_FORMAT0)))
|
||||
return 0;
|
||||
|
||||
if (!crycb_addr)
|
||||
@ -308,7 +311,7 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
|
||||
((crycb_addr + 128) & PAGE_MASK))
|
||||
return set_validity_icpt(scb_s, 0x003CU);
|
||||
|
||||
if (apie_h && (scb_o->eca & ECA_APIE)) {
|
||||
if (apie_s) {
|
||||
ret = setup_apcb(vcpu, &vsie_page->crycb, crycb_addr,
|
||||
vcpu->kvm->arch.crypto.crycb,
|
||||
fmt_o, fmt_h);
|
||||
@ -320,7 +323,8 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
|
||||
/* we may only allow it if enabled for guest 2 */
|
||||
ecb3_flags = scb_o->ecb3 & vcpu->arch.sie_block->ecb3 &
|
||||
(ECB3_AES | ECB3_DEA);
|
||||
if (!ecb3_flags)
|
||||
ecd_flags = scb_o->ecd & vcpu->arch.sie_block->ecd & ECD_ECC;
|
||||
if (!ecb3_flags && !ecd_flags)
|
||||
goto end;
|
||||
|
||||
/* copy only the wrapping keys */
|
||||
@ -329,6 +333,7 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
|
||||
return set_validity_icpt(scb_s, 0x0035U);
|
||||
|
||||
scb_s->ecb3 |= ecb3_flags;
|
||||
scb_s->ecd |= ecd_flags;
|
||||
|
||||
/* xor both blocks in one run */
|
||||
b1 = (unsigned long *) vsie_page->crycb.dea_wrapping_key_mask;
|
||||
@ -339,7 +344,7 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
|
||||
end:
|
||||
switch (ret) {
|
||||
case -EINVAL:
|
||||
return set_validity_icpt(scb_s, 0x0020U);
|
||||
return set_validity_icpt(scb_s, 0x0022U);
|
||||
case -EFAULT:
|
||||
return set_validity_icpt(scb_s, 0x0035U);
|
||||
case -EACCES:
|
||||
|
@ -93,6 +93,9 @@ static struct facility_def facility_defs[] = {
|
||||
131, /* enhanced-SOP 2 and side-effect */
|
||||
139, /* multiple epoch facility */
|
||||
146, /* msa extension 8 */
|
||||
150, /* enhanced sort */
|
||||
151, /* deflate conversion */
|
||||
155, /* msa extension 9 */
|
||||
-1 /* END */
|
||||
}
|
||||
},
|
||||
|
@ -2384,7 +2384,11 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
|
||||
*/
|
||||
if (__test_and_clear_bit(55, (unsigned long *)&status)) {
|
||||
handled++;
|
||||
intel_pt_interrupt();
|
||||
if (unlikely(perf_guest_cbs && perf_guest_cbs->is_in_guest() &&
|
||||
perf_guest_cbs->handle_intel_pt_intr))
|
||||
perf_guest_cbs->handle_intel_pt_intr();
|
||||
else
|
||||
intel_pt_interrupt();
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -10,6 +10,7 @@ extern struct e820_table *e820_table_firmware;
|
||||
|
||||
extern unsigned long pci_mem_start;
|
||||
|
||||
extern bool e820__mapped_raw_any(u64 start, u64 end, enum e820_type type);
|
||||
extern bool e820__mapped_any(u64 start, u64 end, enum e820_type type);
|
||||
extern bool e820__mapped_all(u64 start, u64 end, enum e820_type type);
|
||||
|
||||
|
@ -470,6 +470,7 @@ struct kvm_pmu {
|
||||
u64 global_ovf_ctrl;
|
||||
u64 counter_bitmask[2];
|
||||
u64 global_ctrl_mask;
|
||||
u64 global_ovf_ctrl_mask;
|
||||
u64 reserved_bits;
|
||||
u8 version;
|
||||
struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC];
|
||||
@ -781,6 +782,9 @@ struct kvm_vcpu_arch {
|
||||
|
||||
/* Flush the L1 Data cache for L1TF mitigation on VMENTER */
|
||||
bool l1tf_flush_l1d;
|
||||
|
||||
/* AMD MSRC001_0015 Hardware Configuration */
|
||||
u64 msr_hwcr;
|
||||
};
|
||||
|
||||
struct kvm_lpage_info {
|
||||
@ -1168,7 +1172,8 @@ struct kvm_x86_ops {
|
||||
uint32_t guest_irq, bool set);
|
||||
void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu);
|
||||
|
||||
int (*set_hv_timer)(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc);
|
||||
int (*set_hv_timer)(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc,
|
||||
bool *expired);
|
||||
void (*cancel_hv_timer)(struct kvm_vcpu *vcpu);
|
||||
|
||||
void (*setup_mce)(struct kvm_vcpu *vcpu);
|
||||
|
@ -789,6 +789,14 @@
|
||||
#define MSR_CORE_PERF_GLOBAL_CTRL 0x0000038f
|
||||
#define MSR_CORE_PERF_GLOBAL_OVF_CTRL 0x00000390
|
||||
|
||||
/* PERF_GLOBAL_OVF_CTL bits */
|
||||
#define MSR_CORE_PERF_GLOBAL_OVF_CTRL_TRACE_TOPA_PMI_BIT 55
|
||||
#define MSR_CORE_PERF_GLOBAL_OVF_CTRL_TRACE_TOPA_PMI (1ULL << MSR_CORE_PERF_GLOBAL_OVF_CTRL_TRACE_TOPA_PMI_BIT)
|
||||
#define MSR_CORE_PERF_GLOBAL_OVF_CTRL_OVF_BUF_BIT 62
|
||||
#define MSR_CORE_PERF_GLOBAL_OVF_CTRL_OVF_BUF (1ULL << MSR_CORE_PERF_GLOBAL_OVF_CTRL_OVF_BUF_BIT)
|
||||
#define MSR_CORE_PERF_GLOBAL_OVF_CTRL_COND_CHGD_BIT 63
|
||||
#define MSR_CORE_PERF_GLOBAL_OVF_CTRL_COND_CHGD (1ULL << MSR_CORE_PERF_GLOBAL_OVF_CTRL_COND_CHGD_BIT)
|
||||
|
||||
/* Geode defined MSRs */
|
||||
#define MSR_GEODE_BUSCONT_CONF0 0x00001900
|
||||
|
||||
|
@ -73,12 +73,13 @@ EXPORT_SYMBOL(pci_mem_start);
|
||||
* This function checks if any part of the range <start,end> is mapped
|
||||
* with type.
|
||||
*/
|
||||
bool e820__mapped_any(u64 start, u64 end, enum e820_type type)
|
||||
static bool _e820__mapped_any(struct e820_table *table,
|
||||
u64 start, u64 end, enum e820_type type)
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < e820_table->nr_entries; i++) {
|
||||
struct e820_entry *entry = &e820_table->entries[i];
|
||||
for (i = 0; i < table->nr_entries; i++) {
|
||||
struct e820_entry *entry = &table->entries[i];
|
||||
|
||||
if (type && entry->type != type)
|
||||
continue;
|
||||
@ -88,6 +89,17 @@ bool e820__mapped_any(u64 start, u64 end, enum e820_type type)
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
bool e820__mapped_raw_any(u64 start, u64 end, enum e820_type type)
|
||||
{
|
||||
return _e820__mapped_any(e820_table_firmware, start, end, type);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(e820__mapped_raw_any);
|
||||
|
||||
bool e820__mapped_any(u64 start, u64 end, enum e820_type type)
|
||||
{
|
||||
return _e820__mapped_any(e820_table, start, end, type);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(e820__mapped_any);
|
||||
|
||||
/*
|
||||
|
@ -963,13 +963,13 @@ int kvm_emulate_cpuid(struct kvm_vcpu *vcpu)
|
||||
if (cpuid_fault_enabled(vcpu) && !kvm_require_cpl(vcpu, 0))
|
||||
return 1;
|
||||
|
||||
eax = kvm_register_read(vcpu, VCPU_REGS_RAX);
|
||||
ecx = kvm_register_read(vcpu, VCPU_REGS_RCX);
|
||||
eax = kvm_rax_read(vcpu);
|
||||
ecx = kvm_rcx_read(vcpu);
|
||||
kvm_cpuid(vcpu, &eax, &ebx, &ecx, &edx, true);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RAX, eax);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RBX, ebx);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RCX, ecx);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RDX, edx);
|
||||
kvm_rax_write(vcpu, eax);
|
||||
kvm_rbx_write(vcpu, ebx);
|
||||
kvm_rcx_write(vcpu, ecx);
|
||||
kvm_rdx_write(vcpu, edx);
|
||||
return kvm_skip_emulated_instruction(vcpu);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvm_emulate_cpuid);
|
||||
|
@ -1535,10 +1535,10 @@ static void kvm_hv_hypercall_set_result(struct kvm_vcpu *vcpu, u64 result)
|
||||
|
||||
longmode = is_64_bit_mode(vcpu);
|
||||
if (longmode)
|
||||
kvm_register_write(vcpu, VCPU_REGS_RAX, result);
|
||||
kvm_rax_write(vcpu, result);
|
||||
else {
|
||||
kvm_register_write(vcpu, VCPU_REGS_RDX, result >> 32);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RAX, result & 0xffffffff);
|
||||
kvm_rdx_write(vcpu, result >> 32);
|
||||
kvm_rax_write(vcpu, result & 0xffffffff);
|
||||
}
|
||||
}
|
||||
|
||||
@ -1611,18 +1611,18 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
|
||||
longmode = is_64_bit_mode(vcpu);
|
||||
|
||||
if (!longmode) {
|
||||
param = ((u64)kvm_register_read(vcpu, VCPU_REGS_RDX) << 32) |
|
||||
(kvm_register_read(vcpu, VCPU_REGS_RAX) & 0xffffffff);
|
||||
ingpa = ((u64)kvm_register_read(vcpu, VCPU_REGS_RBX) << 32) |
|
||||
(kvm_register_read(vcpu, VCPU_REGS_RCX) & 0xffffffff);
|
||||
outgpa = ((u64)kvm_register_read(vcpu, VCPU_REGS_RDI) << 32) |
|
||||
(kvm_register_read(vcpu, VCPU_REGS_RSI) & 0xffffffff);
|
||||
param = ((u64)kvm_rdx_read(vcpu) << 32) |
|
||||
(kvm_rax_read(vcpu) & 0xffffffff);
|
||||
ingpa = ((u64)kvm_rbx_read(vcpu) << 32) |
|
||||
(kvm_rcx_read(vcpu) & 0xffffffff);
|
||||
outgpa = ((u64)kvm_rdi_read(vcpu) << 32) |
|
||||
(kvm_rsi_read(vcpu) & 0xffffffff);
|
||||
}
|
||||
#ifdef CONFIG_X86_64
|
||||
else {
|
||||
param = kvm_register_read(vcpu, VCPU_REGS_RCX);
|
||||
ingpa = kvm_register_read(vcpu, VCPU_REGS_RDX);
|
||||
outgpa = kvm_register_read(vcpu, VCPU_REGS_R8);
|
||||
param = kvm_rcx_read(vcpu);
|
||||
ingpa = kvm_rdx_read(vcpu);
|
||||
outgpa = kvm_r8_read(vcpu);
|
||||
}
|
||||
#endif
|
||||
|
||||
|
@ -9,6 +9,34 @@
|
||||
(X86_CR4_PVI | X86_CR4_DE | X86_CR4_PCE | X86_CR4_OSFXSR \
|
||||
| X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_PGE)
|
||||
|
||||
#define BUILD_KVM_GPR_ACCESSORS(lname, uname) \
|
||||
static __always_inline unsigned long kvm_##lname##_read(struct kvm_vcpu *vcpu)\
|
||||
{ \
|
||||
return vcpu->arch.regs[VCPU_REGS_##uname]; \
|
||||
} \
|
||||
static __always_inline void kvm_##lname##_write(struct kvm_vcpu *vcpu, \
|
||||
unsigned long val) \
|
||||
{ \
|
||||
vcpu->arch.regs[VCPU_REGS_##uname] = val; \
|
||||
}
|
||||
BUILD_KVM_GPR_ACCESSORS(rax, RAX)
|
||||
BUILD_KVM_GPR_ACCESSORS(rbx, RBX)
|
||||
BUILD_KVM_GPR_ACCESSORS(rcx, RCX)
|
||||
BUILD_KVM_GPR_ACCESSORS(rdx, RDX)
|
||||
BUILD_KVM_GPR_ACCESSORS(rbp, RBP)
|
||||
BUILD_KVM_GPR_ACCESSORS(rsi, RSI)
|
||||
BUILD_KVM_GPR_ACCESSORS(rdi, RDI)
|
||||
#ifdef CONFIG_X86_64
|
||||
BUILD_KVM_GPR_ACCESSORS(r8, R8)
|
||||
BUILD_KVM_GPR_ACCESSORS(r9, R9)
|
||||
BUILD_KVM_GPR_ACCESSORS(r10, R10)
|
||||
BUILD_KVM_GPR_ACCESSORS(r11, R11)
|
||||
BUILD_KVM_GPR_ACCESSORS(r12, R12)
|
||||
BUILD_KVM_GPR_ACCESSORS(r13, R13)
|
||||
BUILD_KVM_GPR_ACCESSORS(r14, R14)
|
||||
BUILD_KVM_GPR_ACCESSORS(r15, R15)
|
||||
#endif
|
||||
|
||||
static inline unsigned long kvm_register_read(struct kvm_vcpu *vcpu,
|
||||
enum kvm_reg reg)
|
||||
{
|
||||
@ -37,6 +65,16 @@ static inline void kvm_rip_write(struct kvm_vcpu *vcpu, unsigned long val)
|
||||
kvm_register_write(vcpu, VCPU_REGS_RIP, val);
|
||||
}
|
||||
|
||||
static inline unsigned long kvm_rsp_read(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return kvm_register_read(vcpu, VCPU_REGS_RSP);
|
||||
}
|
||||
|
||||
static inline void kvm_rsp_write(struct kvm_vcpu *vcpu, unsigned long val)
|
||||
{
|
||||
kvm_register_write(vcpu, VCPU_REGS_RSP, val);
|
||||
}
|
||||
|
||||
static inline u64 kvm_pdptr_read(struct kvm_vcpu *vcpu, int index)
|
||||
{
|
||||
might_sleep(); /* on svm */
|
||||
@ -83,8 +121,8 @@ static inline ulong kvm_read_cr4(struct kvm_vcpu *vcpu)
|
||||
|
||||
static inline u64 kvm_read_edx_eax(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return (kvm_register_read(vcpu, VCPU_REGS_RAX) & -1u)
|
||||
| ((u64)(kvm_register_read(vcpu, VCPU_REGS_RDX) & -1u) << 32);
|
||||
return (kvm_rax_read(vcpu) & -1u)
|
||||
| ((u64)(kvm_rdx_read(vcpu) & -1u) << 32);
|
||||
}
|
||||
|
||||
static inline void enter_guest_mode(struct kvm_vcpu *vcpu)
|
||||
|
@ -1454,7 +1454,7 @@ static void apic_timer_expired(struct kvm_lapic *apic)
|
||||
if (swait_active(q))
|
||||
swake_up_one(q);
|
||||
|
||||
if (apic_lvtt_tscdeadline(apic))
|
||||
if (apic_lvtt_tscdeadline(apic) || ktimer->hv_timer_in_use)
|
||||
ktimer->expired_tscdeadline = ktimer->tscdeadline;
|
||||
}
|
||||
|
||||
@ -1696,37 +1696,42 @@ static void cancel_hv_timer(struct kvm_lapic *apic)
|
||||
static bool start_hv_timer(struct kvm_lapic *apic)
|
||||
{
|
||||
struct kvm_timer *ktimer = &apic->lapic_timer;
|
||||
int r;
|
||||
struct kvm_vcpu *vcpu = apic->vcpu;
|
||||
bool expired;
|
||||
|
||||
WARN_ON(preemptible());
|
||||
if (!kvm_x86_ops->set_hv_timer)
|
||||
return false;
|
||||
|
||||
if (!apic_lvtt_period(apic) && atomic_read(&ktimer->pending))
|
||||
return false;
|
||||
|
||||
if (!ktimer->tscdeadline)
|
||||
return false;
|
||||
|
||||
r = kvm_x86_ops->set_hv_timer(apic->vcpu, ktimer->tscdeadline);
|
||||
if (r < 0)
|
||||
if (kvm_x86_ops->set_hv_timer(vcpu, ktimer->tscdeadline, &expired))
|
||||
return false;
|
||||
|
||||
ktimer->hv_timer_in_use = true;
|
||||
hrtimer_cancel(&ktimer->timer);
|
||||
|
||||
/*
|
||||
* Also recheck ktimer->pending, in case the sw timer triggered in
|
||||
* the window. For periodic timer, leave the hv timer running for
|
||||
* simplicity, and the deadline will be recomputed on the next vmexit.
|
||||
* To simplify handling the periodic timer, leave the hv timer running
|
||||
* even if the deadline timer has expired, i.e. rely on the resulting
|
||||
* VM-Exit to recompute the periodic timer's target expiration.
|
||||
*/
|
||||
if (!apic_lvtt_period(apic) && (r || atomic_read(&ktimer->pending))) {
|
||||
if (r)
|
||||
if (!apic_lvtt_period(apic)) {
|
||||
/*
|
||||
* Cancel the hv timer if the sw timer fired while the hv timer
|
||||
* was being programmed, or if the hv timer itself expired.
|
||||
*/
|
||||
if (atomic_read(&ktimer->pending)) {
|
||||
cancel_hv_timer(apic);
|
||||
} else if (expired) {
|
||||
apic_timer_expired(apic);
|
||||
return false;
|
||||
cancel_hv_timer(apic);
|
||||
}
|
||||
}
|
||||
|
||||
trace_kvm_hv_timer_state(apic->vcpu->vcpu_id, true);
|
||||
trace_kvm_hv_timer_state(vcpu->vcpu_id, ktimer->hv_timer_in_use);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
@ -1750,8 +1755,13 @@ static void start_sw_timer(struct kvm_lapic *apic)
|
||||
static void restart_apic_timer(struct kvm_lapic *apic)
|
||||
{
|
||||
preempt_disable();
|
||||
|
||||
if (!apic_lvtt_period(apic) && atomic_read(&apic->lapic_timer.pending))
|
||||
goto out;
|
||||
|
||||
if (!start_hv_timer(apic))
|
||||
start_sw_timer(apic);
|
||||
out:
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
|
@ -44,6 +44,7 @@
|
||||
#include <asm/page.h>
|
||||
#include <asm/pat.h>
|
||||
#include <asm/cmpxchg.h>
|
||||
#include <asm/e820/api.h>
|
||||
#include <asm/io.h>
|
||||
#include <asm/vmx.h>
|
||||
#include <asm/kvm_page_track.h>
|
||||
@ -487,16 +488,24 @@ static void kvm_mmu_reset_all_pte_masks(void)
|
||||
* If the CPU has 46 or less physical address bits, then set an
|
||||
* appropriate mask to guard against L1TF attacks. Otherwise, it is
|
||||
* assumed that the CPU is not vulnerable to L1TF.
|
||||
*
|
||||
* Some Intel CPUs address the L1 cache using more PA bits than are
|
||||
* reported by CPUID. Use the PA width of the L1 cache when possible
|
||||
* to achieve more effective mitigation, e.g. if system RAM overlaps
|
||||
* the most significant bits of legal physical address space.
|
||||
*/
|
||||
low_phys_bits = boot_cpu_data.x86_phys_bits;
|
||||
if (boot_cpu_data.x86_phys_bits <
|
||||
shadow_nonpresent_or_rsvd_mask = 0;
|
||||
low_phys_bits = boot_cpu_data.x86_cache_bits;
|
||||
if (boot_cpu_data.x86_cache_bits <
|
||||
52 - shadow_nonpresent_or_rsvd_mask_len) {
|
||||
shadow_nonpresent_or_rsvd_mask =
|
||||
rsvd_bits(boot_cpu_data.x86_phys_bits -
|
||||
rsvd_bits(boot_cpu_data.x86_cache_bits -
|
||||
shadow_nonpresent_or_rsvd_mask_len,
|
||||
boot_cpu_data.x86_phys_bits - 1);
|
||||
boot_cpu_data.x86_cache_bits - 1);
|
||||
low_phys_bits -= shadow_nonpresent_or_rsvd_mask_len;
|
||||
}
|
||||
} else
|
||||
WARN_ON_ONCE(boot_cpu_has_bug(X86_BUG_L1TF));
|
||||
|
||||
shadow_nonpresent_or_rsvd_lower_gfn_mask =
|
||||
GENMASK_ULL(low_phys_bits - 1, PAGE_SHIFT);
|
||||
}
|
||||
@ -2892,7 +2901,9 @@ static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
|
||||
*/
|
||||
(!pat_enabled() || pat_pfn_immune_to_uc_mtrr(pfn));
|
||||
|
||||
return true;
|
||||
return !e820__mapped_raw_any(pfn_to_hpa(pfn),
|
||||
pfn_to_hpa(pfn + 1) - 1,
|
||||
E820_TYPE_RAM);
|
||||
}
|
||||
|
||||
/* Bits which may be returned by set_spte() */
|
||||
|
@ -48,11 +48,6 @@ static bool msr_mtrr_valid(unsigned msr)
|
||||
return false;
|
||||
}
|
||||
|
||||
static bool valid_pat_type(unsigned t)
|
||||
{
|
||||
return t < 8 && (1 << t) & 0xf3; /* 0, 1, 4, 5, 6, 7 */
|
||||
}
|
||||
|
||||
static bool valid_mtrr_type(unsigned t)
|
||||
{
|
||||
return t < 8 && (1 << t) & 0x73; /* 0, 1, 4, 5, 6 */
|
||||
@ -67,10 +62,7 @@ bool kvm_mtrr_valid(struct kvm_vcpu *vcpu, u32 msr, u64 data)
|
||||
return false;
|
||||
|
||||
if (msr == MSR_IA32_CR_PAT) {
|
||||
for (i = 0; i < 8; i++)
|
||||
if (!valid_pat_type((data >> (i * 8)) & 0xff))
|
||||
return false;
|
||||
return true;
|
||||
return kvm_pat_valid(data);
|
||||
} else if (msr == MSR_MTRRdefType) {
|
||||
if (data & ~0xcff)
|
||||
return false;
|
||||
|
@ -141,15 +141,35 @@ static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
|
||||
struct page *page;
|
||||
|
||||
npages = get_user_pages_fast((unsigned long)ptep_user, 1, FOLL_WRITE, &page);
|
||||
/* Check if the user is doing something meaningless. */
|
||||
if (unlikely(npages != 1))
|
||||
return -EFAULT;
|
||||
if (likely(npages == 1)) {
|
||||
table = kmap_atomic(page);
|
||||
ret = CMPXCHG(&table[index], orig_pte, new_pte);
|
||||
kunmap_atomic(table);
|
||||
|
||||
table = kmap_atomic(page);
|
||||
ret = CMPXCHG(&table[index], orig_pte, new_pte);
|
||||
kunmap_atomic(table);
|
||||
kvm_release_page_dirty(page);
|
||||
} else {
|
||||
struct vm_area_struct *vma;
|
||||
unsigned long vaddr = (unsigned long)ptep_user & PAGE_MASK;
|
||||
unsigned long pfn;
|
||||
unsigned long paddr;
|
||||
|
||||
kvm_release_page_dirty(page);
|
||||
down_read(¤t->mm->mmap_sem);
|
||||
vma = find_vma_intersection(current->mm, vaddr, vaddr + PAGE_SIZE);
|
||||
if (!vma || !(vma->vm_flags & VM_PFNMAP)) {
|
||||
up_read(¤t->mm->mmap_sem);
|
||||
return -EFAULT;
|
||||
}
|
||||
pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
|
||||
paddr = pfn << PAGE_SHIFT;
|
||||
table = memremap(paddr, PAGE_SIZE, MEMREMAP_WB);
|
||||
if (!table) {
|
||||
up_read(¤t->mm->mmap_sem);
|
||||
return -EFAULT;
|
||||
}
|
||||
ret = CMPXCHG(&table[index], orig_pte, new_pte);
|
||||
memunmap(table);
|
||||
up_read(¤t->mm->mmap_sem);
|
||||
}
|
||||
|
||||
return (ret != orig_pte);
|
||||
}
|
||||
|
@ -2091,7 +2091,7 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
|
||||
init_vmcb(svm);
|
||||
|
||||
kvm_cpuid(vcpu, &eax, &dummy, &dummy, &dummy, true);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RDX, eax);
|
||||
kvm_rdx_write(vcpu, eax);
|
||||
|
||||
if (kvm_vcpu_apicv_active(vcpu) && !init_event)
|
||||
avic_update_vapic_bar(svm, APIC_DEFAULT_PHYS_BASE);
|
||||
@ -3071,32 +3071,6 @@ static inline bool nested_svm_nmi(struct vcpu_svm *svm)
|
||||
return false;
|
||||
}
|
||||
|
||||
static void *nested_svm_map(struct vcpu_svm *svm, u64 gpa, struct page **_page)
|
||||
{
|
||||
struct page *page;
|
||||
|
||||
might_sleep();
|
||||
|
||||
page = kvm_vcpu_gfn_to_page(&svm->vcpu, gpa >> PAGE_SHIFT);
|
||||
if (is_error_page(page))
|
||||
goto error;
|
||||
|
||||
*_page = page;
|
||||
|
||||
return kmap(page);
|
||||
|
||||
error:
|
||||
kvm_inject_gp(&svm->vcpu, 0);
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static void nested_svm_unmap(struct page *page)
|
||||
{
|
||||
kunmap(page);
|
||||
kvm_release_page_dirty(page);
|
||||
}
|
||||
|
||||
static int nested_svm_intercept_ioio(struct vcpu_svm *svm)
|
||||
{
|
||||
unsigned port, size, iopm_len;
|
||||
@ -3299,10 +3273,11 @@ static inline void copy_vmcb_control_area(struct vmcb *dst_vmcb, struct vmcb *fr
|
||||
|
||||
static int nested_svm_vmexit(struct vcpu_svm *svm)
|
||||
{
|
||||
int rc;
|
||||
struct vmcb *nested_vmcb;
|
||||
struct vmcb *hsave = svm->nested.hsave;
|
||||
struct vmcb *vmcb = svm->vmcb;
|
||||
struct page *page;
|
||||
struct kvm_host_map map;
|
||||
|
||||
trace_kvm_nested_vmexit_inject(vmcb->control.exit_code,
|
||||
vmcb->control.exit_info_1,
|
||||
@ -3311,9 +3286,14 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
|
||||
vmcb->control.exit_int_info_err,
|
||||
KVM_ISA_SVM);
|
||||
|
||||
nested_vmcb = nested_svm_map(svm, svm->nested.vmcb, &page);
|
||||
if (!nested_vmcb)
|
||||
rc = kvm_vcpu_map(&svm->vcpu, gfn_to_gpa(svm->nested.vmcb), &map);
|
||||
if (rc) {
|
||||
if (rc == -EINVAL)
|
||||
kvm_inject_gp(&svm->vcpu, 0);
|
||||
return 1;
|
||||
}
|
||||
|
||||
nested_vmcb = map.hva;
|
||||
|
||||
/* Exit Guest-Mode */
|
||||
leave_guest_mode(&svm->vcpu);
|
||||
@ -3408,16 +3388,16 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
|
||||
} else {
|
||||
(void)kvm_set_cr3(&svm->vcpu, hsave->save.cr3);
|
||||
}
|
||||
kvm_register_write(&svm->vcpu, VCPU_REGS_RAX, hsave->save.rax);
|
||||
kvm_register_write(&svm->vcpu, VCPU_REGS_RSP, hsave->save.rsp);
|
||||
kvm_register_write(&svm->vcpu, VCPU_REGS_RIP, hsave->save.rip);
|
||||
kvm_rax_write(&svm->vcpu, hsave->save.rax);
|
||||
kvm_rsp_write(&svm->vcpu, hsave->save.rsp);
|
||||
kvm_rip_write(&svm->vcpu, hsave->save.rip);
|
||||
svm->vmcb->save.dr7 = 0;
|
||||
svm->vmcb->save.cpl = 0;
|
||||
svm->vmcb->control.exit_int_info = 0;
|
||||
|
||||
mark_all_dirty(svm->vmcb);
|
||||
|
||||
nested_svm_unmap(page);
|
||||
kvm_vcpu_unmap(&svm->vcpu, &map, true);
|
||||
|
||||
nested_svm_uninit_mmu_context(&svm->vcpu);
|
||||
kvm_mmu_reset_context(&svm->vcpu);
|
||||
@ -3483,7 +3463,7 @@ static bool nested_vmcb_checks(struct vmcb *vmcb)
|
||||
}
|
||||
|
||||
static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
|
||||
struct vmcb *nested_vmcb, struct page *page)
|
||||
struct vmcb *nested_vmcb, struct kvm_host_map *map)
|
||||
{
|
||||
if (kvm_get_rflags(&svm->vcpu) & X86_EFLAGS_IF)
|
||||
svm->vcpu.arch.hflags |= HF_HIF_MASK;
|
||||
@ -3516,9 +3496,9 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
|
||||
kvm_mmu_reset_context(&svm->vcpu);
|
||||
|
||||
svm->vmcb->save.cr2 = svm->vcpu.arch.cr2 = nested_vmcb->save.cr2;
|
||||
kvm_register_write(&svm->vcpu, VCPU_REGS_RAX, nested_vmcb->save.rax);
|
||||
kvm_register_write(&svm->vcpu, VCPU_REGS_RSP, nested_vmcb->save.rsp);
|
||||
kvm_register_write(&svm->vcpu, VCPU_REGS_RIP, nested_vmcb->save.rip);
|
||||
kvm_rax_write(&svm->vcpu, nested_vmcb->save.rax);
|
||||
kvm_rsp_write(&svm->vcpu, nested_vmcb->save.rsp);
|
||||
kvm_rip_write(&svm->vcpu, nested_vmcb->save.rip);
|
||||
|
||||
/* In case we don't even reach vcpu_run, the fields are not updated */
|
||||
svm->vmcb->save.rax = nested_vmcb->save.rax;
|
||||
@ -3567,7 +3547,7 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
|
||||
svm->vmcb->control.pause_filter_thresh =
|
||||
nested_vmcb->control.pause_filter_thresh;
|
||||
|
||||
nested_svm_unmap(page);
|
||||
kvm_vcpu_unmap(&svm->vcpu, map, true);
|
||||
|
||||
/* Enter Guest-Mode */
|
||||
enter_guest_mode(&svm->vcpu);
|
||||
@ -3587,17 +3567,23 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
|
||||
|
||||
static bool nested_svm_vmrun(struct vcpu_svm *svm)
|
||||
{
|
||||
int rc;
|
||||
struct vmcb *nested_vmcb;
|
||||
struct vmcb *hsave = svm->nested.hsave;
|
||||
struct vmcb *vmcb = svm->vmcb;
|
||||
struct page *page;
|
||||
struct kvm_host_map map;
|
||||
u64 vmcb_gpa;
|
||||
|
||||
vmcb_gpa = svm->vmcb->save.rax;
|
||||
|
||||
nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax, &page);
|
||||
if (!nested_vmcb)
|
||||
rc = kvm_vcpu_map(&svm->vcpu, gfn_to_gpa(vmcb_gpa), &map);
|
||||
if (rc) {
|
||||
if (rc == -EINVAL)
|
||||
kvm_inject_gp(&svm->vcpu, 0);
|
||||
return false;
|
||||
}
|
||||
|
||||
nested_vmcb = map.hva;
|
||||
|
||||
if (!nested_vmcb_checks(nested_vmcb)) {
|
||||
nested_vmcb->control.exit_code = SVM_EXIT_ERR;
|
||||
@ -3605,7 +3591,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
|
||||
nested_vmcb->control.exit_info_1 = 0;
|
||||
nested_vmcb->control.exit_info_2 = 0;
|
||||
|
||||
nested_svm_unmap(page);
|
||||
kvm_vcpu_unmap(&svm->vcpu, &map, true);
|
||||
|
||||
return false;
|
||||
}
|
||||
@ -3649,7 +3635,7 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
|
||||
|
||||
copy_vmcb_control_area(hsave, vmcb);
|
||||
|
||||
enter_svm_guest_mode(svm, vmcb_gpa, nested_vmcb, page);
|
||||
enter_svm_guest_mode(svm, vmcb_gpa, nested_vmcb, &map);
|
||||
|
||||
return true;
|
||||
}
|
||||
@ -3673,21 +3659,26 @@ static void nested_svm_vmloadsave(struct vmcb *from_vmcb, struct vmcb *to_vmcb)
|
||||
static int vmload_interception(struct vcpu_svm *svm)
|
||||
{
|
||||
struct vmcb *nested_vmcb;
|
||||
struct page *page;
|
||||
struct kvm_host_map map;
|
||||
int ret;
|
||||
|
||||
if (nested_svm_check_permissions(svm))
|
||||
return 1;
|
||||
|
||||
nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax, &page);
|
||||
if (!nested_vmcb)
|
||||
ret = kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map);
|
||||
if (ret) {
|
||||
if (ret == -EINVAL)
|
||||
kvm_inject_gp(&svm->vcpu, 0);
|
||||
return 1;
|
||||
}
|
||||
|
||||
nested_vmcb = map.hva;
|
||||
|
||||
svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
|
||||
ret = kvm_skip_emulated_instruction(&svm->vcpu);
|
||||
|
||||
nested_svm_vmloadsave(nested_vmcb, svm->vmcb);
|
||||
nested_svm_unmap(page);
|
||||
kvm_vcpu_unmap(&svm->vcpu, &map, true);
|
||||
|
||||
return ret;
|
||||
}
|
||||
@ -3695,21 +3686,26 @@ static int vmload_interception(struct vcpu_svm *svm)
|
||||
static int vmsave_interception(struct vcpu_svm *svm)
|
||||
{
|
||||
struct vmcb *nested_vmcb;
|
||||
struct page *page;
|
||||
struct kvm_host_map map;
|
||||
int ret;
|
||||
|
||||
if (nested_svm_check_permissions(svm))
|
||||
return 1;
|
||||
|
||||
nested_vmcb = nested_svm_map(svm, svm->vmcb->save.rax, &page);
|
||||
if (!nested_vmcb)
|
||||
ret = kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map);
|
||||
if (ret) {
|
||||
if (ret == -EINVAL)
|
||||
kvm_inject_gp(&svm->vcpu, 0);
|
||||
return 1;
|
||||
}
|
||||
|
||||
nested_vmcb = map.hva;
|
||||
|
||||
svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
|
||||
ret = kvm_skip_emulated_instruction(&svm->vcpu);
|
||||
|
||||
nested_svm_vmloadsave(svm->vmcb, nested_vmcb);
|
||||
nested_svm_unmap(page);
|
||||
kvm_vcpu_unmap(&svm->vcpu, &map, true);
|
||||
|
||||
return ret;
|
||||
}
|
||||
@ -3791,11 +3787,11 @@ static int invlpga_interception(struct vcpu_svm *svm)
|
||||
{
|
||||
struct kvm_vcpu *vcpu = &svm->vcpu;
|
||||
|
||||
trace_kvm_invlpga(svm->vmcb->save.rip, kvm_register_read(&svm->vcpu, VCPU_REGS_RCX),
|
||||
kvm_register_read(&svm->vcpu, VCPU_REGS_RAX));
|
||||
trace_kvm_invlpga(svm->vmcb->save.rip, kvm_rcx_read(&svm->vcpu),
|
||||
kvm_rax_read(&svm->vcpu));
|
||||
|
||||
/* Let's treat INVLPGA the same as INVLPG (can be optimized!) */
|
||||
kvm_mmu_invlpg(vcpu, kvm_register_read(&svm->vcpu, VCPU_REGS_RAX));
|
||||
kvm_mmu_invlpg(vcpu, kvm_rax_read(&svm->vcpu));
|
||||
|
||||
svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
|
||||
return kvm_skip_emulated_instruction(&svm->vcpu);
|
||||
@ -3803,7 +3799,7 @@ static int invlpga_interception(struct vcpu_svm *svm)
|
||||
|
||||
static int skinit_interception(struct vcpu_svm *svm)
|
||||
{
|
||||
trace_kvm_skinit(svm->vmcb->save.rip, kvm_register_read(&svm->vcpu, VCPU_REGS_RAX));
|
||||
trace_kvm_skinit(svm->vmcb->save.rip, kvm_rax_read(&svm->vcpu));
|
||||
|
||||
kvm_queue_exception(&svm->vcpu, UD_VECTOR);
|
||||
return 1;
|
||||
@ -3817,7 +3813,7 @@ static int wbinvd_interception(struct vcpu_svm *svm)
|
||||
static int xsetbv_interception(struct vcpu_svm *svm)
|
||||
{
|
||||
u64 new_bv = kvm_read_edx_eax(&svm->vcpu);
|
||||
u32 index = kvm_register_read(&svm->vcpu, VCPU_REGS_RCX);
|
||||
u32 index = kvm_rcx_read(&svm->vcpu);
|
||||
|
||||
if (kvm_set_xcr(&svm->vcpu, index, new_bv) == 0) {
|
||||
svm->next_rip = kvm_rip_read(&svm->vcpu) + 3;
|
||||
@ -4213,7 +4209,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
|
||||
static int rdmsr_interception(struct vcpu_svm *svm)
|
||||
{
|
||||
u32 ecx = kvm_register_read(&svm->vcpu, VCPU_REGS_RCX);
|
||||
u32 ecx = kvm_rcx_read(&svm->vcpu);
|
||||
struct msr_data msr_info;
|
||||
|
||||
msr_info.index = ecx;
|
||||
@ -4225,10 +4221,8 @@ static int rdmsr_interception(struct vcpu_svm *svm)
|
||||
} else {
|
||||
trace_kvm_msr_read(ecx, msr_info.data);
|
||||
|
||||
kvm_register_write(&svm->vcpu, VCPU_REGS_RAX,
|
||||
msr_info.data & 0xffffffff);
|
||||
kvm_register_write(&svm->vcpu, VCPU_REGS_RDX,
|
||||
msr_info.data >> 32);
|
||||
kvm_rax_write(&svm->vcpu, msr_info.data & 0xffffffff);
|
||||
kvm_rdx_write(&svm->vcpu, msr_info.data >> 32);
|
||||
svm->next_rip = kvm_rip_read(&svm->vcpu) + 2;
|
||||
return kvm_skip_emulated_instruction(&svm->vcpu);
|
||||
}
|
||||
@ -4422,7 +4416,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
|
||||
static int wrmsr_interception(struct vcpu_svm *svm)
|
||||
{
|
||||
struct msr_data msr;
|
||||
u32 ecx = kvm_register_read(&svm->vcpu, VCPU_REGS_RCX);
|
||||
u32 ecx = kvm_rcx_read(&svm->vcpu);
|
||||
u64 data = kvm_read_edx_eax(&svm->vcpu);
|
||||
|
||||
msr.data = data;
|
||||
@ -6236,7 +6230,7 @@ static int svm_pre_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
|
||||
{
|
||||
struct vcpu_svm *svm = to_svm(vcpu);
|
||||
struct vmcb *nested_vmcb;
|
||||
struct page *page;
|
||||
struct kvm_host_map map;
|
||||
u64 guest;
|
||||
u64 vmcb;
|
||||
|
||||
@ -6244,10 +6238,10 @@ static int svm_pre_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
|
||||
vmcb = GET_SMSTATE(u64, smstate, 0x7ee0);
|
||||
|
||||
if (guest) {
|
||||
nested_vmcb = nested_svm_map(svm, vmcb, &page);
|
||||
if (!nested_vmcb)
|
||||
if (kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(vmcb), &map) == -EINVAL)
|
||||
return 1;
|
||||
enter_svm_guest_mode(svm, vmcb, nested_vmcb, page);
|
||||
nested_vmcb = map.hva;
|
||||
enter_svm_guest_mode(svm, vmcb, nested_vmcb, &map);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
@ -2,6 +2,8 @@
|
||||
#ifndef __KVM_X86_VMX_CAPS_H
|
||||
#define __KVM_X86_VMX_CAPS_H
|
||||
|
||||
#include <asm/vmx.h>
|
||||
|
||||
#include "lapic.h"
|
||||
|
||||
extern bool __read_mostly enable_vpid;
|
||||
|
@ -193,10 +193,8 @@ static inline void nested_release_evmcs(struct kvm_vcpu *vcpu)
|
||||
if (!vmx->nested.hv_evmcs)
|
||||
return;
|
||||
|
||||
kunmap(vmx->nested.hv_evmcs_page);
|
||||
kvm_release_page_dirty(vmx->nested.hv_evmcs_page);
|
||||
kvm_vcpu_unmap(vcpu, &vmx->nested.hv_evmcs_map, true);
|
||||
vmx->nested.hv_evmcs_vmptr = -1ull;
|
||||
vmx->nested.hv_evmcs_page = NULL;
|
||||
vmx->nested.hv_evmcs = NULL;
|
||||
}
|
||||
|
||||
@ -229,16 +227,9 @@ static void free_nested(struct kvm_vcpu *vcpu)
|
||||
kvm_release_page_dirty(vmx->nested.apic_access_page);
|
||||
vmx->nested.apic_access_page = NULL;
|
||||
}
|
||||
if (vmx->nested.virtual_apic_page) {
|
||||
kvm_release_page_dirty(vmx->nested.virtual_apic_page);
|
||||
vmx->nested.virtual_apic_page = NULL;
|
||||
}
|
||||
if (vmx->nested.pi_desc_page) {
|
||||
kunmap(vmx->nested.pi_desc_page);
|
||||
kvm_release_page_dirty(vmx->nested.pi_desc_page);
|
||||
vmx->nested.pi_desc_page = NULL;
|
||||
vmx->nested.pi_desc = NULL;
|
||||
}
|
||||
kvm_vcpu_unmap(vcpu, &vmx->nested.virtual_apic_map, true);
|
||||
kvm_vcpu_unmap(vcpu, &vmx->nested.pi_desc_map, true);
|
||||
vmx->nested.pi_desc = NULL;
|
||||
|
||||
kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL);
|
||||
|
||||
@ -519,39 +510,19 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12)
|
||||
{
|
||||
int msr;
|
||||
struct page *page;
|
||||
unsigned long *msr_bitmap_l1;
|
||||
unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.vmcs02.msr_bitmap;
|
||||
/*
|
||||
* pred_cmd & spec_ctrl are trying to verify two things:
|
||||
*
|
||||
* 1. L0 gave a permission to L1 to actually passthrough the MSR. This
|
||||
* ensures that we do not accidentally generate an L02 MSR bitmap
|
||||
* from the L12 MSR bitmap that is too permissive.
|
||||
* 2. That L1 or L2s have actually used the MSR. This avoids
|
||||
* unnecessarily merging of the bitmap if the MSR is unused. This
|
||||
* works properly because we only update the L01 MSR bitmap lazily.
|
||||
* So even if L0 should pass L1 these MSRs, the L01 bitmap is only
|
||||
* updated to reflect this when L1 (or its L2s) actually write to
|
||||
* the MSR.
|
||||
*/
|
||||
bool pred_cmd = !msr_write_intercepted_l01(vcpu, MSR_IA32_PRED_CMD);
|
||||
bool spec_ctrl = !msr_write_intercepted_l01(vcpu, MSR_IA32_SPEC_CTRL);
|
||||
struct kvm_host_map *map = &to_vmx(vcpu)->nested.msr_bitmap_map;
|
||||
|
||||
/* Nothing to do if the MSR bitmap is not in use. */
|
||||
if (!cpu_has_vmx_msr_bitmap() ||
|
||||
!nested_cpu_has(vmcs12, CPU_BASED_USE_MSR_BITMAPS))
|
||||
return false;
|
||||
|
||||
if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
|
||||
!pred_cmd && !spec_ctrl)
|
||||
if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->msr_bitmap), map))
|
||||
return false;
|
||||
|
||||
page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->msr_bitmap);
|
||||
if (is_error_page(page))
|
||||
return false;
|
||||
|
||||
msr_bitmap_l1 = (unsigned long *)kmap(page);
|
||||
msr_bitmap_l1 = (unsigned long *)map->hva;
|
||||
|
||||
/*
|
||||
* To keep the control flow simple, pay eight 8-byte writes (sixteen
|
||||
@ -592,20 +563,42 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
|
||||
}
|
||||
}
|
||||
|
||||
if (spec_ctrl)
|
||||
/* KVM unconditionally exposes the FS/GS base MSRs to L1. */
|
||||
nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, msr_bitmap_l0,
|
||||
MSR_FS_BASE, MSR_TYPE_RW);
|
||||
|
||||
nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, msr_bitmap_l0,
|
||||
MSR_GS_BASE, MSR_TYPE_RW);
|
||||
|
||||
nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, msr_bitmap_l0,
|
||||
MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
|
||||
|
||||
/*
|
||||
* Checking the L0->L1 bitmap is trying to verify two things:
|
||||
*
|
||||
* 1. L0 gave a permission to L1 to actually passthrough the MSR. This
|
||||
* ensures that we do not accidentally generate an L02 MSR bitmap
|
||||
* from the L12 MSR bitmap that is too permissive.
|
||||
* 2. That L1 or L2s have actually used the MSR. This avoids
|
||||
* unnecessarily merging of the bitmap if the MSR is unused. This
|
||||
* works properly because we only update the L01 MSR bitmap lazily.
|
||||
* So even if L0 should pass L1 these MSRs, the L01 bitmap is only
|
||||
* updated to reflect this when L1 (or its L2s) actually write to
|
||||
* the MSR.
|
||||
*/
|
||||
if (!msr_write_intercepted_l01(vcpu, MSR_IA32_SPEC_CTRL))
|
||||
nested_vmx_disable_intercept_for_msr(
|
||||
msr_bitmap_l1, msr_bitmap_l0,
|
||||
MSR_IA32_SPEC_CTRL,
|
||||
MSR_TYPE_R | MSR_TYPE_W);
|
||||
|
||||
if (pred_cmd)
|
||||
if (!msr_write_intercepted_l01(vcpu, MSR_IA32_PRED_CMD))
|
||||
nested_vmx_disable_intercept_for_msr(
|
||||
msr_bitmap_l1, msr_bitmap_l0,
|
||||
MSR_IA32_PRED_CMD,
|
||||
MSR_TYPE_W);
|
||||
|
||||
kunmap(page);
|
||||
kvm_release_page_clean(page);
|
||||
kvm_vcpu_unmap(vcpu, &to_vmx(vcpu)->nested.msr_bitmap_map, false);
|
||||
|
||||
return true;
|
||||
}
|
||||
@ -613,20 +606,20 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
|
||||
static void nested_cache_shadow_vmcs12(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12)
|
||||
{
|
||||
struct kvm_host_map map;
|
||||
struct vmcs12 *shadow;
|
||||
struct page *page;
|
||||
|
||||
if (!nested_cpu_has_shadow_vmcs(vmcs12) ||
|
||||
vmcs12->vmcs_link_pointer == -1ull)
|
||||
return;
|
||||
|
||||
shadow = get_shadow_vmcs12(vcpu);
|
||||
page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->vmcs_link_pointer);
|
||||
|
||||
memcpy(shadow, kmap(page), VMCS12_SIZE);
|
||||
if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->vmcs_link_pointer), &map))
|
||||
return;
|
||||
|
||||
kunmap(page);
|
||||
kvm_release_page_clean(page);
|
||||
memcpy(shadow, map.hva, VMCS12_SIZE);
|
||||
kvm_vcpu_unmap(vcpu, &map, false);
|
||||
}
|
||||
|
||||
static void nested_flush_cached_shadow_vmcs12(struct kvm_vcpu *vcpu,
|
||||
@ -930,7 +923,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne
|
||||
if (cr3 != kvm_read_cr3(vcpu) || (!nested_ept && pdptrs_changed(vcpu))) {
|
||||
if (!nested_cr3_valid(vcpu, cr3)) {
|
||||
*entry_failure_code = ENTRY_FAIL_DEFAULT;
|
||||
return 1;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/*
|
||||
@ -941,7 +934,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne
|
||||
!nested_ept) {
|
||||
if (!load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)) {
|
||||
*entry_failure_code = ENTRY_FAIL_PDPTE;
|
||||
return 1;
|
||||
return -EINVAL;
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -1794,13 +1787,11 @@ static int nested_vmx_handle_enlightened_vmptrld(struct kvm_vcpu *vcpu,
|
||||
|
||||
nested_release_evmcs(vcpu);
|
||||
|
||||
vmx->nested.hv_evmcs_page = kvm_vcpu_gpa_to_page(
|
||||
vcpu, assist_page.current_nested_vmcs);
|
||||
|
||||
if (unlikely(is_error_page(vmx->nested.hv_evmcs_page)))
|
||||
if (kvm_vcpu_map(vcpu, gpa_to_gfn(assist_page.current_nested_vmcs),
|
||||
&vmx->nested.hv_evmcs_map))
|
||||
return 0;
|
||||
|
||||
vmx->nested.hv_evmcs = kmap(vmx->nested.hv_evmcs_page);
|
||||
vmx->nested.hv_evmcs = vmx->nested.hv_evmcs_map.hva;
|
||||
|
||||
/*
|
||||
* Currently, KVM only supports eVMCS version 1
|
||||
@ -2373,19 +2364,19 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
|
||||
*/
|
||||
if (vmx->emulation_required) {
|
||||
*entry_failure_code = ENTRY_FAIL_DEFAULT;
|
||||
return 1;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* Shadow page tables on either EPT or shadow page tables. */
|
||||
if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, nested_cpu_has_ept(vmcs12),
|
||||
entry_failure_code))
|
||||
return 1;
|
||||
return -EINVAL;
|
||||
|
||||
if (!enable_ept)
|
||||
vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
|
||||
|
||||
kvm_register_write(vcpu, VCPU_REGS_RSP, vmcs12->guest_rsp);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12->guest_rip);
|
||||
kvm_rsp_write(vcpu, vmcs12->guest_rsp);
|
||||
kvm_rip_write(vcpu, vmcs12->guest_rip);
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -2589,11 +2580,19 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu,
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Checks related to Host Control Registers and MSRs
|
||||
*/
|
||||
static int nested_check_host_control_regs(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12)
|
||||
static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12)
|
||||
{
|
||||
if (nested_check_vm_execution_controls(vcpu, vmcs12) ||
|
||||
nested_check_vm_exit_controls(vcpu, vmcs12) ||
|
||||
nested_check_vm_entry_controls(vcpu, vmcs12))
|
||||
return -EINVAL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12)
|
||||
{
|
||||
bool ia32e;
|
||||
|
||||
@ -2606,6 +2605,10 @@ static int nested_check_host_control_regs(struct kvm_vcpu *vcpu,
|
||||
is_noncanonical_address(vmcs12->host_ia32_sysenter_eip, vcpu))
|
||||
return -EINVAL;
|
||||
|
||||
if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PAT) &&
|
||||
!kvm_pat_valid(vmcs12->host_ia32_pat))
|
||||
return -EINVAL;
|
||||
|
||||
/*
|
||||
* If the load IA32_EFER VM-exit control is 1, bits reserved in the
|
||||
* IA32_EFER MSR must be 0 in the field for that register. In addition,
|
||||
@ -2624,6 +2627,32 @@ static int nested_check_host_control_regs(struct kvm_vcpu *vcpu,
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int nested_vmx_check_vmcs_link_ptr(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12)
|
||||
{
|
||||
int r = 0;
|
||||
struct vmcs12 *shadow;
|
||||
struct kvm_host_map map;
|
||||
|
||||
if (vmcs12->vmcs_link_pointer == -1ull)
|
||||
return 0;
|
||||
|
||||
if (!page_address_valid(vcpu, vmcs12->vmcs_link_pointer))
|
||||
return -EINVAL;
|
||||
|
||||
if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->vmcs_link_pointer), &map))
|
||||
return -EINVAL;
|
||||
|
||||
shadow = map.hva;
|
||||
|
||||
if (shadow->hdr.revision_id != VMCS12_REVISION ||
|
||||
shadow->hdr.shadow_vmcs != nested_cpu_has_shadow_vmcs(vmcs12))
|
||||
r = -EINVAL;
|
||||
|
||||
kvm_vcpu_unmap(vcpu, &map, false);
|
||||
return r;
|
||||
}
|
||||
|
||||
/*
|
||||
* Checks related to Guest Non-register State
|
||||
*/
|
||||
@ -2636,53 +2665,9 @@ static int nested_check_guest_non_reg_state(struct vmcs12 *vmcs12)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int nested_vmx_check_vmentry_prereqs(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12)
|
||||
{
|
||||
if (nested_check_vm_execution_controls(vcpu, vmcs12) ||
|
||||
nested_check_vm_exit_controls(vcpu, vmcs12) ||
|
||||
nested_check_vm_entry_controls(vcpu, vmcs12))
|
||||
return VMXERR_ENTRY_INVALID_CONTROL_FIELD;
|
||||
|
||||
if (nested_check_host_control_regs(vcpu, vmcs12))
|
||||
return VMXERR_ENTRY_INVALID_HOST_STATE_FIELD;
|
||||
|
||||
if (nested_check_guest_non_reg_state(vmcs12))
|
||||
return VMXERR_ENTRY_INVALID_CONTROL_FIELD;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int nested_vmx_check_vmcs_link_ptr(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12)
|
||||
{
|
||||
int r;
|
||||
struct page *page;
|
||||
struct vmcs12 *shadow;
|
||||
|
||||
if (vmcs12->vmcs_link_pointer == -1ull)
|
||||
return 0;
|
||||
|
||||
if (!page_address_valid(vcpu, vmcs12->vmcs_link_pointer))
|
||||
return -EINVAL;
|
||||
|
||||
page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->vmcs_link_pointer);
|
||||
if (is_error_page(page))
|
||||
return -EINVAL;
|
||||
|
||||
r = 0;
|
||||
shadow = kmap(page);
|
||||
if (shadow->hdr.revision_id != VMCS12_REVISION ||
|
||||
shadow->hdr.shadow_vmcs != nested_cpu_has_shadow_vmcs(vmcs12))
|
||||
r = -EINVAL;
|
||||
kunmap(page);
|
||||
kvm_release_page_clean(page);
|
||||
return r;
|
||||
}
|
||||
|
||||
static int nested_vmx_check_vmentry_postreqs(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12,
|
||||
u32 *exit_qual)
|
||||
static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12,
|
||||
u32 *exit_qual)
|
||||
{
|
||||
bool ia32e;
|
||||
|
||||
@ -2690,11 +2675,15 @@ static int nested_vmx_check_vmentry_postreqs(struct kvm_vcpu *vcpu,
|
||||
|
||||
if (!nested_guest_cr0_valid(vcpu, vmcs12->guest_cr0) ||
|
||||
!nested_guest_cr4_valid(vcpu, vmcs12->guest_cr4))
|
||||
return 1;
|
||||
return -EINVAL;
|
||||
|
||||
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT) &&
|
||||
!kvm_pat_valid(vmcs12->guest_ia32_pat))
|
||||
return -EINVAL;
|
||||
|
||||
if (nested_vmx_check_vmcs_link_ptr(vcpu, vmcs12)) {
|
||||
*exit_qual = ENTRY_FAIL_VMCS_LINK_PTR;
|
||||
return 1;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/*
|
||||
@ -2713,13 +2702,16 @@ static int nested_vmx_check_vmentry_postreqs(struct kvm_vcpu *vcpu,
|
||||
ia32e != !!(vmcs12->guest_ia32_efer & EFER_LMA) ||
|
||||
((vmcs12->guest_cr0 & X86_CR0_PG) &&
|
||||
ia32e != !!(vmcs12->guest_ia32_efer & EFER_LME)))
|
||||
return 1;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS) &&
|
||||
(is_noncanonical_address(vmcs12->guest_bndcfgs & PAGE_MASK, vcpu) ||
|
||||
(vmcs12->guest_bndcfgs & MSR_IA32_BNDCFGS_RSVD)))
|
||||
return 1;
|
||||
(is_noncanonical_address(vmcs12->guest_bndcfgs & PAGE_MASK, vcpu) ||
|
||||
(vmcs12->guest_bndcfgs & MSR_IA32_BNDCFGS_RSVD)))
|
||||
return -EINVAL;
|
||||
|
||||
if (nested_check_guest_non_reg_state(vmcs12))
|
||||
return -EINVAL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -2832,6 +2824,7 @@ static void nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
|
||||
struct vcpu_vmx *vmx = to_vmx(vcpu);
|
||||
struct kvm_host_map *map;
|
||||
struct page *page;
|
||||
u64 hpa;
|
||||
|
||||
@ -2864,20 +2857,14 @@ static void nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
|
||||
}
|
||||
|
||||
if (nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW)) {
|
||||
if (vmx->nested.virtual_apic_page) { /* shouldn't happen */
|
||||
kvm_release_page_dirty(vmx->nested.virtual_apic_page);
|
||||
vmx->nested.virtual_apic_page = NULL;
|
||||
}
|
||||
page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->virtual_apic_page_addr);
|
||||
map = &vmx->nested.virtual_apic_map;
|
||||
|
||||
/*
|
||||
* If translation failed, VM entry will fail because
|
||||
* prepare_vmcs02 set VIRTUAL_APIC_PAGE_ADDR to -1ull.
|
||||
*/
|
||||
if (!is_error_page(page)) {
|
||||
vmx->nested.virtual_apic_page = page;
|
||||
hpa = page_to_phys(vmx->nested.virtual_apic_page);
|
||||
vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, hpa);
|
||||
if (!kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->virtual_apic_page_addr), map)) {
|
||||
vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, pfn_to_hpa(map->pfn));
|
||||
} else if (nested_cpu_has(vmcs12, CPU_BASED_CR8_LOAD_EXITING) &&
|
||||
nested_cpu_has(vmcs12, CPU_BASED_CR8_STORE_EXITING) &&
|
||||
!nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) {
|
||||
@ -2898,26 +2885,15 @@ static void nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
|
||||
}
|
||||
|
||||
if (nested_cpu_has_posted_intr(vmcs12)) {
|
||||
if (vmx->nested.pi_desc_page) { /* shouldn't happen */
|
||||
kunmap(vmx->nested.pi_desc_page);
|
||||
kvm_release_page_dirty(vmx->nested.pi_desc_page);
|
||||
vmx->nested.pi_desc_page = NULL;
|
||||
vmx->nested.pi_desc = NULL;
|
||||
vmcs_write64(POSTED_INTR_DESC_ADDR, -1ull);
|
||||
map = &vmx->nested.pi_desc_map;
|
||||
|
||||
if (!kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->posted_intr_desc_addr), map)) {
|
||||
vmx->nested.pi_desc =
|
||||
(struct pi_desc *)(((void *)map->hva) +
|
||||
offset_in_page(vmcs12->posted_intr_desc_addr));
|
||||
vmcs_write64(POSTED_INTR_DESC_ADDR,
|
||||
pfn_to_hpa(map->pfn) + offset_in_page(vmcs12->posted_intr_desc_addr));
|
||||
}
|
||||
page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->posted_intr_desc_addr);
|
||||
if (is_error_page(page))
|
||||
return;
|
||||
vmx->nested.pi_desc_page = page;
|
||||
vmx->nested.pi_desc = kmap(vmx->nested.pi_desc_page);
|
||||
vmx->nested.pi_desc =
|
||||
(struct pi_desc *)((void *)vmx->nested.pi_desc +
|
||||
(unsigned long)(vmcs12->posted_intr_desc_addr &
|
||||
(PAGE_SIZE - 1)));
|
||||
vmcs_write64(POSTED_INTR_DESC_ADDR,
|
||||
page_to_phys(vmx->nested.pi_desc_page) +
|
||||
(unsigned long)(vmcs12->posted_intr_desc_addr &
|
||||
(PAGE_SIZE - 1)));
|
||||
}
|
||||
if (nested_vmx_prepare_msr_bitmap(vcpu, vmcs12))
|
||||
vmcs_set_bits(CPU_BASED_VM_EXEC_CONTROL,
|
||||
@ -3000,7 +2976,7 @@ int nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, bool from_vmentry)
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (nested_vmx_check_vmentry_postreqs(vcpu, vmcs12, &exit_qual))
|
||||
if (nested_vmx_check_guest_state(vcpu, vmcs12, &exit_qual))
|
||||
goto vmentry_fail_vmexit;
|
||||
}
|
||||
|
||||
@ -3145,9 +3121,11 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
|
||||
launch ? VMXERR_VMLAUNCH_NONCLEAR_VMCS
|
||||
: VMXERR_VMRESUME_NONLAUNCHED_VMCS);
|
||||
|
||||
ret = nested_vmx_check_vmentry_prereqs(vcpu, vmcs12);
|
||||
if (ret)
|
||||
return nested_vmx_failValid(vcpu, ret);
|
||||
if (nested_vmx_check_controls(vcpu, vmcs12))
|
||||
return nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
|
||||
|
||||
if (nested_vmx_check_host_state(vcpu, vmcs12))
|
||||
return nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_HOST_STATE_FIELD);
|
||||
|
||||
/*
|
||||
* We're finally done with prerequisite checking, and can start with
|
||||
@ -3310,11 +3288,12 @@ static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
|
||||
|
||||
max_irr = find_last_bit((unsigned long *)vmx->nested.pi_desc->pir, 256);
|
||||
if (max_irr != 256) {
|
||||
vapic_page = kmap(vmx->nested.virtual_apic_page);
|
||||
vapic_page = vmx->nested.virtual_apic_map.hva;
|
||||
if (!vapic_page)
|
||||
return;
|
||||
|
||||
__kvm_apic_update_irr(vmx->nested.pi_desc->pir,
|
||||
vapic_page, &max_irr);
|
||||
kunmap(vmx->nested.virtual_apic_page);
|
||||
|
||||
status = vmcs_read16(GUEST_INTR_STATUS);
|
||||
if ((u8)max_irr > ((u8)status & 0xff)) {
|
||||
status &= ~0xff;
|
||||
@ -3425,8 +3404,8 @@ static void sync_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
|
||||
vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);
|
||||
vmcs12->guest_cr4 = vmcs12_guest_cr4(vcpu, vmcs12);
|
||||
|
||||
vmcs12->guest_rsp = kvm_register_read(vcpu, VCPU_REGS_RSP);
|
||||
vmcs12->guest_rip = kvm_register_read(vcpu, VCPU_REGS_RIP);
|
||||
vmcs12->guest_rsp = kvm_rsp_read(vcpu);
|
||||
vmcs12->guest_rip = kvm_rip_read(vcpu);
|
||||
vmcs12->guest_rflags = vmcs_readl(GUEST_RFLAGS);
|
||||
|
||||
vmcs12->guest_es_selector = vmcs_read16(GUEST_ES_SELECTOR);
|
||||
@ -3609,8 +3588,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
|
||||
vcpu->arch.efer &= ~(EFER_LMA | EFER_LME);
|
||||
vmx_set_efer(vcpu, vcpu->arch.efer);
|
||||
|
||||
kvm_register_write(vcpu, VCPU_REGS_RSP, vmcs12->host_rsp);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12->host_rip);
|
||||
kvm_rsp_write(vcpu, vmcs12->host_rsp);
|
||||
kvm_rip_write(vcpu, vmcs12->host_rip);
|
||||
vmx_set_rflags(vcpu, X86_EFLAGS_FIXED);
|
||||
vmx_set_interrupt_shadow(vcpu, 0);
|
||||
|
||||
@ -3955,16 +3934,9 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
|
||||
kvm_release_page_dirty(vmx->nested.apic_access_page);
|
||||
vmx->nested.apic_access_page = NULL;
|
||||
}
|
||||
if (vmx->nested.virtual_apic_page) {
|
||||
kvm_release_page_dirty(vmx->nested.virtual_apic_page);
|
||||
vmx->nested.virtual_apic_page = NULL;
|
||||
}
|
||||
if (vmx->nested.pi_desc_page) {
|
||||
kunmap(vmx->nested.pi_desc_page);
|
||||
kvm_release_page_dirty(vmx->nested.pi_desc_page);
|
||||
vmx->nested.pi_desc_page = NULL;
|
||||
vmx->nested.pi_desc = NULL;
|
||||
}
|
||||
kvm_vcpu_unmap(vcpu, &vmx->nested.virtual_apic_map, true);
|
||||
kvm_vcpu_unmap(vcpu, &vmx->nested.pi_desc_map, true);
|
||||
vmx->nested.pi_desc = NULL;
|
||||
|
||||
/*
|
||||
* We are now running in L2, mmu_notifier will force to reload the
|
||||
@ -4260,7 +4232,7 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
int ret;
|
||||
gpa_t vmptr;
|
||||
struct page *page;
|
||||
uint32_t revision;
|
||||
struct vcpu_vmx *vmx = to_vmx(vcpu);
|
||||
const u64 VMXON_NEEDED_FEATURES = FEATURE_CONTROL_LOCKED
|
||||
| FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
|
||||
@ -4306,21 +4278,13 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
|
||||
* Note - IA32_VMX_BASIC[48] will never be 1 for the nested case;
|
||||
* which replaces physical address width with 32
|
||||
*/
|
||||
if (!PAGE_ALIGNED(vmptr) || (vmptr >> cpuid_maxphyaddr(vcpu)))
|
||||
if (!page_address_valid(vcpu, vmptr))
|
||||
return nested_vmx_failInvalid(vcpu);
|
||||
|
||||
page = kvm_vcpu_gpa_to_page(vcpu, vmptr);
|
||||
if (is_error_page(page))
|
||||
if (kvm_read_guest(vcpu->kvm, vmptr, &revision, sizeof(revision)) ||
|
||||
revision != VMCS12_REVISION)
|
||||
return nested_vmx_failInvalid(vcpu);
|
||||
|
||||
if (*(u32 *)kmap(page) != VMCS12_REVISION) {
|
||||
kunmap(page);
|
||||
kvm_release_page_clean(page);
|
||||
return nested_vmx_failInvalid(vcpu);
|
||||
}
|
||||
kunmap(page);
|
||||
kvm_release_page_clean(page);
|
||||
|
||||
vmx->nested.vmxon_ptr = vmptr;
|
||||
ret = enter_vmx_operation(vcpu);
|
||||
if (ret)
|
||||
@ -4377,7 +4341,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
|
||||
if (nested_vmx_get_vmptr(vcpu, &vmptr))
|
||||
return 1;
|
||||
|
||||
if (!PAGE_ALIGNED(vmptr) || (vmptr >> cpuid_maxphyaddr(vcpu)))
|
||||
if (!page_address_valid(vcpu, vmptr))
|
||||
return nested_vmx_failValid(vcpu,
|
||||
VMXERR_VMCLEAR_INVALID_ADDRESS);
|
||||
|
||||
@ -4385,7 +4349,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
|
||||
return nested_vmx_failValid(vcpu,
|
||||
VMXERR_VMCLEAR_VMXON_POINTER);
|
||||
|
||||
if (vmx->nested.hv_evmcs_page) {
|
||||
if (vmx->nested.hv_evmcs_map.hva) {
|
||||
if (vmptr == vmx->nested.hv_evmcs_vmptr)
|
||||
nested_release_evmcs(vcpu);
|
||||
} else {
|
||||
@ -4584,7 +4548,7 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
|
||||
if (nested_vmx_get_vmptr(vcpu, &vmptr))
|
||||
return 1;
|
||||
|
||||
if (!PAGE_ALIGNED(vmptr) || (vmptr >> cpuid_maxphyaddr(vcpu)))
|
||||
if (!page_address_valid(vcpu, vmptr))
|
||||
return nested_vmx_failValid(vcpu,
|
||||
VMXERR_VMPTRLD_INVALID_ADDRESS);
|
||||
|
||||
@ -4597,11 +4561,10 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
|
||||
return 1;
|
||||
|
||||
if (vmx->nested.current_vmptr != vmptr) {
|
||||
struct kvm_host_map map;
|
||||
struct vmcs12 *new_vmcs12;
|
||||
struct page *page;
|
||||
|
||||
page = kvm_vcpu_gpa_to_page(vcpu, vmptr);
|
||||
if (is_error_page(page)) {
|
||||
if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmptr), &map)) {
|
||||
/*
|
||||
* Reads from an unbacked page return all 1s,
|
||||
* which means that the 32 bits located at the
|
||||
@ -4611,12 +4574,13 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
|
||||
return nested_vmx_failValid(vcpu,
|
||||
VMXERR_VMPTRLD_INCORRECT_VMCS_REVISION_ID);
|
||||
}
|
||||
new_vmcs12 = kmap(page);
|
||||
|
||||
new_vmcs12 = map.hva;
|
||||
|
||||
if (new_vmcs12->hdr.revision_id != VMCS12_REVISION ||
|
||||
(new_vmcs12->hdr.shadow_vmcs &&
|
||||
!nested_cpu_has_vmx_shadow_vmcs(vcpu))) {
|
||||
kunmap(page);
|
||||
kvm_release_page_clean(page);
|
||||
kvm_vcpu_unmap(vcpu, &map, false);
|
||||
return nested_vmx_failValid(vcpu,
|
||||
VMXERR_VMPTRLD_INCORRECT_VMCS_REVISION_ID);
|
||||
}
|
||||
@ -4628,8 +4592,7 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
|
||||
* cached.
|
||||
*/
|
||||
memcpy(vmx->nested.cached_vmcs12, new_vmcs12, VMCS12_SIZE);
|
||||
kunmap(page);
|
||||
kvm_release_page_clean(page);
|
||||
kvm_vcpu_unmap(vcpu, &map, false);
|
||||
|
||||
set_current_vmptr(vmx, vmptr);
|
||||
}
|
||||
@ -4804,7 +4767,7 @@ static int handle_invvpid(struct kvm_vcpu *vcpu)
|
||||
static int nested_vmx_eptp_switching(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12)
|
||||
{
|
||||
u32 index = vcpu->arch.regs[VCPU_REGS_RCX];
|
||||
u32 index = kvm_rcx_read(vcpu);
|
||||
u64 address;
|
||||
bool accessed_dirty;
|
||||
struct kvm_mmu *mmu = vcpu->arch.walk_mmu;
|
||||
@ -4850,7 +4813,7 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct vcpu_vmx *vmx = to_vmx(vcpu);
|
||||
struct vmcs12 *vmcs12;
|
||||
u32 function = vcpu->arch.regs[VCPU_REGS_RAX];
|
||||
u32 function = kvm_rax_read(vcpu);
|
||||
|
||||
/*
|
||||
* VMFUNC is only supported for nested guests, but we always enable the
|
||||
@ -4936,7 +4899,7 @@ static bool nested_vmx_exit_handled_io(struct kvm_vcpu *vcpu,
|
||||
static bool nested_vmx_exit_handled_msr(struct kvm_vcpu *vcpu,
|
||||
struct vmcs12 *vmcs12, u32 exit_reason)
|
||||
{
|
||||
u32 msr_index = vcpu->arch.regs[VCPU_REGS_RCX];
|
||||
u32 msr_index = kvm_rcx_read(vcpu);
|
||||
gpa_t bitmap;
|
||||
|
||||
if (!nested_cpu_has(vmcs12, CPU_BASED_USE_MSR_BITMAPS))
|
||||
@ -5373,9 +5336,6 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
|
||||
if (kvm_state->format != 0)
|
||||
return -EINVAL;
|
||||
|
||||
if (kvm_state->flags & KVM_STATE_NESTED_EVMCS)
|
||||
nested_enable_evmcs(vcpu, NULL);
|
||||
|
||||
if (!nested_vmx_allowed(vcpu))
|
||||
return kvm_state->vmx.vmxon_pa == -1ull ? 0 : -EINVAL;
|
||||
|
||||
@ -5417,6 +5377,9 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
|
||||
if (kvm_state->vmx.vmxon_pa == -1ull)
|
||||
return 0;
|
||||
|
||||
if (kvm_state->flags & KVM_STATE_NESTED_EVMCS)
|
||||
nested_enable_evmcs(vcpu, NULL);
|
||||
|
||||
vmx->nested.vmxon_ptr = kvm_state->vmx.vmxon_pa;
|
||||
ret = enter_vmx_operation(vcpu);
|
||||
if (ret)
|
||||
@ -5460,9 +5423,6 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
|
||||
if (!(kvm_state->flags & KVM_STATE_NESTED_GUEST_MODE))
|
||||
return 0;
|
||||
|
||||
vmx->nested.nested_run_pending =
|
||||
!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
|
||||
|
||||
if (nested_cpu_has_shadow_vmcs(vmcs12) &&
|
||||
vmcs12->vmcs_link_pointer != -1ull) {
|
||||
struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu);
|
||||
@ -5480,14 +5440,20 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (nested_vmx_check_vmentry_prereqs(vcpu, vmcs12) ||
|
||||
nested_vmx_check_vmentry_postreqs(vcpu, vmcs12, &exit_qual))
|
||||
if (nested_vmx_check_controls(vcpu, vmcs12) ||
|
||||
nested_vmx_check_host_state(vcpu, vmcs12) ||
|
||||
nested_vmx_check_guest_state(vcpu, vmcs12, &exit_qual))
|
||||
return -EINVAL;
|
||||
|
||||
vmx->nested.dirty_vmcs12 = true;
|
||||
vmx->nested.nested_run_pending =
|
||||
!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
|
||||
|
||||
ret = nested_vmx_enter_non_root_mode(vcpu, false);
|
||||
if (ret)
|
||||
if (ret) {
|
||||
vmx->nested.nested_run_pending = 0;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
@ -227,7 +227,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
}
|
||||
break;
|
||||
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
|
||||
if (!(data & (pmu->global_ctrl_mask & ~(3ull<<62)))) {
|
||||
if (!(data & pmu->global_ovf_ctrl_mask)) {
|
||||
if (!msr_info->host_initiated)
|
||||
pmu->global_status &= ~data;
|
||||
pmu->global_ovf_ctrl = data;
|
||||
@ -297,6 +297,12 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
|
||||
pmu->global_ctrl = ((1ull << pmu->nr_arch_gp_counters) - 1) |
|
||||
(((1ull << pmu->nr_arch_fixed_counters) - 1) << INTEL_PMC_IDX_FIXED);
|
||||
pmu->global_ctrl_mask = ~pmu->global_ctrl;
|
||||
pmu->global_ovf_ctrl_mask = pmu->global_ctrl_mask
|
||||
& ~(MSR_CORE_PERF_GLOBAL_OVF_CTRL_OVF_BUF |
|
||||
MSR_CORE_PERF_GLOBAL_OVF_CTRL_COND_CHGD);
|
||||
if (kvm_x86_ops->pt_supported())
|
||||
pmu->global_ovf_ctrl_mask &=
|
||||
~MSR_CORE_PERF_GLOBAL_OVF_CTRL_TRACE_TOPA_PMI;
|
||||
|
||||
entry = kvm_find_cpuid_entry(vcpu, 7, 0);
|
||||
if (entry &&
|
||||
|
@ -1692,6 +1692,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
case MSR_IA32_SYSENTER_ESP:
|
||||
msr_info->data = vmcs_readl(GUEST_SYSENTER_ESP);
|
||||
break;
|
||||
case MSR_IA32_POWER_CTL:
|
||||
msr_info->data = vmx->msr_ia32_power_ctl;
|
||||
break;
|
||||
case MSR_IA32_BNDCFGS:
|
||||
if (!kvm_mpx_supported() ||
|
||||
(!msr_info->host_initiated &&
|
||||
@ -1822,6 +1825,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
case MSR_IA32_SYSENTER_ESP:
|
||||
vmcs_writel(GUEST_SYSENTER_ESP, data);
|
||||
break;
|
||||
case MSR_IA32_POWER_CTL:
|
||||
vmx->msr_ia32_power_ctl = data;
|
||||
break;
|
||||
case MSR_IA32_BNDCFGS:
|
||||
if (!kvm_mpx_supported() ||
|
||||
(!msr_info->host_initiated &&
|
||||
@ -1891,7 +1897,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
break;
|
||||
case MSR_IA32_CR_PAT:
|
||||
if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
|
||||
if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
|
||||
if (!kvm_pat_valid(data))
|
||||
return 1;
|
||||
vmcs_write64(GUEST_IA32_PAT, data);
|
||||
vcpu->arch.pat = data;
|
||||
@ -2288,7 +2294,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
|
||||
min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
|
||||
#endif
|
||||
opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
|
||||
VM_EXIT_SAVE_IA32_PAT |
|
||||
VM_EXIT_LOAD_IA32_PAT |
|
||||
VM_EXIT_LOAD_IA32_EFER |
|
||||
VM_EXIT_CLEAR_BNDCFGS |
|
||||
@ -3619,14 +3624,13 @@ static bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu)
|
||||
|
||||
if (WARN_ON_ONCE(!is_guest_mode(vcpu)) ||
|
||||
!nested_cpu_has_vid(get_vmcs12(vcpu)) ||
|
||||
WARN_ON_ONCE(!vmx->nested.virtual_apic_page))
|
||||
WARN_ON_ONCE(!vmx->nested.virtual_apic_map.gfn))
|
||||
return false;
|
||||
|
||||
rvi = vmx_get_rvi();
|
||||
|
||||
vapic_page = kmap(vmx->nested.virtual_apic_page);
|
||||
vapic_page = vmx->nested.virtual_apic_map.hva;
|
||||
vppr = *((u32 *)(vapic_page + APIC_PROCPRI));
|
||||
kunmap(vmx->nested.virtual_apic_page);
|
||||
|
||||
return ((rvi & 0xf0) > (vppr & 0xf0));
|
||||
}
|
||||
@ -4827,7 +4831,7 @@ static int handle_cpuid(struct kvm_vcpu *vcpu)
|
||||
|
||||
static int handle_rdmsr(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
u32 ecx = vcpu->arch.regs[VCPU_REGS_RCX];
|
||||
u32 ecx = kvm_rcx_read(vcpu);
|
||||
struct msr_data msr_info;
|
||||
|
||||
msr_info.index = ecx;
|
||||
@ -4840,18 +4844,16 @@ static int handle_rdmsr(struct kvm_vcpu *vcpu)
|
||||
|
||||
trace_kvm_msr_read(ecx, msr_info.data);
|
||||
|
||||
/* FIXME: handling of bits 32:63 of rax, rdx */
|
||||
vcpu->arch.regs[VCPU_REGS_RAX] = msr_info.data & -1u;
|
||||
vcpu->arch.regs[VCPU_REGS_RDX] = (msr_info.data >> 32) & -1u;
|
||||
kvm_rax_write(vcpu, msr_info.data & -1u);
|
||||
kvm_rdx_write(vcpu, (msr_info.data >> 32) & -1u);
|
||||
return kvm_skip_emulated_instruction(vcpu);
|
||||
}
|
||||
|
||||
static int handle_wrmsr(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct msr_data msr;
|
||||
u32 ecx = vcpu->arch.regs[VCPU_REGS_RCX];
|
||||
u64 data = (vcpu->arch.regs[VCPU_REGS_RAX] & -1u)
|
||||
| ((u64)(vcpu->arch.regs[VCPU_REGS_RDX] & -1u) << 32);
|
||||
u32 ecx = kvm_rcx_read(vcpu);
|
||||
u64 data = kvm_read_edx_eax(vcpu);
|
||||
|
||||
msr.data = data;
|
||||
msr.index = ecx;
|
||||
@ -4922,7 +4924,7 @@ static int handle_wbinvd(struct kvm_vcpu *vcpu)
|
||||
static int handle_xsetbv(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
u64 new_bv = kvm_read_edx_eax(vcpu);
|
||||
u32 index = kvm_register_read(vcpu, VCPU_REGS_RCX);
|
||||
u32 index = kvm_rcx_read(vcpu);
|
||||
|
||||
if (kvm_set_xcr(vcpu, index, new_bv) == 0)
|
||||
return kvm_skip_emulated_instruction(vcpu);
|
||||
@ -5723,8 +5725,16 @@ void dump_vmcs(void)
|
||||
if (secondary_exec_control & SECONDARY_EXEC_TSC_SCALING)
|
||||
pr_err("TSC Multiplier = 0x%016llx\n",
|
||||
vmcs_read64(TSC_MULTIPLIER));
|
||||
if (cpu_based_exec_ctrl & CPU_BASED_TPR_SHADOW)
|
||||
pr_err("TPR Threshold = 0x%02x\n", vmcs_read32(TPR_THRESHOLD));
|
||||
if (cpu_based_exec_ctrl & CPU_BASED_TPR_SHADOW) {
|
||||
if (secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) {
|
||||
u16 status = vmcs_read16(GUEST_INTR_STATUS);
|
||||
pr_err("SVI|RVI = %02x|%02x ", status >> 8, status & 0xff);
|
||||
}
|
||||
pr_cont("TPR Threshold = 0x%02x\n", vmcs_read32(TPR_THRESHOLD));
|
||||
if (secondary_exec_control & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)
|
||||
pr_err("APIC-access addr = 0x%016llx ", vmcs_read64(APIC_ACCESS_ADDR));
|
||||
pr_cont("virt-APIC addr = 0x%016llx\n", vmcs_read64(VIRTUAL_APIC_PAGE_ADDR));
|
||||
}
|
||||
if (pin_based_exec_ctrl & PIN_BASED_POSTED_INTR)
|
||||
pr_err("PostedIntrVec = 0x%02x\n", vmcs_read16(POSTED_INTR_NV));
|
||||
if ((secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT))
|
||||
@ -6856,30 +6866,6 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
|
||||
}
|
||||
}
|
||||
|
||||
static bool guest_cpuid_has_pmu(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct kvm_cpuid_entry2 *entry;
|
||||
union cpuid10_eax eax;
|
||||
|
||||
entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
|
||||
if (!entry)
|
||||
return false;
|
||||
|
||||
eax.full = entry->eax;
|
||||
return (eax.split.version_id > 0);
|
||||
}
|
||||
|
||||
static void nested_vmx_procbased_ctls_update(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct vcpu_vmx *vmx = to_vmx(vcpu);
|
||||
bool pmu_enabled = guest_cpuid_has_pmu(vcpu);
|
||||
|
||||
if (pmu_enabled)
|
||||
vmx->nested.msrs.procbased_ctls_high |= CPU_BASED_RDPMC_EXITING;
|
||||
else
|
||||
vmx->nested.msrs.procbased_ctls_high &= ~CPU_BASED_RDPMC_EXITING;
|
||||
}
|
||||
|
||||
static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct vcpu_vmx *vmx = to_vmx(vcpu);
|
||||
@ -6968,7 +6954,6 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
|
||||
if (nested_vmx_allowed(vcpu)) {
|
||||
nested_vmx_cr_fixed1_bits_update(vcpu);
|
||||
nested_vmx_entry_exit_ctls_update(vcpu);
|
||||
nested_vmx_procbased_ctls_update(vcpu);
|
||||
}
|
||||
|
||||
if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
|
||||
@ -7028,7 +7013,8 @@ static inline int u64_shl_div_u64(u64 a, unsigned int shift,
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc)
|
||||
static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc,
|
||||
bool *expired)
|
||||
{
|
||||
struct vcpu_vmx *vmx;
|
||||
u64 tscl, guest_tscl, delta_tsc, lapic_timer_advance_cycles;
|
||||
@ -7051,10 +7037,9 @@ static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc)
|
||||
|
||||
/* Convert to host delta tsc if tsc scaling is enabled */
|
||||
if (vcpu->arch.tsc_scaling_ratio != kvm_default_tsc_scaling_ratio &&
|
||||
u64_shl_div_u64(delta_tsc,
|
||||
delta_tsc && u64_shl_div_u64(delta_tsc,
|
||||
kvm_tsc_scaling_ratio_frac_bits,
|
||||
vcpu->arch.tsc_scaling_ratio,
|
||||
&delta_tsc))
|
||||
vcpu->arch.tsc_scaling_ratio, &delta_tsc))
|
||||
return -ERANGE;
|
||||
|
||||
/*
|
||||
@ -7067,7 +7052,8 @@ static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc)
|
||||
return -ERANGE;
|
||||
|
||||
vmx->hv_deadline_tsc = tscl + delta_tsc;
|
||||
return delta_tsc == 0;
|
||||
*expired = !delta_tsc;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu)
|
||||
@ -7104,9 +7090,7 @@ static int vmx_write_pml_buffer(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct vmcs12 *vmcs12;
|
||||
struct vcpu_vmx *vmx = to_vmx(vcpu);
|
||||
gpa_t gpa;
|
||||
struct page *page = NULL;
|
||||
u64 *pml_address;
|
||||
gpa_t gpa, dst;
|
||||
|
||||
if (is_guest_mode(vcpu)) {
|
||||
WARN_ON_ONCE(vmx->nested.pml_full);
|
||||
@ -7126,15 +7110,13 @@ static int vmx_write_pml_buffer(struct kvm_vcpu *vcpu)
|
||||
}
|
||||
|
||||
gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS) & ~0xFFFull;
|
||||
dst = vmcs12->pml_address + sizeof(u64) * vmcs12->guest_pml_index;
|
||||
|
||||
page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->pml_address);
|
||||
if (is_error_page(page))
|
||||
if (kvm_write_guest_page(vcpu->kvm, gpa_to_gfn(dst), &gpa,
|
||||
offset_in_page(dst), sizeof(gpa)))
|
||||
return 0;
|
||||
|
||||
pml_address = kmap(page);
|
||||
pml_address[vmcs12->guest_pml_index--] = gpa;
|
||||
kunmap(page);
|
||||
kvm_release_page_clean(page);
|
||||
vmcs12->guest_pml_index--;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
@ -142,8 +142,11 @@ struct nested_vmx {
|
||||
* pointers, so we must keep them pinned while L2 runs.
|
||||
*/
|
||||
struct page *apic_access_page;
|
||||
struct page *virtual_apic_page;
|
||||
struct page *pi_desc_page;
|
||||
struct kvm_host_map virtual_apic_map;
|
||||
struct kvm_host_map pi_desc_map;
|
||||
|
||||
struct kvm_host_map msr_bitmap_map;
|
||||
|
||||
struct pi_desc *pi_desc;
|
||||
bool pi_pending;
|
||||
u16 posted_intr_nv;
|
||||
@ -169,7 +172,7 @@ struct nested_vmx {
|
||||
} smm;
|
||||
|
||||
gpa_t hv_evmcs_vmptr;
|
||||
struct page *hv_evmcs_page;
|
||||
struct kvm_host_map hv_evmcs_map;
|
||||
struct hv_enlightened_vmcs *hv_evmcs;
|
||||
};
|
||||
|
||||
@ -257,6 +260,8 @@ struct vcpu_vmx {
|
||||
|
||||
unsigned long host_debugctlmsr;
|
||||
|
||||
u64 msr_ia32_power_ctl;
|
||||
|
||||
/*
|
||||
* Only bits masked by msr_ia32_feature_control_valid_bits can be set in
|
||||
* msr_ia32_feature_control. FEATURE_CONTROL_LOCKED is always included
|
||||
|
@ -1100,15 +1100,15 @@ EXPORT_SYMBOL_GPL(kvm_get_dr);
|
||||
|
||||
bool kvm_rdpmc(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
u32 ecx = kvm_register_read(vcpu, VCPU_REGS_RCX);
|
||||
u32 ecx = kvm_rcx_read(vcpu);
|
||||
u64 data;
|
||||
int err;
|
||||
|
||||
err = kvm_pmu_rdpmc(vcpu, ecx, &data);
|
||||
if (err)
|
||||
return err;
|
||||
kvm_register_write(vcpu, VCPU_REGS_RAX, (u32)data);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RDX, data >> 32);
|
||||
kvm_rax_write(vcpu, (u32)data);
|
||||
kvm_rdx_write(vcpu, data >> 32);
|
||||
return err;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvm_rdpmc);
|
||||
@ -1174,6 +1174,9 @@ static u32 emulated_msrs[] = {
|
||||
MSR_PLATFORM_INFO,
|
||||
MSR_MISC_FEATURES_ENABLES,
|
||||
MSR_AMD64_VIRT_SPEC_CTRL,
|
||||
MSR_IA32_POWER_CTL,
|
||||
|
||||
MSR_K7_HWCR,
|
||||
};
|
||||
|
||||
static unsigned num_emulated_msrs;
|
||||
@ -1262,31 +1265,49 @@ static int do_get_msr_feature(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static bool __kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer)
|
||||
{
|
||||
if (efer & EFER_FFXSR && !guest_cpuid_has(vcpu, X86_FEATURE_FXSR_OPT))
|
||||
return false;
|
||||
|
||||
if (efer & EFER_SVME && !guest_cpuid_has(vcpu, X86_FEATURE_SVM))
|
||||
return false;
|
||||
|
||||
if (efer & (EFER_LME | EFER_LMA) &&
|
||||
!guest_cpuid_has(vcpu, X86_FEATURE_LM))
|
||||
return false;
|
||||
|
||||
if (efer & EFER_NX && !guest_cpuid_has(vcpu, X86_FEATURE_NX))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
|
||||
}
|
||||
bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer)
|
||||
{
|
||||
if (efer & efer_reserved_bits)
|
||||
return false;
|
||||
|
||||
if (efer & EFER_FFXSR && !guest_cpuid_has(vcpu, X86_FEATURE_FXSR_OPT))
|
||||
return false;
|
||||
|
||||
if (efer & EFER_SVME && !guest_cpuid_has(vcpu, X86_FEATURE_SVM))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
return __kvm_valid_efer(vcpu, efer);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvm_valid_efer);
|
||||
|
||||
static int set_efer(struct kvm_vcpu *vcpu, u64 efer)
|
||||
static int set_efer(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
{
|
||||
u64 old_efer = vcpu->arch.efer;
|
||||
u64 efer = msr_info->data;
|
||||
|
||||
if (!kvm_valid_efer(vcpu, efer))
|
||||
return 1;
|
||||
if (efer & efer_reserved_bits)
|
||||
return false;
|
||||
|
||||
if (is_paging(vcpu)
|
||||
&& (vcpu->arch.efer & EFER_LME) != (efer & EFER_LME))
|
||||
return 1;
|
||||
if (!msr_info->host_initiated) {
|
||||
if (!__kvm_valid_efer(vcpu, efer))
|
||||
return 1;
|
||||
|
||||
if (is_paging(vcpu) &&
|
||||
(vcpu->arch.efer & EFER_LME) != (efer & EFER_LME))
|
||||
return 1;
|
||||
}
|
||||
|
||||
efer &= ~EFER_LMA;
|
||||
efer |= vcpu->arch.efer & EFER_LMA;
|
||||
@ -2279,6 +2300,18 @@ static void kvmclock_sync_fn(struct work_struct *work)
|
||||
KVMCLOCK_SYNC_PERIOD);
|
||||
}
|
||||
|
||||
/*
|
||||
* On AMD, HWCR[McStatusWrEn] controls whether setting MCi_STATUS results in #GP.
|
||||
*/
|
||||
static bool can_set_mci_status(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
/* McStatusWrEn enabled? */
|
||||
if (guest_cpuid_is_amd(vcpu))
|
||||
return !!(vcpu->arch.msr_hwcr & BIT_ULL(18));
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
{
|
||||
u64 mcg_cap = vcpu->arch.mcg_cap;
|
||||
@ -2310,9 +2343,14 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
if ((offset & 0x3) == 0 &&
|
||||
data != 0 && (data | (1 << 10)) != ~(u64)0)
|
||||
return -1;
|
||||
|
||||
/* MCi_STATUS */
|
||||
if (!msr_info->host_initiated &&
|
||||
(offset & 0x3) == 1 && data != 0)
|
||||
return -1;
|
||||
(offset & 0x3) == 1 && data != 0) {
|
||||
if (!can_set_mci_status(vcpu))
|
||||
return -1;
|
||||
}
|
||||
|
||||
vcpu->arch.mce_banks[offset] = data;
|
||||
break;
|
||||
}
|
||||
@ -2456,13 +2494,16 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
vcpu->arch.arch_capabilities = data;
|
||||
break;
|
||||
case MSR_EFER:
|
||||
return set_efer(vcpu, data);
|
||||
return set_efer(vcpu, msr_info);
|
||||
case MSR_K7_HWCR:
|
||||
data &= ~(u64)0x40; /* ignore flush filter disable */
|
||||
data &= ~(u64)0x100; /* ignore ignne emulation enable */
|
||||
data &= ~(u64)0x8; /* ignore TLB cache disable */
|
||||
data &= ~(u64)0x40000; /* ignore Mc status write enable */
|
||||
if (data != 0) {
|
||||
|
||||
/* Handle McStatusWrEn */
|
||||
if (data == BIT_ULL(18)) {
|
||||
vcpu->arch.msr_hwcr = data;
|
||||
} else if (data != 0) {
|
||||
vcpu_unimpl(vcpu, "unimplemented HWCR wrmsr: 0x%llx\n",
|
||||
data);
|
||||
return 1;
|
||||
@ -2736,7 +2777,6 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
case MSR_K8_SYSCFG:
|
||||
case MSR_K8_TSEG_ADDR:
|
||||
case MSR_K8_TSEG_MASK:
|
||||
case MSR_K7_HWCR:
|
||||
case MSR_VM_HSAVE_PA:
|
||||
case MSR_K8_INT_PENDING_MSG:
|
||||
case MSR_AMD64_NB_CFG:
|
||||
@ -2900,6 +2940,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
|
||||
case MSR_MISC_FEATURES_ENABLES:
|
||||
msr_info->data = vcpu->arch.msr_misc_features_enables;
|
||||
break;
|
||||
case MSR_K7_HWCR:
|
||||
msr_info->data = vcpu->arch.msr_hwcr;
|
||||
break;
|
||||
default:
|
||||
if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
|
||||
return kvm_pmu_get_msr(vcpu, msr_info->index, &msr_info->data);
|
||||
@ -3079,9 +3122,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||
case KVM_CAP_MAX_VCPUS:
|
||||
r = KVM_MAX_VCPUS;
|
||||
break;
|
||||
case KVM_CAP_NR_MEMSLOTS:
|
||||
r = KVM_USER_MEM_SLOTS;
|
||||
break;
|
||||
case KVM_CAP_PV_MMU: /* obsolete */
|
||||
r = 0;
|
||||
break;
|
||||
@ -5521,9 +5561,9 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
|
||||
unsigned int bytes,
|
||||
struct x86_exception *exception)
|
||||
{
|
||||
struct kvm_host_map map;
|
||||
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
|
||||
gpa_t gpa;
|
||||
struct page *page;
|
||||
char *kaddr;
|
||||
bool exchanged;
|
||||
|
||||
@ -5540,12 +5580,11 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
|
||||
if (((gpa + bytes - 1) & PAGE_MASK) != (gpa & PAGE_MASK))
|
||||
goto emul_write;
|
||||
|
||||
page = kvm_vcpu_gfn_to_page(vcpu, gpa >> PAGE_SHIFT);
|
||||
if (is_error_page(page))
|
||||
if (kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), &map))
|
||||
goto emul_write;
|
||||
|
||||
kaddr = kmap_atomic(page);
|
||||
kaddr += offset_in_page(gpa);
|
||||
kaddr = map.hva + offset_in_page(gpa);
|
||||
|
||||
switch (bytes) {
|
||||
case 1:
|
||||
exchanged = CMPXCHG_TYPE(u8, kaddr, old, new);
|
||||
@ -5562,13 +5601,12 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
|
||||
default:
|
||||
BUG();
|
||||
}
|
||||
kunmap_atomic(kaddr);
|
||||
kvm_release_page_dirty(page);
|
||||
|
||||
kvm_vcpu_unmap(vcpu, &map, true);
|
||||
|
||||
if (!exchanged)
|
||||
return X86EMUL_CMPXCHG_FAILED;
|
||||
|
||||
kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
|
||||
kvm_page_track_write(vcpu, gpa, new, bytes);
|
||||
|
||||
return X86EMUL_CONTINUE;
|
||||
@ -6558,7 +6596,7 @@ static int complete_fast_pio_out(struct kvm_vcpu *vcpu)
|
||||
static int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size,
|
||||
unsigned short port)
|
||||
{
|
||||
unsigned long val = kvm_register_read(vcpu, VCPU_REGS_RAX);
|
||||
unsigned long val = kvm_rax_read(vcpu);
|
||||
int ret = emulator_pio_out_emulated(&vcpu->arch.emulate_ctxt,
|
||||
size, port, &val, 1);
|
||||
if (ret)
|
||||
@ -6593,8 +6631,7 @@ static int complete_fast_pio_in(struct kvm_vcpu *vcpu)
|
||||
}
|
||||
|
||||
/* For size less than 4 we merge, else we zero extend */
|
||||
val = (vcpu->arch.pio.size < 4) ? kvm_register_read(vcpu, VCPU_REGS_RAX)
|
||||
: 0;
|
||||
val = (vcpu->arch.pio.size < 4) ? kvm_rax_read(vcpu) : 0;
|
||||
|
||||
/*
|
||||
* Since vcpu->arch.pio.count == 1 let emulator_pio_in_emulated perform
|
||||
@ -6602,7 +6639,7 @@ static int complete_fast_pio_in(struct kvm_vcpu *vcpu)
|
||||
*/
|
||||
emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, vcpu->arch.pio.size,
|
||||
vcpu->arch.pio.port, &val, 1);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RAX, val);
|
||||
kvm_rax_write(vcpu, val);
|
||||
|
||||
return kvm_skip_emulated_instruction(vcpu);
|
||||
}
|
||||
@ -6614,12 +6651,12 @@ static int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size,
|
||||
int ret;
|
||||
|
||||
/* For size less than 4 we merge, else we zero extend */
|
||||
val = (size < 4) ? kvm_register_read(vcpu, VCPU_REGS_RAX) : 0;
|
||||
val = (size < 4) ? kvm_rax_read(vcpu) : 0;
|
||||
|
||||
ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size, port,
|
||||
&val, 1);
|
||||
if (ret) {
|
||||
kvm_register_write(vcpu, VCPU_REGS_RAX, val);
|
||||
kvm_rax_write(vcpu, val);
|
||||
return ret;
|
||||
}
|
||||
|
||||
@ -6854,10 +6891,20 @@ static unsigned long kvm_get_guest_ip(void)
|
||||
return ip;
|
||||
}
|
||||
|
||||
static void kvm_handle_intel_pt_intr(void)
|
||||
{
|
||||
struct kvm_vcpu *vcpu = __this_cpu_read(current_vcpu);
|
||||
|
||||
kvm_make_request(KVM_REQ_PMI, vcpu);
|
||||
__set_bit(MSR_CORE_PERF_GLOBAL_OVF_CTRL_TRACE_TOPA_PMI_BIT,
|
||||
(unsigned long *)&vcpu->arch.pmu.global_status);
|
||||
}
|
||||
|
||||
static struct perf_guest_info_callbacks kvm_guest_cbs = {
|
||||
.is_in_guest = kvm_is_in_guest,
|
||||
.is_user_mode = kvm_is_user_mode,
|
||||
.get_guest_ip = kvm_get_guest_ip,
|
||||
.handle_intel_pt_intr = kvm_handle_intel_pt_intr,
|
||||
};
|
||||
|
||||
static void kvm_set_mmio_spte_mask(void)
|
||||
@ -7133,11 +7180,11 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
|
||||
if (kvm_hv_hypercall_enabled(vcpu->kvm))
|
||||
return kvm_hv_hypercall(vcpu);
|
||||
|
||||
nr = kvm_register_read(vcpu, VCPU_REGS_RAX);
|
||||
a0 = kvm_register_read(vcpu, VCPU_REGS_RBX);
|
||||
a1 = kvm_register_read(vcpu, VCPU_REGS_RCX);
|
||||
a2 = kvm_register_read(vcpu, VCPU_REGS_RDX);
|
||||
a3 = kvm_register_read(vcpu, VCPU_REGS_RSI);
|
||||
nr = kvm_rax_read(vcpu);
|
||||
a0 = kvm_rbx_read(vcpu);
|
||||
a1 = kvm_rcx_read(vcpu);
|
||||
a2 = kvm_rdx_read(vcpu);
|
||||
a3 = kvm_rsi_read(vcpu);
|
||||
|
||||
trace_kvm_hypercall(nr, a0, a1, a2, a3);
|
||||
|
||||
@ -7178,7 +7225,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
|
||||
out:
|
||||
if (!op_64_bit)
|
||||
ret = (u32)ret;
|
||||
kvm_register_write(vcpu, VCPU_REGS_RAX, ret);
|
||||
kvm_rax_write(vcpu, ret);
|
||||
|
||||
++vcpu->stat.hypercalls;
|
||||
return kvm_skip_emulated_instruction(vcpu);
|
||||
@ -8280,23 +8327,23 @@ static void __get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
|
||||
emulator_writeback_register_cache(&vcpu->arch.emulate_ctxt);
|
||||
vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
|
||||
}
|
||||
regs->rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
|
||||
regs->rbx = kvm_register_read(vcpu, VCPU_REGS_RBX);
|
||||
regs->rcx = kvm_register_read(vcpu, VCPU_REGS_RCX);
|
||||
regs->rdx = kvm_register_read(vcpu, VCPU_REGS_RDX);
|
||||
regs->rsi = kvm_register_read(vcpu, VCPU_REGS_RSI);
|
||||
regs->rdi = kvm_register_read(vcpu, VCPU_REGS_RDI);
|
||||
regs->rsp = kvm_register_read(vcpu, VCPU_REGS_RSP);
|
||||
regs->rbp = kvm_register_read(vcpu, VCPU_REGS_RBP);
|
||||
regs->rax = kvm_rax_read(vcpu);
|
||||
regs->rbx = kvm_rbx_read(vcpu);
|
||||
regs->rcx = kvm_rcx_read(vcpu);
|
||||
regs->rdx = kvm_rdx_read(vcpu);
|
||||
regs->rsi = kvm_rsi_read(vcpu);
|
||||
regs->rdi = kvm_rdi_read(vcpu);
|
||||
regs->rsp = kvm_rsp_read(vcpu);
|
||||
regs->rbp = kvm_rbp_read(vcpu);
|
||||
#ifdef CONFIG_X86_64
|
||||
regs->r8 = kvm_register_read(vcpu, VCPU_REGS_R8);
|
||||
regs->r9 = kvm_register_read(vcpu, VCPU_REGS_R9);
|
||||
regs->r10 = kvm_register_read(vcpu, VCPU_REGS_R10);
|
||||
regs->r11 = kvm_register_read(vcpu, VCPU_REGS_R11);
|
||||
regs->r12 = kvm_register_read(vcpu, VCPU_REGS_R12);
|
||||
regs->r13 = kvm_register_read(vcpu, VCPU_REGS_R13);
|
||||
regs->r14 = kvm_register_read(vcpu, VCPU_REGS_R14);
|
||||
regs->r15 = kvm_register_read(vcpu, VCPU_REGS_R15);
|
||||
regs->r8 = kvm_r8_read(vcpu);
|
||||
regs->r9 = kvm_r9_read(vcpu);
|
||||
regs->r10 = kvm_r10_read(vcpu);
|
||||
regs->r11 = kvm_r11_read(vcpu);
|
||||
regs->r12 = kvm_r12_read(vcpu);
|
||||
regs->r13 = kvm_r13_read(vcpu);
|
||||
regs->r14 = kvm_r14_read(vcpu);
|
||||
regs->r15 = kvm_r15_read(vcpu);
|
||||
#endif
|
||||
|
||||
regs->rip = kvm_rip_read(vcpu);
|
||||
@ -8316,23 +8363,23 @@ static void __set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
|
||||
vcpu->arch.emulate_regs_need_sync_from_vcpu = true;
|
||||
vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
|
||||
|
||||
kvm_register_write(vcpu, VCPU_REGS_RAX, regs->rax);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RBX, regs->rbx);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RCX, regs->rcx);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RDX, regs->rdx);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RSI, regs->rsi);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RDI, regs->rdi);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RSP, regs->rsp);
|
||||
kvm_register_write(vcpu, VCPU_REGS_RBP, regs->rbp);
|
||||
kvm_rax_write(vcpu, regs->rax);
|
||||
kvm_rbx_write(vcpu, regs->rbx);
|
||||
kvm_rcx_write(vcpu, regs->rcx);
|
||||
kvm_rdx_write(vcpu, regs->rdx);
|
||||
kvm_rsi_write(vcpu, regs->rsi);
|
||||
kvm_rdi_write(vcpu, regs->rdi);
|
||||
kvm_rsp_write(vcpu, regs->rsp);
|
||||
kvm_rbp_write(vcpu, regs->rbp);
|
||||
#ifdef CONFIG_X86_64
|
||||
kvm_register_write(vcpu, VCPU_REGS_R8, regs->r8);
|
||||
kvm_register_write(vcpu, VCPU_REGS_R9, regs->r9);
|
||||
kvm_register_write(vcpu, VCPU_REGS_R10, regs->r10);
|
||||
kvm_register_write(vcpu, VCPU_REGS_R11, regs->r11);
|
||||
kvm_register_write(vcpu, VCPU_REGS_R12, regs->r12);
|
||||
kvm_register_write(vcpu, VCPU_REGS_R13, regs->r13);
|
||||
kvm_register_write(vcpu, VCPU_REGS_R14, regs->r14);
|
||||
kvm_register_write(vcpu, VCPU_REGS_R15, regs->r15);
|
||||
kvm_r8_write(vcpu, regs->r8);
|
||||
kvm_r9_write(vcpu, regs->r9);
|
||||
kvm_r10_write(vcpu, regs->r10);
|
||||
kvm_r11_write(vcpu, regs->r11);
|
||||
kvm_r12_write(vcpu, regs->r12);
|
||||
kvm_r13_write(vcpu, regs->r13);
|
||||
kvm_r14_write(vcpu, regs->r14);
|
||||
kvm_r15_write(vcpu, regs->r15);
|
||||
#endif
|
||||
|
||||
kvm_rip_write(vcpu, regs->rip);
|
||||
|
@ -345,6 +345,16 @@ static inline void kvm_after_interrupt(struct kvm_vcpu *vcpu)
|
||||
__this_cpu_write(current_vcpu, NULL);
|
||||
}
|
||||
|
||||
|
||||
static inline bool kvm_pat_valid(u64 data)
|
||||
{
|
||||
if (data & 0xF8F8F8F8F8F8F8F8ull)
|
||||
return false;
|
||||
/* 0, 1, 4, 5, 6, 7 are valid values. */
|
||||
return (data | ((data & 0x0202020202020202ull) << 1)) == data;
|
||||
}
|
||||
|
||||
void kvm_load_guest_xcr0(struct kvm_vcpu *vcpu);
|
||||
void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu);
|
||||
|
||||
#endif
|
||||
|
@ -227,6 +227,32 @@ enum {
|
||||
READING_SHADOW_PAGE_TABLES,
|
||||
};
|
||||
|
||||
#define KVM_UNMAPPED_PAGE ((void *) 0x500 + POISON_POINTER_DELTA)
|
||||
|
||||
struct kvm_host_map {
|
||||
/*
|
||||
* Only valid if the 'pfn' is managed by the host kernel (i.e. There is
|
||||
* a 'struct page' for it. When using mem= kernel parameter some memory
|
||||
* can be used as guest memory but they are not managed by host
|
||||
* kernel).
|
||||
* If 'pfn' is not managed by the host kernel, this field is
|
||||
* initialized to KVM_UNMAPPED_PAGE.
|
||||
*/
|
||||
struct page *page;
|
||||
void *hva;
|
||||
kvm_pfn_t pfn;
|
||||
kvm_pfn_t gfn;
|
||||
};
|
||||
|
||||
/*
|
||||
* Used to check if the mapping is valid or not. Never use 'kvm_host_map'
|
||||
* directly to check for that.
|
||||
*/
|
||||
static inline bool kvm_vcpu_mapped(struct kvm_host_map *map)
|
||||
{
|
||||
return !!map->hva;
|
||||
}
|
||||
|
||||
/*
|
||||
* Sometimes a large or cross-page mmio needs to be broken up into separate
|
||||
* exits for userspace servicing.
|
||||
@ -733,7 +759,9 @@ struct kvm_memslots *kvm_vcpu_memslots(struct kvm_vcpu *vcpu);
|
||||
struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn);
|
||||
kvm_pfn_t kvm_vcpu_gfn_to_pfn_atomic(struct kvm_vcpu *vcpu, gfn_t gfn);
|
||||
kvm_pfn_t kvm_vcpu_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn);
|
||||
int kvm_vcpu_map(struct kvm_vcpu *vcpu, gpa_t gpa, struct kvm_host_map *map);
|
||||
struct page *kvm_vcpu_gfn_to_page(struct kvm_vcpu *vcpu, gfn_t gfn);
|
||||
void kvm_vcpu_unmap(struct kvm_vcpu *vcpu, struct kvm_host_map *map, bool dirty);
|
||||
unsigned long kvm_vcpu_gfn_to_hva(struct kvm_vcpu *vcpu, gfn_t gfn);
|
||||
unsigned long kvm_vcpu_gfn_to_hva_prot(struct kvm_vcpu *vcpu, gfn_t gfn, bool *writable);
|
||||
int kvm_vcpu_read_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, void *data, int offset,
|
||||
@ -1242,11 +1270,21 @@ struct kvm_device_ops {
|
||||
*/
|
||||
void (*destroy)(struct kvm_device *dev);
|
||||
|
||||
/*
|
||||
* Release is an alternative method to free the device. It is
|
||||
* called when the device file descriptor is closed. Once
|
||||
* release is called, the destroy method will not be called
|
||||
* anymore as the device is removed from the device list of
|
||||
* the VM. kvm->lock is held.
|
||||
*/
|
||||
void (*release)(struct kvm_device *dev);
|
||||
|
||||
int (*set_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
|
||||
int (*get_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
|
||||
int (*has_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
|
||||
long (*ioctl)(struct kvm_device *dev, unsigned int ioctl,
|
||||
unsigned long arg);
|
||||
int (*mmap)(struct kvm_device *dev, struct vm_area_struct *vma);
|
||||
};
|
||||
|
||||
void kvm_device_get(struct kvm_device *dev);
|
||||
@ -1307,6 +1345,16 @@ static inline bool vcpu_valid_wakeup(struct kvm_vcpu *vcpu)
|
||||
}
|
||||
#endif /* CONFIG_HAVE_KVM_INVALID_WAKEUPS */
|
||||
|
||||
#ifdef CONFIG_HAVE_KVM_NO_POLL
|
||||
/* Callback that tells if we must not poll */
|
||||
bool kvm_arch_no_poll(struct kvm_vcpu *vcpu);
|
||||
#else
|
||||
static inline bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
#endif /* CONFIG_HAVE_KVM_NO_POLL */
|
||||
|
||||
#ifdef CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL
|
||||
long kvm_arch_vcpu_async_ioctl(struct file *filp,
|
||||
unsigned int ioctl, unsigned long arg);
|
||||
|
@ -30,6 +30,7 @@ struct perf_guest_info_callbacks {
|
||||
int (*is_in_guest)(void);
|
||||
int (*is_user_mode)(void);
|
||||
unsigned long (*get_guest_ip)(void);
|
||||
void (*handle_intel_pt_intr)(void);
|
||||
};
|
||||
|
||||
#ifdef CONFIG_HAVE_HW_BREAKPOINT
|
||||
|
@ -986,8 +986,13 @@ struct kvm_ppc_resize_hpt {
|
||||
#define KVM_CAP_HYPERV_ENLIGHTENED_VMCS 163
|
||||
#define KVM_CAP_EXCEPTION_PAYLOAD 164
|
||||
#define KVM_CAP_ARM_VM_IPA_SIZE 165
|
||||
#define KVM_CAP_MANUAL_DIRTY_LOG_PROTECT 166
|
||||
#define KVM_CAP_MANUAL_DIRTY_LOG_PROTECT 166 /* Obsolete */
|
||||
#define KVM_CAP_HYPERV_CPUID 167
|
||||
#define KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 168
|
||||
#define KVM_CAP_PPC_IRQ_XIVE 169
|
||||
#define KVM_CAP_ARM_SVE 170
|
||||
#define KVM_CAP_ARM_PTRAUTH_ADDRESS 171
|
||||
#define KVM_CAP_ARM_PTRAUTH_GENERIC 172
|
||||
|
||||
#ifdef KVM_CAP_IRQ_ROUTING
|
||||
|
||||
@ -1145,6 +1150,7 @@ struct kvm_dirty_tlb {
|
||||
#define KVM_REG_SIZE_U256 0x0050000000000000ULL
|
||||
#define KVM_REG_SIZE_U512 0x0060000000000000ULL
|
||||
#define KVM_REG_SIZE_U1024 0x0070000000000000ULL
|
||||
#define KVM_REG_SIZE_U2048 0x0080000000000000ULL
|
||||
|
||||
struct kvm_reg_list {
|
||||
__u64 n; /* number of regs */
|
||||
@ -1211,6 +1217,8 @@ enum kvm_device_type {
|
||||
#define KVM_DEV_TYPE_ARM_VGIC_V3 KVM_DEV_TYPE_ARM_VGIC_V3
|
||||
KVM_DEV_TYPE_ARM_VGIC_ITS,
|
||||
#define KVM_DEV_TYPE_ARM_VGIC_ITS KVM_DEV_TYPE_ARM_VGIC_ITS
|
||||
KVM_DEV_TYPE_XIVE,
|
||||
#define KVM_DEV_TYPE_XIVE KVM_DEV_TYPE_XIVE
|
||||
KVM_DEV_TYPE_MAX,
|
||||
};
|
||||
|
||||
@ -1434,12 +1442,15 @@ struct kvm_enc_region {
|
||||
#define KVM_GET_NESTED_STATE _IOWR(KVMIO, 0xbe, struct kvm_nested_state)
|
||||
#define KVM_SET_NESTED_STATE _IOW(KVMIO, 0xbf, struct kvm_nested_state)
|
||||
|
||||
/* Available with KVM_CAP_MANUAL_DIRTY_LOG_PROTECT */
|
||||
/* Available with KVM_CAP_MANUAL_DIRTY_LOG_PROTECT_2 */
|
||||
#define KVM_CLEAR_DIRTY_LOG _IOWR(KVMIO, 0xc0, struct kvm_clear_dirty_log)
|
||||
|
||||
/* Available with KVM_CAP_HYPERV_CPUID */
|
||||
#define KVM_GET_SUPPORTED_HV_CPUID _IOWR(KVMIO, 0xc1, struct kvm_cpuid2)
|
||||
|
||||
/* Available with KVM_CAP_ARM_SVE */
|
||||
#define KVM_ARM_VCPU_FINALIZE _IOW(KVMIO, 0xc2, int)
|
||||
|
||||
/* Secure Encrypted Virtualization command */
|
||||
enum sev_cmd_id {
|
||||
/* Guest initialization commands */
|
||||
|
@ -152,7 +152,8 @@ struct kvm_s390_vm_cpu_subfunc {
|
||||
__u8 pcc[16]; /* with MSA4 */
|
||||
__u8 ppno[16]; /* with MSA5 */
|
||||
__u8 kma[16]; /* with MSA8 */
|
||||
__u8 reserved[1808];
|
||||
__u8 kdsa[16]; /* with MSA9 */
|
||||
__u8 reserved[1792];
|
||||
};
|
||||
|
||||
/* kvm attributes for crypto */
|
||||
|
7
tools/testing/selftests/kvm/.gitignore
vendored
7
tools/testing/selftests/kvm/.gitignore
vendored
@ -1,9 +1,14 @@
|
||||
/x86_64/cr4_cpuid_sync_test
|
||||
/x86_64/evmcs_test
|
||||
/x86_64/hyperv_cpuid
|
||||
/x86_64/kvm_create_max_vcpus
|
||||
/x86_64/platform_info_test
|
||||
/x86_64/set_sregs_test
|
||||
/x86_64/smm_test
|
||||
/x86_64/state_test
|
||||
/x86_64/sync_regs_test
|
||||
/x86_64/vmx_close_while_nested_test
|
||||
/x86_64/vmx_set_nested_state_test
|
||||
/x86_64/vmx_tsc_adjust_test
|
||||
/x86_64/state_test
|
||||
/clear_dirty_log_test
|
||||
/dirty_log_test
|
||||
|
@ -20,6 +20,8 @@ TEST_GEN_PROGS_x86_64 += x86_64/evmcs_test
|
||||
TEST_GEN_PROGS_x86_64 += x86_64/hyperv_cpuid
|
||||
TEST_GEN_PROGS_x86_64 += x86_64/vmx_close_while_nested_test
|
||||
TEST_GEN_PROGS_x86_64 += x86_64/smm_test
|
||||
TEST_GEN_PROGS_x86_64 += x86_64/kvm_create_max_vcpus
|
||||
TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test
|
||||
TEST_GEN_PROGS_x86_64 += dirty_log_test
|
||||
TEST_GEN_PROGS_x86_64 += clear_dirty_log_test
|
||||
|
||||
|
@ -314,7 +314,7 @@ static void run_test(enum vm_guest_mode mode, unsigned long iterations,
|
||||
#ifdef USE_CLEAR_DIRTY_LOG
|
||||
struct kvm_enable_cap cap = {};
|
||||
|
||||
cap.cap = KVM_CAP_MANUAL_DIRTY_LOG_PROTECT;
|
||||
cap.cap = KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2;
|
||||
cap.args[0] = 1;
|
||||
vm_enable_cap(vm, &cap);
|
||||
#endif
|
||||
@ -430,7 +430,7 @@ int main(int argc, char *argv[])
|
||||
int opt, i;
|
||||
|
||||
#ifdef USE_CLEAR_DIRTY_LOG
|
||||
if (!kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT)) {
|
||||
if (!kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2)) {
|
||||
fprintf(stderr, "KVM_CLEAR_DIRTY_LOG not available, skipping tests\n");
|
||||
exit(KSFT_SKIP);
|
||||
}
|
||||
|
@ -118,6 +118,10 @@ void vcpu_events_get(struct kvm_vm *vm, uint32_t vcpuid,
|
||||
struct kvm_vcpu_events *events);
|
||||
void vcpu_events_set(struct kvm_vm *vm, uint32_t vcpuid,
|
||||
struct kvm_vcpu_events *events);
|
||||
void vcpu_nested_state_get(struct kvm_vm *vm, uint32_t vcpuid,
|
||||
struct kvm_nested_state *state);
|
||||
int vcpu_nested_state_set(struct kvm_vm *vm, uint32_t vcpuid,
|
||||
struct kvm_nested_state *state, bool ignore_error);
|
||||
|
||||
const char *exit_reason_str(unsigned int exit_reason);
|
||||
|
||||
|
@ -1250,6 +1250,38 @@ void vcpu_events_set(struct kvm_vm *vm, uint32_t vcpuid,
|
||||
ret, errno);
|
||||
}
|
||||
|
||||
void vcpu_nested_state_get(struct kvm_vm *vm, uint32_t vcpuid,
|
||||
struct kvm_nested_state *state)
|
||||
{
|
||||
struct vcpu *vcpu = vcpu_find(vm, vcpuid);
|
||||
int ret;
|
||||
|
||||
TEST_ASSERT(vcpu != NULL, "vcpu not found, vcpuid: %u", vcpuid);
|
||||
|
||||
ret = ioctl(vcpu->fd, KVM_GET_NESTED_STATE, state);
|
||||
TEST_ASSERT(ret == 0,
|
||||
"KVM_SET_NESTED_STATE failed, ret: %i errno: %i",
|
||||
ret, errno);
|
||||
}
|
||||
|
||||
int vcpu_nested_state_set(struct kvm_vm *vm, uint32_t vcpuid,
|
||||
struct kvm_nested_state *state, bool ignore_error)
|
||||
{
|
||||
struct vcpu *vcpu = vcpu_find(vm, vcpuid);
|
||||
int ret;
|
||||
|
||||
TEST_ASSERT(vcpu != NULL, "vcpu not found, vcpuid: %u", vcpuid);
|
||||
|
||||
ret = ioctl(vcpu->fd, KVM_SET_NESTED_STATE, state);
|
||||
if (!ignore_error) {
|
||||
TEST_ASSERT(ret == 0,
|
||||
"KVM_SET_NESTED_STATE failed, ret: %i errno: %i",
|
||||
ret, errno);
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
/*
|
||||
* VM VCPU System Regs Get
|
||||
*
|
||||
|
70
tools/testing/selftests/kvm/x86_64/kvm_create_max_vcpus.c
Normal file
70
tools/testing/selftests/kvm/x86_64/kvm_create_max_vcpus.c
Normal file
@ -0,0 +1,70 @@
|
||||
/*
|
||||
* kvm_create_max_vcpus
|
||||
*
|
||||
* Copyright (C) 2019, Google LLC.
|
||||
*
|
||||
* This work is licensed under the terms of the GNU GPL, version 2.
|
||||
*
|
||||
* Test for KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_VCPU_ID.
|
||||
*/
|
||||
|
||||
#define _GNU_SOURCE /* for program_invocation_short_name */
|
||||
#include <fcntl.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "test_util.h"
|
||||
|
||||
#include "kvm_util.h"
|
||||
#include "asm/kvm.h"
|
||||
#include "linux/kvm.h"
|
||||
|
||||
void test_vcpu_creation(int first_vcpu_id, int num_vcpus)
|
||||
{
|
||||
struct kvm_vm *vm;
|
||||
int i;
|
||||
|
||||
printf("Testing creating %d vCPUs, with IDs %d...%d.\n",
|
||||
num_vcpus, first_vcpu_id, first_vcpu_id + num_vcpus - 1);
|
||||
|
||||
vm = vm_create(VM_MODE_P52V48_4K, DEFAULT_GUEST_PHY_PAGES, O_RDWR);
|
||||
|
||||
for (i = 0; i < num_vcpus; i++) {
|
||||
int vcpu_id = first_vcpu_id + i;
|
||||
|
||||
/* This asserts that the vCPU was created. */
|
||||
vm_vcpu_add(vm, vcpu_id, 0, 0);
|
||||
}
|
||||
|
||||
kvm_vm_free(vm);
|
||||
}
|
||||
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
int kvm_max_vcpu_id = kvm_check_cap(KVM_CAP_MAX_VCPU_ID);
|
||||
int kvm_max_vcpus = kvm_check_cap(KVM_CAP_MAX_VCPUS);
|
||||
|
||||
printf("KVM_CAP_MAX_VCPU_ID: %d\n", kvm_max_vcpu_id);
|
||||
printf("KVM_CAP_MAX_VCPUS: %d\n", kvm_max_vcpus);
|
||||
|
||||
/*
|
||||
* Upstream KVM prior to 4.8 does not support KVM_CAP_MAX_VCPU_ID.
|
||||
* Userspace is supposed to use KVM_CAP_MAX_VCPUS as the maximum ID
|
||||
* in this case.
|
||||
*/
|
||||
if (!kvm_max_vcpu_id)
|
||||
kvm_max_vcpu_id = kvm_max_vcpus;
|
||||
|
||||
TEST_ASSERT(kvm_max_vcpu_id >= kvm_max_vcpus,
|
||||
"KVM_MAX_VCPU_ID (%d) must be at least as large as KVM_MAX_VCPUS (%d).",
|
||||
kvm_max_vcpu_id, kvm_max_vcpus);
|
||||
|
||||
test_vcpu_creation(0, kvm_max_vcpus);
|
||||
|
||||
if (kvm_max_vcpu_id > kvm_max_vcpus)
|
||||
test_vcpu_creation(
|
||||
kvm_max_vcpu_id - kvm_max_vcpus, kvm_max_vcpus);
|
||||
|
||||
return 0;
|
||||
}
|
280
tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c
Normal file
280
tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c
Normal file
@ -0,0 +1,280 @@
|
||||
/*
|
||||
* vmx_set_nested_state_test
|
||||
*
|
||||
* Copyright (C) 2019, Google LLC.
|
||||
*
|
||||
* This work is licensed under the terms of the GNU GPL, version 2.
|
||||
*
|
||||
* This test verifies the integrity of calling the ioctl KVM_SET_NESTED_STATE.
|
||||
*/
|
||||
|
||||
#include "test_util.h"
|
||||
#include "kvm_util.h"
|
||||
#include "processor.h"
|
||||
#include "vmx.h"
|
||||
|
||||
#include <errno.h>
|
||||
#include <linux/kvm.h>
|
||||
#include <string.h>
|
||||
#include <sys/ioctl.h>
|
||||
#include <unistd.h>
|
||||
|
||||
/*
|
||||
* Mirror of VMCS12_REVISION in arch/x86/kvm/vmx/vmcs12.h. If that value
|
||||
* changes this should be updated.
|
||||
*/
|
||||
#define VMCS12_REVISION 0x11e57ed0
|
||||
#define VCPU_ID 5
|
||||
|
||||
void test_nested_state(struct kvm_vm *vm, struct kvm_nested_state *state)
|
||||
{
|
||||
volatile struct kvm_run *run;
|
||||
|
||||
vcpu_nested_state_set(vm, VCPU_ID, state, false);
|
||||
run = vcpu_state(vm, VCPU_ID);
|
||||
vcpu_run(vm, VCPU_ID);
|
||||
TEST_ASSERT(run->exit_reason == KVM_EXIT_SHUTDOWN,
|
||||
"Got exit_reason other than KVM_EXIT_SHUTDOWN: %u (%s),\n",
|
||||
run->exit_reason,
|
||||
exit_reason_str(run->exit_reason));
|
||||
}
|
||||
|
||||
void test_nested_state_expect_errno(struct kvm_vm *vm,
|
||||
struct kvm_nested_state *state,
|
||||
int expected_errno)
|
||||
{
|
||||
volatile struct kvm_run *run;
|
||||
int rv;
|
||||
|
||||
rv = vcpu_nested_state_set(vm, VCPU_ID, state, true);
|
||||
TEST_ASSERT(rv == -1 && errno == expected_errno,
|
||||
"Expected %s (%d) from vcpu_nested_state_set but got rv: %i errno: %s (%d)",
|
||||
strerror(expected_errno), expected_errno, rv, strerror(errno),
|
||||
errno);
|
||||
run = vcpu_state(vm, VCPU_ID);
|
||||
vcpu_run(vm, VCPU_ID);
|
||||
TEST_ASSERT(run->exit_reason == KVM_EXIT_SHUTDOWN,
|
||||
"Got exit_reason other than KVM_EXIT_SHUTDOWN: %u (%s),\n",
|
||||
run->exit_reason,
|
||||
exit_reason_str(run->exit_reason));
|
||||
}
|
||||
|
||||
void test_nested_state_expect_einval(struct kvm_vm *vm,
|
||||
struct kvm_nested_state *state)
|
||||
{
|
||||
test_nested_state_expect_errno(vm, state, EINVAL);
|
||||
}
|
||||
|
||||
void test_nested_state_expect_efault(struct kvm_vm *vm,
|
||||
struct kvm_nested_state *state)
|
||||
{
|
||||
test_nested_state_expect_errno(vm, state, EFAULT);
|
||||
}
|
||||
|
||||
void set_revision_id_for_vmcs12(struct kvm_nested_state *state,
|
||||
u32 vmcs12_revision)
|
||||
{
|
||||
/* Set revision_id in vmcs12 to vmcs12_revision. */
|
||||
*(u32 *)(state->data) = vmcs12_revision;
|
||||
}
|
||||
|
||||
void set_default_state(struct kvm_nested_state *state)
|
||||
{
|
||||
memset(state, 0, sizeof(*state));
|
||||
state->flags = KVM_STATE_NESTED_RUN_PENDING |
|
||||
KVM_STATE_NESTED_GUEST_MODE;
|
||||
state->format = 0;
|
||||
state->size = sizeof(*state);
|
||||
}
|
||||
|
||||
void set_default_vmx_state(struct kvm_nested_state *state, int size)
|
||||
{
|
||||
memset(state, 0, size);
|
||||
state->flags = KVM_STATE_NESTED_GUEST_MODE |
|
||||
KVM_STATE_NESTED_RUN_PENDING |
|
||||
KVM_STATE_NESTED_EVMCS;
|
||||
state->format = 0;
|
||||
state->size = size;
|
||||
state->vmx.vmxon_pa = 0x1000;
|
||||
state->vmx.vmcs_pa = 0x2000;
|
||||
state->vmx.smm.flags = 0;
|
||||
set_revision_id_for_vmcs12(state, VMCS12_REVISION);
|
||||
}
|
||||
|
||||
void test_vmx_nested_state(struct kvm_vm *vm)
|
||||
{
|
||||
/* Add a page for VMCS12. */
|
||||
const int state_sz = sizeof(struct kvm_nested_state) + getpagesize();
|
||||
struct kvm_nested_state *state =
|
||||
(struct kvm_nested_state *)malloc(state_sz);
|
||||
|
||||
/* The format must be set to 0. 0 for VMX, 1 for SVM. */
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->format = 1;
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/*
|
||||
* We cannot virtualize anything if the guest does not have VMX
|
||||
* enabled.
|
||||
*/
|
||||
set_default_vmx_state(state, state_sz);
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/*
|
||||
* We cannot virtualize anything if the guest does not have VMX
|
||||
* enabled. We expect KVM_SET_NESTED_STATE to return 0 if vmxon_pa
|
||||
* is set to -1ull.
|
||||
*/
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->vmx.vmxon_pa = -1ull;
|
||||
test_nested_state(vm, state);
|
||||
|
||||
/* Enable VMX in the guest CPUID. */
|
||||
vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
|
||||
|
||||
/* It is invalid to have vmxon_pa == -1ull and SMM flags non-zero. */
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->vmx.vmxon_pa = -1ull;
|
||||
state->vmx.smm.flags = 1;
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/* It is invalid to have vmxon_pa == -1ull and vmcs_pa != -1ull. */
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->vmx.vmxon_pa = -1ull;
|
||||
state->vmx.vmcs_pa = 0;
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/*
|
||||
* Setting vmxon_pa == -1ull and vmcs_pa == -1ull exits early without
|
||||
* setting the nested state.
|
||||
*/
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->vmx.vmxon_pa = -1ull;
|
||||
state->vmx.vmcs_pa = -1ull;
|
||||
test_nested_state(vm, state);
|
||||
|
||||
/* It is invalid to have vmxon_pa set to a non-page aligned address. */
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->vmx.vmxon_pa = 1;
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/*
|
||||
* It is invalid to have KVM_STATE_NESTED_SMM_GUEST_MODE and
|
||||
* KVM_STATE_NESTED_GUEST_MODE set together.
|
||||
*/
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->flags = KVM_STATE_NESTED_GUEST_MODE |
|
||||
KVM_STATE_NESTED_RUN_PENDING;
|
||||
state->vmx.smm.flags = KVM_STATE_NESTED_SMM_GUEST_MODE;
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/*
|
||||
* It is invalid to have any of the SMM flags set besides:
|
||||
* KVM_STATE_NESTED_SMM_GUEST_MODE
|
||||
* KVM_STATE_NESTED_SMM_VMXON
|
||||
*/
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->vmx.smm.flags = ~(KVM_STATE_NESTED_SMM_GUEST_MODE |
|
||||
KVM_STATE_NESTED_SMM_VMXON);
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/* Outside SMM, SMM flags must be zero. */
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->flags = 0;
|
||||
state->vmx.smm.flags = KVM_STATE_NESTED_SMM_GUEST_MODE;
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/* Size must be large enough to fit kvm_nested_state and vmcs12. */
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->size = sizeof(*state);
|
||||
test_nested_state(vm, state);
|
||||
|
||||
/* vmxon_pa cannot be the same address as vmcs_pa. */
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->vmx.vmxon_pa = 0;
|
||||
state->vmx.vmcs_pa = 0;
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/* The revision id for vmcs12 must be VMCS12_REVISION. */
|
||||
set_default_vmx_state(state, state_sz);
|
||||
set_revision_id_for_vmcs12(state, 0);
|
||||
test_nested_state_expect_einval(vm, state);
|
||||
|
||||
/*
|
||||
* Test that if we leave nesting the state reflects that when we get
|
||||
* it again.
|
||||
*/
|
||||
set_default_vmx_state(state, state_sz);
|
||||
state->vmx.vmxon_pa = -1ull;
|
||||
state->vmx.vmcs_pa = -1ull;
|
||||
state->flags = 0;
|
||||
test_nested_state(vm, state);
|
||||
vcpu_nested_state_get(vm, VCPU_ID, state);
|
||||
TEST_ASSERT(state->size >= sizeof(*state) && state->size <= state_sz,
|
||||
"Size must be between %d and %d. The size returned was %d.",
|
||||
sizeof(*state), state_sz, state->size);
|
||||
TEST_ASSERT(state->vmx.vmxon_pa == -1ull, "vmxon_pa must be -1ull.");
|
||||
TEST_ASSERT(state->vmx.vmcs_pa == -1ull, "vmcs_pa must be -1ull.");
|
||||
|
||||
free(state);
|
||||
}
|
||||
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
struct kvm_vm *vm;
|
||||
struct kvm_nested_state state;
|
||||
struct kvm_cpuid_entry2 *entry = kvm_get_supported_cpuid_entry(1);
|
||||
|
||||
if (!kvm_check_cap(KVM_CAP_NESTED_STATE)) {
|
||||
printf("KVM_CAP_NESTED_STATE not available, skipping test\n");
|
||||
exit(KSFT_SKIP);
|
||||
}
|
||||
|
||||
/*
|
||||
* AMD currently does not implement set_nested_state, so for now we
|
||||
* just early out.
|
||||
*/
|
||||
if (!(entry->ecx & CPUID_VMX)) {
|
||||
fprintf(stderr, "nested VMX not enabled, skipping test\n");
|
||||
exit(KSFT_SKIP);
|
||||
}
|
||||
|
||||
vm = vm_create_default(VCPU_ID, 0, 0);
|
||||
|
||||
/* Passing a NULL kvm_nested_state causes a EFAULT. */
|
||||
test_nested_state_expect_efault(vm, NULL);
|
||||
|
||||
/* 'size' cannot be smaller than sizeof(kvm_nested_state). */
|
||||
set_default_state(&state);
|
||||
state.size = 0;
|
||||
test_nested_state_expect_einval(vm, &state);
|
||||
|
||||
/*
|
||||
* Setting the flags 0xf fails the flags check. The only flags that
|
||||
* can be used are:
|
||||
* KVM_STATE_NESTED_GUEST_MODE
|
||||
* KVM_STATE_NESTED_RUN_PENDING
|
||||
* KVM_STATE_NESTED_EVMCS
|
||||
*/
|
||||
set_default_state(&state);
|
||||
state.flags = 0xf;
|
||||
test_nested_state_expect_einval(vm, &state);
|
||||
|
||||
/*
|
||||
* If KVM_STATE_NESTED_RUN_PENDING is set then
|
||||
* KVM_STATE_NESTED_GUEST_MODE has to be set as well.
|
||||
*/
|
||||
set_default_state(&state);
|
||||
state.flags = KVM_STATE_NESTED_RUN_PENDING;
|
||||
test_nested_state_expect_einval(vm, &state);
|
||||
|
||||
/*
|
||||
* TODO: When SVM support is added for KVM_SET_NESTED_STATE
|
||||
* add tests here to support it like VMX.
|
||||
*/
|
||||
if (entry->ecx & CPUID_VMX)
|
||||
test_vmx_nested_state(vm);
|
||||
|
||||
kvm_vm_free(vm);
|
||||
return 0;
|
||||
}
|
@ -57,3 +57,6 @@ config HAVE_KVM_VCPU_ASYNC_IOCTL
|
||||
|
||||
config HAVE_KVM_VCPU_RUN_PID_CHANGE
|
||||
bool
|
||||
|
||||
config HAVE_KVM_NO_POLL
|
||||
bool
|
||||
|
@ -56,7 +56,7 @@
|
||||
__asm__(".arch_extension virt");
|
||||
#endif
|
||||
|
||||
DEFINE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
|
||||
DEFINE_PER_CPU(kvm_host_data_t, kvm_host_data);
|
||||
static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
|
||||
|
||||
/* Per-CPU variable containing the currently running vcpu. */
|
||||
@ -224,9 +224,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||
case KVM_CAP_MAX_VCPUS:
|
||||
r = KVM_MAX_VCPUS;
|
||||
break;
|
||||
case KVM_CAP_NR_MEMSLOTS:
|
||||
r = KVM_USER_MEM_SLOTS;
|
||||
break;
|
||||
case KVM_CAP_MSI_DEVID:
|
||||
if (!kvm)
|
||||
r = -EINVAL;
|
||||
@ -360,8 +357,10 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
|
||||
void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
||||
{
|
||||
int *last_ran;
|
||||
kvm_host_data_t *cpu_data;
|
||||
|
||||
last_ran = this_cpu_ptr(vcpu->kvm->arch.last_vcpu_ran);
|
||||
cpu_data = this_cpu_ptr(&kvm_host_data);
|
||||
|
||||
/*
|
||||
* We might get preempted before the vCPU actually runs, but
|
||||
@ -373,18 +372,21 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
|
||||
}
|
||||
|
||||
vcpu->cpu = cpu;
|
||||
vcpu->arch.host_cpu_context = this_cpu_ptr(&kvm_host_cpu_state);
|
||||
vcpu->arch.host_cpu_context = &cpu_data->host_ctxt;
|
||||
|
||||
kvm_arm_set_running_vcpu(vcpu);
|
||||
kvm_vgic_load(vcpu);
|
||||
kvm_timer_vcpu_load(vcpu);
|
||||
kvm_vcpu_load_sysregs(vcpu);
|
||||
kvm_arch_vcpu_load_fp(vcpu);
|
||||
kvm_vcpu_pmu_restore_guest(vcpu);
|
||||
|
||||
if (single_task_running())
|
||||
vcpu_clear_wfe_traps(vcpu);
|
||||
else
|
||||
vcpu_set_wfe_traps(vcpu);
|
||||
|
||||
vcpu_ptrauth_setup_lazy(vcpu);
|
||||
}
|
||||
|
||||
void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
|
||||
@ -393,6 +395,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
|
||||
kvm_vcpu_put_sysregs(vcpu);
|
||||
kvm_timer_vcpu_put(vcpu);
|
||||
kvm_vgic_put(vcpu);
|
||||
kvm_vcpu_pmu_restore_host(vcpu);
|
||||
|
||||
vcpu->cpu = -1;
|
||||
|
||||
@ -545,6 +548,9 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
|
||||
if (likely(vcpu->arch.has_run_once))
|
||||
return 0;
|
||||
|
||||
if (!kvm_arm_vcpu_is_finalized(vcpu))
|
||||
return -EPERM;
|
||||
|
||||
vcpu->arch.has_run_once = true;
|
||||
|
||||
if (likely(irqchip_in_kernel(kvm))) {
|
||||
@ -1121,6 +1127,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
|
||||
if (unlikely(!kvm_vcpu_initialized(vcpu)))
|
||||
break;
|
||||
|
||||
r = -EPERM;
|
||||
if (!kvm_arm_vcpu_is_finalized(vcpu))
|
||||
break;
|
||||
|
||||
r = -EFAULT;
|
||||
if (copy_from_user(®_list, user_list, sizeof(reg_list)))
|
||||
break;
|
||||
@ -1174,6 +1184,17 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
|
||||
|
||||
return kvm_arm_vcpu_set_events(vcpu, &events);
|
||||
}
|
||||
case KVM_ARM_VCPU_FINALIZE: {
|
||||
int what;
|
||||
|
||||
if (!kvm_vcpu_initialized(vcpu))
|
||||
return -ENOEXEC;
|
||||
|
||||
if (get_user(what, (const int __user *)argp))
|
||||
return -EFAULT;
|
||||
|
||||
return kvm_arm_vcpu_finalize(vcpu, what);
|
||||
}
|
||||
default:
|
||||
r = -EINVAL;
|
||||
}
|
||||
@ -1554,11 +1575,11 @@ static int init_hyp_mode(void)
|
||||
}
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
kvm_cpu_context_t *cpu_ctxt;
|
||||
kvm_host_data_t *cpu_data;
|
||||
|
||||
cpu_ctxt = per_cpu_ptr(&kvm_host_cpu_state, cpu);
|
||||
kvm_init_host_cpu_context(cpu_ctxt, cpu);
|
||||
err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1, PAGE_HYP);
|
||||
cpu_data = per_cpu_ptr(&kvm_host_data, cpu);
|
||||
kvm_init_host_cpu_context(&cpu_data->host_ctxt, cpu);
|
||||
err = create_hyp_mappings(cpu_data, cpu_data + 1, PAGE_HYP);
|
||||
|
||||
if (err) {
|
||||
kvm_err("Cannot map host CPU state: %d\n", err);
|
||||
@ -1669,6 +1690,10 @@ int kvm_arch_init(void *opaque)
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
err = kvm_arm_init_sve();
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
if (!in_hyp_mode) {
|
||||
err = init_hyp_mode();
|
||||
if (err)
|
||||
|
@ -51,9 +51,9 @@
|
||||
#include <linux/slab.h>
|
||||
#include <linux/sort.h>
|
||||
#include <linux/bsearch.h>
|
||||
#include <linux/io.h>
|
||||
|
||||
#include <asm/processor.h>
|
||||
#include <asm/io.h>
|
||||
#include <asm/ioctl.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <asm/pgtable.h>
|
||||
@ -1135,11 +1135,11 @@ EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
|
||||
|
||||
#ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT
|
||||
/**
|
||||
* kvm_get_dirty_log_protect - get a snapshot of dirty pages, and if any pages
|
||||
* kvm_get_dirty_log_protect - get a snapshot of dirty pages
|
||||
* and reenable dirty page tracking for the corresponding pages.
|
||||
* @kvm: pointer to kvm instance
|
||||
* @log: slot id and address to which we copy the log
|
||||
* @is_dirty: flag set if any page is dirty
|
||||
* @flush: true if TLB flush is needed by caller
|
||||
*
|
||||
* We need to keep it in mind that VCPU threads can write to the bitmap
|
||||
* concurrently. So, to avoid losing track of dirty pages we keep the
|
||||
@ -1224,6 +1224,7 @@ EXPORT_SYMBOL_GPL(kvm_get_dirty_log_protect);
|
||||
* and reenable dirty page tracking for the corresponding pages.
|
||||
* @kvm: pointer to kvm instance
|
||||
* @log: slot id and address from which to fetch the bitmap of dirty pages
|
||||
* @flush: true if TLB flush is needed by caller
|
||||
*/
|
||||
int kvm_clear_dirty_log_protect(struct kvm *kvm,
|
||||
struct kvm_clear_dirty_log *log, bool *flush)
|
||||
@ -1251,7 +1252,7 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm,
|
||||
if (!dirty_bitmap)
|
||||
return -ENOENT;
|
||||
|
||||
n = kvm_dirty_bitmap_bytes(memslot);
|
||||
n = ALIGN(log->num_pages, BITS_PER_LONG) / 8;
|
||||
|
||||
if (log->first_page > memslot->npages ||
|
||||
log->num_pages > memslot->npages - log->first_page ||
|
||||
@ -1264,8 +1265,8 @@ int kvm_clear_dirty_log_protect(struct kvm *kvm,
|
||||
return -EFAULT;
|
||||
|
||||
spin_lock(&kvm->mmu_lock);
|
||||
for (offset = log->first_page,
|
||||
i = offset / BITS_PER_LONG, n = log->num_pages / BITS_PER_LONG; n--;
|
||||
for (offset = log->first_page, i = offset / BITS_PER_LONG,
|
||||
n = DIV_ROUND_UP(log->num_pages, BITS_PER_LONG); n--;
|
||||
i++, offset += BITS_PER_LONG) {
|
||||
unsigned long mask = *dirty_bitmap_buffer++;
|
||||
atomic_long_t *p = (atomic_long_t *) &dirty_bitmap[i];
|
||||
@ -1742,6 +1743,70 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(gfn_to_page);
|
||||
|
||||
static int __kvm_map_gfn(struct kvm_memory_slot *slot, gfn_t gfn,
|
||||
struct kvm_host_map *map)
|
||||
{
|
||||
kvm_pfn_t pfn;
|
||||
void *hva = NULL;
|
||||
struct page *page = KVM_UNMAPPED_PAGE;
|
||||
|
||||
if (!map)
|
||||
return -EINVAL;
|
||||
|
||||
pfn = gfn_to_pfn_memslot(slot, gfn);
|
||||
if (is_error_noslot_pfn(pfn))
|
||||
return -EINVAL;
|
||||
|
||||
if (pfn_valid(pfn)) {
|
||||
page = pfn_to_page(pfn);
|
||||
hva = kmap(page);
|
||||
} else {
|
||||
hva = memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
|
||||
}
|
||||
|
||||
if (!hva)
|
||||
return -EFAULT;
|
||||
|
||||
map->page = page;
|
||||
map->hva = hva;
|
||||
map->pfn = pfn;
|
||||
map->gfn = gfn;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map)
|
||||
{
|
||||
return __kvm_map_gfn(kvm_vcpu_gfn_to_memslot(vcpu, gfn), gfn, map);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvm_vcpu_map);
|
||||
|
||||
void kvm_vcpu_unmap(struct kvm_vcpu *vcpu, struct kvm_host_map *map,
|
||||
bool dirty)
|
||||
{
|
||||
if (!map)
|
||||
return;
|
||||
|
||||
if (!map->hva)
|
||||
return;
|
||||
|
||||
if (map->page)
|
||||
kunmap(map->page);
|
||||
else
|
||||
memunmap(map->hva);
|
||||
|
||||
if (dirty) {
|
||||
kvm_vcpu_mark_page_dirty(vcpu, map->gfn);
|
||||
kvm_release_pfn_dirty(map->pfn);
|
||||
} else {
|
||||
kvm_release_pfn_clean(map->pfn);
|
||||
}
|
||||
|
||||
map->hva = NULL;
|
||||
map->page = NULL;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kvm_vcpu_unmap);
|
||||
|
||||
struct page *kvm_vcpu_gfn_to_page(struct kvm_vcpu *vcpu, gfn_t gfn)
|
||||
{
|
||||
kvm_pfn_t pfn;
|
||||
@ -2255,7 +2320,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
|
||||
u64 block_ns;
|
||||
|
||||
start = cur = ktime_get();
|
||||
if (vcpu->halt_poll_ns) {
|
||||
if (vcpu->halt_poll_ns && !kvm_arch_no_poll(vcpu)) {
|
||||
ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
|
||||
|
||||
++vcpu->stat.halt_attempted_poll;
|
||||
@ -2886,6 +2951,16 @@ out:
|
||||
}
|
||||
#endif
|
||||
|
||||
static int kvm_device_mmap(struct file *filp, struct vm_area_struct *vma)
|
||||
{
|
||||
struct kvm_device *dev = filp->private_data;
|
||||
|
||||
if (dev->ops->mmap)
|
||||
return dev->ops->mmap(dev, vma);
|
||||
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
static int kvm_device_ioctl_attr(struct kvm_device *dev,
|
||||
int (*accessor)(struct kvm_device *dev,
|
||||
struct kvm_device_attr *attr),
|
||||
@ -2930,6 +3005,13 @@ static int kvm_device_release(struct inode *inode, struct file *filp)
|
||||
struct kvm_device *dev = filp->private_data;
|
||||
struct kvm *kvm = dev->kvm;
|
||||
|
||||
if (dev->ops->release) {
|
||||
mutex_lock(&kvm->lock);
|
||||
list_del(&dev->vm_node);
|
||||
dev->ops->release(dev);
|
||||
mutex_unlock(&kvm->lock);
|
||||
}
|
||||
|
||||
kvm_put_kvm(kvm);
|
||||
return 0;
|
||||
}
|
||||
@ -2938,6 +3020,7 @@ static const struct file_operations kvm_device_fops = {
|
||||
.unlocked_ioctl = kvm_device_ioctl,
|
||||
.release = kvm_device_release,
|
||||
KVM_COMPAT(kvm_device_ioctl),
|
||||
.mmap = kvm_device_mmap,
|
||||
};
|
||||
|
||||
struct kvm_device *kvm_device_from_filp(struct file *filp)
|
||||
@ -3046,7 +3129,7 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
|
||||
case KVM_CAP_CHECK_EXTENSION_VM:
|
||||
case KVM_CAP_ENABLE_CAP_VM:
|
||||
#ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT
|
||||
case KVM_CAP_MANUAL_DIRTY_LOG_PROTECT:
|
||||
case KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2:
|
||||
#endif
|
||||
return 1;
|
||||
#ifdef CONFIG_KVM_MMIO
|
||||
@ -3065,6 +3148,8 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
|
||||
#endif
|
||||
case KVM_CAP_MAX_VCPU_ID:
|
||||
return KVM_MAX_VCPU_ID;
|
||||
case KVM_CAP_NR_MEMSLOTS:
|
||||
return KVM_USER_MEM_SLOTS;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
@ -3082,7 +3167,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
|
||||
{
|
||||
switch (cap->cap) {
|
||||
#ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT
|
||||
case KVM_CAP_MANUAL_DIRTY_LOG_PROTECT:
|
||||
case KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2:
|
||||
if (cap->flags || (cap->args[0] & ~1))
|
||||
return -EINVAL;
|
||||
kvm->manual_dirty_log_protect = cap->args[0];
|
||||
|
Loading…
Reference in New Issue
Block a user