Commit Graph

2159 Commits

Author SHA1 Message Date
Reiji Watanabe
ea9ca904d2 KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest
KVM does not yet support userspace modifying PMCR_EL0.N (With
the previous patch, KVM ignores what is written by userspace).
Add support userspace limiting PMCR_EL0.N.

Disallow userspace to set PMCR_EL0.N to a value that is greater
than the host value as KVM doesn't support more event counters
than what the host HW implements. Also, make this register
immutable after the VM has started running. To maintain the
existing expectations, instead of returning an error, KVM
returns a success for these two cases.

Finally, ignore writes to read-only bits that are cleared on
vCPU reset, and RES{0,1} bits (including writable bits that
KVM doesn't support yet), as those bits shouldn't be modified
(at least with the current KVM).

Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-8-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Raghavendra Rao Ananta
27131b199f KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run
For unimplemented counters, the registers PM{C,I}NTEN{SET,CLR}
and PMOVS{SET,CLR} are expected to have the corresponding bits RAZ.
Hence to ensure correct KVM's PMU emulation, mask out the RES0 bits.
Defer this work to the point that userspace can no longer change the
number of advertised PMCs.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-7-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Raghavendra Rao Ananta
a45f41d754 KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR}
For unimplemented counters, the bits in PM{C,I}NTEN{SET,CLR} and
PMOVS{SET,CLR} registers are expected to RAZ. To honor this,
explicitly implement the {get,set}_user functions for these
registers to mask out unimplemented counters for userspace reads
and writes.

Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-6-rananta@google.com
[Oliver: drop unnecessary locking]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Raghavendra Rao Ananta
4d20debf9c KVM: arm64: PMU: Set PMCR_EL0.N for vCPU based on the associated PMU
The number of PMU event counters is indicated in PMCR_EL0.N.
For a vCPU with PMUv3 configured, the value is set to the same
value as the current PE on every vCPU reset.  Unless the vCPU is
pinned to PEs that has the PMU associated to the guest from the
initial vCPU reset, the value might be different from the PMU's
PMCR_EL0.N on heterogeneous PMU systems.

Fix this by setting the vCPU's PMCR_EL0.N to the PMU's PMCR_EL0.N
value. Track the PMCR_EL0.N per guest, as only one PMU can be set
for the guest (PMCR_EL0.N must be the same for all vCPUs of the
guest), and it is convenient for updating the value.

To achieve this, the patch introduces a helper,
kvm_arm_pmu_get_max_counters(), that reads the maximum number of
counters from the arm_pmu associated to the VM. Make the function
global as upcoming patches will be interested to know the value
while setting the PMCR.N of the guest from userspace.

KVM does not yet support userspace modifying PMCR_EL0.N.
The following patch will add support for that.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-5-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:30 +00:00
Reiji Watanabe
57fc267f1b KVM: arm64: PMU: Add a helper to read a vCPU's PMCR_EL0
Add a helper to read a vCPU's PMCR_EL0, and use it whenever KVM
reads a vCPU's PMCR_EL0.

Currently, the PMCR_EL0 value is tracked per vCPU. The following
patches will make (only) PMCR_EL0.N track per guest. Having the
new helper will be useful to combine the PMCR_EL0.N field
(tracked per guest) and the other fields (tracked per vCPU)
to provide the value of PMCR_EL0.

No functional change intended.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-4-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:29 +00:00
Reiji Watanabe
4277335797 KVM: arm64: Select default PMU in KVM_ARM_VCPU_INIT handler
Future changes to KVM's sysreg emulation will rely on having a valid PMU
instance to determine the number of implemented counters (PMCR_EL0.N).
This is earlier than when userspace is expected to modify the vPMU
device attributes, where the default is selected today.

Select the default PMU when handling KVM_ARM_VCPU_INIT such that it is
available in time for sysreg emulation.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Link: https://lore.kernel.org/r/20231020214053.2144305-3-rananta@google.com
[Oliver: rewrite changelog]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 22:59:29 +00:00
Reiji Watanabe
1616ca6f3c KVM: arm64: PMU: Introduce helpers to set the guest's PMU
Introduce new helper functions to set the guest's PMU
(kvm->arch.arm_pmu) either to a default probed instance or to a
caller requested one, and use it when the guest's PMU needs to
be set. These helpers will make it easier for the following
patches to modify the relevant code.

No functional change intended.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231020214053.2144305-2-rananta@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-10-24 19:07:33 +00:00
Jean-Philippe Brucker
373beef00f KVM: arm64: nvhe: Ignore SVE hint in SMCCC function ID
When SVE is enabled, the host may set bit 16 in SMCCC function IDs, a
hint that indicates an unused SVE state. At the moment NVHE doesn't
account for this bit when inspecting the function ID, and rejects most
calls. Clear the hint bit before comparing function IDs.

About version compatibility: the host's PSCI driver initially probes the
firmware for a SMCCC version number. If the firmware implements a
protocol recent enough (1.3), subsequent SMCCC calls have the hint bit
set. Since the hint bit was reserved in earlier versions of the
protocol, clearing it is fine regardless of the version in use.

When a new hint is added to the protocol in the future, it will be added
to ARM_SMCCC_CALL_HINTS and NVHE will handle it straight away. This
patch only clears known hints and leaves reserved bits as is, because
future SMCCC versions could use reserved bits as modifiers for the
function ID, rather than hints.

Fixes: cfa7ff959a ("arm64: smccc: Support SMCCC v1.3 SVE register saving hint")
Reported-by: Ben Horgan <ben.horgan@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230911145254.934414-4-jean-philippe@linaro.org
2023-09-12 13:07:37 +01:00
Marc Zyngier
3579dc742f KVM: arm64: Properly return allocated EL2 VA from hyp_alloc_private_va_range()
Marek reports that his RPi4 spits out a warning at boot time,
right at the point where the GICv2 virtual CPU interface gets
mapped.

Upon investigation, it seems that we never return the allocated
VA and use whatever was on the stack at this point. Yes, this
is good stuff, and Marek was pretty lucky that he ended-up with
a VA that intersected with something that was already mapped.

On my setup, this random value is plausible enough for the mapping
to take place. Who knows what happens...

Fixes: f156a7d13f ("KVM: arm64: Remove size-order align in the nVHE hyp private VA range")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/79b0ad6e-0c2a-f777-d504-e40e8123d81d@samsung.com
Link: https://lore.kernel.org/r/20230828153121.4179627-1-maz@kernel.org
2023-09-12 12:58:25 +01:00
Linus Torvalds
0c02183427 ARM:
* Clean up vCPU targets, always returning generic v8 as the preferred target
 
 * Trap forwarding infrastructure for nested virtualization (used for traps
   that are taken from an L2 guest and are needed by the L1 hypervisor)
 
 * FEAT_TLBIRANGE support to only invalidate specific ranges of addresses
   when collapsing a table PTE to a block PTE.  This avoids that the guest
   refills the TLBs again for addresses that aren't covered by the table PTE.
 
 * Fix vPMU issues related to handling of PMUver.
 
 * Don't unnecessary align non-stack allocations in the EL2 VA space
 
 * Drop HCR_VIRT_EXCP_MASK, which was never used...
 
 * Don't use smp_processor_id() in kvm_arch_vcpu_load(),
   but the cpu parameter instead
 
 * Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
 
 * Remove prototypes without implementations
 
 RISC-V:
 
 * Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest
 
 * Added ONE_REG interface for SATP mode
 
 * Added ONE_REG interface to enable/disable multiple ISA extensions
 
 * Improved error codes returned by ONE_REG interfaces
 
 * Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V
 
 * Added get-reg-list selftest for KVM RISC-V
 
 s390:
 
 * PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
   Allows a PV guest to use crypto cards. Card access is governed by
   the firmware and once a crypto queue is "bound" to a PV VM every
   other entity (PV or not) looses access until it is not bound
   anymore. Enablement is done via flags when creating the PV VM.
 
 * Guest debug fixes (Ilya)
 
 x86:
 
 * Clean up KVM's handling of Intel architectural events
 
 * Intel bugfixes
 
 * Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use debug
   registers and generate/handle #DBs
 
 * Clean up LBR virtualization code
 
 * Fix a bug where KVM fails to set the target pCPU during an IRTE update
 
 * Fix fatal bugs in SEV-ES intrahost migration
 
 * Fix a bug where the recent (architecturally correct) change to reinject
   #BP and skip INT3 broke SEV guests (can't decode INT3 to skip it)
 
 * Retry APIC map recalculation if a vCPU is added/enabled
 
 * Overhaul emergency reboot code to bring SVM up to par with VMX, tie the
   "emergency disabling" behavior to KVM actually being loaded, and move all of
   the logic within KVM
 
 * Fix user triggerable WARNs in SVM where KVM incorrectly assumes the TSC
   ratio MSR cannot diverge from the default when TSC scaling is disabled
   up related code
 
 * Add a framework to allow "caching" feature flags so that KVM can check if
   the guest can use a feature without needing to search guest CPUID
 
 * Rip out the ancient MMU_DEBUG crud and replace the useful bits with
   CONFIG_KVM_PROVE_MMU
 
 * Fix KVM's handling of !visible guest roots to avoid premature triple fault
   injection
 
 * Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the API surface
   that is needed by external users (currently only KVMGT), and fix a variety
   of issues in the process
 
 This last item had a silly one-character bug in the topic branch that
 was sent to me.  Because it caused pretty bad selftest failures in
 some configurations, I decided to squash in the fix.  So, while the
 exact commit ids haven't been in linux-next, the code has (from the
 kvm-x86 tree).
 
 Generic:
 
 * Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
   action specific data without needing to constantly update the main handlers.
 
 * Drop unused function declarations
 
 Selftests:
 
 * Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
 
 * Add support for printf() in guest code and covert all guest asserts to use
   printf-based reporting
 
 * Clean up the PMU event filter test and add new testcases
 
 * Include x86 selftests in the KVM x86 MAINTAINERS entry
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT1m0kUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMNgggAiN7nz6UC423FznuI+yO3TLm8tkx1
 CpKh5onqQogVtchH+vrngi97cfOzZb1/AtifY90OWQi31KEWhehkeofcx7G6ERhj
 5a9NFADY1xGBsX4exca/VHDxhnzsbDWaWYPXw5vWFWI6erft9Mvy3tp1LwTvOzqM
 v8X4aWz+g5bmo/DWJf4Wu32tEU6mnxzkrjKU14JmyqQTBawVmJ3RYvHVJ/Agpw+n
 hRtPAy7FU6XTdkmq/uCT+KUHuJEIK0E/l1js47HFAqSzwdW70UDg14GGo1o4ETxu
 RjZQmVNvL57yVgi6QU38/A0FWIsWQm5IlaX1Ug6x8pjZPnUKNbo9BY4T1g==
 =W+4p
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Clean up vCPU targets, always returning generic v8 as the preferred
     target

   - Trap forwarding infrastructure for nested virtualization (used for
     traps that are taken from an L2 guest and are needed by the L1
     hypervisor)

   - FEAT_TLBIRANGE support to only invalidate specific ranges of
     addresses when collapsing a table PTE to a block PTE. This avoids
     that the guest refills the TLBs again for addresses that aren't
     covered by the table PTE.

   - Fix vPMU issues related to handling of PMUver.

   - Don't unnecessary align non-stack allocations in the EL2 VA space

   - Drop HCR_VIRT_EXCP_MASK, which was never used...

   - Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu
     parameter instead

   - Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()

   - Remove prototypes without implementations

  RISC-V:

   - Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest

   - Added ONE_REG interface for SATP mode

   - Added ONE_REG interface to enable/disable multiple ISA extensions

   - Improved error codes returned by ONE_REG interfaces

   - Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V

   - Added get-reg-list selftest for KVM RISC-V

  s390:

   - PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)

     Allows a PV guest to use crypto cards. Card access is governed by
     the firmware and once a crypto queue is "bound" to a PV VM every
     other entity (PV or not) looses access until it is not bound
     anymore. Enablement is done via flags when creating the PV VM.

   - Guest debug fixes (Ilya)

  x86:

   - Clean up KVM's handling of Intel architectural events

   - Intel bugfixes

   - Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use
     debug registers and generate/handle #DBs

   - Clean up LBR virtualization code

   - Fix a bug where KVM fails to set the target pCPU during an IRTE
     update

   - Fix fatal bugs in SEV-ES intrahost migration

   - Fix a bug where the recent (architecturally correct) change to
     reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to
     skip it)

   - Retry APIC map recalculation if a vCPU is added/enabled

   - Overhaul emergency reboot code to bring SVM up to par with VMX, tie
     the "emergency disabling" behavior to KVM actually being loaded,
     and move all of the logic within KVM

   - Fix user triggerable WARNs in SVM where KVM incorrectly assumes the
     TSC ratio MSR cannot diverge from the default when TSC scaling is
     disabled up related code

   - Add a framework to allow "caching" feature flags so that KVM can
     check if the guest can use a feature without needing to search
     guest CPUID

   - Rip out the ancient MMU_DEBUG crud and replace the useful bits with
     CONFIG_KVM_PROVE_MMU

   - Fix KVM's handling of !visible guest roots to avoid premature
     triple fault injection

   - Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the
     API surface that is needed by external users (currently only
     KVMGT), and fix a variety of issues in the process

  Generic:

   - Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier
     events to pass action specific data without needing to constantly
     update the main handlers.

   - Drop unused function declarations

  Selftests:

   - Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs

   - Add support for printf() in guest code and covert all guest asserts
     to use printf-based reporting

   - Clean up the PMU event filter test and add new testcases

   - Include x86 selftests in the KVM x86 MAINTAINERS entry"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (279 commits)
  KVM: x86/mmu: Include mmu.h in spte.h
  KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots
  KVM: x86/mmu: Disallow guest from using !visible slots for page tables
  KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow page
  KVM: x86/mmu: Harden new PGD against roots without shadow pages
  KVM: x86/mmu: Add helper to convert root hpa to shadow page
  drm/i915/gvt: Drop final dependencies on KVM internal details
  KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callers
  KVM: x86/mmu: Drop @slot param from exported/external page-track APIs
  KVM: x86/mmu: Bug the VM if write-tracking is used but not enabled
  KVM: x86/mmu: Assert that correct locks are held for page write-tracking
  KVM: x86/mmu: Rename page-track APIs to reflect the new reality
  KVM: x86/mmu: Drop infrastructure for multiple page-track modes
  KVM: x86/mmu: Use page-track notifiers iff there are external users
  KVM: x86/mmu: Move KVM-only page-track declarations to internal header
  KVM: x86: Remove the unused page-track hook track_flush_slot()
  drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region()
  KVM: x86: Add a new page-track hook to handle memslot deletion
  drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slot
  KVM: x86: Reject memslot MOVE operations if KVMGT is attached
  ...
2023-09-07 13:52:20 -07:00
Paolo Bonzini
0d15bf966d Common KVM changes for 6.6:
- Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
    action specific data without needing to constantly update the main handlers.
 
  - Drop unused function declarations
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmTudpYSHHNlYW5qY0Bn
 b29nbGUuY29tAAoJEGCRIgFNDBL5xJUQAKnMVEV+7gRtfV5KCJFRTNAMxo4zSIt/
 K6QX+x/SwUriXj4nTAlvAtju1xz4nwTYBABKj3bXEaLpVjIUIbnEzEGuTKKK6XY9
 UyJKVgafwLuKLWPYN/5Zv5SCO7DmVC9W3lVMtchgt7gFcRxtZhmEn53boHhrhan0
 /2L5XD6N9rd81Zmd/rQkJNRND7XY3HkvDSnfmsRI/rfFUglCUHBDp4c2Wkmz+Dnb
 ux7N37si5OTbRVp18VzbLg1jalstDEm36ZQ7tLkvIbNbZV6pV93/ZmcTmsOruTeN
 gHVr6/RXmKKwgO3wtZ9DKL6oMcoh20yoT+vqhbaihVssLPGPusk7S2cCQ7529u8/
 Oda+w67MMdbE46N9CmB56fkpwNvn9nLCoQFhMhXBWhPJVNmorpiR6drHKqLy5zCq
 lrsWGqXU/DXA2PwdsztfIIMVeALawzExHu9ayppcKwb4S8TLJhLma7dT+EvwUxuV
 hoswaIT7Tq2ptZ34Fo5/vEz+90u2wi7LynHrNdTs7NLsW+WI/jab7KxKc+mf5WYh
 KuMzqmmPXmWRFupFeDa61YY5PCvMddDeac/jCYL/2cr73RA8bUItivwt5mEg5nOW
 9NEU+cLbl1s8g2KfxwhvodVkbhiNGf8MkVpE5skHHh9OX8HYzZUa/s6uUZO1O0eh
 XOk+fa9KWabt
 =n819
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-generic-6.6' of https://github.com/kvm-x86/linux into HEAD

Common KVM changes for 6.6:

 - Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
   action specific data without needing to constantly update the main handlers.

 - Drop unused function declarations
2023-08-31 13:19:55 -04:00
Paolo Bonzini
e0fb12c673 KVM/arm64 updates for Linux 6.6
- Add support for TLB range invalidation of Stage-2 page tables,
   avoiding unnecessary invalidations. Systems that do not implement
   range invalidation still rely on a full invalidation when dealing
   with large ranges.
 
 - Add infrastructure for forwarding traps taken from a L2 guest to
   the L1 guest, with L0 acting as the dispatcher, another baby step
   towards the full nested support.
 
 - Simplify the way we deal with the (long deprecated) 'CPU target',
   resulting in a much needed cleanup.
 
 - Fix another set of PMU bugs, both on the guest and host sides,
   as we seem to never have any shortage of those...
 
 - Relax the alignment requirements of EL2 VA allocations for
   non-stack allocations, as we were otherwise wasting a lot of that
   precious VA space.
 
 - The usual set of non-functional cleanups, although I note the lack
   of spelling fixes...
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmTsXrUPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDZpIQAJUM1rNEOJ8ExYRfoG1LaTfcOm5TD6D1IWlO
 uCUx4xLMBudw/55HusmUSdiomQ3Xg5UdRaU7vX5OYwPbdoWebjEUfgdP3jCA/TiW
 mZTMv3x9hOvp+EOS/UnS469cERvg1/KfwcdOQsWL0HsCFZnu2XmQHWPD++vovLNp
 F1892ij875mC6C6mOR60H2nyjIiCuqWh/8eKBkp65CARCbFDYxWhqBnmcmTvoquh
 E87pQDPdtgXc0KlOWCABh5bYOu1WGVEXE5f3ixtdY9cQakkSI3NkFKw27/mIWS4q
 TCsagByNnPFDXTglb1dJopNdluLMFi1iXhRJX78R/PYaHxf4uFafWcQk1U7eDdLg
 1kPANggwYe4KNAQZUvRhH7lIPWHCH0r4c1qHV+FsiOZVoDOSKHo4RW1ZFtirJSNW
 LNJMdk+8xyae0S7z164EpZB/tpFttX4gl3YvUT/T+4gH8+CRFAaoAlK39CoGDPpk
 f+P2GE1Z5YupF16YjpZtBnan55KkU1b6eORl5zpnAtoaz5WGXqj1t4qo0Q6e9WB9
 X4rdDVhH7vRUmhjmSP6PuEygb84hnITLdGpkH2BmWj/4uYuCN+p+U2B2o/QdMJoo
 cPxdflLOU/+1gfAFYPtHVjVKCqzhwbw3iLXQpO12gzRYqE13rUnAr7RuGDf5fBVC
 LW7Pv81o
 =DKhx
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for Linux 6.6

- Add support for TLB range invalidation of Stage-2 page tables,
  avoiding unnecessary invalidations. Systems that do not implement
  range invalidation still rely on a full invalidation when dealing
  with large ranges.

- Add infrastructure for forwarding traps taken from a L2 guest to
  the L1 guest, with L0 acting as the dispatcher, another baby step
  towards the full nested support.

- Simplify the way we deal with the (long deprecated) 'CPU target',
  resulting in a much needed cleanup.

- Fix another set of PMU bugs, both on the guest and host sides,
  as we seem to never have any shortage of those...

- Relax the alignment requirements of EL2 VA allocations for
  non-stack allocations, as we were otherwise wasting a lot of that
  precious VA space.

- The usual set of non-functional cleanups, although I note the lack
  of spelling fixes...
2023-08-31 13:18:53 -04:00
Linus Torvalds
727dbda16b hardening updates for v6.6-rc1
- Carve out the new CONFIG_LIST_HARDENED as a more focused subset of
   CONFIG_DEBUG_LIST (Marco Elver).
 
 - Fix kallsyms lookup failure under Clang LTO (Yonghong Song).
 
 - Clarify documentation for CONFIG_UBSAN_TRAP (Jann Horn).
 
 - Flexible array member conversion not carried in other tree (Gustavo
   A. R. Silva).
 
 - Various strlcpy() and strncpy() removals not carried in other trees
   (Azeem Shaikh, Justin Stitt).
 
 - Convert nsproxy.count to refcount_t (Elena Reshetova).
 
 - Add handful of __counted_by annotations not carried in other trees,
   as well as an LKDTM test.
 
 - Fix build failure with gcc-plugins on GCC 14+.
 
 - Fix selftests to respect SKIP for signal-delivery tests.
 
 - Fix CFI warning for paravirt callback prototype.
 
 - Clarify documentation for seq_show_option_n() usage.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmTs6ZAWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJkpjD/9AeST5Imc2t0t71Qd+wPxW3jT3
 kDZPlHH8wHmuxSpRscX82m21SozvEMvybo6Cp7FSH4qr863FnBWMlo8acr7rKxUf
 0f7Y9qgY/hKADiVx5p0pbnCgcy+l4pwsxIqVCGuhjvNCbWHrdGqLM4UjIfaVz5Ws
 +55a/C3S1KVwB1s1+6to43jtKqQAx6yrqYWOaT3wEfCzHC87f9PUHhIGnFQVwPGP
 WpjQI/BQKpH7+MDCoJOPrZqXaE/4lWALxR6+5BBheGbvLoWifpJEYHX6bDUzkgBz
 liQDkgr4eAw5EXSOS7mX3EApfeMKakznJt9Mcmn0h3pPRlM3ZSVD64Xrou2Brpje
 exS2JRuh6HwIiXY9nTHc6YMGcAWG1syAR/hM2fQdujM0CWtBUk9+kkuYWsqF6nIK
 3tOxYLB/Ph4p+tShd+v5R3mEmp/6snYRKJoUk+9Fk67i54NnK4huyxaCO4zui+ML
 3vHuGp8KgFHUjJaYmYXHs3TRZnKSFUkPGc4MbpiGtmJ9zhfSwlhhF+yfBJCsvmTf
 ZajA+sPupT4OjLxU6vUD/ZNkXAEjWzktyX2v9YBA7FHh7SqPtX9ARRIxh417AjEJ
 tBPHhW/iRw9ftBIAKDmI7gPLynngd/zvjhvk6O5egHYjjgRM1/WAJZ4V26XR6+hf
 TWfQb7VRzdZIqwOEUA==
 =9ZWP
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "As has become normal, changes are scattered around the tree (either
  explicitly maintainer Acked or for trivial stuff that went ignored):

   - Carve out the new CONFIG_LIST_HARDENED as a more focused subset of
     CONFIG_DEBUG_LIST (Marco Elver)

   - Fix kallsyms lookup failure under Clang LTO (Yonghong Song)

   - Clarify documentation for CONFIG_UBSAN_TRAP (Jann Horn)

   - Flexible array member conversion not carried in other tree (Gustavo
     A. R. Silva)

   - Various strlcpy() and strncpy() removals not carried in other trees
     (Azeem Shaikh, Justin Stitt)

   - Convert nsproxy.count to refcount_t (Elena Reshetova)

   - Add handful of __counted_by annotations not carried in other trees,
     as well as an LKDTM test

   - Fix build failure with gcc-plugins on GCC 14+

   - Fix selftests to respect SKIP for signal-delivery tests

   - Fix CFI warning for paravirt callback prototype

   - Clarify documentation for seq_show_option_n() usage"

* tag 'hardening-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
  LoadPin: Annotate struct dm_verity_loadpin_trusted_root_digest with __counted_by
  kallsyms: Change func signature for cleanup_symbol_name()
  kallsyms: Fix kallsyms_selftest failure
  nsproxy: Convert nsproxy.count to refcount_t
  integrity: Annotate struct ima_rule_opt_list with __counted_by
  lkdtm: Add FAM_BOUNDS test for __counted_by
  Compiler Attributes: counted_by: Adjust name and identifier expansion
  um: refactor deprecated strncpy to memcpy
  um: vector: refactor deprecated strncpy
  alpha: Replace one-element array with flexible-array member
  hardening: Move BUG_ON_DATA_CORRUPTION to hardening options
  list: Introduce CONFIG_LIST_HARDENED
  list_debug: Introduce inline wrappers for debug checks
  compiler_types: Introduce the Clang __preserve_most function attribute
  gcc-plugins: Rename last_stmt() for GCC 14+
  selftests/harness: Actually report SKIP for signal tests
  x86/paravirt: Fix tlb_remove_table function callback prototype warning
  EISA: Replace all non-returning strlcpy with strscpy
  perf: Replace strlcpy with strscpy
  um: Remove strlcpy declaration
  ...
2023-08-28 12:59:45 -07:00
Marc Zyngier
1f66f1246b Merge branch kvm-arm64/6.6/misc into kvmarm-master/next
* kvm-arm64/6.6/misc:
  : .
  : Misc KVM/arm64 updates for 6.6:
  :
  : - Don't unnecessary align non-stack allocations in the EL2 VA space
  :
  : - Drop HCR_VIRT_EXCP_MASK, which was never used...
  :
  : - Don't use smp_processor_id() in kvm_arch_vcpu_load(),
  :   but the cpu parameter instead
  :
  : - Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
  :
  : - Remove prototypes without implementations
  : .
  KVM: arm64: Remove size-order align in the nVHE hyp private VA range
  KVM: arm64: Remove unused declarations
  KVM: arm64: Remove redundant kvm_set_pfn_accessed() from user_mem_abort()
  KVM: arm64: Drop HCR_VIRT_EXCP_MASK
  KVM: arm64: Use the known cpu id instead of smp_processor_id()

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-08-28 09:30:32 +01:00
Marc Zyngier
50a40ff7d3 Merge branch kvm-arm64/6.6/pmu-fixes into kvmarm-master/next
* kvm-arm64/6.6/pmu-fixes:
  : .
  : Another set of PMU fixes, coutrtesy of Reiji Watanabe.
  : From the cover letter:
  :
  : "This series fixes a couple of PMUver related handling of
  : vPMU support.
  :
  : On systems where the PMUVer is not uniform across all PEs,
  : KVM currently does not advertise PMUv3 to the guest,
  : even if userspace successfully runs KVM_ARM_VCPU_INIT with
  : KVM_ARM_VCPU_PMU_V3."
  :
  : Additionally, a fix for an obscure counter oversubscription
  : issue happening when the hsot profines the guest's EL0.
  : .
  KVM: arm64: pmu: Guard PMU emulation definitions with CONFIG_KVM
  KVM: arm64: pmu: Resync EL0 state on counter rotation
  KVM: arm64: PMU: Don't advertise STALL_SLOT_{FRONTEND,BACKEND}
  KVM: arm64: PMU: Don't advertise the STALL_SLOT event
  KVM: arm64: PMU: Avoid inappropriate use of host's PMUVer
  KVM: arm64: PMU: Disallow vPMU on non-uniform PMUVer

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-08-28 09:29:11 +01:00
Marc Zyngier
d58335d10f Merge branch kvm-arm64/tlbi-range into kvmarm-master/next
* kvm-arm64/tlbi-range:
  : .
  : FEAT_TLBIRANGE support, courtesy of Raghavendra Rao Ananta.
  : From the cover letter:
  :
  : "In certain code paths, KVM/ARM currently invalidates the entire VM's
  : page-tables instead of just invalidating a necessary range. For example,
  : when collapsing a table PTE to a block PTE, instead of iterating over
  : each PTE and flushing them, KVM uses 'vmalls12e1is' TLBI operation to
  : flush all the entries. This is inefficient since the guest would have
  : to refill the TLBs again, even for the addresses that aren't covered
  : by the table entry. The performance impact would scale poorly if many
  : addresses in the VM is going through this remapping.
  :
  : For architectures that implement FEAT_TLBIRANGE, KVM can replace such
  : inefficient paths by performing the invalidations only on the range of
  : addresses that are in scope. This series tries to achieve the same in
  : the areas of stage-2 map, unmap and write-protecting the pages."
  : .
  KVM: arm64: Use TLBI range-based instructions for unmap
  KVM: arm64: Invalidate the table entries upon a range
  KVM: arm64: Flush only the memslot after write-protect
  KVM: arm64: Implement kvm_arch_flush_remote_tlbs_range()
  KVM: arm64: Define kvm_tlb_flush_vmid_range()
  KVM: arm64: Implement __kvm_tlb_flush_vmid_range()
  arm64: tlb: Implement __flush_s2_tlb_range_op()
  arm64: tlb: Refactor the core flush algorithm of __flush_tlb_range
  KVM: Move kvm_arch_flush_remote_tlbs_memslot() to common code
  KVM: Allow range-based TLB invalidation from common code
  KVM: Remove CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
  KVM: arm64: Use kvm_arch_flush_remote_tlbs()
  KVM: Declare kvm_arch_flush_remote_tlbs() globally
  KVM: Rename kvm_arch_flush_remote_tlb() to kvm_arch_flush_remote_tlbs()

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-08-28 09:29:02 +01:00
Marc Zyngier
c1907626dd Merge branch kvm-arm64/nv-trap-forwarding into kvmarm-master/next
* kvm-arm64/nv-trap-forwarding: (30 commits)
  : .
  : This implements the so called "trap forwarding" infrastructure, which
  : gets used when we take a trap from an L2 guest and that the L1 guest
  : wants to see the trap for itself.
  : .
  KVM: arm64: nv: Add trap description for SPSR_EL2 and ELR_EL2
  KVM: arm64: nv: Select XARRAY_MULTI to fix build error
  KVM: arm64: nv: Add support for HCRX_EL2
  KVM: arm64: Move HCRX_EL2 switch to load/put on VHE systems
  KVM: arm64: nv: Expose FGT to nested guests
  KVM: arm64: nv: Add switching support for HFGxTR/HDFGxTR
  KVM: arm64: nv: Expand ERET trap forwarding to handle FGT
  KVM: arm64: nv: Add SVC trap forwarding
  KVM: arm64: nv: Add trap forwarding for HDFGxTR_EL2
  KVM: arm64: nv: Add trap forwarding for HFGITR_EL2
  KVM: arm64: nv: Add trap forwarding for HFGxTR_EL2
  KVM: arm64: nv: Add fine grained trap forwarding infrastructure
  KVM: arm64: nv: Add trap forwarding for CNTHCTL_EL2
  KVM: arm64: nv: Add trap forwarding for MDCR_EL2
  KVM: arm64: nv: Expose FEAT_EVT to nested guests
  KVM: arm64: nv: Add trap forwarding for HCR_EL2
  KVM: arm64: nv: Add trap forwarding infrastructure
  KVM: arm64: Restructure FGT register switching
  KVM: arm64: nv: Add FGT registers
  KVM: arm64: Add missing HCR_EL2 trap bits
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-08-28 09:28:53 +01:00
Vincent Donnefort
f156a7d13f KVM: arm64: Remove size-order align in the nVHE hyp private VA range
commit f922c13e77 ("KVM: arm64: Introduce
pkvm_alloc_private_va_range()") and commit 92abe0f81e ("KVM: arm64:
Introduce hyp_alloc_private_va_range()") added an alignment for the
start address of any allocation into the nVHE hypervisor private VA
range.

This alignment (order of the size of the allocation) intends to enable
efficient stack verification (if the PAGE_SHIFT bit is zero, the stack
pointer is on the guard page and a stack overflow occurred).

But this is only necessary for stack allocation and can waste a lot of
VA space. So instead make stack-specific functions, handling the guard
page requirements, while other users (e.g.  fixmap) will only get page
alignment.

Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230811112037.1147863-1-vdonnefort@google.com
2023-08-26 12:00:54 +01:00
Marc Zyngier
c948a0a2f5 KVM: arm64: nv: Add trap description for SPSR_EL2 and ELR_EL2
Having carved a hole for SP_EL1, we are now missing the entries
for SPSR_EL2 and ELR_EL2. Add them back.

Reported-by: Miguel Luis <miguel.luis@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-08-23 20:03:25 +01:00
Marc Zyngier
b1f778a223 KVM: arm64: pmu: Resync EL0 state on counter rotation
Huang Shijie reports that, when profiling a guest from the host
with a number of events that exceeds the number of available
counters, the reported counts are wildly inaccurate. Without
the counter oversubscription, the reported counts are correct.

Their investigation indicates that upon counter rotation (which
takes place on the back of a timer interrupt), we fail to
re-apply the guest EL0 enabling, leading to the counting of host
events instead of guest events.

In order to solve this, add yet another hook between the host PMU
driver and KVM, re-applying the guest EL0 configuration if the
right conditions apply (the host is VHE, we are in interrupt
context, and we interrupted a running vcpu). This triggers a new
vcpu request which will apply the correct configuration on guest
reentry.

With this, we have the correct counts, even when the counters are
oversubscribed.

Reported-by: Huang Shijie <shijie@os.amperecomputing.com>
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Tested_by: Huang Shijie <shijie@os.amperecomputing.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230809013953.7692-1-shijie@os.amperecomputing.com
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230820090108.177817-1-maz@kernel.org
2023-08-22 13:35:51 +01:00
Reiji Watanabe
64b81000b6 KVM: arm64: PMU: Don't advertise STALL_SLOT_{FRONTEND,BACKEND}
Don't advertise STALL_SLOT_{FRONT,BACK}END events to the guest,
similar to STALL_SLOT event, as when any of these three events
are implemented, all three of them should be implemented,
according to the Arm ARM.

Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230819043947.4100985-5-reijiw@google.com
2023-08-20 09:42:16 +01:00
Reiji Watanabe
8c694f557f KVM: arm64: PMU: Don't advertise the STALL_SLOT event
Currently, KVM hides the STALL_SLOT event for guests if the
host PMU version is PMUv3p4 or newer, as PMMIR_EL1 is handled
as RAZ for the guests. But, this should be based on the guests'
PMU version (instead of the host PMU version), as an older PMU
that doesn't support PMMIR_EL1 could support the STALL_SLOT
event, according to the Arm ARM. Exposing the STALL_SLOT event
without PMMIR_EL1 won't be very useful anyway though.

Stop advertising the STALL_SLOT event for guests unconditionally,
rather than fixing or keeping the inaccurate checking to
advertise the event for the case, where it is not very useful.

Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230819043947.4100985-4-reijiw@google.com
2023-08-20 09:42:16 +01:00
Reiji Watanabe
335ca49ff3 KVM: arm64: PMU: Avoid inappropriate use of host's PMUVer
Avoid using the PMUVer of the host's PMU hardware to determine
the PMU event mask, except in one case, as the value of host's
PMUVer may differ from the value of ID_AA64DFR0_EL1.PMUVer for
the guest.

The exception case is when using the PMUVer to determine the
valid range of events for KVM_ARM_VCPU_PMU_V3_FILTER, as it has
been allowing userspace to specify events that are valid for
the PMU hardware, regardless of the value of the guest's
ID_AA64DFR0_EL1.PMUVer.  KVM will use a valid range of events
based on the value of the guest's ID_AA64DFR0_EL1.PMUVer,
in order to effectively filter events that the guest attempts
to program though.

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230819043947.4100985-3-reijiw@google.com
2023-08-20 09:42:16 +01:00
Reiji Watanabe
ec3eb9ed60 KVM: arm64: PMU: Disallow vPMU on non-uniform PMUVer
Disallow userspace from configuring vPMU for guests on systems
where the PMUVer is not uniform across all PEs.
KVM has not been advertising PMUv3 to the guests with vPMU on
such systems anyway, and such systems would be extremely
uncommon and unlikely to even use KVM.

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230819043947.4100985-2-reijiw@google.com
2023-08-20 09:42:16 +01:00
Sean Christopherson
3e1efe2b67 KVM: Wrap kvm_{gfn,hva}_range.pte in a per-action union
Wrap kvm_{gfn,hva}_range.pte in a union so that future notifier events can
pass event specific information up and down the stack without needing to
constantly expand and churn the APIs.  Lockless aging of SPTEs will pass
around a bitmap, and support for memory attributes will pass around the
new attributes for the range.

Add a "KVM_NO_ARG" placeholder to simplify handling events without an
argument (creating a dummy union variable is midly annoying).

Opportunstically drop explicit zero-initialization of the "pte" field, as
omitting the field (now a union) has the same effect.

Cc: Yu Zhao <yuzhao@google.com>
Link: https://lore.kernel.org/all/CAOUHufagkd2Jk3_HrVoFFptRXM=hX2CV8f+M-dka-hJU4bP8kw@mail.gmail.com
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Yu Zhao <yuzhao@google.com>
Link: https://lore.kernel.org/r/20230729004144.1054885-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-08-17 11:26:53 -07:00
Randy Dunlap
60046980bf KVM: arm64: nv: Select XARRAY_MULTI to fix build error
populate_nv_trap_config() uses xa_store_range(), which is only built
when XARRAY_MULTI is set, so select that symbol to prevent the build error.

aarch64-linux-ld: arch/arm64/kvm/emulate-nested.o: in function `populate_nv_trap_config':
emulate-nested.c:(.init.text+0x17c): undefined reference to `xa_store_range'

Fixes: e58ec47bf6 ("KVM: arm64: nv: Add trap forwarding infrastructure")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: kvmarm@lists.linux.dev
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230816210949.17117-1-rdunlap@infradead.org
2023-08-17 10:24:00 +01:00
Marc Zyngier
03fb54d0aa KVM: arm64: nv: Add support for HCRX_EL2
HCRX_EL2 has an interesting effect on HFGITR_EL2, as it conditions
the traps of TLBI*nXS.

Expand the FGT support to add a new Fine Grained Filter that will
get checked when the instruction gets trapped, allowing the shadow
register to override the trap as needed.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-29-maz@kernel.org
2023-08-17 10:00:28 +01:00
Marc Zyngier
a63cf31139 KVM: arm64: Move HCRX_EL2 switch to load/put on VHE systems
Although the nVHE behaviour requires HCRX_EL2 to be switched
on each switch between host and guest, there is nothing in
this register that would affect a VHE host.

It is thus possible to save/restore this register on load/put
on VHE systems, avoiding unnecessary sysreg access on the hot
path. Additionally, it avoids unnecessary traps when running
with NV.

To achieve this, simply move the read/writes to the *_common()
helpers, which are called on load/put on VHE, and more eagerly
on nVHE.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-28-maz@kernel.org
2023-08-17 10:00:28 +01:00
Marc Zyngier
0a5d28433a KVM: arm64: nv: Expose FGT to nested guests
Now that we have FGT support, expose the feature to NV guests.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-27-maz@kernel.org
2023-08-17 10:00:28 +01:00
Marc Zyngier
d4d2dacc7c KVM: arm64: nv: Add switching support for HFGxTR/HDFGxTR
Now that we can evaluate the FGT registers, allow them to be merged
with the hypervisor's own configuration (in the case of HFG{RW}TR_EL2)
or simply set for HFGITR_EL2, HDGFRTR_EL2 and HDFGWTR_EL2.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-26-maz@kernel.org
2023-08-17 10:00:28 +01:00
Marc Zyngier
ea3b27d8de KVM: arm64: nv: Expand ERET trap forwarding to handle FGT
We already handle ERET being trapped from a L1 guest in hyp context.
However, with FGT, we can also have ERET being trapped from L2, and
this needs to be reinjected into L1.

Add the required exception routing.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-25-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
a77b31dce4 KVM: arm64: nv: Add SVC trap forwarding
HFGITR_EL2 allows the trap of SVC instructions to EL2. Allow these
traps to be forwarded. Take this opportunity to deny any 32bit activity
when NV is enabled.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-24-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
d0be0b2ede KVM: arm64: nv: Add trap forwarding for HDFGxTR_EL2
... and finally, the Debug version of FGT, with its *enormous*
list of trapped registers.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-23-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
039f9f12de KVM: arm64: nv: Add trap forwarding for HFGITR_EL2
Similarly, implement the trap forwarding for instructions affected
by HFGITR_EL2.

Note that the TLBI*nXS instructions should be affected by HCRX_EL2,
which will be dealt with down the line. Also, ERET* and SVC traps
are handled separately.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-22-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
5a24ea7869 KVM: arm64: nv: Add trap forwarding for HFGxTR_EL2
Implement the trap forwarding for traps described by HFGxTR_EL2,
reusing the Fine Grained Traps infrastructure previously implemented.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-21-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
15b4d82d69 KVM: arm64: nv: Add fine grained trap forwarding infrastructure
Fine Grained Traps are fun. Not.

Implement the fine grained trap forwarding, reusing the Coarse Grained
Traps infrastructure previously implemented.

Each sysreg/instruction inserted in the xarray gets a FGT group
(vaguely equivalent to a register number), a bit number in that register,
and a polarity.

It is then pretty easy to check the FGT state at handling time, just
like we do for the coarse version (it is just faster).

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-20-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
e880bd3363 KVM: arm64: nv: Add trap forwarding for CNTHCTL_EL2
Describe the CNTHCTL_EL2 register, and associate it with all the sysregs
it allows to trap.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-19-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
cb31632c44 KVM: arm64: nv: Add trap forwarding for MDCR_EL2
Describe the MDCR_EL2 register, and associate it with all the sysregs
it allows to trap.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-18-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
a0b70fb00d KVM: arm64: nv: Expose FEAT_EVT to nested guests
Now that we properly implement FEAT_EVT (as we correctly forward
traps), expose it to guests.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-17-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
d0fc0a2519 KVM: arm64: nv: Add trap forwarding for HCR_EL2
Describe the HCR_EL2 register, and associate it with all the sysregs
it allows to trap.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-16-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
e58ec47bf6 KVM: arm64: nv: Add trap forwarding infrastructure
A significant part of what a NV hypervisor needs to do is to decide
whether a trap from a L2+ guest has to be forwarded to a L1 guest
or handled locally. This is done by checking for the trap bits that
the guest hypervisor has set and acting accordingly, as described by
the architecture.

A previous approach was to sprinkle a bunch of checks in all the
system register accessors, but this is pretty error prone and doesn't
help getting an overview of what is happening.

Instead, implement a set of global tables that describe a trap bit,
combinations of trap bits, behaviours on trap, and what bits must
be evaluated on a system register trap.

Although this is painful to describe, this allows to specify each
and every control bit in a static manner. To make it efficient,
the table is inserted in an xarray that is global to the system,
and checked each time we trap a system register while running
a L2 guest.

Add the basic infrastructure for now, while additional patches will
implement configuration registers.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Link: https://lore.kernel.org/r/20230815183903.2735724-15-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
e930694e61 KVM: arm64: Restructure FGT register switching
As we're about to majorly extend the handling of FGT registers,
restructure the code to actually save/restore the registers
as required. This is made easy thanks to the previous addition
of the EL2 registers, allowing us to use the host context for
this purpose.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-14-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
50d2fe4648 KVM: arm64: nv: Add FGT registers
Add the 5 registers covering FEAT_FGT. The AMU-related registers
are currently left out as we don't have a plan for them. Yet.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-13-maz@kernel.org
2023-08-17 10:00:27 +01:00
Marc Zyngier
484f86824a KVM: arm64: Correctly handle ACCDATA_EL1 traps
As we blindly reset some HFGxTR_EL2 bits to 0, we also randomly trap
unsuspecting sysregs that have their trap bits with a negative
polarity.

ACCDATA_EL1 is one such register that can be accessed by the guest,
causing a splat on the host as we don't have a proper handler for
it.

Adding such handler addresses the issue, though there are a number
of other registers missing as the current architecture documentation
doesn't describe them yet.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230815183903.2735724-11-maz@kernel.org
2023-08-17 10:00:27 +01:00
Raghavendra Rao Ananta
7657ea920c KVM: arm64: Use TLBI range-based instructions for unmap
The current implementation of the stage-2 unmap walker traverses
the given range and, as a part of break-before-make, performs
TLB invalidations with a DSB for every PTE. A multitude of this
combination could cause a performance bottleneck on some systems.

Hence, if the system supports FEAT_TLBIRANGE, defer the TLB
invalidations until the entire walk is finished, and then
use range-based instructions to invalidate the TLBs in one go.
Condition deferred TLB invalidation on the system supporting FWB,
as the optimization is entirely pointless when the unmap walker
needs to perform CMOs.

Rename stage2_put_pte() to stage2_unmap_put_pte() as the function
now serves the stage-2 unmap walker specifically, rather than
acting generic.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230811045127.3308641-15-rananta@google.com
2023-08-17 09:40:35 +01:00
Raghavendra Rao Ananta
defc8cc7ab KVM: arm64: Invalidate the table entries upon a range
Currently, during the operations such as a hugepage collapse,
KVM would flush the entire VM's context using 'vmalls12e1is'
TLBI operation. Specifically, if the VM is faulting on many
hugepages (say after dirty-logging), it creates a performance
penalty for the guest whose pages have already been faulted
earlier as they would have to refill their TLBs again.

Instead, leverage kvm_tlb_flush_vmid_range() for table entries.
If the system supports it, only the required range will be
flushed. Else, it'll fallback to the previous mechanism.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230811045127.3308641-14-rananta@google.com
2023-08-17 09:40:35 +01:00
Raghavendra Rao Ananta
3756b6f2bb KVM: arm64: Flush only the memslot after write-protect
After write-protecting the region, currently KVM invalidates
the entire TLB entries using kvm_flush_remote_tlbs(). Instead,
scope the invalidation only to the targeted memslot. If
supported, the architecture would use the range-based TLBI
instructions to flush the memslot or else fallback to flushing
all of the TLBs.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230811045127.3308641-13-rananta@google.com
2023-08-17 09:40:35 +01:00
Raghavendra Rao Ananta
c42b6f0b1c KVM: arm64: Implement kvm_arch_flush_remote_tlbs_range()
Implement kvm_arch_flush_remote_tlbs_range() for arm64
to invalidate the given range in the TLB.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230811045127.3308641-12-rananta@google.com
2023-08-17 09:40:35 +01:00
Raghavendra Rao Ananta
117940aa6e KVM: arm64: Define kvm_tlb_flush_vmid_range()
Implement the helper kvm_tlb_flush_vmid_range() that acts
as a wrapper for range-based TLB invalidations. For the
given VMID, use the range-based TLBI instructions to do
the job or fallback to invalidating all the TLB entries.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230811045127.3308641-11-rananta@google.com
2023-08-17 09:40:35 +01:00
Raghavendra Rao Ananta
6354d15052 KVM: arm64: Implement __kvm_tlb_flush_vmid_range()
Define  __kvm_tlb_flush_vmid_range() (for VHE and nVHE)
to flush a range of stage-2 page-tables using IPA in one go.
If the system supports FEAT_TLBIRANGE, the following patches
would conveniently replace global TLBI such as vmalls12e1is
in the map, unmap, and dirty-logging paths with ripas2e1is
instead.

Signed-off-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230811045127.3308641-10-rananta@google.com
2023-08-17 09:40:35 +01:00