With the transition to kvm_arch_vcpu_run_pid_change() to handle
the "run once" activities, it becomes obvious that has_run_once
is now an exact shadow of vcpu->pid.
Replace vcpu->arch.has_run_once with a new vcpu_has_run_once()
helper that directly checks for vcpu->pid, and get rid of the
now unused field.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
The kvm_arch_vcpu_run_pid_change() helper gets called on each PID
change. The kvm_vcpu_first_run_init() helper gets run on the...
first run(!) of a vcpu.
As it turns out, the first run of a vcpu also triggers a PID change
event (vcpu->pid is initially NULL).
Use this property to merge these two helpers and get rid of another
arm64-specific oddity.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Restructure kvm_vcpu_first_run_init() to set the has_run_once
flag after having completed all the "run once" activities.
This includes moving the flip of the userspace irqchip static key
to a point where nothing can fail.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Having kvm_arch_vcpu_run_pid_change() inline doesn't bring anything
to the table. Move it next to kvm_vcpu_first_run_init(), which will
be convenient for what is next to come.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
We currently have to maintain a mapping the thread_info structure
at EL2 in order to be able to check the TIF_FOREIGN_FPSTATE flag.
In order to eventually get rid of this, start with a vcpu flag that
shadows the thread flag on each entry into the hypervisor.
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Generally, it doesn't make sense to return the recommended maximum number
of vCPUs which exceeds the maximum possible number of vCPUs.
Note: ARM64 is special as the value returned by KVM_CAP_MAX_VCPUS differs
depending on whether it is a system-wide ioctl or a per-VM one. Previously,
KVM_CAP_NR_VCPUS didn't have this difference and it seems preferable to
keep the status quo. Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
which is what gets returned by system-wide KVM_CAP_MAX_VCPUS.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211116163443.88707-2-vkuznets@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move x86's perf guest callbacks into common KVM, as they are semantically
identical to arm64's callbacks (the only other such KVM callbacks).
arm64 will convert to the common versions in a future patch.
Implement the necessary arm64 arch hooks now to avoid having to provide
stubs or a temporary #define (from x86) to avoid arm64 compilation errors
when CONFIG_GUEST_PERF_EVENTS=y.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211111020738.2512932-13-seanjc@google.com
KVM/arm64 fixes for 5.16, take #1
- Fix the host S2 finalization by solely iterating over the memblocks
instead of the whole IPA space
- Tighten the return value of kvm_vcpu_preferred_target() now that
32bit support is long gone
- Make sure the extraction of ESR_ELx.EC is limited to the architected
bits
- Comment fixups
KVM/arm64 updates for Linux 5.16
- More progress on the protected VM front, now with the full
fixed feature set as well as the limitation of some hypercalls
after initialisation.
- Cleanup of the RAZ/WI sysreg handling, which was pointlessly
complicated
- Fixes for the vgic placement in the IPA space, together with a
bunch of selftests
- More memcg accounting of the memory allocated on behalf of a guest
- Timer and vgic selftests
- Workarounds for the Apple M1 broken vgic implementation
- KConfig cleanups
- New kvmarm.mode=none option, for those who really dislike us
* kvm-arm64/pkvm/fixed-features: (22 commits)
: .
: Add the pKVM fixed feature that allows a bunch of exceptions
: to either be forbidden or be easily handled at EL2.
: .
KVM: arm64: pkvm: Give priority to standard traps over pvm handling
KVM: arm64: pkvm: Pass vpcu instead of kvm to kvm_get_exit_handler_array()
KVM: arm64: pkvm: Move kvm_handle_pvm_restricted around
KVM: arm64: pkvm: Consolidate include files
KVM: arm64: pkvm: Preserve pending SError on exit from AArch32
KVM: arm64: pkvm: Handle GICv3 traps as required
KVM: arm64: pkvm: Drop sysregs that should never be routed to the host
KVM: arm64: pkvm: Drop AArch32-specific registers
KVM: arm64: pkvm: Make the ERR/ERX*_EL1 registers RAZ/WI
KVM: arm64: pkvm: Use a single function to expose all id-regs
KVM: arm64: Fix early exit ptrauth handling
KVM: arm64: Handle protected guests at 32 bits
KVM: arm64: Trap access to pVM restricted features
KVM: arm64: Move sanitized copies of CPU features
KVM: arm64: Initialize trap registers for protected VMs
KVM: arm64: Add handlers for protected VM System Registers
KVM: arm64: Simplify masking out MTE in feature id reg
KVM: arm64: Add missing field descriptor for MDCR_EL2
KVM: arm64: Pass struct kvm to per-EC handlers
KVM: arm64: Move early handlers to per-EC handlers
...
Signed-off-by: Marc Zyngier <maz@kernel.org>
* kvm-arm64/memory-accounting:
: .
: Sprinkle a bunch of GFP_KERNEL_ACCOUNT all over the code base
: to better track memory allocation made on behalf of a VM.
: .
KVM: arm64: Add memcg accounting to KVM allocations
KVM: arm64: vgic: Add memcg accounting to vgic allocations
Signed-off-by: Marc Zyngier <maz@kernel.org>
Inspired by commit 254272ce65 ("kvm: x86: Add memcg accounting to KVM
allocations"), it would be better to make arm64 KVM consistent with
common kvm codes.
The memory allocations of VM scope should be charged into VM process
cgroup, hence change GFP_KERNEL to GFP_KERNEL_ACCOUNT.
There remain a few cases since these allocations are global, not in VM
scope.
Signed-off-by: Jia He <justin.he@arm.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210907123112.10232-3-justin.he@arm.com
Protected VMs have more restricted features that need to be
trapped. Moreover, the host should not be trusted to set the
appropriate trapping registers and their values.
Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
cptr_el2 at EL2 for protected guests, based on the values of the
guest's feature id registers.
No functional change intended as trap handlers introduced in the
previous patch are still not hooked in to the guest exit
handlers.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211010145636.1950948-9-tabba@google.com
Add system register handlers for protected VMs. These cover Sys64
registers (including feature id registers), and debug.
No functional change intended as these are not hooked in yet to
the guest exit handlers introduced earlier. So when trapping is
triggered, the exit handlers let the host handle it, as before.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211010145636.1950948-8-tabba@google.com
* kvm-arm64/misc-5.16:
: .
: - Allow KVM to be disabled from the command-line
: - Clean up CONFIG_KVM vs CONFIG_HAVE_KVM
: .
KVM: arm64: Depend on HAVE_KVM instead of OF
KVM: arm64: Unconditionally include generic KVM's Kconfig
KVM: arm64: Allow KVM to be disabled from the command line
Signed-off-by: Marc Zyngier <maz@kernel.org>
Although KVM can be compiled out of the kernel, it cannot be disabled
at runtime. Allow this possibility by introducing a new mode that
will prevent KVM from initialising.
This is useful in the (limited) circumstances where you don't want
KVM to be available (what is wrong with you?), or when you want
to install another hypervisor instead (good luck with that).
Reviewed-by: David Brazdil <dbrazdil@google.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Andrew Scull <ascull@google.com>
Link: https://lore.kernel.org/r/20211001170553.3062988-1-maz@kernel.org
If the __pkvm_prot_finalize hypercall returns an error, we WARN but fail
to propagate the failure code back to kvm_arch_init().
Pass a pointer to a zero-initialised return variable so that failure
to finalise the pKVM protections on a host CPU can be reported back to
KVM.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211008135839.1193-5-will@kernel.org
The stub hypercalls provide mechanisms to reset and replace the EL2 code,
so uninstall them once pKVM has been initialised in order to ensure the
integrity of the hypervisor code.
To ensure pKVM initialisation remains functional, split cpu_hyp_reinit()
into two helper functions to separate usage of the stub from usage of
pkvm hypercalls either side of __pkvm_init on the boot CPU.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211008135839.1193-4-will@kernel.org
By switching from kfree() to kvfree() in kvm_arch_free_vm() Arm64 can
use the common variant. This can be accomplished by adding another
macro __KVM_HAVE_ARCH_VM_FREE, which will be used only by x86 for now.
Further simplification can be achieved by adding __kvm_arch_free_vm()
doing the common part.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-Id: <20210903130808.30142-5-jgross@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* kvm-arm64/pkvm-fixed-features-prologue:
: Rework a bunch of common infrastructure as a prologue
: to Fuad Tabba's protected VM fixed feature series.
KVM: arm64: Upgrade trace_kvm_arm_set_dreg32() to 64bit
KVM: arm64: Add config register bit definitions
KVM: arm64: Add feature register flag definitions
KVM: arm64: Track value of cptr_el2 in struct kvm_vcpu_arch
KVM: arm64: Keep mdcr_el2's value as set by __init_el2_debug
KVM: arm64: Restore mdcr_el2 from vcpu
KVM: arm64: Refactor sys_regs.h,c for nVHE reuse
KVM: arm64: Fix names of config register fields
KVM: arm64: MDCR_EL2 is a 64-bit register
KVM: arm64: Remove trailing whitespace in comment
KVM: arm64: placeholder to check if VM is protected
Signed-off-by: Marc Zyngier <maz@kernel.org>
* kvm-arm64/mmu/vmid-cleanups:
: Cleanup the stage-2 configuration by providing a single helper,
: and tidy up some of the ordering requirements for the VMID
: allocator.
KVM: arm64: Upgrade VMID accesses to {READ,WRITE}_ONCE
KVM: arm64: Unify stage-2 programming behind __load_stage2()
KVM: arm64: Move kern_hyp_va() usage in __load_guest_stage2() into the callers
Signed-off-by: Marc Zyngier <maz@kernel.org>
Switch KVM/arm64 to the generic entry code, courtesy of Oliver Upton
* kvm-arm64/generic-entry:
KVM: arm64: Use generic KVM xfer to guest work function
entry: KVM: Allow use of generic KVM entry w/o full generic support
KVM: arm64: Record number of signal exits as a vCPU stat
Signed-off-by: Marc Zyngier <maz@kernel.org>
PSCI fixes from Oliver Upton:
- Plug race on reset
- Ensure that a pending reset is applied before userspace accesses
- Reject PSCI requests with illegal affinity bits
* kvm-arm64/psci/cpu_on:
selftests: KVM: Introduce psci_cpu_on_test
KVM: arm64: Enforce reserved bits for PSCI target affinities
KVM: arm64: Handle PSCI resets before userspace touches vCPU state
KVM: arm64: Fix read-side race on updates to vcpu reset state
Signed-off-by: Marc Zyngier <maz@kernel.org>
Prevent kmemleak from peeking into the HYP data, which is fatal
in protected mode.
* kvm-arm64/mmu/kmemleak-pkvm:
KVM: arm64: Unregister HYP sections from kmemleak in protected mode
arm64: Move .hyp.rodata outside of the _sdata.._edata range
Signed-off-by: Marc Zyngier <maz@kernel.org>
KVM/arm64 fixes for 5.14, take #2
- Plug race between enabling MTE and creating vcpus
- Fix off-by-one bug when checking whether an address range is RAM
Signed-off-by: Marc Zyngier <maz@kernel.org>
Track the baseline guest value for cptr_el2 in struct
kvm_vcpu_arch, similar to the other registers that control traps.
Use this value when setting cptr_el2 for the guest.
Currently this value is unchanged (CPTR_EL2_DEFAULT), but future
patches will set trapping bits based on features supported for
the guest.
No functional change intended.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210817081134.2918285-9-tabba@google.com
The CPU_ON PSCI call takes a payload that KVM uses to configure a
destination vCPU to run. This payload is non-architectural state and not
exposed through any existing UAPI. Effectively, we have a race between
CPU_ON and userspace saving/restoring a guest: if the target vCPU isn't
ran again before the VMM saves its state, the requested PC and context
ID are lost. When restored, the target vCPU will be runnable and start
executing at its old PC.
We can avoid this race by making sure the reset payload is serviced
before userspace can access a vCPU's state.
Fixes: 358b28f09f ("arm/arm64: KVM: Allow a VCPU to fully reset itself")
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210818202133.1106786-3-oupton@google.com
Booting a KVM host in protected mode with kmemleak quickly results
in a pretty bad crash, as kmemleak doesn't know that the HYP sections
have been taken away. This is specially true for the BSS section,
which is part of the kernel BSS section and registered at boot time
by kmemleak itself.
Unregister the HYP part of the BSS before making that section
HYP-private. The rest of the HYP-specific data is obtained via
the page allocator or lives in other sections, none of which is
subjected to kmemleak.
Fixes: 90134ac9ca ("KVM: arm64: Protect the .hyp sections from the host")
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org # 5.13
Link: https://lore.kernel.org/r/20210802123830.2195174-3-maz@kernel.org
When enabling KVM_CAP_ARM_MTE the ioctl checks that there are no VCPUs
created to ensure that the capability is enabled before the VM is
running. However no locks are held at that point so it is
(theoretically) possible for another thread in the VMM to create VCPUs
between the check and actually setting mte_enabled. Close the race by
taking kvm->lock.
Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Fixes: 673638f434 ("KVM: arm64: Expose KVM_ARM_CAP_MTE")
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210729160036.20433-1-steven.price@arm.com
Pull kvm updates from Paolo Bonzini:
"This covers all architectures (except MIPS) so I don't expect any
other feature pull requests this merge window.
ARM:
- Add MTE support in guests, complete with tag save/restore interface
- Reduce the impact of CMOs by moving them in the page-table code
- Allow device block mappings at stage-2
- Reduce the footprint of the vmemmap in protected mode
- Support the vGIC on dumb systems such as the Apple M1
- Add selftest infrastructure to support multiple configuration and
apply that to PMU/non-PMU setups
- Add selftests for the debug architecture
- The usual crop of PMU fixes
PPC:
- Support for the H_RPT_INVALIDATE hypercall
- Conversion of Book3S entry/exit to C
- Bug fixes
S390:
- new HW facilities for guests
- make inline assembly more robust with KASAN and co
x86:
- Allow userspace to handle emulation errors (unknown instructions)
- Lazy allocation of the rmap (host physical -> guest physical
address)
- Support for virtualizing TSC scaling on VMX machines
- Optimizations to avoid shattering huge pages at the beginning of
live migration
- Support for initializing the PDPTRs without loading them from
memory
- Many TLB flushing cleanups
- Refuse to load if two-stage paging is available but NX is not (this
has been a requirement in practice for over a year)
- A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from
CR0/CR4/EFER, using the MMU mode everywhere once it is computed
from the CPU registers
- Use PM notifier to notify the guest about host suspend or hibernate
- Support for passing arguments to Hyper-V hypercalls using XMM
registers
- Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap
on AMD processors
- Hide Hyper-V hypercalls that are not included in the guest CPUID
- Fixes for live migration of virtual machines that use the Hyper-V
"enlightened VMCS" optimization of nested virtualization
- Bugfixes (not many)
Generic:
- Support for retrieving statistics without debugfs
- Cleanups for the KVM selftests API"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (314 commits)
KVM: x86: rename apic_access_page_done to apic_access_memslot_enabled
kvm: x86: disable the narrow guest module parameter on unload
selftests: kvm: Allows userspace to handle emulation errors.
kvm: x86: Allow userspace to handle emulation errors
KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on
KVM: x86/mmu: Get CR4.SMEP from MMU, not vCPU, in shadow page fault
KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault
KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT
KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic
KVM: x86: Enhance comments for MMU roles and nested transition trickiness
KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE
KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU
KVM: x86/mmu: Use MMU's role to determine PTTYPE
KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers
KVM: x86/mmu: Add a helper to calculate root from role_regs
KVM: x86/mmu: Add helper to update paging metadata
KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0
KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls
KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper
KVM: x86/mmu: Get nested MMU's root level from the MMU's role
...
Pull arm64 updates from Will Deacon:
"There's a reasonable amount here and the juicy details are all below.
It's worth noting that the MTE/KASAN changes strayed outside of our
usual directories due to core mm changes and some associated changes
to some other architectures; Andrew asked for us to carry these [1]
rather that take them via the -mm tree.
Summary:
- Optimise SVE switching for CPUs with 128-bit implementations.
- Fix output format from SVE selftest.
- Add support for versions v1.2 and 1.3 of the SMC calling
convention.
- Allow Pointer Authentication to be configured independently for
kernel and userspace.
- PMU driver cleanups for managing IRQ affinity and exposing event
attributes via sysfs.
- KASAN optimisations for both hardware tagging (MTE) and out-of-line
software tagging implementations.
- Relax frame record alignment requirements to facilitate 8-byte
alignment with KASAN and Clang.
- Cleanup of page-table definitions and removal of unused memory
types.
- Reduction of ARCH_DMA_MINALIGN back to 64 bytes.
- Refactoring of our instruction decoding routines and addition of
some missing encodings.
- Move entry code moved into C and hardened against harmful compiler
instrumentation.
- Update booting requirements for the FEAT_HCX feature, added to v8.7
of the architecture.
- Fix resume from idle when pNMI is being used.
- Additional CPU sanity checks for MTE and preparatory changes for
systems where not all of the CPUs support 32-bit EL0.
- Update our kernel string routines to the latest Cortex Strings
implementation.
- Big cleanup of our cache maintenance routines, which were
confusingly named and inconsistent in their implementations.
- Tweak linker flags so that GDB can understand vmlinux when using
RELR relocations.
- Boot path cleanups to enable early initialisation of per-cpu
operations needed by KCSAN.
- Non-critical fixes and miscellaneous cleanup"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (150 commits)
arm64: tlb: fix the TTL value of tlb_get_level
arm64: Restrict undef hook for cpufeature registers
arm64/mm: Rename ARM64_SWAPPER_USES_SECTION_MAPS
arm64: insn: avoid circular include dependency
arm64: smp: Bump debugging information print down to KERN_DEBUG
drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe()
perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number
arm64: suspend: Use cpuidle context helpers in cpu_suspend()
PSCI: Use cpuidle context helpers in psci_cpu_suspend_enter()
arm64: Convert cpu_do_idle() to using cpuidle context helpers
arm64: Add cpuidle context save/restore helpers
arm64: head: fix code comments in set_cpu_boot_mode_flag
arm64: mm: drop unused __pa(__idmap_text_start)
arm64: mm: fix the count comments in compute_indices
arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan
arm64: mm: Pass original fault address to handle_mm_fault()
arm64/mm: Drop SECTION_[SHIFT|SIZE|MASK]
arm64/mm: Use CONT_PMD_SHIFT for ARM64_MEMSTART_SHIFT
arm64/mm: Drop SWAPPER_INIT_MAP_SIZE
arm64: Conditionally configure PTR_AUTH key of the kernel.
...
Additional CPU sanity checks for MTE and preparatory changes for systems
where not all of the CPUs support 32-bit EL0.
* for-next/cpufeature:
arm64: Restrict undef hook for cpufeature registers
arm64: Kill 32-bit applications scheduled on 64-bit-only CPUs
KVM: arm64: Kill 32-bit vCPUs on systems with mismatched EL0 support
arm64: Allow mismatched 32-bit EL0 support
arm64: cpuinfo: Split AArch32 registers out into a separate struct
arm64: Check if GMID_EL1.BS is the same on all CPUs
arm64: Change the cpuinfo_arm64 member type for some sysregs to u64
KVM/arm64 support for MTE, courtesy of Steven Price.
It allows the guest to use memory tagging, and offers
a new userspace API to save/restore the tags.
* kvm-arm64/mmu/mte:
KVM: arm64: Document MTE capability and ioctl
KVM: arm64: Add ioctl to fetch/store tags in a guest
KVM: arm64: Expose KVM_ARM_CAP_MTE
KVM: arm64: Save/restore MTE registers
KVM: arm64: Introduce MTE VM feature
arm64: mte: Sync tags for pages where PTE is untagged
Signed-off-by: Marc Zyngier <maz@kernel.org>
The VMM may not wish to have it's own mapping of guest memory mapped
with PROT_MTE because this causes problems if the VMM has tag checking
enabled (the guest controls the tags in physical RAM and it's unlikely
the tags are correct for the VMM).
Instead add a new ioctl which allows the VMM to easily read/write the
tags from guest memory, allowing the VMM's mapping to be non-PROT_MTE
while the VMM can still read/write the tags for the purpose of
migration.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210621111716.37157-6-steven.price@arm.com
arm64 cache management function cleanup from Fuad Tabba,
shared with the arm64 tree.
* arm64/for-next/caches:
arm64: Rename arm64-internal cache maintenance functions
arm64: Fix cache maintenance function comments
arm64: sync_icache_aliases to take end parameter instead of size
arm64: __clean_dcache_area_pou to take end parameter instead of size
arm64: __clean_dcache_area_pop to take end parameter instead of size
arm64: __clean_dcache_area_poc to take end parameter instead of size
arm64: __flush_dcache_area to take end parameter instead of size
arm64: dcache_by_line_op to take end parameter instead of size
arm64: __inval_dcache_area to take end parameter instead of size
arm64: Fix comments to refer to correct function __flush_icache_range
arm64: Move documentation of dcache_by_line_op
arm64: assembler: remove user_alt
arm64: Downgrade flush_icache_range to invalidate
arm64: Do not enable uaccess for invalidate_icache_range
arm64: Do not enable uaccess for flush_icache_range
arm64: Apply errata to swsusp_arch_suspend_exit
arm64: assembler: add conditional cache fixups
arm64: assembler: replace `kaddr` with `addr`
Signed-off-by: Marc Zyngier <maz@kernel.org>
Commit 26778aaa13 ("KVM: arm64: Commit pending PC adjustemnts before
returning to userspace") fixed the PC updating issue by forcing an explicit
synchronisation of the exception state on vcpu exit to userspace.
However, we forgot to take into account the case where immediate_exit is
set by userspace and KVM_RUN will exit immediately. Fix it by resolving all
pending PC updates before returning to userspace.
Since __kvm_adjust_pc() relies on a loaded vcpu context, I moved the
immediate_exit checking right after vcpu_load(). We will get some overhead
if immediate_exit is true (which should hopefully be rare).
Fixes: 26778aaa13 ("KVM: arm64: Commit pending PC adjustemnts before returning to userspace")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210526141831.1662-1-yuzenghui@huawei.com
Cc: stable@vger.kernel.org # 5.11
Although naming across the codebase isn't that consistent, it
tends to follow certain patterns. Moreover, the term "flush"
isn't defined in the Arm Architecture reference manual, and might
be interpreted to mean clean, invalidate, or both for a cache.
Rename arm64-internal functions to make the naming internally
consistent, as well as making it consistent with the Arm ARM, by
specifying whether it applies to the instruction, data, or both
caches, whether the operation is a clean, invalidate, or both.
Also specify which point the operation applies to, i.e., to the
point of unification (PoU), coherency (PoC), or persistence
(PoP).
This commit applies the following sed transformation to all files
under arch/arm64:
"s/\b__flush_cache_range\b/caches_clean_inval_pou_macro/g;"\
"s/\b__flush_icache_range\b/caches_clean_inval_pou/g;"\
"s/\binvalidate_icache_range\b/icache_inval_pou/g;"\
"s/\b__flush_dcache_area\b/dcache_clean_inval_poc/g;"\
"s/\b__inval_dcache_area\b/dcache_inval_poc/g;"\
"s/__clean_dcache_area_poc\b/dcache_clean_poc/g;"\
"s/\b__clean_dcache_area_pop\b/dcache_clean_pop/g;"\
"s/\b__clean_dcache_area_pou\b/dcache_clean_pou/g;"\
"s/\b__flush_cache_user_range\b/caches_clean_inval_user_pou/g;"\
"s/\b__flush_icache_all\b/icache_inval_all_pou/g;"
Note that __clean_dcache_area_poc is deliberately missing a word
boundary check at the beginning in order to match the efistub
symbols in image-vars.h.
Also note that, despite its name, __flush_icache_range operates
on both instruction and data caches. The name change here
reflects that.
No functional change intended.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210524083001.2586635-19-tabba@google.com
Signed-off-by: Will Deacon <will@kernel.org>