linux/virt/kvm
Paolo Bonzini b85524314a KVM: guest_memfd: delay kvm_gmem_prepare_folio() until the memory is passed to the guest
Initializing the contents of the folio on fallocate() is unnecessarily
restrictive.  It means that the page is registered with the firmware and
then it cannot be touched anymore.  In particular, this loses the
possibility of using fallocate() to pre-allocate the page for SEV-SNP
guests, because kvm_arch_gmem_prepare() then fails.

It's only when the guest actually accesses the page (and therefore
kvm_gmem_get_pfn() is called) that the page must be cleared from any
stale host data and registered with the firmware.  The up-to-date flag
is clear if this has to be done (i.e. it is the first access and
kvm_gmem_populate() has not been called).

All in all, there are enough differences between kvm_gmem_get_pfn() and
kvm_gmem_populate(), that it's better to separate the two flows completely.
Extract the bulk of kvm_gmem_get_folio(), which take a folio and end up
setting its up-to-date flag, to a new function kvm_gmem_prepare_folio();
these are now done only by the non-__-prefixed kvm_gmem_get_pfn().
As a bonus, __kvm_gmem_get_pfn() loses its ugly "bool prepare" argument.

One difference is that fallocate(PUNCH_HOLE) can now race with a
page fault.  Potentially this causes a page to be prepared and into the
filemap even after fallocate(PUNCH_HOLE).  This is harmless, as it can be
fixed by another hole punching operation, and can be avoided by clearing
the private-page attribute prior to invoking fallocate(PUNCH_HOLE).
This way, the page fault will cause an exit to user space.

The previous semantics, where fallocate() could be used to prepare
the pages in advance of running the guest, can be accessed with
KVM_PRE_FAULT_MEMORY.

For now, accessing a page in one VM will attempt to call
kvm_arch_gmem_prepare() in all of those that have bound the guest_memfd.
Cleaning this up is left to a separate patch.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-26 14:46:14 -04:00
..
async_pf.c Revert "KVM: async_pf: avoid recursive flushing of work items" 2024-06-03 08:55:55 -07:00
async_pf.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504 2019-06-19 17:09:56 +02:00
binary_stats.c KVM: stats: remove dead stores 2021-08-13 03:35:15 -04:00
coalesced_mmio.c KVM: destruct kvm_io_device while unregistering it from kvm_io_bus 2023-06-13 14:18:09 -07:00
coalesced_mmio.h
dirty_ring.c KVM: Discard zero mask with function kvm_dirty_ring_reset 2024-06-20 17:20:11 -04:00
eventfd.c Generic: 2024-01-17 13:03:37 -08:00
guest_memfd.c KVM: guest_memfd: delay kvm_gmem_prepare_folio() until the memory is passed to the guest 2024-07-26 14:46:14 -04:00
irqchip.c KVM: Setup empty IRQ routing when creating a VM 2024-06-11 14:18:34 -07:00
Kconfig KVM: rename CONFIG_HAVE_KVM_GMEM_* to CONFIG_HAVE_KVM_ARCH_GMEM_* 2024-07-26 14:46:14 -04:00
kvm_main.c KVM generic changes for 6.11 2024-07-16 09:51:36 -04:00
kvm_mm.h KVM: Drop unused @may_block param from gfn_to_pfn_cache_invalidate_start() 2024-04-11 12:58:53 -07:00
Makefile.kvm KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory 2023-11-14 08:01:03 -05:00
pfncache.c KVM: Validate hva in kvm_gpc_activate_hva() to fix __kvm_gpc_refresh() WARN 2024-06-28 08:31:46 -07:00
vfio.c KVM: Treat the device list as an rculist 2024-04-25 13:19:55 +01:00
vfio.h