Pull char / misc / other smaller driver subsystem updates from Greg KH:
"Here is the large set of char, misc, and other driver subsystem
updates for 5.19-rc1. The merge request for this has been delayed as I
wanted to get lots of linux-next testing due to some late arrivals of
changes for the habannalabs driver.
Highlights of this merge are:
- habanalabs driver updates for new hardware types and fixes and
other updates
- IIO driver tree merge which includes loads of new IIO drivers and
cleanups and additions
- PHY driver tree merge with new drivers and small updates to
existing ones
- interconnect driver tree merge with fixes and updates
- soundwire driver tree merge with some small fixes
- coresight driver tree merge with small fixes and updates
- mhi bus driver tree merge with lots of updates and new device
support
- firmware driver updates
- fpga driver updates
- lkdtm driver updates (with a merge conflict, more on that below)
- extcon driver tree merge with small updates
- lots of other tiny driver updates and fixes and cleanups, full
details in the shortlog.
All of these have been in linux-next for almost 2 weeks with no
reported problems"
* tag 'char-misc-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (387 commits)
habanalabs: use separate structure info for each error collect data
habanalabs: fix missing handle shift during mmap
habanalabs: remove hdev from hl_ctx_get args
habanalabs: do MMU prefetch as deferred work
habanalabs: order memory manager messages
habanalabs: return -EFAULT on copy_to_user error
habanalabs: use NULL for eventfd
habanalabs: update firmware header
habanalabs: add support for notification via eventfd
habanalabs: add topic to memory manager buffer
habanalabs: handle race in driver fini
habanalabs: add device memory scrub ability through debugfs
habanalabs: use unified memory manager for CB flow
habanalabs: unified memory manager new code for CB flow
habanalabs/gaudi: set arbitration timeout to a high value
habanalabs: add put by handle method to memory manager
habanalabs: hide memory manager page shift
habanalabs: Add separate poll interval value for protocol
habanalabs: use get_task_pid() to take PID
habanalabs: add prefetch flag to the MAP operation
...
Pull misc hardening updates from Gustavo Silva:
"Replace a few open-coded instances with size_t saturating arithmetic
helpers"
* tag 'size_t-saturating-helpers-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
virt: acrn: Prefer array_size and struct_size over open coded arithmetic
afs: Prefer struct_size over open coded arithmetic
Pull AMD SEV-SNP support from Borislav Petkov:
"The third AMD confidential computing feature called Secure Nested
Paging.
Add to confidential guests the necessary memory integrity protection
against malicious hypervisor-based attacks like data replay, memory
remapping and others, thus achieving a stronger isolation from the
hypervisor.
At the core of the functionality is a new structure called a reverse
map table (RMP) with which the guest has a say in which pages get
assigned to it and gets notified when a page which it owns, gets
accessed/modified under the covers so that the guest can take an
appropriate action.
In addition, add support for the whole machinery needed to launch a
SNP guest, details of which is properly explained in each patch.
And last but not least, the series refactors and improves parts of the
previous SEV support so that the new code is accomodated properly and
not just bolted on"
* tag 'x86_sev_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
x86/entry: Fixup objtool/ibt validation
x86/sev: Mark the code returning to user space as syscall gap
x86/sev: Annotate stack change in the #VC handler
x86/sev: Remove duplicated assignment to variable info
x86/sev: Fix address space sparse warning
x86/sev: Get the AP jump table address from secrets page
x86/sev: Add missing __init annotations to SEV init routines
virt: sevguest: Rename the sevguest dir and files to sev-guest
virt: sevguest: Change driver name to reflect generic SEV support
x86/boot: Put globals that are accessed early into the .data section
x86/boot: Add an efi.h header for the decompressor
virt: sevguest: Fix bool function returning negative value
virt: sevguest: Fix return value check in alloc_shared_pages()
x86/sev-es: Replace open-coded hlt-loop with sev_es_terminate()
virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement
virt: sevguest: Add support to get extended report
virt: sevguest: Add support to derive key
virt: Add SEV-SNP guest driver
x86/sev: Register SEV-SNP guest request platform device
x86/sev: Provide support for SNP guest request NAEs
...
The GHCB specification section 2.7 states that when SEV-SNP is enabled,
a guest should not rely on the hypervisor to provide the address of the
AP jump table. Instead, if a guest BIOS wants to provide an AP jump
table, it should record the address in the SNP secrets page so the guest
operating system can obtain it directly from there.
Fix this on the guest kernel side by having SNP guests use the AP jump
table address published in the secrets page rather than issuing a GHCB
request to get it.
[ mroth:
- Improve error handling when ioremap()/memremap() return NULL
- Don't mix function calls with declarations
- Add missing __init
- Tweak commit message ]
Fixes: 0afb6b660a ("x86/sev: Use SEV-SNP AP creation to start secondary CPUs")
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220422135624.114172-3-michael.roth@amd.com
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.
So, use the array_size() helper to do the arithmetic instead of the
argument "count * size" in the vzalloc() function.
Also, take the opportunity to add a flexible array member of struct
vm_memory_region_op to the vm_memory_region_batch structure. And then,
change the code accordingly and use the struct_size() helper to do the
arithmetic instead of the argument "size + size * count" in the kzalloc
function.
This code was detected with the help of Coccinelle and audited and fixed
manually.
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
Acked-by: Fei Li <fei1.li@intel.com>
Signed-off-by: Len Baker <len.baker@gmx.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
During patch review, it was decided the SNP guest driver name should not
be SEV-SNP specific, but should be generic for use with anything SEV.
However, this feedback was missed and the driver name, and many of the
driver functions and structures, are SEV-SNP name specific. Rename the
driver to "sev-guest" (to match the misc device that is created) and
update some of the function and structure names, too.
While in the file, adjust the one pr_err() message to be a dev_err()
message so that the message, if issued, uses the driver name.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/307710bb5515c9088a19fd0b930268c7300479b2.1650464054.git.thomas.lendacky@amd.com
The new efi_secret module exposes the confidential computing (coco)
EFI secret area via securityfs interface.
When the module is loaded (and securityfs is mounted, typically under
/sys/kernel/security), a "secrets/coco" directory is created in
securityfs. In it, a file is created for each secret entry. The name
of each such file is the GUID of the secret entry, and its content is
the secret data.
This allows applications running in a confidential computing setting to
read secrets provided by the guest owner via a secure secret injection
mechanism (such as AMD SEV's LAUNCH_SECRET command).
Removing (unlinking) files in the "secrets/coco" directory will zero out
the secret in memory, and remove the filesystem entry. If the module is
removed and loaded again, that secret will not appear in the filesystem.
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Link: https://lore.kernel.org/r/20220412212127.154182-3-dovmurik@linux.ibm.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Version 2 of GHCB specification defines Non-Automatic-Exit (NAE) to get
extended guest report which is similar to the SNP_GET_REPORT ioctl. The
main difference is related to the additional data that will be returned.
That additional data returned is a certificate blob that can be used by
the SNP guest user. The certificate blob layout is defined in the GHCB
specification. The driver simply treats the blob as a opaque data and
copies it to userspace.
[ bp: Massage commit message, cast 1st arg of access_ok() ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-46-brijesh.singh@amd.com
The SNP_GET_DERIVED_KEY ioctl interface can be used by the SNP guest to
ask the firmware to provide a key derived from a root key. The derived
key may be used by the guest for any purposes it chooses, such as a
sealing key or communicating with the external entities.
See SEV-SNP firmware spec for more information.
[ bp: No need to memset "req" - it will get overwritten. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://lore.kernel.org/r/20220307213356.2797205-45-brijesh.singh@amd.com
The SEV-SNP specification provides the guest a mechanism to communicate
with the PSP without risk from a malicious hypervisor who wishes to
read, alter, drop or replay the messages sent. The driver uses
snp_issue_guest_request() to issue GHCB SNP_GUEST_REQUEST or
SNP_EXT_GUEST_REQUEST NAE events to submit the request to PSP.
The PSP requires that all communication should be encrypted using key
specified through a struct snp_guest_platform_data descriptor.
Userspace can use SNP_GET_REPORT ioctl() to query the guest attestation
report.
See SEV-SNP spec section Guest Messages for more details.
[ bp: Remove the "what" from the commit message, massage. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-44-brijesh.singh@amd.com
Pull random number generator fixes from Jason Donenfeld:
- If a hardware random number generator passes a sufficiently large
chunk of entropy to random.c during early boot, we now skip the
"fast_init" business and let it initialize the RNG.
This makes CONFIG_RANDOM_TRUST_BOOTLOADER=y actually useful.
- We already have the command line `random.trust_cpu=0/1` option for
RDRAND, which let distros enable CONFIG_RANDOM_TRUST_CPU=y while
placating concerns of more paranoid users.
Now we add `random.trust_bootloader=0/1` so that distros can
similarly enable CONFIG_RANDOM_TRUST_BOOTLOADER=y.
- Re-add a comment that got removed by accident in the recent revert.
- Add the spec-compliant ACPI CID for vmgenid, which Microsoft added to
the vmgenid spec at Ard's request during earlier review.
- Restore build-time randomness via the latent entropy plugin, which
was lost when we transitioned to using a hash function.
* tag 'random-5.18-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
random: mix build-time latent entropy into pool at init
virt: vmgenid: recognize new CID added by Hyper-V
random: re-add removed comment about get_random_{u32,u64} reseeding
random: treat bootloader trust toggle the same way as cpu trust toggle
random: skip fast_init if hwrng provides large chunk of entropy
Pull char/misc and other driver updates from Greg KH:
"Here is the big set of char/misc and other small driver subsystem
updates for 5.18-rc1.
Included in here are merges from driver subsystems which contain:
- iio driver updates and new drivers
- fsi driver updates
- fpga driver updates
- habanalabs driver updates and support for new hardware
- soundwire driver updates and new drivers
- phy driver updates and new drivers
- coresight driver updates
- icc driver updates
Individual changes include:
- mei driver updates
- interconnect driver updates
- new PECI driver subsystem added
- vmci driver updates
- lots of tiny misc/char driver updates
All of these have been in linux-next for a while with no reported
problems"
* tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (556 commits)
firmware: google: Properly state IOMEM dependency
kgdbts: fix return value of __setup handler
firmware: sysfb: fix platform-device leak in error path
firmware: stratix10-svc: add missing callback parameter on RSU
arm64: dts: qcom: add non-secure domain property to fastrpc nodes
misc: fastrpc: Add dma handle implementation
misc: fastrpc: Add fdlist implementation
misc: fastrpc: Add helper function to get list and page
misc: fastrpc: Add support to secure memory map
dt-bindings: misc: add fastrpc domain vmid property
misc: fastrpc: check before loading process to the DSP
misc: fastrpc: add secure domain support
dt-bindings: misc: add property to support non-secure DSP
misc: fastrpc: Add support to get DSP capabilities
misc: fastrpc: add support for FASTRPC_IOCTL_MEM_MAP/UNMAP
misc: fastrpc: separate fastrpc device from channel context
dt-bindings: nvmem: brcm,nvram: add basic NVMEM cells
dt-bindings: nvmem: make "reg" property optional
nvmem: brcm_nvram: parse NVRAM content into NVMEM cells
nvmem: dt-bindings: Fix the error of dt-bindings check
...
In the Windows spec for VM Generation ID, the originally specified CID
is longer than allowed by the ACPI spec. Hyper-V has added "VMGENCTR" as
a second valid CID that is conformant, while retaining the original CID
for compatibility with Windows guests.
Add this new CID to the list recognized by the driver.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
acrn_vm_ram_map can't pin the user pages with VM_PFNMAP flag
by calling get_user_pages_fast(), the PA(physical pages)
may be mapped by kernel driver and set PFNMAP flag.
This patch fixes logic to setup EPT mapping for PFN mapped RAM region
by checking the memory attribute before adding EPT mapping for them.
Fixes: 88f537d5e8 ("virt: acrn: Introduce EPT mapping management")
Signed-off-by: Yonghua Huang <yonghua.huang@intel.com>
Signed-off-by: Fei Li <fei1.li@intel.com>
Link: https://lore.kernel.org/r/20220228022212.419406-1-yonghua.huang@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
VM Generation ID is a feature from Microsoft, described at
<https://go.microsoft.com/fwlink/?LinkId=260709>, and supported by
Hyper-V and QEMU. Its usage is described in Microsoft's RNG whitepaper,
<https://aka.ms/win10rng>, as:
If the OS is running in a VM, there is a problem that most
hypervisors can snapshot the state of the machine and later rewind
the VM state to the saved state. This results in the machine running
a second time with the exact same RNG state, which leads to serious
security problems. To reduce the window of vulnerability, Windows
10 on a Hyper-V VM will detect when the VM state is reset, retrieve
a unique (not random) value from the hypervisor, and reseed the root
RNG with that unique value. This does not eliminate the
vulnerability, but it greatly reduces the time during which the RNG
system will produce the same outputs as it did during a previous
instantiation of the same VM state.
Linux has the same issue, and given that vmgenid is supported already by
multiple hypervisors, we can implement more or less the same solution.
So this commit wires up the vmgenid ACPI notification to the RNG's newly
added add_vmfork_randomness() function.
It can be used from qemu via the `-device vmgenid,guid=auto` parameter.
After setting that, use `savevm` in the monitor to save the VM state,
then quit QEMU, start it again, and use `loadvm`. That will trigger this
driver's notify function, which hands the new UUID to the RNG. This is
described in <https://git.qemu.org/?p=qemu.git;a=blob;f=docs/specs/vmgenid.txt>.
And there are hooks for this in libvirt as well, described in
<https://libvirt.org/formatdomain.html#general-metadata>.
Note, however, that the treatment of this as a UUID is considered to be
an accidental QEMU nuance, per
<https://github.com/libguestfs/virt-v2v/blob/master/docs/vm-generation-id-across-hypervisors.txt>,
so this driver simply treats these bytes as an opaque 128-bit binary
blob, as per the spec. This doesn't really make a difference anyway,
considering that's how it ends up when handed to the RNG in the end.
Cc: Alexander Graf <graf@amazon.com>
Cc: Adrian Catangiu <adrian@parity.io>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Tested-by: Souradeep Chakrabarti <souradch.linux@gmail.com> # With Hyper-V's virtual hardware
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Pull bitmap updates from Yury Norov:
- introduce for_each_set_bitrange()
- use find_first_*_bit() instead of find_next_*_bit() where possible
- unify for_each_bit() macros
* tag 'bitmap-5.17-rc1' of git://github.com/norov/linux:
vsprintf: rework bitmap_list_string
lib: bitmap: add performance test for bitmap_print_to_pagebuf
bitmap: unify find_bit operations
mm/percpu: micro-optimize pcpu_is_populated()
Replace for_each_*_bit_from() with for_each_*_bit() where appropriate
find: micro-optimize for_each_{set,clear}_bit()
include/linux: move for_each_bit() macros from bitops.h to find.h
cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
tools: sync tools/bitmap with mother linux
all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
cpumask: use find_first_and_bit()
lib: add find_first_and_bit()
arch: remove GENERIC_FIND_FIRST_BIT entirely
include: move find.h from asm_generic to linux
bitops: move find_bit_*_le functions from le.h to find.h
bitops: protect find_first_{,zero}_bit properly
find_first{,_zero}_bit is a more effective analogue of 'next' version if
start == 0. This patch replaces 'next' with 'first' where things look
trivial.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Add KUnit tests for the contiguous physical memory regions merging
functionality from the Nitro Enclaves misc device logic.
We can build the test binary with the following configuration:
CONFIG_KUNIT=y
CONFIG_NITRO_ENCLAVES=m
CONFIG_NITRO_ENCLAVES_MISC_DEV_TEST=y
and install the nitro_enclaves module to run the testcases.
We'll see the following message using dmesg if everything goes well:
[...] # Subtest: ne_misc_dev_test
[...] 1..1
[...] (NULL device *): Physical mem region address is not 2 MiB aligned
[...] (NULL device *): Physical mem region size is not multiple of 2 MiB
[...] (NULL device *): Physical mem region address is not 2 MiB aligned
[...] ok 1 - ne_misc_dev_test_merge_phys_contig_memory_regions
[...] ok 1 - ne_misc_dev_test
Reviewed-by: Andra Paraschiv <andraprs@amazon.com>
Signed-off-by: Longpeng <longpeng2@huawei.com>
Link: https://lore.kernel.org/r/20211107140918.2106-5-longpeng2@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There can be cases when there are more memory regions that need to be
set for an enclave than the maximum supported number of memory regions
per enclave. One example can be when the memory regions are backed by 2
MiB hugepages (the minimum supported hugepage size).
Let's merge the adjacent regions if they are physically contiguous. This
way the final number of memory regions is less than before merging and
could potentially avoid reaching maximum.
Reviewed-by: Andra Paraschiv <andraprs@amazon.com>
Signed-off-by: Longpeng <longpeng2@huawei.com>
Link: https://lore.kernel.org/r/20211107140918.2106-2-longpeng2@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ACRN hypervisor can emulate a virtual device within hypervisor for a
Guest VM. The emulated virtual device can work without the ACRN
userspace after creation. The hypervisor do the emulation of that device.
To support the virtual device creating/destroying, HSM provides the
following ioctls:
- ACRN_IOCTL_CREATE_VDEV
Pass data struct acrn_vdev from userspace to the hypervisor, and inform
the hypervisor to create a virtual device for a User VM.
- ACRN_IOCTL_DESTROY_VDEV
Pass data struct acrn_vdev from userspace to the hypervisor, and inform
the hypervisor to destroy a virtual device of a User VM.
These new APIs will be used by user space code vm_add_hv_vdev and
vm_remove_hv_vdev in
https://github.com/projectacrn/acrn-hypervisor/blob/master/devicemodel/core/vmmapi.c
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Signed-off-by: Fei Li <fei1.li@intel.com>
Link: https://lore.kernel.org/r/20210923084128.18902-3-fei1.li@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
MMIO device passthrough enables an OS in a virtual machine to directly
access a MMIO device in the host. It promises almost the native
performance, which is required in performance-critical scenarios of
ACRN.
HSM provides the following ioctls:
- Assign - ACRN_IOCTL_ASSIGN_MMIODEV
Pass data struct acrn_mmiodev from userspace to the hypervisor, and
inform the hypervisor to assign a MMIO device to a User VM.
- De-assign - ACRN_IOCTL_DEASSIGN_PCIDEV
Pass data struct acrn_mmiodev from userspace to the hypervisor, and
inform the hypervisor to de-assign a MMIO device from a User VM.
These new APIs will be used by user space code vm_assign_mmiodev and
vm_deassign_mmiodev in
https://github.com/projectacrn/acrn-hypervisor/blob/master/devicemodel/core/vmmapi.c
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Signed-off-by: Fei Li <fei1.li@intel.com>
Link: https://lore.kernel.org/r/20210923084128.18902-2-fei1.li@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ACRN hypervisor has scenarios which could run a real-time guest VM.
The real-time guest VM occupies dedicated CPU cores, be assigned with
dedicated PCI devices. It can run without the Service VM after boot up.
hcall_destroy_vm() returns failure when a real-time guest VM refuses.
The clearing of flag ACRN_VM_FLAG_DESTROYED causes some kernel resource
double-freed in a later acrn_vm_destroy().
Do hcall_destroy_vm() before resource release to drop this chance to
destroy the VM if hypercall fails.
Fixes: 9c5137aedd ("virt: acrn: Introduce VM management interfaces")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Signed-off-by: Fei Li <fei1.li@intel.com>
Link: https://lore.kernel.org/r/20210722062736.15050-1-fei1.li@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A failing usercopy of the slot uid will lead to a stale entry in the
file descriptor table as put_unused_fd() won't release it. This enables
userland to refer to a dangling 'file' object through that still valid
file descriptor, leading to all kinds of use-after-free exploitation
scenarios.
Exchanging put_unused_fd() for close_fd(), ksys_close() or alike won't
solve the underlying issue, as the file descriptor might have been
replaced in the meantime, e.g. via userland calling close() on it
(leading to a NULL pointer dereference in the error handling code as
'fget(enclave_fd)' will return a NULL pointer) or by dup2()'ing a
completely different file object to that very file descriptor, leading
to the same situation: a dangling file descriptor pointing to a freed
object -- just in this case to a file object of user's choosing.
Generally speaking, after the call to fd_install() the file descriptor
is live and userland is free to do whatever with it. We cannot rely on
it to still refer to our enclave object afterwards. In fact, by abusing
userfaultfd() userland can hit the condition without any racing and
abuse the error handling in the nitro code as it pleases.
To fix the above issues, defer the call to fd_install() until all
possible errors are handled. In this case it's just the usercopy, so do
it directly in ne_create_vm_ioctl() itself.
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210429165941.27020-2-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ioeventfd is a mechanism to register PIO/MMIO regions to trigger an
eventfd signal when written to by a User VM. ACRN userspace can register
any arbitrary I/O address with a corresponding eventfd and then pass the
eventfd to a specific end-point of interest for handling.
Vhost is a kernel-level virtio server which uses eventfd for signalling.
To support vhost on ACRN, ioeventfd is introduced in HSM.
A new I/O client dedicated to ioeventfd is associated with a User VM
during VM creation. HSM provides ioctls to associate an I/O region with
a eventfd. The I/O client signals a eventfd once its corresponding I/O
region is matched with an I/O request.
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yu Wang <yu1.wang@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Link: https://lore.kernel.org/r/20210207031040.49576-16-shuo.a.liu@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>