linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-18 01:51:53 +00:00

Author	SHA1	Message	Date
Yang, Sheng	002c7f7c32	KVM: VMX: Add cpu consistency check All the physical CPUs on the board should support the same VMX feature set. Add check_processor_compatibility to kvm_arch_ops for the consistency check. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:22 +02:00
Rusty Russell	39214915f5	KVM: kvm_vm_ioctl_get_dirty_log restore "nothing dirty" optimization kvm_vm_ioctl_get_dirty_log scans bitmap to see it it's all zero, but doesn't use that information. Avi says: Looks like it was used to guard kvm_mmu_slot_remove_write_access(); optimizing the case where the guest just leaves the screen alone (which it usually does, especially in benchmarks). I'd rather reinstate that optimization. See `90cb0529dd` where the damage was done. It's pretty simple: if the bitmap is all zero, we don't need to do anything to clean it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:21 +02:00
Rusty Russell	b114b0804d	KVM: Use alignment properties of vcpu to simplify FPU ops Now we use a kmem cache for allocating vcpus, we can get the 16-byte alignment required by fxsave & fxrstor instructions, and avoid manually aligning the buffer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:21 +02:00
Rusty Russell	c16f862d02	KVM: Use kmem cache for allocating vcpus Avi wants the allocations of vcpus centralized again. The easiest way is to add a "size" arg to kvm_init_arch, and expose the thus-prepared cache to the modules. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:21 +02:00
Laurent Vivier	e7d5d76cae	KVM: Remove kvm_{read,write}_guest() ... in favor of the more general emulator_{read,write}_*. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:21 +02:00
Laurent Vivier	cebff02b11	KVM: Change the emulator_{read,write,cmpxchg}_* functions to take a vcpu ... instead of a x86_emulate_ctxt, so that other callers can use it easily. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:21 +02:00
Rusty Russell	3077c4513c	KVM: Remove three magic numbers There are several places where hardcoded numbers are used in place of the easily-available constant, which is poor form. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:21 +02:00
Rusty Russell	9bd01506ee	KVM: fx_init() needs preemption disabled while it plays with the FPU state Now that kvm generally runs with preemption enabled, we need to protect the fpu intialization sequence. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:20 +02:00
Shaohua Li	11ec280471	KVM: Convert vm lock to a mutex This allows the kvm mmu to perform sleepy operations, such as memory allocation. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:20 +02:00
Avi Kivity	15ad71460d	KVM: Use the scheduler preemption notifiers to make kvm preemptible Current kvm disables preemption while the new virtualization registers are in use. This of course is not very good for latency sensitive workloads (one use of virtualization is to offload user interface and other latency insensitive stuff to a container, so that it is easier to analyze the remaining workload). This patch re-enables preemption for kvm; preemption is now only disabled when switching the registers in and out, and during the switch to guest mode and back. Contains fixes from Shaohua Li <shaohua.li@intel.com>. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:20 +02:00
Jeff Dike	519ef35341	KVM: add hypercall nr to kvm_run Add the hypercall number to kvm_run and initialize it. This changes the ABI, but as this particular ABI was unusable before this no users are affected. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:20 +02:00
Rusty Russell	fb3f0f51d9	KVM: Dynamically allocate vcpus This patch converts the vcpus array in "struct kvm" to a pointer array, and changes the "vcpu_create" and "vcpu_setup" hooks into one "vcpu_create" call which does the allocation and initialization of the vcpu (calling back into the kvm_vcpu_init core helper). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:20 +02:00
Gregory Haskins	a2fa3e9f52	KVM: Remove arch specific components from the general code struct kvm_vcpu has vmx-specific members; remove them to a private structure. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:20 +02:00
Rusty Russell	c820c2aa27	KVM: load_pdptrs() cleanups load_pdptrs can be handed an invalid cr3, and it should not oops. This can happen because we injected #gp in set_cr3() after we set vcpu->cr3 to the invalid value, or from kvm_vcpu_ioctl_set_sregs(), or memory configuration changes after the guest did set_cr3(). We should also copy the pdpte array once, before checking and assigning, otherwise an SMP guest can potentially alter the values between the check and the set. Finally one nitpick: ret = 1 should be done as late as possible: this allows GCC to check for unset "ret" should the function change in future. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:20 +02:00
Shaohua Li	fe55188194	KVM: Move gfn_to_page out of kmap/unmap pairs gfn_to_page might sleep with swap support. Move it out of the kmap calls. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:19 +02:00
Rusty Russell	310bc76c2b	KVM: Return if the pdptrs are invalid when the guest turns on PAE. Don't fall through and turn on PAE in this case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:19 +02:00
Rusty Russell	7075bc816c	KVM: Use standard CR8 flags, and fix TPR definition Intel manual (and KVM definition) say the TPR is 4 bits wide. Also fix CR8_RESEVED_BITS typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:19 +02:00
Jeff Dike	8fc0d085f5	KVM: Set exit_reason to KVM_EXIT_MMIO where run->mmio is initialized. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:19 +02:00
Rusty Russell	9eb829ced8	KVM: Trivial: Use standard BITMAP macros, open-code userspace-exposed header Creating one's own BITMAP macro seems suboptimal: if we use manual arithmetic in the one place exposed to userspace, we can use standard macros elsewhere. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:18 +02:00
Rusty Russell	66aee91aaa	KVM: Use standard CR4 flags, tighten checking On this machine (Intel), writing to the CR4 bits 0x00000800 and 0x00001000 cause a GPF. The Intel manual is a little unclear, but AFIACT they're reserved, too. Also fix spelling of CR4_RESEVED_BITS. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:18 +02:00
Rusty Russell	f802a307cb	KVM: Use standard CR3 flags, tighten checking The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR3_RESEVED_BITS, fix spelling and tighten it for the non-PAE case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:18 +02:00
Rusty Russell	707d92fa72	KVM: Trivial: Use standard CR0 flags macros from asm/cpu-features.h The kernel now has asm/cpu-features.h: use those macros instead of inventing our own. Also spell out definition of CR0_RESEVED_BITS (no code change) and fix typo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:18 +02:00
Rusty Russell	9a2b85c620	KVM: Trivial: Avoid hardware_disable predeclaration Don't pre-declare hardware_disable: shuffle the reboot hook down. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:18 +02:00
Eddie Dong	65619eb5a8	KVM: In-kernel string pio write support Add string pio write support to support some version of Windows. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:17 +02:00
Qing He	dad3795d2b	KVM: SMP: Add vcpu_id field in struct vcpu This patch adds a `vcpu_id' field in `struct vcpu', so we can differentiate BSP and APs without pointer comparison or arithmetic. Signed-off-by: Qing He <qing.he@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:17 +02:00
Nguyen Anh Quynh	cd0d913797	KVM: Fix nopage() in kvm_main.c nopage() in kvm_main.c should only store the type of mmap() fault if the pointers are not NULL. This patch fixes the problem. Signed-off-by: Nguyen Anh Quynh <aquynh@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-10-13 10:18:17 +02:00
Avi Kivity	6ec8a856e4	KVM: Avoid calling smp_call_function_single() with interrupts disabled When taking a cpu down, we need to hardware_disable() it. Unfortunately, the CPU_DYING notifier is called with interrupts disabled, which means we can't use smp_call_function_single(). Fortunately, the CPU_DYING notifier is always called on the dying cpu, so we don't need to use the function at all and can simply call hardware_disable() directly. Tested-by: Paolo Ornati <ornati@fastwebnet.it> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-19 10:13:49 -07:00
Avi Kivity	4c981b43d7	KVM: Fix removal of nx capability from guest cpuid Testing the wrong bit caused kvm not to disable nx on the guest when it is disabled on the host (an mmu optimization relies on the nx bits being the same in the guest and host). This allows Windows to boot when nx is disabled on te host (e.g. when host pae is disabled). Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-25 14:31:13 +03:00
Avi Kivity	7cfa4b0a43	Revert "KVM: Avoid useless memory write when possible" This reverts commit `a3c870bdce`. While it does save useless updates, it (probably) defeats the fork detector, causing a massive performance loss. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-25 14:30:56 +03:00
Rusty Russell	5e58cfe41c	KVM: Fix unlikely kvm_create vs decache_vcpus_on_cpu race We add the kvm to the vm_list before initializing the vcpu mutexes, which can be mutex_trylock()'ed by decache_vcpus_on_cpu(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-25 14:29:34 +03:00
Avi Kivity	b0fcd903e6	KVM: Correctly handle writes crossing a page boundary Writes that are contiguous in virtual memory may not be contiguous in physical memory; so split writes that straddle a page boundary. Thanks to Aurelien for reporting the bug, patient testing, and a fix to this very patch. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-25 14:29:17 +03:00
Avi Kivity	35f3f28613	KVM: x86 emulator: implement rdmsr and wrmsr Allow real-mode emulation of rdmsr and wrmsr. This allows smp Windows to boot, presumably for its sipi trampoline. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-20 20:16:29 +03:00
Avi Kivity	90cb0529dd	KVM: Fix memory slot management functions for guest smp The memory slot management functions were oriented against vcpu 0, where they should be kvm-wide. This causes hangs starting X on guest smp. Fix by making the functions (and resultant tail in the mmu) non-vcpu-specific. Unfortunately this reduces the efficiency of the mmu object cache a bit. We may have to revisit this later. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-20 20:16:29 +03:00
Avi Kivity	cec9ad279b	KVM: Use CPU_DYING for disabling virtualization Only at the CPU_DYING stage can we be sure that no user process will be scheduled onto the cpu and oops when trying to use virtualization extensions. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:51 +03:00
Avi Kivity	4267c41a45	KVM: Tune hotplug/suspend IPIs The hotplug IPIs can be called from the cpu on which we are currently running on, so use on_cpu(). Similarly, drop on_each_cpu() for the suspend/resume callbacks, as we're in atomic context here and only one cpu is up anyway. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:51 +03:00
Avi Kivity	1b6c016818	KVM: Keep track of which cpus have virtualization enabled By keeping track of which cpus have virtualization enabled, we prevent double-enable or double-disable during hotplug, which is a very fatal oops. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:51 +03:00
Avi Kivity	e495606dd0	KVM: Clean up #includes Remove unnecessary ones, and rearange the remaining in the standard order. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:49 +03:00
Avi Kivity	d6d2816849	KVM: Remove kvmfs in favor of the anonymous inodes source kvm uses a pseudo filesystem, kvmfs, to generate inodes, a job that the new anonymous inodes source does much better. Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:49 +03:00
Luca Tettamanti	a3c870bdce	KVM: Avoid useless memory write when possible When writing to normal memory and the memory area is unchanged the write can be safely skipped, avoiding the costly kvm_mmu_pte_write. Signed-Off-By: Luca Tettamanti <kronos.it@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:48 +03:00
Eddie Dong	74906345ff	KVM: Add support for in-kernel pio handlers Useful for the PIC and PIT. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:48 +03:00
Gregory Haskins	2eeb2e94eb	KVM: Adds support for in-kernel mmio handlers Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:47 +03:00
Avi Kivity	d9e368d612	KVM: Flush remote tlbs when reducing shadow pte permissions When a vcpu causes a shadow tlb entry to have reduced permissions, it must also clear the tlb on remote vcpus. We do that by: - setting a bit on the vcpu that requests a tlb flush before the next entry - if the vcpu is currently executing, we send an ipi to make sure it exits before we continue Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:46 +03:00
Avi Kivity	39c3b86e5c	KVM: Keep an upper bound of initialized vcpus That way, we don't need to loop for KVM_MAX_VCPUS for a single vcpu vm. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:46 +03:00
Avi Kivity	d3bef15f84	KVM: Move duplicate halt handling code into kvm_main.c Will soon have a thid user. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:46 +03:00
Avi Kivity	120e9a453b	KVM: Fix adding an smp virtual machine to the vm list If we add the vm once per vcpu, we corrupt the list if the guest has multiple vcpus. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:45 +03:00
Avi Kivity	7b53aa5650	KVM: Fix vcpu freeing for guest smp A vcpu can pin up to four mmu shadow pages, which means the freeing loop will never terminate. Fix by first unpinning shadow pages on all vcpus, then freeing shadow pages. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:45 +03:00
Nguyen Anh Quynh	313899477f	KVM: Remove unnecessary initialization and checks in mark_page_dirty() Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:45 +03:00
Avi Kivity	d3d25b048b	KVM: MMU: Use slab caches for shadow pages and their headers Use slab caches instead of a simple custom list. Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:43 +03:00
Eddie Dong	2cc51560ae	KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexit MSR_EFER.LME/LMA bits are automatically save/restored by VMX hardware, KVM only needs to save NX/SCE bits at time of heavy weight VM Exit. But clearing NX bits in host envirnment may cause system hang if the host page table is using EXB bits, thus we leave NX bits as it is. If Host NX=1 and guest NX=0, we can do guest page table EXB bits check before inserting a shadow pte (though no guest is expecting to see this kind of gp fault). If host NX=0, we present guest no Execute-Disable feature to guest, thus no host NX=0, guest NX=1 combination. This patch reduces raw vmexit time by ~27%. Me: fix compile warnings on i386. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:42 +03:00
Matthew Gregan	2dc7094b56	KVM: Implement IA32_EBL_CR_POWERON msr Attempting to boot the default 'bsd' kernel of OpenBSD 4.1 i386 in a guest fails early in the kernel init inside p3_get_bus_clock while trying to read the IA32_EBL_CR_POWERON MSR. KVM logs an 'unhandled MSR' message and the guest kernel faults. This patch is sufficient to allow OpenBSD to boot, after which it seems to run fine. I'm not sure if this is the correct solution for dealing with this particular MSR, but it works for me. Signed-off-by: Matthew Gregan <kinetik@flim.org> Signed-off-by: Avi Kivity <avi@qumranet.com>	2007-07-16 12:05:40 +03:00

1 2 3

138 Commits