linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-17 09:31:50 +00:00

Author	SHA1	Message	Date
James Hogan	f74a8e224e	MIPS: KVM: Add count frequency KVM register Expose the KVM guest CP0_Count frequency to userland via a new KVM_REG_MIPS_COUNT_HZ register accessible with the KVM_{GET,SET}_ONE_REG ioctls. When the frequency is altered the bias is adjusted such that the guest CP0_Count doesn't jump discontinuously or lose any timer interrupts. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David Daney <david.daney@cavium.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:02:54 +02:00
James Hogan	f82393426a	MIPS: KVM: Add master disable count interface Expose two new virtual registers to userland via the KVM_{GET,SET}_ONE_REG ioctls. KVM_REG_MIPS_COUNT_CTL is for timer configuration fields and just contains a master disable count bit. This can be used by userland to freeze the timer in order to read a consistent state from the timer count value and timer interrupt pending bit. This cannot be done with the CP0_Cause.DC bit because the timer interrupt pending bit (TI) is also in CP0_Cause so it would be impossible to stop the timer without also risking a race with an hrtimer interrupt and having to explicitly check whether an interrupt should have occurred. When the timer is re-enabled it resumes without losing time, i.e. the CP0_Count value jumps to what it would have been had the timer not been disabled, which would also be impossible to do from userland with CP0_Cause.DC. The timer interrupt also cannot be lost, i.e. if a timer interrupt would have occurred had the timer not been disabled it is queued when the timer is re-enabled. This works by storing the nanosecond monotonic time when the master disable is set, and using it for various operations instead of the current monotonic time (e.g. when recalculating the bias when the CP0_Count is set), until the master disable is cleared again, i.e. the timer state is read/written as it would have been at that time. This state is exposed to userland via the read-only KVM_REG_MIPS_COUNT_RESUME virtual register so that userland can determine the exact time the master disable took effect. This should allow userland to atomically save the state of the timer, and later restore it. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David Daney <david.daney@cavium.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:02:45 +02:00
James Hogan	eda3d33c68	MIPS: KVM: Override guest kernel timer frequency directly The KVM_HOST_FREQ Kconfig symbol was used by KVM guest kernels to override the timer frequency calculation to a value based on the host frequency. Now that the KVM timer emulation is implemented independent of the host timer frequency and defaults to 100MHz, adjust the working of CONFIG_KVM_HOST_FREQ to match. The Kconfig symbol now specifies the guest timer frequency directly, and has been renamed accordingly to KVM_GUEST_TIMER_FREQ. It now defaults to 100MHz too and the help text is updated to make it clear that a zero value will allow the normal timer frequency calculation to take place (based on the emulated RTC). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: Sanjay Lal <sanjayl@kymasys.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:02:23 +02:00
James Hogan	e30492bbe9	MIPS: KVM: Rewrite count/compare timer emulation Previously the emulation of the CPU timer was just enough to get a Linux guest running but some shortcuts were taken: - The guest timer interrupt was hard coded to always happen every 10 ms rather than being timed to when CP0_Count would match CP0_Compare. - The guest's CP0_Count register was based on the host's CP0_Count register. This isn't very portable and fails on cores without a CP_Count register implemented such as Ingenic XBurst. It also meant that the guest's CP0_Cause.DC bit to disable the CP0_Count register took no effect. - The guest's CP0_Count register was emulated by just dividing the host's CP0_Count register by 4. This resulted in continuity problems when used as a clock source, since when the host CP0_Count overflows from 0x7fffffff to 0x80000000, the guest CP0_Count transitions discontinuously from 0x1fffffff to 0xe0000000. Therefore rewrite & fix emulation of the guest timer based on the monotonic kernel time (i.e. ktime_get()). Internally a 32-bit count_bias value is added to the frequency scaled nanosecond monotonic time to get the guest's CP0_Count. The frequency of the timer is initialised to 100MHz and cannot yet be changed, but a later patch will allow the frequency to be configured via the KVM_{GET,SET}_ONE_REG ioctl interface. The timer can now be stopped via the CP0_Cause.DC bit (by the guest or via the KVM_SET_ONE_REG ioctl interface), at which point the current CP0_Count is stored and can be read directly. When it is restarted the bias is recalculated such that the CP0_Count value is continuous. Due to the nature of hrtimer interrupts any read of the guest's CP0_Count register while it is running triggers a check for whether the hrtimer has expired, so that the guest/userland cannot observe the CP0_Count passing CP0_Compare without queuing a timer interrupt. This is also taken advantage of when stopping the timer to ensure that a pending timer interrupt is queued. This replaces the implementation of: - Guest read of CP0_Count - Guest write of CP0_Count - Guest write of CP0_Compare - Guest write of CP0_Cause - Guest read of HWR 2 (CC) with RDHWR - Host read of CP0_Count via KVM_GET_ONE_REG ioctl interface - Host write of CP0_Count via KVM_SET_ONE_REG ioctl interface - Host write of CP0_Compare via KVM_SET_ONE_REG ioctl interface - Host write of CP0_Cause via KVM_SET_ONE_REG ioctl interface Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:01:48 +02:00
James Hogan	3a0ba77408	MIPS: KVM: Migrate hrtimer to follow VCPU When a VCPU is scheduled in on a different CPU, refresh the hrtimer used for emulating count/compare so that it gets migrated to the same CPU. This should prevent a timer interrupt occurring on a different CPU to where the guest it relates to is running, which would cause the guest timer interrupt not to be delivered until after the next guest exit. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:01:33 +02:00
James Hogan	c73c99b0df	MIPS: KVM: Fix timer race modifying guest CP0_Cause The hrtimer callback for guest timer timeouts sets the guest's CP0_Cause.TI bit to indicate to the guest that a timer interrupt is pending, however there is no mutual exclusion implemented to prevent this occurring while the guest's CP0_Cause register is being read-modify-written elsewhere. When this occurs the setting of the CP0_Cause.TI bit is undone and the guest misses the timer interrupt and doesn't reprogram the CP0_Compare register for the next timeout. Currently another timer interrupt will be triggered again in another 10ms anyway due to the way timers are emulated, but after the MIPS timer emulation is fixed this would result in Linux guest time standing still and the guest scheduler not being invoked until the guest CP0_Count has looped around again, which at 100MHz takes just under 43 seconds. Currently this is the only asynchronous modification of guest registers, therefore it is fixed by adjusting the implementations of the kvm_set_c0_guest_cause(), kvm_clear_c0_guest_cause(), and kvm_change_c0_guest_cause() macros which are used for modifying the guest CP0_Cause register to use ll/sc to ensure atomic modification. This should work in both UP and SMP cases without requiring interrupts to be disabled. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:01:25 +02:00
James Hogan	044f0f03ec	MIPS: KVM: Deliver guest interrupts after local_irq_disable() When about to run the guest, deliver guest interrupts after disabling host interrupts. This should prevent an hrtimer interrupt from being handled after delivering guest interrupts, and therefore not delivering the guest timer interrupt until after the next guest exit. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:01:10 +02:00
James Hogan	16fd5c1de4	MIPS: KVM: Add CP0_HWREna KVM register access Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0 HWREna register. This is so that userland can save and restore its value so that RDHWR instructions don't have to be emulated by the guest. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David Daney <david.daney@cavium.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:01:03 +02:00
James Hogan	7767b7d2f7	MIPS: KVM: Add CP0_UserLocal KVM register access Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0 UserLocal register. This is so that userland can save and restore its value. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David Daney <david.daney@cavium.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:00:57 +02:00
James Hogan	f8be02daca	MIPS: KVM: Add CP0_Count/Compare KVM register access Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0 Count and Compare registers. These registers are special in that writing to them has side effects (adjusting the time until the next timer interrupt) and reading of Count depends on the time. Therefore add a couple of callbacks so that different implementations (trap & emulate or VZ) can implement them differently depending on what the hardware provides. The trap & emulate versions mostly duplicate what happens when a T&E guest reads or writes these registers, so it inherits the same limitations which can be fixed in later patches. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David Daney <david.daney@cavium.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:00:44 +02:00
James Hogan	48a3c4e4cd	MIPS: KVM: Move KVM_{GET,SET}_ONE_REG definitions into kvm_host.h Move the KVM_{GET,SET}_ONE_REG MIPS register id definitions out of kvm_mips.c to kvm_host.h so that they can be shared between multiple source files. This allows register access to be indirected depending on the underlying implementation (trap & emulate or VZ). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David Daney <david.daney@cavium.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:00:35 +02:00
James Hogan	fb6df0cdf0	MIPS: KVM: Add CP0_EPC KVM register access Contrary to the comment, the guest CP0_EPC register cannot be set via kvm_regs, since it is distinct from the guest PC. Add the EPC register to the KVM_{GET,SET}_ONE_REG ioctl interface. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David Daney <david.daney@cavium.com> Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:00:26 +02:00
James Hogan	b5dfc6c106	MIPS: KVM: Use tlb_write_random When MIPS KVM needs to write a TLB entry for the guest it reads the CP0_Random register, uses it to generate the CP_Index, and writes the TLB entry using the TLBWI instruction (tlb_write_indexed()). However there's an instruction for that, TLBWR (tlb_write_random()) so use that instead. This happens to also fix an issue with Ingenic XBurst cores where the same TLB entry is replaced each time preventing forward progress on stores due to alternating between TLB load misses for the instruction fetch and TLB store misses. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 13:00:02 +02:00
James Hogan	facaaec1a7	MIPS: KVM: Use local_flush_icache_range to fix RI on XBurst MIPS KVM uses mips32_SyncICache to synchronise the icache with the dcache after dynamically modifying guest instructions or writing guest exception vector. However this uses rdhwr to get the SYNCI step, which causes a reserved instruction exception on Ingenic XBurst cores. It would seem to make more sense to use local_flush_icache_range() instead which does the same thing but is more portable. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Sanjay Lal <sanjayl@kymasys.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 12:59:54 +02:00
James Hogan	90f91356c7	MIPS: Export local_flush_icache_range for KVM Export the local_flush_icache_range function pointer for GPL modules so that it can be used by KVM for syncing the icache after binary translation of trapping instructions. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: Sanjay Lal <sanjayl@kymasys.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 12:59:28 +02:00
James Hogan	7006e2dfda	MIPS: KVM: Allocate at least 16KB for exception handlers Each MIPS KVM guest has its own copy of the KVM exception vector. This contains the TLB refill exception handler at offset 0x000, the general exception handler at offset 0x180, and interrupt exception handlers at offset 0x200 in case Cause_IV=1. A common handler is copied to offset 0x2000 and offset 0x3000 is used for temporarily storing k1 during entry from guest. However the amount of memory allocated for this purpose is calculated as 0x200 rounded up to the next page boundary, which is insufficient if 4KB pages are in use. This can lead to the common handler at offset 0x2000 being overwritten and infinitely recursive exceptions on the next exit from the guest. Increase the minimum size from 0x200 to 0x4000 to cover the full use of the page. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Sanjay Lal <sanjayl@kymasys.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-30 12:59:13 +02:00
Paolo Bonzini	146b2cfe0c	1. Several minor fixes and cleanups for KVM: 2. Fix flag check for gdb support 3. Remove unnecessary vcpu start 4. Remove code duplication for sigp interrupts 5. Better DAT handling for the TPROT instruction 6. Correct addressing exception for standby memory -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJTiD4+AAoJEBF7vIC1phx88poQALJkg1EZLePQBwuqe3XKeUnV AxbtP1T5++Pv+JpXj/mTmI4ToE3U+t66iObCZ8GAWZtXNr47yJQRetTqSU1G8BFe eLpuWhyaQD31awp5M9sWIAzNjqnO4vR0f6kbVeoXbRmPzdZY19QaQb5QIRFMA/GT hSYlNoftRfM/e2ZRFUvqiMIKFHs2r/f9T+7SMRhu3YeZo42N3KOnIwFQrLCoaP+x 8GBmxfl0OaEU9Rer6/MvyJVdMHD5ap4UyreaB5PLVtUNGIrcAJ2vjQGbF78bRonA 9SwJ0mMEfzJoH6gZXDEby1e/p7MJ80wHf9jMoy2u2qtyvNuQIyBWS1eco7LH5dpd pxXcViLC8proiXfFK/6N4ra5cr3auMAmA5iX1LjD+ZxhAjlA+WTcC31jSfueDn33 mBpDASLcoJXbPoEhWiAjP8wkgfUAQnr5CPu5lrmloOrj5llYx0FL2gDhX5dXe7op 5orTAcEtfSxqosK/Av2oQnmUU61BkI4UuBwCbpqM/1w8oKcYX4jUE5Yx/Kmej80Z MtpYUnBDmJaYPogddr8q5QcnRyLEoRUKfKu5lDnlLaP1oi7vr+WBR+BkNLl0/o7o +XVdruc7OtC/Ahtep8unkLV5C8j1siXergjvHXhHjpGNsIQ2tfD8LO55ST4vYkUd GAosCgM/6wt8YmK67qc+ =84hG -----END PGP SIGNATURE----- Merge tag 'kvm-s390-20140530' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next 1. Several minor fixes and cleanups for KVM: 2. Fix flag check for gdb support 3. Remove unnecessary vcpu start 4. Remove code duplication for sigp interrupts 5. Better DAT handling for the TPROT instruction 6. Correct addressing exception for standby memory	2014-05-30 12:57:10 +02:00
Matthew Rosato	5a5e65361f	KVM: s390: Intercept the tprot instruction Based on original patch from Jeng-fang (Nick) Wang When standby memory is specified for a guest Linux, but no virtual memory has been allocated on the Qemu host backing that guest, the guest memory detection process encounters a memory access exception which is not thrown from the KVM handle_tprot() instruction-handler function. The access exception comes from sie64a returning EFAULT, which then passes an addressing exception to the guest. Unfortunately this does not the proper PSW fixup (nullifying vs. suppressing) so the guest will get a fault for the wrong address. Let's just intercept the tprot instruction all the time to do the right thing and not go the page fault handler path for standby memory. tprot is only used by Linux during startup so some exits should be ok. Without this patch, standby memory cannot be used with KVM. Signed-off-by: Nick Wang <jfwang@us.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Tested-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-30 09:39:40 +02:00
David Hildenbrand	3192c63950	KVM: s390: a VCPU is already started when delivering interrupts This patch removes the start of a VCPU when delivering a RESTART interrupt. Interrupt delivery is called from kvm_arch_vcpu_ioctl_run. So the VCPU is already considered started - no need to call kvm_s390_vcpu_start. This function will early exit anyway. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-30 09:39:39 +02:00
David Hildenbrand	2de3bfc25a	KVM: s390: check the given debug flags, not the set ones This patch fixes a minor bug when updating the guest debug settings. We should check the given debug flags, not the already set ones. Doesn't do any harm but too many (for now unused) flags could be set internally without error. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-30 09:39:38 +02:00
Jens Freimann	22ff4a3366	KVM: s390: clean up interrupt injection in sigp code We have all the logic to inject interrupts available in kvm_s390_inject_vcpu(), so let's use it instead of injecting irqs manually to the list in sigp code. SIGP stop is special because we have to check the action_flags before injecting the interrupt. As the action_flags are not available in kvm_s390_inject_vcpu() we leave the code for the stop order code untouched for now. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2014-05-30 09:39:37 +02:00
Thomas Huth	a0465f9ae4	KVM: s390: Enable DAT support for TPROT handler The TPROT instruction can be used to check the accessability of storage for any kind of logical addresses. So far, our handler only supported real addresses. This patch now also enables support for addresses that have to be translated via DAT first. And while we're at it, change the code to use the common KVM function gfn_to_hva_prot() to check for the validity and writability of the memory page. Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2014-05-30 09:39:36 +02:00
Thomas Huth	9fbc02760d	KVM: s390: Add a generic function for translating guest addresses This patch adds a function for translating logical guest addresses into physical guest addresses without touching the memory at the given location. Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2014-05-30 09:39:35 +02:00
Deng-Cheng Zhu	356d4c2040	MIPS: KVM: remove the stale memory alias support function unalias_gfn The memory alias support has been removed since `a1f4d39500` (KVM: Remove memory alias support). So remove unalias_gfn from the MIPS port. Reviewed-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-29 21:38:21 +02:00
Christoffer Dall	d6d7a95c1b	arm: Fix compile warning for psci Commit `e71246a23a` changes psci_init from a function returning a void to an int, but does not change the non CONFIG_ARM_PSCI implementation to return a value, which causes a compile warning. Just return 0. Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org> Cc: Shawn Guo <shawn.guo@freescale.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-27 15:58:49 +02:00
Paolo Bonzini	04092204bb	Changed for the 3.16 merge window. This includes KVM support for PSCI v0.2 and also includes generic Linux support for PSCI v0.2 (on hosts that advertise that feature via their DT), since the latter depends on headers introduced by the former. Finally there's a small patch from Marc that enables Cortex-A53 support. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJTgjKZAAoJEEtpOizt6ddyi2gIAIO93NUZqQfJ/HmGo0cqvTsA OixRhSdAIgigzIavuGfp8UvTtk8WH82Qo6G7sy8/UveT1uoc3hZkfxkASLfjELw4 rKhfMGwfXifC5zrgQ9+h1CM77lLpWMU1+PAqUOO2TZXjlOHZxSpx5AdfY03aGxvb sL+ovj02eGXB0IxR7dNI4XPIRS7ny+2OOzoKKH4u6ogQlwm96pptQ634sWUSM+mB dedJBLZHEattn5GLh+QnvDvdrROgE5wR/Ji4PX2YdXoaeEjiz0dcmLxJknj7zhj8 jwq4NulpV2FQx5gc/KgpifCtmRo87mWgKNm6A2xJjRigcn00ekeAlADcwA4sppo= =J4JR -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next Changed for the 3.16 merge window. This includes KVM support for PSCI v0.2 and also includes generic Linux support for PSCI v0.2 (on hosts that advertise that feature via their DT), since the latter depends on headers introduced by the former. Finally there's a small patch from Marc that enables Cortex-A53 support.	2014-05-27 15:58:14 +02:00
Nadav Amit	9b88ae99d2	KVM: x86: MOV CR/DR emulation should ignore mod MOV CR/DR instructions ignore the mod field (in the ModR/M byte). As the SDM states: "The 2 bits in the mod field are ignored". Accordingly, the second operand of these instructions is always a general purpose register. The current emulator implementation does not do so. If the mod bits do not equal 3, it expects the second operand to be in memory. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-27 10:22:56 +02:00
Paolo Bonzini	fc57ac2c9c	KVM: lapic: sync highest ISR to hardware apic on EOI When Hyper-V enlightenments are in effect, Windows prefers to issue an Hyper-V MSR write to issue an EOI rather than an x2apic MSR write. The Hyper-V MSR write is not handled by the processor, and besides being slower, this also causes bugs with APIC virtualization. The reason is that on EOI the processor will modify the highest in-service interrupt (SVI) field of the VMCS, as explained in section 29.1.4 of the SDM; every other step in EOI virtualization is already done by apic_send_eoi or on VM entry, but this one is missing. We need to do the same, and be careful not to muck with the isr_count and highest_isr_cache fields that are unused when virtual interrupt delivery is enabled. Cc: stable@vger.kernel.org Reviewed-by: Yang Zhang <yang.z.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-27 10:21:09 +02:00
Marc Zyngier	1252b33136	arm64: KVM: Enable minimalistic support for Cortex-A53 In order to allow KVM to run on Cortex-A53 implementations, wire the minimal support required. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2014-05-25 20:05:30 +02:00
Nadav Amit	1f85411255	KVM: vmx: DR7 masking on task switch emulation is wrong The DR7 masking which is done on task switch emulation should be in hex format (clearing the local breakpoints enable bits 0,2,4 and 6). Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-22 17:47:18 +02:00
Dave Hansen	65a7f03f6b	x86: fix page fault tracing when KVM guest support enabled I noticed on some of my systems that page fault tracing doesn't work: cd /sys/kernel/debug/tracing echo 1 > events/exceptions/enable cat trace; # nothing shows up I eventually traced it down to CONFIG_KVM_GUEST. At least in a KVM VM, enabling that option breaks page fault tracing, and disabling fixes it. I tried on some old kernels and this does not appear to be a regression: it never worked. There are two page-fault entry functions today. One when tracing is on and another when it is off. The KVM code calls do_page_fault() directly instead of calling the traced version: > dotraplinkage void __kprobes > do_async_page_fault(struct pt_regs *regs, unsigned long > error_code) > { > enum ctx_state prev_state; > > switch (kvm_read_and_reset_pf_reason()) { > default: > do_page_fault(regs, error_code); > break; > case KVM_PV_REASON_PAGE_NOT_PRESENT: I'm also having problems with the page fault tracing on bare metal (same symptom of no trace output). I'm unsure if it's related. Steven had an alternative to this which has zero overhead when tracing is off where this includes the standard noops even when tracing is disabled. I'm unconvinced that the extra complexity of his apporach: http://lkml.kernel.org/r/20140508194508.561ed220@gandalf.local.home is worth it, expecially considering that the KVM code is already making page fault entry slower here. This solution is dirt-simple. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Gleb Natapov <gleb@redhat.com> Cc: kvm@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-22 17:47:17 +02:00
Paolo Bonzini	ae9fedc793	KVM: x86: get CPL from SS.DPL CS.RPL is not equal to the CPL in the few instructions between setting CR0.PE and reloading CS. And CS.DPL is also not equal to the CPL for conforming code segments. However, SS.DPL is always equal to the CPL except for the weird case of SYSRET on AMD processors, which sets SS.DPL=SS.RPL from the value in the STAR MSR, but force CPL=3 (Intel instead forces SS.DPL=SS.RPL=CPL=3). So this patch: - modifies SVM to update the CPL from SS.DPL rather than CS.RPL; the above case with SYSRET is not broken further, and the way to fix it would be to pass the CPL to userspace and back - modifies VMX to always return the CPL from SS.DPL (except forcing it to 0 if we are emulating real mode via vm86 mode; in vm86 mode all DPLs have to be 3, but real mode does allow privileged instructions). It also removes the CPL cache, which becomes a duplicate of the SS access rights cache. This fixes doing KVM_IOCTL_SET_SREGS exactly after setting CR0.PE=1 but before CS has been reloaded. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-22 17:47:17 +02:00
Paolo Bonzini	5045b46803	KVM: x86: check CS.DPL against RPL during task switch Table 7-1 of the SDM mentions a check that the code segment's DPL must match the selector's RPL. This was not done by KVM, fix it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-22 17:47:17 +02:00
Paolo Bonzini	fb5e336b97	KVM: x86: drop set_rflags callback Not needed anymore now that the CPL is computed directly during task switch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-22 17:47:16 +02:00
Paolo Bonzini	2356aaeb2f	KVM: x86: use new CS.RPL as CPL during task switch During task switch, all of CS.DPL, CS.RPL, SS.DPL must match (in addition to all the other requirements) and will be the new CPL. So far this worked by carefully setting the CS selector and flag before doing the task switch; setting CS.selector will already change the CPL. However, this will not work once we get the CPL from SS.DPL, because then you will have to set the full segment descriptor cache to change the CPL. ctxt->ops->cpl(ctxt) will then return the old CPL during the task switch, and the check that SS.DPL == CPL will fail. Temporarily assume that the CPL comes from CS.RPL during task switch to a protected-mode task. This is the same approach used in QEMU's emulation code, which (until version 2.0) manually tracks the CPL. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-22 17:45:38 +02:00
Michael Mueller	fda902cb83	KVM: s390: split SIE state guest prefix field This patch splits the SIE state guest prefix at offset 4 into a prefix bit field. Additionally it provides the access functions: - kvm_s390_get_prefix() - kvm_s390_set_prefix() to access the prefix per vcpu. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:31 +02:00
Michael Mueller	570126d370	s390/sclp: add sclp_get_ibc function The patch adds functionality to retrieve the IBC configuration by means of function sclp_get_ibc(). Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:30 +02:00
David Hildenbrand	4953919fee	KVM: s390: interpretive execution of SIGP EXTERNAL CALL If the sigp interpretation facility is installed, most SIGP EXTERNAL CALL operations will be interpreted instead of intercepted. A partial execution interception will occurr at the sending cpu only if the target cpu is in the wait state ("W" bit in the cpuflags set). Instruction interception will only happen in error cases (e.g. cpu addr invalid). As a sending cpu might set the external call interrupt pending flags at the target cpu at every point in time, we can't handle this kind of interrupt using our kvm interrupt injection mechanism. The injection will be done automatically by the SIE when preparing the start of the target cpu. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> CC: Thomas Huth <thuth@linux.vnet.ibm.com> [Adopt external call injection to check for sigp interpretion] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:28 +02:00
Alexander Yarygin	d26b8655f0	KVM: s390: Use intercept_insn decoder in trace event The current trace definition doesn't work very well with the perf tool. Perf shows a "insn_to_mnemonic not found" message. Let's handle the decoding completely in a parseable format. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:27 +02:00
Alexander Yarygin	05db1f6e03	KVM: s390: decoder of SIE intercepted instructions This patch adds a new decoder of SIE intercepted instructions. The decoder implemented as a macro and potentially can be used in both kernelspace and userspace. Note that this simplified instruction decoder is only intended to be used with the subset of instructions that may cause a SIE intercept. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:26 +02:00
Alexander Yarygin	6de1bf88df	KVM: s390: Use trace tables from sie.h. Use the symbolic translation tables from sie.h for decoding diag, sigp and sie exit codes. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:24 +02:00
Alexander Yarygin	ceae283bb2	KVM: s390: add sie exit reasons tables This patch defines tables of reasons for exiting from SIE mode in a new sie.h header file. Tables contain SIE intercepted codes, intercepted instructions and program interruptions codes. Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:23 +02:00
Thomas Huth	f22166dcfd	KVM: s390: Improved MVPG partial execution handler Use the new helper function kvm_arch_fault_in_page() for faulting-in the guest pages and only inject addressing errors when we've really hit a bad address (and return other error codes to userspace instead). Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:22 +02:00
Thomas Huth	fa576c583d	KVM: s390: Introduce helper function for faulting-in a guest page Rework the function kvm_arch_fault_in_sync() to become a proper helper function for faulting-in a guest page. Now it takes the guest address as a parameter and does not ignore the possible error code from gmap_fault() anymore (which could cause undetected error conditions before). Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:20 +02:00
Thomas Huth	684135e096	KVM: s390: Avoid endless loops of specification exceptions If the new PSW for program interrupts is invalid, the VM ends up in an endless loop of specification exceptions. Since there is not much left we can do in this case, we should better drop to userspace instead so that the crash can be reported to the user. Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:19 +02:00
Thomas Huth	a3fb577e48	KVM: s390: Improve is_valid_psw() As a program status word is also invalid (and thus generates an specification exception) if the instruction address is not even, we should test this in is_valid_psw(), too. This patch also exports the function so that it becomes available for other parts of the S390 KVM code as well. Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:18 +02:00
Martin Schwidefsky	3a801517ad	KVM: s390: correct locking for s390_enable_skey Use the mm semaphore to serialize multiple invocations of s390_enable_skey. The second CPU faulting on a storage key operation needs to wait for the completion of the page table update. Taking the mm semaphore writable has the positive side-effect that it prevents any host faults from taking place which does have implications on keys vs PGSTE. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2014-05-16 14:57:17 +02:00
Ashwin Chaugule	c814ca029e	ARM: Check if a CPU has gone offline PSCIv0.2 adds a new function called AFFINITY_INFO, which can be used to query if a specified CPU has actually gone offline. Calling this function via cpu_kill ensures that a CPU has quiesced after a call to cpu_die. This helps prevent the CPU from doing arbitrary bad things when data or instructions are clobbered (as happens with kexec) in the window between a CPU announcing that it is dead and said CPU leaving the kernel. Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-15 10:16:30 -04:00
Ashwin Chaugule	e71246a23a	PSCI: Add initial support for PSCIv0.2 functions The PSCIv0.2 spec defines standard values of function IDs and introduces a few new functions. Detect version of PSCI and appropriately select the right PSCI functions. Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2014-05-15 10:16:00 -04:00
Jan Kiszka	d9f89b88f5	KVM: x86: Fix CR3 reserved bits check in long mode Regression of `346874c9`: PAE is set in long mode, but that does not mean we have valid PDPTRs. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-12 20:04:01 +02:00

1 2 3 4 5 ...

95116 Commits