linux/arch
Vitaly Kuznetsov 49097762fa Revert "KVM: VMX: Micro-optimize vmexit time when not exposing PMU"
Guest crashes are observed on a Cascade Lake system when 'perf top' is
launched on the host, e.g.

 BUG: unable to handle kernel paging request at fffffe0000073038
 PGD 7ffa7067 P4D 7ffa7067 PUD 7ffa6067 PMD 7ffa5067 PTE ffffffffff120
 Oops: 0000 [#1] SMP PTI
 CPU: 1 PID: 1 Comm: systemd Not tainted 4.18.0+ #380
...
 Call Trace:
  serial8250_console_write+0xfe/0x1f0
  call_console_drivers.constprop.0+0x9d/0x120
  console_unlock+0x1ea/0x460

Call traces are different but the crash is imminent. The problem was
blindly bisected to the commit 041bc42ce2 ("KVM: VMX: Micro-optimize
vmexit time when not exposing PMU"). It was also confirmed that the
issue goes away if PMU is exposed to the guest.

With some instrumentation of the guest we can see what is being switched
(when we do atomic_switch_perf_msrs()):

 vmx_vcpu_run: switching 2 msrs
 vmx_vcpu_run: switching MSR38f guest: 70000000d host: 70000000f
 vmx_vcpu_run: switching MSR3f1 guest: 0 host: 2

The current guess is that PEBS (MSR_IA32_PEBS_ENABLE, 0x3f1) is to blame.
Regardless of whether PMU is exposed to the guest or not, PEBS needs to
be disabled upon switch.

This reverts commit 041bc42ce2.

Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200619094046.654019-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-19 08:13:40 -04:00
..
alpha Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
arc treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
arm Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
arm64 Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
c6x This time around we have 4 lines of diff in the core framework, removing a 2020-06-10 11:42:19 -07:00
csky mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
h8300 This time around we have 4 lines of diff in the core framework, removing a 2020-06-10 11:42:19 -07:00
hexagon treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
ia64 treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
m68k Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
microblaze mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mips KVM: MIPS: Fix a build error for !CPU_LOONGSON64 2020-06-15 05:24:30 -04:00
nds32 mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
nios2 nios2 update for v5.8-rc1 2020-06-12 11:55:11 -07:00
openrisc OpenRISC updates for 5.8 2020-06-13 10:54:09 -07:00
parisc treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
powerpc Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
riscv RISC-V Patches for the 5.8 Merge Window, Part 2 2020-06-11 12:55:20 -07:00
s390 Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
sh treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
sparc treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
um treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
unicore32 This time around we have 4 lines of diff in the core framework, removing a 2020-06-10 11:42:19 -07:00
x86 Revert "KVM: VMX: Micro-optimize vmexit time when not exposing PMU" 2020-06-19 08:13:40 -04:00
xtensa mmap locking API: convert mmap_sem API comments 2020-06-09 09:39:14 -07:00
.gitignore
Kconfig treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00