linux/arch
Borislav Petkov 9a90ed065a x86/thermal: Fix LVT thermal setup for SMI delivery mode
There are machines out there with added value crap^WBIOS which provide an
SMI handler for the local APIC thermal sensor interrupt. Out of reset,
the BSP on those machines has something like 0x200 in that APIC register
(timestamps left in because this whole issue is timing sensitive):

  [    0.033858] read lvtthmr: 0x330, val: 0x200

which means:

 - bit 16 - the interrupt mask bit is clear and thus that interrupt is enabled
 - bits [10:8] have 010b which means SMI delivery mode.

Now, later during boot, when the kernel programs the local APIC, it
soft-disables it temporarily through the spurious vector register:

  setup_local_APIC:

  	...

	/*
	 * If this comes from kexec/kcrash the APIC might be enabled in
	 * SPIV. Soft disable it before doing further initialization.
	 */
	value = apic_read(APIC_SPIV);
	value &= ~APIC_SPIV_APIC_ENABLED;
	apic_write(APIC_SPIV, value);

which means (from the SDM):

"10.4.7.2 Local APIC State After It Has Been Software Disabled

...

* The mask bits for all the LVT entries are set. Attempts to reset these
bits will be ignored."

And this happens too:

  [    0.124111] APIC: Switch to symmetric I/O mode setup
  [    0.124117] lvtthmr 0x200 before write 0xf to APIC 0xf0
  [    0.124118] lvtthmr 0x10200 after write 0xf to APIC 0xf0

This results in CPU 0 soft lockups depending on the placement in time
when the APIC soft-disable happens. Those soft lockups are not 100%
reproducible and the reason for that can only be speculated as no one
tells you what SMM does. Likely, it confuses the SMM code that the APIC
is disabled and the thermal interrupt doesn't doesn't fire at all,
leading to CPU 0 stuck in SMM forever...

Now, before

  4f432e8bb1 ("x86/mce: Get rid of mcheck_intel_therm_init()")

due to how the APIC_LVTTHMR was read before APIC initialization in
mcheck_intel_therm_init(), it would read the value with the mask bit 16
clear and then intel_init_thermal() would replicate it onto the APs and
all would be peachy - the thermal interrupt would remain enabled.

But that commit moved that reading to a later moment in
intel_init_thermal(), resulting in reading APIC_LVTTHMR on the BSP too
late and with its interrupt mask bit set.

Thus, revert back to the old behavior of reading the thermal LVT
register before the APIC gets initialized.

Fixes: 4f432e8bb1 ("x86/mce: Get rid of mcheck_intel_therm_init()")
Reported-by: James Feeney <james@nurealm.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lkml.kernel.org/r/YKIqDdFNaXYd39wz@zn.tnic
2021-05-31 22:32:26 +02:00
..
alpha quota: Disable quotactl_path syscall 2021-05-17 14:39:56 +02:00
arc ARC: mm: Use max_high_pfn as a HIGHMEM zone border 2021-05-10 12:38:59 -07:00
arm A few fixes for irqchip drivers: 2021-05-23 06:28:20 -10:00
arm64 ARM: SoC fixes for 5.13 2021-05-20 14:46:26 -10:00
csky arch/csky patches for 5.13-rc1 2021-05-03 12:58:31 -07:00
h8300 arch: rearrange headers inclusion order in asm/bitops for m68k, sh and h8300 2021-05-06 19:24:11 -07:00
hexagon Merge branch 'akpm' (patches from Andrew) 2021-05-07 00:34:51 -07:00
ia64 quota: Disable quotactl_path syscall 2021-05-17 14:39:56 +02:00
m68k Merge branch 'for-v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-05-21 06:12:52 -10:00
microblaze quota: Disable quotactl_path syscall 2021-05-17 14:39:56 +02:00
mips quota: Disable quotactl_path syscall 2021-05-17 14:39:56 +02:00
nds32 tracing updates for 5.13 2021-05-03 11:19:54 -07:00
nios2 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-05-02 09:14:01 -07:00
openrisc OpenRISC fixes for 5.13 2021-05-21 06:06:19 -10:00
parisc quota: Disable quotactl_path syscall 2021-05-17 14:39:56 +02:00
powerpc powerpc fixes for 5.13 #4 2021-05-23 06:07:33 -10:00
riscv riscv: remove unused handle_exception symbol 2021-05-06 09:40:16 -07:00
s390 quota: Disable quotactl_path syscall 2021-05-17 14:39:56 +02:00
sh \n 2021-05-20 06:20:15 -10:00
sparc quota: Disable quotactl_path syscall 2021-05-17 14:39:56 +02:00
um Merge branch 'akpm' (patches from Andrew) 2021-05-07 00:34:51 -07:00
x86 x86/thermal: Fix LVT thermal setup for SMI delivery mode 2021-05-31 22:32:26 +02:00
xtensa quota: Disable quotactl_path syscall 2021-05-17 14:39:56 +02:00
.gitignore .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
Kconfig Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com> 2021-05-01 18:50:44 -07:00