linux/arch/x86
Igor Mammedov 89f898c1e1 x86: Fix list/memory corruption on CPU hotplug
currently if AP wake up is failed, master CPU marks AP as not
present in do_boot_cpu() by calling set_cpu_present(cpu, false).
That leads to following list corruption on the next physical CPU
hotplug:

[  418.107336] WARNING: CPU: 1 PID: 45 at lib/list_debug.c:33 __list_add+0xbe/0xd0()
[  418.115268] list_add corruption. prev->next should be next (ffff88003dc57600), but was ffff88003e20c3a0. (prev=ffff88003e20c3a0).
[  418.123693] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT ipt_REJECT cfg80211 xt_conntrack rfkill ee
[  418.138979] CPU: 1 PID: 45 Comm: kworker/u10:1 Not tainted 3.14.0-rc6+ #387
[  418.149989] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[  418.165750] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  418.166433]  0000000000000021 ffff880038ca7988 ffffffff8159b22d 0000000000000021
[  418.176460]  ffff880038ca79d8 ffff880038ca79c8 ffffffff8106942c ffff880038ca79e8
[  418.177453]  ffff88003e20c3a0 ffff88003dc57600 ffff88003e20c3a0 00000000ffffffea
[  418.178445] Call Trace:
[  418.185811]  [<ffffffff8159b22d>] dump_stack+0x49/0x5c
[  418.186440]  [<ffffffff8106942c>] warn_slowpath_common+0x8c/0xc0
[  418.187192]  [<ffffffff81069516>] warn_slowpath_fmt+0x46/0x50
[  418.191231]  [<ffffffff8136ef51>] ? acpi_ns_get_node+0xb7/0xc7
[  418.193889]  [<ffffffff812f796e>] __list_add+0xbe/0xd0
[  418.196649]  [<ffffffff812e2aa9>] kobject_add_internal+0x79/0x200
[  418.208610]  [<ffffffff812e2e18>] kobject_add_varg+0x38/0x60
[  418.213831]  [<ffffffff812e2ef4>] kobject_add+0x44/0x70
[  418.229961]  [<ffffffff813e2c60>] device_add+0xd0/0x550
[  418.234991]  [<ffffffff813f0e95>] ? pm_runtime_init+0xe5/0xf0
[  418.250226]  [<ffffffff813e32be>] device_register+0x1e/0x30
[  418.255296]  [<ffffffff813e82a3>] register_cpu+0xe3/0x130
[  418.266539]  [<ffffffff81592be5>] arch_register_cpu+0x65/0x150
[  418.285845]  [<ffffffff81355c0d>] acpi_processor_hotadd_init+0x5a/0x9b
...
Which is caused by the fact that generic_processor_info() allocates
logical CPU id by calling:

 cpu = cpumask_next_zero(-1, cpu_present_mask);

which returns id of previously failed to wake up CPU, since its
bit is cleared by do_boot_cpu() and as result register_cpu()
tries to register another CPU with the same id as already
present but failed to be onlined CPU.

Taking in account that AP will not do anything if master CPU
failed to wake it up, there is no reason to mark that AP as not
present and break next cpu hotplug attempts. As a side effect of
not marking AP as not present, user would be allowed to online
it again later.

Also fix memory corruption in acpi_unmap_lsapic()

if during CPU hotplug master CPU failed to wake up AP
it set percpu x86_cpu_to_apicid to BAD_APICID=0xFFFF for AP.

However following attempt to unplug that CPU will lead to
out of bound write access to __apicid_to_node[] which is
32768 items long on x86_64 kernel.

So with above fix of cpu_present_mask make sure that a present
CPU has a valid APIC ID by not setting x86_cpu_to_apicid
to BAD_APICID in do_boot_cpu() on failure and allow
acpi_processor_remove()->acpi_unmap_lsapic() cleanly remove CPU.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1401975765-22328-2-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 16:33:07 +02:00
..
boot asmlinkage, x86: Add explicit __visible to arch/x86/* 2014-05-05 16:07:44 -07:00
configs ACPI: Remove Kconfig symbol ACPI_PROCFS 2014-02-19 00:27:37 +01:00
crypto crypto: ghash-clmulni-intel - use C implementation for setkey() 2014-04-01 17:22:47 +08:00
ia32
include x86_64: expand kernel stack to 16K 2014-05-30 11:52:51 -07:00
kernel x86: Fix list/memory corruption on CPU hotplug 2014-06-05 16:33:07 +02:00
kvm Small fixes for x86, slightly larger fixes for PPC, and a forgotten s390 patch. 2014-05-28 08:08:03 -07:00
lguest asmlinkage, x86: Add explicit __visible to arch/x86/* 2014-05-05 16:07:44 -07:00
lib x86: Fix typo preventing msr_set/clear_bit from having an effect 2014-05-09 08:42:32 -07:00
math-emu asmlinkage, x86: Add explicit __visible to arch/x86/* 2014-05-05 16:07:44 -07:00
mm arch/x86/mm/kmemcheck/kmemcheck.c: use kstrtoint() instead of sscanf() 2014-04-08 16:48:52 -07:00
net net: filter: x86: fix JIT address randomization 2014-05-13 18:31:13 -04:00
oprofile x86, oprofile, nmi: Fix CPU hotplug callback registration 2014-03-20 13:43:43 +01:00
pci CPU hotplug notifiers registration fixes for 3.15-rc1 2014-04-07 14:55:46 -07:00
platform asmlinkage, x86: Add explicit __visible to arch/x86/* 2014-05-05 16:07:44 -07:00
power asmlinkage, x86: Add explicit __visible to arch/x86/* 2014-05-05 16:07:44 -07:00
realmode Merge commit 'f4bcd8ccddb02833340652e9f46f5127828eb79d' into x86/build 2014-01-29 09:07:00 -08:00
syscalls x86/build: Supress "Nothing to be done for ..." messages 2014-04-14 11:44:36 +02:00
tools x86/build: Supress "Nothing to be done for ..." messages 2014-04-14 11:44:36 +02:00
um x86: Remove CONFIG_X86_OOSTORE 2014-03-11 10:16:18 -07:00
vdso x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET 2014-05-21 16:14:04 -07:00
video
xen asmlinkage, x86: Add explicit __visible to arch/x86/* 2014-05-05 16:07:44 -07:00
.gitignore
Kbuild
Kconfig Merge git://git.infradead.org/users/eparis/audit 2014-04-12 12:38:53 -07:00
Kconfig.cpu Merge branch 'x86-nuke-platforms-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-04-02 13:15:58 -07:00
Kconfig.debug x86/efi: Dump the EFI page table 2014-03-04 16:17:17 +00:00
Makefile x86-64, build: Fix stack protector Makefile breakage with 32-bit userland 2014-05-07 14:14:44 -07:00
Makefile_32.cpu
Makefile.um