linux/arch
Boqun Feng 19ab58d19e powerpc, hotplug: Avoid to touch non-existent cpumasks.
We observed a kernel oops when running a PPC guest with config NR_CPUS=4
and qemu option "-smp cores=1,threads=8":

[   30.634781] Unable to handle kernel paging request for data at
address 0xc00000014192eb17
[   30.636173] Faulting instruction address: 0xc00000000003e5cc
[   30.637069] Oops: Kernel access of bad area, sig: 11 [#1]
[   30.637877] SMP NR_CPUS=4 NUMA pSeries
[   30.638471] Modules linked in:
[   30.638949] CPU: 3 PID: 27 Comm: migration/3 Not tainted
4.7.0-07963-g9714b26 #1
[   30.640059] task: c00000001e29c600 task.stack: c00000001e2a8000
[   30.640956] NIP: c00000000003e5cc LR: c00000000003e550 CTR:
0000000000000000
[   30.642001] REGS: c00000001e2ab8e0 TRAP: 0300   Not tainted
(4.7.0-07963-g9714b26)
[   30.643139] MSR: 8000000102803033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22004084  XER: 00000000
[   30.644583] CFAR: c000000000009e98 DAR: c00000014192eb17 DSISR: 40000000 SOFTE: 0
GPR00: c00000000140a6b8 c00000001e2abb60 c0000000016dd300 0000000000000003
GPR04: 0000000000000000 0000000000000004 c0000000016e5920 0000000000000008
GPR08: 0000000000000004 c00000014192eb17 0000000000000000 0000000000000020
GPR12: c00000000140a6c0 c00000000ffffc00 c0000000000d3ea8 c00000001e005680
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 c00000001e6b3a00 0000000000000000 0000000000000001
GPR24: c00000001ff85138 c00000001ff85130 000000001eb6f000 0000000000000001
GPR28: 0000000000000000 c0000000017014e0 0000000000000000 0000000000000018
[   30.653882] NIP [c00000000003e5cc] __cpu_disable+0xcc/0x190
[   30.654713] LR [c00000000003e550] __cpu_disable+0x50/0x190
[   30.655528] Call Trace:
[   30.655893] [c00000001e2abb60] [c00000000003e550] __cpu_disable+0x50/0x190 (unreliable)
[   30.657280] [c00000001e2abbb0] [c0000000000aca0c] take_cpu_down+0x5c/0x100
[   30.658365] [c00000001e2abc10] [c000000000163918] multi_cpu_stop+0x1a8/0x1e0
[   30.659617] [c00000001e2abc60] [c000000000163cc0] cpu_stopper_thread+0xf0/0x1d0
[   30.660737] [c00000001e2abd20] [c0000000000d8d70] smpboot_thread_fn+0x290/0x2a0
[   30.661879] [c00000001e2abd80] [c0000000000d3fa8] kthread+0x108/0x130
[   30.662876] [c00000001e2abe30] [c000000000009968] ret_from_kernel_thread+0x5c/0x74
[   30.664017] Instruction dump:
[   30.664477] 7bde1f24 38a00000 787f1f24 3b600001 39890008 7d204b78 7d05e214 7d0b07b4
[   30.665642] 796b1f24 7d26582a 7d204a14 7d29f214 <7d4048a8> 7d4a3878 7d4049ad 40c2fff4
[   30.666854] ---[ end trace 32643b7195717741 ]---

The reason of this is that in __cpu_disable(), when we try to set the
cpu_sibling_mask or cpu_core_mask of the sibling CPUs of the disabled
one, we don't check whether the current configuration employs those
sibling CPUs(hw threads). And if a CPU is not employed by a
configuration, the percpu structures cpu_{sibling,core}_mask are not
allocated, therefore accessing those cpumasks will result in problems as
above.

This patch fixes this problem by adding an addition check on whether the
id is no less than nr_cpu_ids in the sibling CPU iteration code.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
..
alpha RTC for 4.8 2016-08-05 09:48:22 -04:00
arc dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
arm DeviceTree fixes for 4.8: 2016-08-18 19:31:08 -07:00
arm64 arm64: Fix shift warning in arch/arm64/mm/dump.c 2016-08-18 12:38:11 +01:00
avr32 dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
blackfin dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
c6x dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
cris dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
frv RTC for 4.8 2016-08-05 09:48:22 -04:00
h8300 h8300: Add missing include file to asm/io.h 2016-08-13 08:53:56 -07:00
hexagon dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
ia64 Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user 2016-08-08 14:48:14 -07:00
m32r mm: do not pass mm_struct into handle_mm_fault 2016-07-26 16:19:19 -07:00
m68k m68knommu: fix user a5 register being overwritten 2016-08-08 12:38:47 +10:00
metag metag: Drop show_mem() from mem_init() 2016-08-09 13:41:30 +01:00
microblaze dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
mips MIPS: KVM: Propagate kseg0/mapped tlb fault errors 2016-08-12 12:01:30 +02:00
mn10300 RTC for 4.8 2016-08-05 09:48:22 -04:00
nios2 dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
openrisc dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
parisc parisc: Fix order of EREFUSED define in errno.h 2016-08-20 13:33:53 +02:00
powerpc powerpc, hotplug: Avoid to touch non-existent cpumasks. 2016-08-22 11:09:33 +10:00
s390 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2016-08-16 15:50:22 -07:00
score treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
sh These changes improve device tree support (including builtin DTB), add 2016-08-06 09:00:05 -04:00
sparc Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user 2016-08-08 14:48:14 -07:00
tile tile: support static_key usage in non-module __exit sections 2016-08-04 08:50:07 -04:00
um Merge branch 'for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml 2016-08-04 19:37:59 -04:00
unicore32 unicore32: mm: Add missing parameter to arch_vma_access_permitted 2016-08-13 08:53:18 -07:00
x86 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-18 15:09:41 -07:00
xtensa dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
.gitignore
Kconfig Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user 2016-08-08 14:48:14 -07:00