linux/arch/x86/kernel
Yinghai Lu 20e6926dcb x86, ACPI, mm: Revert movablemem_map support
Tim found:

  WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80()
  Hardware name: S2600CP
  sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
  smpboot: Booting Node   1, Processors  #1
  Modules linked in:
  Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1
  Call Trace:
    set_cpu_sibling_map+0x279/0x449
    start_secondary+0x11d/0x1e5

Don Morris reproduced on a HP z620 workstation, and bisected it to
commit e8d1955258 ("acpi, memory-hotplug: parse SRAT before memblock
is ready")

It turns out movable_map has some problems, and it breaks several things

1. numa_init is called several times, NOT just for srat. so those
	nodes_clear(numa_nodes_parsed)
	memset(&numa_meminfo, 0, sizeof(numa_meminfo))
   can not be just removed.  Need to consider sequence is: numaq, srat, amd, dummy.
   and make fall back path working.

2. simply split acpi_numa_init to early_parse_srat.
   a. that early_parse_srat is NOT called for ia64, so you break ia64.
   b.  for (i = 0; i < MAX_LOCAL_APIC; i++)
	     set_apicid_to_node(i, NUMA_NO_NODE)
     still left in numa_init. So it will just clear result from early_parse_srat.
     it should be moved before that....
   c.  it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved
       early before override from INITRD is settled.

3. that patch TITLE is total misleading, there is NO x86 in the title,
   but it changes critical x86 code. It caused x86 guys did not
   pay attention to find the problem early. Those patches really should
   be routed via tip/x86/mm.

4. after that commit, following range can not use movable ram:
  a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed?
  b. initrd... it will be freed after booting, so it could be on movable...
  c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G
	anymore.
  d. init_mem_mapping: can not put page table high anymore.
  e. initmem_init: vmemmap can not be high local node anymore. That is
     not good.

If node is hotplugable, the mem related range like page table and
vmemmap could be on the that node without problem and should be on that
node.

We have workaround patch that could fix some problems, but some can not
be fixed.

So just remove that offending commit and related ones including:

 f7210e6c4a ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to
    protect movablecore_map in memblock_overlaps_region().")

 01a178a94e ("acpi, memory-hotplug: support getting hotplug info from
    SRAT")

 27168d38fa ("acpi, memory-hotplug: extend movablemem_map ranges to
    the end of node")

 e8d1955258 ("acpi, memory-hotplug: parse SRAT before memblock is
    ready")

 fb06bc8e5f ("page_alloc: bootmem limit with movablecore_map")

 42f47e27e7 ("page_alloc: make movablemem_map have higher priority")

 6981ec3114 ("page_alloc: introduce zone_movable_limit[] to keep
    movable limit for nodes")

 34b71f1e04 ("page_alloc: add movable_memmap kernel parameter")

 4d59a75125 ("x86: get pg_data_t's memory from other node")

Later we should have patches that will make sure kernel put page table
and vmemmap on local node ram instead of push them down to node0.  Also
need to find way to put other kernel used ram to local node ram.

Reported-by: Tim Gardner <tim.gardner@canonical.com>
Reported-by: Don Morris <don.morris@hp.com>
Bisected-by: Don Morris <don.morris@hp.com>
Tested-by: Don Morris <don.morris@hp.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-02 09:34:39 -08:00
..
acpi cpu_hotplug: clear apicid to node when the cpu is hotremoved 2013-02-23 17:50:13 -08:00
apic Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-26 19:45:29 -08:00
cpu Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-26 19:40:37 -08:00
kprobes hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
.gitignore
alternative.c Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-10-01 10:47:45 -07:00
amd_gart_64.c x86, mm: use pfn_range_is_mapped() with gart 2012-11-17 11:59:10 -08:00
amd_nb.c Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-22 16:07:45 -07:00
apb_timer.c Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 20:11:07 -08:00
aperture_64.c x86/gart: Fix kmemleak warning 2012-06-06 11:58:38 +02:00
apm_32.c ACPI and power management updates for 3.9-rc1 2013-02-20 11:26:56 -08:00
asm-offsets_32.c
asm-offsets_64.c x32: If configured, add x32 system calls to system call tables 2012-02-20 12:52:06 -08:00
asm-offsets.c x86, um/x86: switch to generic sys_execve and kernel_execve 2012-09-30 22:53:32 -04:00
audit_64.c
bootflag.c
check.c x86: kernel/check.c simple_strtoul cleanup 2012-05-15 15:36:41 -07:00
cpuid.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
crash_dump_32.c x86: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:15 +08:00
crash_dump_64.c
crash.c x86/kexec: crash_vmclear_local_vmcss needs __rcu 2012-12-11 19:55:23 -02:00
devicetree.c x86: dt: Use linear irq domain for ioapic(s) 2012-08-21 22:16:57 +02:00
doublefault_32.c
dumpstack_32.c x86: Move call to print_modules() out of show_regs() 2012-06-20 14:33:48 +02:00
dumpstack_64.c x86: Move call to print_modules() out of show_regs() 2012-06-20 14:33:48 +02:00
dumpstack.c taint: add explicit flag to show whether lock dep is still OK. 2013-01-21 17:17:57 +10:30
e820.c x86, mm: Let "memmap=" take more entries one time 2012-11-17 11:59:51 -08:00
early_printk.c Revert "x86/early_printk: Replace obsolete simple_strtoul() usage with kstrtoint()" 2012-07-22 15:47:52 +02:00
early-quirks.c
entry_32.S Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-02-23 18:50:11 -08:00
entry_64.S Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-02-23 18:50:11 -08:00
ftrace.c x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols 2012-11-16 16:42:09 -08:00
head32.c x86: Merge early kernel reserve for 32bit and 64bit 2013-01-29 19:32:58 -08:00
head64.c Merge branch 'x86/microcode' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-22 19:22:52 -08:00
head_32.S Merge branch 'x86/microcode' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-22 19:22:52 -08:00
head_64.S Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-27 16:16:39 -08:00
head.c x86: Make sure we can boot in the case the BDA contains pure garbage 2013-02-27 13:38:57 -08:00
hpet.c x86, hpet: Introduce x86_msi_ops.setup_hpet_msi 2013-01-28 10:48:30 +01:00
hw_breakpoint.c
i386_ksyms_32.c x86-32: Add support for 64bit get_user() 2013-02-07 15:07:28 -08:00
i387.c x86/i387.c: Initialize thread xstate only on CPU0 only once 2012-11-14 15:28:11 -08:00
i8237.c
i8253.c
i8259.c x86/irq/i8259: Fix incorrect comment 2012-08-22 09:34:24 +02:00
io_delay.c
ioport.c x86: get rid of pt_regs argument of iopl(2) 2013-02-03 18:16:24 -05:00
irq_32.c x86: Use common threadinfo allocator 2012-05-08 14:08:44 +02:00
irq_64.c
irq_work.c
irq.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-10-01 11:13:33 -07:00
irqinit.c x86, 386 removal: Remove support for IRQ 13 FPU error reporting 2012-12-17 11:42:40 -08:00
jump_label.c
kdebugfs.c arch/x86/kernel/kdebugfs.c: Ensure a consistent return value in error case 2012-07-26 15:07:20 +02:00
kgdb.c kgdb,x86: fix warning about unused variable 2012-10-12 06:37:34 -05:00
kvm.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-21 18:06:55 -08:00
kvmclock.c Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm 2013-02-24 13:07:18 -08:00
ldt.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
machine_kexec_32.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
machine_kexec_64.c x86, kexec, 64bit: Only set ident mapping for ram. 2013-01-29 15:26:35 -08:00
Makefile Merge branch 'x86/microcode' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-22 19:22:52 -08:00
microcode_amd.c x86, microcode, AMD: Add support for family 16h processors 2012-11-20 22:23:28 -08:00
microcode_core_early.c x86/microcode_core_early.c: Define interfaces for early loading ucode 2013-01-31 13:19:12 -08:00
microcode_core.c x86/microcode_intel.h: Define functions and macros for early loading ucode 2013-01-31 13:18:50 -08:00
microcode_intel_early.c x86/microcode_intel_early.c: Early update ucode on Intel's CPU 2013-01-31 13:19:18 -08:00
microcode_intel_lib.c x86/microcode_intel_lib.c: Early update ucode on Intel's CPU 2013-01-31 13:19:14 -08:00
microcode_intel.c x86/microcode_intel.h: Define functions and macros for early loading ucode 2013-01-31 13:18:50 -08:00
mmconf-fam10h_64.c
module.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-07-24 13:34:56 -07:00
mpparse.c Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-29 20:14:53 -07:00
msr.c x86/msr: Add capabilities check 2013-01-24 17:37:51 +01:00
nmi_selftest.c x86/nmi: Clean up register_nmi_handler() usage 2012-06-20 14:23:17 +02:00
nmi.c x86/nmi: export local_touch_nmi() symbol for modules 2013-01-17 22:25:57 +08:00
paravirt_patch_32.c
paravirt_patch_64.c
paravirt-spinlocks.c
paravirt.c x86, pvops: Remove hooks for {rd,wr}msr_safe_regs 2012-06-07 11:41:08 -07:00
pci-calgary_64.c x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> 2012-06-06 09:17:22 +02:00
pci-dma.c x86/dma-debug: Bump PREALLOC_DMA_DEBUG_ENTRIES 2013-01-24 17:34:18 +01:00
pci-iommu_table.c
pci-nommu.c X86: integrate CMA with DMA-mapping subsystem 2012-05-21 15:09:38 +02:00
pci-swiotlb.c X86 & IA64: adapt for dma_map_ops changes 2012-03-28 16:36:31 +02:00
pcspeaker.c
perf_regs.c perf: Fix off by one test in perf_reg_value() 2012-09-19 17:08:40 +02:00
probe_roms.c x86/pci/probe_roms: Add missing __iomem annotation to pci_map_biosrom() 2012-09-05 10:52:25 +02:00
process_32.c flagday: don't pass regs to copy_thread() 2012-11-28 23:43:42 -05:00
process_64.c x86/process: Change %8s to %s for pr_warn() in release_thread() 2013-01-24 18:11:03 +01:00
process.c Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux 2013-02-18 22:34:11 +01:00
ptrace.c x86: ptrace.c only needs export.h and not the full module.h 2013-02-14 12:56:12 -08:00
pvclock.c x86/kvm: Fix pvclock vsyscall fixmap 2013-02-28 08:50:11 +02:00
quirks.c X86: drivers: remove __dev* attributes. 2013-01-03 15:57:04 -08:00
reboot_fixups_32.c
reboot.c efi: Make 'efi_enabled' a function to query EFI facilities 2013-01-30 11:51:59 -08:00
relocate_kernel_32.S
relocate_kernel_64.S
resource.c
rtc.c x86/time/rtc: Don't print extended CMOS year when reading RTC 2013-01-24 14:56:35 +01:00
setup_percpu.c x86: Add read_mostly declaration/definition to variables from smp.h 2012-06-14 12:42:11 +02:00
setup.c x86, ACPI, mm: Revert movablemem_map support 2013-03-02 09:34:39 -08:00
signal.c x86: convert to ksignal 2013-02-14 09:21:17 -05:00
smp.c x86/reboot: Update nonmi_ipi parameter 2012-05-14 11:49:38 +02:00
smpboot.c x86 idle: remove mwait_idle() and "idle=mwait" cmdline param 2013-02-10 03:03:41 -05:00
stacktrace.c
step.c ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL 2013-01-22 10:08:00 -08:00
sys_x86_64.c x86: Fix a typo 2013-01-24 16:22:10 +01:00
syscall_32.c
syscall_64.c x32: If configured, add x32 system calls to system call tables 2012-02-20 12:52:06 -08:00
tboot.c Revert "x86-64/efi: Use EFI to deal with platform wall clock (again)" 2012-12-15 15:20:41 -08:00
tce_64.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
test_nx.c
test_rodata.c x86, extable: Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c 2012-04-20 13:51:38 -07:00
time.c MCA: delete all remaining traces of microchannel bus support. 2012-05-17 19:06:13 -04:00
tls.c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 14:28:26 -07:00
tls.h
topology.c x86, topology: Debug CPU0 hotplug 2012-11-14 15:28:11 -08:00
trace_clock.c tracing,x86: Add a TSC trace_clock 2012-11-13 15:48:27 -05:00
traps.c x86, 64bit: Use a #PF handler to materialize early mappings on demand 2013-01-29 15:20:06 -08:00
tsc_sync.c x86/tsc: Reduce the TSC sync check time for core-siblings 2012-02-22 11:49:40 +01:00
tsc.c Merge branch 'fortglx/3.9/time' of git://git.linaro.org/people/jstultz/linux into timers/core 2013-02-04 11:03:03 +01:00
uprobes.c uprobes: Change handle_swbp() to expose bp_vaddr to handler_chain() 2013-02-08 17:47:11 +01:00
verify_cpu.S
vm86_32.c x86: get rid of pt_regs argument in vm86/vm86old 2013-02-03 18:16:24 -05:00
vmlinux.lds.S x86, realmode: Move ACPI wakeup to unified realmode code 2012-05-08 11:46:05 -07:00
vsmp_64.c x86/apic/x2apic: Limit the vector reservation to the user specified mask 2012-07-06 11:00:22 +02:00
vsyscall_64.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2012-12-16 15:40:50 -08:00
vsyscall_emu_64.S
vsyscall_trace.h
x86_init.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-21 18:06:55 -08:00
x8664_ksyms_64.c x86: Improve __phys_addr performance by making use of carry flags and inlining 2012-11-16 16:42:08 -08:00
xsave.c x86, smap: Do not abuse the [f][x]rstor_checking() functions for user space 2012-09-25 15:42:18 -07:00