linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-16 17:12:06 +00:00

Author	SHA1	Message	Date
Pekka Enberg	e5b2bb5527	x86: unify free_init_pages() and free_initmem() Impact: unification This patch introduces a common arch/x86/mm/init.c and moves the identical free_init_pages() and free_initmem() functions to the file. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <1236078906.2675.18.camel@penberg-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-03 12:21:18 +01:00
Pekka Enberg	e087edd8c0	x86: make sure initmem is writable on 64-bit Impact: unification This patch ports commit `3c1df68b84` ("x86: make sure initmem is writable") to the 64-bit version to unify implementations of free_init_pages(). Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <1236078904.2675.17.camel@penberg-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-03 12:21:18 +01:00
Yinghai Lu	0fc59d3a01	x86: fix init_memory_mapping() to handle small ranges Impact: fix failed EFI bootup in certain circumstances Ying Huang found init_memory_mapping() has problem with small ranges less than 2M when he tried to direct map the EFI runtime code out of max_low_pfn_mapped. It turns out we never considered that case and didn't check the range... Reported-by: Ying Huang <ying.huang@intel.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Brian Maly <bmaly@redhat.com> LKML-Reference: <49ACDDED.1060508@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-03-03 08:50:22 +01:00
Tejun Heo	458a3e644c	x86: update populate_extra_pte() and add populate_extra_pmd() Impact: minor change to populate_extra_pte() and addition of pmd flavor Update populate_extra_pte() to return pointer to the pte_t for the specified address and add populate_extra_pmd() which only populates till the pmd and returns pointer to the pmd entry for the address. For 64bit, pud/pmd/pte fill functions are separated out from set_pte_vaddr[_pud]() and used for set_pte_vaddr[_pud]() and populate_extra_{pte\|pmd}(). Signed-off-by: Tejun Heo <tj@kernel.org>	2009-02-24 11:57:21 +09:00
Steven Rostedt	1623963097	ftrace, x86: make kernel text writable only for conversions Impact: keep kernel text read only Because dynamic ftrace converts the calls to mcount into and out of nops at run time, we needed to always keep the kernel text writable. But this defeats the point of CONFIG_DEBUG_RODATA. This patch converts the kernel code to writable before ftrace modifies the text, and converts it back to read only afterward. The kernel text is converted to read/write, stop_machine is called to modify the code, then the kernel text is converted back to read only. The original version used SYSTEM_STATE to determine when it was OK or not to change the code to rw or ro. Andrew Morton pointed out that using SYSTEM_STATE is a bad idea since there is no guarantee to what its state will actually be. Instead, I moved the check into the set_kernel_text_* functions themselves, and use a local variable to determine when it is OK to change the kernel text RW permissions. [ Update: Ingo Molnar suggested moving the prototypes to cacheflush.h ] Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-20 14:30:06 -05:00
Tejun Heo	11124411aa	x86: convert to the new dynamic percpu allocator Impact: use new dynamic allocator, unified access to static/dynamic percpu memory Convert to the new dynamic percpu allocator. * implement populate_extra_pte() for both 32 and 64 * update setup_per_cpu_areas() to use pcpu_setup_static() * define __addr_to_pcpu_ptr() and __pcpu_ptr_to_addr() * define config HAVE_DYNAMIC_PER_CPU_AREA Signed-off-by: Tejun Heo <tj@kernel.org>	2009-02-20 16:29:09 +09:00
Gary Hade	f5495506c3	x86: remove kernel_physical_mapping_init() from init section Impact: fix crash with memory hotplug enabled kernel_physical_mapping_init() is called during memory hotplug so it does not belong in the init section. If the kernel is built with CONFIG_DEBUG_SECTION_MISMATCH=y on the make command line, arch/x86/mm/init_64.c is compiled with the -fno-inline-functions-called-once gcc option defeating inlining of kernel_physical_mapping_init() within init_memory_mapping(). When kernel_physical_mapping_init() is not inlined it is placed in the .init.text section according to the __init in it's current declaration. A later call to kernel_physical_mapping_init() during a memory hotplug operation encounters an int3 trap because the .init.text section memory has been freed. This patch eliminates the crash caused by the int3 trap by moving the non-inlined kernel_physical_mapping_init() from .init.text to .meminit.text. Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-20 00:31:43 +01:00
Arjan van de Ven	e8de1481fd	resource: allow MMIO exclusivity for device drivers Device drivers that use pci_request_regions() (and similar APIs) have a reasonable expectation that they are the only ones accessing their device. As part of the e1000e hunt, we were afraid that some userland (X or some bootsplash stuff) was mapping the MMIO region that the driver thought it had exclusively via /dev/mem or via various sysfs resource mappings. This patch adds the option for device drivers to cause their reserved regions to the "banned from /dev/mem use" list, so now both kernel memory and device-exclusive MMIO regions are banned. NOTE: This is only active when CONFIG_STRICT_DEVMEM is set. In addition to the config option, a kernel parameter iomem=relaxed is provided for the cases where developers want to diagnose, in the field, drivers issues from userspace. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-01-07 11:12:32 -08:00
Gary Hade	c04fc586c1	mm: show node to memory section relationship with symlinks in sysfs Show node to memory section relationship with symlinks in sysfs Add /sys/devices/system/node/nodeX/memoryY symlinks for all the memory sections located on nodeX. For example: /sys/devices/system/node/node1/memory135 -> ../../memory/memory135 indicates that memory section 135 resides on node1. Also revises documentation to cover this change as well as updating Documentation/ABI/testing/sysfs-devices-memory to include descriptions of memory hotremove files 'phys_device', 'phys_index', and 'state' that were previously not described there. In addition to it always being a good policy to provide users with the maximum possible amount of physical location information for resources that can be hot-added and/or hot-removed, the following are some (but likely not all) of the user benefits provided by this change. Immediate: - Provides information needed to determine the specific node on which a defective DIMM is located. This will reduce system downtime when the node or defective DIMM is swapped out. - Prevents unintended onlining of a memory section that was previously offlined due to a defective DIMM. This could happen during node hot-add when the user or node hot-add assist script onlines _all_ offlined sections due to user or script inability to identify the specific memory sections located on the hot-added node. The consequences of reintroducing the defective memory could be ugly. - Provides information needed to vary the amount and distribution of memory on specific nodes for testing or debugging purposes. Future: - Will provide information needed to identify the memory sections that need to be offlined prior to physical removal of a specific node. Symlink creation during boot was tested on 2-node x86_64, 2-node ppc64, and 2-node ia64 systems. Symlink creation during physical memory hot-add tested on a 2-node x86_64 system. Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:00 -08:00
Ingo Molnar	90accd6fab	Merge branch 'linus' into x86/memory-corruption-check	2008-11-20 09:03:38 +01:00
Gary Hade	fe8b868ecc	x86: remove debug code from arch_add_memory() Impact: remove incorrect WARN_ON(1) Gets rid of dmesg spam created during physical memory hot-add which will very likely confuse users. The change removes what appears to be debugging code which I assume was unintentionally included in: x86: arch/x86/mm/init_64.c printk fixes commit `10f22dde55` Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-29 09:29:22 +01:00
Yinghai Lu	f96f57d91c	x86: fix init_memory_mapping for [dc000000 - e0000000) - v2 Impact: change over-mapping to precise mapping, fix /proc/meminfo output v2: fix less than 1G ram system handling when gart aperture is 0xdc000000 - 0xe0000000 it return 0xc0000000 - 0xe0000000 that is not right. this patch fix that will get exact mapping on 256g sytem with that aperture after patch LBSuse:~ # cat /proc/meminfo MemTotal: 264742432 kB MemFree: 263920628 kB Buffers: 1416 kB Cached: 24468 kB ... DirectMap4k: 5760 kB DirectMap2M: 3205120 kB DirectMap1G: 265289728 kB it is consistent to LBSuse:~ # cat /sys/kernel/debug/kernel_page_tables .. ---[ Low Kernel Mapping ]--- 0xffff880000000000-0xffff880000200000 2M RW GLB x pte 0xffff880000200000-0xffff880040000000 1022M RW PSE GLB x pmd 0xffff880040000000-0xffff8800c0000000 2G RW PSE GLB NX pud 0xffff8800c0000000-0xffff8800d7e00000 382M RW PSE GLB NX pmd 0xffff8800d7e00000-0xffff8800d7fa0000 1664K RW GLB NX pte 0xffff8800d7fa0000-0xffff8800d8000000 384K pte 0xffff8800d8000000-0xffff8800dc000000 64M pmd 0xffff8800dc000000-0xffff8800e0000000 64M RW PSE GLB NX pmd 0xffff8800e0000000-0xffff880100000000 512M pmd 0xffff880100000000-0xffff880800000000 28G RW PSE GLB NX pud 0xffff880800000000-0xffff880824600000 582M RW PSE GLB NX pmd 0xffff880824600000-0xffff8808247f0000 1984K RW GLB NX pte 0xffff8808247f0000-0xffff880824800000 64K RW PCD GLB NX pte 0xffff880824800000-0xffff880840000000 440M RW PSE GLB NX pmd 0xffff880840000000-0xffff884000000000 223G RW PSE GLB NX pud 0xffff884000000000-0xffff884028000000 640M RW PSE GLB NX pmd 0xffff884028000000-0xffff884040000000 384M pmd 0xffff884040000000-0xffff888000000000 255G pud 0xffff888000000000-0xffffc20000000000 58880G pgd Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-28 20:54:47 +01:00
Yinghai Lu	11a6b0c933	x86: 64 bit print out absent pages num too so users are not confused with memhole causing big total ram we don't need to worry about 32 bit, because memhole is always above max_low_pfn. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-28 16:50:49 +01:00
Shaohua Li	60817c9b31	x86, memory hotplug: remove wrong -1 in calling init_memory_mapping() Impact: fix crash with memory hotplug Shuahua Li found: \| I just did some experiments on a desktop for memory hotplug and this bug \| triggered a crash in my test. \| \| Yinghai's suggestion also fixed the bug. We don't need to round it, just remove that extra -1 Signed-off-by: Yinghai <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-28 09:33:17 +01:00
Yinghai Lu	3afa39493d	x86: keep the /proc/meminfo page count correct Impact: get correct page count in /proc/meminfo found page count in /proc/meminfo is nor correct on 1G system in VirtualBox 2.0.4 # cat /proc/meminfo MemTotal: 1017508 kB MemFree: 822700 kB Buffers: 1456 kB Cached: 26632 kB SwapCached: 0 kB ... Hugepagesize: 2048 kB DirectMap4k: 4032 kB DirectMap2M: 18446744073709549568 kB with this patch get: ... DirectMap4k: 4032 kB DirectMap2M: 1044480 kB which is consistent to kernel_page_tables ---[ Low Kernel Mapping ]--- 0xffff880000000000-0xffff880000001000 4K RW PCD GLB x pte 0xffff880000001000-0xffff88000009f000 632K RW GLB x pte 0xffff88000009f000-0xffff8800000a0000 4K RW PCD GLB x pte 0xffff8800000a0000-0xffff880000200000 1408K RW GLB x pte 0xffff880000200000-0xffff88003fe00000 1020M RW PSE GLB x pmd 0xffff88003fe00000-0xffff88003fff0000 1984K RW GLB NX pte 0xffff88003fff0000-0xffff880040000000 64K pte 0xffff880040000000-0xffff888000000000 511G pud 0xffff888000000000-0xffffc20000000000 58880G pgd Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-27 18:55:26 +01:00
Arjan van de Ven	304e629bf4	x86: corruption check: run the corruption checks from a work queue Impact: change the implementation of the debug feature the periodic corruption checks are better off run from a work queue; there's nothing time critical about them and this way the amount of interrupt-context work is reduced. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-27 18:09:45 +01:00
Jan Beulich	5e72d9e485	x86-64: fix combining of regions in init_memory_mapping() When nr_range gets decremented, the same slot must be considered for coalescing with its new successor again. The issue is apparently pretty benign to native code, but surfaces as a boot time crash in our forward ported Xen tree (where the page table setup overall works differently than in native). Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-13 10:21:16 +02:00
Jeremy Fitzhardinge	a32ad46267	x86-64: don't check for map replacement The check prevents flags on mappings from being changed, which is not desireable. There's no need to check for replacing a mapping, and x86-32 does not do this check. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-13 10:21:05 +02:00
Jeremy Fitzhardinge	1494177942	x86: add early_memremap() early_ioremap() is also used to map normal memory when constructing the linear memory mapping. However, since we sometimes need to be able to distinguish between actual IO mappings and normal memory mappings, add a early_memremap() call, which maps with PAGE_KERNEL (as opposed to PAGE_KERNEL_IO for early_ioremap()), and use it when constructing pagetables. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-13 10:21:01 +02:00
Jeremy Fitzhardinge	be43d72835	x86: add _PAGE_IOMAP pte flag for IO mappings Use one of the software-defined PTE bits to indicate that a mapping is intended for an IO address. On native hardware this is irrelevent, since a physical address is a physical address. But in a virtual environment, physical addresses are also virtualized, so there needs to be some way to distinguish between pseudo-physical addresses and actual hardware addresses; _PAGE_IOMAP indicates this intent. By default, __supported_pte_mask masks out _PAGE_IOMAP, so it doesn't even appear in the final pagetable. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-13 10:20:56 +02:00
Ingo Molnar	46eaa67020	x86: memory corruption check - cleanup Move the prototypes from the generic kernel.h header to the more appropriate include/asm-x86/bios_ebda.h header file. Also, remove the check from the power management code - this is a pure x86 matter for now. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-12 15:09:23 +02:00
Ingo Molnar	a9b9e81c91	Merge branch 'linus' into x86/memory-corruption-check	2008-10-12 15:05:39 +02:00
Ingo Molnar	0afe2db213	Merge branch 'x86/unify-cpu-detect' into x86-v28-for-linus-phase4-D Conflicts: arch/x86/kernel/cpu/common.c arch/x86/kernel/signal_64.c include/asm-x86/cpufeature.h	2008-10-11 20:23:20 +02:00
Ingo Molnar	3dd392a407	Merge branch 'linus' into x86/pat2 Conflicts: arch/x86/mm/init_64.c	2008-10-10 19:30:08 +02:00
Suresh Siddha	b27a43c1e9	x86, cpa: make the kernel physical mapping initialization a two pass sequence, fix Jeremy Fitzhardinge wrote: > I'd noticed that current tip/master hasn't been booting under Xen, and I > just got around to bisecting it down to this change. > > commit 065ae73c5462d42e9761afb76f2b52965ff45bd6 > Author: Suresh Siddha <suresh.b.siddha@intel.com> > > x86, cpa: make the kernel physical mapping initialization a two pass sequence > > This patch is causing Xen to fail various pagetable updates because it > ends up remapping pagetables to RW, which Xen explicitly prohibits (as > that would allow guests to make arbitrary changes to pagetables, rather > than have them mediated by the hypervisor). Instead of making init a two pass sequence, to satisfy the Intel's TLB Application note (developer.intel.com/design/processor/applnots/317080.pdf Section 6 page 26), we preserve the original page permissions when fragmenting the large mappings and don't touch the existing memory mapping (which satisfies Xen's requirements). Only open issue is: on a native linux kernel, we will go back to mapping the first 0-1GB kernel identity mapping as executable (because of the static mapping setup in head_64.S). We can fix this in a different patch if needed. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-10 19:29:21 +02:00
Suresh Siddha	28dd033f43	x86: fix pagetable init 64-bit breakage Fix _end alignment check - can trigger a crash if _end happens to be on a page boundary. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-10 19:29:20 +02:00
Suresh Siddha	8311eb84bf	x86, cpa: remove cpa pool code Interrupt context no longer splits large page in cpa(). So we can do away with cpa memory pool code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-10 19:29:16 +02:00
Suresh Siddha	0b8fdcbcd2	x86, cpa: dont use large pages for kernel identity mapping with DEBUG_PAGEALLOC Don't use large pages for kernel identity mapping with DEBUG_PAGEALLOC. This will remove the need to split the large page for the allocated kernel page in the interrupt context. This will simplify cpa code(as we don't do the split any more from the interrupt context). cpa code simplication in the subsequent patches. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-10 19:29:14 +02:00
Suresh Siddha	a2699e477b	x86, cpa: make the kernel physical mapping initialization a two pass sequence In the first pass, kernel physical mapping will be setup using large or small pages but uses the same PTE attributes as that of the early PTE attributes setup by early boot code in head_[32\|64].S After flushing TLB's, we go through the second pass, which setups the direct mapped PTE's with the appropriate attributes (like NX, GLOBAL etc) which are runtime detectable. This two pass mechanism conforms to the TLB app note which says: "Software should not write to a paging-structure entry in a way that would change, for any linear address, both the page size and either the page frame or attributes." Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-10 19:29:13 +02:00
Hugh Dickins	bb577f980e	x86: add periodic corruption check Perodically check for corruption in low phusical memory. Don't bother checking at fault time, since it won't show anything useful. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-09-07 17:40:00 +02:00
Ingo Molnar	deed05b7c0	x86, init_64.c: cleanup Clean up comments. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-09-05 10:23:47 +02:00
Yinghai Lu	bd220a24a9	x86: move nonx_setup etc from common.c to init_64.c like 32 bit put it in init_32.c Signed-off-by: Yinghai <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-09-05 10:23:47 +02:00
Ingo Molnar	ea1c9de45e	Merge branch 'x86/urgent' into x86/cleanups	2008-08-25 11:10:42 +02:00
Jan Beulich	9482ac6e34	x86: fix two modpost warnings in mm/init_64.c early_io{re,un}map() are __init and hence can't be called from __meminit functions. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-22 07:51:54 +02:00
Jan Beulich	8ae3a5a8df	x86: fix 1:1 mapping init on 64-bit (memory hotplug case) While I don't have a hotplug capable system at hand, I think two issues need fixing: - pud_phys (in kernel_physical_ampping_init()) would remain uninitialized in the after_bootmem case - the locking done just around phys_pmd_{init,update}() would leave out pgd updates, and it was needlessly covering code portions that do allocations (perhaps using a more friendly gfp value in alloc_low_page() would then be possible) Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-22 07:51:53 +02:00
Ingo Molnar	7393423dd9	Merge branch 'linus' into x86/cleanups	2008-08-20 11:52:15 +02:00
Marcin Slusarz	8d6ea9674c	x86: fix section mismatch warning - spp_getpage() WARNING: vmlinux.o(.text+0x17a3e): Section mismatch in reference from the function set_pte_vaddr_pud() to the function .init.text:spp_getpage() The function set_pte_vaddr_pud() references the function __init spp_getpage(). This is often because set_pte_vaddr_pud lacks a __init annotation or the annotation of spp_getpage is wrong. spp_getpage is called from __init (__init_extra_mapping) and non __init (set_pte_vaddr_pud) functions, so it can't be __init. Unfortunately it calls alloc_bootmem_pages which is __init, but does it only when bootmem allocator is available (after_bootmem == 0). So annotate it accordingly. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com>	2008-08-15 19:16:06 +02:00
Hugh Dickins	a06de63000	x86: fix /proc/meminfo DirectMap Do we actually want these DirectMap lines in the x86 /proc/meminfo? I can see they're interesting to CPA developers and TLB optimizers, but they don't fit its usual "where has all my memory gone?" usage. If they are to stay, here are some fixes. 1. On x86_32 without PAE, they're not 2M but 4M pages: no need to mess with the internal enum, but show the right name to users. 2. Many machines can never show anything but 0 for DirectMap1G, so suppress that line unless direct_gbpages are really enabled. 3. The unit in /proc/meminfo is kB not number of pages: HugePages messed that up, but they're an example to regret not to follow. 4. Once we use kB, it's easy to see that 1GB has gone missing (which explains why CONFIG_CPA_DEBUG=y soon wraps DirectMap2M negative): because head_64.S's level2_ident_pgt entries were not counted. My fix is not ideal, but works for more and for less than 1G, and avoids interfering with early bootup pagetable contortions. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 15:27:55 +02:00
Ingo Molnar	6de9c70882	Merge branch 'linus' into x86/cleanups	2008-08-11 12:57:01 +02:00
Johannes Weiner	8dad322f54	x86: use generic show_mem() Remove arch-specific show_mem() in favor of the generic version. This also removes the following redundant information display: - pages in swapcache, printed by show_swap_cache_info() - dirty pages, writeback pages, mapped pages, slab pages, pagetable pages, printed by show_free_areas() where show_mem() calls show_free_areas(), which calls show_swap_cache_info(). Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:10 -07:00
Joerg Roedel	d86bb0dac7	x86: convert init_64.c from round_up to roundup Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 15:39:21 +02:00
Yinghai Lu	1f067167a8	x86: seperate memtest from init_64.c it's separate functionality that deserves its own file. This also prepares 32-bit memtest support. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 14:10:27 +02:00
Jack Steiner	e22146e610	x86: fix kernel_physical_mapping_init() for large x86 systems Fix bug in kernel_physical_mapping_init() that causes kernel page table to be built incorrectly for systems with greater than 512GB of memory. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 18:27:36 +02:00
Ingo Molnar	5806b81ac1	Merge branch 'auto-ftrace-next' into tracing/for-linus Conflicts: arch/x86/kernel/entry_32.S arch/x86/kernel/process_32.c arch/x86/kernel/process_64.c arch/x86/lib/Makefile include/asm-x86/irqflags.h kernel/Makefile kernel/sched.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-14 16:11:52 +02:00
Yinghai Lu	9958e810f8	x86: max_low_pfn_mapped fix, #3 optimization: try to merge the range with same page size in init_memory_mapping, to get the best possible linear mappings set up. thus when GBpages is not there, we could do 2M pages. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-13 08:19:16 +02:00
Yinghai Lu	f361a450bf	x86: introduce max_low_pfn_mapped for 64-bit when more than 4g memory is installed, don't map the big hole below 4g. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-11 10:24:04 +02:00
Ingo Molnar	bac0c9103b	Merge branch 'tracing/ftrace' into auto-ftrace-next	2008-07-10 11:43:00 +02:00
Yinghai Lu	7b16eb8930	x86: overmapped fix when 4K pages on tail, 64-bit fix phys_pmd_init to make sure not to return bigger value than end. also print out range split:1G/2M/4K in init_memory_mapping(). Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-10 08:46:40 +02:00
Yinghai Lu	c2e6d65bce	x86: not overmap more than the end of RAM in init_memory_mapping - 64bit handle head and tail that are not aligned to big pages (2MB/1GB boundary). with this patch, on system that support gbpages, change: last_map_addr: 1080000000 end: 1078000000 to: last_map_addr: 1078000000 end: 1078000000 Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 10:43:26 +02:00
Yinghai Lu	49c980df55	x86: fix vmemmap printout check Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: "Nick Piggin" <npiggin@suse.de> Cc: "Mark McLoughlin" <markmc@redhat.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: "Eduardo Habkost" <ehabkost@redhat.com> Cc: "Vegard Nossum" <vegard.nossum@gmail.com> Cc: "Stephen Tweedie" <sct@redhat.com> Cc: "Jeremy Fitzhardinge" <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 10:43:24 +02:00
Yinghai Lu	b50efd2a55	x86: introduce page_size_mask for 64bit prepare for overmapped patch also printout last_map_addr together with end Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 09:37:45 +02:00
Jack Steiner	3a9e189d69	x86: map UV chipset space - pagetable Add boot-time function for creating additional 2MB page table entries for mapping chipset specific cached/uncached ranges. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-09 07:43:23 +02:00
Jeremy Fitzhardinge	574977a2ed	x86_64/setup: unconditionally populate the pgd When allocating a new pud, unconditionally populate the pgd (why did we bother to create a new pud if we weren't going to populate it?). This will only happen if the pgd slot was empty, since any existing pud will be reused. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:27 +02:00
Jeremy Fitzhardinge	22b45144f6	x86/paravirt: groundwork for 64-bit Xen support, fix #2 Ingo Molnar wrote: > that fixed the build but now we've got a boot crash with this config: > > time.c: Detected 2010.304 MHz processor. > spurious 8259A interrupt: IRQ7. > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > IP: [<0000000000000000>] > PGD 0 > Thread overran stack, or stack corrupted > Oops: 0010 [1] SMP > CPU 0 > I don't know if this will fix this bug, but it's definitely a bugfix. It was trashing random pages by overwriting them with pagetables... Don't trash a large pmd's data when mapping physical memory. This is a bugfix for "x86_64: adjust mapping of physical pagetables to work with Xen". Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:16:02 +02:00
Eduardo Habkost	0814e0bace	x86, 64-bit: split set_pte_vaddr() We will need to set a pte on l3_user_pgt. Extract set_pte_vaddr_pud() from set_pte_vaddr(), that will accept the l3 page table as parameter. This change should be a no-op for existing code. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:09 +02:00
Jeremy Fitzhardinge	7c934d3990	x86, 64-bit: create small vmemmap mappings if PSE not available If PSE is not available, then fall back to 4k page mappings for the vmemmap area. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:08 +02:00
Jeremy Fitzhardinge	4f9c11dd49	x86, 64-bit: adjust mapping of physical pagetables to work with Xen This makes a few of changes to the construction of the initial pagetables to work better with paravirt_ops/Xen. The main areas are: 1. Support non-PSE mapping of memory, since Xen doesn't currently allow 2M pages to be mapped in guests. 2. Make sure that the ioremap alias of all pages are dropped before attaching the new page to the pagetable. This avoids having writable aliases of pagetable pages. 3. Preserve existing pagetable entries, rather than overwriting. Its possible that a fair amount of pagetable has already been constructed, so reuse what's already in place rather than ignoring and overwriting it. The algorithm relies on the invariant that any page which is part of the kernel pagetable is itself mapped in the linear memory area. This way, it can avoid using ioremap on a pagetable page. The invariant holds because it maps memory from low to high addresses, and also allocates memory from low to high. Each allocated page can map at least 2M of address space, so the mapped area will always progress much faster than the allocated area. It relies on the early boot code mapping enough pages to get started. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:11:07 +02:00
Yinghai Lu	c987d12f84	x86: remove end_pfn in 64bit and use max_pfn directly. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:38 +02:00
Yinghai Lu	d86623a0d5	x86: add table_top check for alloc_low_page in 64 bit that range is from find_e820_area, so don't try to use end_pfn to see if out of boundary...use table_top instead to avoid possible strange result while cross the boundary... also change early_printk to printk, because init_memory_mapping is after early param parsing, and console=uart8250 already working at that time. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:36 +02:00
Yinghai Lu	1a0db38e5f	x86: get max_pfn_mapped in init_memory_mapping so don't shift that in the loop Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:35 +02:00
Jeremy Fitzhardinge	4583ed514e	x86, 64-bit: unify early_ioremap The 32-bit early_ioremap will work equally well for 64-bit, so just use it. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:28 +02:00
Jeremy Fitzhardinge	bb23e403e5	x86, 64-bit: use p??_populate() to attach pages to pagetable Use the _populate() functions to attach new pages to a pagetable, to make sure the right paravirt_ops calls get called. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 13:10:27 +02:00
Yinghai Lu	6a07a0edac	x86: fix compile warning in init_64.c len is long and ret is only for NUMA Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:23 +02:00
Yinghai Lu	346cafecde	x86: clean up min_low_pfn for 32bit we already had early_res support, so don't need to track min_low_pfn. keep it to 0 always. also use init_bootmem_node instead of init_bootmem, so don't touch min_low_pfn. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:21 +02:00
Yinghai Lu	1f75d7e32e	x86: introduce initmem_init for 64 bit Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:50:14 +02:00
Ingo Molnar	6236af82d8	Merge branch 'x86/fixmap' into x86/devel Conflicts: arch/x86/mm/init_64.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 12:24:29 +02:00
Ingo Molnar	3de352bbd8	Merge branch 'x86/mpparse' into x86/devel Conflicts: arch/x86/Kconfig arch/x86/kernel/io_apic_32.c arch/x86/kernel/setup_64.c arch/x86/mm/init_32.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 11:14:58 +02:00
Yinghai Lu	064d25f120	x86: merge setup_memory_map with e820 ... and kill e820_32/64.c and e820_32/64.h Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:38:25 +02:00
Yinghai Lu	d2dbf34332	x86: clean up reserve_bootmem_generic() and port it to 32-bit 1. add reserve_bootmem_generic for 32bit 2. change len to unsigned long 3. make early_res_to_bootmem to use it Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:36:17 +02:00
Bernhard Walle	8b2ef1d728	x86: add flags parameter to reserve_bootmem_generic() This patch adds a 'flags' parameter to reserve_bootmem_generic() like it already has been added in reserve_bootmem() with commit `72a7fe3967`. It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively change the behaviour. Since the change is x86-specific, I don't think it's necessary to add a new API for migration. There are only 4 users of that function. The change is necessary for the next patch, using reserve_bootmem_generic() for crashkernel reservation. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 10:34:54 +02:00
Ingo Molnar	6924d1ab8b	Merge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build', 'x86/cleanups', 'x86/cpa', 'x86/cpu', 'x86/defconfig', 'x86/gart', 'x86/i8259', 'x86/intel', 'x86/irqstats', 'x86/kconfig', 'x86/ldt', 'x86/mce', 'x86/memtest', 'x86/pat', 'x86/ptemask', 'x86/resumetrace', 'x86/threadinfo', 'x86/timers', 'x86/vdso' and 'x86/xen' into x86/devel	2008-07-08 09:16:56 +02:00
Andi Kleen	ce0c0e50f9	x86, generic: CPA add statistics about state of direct mapping v4 Add information about the mapping state of the direct mapping to /proc/meminfo. I chose /proc/meminfo because that is where all the other memory statistics are too and it is a generally useful metric even outside debugging situations. A lot of split kernel pages means the kernel will run slower. This way we can see how many large pages are really used for it and how many are split. Useful for general insight into the kernel. v2: Add hotplug locking to 64bit to plug a very obscure theoretical race. 32bit doesn't need it because it doesn't support hotadd for lowmem. Fix some typos v3: Rename dpages_cnt Add CONFIG ifdef for count update as requested by tglx Expand description v4: Fix stupid bugs added in v3 Move update_page_count to pageattr.c Signed-off-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-08 08:11:45 +02:00
Andrew Morton	27df66a406	arch/x86/mm/init_64.c: early_memtest(): fix types fix this warning: arch/x86/mm/init_64.c: In function 'early_memtest': arch/x86/mm/init_64.c:524: warning: passing argument 2 of 'find_e820_area_size' from incompatible pointer type Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-03 10:15:51 +02:00
Daniel J Blueman	0b1faeef5f	x86: section/warning fixes WARNING: arch/x86/mm/built-in.o(.text+0x3a1): Section mismatch in reference from the function set_pte_phys() to the function .init.text:spp_getpage() The function set_pte_phys() references the function __init spp_getpage(). This is often because set_pte_phys lacks a __init annotation or the annotation of spp_getpage is wrong. arch/x86/mm/init_64.c: In function 'early_memtest': arch/x86/mm/init_64.c:520: warning: passing argument 2 of 'find_e820_area_size' from incompatible pointer type Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Cc: "Linus Torvalds" <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-26 15:33:09 +02:00
Jeremy Fitzhardinge	d494a96125	x86: implement set_pte_vaddr Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 15:09:54 +02:00
Jeremy Fitzhardinge	7c7e6e07e2	x86: unify __set_fixmap In both cases, I went with the 32-bit behaviour. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-20 15:09:51 +02:00
Ingo Molnar	064a32d82c	Merge branch 'linus' into x86/memtest	2008-06-16 11:19:53 +02:00
Ingo Molnar	1791a78c0b	Merge branch 'linus' into x86/cleanups	2008-06-16 11:17:50 +02:00
Ingo Molnar	e765ee90da	Merge branch 'linus' into tracing/ftrace	2008-06-16 11:15:58 +02:00
Kevin Winchester	511631011d	x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest Changed the call to find_e820_area_size to pass u64 instead of unsigned long. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-04 13:11:47 +02:00
Hugh Dickins	2884f110d5	x86: fix bad pmd ffff810000207xxx(9090909090909090) OGAWA Hirofumi and Fede have reported rare pmd_ERROR messages: mm/memory.c:127: bad pmd ffff810000207xxx(9090909090909090). Initialization's cleanup_highmap was leaving alignment filler behind in the pmd for MODULES_VADDR: when vmalloc's guard page would occupy a new page table, it's not allocated, and then module unload's vfree hits the bad 9090 pmd entry left over. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-04 13:11:47 +02:00
Thomas Gleixner	11034d5597	x86: init64.c include initrd.h free_initrd_mem needs a prototype, which is in linux/initrd.h Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-05-25 08:58:27 +02:00
Pekka Paalanen	72b59d67f8	x86_64: fix kernel rodata NX setting Without CONFIG_DYNAMIC_FTRACE, mark_rodata_ro() would mark a wrong number of pages as no-execute. The bug was introduced in the patch "ftrace: dont write protect kernel text". The symptom was machine reboot after a CPU hotplug. Signed-off-by: Pekka Paalanen <pq@iki.fi> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 21:53:07 +02:00
Steven Rostedt	8f0f996e80	ftrace: dont write protect kernel text Dynamic ftrace cant work when the kernel has its text write protected. This patch keeps the kernel from being write protected when dynamic ftrace is in place. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 21:16:22 +02:00
Yinghai Lu	0327318445	x86_64: simplify the memtest parameter setting use CONFIG_MEMTEST only. if it is set, will have memtest=0 (disabled) need to have memtest=4 in command line to test more patterns. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-12 21:28:15 +02:00
Jeremy Fitzhardinge	180c06efce	hotplug-memory: make online_page() common All architectures use an effectively identical definition of online_page(), so just make it common code. x86-64, ia64, powerpc and sh are actually identical; x86-32 is slightly different. x86-32's differences arise because it puts its hotplug pages in the highmem zone. We can handle this in the generic code by inspecting the page to see if its in highmem, and update the totalhigh_pages count appropriately. This leaves init_32.c:free_new_highpage with a single caller, so I folded it into add_one_highpage_init. I also removed an incorrect comment referring to the NUMA case; any NUMA details have already been dealt with by the time online_page() is called. [akpm@linux-foundation.org: fix indenting] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Tested-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:17 -07:00
Yinghai Lu	c2b91e2eec	x86_64/mm: check and print vmemmap allocation continuous On big systems with lots of memory, don't print out too much during bootup, and make it easy to find if it is continuous. on 256G 8 sockets system will get [ffffe20000000000-ffffe20002bfffff] PMD -> [ffff810001400000-ffff810003ffffff] on node 0 [ffffe2001c700000-ffffe2001c7fffff] potential offnode page_structs [ffffe20002c00000-ffffe2001c7fffff] PMD -> [ffff81000c000000-ffff8100255fffff] on node 0 [ffffe20038700000-ffffe200387fffff] potential offnode page_structs [ffffe2001c800000-ffffe200387fffff] PMD -> [ffff810820200000-ffff81083c1fffff] on node 1 [ffffe20040000000-ffffe2007fffffff] PUD ->ffff811027a00000 on node 2 [ffffe20038800000-ffffe2003fffffff] PMD -> [ffff811020200000-ffff8110279fffff] on node 2 [ffffe20054700000-ffffe200547fffff] potential offnode page_structs [ffffe20040000000-ffffe200547fffff] PMD -> [ffff811027c00000-ffff81103c3fffff] on node 2 [ffffe20070700000-ffffe200707fffff] potential offnode page_structs [ffffe20054800000-ffffe200707fffff] PMD -> [ffff811820200000-ffff81183c1fffff] on node 3 [ffffe20080000000-ffffe200bfffffff] PUD ->ffff81202fa00000 on node 4 [ffffe20070800000-ffffe2007fffffff] PMD -> [ffff812020200000-ffff81202f9fffff] on node 4 [ffffe2008c700000-ffffe2008c7fffff] potential offnode page_structs [ffffe20080000000-ffffe2008c7fffff] PMD -> [ffff81202fc00000-ffff81203c3fffff] on node 4 [ffffe200a8700000-ffffe200a87fffff] potential offnode page_structs [ffffe2008c800000-ffffe200a87fffff] PMD -> [ffff812820200000-ffff81283c1fffff] on node 5 [ffffe200c0000000-ffffe200ffffffff] PUD ->ffff813037a00000 on node 6 [ffffe200a8800000-ffffe200bfffffff] PMD -> [ffff813020200000-ffff8130379fffff] on node 6 [ffffe200c4700000-ffffe200c47fffff] potential offnode page_structs [ffffe200c0000000-ffffe200c47fffff] PMD -> [ffff813037c00000-ffff81303c3fffff] on node 6 [ffffe200c4800000-ffffe200e07fffff] PMD -> [ffff813820200000-ffff81383c1fffff] on node 7 instead of a very long print out... Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-26 22:51:09 +02:00
Yinghai Lu	8b3cd09ed2	x86_64: make reserve_bootmem_generic() use new reserve_bootmem() "mm: make reserve_bootmem can crossed the nodes" provides new reserve_bootmem(), let reserve_bootmem_generic() use that. reserve_bootmem_generic() is used to reserve initramdisk, so this way we can make sure even when bootloader or kexec load ranges cross the node memory boundaries, reserve_bootmem still works. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-26 22:51:08 +02:00
Linus Torvalds	bf16ae2509	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-pat * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-pat: generic: add ioremap_wc() interface wrapper /dev/mem: make promisc the default pat: cleanups x86: PAT use reserve free memtype in mmap of /dev/mem x86: PAT phys_mem_access_prot_allowed for dev/mem mmap x86: PAT avoid aliasing in /dev/mem read/write devmem: add range_is_allowed() check to mmap of /dev/mem x86: introduce /dev/mem restrictions with a config option	2008-04-25 12:48:08 -07:00
Ingo Molnar	70c9f590ff	x86: remove set_fixmap() warning set_fixmap()+clear_fixmap() is safe. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-25 19:54:07 +02:00
Ingo Molnar	82a355f5a2	x86: make __set_fixmap() non-init Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-25 19:54:07 +02:00
Arjan van de Ven	ae531c26c5	x86: introduce /dev/mem restrictions with a config option This patch introduces a restriction on /dev/mem: Only non-memory can be read or written unless the newly introduced config option is set. The X server needs access to /dev/mem for the PCI space, but it doesn't need access to memory; both the file permissions and SELinux permissions of /dev/mem just make X effectively super-super powerful. With the exception of the BIOS area, there's just no valid app that uses /dev/mem on actual memory. Other popular users of /dev/mem are rootkits and the like. (note: mmap access of memory via /dev/mem was already not allowed since a really long time) People who want to use /dev/mem for kernel debugging can enable the config option. The restrictions of this patch have been in the Fedora and RHEL kernels for at least 4 years without any problems. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-24 23:40:47 +02:00
Glauber Costa	85c246ee16	x86: move definition to pci-dma.c Move dma_ops structure definition to pci-dma.c, where it belongs. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-19 19:19:57 +02:00
Andi Kleen	cc61503219	x86: account overlapped mappings in max_pfn_mapped When end_pfn is not aligned to 2MB (or 1GB) then the kernel might map more memory than end_pfn. Account this in max_pfn_mapped. Signed-off-by: Andi Kleen <ak@suse.de> Cc: andreas.herrmann3@amd.com Cc: mingo@elte.hu Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:30 +02:00
Thomas Gleixner	67794292c8	x86: replace the now useless max_pfn_mapped define Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:30 +02:00
Yinghai Lu	dcfe946520	x86: fix memtest print out Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:21 +02:00
Yinghai Lu	c64df70793	x86: memtest bootparam Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:21 +02:00
Yinghai Lu	272b9cad6e	x86: early memtest to find bad ram do simple memtest after init_memory_mapping use find_e820_area_size to find all ram range that is not reserved. and do some simple bits test to find some bad ram. if find some bad ram, use reserve_early to exclude that range. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
Johannes Weiner	1415d160c7	x86: Remove redundant display of free swap space in show_mem() Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:58 +02:00
Mathieu Desnoyers	4e4eee0e01	x86: enhance DEBUG_RODATA support for hotplug and kprobes Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Andi Kleen <andi@firstfloor.org> Cc: pageexec@freemail.hu Cc: akpm@linux-foundation.org CC: Andi Kleen <andi@firstfloor.org> CC: pageexec@freemail.hu CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:58 +02:00
Andi Kleen	ef9257668e	x86: do kernel direct mapping at boot using GB pages The AMD Fam10h CPUs support new Gigabyte page table entry for mapping 1GB at a time. Use this for the kernel direct mapping. Only done for 64bit because i386 does not support GB page tables. This only applies to the data portion of the direct mapping; the kernel text mapping stays with 2MB pages because the AMD Fam10h microarchitecture does not support GB ITLBs and AMD recommends against using GB mappings for code. Can be disabled with disable_gbpages on the kernel command line [ tglx@linutronix.de: simplify enable code ] [ Yinghai Lu <yinghai.lu@sun.com>: boot fix on 256 GB RAM ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Ingo Molnar	00d1c5e057	x86: add gbpages switches These new controls toggle experimental support for a new CPU feature, the straightforward extension of largepages from the pmd level to the pud level, which allows 1GB (kernel) TLBs instead of 2MB TLBs. Turn it off by default, as this code has not been tested well enough yet. Use the CONFIG_DIRECT_GBPAGES=y .config option or gbpages on the boot line can be used to enable it. If enabled in the .config then nogbpages boot option disables it. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Ingo Molnar	88f3aec7af	x86: fix spontaneous reboot with allyesconfig bzImage recently the 64-bit allyesconfig bzImage kernel started spontaneously rebooting during early bootup. after a few fun hours spent with early init debugging, it turns out that we've got this rather annoying limit on the size of the kernel image: #define KERNEL_TEXT_SIZE (4010241024) which limit my vmlinux just happened to pass: text data bss dec hex filename 29703744 4222751 `8646224` 42572719 2899baf vmlinux 40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/ So it happily crashed right in head_64.S, which - as we all know - is the most debuggable code in the whole architecture ;-) So increase the limit to allow an up to 128MB kernel image to be mapped. (should anyone be that crazy or lazy) We have a full 4K of pagetable (level2_kernel_pgt) allocated for these mappings already, so there's no RAM overhead and the limit was rather pointless and arbitrary. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-26 12:55:56 +01:00
Yinghai Lu	3b57bc461f	x86: remove double-checking empty zero pages debug so far no one complained about that. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-26 12:55:55 +01:00
Thomas Gleixner	31eedd823c	x86: zap invalid and unused pmds in early boot The early boot code maps KERNEL_TEXT_SIZE (currently 40MB) starting from __START_KERNEL_map. The kernel itself only needs _text to _end mapped in the high alias. On relocatible kernels the ASM setup code adjusts the compile time created high mappings to the relocation. This creates invalid pmd entries for negative offsets: 0xffffffff80000000 -> pmd entry: ffffffffff2001e3 It points outside of the physical address space and is marked present. This starts at the virtual address __START_KERNEL_map and goes up to the point where the first valid physical address (0x0) is mapped. Zap the mappings before _text and after _end right away in early boot. This removes also the invalid entries. Furthermore it simplifies the range check for high aliases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Harvey Harrison	7bfeab9af9	x86: include proper prototypes for rodata_test extern should not appear in C files. Also, the definitions do not match the prototype currently, not sure what way you want to go with this, I've switched the prototype to return int, but I can see going to the void return as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:20 +01:00
Thomas Gleixner	76ebd0548d	x86: introduce page pool in cpa DEBUG_PAGEALLOC was not possible on 64-bit due to its early-bootup hardcoded reliance on PSE pages, and the unrobustness of the runtime splitup of large pages. The splitup ended in recursive calls to alloc_pages() when a page for a pte split was requested. Avoid the recursion with a preallocated page pool, which is used to split up large mappings and gets refilled in the return path of kernel_map_pages after the split has been done. The size of the page pool is adjusted to the available memory. This part just implements the page pool and the initialization w/o using it yet. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Thomas Gleixner	bfc734b246	x86: avoid unused variable warning in mm/init_64.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Bernhard Walle	72a7fe3967	Introduce flags for reserve_bootmem() This patchset adds a flags variable to reserve_bootmem() and uses the BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions between crashkernel area and already used memory. This patch: Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. Because that code runs before SMP initialisation, there's no race condition inside reserve_bootmem_core(). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: <linux-arch@vger.kernel.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:25 -08:00
Arjan van de Ven	984bb80d94	x86: mark the .rodata section also NX The .rodata section shouldn't just be read-only, but also non-executable. This is free since we've broken up the 2MB page already anyway. also update test_nx to check for this. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:45 +01:00
Andi Kleen	d4f71f7969	x86: switch direct mapping setup over to set_pte Use set_pte() for setting up the 2MB pages in the direct mapping. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	bde1965ce8	x86: remove now unused clear_kernel_mapping Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Andi Kleen	31422c51e0	x86: rename LARGE_PAGE_SIZE to PMD_PAGE_SIZE Fix up all users. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Yinghai Lu	24a5da73f4	x86_64: make bootmap_start page align v6 boot oopses when a system has 64 or 128 GB of RAM installed: Calling initcall 0xffffffff80bc33b6: sctp_init+0x0/0x711() BUG: unable to handle kernel NULL pointer dereference at 000000000000005f IP: [<ffffffff802bfe55>] proc_register+0xe7/0x10f PGD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #6 RIP: 0010:[<ffffffff802bfe55>] [<ffffffff802bfe55>] proc_register+0xe7/0x10f RSP: 0000:ffff810824c57e60 EFLAGS: 00010246 RAX: 000000000000d7d7 RBX: ffff811024c5fa80 RCX: ffff810824c57e08 RDX: 0000000000000000 RSI: 0000000000000195 RDI: ffffffff80cc2460 RBP: ffffffffffffffff R08: 0000000000000000 R09: ffff811024c5fa80 R10: 0000000000000000 R11: 0000000000000002 R12: ffff810824c57e6c R13: 0000000000000000 R14: ffff810824c57ee0 R15: 00000006abd25bee FS: 0000000000000000(0000) GS:ffffffff80b4d000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 000000000000005f CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 1, threadinfo ffff810824c56000, task ffff812024c52000) Stack: ffffffff80a57348 0000019500000000 ffff811024c5fa80 0000000000000000 00000000ffffff97 ffffffff802bfef0 0000000000000000 ffffffffffffffff 0000000000000000 ffffffff80bc3b4b ffff810824c57ee0 ffffffff80bc34a5 Call Trace: [<ffffffff802bfef0>] ? create_proc_entry+0x73/0x8a [<ffffffff80bc3b4b>] ? sctp_snmp_proc_init+0x1c/0x34 [<ffffffff80bc34a5>] ? sctp_init+0xef/0x711 [<ffffffff80b976e3>] ? kernel_init+0x175/0x2e1 [<ffffffff8020ccf8>] ? child_rip+0xa/0x12 [<ffffffff80b9756e>] ? kernel_init+0x0/0x2e1 [<ffffffff8020ccee>] ? child_rip+0x0/0x12 Code: 1e 48 83 7b 38 00 75 08 48 c7 43 38 f0 e8 82 80 48 83 7b 30 00 75 08 48 c7 43 30 d0 e9 82 80 48 c7 c7 60 24 cc 80 e8 bd 5a 54 00 <48> 8b 45 60 48 89 6b 58 48 89 5d 60 48 89 43 50 fe 05 f5 25 a0 RIP [<ffffffff802bfe55>] proc_register+0xe7/0x10f RSP <ffff810824c57e60> CR2: 000000000000005f ---[ end trace 02c2d78def82877a ]--- Kernel panic - not syncing: Attempted to kill init! it turns out some variables near end of bss are corrupted already. in System.map we have ffffffff80d40420 b rsi_table ffffffff80d40620 B krb5_seq_lock ffffffff80d40628 b i.20437 ffffffff80d40630 b xprt_rdma_inline_write_padding ffffffff80d40638 b sunrpc_table_header ffffffff80d40640 b zero ffffffff80d40644 b min_memreg ffffffff80d40648 b rpcrdma_tk_lock_g ffffffff80d40650 B sctp_assocs_id_lock ffffffff80d40658 B proc_net_sctp ffffffff80d40660 B sctp_assocs_id ffffffff80d40680 B sysctl_sctp_mem ffffffff80d40690 B sysctl_sctp_rmem ffffffff80d406a0 B sysctl_sctp_wmem ffffffff80d406b0 b sctp_ctl_socket ffffffff80d406b8 b sctp_pf_inet6_specific ffffffff80d406c0 b sctp_pf_inet_specific ffffffff80d406c8 b sctp_af_v4_specific ffffffff80d406d0 b sctp_af_v6_specific ffffffff80d406d8 b sctp_rand.33270 ffffffff80d406dc b sctp_memory_pressure ffffffff80d406e0 b sctp_sockets_allocated ffffffff80d406e4 b sctp_memory_allocated ffffffff80d406e8 b sctp_sysctl_header ffffffff80d406f0 b zero ffffffff80d406f4 A __bss_stop ffffffff80d406f4 A _end and setup_node_bootmem() will use that page 0xd40000 for bootmap Bootmem setup node 0 0000000000000000-0000000828000000 NODE_DATA [000000000008a485 - 0000000000091484] bootmap [0000000000d406f4 - 0000000000e456f3] pages 105 Bootmem setup node 1 0000000828000000-0000001028000000 NODE_DATA [0000000828000000 - 0000000828006fff] bootmap [0000000828007000 - 0000000828106fff] pages 100 Bootmem setup node 2 0000001028000000-0000001828000000 NODE_DATA [0000001028000000 - 0000001028006fff] bootmap [0000001028007000 - 0000001028106fff] pages 100 Bootmem setup node 3 0000001828000000-0000002028000000 NODE_DATA [0000001828000000 - 0000001828006fff] bootmap [0000001828007000 - 0000001828106fff] pages 100 setup_node_bootmem() makes NODE_DATA cacheline aligned, and bootmap is page-aligned. the patch updates find_e820_area() to make sure we can meet the alignment constraints. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:41 +01:00
Yinghai Lu	25eff8d4cd	x86_64: add debug name for early_res helps debugging problems in this rather murky area of code. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:41 +01:00
Yinghai Lu	9198715763	x86: fix overlap between pagetable with bss section one early crash on one 8 node 256g machine: Command line: console=uart8250,io,0x3f8,115200n8 initrd=kernel.org/mydisk11_x86_64.gz rw root=/dev/ram0 debug initcall_debug apic=debug acpi.debug_level=0x0000000f pci=routeirq ip=dhcp load_ramdisk=1 ramdisk_size=131072 BOOT_IMAGE=kernel.org/bzImage_2.6.25_k8.1 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009bc00 (usable) BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000dffe0000 (usable) BIOS-e820: 00000000dffe0000 - 00000000dffee000 (ACPI data) BIOS-e820: 00000000dffee000 - 00000000dffff050 (ACPI NVS) BIOS-e820: 00000000dffff050 - 00000000e0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000004020000000 (usable) Early serial console at I/O port 0x3f8 (options '115200n8') console [uart0] enabled end_pfn_map = 67239936 Kernel panic - not syncing: Duplicated early reservation d40000-e42000 Pid: 0, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #3 Call Trace: [<ffffffff80221545>] lapic_get_maxlvt+0x0/0x10 [<ffffffff80221657>] clear_local_APIC+0x5/0xcf [<ffffffff80221726>] disable_local_APIC+0x5/0x17 [<ffffffff8021fe16>] smp_send_stop+0x46/0x4c [<ffffffff80235293>] panic+0x94/0x13e [<ffffffff80bc3b03>] sctp_eps_proc_init+0x12/0x34 [<ffffffff80b9f1c5>] reserve_early+0x30/0x6c [<ffffffff80803925>] init_memory_mapping+0x2cd/0x2dc [<ffffffff80b9dc01>] setup_arch+0x21f/0x44e [<ffffffff80b978be>] start_kernel+0x6f/0x2c7 [<ffffffff80b971cc>] _sinittext+0x1cc/0x1d3 it turns out there is overlap between pgtable and bss... in System.map we have ffffffff80d40420 b rsi_table ffffffff80d40620 B krb5_seq_lock ffffffff80d40628 b i.20437 ffffffff80d40630 b xprt_rdma_inline_write_padding ffffffff80d40638 b sunrpc_table_header ffffffff80d40640 b zero ffffffff80d40644 b min_memreg ffffffff80d40648 b rpcrdma_tk_lock_g ffffffff80d40650 B sctp_assocs_id_lock ffffffff80d40658 B proc_net_sctp ffffffff80d40660 B sctp_assocs_id ffffffff80d40680 B sysctl_sctp_mem ffffffff80d40690 B sysctl_sctp_rmem ffffffff80d406a0 B sysctl_sctp_wmem ffffffff80d406b0 b sctp_ctl_socket ffffffff80d406b8 b sctp_pf_inet6_specific ffffffff80d406c0 b sctp_pf_inet_specific ffffffff80d406c8 b sctp_af_v4_specific ffffffff80d406d0 b sctp_af_v6_specific ffffffff80d406d8 b sctp_rand.33270 ffffffff80d406dc b sctp_memory_pressure ffffffff80d406e0 b sctp_sockets_allocated ffffffff80d406e4 b sctp_memory_allocated ffffffff80d406e8 b sctp_sysctl_header ffffffff80d406f0 b zero ffffffff80d406f4 A __bss_stop ffffffff80d406f4 A _end need to round up table_start to PAGE_SIZE. also make the panic more informative. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:12 +01:00
Ingo Molnar	10f22dde55	x86: arch/x86/mm/init_64.c printk fixes Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Thomas Gleixner	14a62c34b1	x86: unify ioremap Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Ingo Molnar	86f03989d9	x86: cpa: fix the self-test Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Ingo Molnar	ee01f1122c	x86: init memory debugging debug incorrect/late access to init memory, by permanently unmapping the init memory ranges. Depends on CONFIG_DEBUG_PAGEALLOC=y. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Arjan van de Ven	1a4872529e	x86: move misplaced rodata check call It looks like a mismerge put the rodata self-check in the wrong spot; move it to the right place after marking the .rodata section read only. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Arjan van de Ven	edeed30589	x86: add testcases for RODATA and NX protections/attributes Latest update; I now have 4 NX tests, but 2 fail so they're #if 0'd. I also cleaned up the NX test code quite a bit, and got rid of the ugly exception table sorting stuff. From: Arjan van de Ven <arjan@linux.intel.com> This patch adds testcases for the CONFIG_DEBUG_RODATA configuration option as well as the NX CPU feature/mappings. Both testcases can move to tests/ once that patch gets merged into mainline. (I'm half considering moving the rodata test into mm/init.c but I'll wait with that until init.c is unified) As part of this I had to fix a not-quite-right alignment in the vmlinux.lds.h for the RODATA sections, which lead to 1 page less being marked read only. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:08 +01:00
Arjan van de Ven	3c1df68b84	x86: make sure initmem is writable When we free initmem, various rodata and CPA checks may have left memory read only.. this patch ensures that the memory is writable before we free it. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Thomas Gleixner	d7c8f21a8c	x86: cpa: move flush to cpa The set_memory_* and set_pages_* family of API's currently requires the callers to do a global tlb flush after the function call; forgetting this is a very nasty deathtrap. This patch moves the global tlb flush into each of the callers Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Ingo Molnar	f62d0f008e	x86: cpa: set_memory_notpresent() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Arjan van de Ven	6d238cc4dc	x86: convert CPA users to the new set_page_ API This patch converts various users of change_page_attr() to the new, more intent driven set_page_/set_memory_ API set. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:06 +01:00
Andi Kleen	1a2b441231	x86: fix early_ioremap() on 64-bit Fix early_ioremap() on x86-64 I had ACPI failures on several machines since a few days. Symptom was NUMA nodes not getting detected or worse cores not getting detected. They all came from ACPI not being able to read various of its tables. I finally bisected it down to Jeremy's "put _PAGE_GLOBAL into PAGE_KERNEL" change. With that the fix was fairly obvious. The problem was that early_ioremap() didn't use a "_all" flush that would affect the global PTEs too. So with global bits getting used everywhere now an early_ioremap would not actually flush a mapping if something else was mapped previously on that slot (which can happen with early_iounmap inbetween) This patch changes all flushes in init_64.c to be __flush_tlb_all() and fixes the problem here. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:54 +01:00
Andi Kleen	0c42f39276	c_p_a(): do a simple self test at boot When CONFIG_DEBUG_RODATA is enabled undo the ro mapping and redo it again. This gives some simple testing for change_page_attr(). Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:42 +01:00
Andi Kleen	7517527891	x86: replace hard coded reservations in 64-bit early boot code with dynamic table On x86-64 there are several memory allocations before bootmem. To avoid them stomping on each other they used to be all hard coded in bad_area(). Replace this with an array that is filled as needed. This cleans up the code considerably and allows to expand its use. Cc: peterz@infradead.org Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:17 +01:00
Ingo Molnar	f2633105cd	x86: debug: double-check the empty zero page temporary debugging - remove before this hits v2.6.25. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:32:36 +01:00
Yinghai Lu	48ddb154cf	x86: not clear empty_zero_page again empty_zero_page is in .bss section, and it is cleared in clear_bss by x86_64_start_kernel(). So don't clear that again in mem_init Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:32:36 +01:00
Jeremy Fitzhardinge	27ec161f93	x86: kill mk_pte_huge It only has a single use, which can be trivially replaced. Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:09 +01:00
Joerg Roedel	929fd58913	x86: some whitespace cleanups in paging code This patch does some whitespace cleanups in the paging code to fix some checkpatch.pl warnings of my formerly merged cleanup patches. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:08 +01:00
Joerg Roedel	40842bf50e	x86: use __PAGE_KERNEL* instead of _KERNPG_TABLE This minor cleanup replaces _KERNPG_TABLE with the __PAGE_KERNEL* for 2MB PTEs in the 64-bit memory initialization code. The __PAGE_KERNEL* defines are more appropriate for PTEs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:02 +01:00
Christoph Lameter	b263295dbf	x86: 64-bit, make sparsemem vmemmap the only memory model Use sparsemem as the only memory model for UP, SMP and NUMA. Measurements indicate that DISCONTIGMEM has a higher overhead than sparsemem. And FLATMEMs benefits are minimal. So I think its best to simply standardize on sparsemem. Results of page allocator tests (test can be had via git from slab git tree branch tests) Measurements in cycle counts. 1000 allocations were performed and then the average cycle count was calculated. Order FlatMem Discontig SparseMem 0 639 665 641 1 567 647 593 2 679 774 692 3 763 967 781 4 961 1501 962 5 1356 2344 1392 6 2224 3982 2336 7 4869 7225 5074 8 12500 14048 12732 9 27926 28223 28165 10 58578 58714 58682 (Note that FlatMem is an SMP config and the rest NUMA configurations) Memory use: SMP Sparsemem ------------- Kernel size: text data bss dec hex filename 3849268 397739 1264856 5511863 541ab7 vmlinux total used free shared buffers cached Mem: 8242252 41164 8201088 0 352 11512 -/+ buffers/cache: 29300 8212952 Swap: 9775512 0 9775512 SMP Flatmem ----------- Kernel size: text data bss dec hex filename 3844612 397739 1264536 5506887 540747 vmlinux So 4.5k growth in text size vs. FLATMEM. total used free shared buffers cached Mem: 8244052 40544 8203508 0 352 11484 -/+ buffers/cache: 28708 8215344 2k growth in overall memory use after boot. NUMA discontig: text data bss dec hex filename 3888124 470659 1276504 5635287 55fcd7 vmlinux total used free shared buffers cached Mem: 8256256 56908 8199348 0 352 11496 -/+ buffers/cache: 45060 8211196 Swap: 9775512 0 9775512 NUMA sparse: text data bss dec hex filename 3896428 470659 1276824 5643911 561e87 vmlinux 8k text growth. Given that we fully inline virt_to_page and friends now that is rather good. total used free shared buffers cached Mem: 8264720 57240 8207480 0 352 11516 -/+ buffers/cache: 45372 8219348 Swap: 9775512 0 9775512 The total available memory is increased by 8k. This patch makes sparsemem the default and removes discontig and flatmem support from x86. [ akpm@linux-foundation.org: allnoconfig build fix ] Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:47 +01:00
Thomas Gleixner	aaa64e04f9	x86: move numa related declarations More stuff shuffeled to the correct place Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:30:17 +01:00
Thomas Gleixner	718fc13b46	x86: move debug related declarations to kdebug.h Move them and fixup some users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:30:17 +01:00
KAMEZAWA Hiroyuki	b6fd6ecb83	memory hotplug x86_64: fix section mismatch in init_memory_mapping() Changes __meminit to __init_refok. WARNING: vmlinux.o(.text+0x1d07c): Section mismatch: reference to .init.text:find_e820_area (between 'init_memory_mapping' and 'arch_add_memory') Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Linus Torvalds	6a22c57b8d	Revert "x86_64: allocate sparsemem memmap above 4G" This reverts commit `2e1c49db4c`. First off, testing in Fedora has shown it to cause boot failures, bisected down by Martin Ebourne, and reported by Dave Jobes. So the commit will likely be reverted in the 2.6.23 stable kernels. Secondly, in the 2.6.24 model, x86-64 has now grown support for SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the bug is not visible any more, it's become invisible due to the code just being irrelevant and no longer enabled on the only architecture that this ever affected. Reported-by: Dave Jones <davej@redhat.com> Tested-by: Martin Ebourne <fedora@ebourne.me.uk> Cc: Zou Nan hai <nanhai.zou@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-29 14:05:37 -07:00
KAMEZAWA Hiroyuki	48e94196a5	fix memory hot remove not configured case. Now, arch dependent code around CONFIG_MEMORY_HOTREMOVE is a mess. This patch cleans up them. This is against 2.6.23-rc6-mm1. - fix compile failure on ia64/ CONFIG_MEMORY_HOTPLUG && !CONFIG_MEMORY_HOTREMOVE case. - For !CONFIG_MEMORY_HOTREMOVE, add generic no-op remove_memory(), which returns -EINVAL. - removed remove_pages() only used in powerpc. - removed no-op remove_memory() in i386, sh, sparc64, x86_64. - only powerpc returns -ENOSYS at memory hot remove(no-op). changes it to return -EINVAL. Note: Currently, only ia64 supports CONFIG_MEMORY_HOTREMOVE. I welcome other archs if there are requirements and testers. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:43:02 -07:00
Christoph Lameter	0889eba5b3	x86_64: SPARSEMEM_VMEMMAP 2M page size support x86_64 uses 2M page table entries to map its 1-1 kernel space. We also implement the virtual memmap using 2M page table entries. So there is no additional runtime overhead over FLATMEM, initialisation is slightly more complex. As FLATMEM still references memory to obtain the mem_map pointer and SPARSEMEM_VMEMMAP uses a compile time constant, SPARSEMEM_VMEMMAP should be superior. With this SPARSEMEM becomes the most efficient way of handling virt_to_page, pfn_to_page and friends for UP, SMP and NUMA on x86_64. [apw@shadowen.org: code resplit, style fixups] [apw@shadowen.org: vmemmap x86_64: ensure end of section memmap is initialised] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andy Whitcroft <apw@shadowen.org> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: Andi Kleen <ak@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:51 -07:00
Thomas Gleixner	95119fbd87	x86_64: move mm Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-11 11:17:18 +02:00

... 2 3 4 5 6

292 Commits