linux

Author	SHA1	Message	Date
venkatesh.pallipadi@intel.com	d7677d4034	x86: PAT use reserve free memtype in ioremap and iounmap Use reserve_memtype and free_memtype interfaces in ioremap/iounmap to avoid aliasing. If there is an existing alias for the region, inherit the memory type from the alias. If there are conflicting aliases for the entire region, then fail ioremap. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
venkatesh.pallipadi@intel.com	3a96ce8cac	x86: PAT make ioremap_change_attr non-static Make ioremap_change_attr() non-static and use prot_val in place of ioremap_mode. This interface is used in subsequent PAT patches. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
Ingo Molnar	55c626820a	x86: revert ucminus change Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
venkatesh.pallipadi@intel.com	2e5d9c857d	x86: PAT infrastructure patch Sets up pat_init() infrastructure. PAT MSR has following setting. PAT \|PCD \|\|PWT \|\|\| 000 WB _PAGE_CACHE_WB 001 WC _PAGE_CACHE_WC 010 UC- _PAGE_CACHE_UC_MINUS 011 UC _PAGE_CACHE_UC We are effectively changing WT from boot time setting to WC. UC_MINUS is used to provide backward compatibility to existing /dev/mem users(X). reserve_memtype and free_memtype are new interfaces for maintaining alias-free mapping. It is currently implemented in a simple way with a linked list and not optimized. reserve and free tracks the effective memory type, as a result of PAT and MTRR setting rather than what is actually requested in PAT. pat_init piggy backs on mtrr_init as the rules for setting both pat and mtrr are same. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
Yinghai Lu	272b9cad6e	x86: early memtest to find bad ram do simple memtest after init_memory_mapping use find_e820_area_size to find all ram range that is not reserved. and do some simple bits test to find some bad ram. if find some bad ram, use reserve_early to exclude that range. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:19 +02:00
Alexey Starikovskiy	ce3fe6b2bf	x86: use get_bios_ebda in mpparse_64.c Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:41:05 +02:00
Johannes Weiner	1415d160c7	x86: Remove redundant display of free swap space in show_mem() Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:58 +02:00
Yinghai Lu	9a79cf9c1a	x86: sort address_markers for dump_pagetables otherwise Vmemmap and High Kernel Mapping string is not showing up. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:58 +02:00
Mathieu Desnoyers	4e4eee0e01	x86: enhance DEBUG_RODATA support for hotplug and kprobes Standardize DEBUG_RODATA, removing special cases for hotplug and kprobes. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Andi Kleen <andi@firstfloor.org> Cc: pageexec@freemail.hu Cc: akpm@linux-foundation.org CC: Andi Kleen <andi@firstfloor.org> CC: pageexec@freemail.hu CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:58 +02:00
Ingo Molnar	9fc34113f6	x86: debug pmd_bad() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:52 +02:00
Ingo Molnar	ba748d221e	x86: warn about RAM pages in ioremap() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:52 +02:00
Ingo Molnar	bdd3cee2e4	x86: ioremap(), extend check to all RAM pages Suggested by Jan Beulich. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jan Beulich <jbeulich@novell.com>	2008-04-17 17:40:51 +02:00
Thomas Gleixner	e3100c82ab	x86: check physical address range in ioremap Roland Dreier reported in http://lkml.org/lkml/2008/2/27/194 [ 8425.915139] BUG: unable to handle kernel paging request at ffffc20001a0a000 [ 8425.919087] IP: [<ffffffff8021dacc>] clflush_cache_range+0xc/0x25 [ 8425.919087] PGD 1bf80e067 PUD 1bf80f067 PMD 1bb497067 PTE 80000047000ee17b This is on a Intel machine with 36bit physical address space. The PTE entry references 47000ee000, which is outside of it. Add a check for the physical address space and warn/printk about the stupid caller. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:51 +02:00
Ian Campbell	c92a7a54d6	x86: reduce arch/x86/mm/ioremap.o size > Don't we have a special section for page-aligned data so it doesn't > waste most of two pages? We have .bss.page_aligned and it seems appropriate to use it. text data bss dec hex filename - 3388 8236 4 11628 2d6c ../build-32/arch/x86/mm/ioremap.o + 3388 48 4100 7536 1d70 ../build-32/arch/x86/mm/ioremap.o Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: Matt Mackall <mpm@selenic.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:47 +02:00
Yinghai Lu	04adf11435	x86: remove never used nodenumer in pda Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:47 +02:00
Yinghai Lu	beafe91f1c	x86: get apic_id later in acpi_numa_processor_affinity_init we don't need get that so early. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:47 +02:00
Andi Kleen	ef9257668e	x86: do kernel direct mapping at boot using GB pages The AMD Fam10h CPUs support new Gigabyte page table entry for mapping 1GB at a time. Use this for the kernel direct mapping. Only done for 64bit because i386 does not support GB page tables. This only applies to the data portion of the direct mapping; the kernel text mapping stays with 2MB pages because the AMD Fam10h microarchitecture does not support GB ITLBs and AMD recommends against using GB mappings for code. Can be disabled with disable_gbpages on the kernel command line [ tglx@linutronix.de: simplify enable code ] [ Yinghai Lu <yinghai.lu@sun.com>: boot fix on 256 GB RAM ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Ingo Molnar	00d1c5e057	x86: add gbpages switches These new controls toggle experimental support for a new CPU feature, the straightforward extension of largepages from the pmd level to the pud level, which allows 1GB (kernel) TLBs instead of 2MB TLBs. Turn it off by default, as this code has not been tested well enough yet. Use the CONFIG_DIRECT_GBPAGES=y .config option or gbpages on the boot line can be used to enable it. If enabled in the .config then nogbpages boot option disables it. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
H. Peter Anvin	fe770bf031	x86: clean up the page table dumper and add 32-bit support Clean up the page table dumper (fix boundary conditions, table driven address ranges, some formatting changes since it is no longer using the kernel log but a separate virtual file), and generalize to 32 bits. [ mingo@elte.hu: x86: fix the pagetable dumper ] Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Arjan van de Ven	926e5392ba	x86: add code to dump the (kernel) page tables for visual inspection by kernel developers This patch adds code to the kernel to have an (optional) /proc/kernel_page_tables debug file that basically dumps the kernel pagetables; this allows us kernel developers to verify that nothing fishy is going on and that the various mappings are set up correctly. This was quite useful in finding various change_page_attr() bugs, and is very likely to be useful in the future as well. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: mingo@elte.hu Cc: tglx@tglx.de Cc: hpa@zytor.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
H. Peter Anvin	2596e0fae0	x86: unify arch/x86/mm/Makefile Unify arch/x86/mm/Makefile between 32 and 64 bits. All configuration variables that are protected by Kconfig constraints have been put in the common part of the Makefile; however, the NUMA files are totally different between 32 and 64 bits and are handled via an ifdef. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Thomas Gleixner	ee7ae7a198	x86: add debug info to DEBUG_PAGEALLOC Add debug information for DEBUG_PAGEALLOC to get some statistics about the pool usage and split status. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Ingo Molnar	b4e0409a36	x86: check vmlinux limits, 64-bit these build-time and link-time checks would have prevented the vmlinux size regression. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:45 +02:00
Andrew Morton	9c312058b2	Avoid false positive warnings in kmap_atomic_prot() with DEBUG_HIGHMEM I believe http://bugzilla.kernel.org/show_bug.cgi?id=10318 is a false positive. There's no way in which networking will be using highmem pages here, so it won't be taking the KM_USER0 kmap slot, so there's no point in performing these checks. Cc: Pawel Staszewski <pstaszewski@artcom.pl> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Christoph Lameter <clameter@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Really sad. We lose almost all real-life coverage of the debug tests with this patch. Now it will only report problems for the cases where people actually end up using a HIGHMEM page, not when they just _might_ use one. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-28 13:08:14 -07:00
Ingo Molnar	3085354de6	x86: prefetch fix #2 Linus noticed a second bug and an uncleanliness: - we'd return on any instruction fetch fault - we'd use both the value of 16 and the PF_INSTR symbol which are the same and make no sense the cleanup nicely unifies this piece of logic. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-27 22:00:16 +01:00
Christoph Lameter	25e59881f1	x86: stricter check in follow_huge_addr() The first page of the compound page is determined in follow_huge_addr() but then PageCompound() only checks if the page is part of a compound page. PageHead() allows checking if this is indeed the first page of the compound. Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-27 16:08:45 +01:00
Ingo Molnar	bc713dcf35	x86: fix prefetch workaround some early Athlon XP's and Opterons generate bogus faults on prefetch instructions. The workaround for this regressed over .24 - reinstate it. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-27 16:08:44 +01:00
Suresh Siddha	d546b67a94	x86: fix performance drop for glx fix the 3D performance drop reported at: http://bugzilla.kernel.org/show_bug.cgi?id=10328 fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with WC attribute. Recent changes in page attribute code made both ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This breaks the graphics performance, as the effective memory type is UC instead of expected WC. The correct way to fix this is to add ioremap_wc() (which uses UC- in the absence of PAT kernel support and WC with PAT) and change all the fb drivers to use this new ioremap_wc() API. We can take this correct and longer route for post 2.6.25. For now, revert back to the UC- behavior for ioremap/ioremap_nocache. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-26 22:23:41 +01:00
Yinghai Lu	76c324182b	x86: fix trim mtrr not to setup_memory two times we could call find_max_pfn() directly instead of setup_memory() to get max_pfn needed for mtrr trimming. otherwise setup_memory() is called two times... that is duplicated... [ mingo@elte.hu: both Thomas and me simulated a double call to setup_bootmem_allocator() and can confirm that it is a real bug which can hang in certain configs. It's not been reported yet but that is probably due to the relatively scarce nature of MTRR-trimming systems. ] Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-26 22:23:41 +01:00
Linus Torvalds	b9e76a0074	x86-32: Pass the full resource data to ioremap() It appears that 64-bit PCI resources cannot possibly ever have worked on x86-32 even when the RESOURCES_64BIT config option was set, because any driver that tried to [pci_]ioremap() the resource would have been unable to do so because the high 32 bits would have been silently dropped on the floor by the ioremap() routines that only used "unsigned long". Change them to use "resource_size_t" instead, which properly encodes the whole 64-bit resource data if RESOURCES_64BIT is enabled. Acked-by: H. Peter Anvin <hpa@kernel.org> Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-24 11:22:39 -07:00
Yinghai Lu	37bff62e98	x86_64: free_bootmem should take phys so use nodedata_phys directly. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Thomas Gleixner	985a34bd75	x86: remove quicklists quicklists cause a serious memory leak on 32-bit x86, as documented at: http://bugzilla.kernel.org/show_bug.cgi?id=9991 the reason is that the quicklist pool is a special-purpose cache that grows out of proportion. It is not accounted for anywhere and users have no way to even realize that it's the quicklists that are causing RAM usage spikes. It was supposed to be a relatively small pool, but as demonstrated by KOSAKI Motohiro, they can grow as large as: Quicklists: 1194304 kB given how much trouble this code has caused historically, and given that Andrew objected to its introduction on x86 (years ago), the best option at this point is to remove them. [ any performance benefits of caching constructed pgds should be implemented in a more generic way (possibly within the page allocator), while still allowing constructed pages to be allocated by other workloads. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-11 17:11:55 +01:00
Ingo Molnar	9a46d7e5b6	x86: ioremap, remove WARN_ON() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-11 17:11:54 +01:00
Yinghai Lu	7c9e92b6cd	x86: not set node to cpu_to_node if the node is not online resolve boot problem reported by Mel Gorman: http://lkml.org/lkml/2008/2/13/404 init_cpu_to_node will use cpu->apic (from MADT or mptable) and apic->node(from SRAT or AMD config space with k8_bus_64.c) to have cpu->node mapping, and later identify_cpu will overwrite them again...(with nearby_node...) this patch checks if the node is online, otherwise it will not update cpu_node map. so keep cpu_node map to online node before identify_cpu..., to prevent possible error. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-04 17:10:12 +01:00
Rafael J. Wysocki	9b5cf48b06	x86: revert "x86: CPA: avoid split of alias mappings" Revert: commit `8be8f54bae` Author: Thomas Gleixner <tglx@linutronix.de> Date: Sat Feb 23 20:43:21 2008 +0100 x86: CPA: avoid split of alias mappings because it clearly mishandles the case when __change_page_attr(), called from __change_page_attr_set_clr(), changes cpa->processed to 1 and cpa_process_alias(cpa) is executed right after that. This crashes my x86-64 test box early in the boot process (ref. http://bugzilla.kernel.org/show_bug.cgi?id=10140#c4). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-03 14:18:27 +01:00
Thomas Gleixner	8be8f54bae	x86: CPA: avoid split of alias mappings avoid over-eager large page splitup. When the target area needs to be split or is split already (ioremap) then the current code enforces the split of large mappings in the alias regions even if we could avoid it. Use a separate variable processed in the cpa_data structure to carry the number of pages which have been processed instead of reusing the numpages variable. This keeps numpages intact and gives the alias code a chance to keep large mappings intact. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-29 18:55:42 +01:00
Ingo Molnar	b16bf712f4	x86: fix leak un ioremap_page_range() failure Jan Beulich noticed it during code review that if a driver's ioremap() fails (say due to -ENOMEM) then we might leak the struct vm_area. Free it properly. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-29 18:55:42 +01:00
Ingo Molnar	88f3aec7af	x86: fix spontaneous reboot with allyesconfig bzImage recently the 64-bit allyesconfig bzImage kernel started spontaneously rebooting during early bootup. after a few fun hours spent with early init debugging, it turns out that we've got this rather annoying limit on the size of the kernel image: #define KERNEL_TEXT_SIZE (4010241024) which limit my vmlinux just happened to pass: text data bss dec hex filename 29703744 4222751 `8646224` 42572719 2899baf vmlinux 40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/ So it happily crashed right in head_64.S, which - as we all know - is the most debuggable code in the whole architecture ;-) So increase the limit to allow an up to 128MB kernel image to be mapped. (should anyone be that crazy or lazy) We have a full 4K of pagetable (level2_kernel_pgt) allocated for these mappings already, so there's no RAM overhead and the limit was rather pointless and arbitrary. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-26 12:55:56 +01:00
Yinghai Lu	3b57bc461f	x86: remove double-checking empty zero pages debug so far no one complained about that. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-26 12:55:55 +01:00
Ingo Molnar	92cb54a37a	x86: make DEBUG_PAGEALLOC and CPA more robust Use PF_MEMALLOC to prevent recursive calls in the DBEUG_PAGEALLOC case. This makes the code simpler and more robust against allocation failures. This fixes the following fallback to non-mmconfig: http://lkml.org/lkml/2008/2/20/551 http://bugzilla.kernel.org/show_bug.cgi?id=10083 Also, for DEBUG_PAGEALLOC=n reduce the pool size to one page. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-26 12:55:50 +01:00
Rafael J. Wysocki	8a235efad5	Hibernation: Handle DEBUG_PAGEALLOC on x86 Make hibernation work with CONFIG_DEBUG_PAGEALLOC set on x86, by checking if the pages to be copied are marked as present in the kernel mapping and temporarily marking them as present if that's not the case. No functional modifications are introduced if CONFIG_DEBUG_PAGEALLOC is unset. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>	2008-02-21 02:15:28 -05:00
Linus Torvalds	5d9c4a7de6	Merge branch 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6 * 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6: agp: fix missing casts that produced a warning. agp: add support for 662/671 to agp driver fix historic ioremap() abuse in AGP agp/sis: Suspend support for SiS AGP agp/sis: Clear bit 2 from aperture size byte as well	2008-02-19 18:29:57 -08:00
Arjan van de Ven	156fbc3fbe	x86: fix page_is_ram() thinko page_is_ram() has a special case for the 640k-1M bios area, however due to a thinko the special case checks the e820 table entry and not the memory the user has asked for. This patch fixes the bug. [ mingo@elte.hu: this too is better solved in the e820 space, but those fixes are too intrusive for v2.6.25. ] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:34 +01:00
Arjan van de Ven	d8a9e6a51e	x86: fix WARN_ON() message: teach page_is_ram() about the special 4Kb bios data page This patch teaches page_is_ram() about the fact that the first 4Kb of memory are special on x86, even though the E820 table normally doesn't exclude it. This fixes the WARN_ON() reported by Laurent Riffard who was also very helpful in diagnosing the issue. [ mingo@elte.hu: we are working on doing this properly in the e820 space, but for 2.6.25 this is the better fix. ] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:34 +01:00
Sam Ravnborg	d01b9ad56e	x86: fix section mismatch in srat_64.c:reserve_hotadd reserve_hotadd() are only used by __init acpi_numa_memory_affinity_init(). Annotate reserve_hotadd() with __init is the trivial fix. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:31 +01:00
Andi Kleen	8e31c2ac11	x86: CPA: remove BUG_ON for LRU/Compound pages New implementation does not use lru for anything so there is no need to reject pages that are in the LRU. Similar for compound pages (which were checked because they also use page->lru) [ tglx@linutronix.de: removed unused variable ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:29 +01:00
Arjan van dev Ven	fcea424d31	fix historic ioremap() abuse in AGP Several AGP drivers right now use ioremap_nocache() on kernel ram in order to turn a page of regular memory uncached. There are two problems with this: 1) This is a total nightmare for the ioremap() implementation to keep various mappings of the same page coherent. 2) It's a total nightmare for the AGP code since it adds a ton of complexity in terms of keeping track of 2 different pointers to the same thing, in terms of error handling etc etc. This patch fixes this by making the AGP drivers use the new set_memory_XX APIs instead. Note: amd-k7-agp.c is built on Alpha too, and generic.c is built on ia64 as well, which do not yet have the set_memory_*() APIs, so for them some we have a few ugly #ifdefs - hopefully they'll be fixed soon. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dave Airlie <airlied@linux.ie>	2008-02-19 14:46:39 +10:00
Yinghai Lu	b7ad149d62	x86: reenable support for system without on node0 One system doesn't have RAM for node0 installed. SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 0 -> APIC 1 -> Node 0 SRAT: PXM 1 -> APIC 2 -> Node 1 SRAT: PXM 1 -> APIC 3 -> Node 1 SRAT: Node 1 PXM 1 0-a0000 SRAT: Node 1 PXM 1 0-dd000000 SRAT: Node 1 PXM 1 0-123000000 ACPI: SLIT: nodes = 2 10 13 13 10 mapped APIC to ffffffffff5fb000 ( fee00000) Bootmem setup node 1 0000000000000000-0000000123000000 NODE_DATA [000000000000e000 - 0000000000014fff] bootmap [0000000000015000 - 00000000000395ff] pages 25 Could not find start_pfn for node 0 Pid: 0, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #14 Call Trace: [<ffffffff80bab498>] free_area_init_node+0x22/0x381 [<ffffffff8045ffc5>] generic_swap+0x0/0x17 [<ffffffff80bab0cc>] find_zone_movable_pfns_for_nodes+0x54/0x271 [<ffffffff80baba5f>] free_area_init_nodes+0x239/0x287 [<ffffffff80ba6311>] paging_init+0x46/0x4c [<ffffffff80b9dda5>] setup_arch+0x3c3/0x44e [<ffffffff80b978be>] start_kernel+0x6f/0x2c7 [<ffffffff80b971cc>] _sinittext+0x1cc/0x1d3 This happens because node 0 is not online, but the node state in mm/page_alloc.c has node 0 set. nodemask_t node_states[NR_NODE_STATES] __read_mostly = { [N_POSSIBLE] = NODE_MASK_ALL, [N_ONLINE] = { { [0] = 1UL } }, So we need to clear node_online_map before initializing the memory. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	f34b439f34	x86: CPA: avoid double checking of alias ranges When the CPA code is called with an virtual address in the range of the direct mapping or the high alias then we do not need to run through the alias check for this range. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	af96e4438a	x86: CPA no alias checking for _NX NX settings are not required to be consistent across alias mappings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	31eedd823c	x86: zap invalid and unused pmds in early boot The early boot code maps KERNEL_TEXT_SIZE (currently 40MB) starting from __START_KERNEL_map. The kernel itself only needs _text to _end mapped in the high alias. On relocatible kernels the ASM setup code adjusts the compile time created high mappings to the relocation. This creates invalid pmd entries for negative offsets: 0xffffffff80000000 -> pmd entry: ffffffffff2001e3 It points outside of the physical address space and is marked present. This starts at the virtual address __START_KERNEL_map and goes up to the point where the first valid physical address (0x0) is mapped. Zap the mappings before _text and after _end right away in early boot. This removes also the invalid entries. Furthermore it simplifies the range check for high aliases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	c31c7d4844	x86: CPA, fix alias checks c_p_a() did not discover all aliases correctly. (such as when called on vmalloc()-ed areas or ioremap()-ed areas) Push the alias checks to the lower, physical level and consistently discover all aliases that might exist: the low direct mappings and the high linear kernel-text mappings (on 64-bit). Thanks to Andi Kleen for pointing out that this was buggy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Ingo Molnar	f8d8406bcb	x86: cpa, fix out of date comment Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:21 +01:00
Thomas Gleixner	69b1415e93	x86: cpa: ensure page alignment the cpa API is page aligned - warn about any weird alignments. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:20 +01:00
Harvey Harrison	7bfeab9af9	x86: include proper prototypes for rodata_test extern should not appear in C files. Also, the definitions do not match the prototype currently, not sure what way you want to go with this, I've switched the prototype to return int, but I can see going to the void return as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:20 +01:00
Adrian Bunk	cae30f8270	x86: make dump_pagetable() static dump_pagetable() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:19 +01:00
Andi Kleen	5d3c8b21e2	x86: CPA: fix gbpages support in try_preserve_large_page [ mingo@elte.hu: while gbpages cannot be enabled on mainline currently, keep the code uptodate and this fix is easy enough. ] Use correct page sizes and masks for GB pages in try_preserve_large_page() This prevents a boot hang on a GB capable system with CONFIG_DIRECT_GBPAGES enabled. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-13 16:20:35 +01:00
Jeremy Fitzhardinge	37cc8d7f96	x86/early_ioremap: don't assume we're using swapper_pg_dir At the early stages of boot, before the kernel pagetable has been fully initialized, a Xen kernel will still be running off the Xen-provided pagetables rather than swapper_pg_dir[]. Therefore, readback cr3 to determine the base of the pagetable rather than assuming swapper_pg_dir[]. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Tested-by: Jody Belka <knew-linux@pimb.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-13 16:20:35 +01:00
Thomas Gleixner	81772fea41	x86: remove over noisy debug printk pageattr-test.c contains a noisy debug printk that people reported. The condition under which it prints (randomly tapping into a mem_map[] hole and not being able to c_p_a() there) is valid behavior and not interesting to report. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-11 11:24:24 -08:00
Thomas Gleixner	fac8493960	x86: cpa, strict range check in try_preserve_large_page() Right now, we check only the first 4k page for static required protections. This does not take overlapping regions into account. So we might end up setting the wrong permissions/protections for other parts of this large page. This can be optimized further, but correctness is the important part. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Thomas Gleixner	eb5b5f024c	x86: cpa, use page pool Switch the split page code to use the page pool. We do this unconditionally to avoid different behaviour with and without DEBUG_PAGEALLOC enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Thomas Gleixner	76ebd0548d	x86: introduce page pool in cpa DEBUG_PAGEALLOC was not possible on 64-bit due to its early-bootup hardcoded reliance on PSE pages, and the unrobustness of the runtime splitup of large pages. The splitup ended in recursive calls to alloc_pages() when a page for a pte split was requested. Avoid the recursion with a preallocated page pool, which is used to split up large mappings and gets refilled in the return path of kernel_map_pages after the split has been done. The size of the page pool is adjusted to the available memory. This part just implements the page pool and the initialization w/o using it yet. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Ian Campbell	b6fbb669c8	x86: fix early_ioremap pagetable ops Some important parts of `f6df72e71e` got dropped along the way, reintroduce them. Only affects paravirt guests. Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-09 23:24:09 +01:00
Ian Campbell	551889a6e2	x86: construct 32-bit boot time page tables in native format. Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled kernel are in PAE format. early_ioremap is updated to use the standard page table accessors. Clear any mappings beyond max_low_pfn from the boot page tables in native_pagetable_setup_start because the initial mappings can extend beyond the range of physical memory and into the vmalloc area. Derived from patches by Eric Biederman and H. Peter Anvin. [ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ] Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Mika PenttilÃÂ¤ <mika.penttila@kolumbus.fi> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-09 23:24:09 +01:00
Thomas Gleixner	bfc734b246	x86: avoid unused variable warning in mm/init_64.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Harvey Harrison	da7bfc50f5	x86: sparse warnings in pageattr.c Adjust the definition of lookup_address to take an unsigned long level argument. Adjust callers in xen/mmu.c that pass in a dummy variable. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-09 23:24:08 +01:00
Martin Schwidefsky	2f569afd9c	CONFIG_HIGHPTE vs. sub-page page tables. Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte ) - to be introduced with a later patch. For everybody else it will be a (struct page ). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:42 -08:00
Bernhard Walle	72a7fe3967	Introduce flags for reserve_bootmem() This patchset adds a flags variable to reserve_bootmem() and uses the BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions between crashkernel area and already used memory. This patch: Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. Because that code runs before SMP initialisation, there's no race condition inside reserve_bootmem_core(). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: <linux-arch@vger.kernel.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:25 -08:00
Ingo Molnar	58d5d0d8dd	x86: fix deadlock, make pgd_lock irq-safe lockdep just caught this one: ================================= [ INFO: inconsistent lock state ] 2.6.24 #38 --------------------------------- inconsistent {in-softirq-W} -> {softirq-on-W} usage. swapper/1 [HC0[0]:SC0[0]:HE1:SE1] takes: (pgd_lock){-+..}, at: [<ffffffff8022a9ea>] mm_init+0x1da/0x250 {in-softirq-W} state was registered at: [<ffffffffffffffff>] 0xffffffffffffffff irq event stamp: 394559 hardirqs last enabled at (394559): [<ffffffff80267f0a>] get_page_from_freelist+0x30a/0x4c0 hardirqs last disabled at (394558): [<ffffffff80267d25>] get_page_from_freelist+0x125/0x4c0 softirqs last enabled at (393952): [<ffffffff80232f8e>] __do_softirq+0xce/0xe0 softirqs last disabled at (393945): [<ffffffff8020c57c>] call_softirq+0x1c/0x30 other info that might help us debug this: no locks held by swapper/1. stack backtrace: Pid: 1, comm: swapper Not tainted 2.6.24 #38 Call Trace: [<ffffffff8024e1fb>] print_usage_bug+0x18b/0x190 [<ffffffff8024f55d>] mark_lock+0x53d/0x560 [<ffffffff8024fffa>] __lock_acquire+0x3ca/0xed0 [<ffffffff80250ba8>] lock_acquire+0xa8/0xe0 [<ffffffff8022a9ea>] ? mm_init+0x1da/0x250 [<ffffffff809bcd10>] _spin_lock+0x30/0x70 [<ffffffff8022a9ea>] mm_init+0x1da/0x250 [<ffffffff8022aa99>] mm_alloc+0x39/0x50 [<ffffffff8028b95a>] bprm_mm_init+0x2a/0x1a0 [<ffffffff8028d12b>] do_execve+0x7b/0x220 [<ffffffff80209776>] sys_execve+0x46/0x70 [<ffffffff8020c214>] kernel_execve+0x64/0xd0 [<ffffffff8020901e>] ? _stext+0x1e/0x20 [<ffffffff802090ba>] init_post+0x9a/0xf0 [<ffffffff809bc5f6>] ? trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8024f75a>] ? trace_hardirqs_on+0xba/0xd0 [<ffffffff8020c1a8>] ? child_rip+0xa/0x12 [<ffffffff8020bcbc>] ? restore_args+0x0/0x44 [<ffffffff8020c19e>] ? child_rip+0x0/0x12 turns out that pgd_lock has been used on 64-bit x86 in an irq-unsafe way for almost two years, since commit `8c914cb704`. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-06 22:39:45 +01:00
Ingo Molnar	971a52d66a	x86: delay CPA self-test and repeat it delay the CPA self-test so that any impact (corruption) of user-space pagetables can be triggered. Repeat the test every 30 seconds. this would have prevented the bug fixed by `8cb2a7c1e9`, at its source. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:45 +01:00
Arjan van de Ven	cc842b82cc	x86: remove suprious ifdefs from pageattr.c The .rodata section really should just be read only; the config option is there to make breaking up the 2Mb page an option (so people whos machines give more performance for the 2Mb case can opt to do so). But when the page gets split anyway, this is no longer an issue, so clean up the code and remove the ifdefs Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:45 +01:00
Arjan van de Ven	984bb80d94	x86: mark the .rodata section also NX The .rodata section shouldn't just be read-only, but also non-executable. This is free since we've broken up the 2MB page already anyway. also update test_nx to check for this. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:45 +01:00
Ingo Molnar	2d684cd6d9	x86: remove X2 workaround With the spurious handler fix, the X2 does not lock up anymore. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:44 +01:00
Thomas Gleixner	d8b57bb700	x86: make spurious fault handler aware of large mappings In very rare cases, on certain CPUs, we could end up in the spurious fault handler and ignore a large pud/pmd mapping. The resulting pte pointer points into the mapped physical space and dereferencing it will fault recursively. Make the code aware of large mappings and do the permission check on the pmd/pud entry, when a large pud/pmd mapping is detected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:43 +01:00
Hugh Dickins	8cb2a7c1e9	stop c_p_a corrupting the pds When change_page_attr splits a large page on x86_32 (without PAE), it is currently corrupting every process's page directory: fix that by removing the thinko which passes down a physical instead of a virtual address. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 14:37:14 -08:00
Benjamin Herrenschmidt	5e5419734c	add mm argument to pte/pmd/pud/pgd_free (with Martin Schwidefsky <schwidefsky@de.ibm.com>) The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as first argument. The free functions do not get the mm_struct argument. This is 1) asymmetrical and 2) to do mm related page table allocations the mm argument is needed on the free function as well. [kamalesh@linux.vnet.ibm.com: i386 fix] [akpm@linux-foundation.org: coding-syle fixes] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 09:44:18 -08:00
Thomas Gleixner	7b610eec7a	x86: cpa, micro-optimization Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:10 +01:00
Ingo Molnar	87f7f8fe32	x86: cpa, clean up code flow Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:10 +01:00
Ingo Molnar	beaff6333b	x86: cpa, eliminate CPA_ enum Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Ingo Molnar	9df84993cb	x86: cpa, cleanups Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	f07333fd14	x86: implement gbpages support in change_page_attr() Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	b536022227	x86: support gbpages in pagetable dump Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	c2f71ee214	x86: add gbpages support to lookup_address [ tglx@linutronix.de: fix bootup crash on sparse mappings. ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	d4f71f7969	x86: switch direct mapping setup over to set_pte Use set_pte() for setting up the 2MB pages in the direct mapping. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Thomas Gleixner	7bfb72e847	x86: fix page-present check in cpa_flush_range pte_present() might return true for PROT_NONE mappings. Explicitely check the present bit. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:08 +01:00
Ingo Molnar	6ce9fc17d9	x86: remove cpa warning this race is legit and can happen on SMP systems. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Andi Kleen	bde1965ce8	x86: remove now unused clear_kernel_mapping Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Thomas Gleixner	64f351d197	x86: cpa selftest, skip non present entries pud and pmd entries in the RAM area might be marked as non present. Do not try to modify them in the selftest. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:08 +01:00
Thomas Gleixner	07cf89c05f	x86: CPA fix pagetable split Move the readout of the large entry into the spinlock section to prevent an unlikely but possible race. Mark the pmd/pud entry present after the split. We preserved the non present bit in the new split mapping. Remove the stale gfp_flags double initialization. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:08 +01:00
Andi Kleen	31422c51e0	x86: rename LARGE_PAGE_SIZE to PMD_PAGE_SIZE Fix up all users. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Thomas Gleixner	9a14aefc1d	x86: cpa, fix lookup_address lookup_address() returns a wrong level and a wrong pointer to a non existing pte, when pmd or pud entries are marked !present. This happens for example due to boot time mapping of GART into the low memory space. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Ingo Molnar	34508f66b6	x86: AMD Athlon X2 hard hang fix An Athlon 64 X2 test system showed hard hangs shortly after marking the kernel text read-only, if we tried to preserve largepages and changed the PSE entry from RW to RO. The pagetable code itself is correct, it's the CPU that locked up hard (and not even the NMI watchdog could punch through that hard hang). So be conservative and always do splitups - like we did in the past. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Thomas Gleixner	65e074dffa	x86: cpa, preserve large pages if possible When CPA is called on a range which fits into a large page mapping, avoid to split the page when: 1) There is no change of attributes 2) The range to change is a complete large mapping Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Thomas Gleixner	f4ae5da0e8	x86: cpa, check if we changed anything and tlb flushing is necessary Flush tlbs only when there was a real change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Thomas Gleixner	72e458dfa6	x86: introduce struct cpa_data The number of arguments which need to be transported is increasing and we want to add flush optimizations and large page preserving. Create struct cpa data and pass a pointer instead of increasing the number of arguments further. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Andi Kleen	6bb8383beb	x86: cpa, only flush the cache if the caching attributes have changed We only need to flush the caches in cpa() if the the caching attributes have changed. Otherwise only flush the TLBs. This checks the PAT bits too although they are currently not used by the kernel. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:06 +01:00
Thomas Gleixner	331e406588	x86: CPA return early when requested feature is not available Mask out the not supported bits (e.g. NX). If the clr/set masks are empty after the mask return without changing anything. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:06 +01:00
Thomas Gleixner	f56d005d30	x86: no CPA on iounmap When an ioremap is unmapped, do not change the page attributes. There might be another mapping of the same physical address. PAT might detect a conflicting mapping attribute for no good reason. The mapping is removed anyway. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:05 +01:00
Thomas Gleixner	75ab43bfce	x86: ioremap remove the range check of cpa Now that cpa works on non-direct mappings as well, we can safely remove the range check in ioremap_change_attr(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:05 +01:00
Thomas Gleixner	e66aadbe6c	x86: simplify __ioremap Remove tons of castings which make the code hard to read. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:05 +01:00
Thomas Gleixner	63c1dcf4bc	x86: CPA use the existing pfn in split as well When splitting large pages, we ge the pfn from the existing entry instead of calculating it ourself. This removes the last remaining range restriction of the cpa code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:05 +01:00
Arjan van de Ven	626c2c9d06	x86: use the pfn from the page when change its attributes When changing the attributes of a pte, we should use the PFN from the existing PTE rather than going through hoops calculating what we think it might have been; this is both fragile and totally unneeded. It also makes it more hairy to call any of these functions on non-direct maps for no good reason whatsover. With this change, __change_page_attr() no longer takes a pfn as argument, which simplifies all the callers. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@tglx.de>	2008-02-04 16:48:05 +01:00
Arjan van de Ven	cc0f21bbc1	x86: teach the static_protection function about high mappings Right now, enforcing that the high mapping of the kernel text doesn't get the NX bit is done deep in the guts of CPA, rather than in the static_protection() function that enforces all other per-arch sanity checks. This patch moves this sanity check into the central static_protection() function instead, and makes it apply ONLY to the kernel text, not to all other areas in the high mapping. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:05 +01:00
Jeremy Fitzhardinge	a67ad9c9f8	x86: revert "defer cr3 reload when doing pud_clear()" Revert "defer cr3 reload when doing pud_clear()" since I'm going to replace it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:02 +01:00
Jeremy Fitzhardinge	e618c9579c	x86: unify PAE/non-PAE pgd_ctor The constructors for PAE and non-PAE pgd_ctors are more or less identical, and can be made into the same function. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: William Irwin <wli@holomorphy.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:02 +01:00
H. Peter Anvin	f832ff18e8	x86: use _ASM_EXTABLE macro in arch/x86/mm/init_32.c Use the _ASM_EXTABLE macro from <asm/asm.h>, instead of open-coding __ex_table entires in arch/x86/mm/init_32.c. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:58 +01:00
Harvey Harrison	cf89ec924d	x86: reduce ifdef sections in fault.c Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:56 +01:00
Yinghai Lu	6118f76fb7	x86: print out node_data addr and bootmap_start addr print out node_data addr and bootmap_start addr. helpful for debugging early crashes on high-end NUMA systems. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:56 +01:00
Thomas Gleixner	b50516fc20	x86: CPA remove bogus NX clear In split_large_page we clear the NX bit for the new split ptes, but we need to preserve the original setting of it for the split ptes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:47:55 +01:00
Ingo Molnar	38cb47ba01	x86: relax RAM check in ioremap() Kevin Winchester reported the loss of direct rendering, due to: [ 0.588184] agpgart: Detected AGP bridge 0 [ 0.588184] agpgart: unable to get memory for graphics translation table. [ 0.588184] agpgart: agp_backend_initialize() failed. [ 0.588207] agpgart-amd64: probe of 0000:00:00.0 failed with error -12 and bisected it down to: commit `266b9f8727` Author: Thomas Gleixner <tglx@linutronix.de> Date: Wed Jan 30 13:34:06 2008 +0100 x86: fix ioremap RAM check this check was too strict and caused an ioremap() failure. the problem is due to the somewhat unclean way of how the GART code reserves a memory range for its aperture, and how it utilizes it later on. Allow RAM pages to be ioremap()-ed too, as long as they are reserved. Bisected-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:54 +01:00
Rafael J. Wysocki	a6eb84bc1e	suspend: cleanup reference to swsusp_pg_dir[] swsusp_pg_dir[] is used for suspend, but not for hibernation. clean-up the ifdefs which worked by accident, while implying the opposite. Delete the __nosavedata, which also implied the opposite. Some day we may optimize CONFIG_ACPI_SLEEP to build minimal kernels for just hibernate or just suspend but not both, but today isn't that day. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>	2008-02-01 18:30:59 -05:00
Harvey Harrison	93809be8b1	x86: fixes for lookup_address args Signedness mismatches in level argument. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:43 +01:00
Yinghai Lu	9347e0b0ce	x86: remove unneeded round_up Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:42 +01:00
Yinghai Lu	24a5da73f4	x86_64: make bootmap_start page align v6 boot oopses when a system has 64 or 128 GB of RAM installed: Calling initcall 0xffffffff80bc33b6: sctp_init+0x0/0x711() BUG: unable to handle kernel NULL pointer dereference at 000000000000005f IP: [<ffffffff802bfe55>] proc_register+0xe7/0x10f PGD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #6 RIP: 0010:[<ffffffff802bfe55>] [<ffffffff802bfe55>] proc_register+0xe7/0x10f RSP: 0000:ffff810824c57e60 EFLAGS: 00010246 RAX: 000000000000d7d7 RBX: ffff811024c5fa80 RCX: ffff810824c57e08 RDX: 0000000000000000 RSI: 0000000000000195 RDI: ffffffff80cc2460 RBP: ffffffffffffffff R08: 0000000000000000 R09: ffff811024c5fa80 R10: 0000000000000000 R11: 0000000000000002 R12: ffff810824c57e6c R13: 0000000000000000 R14: ffff810824c57ee0 R15: 00000006abd25bee FS: 0000000000000000(0000) GS:ffffffff80b4d000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 000000000000005f CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 1, threadinfo ffff810824c56000, task ffff812024c52000) Stack: ffffffff80a57348 0000019500000000 ffff811024c5fa80 0000000000000000 00000000ffffff97 ffffffff802bfef0 0000000000000000 ffffffffffffffff 0000000000000000 ffffffff80bc3b4b ffff810824c57ee0 ffffffff80bc34a5 Call Trace: [<ffffffff802bfef0>] ? create_proc_entry+0x73/0x8a [<ffffffff80bc3b4b>] ? sctp_snmp_proc_init+0x1c/0x34 [<ffffffff80bc34a5>] ? sctp_init+0xef/0x711 [<ffffffff80b976e3>] ? kernel_init+0x175/0x2e1 [<ffffffff8020ccf8>] ? child_rip+0xa/0x12 [<ffffffff80b9756e>] ? kernel_init+0x0/0x2e1 [<ffffffff8020ccee>] ? child_rip+0x0/0x12 Code: 1e 48 83 7b 38 00 75 08 48 c7 43 38 f0 e8 82 80 48 83 7b 30 00 75 08 48 c7 43 30 d0 e9 82 80 48 c7 c7 60 24 cc 80 e8 bd 5a 54 00 <48> 8b 45 60 48 89 6b 58 48 89 5d 60 48 89 43 50 fe 05 f5 25 a0 RIP [<ffffffff802bfe55>] proc_register+0xe7/0x10f RSP <ffff810824c57e60> CR2: 000000000000005f ---[ end trace 02c2d78def82877a ]--- Kernel panic - not syncing: Attempted to kill init! it turns out some variables near end of bss are corrupted already. in System.map we have ffffffff80d40420 b rsi_table ffffffff80d40620 B krb5_seq_lock ffffffff80d40628 b i.20437 ffffffff80d40630 b xprt_rdma_inline_write_padding ffffffff80d40638 b sunrpc_table_header ffffffff80d40640 b zero ffffffff80d40644 b min_memreg ffffffff80d40648 b rpcrdma_tk_lock_g ffffffff80d40650 B sctp_assocs_id_lock ffffffff80d40658 B proc_net_sctp ffffffff80d40660 B sctp_assocs_id ffffffff80d40680 B sysctl_sctp_mem ffffffff80d40690 B sysctl_sctp_rmem ffffffff80d406a0 B sysctl_sctp_wmem ffffffff80d406b0 b sctp_ctl_socket ffffffff80d406b8 b sctp_pf_inet6_specific ffffffff80d406c0 b sctp_pf_inet_specific ffffffff80d406c8 b sctp_af_v4_specific ffffffff80d406d0 b sctp_af_v6_specific ffffffff80d406d8 b sctp_rand.33270 ffffffff80d406dc b sctp_memory_pressure ffffffff80d406e0 b sctp_sockets_allocated ffffffff80d406e4 b sctp_memory_allocated ffffffff80d406e8 b sctp_sysctl_header ffffffff80d406f0 b zero ffffffff80d406f4 A __bss_stop ffffffff80d406f4 A _end and setup_node_bootmem() will use that page 0xd40000 for bootmap Bootmem setup node 0 0000000000000000-0000000828000000 NODE_DATA [000000000008a485 - 0000000000091484] bootmap [0000000000d406f4 - 0000000000e456f3] pages 105 Bootmem setup node 1 0000000828000000-0000001028000000 NODE_DATA [0000000828000000 - 0000000828006fff] bootmap [0000000828007000 - 0000000828106fff] pages 100 Bootmem setup node 2 0000001028000000-0000001828000000 NODE_DATA [0000001028000000 - 0000001028006fff] bootmap [0000001028007000 - 0000001028106fff] pages 100 Bootmem setup node 3 0000001828000000-0000002028000000 NODE_DATA [0000001828000000 - 0000001828006fff] bootmap [0000001828007000 - 0000001828106fff] pages 100 setup_node_bootmem() makes NODE_DATA cacheline aligned, and bootmap is page-aligned. the patch updates find_e820_area() to make sure we can meet the alignment constraints. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:41 +01:00
Yinghai Lu	25eff8d4cd	x86_64: add debug name for early_res helps debugging problems in this rather murky area of code. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:41 +01:00
Ingo Molnar	5aa0508508	x86: uninline __pte_free_tlb() and __pmd_free_tlb() this also removes an include file dependency. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-31 22:05:48 +01:00
Huang, Ying	1fd6a53ddc	x86: early_ioremap_reset fix 2 This patch fixes a bug of early_ioremap_reset(), which had been fixed before by "convert the boot time page table to the kernels native format" patch. But that patch has been reverted now. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-31 22:05:45 +01:00
Huang, Ying	5827040df0	x86: change_page_attr_clear fix This patch replaces __change_page_attr_set_clr() with change_page_attr_set_clr() in change_page_attr_clear() to flush the TLB/cache properly. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-31 22:05:43 +01:00
Yinghai Lu	afadcd788f	x86: fix nodemap_size according to nodeid bits memnode.map is s16 array because of nodeid is 16 bit now. so need to increase the nodemap_size according to that bits. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:12 +01:00
Yinghai Lu	9198715763	x86: fix overlap between pagetable with bss section one early crash on one 8 node 256g machine: Command line: console=uart8250,io,0x3f8,115200n8 initrd=kernel.org/mydisk11_x86_64.gz rw root=/dev/ram0 debug initcall_debug apic=debug acpi.debug_level=0x0000000f pci=routeirq ip=dhcp load_ramdisk=1 ramdisk_size=131072 BOOT_IMAGE=kernel.org/bzImage_2.6.25_k8.1 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009bc00 (usable) BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000dffe0000 (usable) BIOS-e820: 00000000dffe0000 - 00000000dffee000 (ACPI data) BIOS-e820: 00000000dffee000 - 00000000dffff050 (ACPI NVS) BIOS-e820: 00000000dffff050 - 00000000e0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000004020000000 (usable) Early serial console at I/O port 0x3f8 (options '115200n8') console [uart0] enabled end_pfn_map = 67239936 Kernel panic - not syncing: Duplicated early reservation d40000-e42000 Pid: 0, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #3 Call Trace: [<ffffffff80221545>] lapic_get_maxlvt+0x0/0x10 [<ffffffff80221657>] clear_local_APIC+0x5/0xcf [<ffffffff80221726>] disable_local_APIC+0x5/0x17 [<ffffffff8021fe16>] smp_send_stop+0x46/0x4c [<ffffffff80235293>] panic+0x94/0x13e [<ffffffff80bc3b03>] sctp_eps_proc_init+0x12/0x34 [<ffffffff80b9f1c5>] reserve_early+0x30/0x6c [<ffffffff80803925>] init_memory_mapping+0x2cd/0x2dc [<ffffffff80b9dc01>] setup_arch+0x21f/0x44e [<ffffffff80b978be>] start_kernel+0x6f/0x2c7 [<ffffffff80b971cc>] _sinittext+0x1cc/0x1d3 it turns out there is overlap between pgtable and bss... in System.map we have ffffffff80d40420 b rsi_table ffffffff80d40620 B krb5_seq_lock ffffffff80d40628 b i.20437 ffffffff80d40630 b xprt_rdma_inline_write_padding ffffffff80d40638 b sunrpc_table_header ffffffff80d40640 b zero ffffffff80d40644 b min_memreg ffffffff80d40648 b rpcrdma_tk_lock_g ffffffff80d40650 B sctp_assocs_id_lock ffffffff80d40658 B proc_net_sctp ffffffff80d40660 B sctp_assocs_id ffffffff80d40680 B sysctl_sctp_mem ffffffff80d40690 B sysctl_sctp_rmem ffffffff80d406a0 B sysctl_sctp_wmem ffffffff80d406b0 b sctp_ctl_socket ffffffff80d406b8 b sctp_pf_inet6_specific ffffffff80d406c0 b sctp_pf_inet_specific ffffffff80d406c8 b sctp_af_v4_specific ffffffff80d406d0 b sctp_af_v6_specific ffffffff80d406d8 b sctp_rand.33270 ffffffff80d406dc b sctp_memory_pressure ffffffff80d406e0 b sctp_sockets_allocated ffffffff80d406e4 b sctp_memory_allocated ffffffff80d406e8 b sctp_sysctl_header ffffffff80d406f0 b zero ffffffff80d406f4 A __bss_stop ffffffff80d406f4 A _end need to round up table_start to PAGE_SIZE. also make the panic more informative. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:12 +01:00
Joachim Deguara	bb4a1d644a	x86: add PCI IDs to k8topology_64.c This just adds the PCI IDs of AMD's family 10h and 11h CPU's northbridges to k8topology discovery. Signed-off-by: Joachim Deguara <joachim.deguara@amd.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:12 +01:00
Jeremy Fitzhardinge	f6df72e71e	x86: fix early_ioremap pagetable ops Put appropriate pagetable update hooks in so that paravirt knows what's going on in there. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Jeremy Fitzhardinge	e3ed910db2	x86: use the same pgd_list for PAE and 64-bit Use a standard list threaded through page->lru for maintaining the pgd list on PAE. This is the same as 64-bit, and seems saner than using a non-standard list via page->index. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Jeremy Fitzhardinge	6194ba6ff6	x86: don't special-case pmd allocations as much In x86 PAE mode, stop treating pmds as a special case. Previously they were always allocated and freed with the pgd. The modifies the code to be the same as 64-bit mode, where they are allocated on demand. This is a step on the way to unifying 32/64-bit pagetable allocation as much as possible. There is a complicating wart, however. When you install a new reference to a pmd in the pgd, the processor isn't guaranteed to see it unless you reload cr3. Since reloading cr3 also has the side-effect of flushing the tlb, this is an expense that we want to avoid whereever possible. This patch simply avoids reloading cr3 unless the update is to the current pagetable. Later patches will optimise this further. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: William Irwin <wli@holomorphy.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Harvey Harrison	fd40d6e318	x86: shrink some ifdefs in fault.c The change from current to tsk in do_page_fault is safe as this is set at the very beginning of the function. Removes a likely() annotation from the 64-bit version, this could have instead been added to 32-bit. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Jeremy Fitzhardinge	5b727a3b01	x86: ignore spurious faults When changing a kernel page from RO->RW, it's OK to leave stale TLB entries around, since doing a global flush is expensive and they pose no security problem. They can, however, generate a spurious fault, which we should catch and simply return from (which will have the side-effect of reloading the TLB to the current PTE). This can occur when running under Xen, because it frequently changes kernel pages from RW->RO->RW to implement Xen's pagetable semantics. It could also occur when using CONFIG_DEBUG_PAGEALLOC, since it avoids doing a global TLB flush after changing page permissions. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Harvey Harrison	b406ac61e9	x86: remove nx_enabled from fault.c On !PAE 32-bit, _PAGE_NX will be 0, making is_prefetch always return early. The test is sufficient on PAE as __supported_pte_mask is updated in the same places as nx_enabled in init_32.c which also takes disable_nx into account. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Harvey Harrison	c61e211d99	x86: unify fault_32\|64.c Unify includes in moved fault.c. Modify Makefiles to pick up unified file. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Harvey Harrison	f8c2ee224d	x86: unify fault_32\|64.c with ifdefs Elimination of these ifdefs can be done in a unified file. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Harvey Harrison	1156e098c5	x86: unify fault_32\|64.c by ifdef'd function bodies It's about time to get on with unifying these files, elimination of the ugly ifdefs can occur in the unified file. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Ingo Molnar	d7d119d777	x86: arch/x86/mm/init_32.c printk fixes printk fixes. NOP in terms of functionality, but strings got a bit larger due to the KERN_ markers that were added. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Ingo Molnar	8550eb9982	x86: arch/x86/mm/init_32.c cleanup Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Ingo Molnar	10f22dde55	x86: arch/x86/mm/init_64.c printk fixes Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Thomas Gleixner	14a62c34b1	x86: unify ioremap Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Harvey Harrison	19f0dda91e	x86: unify page fault oops printing This changes the oops dumping format for page faults to be similar between X86_32 and 64. This is the first user of printk_address on X86_32. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Harvey Harrison	b3279c7fd7	x86: introduce show_fault_oops helper to fault_32\|64.c This will help when unifying the oops dumping code on 32/64 bit. No functional changes. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Harvey Harrison	35f3266ffb	x86: add is_errata100 helper to fault_32\|64.c Further towards unifying these files, add another helper in same spirit as is_errata93. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Harvey Harrison	29caf2f98c	x86: add is_f00f_bug helper to fault_32\|64.c Further towards unifying these files, add another helper in same spirit as is_errata93. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Thomas Gleixner	0879750f5d	x86: cpa cleanup the 64-bit alias math Cleanup the address calculations, which are necessary to identify the high/low alias mappings of the kernel on 64 bit machines. Instead of calling __pa/__va back and forth, calculate the physical address once and base the other calculations on it. Add understandable constants so we can use the already available within() helper. Also add comments, which help mere mortals to understand what this code does. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:09 +01:00
Ingo Molnar	86f03989d9	x86: cpa: fix the self-test Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Ingo Molnar	ee01f1122c	x86: init memory debugging debug incorrect/late access to init memory, by permanently unmapping the init memory ranges. Depends on CONFIG_DEBUG_PAGEALLOC=y. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Arjan van de Ven	1a4872529e	x86: move misplaced rodata check call It looks like a mismerge put the rodata self-check in the wrong spot; move it to the right place after marking the .rodata section read only. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Ingo Molnar	4c61afcdb2	x86: fix clflush_page_range logic only present ptes must be flushed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:09 +01:00
Thomas Gleixner	3b233e52f7	x86: optimize clflush clflush is sufficient to be issued on one CPU. The invalidation is broadcast throughout the coherence domain. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	cd8ddf1a28	x86: clflush_page_range needs mfence clflush is an unordered operation with respect to other memory traffic, including other CLFLUSH instructions. This needs proper fencing with mfence. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	af1e6844d6	x86: cpa: rename global_flush_tlb() to cpa_flush_all() The function name global_flush_tlb() suggests something different from what the function really does. Rename it to cpa_flush_all(), which is an understandable counterpart to cpa_flush_range(). no global visibility of the old API anymore. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	57a6a46aa2	x86: cpa: implement clflush optimization Use clflush on CPUs which support this. clflush is only used when the page attribute operation has been successful. On CPUs which do not support clflush and in the case of error the old fashioned global_flush_tlb() is called. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	56744546b3	x86: cpa use the new set_clr function Convert cpa_set and cpa_clear to call the new set_clr function. Seperate out the debug helpers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	ff31452b6e	x86: cpa create set_and_clr function Create a set_and_clr function to avoid the duplicate loops. Allows also to do combined operations for optimization. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	72932c7ad2	x86: cpa move the flush into set and clear functions To avoid the modification of the flush code for the clflush implementation, move the flush into the set and clear functions and provide helper functions for the debugging code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Arjan van de Ven	edeed30589	x86: add testcases for RODATA and NX protections/attributes Latest update; I now have 4 NX tests, but 2 fail so they're #if 0'd. I also cleaned up the NX test code quite a bit, and got rid of the ugly exception table sorting stuff. From: Arjan van de Ven <arjan@linux.intel.com> This patch adds testcases for the CONFIG_DEBUG_RODATA configuration option as well as the NX CPU feature/mappings. Both testcases can move to tests/ once that patch gets merged into mainline. (I'm half considering moving the rodata test into mm/init.c but I'll wait with that until init.c is unified) As part of this I had to fix a not-quite-right alignment in the vmlinux.lds.h for the RODATA sections, which lead to 1 page less being marked read only. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:08 +01:00
Ingo Molnar	adafdf6a4e	x86: ioremap KERN_INFO Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	6eade8ff46	x86: cpa: clean up change_page_attr_set/clear() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:08 +01:00
Ingo Molnar	4692a1450b	x86: cpa: fix loop Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Thomas Gleixner	a72a08a4b6	x86: cpa: fix split thinko Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Arjan van de Ven	3c1df68b84	x86: make sure initmem is writable When we free initmem, various rodata and CPA checks may have left memory read only.. this patch ensures that the memory is writable before we free it. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Arjan van de Ven	488fd99588	x86: fix pageattr-selftest In Ingo's testing, he found a bug in the CPA selftest code. What would happen is that the test would call change_page_attr_addr on a range of memory, part of which was read only, part of which was writable. The only thing the test wanted to change was the global bit... What actually happened was that the selftest would take the permissions of the first page, and then the change_page_attr_addr call would then set the permissions of the entire range to this first page. In the rodata section case, this resulted in pages after the .rodata becoming read only... which made the kernel rather unhappy in many interesting ways. This is just another example of how dangerous the cpa API is (was); this patch changes the test to use the incremental clear/set APIs instead, and it changes the clear/set implementation to work on a 1 page at a time basis. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Thomas Gleixner	d7c8f21a8c	x86: cpa: move flush to cpa The set_memory_* and set_pages_* family of API's currently requires the callers to do a global tlb flush after the function call; forgetting this is a very nasty deathtrap. This patch moves the global tlb flush into each of the callers Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Arjan van de Ven	d1028a154c	x86: make various pageattr.c functions static change_page_attr_add is only used in pageattr.c now, so we can make this function static. change_page_attr() isn't used anywere at all anymore; this function is a really bad API anyway so just remove the bloat entirely. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Ingo Molnar	f62d0f008e	x86: cpa: set_memory_notpresent() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Thomas Gleixner	d806e5ee20	x86: cpa: convert ioremap to new API Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:06 +01:00
Thomas Gleixner	5f8681529c	x86: fix ioremap API Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:06 +01:00
Thomas Gleixner	266b9f8727	x86: fix ioremap RAM check Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:06 +01:00
Thomas Gleixner	950f9d95be	x86: fix the missing BIOS area check in page_is_ram page_is_ram has a FIXME since ages, which reminds to sanity check the BIOS area between 640k and 1M, which is sometimes falsely reported as RAM in the e820 tables. Implement the sanity check. Move the BIOS range defines from pageattr.c into e820.h to avoid duplicate defines. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:06 +01:00
Thomas Gleixner	5f5192b9fe	x86: move page_is_ram() function Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:06 +01:00
Arjan van de Ven	e1271f686a	x86: deprecate change_page_attr() for drivers With the introduction of the new API, no driver or non-archcore code needs to use c-p-a anymore, so this patch also deprecates the EXPORT_SYMBOL of CPA (it's a horrible API after all). Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:06 +01:00
Arjan van de Ven	6d238cc4dc	x86: convert CPA users to the new set_page_ API This patch converts various users of change_page_attr() to the new, more intent driven set_page_/set_memory_ API set. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:06 +01:00
Arjan van de Ven	75cbade8ea	x86: a new API for drivers/etc to control cache and other page attributes Right now, if drivers or other code want to change, say, a cache attribute of a page, the only API they have is change_page_attr(). c-p-a is a really bad API for this, because it forces the caller to know ALL the attributes he wants for the page, not just the 1 thing he wants to change. So code that wants to set a page uncachable, needs to be aware of the NX status as well etc etc etc. This patch introduces a set of new APIs for this, set_pages_<attr> and set_memory_<attr>, that offer a logical change to the user, and leave all attributes not implied by the requested logical change alone. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:06 +01:00
Ingo Molnar	e81d5dc41b	x86: cpa: move clflush_cache_range() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:06 +01:00
Thomas Gleixner	e64c8aa0c5	x86: unify ioremap_32 and _64 Unify the now identical ioremap_32.c and ioremap_64.c into the same ioremap.c file. No code changed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	240d3a7c47	x86: unify ioremap Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	e4c1b977f0	x86: use remove_vm_are in ioremap_32 error path When ioremap_page_range fails, then we can use remove_vm_area instead of vunmap safely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	4b40fcee13	x86: __iomem annotations Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	e9332cacd7	x86: switch to change_page_attr_addr in ioremap_32.c Use change_page_attr_addr() instead of change_page_attr(), which simplifies the code significantly and matches the 64bit implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	74ff2857f0	x86: make c_p_a unconditional in ioremap Make c_p_a unconditional for ioremap and iounmap. This ensures complete consistency of the flags which are handed to ioremap_page_range and the real flags in the mappings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	f87519e8f4	x86: introduce max_pfn_mapped 64bit uses end_pfn_map and 32bit uses max_low_pfn. There are several files which have #ifdef'ed defines which map either to end_pfn_map or max_low_pfn. Replace this by a universal define and clean up all the other instances. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	3cbd09e482	x86: cleanup ioremap includes Get rid of the douplicate define of ISA_START/END_ADDRESS and use the same headers in 32 and 64 bit code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	91eebf40b3	x86: style cleanup of ioremap code Fix the coding style before going further. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	1aaf74e919	x86: fix ioremap pgprot inconsistency The pgprot flags which are handed into ioremap_page_range() are different to those which are set in change_page_attr(). The ioremap_page_range flags are executable, while the c_p_a flags are not. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:05 +01:00
Thomas Gleixner	a40343497e	x86: fix ioremap pgprot inconsistency The pgprot flags which are handed into ioremap_page_range() are different to those which are set in change_page_attr(). The ioremap_page_range flags are executable, while the c_p_a flags are not. Also make the mappings global (which is a NOP currently on 32bit, although CPUs from PPRO+ onwards support it, but that's a separate fix.) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:04 +01:00
Arjan van de Ven	ed724be65f	x86: turn the check_exec function into function that What the check_exec() function really is trying to do is enforce certain bits in the pgprot that are required by the x86 architecture, but that callers might not be aware of (such as NX bit exclusion of the BIOS area for BIOS based PCI access; it's not uncommon to ioremap the BIOS region for various purposes and normally ioremap() memory has the NX bit set). This patch turns the check_exec() function into static_protections() which also is now used to make sure the kernel text area remains non-NX and that the .rodata section remains read-only. If the architecture ends up requiring more such mandatory prot settings for specific areas, this is now a reasonable place to add these. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:04 +01:00
Huang, Ying	1c17f4d615	x86: ioremap_nocache fix This patch fixes a bug of ioremap_nocache. ioremap_nocache() will call __ioremap() with flags != 0 to do the real work, which will call change_page_attr_addr() if phys_addr + size - 1 < (end_pfn_map << PAGE_SHIFT). But some pages between 0 ~ end_pfn_map << PAGE_SHIFT are not mapped by identity map, this will make change_page_attr_addr failed. This patch is based on latest x86 git and has been tested on x86_64 platform. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:04 +01:00
Huang, Ying	4c881ca181	x86: fix NX bit handling in change_page_attr() This patch fixes a bug of change_page_attr/change_page_attr_addr on Intel i386/x86_64 CPUs. After changing page attribute to be executable with these functions, the page remains un-executable on Intel i386/x86_64 CPU. Because on Intel i386/x86_64 CPU, only if the "NX" bits of all three level page tables are cleared (PAE is enabled), the corresponding page is executable (refer to section 4.13.2 of Intel 64 and IA-32 Architectures Software Developer's Manual). So, the bug is fixed through clearing the "NX" bit of PMD when splitting the huge PMD. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:04 +01:00
Ingo Molnar	8192206df0	x86: change cpa to pfn based change CPA to pfn based. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:04 +01:00
Ingo Molnar	687c4825b6	x86: keep the BIOS area executable keep the BIOS area executable. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:04 +01:00
Thomas Gleixner	30551bb3ce	x86: add PG_LEVEL enum this way PG_LEVEL_1GB will be an easy change. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:04 +01:00
Ingo Molnar	e4b71dcf54	x86: clean up arch/x86/mm/pageattr.c do some leftover cleanups in the now unified arch/x86/mm/pageattr.c file. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:04 +01:00
Ingo Molnar	4554ab95c2	x86: re-add clflush_cache_range() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:03 +01:00
Ingo Molnar	b195bc00ef	x86: unify pageattr_32.c and pageattr_64.c unify the now perfectly identical pageattr_32/64.c files - no code changed. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:03 +01:00
Ingo Molnar	6050be70d8	x86: prepare for pageattr.c unification Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:03 +01:00
Ingo Molnar	44af6c41e6	x86: backmerge 64-bit details into 32-bit pageattr.c backmerge 64-bit details into 32-bit pageattr.c. the pageattr_32.c and pageattr_64.c files are now identical. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:03 +01:00
Ingo Molnar	ace63e3743	x86: add kernel_map_pages() to 64-bit needed for DEBUG_PAGEALLOC support and for unification. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:03 +01:00
Ingo Molnar	f5a50ce1bf	x86: return -EINVAL in __change_page_attr(), instead of 0 careful: might change driver behavior - but this is the right return value. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:03 +01:00
Ingo Molnar	674d67269e	x86: clean up differences between 64-bit and 32-bit Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:03 +01:00
Ingo Molnar	6faa4c53b2	x86: 64-bit, add the new split_large_page() function Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:03 +01:00
Ingo Molnar	44136717e0	x86: 64-bit pageattr.c, prepare for unification Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:03 +01:00
Ingo Molnar	d9db847f29	x86: change 64-bit pageattr to use set_pte_atomic() NOP change - same as set_pte(). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:02 +01:00
Ingo Molnar	0d82494ebd	x86: change 64-bit __change_page_attr() to struct page Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:02 +01:00
Ingo Molnar	34eff1d75b	x86: simplify __change_page_attr() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:01 +01:00
Ingo Molnar	a5c6251488	x86: clean up and simplify 64-bit split_large_page() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:00 +01:00
Ingo Molnar	5e5224a77e	x86: unify header part of pageattr_64.c Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:00 +01:00
Ingo Molnar	d6ee09a2a0	x86: simplify pageattr_64.c simplify pageattr_64.c. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:59 +01:00
Ingo Molnar	a5f55035f6	x86: prepare for the unification of the cpa code prepare for the unification of the cpa code, by unifying the lookup_address() logic between 32-bit and 64-bit. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:59 +01:00
Ingo Molnar	bbb09f5cfc	x86: prepare for the unification of the cpa code prepare for the unification of the cpa code, by unifying the lookup_address() logic between 32-bit and 64-bit. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:59 +01:00
Ingo Molnar	55ce29ba16	x86: cpa self-test, WARN_ON() add a WARN_ON() to the cpa-self-test failure branch. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:58 +01:00
Ingo Molnar	12d6f21eac	x86: do not PSE on CONFIG_DEBUG_PAGEALLOC=y get more testing of the c_p_a() code done by not turning off PSE on DEBUG_PAGEALLOC. this simplifies the early pagetable setup code, and tests the largepage-splitup code quite heavily. In the end, all the largepages will be split up pretty quickly, so there's no difference to how DEBUG_PAGEALLOC worked before. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:58 +01:00
Ingo Molnar	9a3dc7804e	x86: cpa: simplify locking further simplify cpa locking: since the largepage-split is a slowpath, use the pgd_lock for the whole operation, intead of the mmap_sem. This also makes it suitable for DEBUG_PAGEALLOC purposes again. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:57 +01:00
Ingo Molnar	7afe15b9d8	x86: simplify cpa largepage split, #3 simplify cpa largepage split: push the reference protection bits into the largepage-splitting function. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:57 +01:00
Ingo Molnar	5508a74896	x86: cpa self-test fixes cpa self-test fixes. change_page_attr_addr() was buggy, it passed in a virtual address as a physical one. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:56 +01:00
Ingo Molnar	bb5c2dbd57	x86: further cpa largepage-split cleanups further cpa largepage-split cleanups: make the splitup isolated functionality, without leaking details back into __change_page_attr(). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:56 +01:00
Ingo Molnar	97f99fedf2	x86: simplify 32-bit cpa largepage splitting simplify 32-bit cpa largepage splitting: do a pure split and repeat the pte lookup to get the new pte modified. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:55 +01:00
Ingo Molnar	78c94abaea	x86: simplify the 32-bit cpa code simplify the 32-bit cpa code. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:55 +01:00
Andi Kleen	1a2b441231	x86: fix early_ioremap() on 64-bit Fix early_ioremap() on x86-64 I had ACPI failures on several machines since a few days. Symptom was NUMA nodes not getting detected or worse cores not getting detected. They all came from ACPI not being able to read various of its tables. I finally bisected it down to Jeremy's "put _PAGE_GLOBAL into PAGE_KERNEL" change. With that the fix was fairly obvious. The problem was that early_ioremap() didn't use a "_all" flush that would affect the global PTEs too. So with global bits getting used everywhere now an early_ioremap would not actually flush a mapping if something else was mapped previously on that slot (which can happen with early_iounmap inbetween) This patch changes all flushes in init_64.c to be __flush_tlb_all() and fixes the problem here. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:54 +01:00
Andi Kleen	934d15854d	x86: remove set_kernel_exec() The SMP trampoline always runs in real mode, so making it executable in the page tables doesn't make much sense because it executes before page tables are set up. That was the only user of set_kernel_exec(). Remove set_kernel_exec(). Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:53 +01:00
Andi Kleen	895bdc2995	x86: c_p_a() make it more robust against use of PAT bits Use the page table level instead of the PSE bit to check if the PTE is for a 4K page or not. This makes the code more robust when the PAT bit is changed because the PAT bit on 4K pages is in the same position as the PSE bit. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:52 +01:00
Andi Kleen	3c86882341	x86: c_p_a() fix: reorder TLB / cache flushes to follow Intel recommendation Intel recommends to first flush the TLBs and then the caches on caching attribute changes. c_p_a() previously did it the other way round. Reorder that. The procedure is still not fully compliant to the Intel documentation because Intel recommends a all CPU synchronization step between the TLB flushes and the cache flushes. However on all new Intel CPUs this is now meaningless anyways because they support Self-Snoop and can skip the cache flush step anyway. [ mingo@elte.hu: decoupled from clflush and ported it to x86.git ] Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:52 +01:00
Andi Kleen	6ba9b7d8f0	x86: fix c_p_a() boot crash fix: > hm, i just found a failing 64-bit .config while testing your CPA > patchset: > > [ 1.916541] CPA mapping 4k 0 large 2048 gb 0 x 0[0-0] miss 0 > [ 1.919874] Unable to handle kernel paging request at 000000000335aea8 RIP: > [ 1.919874] [<ffffffff8021d2d3>] change_page_attr+0x3/0x61 > [ 1.919874] PGD 0 > [ 1.919874] Oops: 0000 [1] > [ 1.919874] CPU 0 This handles addresses which don't have a mem_map entry. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:52 +01:00
Andi Kleen	c93c82bbea	x86: shrink __PAGE_KERNEL/__PAGE_KERNEL_EXEC on non PAE kernels No need to make it 64bit there. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:50 +01:00
Andi Kleen	a3ae91b0a0	x86: cpa: remove unnecessary masking of address virt_to_page does not care about the bits below the page granuality. So don't mask them. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:50 +01:00
Andi Kleen	5b016432a7	x86: cpa: use wbinvd() macro instead of inline assembly in 64bit c_p_a() Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:50 +01:00
Ingo Molnar	0e3a954929	x86: early_ioremap_init(), enhance warnings enhance the debug warning in early_ioremap_init(). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:49 +01:00
Ingo Molnar	d690b2afd5	x86: add early_ioremap() leak detection Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:47 +01:00
Huang, Ying	793b24a2dd	x86: make early_ioremap_debug early_param This patch makes "early_ioremap_debug" a early parameter, because "early_ioreamp/early_iounmap" is only used during early boot stage. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:45 +01:00
Ingo Molnar	d18d6d65ef	x86: early_ioremap(), debugging add early_ioremap() debug printouts via the early_ioremap_debug boot option. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:45 +01:00
Ingo Molnar	bd796ed023	x86: add debug warnings to early_ioremap() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:45 +01:00
Ingo Molnar	1b42f51630	x86: enhance early_ioremap() - allow nesting of up to 4 levels Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:45 +01:00
Huang, Ying	64a8f852a2	x86: early_ioremap_reset fix This patch fixes a bug of early_ioremap_reset. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:44 +01:00
Huang, Ying	beacfaac3f	x86 32-bit boot: rename bt_ioremap() to early_ioremap() This patch renames bt_ioremap to early_ioremap, which is used in x86_64. This makes it easier to merge i386 and x86_64 usage. [ mingo@elte.hu: fix ] Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:44 +01:00
Huang, Ying	4716e79c99	x86: replace boot_ioremap() with enhanced bt_ioremap() - remove boot_ioremap() This patch replaces boot_ioremap invokation with bt_ioremap and removes the boot_ioremap implementation. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:44 +01:00
Huang, Ying	0947b2f31c	i386 boot: replace boot_ioremap with enhanced bt_ioremap - enhance bt_ioremap This patch makes it possible for bt_ioremap() to be used before paging_init(), via providing an early implementation of set_fixmap() that can be used before paging_init(). This way boot_ioremap() can be replaced by bt_ioremap(). Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:44 +01:00
Siddha, Suresh B	4138cc3418	x86: set strong uncacheable where UC is really desired Also use _PAGE_PWT for all the mappings which need uncache mapping. Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs), we now use PAT3 which is strong uncacheable. This makes it consistent with pgprot_noncached() Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:43 +01:00
Joerg Roedel	fbd3bfd87f	x86: use __PAGE_KERNEL_EXEC in ioremap_64.c This patch replaces the manual permission setup for pages in ioremap_64.c with the pre-defined __PAGE_KERNEL_EXEC value. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:43 +01:00
Thomas Gleixner	718f94974d	x86: cleanup boot_ioremap_32.c Coding style cleanup before modifying the file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:43 +01:00
Ingo Molnar	851339b1ff	x86: clean up arch/x86/mm/pageattr-test.c fix 15 checkpatch warnings. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:43 +01:00
Andi Kleen	fa2d8369a1	x86: c_p_a(), add simple self test at boot Since change_page_attr() is tricky code it is good to have some regression test code. This patch maps and unmaps some random pages in the direct mapping at boot and then dumps the state and does some simple sanity checks. Add it with a CONFIG option. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:43 +01:00
Ingo Molnar	f0646e43ac	x86: return the page table level in lookup_address() based on this patch from Andi Kleen: \| Subject: CPA: Return the page table level in lookup_address() \| From: Andi Kleen <ak@suse.de> \| \| Needed for the next change. \| \| And change all the callers. and ported it to x86.git. Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:43 +01:00
Andi Kleen	4c3c4b4513	x86: clean up pte_exec - Rename it to pte_exec() from pte_exec_kernel(). There is nothing kernel specific in there. - Move it into the common file because _PAGE_NX is 0 on !PAE and then pte_exec() will be always evaluate to true. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:42 +01:00
Harvey Harrison	e66a95127d	x86: add dump_pagetable helper to X86_32 Similar to x86 64-bit. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:42 +01:00
Andi Kleen	0c42f39276	c_p_a(): do a simple self test at boot When CONFIG_DEBUG_RODATA is enabled undo the ro mapping and redo it again. This gives some simple testing for change_page_attr(). Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:42 +01:00
Ingo Molnar	b4416a1be8	x86: clean up arch/x86/mm/pageattr_64.c clean up arch/x86/mm/pageattr_64.c. no code changed: text data bss dec hex filename 1751 16 0 1767 6e7 pageattr_64.o.before 1751 16 0 1767 6e7 pageattr_64.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:41 +01:00
Ingo Molnar	9f4c815ce7	x86: clean up arch/x86/mm/pageattr_32.c clean up arch/x86/mm/pageattr_32.c. no code changed: text data bss dec hex filename 1255 40 0 1295 50f pageattr_32.o.before 1255 40 0 1295 50f pageattr_32.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:41 +01:00
Jeremy Fitzhardinge	508bebbb1f	x86: allocate and initialize unshared pmds If SHARED_KERNEL_PMD is false, then we need to allocate and initialize the kernel pmd. We can easily piggy-back this onto the existing pmd prepopulation code. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:40 +01:00
Jeremy Fitzhardinge	8fe3deef01	x86: preallocate pmds at pgd creation time In PAE mode, an update to the pgd requires a cr3 reload to make sure the processor notices the changes. Since this also has the side-effect of flushing the tlb, its an expensive operation which we want to avoid where possible. This patch mitigates the cost of installing the initial set of pmds on process creation by preallocating them when the pgd is allocated. This avoids up to three tlb flushes during exec, as it creates the new process address space while the pagetable is in active use. The pmds will be freed as part of the normal pagetable teardown in free_pgtables, which is called in munmap and process exit. However, free_pgtables will only free parts of the pagetable which actually contain mappings, so stray pmds may still be attached to the pgd at pgd_free time. We must mop them up to prevent a memory leak. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: William Irwin <wli@holomorphy.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:40 +01:00
Jeremy Fitzhardinge	a5a19c63f4	x86: demacro asm-x86/pgalloc_32.h Convert macros into inline functions, for better type-checking. This patch required a little bit of fiddling with headers in order to make __(pte\|pmd)_free_tlb inline rather than macros. asm-generic/tlb.h includes asm/pgalloc.h, though it doesn't directly use any pgalloc definitions. I removed this include to avoid an include cycle, but it may cause secondary compile failures by things depending on the indirect inclusion; arch/x86/mm/hugetlbpage.c was one such place; there may be others. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:39 +01:00
Jeremy Fitzhardinge	6c435456dc	x86: add mm parameter to paravirt_alloc_pd Add mm to paravirt_alloc_pd, partly to make it consistent with paravirt_alloc_pt, and because later changes will make use of it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:39 +01:00
Sam Ravnborg	1e296f578a	x86: fix section mismatch warning in srat_64.c Fix the following warnings: WARNING: arch/x86/mm/built-in.o(.text+0x1abc): Section mismatch: reference to .init.data:nodes_parsed in 'unparse_node' WARNING: arch/x86/mm/built-in.o(.text+0x1ac6): Section mismatch: reference to .cpuinit.data:apicid_to_node in 'unparse_node' WARNING: arch/x86/mm/built-in.o(.text+0x1ad2): Section mismatch: reference to .cpuinit.data:apicid_to_node in 'unparse_node' unparse_node are static and only used by acpi_scan_nodes which is already annotated __init. So we annotate unparse_node with __init. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:37 +01:00
Minoru Usui	9a1b62fe85	x86: fix NUMA emulation on 64-bit I found a small bug of NUMA emulation code for x86_64. (CONFIG_NUMA_EMU) If machine is non-NUMA, find_node_by_addr() should return NUMA_NO_NODE, but current implementation code returns existent maximum NUMA node number + 1. This is not existent NUMA node number. However, this behaviour does not affect NUMA emulation fortunately, because acpi_fake_nodes() that is caller of find_node_by_addr() gets pxm (proximity domain) by node_to_pxm() from non-existent NUMA node number that was returned by find_node_by_addr(). node_to_pxm() returns PXM_INVAL that means illegal or non-existent NUMA node number. Signed-off-by: Minoru Usui <usui@mxm.nes.nec.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:35 +01:00
travis@sgi.com	1ce357129a	x86: early cpu_to_node fix in numa_64.c Both of these references to cpu_to_node() can potentially occur before the "late" cpu_to_node map is setup. Therefore, they should be changed to use early_cpu_to_node(). Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:33 +01:00
travis@sgi.com	4323838215	x86: change size of node ids from u8 to s16 Change the size of node ids for X86_64 from u8 to s16 to accomodate more than 32k nodes and allow for NUMA_NO_NODE (-1) to be sign extended to int. Cc: David Rientjes <rientjes@google.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:25 +01:00
Mel Gorman	1b000a5dbe	x86: make NUMA work on 32-bit The DISCONTIG memory model on x86 32 bit uses a remap allocator early in boot. The objective is that portions of every node are mapped in to the kernel virtual area (KVA) in place of ZONE_NORMAL so that node-local allocations can be made for pgdat and mem_map structures. With SPARSEMEM, the amount that is set aside is insufficient for all the mem_maps to be allocated. During the boot process, it falls back to using the bootmem allocator. This breaks assumptions that SPARSEMEM makes about the layout of the mem_map in memory and results in a VM_BUG_ON triggering due to pfn_to_page() returning garbage values. This patch only enables the remap allocator for use with DISCONTIG. Without SRAT support, a compile-error occurs because ACPI table parsing functions are only available in x86-64. This patch also adds no-op stubs and prints a warning message. What likely needs to be done is sharing the table parsing functions between 32 and 64 bit if they are compatible. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:25 +01:00

... 3 4 5 6 7 ...

562 Commits