Commit Graph

2507 Commits

Author SHA1 Message Date
Yinghai Lu
e8c27ac919 x86, numa, 32-bit: print out debug info on all kvas
also fix the print out of node_remap_end_vaddr

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-03 13:26:26 +02:00
Yinghai Lu
0596152388 x86, 32-bit: change propagate_e820_map() back to find_max_pfn()
we don't need to call memory_present that early.
numa and sparse will call memory_present later and might
even fail, it will call memory_present for the full range.

also for sparse it will call alloc_bootmem ... before we set up bootmem.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-03 13:26:25 +02:00
Yinghai Lu
b66cd72073 x86: set node_remap_size[0] in fallback path
... otherwise alloc_remap will not get node_mem_map from kva area, and
alloc_node_mem_map has to alloc_bootmem_node to get mem_map.
It will use two low address copies ...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-03 13:26:25 +02:00
Yinghai Lu
ba924c81dd x86, numa, 32-bit: increase max_elements to 1024
so every element will represent 64M instead of 256M.

AMD opteron could have HW memory hole remapping, so could have
[0, 8g + 64M) on node0. Reduce element size to 64M to keep that on node 0

Later we need to use find_e820_area() to allocate memory_node_map like
on 64-bit. But need to move memory_present out of populate_mem_map...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-03 13:26:24 +02:00
Pavel Machek
f529626a86 suspend-vs-iommu: prevent suspend if we could not resume
iommu/gart support misses suspend/resume code, which can do bad stuff,
including memory corruption on resume.  Prevent system suspend in case we
would be unable to resume.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Tested-by: Patrick <ragamuffin@datacomm.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-02 13:02:48 +02:00
Miquel van Smoorenburg
db9f600b96 x86: pci-dma.c: use __GFP_NO_OOM instead of __GFP_NORETRY
On Wed, 2008-05-28 at 04:47 +0200, Andi Kleen wrote:
> > So...  why not just remove the setting of __GFP_NORETRY?  Why is it
> > wrong to oom-kill things in this case?
>
> When the 16MB zone overflows (which can be common in some workloads)
> calling the OOM killer is pretty useless because it has barely any
> real user data [only exception would be the "only 16MB" case Alan
> mentioned]. Killing random processes in this case is bad.
>
> I think for 16MB __GFP_NORETRY is ok because there should be
> nothing freeable in there so looping is useless. Only exception would be the
> "only 16MB total" case again but I'm not sure 2.6 supports that at all
> on x86.
>
> On the other hand d_a_c() does more allocations than just 16MB, especially
> on 64bit and the other zones need different strategies.

Okay, so how about this then ?

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-02 12:14:58 +02:00
Bertram Felgenhauer
75b19b790b pci, x86: add workaround for bug in ASUS A7V600 BIOS (rev 1005)
This BIOS claims the VIA 8237 south bridge to be compatible with VIA 586,
which it is not.

Without this patch, I get the following warning while booting,
among others,

| PCI: Using IRQ router VIA [1106/3227] at 0000:00:11.0
| ------------[ cut here ]------------
| WARNING: at arch/x86/pci/irq.c:265 pirq_via586_get+0x4a/0x60()
| Modules linked in:
| Pid: 1, comm: swapper Not tainted 2.6.26-rc4-00015-g1ec7d99 #1
|  [<c0119fd4>] warn_on_slowpath+0x54/0x70
|  [<c02246e0>] ? vt_console_print+0x210/0x2b0
|  [<c02244d0>] ? vt_console_print+0x0/0x2b0
|  [<c011a413>] ? __call_console_drivers+0x43/0x60
|  [<c011a482>] ? _call_console_drivers+0x52/0x80
|  [<c011aa89>] ? release_console_sem+0x1c9/0x200
|  [<c0291d21>] ? raw_pci_read+0x41/0x70
|  [<c0291e8f>] ? pci_read+0x2f/0x40
|  [<c029151a>] pirq_via586_get+0x4a/0x60
|  [<c02914d0>] ? pirq_via586_get+0x0/0x60
|  [<c029178d>] pcibios_lookup_irq+0x15d/0x430
|  [<c03b895a>] pcibios_irq_init+0x17a/0x3e0
|  [<c03a66f0>] ? kernel_init+0x0/0x250
|  [<c03a6763>] kernel_init+0x73/0x250
|  [<c03b87e0>] ? pcibios_irq_init+0x0/0x3e0
|  [<c0114d00>] ? schedule_tail+0x10/0x40
|  [<c0102dee>] ? ret_from_fork+0x6/0x1c
|  [<c03a66f0>] ? kernel_init+0x0/0x250
|  [<c03a66f0>] ? kernel_init+0x0/0x250
|  [<c010324b>] kernel_thread_helper+0x7/0x1c
|  =======================
| ---[ end trace 4eaa2a86a8e2da22 ]---

and IRQ trouble later,

| irq 10: nobody cared (try booting with the "irqpoll" option)

Now that's an VIA 8237 chip, so pirq_via586_get shouldn't be called
at all; adding this workaround to via_router_probe() fixes the
problem for me.

Amazingly I have a 2.6.23.8 kernel that somehow works fine ... I'll
never understand why.

Signed-off-by: Bertram Felgenhauer <int-e@gmx.de>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-02 11:29:10 +02:00
Yinghai Lu
9a73aa81ff x86: 32bit numa srat fix early_ioremap leak
on two node system (16g RAM) with numa config I got this crash:

get_memcfg_from_srat: assigning address to rsdp
RSD PTR  v0 [ACPIAM]
ACPI: Too big length in RSDT: 92
failed to get NUMA memory information from SRAT table
NUMA - single node, flat memory mode
Node: 0, start_pfn: 0, end_pfn: 153
 Setting physnode_map array to node 0 for pfns:
 0
...
Pid: 0, comm: swapper Not tainted 2.6.26-rc4 #4
 [<80b41289>] hlt_loop+0x0/0x3
 [<8011efa0>] ? alloc_remap+0x50/0x70
 [<8079e32e>] alloc_node_mem_map+0x5e/0xa0
 [<8012e77b>] ? printk+0x1b/0x20
 [<80b590f6>] free_area_init_node+0xc6/0x470
 [<80b588fc>] ? __alloc_bootmem_node+0x2c/0x50
 [<80b58ad8>] ? find_min_pfn_for_node+0x38/0x70
 [<8012e77b>] ? printk+0x1b/0x20
 [<80b597c4>] free_area_init_nodes+0x254/0x2d0
 [<80b544d7>] zone_sizes_init+0x97/0xa0
 [<80b48a03>] setup_arch+0x383/0x530
 [<8012e77b>] ? printk+0x1b/0x20
 [<80b41aa4>] start_kernel+0x64/0x350
 [<80b412d8>] i386_start_kernel+0x8/0x10
 =======================

this patch increases the acpi table limit to 32.
Also match early_ioremap() with early_iounmap().

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-31 09:55:56 +02:00
Yinghai Lu
a5481280b2 x86: extend e820 early_res support 32bit -fix #5
reserve early numa kva, so it will not clash with new RAMDISK

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-31 09:55:53 +02:00
Yinghai Lu
163872950d x86: extend e820 early_res support 32bit -fix #4
reserve_early pgdata for 32bit numa

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-31 09:55:50 +02:00
Yinghai Lu
f0d43100f1 x86: extend e820 early_res support 32bit -fix #3
introduce init_pg_table_start, so xen PV could specify the value.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-31 09:55:47 +02:00
Rusty Russell
a16ffe93c4 lguest: fix ugly <NULL> in /proc/interrupts
Before:
	root@ubuntu:~# cat /proc/interrupts
	           CPU0
	  1:       1672    lguest-<NULL>    virtio0
	  2:          1    lguest-<NULL>    virtio1
	  ...
After:
	root@ubuntu:~# cat /proc/interrupts
	           CPU0
	  1:       2889    lguest-level     virtio0
	  2:          9    lguest-level     virtio1

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-30 15:09:43 +10:00
Yinghai Lu
3945e2c9ab x86: extend e820 ealy_res support 32bit - fix #2
remove extra -1 in reseve_early calling
    panic if can not find space for new RAMDISK

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27 10:20:12 +02:00
Sam Ravnborg
73531905ed Kconfig: introduce ARCH_DEFCONFIG to DEFCONFIG_LIST
init/Kconfig contains a list of configs that are searched
for if 'make *config' are used with no .config present.
Extend this list to look at the config identified by
ARCH_DEFCONFIG.

With this change we now try the defconfig targets last.

This fixes a regression reported
by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2008-05-25 23:03:18 +02:00
Thomas Gleixner
85cc35fa72 x86: fix mpparse fallout
UP builds with LOCAL_APIC=y and IO_APIC=n fail with a missing
reference to mp_bus_not_pci. Distangle the mpparse code some more and
move the ioapic specific bus check into a separate function.

This code needs sume urgent un#ifdef surgery all over the place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 21:21:36 +02:00
Alexey Starikovskiy
136ef671df x86: allow MPPARSE to be deselected in SMP configs 2008-05-25 12:01:26 +02:00
Alexey Starikovskiy
8732fc4b23 x86: move mp_bus_not_pci from mpparse.c 2008-05-25 12:01:26 +02:00
Alexey Starikovskiy
ce6444d39f x86: mp_bus_id_to_pci_bus is not needed 2008-05-25 12:01:25 +02:00
Alexey Starikovskiy
bab4b27c00 x86: move smp_found_config 2008-05-25 12:01:25 +02:00
Alexey Starikovskiy
f391835290 x86: move pic_mode to apic_32.c 2008-05-25 12:01:25 +02:00
Alexey Starikovskiy
b3e2416465 x86: Set pic_mode only if local apic code is present 2008-05-25 12:01:25 +02:00
Yinghai Lu
bf62f3981c x86: move e820_mark_nosave_regions to e820.c
and make e820_mark_nosave_regions to take limit_pfn to use max_low_pfn
for 32bit and end_pfn for 64bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 11:35:53 +02:00
Alexey Starikovskiy
aafbdf71f1 x86: fix mpparse/acpi interaction
Sitsofe Wheeler reported boot problems on linux-next.

It looks like the same issue as found by Soeren Sandman in 7575217f656a93,
"x86: initialize all fields of mp_irqs[mp_irq_entries]".

But his fix is also not complete, as dstapic is used before it assigned.

Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Bisected-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:13 +02:00
Soeren Sandmann
59f4519ad7 x86: initialize all fields of mp_irqs[mp_irq_entries]
Commit "x86: make config_irqsrc not MPspec specific" introduced some uses
of uninitialized fields in mp_config_acpi_legacy_irqs(). I need the
following patch to get sched-devel/master to boot.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:13 +02:00
Alexey Starikovskiy
2fddb6e28e x86: make config_irqsrc not MPspec specific
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:13 +02:00
Alexey Starikovskiy
ec2cd0a22e x86: make struct config_ioapic not MPspec specific
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:13 +02:00
Alexey Starikovskiy
5f8951487d x86: make mp_ioapic_routing definition local
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:12 +02:00
Alexey Starikovskiy
11113f84c7 x86: complete move ACPI from mpparse.c
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:12 +02:00
Alexey Starikovskiy
32c5061265 x86: move es7000_plat out of mpparse.c
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:12 +02:00
Yinghai Lu
11a62a0560 x86: cleanup print out for mptable
the new output is:

 MPTABLE: OEM ID: SUN
 MPTABLE: Product ID: 4600 M2
 MPTABLE: APIC at: 0x

instead of it all in one line with <6> and double Product ID...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:12 +02:00
Thomas Gleixner
4a139a7fde x86: include pci.h in e820_64.c
global pci_mem_start needs a declaration. include pci.h

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:12 +02:00
Thomas Gleixner
a91eea6df3 x86: fix shadow variables of global end_pnf in e820_64.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:12 +02:00
Thomas Gleixner
7f028bc0fd x86: move mp_ioapic_routing to mpparse and make it static
mpparse is the only user of mp_ioapic_routing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:12 +02:00
Yinghai Lu
ba5b14cc03 x86: extend e820 ealy_res support 32bit - fix
use find_e820_area to find addess for new RAMDISK, instead of using ram blindly

also print out low ram and bootmap info

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:12 +02:00
Yinghai Lu
a4c81cf684 x86: extend e820 ealy_res support 32bit
move early_res related from e820_64.c to e820.c
make edba detection to be done in head32.c
remove smp_alloc_memory, because we have fixed trampoline address now.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>

 arch/x86/kernel/e820.c              |  214 ++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/e820_64.c           |  196 --------------------------------
 arch/x86/kernel/head32.c            |   76 ++++++++++++
 arch/x86/kernel/setup_32.c          |  109 +++---------------
 arch/x86/kernel/smpboot.c           |   17 --
 arch/x86/kernel/trampoline.c        |    2
 arch/x86/mach-voyager/voyager_smp.c |    9 -
 include/asm-x86/e820.h              |    6 +
 include/asm-x86/e820_64.h           |    9 -
 include/asm-x86/smp.h               |    1
 arch/x86/kernel/e820.c              |  214 ++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/e820_64.c           |  196 --------------------------------
 arch/x86/kernel/head32.c            |   76 ++++++++++++
 arch/x86/kernel/setup_32.c          |  109 +++---------------
 arch/x86/kernel/smpboot.c           |   17 --
 arch/x86/kernel/trampoline.c        |    2
 arch/x86/mach-voyager/voyager_smp.c |    9 -
 include/asm-x86/e820.h              |    6 +
 include/asm-x86/e820_64.h           |    9 -
 include/asm-x86/smp.h               |    1
 arch/x86/kernel/e820.c              |  214 ++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/e820_64.c           |  196 --------------------------------
 arch/x86/kernel/head32.c            |   76 ++++++++++++
 arch/x86/kernel/setup_32.c          |  109 +++---------------
 arch/x86/kernel/smpboot.c           |   17 --
 arch/x86/kernel/trampoline.c        |    2
 arch/x86/mach-voyager/voyager_smp.c |    9 -
 include/asm-x86/e820.h              |    6 +
 include/asm-x86/e820_64.h           |    9 -
 include/asm-x86/smp.h               |    1
 10 files changed, 320 insertions(+), 319 deletions(-)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:11 +02:00
Paul Jackson
69c9189320 x86 boot: add code to add BIOS provided EFI memory entries to kernel
Add to the kernels boot memory map 'memmap' entries found in
the EFI memory descriptors passed in from the BIOS.

On EFI systems, up to E820MAX == 128 memory map entries can
be passed via the legacy E820 interface (limited by the size
of the 'zeropage').  These entries can be duplicated in the
EFI descriptors also passed from the BIOS, and possibly more
entries passed by the EFI interface, which does not have the
E820MAX limit on number of memory map entries.

This code doesn't worry about the likely duplicate, overlapping
or (unlikely) conflicting entries between the EFI map and the
E820 map.  It just dumps all the EFI entries into the memmap[]
array (which already has the E820 entries) and lets the existing
routine sanitize_e820_map() sort the mess out.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:11 +02:00
Paul Jackson
5b7eb2e9ef x86 boot: longer comment explaining sanitize_e820_map routine
Elaborate on the comment for sanitize_e820_map(), epxlaining more what
it does, what it inputs, and what it returns.  Rearrange the placement of
this comment to fit kernel conventions, before the routine's code rather
than buried inside it.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:11 +02:00
Paul Jackson
6e9bcc796b x86 boot: change sanitize_e820_map parameter from byte to int to allow bigger memory maps
The map size counter passed into, and back out of, sanitize_e820_map(),
was an eight bit type (char or u8), as derived from its origins in
legacy BIOS E820 structures.  This patch changes that type to an 'int',
to allow this sanitize routine to also be used on larger maps (larger
than the 256 count that fits in a char).  The legacy BIOS E820 interface
of course does not change; that remains at 8 bits for this count, holding
up to E820MAX == 128 entries.  But the kernel internals can handle more
when those additional memory map entries are passed from the BIOS via
EFI interfaces.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:11 +02:00
Paul Jackson
028b785888 x86 boot: extend some internal memory map arrays to handle larger EFI input
Extend internal boot time memory tables to allow for up to
three entries per node, which may be larger than the 128 E820MAX
entries handled by the legacy BIOS E820 interface.  The EFI
interface, if present, is capable of passing memory map
entries for these larger node counts.

This patch requires an earlier patch that rewrote code depending
on these array sizes from using E820MAX explicitly to size loops,
to instead using ARRAY_SIZE() of the applicable array.

Another patch following this one will provide the code to pick
up additional memory entries passed via the EFI interface from
the BIOS and insert them in the following, now enlarged, arrays.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:11 +02:00
Paul Jackson
c3965bd151 x86 boot: proper use of ARRAY_SIZE instead of repeated E820MAX constant
This patch is motivated by a subsequent patch which will allow for more
memory map entries on EFI supported systems than can be passed via the x86
legacy BIOS E820 interface.  The legacy interface is limited to E820MAX ==
128 memory entries, and that "E820MAX" manifest constant was used as the
size for several arrays and loops over those arrays.

The primary change in this patch is to change code loop sizes over those
arrays from using the constant E820MAX, to using the ARRAY_SIZE() macro
evaluated for the array being looped.  That way, a subsequent patch can
change the size of some of these arrays, without breaking this code.

This patch also adds a parameter to the sanitize_e820_map() routine,
which had an implicit size for the array passed it of E820MAX entries.
This new parameter explicitly passes the size of said array.  Once again,
this will allow a subsequent patch to change that array size for some
calls to sanitize_e820_map() without breaking the code.

As part of enhancing the sanitize_e820_map() interface this way, I further
combined the unnecessarily distinct x86_32 and x86_64 declarations for
this routine into a single, commonly used, declaration.

This patch in itself should make no difference to the resulting kernel
binary.

[ mingo@elte.hu: merged to -tip ]

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:11 +02:00
Paul Jackson
b25e31cec7 x86 boot: minor code format fixes in e820 and efi routines
Standardize a few pointer declarations to not have the
extra space after the '*' character.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:11 +02:00
Paul Jackson
e9197bf011 x86 boot: remove some unused extern function declarations
Remove three extern declarations for routines
that don't exist.  Fix a typo in a comment.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:10 +02:00
Yinghai Lu
3f03c54a34 x86: mtrr cleanup for converting continuous to discrete layout - fix #2
disable the noisy print out.
also use the one the less spare mtrr reg.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:10 +02:00
Yinghai Lu
8004dd965b x86: amd opteron TOM2 mask val fix
there is a typo in the mask value, need to remove that extra 0,
to avoid 4bit clearing.

Signed-off-by: Yinghal Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:10 +02:00
Yinghai Lu
b79cd8f126 x86: make e820.c to have common functions
remove the duplicated copy of these functions.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:10 +02:00
Yinghai Lu
833e78bfee x86: process fam 10h like k8 with fixed mtrr setting
otherwise fixed MTRR for family 10h may not be changed.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:10 +02:00
Yinghai Lu
12031a624a x86: mtrr cleanup for converting continuous to discrete - auto detect v4
Loop through mtrr chunk_size and gran_size from 1M to 2G to find out
the optimal value so user does not need to add mtrr_chunk_size and
mtrr_gran_size to the kernel command line.

If optimal value is not found, print out all list to help select less
optimal value.

Add mtrr_spare_reg_nr= so user could set 2 instead of 1, if the card
need more entries.

v2: find the one with more spare entries
v3: fix hole_basek offset
v4: tight the compare between range and range_new
    loop stop with 4g

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Gabriel C <nix.or.die@googlemail.com>
Cc: Mika Fischer <mika.fischer@zoopnet.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:10 +02:00
Yinghai Lu
f5098d62c1 x86: mtrr cleanup for converting continuous to discrete layout v8 - fix
v9: address format change requests by Ingo
    more case handling in range_to_var_with_hole

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:10 +02:00
Yinghai Lu
8a374026c2 x86: fix trimming e820 with MTRR holes. - fix
v2: process hole then end_pfn
    fix update_memory_range with whole cover comparing

Signed-off-by: Yinghai Lu <yinghai.lu@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:10 +02:00
Yinghai Lu
42651f1582 x86: fix trimming e820 with MTRR holes.
converting MTRR layout from continous to discrete, some time could run out of
MTRRs. So add gran_sizek to prevent that by dumpping small RAM piece less than
gran_sizek.

previous trimming only can handle highest_pfn from mtrr to end_pfn from e820.
when have more than 4g RAM installed, there will be holes below 4g. so need to
check ram below 4g is coverred well.

need to be applied after
	[PATCH] x86: mtrr cleanup for converting continuous to discrete layout v7

Signed-off-by: Yinghai Lu <yinghai.lu@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:09 +02:00
Yinghai Lu
95ffa2438d x86: mtrr cleanup for converting continuous to discrete layout, v8
some BIOS like to use continus MTRR layout, and X driver can not add
WB entries for graphical cards when 4g or more RAM installed.

the patch will change MTRR to discrete.

mtrr_chunk_size= could be used to have smaller continuous block to hold holes.
default is 256m, could be set according to size of graphics card memory.

mtrr_gran_size= could be used to send smallest mtrr block to avoid run out of MTRRs

v2: fix -1 for UC checking
v3: default to disable, and need use enable_mtrr_cleanup to enable this feature
    skip the var state change warning.
    remove next_basek in range_to_mtrr()
v4: correct warning mask.
v5: CONFIG_MTRR_SANITIZER
v6: fix 1g, 2g, 512 aligment with extra hole
v7: gran_sizek to prevent running out of MTRRs.
v8: fix hole_basek caculation caused when removing next_basek
    gran_sizek using when basek is 0.

need to apply
	[PATCH] x86: fix trimming e820 with MTRR holes.
right after this one.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:09 +02:00
Alexander van Heukelum
0dbfafa5fc x86: move i386 memory setup code to e820_32.c
The x86_64 code has centralized the memory setup code in
e820_64.c. This patch copies that approach to i386:

- early_param("mem", ...) parsing is moved from
setup_32.c to e820_32.c.

- setup_memory_map() and finish_e820_parsing() are
factored out from setup_arch(), and declarations
are added to e820_32.h.

- print_memory_map() is made static and removed from
e820_32.h.

- user_defined_memmap is marked as __initdata.

Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 10:55:09 +02:00
Thomas Gleixner
0da72a4aeb x86: fix sparse warning in mtrr/generic.c
arch/x86/kernel/cpu/mtrr/generic.c:216:12: warning: symbol 'lo' shadows an earlier one

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-25 10:55:09 +02:00
Linus Torvalds
eb90d81d03 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
  x86: prevent PGE flush from interruption/preemption
  x86: use explicit copy in vdso_gettimeofday()
  namespacecheck: automated fixes
  x86/xen: fix arbitrary_virt_to_machine()
  x86: don't read maxlvt before checking if APIC is mapped
  x86: disable TSC for sched_clock() when calibration failed
  x86: distangle user disabled TSC from unstable
  x86: fix setup of cyc2ns in tsc_64.c
2008-05-24 10:20:00 -07:00
Linus Torvalds
e6b027a398 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] clarify license of freq_table.c
  [CPUFREQ] Remove documentation of removed ondemand tunable.
  [CPUFREQ] Crusoe: longrun cpufreq module reports false min freq
  [CPUFREQ] powernow-k8: improve error messages
2008-05-23 09:24:52 -07:00
Harvey Harrison
7fafd91d85 x86: fix integer as NULL pointer warning
arch/x86/boot/printf.c:59:10: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-23 08:11:06 -07:00
Andi Kleen
a1289643ad x86: use explicit copy in vdso_gettimeofday()
Jeremy's gcc 3.4 seems to be unable to inline a 8 byte memcpy.  But the
vdso doesn't support external references.  Copy the structure members
of struct timezone explicitely instead.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 14:08:06 +02:00
Ingo Molnar
2ddfd20e7c namespacecheck: automated fixes
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-23 14:08:06 +02:00
Jan Beulich
de067814d6 x86/xen: fix arbitrary_virt_to_machine()
While I realize that the function isn't currently being used, I still
think an obvious mistake like this should be corrected.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 14:08:06 +02:00
Chuck Ebbert
2584a82dee x86: don't read maxlvt before checking if APIC is mapped
A check for unmapped apic was added before reading maxlvt but the early
read of maxlvt wasn't removed.

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
2008-05-23 14:08:06 +02:00
Thomas Gleixner
74dc51a3de x86: disable TSC for sched_clock() when calibration failed
When the TSC calibration fails then TSC is still used in
sched_clock(). Disable it completely in that case.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
2008-05-23 14:08:06 +02:00
Thomas Gleixner
9ccc906c97 x86: distangle user disabled TSC from unstable
tsc_enabled is set to 0 from the command line switch "notsc" and from
the mark_tsc_unstable code. Seperate those functionalities and replace
tsc_enable with tsc_disable. This makes also the native_sched_clock()
decision when to use TSC understandable.

Preparatory patch to solve the sched_clock() issue on 32 bit.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-23 14:08:06 +02:00
Thomas Gleixner
b6db80ee13 x86: fix setup of cyc2ns in tsc_64.c
When the TSC is calibrated against the PIT due to the nonavailability
of PMTIMER/HPET or due to SMI interference then the setup of the per
CPU cyc2ns variables is skipped. This is unlikely to happen but it
would definitely render sched_clock() unusable.

This was introduced with commit 53d517cdba

    x86: scale cyc_2_nsec according to CPU frequency

Update the per CPU cyc2ns variables in all exit pathes of tsc_calibrate.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
2008-05-23 14:08:06 +02:00
Tony Camuso
a167607255 PCI: Correct last two HP entries in the bfsort whitelist
Greetings.

There is a code flaw in the bfsort whitelist, where there are redundant
entries for the same two HP systems, DL385 G2 and DL585 G2. This patch
replaces those redundant entries with the correct ones. The correct
entries are for large-volume systems, the DL360 and DL380.

-----------------------------------------------------------------------

commit ec69f0374c3b0ad7ea991b0e9ac00377acfe5b1a
Author: Tony Camuso <tony.camuso@hp.com>
Date:   Wed May 14 07:09:28 2008 -0400

     Replace Redundant Whitelist Entries with the Correct Ones

     The ProLiant DL585 G2 and the DL585 G2 are entered reundantly
     in the dmi_system_id table. What should have been there are the
     DL360 and DL380. This patch simply replaces the redundant
     entries with the correct entries.

 arch/x86/pci/common.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

     Signed-off-by: Tony Camuso <tony.camuso@hp.com>
     Signed-off-by: Pat Schoeller <patrick.schoeller@hp.com>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-22 18:16:24 +02:00
Linus Torvalds
737b0fbf44 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: correct mailing list address
  PCI: Correct last two HP entries in the bfsort whitelist
2008-05-20 10:55:04 -07:00
Linus Torvalds
e23a5f6687 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  [PATCH] return to old errno choice in mkdir() et.al.
  [Patch] fs/binfmt_elf.c: fix wrong return values
  [PATCH] get rid of leak in compat_execve()
  [Patch] fs/binfmt_elf.c: fix a wrong free
  [PATCH] avoid multiplication overflows and signedness issues for max_fds
  [PATCH] dup_fd() part 4 - race fix
  [PATCH] dup_fd() - part 3
  [PATCH] dup_fd() part 2
  [PATCH] dup_fd() fixes, part 1
  [PATCH] take init_files to fs/file.c
2008-05-19 16:37:45 -07:00
maximilian attems
667ad4f701 [CPUFREQ] Crusoe: longrun cpufreq module reports false min freq
The longrun cpufreq module reports a false minimum frequency 3MHz on
300-600MHz Crusoe processor.  This may be due to a calculation bug
in the module.

Original patch from Kaz Sasayama <kazssym@hypercore.co.jp>
submitted as http://bugs.debian.org/468149 patch ported to x86

Cc: Kaz Sasayama <kazssym@hypercore.co.jp>
Signed-off-by: maximilian attems <max@stro.at>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-05-19 18:17:28 -04:00
Mark Langsdorf
eba9fe93a2 [CPUFREQ] powernow-k8: improve error messages
The most common error with powernow-k8 is an ACPI _PSS error
caused either by failure to load the ACPI processor module
or a bad parse of the _PSS object.  Make the error message
returned to the user in these situations more straightforward
and easier to understand.

-Mark Langsdorf
Operating System Research Center
AMD

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-05-19 18:17:27 -04:00
Linus Torvalds
88d53766bd Merge branch 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: LAPIC: ignore pending timers if LVTT is disabled
  KVM: Update MAINTAINERS for new mailing lists
  KVM: Fix kvm_vcpu_block() task state race
  KVM: ia64: Set KVM_IOAPIC_NUM_PINS to 48
  KVM: ia64: fix GVMM module including position-dependent objects
  KVM: ia64: Define new kvm_fpreg struture to replace ia64_fpreg
  KVM: PIT: take inject_pending into account when emulating hlt
  s390: KVM guest: fix compile error
  KVM: x86 emulator: fix writes to registers with modrm encodings
2008-05-19 13:53:21 -07:00
Tony Camuso
8d64c781f0 PCI: Correct last two HP entries in the bfsort whitelist
Replace Redundant Whitelist Entries with the Correct Ones

The ProLiant DL585 G2 and the DL585 G2 are entered reundantly in the
dmi_system_id table. What should have been there are the DL360 and DL380. This
patch simply replaces the redundant entries with the correct entries.

Signed-off-by: Tony Camuso <tony.camuso@hp.com>
Signed-off-by: Pat Schoeller <patrick.schoeller@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-19 12:21:36 -07:00
Marcelo Tosatti
54aaacee35 KVM: LAPIC: ignore pending timers if LVTT is disabled
Only use the APIC pending timers count to break out of HLT emulation if
the timer vector is enabled.

Certain configurations of Windows simply mask out the vector without
disabling the timer.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-18 14:39:39 +03:00
Marcelo Tosatti
eedaa4e2af KVM: PIT: take inject_pending into account when emulating hlt
Otherwise hlt emulation fails if PIT is not injecting IRQ's.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-18 14:34:15 +03:00
Avi Kivity
107d6d2efa KVM: x86 emulator: fix writes to registers with modrm encodings
A register destination encoded with a mod=3 encoding left dst.ptr NULL.
Normally we don't trap writes to registers, but in the case of smsw, we do.

Fix by pointing dst.ptr at the destination register.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-18 14:34:14 +03:00
Thomas Gleixner
e9623b3559 x86: disable mwait for AMD family 10H/11H CPUs
The previous revert of 0c07ee38c9 left
out the mwait disable condition for AMD family 10H/11H CPUs.

Andreas Herrman said:

It depends on the CPU. For AMD CPUs that support MWAIT this is wrong.
Family 0x10 and 0x11 CPUs will enter C1 on HLT. Powersavings then
depend on a clock divisor and current Pstate of the core.

If all cores of a processor are in halt state (C1) the processor can
enter the C1E (C1 enhanced) state. If mwait is used this will never
happen.

Thus HLT saves more power than MWAIT here.

It might be best to switch off the mwait flag for these AMD CPU
families like it was introduced with commit
f039b75471 (x86: Don't use MWAIT on AMD
Family 10)

Re-add the AMD families 10H/11H check and disable the mwait usage for
those.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-17 22:57:20 +02:00
Avi Kivity
31f4d870b0 x86: fix crash on cpu hotplug on pat-incapable machines
pat_disable() is __init, which means it goes away after booting is complete.
Unfortunately it is used by the hotplug code if the machine is not
pat-capable, causing a crash.

Fix by marking pat_disable() as __cpuinit.

Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-17 22:57:20 +02:00
Ingo Molnar
a738d897b7 x86: remove mwait capability C-state check
Vegard Nossum reports:

| powertop shows between 200-400 wakeups/second with the description
| "<kernel IPI>: Rescheduling interrupts" when all processors have load (e.g.
| I need to run two busy-loops on my 2-CPU system for this to show up).
|
| The bisect resulted in this commit:
|
| commit 0c07ee38c9
| Date:   Wed Jan 30 13:33:16 2008 +0100
|
|     x86: use the correct cpuid method to detect MWAIT support for C states

remove the functional effects of this patch and make mwait unconditional.

A future patch will turn off mwait on specific CPUs where that causes
power to be wasted.

Bisected-by: Vegard Nossum <vegard.nossum@gmail.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-17 22:57:20 +02:00
Al Viro
f52111b154 [PATCH] take init_files to fs/file.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-05-16 17:22:20 -04:00
Linus Torvalds
4ef7e3e90f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: user_regset_view table fix for ia32 on 64-bit
  x86: arch/x86/mm/pat.c - fix warning
  x86: fix csum_partial() export
  x86: early_init_centaur(): use set_cpu_cap()
  x86: fix app crashes after SMP resume
  x86: wakeup.lds.S - section ordering fix
  x86: [VOYAGER] fix duplicate phys_cpu_present_map symbol
  x86/pci: fix broken ISA DMA
2008-05-13 12:33:56 -07:00
Roland McGrath
1f465f4e47 x86: user_regset_view table fix for ia32 on 64-bit
The user_regset_view table for the 32-bit regsets on the 64-bit build had
the wrong sizes for the FP regsets.  This bug had no user-visible effect
(just on kernel modules using the user_regset interfaces and the like).
But the fix is trivial and risk-free.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:40:20 +02:00
Pranith Kumar
afc8534380 x86: arch/x86/mm/pat.c - fix warning
fix this warning:

 arch/x86/mm/pat.c: In function `phys_mem_access_prot_allowed':
 arch/x86/mm/pat.c:558: warning: long long unsigned int format, long
 unsigned int arg (arg 6)
 arch/x86/mm/pat.c: In function `map_devmem':
 arch/x86/mm/pat.c:580: warning: long long unsigned int format, long
 unsigned int arg (arg 6)

Signed-off-by: D Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:39:30 +02:00
Ingo Molnar
89804c022f x86: fix csum_partial() export
Fix this symbol export problem:

    Building modules, stage 2.
    MODPOST 193 modules
    ERROR: "csum_partial" [fs/reiserfs/reiserfs.ko] undefined!
    make[1]: *** [__modpost] Error 1
    make: *** [modules] Error 2

This is due to a known weakness of symbol exports: if a symbol's
only in-core user is an EXPORT_SYMBOL from a lib-y section, the
symbol is not linked in.

The solution is to move the export to x8664_ksyms_64.c - but the real
solution would be to fix kbuild.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:38:47 +02:00
Andrew Morton
8c45a4e4f2 x86: early_init_centaur(): use set_cpu_cap()
arch/x86/kernel/setup_64.c:954: warning: passing argument 2 of 'set_bit' from incompatible pointer type

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:37:38 +02:00
Hugh Dickins
61165d7a03 x86: fix app crashes after SMP resume
After resume on a 2cpu laptop, kernel builds collapse with a sed hang,
sh or make segfault (often on 20295564), real-time signal to cc1 etc.

Several hurdles to jump, but a manually-assisted bisect led to -rc1's
d2bcbad5f3 x86: do not zap_low_mappings
in __smp_prepare_cpus.  Though the low mappings were removed at bootup,
they were left behind (with Global flags helping to keep them in TLB)
after resume or cpu online, causing the crashes seen.

Reinstate zap_low_mappings (with local __flush_tlb_all) for each cpu_up
on x86_32.  This used to be serialized by smp_commenced_mask: that's now
gone, but a low_mappings flag will do.  No need for native_smp_cpus_done
to repeat the zap: let mem_init zap BSP's low mappings just like on UP.

(In passing, fix error code from native_cpu_up: do_boot_cpu returns a
variety of diagnostic values, Dprintk what it says but convert to -EIO.
And save_pg_dir separately before zap_low_mappings: doesn't matter now,
but zapping twice in succession wiped out resume's swsusp_pg_dir.)

That worked well on the duo and one quad, but wouldn't boot 3rd or 4th
cpu on P4 Xeon, oopsing just after unlock_ipi_call_lock.  The TLB flush
IPI now being sent reveals a long-standing bug: the booting cpu has its
APIC readied in smp_callin at the top of start_secondary, but isn't put
into the cpu_online_map until just before that unlock_ipi_call_lock.

So native_smp_call_function_mask to online cpus would send_IPI_allbutself,
including the cpu just coming up, though it has been excluded from the
count to wait for: by the time it handles the IPI, the call data on
native_smp_call_function_mask's stack may well have been overwritten.

So fall back to send_IPI_mask while cpu_online_map does not match
cpu_callout_map: perhaps there's a better APICological fix to be
made at the start_secondary end, but I wouldn't know that.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-13 19:36:12 +02:00
Venki Pallipadi
77db988564 x86/PCI: X86_PAT & mprotect
Some versions of X used the mprotect workaround to change caching type from UC
to WB, so that it can then use mtrr to program WC for that region [1].  Change
the mmap of pci space through /sys or /proc interfaces from UC to UC_MINUS.
With this change, X will not need to use mprotect workaround to get WC type
since the MTRR mapping type will be honored.

The bug in mprotect that clobbers PAT bits is fixed in a follow on patch. So,
this X workaround will stop working as well.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-13 09:51:54 -07:00
Takashi Iwai
4a367f3a9d x86/PCI: fix broken ISA DMA
Rene Herman reported:

> commit 8779f2fc3b
>
> "x86: don't try to allocate from DMA zone at first"
>
> breaks all of ISA DMA. Or all of ALSA ISA DMA at least. All
> ISA soundcards are silent following that commit -- no error
> messages, everything appears fine, just silence.

That patch is buggy. We had an implicit assumption that
dev = NULL for ISA devices that require 24bit DMA.

The recent work on x86 dma_alloc_coherent() breaks the ISA DMA buffer
allocation, which is represented by "dev = NULL" and requires 24bit
DMA implicitly.

Bisected-by: Rene Herman <rene.herman@keyaccess.nl>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-13 09:51:53 -07:00
Cyrill Gorcunov
8c6b0ef2ea x86: wakeup.lds.S - section ordering fix
To allow linker to catch sections overlapping we have to declare
them in appropriate order.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-12 21:27:51 +02:00
James Bottomley
f8955ebe3e x86: [VOYAGER] fix duplicate phys_cpu_present_map symbol
The phys_cpu_present_map is an expected symbol in the SMP harness.
Unfortunately, x86 recently moved this and a few others to
kernel/setup.c where it doesn't quite work because voyager has to
define its own.  Use CONFIG_X86_LOCAL_APIC to isolate these
definitions and fix up another area in setup.c where CONFIG_X86_SMP
should be used instead of CONFIG_SMP.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: toralf.foerster@gmx.de
Cc: Mike Travis <travis@sgi.com>
Cc: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-12 21:27:51 +02:00
Takashi Iwai
8965eb1938 x86/pci: fix broken ISA DMA
Rene Herman reported:

> commit 8779f2fc3b
>
> "x86: don't try to allocate from DMA zone at first"
>
> breaks all of ISA DMA. Or all of ALSA ISA DMA at least. All
> ISA soundcards are silent following that commit -- no error
> messages, everything appears fine, just silence.

That patch is buggy. We had an implicit assumption that
dev = NULL for ISA devices that require 24bit DMA.

The recent work on x86 dma_alloc_coherent() breaks the ISA DMA buffer
allocation, which is represented by "dev = NULL" and requires 24bit
DMA implicitly.

Bisected-by: Rene Herman <rene.herman@keyaccess.nl>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Rene Herman <rene.herman@keyaccess.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-12 21:27:50 +02:00
Linus Torvalds
3e1b83ab39 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: rdc: leds build/config fix
  x86: sysfs cpu?/topology is empty in 2.6.25 (32-bit Intel system)
  x86: revert commit 709f744 ("x86: bitops asm constraint fixes")
  x86: restrict keyboard io ports reservation to make ipmi driver work
  x86: fix fpu restore from sig return
  x86: remove spew print out about bus to node mapping
  x86: revert printk format warning change which is for linux-next
  x86: cleanup PAT cpu validation
  x86: geode: define geode_has_vsa2() even if CONFIG_MGEODE_LX is not set
  x86: GEODE: cache results from geode_has_vsa2() and uninline
  x86: revert geode config dependency
2008-05-10 21:10:48 -07:00
Ingo Molnar
82fd866701 x86: rdc: leds build/config fix
select NEW_LEDS for now until the Kconfig dependencies have been
fixed.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-10 19:31:45 +02:00
Helge Wagner
9096bd7a66 x86: restrict keyboard io ports reservation to make ipmi driver work
On some of our (single board computer) boards (x86) we are using an
IPMI controller that uses I/O ports 0x62 and 0x66 for a KCS (keyboard
controller style) IPMI system interface.

Trying to load the openipmi driver fails, because the ports
(0x62/0x66) are reserved for keyboard. keyboard reserves the full
range 0x60-0x6F while it doesn't need to.

Reserve only ports 0x60 and 0x64 for the legacy PS/2 i8042 keyboad
controller instead of 0x60-0x6F to allow the openipmi driver to work.

[ tglx: added 64bit fixup ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-10 19:31:45 +02:00
Suresh Siddha
fd3c3ed5d1 x86: fix fpu restore from sig return
If the task never used fpu, initialize the fpu before restoring the FP
state from the signal handler context. This will allocate the fpu
state, if the task never needed it before.

Reported-and-bisected-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Frederik Deweerdt <deweerdt@free.fr>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-10 19:31:45 +02:00
Yinghai Lu
0646153921 x86: remove spew print out about bus to node mapping
Jeff Garzik pointed out that this printout is not needed.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-10 19:31:45 +02:00
Thomas Gleixner
5ecddcebfb x86: revert printk format warning change which is for linux-next
commit 62179849b4
    x86: fix setup printk format warning

is for linux-next and not for .26

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-10 19:31:44 +02:00
Linus Torvalds
8d53910856 Revert "PCI: remove default PCI expansion ROM memory allocation"
This reverts commit 9f8daccaa0, which was
reported to break X startup (xf86-video-ati-6.8.0). See

	http://bugs.freedesktop.org/show_bug.cgi?id=15523

for details.

Reported-by: Laurence Withers <l@lwithers.me.uk>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08 19:02:55 -07:00
Linus Torvalds
f589274533 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  [ALSA] soc at91 minor bug fixes
  [ALSA] soc - at91-pcm - Fix line wrapping
  pcspkr: fix dependancies
2008-05-08 10:58:45 -07:00
Thomas Gleixner
8d4a430085 x86: cleanup PAT cpu validation
Move the scattered checks for PAT support to a single function. Its
moved to addon_cpuid_features.c as this file is shared between 32 and
64 bit.

Remove the manipulation of the PAT feature bit and just disable PAT in
the PAT layer, based on the PAT bit provided by the CPU and the
current CPU version/model white list.

Change the boot CPU check so it works on Voyager somewhere in the
future as well :) Also panic, when a secondary has PAT disabled but
the primary one has alrady switched to PAT. We have no way to undo
that.

The white list is kept for now to ensure that we can rely on known to
work CPU types and concentrate on the software induced problems
instead of fighthing CPU erratas and subtle wreckage caused by not yet
verified CPUs. Once the PAT code has stabilized enough, we can remove
the white list and open the can of worms.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08 15:43:51 +02:00
Andres Salomon
547acec7ec x86: GEODE: cache results from geode_has_vsa2() and uninline
This moves geode_has_vsa2 into a .c file, caches the result we get from
the VSA virtual registers, and causes the function to no longer be inline.

[akpm@linux-foundation.org: cleanup]

Signed-off-by: Andres Salomon <dilinger@debian.org>
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-08 15:43:50 +02:00
Thomas Gleixner
ac44cc96fb x86: revert geode config dependency
commit e26a28d190
    x86: olpc build fix

was a fix to a patch that was withdrawn/delayed and then erroneously
commited to x86.git. Revert it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-08 15:43:50 +02:00
Stas Sergeev
e5e1d3cb20 pcspkr: fix dependancies
fix pcspkr dependancies: make the pcspkr platform
drivers to depend on a platform device, and
not the other way around.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
CC: Vojtech Pavlik <vojtech@suse.cz>
CC: Michael Opdenacker <michael-lists@free-electrons.com>
[fixed for 2.6.26-rc1 by tiwai]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-05-07 12:42:03 +02:00
Hugh Dickins
aeed5fce37 x86: fix PAE pmd_bad bootup warning
Fix warning from pmd_bad() at bootup on a HIGHMEM64G HIGHPTE x86_32.

That came from 9fc34113f6 x86: debug pmd_bad();
but we understand now that the typecasting was wrong for PAE in the previous
version: pagetable pages above 4GB looked bad and stopped Arjan from booting.

And revert that cded932b75 x86: fix pmd_bad
and pud_bad to support huge pages.  It was the wrong way round: we shouldn't
weaken every pmd_bad and pud_bad check to let huge pages slip through - in
part they check that we _don't_ have a huge page where it's not expected.

Put the x86 pmd_bad() and pud_bad() definitions back to what they have long
been: they can be improved (x86_32 should use PTE_MASK, to stop PAE thinking
junk in the upper word is good; and x86_64 should follow x86_32's stricter
comparison, to stop thinking any subset of required bits is good); but that
should be a later patch.

Fix Hans' good observation that follow_page() will never find pmd_huge()
because that would have already failed the pmd_bad test: test pmd_huge in
between the pmd_none and pmd_bad tests.  Tighten x86's pmd_huge() check?
No, once it's a hugepage entry, it can get quite far from a good pmd: for
example, PROT_NONE leaves it with only ACCESSED of the KERN_PGTABLE bits.

However... though follow_page() contains this and another test for huge
pages, so it's nice to keep it working on them, where does it actually get
called on a huge page?  get_user_pages() checks is_vm_hugetlb_page(vma) to
to call alternative hugetlb processing, as does unmap_vmas() and others.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Earlier-version-tested-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Chua <jeff.chua.linux@gmail.com>
Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-06 13:08:58 -07:00
Linus Torvalds
bb896afe20 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes:
  sched: default to n for GROUP_SCHED and FAIR_GROUP_SCHED
  sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
  sched, x86: add HAVE_UNSTABLE_SCHED_CLOCK
  sched: fix cpu clock
  sched: fair-group: fix a Div0 error of the fair group scheduler
  sched: fix missing locking in sched_domains code
  sched: make clock sync tunable by architecture code
  sched: fix debugging
  sched: fix sched_info_switch not being called according to documentation
  sched: fix hrtick_start_fair and CPU-Hotplug
  sched: fix SCHED_FAIR wake-idle logic error
  sched: fix RT task-wakeup logic
  sched: add statics, don't return void expressions
  sched: add debug checks to idle functions
  sched: remove old sched doc
  sched: make rt_sched_class, idle_sched_class static
  sched: optimize calc_delta_mine()
  sched: fix normalized sleeper
2008-05-05 17:31:14 -07:00
Ingo Molnar
a5574cf65b sched, x86: add HAVE_UNSTABLE_SCHED_CLOCK
add the HAVE_UNSTABLE_SCHED_CLOCK, for architectures to select.

the next change utilizes it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-05 23:56:18 +02:00
Linus Torvalds
108c196184 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86 PCI: call dmi_check_pciprobe()
  x86/pci: add pci=skip_isa_align command lines.
  x86/pci: remove flag in pci_cfg_space_size_ext
  x86: fix section mismatch in pci_scan_bus
2008-05-05 12:39:10 -07:00
Yinghai Lu
0df18ff366 x86 PCI: call dmi_check_pciprobe()
this change:

| commit 08f1c192c3
| Author: Muli Ben-Yehuda <muli@il.ibm.com>
| Date:   Sun Jul 22 00:23:39 2007 +0300
|
|    x86-64: introduce struct pci_sysdata to facilitate sharing of ->sysdata
|
|    This patch introduces struct pci_sysdata to x86 and x86-64, and
|    converts the existing two users (NUMA, Calgary) to use it.
|
|    This lays the groundwork for having other users of sysdata, such as
|    the PCI domains work.
|
|    The Calgary bits are tested, the NUMA bits just look ok.

replaces pcibios_scan_root with pci_scan_bus_parented...

but in pcibios_scan_root we have a DMI check:

    dmi_check_system(pciprobe_dmi_table);

when when have several peer root buses this could be called multiple
times (which is bad), so move that call to pci_access_init().

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-05 09:24:00 -07:00
Yinghai Lu
13a6ddb08e x86/pci: add pci=skip_isa_align command lines.
so we don't align the io port start address for pci cards.

also move out dmi check out acpi.c, because it has nothing to do with acpi.
it could spare some calling when we have several peer root buses.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-05 09:22:08 -07:00
Linus Torvalds
45ea2103d8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes:
  x86: fix setup printk format warning
  x86: olpc build fix
  x86: video/fbdev.c: add MODULE_LICENSE
  x86: fix up bootparam.h for userspace inclusion
  x86: relocs ELF handling - use SELFMAG instead of numeric constant
  x86: vdso ELF handling - use SELFMAG instead of numeric constant
  x86: remove dell reboot dmi quirk board name match
  x86: es7000 build fix
  x86: make additional_cpus static
  x86: make start_secondary() static
  kbuild, suspend, x86: fix rebuild of wakeup.bin
  uml: fix gcc problem
  x86: undo visws/numaq build changes
2008-05-04 17:11:43 -07:00
Randy Dunlap
62179849b4 x86: fix setup printk format warning
Fix x86 setup printk format warming:

next-20080430/arch/x86/kernel/setup.c:172: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'ssize_t'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-04 20:04:46 +02:00
Thomas Gleixner
e26a28d190 x86: olpc build fix
CONFIG_OLPC needs to depend on MGEODE_LX

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-04 20:04:46 +02:00
Adrian Bunk
7b04fa014c x86: video/fbdev.c: add MODULE_LICENSE
Add the missing MODULE_LICENSE("GPL").

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-04 20:04:46 +02:00
Cyrill Gorcunov
8bd1796ded x86: relocs ELF handling - use SELFMAG instead of numeric constant
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-04 20:04:45 +02:00
Cyrill Gorcunov
ecb783eae1 x86: vdso ELF handling - use SELFMAG instead of numeric constant
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: mingo@elte.hu
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-04 20:04:45 +02:00
Ben
163ea310b6 x86: remove dell reboot dmi quirk board name match
http://bugzilla.kernel.org/show_bug.cgi?id=10547

Newer Dell OptiPlex 745s hang before rebooting after 'sudo reboot'.

A patch for some versions of the OptiPlex was proposed here --
http://lkml.org/lkml/2007/6/5/59 -- and is included in 2.6.23 and
later kernels, according to
http://lxr.linux.no/linux+v2.6.23/arch/i386/kernel/reboot.c . However,
the DMI_BOARD_NAME ("0WF810") is too restrictive. Newer OptiPlex
machines have a DMI_BOARD_NAME of "0RF703".  I therefore suggest
adding another clause to reboot.c, similar to the one in the original
patch, but matching a DMI_BOARD_NAME of "0RF703".

On further inspection, it seems that there are other DMI_BOARD_NAMEs
for this same machine. They seem to change from time to time, which
means that the current code is fragile. Moreover, using bios reboot
should not break non-SFF OptiPlex 745s, and so a reasonable fix is to
simply drop the match on DMI_BOARD_NAME.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-04 20:04:45 +02:00
Ingo Molnar
e37ee42caa x86: es7000 build fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-04 20:04:45 +02:00
Adrian Bunk
c5562faeaa x86: make additional_cpus static
This patch makes the needlessly global additional_cpus static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-04 20:04:45 +02:00
Adrian Bunk
dbe55f4797 x86: make start_secondary() static
start_secondary() needlessly became global.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-04 20:04:45 +02:00
Sam Ravnborg
4c6214c75a kbuild, suspend, x86: fix rebuild of wakeup.bin
In kernel/acpi/realmode/Makefile use the 'always'
variable to say that wakeup.bin should always
be made.

In acpi/Makefile we then do not need to specify the
requested target and we avoid the message from make:

   `arch/x86/kernel/acpi/realmode/wakeup.bin' is up to date.

Add wakeup.lds to list af targets to avoid rebuilding
wakeup.bin - from Roland McGrath.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@suse.cz>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-04 20:04:45 +02:00
Thomas Gleixner
48b83d2425 x86: undo visws/numaq build changes
arch/x86/pci/Makefile_32 has a nasty detail. VISWS and NUMAQ build
override the generic pci-y rules. This needs a proper cleanup, but
that needs more thoughts. Undo

commit 895d30935e
    x86: numaq fix
    do not override the existing pci-y rule when adding visws or
    numaq rules.

There is also a stupid init function ordering problem vs. acpi.o

Add comments to the Makefile to avoid tripping over this again.

Remove the srat stub code in discontig_32.c to allow a proper NUMAQ
build.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-04 20:04:45 +02:00
Glauber Costa
b8ba5f10c5 x86: KVM geust: make setup_secondary_clock definition dependent on local apic
Since the pv_apic_ops are only present if CONFIG_X86_LOCAL_APIC is compiled
in, kvmclock failed to build without this option.  This patch fixes this.

Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:45:12 +03:00
Avi Kivity
93df766322 KVM: MMU: Allow more than PAGES_PER_HPAGE write protections per large page
nonpae guests can call rmap_write_protect twice per page (for page tables)
or four times per page (for page directories), triggering a bogus warning.

Remove the warning.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:49 +03:00
Andrea Arcangeli
bc1a34f1bf KVM: avoid fx_init() schedule in atomic
This make sure not to schedule in atomic during fx_init. I also
changed the name of fpu_init to fx_finit to avoid duplicating the name
with fpu_init that is already used in the kernel, this makes grep
simpler if nothing else.

Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:48 +03:00
Jan Kiszka
b4f14abd95 KVM: Avoid spurious execeptions after setting registers
Clear pending exceptions when setting new register values. This avoids
spurious exceptions after restoring a vcpu state or after
reset-on-triple-fault.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:47 +03:00
Marcelo Tosatti
ece15babfa KVM: PIT: support mode 4
The in-kernel PIT emulation ignores pending timers if operating under
mode 4, which for example DragonFlyBSD uses (and Plan9 too, apparently).

Mode 4 seems to be similar to one-shot mode, other than the fact that it
starts counting after the next CLK pulse once programmed, while mode 1
starts counting immediately, so add a FIXME to enhance precision.

Fixes sourceforge bug 1952988.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:46 +03:00
Avi Kivity
dc7457ea52 KVM: x86 emulator: disable writeback on lmsw
The recent changes allowing memory operands with lmsw and smsw left
lmsw with writeback enabled.  Since lmsw has no oridinary destination
operand, the dst pointer was not initialized, resulting in an oops.

Close the hole by disabling writeback for lmsw.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:45 +03:00
Izik Eidus
3fe913e7c5 KVM: x86: task switch: fix wrong bit setting for the busy flag
The busy bit is bit 1 of the type field, not bit 8.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:43 +03:00
Sheng Yang
1439442c7b KVM: VMX: Enable EPT feature for KVM
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:42 +03:00
Sheng Yang
b7ebfb0509 KVM: VMX: Prepare an identity page table for EPT in real mode
[aliguory: plug leak]

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:41 +03:00
Sheng Yang
1ac593c97e KVM: MMU: Remove #ifdef CONFIG_X86_64 to support 4 level EPT
Currently EPT level is 4 for both pae and x86_64. The patch remove the #ifdef
for alloc root_hpa and free root_hpa to support EPT.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:39 +03:00
Sheng Yang
7b52345e2c KVM: MMU: Add EPT support
Enable kvm_set_spte() to generate EPT entries.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:38 +03:00
Sheng Yang
67253af52e KVM: Add kvm_x86_ops get_tdp_level()
The function get_tdp_level() provided the number of tdp level for EPT and
NPT rather than the NPT specific macro.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 14:44:34 +03:00
Sheng Yang
8c6d6adc6b KVM: MMU: Move some definitions to a header file
Move some definitions to mmu.h in order to allow building common table
entries between EPT and non-EPT.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 12:26:38 +03:00
Sheng Yang
d56f546db9 KVM: VMX: EPT Feature Detection
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-05-04 12:26:38 +03:00
Ulrich Drepper
d35c7b0e54 unified (weak) sys_pipe implementation
This replaces the duplicated arch-specific versions of "sys_pipe()" with
one unified implementation.  This removes almost 250 lines of duplicated
code.

It's marked __weak, so that *if* an architecture wants to override the
default implementation it can do so by simply having its own replacement
version, since many architectures use alternate calling conventions for
the 'pipe()' system call for legacy reasons (ie traditional UNIX
implementations often return the two file descriptors in registers)

I still haven't changed the cris version even though Linus says the BKL
isn't needed.  The arch maintainer can easily do it if there are really
no obstacles.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-03 13:50:33 -07:00
Roman Zippel
6f6d6a1a6a rename div64_64 to div64_u64
Rename div64_64 to div64_u64 to make it consistent with the other divide
functions, so it clearly includes the type of the divide.  Move its definition
to math64.h as currently no architecture overrides the generic implementation.
 They can still override it of course, but the duplicated declarations are
avoided.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-01 08:03:58 -07:00
Linus Torvalds
6d98ca7364 x86: Mark OPTIMIZE_INLINING broken
So Ingo finally did figure out why UML broke with this option: UML
passes gcc the -fno-unit-at-a-time flag, and apparently that wreaks
havoc with gcc's inlining.

We could turn off -fno-unit-at-a-time for UML for gcc4+ (which is what
x86 does), but there's bad blood about this whole option, and it does
show that the thing is just fragile as heck.

So let tempers cool, and disable the thing, and we can revisit the
decision later.

Cc: Adrian Bunk <bunk@kernel.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-30 20:07:22 -07:00
Ingo Molnar
895d30935e x86: numaq fix
do not override the existing pci-y rule when adding visws or
numaq rules.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-30 23:15:35 +02:00
Ingo Molnar
6b8e1c7ec4 x86: 8K stacks by default
Switch back to 8K stacks as the safer default. Out-of-memory
situations are less problematic than silent and hard to debug
stack corruption.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:35 +02:00
Andres Salomon
cb8ab687c3 x86: ioremap ram check fix
bdd3cee2e4 (x86: ioremap(), extend check
to all RAM pages) breaks OLPC's ioremap call.  The ioremap that OLPC uses is:

        romsig = ioremap(0xffffffc0, 16);

The commit that breaks it is basically:

-       for (pfn = phys_addr >> PAGE_SHIFT; pfn < max_pfn_mapped &&
-            (pfn << PAGE_SHIFT) < last_addr; pfn++) {
+       for (pfn = phys_addr >> PAGE_SHIFT;
+                               (pfn << PAGE_SHIFT) < last_addr; pfn++) {
+

Previously, the 'pfn < max_pfn_mapped' check would've caused us to not
enter the loop.  Removing that check means we loop infinitely.  The
reason for that is because pfn is 0xfffff, and last_addr is 0xffffffcf.
The remaining check that is used to exit the loop is not sufficient;
when pfn<<PAGE_SHIFT is 0xfffff000, that is less than 0xffffffcf; when
we increment pfn and it overflows (pfn == 0x100000), pfn<<PAGE_SHIFT
ends up being 0.  That, of course, is less than last_addr.  In effect,
pfn<<PAGE_SHIFT is never lower than last_addr.

The simple fix for this is to limit the last_addr check to the PAGE_MASK;
a patch is below.

Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:35 +02:00
Ingo Molnar
5de8f68b43 x86: optimize inlining off
default to inline optimizing off.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:35 +02:00
Ingo Molnar
acbaa93e3d x86: CONFIG_X86_ELAN fix
move the X86_CPU section out of the !X86_ELAN branch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:35 +02:00
Ingo Molnar
c9af1e3323 x86: Kconfig fix
Andrew noticed that OPTIMIZE_INLINING appeared in the toplevel
menu - fix it.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:35 +02:00
Suresh Siddha
de33c442ed x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()
Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
Once all the X drivers move to ioremap_wc(), we can go back to strong
UC semantics for ioremap() and ioremap_nocache().

To avoid attribute aliasing issues, pci_mmap_page_range() will also
use UC_MINUS for default non write-combining mapping request.

Next steps:
	a) change all the video drivers using ioremap() or ioremap_nocache()
	   and adding WC MTTR using mttr_add() to ioremap_wc()

	b) for strict usage, we can go back to strong uc semantics
	   for ioremap() and ioremap_nocache() after some grace period for
	   completing step-a.

	c) user level X server needs to use the appropriate method for setting
	   up WC mapping (like using resourceX_wc sysfs file instead of
	   adding MTRR for WC and using /dev/mem or resourceX under /sys)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:35 +02:00
Sam Ravnborg
b9b39bfba5 x86: use defconfigs from x86/configs/*
Daniel Drake <dsd@gentoo.org> reported:

In 2.6.23, if you unpacked a kernel source tarball and then
ran "make menuconfig" you'd be presented with this message:
    # using defaults found in arch/i386/defconfig

and the default options would be set.

The same thing in 2.6.24 does not give you any "using defaults" message, and
the default config options within menuconfig are rather blank (e.g. no PCI
support). You can work around this by explicitly running "make defconfig"
before menuconfig, but it would be nice to have the behaviour the way it was
for 2.6.23 (and the way it still is for other archs).

Fixed by adding a x86 specific defconfig list to Kconfig.

Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=10470
Tested-by: dsd@gentoo.org
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:34 +02:00
Ingo Molnar
2544a873ab revert: "x86: ioremap(), extend check to all RAM pages"
Vegard Nossum reported a large (150 seconds) boot delay during bootup,
and bisected it to "x86: ioremap(), extend check to all RAM pages"
(commit bdd3cee2e4). Revert this commit for now.

Bisected-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:34 +02:00
Jeremy Fitzhardinge
a4c863f497 x86: don't bother printing compat vdso address
The kernel prints the compat vdso address regardless of whether compat
vdso mode is enabled or not, which is confusing.  Given that this
isn't very interesting information anyway, just remove the printk.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Gerhard Mack <gmack@innerfire.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:34 +02:00
Andi Kleen
f6c133f7d5 fix: x86: support for new UV apic
Don't warn in read_apic_id() when preemptible but only one CPU online.

Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:34 +02:00
Vegard Nossum
575ca7351b x86: fix early-BUG message
The .asciz directive takes any number of strings, but each one is zero-
terminated, and string pasting is not done as in C. That results in only the
first line being output.

Replace .asciz with multiple .ascii directives and terminate with .asciz.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:34 +02:00
Dmitri Vorobiev
b4cdc4300d x86: iommu_sac_force can become static
The iommu_sac_force variable is needlessly defined global,
and this patch makes it static. Additionally, this variable
needs not be explicitly initialized.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:34 +02:00
Dmitri Vorobiev
4412620fc2 x86: add proper header for reboot_force
This patch fixes one sparse warning by including the appropriate
header for the reboot_force symbol.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:34 +02:00
Ingo Molnar
3e8f7e35f3 x86 VISWS: build fix
the 'reboot_force' flag is a notion that non-PC subarchitectures do
not have.

also, unify the X86_BIOS_REBOOT option between 32-bit and 64-bit
and get rid of a few unnecessary Kconfig and Makefile complications
that way.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-30 23:15:34 +02:00