Commit Graph

4324 Commits

Author SHA1 Message Date
Andreas Herrmann
b98103a559 x86: hpet: print HPET registers during setup (if hpet=verbose is used)
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Mark Hounschell <markh@compro.net>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-22 18:01:14 +01:00
Ingo Molnar
2702e0a46c Merge branch 'linus' into timers/hpet 2009-02-22 17:59:49 +01:00
Hannes Eder
fc6fcdfbb8 x86: kexec/i386: fix sparse warnings: Using plain integer as NULL pointer
Fix these sparse warnings:

  arch/x86/kernel/machine_kexec_32.c:124:22: warning: Using plain integer as NULL pointer
  arch/x86/kernel/traps.c:950:24: warning: Using plain integer as NULL pointer

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Cc: trivial@kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-22 09:27:11 +01:00
Jiri Slaby
6defa2fe20 x86_64: Fix S3 fail path
As acpi_enter_sleep_state can fail, take this into account in
do_suspend_lowlevel and don't return to the do_suspend_lowlevel's
caller. This would break (currently) fpu status and preempt count.

Technically, this means use `call' instead of `jmp' and `jmp' to
the `resume_point' after the `call' (i.e. if
acpi_enter_sleep_state returns=fails). `resume_point' will handle
the restore of fpu and preempt count gracefully.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-21 21:58:18 -05:00
Jiri Slaby
e6bd6760c9 x86_64: acpi/wakeup_64 cleanup
- remove %ds re-set, it's already set in wakeup_long64
- remove double labels and alignment (ENTRY already adds both)
- use meaningful resume point labelname
- skip alignment while jumping from wakeup_long64 to the resume point
- remove .size, .type and unused labels
[v2]
- added ENDPROCs

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-21 21:58:18 -05:00
H. Peter Anvin
cc3ca22063 x86, mce: remove incorrect __cpuinit for mce_cpu_features()
Impact: Bug fix on UP

Checkin 6ec68bff3c:
    x86, mce: reinitialize per cpu features on resume

introduced a call to mce_cpu_features() in the resume path, in order
for the MCE machinery to get properly reinitialized after a resume.
However, this function (and its successors) was flagged __cpuinit,
which becomes __init on UP configurations (on SMP suspend/resume
requires CPU hotplug and so this would not be seen.)

Remove the offending __cpuinit annotations for mce_cpu_features() and
its successor functions.

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-20 23:40:40 -08:00
Ingo Molnar
d951734654 x86, mm: rename TASK_SIZE64 => TASK_SIZE_MAX
Impact: cleanup

Rename TASK_SIZE64 to TASK_SIZE_MAX, and provide the
define on 32-bit too. (mapped to TASK_SIZE)

This allows 32-bit code to make use of the (former-) TASK_SIZE64
symbol as well, in a clean way.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-21 00:09:44 +01:00
Steven Rostedt
90c7ac49aa ftrace: immediately stop code modification if failure is detected
Impact: fix to prevent NMI lockup

If the page fault handler produces a WARN_ON in the modifying of
text, and the system is setup to have a high frequency of NMIs,
we can lock up the system on a failure to modify code.

The modifying of code with NMIs allows all NMIs to modify the code
if it is about to run. This prevents a modifier on one CPU from
modifying code running in NMI context on another CPU. The modifying
is done through stop_machine, so only NMIs must be considered.

But if the write causes the page fault handler to produce a warning,
the print can slow it down enough that as soon as it is done
it will take another NMI before going back to the process context.
The new NMI will perform the write again causing another print and
this will hang the box.

This patch turns off the writing as soon as a failure is detected
and does not wait for it to be turned off by the process context.
This will keep NMIs from getting stuck in this back and forth
of print outs.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-20 14:30:18 -05:00
Steven Rostedt
1623963097 ftrace, x86: make kernel text writable only for conversions
Impact: keep kernel text read only

Because dynamic ftrace converts the calls to mcount into and out of
nops at run time, we needed to always keep the kernel text writable.

But this defeats the point of CONFIG_DEBUG_RODATA. This patch converts
the kernel code to writable before ftrace modifies the text, and converts
it back to read only afterward.

The kernel text is converted to read/write, stop_machine is called to
modify the code, then the kernel text is converted back to read only.

The original version used SYSTEM_STATE to determine when it was OK
or not to change the code to rw or ro. Andrew Morton pointed out that
using SYSTEM_STATE is a bad idea since there is no guarantee to what
its state will actually be.

Instead, I moved the check into the set_kernel_text_* functions
themselves, and use a local variable to determine when it is
OK to change the kernel text RW permissions.

[ Update: Ingo Molnar suggested moving the prototypes to cacheflush.h ]

Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-20 14:30:06 -05:00
Alok Kataria
fdb17aeb28 x86, vmi: TSC going backwards check in vmi clocksource, cleanup
clean up vmi_read_cycles to use max()

Reported-b: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: Zach Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 19:31:03 +01:00
Ingo Molnar
609162850d Merge branches 'x86/asm', 'x86/cleanups' and 'x86/headers' into x86/core 2009-02-20 17:40:50 +01:00
Ingo Molnar
3b6f7b9beb Merge branch 'x86/urgent' into x86/core 2009-02-20 17:40:43 +01:00
Vegard Nossum
ecab22aa6d x86: use symbolic constants for MSR_IA32_MISC_ENABLE bits
Impact: Cleanup. No functional changes.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 12:07:43 +01:00
Ingo Molnar
64b36ca7f4 Merge branches 'tracing/function-graph-tracer' and 'linus' into tracing/core 2009-02-20 11:35:57 +01:00
Tejun Heo
11124411aa x86: convert to the new dynamic percpu allocator
Impact: use new dynamic allocator, unified access to static/dynamic
        percpu memory

Convert to the new dynamic percpu allocator.

* implement populate_extra_pte() for both 32 and 64
* update setup_per_cpu_areas() to use pcpu_setup_static()
* define __addr_to_pcpu_ptr() and __pcpu_ptr_to_addr()
* define config HAVE_DYNAMIC_PER_CPU_AREA

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:09 +09:00
Rusty Russell
b36128c830 alloc_percpu: change percpu_ptr to per_cpu_ptr
Impact: cleanup

There are two allocated per-cpu accessor macros with almost identical
spelling.  The original and far more popular is per_cpu_ptr (44
files), so change over the other 4 files.

tj: kill percpu_ptr() and update UP too

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: mingo@redhat.com
Cc: lenb@kernel.org
Cc: cpufreq@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:29:08 +09:00
Lai Jiangshan
42f8faecf7 x86: use percpu data for 4k hardirq and softirq stacks
Impact: economize memory for large NR_CPUS

percpu data is setup earlier than irq, we can use percpu data
to economize memory.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-02-20 16:26:10 +09:00
Alok N Kataria
48ffc70b67 x86, vmi: TSC going backwards check in vmi clocksource
Impact: fix time warps under vmware

Similar to the check for TSC going backwards in the TSC clocksource,
we also need this check for VMI clocksource.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org
2009-02-20 07:53:08 +01:00
H. Peter Anvin
f6d1826dfa x86, mce: use %ll instead of %L for 64-bit numbers
Impact: Cleanup

The standard spelling of a printf pattern for long long is "ll", not
"L", which is for long double.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-19 15:44:58 -08:00
Andi Kleen
b79109c3bb x86, mce: separate correct machine check poller and fatal exception handler
Impact: cleanup, performance enhancement

The machine check poller is diverging more and more from the fatal
exception handler. Instead of adding more special cases separate the code
paths completely. The corrected poll path is actually quite simple,
and this doesn't result in much code duplication.

This makes both handlers much easier to read and results in
cleaner code flow.  The exception handler now only needs to care
about uncorrected errors, which also simplifies the handling of multiple
errors. The corrected poller also now always runs in standard interrupt
context and does not need to do anything special to handle NMI context.

Minor behaviour changes:
- MCG status is now not cleared on polling.
- Only the banks which had corrected errors get cleared on polling
- The exception handler only clears banks with errors now

v2: Forward port to new patch order. Add "uc" argument.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-19 14:52:20 -08:00
Andi Kleen
b5f2fa4ea0 x86, mce: factor out duplicated struct mce setup into one function
Impact: cleanup

This merely factors out duplicated code to set up
the initial struct mce state into a single function.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-19 14:51:39 -08:00
Andi Kleen
0d7482e3d7 x86, mce: implement dynamic machine check banks support
Impact: cleanup; making code future proof; memory saving on small systems

This patch replaces the hardcoded max number of machine check banks with 
dynamic allocation depending on what the CPU reports. The sysfs
data structures and the banks array are dynamically allocated.

There is still a hard bank limit (128) because the mcelog protocol uses
banks >= 128 as pseudo banks to escape other events. But we expect
that 128 banks is beyond any reasonable CPU for now.

This supersedes an earlier patch by Venki, but it solves the problem
more completely by making the limit fully dynamic (up to the 128
boundary).

This saves some memory on machines with less than 6 banks because
they won't need sysdevs for unused ones and also allows to 
use sysfs to control these banks on possible future CPUs with
more than 6 banks.

This is an updated patch addressing Venki's comments.  I also added in
another patch from Thomas which fixed the error allocation path (that
patch was previously separated)

Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-19 14:50:58 -08:00
Ingo Molnar
e9ce0c37c2 Merge branch 'x86/untangle2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/headers 2009-02-19 18:15:01 +01:00
Linus Torvalds
bcf8951fc2 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown
  x86, mce: use force_sig_info to kill process in machine check
  x86, mce: reinitialize per cpu features on resume
  x86, rcu: fix strange load average and ksoftirqd behavior
2009-02-19 09:14:35 -08:00
Ingo Molnar
4cd0332db7 Merge branch 'mainline/function-graph' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/function-graph-tracer 2009-02-19 12:13:33 +01:00
Ingo Molnar
72c26c9a26 Merge branch 'linus' into tracing/blktrace
Conflicts:
	block/blktrace.c

Semantic merge:
	kernel/trace/blktrace.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-19 09:00:35 +01:00
Steven Rostedt
712406a6bf tracing/function-graph-tracer: make arch generic push pop functions
There is nothing really arch specific of the push and pop functions
used by the function graph tracer. This patch moves them to generic
code.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-18 13:43:04 -05:00
Huang Ying
ef41df4344 x86, mce: fix a race condition in mce_read()
Impact: bugfix

Considering the situation as follow:

before: mcelog.next == 1, mcelog.entry[0].finished = 1

+--------------------------------------------------------------------------
R                   W1                  W2                  W3

read mcelog.next (1)
                    mcelog.next++ (2)
                    (working on entry 1,
                    finished == 0)

mcelog.next = 0
                                        mcelog.next++ (1)
                                        (working on entry 0)
                                                           mcelog.next++ (2)
                                                           (working on entry 1)
                        <----------------- race ---------------->
                    (done on entry 1,
                    finished = 1)
                                                           (done on entry 1,
                                                           finished = 1)

To fix the race condition, a cmpxchg loop is added to mce_read() to
ensure no new MCE record can be added between mcelog.next reading and
mcelog.next = 0.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:33:05 -08:00
Andi Kleen
d6b75584a3 x86, mce: disable machine checks on offlined CPUs
Impact: Lower priority bug fix

Offlined CPUs could still get machine checks, but the machine check handler
cannot handle them properly, leading to an unconditional crash. Disable
machine checks on CPUs that are going down.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:56 -08:00
Andi Kleen
5b4408fdaa x86, mce: don't set up mce sysdev devices with mce=off
Impact: bug fix, in this case the resume handler shouldn't run which
	avoids incorrectly reenabling machine checks on resume

When MCEs are completely disabled on the command line don't set
up the sysdev devices for them either.

Includes a comment fix from Thomas Gleixner.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:50 -08:00
Andi Kleen
52d168e28b x86, mce: switch machine check polling to per CPU timer
Impact: Higher priority bug fix

The machine check poller runs a single timer and then broadcasted an
IPI to all CPUs to check them. This leads to unnecessary
synchronization between CPUs. The original CPU running the timer has
to wait potentially a long time for all other CPUs answering. This is
also real time unfriendly and in general inefficient.

This was especially a problem on systems with a lot of events where
the poller run with a higher frequency after processing some events.
There could be more and more CPU time wasted with this, to
the point of significantly slowing down machines.

The machine check polling is actually fully independent per CPU, so
there's no reason to not just do this all with per CPU timers.  This
patch implements that.

Also switch the poller also to use standard timers instead of work
queues. It was using work queues to be able to execute a user program
on a event, but mce_notify_user() handles this case now with a
separate callback. So instead always run the poll code in in a
standard per CPU timer, which means that in the common case of not
having to execute a trigger there will be less overhead.

This allows to clean up the initialization significantly, because
standard timers are already up when machine checks get init'ed.  No
multiple initialization functions.

Thanks to Thomas Gleixner for some help.

Cc: thockin@google.com
v2: Use del_timer_sync() on cpu shutdown and don't try to handle
migrated timers.
v3: Add WARN_ON for timer running on unexpected CPU

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:44 -08:00
Andi Kleen
9bd9840580 x86, mce: always use separate work queue to run trigger
Impact: Needed for bug fix in next patch

This relaxes the requirement that mce_notify_user has to run in process
context. Useful for future changes, but also leads to cleaner
behaviour now. Now instead mce_notify_user can be called directly
from interrupt (but not NMI) context.

The work queue only uses a single global work struct, which can be done safely
because it is always free to reuse before the trigger function is executed.
This way no events can be lost.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:41 -08:00
Andi Kleen
123aa76ec0 x86, mce: don't disable machine checks during code patching
Impact: low priority bug fix

This removes part of a a patch I added myself some time ago. After some
consideration the patch was a bad idea. In particular it stopped machine check
exceptions during code patching.

To quote the comment:

        * MCEs only happen when something got corrupted and in this
        * case we must do something about the corruption.
        * Ignoring it is worse than a unlikely patching race.
        * Also machine checks tend to be broadcast and if one CPU
        * goes into machine check the others follow quickly, so we don't
        * expect a machine check to cause undue problems during to code
        * patching.

So undo the machine check related parts of
8f4e956b31 NMIs are still disabled.

This only removes code, the only additions are a new comment.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:38 -08:00
Andi Kleen
973a2dd1d5 x86, mce: disable machine checks on suspend
Impact: Bug fix

During suspend it is not reliable to process machine check
exceptions, because CPUs disappear but can still get machine check
broadcasts.  Also the system is slightly more likely to
machine check them, but the handler is typically not a position
to handle them in a meaningfull way.

So disable them during suspend and enable them during resume.

Also make sure they are always disabled on hot-unplugged CPUs.

This new code assumes that suspend always hotunplugs all
non BP CPUs.

v2: Remove the WARN_ONs Thomas objected to.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:32:14 -08:00
Andi Kleen
07db1c140e x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown
Impact: Bugfix

The ifdef for the apic clear on shutdown for the 64bit intel thermal
vector was incorrect and never triggered. Fix that.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:34 -08:00
Andi Kleen
380851bc6b x86, mce: use force_sig_info to kill process in machine check
Impact: bug fix (with tolerant == 3)

do_exit cannot be called directly from the exception handler because
it can sleep and the exception handler runs on the exception stack.
Use force_sig() instead.

Based on a earlier patch by Ying Huang who debugged the problem.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:31 -08:00
Andi Kleen
6ec68bff3c x86, mce: reinitialize per cpu features on resume
Impact: Bug fix

This fixes a long standing bug in the machine check code. On resume the
boot CPU wouldn't get its vendor specific state like thermal handling
reinitialized. This means the boot cpu wouldn't ever get any thermal
events reported again.

Call the respective initialization functions on resume

v2: Remove ancient init because they don't have a resume device anyways.
    Pointed out by Thomas Gleixner.
v3: Now fix the Subject too to reflect v2 change

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-17 15:24:28 -08:00
Linus Torvalds
35010334aa Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, vm86: fix preemption bug
  x86, olpc: fix model detection without OFW
  x86, hpet: fix for LS21 + HPET = boot hang
  x86: CPA avoid repeated lazy mmu flush
  x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
  x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption
  x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem
  x86/cpa: make sure cpa is safe to call in lazy mmu mode
  x86, ptrace, mm: fix double-free on race
2009-02-17 14:27:39 -08:00
Ingo Molnar
9be1b56a3e x86, apic: separate 32-bit setup functionality out of apic_32.c
Impact: build fix, cleanup

A couple of arch setup callbacks were mistakenly in apic_32.c, breaking
the build.

Also simplify the code a bit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 23:12:48 +01:00
Paul E. McKenney
bf51935f3e x86, rcu: fix strange load average and ksoftirqd behavior
Damien Wyart reported high ksoftirqd CPU usage (20%) on an
otherwise idle system.

The function-graph trace Damien provided:

>   799.521187 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521371 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521555 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521738 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521934 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522068 |   1)  ksoftir-2324  |               |                rcu_check_callbacks() {
>   799.522208 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522392 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522575 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522759 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522956 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523074 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.523214 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523397 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523579 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523762 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523960 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524079 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.524220 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524403 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524587 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524770 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
> [ . . . ]

Shows rcu_check_callbacks() being invoked way too often. It should be called
once per jiffy, and here it is called no less than 22 times in about
3.5 milliseconds, meaning one call every 160 microseconds or so.

Why do we need to call rcu_pending() and rcu_check_callbacks() from the
idle loop of 32-bit x86, especially given that no other architecture does
this?

The following patch removes the call to rcu_pending() and
rcu_check_callbacks() from the x86 32-bit idle loop in order to
reduce the softirq load on idle systems.

Reported-by: Damien Wyart <damien.wyart@free.fr>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 22:47:45 +01:00
Ingo Molnar
2a05180fe2 x86, apic: move remaining APIC drivers to arch/x86/kernel/apic/*
Move the 32-bit extended-arch APIC drivers to arch/x86/kernel/apic/
too, and rename apic_64.c to probe_64.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 20:35:47 +01:00
Ingo Molnar
f62bae5009 x86, apic: move APIC drivers to arch/x86/kernel/apic/*
arch/x86/kernel/ is getting a bit crowded, and the APIC
drivers are scattered into various different files.

Move them to arch/x86/kernel/apic/*, and also remove
the 'gen' prefix from those which had it.

Also move APIC related functionality: the IO-APIC driver,
the NMI and the IPI code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 18:17:36 +01:00
Ingo Molnar
be163a159b x86, apic: rename 'genapic' to 'apic'
Impact: cleanup

Now that all APIC code is consolidated there's nothing 'gen' about
apics anymore - so rename 'struct genapic' to 'struct apic'.

This shortens the code and is nicer to read as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:53:57 +01:00
Ingo Molnar
ab6fb7c0b0 x86, apic: remove ->store_NMI_vector()
Impact: cleanup

It's not used by anything anymore.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:53:56 +01:00
Ingo Molnar
cb81eaedf1 x86, numaq_32: clean up, misc
Impact: cleanup

 - misc other cleanups that change the md5 signature
 - consolidate global variables
 - remove unnecessary __numaq_mps_oem_check() wrapper
 - make numaq_mps_oem_check static
 - update copyrights
 - misc other cleanups pointed out by checkpatch

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:53:54 +01:00
Ingo Molnar
36afc3af04 x86, numaq_32: clean up
Impact: cleanup

- refactor smp_dump_qct()
- tidy up include files, remove duplicates
- misc other cleanups, pointed out by checkpatch

No code changed:

md5:
   9c0bc01a53558c77df0f2ebcda7e11a9  numaq_32.o.before.asm
   9c0bc01a53558c77df0f2ebcda7e11a9  numaq_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:51 +01:00
Ingo Molnar
7da18ed924 x86, es7000: misc cleanups
These are cleanups that change the md5 signature:

 - asm/ => linux/ include conversion
 - simplify the code flow of find_unisys_acpi_oem_table()
 - move ACPI methods into one #ifdef block
 - remove 0/NULL initialization of statics
 - simplify/standardize printouts
 - update copyrights
 - more cleanups, pointed out by checkpatch

arch/x86/kernel/es7000_32.o:

   text	   data	    bss	    dec	    hex	filename
   2693	    192	     44	   2929	    b71	es7000_32.o.before
   2688	    192	     44	   2924	    b6c	es7000_32.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:50 +01:00
Ingo Molnar
352887d1c9 x86, es7000: remove dead code, clean up
Impact: cleanup

 - a number of structure definitions were stale
 - remove needless wrappers around apic definitions
 - fix details noticed by checkpatch

No code changed:

md5:
   029d8fde0aaf6e934ea63bd8b36430fd  es7000_32.o.before.asm
   029d8fde0aaf6e934ea63bd8b36430fd  es7000_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:49 +01:00
Ingo Molnar
d3185b37df x86, es7000: remove externs
Impact: cleanup

In the subarch times there were a number of externs between
various bits of the ES7000 code. Now that there's a single
es7000-platform support file, the externs can be removed and
the functions can be changed the statics.

Beyond the cleanup factor, this also shrinks the size of the
kernel image a bit:

arch/x86/kernel/es7000_32.o:

   text	   data	    bss	    dec	    hex	filename
   2813	    192	     44	   3049	    be9	es7000_32.o.before
   2693	    192	     44	   2929	    b71	es7000_32.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:48 +01:00
Ingo Molnar
b9e0d1aa97 x86, apic: remove apicid_cluster()
There were multiple definitions of apicid_cluster() scattered around
in APIC drivers - but the definitions are equivalent to the already
existing generic APIC_CLUSTER() method.

So remove apicid_cluster() and change all users to APIC_CLUSTER().

No code changed:

md5:
   1b8244ba8d3d6a454593ce10f09dfa58  summit_32.o.before.asm
   1b8244ba8d3d6a454593ce10f09dfa58  summit_32.o.after.asm

md5:
   a593d98a882bf534622c70d9568497ac  es7000_32.o.before.asm
   a593d98a882bf534622c70d9568497ac  es7000_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:47 +01:00
Ingo Molnar
2c4ce18c95 x86, es7000: clean up
No code changed:

arch/x86/kernel/es7000_32.o:

   text	   data	    bss	    dec	    hex	filename
   2813	    192	     44	   3049	    be9	es7000_32.o.before
   2813	    192	     44	   3049	    be9	es7000_32.o.after

md5:
   a593d98a882bf534622c70d9568497ac  es7000_32.o.before.asm
   a593d98a882bf534622c70d9568497ac  es7000_32.o.after.asm

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:46 +01:00
Ingo Molnar
2f205bc47f x86, apic: clean up the cpu_2_logical_apiciddeclaration
extern declarations were scattered in 4 files - consolidate them
into apic.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:46 +01:00
Ingo Molnar
77313190d1 x86, apic: clean up arch/x86/kernel/bigsmp_32.c
Impact: cleanup

- remove unnecessary indirections that were artifacts of the subarch code
- clean up include file section
- clean up various small details

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:45 +01:00
Ingo Molnar
5c615feb90 x86, apic: remove stale references to APIC_DEFINITION
Impact: cleanup

APIC_DEFINITION was a hack from the x86 subarch times, it has no
meaning anymore - remove it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:45 +01:00
Ingo Molnar
e641f5f525 x86, apic: remove duplicate asm/apic.h inclusions
Impact: cleanup

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:44 +01:00
Ingo Molnar
7b6aa335ca x86, apic: remove genapic.h
Impact: cleanup

Remove genapic.h and remove all references to it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:44 +01:00
Ingo Molnar
28aa29eeb3 remove: genapic prepare
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 17:52:42 +01:00
Ingo Molnar
7d01d32d3b x86, apic: fix build fallout of genapic changes
- make oprofile build
- select X86_X2APIC from X86_UV - it relies on it
- export genapic for oprofile modular build

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 13:13:25 +01:00
Yinghai Lu
c1eeb2de41 x86: fold apic_ops into genapic
Impact: cleanup

make it simpler, don't need have one extra struct.

v2: fix the sgi_uv build

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 12:22:20 +01:00
Yinghai Lu
06cd9a7dc8 x86: add x2apic config
Impact: cleanup

so could deselect x2apic
and INTR_REMAP will select x2apic

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 12:22:20 +01:00
Ingo Molnar
494df596f9 Merge branches 'x86/acpi', 'x86/apic', 'x86/cpudetect', 'x86/headers', 'x86/paravirt', 'x86/urgent' and 'x86/xen'; commit 'v2.6.29-rc5' into x86/core 2009-02-17 12:07:00 +01:00
Yinghai Lu
98c061b6cf x86: make APIC_init_uniprocessor() more like smp_prepare_cpus()
Impact: cleanup

1. move localise_nmi_watchdog() later
2. change setup_boot_APIC_clock() to setup_boot_clock() for 64-bit

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 09:37:04 +01:00
Yinghai Lu
3bd25d0fa3 x86: pre init pirq_entries[]
Impact: cleanup

set default value early - this allows the removal of a number
of dynamic initialization codepaths, and an #ifdef.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 09:36:58 +01:00
Rusty Russell
a0abd520fd cpumask: fix powernow-k8: partial revert of 2fdf66b491
Impact: fix powernow-k8 when acpi=off (or other error).

There was a spurious change introduced into powernow-k8 in this patch:
so that we try to "restore" the cpus_allowed we never saved.  We revert
that file.

See lkml "[PATCH] x86/powernow: fix cpus_allowed brokage when
acpi=off" from Yinghai for the bug report.

Cc: Mike Travis <travis@sgi.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 17:31:59 +10:30
Ingo Molnar
72b623c736 Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/power-tracer 2009-02-15 20:43:03 +01:00
Yinghai Lu
88d0f550d7 x86: make 32bit to call enable_IO_APIC early like 64bit
Impact: cleanup

So we remove some #ifdefs.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15 13:23:46 +01:00
Thomas Gleixner
be716615fe x86, vm86: fix preemption bug
Commit 3d2a71a596 ("x86, traps: converge
do_debug handlers") changed the preemption disable logic of do_debug()
so vm86_handle_trap() is called with preemption disabled resulting in:

 BUG: sleeping function called from invalid context at include/linux/kernel.h:155
 in_atomic(): 1, irqs_disabled(): 0, pid: 3005, name: dosemu.bin
 Pid: 3005, comm: dosemu.bin Tainted: G        W  2.6.29-rc1 #51
 Call Trace:
  [<c050d669>] copy_to_user+0x33/0x108
  [<c04181f4>] save_v86_state+0x65/0x149
  [<c0418531>] handle_vm86_trap+0x20/0x8f
  [<c064e345>] do_debug+0x15b/0x1a4
  [<c064df1f>] debug_stack_correct+0x27/0x2c
  [<c040365b>] sysenter_do_call+0x12/0x2f
 BUG: scheduling while atomic: dosemu.bin/3005/0x10000001

Restore the original calling convention and reenable preemption before
calling handle_vm86_trap().

Reported-by: Michal Suchanek <hramrach@centrum.cz>
Cc: stable@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15 10:46:13 +01:00
Yinghai Lu
f6db44df5b x86: fix typo in filter_cpuid_features()
Impact: fix wrong disabling of cpu features

an amd system got this strange output:

 CPU: CPU feature monitor disabled due to lack of CPUID level 0x5

but in /proc/cpuinfo I have:

 cpuid level	: 5

on intel system:

 CPU: CPU feature monitor disabled due to lack of CPUID level 0x5
 CPU: CPU feature dca disabled due to lack of CPUID level 0x9

but in /proc/cpuinfo i have:

 cpuid level     : 11

Tt turns out there is a typo, and we should use level member in df.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-15 09:03:29 +01:00
Chris Ball
e49590b6dd x86, olpc: fix model detection without OFW
Impact: fix "garbled display, laptop is unusable" bug

Commit e51a1ac2df ("x86, olpc: fix endian
bug in openfirmware workaround") breaks model comparison on OLPC; the value
0xc2 needs to be scaled up by olpc_board().

The pre-patch version was wrong, but accidentally worked anyway
(big-endian 0xc2 is big enough to satisfy all other board revisions,
but little endian 0xc2 is not).

Signed-off-by: Chris Ball <cjb@laptop.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Andres Salomon <dilinger@queued.net>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-14 23:05:25 +01:00
Jeremy Fitzhardinge
0341c14da4 x86: use _types.h headers in asm where available
In general, the only definitions that assembly files can use
are in _types.S headers (where available), so convert them.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-13 11:35:01 -08:00
Dimitri Sivanich
c466ed2e43 x86, UV: set full apicid in uv_hub_send_ipi
The uv_hub_send_ipi() function needs to set the full apicid in the
UVH_IPI_INT mmr.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 19:13:13 +01:00
Jason Baron
b5f9fd0f8a tracing: convert c/p state power tracer to use tracepoints
Convert the c/p state "power" tracer to use tracepoints. Avoids a
function call when the tracer is disabled.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-13 09:06:18 -05:00
Ingo Molnar
1c511f740f Merge branches 'tracing/ftrace', 'tracing/ring-buffer', 'tracing/sysprof', 'tracing/urgent' and 'linus' into tracing/core 2009-02-13 10:25:18 +01:00
Ingo Molnar
7032e86967 Merge branches 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgent' into x86/core 2009-02-13 09:47:32 +01:00
Ingo Molnar
a56cdcb662 Merge branches 'x86/acpi', 'x86/asm', 'x86/cpudetect', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/header-fixes', 'x86/headers' and 'x86/minor-fixes' into x86/core 2009-02-13 09:46:36 +01:00
Ingo Molnar
ab639f3593 Merge branch 'core/percpu' into x86/core 2009-02-13 09:45:09 +01:00
Ingo Molnar
f8a6b2b9ce Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/kernel/acpi/boot.c
	arch/x86/mm/fault.c
2009-02-13 09:44:22 +01:00
john stultz
b13e24644c x86, hpet: fix for LS21 + HPET = boot hang
Between 2.6.23 and 2.6.24-rc1 a change was made that broke IBM LS21
systems that had the HPET enabled in the BIOS, resulting in boot hangs
for x86_64.

Specifically commit b8ce335906, which
merges the i386 and x86_64 HPET code.

Prior to this commit, when we setup the HPET timers in x86_64, we did
the following:

	hpet_writel(HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL |
                    HPET_TN_32BIT, HPET_T0_CFG);

However after the i386/x86_64 HPET merge, we do the following:

	cfg = hpet_readl(HPET_Tn_CFG(timer));
	cfg |= HPET_TN_ENABLE | HPET_TN_PERIODIC |
			HPET_TN_SETVAL | HPET_TN_32BIT;
	hpet_writel(cfg, HPET_Tn_CFG(timer));

However on LS21s with HPET enabled in the BIOS, the HPET_T0_CFG register
boots with Level triggered interrupts (HPET_TN_LEVEL) enabled. This
causes the periodic interrupt to be not so periodic, and that results in
the boot time hang I reported earlier in the delay calibration.

My fix: Always disable HPET_TN_LEVEL when setting up periodic mode.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 09:15:46 +01:00
Thomas Gleixner
34b0900d32 x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
Impact: Catch cases where lazy MMU state is active in a preemtible context

arch_flush_lazy_mmu_cpu() has been changed to disable preemption so
the checks in enter/leave will never trigger. Put the preemtible()
check into arch_flush_lazy_mmu_cpu() to catch such cases.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-02-12 23:11:58 +01:00
Jeremy Fitzhardinge
d85cf93da6 x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption
Impact: avoid access to percpu vars in preempible context

They are intended to be used whenever there's the possibility
that there's some stale state which is going to be overwritten
with a queued update, or to force a state change when we may be
in lazy mode.  Either way, we could end up calling it with
preemption enabled, so wrap the functions in their own little
preempt-disable section so they can be safely called in any
context (though preemption should never be enabled if we're actually
in a lazy state).

(Move out of line to avoid #include dependencies.)
    
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-02-12 23:11:58 +01:00
H. Peter Anvin
7445250927 x86: merge sys_rt_sigreturn between 32 and 64 bits
Impact: cleanup

With the recent changes in the 32-bit code to make system calls which
use struct pt_regs take a pointer, sys_rt_sigreturn() have become
identical between 32 and 64 bits, and both are empty wrappers around
do_rt_sigreturn().  Remove both wrappers and rename both to
sys_rt_sigreturn().

Cc: Brian Gerst <brgerst@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-11 16:31:40 -08:00
Brian Gerst
b12bdaf11f x86: use regparm(3) for passed-in pt_regs pointer
Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to take the pointer as an argument instead of relying on
the assumption that the pt_regs structure overlaps the function
arguments.

Drop the use of regparm(1) due to concern about gcc bugs, and to move
in the direction of the eventual removal of regparm(0) for asmlinkage.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-11 14:00:56 -08:00
Jaswinder Singh Rajput
ba1511bf7f x86: kernel/mpparse.c fix compilation warnings
arch/x86/kernel/mpparse.c: In function ‘smp_scan_config’:
 arch/x86/kernel/mpparse.c:696: warning: format ‘%08lx’ expects type ‘long unsigned int’, but argument 3 has type ‘phys_addr_t’
 arch/x86/kernel/mpparse.c: In function ‘update_mp_table’:
 arch/x86/kernel/mpparse.c:1014: warning: format ‘%lx’ expects type ‘long unsigned int’, but argument 2 has type ‘phys_addr_t’

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 21:01:08 +01:00
Linus Torvalds
94dba89533 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  timers: fix TIMER_ABSTIME for process wide cpu timers
  timers: split process wide cpu clocks/timers, fix
  x86: clean up hpet timer reinit
  timers: split process wide cpu clocks/timers, remove spurious warning
  timers: split process wide cpu clocks/timers
  signal: re-add dead task accumulation stats.
  x86: fix hpet timer reinit for x86_64
  sched: fix nohz load balancer on cpu offline
2009-02-11 08:24:32 -08:00
Linus Torvalds
9ce04f9238 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ptrace, x86: fix the usage of ptrace_fork()
  i8327: fix outb() parameter order
  x86: fix math_emu register frame access
  x86: math_emu info cleanup
  x86: include correct %gs in a.out core dump
  x86, vmi: put a missing paravirt_release_pmd in pgd_dtor
  x86: find nr_irqs_gsi with mp_ioapic_routing
  x86: add clflush before monitor for Intel 7400 series
  x86: disable intel_iommu support by default
  x86: don't apply __supported_pte_mask to non-present ptes
  x86: fix grammar in user-visible BIOS warning
  x86/Kconfig.cpu: make Kconfig help readable in the console
  x86, 64-bit: print DMI info in the oops trace
2009-02-11 08:23:22 -08:00
Markus Metzger
9f339e7028 x86, ptrace, mm: fix double-free on race
Ptrace_detach() races with __ptrace_unlink() if the traced task is
reaped while detaching. This might cause a double-free of the BTS
buffer.

Change the ptrace_detach() path to only do the memory accounting in
ptrace_bts_detach() and leave the buffer free to ptrace_bts_untrace()
which will be called from __ptrace_unlink().

The fix follows a proposal from Oleg Nesterov.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 15:44:20 +01:00
Brian Gerst
9c8bb6b534 x86: drop -fno-stack-protector annotations after pt_regs fixes
Now that no functions rely on struct pt_regs being passed by value,
various "no stack protector" annotations can be dropped.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 12:40:45 +01:00
Brian Gerst
253f29a4ae x86: pass in pt_regs pointer for syscalls that need it
Some syscalls need to access the pt_regs structure, either to copy
user register state or to modifiy it.  This patch adds stubs to load
the address of the pt_regs struct into the %eax register, and changes
the syscalls to regparm(1) to receive the pt_regs pointer as the
first argument.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 12:40:45 +01:00
Brian Gerst
aa78bcfa01 x86: use pt_regs pointer in do_device_not_available()
The generic exception handler (error_code) passes in the pt_regs
pointer and the error code (unused in this case).  The commit
"x86: fix math_emu register frame access" changed this to pass by
value, which doesn't work correctly with stack protector enabled.
Change it back to use the pt_regs pointer.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 12:40:44 +01:00
Tejun Heo
5c79d2a517 x86: fix x86_32 stack protector bugs
Impact: fix x86_32 stack protector

Brian Gerst found out that %gs was being initialized to stack_canary
instead of stack_canary - 20, which basically gave the same canary
value for all threads.  Fixing this also exposed the following bugs.

* cpu_idle() didn't call boot_init_stack_canary()

* stack canary switching in switch_to() was being done too late making
  the initial run of a new thread use the old stack canary value.

Fix all of them and while at it update comment in cpu_idle() about
calling boot_init_stack_canary().

Reported-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 11:33:49 +01:00
Ingo Molnar
160d8dac12 x86, apic: make generic_apic_probe() generally available
Impact: build fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 11:27:39 +01:00
Ingo Molnar
d5b5a232b2 Merge branch 'x86/apic' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/apic 2009-02-11 10:49:40 +01:00
Alok Kataria
0e81cb59c7 x86, apic: fix initialization of wakeup_cpu
With refactoring of wake_cpu macros the 32bit code in tip doesn't
execute generic_apic_probe if CONFIG_X86_32_NON_STANDARD is not set.

Even on a x86 STANDARD cpu we need to execute the generic_apic_probe
function, as we rely on this function to execute the update_genapic
quirk which initilizes apic->wakeup_cpu.

Failing to do so results in we making a call to a null function in do_boot_cpu.

The stack trace without the patch goes like this.

Booting processor 1 APIC 0x1 ip 0x6000
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<(null)>] (null)
*pdpt = 0000000000839001 *pde = 0000000000c97067 *pte = 0000000000000163
Oops: 0000 [#1] SMP
last sysfs file:
Modules linked in:

Pid: 1, comm: swapper Not tainted (2.6.29-rc4-tip #18) VMware Virtual Platform
EIP: 0062:[<00000000>] EFLAGS: 00010293 CPU: 0
EIP is at 0x0
EAX: 00000001 EBX: 00006000 ECX: c077ed00 EDX: 00006000
ESI: 00000001 EDI: 00000001 EBP: ef04cf40 ESP: ef04cf1c
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 006a
Process swapper (pid: 1, ti=ef04c000 task=ef050000 task.ti=ef04c000)
Stack:
 c0644e52 00000000 ef04cf24 ef04cf24 c064468d c0886dc0 00000000 c0702aea
 ef055480 00000001 00000101 dead4ead ffffffff ffffffff c08af530 00000000
 c0709715 ef04cf60 ef04cf60 00000001 00000000 00000000 dead4ead ffffffff
Call Trace:
 [<c0644e52>] ? native_cpu_up+0x2de/0x45b
 [<c064468d>] ? do_fork_idle+0x0/0x19
 [<c0645c5e>] ? _cpu_up+0x88/0xe8
 [<c0645d20>] ? cpu_up+0x42/0x4e
 [<c07e7462>] ? kernel_init+0x99/0x14b
 [<c07e73c9>] ? kernel_init+0x0/0x14b
 [<c040375f>] ? kernel_thread_helper+0x7/0x10
Code:  Bad EIP value.
EIP: [<00000000>] 0x0 SS:ESP 006a:ef04cf1c

I think we should call generic_apic_probe unconditionally for 32 bit now.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 10:48:14 +01:00
Steven Rostedt
f47a454db9 tracing, x86: fix constraint for parent variable
The constraint used for retrieving and restoring the parent function
pointer is incorrect. The parent variable is a pointer, and the
address of the pointer is modified by the asm statement and not
the pointer itself. It is incorrect to pass it in as an output
constraint since the asm will never update the pointer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 10:06:13 +01:00
Ingo Molnar
4040068dce Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-11 10:03:53 +01:00
Ingo Molnar
d524e03207 Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core 2009-02-11 10:03:11 +01:00
Steven Rostedt
e3944bfac9 tracing, x86: fix fixup section to return to original code
Impact: fix to prevent a kernel crash on fault

If for some reason the pointer to the parent function on the
stack takes a fault, the fix up code will not return back to
the original faulting code. This can lead to unpredictable
results and perhaps even a kernel panic.

A fault should not happen, but if it does, we should simply
disable the tracer, warn, and continue running the kernel.
It should not lead to a kernel crash.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-10 13:07:13 -05:00
Steven Rostedt
966657883f tracing, x86: fix constraint for parent variable
The constraint used for retrieving and restoring the parent function
pointer is incorrect. The parent variable is a pointer, and the
address of the pointer is modified by the asm statement and not
the pointer itself. It is incorrect to pass it in as an output
constraint since the asm will never update the pointer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-10 11:53:23 -05:00
Ingo Molnar
f9915bfef3 Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core 2009-02-10 13:25:42 +01:00
Clemens Ladisch
b52af40923 i8327: fix outb() parameter order
In i8237A_resume(), when resetting the DMA controller, the parameters to
dma_outb() were mixed up.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
[ cleaned up the file a tiny bit. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 13:13:23 +01:00
Tejun Heo
60a5317ff0 x86: implement x86_32 stack protector
Impact: stack protector for x86_32

Implement stack protector for x86_32.  GDT entry 28 is used for it.
It's set to point to stack_canary-20 and have the length of 24 bytes.
CONFIG_CC_STACKPROTECTOR turns off CONFIG_X86_32_LAZY_GS and sets %gs
to the stack canary segment on entry.  As %gs is otherwise unused by
the kernel, the canary can be anywhere.  It's defined as a percpu
variable.

x86_32 exception handlers take register frame on stack directly as
struct pt_regs.  With -fstack-protector turned on, gcc copies the
whole structure after the stack canary and (of course) doesn't copy
back on return thus losing all changed.  For now, -fno-stack-protector
is added to all files which contain those functions.  We definitely
need something better.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:42:01 +01:00
Tejun Heo
ccbeed3a05 x86: make lazy %gs optional on x86_32
Impact: pt_regs changed, lazy gs handling made optional, add slight
        overhead to SAVE_ALL, simplifies error_code path a bit

On x86_32, %gs hasn't been used by kernel and handled lazily.  pt_regs
doesn't have place for it and gs is saved/loaded only when necessary.
In preparation for stack protector support, this patch makes lazy %gs
handling optional by doing the followings.

* Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs.

* Save and restore %gs along with other registers in entry_32.S unless
  LAZY_GS.  Note that this unfortunately adds "pushl $0" on SAVE_ALL
  even when LAZY_GS.  However, it adds no overhead to common exit path
  and simplifies entry path with error code.

* Define different user_gs accessors depending on LAZY_GS and add
  lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS.  The
  lazy_*_gs() ops are used to save, load and clear %gs lazily.

* Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly.

xen and lguest changes need to be verified.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:42:00 +01:00
Tejun Heo
d9a89a26e0 x86: add %gs accessors for x86_32
Impact: cleanup

On x86_32, %gs is handled lazily.  It's not saved and restored on
kernel entry/exit but only when necessary which usually is during task
switch but there are few other places.  Currently, it's done by
calling savesegment() and loadsegment() explicitly.  Define
get_user_gs(), set_user_gs() and task_user_gs() and use them instead.

While at it, clean up register access macros in signal.c.

This cleans up code a bit and will help future changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:41:58 +01:00
Tejun Heo
f0d96110f9 x86: use asm .macro instead of cpp #define in entry_32.S
Impact: cleanup

Use .macro instead of cpp #define where approriate.  This cleans up
code and will ease future changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:41:57 +01:00
Ingo Molnar
92e2d50846 Merge branch 'x86/urgent' into core/percpu
Conflicts:
	arch/x86/kernel/acpi/boot.c
2009-02-10 00:41:02 +01:00
Ingo Molnar
5d96218b4a Merge branch 'x86/uaccess' into core/percpu 2009-02-10 00:40:48 +01:00
Tejun Heo
d315760ffa x86: fix math_emu register frame access
do_device_not_available() is the handler for #NM and it declares that
it takes a unsigned long and calls math_emu(), which takes a long
argument and surprisingly expects the stack frame starting at the zero
argument would match struct math_emu_info, which isn't true regardless
of configuration in the current code.

This patch makes do_device_not_available() take struct pt_regs like
other exception handlers and initialize struct math_emu_info with
pointer to it and pass pointer to the math_emu_info to math_emulate()
like normal C functions do.  This way, unless gcc makes a copy of
struct pt_regs in do_device_not_available(), the register frame is
correctly accessed regardless of kernel configuration or compiler
used.

This doesn't fix all math_emu problems but it at least gets it
somewhat working.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:39:14 +01:00
Jeremy Fitzhardinge
ca97ab9016 x86: unstatic ioapic entry funcs
Unstatic ioapic_write_entry and setup_ioapic_entry functions so that
the Xen code can do its own ioapic routing setup.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-09 14:04:31 -08:00
Jeremy Fitzhardinge
c3e137d1e8 x86: add mp_find_ioapic_pin
Add mp_find_ioapic_pin() to find an IO APIC's specific pin from a GSI,
and use this function within acpi/boot.  Make it non-static so other
code can use it too.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-09 14:04:26 -08:00
Jeremy Fitzhardinge
4924e228ae x86: unstatic mp_find_ioapic so it can be used elsewhere
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-09 14:04:19 -08:00
Linus Torvalds
6707fbb56c Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] powernow-k8: Get transition latency from ACPI _PSS table
  [CPUFREQ] Make ignore_nice_load setting of ondemand work as expected.
2009-02-09 13:58:22 -08:00
Ingo Molnar
249d51b53a Merge commit 'v2.6.29-rc4' into core/percpu
Conflicts:
	arch/x86/mach-voyager/voyager_smp.c
	arch/x86/mm/fault.c
2009-02-09 14:58:11 +01:00
Yinghai Lu
b825e6cc7b x86, es7000: fix ACPI table mappings
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 13:35:37 +01:00
Yinghai Lu
7d97277b75 acpi/x86: introduce __apci_map_table, v4
to prevent wrongly overwriting fixmap that still want to use.

ACPI used to rely on low mappings being all linearly mapped and
grew a habit: it never really unmapped certain kinds of tables
after use.

This can cause problems - for example the hypothetical case
when some spurious access still references it.

v2: remove prev_map and prev_size in __apci_map_table
v3: let acpi_os_unmap_memory() call early_iounmap too, so remove extral calling to
early_acpi_os_unmap_memory
v4: fix typo in one acpi_get_table_with_size calling

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 13:35:07 +01:00
Jeremy Fitzhardinge
05876f88ed acpi: remove final __acpi_map_table mapping before setting acpi_gbl_permanent_mmap
On x86, __acpi_map_table uses early_ioremap() to create the mapping,
replacing the previous mapping with a new one.  Once enough of the
kernel is up an running it switches to using normal ioremap().  At
that point, we need to clean up the final mapping to avoid a warning
from the early_ioremap subsystem.

This can be removed after all the instances in the ACPI code are fixed
that rely on early-ioremap's implicit overmapping of previously
mapped tables.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 13:34:46 +01:00
Jeremy Fitzhardinge
eecb9a697f x86: always explicitly map acpi memory
Always map acpi tables, rather than assuming we can use the normal
linear mapping to access the acpi tables.  This is necessary in a
virtual environment where the linear mappings are to pseudo-physical
memory, but the acpi tables exist at a real physical address.  It
doesn't hurt to map in the normal non-virtual case, so just do it
unconditionally.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 13:34:12 +01:00
Jeremy Fitzhardinge
1c14fa4937 x86: use early_ioremap in __acpi_map_table
__acpi_map_table() effectively reimplements early_ioremap().  Rather
than have that duplication, just implement it in terms of
early_ioremap().

However, unlike early_ioremap(), __acpi_map_table() just maintains a
single mapping which gets replaced each call, and has no corresponding
unmap function.  Implement this by just removing the previous mapping
each time its called.  Unfortunately, this will leave a stray mapping
at the end.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 13:33:51 +01:00
Alok Kataria
55a8ba4b7f x86, vmi: put a missing paravirt_release_pmd in pgd_dtor
Commit 6194ba6ff6 ("x86: don't special-case
pmd allocations as much") made changes to the way we handle pmd allocations,
and while doing that it dropped a call to  paravirt_release_pd on the
pgd page from the pgd_dtor code path.

As a result of this missing release, the hypervisor is now unaware of the
pgd page being freed, and as a result it ends up tracking this page as a
page table page.

After this the guest may start using the same page for other purposes, and
depending on what use the page is put to, it may result in various performance
and/or functional issues ( hangs, reboots).

Since this release is only required for VMI, I now release the pgd page from
the (vmi)_pgd_free hook.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
2009-02-09 13:10:13 +01:00
Yinghai Lu
3f4a739c6a x86: find nr_irqs_gsi with mp_ioapic_routing
Impact: find right nr_irqs_gsi on some systems.

One test-system has gap between gsi's:

[    0.000000] ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 4, version 0, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: IOAPIC (id[0x05] address[0xfeafd000] gsi_base[48])
[    0.000000] IOAPIC[1]: apic_id 5, version 0, address 0xfeafd000, GSI 48-54
[    0.000000] ACPI: IOAPIC (id[0x06] address[0xfeafc000] gsi_base[56])
[    0.000000] IOAPIC[2]: apic_id 6, version 0, address 0xfeafc000, GSI 56-62
...
[    0.000000] nr_irqs_gsi: 38

So nr_irqs_gsi is not right. some irq for MSI will overwrite with io_apic.

need to get that with acpi_probe_gsi when acpi io_apic is used

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 12:42:59 +01:00
Ingo Molnar
eca217b36e Merge branch 'x86/paravirt' into x86/apic
Conflicts:
	arch/x86/mach-voyager/voyager_smp.c
2009-02-09 12:16:59 +01:00
Jeremy Fitzhardinge
7c1d7cdcef x86: unify do_IRQ()
With the differences in interrupt handling hoisted into handle_irq(),
do_IRQ is more or less identical between 32 and 64 bit, so unify it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 12:16:05 +01:00
Jeremy Fitzhardinge
9b2b76a334 x86: add handle_irq() to allow interrupt injection
Xen uses a different interrupt path, so introduce handle_irq() to
allow interrupts to be inserted into the normal interrupt path.  This
is handled slightly differently on 32 and 64-bit.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 12:15:57 +01:00
Ingo Molnar
726c0d95b6 x86: early_printk.c - fix pgtable.h unification fallout
arch/x86/kernel/early_printk.c: In function ‘early_dbgp_init’:
 arch/x86/kernel/early_printk.c:827: error: ‘PAGE_KERNEL_NOCACHE’ undeclared (first use in this function)
 arch/x86/kernel/early_printk.c:827: error: (Each undeclared identifier is reported only once
 arch/x86/kernel/early_printk.c:827: error: for each function it appears in.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 11:32:17 +01:00
Ingo Molnar
790c7ebbe9 Merge branch 'jsgf/x86/unify' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/headers 2009-02-09 11:19:29 +01:00
Pallipadi, Venkatesh
e736ad548d x86: add clflush before monitor for Intel 7400 series
For Intel 7400 series CPUs, the recommendation is to use a clflush on the
monitored address just before monitor and mwait pair [1].

This clflush makes sure that there are no false wakeups from mwait when the
monitored address was recently written to.

[1] "MONITOR/MWAIT Recommendations for Intel Xeon Processor 7400 series"
    section in specification update document of 7400 series
    http://download.intel.com/design/xeon/specupdt/32033601.pdf

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 11:15:15 +01:00
Frederic Weisbecker
3861a17bcc tracing/function-graph-tracer: drop the kernel_text_address check
When the function graph tracer picks a return address, it ensures this address
is really a kernel text one by calling __kernel_text_address()

Actually this path has never been taken.Its role was more likely to debug the tracer
on the beginning of its development but this function is wasteful since it is called
for every traced function.

The fault check is already sufficient.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 10:51:38 +01:00
Frederic Weisbecker
1292211058 tracing/power: move the power trace headers to a dedicated file
Impact: cleanup

Move the power tracer headers to trace/power.h to keep ftrace.h and power bits
more easy to maintain as separated topics.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 10:51:38 +01:00
Ingo Molnar
44b0635481 Merge branch 'tip/tracing/core/devel' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
Conflicts:
	kernel/trace/trace_hw_branches.c
2009-02-09 10:35:12 +01:00
Ingo Molnar
4ad476e11f Merge commit 'v2.6.29-rc4' into tracing/core 2009-02-09 10:32:48 +01:00
Brian Gerst
2add8e235c x86: use linker to offset symbols by __per_cpu_load
Impact: cleanup and bug fix

Use the linker to create symbols for certain per-cpu variables
that are offset by __per_cpu_load.  This allows the removal of
the runtime fixup of the GDT pointer, which fixes a bug with
resume reported by Jiri Slaby.

Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 10:30:30 +01:00
Arjan van de Ven
2c344e9d6e x86: don't pretend that non-framepointer stack traces are reliable
Without frame pointers enabled, the x86 stack traces should not
pretend to be reliable; instead they should just be what they are:
unreliable.

The effect of this is that they have a '?' printed in the stacktrace,
to warn the reader that these entries are guesses rather than known
based on more reliable information.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 09:45:29 +01:00
Yinghai Lu
cc6c50066e x86: find nr_irqs_gsi with mp_ioapic_routing
Impact: find right nr_irqs_gsi on some systems.

One test-system has gap between gsi's:

[    0.000000] ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 4, version 0, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: IOAPIC (id[0x05] address[0xfeafd000] gsi_base[48])
[    0.000000] IOAPIC[1]: apic_id 5, version 0, address 0xfeafd000, GSI 48-54
[    0.000000] ACPI: IOAPIC (id[0x06] address[0xfeafc000] gsi_base[56])
[    0.000000] IOAPIC[2]: apic_id 6, version 0, address 0xfeafc000, GSI 56-62
...
[    0.000000] nr_irqs_gsi: 38

So nr_irqs_gsi is not right. some irq for MSI will overwrite with io_apic.

need to get that with acpi_probe_gsi when acpi io_apic is used

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 09:22:09 +01:00
Yinghai Lu
f72dccace7 x86: check_timer cleanup
Impact: make check-timer more robust potentially solve boot fragility

For edge trigger io-apic routing, we already unmasked the pin via
setup_IO_APIC_irq(), so don't unmask it again.

Also call local_irq_disable() between timer_irq_works(), because it
calls local_irq_enable() inside.

Also remove not needed apic version reading for 64-bit

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 09:21:29 +01:00
Yinghai Lu
abcaa2b831 x86: use NR_IRQS_LEGACY to replace 16
Impact: cleanup

also could kill platform_legacy_irq

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 09:21:28 +01:00
Yinghai Lu
f1ee5548a6 x86/irq: optimize nr_irqs
Impact: make nr_irqs depend more on cards used in a system

depend on nr_irq_gsi more, and have a ratio for MSI.

v2: make nr_irqs less than NR_VECTORS * nr_cpu_ids
    aka if only one cpu, we only can support nr_irqs = NR_VECTORS

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 09:21:27 +01:00
Ingo Molnar
5ba1ae92b6 Merge branches 'timers/clockevents', 'timers/hpet', 'timers/hrtimers' and 'timers/urgent' into timers/core 2009-02-08 20:14:11 +01:00
Steven Rostedt
a81bd80a0b ring-buffer: use generic version of in_nmi
Impact: clean up

Now that a generic in_nmi is available, this patch removes the
special code in the ring_buffer and implements the in_nmi generic
version instead.

With this change, I was also able to rename the "arch_ftrace_nmi_enter"
back to "ftrace_nmi_enter" and remove the code from the ring buffer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:03:33 -05:00
Steven Rostedt
9a5fd90227 ftrace: change function graph tracer to use new in_nmi
The function graph tracer piggy backed onto the dynamic ftracer
to use the in_nmi custom code for dynamic tracing. The problem
was (as Andrew Morton pointed out) it really only wanted to bail
out if the context of the current CPU was in NMI context. But the
dynamic ftrace in_nmi custom code was true if _any_ CPU happened
to be in NMI context.

Now that we have a generic in_nmi interface, this patch changes
the function graph code to use it instead of the dynamic ftarce
custom code.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:02:55 -05:00
Steven Rostedt
4e6ea1440c ftrace, x86: rename in_nmi variable
Impact: clean up

The in_nmi variable in x86 arch ftrace.c is a misnomer.
Andrew Morton pointed out that the in_nmi variable is incremented
by all CPUS. It can be set when another CPU is running an NMI.

Since this is actually intentional, the fix is to rename it to
what it really is: "nmi_running"

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:01:21 -05:00
Steven Rostedt
78d904b46a ring-buffer: add NMI protection for spinlocks
Impact: prevent deadlock in NMI

The ring buffers are not yet totally lockless with writing to
the buffer. When a writer crosses a page, it grabs a per cpu spinlock
to protect against a reader. The spinlocks taken by a writer are not
to protect against other writers, since a writer can only write to
its own per cpu buffer. The spinlocks protect against readers that
can touch any cpu buffer. The writers are made to be reentrant
with the spinlocks disabling interrupts.

The problem arises when an NMI writes to the buffer, and that write
crosses a page boundary. If it grabs a spinlock, it can be racing
with another writer (since disabling interrupts does not protect
against NMIs) or with a reader on the same CPU. Luckily, most of the
users are not reentrant and protects against this issue. But if a
user of the ring buffer becomes reentrant (which is what the ring
buffers do allow), if the NMI also writes to the ring buffer then
we risk the chance of a deadlock.

This patch moves the ftrace_nmi_enter called by nmi_enter() to the
ring buffer code. It replaces the current ftrace_nmi_enter that is
used by arch specific code to arch_ftrace_nmi_enter and updates
the Kconfig to handle it.

When an NMI is called, it will set a per cpu variable in the ring buffer
code and will clear it when the NMI exits. If a write to the ring buffer
crosses page boundaries inside an NMI, a trylock is used on the spin
lock instead. If the spinlock fails to be acquired, then the entry
is discarded.

This bug appeared in the ftrace work in the RT tree, where event tracing
is reentrant. This workaround solved the deadlocks that appeared there.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:00:17 -05:00
Len Brown
2d29c6a075 Merge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug', 'ec', 'misc', 'printk' and 'processor' into release 2009-02-07 01:34:56 -05:00
Jeremy Fitzhardinge
fb08b20fe7 x86: Fix compile error in arch/x86/kernel/early_printk.c
Fix compile problem:

  CC      arch/x86/kernel/early_printk.o
In file included from /home/jeremy/hg/xen/paravirt/linux/arch/x86/kernel/early_printk.c:17:
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h: In function 'pmd_page':
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h:516: error: implicit declaration of function '__pfn_to_section'
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h:516: warning: initialization makes pointer from integer without a cast
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h:516: error: implicit declaration of function '__section_mem_map_addr'
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h:516: warning: return makes pointer from integer without a cast
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h: In function 'pud_page':
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h:586: warning: initialization makes pointer from integer without a cast
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h:586: warning: return makes pointer from integer without a cast
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h: In function 'pgd_page':
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h:625: warning: initialization makes pointer from integer without a cast
/home/jeremy/hg/xen/paravirt/linux/arch/x86/include/asm/pgtable.h:625: warning: return makes pointer from integer without a cast

This is a cycling dependency between asm/pgtable.h and linux/mmzone.h
when using CONFIG_SPARSEMEM.  Rather than hacking up the headers some
more, remove asm/pgtable.h, since early_printk.c doesn't actually need
it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-06 14:05:42 -08:00
Pavel Emelyanov
ff08f76d73 x86: clean up hpet timer reinit
Implement Linus's suggestion: introduce the hpet_cnt_ahead()
helper function to compare hpet time values - like other
wrapping counter comparisons are abstracted away elsewhere.
(jiffies, ktime_t, etc.)

Reported-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-06 15:07:13 +01:00
Ingo Molnar
4f179d1218 x86, numaq: cleanups
Also move xquad_portio over to where it's allocated.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:30:14 +01:00
Ingo Molnar
9d45cf9e36 Merge branch 'x86/urgent' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic merge:
	arch/x86/kernel/irqinit_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:30:01 +01:00
Yinghai Lu
c5e9548203 x86: move default_ipi_xx back to ipi.c
Impact: cleanup

only leave _default_ipi_xx etc in .h

Beyond the cleanup factor, this saves a bit of code size as well:

    text	   data	    bss	    dec	            hex	filename
 7281931	1630144	1463304	10375379	 9e50d3	vmlinux.before
 7281753	1630144	1463304	10375201	 9e5021	vmlinux.after

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:27:56 +01:00
Ingo Molnar
fdbecd9fd1 x86, apic: explain the purpose of max_physical_apicid
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:27:55 +01:00
Ingo Molnar
65a4e574d2 smp, generic: introduce arch_disable_smp_support() instead of disable_ioapic_setup()
Impact: cleanup

disable_ioapic_setup() in init/main.c is ugly as the function is
x86-specific. The #ifdef inline prototype there is ugly too.

Replace it with a generic arch_disable_smp_support() function - which
has a weak alias for non-x86 architectures and for non-ioapic x86 builds.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:27:54 +01:00
Mark Langsdorf
732553e567 [CPUFREQ] powernow-k8: Get transition latency from ACPI _PSS table
At this time, the PowerNow! driver for K8 uses an experimentally
derived formula to calculate transition latency.  The value it
provides is orders of magnitude too large on modern systems.
This patch replaces the formula with ACPI _PSS latency values
for more accuracy and better performance.

I've tested it on two 2nd generation Opteron systems, a 3rd
generation Operton system, and a Turion X2 without seeing any
stability problems.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-02-05 12:25:26 -05:00
H. Peter Anvin
327641da8e Merge branch 'core/percpu' into x86/paravirt 2009-02-04 16:58:26 -08:00
Alex Chiang
4560839939 x86: fix grammar in user-visible BIOS warning
Fix user-visible grammo.

Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 01:14:38 +01:00
Pavel Emelyanov
a6a95406c6 x86: fix hpet timer reinit for x86_64
There's a small problem with hpet_rtc_reinit function - it checks
for the:

	hpet_readl(HPET_COUNTER) - hpet_t1_cmp > 0

to continue increasing both the HPET_T1_CMP (register) and the
hpet_t1_cmp (variable).

But since the HPET_COUNTER is always 32-bit, if the hpet_t1_cmp
is 64-bit this condition will always be FALSE once the latter hits
the 32-bit boundary, and we can have a situation, when we don't
increase the HPET_T1_CMP register high enough.

The result - timer stops ticking, since HPET_T1_CMP becomes less,
than the COUNTER and never increased again.

The solution is (based on Linus's suggestion) to not compare 64-bits
(on 64-bit x86), but to do the comparison on 32-bit signed
integers.

Reported-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 01:04:16 +01:00
Kyle McMartin
48ec4d9537 x86, 64-bit: print DMI info in the oops trace
This patch echoes what we already do on 32-bit since
90f7d25c6b, and prints the DMI
product name in show_regs, so that system specific problems can be
easily identified.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-04 22:10:12 +01:00
Ingo Molnar
bb960a1e42 Merge branch 'core/xen' into x86/urgent 2009-02-04 14:54:56 +01:00
Thomas Renninger
62663ea822 ACPI: cpufreq: Remove deprecated /proc/acpi/processor/../performance proc entries
They were long enough set deprecated...

Update Documentation/cpu-freq/users-guide.txt:
The deprecated files listed there seen not to exist for some time anymore
already.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-04 00:12:24 -05:00
Huang Ying
f5deb79679 x86: kexec: Use one page table in x86_64 machine_kexec
Impact: reduce kernel BSS size by 7 pages, improve code readability

Two page tables are used in current x86_64 kexec implementation. One
is used to jump from kernel virtual address to identity map address,
the other is used to map all physical memory. In fact, on x86_64,
there is no conflict between kernel virtual address space and physical
memory space, so just one page table is sufficient. The page table
pages used to map control page are dynamically allocated to save
memory if kexec image is not loaded. ASM code used to map control page
is replaced by C code too.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-03 18:29:18 -08:00
Borislav Petkov
858770619d x86: APIC: enable workaround on AMD Fam10h CPUs
Impact: fix to enable APIC for AMD Fam10h on chipsets with a missing/b0rked
	ACPI MP table (MADT)

Booting a 32bit kernel on an AMD Fam10h CPU running on chipsets with
missing/b0rked MP table leads to a hang pretty early in the boot process
due to the APIC not being initialized. Fix that by falling back to the
default APIC base address in 32bit code, as it is done in the 64bit
codepath.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-03 18:09:33 -08:00
Ingo Molnar
dc573f9b20 Merge branches 'tracing/ftrace', 'tracing/kmemtrace' and 'linus' into tracing/core 2009-02-03 06:25:38 +01:00
Martin Hicks
a67798cd7b x86: push old stack address on irqstack for unwinder
Impact: Fixes dumpstack and KDB on 64 bits

This re-adds the old stack pointer to the top of the irqstack to help
with unwinding.  It was removed in commit d99015b1ab
as part of the save_args out-of-line work.

Both dumpstack and KDB require this information.

Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-02 21:18:03 -08:00
Yinghai Lu
ef3892bd63 x86, percpu: fix kexec with vmlinux
Impact: fix regression with kexec with vmlinux

Split data.init into data.init, percpu, data.init2 sections
instead of let data.init wrap percpu secion.

Thus kexec loading will be happy, because sections will not
overlap.

Before the patch we have:

Elf file type is EXEC (Executable file)
Entry point 0x200000
There are 6 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000200000 0xffffffff80200000 0x0000000000200000
                 0x0000000000ca6000 0x0000000000ca6000  R E    200000
  LOAD           0x0000000000ea6000 0xffffffff80ea6000 0x0000000000ea6000
                 0x000000000014dfe0 0x000000000014dfe0  RWE    200000
  LOAD           0x0000000001000000 0xffffffffff600000 0x0000000000ff4000
                 0x0000000000000888 0x0000000000000888  RWE    200000
  LOAD           0x00000000011f6000 0xffffffff80ff6000 0x0000000000ff6000
                 0x0000000000073086 0x0000000000a2d938  RWE    200000
  LOAD           0x0000000001400000 0x0000000000000000 0x000000000106a000
                 0x00000000001d2ce0 0x00000000001d2ce0  RWE    200000
  NOTE           0x00000000009e2c1c 0xffffffff809e2c1c 0x00000000009e2c1c
                 0x0000000000000024 0x0000000000000024         4

 Section to Segment mapping:
  Segment Sections...
   00     .text .notes __ex_table .rodata __bug_table .pci_fixup .builtin_fw __ksymtab __ksymtab_gpl __ksymtab_strings __init_rodata __param
   01     .data .init.rodata .data.cacheline_aligned .data.read_mostly
   02     .vsyscall_0 .vsyscall_fn .vsyscall_gtod_data .vsyscall_1 .vsyscall_2 .vgetcpu_mode .jiffies
   03     .data.init_task .smp_locks .init.text .init.data .init.setup .initcall.init .con_initcall.init .x86_cpu_dev.init .altinstructions .altinstr_replacement .exit.text .init.ramfs .bss
   04     .data.percpu
   05     .notes

After patch we've got:

Elf file type is EXEC (Executable file)
Entry point 0x200000
There are 7 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000200000 0xffffffff80200000 0x0000000000200000
                 0x0000000000ca6000 0x0000000000ca6000  R E    200000
  LOAD           0x0000000000ea6000 0xffffffff80ea6000 0x0000000000ea6000
                 0x000000000014dfe0 0x000000000014dfe0  RWE    200000
  LOAD           0x0000000001000000 0xffffffffff600000 0x0000000000ff4000
                 0x0000000000000888 0x0000000000000888  RWE    200000
  LOAD           0x00000000011f6000 0xffffffff80ff6000 0x0000000000ff6000
                 0x0000000000073086 0x0000000000073086  RWE    200000
  LOAD           0x0000000001400000 0x0000000000000000 0x000000000106a000
                 0x00000000001d2ce0 0x00000000001d2ce0  RWE    200000
  LOAD           0x000000000163d000 0xffffffff8123d000 0x000000000123d000
                 0x0000000000000000 0x00000000007e6938  RWE    200000
  NOTE           0x00000000009e2c1c 0xffffffff809e2c1c 0x00000000009e2c1c
                 0x0000000000000024 0x0000000000000024         4

 Section to Segment mapping:
  Segment Sections...
   00     .text .notes __ex_table .rodata __bug_table .pci_fixup .builtin_fw __ksymtab __ksymtab_gpl __ksymtab_strings __init_rodata __param
   01     .data .init.rodata .data.cacheline_aligned .data.read_mostly
   02     .vsyscall_0 .vsyscall_fn .vsyscall_gtod_data .vsyscall_1 .vsyscall_2 .vgetcpu_mode .jiffies
   03     .data.init_task .smp_locks .init.text .init.data .init.setup .initcall.init .con_initcall.init .x86_cpu_dev.init .altinstructions .altinstr_replacement .exit.text .init.ramfs
   04     .data.percpu
   05     .bss
   06     .notes

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-03 06:11:18 +01:00
Jeremy Fitzhardinge
664c795472 x86/vmi: fix interrupt enable/disable/save/restore calling convention.
Zach says:
> Enable/Disable have no clobbers at all.
> Save clobbers only return value, %eax
> Restore also clobbers nothing.

This is precisely compatible with the calling convention, so we can
just call them directly without wrapping.

(Compile tested only.)

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-02-02 08:06:33 -08:00
Yinghai Lu
10b888d6ce irq, x86: fix lock status with numa_migrate_irq_desc
Eric Paris reported:

> I have an hp dl785g5 which is unable to successfully run
> 2.6.29-0.66.rc3.fc11.x86_64 or 2.6.29-rc2-next-20090126.  During bootup
> (early in userspace daemons starting) I get the below BUG, which quickly
> renders the machine dead.  I assume it is because sparse_irq_lock never
> gets released when the BUG kills that task.

Adjust lock sequence when migrating a descriptor with
CONFIG_NUMA_MIGRATE_IRQ_DESC enabled.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-01 11:36:31 +01:00
Dave Jones
9a8ecae87a x86: add cache descriptors for Intel Core i7
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-01 11:06:50 +01:00
Linus Torvalds
f6490438fc Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, ds, bts: cleanup/fix DS configuration
  ring-buffer: reset timestamps when ring buffer is reset
  trace: set max latency variable to zero on default
  trace: stop all recording to ring buffer on ftrace_dump
  trace: print ftrace_dump at KERN_EMERG log level
  ring_buffer: reset write when reserve buffer fail
  tracing/function-graph-tracer: fix a regression while suspend to disk
  ring-buffer: fix alignment problem
2009-01-31 15:53:30 -08:00
Linus Torvalds
e81cfd214f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86 setup: fix asm constraints in vesa_store_edid
  xen: make sysfs files behave as their names suggest
  x86: tone down mtrr_trim_uncached_memory() warning
  x86: correct the CPUID pattern for MSR_IA32_MISC_ENABLE availability
2009-01-31 15:52:46 -08:00
James Bottomley
92ab78315c x86/Voyager: make it build and boot
[
  mingo@elte.hu: these fixes are a subset of changes cherry-picked from:

     git://git.kernel.org:/pub/scm/linux/kernel/git/jejb/voyager-2.6.git

  They fix various problems that recent x86 changes caused in the Voyager
  subarchitecture: both APIC changes and cpumask changes and certain
  cleanups caused subarch assumptions to break.

  Most of these changes are obsolete as the subarch code has been removed
  from the x86 development tree - but we merge them upstream to make Voyager
  build and boot.
]

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-31 18:26:07 +01:00
Jeremy Fitzhardinge
11e3a840cd x86: split loading percpu segments from loading gdt
Impact: split out a function, no functional change

Xen needs to be able to access percpu data from very early on.  For
various reasons, it cannot also load the gdt at that time.   It does,
however, have a pefectly functional gdt at that point, so there's no
pressing need to reload the gdt.

Split the function to load the segment registers off, so Xen can call
it directly.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-31 14:28:54 +09:00
Brian Gerst
552be871e6 x86: pass in cpu number to switch_to_new_gdt()
Impact: cleanup, prepare for xen boot fix.

Xen needs to call this function very early to setup the GDT and
per-cpu segments.  Remove the call to smp_processor_id() and just
pass in the cpu number.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-31 14:28:50 +09:00
Cliff Wickman
2749ebe320 x86: UV fix uv_flush_send_and_wait()
Impact: fix possible tlb mis-flushing on UV

uv_flush_send_and_wait() should return a pointer if the broadcast
remote tlb shootdown requests fail. That causes the conventional IPI
method of shootdown to be used.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-31 14:23:37 +09:00
Ingo Molnar
647ad94fc0 x86, apic: clean up spurious vector sanity check
Move the spurious vector sanity check to the place where it's
defined - out of a .c file.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-31 04:21:20 +01:00
Ingo Molnar
8f47e16348 x86: update copyrights
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-31 04:21:18 +01:00
Ingo Molnar
d1de36f5b5 x86, apic: clean up header section
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-31 04:21:17 +01:00
Jeremy Fitzhardinge
da5de7c22e x86/paravirt: use callee-saved convention for pte_val/make_pte/etc
Impact: Optimization

In the native case, pte_val, make_pte, etc are all just identity
functions, so there's no need to clobber a lot of registers over them.

(This changes the 32-bit callee-save calling convention to return both
EAX and EDX so functions can return 64-bit values.)

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-30 14:51:45 -08:00
Jeremy Fitzhardinge
ecb93d1ccd x86/paravirt: add register-saving thunks to reduce caller register pressure
Impact: Optimization

One of the problems with inserting a pile of C calls where previously
there were none is that the register pressure is greatly increased.
The C calling convention says that the caller must expect a certain
set of registers may be trashed by the callee, and that the callee can
use those registers without restriction.  This includes the function
argument registers, and several others.

This patch seeks to alleviate this pressure by introducing wrapper
thunks that will do the register saving/restoring, so that the
callsite doesn't need to worry about it, but the callee function can
be conventional compiler-generated code.  In many cases (particularly
performance-sensitive cases) the callee will be in assembler anyway,
and need not use the compiler's calling convention.

Standard calling convention is:
	 arguments	    return	scratch
x86-32	 eax edx ecx	    eax		?
x86-64	 rdi rsi rdx rcx    rax		r8 r9 r10 r11

The thunk preserves all argument and scratch registers.  The return
register is not preserved, and is available as a scratch register for
unwrapped callee code (and of course the return value).

Wrapped function pointers are themselves wrapped in a struct
paravirt_callee_save structure, in order to get some warning from the
compiler when functions with mismatched calling conventions are used.

The most common paravirt ops, both statically and dynamically, are
interrupt enable/disable/save/restore, so handle them first.  This is
particularly easy since their calls are handled specially anyway.

XXX Deal with VMI.  What's their calling convention?

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-30 14:51:45 -08:00
Jeremy Fitzhardinge
b8aa287f77 x86: fix paravirt clobber in entry_64.S
Impact: Fix latent bug

The clobber is trying to say that anything except RDI is available for
clobbering, but actually clobbers everything.  This hasn't mattered
because the clobbers were basically ignored, but subsequent patches
will rely on them.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-30 14:51:44 -08:00
Jeremy Fitzhardinge
41edafdb78 x86/pvops: add a paravirt_ident functions to allow special patching
Impact: Optimization

Several paravirt ops implementations simply return their arguments,
the most obvious being the make_pte/pte_val class of operations on
native.

On 32-bit, the identity function is literally a no-op, as the calling
convention uses the same registers for the first argument and return.
On 64-bit, it can be implemented with a single "mov".

This patch adds special identity functions for 32 and 64 bit argument,
and machinery to recognize them and replace them with either nops or a
mov as appropriate.

At the moment, the only users for the identity functions are the
pagetable entry conversion functions.

The result is a measureable improvement on pagetable-heavy benchmarks
(2-3%, reducing the pvops overhead from 5 to 2%).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-30 14:51:44 -08:00
H. Peter Anvin
9b7ed8faa0 Merge branch 'core/percpu' into x86/paravirt 2009-01-30 14:50:57 -08:00
Ingo Molnar
6b64ee02da x86, apic, 32-bit: add self-IPI methods
Impact: fix rare crash on 32-bit

The 32-bit APIC drivers had their send_IPI_self vectors set to NULL,
but ioapic_retrigger_irq() depends on it being always set. Fix it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-30 23:42:18 +01:00
Ingo Molnar
c43e0e46ad Merge branch 'linus' into core/percpu
Conflicts:
	kernel/irq/handle.c
2009-01-30 18:23:30 +01:00
Yinghai Lu
26f7ef14a7 x86: don't treat bigsmp as non-standard
just like 64 bit switch from flat logical APIC messages to
flat physical mode automatically.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-30 15:24:37 +01:00
Yinghai Lu
43f39890db x86: seperate default_send_IPI_mask_sequence/allbutself from logical
Impact: 32-bit should use logical version

there are two version: for default_send_IPI_mask_sequence/allbutself
one in ipi.h and one in ipi.c for 32bit

it seems .h version overwrote ipi.c for a while.

restore it so 32 bit could use its old logical version.
also remove dupicated functions in .c

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-30 15:21:24 +01:00
Yinghai Lu
1ff2f20de3 x86: fix compiling with 64bit with def_to_bigsmp
only need to do cut off with 32bit

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-30 15:21:23 +01:00
Randy Dunlap
5872fb94f8 Documentation: move DMA-mapping.txt to Doc/PCI/
Move DMA-mapping.txt to Documentation/PCI/.

DMA-mapping.txt was supposed to be moved from Documentation/ to
Documentation/PCI/.  The 00-INDEX files in those two directories
were updated, along with a few other text files, but the file
itself somehow escaped being moved, so move it and update more
text files and source files with its new location.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
cc:	Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-29 18:19:29 -08:00
Yasuaki Ishimatsu
39ba5d43fc x86: unify PM-Timer messages
Impact: Cleans up printk formatting

When LOCAL APIC was calibrated, the debug message is displayed as follows.

	CPU0: Intel(R) Xeon(R) CPU            5110  @ 1.60GHz stepping 06
	Using local APIC timer interrupts.
	calibrating APIC timer ...
	... lapic delta = 3773131
	... PM timer delta = 812434
	APIC calibration not consistent with PM Timer: 226ms instead of 100ms
	APIC delta adjusted to PM-Timer: 1662420 (3773131)
	TSC delta adjusted to PM-Timer: 159592409 (362220564)
	..... delta 1662420
	..... mult: 71411249
	..... calibration result: 265987
	..... CPU clock speed is 1595.0924 MHz.
	..... host bus clock speed is 265.0987 MHz.

There are three type of PM-Timer (PM-Timer, PM Timer, and PM timer),
in this message. This patch unifies those messages to PM-Timer.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-29 17:07:09 -08:00
Yasuaki Ishimatsu
754ef0cd65 x86: fix debug message of CPU clock speed
Impact: Fixes incorrect printk

LOCAL APIC is corrected by PM-Timer, when SMI occurred while LOCAL APIC is calibrated.
In this case, LOCAL APIC debug message(Boot with apic=debug) is displayed correctly,
however, CPU clock speed debug message is displayed wrongly .

When SMI occured on my machine, which has 1.6GHz CPU, CPU clock speed is displayed
3622.0205 MHz as follow.

	CPU0: Intel(R) Xeon(R) CPU            5110  @ 1.60GHz stepping 06
	Using local APIC timer interrupts.
	calibrating APIC timer ...
	... lapic delta = 3773130
	... PM timer delta = 812434
	APIC calibration not consistent with PM Timer: 226ms instead of 100ms
	APIC delta adjusted to PM-Timer: 1662420 (3773130)
	..... delta 1662420
	..... mult: 71411249
	..... calibration result: 265987
	..... CPU clock speed is 3622.0205 MHz.  =====>  here
	..... host bus clock speed is 265.0987 MHz.

This patch fixes to displaying CPU clock speed correctly as follow.

	CPU0: Intel(R) Xeon(R) CPU            5110  @ 1.60GHz stepping 06
	Using local APIC timer interrupts.
	calibrating APIC timer ...
	... lapic delta = 3773131
	... PM timer delta = 812434
	APIC calibration not consistent with PM Timer: 226ms instead of 100ms
	APIC delta adjusted to PM-Timer: 1662420 (3773131)
	TSC delta adjusted to PM-Timer: 159592409 (362220564)
	..... delta 1662420
	..... mult: 71411249
	..... calibration result: 265987
	..... CPU clock speed is 1595.0924 MHz.
	..... host bus clock speed is 265.0987 MHz.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-29 17:06:03 -08:00
Yinghai Lu
4272ebfbef x86: allow more than 8 cpus to be used on 32-bit
X86_PC is the only remaining 'sub' architecture, so we dont need
it anymore.

This also cleans up a few spurious references to X86_PC in the
driver space - those certainly should be X86.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-30 00:20:22 +01:00
Suresh Siddha
fbeb2ca022 x86: unify genapic code, unify subarchitectures, remove old subarchitecture code, xapic fix
xapic fix for 32bit platform with less than 8 cpu's.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 21:25:28 +01:00
Cyrill Gorcunov
0a1e8869f4 x86: trampoline_64.S - use predefined constants with simplification
Impact: cleanup

We may use macros from processor-flags.h instead
of hardcoding bits. Actually it's not direct mapping
of old instructions with new ones -- BTS does change
CF flag while MOV does not. But i didn't find any
dependency on CF in this code.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:37:01 +01:00
Ingo Molnar
e0c7ae376a x86: rename X86_GENERICARCH to X86_32_NON_STANDARD
X86_GENERICARCH is a misnomer - it contains non-PC 32-bit architectures
that are not included in the default build.

Rename it to X86_32_NON_STANDARD.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:17:20 +01:00
Ingo Molnar
550fe4f198 x86/Voyager: remove X86_FIND_SMP_CONFIG Kconfig quirk
x86/Voyager had this Kconfig quirk:

 config X86_FIND_SMP_CONFIG
	def_bool y
	depends on X86_MPPARSE || X86_VOYAGER

Which splits off the find_smp_config() callback into a build-time quirk.

Voyager should use the existing x86_quirks.mach_find_smp_config() callback
to introduce SMP-config quirks. NUMAQ-32 and VISWS already use this.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:17:04 +01:00
Ingo Molnar
f095df0a0c x86/Voyager: remove X86_BIOS_REBOOT Kconfig quirk
Voyager has this Kconfig quirk:

config X86_BIOS_REBOOT
	bool
	depends on !X86_VOYAGER
	default y

Voyager should use the existing machine_ops.emergency_restart reboot
quirk mechanism instead of a build-time quirk.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:17:03 +01:00
Ingo Molnar
c0b5842a45 x86: generalize boot_cpu_id
x86/Voyager can boot on non-zero processors. While that can probably
be fixed by properly remapping the physical CPU IDs, keep boot_cpu_id
for now for easier transition - and expand it to all of x86.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:17:01 +01:00
Ingo Molnar
3e5095d152 x86: replace CONFIG_X86_SMP with CONFIG_SMP
The x86/Voyager subarch used to have this distinction between
 'x86 SMP support' and 'Voyager SMP support':

 config X86_SMP
	bool
	depends on SMP && ((X86_32 && !X86_VOYAGER) || X86_64)

This is a pointless distinction - Voyager can (and already does) use
smp_ops to implement various SMP quirks it has - and it can be extended
more to cover all the specialities of Voyager.

So remove this complication in the Kconfig space.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:17:00 +01:00
Ingo Molnar
6bda2c8b32 x86: remove subarchitecture support
Remove the 32-bit subarchitecture support code.

All subarchitectures but Voyager have been converted. Voyager will be
done later or will be removed.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:52 +01:00
Ingo Molnar
1164dd0099 x86: move mach-default/*.h files to asm/
We are getting rid of subarchitecture support - move the hook files
to asm/. (These are now stale and should be replaced with more explicit
runtime mechanisms - but the transition is simpler this way.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:51 +01:00
Ingo Molnar
7b38725318 x86: remove subarchitecture support code
Remove remaining bits of the subarchitecture code. Now that all the
special platforms are runtime probed and runtime handled, we can remove
these facilities.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:50 +01:00
Ingo Molnar
d53e2f2855 x86, smp: remove mach_ipi.h
Move mach_ipi.h definitions into genapic.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:49 +01:00
Ingo Molnar
9f4187f0a3 x86, bigsmp: consolidate header code
Move all the asm/bigsmp/*.h definitions into bigsmp_32.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:48 +01:00
Ingo Molnar
b3daa3a1a5 x86, bigsmp: consolidate code
Move all code to arch/x86/kernel/bigsmp_32.c.

With this it ceases to rely on any build-time subarch features.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:47 +01:00
Ingo Molnar
61b90b7ca1 x86, NUMAQ: Consolidate code
Move all NUMAQ code into arch/x86/kernel/numaq.c.

With this it ceases to rely on any build-time subarch features.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:46 +01:00
Ingo Molnar
2e096df8ed x86, ES7000: Consolidate code
Move all ES7000 code into arch/x86/kernel/es7000_32.c.

With this it ceases to rely on any build-time subarch features.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:45 +01:00
Ingo Molnar
1dcdd3d15e x86: remove mach_apic.h
Spread mach_apic.h definitions into genapic.h. (with some knock-on effects
on smp.h and apic.h.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:42 +01:00
Ingo Molnar
7c20dcc545 x86, summit: consolidate code, fix
Build fix for !NUMA Summit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 14:16:41 +01:00
Ingo Molnar
bf3647c44b x86: tone down mtrr_trim_uncached_memory() warning
kerneloops.org is reporting a lot of these warnings that come due to
vmware not setting up any MTRRs for emulated CPUs:

| Reported 709 times (14696 total reports)
| BIOS bug (often in VMWare) where the MTRR's are set up incorrectly
| or not at all
|
| This warning was last seen in version 2.6.29-rc2-git1, and first
| seen in 2.6.24.
|
| More info:
|   http://www.kerneloops.org/searchweek.php?search=mtrr_trim_uncached_memory

Keep a one-liner KERN_INFO about it - so that we have so notice if empty
MTRRs are caused by native hardware/BIOS weirdness.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-29 11:45:35 +01:00
Ingo Molnar
b11b867f78 x86, summit: consolidate code
Consolidate all the Summit code into a single file:
arch/x86/kernel/summit_32.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:38 +01:00
Ingo Molnar
328386d7ab x86, smp: refactor ->wake_cpu
- remove macro wrappers

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:37 +01:00
Ingo Molnar
1f75ed0c13 x86: remove mach_apicdef.h
Move its definitions into apic.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:36 +01:00
Ingo Molnar
fb5b33c9f6 x86: eliminate asm/mach-*/mach_mpparse.h
Move the definition to mpparse.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:35 +01:00
Ingo Molnar
0939e4fd35 x86, smp: eliminate asm/mach-default/mach_wakecpu.h
Spread mach_wakecpu.h's definitions into apic.h and genapic.h
and remove mach_wakecpu.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:35 +01:00
Ingo Molnar
25dc004903 x86, smp: refactor ->inquire_remote_apic() methods
Nothing exciting - a few subarches dont want APIC remote reads to
be performed - the others are content with the default method.

 - extend the generic code to handle NULL methods

 - clear out dummy methods and replace them with NULL

 - clean up: remove wrapper macros, etc.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:34 +01:00
Ingo Molnar
3d5f597e93 x86, smp: remove ->restore_NMI_vector()
Nothing actually restores the NMI vector - so remove this
logic altogether.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:34 +01:00
Ingo Molnar
7bd06ec63a x86, smp: refactor ->store/restore_NMI_vector() methods
Only NUMAQ does something substantial here, because it initializes
via NMIs (not via INIT as standard SMP startup) - so it needs to
store and restore the NMI vector.

 - extend the generic code to handle NULL methods

 - clear out dummy methods and replace them with NULL

 - clean up: remove wrapper macros, etc.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:33 +01:00
Ingo Molnar
333344d943 x86, smp: refactor ->smp_callin_clear_local_apic() methods
Only NUMAQ does something substantial here, because it initializes
via NMIs (not via INIT as standard SMP startup) - so it needs to
reset the APIC.

 - extend the generic code to handle NULL methods

 - clear out dummy methods and replace them with NULL

 - clean up: remove wrapper macros, etc.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:33 +01:00
Ingo Molnar
a965936643 x86, smp: refactor ->wait_for_init_deassert()
- spread out the namespace on a per APIC driver basis

 - handle a NULL ->wait_for_init_deassert() as a 'dont wait' default method

 - remove NUMAQ and Summit handlers

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:32 +01:00
Ingo Molnar
abfa584c8d x86: set ->trampoline_phys_low/high on 64-bit too
64-bit x86 has zero for ->trampoline_phys_low/high, but the smpboot
code can use these values - so it's better to set them up to their
correct values.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:32 +01:00
Ingo Molnar
dac5f4121d x86, apic: untangle the send_IPI_*() jungle
Our send_IPI_*() methods and definitions are a twisted mess: the same
symbol is defined to different things depending on .config details,
in a non-transparent way.

 - spread out the quirks into separately named per apic driver methods

 - prefix the standard PC methods with default_

 - get rid of wrapper macro obfuscation

 - clean up various details

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:31 +01:00
Ingo Molnar
debccb3e77 x86, apic: refactor ->cpu_mask_to_apicid*()
- spread out the namespace on a per driver basis

 - clean up the functions

 - get rid of macros

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:30 +01:00
Ingo Molnar
5b8127277b x86, apic: refactor ->apic_id_mask & APIC_ID_MASK
- spread out the namespace on a per driver basis

 - get rid of wrapper macros

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:29 +01:00
Ingo Molnar
ca6c8ed464 x86, apic: refactor ->get_apic_id() & GET_APIC_ID()
- spread out the namespace on a per driver basis

 - get rid of macro wrappers

 - small cleanups

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:29 +01:00
Ingo Molnar
9c7642470e x86: consolidate the ->mps_oem_check() code
- spread out the mps_oem_check() namespace on a per APIC driver basis

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:28 +01:00
Ingo Molnar
1322a2e2db x86, mpparse: call the generic quirk handlers early
Call all the registered MPS quirk handlers early. These methods scan
low RAM typically for specific signatures so are safe to be called
early.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:28 +01:00
Ingo Molnar
cb8cc442dc x86, apic: refactor ->phys_pkg_id()
Refactor the ->phys_pkg_id() methods:

 - namespace separation

 - macro wrapper removal

 - open-coded calls to the methods in the generic code

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:27 +01:00
Ingo Molnar
d4c9a9f3d4 x86, apic: unify phys_pkg_id()
- unify the call signature of 64-bit to that of 32-bit

 - clean up the types all around

 - clean up namespace contamination

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:26 +01:00
Ingo Molnar
b0b20e5a3a x86, es7000: clean up es7000_enable_apic_mode()
- eliminate the needless es7000_enable_apic_mode() complication which
  was not apparent prior the namespace cleanups

- clean up the control flow in es7000_enable_apic_mode()

- other cleanups

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:26 +01:00
Ingo Molnar
4904033302 x86: refactor ->enable_apic_mode() subarch methods
Only ES7000 has a real ->enable_apic_mode() method, the other
subarchitectures define it but keep it empty.

So mark the vector as NULL, extend the generic code to handle
NULL -setup_portio_remap() entries and remove all the empty
handlers.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:26 +01:00
Ingo Molnar
a27a621001 x86: refactor ->check_phys_apicid_present() subarch methods
- spread out the namespace to per driver methods

 - extend it to 64-bit as well so that we can use
   apic->check_phys_apicid_present() unconditionally

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:25 +01:00
Ingo Molnar
d83093b504 x86: refactor ->setup_portio_remap() subarch methods
Only NUMAQ has a real ->setup_portio_remap() method, the other
subarchitectures define it but keep it empty.

So mark the vector as NULL, extend the generic code to handle
NULL -setup_portio_remap() entries and remove all the empty
handlers.

Also move the NUMAQ method from the header file into the
 apic driver .c file.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:25 +01:00
Ingo Molnar
8058714a41 x86, apic: clean up ->apicid_to_cpu_present()
- separate the namespace

 - remove macros

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:24 +01:00
Ingo Molnar
a21769a446 x86, apic: clean up ->cpu_present_to_apicid()
- separate the namespace

 - remove macros

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:24 +01:00
Ingo Molnar
5257c5111c x86, apic: clean up ->cpu_to_logical_apicid()
- separate the namespace

 - remove macros

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:23 +01:00
Ingo Molnar
3f57a318c3 x86, apic: clean up ->apicid_to_node()
- separate the namespace

 - remove macros

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:23 +01:00
Ingo Molnar
33a201fac6 x86, apic: streamline the ->multi_timer_check() quirk
only NUMAQ uses this quirk: to prevent the timer IRQ from being added
on secondary nodes.

All other genapic templates can have a NULL ->multi_timer_check()
callback.

Also, extend the generic code to treat a NULL pointer accordingly.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:22 +01:00
Ingo Molnar
72ce016583 x86, apic: clean up ->setup_apic_routing()
- separate the namespace

 - remove macros

 - remove namespace clash on 64-bit

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:22 +01:00
Ingo Molnar
d190cb87c4 x86, apic: clean up ->ioapic_phys_id_map()
- separate the namespace

 - remove macros

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:21 +01:00
Ingo Molnar
a5c4329622 x86, apic: clean up ->init_apic_ldr()
- separate the namespace

 - remove macros

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:21 +01:00
Ingo Molnar
e2d40b1878 x86, apic: clean up ->vector_allocation_domain()
- separate the namespace

 - remove macros

 - move the default vector-allocation-domain to mach-generic

 - fix whitespace damage

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:20 +01:00
Ingo Molnar
2e867b17cc x86, apic: remove no_balance_irq and no_ioapic_check flags
These flags are completely unused. (the in-kernel IRQ balancer has
been removed from the upstream kernel.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:20 +01:00
Ingo Molnar
d1d7cae8fd x86, apic: clean up check_apicid*() callbacks
Clean up these methods - to make it clearer which function is
used in which case.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:19 +01:00
Ingo Molnar
bdb1a9b62f x86, apic: rename genapic::apic_destination_logical to genapic::dest_logical
This field name was unreasonably long - shorten it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:19 +01:00
Ingo Molnar
0b06e734bf x86: clean up the APIC_DEST_LOGICAL logic
Impact: cleanup

The bigsmp and es7000 subarchitectures un-defined APIC_DEST_LOGICAL in
a rather nasty way by re-defining it to zero. That is infinitely
fragile and makes it very hard to see what to code really does in
a given context. The very same constant has different meanings and
values - depending on which subarch is enabled.

Untangle this mess by never undefining the constant, but instead
propagating the right values into the genapic driver templates.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:18 +01:00
Ingo Molnar
08125d3eda x86: rename ->ESR_DISABLE to ->disable_esr
the ->ESR_DISABLE shouting variant was used to enable the esr_disable
macro wrappers. Those ugly macros are removed now so we can rename
->ESR_DISABLE to ->disable_esr

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:18 +01:00
Ingo Molnar
f6f52baf26 x86: clean up esr_disable() methods
Impact: cleanup

Most subarchitectures want to disable the APIC ESR (Error Status Register),
because they generally have hardware hacks that wrap standard CPUs into
a bigger system and hence the APIC bus is quite non-standard and weirdnesses
(lockups) have been seen with ESR reporting.

Remove the esr_disable macros and put the desired flag into each
subarchitecture's genapic template directly.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:17 +01:00
Ingo Molnar
fe402e1f2b x86, apic: clean up / remove TARGET_CPUS
Impact: cleanup

use apic->target_cpus() directly instead of the TARGET_CPUS wrapper.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:17 +01:00
Ingo Molnar
9b5bc8dc12 x86, apic: remove IRQ_DEST_MODE / IRQ_DELIVERY_MODE
Remove the wrapper macros IRQ_DEST_MODE and IRQ_DELIVERY_MODE.

The typical 32-bit and the 64-bit build all dereference via the genapic,
so it's pointless to hide that indirection via these ugly macros.

Furthermore, it also obscures subarchitecture details.

So replace it with apic->irq_dest_mode / etc. accesses.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:13 +01:00
Ingo Molnar
f8987a1093 x86, genapic: rename int_delivery_mode, et. al.
int_delivery_mode is supposed to mean 'interrupt delivery mode', but
it's quite a misnomer as 'int' we usually think of as an integer type ...

The standard naming for such attributes is 'irq' - so rename the following
fields and macros:

 int_delivery_mode => irq_delivery_mode
 INT_DELIVERY_MODE => IRQ_DELIVERY_MODE
 int_dest_mode     => irq_dest_mode
 INT_DEST_MODE     => IRQ_DEST_MODE

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:13 +01:00
Ingo Molnar
7ed248daa5 x86: clean up apic->apic_id_registered() methods
Impact: cleanup

x86 subarchitectures each defined a "apic_id_registered()" method,
which could be an inline function depending on which subarch we build
for, and which was also the name of a genapic field.

Untangle this namespace spaghetti by giving each of the instances
a separate name.

Also remove wrapper macro obfuscation.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:12 +01:00
Ingo Molnar
306db03b0d x86: clean up apic->acpi_madt_oem_check methods
Impact: refactor code

x86 subarchitectures each defined a "acpi_madt_oem_check()" method,
which could be an inline function, or an extern, or a static function,
and which was also the name of a genapic field.

Untangle this namespace spaghetti by setting ->acpi_madt_oem_check()
to NULL on those subarchitectures that have no detection quirks,
and rename the other ones (summit, es7000) that do.

Also change default_acpi_madt_oem_check() to handle NULL entries,
and clean its control flow up as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:12 +01:00
Ingo Molnar
504a3c3ad4 x86: clean up apic_x2apic_cluster
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:09 +01:00
Ingo Molnar
05c155c235 x86: clean up apic_x2apic_phys
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:08 +01:00
Ingo Molnar
c796732991 x86: clean up apic_x2apic_uv_x
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:08 +01:00
Ingo Molnar
4c3e51e05a x86: clean up genapic_phys_flat
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:07 +01:00
Ingo Molnar
f2f05ee8b8 x86: clean up genapic_flat
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:07 +01:00
Ingo Molnar
c8d46cf06d x86: rename 'genapic' to 'apic'
Rename genapic-> to apic-> references because in a future chagne we'll
open-code all the indirect calls (instead of obscuring them via macros),
so we want this reference to be as short as possible.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:06 +01:00
Ingo Molnar
74b6eb6b93 Merge branches 'x86/asm', 'x86/cleanups', 'x86/cpudetect', 'x86/debug', 'x86/doc', 'x86/header-fixes', 'x86/mm', 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgent' into x86/core 2009-01-28 23:13:53 +01:00
Peter Zijlstra
8f6d86dc41 x86: cpu_init(): remove ugly #ifdef construct around debug register clear
Impact: Cleanup

While I was looking through the new and improved bootstrap code - great
work that, thanks! I found the below a slight improvement.

Remove unnecessary ugly #ifdef construct around debug register clear.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-27 14:54:44 -08:00
Cyrill Gorcunov
8902528237 x86: ftrace - simplify wait_for_nmi
Get rid of 'waited' stack variable.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-27 14:31:17 +01:00
Hiroshi Shimamoto
bd0838fc48 x86: intel_cacheinfo: fix compiler warning
fix the following warning:

  CC      arch/x86/kernel/cpu/intel_cacheinfo.o
  arch/x86/kernel/cpu/intel_cacheinfo.c:314: warning: 'cpuid4_cache_lookup' defined but not used

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-27 13:57:41 +01:00
Ingo Molnar
4369f1fb7c Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu
Conflicts:
	arch/x86/kernel/setup_percpu.c

Semantic conflict:

	arch/x86/kernel/cpu/common.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-27 12:03:24 +01:00
Ingo Molnar
3ddeb51d9c Merge branch 'linus' into core/percpu
Conflicts:
	arch/x86/kernel/setup_percpu.c
2009-01-27 12:01:51 +01:00
Tejun Heo
cf3997f507 x86: clean up indentation in setup_per_cpu_areas()
Impact: cosmetic cleanup

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 14:25:05 +09:00
James Bottomley
22f25138c3 x86: fix build breakage on voyage
Impact: build fix

x86_cpu_to_apicid and x86_bios_cpu_apicid aren't defined for voyage.
Earlier patch forgot to conditionalize early percpu clearing.  Fix it.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 14:21:37 +09:00
Brian Gerst
2697fbd5fa x86: load new GDT after setting up boot cpu per-cpu area
Impact: sync 32 and 64-bit code

Merge load_gs_base() into switch_to_new_gdt().  Load the GDT and
per-cpu state for the boot cpu when its new area is set up.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
b2d2f4312b x86: initialize per-cpu GDT segment in per-cpu setup
Impact: cleanup

Rename init_gdt() to setup_percpu_segment(), and move it to
setup_percpu.c.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
89c9c4c58e x86: make Voyager use x86 per-cpu setup.
Impact: standardize all x86 platforms on same setup code

With the preceding changes, Voyager can use the same per-cpu setup
code as all the other x86 platforms.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
34019be1cd x86: don't assume boot cpu is #0
Impact: minor cleanup

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
1688401a0f x86: move this_cpu_offset
Impact: Small cleanup

Define BOOT_PERCPU_OFFSET and use it for this_cpu_offset and
__per_cpu_offset initializers.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
996db817e3 x86: only compile setup_percpu.o on SMP
Impact: Minor build optimization

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
ec70de8b04 x86: move apic variables to apic.c
Impact: Code movement

Move the variable definitions to apic.c.  Ifdef the copying of
the two early per-cpu variables, since Voyager doesn't use them.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
74631a248d x86: always page-align per-cpu area start and size
Impact: cleanup

The way the code is written, align is always PAGE_SIZE.  Simplify
the code by removing the align variable.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
2f2f52bad7 x86: move setup_cpu_local_masks()
Impact: Code movement, no functional change.

Move setup_cpu_local_masks() to kernel/cpu/common.c, where the
masks are defined.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
6470aff619 x86: move 64-bit NUMA code
Impact: Code movement, no functional change.

Move the 64-bit NUMA code from setup_percpu.c to numa_64.c

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
0d77e7f04d x86: merge setup_per_cpu_maps() into setup_per_cpu_areas()
Impact: minor optimization

Eliminates the need for two loops over possible cpus.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Linus Torvalds
3386c05bdb Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debugobjects: add and use INIT_WORK_ON_STACK
  rcu: remove duplicate CONFIG_RCU_CPU_STALL_DETECTOR
  relay: fix lock imbalance in relay_late_setup_files
  oprofile: fix uninitialized use of struct op_entry
  rcu: move Kconfig menu
  softlock: fix false panic which can occur if softlockup_thresh is reduced
  rcu: add __cpuinit to rcu_init_percpu_data()
2009-01-26 09:47:56 -08:00
Linus Torvalds
1e70c7f7a9 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  hrtimers: fix inconsistent lock state on resume in hres_timers_resume
  time-sched.c: tick_nohz_update_jiffies should be static
  locking, hpet: annotate false positive warning
  kernel/fork.c: unused variable 'ret'
  itimers: remove the per-cpu-ish-ness
2009-01-26 09:47:43 -08:00
Linus Torvalds
810ee58de2 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (29 commits)
  xen: unitialised return value in xenbus_write_transaction
  x86: fix section mismatch warning
  x86: unmask CPUID levels on Intel CPUs, fix
  x86: work around PAGE_KERNEL_WC not getting WC in iomap_atomic_prot_pfn.
  x86: use standard PIT frequency
  xen: handle highmem pages correctly when shrinking a domain
  x86, mm: fix pte_free()
  xen: actually release memory when shrinking domain
  x86: unmask CPUID levels on Intel CPUs
  x86: add MSR_IA32_MISC_ENABLE bits to <asm/msr-index.h>
  x86: fix PTE corruption issue while mapping RAM using /dev/mem
  x86: mtrr fix debug boot parameter
  x86: fix page attribute corruption with cpa()
  Revert "x86: signal: change type of paramter for sys_rt_sigreturn()"
  x86: use early clobbers in usercopy*.c
  x86: remove kernel_physical_mapping_init() from init section
  fix: crash: IP: __bitmap_intersects+0x48/0x73
  cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
  work_on_cpu: Use our own workqueue.
  work_on_cpu: don't try to get_online_cpus() in work_on_cpu.
  ...
2009-01-26 09:47:28 -08:00
H. Peter Anvin
30a0fb947a x86: correct the CPUID pattern for MSR_IA32_MISC_ENABLE availability
Impact: re-enable CPUID unmasking on affected processors

As far as I am capable of discerning from the documentation,
MSR_IA32_MISC_ENABLE should be available for all family 0xf CPUs, as
well as family 6 for model >= 0xd (newer Pentium M).

The documentation on this isn't ideal, so we need to be on the lookout
for errors, still.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-26 09:40:58 -08:00
Rakib Mullick
659d2618b3 x86: fix section mismatch warning
Here function vmi_activate calls a init function activate_vmi , which
causes the following section mismatch warnings:

  LD      arch/x86/kernel/built-in.o
WARNING: arch/x86/kernel/built-in.o(.text+0x13ba9): Section mismatch
in reference from the function vmi_activate() to the function
.init.text:vmi_time_init()
The function vmi_activate() references
the function __init vmi_time_init().
This is often because vmi_activate lacks a __init
annotation or the annotation of vmi_time_init is wrong.

WARNING: arch/x86/kernel/built-in.o(.text+0x13bd1): Section mismatch
in reference from the function vmi_activate() to the function
.devinit.text:vmi_time_bsp_init()
The function vmi_activate() references
the function __devinit vmi_time_bsp_init().
This is often because vmi_activate lacks a __devinit
annotation or the annotation of vmi_time_bsp_init is wrong.

WARNING: arch/x86/kernel/built-in.o(.text+0x13bdb): Section mismatch
in reference from the function vmi_activate() to the function
.devinit.text:vmi_time_ap_init()
The function vmi_activate() references
the function __devinit vmi_time_ap_init().
This is often because vmi_activate lacks a __devinit
annotation or the annotation of vmi_time_ap_init is wrong.

Fix it by marking vmi_activate() as __init too.

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26 14:27:18 +01:00
Ingo Molnar
d5e397cb49 x86: improve early fault/irq printout
Impact: add a stack dump to early IRQs/faults

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26 14:22:00 +01:00
Ingo Molnar
34707bcd04 x86, debug: remove early_printk() #ifdefs from head_32.S
Impact: cleanup

Remove such constructs:

 #ifdef CONFIG_EARLY_PRINTK
        call early_printk
 #else
        call printk
 #endif

Not only are they ugly, they are also pointless: a call to printk()
maps to early_printk during early bootup anyway, if CONFIG_EARLY_PRINTK
is enabled.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26 14:18:43 +01:00
Ingo Molnar
99fb4d349d x86: unmask CPUID levels on Intel CPUs, fix
Impact: fix boot hang on pre-model-15 Intel CPUs

rdmsrl_safe() does not work in very early bootup code yet, because we
dont have the pagefault handler installed yet so exception section
does not get parsed. rdmsr_safe() will just crash and hang the bootup.

So limit the MSR_IA32_MISC_ENABLE MSR read to those CPU types that
support it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26 12:36:24 +01:00
H. Peter Anvin
b38b066590 x86: filter CPU features dependent on unavailable CPUID levels
Impact: Fixes potential crashes on misconfigured systems.

Some CPU features require specific CPUID levels to be available in
order to function, as they contain information about the operation of
a specific feature.  However, some BIOSes and virtualization software
provide the ability to mask CPUID levels in order to support legacy
operating systems.  We try to enable such CPUID levels when we know
how to do it, but for the remaining cases, filter out such CPU
features when there is no way for us to support them.

Do this in one place, in the CPUID code, with a table-driven approach.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-23 18:08:05 -08:00
H. Peter Anvin
75a048119e x86: handle PAT more like other CPU features
Impact: Cleanup

When PAT was originally introduced, it was handled specially for a few
reasons:

- PAT bugs are hard to track down, so we wanted to maintain a
  whitelist of CPUs.
- The i386 and x86-64 CPUID code was not yet unified.

Both of these are now obsolete, so handle PAT like any other features,
including ordinary feature blacklisting due to known bugs.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-23 18:07:45 -08:00
Hiroshi Shimamoto
98e3d45eda x86: signal: use {get|put}_user_try and catch
Impact: use new framework

Use {get|put}_user_try, catch, and _ex in arch/x86/kernel/signal.c.

Note: this patch contains "WARNING: line over 80 characters", because when
introducing new block I insert an indent to avoid mistakes by edit.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-23 17:17:38 -08:00
Ian Campbell
b041cf22dd x86: rename arch/x86/kernel/pci-swiotlb_64.c => pci-swiotlb.c
The file is used for 32 and 64 bit since:

  commit cfb80c9eae
  Author: Jeremy Fitzhardinge <jeremy@goop.org>
  Date:   Tue Dec 16 12:17:36 2008 -0800

    x86: unify pci iommu setup and allow swiotlb to compile for 32 bit

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-23 12:06:44 +01:00
Brian Gerst
3819cd489e x86: remove include of apic.h from hardirq_64.h
Impact: cleanup

APIC definitions aren't needed here.  Remove the include and fix
up the fallout.

tj: added include to mce_intel_64.c.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-23 11:03:29 +09:00
Brian Gerst
03d2989df9 x86: remove idle_timestamp from 32bit irq_cpustat_t
Impact: bogus irq_cpustat field removed

idle_timestamp is left over from the removed irqbalance code.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-23 11:03:28 +09:00
Jeremy Fitzhardinge
ab897d2013 x86/pvops: remove pte_flags pvop
pte_flags() was introduced as a new pvop in order to extract just the
flags portion of a pte, which is a potentially cheaper operation than
extracting the page number as well.  It turns out this operation is
not needed, because simply using a mask to extract the flags from a
pte is sufficient for all current users.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-22 23:35:20 +01:00
Markus Metzger
ba2607fe9c x86, ds, bts: cleanup/fix DS configuration
Cleanup the cpuid check for DS configuration.

This also fixes a Corei7 CPUID enumeration bug.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-22 14:35:00 +01:00
Thomas Gleixner
336f6c322d debugobjects: add and use INIT_WORK_ON_STACK
Impact: Fix debugobjects warning

debugobject enabled kernels spit out a warning in hpet code due to a
workqueue which is initialized on stack.

Add INIT_WORK_ON_STACK() which calls init_timer_on_stack() and use it
in hpet.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-01-22 10:02:07 +01:00
H. Peter Anvin
066941bd4e x86: unmask CPUID levels on Intel CPUs
Impact: Fixes crashes with misconfigured BIOSes on XSAVE hardware

Avuton Olrich reported early boot crashes with v2.6.28 and
bisected it down to dc1e35c6e9
("x86, xsave: enable xsave/xrstor on cpus with xsave support").

If the CPUID limit bit in MSR_IA32_MISC_ENABLE is set, clear it to
make all CPUID information available.  This is required for some
features to work, in particular XSAVE.

Reported-and-bisected-by: Avuton Olrich <avuton@gmail.com>
Tested-by: Avuton Olrich <avuton@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-22 09:24:02 +01:00
Nick Piggin
03b486322e x86: make UV support configurable
Make X86 SGI Ultraviolet support configurable. Saves about 13K of text size
on my modest config.

   text    data     bss     dec     hex filename
6770537 1158680  694356 8623573  8395d5 vmlinux
6757492 1157664  694228 8609384  835e68 vmlinux.nouv

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 13:00:42 +01:00
Thomas Renninger
731f1872f4 x86: mtrr fix debug boot parameter
while looking at:

  http://bugzilla.kernel.org/show_bug.cgi?id=11541

I realized that the mtrr.show param cannot work, because
the code is processed much too early.

This patch:
 - Declares mtrr.show as early_param
 - Stays consistent with the previous param (which I doubt
   that it ever worked), so mtrr.show=1 would still work
 - Declares mtrr_show as initdata

Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 12:26:42 +01:00
Ingo Molnar
198030782c Merge branch 'x86/mm' into core/percpu
Conflicts:
	arch/x86/mm/fault.c
2009-01-21 10:39:51 +01:00
Ingo Molnar
55f4949f57 x86, mm: move tlb.c to arch/x86/mm/
Impact: cleanup

Now that it's unified, move the (SMP) TLB flushing code from arch/x86/kernel/
to arch/x86/mm/, where it belongs logically.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 10:16:19 +01:00
Ingo Molnar
3eb3963fd1 Merge branch 'cpus4096' into core/percpu
Conflicts:
	arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
	arch/x86/kernel/tlb_32.c

Merge it here because both the cpumask changes and the ongoing percpu
work is touching the TLB code. The percpu changes take precedence, as
they eliminate tlb_32.c altogether.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 10:14:17 +01:00
Ingo Molnar
552b8aa4d1 Revert "x86: signal: change type of paramter for sys_rt_sigreturn()"
This reverts commit 4217458daf.

Justin Madru bisected this commit, it was causing weird Firefox
crashes.

The reason is that GCC mis-optimizes (re-uses) the on-stack parameters of
the calling frame, which corrupts the syscall return pt_regs state and
thus corrupts user-space register state.

So we go back to the slightly less clean but more optimization-safe
method of getting to pt_regs. Also add a comment to explain this.

Resolves: http://bugzilla.kernel.org/show_bug.cgi?id=12505

Reported-and-bisected-by: Justin Madru <jdm64@gawab.com>
Tested-by: Justin Madru <jdm64@gawab.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 09:43:18 +01:00
Tejun Heo
16c2d3f895 x86: rename tlb_64.c to tlb.c
Impact: file rename

tlb_64.c is now the tlb code for both 32 and 64.  Rename it to tlb.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Tejun Heo
02cf94c370 x86: make x86_32 use tlb_64.c
Impact: less contention when issuing invalidate IPI, cleanup

Make x86_32 use the same tlb code as 64bit.  The 64bit code uses
multiple IPI vectors for tlb shootdown to reduce contention.  This
patch makes x86_32 allocate the same 8 IPIs as x86_64 and share the
code paths.

Note that the usage of asmlinkage is inconsistent for x86_32 and 64
and calls for further cleanup.  This has been noted with a FIXME
comment in tlb_64.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Tejun Heo
6dd01bedee x86: prepare for tlb merge
Impact: clean up, ipi vector number reordering for x86_32

Make the following changes to prepare for tlb merge.

* reorder x86_32 ip vectors

* adjust tlb_32.c and tlb_64.c such that their logics coincide exactly
	- on spurious invalidate ipi, tlb_32 acks the irq
	- tlb_64 now has proper memory barriers around clearing
          flush_cpumask (no change in generated code)

* unexport flush_tlb_page from tlb_32.c, there's no user

* use unsigned int for cpu id

* drop unnecessary includes from tlb_64.c

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Tejun Heo
bdbcdd4888 x86: uv cleanup
Impact: cleanup

Make the following uv related cleanups.

* collect visible uv related definitions and interfaces into uv/uv.h
  and use it.  this cleans up the messy situation where on 64bit, uv
  is defined properly, on 32bit generic it's dummy and on the rest
  undefined.  after this clean up, uv is defined on 64 and dummy on
  32.

* update uv_flush_tlb_others() such that it takes cpumask of
  to-be-flushed cpus as argument, instead of that minus self, and
  returns yet-to-be-flushed cpumask, instead of modifying the passed
  in parameter.  this interface change will ease dummy implementation
  of uv_flush_tlb_others() and makes uv tlb flush related stuff
  defined in tlb_uv proper.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Brian Gerst
d650a51485 x86: merge irq_regs.h
Impact: cleanup, better irq_regs code generation for x86_64

Make 64-bit use the same optimizations as 32-bit.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Brian Gerst
0dd76d736e x86: set %fs to __KERNEL_PERCPU unconditionally for x86_32
Impact: cleanup

%fs is currently set to __KERNEL_DS at boot, and conditionally
switched to __KERNEL_PERCPU for secondary cpus.  Instead, initialize
GDT_ENTRY_PERCPU to the same attributes as GDT_ENTRY_KERNEL_DS and
set %fs to __KERNEL_PERCPU unconditionally.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:05 +09:00
Brian Gerst
06deef892c x86: clean up gdt_page definition
Impact: cleanup && more compact percpu area layout with future changes

Move 64-bit GDT to page-aligned section and clean up comment
formatting.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:05 +09:00
Jiri Kosina
afb33f8c0d x86: remove byte locks
Impact: cleanup

Remove byte locks implementation, which was introduced by Jeremy in
8efcbab6 ("paravirt: introduce a "lock-byte" spinlock implementation"),
but turned out to be dead code that is not used by any in-kernel
virtualization guest (Xen uses its own variant of spinlocks implementation
and KVM is not planning to move to byte locks).

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20 17:14:28 +01:00
Markus Metzger
ce5e5540c0 x86, ds, bts: cleanup DS configuration
Cleanup the cpuid check for DS configuration.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20 13:04:39 +01:00
Markus Metzger
b1818748b0 x86, ftrace, hw-branch-tracer: dump trace on oops
Dump the branch trace on an oops (based on ftrace_dump_on_oops).

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20 13:03:48 +01:00
Ingo Molnar
0ce1c38368 Merge commit 'v2.6.29-rc2' into x86/mm 2009-01-20 09:23:28 +01:00
Ingo Molnar
5766b842b2 x86, cpumask: fix tlb flush race
Impact: fix bootup crash

The cpumask is now passed in as a reference to mm->cpu_vm_mask, not on
the stack - hence it is not constant anymore during the TLB flush.

That way it could race and some static sanity checks would trigger:

[  238.154287] ------------[ cut here ]------------
[  238.156039] kernel BUG at arch/x86/kernel/tlb_32.c:130!
[  238.156039] invalid opcode: 0000 [#1] SMP
[  238.156039] last sysfs file: /sys/class/net/eth2/address
[  238.156039] Modules linked in:
[  238.156039]
[  238.156039] Pid: 6493, comm: ifup-eth Not tainted (2.6.29-rc2-tip #1) P4DC6
[  238.156039] EIP: 0060:[<c0118f87>] EFLAGS: 00010202 CPU: 2
[  238.156039] EIP is at native_flush_tlb_others+0x35/0x158
[  238.156039] EAX: c0ef972c EBX: f6143301 ECX: 00000000 EDX: 00000000
[  238.156039] ESI: f61433a8 EDI: f6143200 EBP: f34f3e00 ESP: f34f3df0
[  238.156039]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[  238.156039] Process ifup-eth (pid: 6493, ti=f34f2000 task=f399ab00 task.ti=f34f2000)
[  238.156039] Stack:
[  238.156039]  ffffffff f61433a8 ffffffff f6143200 f34f3e18 c0118e9c 00000000 f6143200
[  238.156039]  f61433a8 f5bec738 f34f3e28 c0119435 c2b5b830 f6143200 f34f3e34 c01c2dc3
[  238.156039]  bffd9000 f34f3e60 c01c3051 00000000 ffffffff f34f3e4c 00000000 00000071
[  238.156039] Call Trace:
[  238.156039]  [<c0118e9c>] ? flush_tlb_others+0x52/0x5b
[  238.156039]  [<c0119435>] ? flush_tlb_mm+0x7f/0x8b
[  238.156039]  [<c01c2dc3>] ? tlb_finish_mmu+0x2d/0x55
[  238.156039]  [<c01c3051>] ? exit_mmap+0x124/0x170
[  238.156039]  [<c013e965>] ? mmput+0x40/0xf5
[  238.156039]  [<c01e4788>] ? flush_old_exec+0x640/0x94b
[  238.156039]  [<c01ddb4e>] ? fsnotify_access+0x37/0x39
[  238.156039]  [<c01e3435>] ? kernel_read+0x39/0x4b
[  238.156039]  [<c021bc8a>] ? load_elf_binary+0x4a1/0x11bb
[  238.156039]  [<c01c0af9>] ? might_fault+0x51/0x9c
[  238.156039]  [<c010a2cc>] ? paravirt_read_tsc+0x20/0x4f
[  238.156039]  [<c010a406>] ? native_sched_clock+0x5d/0x60
[  238.156039]  [<c01e2fda>] ? search_binary_handler+0xab/0x2c4
[  238.156039]  [<c021b7e9>] ? load_elf_binary+0x0/0x11bb
[  238.156039]  [<c04ae9a5>] ? _raw_read_unlock+0x21/0x46
[  238.156039]  [<c021b7e9>] ? load_elf_binary+0x0/0x11bb
[  238.156039]  [<c01e2fe1>] ? search_binary_handler+0xb2/0x2c4
[  238.156039]  [<c01e4076>] ? do_execve+0x21c/0x2ee
[  238.156039]  [<c01029b7>] ? sys_execve+0x51/0x8c
[  238.156039]  [<c0103eaf>] ? sysenter_do_call+0x12/0x43

Fix it by not assuming that the cpumask is constant.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20 09:13:15 +01:00
Ingo Molnar
8f5d36ed5b Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu 2009-01-20 08:23:45 +01:00
Brian Gerst
0d974d4592 x86: remove pda.h
Impact: cleanup

Signed-off-by: Brian Gerst <brgerst@gmail.com>
2009-01-20 12:29:20 +09:00
Brian Gerst
947e76cdc3 x86: move stack_canary into irq_stack
Impact: x86_64 percpu area layout change, irq_stack now at the beginning

Now that the PDA is empty except for the stack canary, it can be removed.
The irqstack is moved to the start of the per-cpu section.  If the stack
protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack.

tj: * updated subject
    * dropped asm relocation of irq_stack_ptr
    * updated comments a bit
    * rebased on top of stack canary changes

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:20 +09:00
Brian Gerst
8c7e58e690 x86: rework __per_cpu_load adjustments
Impact: cleanup

Use cpu_number to determine if the adjustment is necessary.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:20 +09:00
Brian Gerst
8ce031972b x86: remove pda_init()
Impact: cleanup

Copy the code to cpu_init() to satisfy the requirement that the cpu
be reinitialized.  Remove all other calls, since the segments are
already initialized in head_64.S.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:19 +09:00
Tejun Heo
c6e50f93db x86: cleanup stack protector
Impact: cleanup

Make the following cleanups.

* remove duplicate comment from boot_init_stack_canary() which fits
  better in the other place - cpu_idle().

* move stack_canary offset check from __switch_to() to
  boot_init_stack_canary().

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:19 +09:00
Ingo Molnar
bfa318ad52 fix: crash: IP: __bitmap_intersects+0x48/0x73
-tip testing found this crash:

> [   35.258515] calling  acpi_cpufreq_init+0x0/0x127 @ 1
> [   35.264127] BUG: unable to handle kernel NULL pointer dereference at (null)
> [   35.267554] IP: [<ffffffff80478092>] __bitmap_intersects+0x48/0x73
> [   35.267554] PGD 0
> [   35.267554] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC

arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c is still broken: there's no
allocation of the variable mask, so we pass in an uninitialized cmd.mask
field to drv_read(), which then passes it to the scheduler which then
crashes ...

Switch it over to the much simpler constant-cpumask-pointers approach.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20 00:17:01 +01:00
Mike Travis
7285908185 cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
Impact: use new work_on_cpu function to reduce stack usage

Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for drv_read() and drv_write().

Basically converts do_drv_{read,write} into "work_on_cpu" functions that
are now called by drv_read and drv_write.

Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now
that the work_on_cpu() function is more stable.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Dieter Ries <clip2@gmx.de>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <cpufreq@vger.kernel.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-19 22:36:13 +01:00
Ingo Molnar
5cdc5e9e69 x86: fully honor "nolapic", fix
Impact: build fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-19 20:49:37 +01:00
Michael Ellerman
422e79a8b3 x86: Remove never-called arch_setup_msi_irq()
Since commit 75c46fa, "x64, x2apic/intr-remap: MSI and MSI-X
support for interrupt remapping infrastructure", x86 has had an
implementation of arch_setup_msi_irqs().

That implementation does not call arch_setup_msi_irq(), instead it calls
setup_irq(). No other x86 code calls arch_setup_msi_irq().

That leaves only arch_setup_msi_irqs() in drivers/pci/msi.c, but that
routine is overridden by the x86 version of arch_setup_msi_irqs().

So arch_setup_msi_irq() is dead code, remove it.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-18 22:16:46 -08:00
Leonardo Potenza
c7f8562a51 x86: fix section mismatch warnings in kernel/setup_percpu.c
The function setup_cpu_local_masks() has been marked __init, in
order to remove the following section mismatch messages:

WARNING: vmlinux.o(.text+0x3c2c7): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
The function setup_cpu_local_masks() references
the function __init alloc_bootmem_cpumask_var().
This is often because setup_cpu_local_masks lacks a __init
annotation or the annotation of alloc_bootmem_cpumask_var is wrong.

WARNING: vmlinux.o(.text+0x3c2d3): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
The function setup_cpu_local_masks() references
the function __init alloc_bootmem_cpumask_var().
This is often because setup_cpu_local_masks lacks a __init
annotation or the annotation of alloc_bootmem_cpumask_var is wrong.

WARNING: vmlinux.o(.text+0x3c2df): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
The function setup_cpu_local_masks() references
the function __init alloc_bootmem_cpumask_var().
This is often because setup_cpu_local_masks lacks a __init
annotation or the annotation of alloc_bootmem_cpumask_var is wrong.

WARNING: vmlinux.o(.text+0x3c2eb): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
The function setup_cpu_local_masks() references
the function __init alloc_bootmem_cpumask_var().
This is often because setup_cpu_local_masks lacks a __init
annotation or the annotation of alloc_bootmem_cpumask_var is wrong.

Signed-off-by: Leonardo Potenza <lpotenza@inwind.it>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18 23:59:22 +01:00
Mike Travis
b2b815d80a x86: put trigger in to detect mismatched apic versions
Impact: add debug warning

Fire off one message if two apic's discovered with different
apic versions. (this code is only called during CPU init)

The goal of this is to pave the way of the removal of the apic_version[]
array. We dont expect any apic version incompatibilities in the x86
landscape of systems [if so we dont handle them very well and probably
never will handle deep apic version assymetries well], but it's prudent
to have a debug check for one kernel cycle nevertheless.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18 21:15:27 +01:00
Ingo Molnar
b2b062b816 Merge branch 'core/percpu' into stackprotector
Conflicts:
	arch/x86/include/asm/pda.h
	arch/x86/include/asm/system.h

Also, moved include/asm-x86/stackprotector.h to arch/x86/include/asm.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18 18:37:14 +01:00
Brian Gerst
c2558e0eba x86-64: Move isidle from PDA to per-cpu.
tj: s/isidle/is_idle/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:59 +09:00
Brian Gerst
e7a22c1ebc x86-64: Move nodenumber from PDA to per-cpu.
tj: * s/nodenumber/node_number/
    * removed now unused pda variable from pda_init()

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:59 +09:00
Brian Gerst
5689553076 x86-64: Move irqcount from PDA to per-cpu.
tj: s/irqcount/irq_count/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
3d1e42a7cf x86-64: Move oldrsp from PDA to per-cpu.
tj: * in asm-offsets_64.c, pda.h inclusion shouldn't be removed as pda
      is still referenced in the file
    * s/oldrsp/old_rsp/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
9af45651f1 x86-64: Move kernelstack from PDA to per-cpu.
Also clean up PER_CPU_VAR usage in xen-asm_64.S

tj: * remove now unused stack_thread_info()
    * s/kernelstack/kernel_stack/
    * added FIXME comment in xen-asm_64.S

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
c6f5e0acd5 x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
ea9279066d x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit.
tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
    for voyager.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
92d65b2371 x86-64: Convert exception stacks to per-cpu
Move the exception stacks to per-cpu, removing specific allocation code.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
26f80bd6a9 x86-64: Convert irqstacks to per-cpu
Move the irqstackptr variable from the PDA to per-cpu.  Make the
stacks themselves per-cpu, removing some specific allocation code.
Add a seperate flag (is_boot_cpu) to simplify the per-cpu boot
adjustments.

tj: * sprinkle some underbars around.

    * irq_stack_ptr is not used till traps_init(), no reason to
      initialize it early.  On SMP, just leaving it NULL till proper
      initialization in setup_per_cpu_areas() works.  Dropped
      is_boot_cpu and early irq_stack_ptr initialization.

    * do DECLARE/DEFINE_PER_CPU(char[IRQ_STACK_SIZE], irq_stack)
      instead of (char, irq_stack[IRQ_STACK_SIZE]).

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
9eb912d1aa x86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:57 +09:00
Brian Gerst
1b437c8c73 x86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:57 +09:00
Mike Travis
cef30b3a84 x86: put trigger in to detect mismatched apic versions.
Fire off one message if two apic's discovered with different
apic versions.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-16 15:58:13 -08:00
Mike Travis
6eb714c63e cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
Impact: use new work_on_cpu function to reduce stack usage

Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for drv_read() and drv_write().

Basically converts do_drv_{read,write} into "work_on_cpu" functions that
are now called by drv_read and drv_write.

Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now
that the work_on_cpu() function is more stable.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Dieter Ries <clip2@gmx.de>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <cpufreq@vger.kernel.org>
2009-01-16 15:31:15 -08:00
Rafael J. Wysocki
5d8b532af9 ACPI suspend: Fix compilation warnings in drivers/acpi/sleep.c
Fix two compilation warnings in drivers/acpi/sleep.c, one triggered
by unsetting CONFIG_SUSPEND and the other triggered by unsetting
CONFIG_HIBERNATION, by moving some code under the appropriate
#ifdefs .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-16 18:15:31 -05:00
Len Brown
88d998c264 Merge branch 'misc' into release 2009-01-16 14:45:34 -05:00
Masami Hiramatsu
5a4ccaf37f kprobes: check CONFIG_FREEZER instead of CONFIG_PM
Check CONFIG_FREEZER instead of CONFIG_PM because kprobe booster
depends on freeze_processes() and thaw_processes() when CONFIG_PREEMPT=y.

This fixes a linkage error which occurs when CONFIG_PREEMPT=y, CONFIG_PM=y
and CONFIG_FREEZER=n.

Reported-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-16 14:32:17 -05:00
Tejun Heo
cd3adf5230 x86_64: initialize this_cpu_off to __per_cpu_load
On x86_64, if get_per_cpu_var() is used before per cpu area is setup
(if lockdep is turned on, it happens), it needs this_cpu_off to point
to __per_cpu_load.  Initialize accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16 14:20:58 +01:00
Tejun Heo
a338af2c64 x86: fix build bug introduced during merge
EXPORT_PER_CPU_SYMBOL() got misplaced during merge leading to build
failure.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16 14:20:43 +01:00
Ingo Molnar
6dbde35308 percpu: add optimized generic percpu accessors
It is an optimization and a cleanup, and adds the following new
generic percpu methods:

  percpu_read()
  percpu_write()
  percpu_add()
  percpu_sub()
  percpu_and()
  percpu_or()
  percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

 return __get_cpu_var(var);

 ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
 ffffffff8102ca32:	81
 ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
 ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

 return percpu_read(var);

 ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16 14:20:31 +01:00
Tejun Heo
004aa322f8 x86: misc clean up after the percpu update
Do the following cleanups:

* kill x86_64_init_pda() which now is equivalent to pda_init()

* use per_cpu_offset() instead of cpu_pda() when initializing
  initial_gs

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:20:26 +01:00
Tejun Heo
49357d19e4 x86: convert pda ops to wrappers around x86 percpu accessors
pda is now a percpu variable and there's no reason it can't use plain
x86 percpu accessors.  Add x86_test_and_clear_bit_percpu() and replace
pda op implementations with wrappers around x86 percpu accessors.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:20:22 +01:00
Tejun Heo
b12d8db8fb x86: make pda a percpu variable
[ Based on original patch from Christoph Lameter and Mike Travis. ]

As pda is now allocated in percpu area, it can easily be made a proper
percpu variable.  Make it so by defining per cpu symbol from linker
script and declaring it in C code for SMP and simply defining it for
UP.  This change cleans up code and brings SMP and UP closer a bit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:20:03 +01:00
Tejun Heo
9939ddaff5 x86: merge 64 and 32 SMP percpu handling
Now that pda is allocated as part of percpu, percpu doesn't need to be
accessed through pda.  Unify x86_64 SMP percpu access with x86_32 SMP
one.  Other than the segment register, operand size and the base of
percpu symbols, they behave identical now.

This patch replaces now unnecessary pda->data_offset with a dummy
field which is necessary to keep stack_canary at its place.  This
patch also moves per_cpu_offset initialization out of init_gdt() into
setup_per_cpu_areas().  Note that this change also necessitates
explicit per_cpu_offset initializations in voyager_smp.c.

With this change, x86_OP_percpu()'s are as efficient on x86_64 as on
x86_32 and also x86_64 can use assembly PER_CPU macros.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:58 +01:00
Tejun Heo
1a51e3a0ae x86: fold pda into percpu area on SMP
[ Based on original patch from Christoph Lameter and Mike Travis. ]

Currently pdas and percpu areas are allocated separately.  %gs points
to local pda and percpu area can be reached using pda->data_offset.
This patch folds pda into percpu area.

Due to strange gcc requirement, pda needs to be at the beginning of
the percpu area so that pda->stack_canary is at %gs:40.  To achieve
this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is
added and used to reserve pda sized chunk at the start of the percpu
area.

After this change, for boot cpu, %gs first points to pda in the
data.init area and later during setup_per_cpu_areas() gets updated to
point to the actual pda.  This means that setup_per_cpu_areas() need
to reload %gs for CPU0 while clearing pda area for other cpus as cpu0
already has modified it when control reaches setup_per_cpu_areas().

This patch also removes now unnecessary get_local_pda() and its call
sites.

A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:46 +01:00
Tejun Heo
c8f3329a0d x86: use static _cpu_pda array
_cpu_pda array first uses statically allocated storage in data.init
and then switches to allocated bootmem to conserve space.  However,
after folding pda area into percpu area, _cpu_pda array will be
removed completely.  Drop the reallocation part to simplify the code
for soon-to-follow changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:40 +01:00
Tejun Heo
f32ff5388d x86: load pointer to pda into %gs while brining up a CPU
[ Based on original patch from Christoph Lameter and Mike Travis. ]

CPU startup code in head_64.S loaded address of a zero page into %gs
for temporary use till pda is loaded but address to the actual pda is
available at the point.  Load the real address directly instead.

This will help unifying percpu and pda handling later on.

This patch is mostly taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16 14:19:26 +01:00
Tejun Heo
3e5d8f9784 x86: make percpu symbols zerobased on SMP
[ Based on original patch from Christoph Lameter and Mike Travis. ]

This patch makes percpu symbols zerobased on x86_64 SMP by adding
PERCPU_VADDR() to vmlinux.lds.h which helps setting explicit vaddr on
the percpu output section and using it in vmlinux_64.lds.S.  A new
PHDR is added as existing ones cannot contain sections near address
zero.  PERCPU_VADDR() also adds a new symbol __per_cpu_load which
always points to the vaddr of the loaded percpu data.init region.

The following adjustments have been made to accomodate the address
change.

* code to locate percpu gdt_page in head_64.S is updated to add the
  load address to the gdt_page offset.

* __per_cpu_load is used in places where access to the init data area
  is necessary.

* pda->data_offset is initialized soon after C code is entered as zero
  value doesn't work anymore.

This patch is mostly taken from Mike Travis' "x86_64: Base percpu
variables at zero" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:14 +01:00
Tejun Heo
a698c823e1 x86: make vmlinux_32.lds.S use PERCPU() macro
Make vmlinux_32.lds.S use the generic PERCPU() macro instead of open
coding it.  This will ease future changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:09 +01:00
Mike Travis
c90aa894f0 x86: cleanup early setup_percpu references
[ Based on original patch from Christoph Lameter and Mike Travis. ]

  * Ruggedize some calls in setup_percpu.c to prevent mishaps
    in early calls, particularly for non-critical functions.

  * Cleanup DEBUG_PER_CPU_MAPS usages and some comments.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:18:23 +01:00
Tejun Heo
f10fcd4712 x86: make early_per_cpu() a lvalue and use it
Make early_per_cpu() a lvalue as per_cpu() is and use it where
applicable.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:18:17 +01:00
Ingo Molnar
af2519fb22 Merge branch 'linus' into core/iommu
Conflicts:
	arch/ia64/include/asm/dma-mapping.h
	arch/ia64/include/asm/machvec.h
	arch/ia64/include/asm/machvec_sn2.h
2009-01-16 10:09:10 +01:00
Cliff Wickman
18c07cf530 x86, UV: cpu_relax in uv_wait_completion
The function uv_wait_completion() spins on reads of a memory-mapped
register, waiting for completion of BAU hardware replies.

It should call "cpu_relax()" between those reads to improve performance
on hyperthreaded configurations.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 23:48:20 +01:00
Jan Beulich
4a13ad0bd8 x86: avoid early crash in disable_local_APIC()
E.g. when called due to an early panic.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 23:48:19 +01:00
Ingo Molnar
5cd7376200 fix: crash: IP: __bitmap_intersects+0x48/0x73
-tip testing found this crash:

> [   35.258515] calling  acpi_cpufreq_init+0x0/0x127 @ 1
> [   35.264127] BUG: unable to handle kernel NULL pointer dereference at (null)
> [   35.267554] IP: [<ffffffff80478092>] __bitmap_intersects+0x48/0x73
> [   35.267554] PGD 0
> [   35.267554] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC

arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c is still broken: there's no
allocation of the variable mask, so we pass in an uninitialized cmd.mask
field to drv_read(), which then passes it to the scheduler which then
crashes ...

Switch it over to the much simpler constant-cpumask-pointers approach.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 15:46:12 +01:00
Ingo Molnar
49a93bc978 Merge branch 'linus' into cpus4096 2009-01-15 15:45:31 +01:00
Ingo Molnar
7f268f4352 Merge branches 'cpus4096', 'x86/cleanups' and 'x86/urgent' into x86/percpu 2009-01-15 13:18:57 +01:00
Ingo Molnar
54da5b3d44 x86: fix broken flush_tlb_others_ipi(), fix
Impact: cleanup

Use the proper type.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 13:04:58 +01:00
Jan Beulich
a08c4743ed x86: avoid early crash in disable_local_APIC()
E.g. when called due to an early panic.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 12:04:40 +01:00
Jan Beulich
f11826385b x86: fully honor "nolapic"
Impact: widen the effect of the 'nolapic' boot parameter

"nolapic" should not only suppress SMP and use of the LAPIC, but it
also ought to have the effect of disabling all IO-APIC related activity
as well as PCI MSI and HT-IRQs.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 12:04:40 +01:00
Heiko Carstens
e55380edf6 [CVE-2009-0029] Rename old_readdir to sys_old_readdir
This way it matches the generic system call name convention.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14 14:15:15 +01:00
Ingo Molnar
e46d51787e Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/travis/linux-2.6-cpus4096-for-ingo into cpus4096 2009-01-14 12:13:45 +01:00
Frederik Deweerdt
09b3ec7315 x86, tlb flush_data: replace per_cpu with an array
Impact: micro-optimization, memory reduction

On x86_64 flush tlb data is stored in per_cpu variables. This is
unnecessary because only the first NUM_INVALIDATE_TLB_VECTORS entries
are accessed.

This patch aims at making the code less confusing (there's nothing
really "per_cpu") by using a plain array. It also would save some memory
on most distros out there (Ubuntu x86_64 has NR_CPUS=64 by default).

[ Ravikiran G Thirumalai also pointed out that the correct alignment
  is ____cacheline_internodealigned_in_smp, so that there's no
  bouncing on vsmp. ]

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Acked-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 12:04:53 +01:00
Jaswinder Singh Rajput
c2c21745ec x86: replacing mp_config_intsrc with mpc_intsrc
Impact: cleanup, solve 80 columns wrap problems

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 11:58:35 +01:00
Jaswinder Singh Rajput
b5ba7e6d1e x86: replacing mp_config_ioapic with mpc_ioapic
Impact: cleanup, solve 80 columns wrap problems

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 11:58:27 +01:00
Suresh Siddha
a4a0acf8e1 x86: fix broken flush_tlb_others_ipi()
This commit broke flush_tlb_others_ipi() causing boot hangs on a
16 logical cpu system:

>	commit 4595f9620c
>	Author: Rusty Russell <rusty@rustcorp.com.au>
>	Date:   Sat Jan 10 21:58:09 2009 -0800
>
>	    x86: change flush_tlb_others to take a const struct cpumask

This change resulted in sending the invalidate tlb vector to the
sender itself causing the hang. flush_tlb_others_ipi() should exclude
the sender itself from the destination list.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 08:51:06 +01:00
Ingo Molnar
4a922a969c x86, cpufreq: remove leftover copymask_copy()
Impact: fix potential boot crash on MAXSMP

Remove code left over by:

  50c668d: Revert "cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read

That cmd.cpumask is not allocated anymore. No impact on default !MAXSMP
kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-13 16:11:00 +01:00
Yinghai Lu
4a046d1754 x86: arch_probe_nr_irqs
Impact: save RAM with large NR_CPUS, get smaller nr_irqs

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-12 17:39:24 -08:00
Andi Kleen
42bb8cc5e8 x86: hpet: allow force enable on ICH10 HPET
Intel "Smackover" x58 BIOS don't have HPET enabled in the BIOS, so allow
to force enable it at least.  The register layout is the same as in other
recent ICHs, so all the code can be reused.

Using numerical PCI-ID because it's unlikely the PCI-ID will be used
anywhere else.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:53:24 +01:00
Ingo Molnar
e8cea892df Revert "i386: add TRACE_IRQS_OFF for the nmi"
This reverts commit e0c7317557.

This patch was wrong, as lockdep (and thus the irq state tracer)
aren't nmi safe. People are already seeing lockdep warnings due
to this.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:36:59 +01:00
Ingo Molnar
50c668d678 Revert "cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write"
This reverts commit 7503bfbae8.

Dieter Ries reported bootup soft-hangs and bisected it back to
this commit, and reverting this commit gave him a working system.

The commit introduces work_on_cpu() use into the cpufreq code,
but that is subtly problematic from a lock hierarchy POV: the
hotplug-cpu lock is an highlevel lock that is taken before
lowlevel locks, and in this codepath we are called with the
policy lock taken.

Dieter did not have lockdep enabled so we dont have a nice stack
trace proof for this, but using work_on_cpu() in such a lowlevel
place certainly looks wrong, so we revert the patch.

work_on_cpu() needs to be reworked to be more generally usable.

Reported-by: Dieter Ries <clip2@gmx.de>
Tested-by: Dieter Ries <clip2@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:24:23 +01:00
Jaswinder Singh Rajput
2bc1379712 x86: fix apic.c build error on latest git
Fix this by reintroducing asm/smp.h include in apic.c - later on
I will fix this by removing non-smp data from smp.h

Also fix the __inquire_remote_apic() prototype/inline.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:24:23 +01:00
Jaswinder Singh Rajput
4884d8e6a0 x86: fix mpparse.c build error on latest git
Fix this by reintroducing asm/smp.h include in mpparse.c - later on
I will fix this by removing non-smp data from smp.h.

Reported-by: Petr Titera <P.Titera@century.cz>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:24:22 +01:00
Peter Zijlstra
6d612b0f94 locking, hpet: annotate false positive warning
Alexander Beregalov reported that this warning is caused by the HPET code:

> hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
> hpet0: 3 comparators, 64-bit 14.318180 MHz counter
> ODEBUG: object is on stack, but not annotated
> ------------[ cut here ]------------
> WARNING: at lib/debugobjects.c:251 __debug_object_init+0x2a4/0x352()

> Bisected down to 26afe5f2fb
> (x86: HPET_MSI Initialise per-cpu HPET timers)

The commit is fine - but the on-stack workqueue entry needs annotation.

Reported-and-bisected-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 13:33:20 +01:00
Jaswinder Singh Rajput
3b9dc9f2f1 x86: module_64.c fix style problems
Impact: cleanup

Fix:

 ERROR: trailing whitespace
 ERROR: code indent should use tabs where possible
 WARNING: %Ld/%Lu are not-standard C, use %lld/%llu
 WARNING: printk() should include KERN_ facility level
 ERROR: spaces required around that '=' (ctx:VxW)

 total: 13 errors, 2 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 11:23:01 +01:00
Jaswinder Singh Rajput
e17029ad69 x86: module_32.c fix style problems
Impact: cleanup

Fix:

 ERROR: code indent should use tabs where possible
 ERROR: trailing whitespace
 ERROR: spaces required around that '=' (ctx:VxW)

 total: 3 errors, 0 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 11:22:55 +01:00
Jaswinder Singh Rajput
448dd2fa3e x86: msr.c fix style problems
Impact: cleanup

Fix:

 WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>

 total: 0 errors, 1 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 11:22:50 +01:00
Jaswinder Singh Rajput
dd3feda774 x86: microcode_intel.c fix style problems
Impact: cleanup

Fix:

 WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
 ERROR: trailing whitespace
 ERROR: "(foo*)" should be "(foo *)"

 total: 3 errors, 1 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 11:22:40 +01:00
Mike Travis
9594949b06 irq: change references from NR_IRQS to nr_irqs
Impact: preparation, cleanup, add KERN_INFO printk

Modify references from NR_IRQS to nr_irqs as the later will become
variable-sized based on nr_cpu_ids when CONFIG_SPARSE_IRQS=y.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-11 19:13:29 +01:00
Mike Travis
f9b90566cd x86: reduce stack usage in init_intel_cacheinfo
Impact: reduce stack usage.

init_intel_cacheinfo() does not use the cpumask so define a subset
of struct _cpuid4_info (_cpuid4_info_regs) that can be used instead.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-11 19:13:16 +01:00
Mike Travis
a1c33bbeb7 x86: cleanup remaining cpumask_t code in mce_amd_64.c
Impact: Reduce memory usage, use new cpumask API.

Use cpumask_var_t for 'cpus' cpumask in struct threshold_bank and update
remaining old cpumask_t functions to new cpumask API.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-11 19:13:12 +01:00
Mike Travis
0e21990ae7 SGI UV cpumask: use static temp cpumask in flush_tlb
Impact: Improve tlb flush performance for UV

Calling alloc_cpumask_var a zillion times a second does affect
performance.  Replace with static cpumask.

Note: when CONFIG_X86_UV is defined, this extra PER_CPU memory
will be optimized out for non-UV configs as is_uv_system() will
then return a constant 0.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-11 19:13:09 +01:00
Rusty Russell
4595f9620c x86: change flush_tlb_others to take a const struct cpumask
Impact: reduce stack usage, use new cpumask API.

This is made a little more tricky by uv_flush_tlb_others which
actually alters its argument, for an IPI to be sent to the remaining
cpus in the mask.

I solve this by allocating a cpumask_var_t for this case and falling back
to IPI should this fail.

To eliminate temporaries in the caller, all flush_tlb_others implementations
now do the this-cpu-elimination step themselves.

Note also the curious "cpus_or(f->flush_cpumask, cpumask, f->flush_cpumask)"
which has been there since pre-git and yet f->flush_cpumask is always zero
at this point.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-11 19:13:06 +01:00
Mike Travis
7f7ace0cda cpumask: update irq_desc to use cpumask_var_t
Impact: reduce memory usage, use new cpumask API.

Replace the affinity and pending_masks with cpumask_var_t's.  This adds
to the significant size reduction done with the SPARSE_IRQS changes.

The added functions (init_alloc_desc_masks & init_copy_desc_masks) are
in the include file so they can be inlined (and optimized out for the
!CONFIG_CPUMASKS_OFFSTACK case.)  [Naming chosen to be consistent with
the other init*irq functions, as well as the backwards arg declaration
of "from, to" instead of the more common "to, from" standard.]

Includes a slight change to the declaration of struct irq_desc to embed
the pending_mask within ifdef(CONFIG_SMP) to be consistent with other
references, and some small changes to Xen.

Tested: sparse/non-sparse/cpumask_offstack/non-cpumask_offstack/nonuma/nosmp on x86_64

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: virtualization@lists.osdl.org
Cc: xen-devel@lists.xensource.com
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
2009-01-11 19:12:46 +01:00
Benjamin LaHaise
7106a5ab89 x86-64: remove locked instruction from switch_to()
Impact: micro-optimization

The patch below removes an unnecessary locked instruction from
switch_to().  TIF_FORK is only ever set in copy_thread() on initial
process creation, and gets cleared during the first scheduling of the
process.  As such, it is safe to use an unlocked test for the flag
within switch_to().

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-11 05:05:33 +01:00
Ian Campbell
0b8698ab58 swiotlb: range_needs_mapping should take a physical address.
The swiotlb_arch_range_needs_mapping() hook should take a physical
address rather than a virtual address in order to support highmem pages.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-11 04:54:34 +01:00
Ingo Molnar
0811a433c6 Merge branch 'linus' into core/iommu 2009-01-11 00:51:06 +01:00
Jaswinder Singh Rajput
fb8fd077fb x86: smp.h move cpu_callout_mask and cpu_callout_map declartion to cpumask.h
Impact: cleanup

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-10 23:57:20 +01:00
Jaswinder Singh Rajput
068790334c x86: smp.h move cpu_callin_mask and cpu_callin_map declartion to cpumask.h
Impact: cleanup

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-10 23:57:09 +01:00
Ingo Molnar
1de8cd3cb9 Merge branch 'linus' into x86/cleanups 2009-01-10 23:56:42 +01:00
Linus Torvalds
3d14bdad40 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits)
  x86: fix section mismatch warnings in mcheck/mce_amd_64.c
  x86: offer frame pointers in all build modes
  x86: remove duplicated #include's
  x86: k8 numa register active regions later
  x86: update Alan Cox's email addresses
  x86: rename all fields of mpc_table mpc_X to X
  x86: rename all fields of mpc_oemtable oem_X to X
  x86: rename all fields of mpc_bus mpc_X to X
  x86: rename all fields of mpc_cpu mpc_X to X
  x86: rename all fields of mpc_intsrc mpc_X to X
  x86: rename all fields of mpc_lintsrc mpc_X to X
  x86: rename all fields of mpc_iopic mpc_X to X
  x86: irqinit_64.c init_ISA_irqs should be static
  Documentation/x86/boot.txt: payload length was changed to payload_length
  x86: setup_percpu.c fix style problems
  x86: irqinit_64.c fix style problems
  x86: irqinit_32.c fix style problems
  x86: i8259.c fix style problems
  x86: irq_32.c fix style problems
  x86: ioport.c fix style problems
  ...
2009-01-10 06:13:09 -08:00
Linus Torvalds
4e9b1c184c Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  [IA64] fix typo in cpumask_of_pcibus()
  x86: fix x86_32 builds for summit and es7000 arch's
  cpumask: use work_on_cpu in acpi-cpufreq.c for read_measured_perf_ctrs
  cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
  cpumask: use cpumask_var_t in acpi-cpufreq.c
  cpumask: use work_on_cpu in acpi/cstate.c
  cpumask: convert struct cpufreq_policy to cpumask_var_t
  cpumask: replace CPUMASK_ALLOC etc with cpumask_var_t
  x86: cleanup remaining cpumask_t ops in smpboot code
  cpumask: update pci_bus_show_cpuaffinity to use new cpumask API
  cpumask: update local_cpus_show to use new cpumask API
  ia64: cpumask fix for is_affinity_mask_valid()
2009-01-10 06:12:18 -08:00
Andi Kleen
8659c406ad x86: only scan the root bus in early PCI quirks
We found a situation on Linus' machine that the Nvidia timer quirk hit on
a Intel chipset system.  The problem is that the system has a fancy Nvidia
card with an own PCI bridge, and the early-quirks code looking for any
NVidia bridge triggered on it incorrectly.  This didn't lead a boot
failure by luck, but the timer routing code selecting the wrong timer
first and some ugly messages.  It might lead to real problems on other
systems.

I checked all the devices which are currently checked for by early_quirks
and it turns out they are all located in the root bus zero.

So change the early-quirks loop to only scan bus 0.  This incidently also
saves quite some unnecessary scanning work, because early_quirks doesn't
go through all the non root busses.

The graphics card is not on bus 0, so it is not matched anymore.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-09 12:46:22 -08:00
Len Brown
b2576e1d44 Merge branch 'linus' into release 2009-01-09 03:39:43 -05:00
Len Brown
3cc8a5f4ba Merge branch 'suspend' into release 2009-01-09 03:38:15 -05:00
Len Brown
d0302bc62a Merge branch 'misc' into release
Conflicts:
	include/acpi/acpixf.h

Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-09 03:37:48 -05:00
Zhao Yakui
237889bf0a ACPI : Use RSDT instead of XSDT by adding boot option of "acpi=rsdt"
On some boxes there exist both RSDT and XSDT table. But unfortunately
sometimes there exists the following error when XSDT table is used:
   a. 32/64X address mismatch
   b. The 32/64X FACS address mismatch

   In such case the boot option of "acpi=rsdt" is provided so that
RSDT is tried instead of XSDT table when the system can't work well.

http://bugzilla.kernel.org/show_bug.cgi?id=8246

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
cc:Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-09 01:41:58 -05:00
Zhao Yakui
13b40a1a06 ACPI: Avoid array address overflow when _CST MWAIT hint bits are set
The Cx Register address obtained from the _CST object is used as the MWAIT
hints if the register type is FFixedHW. And it is used to check whether
the Cx type is supported or not.

On some boxes the following Cx state package is obtained from _CST object:
    >{
                ResourceTemplate ()
                {
                    Register (FFixedHW,
                        0x01,               // Bit Width
                        0x02,               // Bit Offset
                        0x0000000000889759, // Address
                        0x03,               // Access Size
                        )
                },

                0x03,
                0xF5,
                0x015E }

   In such case we should use the bit[7:4] of Cx address to check whether
the Cx type is supported or not.

mask the MWAIT hint to avoid array address overflow

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Acked-by:Venki Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-09 01:28:01 -05:00
Fernando Carrijo
c19a28e119 remove lots of double-semicolons
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:14 -08:00
Jaswinder Singh Rajput
1eb1b3b65d x86: rename all fields of mpf_intel mpf_X to X
Impact: cleanup, solve 80 columns wrap problems

It would be cleaner to rename all the mpf->mpf_X fields to
mpf->X - that alone would give 4 characters per usage site.
(we already know that it's an 'mpf' entity -
no need to duplicate that in the field too)

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-08 15:37:37 +01:00