Commit Graph

2914 Commits

Author SHA1 Message Date
Robert Richter
ee5789dbcc perf, x86: Share IBS macros between perf and oprofile
Moving IBS macros from oprofile to <asm/perf_event.h> to make it
available to perf. No additional changes.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1316597423-25723-2-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-10 06:57:11 +02:00
Don Zickus
b227e23399 x86, nmi: Add in logic to handle multiple events and unknown NMIs
Previous patches allow the NMI subsystem to process multipe NMI events
in one NMI.  As previously discussed this can cause issues when an event
triggered another NMI but is processed in the current NMI.  This causes the
next NMI to go unprocessed and become an 'unknown' NMI.

To handle this, we first have to flag whether or not the NMI handler handled
more than one event or not.  If it did, then there exists a chance that
the next NMI might be already processed.  Once the NMI is flagged as a
candidate to be swallowed, we next look for a back-to-back NMI condition.

This is determined by looking at the %rip from pt_regs.  If it is the same
as the previous NMI, it is assumed the cpu did not have a chance to jump
back into a non-NMI context and execute code and instead handled another NMI.

If both of those conditions are true then we will swallow any unknown NMI.

There still exists a chance that we accidentally swallow a real unknown NMI,
but for now things seem better.

An optimization has also been added to the nmi notifier rountine.  Because x86
can latch up to one NMI while currently processing an NMI, we don't have to
worry about executing _all_ the handlers in a standalone NMI.  The idea is
if multiple NMIs come in, the second NMI will represent them.  For those
back-to-back NMI cases, we have the potentail to drop NMIs.  Therefore only
execute all the handlers in the second half of a detected back-to-back NMI.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-5-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-10 06:57:01 +02:00
Don Zickus
9c48f1c629 x86, nmi: Wire up NMI handlers to new routines
Just convert all the files that have an nmi handler to the new routines.
Most of it is straight forward conversion.  A couple of places needed some
tweaking like kgdb which separates the debug notifier from the nmi handler
and mce removes a call to notify_die.

[Thanks to Ying for finding out the history behind that mce call

https://lkml.org/lkml/2010/5/27/114

And Boris responding that he would like to remove that call because of it

https://lkml.org/lkml/2011/9/21/163]

The things that get converted are the registeration/unregistration routines
and the nmi handler itself has its args changed along with code removal
to check which list it is on (most are on one NMI list except for kgdb
which has both an NMI routine and an NMI Unknown routine).

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Corey Minyard <minyard@acm.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Jack Steiner <steiner@sgi.com>
Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-10 06:56:57 +02:00
Don Zickus
c9126b2ee8 x86, nmi: Create new NMI handler routines
The NMI handlers used to rely on the notifier infrastructure.  This worked
great until we wanted to support handling multiple events better.

One of the key ideas to the nmi handling is to process _all_ the handlers for
each NMI.  The reason behind this switch is because NMIs are edge triggered.
If enough NMIs are triggered, then they could be lost because the cpu can
only latch at most one NMI (besides the one currently being processed).

In order to deal with this we have decided to process all the NMI handlers
for each NMI.  This allows the handlers to determine if they recieved an
event or not (the ones that can not determine this will be left to fend
for themselves on the unknown NMI list).

As a result of this change it is now possible to have an extra NMI that
was destined to be received for an already processed event.  Because the
event was processed in the previous NMI, this NMI gets dropped and becomes
an 'unknown' NMI.  This of course will cause printks that scare people.

However, we prefer to have extra NMIs as opposed to losing NMIs and as such
are have developed a basic mechanism to catch most of them.  That will be
a later patch.

To accomplish this idea, I unhooked the nmi handlers from the notifier
routines and created a new mechanism loosely based on doIRQ.  The reason
for this is the notifier routines have a couple of shortcomings.  One we
could't guarantee all future NMI handlers used NOTIFY_OK instead of
NOTIFY_STOP.  Second, we couldn't keep track of the number of events being
handled in each routine (most only handle one, perf can handle more than one).
Third, I wanted to eventually display which nmi handlers are registered in
the system in /proc/interrupts to help see who is generating NMIs.

The patch below just implements the new infrastructure but doesn't wire it up
yet (that is the next patch).  Its design is based on doIRQ structs and the
atomic notifier routines.  So the rcu stuff in the patch isn't entirely untested
(as the notifier routines have soaked it) but it should be double checked in
case I copied the code wrong.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317409584-23662-3-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-10 06:56:52 +02:00
Gleb Natapov
144d31e6f1 perf, intel: Use GO/HO bits in perf-ctr
Intel does not have guest/host-only bit in perf counters like AMD
does.  To support GO/HO bits KVM needs to switch EVENTSELn values
(or PERF_GLOBAL_CTRL if available) at a guest entry. If a counter is
configured to count only in a guest mode it stays disabled in a host,
but VMX is configured to switch it to enabled value during guest entry.

This patch adds GO/HO tracking to Intel perf code and provides interface
for KVM to get a list of MSRs that need to be switched on a guest entry.

Only cpus with architectural PMU (v1 or later) are supported with this
patch.  To my knowledge there is not p6 models with VMX but without
architectural PMU and p4 with VMX are rare and the interface is general
enough to support them if need arise.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317816084-18026-7-git-send-email-gleb@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-10 06:56:42 +02:00
Joerg Roedel
011af85784 perf, amd: Use GO/HO bits in perf-ctr
The AMD perf-counters support counting in guest or host-mode
only. Make use of that feature when user-space specified
guest/host-mode only counting.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1317816084-18026-3-git-send-email-gleb@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-06 13:00:31 +02:00
Linus Torvalds
a7f934d4f1 asm alternatives: remove incorrect alignment notes
On x86-64, they were just wasteful: with the explicitly added (now
unnecessary) padding, the size of the alternatives structure was 16
bytes, and an alignment of 8 bytes didn't hurt much.

However, it was still silly, since the natural size and alignment for
the structure is actually just 12 bytes, 4-byte aligned since commit
59e97e4d6f ("x86: Make alternative instruction pointers relative").
So removing the padding, and removing the extra alignment is just a good
idea.

On x86-32, the alignment of 4 bytes was correct, but was incorrectly
hardcoded as 8 bytes in <asm/alternative-asm.h>.  That header file had
used to be an x86-64 only header file, but various unification efforts
have made it be used for x86-32 too (ie the unification of rwlock and
rwsem).

That in turn caused x86-32 boot failures, because the extra alignment
would result in random zero-filled words in the altinstructions section,
causing oopses early at boot when doing alternative instruction
replacement.

So just remove all the alignment noise entirely.  It's wrong, and it's
unnecessary.  The section itself is already properly aligned by the
linker scripts, and all additions to the section had better be of the
proper 12-byte format, keeping it aligned.  So if the align directive
were to ever make a difference, that would be an indication of a serious
bug to begin with.

Reported-by: Werner Landgraf <w.landgraf@ru.r>
Acked-by: Andrew Lutomirski <luto@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-15 13:28:33 -07:00
Linus Torvalds
768b56f598 Merge branch 'kvm-updates/3.1' of git://github.com/avikivity/kvm
* 'kvm-updates/3.1' of git://github.com/avikivity/kvm:
  KVM: Fix instruction size issue in pvclock scaling
2011-09-07 07:45:43 -07:00
Duncan Sands
3b217116ed KVM: Fix instruction size issue in pvclock scaling
Commit de2d1a524e ("KVM: Fix register corruption in pvclock_scale_delta")
introduced a mul instruction that may have only a memory operand; the
assembler therefore cannot select the correct size:

   pvclock.s:229: Error: no instruction mnemonic suffix given and no register
operands; can't size instruction

In this example the assembler is:

         #APP
         mul -48(%rbp) ; shrd $32, %rdx, %rax
         #NO_APP

A simple solution is to use mulq.

Signed-off-by: Duncan Sands <baldrick@free.fr>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-08-30 14:42:30 +03:00
NeilBrown
f5b9409973 All Arch: remove linkage for sys_nfsservctl system call
The nfsservctl system call is now gone, so we should remove all
linkage for it.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-26 15:09:58 -07:00
Linus Torvalds
4762e252f4 Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/tracing: Fix tracing config option properly
  xen: Do not enable PV IPIs when vector callback not present
  xen/x86: replace order-based range checking of M2P table by linear one
  xen: xen-selfballoon.c needs more header files
2011-08-22 11:25:44 -07:00
Jan Beulich
ccbcdf7cf1 xen/x86: replace order-based range checking of M2P table by linear one
The order-based approach is not only less efficient (requiring a shift
and a compare, typical generated code looking like this

	mov	eax, [machine_to_phys_order]
	mov	ecx, eax
	shr	ebx, cl
	test	ebx, ebx
	jnz	...

whereas a direct check requires just a compare, like in

	cmp	ebx, [machine_to_phys_nr]
	jae	...

), but also slightly dangerous in the 32-on-64 case - the element
address calculation can wrap if the next power of two boundary is
sufficiently far away from the actual upper limit of the table, and
hence can result in user space addresses being accessed (with it being
unknown what may actually be mapped there).

Additionally, the elimination of the mistaken use of fls() here (should
have been __fls()) fixes a latent issue on x86-64 that would trigger
if the code was run on a system with memory extending beyond the 44-bit
boundary.

CC: stable@kernel.org
Signed-off-by: Jan Beulich <jbeulich@novell.com>
[v1: Based on Jeremy's feedback]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-08-17 10:26:48 -04:00
Linus Torvalds
06e727d2a5 Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip:
  x86-64: Rework vsyscall emulation and add vsyscall= parameter
  x86-64: Wire up getcpu syscall
  x86: Remove unnecessary compile flag tweaks for vsyscall code
  x86-64: Add vsyscall:emulate_vsyscall trace event
  x86-64: Add user_64bit_mode paravirt op
  x86-64, xen: Enable the vvar mapping
  x86-64: Work around gold bug 13023
  x86-64: Move the "user" vsyscall segment out of the data segment.
  x86-64: Pad vDSO to a page boundary
2011-08-12 20:46:24 -07:00
Andy Lutomirski
3ae36655b9 x86-64: Rework vsyscall emulation and add vsyscall= parameter
There are three choices:

vsyscall=native: Vsyscalls are native code that issues the
corresponding syscalls.

vsyscall=emulate (default): Vsyscalls are emulated by instruction
fault traps, tested in the bad_area path.  The actual contents of
the vsyscall page is the same as the vsyscall=native case except
that it's marked NX.  This way programs that make assumptions about
what the code in the page does will not be confused when they read
that code.

vsyscall=none: Trying to execute a vsyscall will segfault.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/8449fb3abf89851fd6b2260972666a6f82542284.1312988155.git.luto@mit.edu
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-08-10 19:26:46 -05:00
Andy Lutomirski
fce8dc0642 x86-64: Wire up getcpu syscall
getcpu is available as a vdso entry and an emulated vsyscall.
Programs that for some reason don't want to use the vdso should
still be able to call getcpu without relying on the slow emulated
vsyscall.  It costs almost nothing to expose it as a real syscall.

We also need this for the following patch in vsyscall=native mode.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/6b19f55bdb06a0c32c2fa6dba9b6f222e1fde999.1312988155.git.luto@mit.edu
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-08-10 19:26:46 -05:00
Andy Lutomirski
318f5a2a67 x86-64: Add user_64bit_mode paravirt op
Three places in the kernel assume that the only long mode CPL 3
selector is __USER_CS.  This is not true on Xen -- Xen's sysretq
changes cs to the magic value 0xe033.

Two of the places are corner cases, but as of "x86-64: Improve
vsyscall emulation CS and RIP handling"
(c9712944b2), vsyscalls will segfault
if called with Xen's extra CS selector.  This causes a panic when
older init builds die.

It seems impossible to make Xen use __USER_CS reliably without
taking a performance hit on every system call, so this fixes the
tests instead with a new paravirt op.  It's a little ugly because
ptrace.h can't include paravirt.h.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/f4fcb3947340d9e96ce1054a432f183f9da9db83.1312378163.git.luto@mit.edu
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-08-04 16:13:49 -07:00
H. Peter Anvin
17b0436077 Merge commit 'v3.0' into x86/vdso 2011-08-04 16:13:20 -07:00
Linus Torvalds
33f35f2a4e x86: don't include xen/xen.h in <asm/io.h> unless XEN is enabled
Dmitry Kasatkin reports:
  "kernel-devel package with kernel headers have no <include/xen>
   directory if XEN is disabled.  Modules which inclide asm/io.h won't
   compile.

   XEN related content is behind the CONFIG_XEN flag in the io.h.  And
   <xen/xen.h> should be also behind CONFIG_XEN flag."

So move the include of <xen/xen.h> down into the section that is
conditional on CONFIG_XEN.

Reported-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03 22:00:38 -10:00
Linus Torvalds
35e51fe82d Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
  cpuidle: stop depending on pm_idle
  x86 idle: move mwait_idle_with_hints() to where it is used
  cpuidle: replace xen access to x86 pm_idle and default_idle
  cpuidle: create bootparam "cpuidle.off=1"
  mrst_pmu: driver for Intel Moorestown Power Management Unit
2011-08-03 21:54:15 -10:00
Len Brown
4bfc8288bc x86 idle: move mwait_idle_with_hints() to where it is used
...and make it static

no functional change

cc: x86@kernel.org
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 19:06:36 -04:00
Arun Sharma
7847777a45 atomic: cleanup asm-generic atomic*.h inclusion
After changing all consumers of atomics to include <linux/atomic.h>, we
ran into some compile time errors due to this dependency chain:

linux/atomic.h
  -> asm/atomic.h
    -> asm-generic/atomic-long.h

where atomic-long.h could use funcs defined later in linux/atomic.h
without a prototype.  This patches moves the code that includes
asm-generic/atomic*.h to linux/atomic.h.

Archs that need <asm-generic/atomic64.h> need to select
CONFIG_GENERIC_ATOMIC64 from now on (some of them used to include it
unconditionally).

Compile tested on i386 and x86_64 with allnoconfig.

Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Arun Sharma
f24219b4e9 atomic: move atomic_add_unless to generic code
This is in preparation for more generic atomic primitives based on
__atomic_add_unless.

Signed-off-by: Arun Sharma <asharma@fb.com>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Arun Sharma
60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Akinobu Mita
148817ba09 asm-generic: add another generic ext2 atomic bitops
The majority of architectures implement ext2 atomic bitops as
test_and_{set,clear}_bit() without spinlock.

This adds this type of generic implementation in ext2-atomic-setbit.h and
use it wherever possible.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Suggested-by: Andreas Dilger <adilger@dilger.ca>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:46 -07:00
Mike Frysinger
0e9a6cb5e6 ptrace: unify show_regs() prototype
[ poleg@redhat.com: no need to declare show_regs() in ptrace.h, sched.h does this ]
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:43 -07:00
Linus Torvalds
fa8f53ace4 Merge branch 'x86-olpc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-olpc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, olpc-xo15-sci: Enable EC wakeup capability
  x86, olpc: Fix dependency on POWER_SUPPLY
  x86, olpc: Add XO-1.5 SCI driver
  x86, olpc: Add XO-1 RTC driver
  x86, olpc-xo1-sci: Propagate power supply/battery events
  x86, olpc-xo1-sci: Add lid switch functionality
  x86, olpc-xo1-sci: Add GPE handler and ebook switch functionality
  x86, olpc: EC SCI wakeup mask functionality
  x86, olpc: Add XO-1 SCI driver and power button control
  x86, olpc: Add XO-1 suspend/resume support
  x86, olpc: Rename olpc-xo1 to olpc-xo1-pm
  x86, olpc: Move CS5536-related constants to cs5535.h
  x86, olpc: Add missing elements to device tree
2011-07-26 11:11:54 -07:00
Linus Torvalds
ff0c4ad2c3 Merge branch 'for-upstream' of git://openrisc.net/jonas/linux
* 'for-upstream' of git://openrisc.net/jonas/linux: (24 commits)
  OpenRISC: Add MAINTAINERS entry
  OpenRISC: Miscellaneous
  OpenRISC: Library routines
  OpenRISC: Headers
  OpenRISC: Traps
  OpenRISC: Module support
  OpenRISC: GPIO
  OpenRISC: Scheduling/Process management
  OpenRISC: Idle/Power management
  OpenRISC: System calls
  OpenRISC: IRQ
  OpenRISC: Timekeeping
  OpenRISC: DMA
  OpenRISC: PTrace
  OpenRISC: Build infrastructure
  OpenRISC: Signal handling
  OpenRISC: Memory management
  OpenRISC: Device tree
  OpenRISC: Boot code
  iomap: make IOPORT/PCI mapping functions conditional
  ...
2011-07-24 09:55:18 -07:00
Linus Torvalds
5fabc487c9 Merge branch 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  KVM: IOMMU: Disable device assignment without interrupt remapping
  KVM: MMU: trace mmio page fault
  KVM: MMU: mmio page fault support
  KVM: MMU: reorganize struct kvm_shadow_walk_iterator
  KVM: MMU: lockless walking shadow page table
  KVM: MMU: do not need atomicly to set/clear spte
  KVM: MMU: introduce the rules to modify shadow page table
  KVM: MMU: abstract some functions to handle fault pfn
  KVM: MMU: filter out the mmio pfn from the fault pfn
  KVM: MMU: remove bypass_guest_pf
  KVM: MMU: split kvm_mmu_free_page
  KVM: MMU: count used shadow pages on prepareing path
  KVM: MMU: rename 'pt_write' to 'emulate'
  KVM: MMU: cleanup for FNAME(fetch)
  KVM: MMU: optimize to handle dirty bit
  KVM: MMU: cache mmio info on page fault path
  KVM: x86: introduce vcpu_mmio_gva_to_gpa to cleanup the code
  KVM: MMU: do not update slot bitmap if spte is nonpresent
  KVM: MMU: fix walking shadow page table
  KVM guest: KVM Steal time registration
  ...
2011-07-24 09:07:03 -07:00
Linus Torvalds
c61264f98c Merge branch 'upstream/xen-tracing2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'upstream/xen-tracing2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xen/trace: use class for multicall trace
  xen/trace: convert mmu events to use DECLARE_EVENT_CLASS()/DEFINE_EVENT()
  xen/multicall: move *idx fields to start of mc_buffer
  xen/multicall: special-case singleton hypercalls
  xen/multicalls: add unlikely around slowpath in __xen_mc_entry()
  xen/multicalls: disable MC_DEBUG
  xen/mmu: tune pgtable alloc/release
  xen/mmu: use extend_args for more mmuext updates
  xen/trace: add tlb flush tracepoints
  xen/trace: add segment desc tracing
  xen/trace: add xen_pgd_(un)pin tracepoints
  xen/trace: add ptpage alloc/release tracepoints
  xen/trace: add mmu tracepoints
  xen/trace: add multicall tracing
  xen/trace: set up tracepoint skeleton
  xen/multicalls: remove debugfs stats
  trace/xen: add skeleton for Xen trace events
2011-07-24 09:06:47 -07:00
Xiao Guangrong
c2a2ac2b56 KVM: MMU: lockless walking shadow page table
Use rcu to protect shadow pages table to be freed, so we can safely walk it,
it should run fastly and is needed by mmio page fault

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24 11:50:38 +03:00
Xiao Guangrong
c37079586f KVM: MMU: remove bypass_guest_pf
The idea is from Avi:
| Maybe it's time to kill off bypass_guest_pf=1.  It's not as effective as
| it used to be, since unsync pages always use shadow_trap_nonpresent_pte,
| and since we convert between the two nonpresent_ptes during sync and unsync.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24 11:50:33 +03:00
Xiao Guangrong
bebb106a5a KVM: MMU: cache mmio info on page fault path
If the page fault is caused by mmio, we can cache the mmio info, later, we do
not need to walk guest page table and quickly know it is a mmio fault while we
emulate the mmio instruction

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-24 11:50:26 +03:00
Glauber Costa
d910f5c106 KVM guest: KVM Steal time registration
This patch implements the kvm bits of the steal time infrastructure.
The most important part of it, is the steal time clock. It is an
continuous clock that shows the accumulated amount of steal time
since vcpu creation. It is supposed to survive cpu offlining/onlining.

[marcelo: fix build with CONFIG_KVM_GUEST=n]

Signed-off-by: Glauber Costa <glommer@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Tested-by: Eric B Munson <emunson@mgebm.net>
CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Avi Kivity <avi@redhat.com>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-24 11:49:36 +03:00
Linus Torvalds
148a7b1725 Merge branch 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Add support for cmpxchg_double
2011-07-23 10:35:43 -07:00
Linus Torvalds
9d0715630e Merge branch 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-clocksource-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  clocksource: apb: Share APB timer code with other platforms
2011-07-23 10:34:47 -07:00
Linus Torvalds
8e204874db Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-64, vdso: Do not allocate memory for the vDSO
  clocksource: Change __ARCH_HAS_CLOCKSOURCE_DATA to a CONFIG option
  x86, vdso: Drop now wrong comment
  Document the vDSO and add a reference parser
  ia64: Replace clocksource.fsys_mmio with generic arch data
  x86-64: Move vread_tsc and vread_hpet into the vDSO
  clocksource: Replace vread with generic arch data
  x86-64: Add --no-undefined to vDSO build
  x86-64: Allow alternative patching in the vDSO
  x86: Make alternative instruction pointers relative
  x86-64: Improve vsyscall emulation CS and RIP handling
  x86-64: Emulate legacy vsyscalls
  x86-64: Fill unused parts of the vsyscall page with 0xcc
  x86-64: Remove vsyscall number 3 (venosys)
  x86-64: Map the HPET NX
  x86-64: Remove kernel.vsyscall64 sysctl
  x86-64: Give vvars their own page
  x86-64: Document some of entry_64.S
  x86-64: Fix alignment of jiffies variable
2011-07-22 17:05:15 -07:00
Linus Torvalds
3e0b8df79d Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, UV: Correct UV2 BAU destination timeout
  x86, UV: Correct failed topology memory leak
  x86, UV: Remove cpumask_t from the stack
  x86, UV: Rename hubmask to pnmask
  x86, UV: Correct reset_with_ipi()
  x86, UV: Allow for non-consecutive sockets
  x86, UV: Inline header file functions
  x86, UV: Fix smp_processor_id() use in a preemptable region
  x66, UV: Enable 64-bit ACPI MFCG support for SGI UV2 platform
  x86, UV: Clean up uv_mmrs.h
2011-07-22 17:04:55 -07:00
Linus Torvalds
9e39264ed4 Merge branch 'x86-numa-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-numa-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, numa: Implement pfn -> nid mapping granularity check
  x86, mm: s/PAGES_PER_ELEMENT/PAGES_PER_SECTION/
2011-07-22 17:04:18 -07:00
Linus Torvalds
7c6582b28a Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mce: Use mce_sysdev_ prefix to group functions
  x86, mce: Use mce_chrdev_ prefix to group functions
  x86, mce: Cleanup mce_read()
  x86, mce: Cleanup mce_create()/remove_device()
  x86, mce: Check the result of ancient_init()
  x86, mce: Introduce mce_gather_info()
  x86, mce: Replace MCM_ with MCI_MISC_
  x86, mce: Replace MCE_SELF_VECTOR by irq_work
  x86, mce, severity: Clean up trivial coding style problems
  x86, mce, severity: Cleanup severity table
  x86, mce, severity: Make formatting a bit more readable
  x86, mce, severity: Fix two severities table signatures
2011-07-22 17:03:40 -07:00
Linus Torvalds
35b004cce1 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, intel, power: Correct the MSR_IA32_ENERGY_PERF_BIAS message
  x86, msr: Fix typo in ENERGY_PERF_BIAS_POWERSAVE
  x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS
2011-07-22 17:02:54 -07:00
Linus Torvalds
eb47418dc5 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix write lock scalability 64-bit issue
  x86: Unify rwsem assembly implementation
  x86: Unify rwlock assembly implementation
  x86, asm: Fix binutils 2.16 issue with __USER32_CS
  x86, asm: Cleanup thunk_64.S
  x86, asm: Flip RESTORE_ARGS arguments logic
  x86, asm: Flip SAVE_ARGS arguments logic
  x86, asm: Thin down SAVE/RESTORE_* asm macros
2011-07-22 17:02:24 -07:00
Linus Torvalds
52de84f3f3 Merge branch 'timers-rtc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-rtc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Serialize EFI time accesses on rtc_lock
  x86: Serialize SMP bootup CMOS accesses on rtc_lock
  rtc: stmp3xxx: Remove UIE handlers
  rtc: stmp3xxx: Get rid of mach-specific accessors
  rtc: stmp3xxx: Initialize drvdata before registering device
  rtc: stmp3xxx: Port stmp-functions to mxs-equivalents
  rtc: stmp3xxx: Restore register definitions
  rtc: vt8500: Use define instead of hardcoded value for status bit
2011-07-22 16:52:39 -07:00
Linus Torvalds
a99a7d1436 Merge branch 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  mips: Fix i8253 clockevent fallout
  i8253: Cleanup outb/inb magic
  arm: Footbridge: Use common i8253 clockevent
  mips: Use common i8253 clockevent
  x86: Use common i8253 clockevent
  i8253: Create common clockevent implementation
  i8253: Export i8253_lock unconditionally
  pcpskr: MIPS: Make config dependencies finer grained
  pcspkr: Cleanup Kconfig dependencies
  i8253: Move remaining content and delete asm/i8253.h
  i8253: Consolidate definitions of PIT_LATCH
  x86: i8253: Consolidate definitions of global_clock_event
  i8253: Alpha, PowerPC: Remove unused asm/8253pit.h
  alpha: i8253: Cleanup remaining users of i8253pit.h
  i8253: Remove I8253_LOCK config
  i8253: Make pcsp sound driver use the shared i8253_lock
  i8253: Make pcspkr input driver use the shared i8253_lock
  i8253: Consolidate all kernel definitions of i8253_lock
  i8253: Unify all kernel declarations of i8253_lock
  i8253: Create linux/i8253.h and use it in all 8253 related files
2011-07-22 16:51:56 -07:00
Linus Torvalds
4d4abdcb1d Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (123 commits)
  perf: Remove the nmi parameter from the oprofile_perf backend
  x86, perf: Make copy_from_user_nmi() a library function
  perf: Remove perf_event_attr::type check
  x86, perf: P4 PMU - Fix typos in comments and style cleanup
  perf tools: Make test use the preset debugfs path
  perf tools: Add automated tests for events parsing
  perf tools: De-opt the parse_events function
  perf script: Fix display of IP address for non-callchain path
  perf tools: Fix endian conversion reading event attr from file header
  perf tools: Add missing 'node' alias to the hw_cache[] array
  perf probe: Support adding probes on offline kernel modules
  perf probe: Add probed module in front of function
  perf probe: Introduce debuginfo to encapsulate dwarf information
  perf-probe: Move dwarf library routines to dwarf-aux.{c, h}
  perf probe: Remove redundant dwarf functions
  perf probe: Move strtailcmp to string.c
  perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END
  tracing/kprobe: Update symbol reference when loading module
  tracing/kprobes: Support module init function probing
  kprobes: Return -ENOENT if probe point doesn't exist
  ...
2011-07-22 16:44:39 -07:00
Linus Torvalds
6d16d6d9bb Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  iommu/core: Fix build with INTR_REMAP=y && CONFIG_DMAR=n
  iommu/amd: Don't use MSI address range for DMA addresses
  iommu/amd: Move missing parts to drivers/iommu
  iommu: Move iommu Kconfig entries to submenu
  x86/ia64: intel-iommu: move to drivers/iommu/
  x86: amd_iommu: move to drivers/iommu/
  msm: iommu: move to drivers/iommu/
  drivers: iommu: move to a dedicated folder
  x86/amd-iommu: Store device alias as dev_data pointer
  x86/amd-iommu: Search for existind dev_data before allocting a new one
  x86/amd-iommu: Allow dev_data->alias to be NULL
  x86/amd-iommu: Use only dev_data in low-level domain attach/detach functions
  x86/amd-iommu: Use only dev_data for dte and iotlb flushing routines
  x86/amd-iommu: Store ATS state in dev_data
  x86/amd-iommu: Store devid in dev_data
  x86/amd-iommu: Introduce global dev_data_list
  x86/amd-iommu: Remove redundant device_flush_dte() calls
  iommu-api: Add missing header file

Fix up trivial conflicts (independent additions close to each other) in
drivers/Makefile and include/linux/pci.h
2011-07-22 16:39:42 -07:00
Linus Torvalds
17413f5acd Merge branch 'for-3.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-3.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: Fixup __this_cpu_xchg* operations
2011-07-22 15:07:35 -07:00
Linus Torvalds
acb41c0f92 Merge branch 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  pci/of: Consolidate pci_bus_to_OF_node()
  pci/of: Consolidate pci_device_to_OF_node()
  x86/devicetree: Use generic PCI <-> OF matching
  microblaze/pci: Move the remains of pci_32.c to pci-common.c
  microblaze/pci: Remove powermac originated cruft
  pci/of: Match PCI devices to OF nodes dynamically
2011-07-22 14:54:02 -07:00
Linus Torvalds
a7e1aabb28 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: Fix in/out emulation
  lguest: Fix translation count about wikipedia's cpuid page
  lguest: Fix three simple typos in comments
  lguest: update comments
  lguest: Simplify device initialization.
  lguest: don't rewrite vmcall instructions
  lguest: remove remaining vmcall
  lguest: use a special 1:1 linear pagetable mode until first switch.
  lguest: Do not exit on non-fatal errors
2011-07-22 13:45:50 -07:00
Jonas Bonn
a4e05276a1 asm-generic: move archictures to common delay.h
This patch moves the in-tree architectures that were using the 'generic'
delay.h over to using the header file in asm-generic.

This is not done using the generic-y mechanism as none of these arch's
have started using that mechanism yet.  This is a trivial change to make
later when the arch begins using generic-y.

Note the subtle change to the avr32 and SH architectures where the argument
to __const_udelay was previously using the rounded down constant value
instead of the rounded up value.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
2011-07-22 18:46:24 +02:00
Rusty Russell
9f54288def lguest: update comments
Also removes a long-unused #define and an extraneous semicolon.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-07-22 14:39:50 +09:30