Pull x86 cleanups from Ingo Molnar:
"Misc cleanups"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, cpu, amd: Fix a shadowed variable situation
um, x86: Fix vDSO build
x86: Delete non-required instances of include <linux/init.h>
x86, realmode: Pointer walk cleanups, pull out invariant use of __pa()
x86/traps: Clean up error exception handler definitions
Pull x86/build changes from Ingo Molnar:
"Misc smaller improvements"
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, boot: Move intcall() to the .inittext section
x86, boot: Use .code16 instead of .code16gcc
x86, sparse: Do not force removal of __user when calling copy_to/from_user_nocheck()
Pull x86/asm changes from Ingo Molnar:
"Misc optimizations"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Slightly tweak the access_ok() C variant for better code
x86: Replace assembly access_ok() with a C variant
x86-64, copy_user: Use leal to produce 32-bit results
x86-64, copy_user: Remove zero byte check before copy user buffer.
We don't support LWP yet, don't give the impression that we do:
represent the LWP state as opaque 128 bytes, the way Linux sees it
currently.
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-ecarmjtfKpanpAapfck6dj6g@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull scheduler changes from Ingo Molnar:
- Add the initial implementation of SCHED_DEADLINE support: a real-time
scheduling policy where tasks that meet their deadlines and
periodically execute their instances in less than their runtime quota
see real-time scheduling and won't miss any of their deadlines.
Tasks that go over their quota get delayed (Available to privileged
users for now)
- Clean up and fix preempt_enable_no_resched() abuse all around the
tree
- Do sched_clock() performance optimizations on x86 and elsewhere
- Fix and improve auto-NUMA balancing
- Fix and clean up the idle loop
- Apply various cleanups and fixes
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
sched: Fix __sched_setscheduler() nice test
sched: Move SCHED_RESET_ON_FORK into attr::sched_flags
sched: Fix up attr::sched_priority warning
sched: Fix up scheduler syscall LTP fails
sched: Preserve the nice level over sched_setscheduler() and sched_setparam() calls
sched/core: Fix htmldocs warnings
sched/deadline: No need to check p if dl_se is valid
sched/deadline: Remove unused variables
sched/deadline: Fix sparse static warnings
m68k: Fix build warning in mac_via.h
sched, thermal: Clean up preempt_enable_no_resched() abuse
sched, net: Fixup busy_loop_us_clock()
sched, net: Clean up preempt_enable_no_resched() abuse
sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding
sched/preempt, locking: Rework local_bh_{dis,en}able()
sched/clock, x86: Avoid a runtime condition in native_sched_clock()
sched/clock: Fix up clear_sched_clock_stable()
sched/clock, x86: Use a static_key for sched_clock_stable
sched/clock: Remove local_irq_disable() from the clocks
sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs
...
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Add Intel RAPL energy counter support (Stephane Eranian)
- Clean up uprobes (Oleg Nesterov)
- Optimize ring-buffer writes (Peter Zijlstra)
Tooling side changes, user visible:
- 'perf diff':
- Add column colouring improvements (Ramkumar Ramachandra)
- 'perf kvm':
- Add guest related improvements, including allowing to specify a
directory with guest specific /proc information (Dongsheng Yang)
- Add shell completion support (Ramkumar Ramachandra)
- Add '-v' option (Dongsheng Yang)
- Support --guestmount (Dongsheng Yang)
- 'perf probe':
- Support showing source code, asking for variables to be collected
at probe time and other 'perf probe' operations that use DWARF
information.
This supports only binaries with debugging information at this
time, detached debuginfo (aka debuginfo packages) support should
come in later patches (Masami Hiramatsu)
- 'perf record':
- Rename --no-delay option to --no-buffering, better reflecting its
purpose and freeing up '--delay' to take the place of
'--initial-delay', so that 'record' and 'stat' are consistent
(Arnaldo Carvalho de Melo)
- Default the -t/--thread option to no inheritance (Adrian Hunter)
- Make per-cpu mmaps the default (Adrian Hunter)
- 'perf report':
- Improve callchain processing performance (Frederic Weisbecker)
- Retain bfd reference to lookup source line numbers, greatly
optimizing, among other use cases, 'perf report -s srcline'
(Adrian Hunter)
- Improve callchain processing performance even more (Namhyung Kim)
- Add a perf.data file header window in the 'perf report' TUI,
associated with the 'i' hotkey, providing a counterpart to the
--header option in the stdio UI (Namhyung Kim)
- 'perf script':
- Add an option in 'perf script' to print the source line number
(Adrian Hunter)
- Add --header/--header-only options to 'script' and 'report', the
default is not tho show the header info, but as this has been the
default for some time, leave a single line explaining how to
obtain that information (Jiri Olsa)
- Add options to show comm, fork, exit and mmap PERF_RECORD_ events
(Namhyung Kim)
- Print callchains and symbols if they exist (David Ahern)
- 'perf timechart'
- Add backtrace support to CPU info
- Print pid along the name
- Add support for CPU topology
- Add new option --highlight'ing threads, be it by name or, if a
numeric value is provided, that run more than given duration
(Stanislav Fomichev)
- 'perf top':
- Make 'perf top -g' refer to callchains, for consistency with
other tools (David Ahern)
- 'perf trace':
- Handle old kernels where the "raw_syscalls" tracepoints were
called plain "syscalls" (David Ahern)
- Remove thread summary coloring, by Pekka Enberg.
- Honour -m option in 'trace', the tool was offering the option to
set the mmap size, but wasn't using it when doing the actual mmap
on the events file descriptors (Jiri Olsa)
- generic:
- Backport libtraceevent plugin support (trace-cmd repository, with
plugins for jbd2, hrtimer, kmem, kvm, mac80211, sched_switch,
function, xen, scsi, cfg80211 (Jiri Olsa)
- Print session information only if --stdio is given (Namhyung Kim)
Tooling side changes, developer visible (plumbing):
- Improve 'perf probe' exit path, release resources (Masami
Hiramatsu)
- Improve libtraceevent plugins exit path, allowing the registering
of an unregister handler to be called at exit time (Namhyung Kim)
- Add an alias to the build test makefile (make -C tools/perf
build-test) (Namhyung Kim)
- Get rid of die() and friends (good riddance!) in libtraceevent
(Namhyung Kim)
- Fix cross build problems related to pkgconfig and CROSS_COMPILE not
being propagated to the feature tests, leading to features being
tested in the host and then being enabled on the target (Mark
Rutland)
- Improve forked workload error reporting by sending the errno in the
signal data queueing integer field, using sigqueue and by doing the
signal setup in the evlist methods, removing open coded equivalents
in various tools (Arnaldo Carvalho de Melo)
- Do more auto exit cleanup chores in the 'evlist' destructor, so
that the tools don't have to all do that sequence (Arnaldo Carvalho
de Melo)
- Pack 'struct perf_session_env' and 'struct trace' (Arnaldo Carvalho
de Melo)
- Add test for building detached source tarballs (Arnaldo Carvalho de
Melo)
- Move some header files (tools/perf/ to tools/include/ to make them
available to other tools/ dwelling codebases (Namhyung Kim)
- Move logic to warn about kptr_restrict'ed kernels to separate
function in 'report' (Arnaldo Carvalho de Melo)
- Move hist browser selection code to separate function (Arnaldo
Carvalho de Melo)
- Move histogram entries collapsing to separate function (Arnaldo
Carvalho de Melo)
- Introduce evlist__for_each() & friends (Arnaldo Carvalho de Melo)
- Automate setup of FEATURE_CHECK_(C|LD)FLAGS-all variables (Jiri
Olsa)
- Move arch setup into seprate Makefile (Jiri Olsa)
- Make libtraceevent install target quieter (Jiri Olsa)
- Make tests/make output more compact (Jiri Olsa)
- Ignore generated files in feature-checks (Chunwei Chen)
- Introduce pevent_filter_strerror() in libtraceevent, similar in
purpose to libc's strerror() function (Namhyung Kim)
- Use perf_data_file methods to write output file in 'record' and
'inject' (Jiri Olsa)
- Use pr_*() functions where applicable in 'report' (Namhyumg Kim)
- Add 'machine' 'addr_location' struct to have full picture (machine,
thread, map, symbol, addr) for a (partially) resolved address,
reducing function signatures (Arnaldo Carvalho de Melo)
- Reduce code duplication in the histogram entry creation/insertion
(Arnaldo Carvalho de Melo)
- Auto allocate annotation histogram data structures (Arnaldo
Carvalho de Melo)
- No need to test against NULL before calling free, also set freed
memory in struct pointers to NULL, to help fixing use after free
bugs (Arnaldo Carvalho de Melo)
- Rename some struct DSO binary_type related members and methods, to
clarify its purpose and need for differentiation (symtab_type, ie
one is about the files .text, CFI, etc, i.e. its binary contents,
and the other is about where the symbol table came from (Arnaldo
Carvalho de Melo)
- Convert to new topic libraries, starting with an API one (sysfs,
debugfs, etc), renaming liblk in the process (Borislav Petkov)
- Get rid of some more panic() like error handling in libtraceevent.
(Namhyung Kim)
- Get rid of panic() like calls in libtraceevent (Namyung Kim)
- Start carving out symbol parsing routines (perf, just moving
routines to topic files in tools/lib/symbol/, tools that want to
use it need to integrate it directly, ie no
tools/lib/symbol/Makefile is provided (Arnaldo Carvalho de Melo)
- Assorted refactoring patches, moving code around and adding utility
evlist methods that will be used in the IPT patchset (Adrian
Hunter)
- Assorted mmap_pages handling fixes (Adrian Hunter)
- Several man pages typo fixes (Dongsheng Yang)
- Get rid of several die() calls in libtraceevent (Namhyung Kim)
- Use basename() in a more robust way, to avoid problems related to
different system library implementations for that function
(Stephane Eranian)
- Remove open coded management of short_name_allocated member (Adrian
Hunter)
- Several cleanups in the "dso" methods, constifying some parameters
and renaming some fields to clarify its purpose (Arnaldo Carvalho
de Melo)
- Add per-feature check flags, fixing libunwind related build
problems on some architectures (Jean Pihet)
- Do not disable source line lookup just because of one failure.
(Adrian Hunter)
- Several 'perf kvm' man page corrections (Dongsheng Yang)
- Correct the message in feature-libnuma checking, swowing the right
devel package names for various distros (Dongsheng Yang)
- Polish 'readn()' function and introduce its counterpart,
'writen()' (Jiri Olsa)
- Start moving timechart state from global variables to a 'perf_tool'
derived 'timechart' struct (Arnaldo Carvalho de Melo)
... and lots of fixes and improvements I forgot to list"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (282 commits)
perf tools: Remove unnecessary callchain cursor state restore on unmatch
perf callchain: Spare double comparison of callchain first entry
perf tools: Do proper comm override error handling
perf symbols: Export elf_section_by_name and reuse
perf probe: Release all dynamically allocated parameters
perf probe: Release allocated probe_trace_event if failed
perf tools: Add 'build-test' make target
tools lib traceevent: Unregister handler when xen plugin is unloaded
tools lib traceevent: Unregister handler when scsi plugin is unloaded
tools lib traceevent: Unregister handler when jbd2 plugin is is unloaded
tools lib traceevent: Unregister handler when cfg80211 plugin is unloaded
tools lib traceevent: Unregister handler when mac80211 plugin is unloaded
tools lib traceevent: Unregister handler when sched_switch plugin is unloaded
tools lib traceevent: Unregister handler when kvm plugin is unloaded
tools lib traceevent: Unregister handler when kmem plugin is unloaded
tools lib traceevent: Unregister handler when hrtimer plugin is unloaded
tools lib traceevent: Unregister handler when function plugin is unloaded
tools lib traceevent: Add pevent_unregister_print_function()
tools lib traceevent: Add pevent_unregister_event_handler()
tools lib traceevent: fix pointer-integer size mismatch
...
Pull IRQ changes from Ingo Molnar:
"The only change in this cycle is a CPU hotplug related spurious
warning fix"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt()
x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs
Pull strong stackprotector support from Ingo Molnar:
"This tree adds a CONFIG_CC_STACKPROTECTOR_STRONG=y, a new, stronger
stack canary checking method supported by the newest GCC versions (4.9
and later).
Here's the 'intensity comparison' between the various protection
modes:
- defconfig
11430641 kernel text size
36110 function bodies
- defconfig + CONFIG_CC_STACKPROTECTOR_REGULAR
11468490 kernel text size (+0.33%)
1015 of 36110 functions are stack-protected (2.81%)
- defconfig + CONFIG_CC_STACKPROTECTOR_STRONG via this patch
11692790 kernel text size (+2.24%)
7401 of 36110 functions are stack-protected (20.5%)
the strong model comes with non-trivial costs, which is why we
preserved the 'regular' and 'none' models as well"
* 'core-stackprotector-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stackprotector: Introduce CONFIG_CC_STACKPROTECTOR_STRONG
stackprotector: Unify the HAVE_CC_STACKPROTECTOR logic between architectures
Pull core locking changes from Ingo Molnar:
- futex performance increases: larger hashes, smarter wakeups
- mutex debugging improvements
- lots of SMP ordering documentation updates
- introduce the smp_load_acquire(), smp_store_release() primitives.
(There are WIP patches that make use of them - not yet merged)
- lockdep micro-optimizations
- lockdep improvement: better cover IRQ contexts
- liblockdep at last. We'll continue to monitor how useful this is
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
futexes: Fix futex_hashsize initialization
arch: Re-sort some Kbuild files to hopefully help avoid some conflicts
futexes: Avoid taking the hb->lock if there's nothing to wake up
futexes: Document multiprocessor ordering guarantees
futexes: Increase hash table size for better performance
futexes: Clean up various details
arch: Introduce smp_load_acquire(), smp_store_release()
arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h
arch: Move smp_mb__{before,after}_atomic_{inc,dec}.h into asm/atomic.h
locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
mutexes: Give more informative mutex warning in the !lock->owner case
powerpc: Full barrier for smp_mb__after_unlock_lock()
rcu: Apply smp_mb__after_unlock_lock() to preserve grace periods
Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK
locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier
Documentation/memory-barriers.txt: Document ACCESS_ONCE()
Documentation/memory-barriers.txt: Prohibit speculative writes
Documentation/memory-barriers.txt: Add long atomic examples to memory-barriers.txt
Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt
Revert "smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK=y usable without debug dependency"
...
Pull core debug changes from Ingo Molnar:
"Currently there are two methods to set the panic_timeout: via
'panic=X' boot commandline option, or via /proc/sys/kernel/panic.
This tree adds a third panic_timeout configuration method:
configuration via Kconfig, via CONFIG_PANIC_TIMEOUT=X - useful to
distros that generally want their kernel defaults to come with the
.config.
CONFIG_PANIC_TIMEOUT defaults to 0, which was the previous default
value of panic_timeout.
Doing that unearthed a few arch trickeries regarding arch-special
panic_timeout values and related complications - hopefully all
resolved to the satisfaction of everyone"
* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
powerpc: Clean up panic_timeout usage
MIPS: Remove panic_timeout settings
panic: Make panic_timeout configurable
Rip it all out so people do not waste time making updates
to broken/dead code.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJS3Vq5AAoJEKurIx+X31iBVw8P/Rj2B3kJb+qRy4U6dtAPMsUL
Rk3jrhuRI88Gho+/7FdDk46L2WFtl+GGsn0vMll1fufeCIvz5mX+FZSqv1JKQQ1T
sXBRXs5MINanHppL/i9bnTzITehu+YIdoyE4zGo9/noczPIa0/wUNtrPrCFpxGcZ
vwfATwBsAEKC2d9ZoDRiYPtsS8jIafVWI+lnMF122TTFvQLnEpkg1e0ieqdYd+3F
w+kusP+CuXwLJa6v760RoPuku9QHKQmqe6AaDbCL4AY0Q9pNMLlnWTyC7Q5on6LK
8aynuMDzBtoNvYBHtd9PGpiK3OA1vF6BbOnFm+iWddI0XTQImJE9L19gLYcuai08
gUZw5HgTwnxVt5r4Uuk9AVF5wbKJ1pXZE6OeDhD0T03o3cxAdLWXkYWK33tV+qLt
J71F06sf/BlokBompwwuzQZsjFbDCC+YOJQYKRtqBN4vY8YD4wmyUBAodGtMsgzK
n3ZiE8X02ro20InTiBcTmDHxWEQPEvrcwFkz2jsacpGVnO6oKrgrxGHo6/12hXnP
2u6xxl3XqwZ7HF0i+e+IJ612geA2atxuoW0/H2qVXJuCo9peyFWTD4wqnCD6nQk5
WbxlGgvokgyTG5YdWy6UtZNroTpP7FWFVYqXawbZnw8lrNjFd+sZ9gM6BvBGrWy3
GD0Zvo1k0Pd75WlA1BpQ
=YzMH
-----END PGP SIGNATURE-----
Merge tag 'please-pull-rm_xen' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull ia64 Xen removal from Tony Luck:
"Nobody has been maintaining xen in ia64 for a long time. Rip it all
out so people do not waste time making updates to broken/dead code"
* tag 'please-pull-rm_xen' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
ia64/xen: Remove Xen support for ia64
- A few cleanups and minor bug fixes.
- Kill SMP single function call IPI.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJS3T7AAAoJEKHZs+irPybfT9EP/ifU15XsWYS4CBfHekanhSB8
Ghp6IL7dXH3BecQ/1ynlGDsw3bPxOhVJ3t4YQXkKx1PcGcef9UefJJU2xNEXSeTE
ZljAmiO0cbNTzS96dNQC/2JzkFqEU5ObTa708iOjay7LdqxFItXtanLZ5rFNDW3U
uHuSwX/C+dl1vYuA7kSglCy6r+rFc8CbsFPKShhI6RiIooqPoIoOPNTOlTM2o5AJ
YExgVi74EiOkrwLrtJroD5mlNevBFTZFt0eUYLd9S+k7M5BONCynKqqO70UPBCxC
K/uf6AHHrrG+4b8AOM9Ko7Q3P66V8w4+8g/WWeb/Ghv/SBymcs3zpmdSsF5pyzJ7
szKPHd1vXm+V5tsc1VWs7DCp4droM56Hn+iyzSd/NeW4jVRAsoUPRIZqkxdtzeds
R96IbTz0LpoN96ZV0IXzul9TwJiBg/jsO7OSlGsQZVq8WX5P3q8UsbHF/g8qv4pt
qOqbo9ZJvVT7+aKNG+kDpjMMQCqUUrcpf/vEWz/O6I9hY5zkIFYaNoHTx31wz4K0
Deiig+3VzpCO9UpN3KKhYg709xF0UTTS4SNL92lsvD/mFxCckJjJR1bDGJS5NOjF
apof95fJxVwgjNh/mUhIjQiYpgic7W9RMED3lYUNGeNc6R7sELRDzY+gtTES72qr
21ABm9be2zIpCCAfnKvx
=rbJ3
-----END PGP SIGNATURE-----
Merge tag 'metag-for-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag
Pull Metag architecture changes from James Hogan:
- A few cleanups and minor bug fixes.
- Kill SMP single function call IPI.
* tag 'metag-for-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
metag/smp: Make boot_secondary() static
metag: topology: export 'cpu_core_map'
smp, metag: kill SMP single function call interrupt
metag: smp: don't set irq regs in do_IPI()
metag: dma: remove dead code in dma_alloc_init()
Pull m68k updates from Geert Uytterhoeven:
- Zorro bus cleanups and UAPI revival
- Bootinfo cleanups and UAPI revival
- Kexec support
- Memory size reductions and bug fixes for multi-platform kernels
- Polled interrupt support for Atari EtherNAT, EtherNEC and NetUSBee
- Machine-specific random_get_entropy()
- Defconfig updates and cleanups
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: (46 commits)
m68k/mac: Make SCC reset work more reliably
m68k/irq - Use polled IRQ flag for MFP timer cascaded interrupts
m68k: Update defconfigs for v3.13-rc1
m68k/defconfig: Enable EARLY_PRINTK
m68k/mm: kmap spelling/grammar fixes
m68k: Convert arch/m68k/kernel/traps.c to pr_*()
m68k: Convert arch/m68k/mm/fault.c to pr_*()
m68k/mm: Check for mm != NULL in do_page_fault() debug code
m68k/defconfig: Disable /sbin/hotplug fork-bomb by default
m68k/atari: Hide RTC_PORT() macro from rtc-cmos
m68k/amiga,atari: Fix specifying multiple debug= parameters
m68k/defconfig: Use ext4 for ext2/ext3 file systems
m68k: Add support to export bootinfo in procfs
m68k: Add kexec support
m68k/mac: Mark Mac IIsi ADB driver BROKEN
m68k/amiga: Provide mach_random_get_entropy()
m68k: Add infrastructure for machine-specific random_get_entropy()
m68k/atari: Call paging_init() before nf_init()
m68k: Remove superfluous inclusions of <asm/bootinfo.h>
m68k/UAPI: Use proper types (endianness/size) in <asm/bootinfo*.h>
...
Pull s390 updates from Martin Schwidefsky:
"The bulk of the s390 updates for v3.14.
New features are the perf support for the CPU-Measurement Sample
Facility and the EP11 support for the crypto cards. And the normal
cleanups and bug-fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (44 commits)
s390/cpum_sf: fix printk format warnings
s390: Fix misspellings using 'codespell' tool
s390/qdio: bridgeport support - CHSC part
s390: delete new instances of __cpuinit usage
s390/compat: fix PSW32_USER_BITS definition
s390/zcrypt: add support for EP11 coprocessor cards
s390/mm: optimize randomize_et_dyn for !PF_RANDOMIZE
s390: use IS_ENABLED to check if a CONFIG is set to y or m
s390/cio: use device_lock to synchronize calls to the ccwgroup driver
s390/cio: use device_lock to synchronize calls to the ccw driver
s390/cio: fix unlocked access of online member
s390/cpum_sf: Add flag to process full SDBs only
s390/cpum_sf: Add raw data sampling to support the diagnostic-sampling function
s390/cpum_sf: Filter perf events based event->attr.exclude_* settings
s390/cpum_sf: Detect KVM guest samples
s390/cpum_sf: Add helper to read TOD from trailer entries
s390/cpum_sf: Atomically reset trailer entry fields of sample-data-blocks
s390/cpum_sf: Dynamically extend the sampling buffer if overflows occur
s390/pci: reenable per default
s390/pci/dma: fix accounting of allocated_pages
...
Make KVM_MMU_AUDIT kconfig help text readable and collapse
two spaces between words down to one space.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Seems that commit 210b160701
(KVM: s390: Removed SIE_INTERCEPT_UCONTROL) lost a hunk when we
reworked our patch queue to rework the async_fp code. We now
ignore faults on the sie instruction (guest accesses non-existing
memory) instead of sending a fault into the guest. This leads to
hang situations with the old virtio transport that checks for
descriptor memory after guest memory. Instead of bailing out this
code now goes wild...
Lets re-add the check.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull perf fixes from Ingo Molnar:
- an s2ram related fix on AMD systems
- a perf fault handling bug that is relatively old but which has become
much easier to trigger in v3.13 after commit e00b12e64b ("perf/x86:
Further optimize copy_from_user_nmi()")
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h
x86, mm, perf: Allow recursive faults from interrupts
For SCC initialization we cannot assume that the control register is in
the correct state to accept a register pointer. So first read from the
control register in order to "sync" up.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Pull networking fixes from David Miller:
1) The value choosen for the new SO_MAX_PACING_RATE socket option on
parisc was very poorly choosen, let's fix it while we still can.
From Eric Dumazet.
2) Our generic reciprocal divide was found to handle some edge cases
incorrectly, part of this is encoded into the BPF as deep as the JIT
engines themselves. Just use a real divide throughout for now.
From Eric Dumazet.
3) Because the initial lookup is lockless, the TCP metrics engine can
end up creating two entries for the same lookup key. Fix this by
doing a second lookup under the lock before we actually create the
new entry. From Christoph Paasch.
4) Fix scatter-gather list init in usbnet driver, from Bjørn Mork.
5) Fix unintended 32-bit truncation in cxgb4 driver's bit shifting.
From Dan Carpenter.
6) Netlink socket dumping uses the wrong socket state for timewait
sockets. Fix from Neal Cardwell.
7) Fix netlink memory leak in ieee802154_add_iface(), from Christian
Engelmayer.
8) Multicast forwarding in ipv4 can overflow the per-rule reference
counts, causing all multicast traffic to cease. Fix from Hannes
Frederic Sowa.
9) via-rhine needs to stop all TX queues when it resets the device,
from Richard Weinberger.
10) Fix RDS per-cpu accesses broken by the this_cpu_* conversions. From
Gerald Schaefer.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions
parisc: fix SO_MAX_PACING_RATE typo
ipv6: simplify detection of first operational link-local address on interface
tcp: metrics: Avoid duplicate entries with the same destination-IP
net: rds: fix per-cpu helper usage
e1000e: Fix compilation warning when !CONFIG_PM_SLEEP
bpf: do not use reciprocal divide
be2net: add dma_mapping_error() check for dma_map_page()
bnx2x: Don't release PCI bars on shutdown
net,via-rhine: Fix tx_timeout handling
batman-adv: fix batman-adv header overhead calculation
qlge: Fix vlan netdev features.
net: avoid reference counter overflows on fib_rules in multicast forwarding
dm9601: add USB IDs for new dm96xx variants
MAINTAINERS: add virtio-dev ML for virtio
ieee802154: Fix memory leak in ieee802154_add_iface()
net: usbnet: fix SG initialisation
inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets
cxgb4: silence shift wrapping static checker warning
The s390 bpf jit compiler emits the signed divide instructions "dr" and "d"
for unsigned divisions.
This can cause problems: the dividend will be zero extended to a 64 bit value
and the divisor is the 32 bit signed value as specified A or X accumulator,
even though A and X are supposed to be treated as unsigned values.
The divide instrunctions will generate an exception if the result cannot be
expressed with a 32 bit signed value.
This is the case if e.g. the dividend is 0xffffffff and the divisor either 1
or also 0xffffffff (signed: -1).
To avoid all these issues simply use unsigned divide instructions.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SO_MAX_PACING_RATE definition on parisc got a typo.
Its not too late to fix it, before 3.13 is official.
Fixes: 62748f32d5 ("net: introduce SO_MAX_PACING_RATE")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJS174dAAoJEBvWZb6bTYbywN0P/jiaK+ySw4YTxGggkRhi1WHy
KnELWnPTpA1duucioBS8KSZDXN0SSf4KLio/ArOJee/uU9loY4HWMwpKjBTZkEU4
lWI0WIagllxqBnxfFgdAVS6B/EUqie3qulZ8Hi72OoAEzUnLDsGAIIBzR5fdb1jg
xkmdtPB2Lg+c+Lsuhg0MJegoymoLhEF82k/rBUEAzpjsL5N9DUt4n2rVLbECkfzm
kLC31epho+PaSAmepAk5dbfh1H7ea4OV5HZeyyuIZqGVUgc95OU7VV3eIK+pI7Vu
C6PvHaOnF6VHvnMN4tFSDRvADsb9TnxQN9zYiC10opjzdTwUiA+Y+jbIjyRkAtNY
JfDRSIh3+w93K0sv5eICBp3cXcy584I+N5VC3EY0imv4DZNvlQfEWu1t9qSBhaAQ
DQnIRhY1BpfK4Uu9MVBu8lTeo0VuefupFZ3vqlEQJ8ht2jgnYdQQMOaPNuzOP7NS
WbOoDWGVfpfWN09mCV9OJ9WKjxO0nUeS4/xBqWIhrQKQY3oSvdjxIbqw7E3lj4Sp
0uwzGqbwcJanYR0IYNmBfuyRtwk8SqgvIGGLtG4cnlAAXuwaG/v/+FpVCTCstB9n
Ljxp9Y8dqAyTjrdw4Irp2rsIhbJpW6Ob4zcVeAfcNWjut+DBhbWl6VSe0k48A06h
gO+vkx+qAg7FiF/vnRVN
=ifJc
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Paolo Bonzini:
"Fix for a brown paper bag bug. Thanks to Drew Jones for noticing"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: fix apic_base enable check
This patch adds all the MPX instructions to x86 opcode map, so the x86
instruction decoder can decode MPX instructions.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Link: http://lkml.kernel.org/r/1389518403-7715-4-git-send-email-qiaowei.ren@intel.com
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Set guest activity state in L1's VMCS according to the VCPUs mp_state.
This ensures we report the correct state in case we L2 executed HLT or
if we put L2 into HLT state and it was now woken up by an event.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When we suspend the guest in HLT state, the nested run is no longer
pending - we emulated it completely. So only set nested_run_pending
after checking the activity state.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This simplifies the code and also stops issuing warning about writing to
unhandled MSRs when VMX is disabled or the Feature Control MSR is
locked - we do handle them all according to the spec.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Already used by nested SVM for tracing nested vmexit: kvm_nested_vmexit
marks exits from L2 to L0 while kvm_nested_vmexit_inject marks vmexits
that are reflected to L1.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of fixing up the vmcs12 after the nested vmexit, pass key
parameters already when calling nested_vmx_vmexit. This will help
tracing those vmexits.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When userspace sets MSR_IA32_FEATURE_CONTROL to 0, make sure we leave
root and non-root mode, fully disabling VMX. The register state of the
VCPU is undefined after this step, so userspace has to set it to a
proper state afterward.
This enables to reboot a VM while it is running some hypervisor code.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to the SDM, only bits 0-3 of DR6 "may" be cleared by "certain"
debug exception. So do update them on #DB exception in KVM, but leave
the rest alone, only setting BD and BS in addition to already set bits
in DR6. This also aligns us with kvm_vcpu_check_singlestep.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In contrast to VMX, SVM dose not automatically transfer DR6 into the
VCPU's arch.dr6. So if we face a DR6 read, we must consult a new vendor
hook to obtain the current value. And as SVM now picks the DR6 state
from its VMCB, we also need a set callback in order to write updates of
DR6 back.
Fixes a regression of 020df0794f.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Whenever we change arch.dr7, we also have to call kvm_update_dr7. In
case guest debugging is off, this will synchronize the new state into
hardware.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off: Peter Lieven <pl@kamp.de>
Signed-off: Gleb Natapov
Signed-off: Vadim Rozenfeld <vrozenfe@redhat.com>
After some consideration I decided to submit only Hyper-V reference
counters support this time. I will submit iTSC support as a separate
patch as soon as it is ready.
v1 -> v2
1. mark TSC page dirty as suggested by
Eric Northup <digitaleric@google.com> and Gleb
2. disable local irq when calling get_kernel_ns,
as it was done by Peter Lieven <pl@amp.de>
3. move check for TSC page enable from second patch
to this one.
v3 -> v4
Get rid of ref counter offset.
v4 -> v5
replace __copy_to_user with kvm_write_guest
when updateing iTSC page.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a cleanup proposed by coccinelle. It replaces memcpy with struct
assignment on intel-mid's sfi layer.
Generated by: coccinelle/misc/memcpy-assign.cocci
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Link: http://lkml.kernel.org/r/1389917588-9785-1-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch does cleanup on all intel mid platform code that uses
gpio_get_by_name() function. From now on they should check for any error
code instead of only hardcoded -1.
There are no functional changes from this change.
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1389913624-9149-3-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
When Intel MID finds a match between SFI table from FW and registered
SFI devices, it will always register a device regardless the platform
code was successful or not.
This patch adds an extra option for platform code to return error code
and abort device registration on SFI table parsing.
This patch does not contain any functional changes for current intel
mid platform code.
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1389913624-9149-2-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
If we aren't going to use the local APIC anyway, we obviously don't
care about its timer frequency.
Link: http://lkml.kernel.org/r/tip-rgm7xmg7k6qnjlw3ynkcjsmh@git.kernel.org
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Bin Gao <bin.gao@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This reverts commit 2f7dc60275.
The above commit breaks the mapping type for Device memory because
pgprot_default already contains a Normal memory type. pgprot_default is
also not initialised early enough for earlyprintk resulting in an
inconsistent memory mapping with 64K PAGE_SIZE configuration.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Adds pinctrl driver for Broadcom Capri (BCM281xx) SoCs.
v4: - PINCTRL selected in Kconfig, PINCTRL_CAPRI selected in bcm_defconfig
- make use of regmap
- change CAPRI_PIN_UPDATE from macro to inline function.
- Handle pull-up strength arg in Ohm instead of enum
v3: Re-work driver to be based on generic pin config. Moved config selection
from Kconfig to bcm_defconfig.
v2: Use hyphens instead of underscore in DT property names.
Signed-off-by: Sherman Yin <syin@broadcom.com>
Reviewed-by: Christian Daudt <bcm@fixthebug.org>
Reviewed-by: Matt Porter <matt.porter@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Fengguang Wu's kbuild test robot reported the following new m68k warnings:
In file included from drivers/nubus/nubus.c:22:0:
>> arch/m68k/include/asm/mac_via.h:262:47: warning: 'struct irq_desc' declared inside parameter list [enabled by default]
>> arch/m68k/include/asm/mac_via.h:262:47: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
Caused by the reworking of the generic local_bh{dis,en}able() code.
To fix it, forward declare 'struct irq_desc'.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Fixes: c795eb55e740 ("sched/preempt, locking: Rework local_bh_{dis,en}able()")
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: geert@linux-m68k.org
Link: http://lkml.kernel.org/r/20140112212456.GQ7572@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On AMD family 10h we see following error messages while waking up from
S3 for all non-boot CPUs leading to a failed IBS initialization:
Enabling non-boot CPUs ...
smpboot: Booting Node 0 Processor 1 APIC 0x1
[Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
perf: IBS APIC setup failed on cpu #1
process: Switch to broadcast mode on CPU1
CPU1 is up
...
ACPI: Waking up from system sleep state S3
Reason for this is that during suspend the LVT offset for the IBS
vector gets lost and needs to be reinialized while resuming.
The offset is read from the IBSCTL msr. On family 10h the offset needs
to be 1 as offset 0 is used for the MCE threshold interrupt, but
firmware assings it for IBS to 0 too. The kernel needs to reprogram
the vector. The msr is a readonly node msr, but a new value can be
written via pci config space access. The reinitialization is
implemented for family 10h in setup_ibs_ctl() which is forced during
IBS setup.
This patch fixes IBS setup after waking up from S3 by adding
resume/supend hooks for the boot cpu which does the offset
reinitialization.
Marking it as stable to let distros pick up this fix.
Signed-off-by: Robert Richter <rric@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org> v3.2..
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1389797849-5565-1-git-send-email-rric.net@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Waiman managed to trigger a PMI while in a emulate_vsyscall() fault,
the PMI in turn managed to trigger a fault while obtaining a stack
trace. This triggered the sig_on_uaccess_error recursive fault logic
and killed the process dead.
Fix this by explicitly excluding interrupts from the recursive fault
logic.
Reported-and-Tested-by: Waiman Long <waiman.long@hp.com>
Fixes: e00b12e64b ("perf/x86: Further optimize copy_from_user_nmi()")
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140110200603.GJ7572@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On SoCs that have the calibration MSRs available, either there is no
PIT, HPET or PMTIMER to calibrate against, or the PIT/HPET/PMTIMER is
driven from the same clock as the TSC, so calibration is redundant and
just slows down the boot.
TSC rate is caculated by this formula:
<maximum core-clock to bus-clock ratio> * <maximum resolved frequency>
The ratio and the resolved frequency ID can be obtained from MSR.
See Intel 64 and IA-32 System Programming Guid section 16.12 and 30.11.5
for details.
Signed-off-by: Bin Gao <bin.gao@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-rgm7xmg7k6qnjlw3ynkcjsmh@git.kernel.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791
When a cpu is downed on a system, the irqs on the cpu are assigned to
other cpus. It is possible, however, that when a cpu is downed there
aren't enough free vectors on the remaining cpus to account for the
vectors from the cpu that is being downed.
This results in an interesting "overflow" condition where irqs are
"assigned" to a CPU but are not handled.
For example, when downing cpus on a 1-64 logical processor system:
<snip>
[ 232.021745] smpboot: CPU 61 is now offline
[ 238.480275] smpboot: CPU 62 is now offline
[ 245.991080] ------------[ cut here ]------------
[ 245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250()
[ 246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out
[ 246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas
[ 246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14
[ 246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[ 246.057371] 0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48
[ 246.065728] ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040
[ 246.074073] 0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000
[ 246.082430] Call Trace:
[ 246.085174] <IRQ> [<ffffffff8164fbf6>] dump_stack+0x46/0x58
[ 246.091633] [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0
[ 246.098352] [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50
[ 246.104786] [<ffffffff815710d6>] dev_watchdog+0x246/0x250
[ 246.110923] [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[ 246.119097] [<ffffffff8106092a>] call_timer_fn+0x3a/0x110
[ 246.125224] [<ffffffff8106280f>] ? update_process_times+0x6f/0x80
[ 246.132137] [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[ 246.140308] [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0
[ 246.146933] [<ffffffff81059a80>] __do_softirq+0xe0/0x220
[ 246.152976] [<ffffffff8165fedc>] call_softirq+0x1c/0x30
[ 246.158920] [<ffffffff810045f5>] do_softirq+0x55/0x90
[ 246.164670] [<ffffffff81059d35>] irq_exit+0xa5/0xb0
[ 246.170227] [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60
[ 246.177324] [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70
[ 246.184041] <EOI> [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0
[ 246.191559] [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0
[ 246.198374] [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200
[ 246.204900] [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30
[ 246.210846] [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250
[ 246.217371] [<ffffffff81646b47>] rest_init+0x77/0x80
[ 246.223028] [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb
[ 246.229165] [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e
[ 246.235787] [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c
[ 246.242990] [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc
[ 246.249610] ---[ end trace fb74fdef54d79039 ]---
[ 246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout
[ 246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter
Last login: Mon Nov 11 08:35:14 from 10.18.17.119
[root@(none) ~]# [ 246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
[ 249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[ 246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
[ 249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
(last lines keep repeating. ixgbe driver is dead until module reload.)
If the downed cpu has more vectors than are free on the remaining cpus on the
system, it is possible that some vectors are "orphaned" even though they are
assigned to a cpu. In this case, since the ixgbe driver had a watchdog, the
watchdog fired and notified that something was wrong.
This patch adds a function, check_vectors(), to compare the number of vectors
on the CPU going down and compares it to the number of vectors available on
the system. If there aren't enough vectors for the CPU to go down, an
error is returned and propogated back to userspace.
v2: Do not need to look at percpu irqs
v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest
Priority Mode
v4: Additional changes suggested by Gong Chen.
v5/v6/v7/v8: Updated comment text
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.com
Reviewed-by: Gong Chen <gong.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Yang Zhang <yang.z.zhang@Intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Janet Morgan <janet.morgan@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ruiv Wang <ruiv.wang@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
Pull locking fixes from Ingo Molnar:
"Two fixes from lockdep coverage of seqlocks, which fix deadlocks on
lockdep-enabled ARM systems"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched_clock: Disable seqlock lockdep usage in sched_clock()
seqlock: Use raw_ prefix instead of _no_lockdep
Pull ARM fixes from Russell King:
"Another few fixes for ARM, nothing major here"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling
ARM: 7939/1: traps: fix opcode endianness when read from user memory
ARM: 7937/1: perf_event: Silence sparse warning
ARM: 7934/1: DT/kernel: fix arch_match_cpu_phys_id to avoid erroneous match
Revert "ARM: 7908/1: mm: Fix the arm_dma_limit calculation"
At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
were not correct. (off by one in some cases)
http://www.wireshark.org/~darkjames/reciprocal-buggy.c
He could also show this with BPF:
http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c
The reciprocal divide in linux kernel is not generic enough,
lets remove its use in BPF, as it is not worth the pain with
current cpus.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Daniel Borkmann <dxchgb@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
We want to support all Intel MID (Mobile Internet Device) platforms
with a single config selection. This patch removes deprecated
CONFIG_X86_MDFLD and X86_WANT_INTEL_MID options in favor of having
CONFIG_X86_INTEL_MID only.
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387244246-20714-1-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This code was partially based on Mark Brown's previous work.
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387224459-25746-4-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: Fei Yang <fei.yang@intel.com>
Cc: Mark F. Brown <mark.f.brown@intel.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch adds Clovertrail support on intel-mid and makes it more
flexible to support other SoCs.
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387224459-25746-3-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
In order make the driver more portable and support other Intel MID
(Mobile Internet Device) platforms we need to move Medfield code from
intel-mid.c core to its own mfld.c file.
This patch contains no functional changes.
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387224459-25746-2-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Make disabled_cpu_apicid static and read_mostly, and fix a couple of
typos.
Reported-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/20140115182511.GA22737@gmail.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Race conditions are theoretically possible between the PCI device addition
and removal in the PPC64 PCI error recovery driver and the generic PCI bus
rescan and device removal that can be triggered via sysfs.
To avoid those race conditions make PPC64 PCI error recovery driver use
global PCI rescan-remove locking around PCI device addition and removal.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add disable_cpu_apicid kernel parameter. To use this kernel parameter,
specify an initial APIC ID of the corresponding CPU you want to
disable.
This is mostly used for the kdump 2nd kernel to disable BSP to wake up
multiple CPUs without causing system reset or hang due to sending INIT
from AP to BSP.
Kdump users first figure out initial APIC ID of the BSP, CPU0 in the
1st kernel, for example from /proc/cpuinfo and then set up this kernel
parameter for the 2nd kernel using the obtained APIC ID.
However, doing this procedure at each boot time manually is awkward,
which should be automatically done by user-land service scripts, for
example, kexec-tools on fedora/RHEL distributions.
This design is more flexible than disabling BSP in kernel boot time
automatically in that in kernel boot time we have no choice but
referring to ACPI/MP table to obtain initial APIC ID for BSP, meaning
that the method is not applicable to the systems without such BIOS
tables.
One assumption behind this design is that users get initial APIC ID of
the BSP in still healthy state and so BSP is uniquely kept in
CPU0. Thus, through the kernel parameter, only one initial APIC ID can
be specified.
In a comparison with disabled_cpu_apicid, we use read_apic_id(), not
boot_cpu_physical_apicid, because on some platforms, the variable is
modified to the apicid reported as BSP through MP table and this
function is executed with the temporarily modified
boot_cpu_physical_apicid. As a result, disabled_cpu_apicid kernel
parameter doesn't work well for apicids of APs.
Fixing the wrong handling of boot_cpu_physical_apicid requires some
reviews and tests beyond some platforms and it could take some
time. The fix here is a kind of workaround to focus on the main topic
of this patch.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/20140115064458.1545.38775.stgit@localhost6.localdomain6
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
After the previous patch from Marcelo, the comment before this write
became obsolete. In fact, the write is unnecessary. The calls to
kvm_write_tsc ultimately result in a master clock update as soon as
all TSCs agree and the master clock is re-enabled. This master
clock update will rewrite tsc_timestamp.
So, together with the comment, delete the dead write too.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gpio-samsung.h header file introduced by commit 93177be0910c
("ARM: S3C[24|64]xx: move includes back under <mach/> scope")
is required only by S3C[24|64]xx machines. Include them conditionally
to avoid the following build errors for other machine configurations.
drivers/gpio/gpio-samsung.c:35:31: fatal error: mach/gpio-samsung.h: No such file or directory
arch/arm/plat-samsung/pm-gpio.c:22:31: fatal error: mach/gpio-samsung.h: No such file or directory
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
To fix a problem related to different resolution of TSC and system clock,
the offset in TSC units is approximated by
delta = vcpu->hv_clock.tsc_timestamp - vcpu->last_guest_tsc
(Guest TSC value at (Guest TSC value at last VM-exit)
the last kvm_guest_time_update
call)
Delta is then later scaled using mult,shift pair found in hv_clock
structure (which is correct against tsc_timestamp in that
structure).
However, if a frequency change is performed between these two points,
this delta is measured using different TSC frequencies, but scaled using
mult,shift pair for one frequency only.
The end result is an incorrect delta.
The bug which this code works around is not the only cause for
clock backwards events. The global accumulator is still
necessary, so remove the max_kernel_ns fix and rely on the
global accumulator for no clock backwards events.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch updates the Armada 370/XP SATA node with the new compatible
string "marvell,armada-370-sata".
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: Lior Amsalem <alior@marvell.com>
Cc: stable@vger.kernel.org # v3.6+
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Tejun Heo <tj@kernel.org>
Commit e66d2ae7c6 moved the assignment
vcpu->arch.apic_base = value above a condition with
(vcpu->arch.apic_base ^ value), causing that check
to always fail. Use old_value, vcpu->arch.apic_base's
old value, in the condition instead.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Having u32 and struct cpuinfo_x86 * by the same name is not very smart,
although it was ok in this case due to the limited scope of u32 c and it
being used only once in there.
Fix this.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1389786735-16751-1-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Limit PIT timer frequency similarly to the limit applied by
LAPIC timer.
Cc: stable@kernel.org
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, Loongson-2 call protected_blast_icache_range() and others
call protected_loongson23_blast_icache_range(), but I think the correct
behavior should be the opposite. BTW, Loongson-3's cache-ops is
compatible with MIPS64, but not compatible with Loongson-2. So, rename
xxx_loongson23_yyy things to xxx_loongson2_yyy.
The patch fixes early boot hang with 3.13-rc1, introduced in commit
14bd8c0820 ("MIPS: Loongson: Get rid of Loongson 2 #ifdefery all over
arch/mips").
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: John Crispin <blogic@openwrt.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds the workaround for erratum 793 as a precaution in case not
every BIOS implements it. This addresses CVE-2013-6885.
Erratum text:
[Revision Guide for AMD Family 16h Models 00h-0Fh Processors,
document 51810 Rev. 3.04 November 2013]
793 Specific Combination of Writes to Write Combined Memory Types and
Locked Instructions May Cause Core Hang
Description
Under a highly specific and detailed set of internal timing
conditions, a locked instruction may trigger a timing sequence whereby
the write to a write combined memory type is not flushed, causing the
locked instruction to stall indefinitely.
Potential Effect on System
Processor core hang.
Suggested Workaround
BIOS should set MSR
C001_1020[15] = 1b.
Fix Planned
No fix planned
[ hpa: updated description, fixed typo in MSR name ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20140114230711.GS29865@pd.tnic
Tested-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The help text for RANDOMIZE_BASE_MAX_OFFSET was confusing. This has been
clarified, and updated to be an export-only tunable.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20131210202745.GA2961@www.outflux.net
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Previously the custom GPIO header for the S3C24xx would in turn
bring in the custom pin control implementation from
<plat/gpio-cfg.h>. This is not good as it mixes up two
subsystems and makes the dependencies hard to track. Make
the dependency explicit by explicitly including the pin
control header where needed.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: Tomasz Figa <tomasz.figa@gmail.com>
Cc: Sylwester Nawrocki <sylvester.nawrocki@gmail.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-samsung-soc@vger.kernel.org
Acked-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
When refactoring and breaking out the includes for the
machine-specific GPIO configuration, two files were created
in <linux/platform_data/gpio-samsung-s3c[24|64]xx.h>, but as
that namespace shall be used for defining data exchanged
between machines and drivers, using it for these broad macros
and config settings is wrong.
Move the headers back into the machine-local
<mach/gpio-samsung.h> file and think about the next step.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: Tomasz Figa <tomasz.figa@gmail.com>
Cc: Sylwester Nawrocki <sylvester.nawrocki@gmail.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: linux-samsung-soc@vger.kernel.org
Acked-by: Mark Brown <broonie@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Introduce function for the "Perform network-subchannel operation"
CHSC command with operation code "bridgeport information",
and bit definitions for "characteristics" pertaning to this command.
Signed-off-by: Eugene Crosser <eugene.crosser@ru.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull clocksource/clockevent updates from Daniel Lezcano:
* Axel Lin removed an unused structure defining the ids for the
bcm kona driver.
* Ezequiel Garcia enabled the timer divider only when the 25MHz
timer is not used for the armada 370 XP.
* Jingoo Han removed a pointless platform data initialization for
the sh_mtu and sh_mtu2.
* Laurent Pinchart added the clk_prepare/clk_unprepare for sh_cmt.
* Linus Walleij added a useful warning in clk_of when no clocks
are found while the old behavior was to silently hang at boot time.
* Maxime Ripard added the high speed timer drivers for the
Allwinner SoCs (A10, A13, A20). He increased the rating, shared the
irq across all available cpus and fixed the clockevent's irq
initialization for the sun4i.
* Michael Opdenacker removed the usage of the IRQF_DISABLED for the
all the timers driver located in drivers/clocksource.
* Stephen Boyd switched to sched_clock_register for the
arm_global_timer, cadence_ttc, sun4i and orion timers.
Conflicts:
drivers/clocksource/clksrc-of.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently we do a read, a dummy write and a final read to fetch
the error code. The value from the final read is taken.
This is not the recommended way and leads to corrupted/lost ESR
values.
Intel(c) 64 and IA-32 Architectures Software Developer's Manual,
Combined Volumes 1, 2ABC, 3ABC, Section 10.5.3 states:
Before attempt to read from the ESR, software should first
write to it. (The value written does not affect the values read
subsequently; only zero may be written in x2APIC mode.) This
write clears any previously logged errors and updates the ESR
with any errors detected since the last write to the ESR.
This write also rearms the APIC error interrupt triggering
mechanism.
This patch removes the first read such that we are conform with
the manual.
On my (very old) Pentium MMX SMP system this patch fixes the
issue that APIC errors:
a) are not always reported and
b) are reported with false error numbers.
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: seiji.aguchi@hds.com
Cc: rientjes@google.com
Cc: konrad.wilk@oracle.com
Cc: bp@alien8.de
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1389685487-20872-1-git-send-email-richard@nod.at
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Checkin:
93ea02bb84 arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h
... unfortunately left some Kbuild files out of order, which caused
unnecessary merge conflicts, in particular with checkin:
e3fec2f74f lib: Add missing arch generic-y entries for asm-generic/hash.h
Put them back in order to make the upcoming merges cleaner.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20140114164420.d296fbcc4be3a5f126c86069@canb.auug.org.au
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Miller <davem@davemloft.net>
This reverts commit bdbb0a1376.
Tejun writes:
I'm sorry but can you please revert the whole series?
get_active() waiting while a node is deactivated has potential
to lead to deadlock and that deactivate/reactivate interface is
something fundamentally flawed and that cgroup will have to work
with the remove_self() like everybody else. IOW, I think the
first posting was correct.
Cc: Tejun Heo <tj@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We've grown a bunch of microcode loader files all prefixed with
"microcode_". They should be under cpu/ because this is strictly
CPU-related functionality so do that and drop the prefix since they're
in their own directory now which gives that prefix. :)
While at it, drop MICROCODE_INTEL_LIB config item and stash the
functionality under CONFIG_MICROCODE_INTEL as it was its only user.
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
The original idea to use the microcode cache for the APs doesn't pan out
because we do memory allocation there very early and with IRQs disabled
and we don't want to involve GFP_ATOMIC allocations. Not if it can be
helped.
Thus, extend the caching of the BSP patch approach to the APs and
iterate over the ucode in the initrd instead of using the cache. We
still save the relevant patches to it but later, right before we
jettison the initrd.
While at it, fix early ucode loading on 32-bit too.
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
We want to use those in AMD's early loading path too. Also, add a
native_wrmsrl variant.
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
The ramdisk can possibly get relocated if the whole image is not mapped.
And since we're going over it in the microcode loader and fishing out
the relevant microcode patches, we want access it at its new location.
Thus, export it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
With various drivers wanting to inject idle time; we get people
calling idle routines outside of the idle loop proper.
Therefore we need to be extra careful about not missing
TIF_NEED_RESCHED -> PREEMPT_NEED_RESCHED propagations.
While looking at this, I also realized there's a small window in the
existing idle loop where we can miss TIF_NEED_RESCHED; when it hits
right after the tif_need_resched() test at the end of the loop but
right before the need_resched() test at the start of the loop.
So move preempt_fold_need_resched() out of the loop where we're
guaranteed to have TIF_NEED_RESCHED set.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-x9jgh45oeayzajz2mjt0y7d6@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The patch "s390/perf: add support for the CPU-Measurement Sampling
Facility" added a new instance of the __cpuinit macro usage.
We removed this a couple versions ago; we now want to remove
the compat no-op stubs. Introducing new users is not what
we want to see at this point in time, as it will break once
the stubs are gone.
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
PSW32_USER_BITS should define the primary address space for user space
instead of the home address space.
Symptom of this bug is that gdb doesn't work in compat mode.
The bug was introduced with e258d719ff "s390/uaccess: always run the kernel
in home space" and f26946d7ec "s390/compat: make psw32_user_bits a constant
value again".
Cc: stable@vger.kernel.org # v3.13+
Reported-by: Andreas Arnez <arnez@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use a ring-buffer like multi-version object structure which allows
always having a coherent object; we use this to avoid having to
disable IRQs while reading sched_clock() and avoids a problem when
getting an NMI while changing the cyc2ns data.
MAINLINE PRE POST
sched_clock_stable: 1 1 1
(cold) sched_clock: 329841 331312 257223
(cold) local_clock: 301773 310296 309889
(warm) sched_clock: 38375 38247 25280
(warm) local_clock: 100371 102713 85268
(warm) rdtsc: 27340 27289 24247
sched_clock_stable: 0 0 0
(cold) sched_clock: 382634 372706 301224
(cold) local_clock: 396890 399275 399870
(warm) sched_clock: 38194 38124 25630
(warm) local_clock: 143452 148698 129629
(warm) rdtsc: 27345 27365 24307
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-s567in1e5ekq2nlyhn8f987r@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fengguang Wu's 0day kernel build service reported the following build warning:
arch/x86/kernel/apic/io_apic.c:2211
smp_irq_move_cleanup_interrupt() warn: always true condition '(irq <= -1) => (0-u32max <= (-1))'
because irq is defined as an unsigned int instead of an int.
Fix this trivial error by redefining irq as a signed int. The
remaining consumers of the int are okay.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Link: http://lkml.kernel.org/r/1389620420-7110-1-git-send-email-prarit@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 64681787 (arm64: let the core code deal with preempt_count)
changed the code, but left the comments unchanged, fix it.
Signed-off-by: Neil Zhang <zhangwm@marvell.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
There are no __cycles_2_ns() users outside of arch/x86/kernel/tsc.c,
so move it there.
There are no cycles_2_ns() users.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-01lslnavfgo3kmbo4532zlcj@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>