Commit Graph

18834 Commits

Author SHA1 Message Date
Cliff Wickman
a26fd71953 x86/uv: Update the UV3 TLB shootdown logic
Update of TLB shootdown code for UV3.

Kernel function native_flush_tlb_others() calls
uv_flush_tlb_others() on UV to invalidate tlb page definitions
on remote cpus. The UV systems have a hardware 'broadcast assist
unit' which can be used to broadcast shootdown messages to all
cpu's of selected nodes.

The behavior of the BAU has changed only slightly with UV3:

  - UV3 is recognized with is_uv3_hub().
  - UV2 functions and structures (uv2_xxx) are in most cases
    simply renamed to uv2_3_xxx.
  - Some UV2 error workarounds are not needed for UV3.
    (see uv_bau_message_interrupt and enable_timeouts)

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/E1WkgWh-0001yJ-3K@eag09.americas.sgi.com
[ Removed a few linebreak uglies. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 14:17:20 +02:00
Dimitri Sivanich
5f40f7d938 x86/UV: Set n_lshift based on GAM_GR_CONFIG MMR for UV3
The value of n_lshift for UV is currently set based on the
socket m_val.

For UV3, set the n_lshift value based on the GAM_GR_CONFIG MMR.
This will allow bios to control the n_lshift value independent
of the socket m_val. Then n_lshift can be assigned a fixed value
across a multi-partition system, allowing for a fixed common
global physical address format that is independent of socket
m_val.

Cleanup unneeded macros.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Link: http://lkml.kernel.org/r/20140331143700.GB29916@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-01 12:10:44 +02:00
Linus Torvalds
176ab02d49 Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 LTO changes from Peter Anvin:
 "More infrastructure work in preparation for link-time optimization
  (LTO).  Most of these changes is to make sure symbols accessed from
  assembly code are properly marked as visible so the linker doesn't
  remove them.

  My understanding is that the changes to support LTO are still not
  upstream in binutils, but are on the way there.  This patchset should
  conclude the x86-specific changes, and remaining patches to actually
  enable LTO will be fed through the Kbuild tree (other than keeping up
  with changes to the x86 code base, of course), although not
  necessarily in this merge window"

* 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  Kbuild, lto: Handle basic LTO in modpost
  Kbuild, lto: Disable LTO for asm-offsets.c
  Kbuild, lto: Add a gcc-ld script to let run gcc as ld
  Kbuild, lto: add ld-version and ld-ifversion macros
  Kbuild, lto: Drop .number postfixes in modpost
  Kbuild, lto, workaround: Don't warn for initcall_reference in modpost
  lto: Disable LTO for sys_ni
  lto: Handle LTO common symbols in module loader
  lto, workaround: Add workaround for initcall reordering
  lto: Make asmlinkage __visible
  x86, lto: Disable LTO for the x86 VDSO
  initconst, x86: Fix initconst mistake in ts5500 code
  initconst: Fix initconst mistake in dcdbas
  asmlinkage: Make trace_hardirqs_on/off_caller visible
  asmlinkage, x86: Fix 32bit memcpy for LTO
  asmlinkage Make __stack_chk_failed and memcmp visible
  asmlinkage: Mark rwsem functions that can be called from assembler asmlinkage
  asmlinkage: Make main_extable_sort_needed visible
  asmlinkage, mutex: Mark __visible
  asmlinkage: Make trace_hardirq visible
  ...
2014-03-31 14:13:25 -07:00
Linus Torvalds
e06df6a7ea Merge branch 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 kaslr update from Ingo Molnar:
 "This adds kernel module load address randomization"

* 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, kaslr: fix module lock ordering problem
  x86, kaslr: randomize module base load address
2014-03-31 12:34:49 -07:00
Linus Torvalds
c0fc3cbac0 Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 hyperv change from Ingo Molnar:
 "Skip the timer_irq_works() check on hyperv systems"

* 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, hyperv: Bypass the timer_irq_works() check
2014-03-31 12:28:38 -07:00
Linus Torvalds
d9fcca40eb Merge branch 'x86-hash-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 hashing changes from Ingo Molnar:
 "Small fixes and cleanups to the librarized arch_fast_hash() methods,
  used by the net/openvswitch code"

* 'x86-hash-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, hash: Simplify switch, add __init annotation
  x86, hash: Swap arguments passed to crc32_u32()
  x86, hash: Fix build failure with older binutils
2014-03-31 12:27:32 -07:00
Linus Torvalds
7cc3afdf43 Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 EFI changes from Ingo Molnar:
 "The main changes:

  - Add debug code to the dump EFI pagetable - Borislav Petkov

  - Make 1:1 runtime mapping robust when booting on machines with lots
    of memory - Borislav Petkov

  - Move the EFI facilities bits out of 'x86_efi_facility' and into
    efi.flags which is the standard architecture independent place to
    keep EFI state, by Matt Fleming.

  - Add 'EFI mixed mode' support: this allows 64-bit kernels to be
    booted from 32-bit firmware.  This needs a bootloader that supports
    the 'EFI handover protocol'.  By Matt Fleming"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  x86, efi: Abstract x86 efi_early calls
  x86/efi: Restore 'attr' argument to query_variable_info()
  x86/efi: Rip out phys_efi_get_time()
  x86/efi: Preserve segment registers in mixed mode
  x86/boot: Fix non-EFI build
  x86, tools: Fix up compiler warnings
  x86/efi: Re-disable interrupts after calling firmware services
  x86/boot: Don't overwrite cr4 when enabling PAE
  x86/efi: Wire up CONFIG_EFI_MIXED
  x86/efi: Add mixed runtime services support
  x86/efi: Firmware agnostic handover entry points
  x86/efi: Split the boot stub into 32/64 code paths
  x86/efi: Add early thunk code to go from 64-bit to 32-bit
  x86/efi: Build our own EFI services pointer table
  efi: Add separate 32-bit/64-bit definitions
  x86/efi: Delete dead code when checking for non-native
  x86/mm/pageattr: Always dump the right page table in an oops
  x86, tools: Consolidate #ifdef code
  x86/boot: Cleanup header.S by removing some #ifdefs
  efi: Use NULL instead of 0 for pointer
  ...
2014-03-31 12:26:05 -07:00
Linus Torvalds
ad8946fbf9 Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 debug cleanup from Ingo Molnar:
 "A single trivial cleanup"

* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  i386: Remove unneeded test of 'task' in dump_trace() (again)
2014-03-31 12:25:28 -07:00
Linus Torvalds
918d80a136 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu handling changes from Ingo Molnar:
 "Bigger changes:

   - Intel CPU hardware-enablement: new vector instructions support
     (AVX-512), by Fenghua Yu.

   - Support the clflushopt instruction and use it in appropriate
     places.  clflushopt is similar to clflush but with more relaxed
     ordering, by Ross Zwisler.

   - MSR accessor cleanups, by Borislav Petkov.

   - 'forcepae' boot flag for those who have way too much time to spend
     on way too old Pentium-M systems and want to live way too
     dangerously, by Chris Bainbridge"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, cpu: Add forcepae parameter for booting PAE kernels on PAE-disabled Pentium M
  Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC
  x86, intel: Make MSR_IA32_MISC_ENABLE bit constants systematic
  x86, Intel: Convert to the new bit access MSR accessors
  x86, AMD: Convert to the new bit access MSR accessors
  x86: Add another set of MSR accessor functions
  x86: Use clflushopt in drm_clflush_virt_range
  x86: Use clflushopt in drm_clflush_page
  x86: Use clflushopt in clflush_cache_range
  x86: Add support for the clflushopt instruction
  x86, AVX-512: Enable AVX-512 States Context Switch
  x86, AVX-512: AVX-512 Feature Detection
2014-03-31 12:00:45 -07:00
Linus Torvalds
26a5c0dfbc Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Various smaller cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, pageattr: Correct WBINVD spelling in comment
  x86, crash: Unify ifdef
  x86, boot: Correct max ramdisk size name
2014-03-31 12:00:10 -07:00
Linus Torvalds
54cad6270a Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build change from Ingo Molnar:
 "Explicitly disable x87 FPU instructions, to catch mistaken floating
  point use at build time, instead of crashing or misbehaving during run
  time"

* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Disable generation of traditional x87 instructions
2014-03-31 11:59:31 -07:00
Linus Torvalds
6ed7705167 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic changes from Ingo Molnar:
 "An xAPIC CPU hotplug race fix, plus cleanups and minor fixes"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Plug racy xAPIC access of CPU hotplug code
  x86/apic: Always define nox2apic and define it as initdata
  x86/apic: Remove unused function prototypes
  x86/apic: Switch wait_for_init_deassert() to a bool flag
  x86/apic: Only use default_wait_for_init_deassert()
2014-03-31 11:58:45 -07:00
Linus Torvalds
3a88fe3b74 Merge branch 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 acpi numa fix from Ingo Molnar:
 "A single NUMA CPU hotplug fix"

* 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, acpi: Fix bug in associating hot-added CPUs with corresponding NUMA node
2014-03-31 11:58:08 -07:00
Linus Torvalds
971eae7c99 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
 "Bigger changes:

   - sched/idle restructuring: they are WIP preparation for deeper
     integration between the scheduler and idle state selection, by
     Nicolas Pitre.

   - add NUMA scheduling pseudo-interleaving, by Rik van Riel.

   - optimize cgroup context switches, by Peter Zijlstra.

   - RT scheduling enhancements, by Thomas Gleixner.

  The rest is smaller changes, non-urgnt fixes and cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits)
  sched: Clean up the task_hot() function
  sched: Remove double calculation in fix_small_imbalance()
  sched: Fix broken setscheduler()
  sparc64, sched: Remove unused sparc64_multi_core
  sched: Remove unused mc_capable() and smt_capable()
  sched/numa: Move task_numa_free() to __put_task_struct()
  sched/fair: Fix endless loop in idle_balance()
  sched/core: Fix endless loop in pick_next_task()
  sched/fair: Push down check for high priority class task into idle_balance()
  sched/rt: Fix picking RT and DL tasks from empty queue
  trace: Replace hardcoding of 19 with MAX_NICE
  sched: Guarantee task priority in pick_next_task()
  sched/idle: Remove stale old file
  sched: Put rq's sched_avg under CONFIG_FAIR_GROUP_SCHED
  cpuidle/arm64: Remove redundant cpuidle_idle_call()
  cpuidle/powernv: Remove redundant cpuidle_idle_call()
  sched, nohz: Exclude isolated cores from load balancing
  sched: Fix select_task_rq_fair() description comments
  workqueue: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE
  sys: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE
  ...
2014-03-31 11:21:19 -07:00
Linus Torvalds
8c292f1174 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf changes from Ingo Molnar:
 "Main changes:

  Kernel side changes:

   - Add SNB/IVB/HSW client uncore memory controller support (Stephane
     Eranian)

   - Fix various x86/P4 PMU driver bugs (Don Zickus)

  Tooling, user visible changes:

   - Add several futex 'perf bench' microbenchmarks (Davidlohr Bueso)

   - Speed up thread map generation (Don Zickus)

   - Introduce 'perf kvm --list-cmds' command line option for use by
     scripts (Ramkumar Ramachandra)

   - Print the evsel name in the annotate stdio output, prep to fix
     support outputting annotation for multiple events, not just for the
     first one (Arnaldo Carvalho de Melo)

   - Allow setting preferred callchain method in .perfconfig (Jiri Olsa)

   - Show in what binaries/modules 'perf probe's are set (Masami
     Hiramatsu)

   - Support distro-style debuginfo for uprobe in 'perf probe' (Masami
     Hiramatsu)

  Tooling, internal changes and fixes:

   - Use tid in mmap/mmap2 events to find maps (Don Zickus)

   - Record the reason for filtering an address_location (Namhyung Kim)

   - Apply all filters to an addr_location (Namhyung Kim)

   - Merge al->filtered with hist_entry->filtered in report/hists
     (Namhyung Kim)

   - Fix memory leak when synthesizing thread records (Namhyung Kim)

   - Use ui__has_annotation() in 'report' (Namhyung Kim)

   - hists browser refactorings to reuse code accross UIs (Namhyung Kim)

   - Add support for the new DWARF unwinder library in elfutils (Jiri
     Olsa)

   - Fix build race in the generation of bison files (Jiri Olsa)

   - Further streamline the feature detection display, trimming it a bit
     to show just the libraries detected, using VF=1 gets a more verbose
     output, showing the less interesting feature checks as well (Jiri
     Olsa).

   - Check compatible symtab type before loading dso (Namhyung Kim)

   - Check return value of filename__read_debuglink() (Stephane Eranian)

   - Move some hashing and fs related code from tools/perf/util/ to
     tools/lib/ so that it can be used by more tools/ living utilities
     (Borislav Petkov)

   - Prepare DWARF unwinding code for using an elfutils alternative
     unwinding library (Jiri Olsa)

   - Fix DWARF unwind max_stack processing (Jiri Olsa)

   - Add dwarf unwind 'perf test' entry (Jiri Olsa)

   - 'perf probe' improvements including memory leak fixes, sharing the
     intlist class with other tools, uprobes/kprobes code sharing and
     use of ref_reloc_sym (Masami Hiramatsu)

   - Shorten sample symbol resolving by adding cpumode to struct
     addr_location (Arnaldo Carvalho de Melo)

   - Fix synthesizing mmaps for threads (Don Zickus)

   - Fix invalid output on event group stdio report (Namhyung Kim)

   - Fixup header alignment in 'perf sched latency' output (Ramkumar
     Ramachandra)

   - Fix off-by-one error in 'perf timechart record' argv handling
     (Ramkumar Ramachandra)

  Tooling, cleanups:

   - Remove unused thread__find_map function (Jiri Olsa)

   - Remove unused simple_strtoul() function (Ramkumar Ramachandra)

  Tooling, documentation updates:

   - Update function names in debug messages (Ramkumar Ramachandra)

   - Update some code references in design.txt (Ramkumar Ramachandra)

   - Clarify load-latency information in the 'perf mem' docs (Andi
     Kleen)

   - Clarify x86 register naming in 'perf probe' docs (Andi Kleen)"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (96 commits)
  perf tools: Remove unused simple_strtoul() function
  perf tools: Update some code references in design.txt
  perf evsel: Update function names in debug messages
  perf tools: Remove thread__find_map function
  perf annotate: Print the evsel name in the stdio output
  perf report: Use ui__has_annotation()
  perf tools: Fix memory leak when synthesizing thread records
  perf tools: Use tid in mmap/mmap2 events to find maps
  perf report: Merge al->filtered with hist_entry->filtered
  perf symbols: Apply all filters to an addr_location
  perf symbols: Record the reason for filtering an address_location
  perf sched: Fixup header alignment in 'latency' output
  perf timechart: Fix off-by-one error in 'record' argv handling
  perf machine: Factor machine__find_thread to take tid argument
  perf tools: Speed up thread map generation
  perf kvm: introduce --list-cmds for use by scripts
  perf ui hists: Pass evsel to hpp->header/width functions explicitly
  perf symbols: Introduce thread__find_cpumode_addr_location
  perf session: Change header.misc dump from decimal to hex
  perf ui/tui: Reuse generic __hpp__fmt() code
  ...
2014-03-31 11:13:25 -07:00
Linus Torvalds
462bf234a8 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking updates from Ingo Molnar:
 "The biggest change is the MCS spinlock generalization changes from Tim
  Chen, Peter Zijlstra, Jason Low et al.  There's also lockdep
  fixes/enhancements from Oleg Nesterov, in particular a false negative
  fix related to lockdep_set_novalidate_class() usage"

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
  locking/mutex: Fix debug checks
  locking/mutexes: Add extra reschedule point
  locking/mutexes: Introduce cancelable MCS lock for adaptive spinning
  locking/mutexes: Unlock the mutex without the wait_lock
  locking/mutexes: Modify the way optimistic spinners are queued
  locking/mutexes: Return false if task need_resched() in mutex_can_spin_on_owner()
  locking: Move mcs_spinlock.h into kernel/locking/
  m68k: Skip futex_atomic_cmpxchg_inatomic() test
  futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test
  Revert "sched/wait: Suppress Sparse 'variable shadowing' warning"
  lockdep: Change lockdep_set_novalidate_class() to use _and_name
  lockdep: Change mark_held_locks() to check hlock->check instead of lockdep_no_validate
  lockdep: Don't create the wrong dependency on hlock->check == 0
  lockdep: Make held_lock->check and "int check" argument bool
  locking/mcs: Allow architecture specific asm files to be used for contended case
  locking/mcs: Order the header files in Kbuild of each architecture in alphabetical order
  sched/wait: Suppress Sparse 'variable shadowing' warning
  hung_task/Documentation: Fix hung_task_warnings description
  locking/mcs: Allow architectures to hook in to contended paths
  locking/mcs: Micro-optimize the MCS code, add extra comments
  ...
2014-03-31 10:59:39 -07:00
Artem Fetishev
825600c0f2 x86: fix boot on uniprocessor systems
On x86 uniprocessor systems topology_physical_package_id() returns -1
which causes rapl_cpu_prepare() to leave rapl_pmu variable uninitialized
which leads to GPF in rapl_pmu_init().

See arch/x86/kernel/cpu/perf_event_intel_rapl.c.

It turns out that physical_package_id and core_id can actually be
retreived for uniprocessor systems too.  Enabling them also fixes
rapl_pmu code.

Signed-off-by: Artem Fetishev <artem_fetishev@epam.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-28 13:56:58 -07:00
Jason Wang
ca3ba2a2f4 x86, hyperv: Bypass the timer_irq_works() check
This patch bypass the timer_irq_works() check for hyperv guest since:

- It was guaranteed to work.
- timer_irq_works() may fail sometime due to the lpj calibration were inaccurate
  in a hyperv guest or a buggy host.

In the future, we should get the tsc frequency from hypervisor and use preset
lpj instead.

[ hpa: I would prefer to not defer things to "the future" in the future... ]

Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: http://lkml.kernel.org/r/1393558229-14755-1-git-send-email-jasowang@redhat.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-27 11:02:45 -07:00
Matt Fleming
204b0a1a4b x86, efi: Abstract x86 efi_early calls
The ARM EFI boot stub doesn't need to care about the efi_early
infrastructure that x86 requires in order to do mixed mode thunking. So
wrap everything up in an efi_call_early() macro.

This allows x86 to do the necessary indirection jumps to call whatever
firmware interface is necessary (native or mixed mode), but also allows
the ARM folks to mask the fact that they don't support relocation in the
boot stub and need to pass 'sys_table_arg' to every function.

[ hpa: there are no object code changes from this patch ]

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/20140326091011.GB2958@console-pimps.org
Cc: Roy Franz <roy.franz@linaro.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-26 11:30:03 -07:00
David Vrabel
5926f87fda Revert "xen: properly account for _PAGE_NUMA during xen pte translations"
This reverts commit a9c8e4beee.

PTEs in Xen PV guests must contain machine addresses if _PAGE_PRESENT
is set and pseudo-physical addresses is _PAGE_PRESENT is clear.

This is because during a domain save/restore (migration) the page
table entries are "canonicalised" and uncanonicalised". i.e., MFNs are
converted to PFNs during domain save so that on a restore the page
table entries may be rewritten with the new MFNs on the destination.
This canonicalisation is only done for PTEs that are present.

This change resulted in writing PTEs with MFNs if _PAGE_PROTNONE (or
_PAGE_NUMA) was set but _PAGE_PRESENT was clear.  These PTEs would be
migrated as-is which would result in unexpected behaviour in the
destination domain.  Either a) the MFN would be translated to the
wrong PFN/page; b) setting the _PAGE_PRESENT bit would clear the PTE
because the MFN is no longer owned by the domain; or c) the present
bit would not get set.

Symptoms include "Bad page" reports when munmapping after migrating a
domain.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org>        [3.12+]
2014-03-25 11:11:42 +00:00
Kees Cook
9dd721c6db x86, kaslr: fix module lock ordering problem
There was a potential lock ordering problem with the module kASLR patch
("x86, kaslr: randomize module base load address"). This patch removes
the usage of the module_mutex and creates a new mutex to protect the
module base address offset value.

Chain exists of:
  text_mutex --> kprobe_insn_slots.mutex --> module_mutex

[    0.515561]  Possible unsafe locking scenario:
[    0.515561]
[    0.515561]        CPU0                    CPU1
[    0.515561]        ----                    ----
[    0.515561]   lock(module_mutex);
[    0.515561]                                lock(kprobe_insn_slots.mutex);
[    0.515561]                                lock(module_mutex);
[    0.515561]   lock(text_mutex);
[    0.515561]
[    0.515561]  *** DEADLOCK ***

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andy Honig <ahonig@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-24 10:18:26 -07:00
Chris Bainbridge
69f2366c94 x86, cpu: Add forcepae parameter for booting PAE kernels on PAE-disabled Pentium M
Many Pentium M systems disable PAE but may have a functionally usable PAE
implementation. This adds the "forcepae" parameter which bypasses the boot
check for PAE, and sets the CPU as being PAE capable. Using this parameter
will taint the kernel with TAINT_CPU_OUT_OF_SPEC.

Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Link: http://lkml.kernel.org/r/20140307114040.GA4997@localhost
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-03-20 16:31:54 -07:00
Dave Jones
8c90487cdc Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC
Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC, so we can repurpose
the flag to encompass a wider range of pushing the CPU beyond its
warrany.

Signed-off-by: Dave Jones <davej@fedoraproject.org>
Link: http://lkml.kernel.org/r/20140226154949.GA770@redhat.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-03-20 16:28:09 -07:00
Jan Beulich
7a5917e978 x86, hash: Simplify switch, add __init annotation
Minor cleanups:

- simplify switch statement
- add __init annotation to setup_arch_fast_hash()

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/530F09CE020000780011FBEF@nat28.tlf.novell.com
Cc: Francesco Fusco <ffusco@redhat.com>
Cc: Thomas Graf <tgraf@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-19 16:51:04 -07:00
Jan Beulich
c5cdfdf909 x86, hash: Swap arguments passed to crc32_u32()
... to match the function's parameters. While reportedly commutative,
using the proper order allows for leveraging the instruction permitting
the source operand to be in memory.

[ hpa: This code originated in the dpdk toolkit.  This was a bug in dpdk
  which has recently been fixed in part due to an earlier version of
  this patch. ]

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/530F09B6020000780011FBEB@nat28.tlf.novell.com
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Francesco Fusco <ffusco@redhat.com>
Cc: Thomas Graf <tgraf@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-19 16:51:04 -07:00
Jan Beulich
06325190bd x86, hash: Fix build failure with older binutils
Just like for other ISA extension instruction uses we should check
whether the assembler actually supports them. The fallback here simply
is to encode an instruction  with fixed operands (%eax and %ecx).

[ hpa: tagging for -stable as a build fix ]

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/530F0996020000780011FBE7@nat28.tlf.novell.com
Cc: Francesco Fusco <ffusco@redhat.com>
Cc: Thomas Graf <tgraf@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org> # v3.14
2014-03-19 16:51:04 -07:00
Linus Torvalds
7c1cfacca2 PCI updates for v3.14:
Resource management
     - Revert "Insert GART region into resource map"
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJTKdCYAAoJEFmIoMA60/r88lMP/3Njn9KD69FNU+Hphs5ExbQG
 OtqudAWiJBPAabsvXAJNHh/osCJJcPgewzKGOX9dboEP3eL/bCibT2dKgXIjxRav
 WULAlINj6x1h3K+/+VlBVsW0W/DcnK7zIQ0O7Qjqh8f0E2v7+nB5/tOOQ6gQeqxH
 kp7jzBxdTUfVeZtfwY80ccHM2zI/nqm6/p5PobMt3G8vsdamxNGDjHDlOlqARgNs
 /nP26/XusS9Uw0MI7wwuALgLNLNMbLWMmKBfz1n7Gw3wUvWvQeK+ib67gyTgoVT7
 VVDuFn8bg3hMqVwlCrBU8SnS83/dQlSl0fze0gL4UMPH+AGxHnTPCeirOkEhvtDY
 krVuiber7PAkj5dz0ApRR9nawHSh20Y7WJxLDYrpILo8C3e4pXcGRVsm4d8e3GuP
 zYwD/6OH6SgMdozc456Jb1qSaXDfLfVffYKcjLS+5bPstP+QIFSUdJdhRrsOKZGn
 VuIvqXioKeBYe4QqJYGt7cMWdRXTqkrYXOV5N1qPOqBloc83sRdQWtSqHivISma9
 rIeyBCAA30SYtr9FF0n/up8Mw9AhNukfcKuouWUuDV/FeMRbMbblvw8Vgnq8eXY9
 fLqUMOTPnE0nz5hScHPfTolQ6ATghh/sk6EOkD4gvvfS7+Ul5oiduE+rBOb2eY2Q
 M6/H1MT1OiNKcRROt0q6
 =yDUb
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI resource management fix from Bjorn Helgaas:
 "This is a fix for an AGP regression exposed by e501b3d87f ("agp:
  Support 64-bit APBASE"), which we merged in v3.14-rc1.

  We've warned about the conflict between the GART and PCI resources and
  cleared out the PCI resource for a long time, but after e501b3d87f,
  we still *use* that cleared-out PCI resource.  I think the GART
  resource is incorrect, so this patch removes it"

* tag 'pci-v3.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  Revert "[PATCH] Insert GART region into resource map"
2014-03-19 16:15:54 -07:00
Bjorn Helgaas
707d4eefbd Revert "[PATCH] Insert GART region into resource map"
This reverts commit 56dd669a13, which makes the GART visible in
/proc/iomem.  This fixes a regression: e501b3d87f ("agp: Support 64-bit
APBASE") exposed an existing problem with a conflict between the GART
region and a PCI BAR region.

The GART addresses are bus addresses, not CPU addresses, and therefore
should not be inserted in iomem_resource.

On many machines, the GART region is addressable by the CPU as well as by
an AGP master, but CPU addressability is not required by the spec.  On some
of these machines, the GART is mapped by a PCI BAR, and in that case, the
PCI core automatically inserts it into iomem_resource, just as it does for
all BARs.

Inserting it here means we'll have a conflict if the PCI core later tries
to claim the GART region, so let's drop the insertion here.

The conflict indirectly causes X failures, as reported by Jouni in the
bugzilla below.  We detected the conflict even before e501b3d87f, but
after it the AGP code (fix_northbridge()) uses the PCI resource (which is
zeroed because of the conflict) instead of reading the BAR again.

Conflicts:
	arch/x86_64/kernel/aperture.c

Fixes: e501b3d87f agp: Support 64-bit APBASE
Link: https://bugzilla.kernel.org/show_bug.cgi?id=72201
Reported-and-tested-by: Jouni Mettälä <jtmettala@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-03-18 14:26:12 -06:00
Matt Fleming
9a11040ff9 x86/efi: Restore 'attr' argument to query_variable_info()
In the thunk patches the 'attr' argument was dropped to
query_variable_info(). Restore it otherwise the firmware will return
EFI_INVALID_PARAMETER.

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-03-17 21:55:04 +00:00
Matt Fleming
3f4a7836e3 x86/efi: Rip out phys_efi_get_time()
Dan reported that phys_efi_get_time() is doing kmalloc(..., GFP_KERNEL)
under a spinlock which is very clearly a bug. Since phys_efi_get_time()
has no users let's just delete it instead of trying to fix it.

Note that since there are no users of phys_efi_get_time(), it is not
possible to actually trigger a GFP_KERNEL alloc under the spinlock.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Nathan Zimmer <nzimmer@sgi.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-03-17 21:54:28 +00:00
Matt Fleming
e10848a26a x86/efi: Preserve segment registers in mixed mode
I was triggering a #GP(0) from userland when running with
CONFIG_EFI_MIXED and CONFIG_IA32_EMULATION, from what looked like
register corruption. Turns out that the mixed mode code was trashing the
contents of %ds, %es and %ss in __efi64_thunk().

Save and restore the contents of these segment registers across the call
to __efi64_thunk() so that we don't corrupt the CPU context.

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-03-17 21:54:17 +00:00
Linus Torvalds
b44eeb4d47 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc smaller fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix leak in uncore_type_init failure paths
  perf machine: Use map as success in ip__resolve_ams
  perf symbols: Fix crash in elf_section_by_name
  perf trace: Decode architecture-specific signal numbers
2014-03-16 10:41:21 -07:00
Linus Torvalds
a4ecdf82f8 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
 "Two x86 fixes: Suresh's eager FPU fix, and a fix to the NUMA quirk for
  AMD northbridges.

  This only includes Suresh's fix patch, not the "mostly a cleanup"
  patch which had __init issues"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/amd/numa: Fix northbridge quirk to assign correct NUMA node
  x86, fpu: Check tsk_used_math() in kernel_fpu_end() for eager FPU
2014-03-14 18:07:51 -07:00
Daniel J Blueman
847d7970de x86/amd/numa: Fix northbridge quirk to assign correct NUMA node
For systems with multiple servers and routed fabric, all
northbridges get assigned to the first server. Fix this by also
using the node reported from the PCI bus. For single-fabric
systems, the northbriges are on PCI bus 0 by definition, which
are on NUMA node 0 by definition, so this is invarient on most
systems.

Tested on fam10h and fam15h single and multi-fabric systems and
candidate for stable.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Steffen Persvold <sp@numascale.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1394710981-3596-1-git-send-email-daniel@numascale.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-14 11:05:36 +01:00
Stephane Eranian
81827ed8d8 perf/x86/uncore: Fix missing end markers for SNB/IVB/HSW IMC PMU
This patch fixes a bug with the SNB/IVB/HSW uncore
mmeory controller support.

The PCI Ids tables for the memory controller were missing end
markers. That could cause random crashes on boot during or after
PCI device registration.

Signed-off-by: Stephane Erainan <eranian@google.com>
Cc: peterz@infradead.org
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/20140313120436.GA14236@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
--
2014-03-14 09:25:25 +01:00
Linus Torvalds
53611c0ce9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "I know this is a bit more than you want to see, and I've told the
  wireless folks under no uncertain terms that they must severely scale
  back the extent of the fixes they are submitting this late in the
  game.

  Anyways:

   1) vmxnet3's netpoll doesn't perform the equivalent of an ISR, which
      is the correct implementation, like it should.  Instead it does
      something like a NAPI poll operation.  This leads to crashes.

      From Neil Horman and Arnd Bergmann.

   2) Segmentation of SKBs requires proper socket orphaning of the
      fragments, otherwise we might access stale state released by the
      release callbacks.

      This is a 5 patch fix, but the initial patches are giving
      variables and such significantly clearer names such that the
      actual fix itself at the end looks trivial.

      From Michael S.  Tsirkin.

   3) TCP control block release can deadlock if invoked from a timer on
      an already "owned" socket.  Fix from Eric Dumazet.

   4) In the bridge multicast code, we must validate that the
      destination address of general queries is the link local all-nodes
      multicast address.  From Linus Lüssing.

   5) The x86 BPF JIT support for negative offsets puts the parameter
      for the helper function call in the wrong register.  Fix from
      Alexei Starovoitov.

   6) The descriptor type used for RTL_GIGA_MAC_VER_17 chips in the
      r8169 driver is incorrect.  Fix from Hayes Wang.

   7) The xen-netback driver tests skb_shinfo(skb)->gso_type bits to see
      if a packet is a GSO frame, but that's not the correct test.  It
      should use skb_is_gso(skb) instead.  Fix from Wei Liu.

   8) Negative msg->msg_namelen values should generate an error, from
      Matthew Leach.

   9) at86rf230 can deadlock because it takes the same lock from it's
      ISR and it's hard_start_xmit method, without disabling interrupts
      in the latter.  Fix from Alexander Aring.

  10) The FEC driver's restart doesn't perform operations in the correct
      order, so promiscuous settings can get lost.  Fix from Stefan
      Wahren.

  11) Fix SKB leak in SCTP cookie handling, from Daniel Borkmann.

  12) Reference count and memory leak fixes in TIPC from Ying Xue and
      Erik Hugne.

  13) Forced eviction in inet_frag_evictor() must strictly make sure all
      frags are deleted, otherwise module unload (f.e.  6lowpan) can
      crash.  Fix from Florian Westphal.

  14) Remove assumptions in AF_UNIX's use of csum_partial() (which it
      uses as a hash function), which breaks on PowerPC.  From Anton
      Blanchard.

      The main gist of the issue is that csum_partial() is defined only
      as a value that, once folded (f.e.  via csum_fold()) produces a
      correct 16-bit checksum.  It is legitimate, therefore, for
      csum_partial() to produce two different 32-bit values over the
      same data if their respective alignments are different.

  15) Fix endiannes bug in MAC address handling of ibmveth driver, also
      from Anton Blanchard.

  16) Error checks for ipv6 exthdrs offload registration are reversed,
      from Anton Nayshtut.

  17) Externally triggered ipv6 addrconf routes should count against the
      garbage collection threshold.  Fix from Sabrina Dubroca.

  18) The PCI shutdown handler added to the bnx2 driver can wedge the
      chip if it was not brought up earlier already, which in particular
      causes the firmware to shut down the PHY.  Fix from Michael Chan.

  19) Adjust the sanity WARN_ON_ONCE() in qdisc_list_add() because as
      currently coded it can and does trigger in legitimate situations.
      From Eric Dumazet.

  20) BNA driver fails to build on ARM because of a too large udelay()
      call, fix from Ben Hutchings.

  21) Fair-Queue qdisc holds locks during GFP_KERNEL allocations, fix
      from Eric Dumazet.

  22) The vlan passthrough ops added in the previous release causes a
      regression in source MAC address setting of outgoing headers in
      some circumstances.  Fix from Peter Boström"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (70 commits)
  ipv6: Avoid unnecessary temporary addresses being generated
  eth: fec: Fix lost promiscuous mode after reconnecting cable
  bonding: set correct vlan id for alb xmit path
  at86rf230: fix lockdep splats
  net/mlx4_en: Deregister multicast vxlan steering rules when going down
  vmxnet3: fix building without CONFIG_PCI_MSI
  MAINTAINERS: add networking selftests to NETWORKING
  net: socket: error on a negative msg_namelen
  MAINTAINERS: Add tools/net to NETWORKING [GENERAL]
  packet: doc: Spelling s/than/that/
  net/mlx4_core: Load the IB driver when the device supports IBoE
  net/mlx4_en: Handle vxlan steering rules for mac address changes
  net/mlx4_core: Fix wrong dump of the vxlan offloads device capability
  xen-netback: use skb_is_gso in xenvif_start_xmit
  r8169: fix the incorrect tx descriptor version
  tools/net/Makefile: Define PACKAGE to fix build problems
  x86: bpf_jit: support negative offsets
  bridge: multicast: enable snooping on general queries only
  bridge: multicast: add sanity check for general query destination
  tcp: tcp_release_cb() should release socket ownership
  ...
2014-03-13 20:38:36 -07:00
H. Peter Anvin
0b131be8d4 x86, intel: Make MSR_IA32_MISC_ENABLE bit constants systematic
Replace somewhat arbitrary constants for bits in MSR_IA32_MISC_ENABLE
with verbose but systematic ones.  Add _BIT defines for all the rest
of them, too.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-13 15:55:46 -07:00
Borislav Petkov
c0a639ad0b x86, Intel: Convert to the new bit access MSR accessors
... and save some lines of code.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1394384725-10796-4-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-13 15:35:09 -07:00
Borislav Petkov
8f86a7373a x86, AMD: Convert to the new bit access MSR accessors
... and save us a bunch of code.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1394384725-10796-3-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-13 15:35:03 -07:00
Borislav Petkov
22085a66c2 x86: Add another set of MSR accessor functions
We very often need to set or clear a bit in an MSR as a result of doing
some sort of a hardware configuration. Add generic versions of that
repeated functionality in order to save us a bunch of duplicated code in
the early CPU vendor detection/config code.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1394384725-10796-2-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-13 15:34:45 -07:00
Borislav Petkov
b82ad3d394 x86, pageattr: Correct WBINVD spelling in comment
It is WBINVD, for INValiDate and not "wbindv". Use caps for instruction
names, while at it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1394633584-5509-4-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-13 15:32:45 -07:00
Borislav Petkov
5314feebab x86, crash: Unify ifdef
Merge two back-to-back CONFIG_X86_32 ifdefs into one.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1394633584-5509-3-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-13 15:32:44 -07:00
Borislav Petkov
3e920b532a x86, boot: Correct max ramdisk size name
The name in struct bootparam is ->initrd_addr_max and not ramdisk_max.
Fix that.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1394633584-5509-2-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-13 15:32:42 -07:00
Linus Torvalds
18f2af2d68 The ARM patch fixes a build breakage with randconfig. The x86 one
fixes Windows guests on AMD processors.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJTIJkTAAoJEBvWZb6bTYbyqMcP/1lQTqmo5rux8CBzW6QbVmDg
 VTof8XGqGbZdEZUymeSQZgemRMyXfxDXP3Mx+ayWvPCfpKRp4VUyOdQtxiF50fUY
 PgwamCcmlgCbn+S5nm7xjo6EyCZ5CH/5WT7/9evz6jRaVsNODbreqiRS6ERmIdPo
 qNOptOtOxPbojAhN3kpuKobvq98Go+YMZMswrC/shMKOFPSLZWYazNAyedbRXiGA
 UvcRKRtQI6LzSIzN/3mcDJW4gtX2StAlOgZFzOL2Ft+V8DdR+VTIT/TFRQQEdYWd
 o6E9Xw1Wu11H7uXuXzvkGvwunECEqLChI6UomLw8YdnnMwy4VivuOIiA48yNU5Nc
 QBhKS/HsZUMurv8EGRfuvQyL9NTwlrPnTBfwApn0MoEKWIjmMivb8HnIeUKsm/pq
 LE+/Kyahru5kWzwMG5xbUU9ET+QRGIYz/fIf9I1qpPeY1sXDrcEBzrvFmOAgJ9Yh
 oZpWEv9J6e7aJfVSRXwvYDmzNMpLTH2M/q/X/HlEah5CNOML9IUGS13eExqaI17U
 h+D32tzq/9XuIKMTawwCRrUHonlE9sMwCyFA371n0/tR48+carUtZ7dVG36b5FHK
 b0FTX8YkJcGITe4EuVBXuTbm3mA9wYhVKXo2yDElxfedeEEYR21/hiQoLBEQvjk8
 Df3qbm+bahCzh7k9ydKy
 =gJNC
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "The ARM patch fixes a build breakage with randconfig.  The x86 one
  fixes Windows guests on AMD processors"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SVM: fix cr8 intercept window
  ARM: KVM: fix non-VGIC compilation
2014-03-12 17:27:23 -07:00
Radim Krčmář
596f3142d2 KVM: SVM: fix cr8 intercept window
We always disable cr8 intercept in its handler, but only re-enable it
if handling KVM_REQ_EVENT, so there can be a window where we do not
intercept cr8 writes, which allows an interrupt to disrupt a higher
priority task.

Fix this by disabling intercepts in the same function that re-enables
them when needed. This fixes BSOD in Windows 2008.

Cc: <stable@vger.kernel.org>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-03-12 18:21:10 +01:00
Stephane Eranian
4191c29f05 perf/x86/uncore: Fix compilation warning in snb_uncore_imc_init_box()
This patch fixes a compilation problem (unused variable) with the
new SNB/IVB/HSW uncore IMC code.

[ In -v2 we simplify the fix as suggested by Peter Zjilstra. ]

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140311235329.GA28624@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-12 10:49:13 +01:00
Alexei Starovoitov
fdfaf64e75 x86: bpf_jit: support negative offsets
Commit a998d43423 claimed to introduce negative offset support to x86 jit,
but it couldn't be working, since at the time of the execution
of LD+ABS or LD+IND instructions via call into
bpf_internal_load_pointer_neg_helper() the %edx (3rd argument of this func)
had junk value instead of access size in bytes (1 or 2 or 4).

Store size into %edx instead of %ecx (what original commit intended to do)

Fixes: a998d43423 ("bpf jit: Let the x86 jit handle negative offsets")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jan Seiffert <kaffeemonster@googlemail.com>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 23:25:22 -04:00
Suresh Siddha
731bd6a93a x86, fpu: Check tsk_used_math() in kernel_fpu_end() for eager FPU
For non-eager fpu mode, thread's fpu state is allocated during the first
fpu usage (in the context of device not available exception). This
(math_state_restore()) can be a blocking call and hence we enable
interrupts (which were originally disabled when the exception happened),
allocate memory and disable interrupts etc.

But the eager-fpu mode, call's the same math_state_restore() from
kernel_fpu_end(). The assumption being that tsk_used_math() is always
set for the eager-fpu mode and thus avoid the code path of enabling
interrupts, allocating fpu state using blocking call and disable
interrupts etc.

But the below issue was noticed by Maarten Baert, Nate Eldredge and
few others:

If a user process dumps core on an ecrypt fs while aesni-intel is loaded,
we get a BUG() in __find_get_block() complaining that it was called with
interrupts disabled; then all further accesses to our ecrypt fs hang
and we have to reboot.

The aesni-intel code (encrypting the core file that we are writing) needs
the FPU and quite properly wraps its code in kernel_fpu_{begin,end}(),
the latter of which calls math_state_restore(). So after kernel_fpu_end(),
interrupts may be disabled, which nobody seems to expect, and they stay
that way until we eventually get to __find_get_block() which barfs.

For eager fpu, most the time, tsk_used_math() is true. At few instances
during thread exit, signal return handling etc, tsk_used_math() might
be false.

In kernel_fpu_end(), for eager-fpu, call math_state_restore()
only if tsk_used_math() is set. Otherwise, don't bother. Kernel code
path which cleared tsk_used_math() knows what needs to be done
with the fpu state.

Reported-by: Maarten Baert <maarten-baert@hotmail.com>
Reported-by: Nate Eldredge <nate@thatsmathematics.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Suresh Siddha <sbsiddha@gmail.com>
Link: http://lkml.kernel.org/r/1391410583.3801.6.camel@europa
Cc: George Spelvin <linux@horizon.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-11 12:32:52 -07:00
Dave Jones
09df7c4c80 x86: Remove CONFIG_X86_OOSTORE
This was an optimization that made memcpy type benchmarks a little
faster on ancient (Circa 1998) IDT Winchip CPUs.  In real-life
workloads, it wasn't even noticable, and I doubt anyone is running
benchmarks on 16 year old silicon any more.

Given this code has likely seen very little use over the last decade,
let's just remove it.

Signed-off-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-11 10:16:18 -07:00
Bjorn Helgaas
36fc5500bb sched: Remove unused mc_capable() and smt_capable()
Remove mc_capable() and smt_capable().  Neither is used.

Both were added by 5c45bf279d ("sched: mc/smt power savings sched
policy").  Uses of both were removed by 8e7fbcbc22 ("sched: Remove stale
power aware scheduling remnants and dysfunctional knobs").

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20140304210737.16893.54289.stgit@bhelgaas-glaptop.roam.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-11 12:05:45 +01:00