Commit Graph

100065 Commits

Author SHA1 Message Date
Alexei Starovoitov
72b603ee8c bpf: x86: add missing 'shift by register' instructions to x64 eBPF JIT
'shift by register' operations are supported by eBPF interpreter, but were
accidently left out of x64 JIT compiler. Fix it and add a testcase.

Reported-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Fixes: 622582786c ("net: filter: x86: internal BPF JIT")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25 17:33:56 -07:00
Linus Torvalds
26d189b82f arm64 fixes for merge window fallout
This small set of fixes addresses a few issues introduced during the
 merge window, including:
 
   - Fix typo in I-cache detection that was causing us to treat all
     I-caches as aliasing
   - Hook up memfd_create and getrandom syscalls for native and compat
   - Revert a temporary hack for defconfig builds in -next (the audit
     tree changes didn't make it in this merge window)
   - A couple of UEFI fixes for TEXT_OFFSET fuzzing and /memreserve/
   - A simple sparsemem fix for 48-bit physical addressing
   - Small defconfig updates to get autotesters working with X-gene
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJT9j4lAAoJEC379FI+VC/ZjVAP/jxBpU5V8i4aUiVVbG4efcTL
 elpu74kPek4LgyI55YZPdxo6Hc5BX/STC06hq6s//vFwSNf0tS+12sOCMpjHkhfU
 oRkuQR4TDgTAo4BvrONHYbwSyQclWS+VhKXp4NwTTIwmA5Mg9jgRQCp+VHGTPzZC
 UK3nObqVVVZh6BHzjEIMfy25mkChULdm8A6IG8ciNSrHxKVNMTAJTE1eKE/92ccM
 cs18zlymRPl02amf/tUgTFxHBgRoFZcaFpM7Qge3AUZY95U70kpEiPGLQNfJ2UBH
 NfcWr9LkdMGXULYdfO98akn7boIFW7W+BuHASCfp+di4odcQDjYQvDAvg1NZ3OCi
 C3IyISYyxBjP9kgShXrIAeDtDP9vjy6aj/E/rhq3ByNSewdP+f9roVq1oapuNs9F
 lAerIiqC8+LMLBan0EOuyaj3MOEGUh26qOUaKY8HEmlDXDJD5ceGsB3dROAx2mtV
 FhU7kqp717ETCloxtmJuZUrIlL+e3SZznnT4aCduL3/S9Jd5wdBTCUdqEoGnwBy/
 rkSjaP9D4QbBP3vMPaK7ANv2aRpjo3U2AuBsqL7iSjdKjj+IqQfNn5qIsTFNfVup
 zOUcecVSdntZj19ejbUj7+TGxEV+9er+HPEtqKo7Pii6zl+Mf8HsqsScAq1FZbw/
 2r/W9FODi0gN4WgKnNb1
 =BqdY
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "This small set of fixes addresses a few issues introduced during the
  merge window, including:

   - fix typo in I-cache detection that was causing us to treat all
     I-caches as aliasing
   - hook up memfd_create and getrandom syscalls for native and compat
   - revert a temporary hack for defconfig builds in -next (the audit
     tree changes didn't make it in this merge window)
   - a couple of UEFI fixes for TEXT_OFFSET fuzzing and /memreserve/
   - a simple sparsemem fix for 48-bit physical addressing
   - small defconfig updates to get autotesters working with X-gene"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  Revert "arm64: Do not invoke audit_syscall_* functions if !CONFIG_AUDIT_SYSCALL"
  arm64: mm: update max pa bits to 48
  arm64: ignore DT memreserve entries when booting in UEFI mode
  arm64: configs: Enable X-Gene SATA and ethernet in defconfig
  arm64: align randomized TEXT_OFFSET on 4 kB boundary
  asm-generic: add memfd_create system call to unistd.h
  arm64: compat: wire up memfd_create and getrandom syscalls for aarch32
  arm64: fix typo in I-cache policy detection
2014-08-22 09:08:20 -07:00
Linus Torvalds
d1433d55c7 Add memfd_create syscall to ia64
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT8j/FAAoJEKurIx+X31iBt9wP/Ro7VFSF2MD23ahFY/CEl+1t
 iJGNr3/1x+UcfYy6MK3L9xPJoSIu42aSDP00oExE45ThQZsdHD+gx0mwoW3GSoPk
 jWmXAbJLzXwRljAGvlu2ppecLxauPX3Lh14iCbRR2CdtQhCTSrsSwIuG/+iikv5X
 BAb14ovIjqhNDeZDIcUr1Mc9lAanDjIcvxbnV94el27LJ48sWgjSPCx00JQk3lo5
 +U1EJ9Ae66ARbtSOqfnv4MClT41iVwAWKtmraGS+f85/CKpWmKyTrEyMyqdO8fyO
 aJn6tS8d43rT/9CqVxKeDXk/Ltmthlj+aJKz5LEalamE7auWAp+egE8fBH7xdag5
 RJqr0oUyrPGSRr/KM+O0sfHTXBTC8UX5O83xuBD3ch9TEgL3LQ9J7ng3blMeaer1
 FnnAUwjoQo61fmsc0M8IJHo6OvOfMx9ekzU3uZr0eVlg6GUC/OBYh44v+zeYI+/t
 Z/m6H4ChOsL4+Ftsyb8ZvMswCD9UW1nQpqAlDhN9HevXgTgZDIERkAInallruy6P
 Cwve0eIDtJp9cIxp+nx6V8rw4Vl1Gdf43vQJzSmsbZP/qpOYW1a9f8wNBQ/Stz3P
 Od+8j+DYbS4fOZYPQ6lGXSKnSOfBBUMhwBTg6X1Y+Qyxd71er+Sm1fkcsg97cXQk
 wQua5ovu3mgCkPHW2ApM
 =/GJm
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-memfd_create' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull ia64 update from Tony Luck:
 "Add memfd_create syscall to ia64"

* tag 'please-pull-memfd_create' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  [IA64] Wire up memfd_create() system call
2014-08-21 14:06:56 -07:00
Linus Torvalds
f8d08a1bb4 Microblaze patches for 3.17-rc2
- Wire-up seccomp/getrandom/memfd_create syscalls
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iEYEABECAAYFAlP1rDwACgkQykllyylKDCG+8gCfQPCI3+UBtiLrhY0RLSqZiUs1
 De4An2BtpNkS96vtBZBNJzR1cmbP6Vtb
 =BC1f
 -----END PGP SIGNATURE-----

Merge tag 'microblaze-3.17-rc2' of git://git.monstr.eu/linux-2.6-microblaze

Pull microblaze update from Michal Simek:
 "Wire-up seccomp/getrandom/memfd_create syscalls"

* tag 'microblaze-3.17-rc2' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Wire-up memfd_create syscall
  microblaze: Wire-up getrandom syscall
  microblaze: Wire-up seccomp syscall
2014-08-21 14:06:18 -07:00
Michal Simek
83c43c498a microblaze: Wire-up memfd_create syscall
Add new memfd_create syscall.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2014-08-21 10:19:28 +02:00
Michal Simek
53133453a9 microblaze: Wire-up getrandom syscall
Add new getrandom syscall.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2014-08-21 10:07:05 +02:00
Michal Simek
b760949144 microblaze: Wire-up seccomp syscall
Add new seccomp syscall.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
2014-08-21 10:07:04 +02:00
Linus Torvalds
e9de42d8ee Reverting a 3.16 patch, fixing two bugs in device assignment
(one has a CVE), and fixing some problems introduced during the merge window
 (the CMA bug came in via Andrew, the x86 ones via yours truly).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJT9IfLAAoJEBvWZb6bTYbyAG8P/2GLPast76I9Pc269UNACV0f
 vNgJfSAH97PrEQtVzCurqb0RKHMKcZ5XyYmKh5TvzlbWYXnqJuJr5TrIh0gsuxn9
 DaBKVgeXBTd43OCRXJKw6SgkKlnf+yfQeASLRwjQgVCqsvNR/rKksEPjAhVqQJIJ
 PlRYKeBc7SA8bPUG64GDtF3yP9e/KG5ItGudj4eUADtadPmyldJbTWl0zLwY7jvJ
 /qcSxRgwqUsIS0c8xE5rlByxuWQ43RF+MfohNttNUjXD/dhvJo07NpkPUS6TsqHf
 x1VyWPuIY1zB/WghKutI8oZxS14iUs1l0LL9egS7fc4sYQqQ7+HHLaJnEMloTXqF
 GYfwmnyz53ocR1M4dgCPyBi0uxM3ydRzbSnsToR2kzVdS3WKu5O8GfjkE2zooEaA
 OP77OsSxtl5mLD68ZtubmLt8ttYCiWOEIOzviUSoJjPv0gUE07oAjecp7C8nKDCP
 lUxM2JZ01SLSzRf3uSlrNfRpeyMWVmYhyiG3lqLmph9FfP7p4donbIdh/QA0W7Nj
 E2GEEv3lCUZp7+TnOydsiWNVwv026dDanh5QLSuCvfCqf+xhNMJbrRzlbUpfGAsm
 89XasETdOnAqIO9VOOBLAKE2wrMEx+9vT2G0Dv3e+3IedGwLuM7/53X4zXUIB8ys
 L9C7kZwci9+X3qIExWJI
 =lhLd
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Reverting a 3.16 patch, fixing two bugs in device assignment (one has
  a CVE), and fixing some problems introduced during the merge window
  (the CMA bug came in via Andrew, the x86 ones via yours truly)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  virt/kvm/assigned-dev.c: Set 'dev->irq_source_id' to '-1' after free it
  Revert "KVM: x86: Increase the number of fixed MTRR regs to 10"
  KVM: x86: do not check CS.DPL against RPL during task switch
  KVM: x86: Avoid emulating instructions on #UD mistakenly
  PC, KVM, CMA: Fix regression caused by wrong get_order() use
  kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601)
2014-08-20 18:22:10 -05:00
Will Deacon
44b375070f Revert "arm64: Do not invoke audit_syscall_* functions if !CONFIG_AUDIT_SYSCALL"
For some reason, the audit patches didn't make it out of -next this
merge window, so revert our temporary hack and let the audit guys deal
with fixing up -next.

This reverts commit 2a8f45b040.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-08-19 22:05:45 +01:00
Ganapatrao Kulkarni
07a15dd55a arm64: mm: update max pa bits to 48
Now that we support 48-bit physical addressing, update MAX_PHYSMEM_BITS
accordingly.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-08-19 20:23:02 +01:00
Leif Lindholm
86c8b27a01 arm64: ignore DT memreserve entries when booting in UEFI mode
UEFI provides its own method for marking regions to reserve, via the
memory map which is also used to initialise memblock. So when using the
UEFI memory map, ignore any memreserve entries present in the DT.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-08-19 20:22:03 +01:00
Mark Brown
49d947face arm64: configs: Enable X-Gene SATA and ethernet in defconfig
Currently when run on an APM platform the ARMv8 defconfig has no viable
options for rootfs other than ramdisk which is rather limiting. Since
we already have both SATA and the bits needed for NFS root enabled we just
need to enable the relevant drivers so do that, helping enable direct
testing of upstream.

If the configuration ends up becoming too big we can consider modularising
some of the drivers and asking people to use an initramfs but for now this
is not an issue.

Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-08-19 19:26:09 +01:00
Ard Biesheuvel
4190312beb arm64: align randomized TEXT_OFFSET on 4 kB boundary
When booting via UEFI, the kernel Image is loaded at a 4 kB boundary and
the embedded EFI stub is executed in place. The EFI stub relocates the
Image to reside TEXT_OFFSET bytes above a 2 MB boundary, and jumps into
the kernel proper.

In AArch64, PC relative symbol references are emitted using adrp/add or
adrp/ldr pairs, where the offset into a 4 kB page is resolved using a
separate :lo12: relocation. This implicitly assumes that the code will
always be executed at the same relative offset with respect to a 4 kB
boundary, or the references will point to the wrong address.

This means we should link the kernel at a 4 kB aligned base address in
order to remain compatible with the base address the UEFI loader uses
when doing the initial load of Image. So update the code that generates
TEXT_OFFSET to choose a multiple of 4 kB.

At the same time, update the code so it chooses from the interval [0..2MB)
as the author originally intended.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-08-19 19:26:09 +01:00
Davidlohr Bueso
f325f1643a frv: Define cpu_relax_lowlatency()
3a6bfbc91d "(arch,locking: Ciao arch_mutex_cpu_relax()") broke
building the frv arch.  Fixes errors such as:

  kernel/locking/mcs_spinlock.h:87:2: error: implicit declaration of function 'cpu_relax_lowlatency'

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Compile-tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-19 09:40:08 -05:00
Paolo Bonzini
0d234daf7e Revert "KVM: x86: Increase the number of fixed MTRR regs to 10"
This reverts commit 682367c494,
which causes 32-bit SMP Windows 7 guests to panic.

SeaBIOS has a limit on the number of MTRRs that it can handle,
and this patch exceeded the limit.  Better revert it.
Thanks to Nadav Amit for debugging the cause.

Cc: stable@nongnu.org
Reported-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-19 15:12:28 +02:00
Paolo Bonzini
9a4cfb27f7 KVM: x86: do not check CS.DPL against RPL during task switch
This reverts the check added by commit 5045b46803 (KVM: x86: check CS.DPL
against RPL during task switch, 2014-05-15).  Although the CS.DPL=CS.RPL
check is mentioned in table 7-1 of the SDM as causing a #TSS exception,
it is not mentioned in table 6-6 that lists "invalid TSS conditions"
which cause #TSS exceptions. In fact it causes some tests to fail, which
pass on bare-metal.

Keep the rest of the commit, since we will find new uses for it in 3.18.

Reported-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-19 15:12:28 +02:00
Nadav Amit
3a6095a017 KVM: x86: Avoid emulating instructions on #UD mistakenly
Commit d40a6898e5 mistakenly caused instructions which are not marked as
EmulateOnUD to be emulated upon #UD exception. The commit caused the check of
whether the instruction flags include EmulateOnUD to never be evaluated. As a
result instructions whose emulation is broken may be emulated.  This fix moves
the evaluation of EmulateOnUD so it would be evaluated.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
[Tweak operand order in &&, remove EmulateOnUD where it's now superfluous.
 - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-19 15:12:28 +02:00
Alexey Kardashevskiy
c04fa5831d PC, KVM, CMA: Fix regression caused by wrong get_order() use
fc95ca7284 claims that there is no
functional change but this is not true as it calls get_order() (which
takes bytes) where it should have called order_base_2() and the kernel
stops on VM_BUG_ON().

This replaces get_order() with order_base_2() (round-up version of ilog2).

Suggested-by: Paul Mackerras <paulus@samba.org>
Cc: Alexander Graf <agraf@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-19 15:11:57 +02:00
Will Deacon
a97a42c476 arm64: compat: wire up memfd_create and getrandom syscalls for aarch32
arch/arm/ just grew support for the new memfd_create and getrandom
syscalls, so add them to our compat layer too.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-08-18 19:47:04 +01:00
Ard Biesheuvel
a3a80544ac arm64: fix typo in I-cache policy detection
This removes an unfortunately placed semi-colon resulting in all instruction
caches being classified as AIVIVT.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-08-18 19:47:03 +01:00
Tony Luck
703e6a6ed6 [IA64] Wire up memfd_create() system call
Yet another system call. This one added by:

   commit 9183df25fe
   shm: add memfd_create() syscall

Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-08-18 10:29:52 -07:00
Linus Torvalds
49899007b9 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull idle update from Len Brown:
 "Two Intel-platform-specific updates to intel_idle, and a cosmetic
  tweak to the turbostat utility"

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: tweak whitespace in output format
  intel_idle: Broadwell support
  intel_idle: Disable Baytrail Core and Module C6 auto-demotion
2014-08-16 09:25:34 -06:00
Len Brown
8c058d53f6 intel_idle: Disable Baytrail Core and Module C6 auto-demotion
Power efficiency improves on Baytrail (Intel Atom Processor E3000)
when Linux disables C6 auto-demotion.

Based on work by Srinidhi Kasagar <srinidhi.kasagar@intel.com>.

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
2014-08-15 17:06:14 -04:00
Linus Torvalds
a11c5c9ef6 PCI changes for the v3.17 merge window (part 2):
Miscellaneous
     - Remove DEFINE_PCI_DEVICE_TABLE macro use (Benoit Taine)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT7PyAAAoJEFmIoMA60/r8kjQQALr/8oEfZoVcjgCb7waWOr25
 hUTnrI6GBIAh/50hoBiPq0ouPCAKVv66+CUhuhFkLP7oJz+rMU0B9hfUvdLfmCpH
 7ppaallkllT9nPFIr7h5RUWLXsoQyuHmCYmSrUCcnlT2LPgU0dN72YWElLisEM6Z
 Pldg3933xyIQaCWviHjGEjWb7NvC+JY4pTkV5iyqGgU8Ale/eFYtLLSfdBEjIbGv
 VDirYZmKELYeuncZPrTAsp4IENRMZn702wwDakMSODVMEWtJB5h4yrBawqQDlFP5
 9ztIX6n9p9zkdVKbYZlx/Xwv6SYEnYXLxauVQMSO3Nck7Z10R5Ud+5uuCg/6mWH8
 AQI4UV5bbJcg7zHgocTG9XLFLFPoPtD2JT6k6UT1LeUAiAOqcSzhRO+/qJBmJOWZ
 Zv+EHXPlxBrl0zNifut6ZQrY17teuItVtmha70a/9W3PjnIx3KecqLcTwdTvDsOY
 IAyH8WMZrBKpPpsczSmfE93i2Z1QRS91HEAOeSMxl/98dcDTdllYZS7spjoDll2f
 xmpGDbpriLSCu2XsGHfTC9RbqA7CyuFlHggJSQDkT/5Esli0sCs7eweTuK3RVvPu
 t6bUHK3yElb6x9qMZhb5q6l72wSMlGMishTdaxEHmqrEA8PtaIFodmVX2T/Zel5n
 GHN6bysPqDItNR2v/3JX
 =jJGu
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull DEFINE_PCI_DEVICE_TABLE removal from Bjorn Helgaas:
 "Part two of the PCI changes for v3.17:

    - Remove DEFINE_PCI_DEVICE_TABLE macro use (Benoit Taine)

  It's a mechanical change that removes uses of the
  DEFINE_PCI_DEVICE_TABLE macro.  I waited until later in the merge
  window to reduce conflicts, but it's possible you'll still see a few"

* tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use
2014-08-14 18:10:33 -06:00
Linus Torvalds
179c0ac67b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull Sparc fixes from David Miller:
 "Hook up the memfd syscall, and properly claim all PCI resources
  discovered when building the PCI device tree"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Hook up memfd_create system call.
  sparc64: Properly claim resources as each PCI bus is probed.
  sparc64: Skip bogus PCI bridge ranges.
  sparc64: Expand PCI bridge probing debug logging.
2014-08-14 17:28:08 -06:00
Linus Torvalds
3b7b3e6ec5 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:
 - make clean also considers $(extra-m) and $(extra-) to be consistent
 - cleanup and fixes in scripts/Makefile.host
 - allow to override the name of the Python 2 executable with make
   PYTHON=... (only needed for ia64 in practice)
 - option to split debugingo into *.dwo files to save disk space if the
   compiler supports it (CONFIG_DEBUG_INFO_SPLIT)
 - option to use dwarf4 debuginfo if the compiler supports it
   (CONFIG_DEBUG_INFO_DWARF4)
 - fix for disabling certain warnings with clang
 - fix for unneeded rebuild with dash when a command contains
   backslashes

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kbuild: Fix handling of backslashes in *.cmd files
  kbuild, LLVMLinux: Supress warnings unless W=1-3
  Kbuild: Add a option to enable dwarf4 v2
  kbuild: Support split debug info v4
  kbuild: allow to override Python command name
  kbuild: clean-up and bug fix of scripts/Makefile.host
  kbuild: clean up scripts/Makefile.host
  kbuild: drop shared library support from Makefile.host
  kbuild: fix a bug of C++ host program handling
  kbuild: fix a typo in scripts/Makefile.host
  scripts/Makefile.clean: clean also $(extra-m) and $(extra-)
2014-08-14 11:12:46 -06:00
Linus Torvalds
1d508f8ace Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull more powerpc updates from Ben Herrenschmidt:
 "Here are some more powerpc bits for 3.17, essentially fixes.

  The biggest series, also aimed at -stable, is from Aneesh and is the
  result of weeks and weeks of debugging to find out why the heck or THP
  implementation was occasionally triggering multi-hit errors in our
  level 1 TLB.  It ended up being a combination of issues including
  subtleties as to how we should invalidate those special 'MPSS' pages
  we use to allow the use of 16M pages inside 4K/64K "base page size"
  segments (you really have to love our MMU !)

  Another interesting one in the "OMG" category is the series from
  Michael adding memory barriers to spin_is_locked().  That's also the
  result of many days of debugging to figure out why the semaphore code
  would occasionally crash in ways that made no sense.  It ended up
  being some creative lock stacking that was defeated by the fact that
  our locks allow a load inside the locked section to be re-ordered with
  the load of the lock value itself (I'm still of two mind about whether
  to kill that once and for all by putting a heavier barrier back into
  our lock implementation...).  The fixes come with a long explanation
  in the cset comments, feel free to read it if you feel like having a
  headache today"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
  powerpc/thp: Add tracepoints to track hugepage invalidate
  powerpc/mm: Use read barrier when creating real_pte
  powerpc/thp: Use ACCESS_ONCE when loading pmdp
  powerpc/thp: Invalidate with vpn in loop
  powerpc/thp: Handle combo pages in invalidate
  powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte
  powerpc/thp: Don't recompute vsid and ssize in loop on invalidate
  powerpc/thp: Add write barrier after updating the valid bit
  powerpc: reorder per-cpu NUMA information's initialization
  powerpc/perf/hv-24x7: Use kmem_cache_free
  powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info
  powerpc: Hard disable interrupts in xmon
  powerpc: remove duplicate definition of TEXASR_FS
  powerpc/pseries: Avoid deadlock on removing ddw
  powerpc/pseries: Failure on removing device node
  powerpc/boot: Use correct zlib types for comparison
  powerpc/powernv: Interface to register/unregister opal dump region
  printk: Add function to return log buffer address and size
  powerpc: Add POWER8 features to CPU_FTRS_POSSIBLE/ALWAYS
  powerpc/ppc476: Disable BTAC
  ...
2014-08-14 10:14:07 -06:00
Linus Torvalds
ae36e95cf8 The branch contains the following device tree changes the v3.17 merge
window:
 
 Group changes to the device tree. In preparation for adding device tree
 overlay support, OF_DYNAMIC is reworked so that a set of device tree
 changes can be prepared and applied to the tree all at once. OF_RECONFIG
 notifiers see the most significant change here so that users always get
 a consistent view of the tree. Notifiers generation is moved from before
 a change to after it, and notifiers for a group of changes are emitted
 after the entire block of changes have been applied
 
 Automatic console selection from DT. Console drivers can now use
 of_console_check() to see if the device node is specified as a console
 device. If so then it gets added as a preferred console. UART devices
 get this support automatically when uart_add_one_port() is called.
 
 DT unit tests no longer depend on pre-loaded data in the device tree.
 Data is loaded dynamically at the start of unit tests, and then unloaded
 again when the tests have completed.
 
 Also contains a few bugfixes for reserved regions and early memory setup.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT6M7XAAoJEMWQL496c2LNTTgP/2rXyrTTGZpK/qrLKHWKYHvr
 XL7tcTkhA0OLU64E37fB+xtDXyBYnLsharuwUFSd1LlL1Wnc1cZN5ORRlMJbmjUR
 Wvwl0A8/mkhGl4tzzKgJ4z4rMJXvlZfJRpnVoRB5FOn90LI7k/jsf5rIwF/6S90B
 6D6II0r4RG9ku1m7g70cToxcIFCzp0V+eu2tym9GnhsyGKlunPT9iNiTpwfVhPAj
 QUvMPKIQXReOv6xDU5q6E07839IMf7SdAvciBTHGnCDD7sGziHvnBIShj/2vTDAF
 27sGRKrWUnnuxRUMOoIudiYyeHXIdt1WXp6FsS/ztVI37Ijh9YPShJwWGhQDppnp
 4tcoSdefqw9IRUajAVWsB0RUW/tCL4a7tggWofylOA6itDi+HZnw6YafE1G1YzxF
 q8OFo9uqLcmFQfHDJpk+sdtXoMZzOgrxlEscsGsQ8kd2Uoe8+chgR9EY1sqPkWVF
 Zw0FJCTB6spBlsnXeboBGrnvpbPkacwhvesIFO0IANy4j4R55xlEeTcs1fe3ubUf
 UhNyyFdnCAUA7e5CAabcAQYsdbEKG6v0Il3H6xts36c8lXCSFXVgNcw5mdCpFCcQ
 DZ3/1FGSVzkYNXX8hlzIn1W0dqvwn4x0ZbnpXUJrGUBpSmjt9qkXx0XbdeFwartU
 7X4tNGS1jfNYhOdP87jF
 =8IFz
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux

Pull device tree updates from Grant Likely:
 "The branch contains the following device tree changes the v3.17 merge
  window:

  Group changes to the device tree.  In preparation for adding device
  tree overlay support, OF_DYNAMIC is reworked so that a set of device
  tree changes can be prepared and applied to the tree all at once.
  OF_RECONFIG notifiers see the most significant change here so that
  users always get a consistent view of the tree.  Notifiers generation
  is moved from before a change to after it, and notifiers for a group
  of changes are emitted after the entire block of changes have been
  applied

  Automatic console selection from DT.  Console drivers can now use
  of_console_check() to see if the device node is specified as a console
  device.  If so then it gets added as a preferred console.  UART
  devices get this support automatically when uart_add_one_port() is
  called.

  DT unit tests no longer depend on pre-loaded data in the device tree.
  Data is loaded dynamically at the start of unit tests, and then
  unloaded again when the tests have completed.

  Also contains a few bugfixes for reserved regions and early memory
  setup"

* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux: (21 commits)
  of: Fixing OF Selftest build error
  drivers: of: add automated assignment of reserved regions to client devices
  of: Use proper types for checking memory overflow
  of: typo fix in __of_prop_dup()
  Adding selftest testdata dynamically into live tree
  of: Add todo tasklist for Devicetree
  of: Transactional DT support.
  of: Reorder device tree changes and notifiers
  of: Move dynamic node fixups out of powerpc and into common code
  of: Make sure attached nodes don't carry along extra children
  of: Make devicetree sysfs update functions consistent.
  of: Create unlocked versions of node and property add/remove functions
  OF: Utility helper functions for dynamic nodes
  of: Move CONFIG_OF_DYNAMIC code into a separate file
  of: rename of_aliases_mutex to just of_mutex
  of/platform: Fix of_platform_device_destroy iteration of devices
  of: Migrate of_find_node_by_name() users to for_each_node_by_name()
  tty: Update hypervisor tty drivers to use core stdout parsing code.
  arm/versatile: Add the uart as the stdout device.
  of: Enable console on serial ports specified by /chosen/stdout-path
  ...
2014-08-14 09:53:39 -06:00
Linus Torvalds
4dacb91c7d Xen bug fixes for 3.17-rc0
- Fix ARM build.
 - Fix boot crash with PVH guests.
 - Improve reliability of resume/migration.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJT6y4zAAoJEFxbo/MsZsTRqI4H/3c+TXRjeB5787wkm3Lya1JO
 5+DsbbtlM13PT+Oi0Zop7xUYMPya7Xkwix0FvFvJ/q0/NFMf6RDJREtRy7atRkkp
 IRKccsGA/OH0ETcmQJOGHErnK51cAjMrm+ilJHIIYd6btEVCuP212G9f8CVGMyIK
 r0U6sVUxTJ5TPOLjvv5e8Rz+iTtP+OXvxaf2+rcqAf1QAJJ6S6+ipJZjZemX8OR/
 IoUPiee5Ou/ogQekt0f473vJrxX5nP7wLdJ2F7YA/A6F/HQHNbYDgQXf2IXrqFmb
 DpkpXxaowZGKXwnwqHUr3pVDdeMpvYJKZDPZmTQq2htrXufHIiY/E3qr9wGjqxk=
 =iVy1
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.17-b-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull Xen bugfixes from David Vrabel:
 - fix ARM build
 - fix boot crash with PVH guests
 - improve reliability of resume/migration

* tag 'stable/for-linus-3.17-b-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: use vmap() to map grant table pages in PVH guests
  x86/xen: resume timer irqs early
  arm/xen: remove duplicate arch_gnttab_init() function
2014-08-14 09:40:44 -06:00
David S. Miller
10cf15e1d1 sparc: Hook up memfd_create system call.
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 22:00:09 -07:00
David S. Miller
f1d25d37d3 sparc64: Properly claim resources as each PCI bus is probed.
Perform a pci_claim_resource() on all valid resources discovered
during the OF device tree scan.

Based almost entirely upon the PCI OF bus probing code which does
the same thing there.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 21:17:49 -07:00
David S. Miller
4afba24e5f sparc64: Skip bogus PCI bridge ranges.
It seems that when a PCI Express bridge is not in use and has no devices
behind it, the ranges property is bogus.  Specifically the size property
is of the form [0xffffffff:...], and if you add this size to the resource
start address the 64-bit calculation will overflow.

Just check specifically for this size value signature and skip them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 21:17:49 -07:00
David S. Miller
93a6423bd8 sparc64: Expand PCI bridge probing debug logging.
Dump the various aspects of the PCI bridge probed at boot time, most
importantly the bridge number ranges, and the ranges property.

This helps diagnose PCI resource issues and other problems by giving
ofpci_debug=1 on the boot command line.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 21:17:49 -07:00
Linus Torvalds
f0094b28f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Several networking final fixes and tidies for the merge window:

   1) Changes during the merge window unintentionally took away the
      ability to build bluetooth modular, fix from Geert Uytterhoeven.

   2) Several phy_node reference count bug fixes from Uwe Kleine-König.

   3) Fix ucc_geth build failures, also from Uwe Kleine-König.

   4) Fix klog false positivies when netlink messages go to network
      taps, by properly resetting the network header.  Fix from Daniel
      Borkmann.

   5) Sizing estimate of VF netlink messages is too small, from Jiri
      Benc.

   6) New APM X-Gene SoC ethernet driver, from Iyappan Subramanian.

   7) VLAN untagging is erroneously dependent upon whether the VLAN
      module is loaded or not, but there are generic dependencies that
      matter wrt what can be expected as the SKB enters the stack.
      Make the basic untagging generic code, and do it unconditionally.
      From Vlad Yasevich.

   8) xen-netfront only has so many slots in it's transmit queue so
      linearize packets that have too many frags.  From Zoltan Kiss.

   9) Fix suspend/resume PHY handling in bcmgenet driver, from Florian
      Fainelli"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (55 commits)
  net: bcmgenet: correctly resume adapter from Wake-on-LAN
  net: bcmgenet: update UMAC_CMD only when link is detected
  net: bcmgenet: correctly suspend and resume PHY device
  net: bcmgenet: request and enable main clock earlier
  net: ethernet: myricom: myri10ge: myri10ge.c: Cleaning up missing null-terminate after strncpy call
  xen-netfront: Fix handling packets on compound pages with skb_linearize
  net: fec: Support phys probed from devicetree and fixed-link
  smsc: replace WARN_ON() with WARN_ON_SMP()
  xen-netback: Don't deschedule NAPI when carrier off
  net: ethernet: qlogic: qlcnic: Remove duplicate object file from Makefile
  wan: wanxl: Remove typedefs from struct names
  m68k/atari: EtherNEC - ethernet support (ne)
  net: ethernet: ti: cpmac.c: Cleaning up missing null-terminate after strncpy call
  hdlc: Remove typedefs from struct names
  airo_cs: Remove typedef local_info_t
  atmel: Remove typedef atmel_priv_ioctl
  com20020_cs: Remove typedef com20020_dev_t
  ethernet: amd: Remove typedef local_info_t
  net: Always untag vlan-tagged traffic on input.
  drivers: net: Add APM X-Gene SoC ethernet driver support.
  ...
2014-08-13 18:27:40 -06:00
Linus Torvalds
13b102bf48 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull Sparc fixes from David Miller:
 "Sparc bug fixes, one of which was preventing successful SMP boots with
  mainline"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Fix pcr_ops initialization and usage bugs.
  sparc64: Do not disable interrupts in nmi_cpu_busy()
  sparc: Hook up seccomp and getrandom system calls.
  sparc: fix decimal printf format specifiers prefixed with 0x
2014-08-13 18:26:50 -06:00
Linus Torvalds
81c02a21b2 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic updates from Thomas Gleixner:
 "This is a major overhaul to the x86 apic subsystem consisting of the
  following parts:

   - Remove obsolete APIC driver abstractions (David Rientjes)

   - Use the irqdomain facilities to dynamically allocate IRQs for
     IOAPICs.  This is a prerequisite to enable IOAPIC hotplug support,
     and it also frees up wasted vectors (Jiang Liu)

   - Misc fixlets.

  Despite the hickup in Ingos previous pull request - caused by the
  missing fixup for the suspend/resume issue reported by Borislav - I
  strongly recommend that this update finds its way into 3.17.  Some
  history for you:

  This is preparatory work for physical IOAPIC hotplug.  The first
  attempt to support this was done by Yinghai and I shot it down because
  it just added another layer of obscurity and complexity to the already
  existing mess without tackling the underlying shortcomings of the
  current implementation.

  After quite some on- and offlist discussions, I requested that the
  design of this functionality must use generic infrastructure, i.e.
  irq domains, which provide all the mechanisms to dynamically map linux
  interrupt numbers to physical interrupts.

  Jiang picked up the idea and did a great job of consolidating the
  existing interfaces to manage the x86 (IOAPIC) interrupt system by
  utilizing irq domains.

  The testing in tip, Linux-next and inside of Intel on various machines
  did not unearth any oddities until Borislav exposed it to one of his
  oddball machines.  The issue was resolved quickly, but unfortunately
  the fix fell through the cracks and did not hit the tip tree before
  Ingo sent the pull request.  Not entirely Ingos fault, I also assumed
  that the fix was already merged when Ingo asked me whether he could
  send it.

  Nevertheless this work has a proper design, has undergone several
  rounds of review and the final fallout after applying it to tip and
  integrating it into Linux-next has been more than moderate.  It's the
  ground work not only for IOAPIC hotplug, it will also allow us to move
  the lowlevel vector allocation into the irqdomain hierarchy, which
  will benefit other architectures as well.  Patches are posted already,
  but they are on hold for two weeks, see below.

  I really appreciate the competence and responsiveness Jiang has shown
  in course of this endavour.  So I'm sure that any fallout of this will
  be addressed in a timely manner.

  FYI, I'm vanishing for 2 weeks into my annual kids summer camp kitchen
  duty^Wvacation, while you folks are drooling at KS/LinuxCon :) But HPA
  will have a look at the hopefully zero fallout until I'm back"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  x86, irq, PCI: Keep IRQ assignment for PCI devices during suspend/hibernation
  x86/apic/vsmp: Make is_vsmp_box() static
  x86, apic: Remove enable_apic_mode callback
  x86, apic: Remove setup_portio_remap callback
  x86, apic: Remove multi_timer_check callback
  x86, apic: Replace noop_check_apicid_used
  x86, apic: Remove check_apicid_present callback
  x86, apic: Remove mps_oem_check callback
  x86, apic: Remove smp_callin_clear_local_apic callback
  x86, apic: Replace trampoline physical addresses with defaults
  x86, apic: Remove x86_32_numa_cpu_node callback
  x86: intel-mid: Use the new io_apic interfaces
  x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
  x86, irq: Clean up irqdomain transition code
  x86, irq, devicetree: Release IOAPIC pin when PCI device is disabled
  x86, irq, SFI: Release IOAPIC pin when PCI device is disabled
  x86, irq, mpparse: Release IOAPIC pin when PCI device is disabled
  x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled
  x86, irq: Introduce helper functions to release IOAPIC pin
  x86, irq: Simplify the way to handle ISA IRQ
  ...
2014-08-13 18:23:32 -06:00
Linus Torvalds
d27c0d9018 Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/efix fixes from Peter Anvin:
 "Two EFI-related Kconfig changes, which happen to touch immediately
  adjacent lines in Kconfig and thus collapse to a single patch"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Enforce CONFIG_RELOCATABLE for EFI boot stub
  x86/efi: Fix 3DNow optimization build failure in EFI stub
2014-08-13 18:21:35 -06:00
Linus Torvalds
7453f33b2e Merge branch 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/xsave changes from Peter Anvin:
 "This is a patchset to support the XSAVES instruction required to
  support context switch of supervisor-only features in upcoming
  silicon.

  This patchset missed the 3.16 merge window, which is why it is based
  on 3.15-rc7"

* 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, xsave: Add forgotten inline annotation
  x86/xsaves: Clean up code in xstate offsets computation in xsave area
  x86/xsave: Make it clear that the XSAVE macros use (%edi)/(%rdi)
  Define kernel API to get address of each state in xsave area
  x86/xsaves: Enable xsaves/xrstors
  x86/xsaves: Call booting time xsaves and xrstors in setup_init_fpu_buf
  x86/xsaves: Save xstate to task's xsave area in __save_fpu during booting time
  x86/xsaves: Add xsaves and xrstors support for booting time
  x86/xsaves: Clear reserved bits in xsave header
  x86/xsaves: Use xsave/xrstor for saving and restoring user space context
  x86/xsaves: Use xsaves/xrstors for context switch
  x86/xsaves: Use xsaves/xrstors to save and restore xsave area
  x86/xsaves: Define a macro for handling xsave/xrstor instruction fault
  x86/xsaves: Define macros for xsave instructions
  x86/xsaves: Change compacted format xsave area header
  x86/alternative: Add alternative_input_2 to support alternative with two features and input
  x86/xsaves: Add a kernel parameter noxsaves to disable xsaves/xrstors
2014-08-13 18:20:04 -06:00
Linus Torvalds
fd1cf90580 Metag architecture changes for v3.17
Just a couple of minor static analysis fixes, removal of a NULL check
 that should never happen, and fix an error check where an unsigned value
 was being checked to see if it was negative.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJT6JHIAAoJEGwLaZPeOHZ6dfoP/0NrVlO+ldKALXEaPXvLyiWN
 MAhWt8xMHoTsgGlnbnr4Gi6mDDuSnH9CvCoelE/Htn4UOnNbH1bZ/XKJ9gyfnoIv
 NM0xlInWd9dN1PObgcLONBG5KF9vSb8yMhzCZ+GJYZY7mWmjp7641csLL+ksC5uD
 +qkl6+/RsEAXNHC9Hnzib0zpUf/q+u1ZWa5LmvoUTtCuOfdCEeAN+oCuM7+/d9vd
 iyrzXTzdLOn8OA6biNWHH9OVzBEvmAvkJbhccBN1VOUi85OWnmCMrBkB11T9hUSp
 +O2lNfAJYZfqf51OMh8OmdBhcN+GxJf8YIgJqlfzbEzKDRCqhUvvYIDI0tXmTKIC
 CSkifBSWfyi8/+Nrijs+8Z/q7EJJ3VHX2AFwQlZM1eH4cZR4MM0AmSAGsu+vwGsA
 QwkHRcKKqGJUXGusw5+ezPYy7Esn23KkWIBjucZm2gRc1zb+KqB8+TnypJMIpl5U
 Kf8yWahptEH9fTFjqu557E29Oe0lb9fKRGPovdLfcQ9jMNHaZHoix19+kMO+dol2
 LgKhvt1OEC/NCZf/FjFwOvkdRWmkbOLyn2NJinmCGpAmT88LUI/vcpGHpjWgdxhd
 svDd/hFxIUnv10E9wPf27UMrS5zVKJ1loPfQ8N6m89/q3Mx1Q+zErgeq6TWfWBAm
 UJG8U2dv4JeIWUHmcSQ+
 =3kq/
 -----END PGP SIGNATURE-----

Merge tag 'metag-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag

Pull metag architecture updates from James Hogan:
 "Just a couple of minor static analysis fixes, removal of a NULL check
  that should never happen, and fix an error check where an unsigned
  value was being checked to see if it was negative"

* tag 'metag-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  metag: cachepart: Fix failure check
  metag: hugetlbpage: Remove null pointer checks that could never happen
2014-08-13 18:18:09 -06:00
Aneesh Kumar K.V
9e813308a5 powerpc/thp: Add tracepoints to track hugepage invalidate
Add tracepoint to track hugepage invalidate. This help us
in debugging difficult to track bugs.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:42 +10:00
Aneesh Kumar K.V
85c1fafd72 powerpc/mm: Use read barrier when creating real_pte
On ppc64 we support 4K hash pte with 64K page size. That requires
us to track the hash pte slot information on a per 4k basis. We do that
by storing the slot details in the second half of pte page. The pte bit
_PAGE_COMBO is used to indicate whether the second half need to be
looked while building real_pte. We need to use read memory barrier while
doing that so that load of hidx is not reordered w.r.t _PAGE_COMBO
check. On the store side we already do a lwsync in __hash_page_4K

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:41 +10:00
Aneesh Kumar K.V
7e467245bf powerpc/thp: Use ACCESS_ONCE when loading pmdp
We would get wrong results in compiler recomputed old_pmd. Avoid
that by using ACCESS_ONCE

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:41 +10:00
Aneesh Kumar K.V
969b7b208f powerpc/thp: Invalidate with vpn in loop
As per ISA, for 4k base page size we compare 14..65 bits of VA specified
with the entry_VA in tlb. That implies we need to make sure we do a
tlbie with all the possible 4k va we used to access the 16MB hugepage.
With 64k base page size we compare 14..57 bits of VA. Hence we cannot
ignore the lower 24 bits of va while tlbie .We also cannot tlb
invalidate a 16MB entry with just one tlbie instruction because
we don't track which va was used to instantiate the tlb entry.

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:40 +10:00
Aneesh Kumar K.V
fc04795575 powerpc/thp: Handle combo pages in invalidate
If we changed base page size of the segment, either via sub_page_protect
or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash
table entries. We do a lazy hash page table flush for all mapped pages
in the demoted segment. This happens when we handle hash page fault for
these pages.

We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a
pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte,
that implies that we could possibly have older 64K hash pte entries in
the hash page table and we need to invalidate those entries.

Use _PAGE_COMBO to determine the page size with which we should
invalidate the hash table entries on unmap.

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:39 +10:00
Aneesh Kumar K.V
629149fae4 powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte
If we changed base page size of the segment, either via sub_page_protect
or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash
table entries. We do a lazy hash page table flush for all mapped pages
in the demoted segment. This happens when we handle hash page fault
for these pages.

We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a
pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte,
that implies that we could possibly have older 64K hash pte entries in
the hash page table and we need to invalidate those entries.

Handle this correctly for 16M pages

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:39 +10:00
Aneesh Kumar K.V
fa1f8ae80f powerpc/thp: Don't recompute vsid and ssize in loop on invalidate
The segment identifier and segment size will remain the same in
the loop, So we can compute it outside. We also change the
hugepage_invalidate interface so that we can use it the later patch

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:38 +10:00
Aneesh Kumar K.V
b0aa44a3df powerpc/thp: Add write barrier after updating the valid bit
With hugepages, we store the hpte valid information in the pte page
whose address is stored in the second half of the PMD. Use a
write barrier to make sure clearing pmd busy bit and updating
hpte valid info are ordered properly.

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:37 +10:00
Nishanth Aravamudan
2fabf084b6 powerpc: reorder per-cpu NUMA information's initialization
There is an issue currently where NUMA information is used on powerpc
(and possibly ia64) before it has been read from the device-tree, which
leads to large slab consumption with CONFIG_SLUB and memoryless nodes.

NUMA powerpc non-boot CPU's cpu_to_node/cpu_to_mem is only accurate
after start_secondary(), similar to ia64, which is invoked via
smp_init().

Commit 6ee0578b4d ("workqueue: mark init_workqueues() as
early_initcall()") made init_workqueues() be invoked via
do_pre_smp_initcalls(), which is obviously before the secondary
processors are online.

Additionally, the following commits changed init_workqueues() to use
cpu_to_node to determine the node to use for kthread_create_on_node:

bce903809a ("workqueue: add wq_numa_tbl_len and
wq_numa_possible_cpumask[]")
f3f90ad469 ("workqueue: determine NUMA node of workers accourding to
the allowed cpumask")

Therefore, when init_workqueues() runs, it sees all CPUs as being on
Node 0. On LPARs or KVM guests where Node 0 is memoryless, this leads to
a high number of slab deactivations
(http://www.spinics.net/lists/linux-mm/msg67489.html).

Fix this by initializing the powerpc-specific CPU<->node/local memory
node mapping as early as possible, which on powerpc is
do_init_bootmem(). Currently that function initializes the mapping for
the boot CPU, but we extend it to setup the mapping for all possible
CPUs. Then, in smp_prepare_cpus(), we can correspondingly set the
per-cpu values for all possible CPUs. That ensures that before the
early_initcalls run (and really as early as possible), the per-cpu NUMA
mapping is accurate.

While testing memoryless nodes on PowerKVM guests with a fix to the
workqueue logic to use cpu_to_mem() instead of cpu_to_node(), with a
guest topology of:

available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
node 0 size: 0 MB
node 0 free: 0 MB
node 1 cpus: 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
node 1 size: 16336 MB
node 1 free: 15329 MB
node distances:
node   0   1
  0:  10  40
  1:  40  10

the slab consumption decreases from

Slab:             932416 kB
SUnreclaim:       902336 kB

to

Slab:             395264 kB
SUnreclaim:       359424 kB

And we a corresponding increase in the slab efficiency from

slab                                   mem     objs    slabs
                                      used   active   active
------------------------------------------------------------
kmalloc-16384                       337 MB   11.28%  100.00%
task_struct                         288 MB    9.93%  100.00%

to

slab                                   mem     objs    slabs
                                      used   active   active
------------------------------------------------------------
kmalloc-16384                        37 MB  100.00%  100.00%
task_struct                          31 MB  100.00%  100.00%

Powerpc didn't support memoryless nodes until recently (64bb80d87f
"powerpc/numa: Enable CONFIG_HAVE_MEMORYLESS_NODES" and 8c27226119
"powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"). Those commits also
helped improve memory consumption with these kind of environments.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:14:05 +10:00
Himangi Saraogi
d658972284 powerpc/perf/hv-24x7: Use kmem_cache_free
Free memory allocated using kmem_cache_zalloc using kmem_cache_free
rather than kfree.

The Coccinelle semantic patch that makes this change is as follows:

// <smpl>
@@
expression x,E,c;
@@

 x = \(kmem_cache_alloc\|kmem_cache_zalloc\|kmem_cache_alloc_node\)(c,...)
 ... when != x = E
     when != &x
?-kfree(x)
+kmem_cache_free(c,x)
// </smpl>

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:14:04 +10:00
Thomas Falcon
587870e865 powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info
A buffer returned by H_VTERM_PARTNER_INFO contains device information
in big endian format, causing problems for little endian architectures.
This patch ensures that they are in cpu endian.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:14:04 +10:00