Commit Graph

425955 Commits

Author SHA1 Message Date
Weijie Yang
f893ab41e4 mm/swap: fix race on swap_info reuse between swapoff and swapon
swapoff clear swap_info's SWP_USED flag prematurely and free its
resources after that.  A concurrent swapon will reuse this swap_info
while its previous resources are not cleared completely.

These late freed resources are:
 - p->percpu_cluster
 - swap_cgroup_ctrl[type]
 - block_device setting
 - inode->i_flags &= ~S_SWAPFILE

This patch clears the SWP_USED flag after all its resources are freed,
so that swapon can reuse this swap_info by alloc_swap_info() safely.

[akpm@linux-foundation.org: tidy up code comment]
Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06 13:48:51 -08:00
Shaohua Li
579f82901f swap: add a simple detector for inappropriate swapin readahead
This is a patch to improve swap readahead algorithm.  It's from Hugh and
I slightly changed it.

Hugh's original changelog:

swapin readahead does a blind readahead, whether or not the swapin is
sequential.  This may be ok on harddisk, because large reads have
relatively small costs, and if the readahead pages are unneeded they can
be reclaimed easily - though, what if their allocation forced reclaim of
useful pages? But on SSD devices large reads are more expensive than
small ones: if the readahead pages are unneeded, reading them in caused
significant overhead.

This patch adds very simplistic random read detection.  Stealing the
PageReadahead technique from Konstantin Khlebnikov's patch, avoiding the
vma/anon_vma sophistications of Shaohua Li's patch, swapin_nr_pages()
simply looks at readahead's current success rate, and narrows or widens
its readahead window accordingly.  There is little science to its
heuristic: it's about as stupid as can be whilst remaining effective.

The table below shows elapsed times (in centiseconds) when running a
single repetitive swapping load across a 1000MB mapping in 900MB ram
with 1GB swap (the harddisk tests had taken painfully too long when I
used mem=500M, but SSD shows similar results for that).

Vanilla is the 3.6-rc7 kernel on which I started; Shaohua denotes his
Sep 3 patch in mmotm and linux-next; HughOld denotes my Oct 1 patch
which Shaohua showed to be defective; HughNew this Nov 14 patch, with
page_cluster as usual at default of 3 (8-page reads); HughPC4 this same
patch with page_cluster 4 (16-page reads); HughPC0 with page_cluster 0
(1-page reads: no readahead).

HDD for swapping to harddisk, SSD for swapping to VertexII SSD.  Seq for
sequential access to the mapping, cycling five times around; Rand for
the same number of random touches.  Anon for a MAP_PRIVATE anon mapping;
Shmem for a MAP_SHARED anon mapping, equivalent to tmpfs.

One weakness of Shaohua's vma/anon_vma approach was that it did not
optimize Shmem: seen below.  Konstantin's approach was perhaps mistuned,
50% slower on Seq: did not compete and is not shown below.

HDD        Vanilla Shaohua HughOld HughNew HughPC4 HughPC0
Seq Anon     73921   76210   75611   76904   78191  121542
Seq Shmem    73601   73176   73855   72947   74543  118322
Rand Anon   895392  831243  871569  845197  846496  841680
Rand Shmem 1058375 1053486  827935  764955  764376  756489

SSD        Vanilla Shaohua HughOld HughNew HughPC4 HughPC0
Seq Anon     24634   24198   24673   25107   21614   70018
Seq Shmem    24959   24932   25052   25703   22030   69678
Rand Anon    43014   26146   28075   25989   26935   25901
Rand Shmem   45349   45215   28249   24268   24138   24332

These tests are, of course, two extremes of a very simple case: under
heavier mixed loads I've not yet observed any consistent improvement or
degradation, and wider testing would be welcome.

Shaohua Li:

Test shows Vanilla is slightly better in sequential workload than Hugh's
patch.  I observed with Hugh's patch sometimes the readahead size is
shrinked too fast (from 8 to 1 immediately) in sequential workload if
there is no hit.  And in such case, continuing doing readahead is good
actually.

I don't prepare a sophisticated algorithm for the sequential workload
because so far we can't guarantee sequential accessed pages are swap out
sequentially.  So I slightly change Hugh's heuristic - don't shrink
readahead size too fast.

Here is my test result (unit second, 3 runs average):
	Vanilla		Hugh		New
Seq	356		370		360
Random	4525		2447		2444

Attached graph is the swapin/swapout throughput I collected with 'vmstat
2'.  The first part is running a random workload (till around 1200 of
the x-axis) and the second part is running a sequential workload.
swapin and swapout throughput are almost identical in steady state in
both workloads.  These are expected behavior.  while in Vanilla, swapin
is much bigger than swapout especially in random workload (because wrong
readahead).

Original patches by: Shaohua Li and Konstantin Khlebnikov.

[fengguang.wu@intel.com: swapin_nr_pages() can be static]
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06 13:48:51 -08:00
Zongxun Wang
fb951eb5e1 ocfs2: free allocated clusters if error occurs after ocfs2_claim_clusters
Even if using the same jbd2 handle, we cannot rollback a transaction.
So once some error occurs after successfully allocating clusters, the
allocated clusters will never be used and it means they are lost.  For
example, call ocfs2_claim_clusters successfully when expanding a file,
but failed in ocfs2_insert_extent.  So we need free the allocated
clusters if they are not used indeed.

Signed-off-by: Zongxun Wang <wangzongxun@huawei.com>
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06 13:48:51 -08:00
Randy Dunlap
277cba1d28 Documentation/kernel-parameters.txt: fix memmap= language
Clean up descriptions of memmap= boot options.

Add periods (full stops), drop commas, change "used" to "reserved" or
"marked".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andiry Xu <andiry.xu@gmail.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-06 13:48:51 -08:00
Linus Torvalds
ef42c58a5b Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "This lot provides:

   * Bugfixes for armada irq controller
   * Updates to renesas irq chip
   * Support for the TI-NSPIRE irq controller

  Not strictly a bug fix only pull request, but important updates for
  some of the arm Socs which I completely forgot to send last week.

  Seems like my obliviousness is getting worse, I just can't remember
  when it started"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip: Add support for TI-NSPIRE irqchip
  irqchip: renesas-irqc: Enable mask on suspend
  irqchip: renesas-irqc: Use lazy disable
  irqchip: armada-370-xp: fix MSI race condition
  irqchip: armada-370-xp: fix IPI race condition
2014-02-05 16:02:53 -08:00
Linus Torvalds
1cd731df09 Bug-fixes:
- Revert "xen/grant-table: Avoid m2p_override during mapping" as it broke Xen ARM build.
  - Fix CR4 not being set on AP processors in Xen PVH mode.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJS8AyQAAoJEFjIrFwIi8fJbD4IAJssMuaLI5CRsSWBgDFHHDFt
 srVJpDOYQiDr/TxkwFCVcL4sFy9Htb3KMArU4eIBl6uMqQbGa+3rHyXcHYI219YY
 XH3D8RG+9JChwsxtaeUEzwx1C8ehcygD34vtdcoQXa7eBuEi4TL3HeLifR+HrXKO
 UdFrTA34FmvpVFbSuRXkZh5sd6ca9et9xHuQHM8SIY6pVokY6xaEYOp17tfPZpwM
 7A6LFjUjXeugHC2L3+/H8UOHA9nSZQvnMiZOWq2Cusc2Dt2V7emzgk2wcc2CHttf
 EA6GbtiJzHqMPmt5EjubI9hHdSMB31HpY4hnQE38+ucl+BwiSdRE9z2Rm4TYClg=
 =IX4M
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull Xen fixes from Konrad Rzeszutek Wilk:
 "Bug-fixes:
   - Revert "xen/grant-table: Avoid m2p_override during mapping" as it
     broke Xen ARM build.
   - Fix CR4 not being set on AP processors in Xen PVH mode"

* tag 'stable/for-linus-3.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/pvh: set CR4 flags for APs
  Revert "xen/grant-table: Avoid m2p_override during mapping"
2014-02-05 16:01:11 -08:00
Linus Torvalds
251aa0fddd Wire up new sched_setattr and sched_getattr syscalls
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJS7+jjAAoJEKurIx+X31iBw/0P/0LIyilIt4ZAlBCXknvLORAM
 /EnyEl2GqP+0PBC0t0tH0h9JmmVGs13Bj2u+GZ85/bcMpFLCuOyZG5H7j8m+c0Ei
 JrzwrBxooDEBXqHc5B6WToTpeVUzkEk7euu5WXWp2ZLu3ZhRUbZqomr/PWeptTp7
 PnmtdYHbRYobenhqsjCkNKyDEwo0tJqE9JiLR6K2ibd/9yfvPXwXBH1pK5gU2BvO
 qtEtTdZrUNZIq3nGFvw7KphTZRSEHujxmfdZ2XTKaCjg3V7q/73EWsxS6Y464/mo
 7eSxQ1pfYfw8iYZU3UzCP/1ixkokjKTHhWtHQzaZlAa8fic4TDbu1b5bum1rcwI/
 DiFb6fIzbE+2Ow6vth5WJsI6F10Nu2ZaC5ztoTnpqaPh/fP/51mZuNUC3x9rlngt
 5tG3+g0sRoHyQMXEW7MBb4SvPeGHPCQ2ZV0Sb2c0BEN+KmvmxF4d5JmUBOYxs19Y
 fcGbyWMBnCFmieZ1ilwSWb+odP9bKHTwKbMueUWolIbO4280yL4bzwSLmxKqXcKB
 dckmh06tqya7eUTn4H13lntyJll/P1WOZm3ejWXz1x0Sn4idjsJVgxHhNYruN+QV
 IdayZiYuhY1RlzD4I8bwVI67hIjyP0nOAMJmf3wX3I2CU01tyuN1rOxa/0jmR/Y1
 OQISiVYMwSpK5kvXaMC7
 =G2gr
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-ia64-syscalls' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull ia64 update from Tony Luck:
 "Wire up new sched_setattr and sched_getattr syscalls"

* tag 'please-pull-ia64-syscalls' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  [IA64] Wire up new sched_setattr and sched_getattr syscalls
2014-02-05 16:00:27 -08:00
Linus Torvalds
8352650a5c Merge git://git.infradead.org/users/willy/linux-nvme
Pull NVMe driver update from Matthew Wilcox:
 "Looks like I missed the merge window ...  but these are almost all
  bugfixes anyway (the ones that aren't have been baking for months)"

* git://git.infradead.org/users/willy/linux-nvme:
  NVMe: Namespace use after free on surprise removal
  NVMe: Correct uses of INIT_WORK
  NVMe: Include device and queue numbers in interrupt name
  NVMe: Add a pci_driver shutdown method
  NVMe: Disable admin queue on init failure
  NVMe: Dynamically allocate partition numbers
  NVMe: Async IO queue deletion
  NVMe: Surprise removal handling
  NVMe: Abort timed out commands
  NVMe: Schedule reset for failed controllers
  NVMe: Device resume error handling
  NVMe: Cache dev->pci_dev in a local pointer
  NVMe: Fix lockdep warnings
  NVMe: compat SG_IO ioctl
  NVMe: remove deprecated IRQF_DISABLED
  NVMe: Avoid shift operation when writing cq head doorbell
2014-02-05 15:53:26 -08:00
Linus Torvalds
71c27a8c67 regulator: Fixes for v3.14
A couple of driver fixes here but the main thing is a fix to the checks
 for deferred probe non-DT systems with fully specified regulators which
 had been broken by a device tree fix which meant that we wouldn't insert
 optional regulators.  This had slipped through the cracks since very few
 systems do that in the first place and those that do it in mainline
 don't need optional regulators anyway.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJS8iRIAAoJELSic+t+oim9p98P/A7ILwXsvhM3fIN9jGOPZPks
 Q2yynqjVnEsfAJ9h4sa8cMKkQcWlBJ2+/M3AH03dnoZ5q+vA76ujNrnJHYtu5jH1
 oyVAqP6gmArrGsGe2eO9NQ+Cgjh1zPc3/aREZEopKQbDMxNnDr5b5juzAGlifWBG
 +kDjTdTiNW9eJ7dJJdHh2Y+OZEnFqxRNtbboK6bayrKtUqI8bZjYgsVSyX7US7Lp
 yX36cGi/iLbdI5FgJHDIdPMjZRO5fqPbG4C1ktghT8liD9DtUXMaMNSQYzjOy8Np
 z15E4U7CB1uEn5rz/Lk3mOGqp+G4ttMd7ZIfC18faAgZWRwdY5gqFH9X0t4rVRUN
 C8oQex0qzYWyzoRvix6gpSa9yrc2sUvv2mEHGSGJRO7mMbjJCVj3Hkan7iFgG1jU
 dbQVh67Ww29Vjyh83L4Pw4KaT12LQUUiu01N7nUcYoqJQO+leJMy40qF7tMTl/bq
 nJbrf2uqGmkvZJ0u5hyI4N2lZICzzUHR6ySsTx3qKFUaudYWoopyplfQ5YubukUV
 q2fADCBxfR3zwA2PzOP612bOVK1uFx+/EtRLkn5JgI87MbNdbgvQzF4qi/p5MF4V
 qYhvLRcUzAI82FPSYju894nwwsEyy4B6pxItdtQgr0hgsjmVD2FPZmdxehosF4IA
 xWfmkocBjDdX7VAIo0Ld
 =2an0
 -----END PGP SIGNATURE-----

Merge tag 'regulator-v3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "A couple of driver fixes here but the main thing is a fix to the
  checks for deferred probe non-DT systems with fully specified
  regulators which had been broken by a device tree fix which meant that
  we wouldn't insert optional regulators.

  This had slipped through the cracks since very few systems do that in
  the first place and those that do it in mainline don't need optional
  regulators anyway"

* tag 'regulator-v3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: s2mps11: Fix NULL pointer of_node value when using platform data
  regulator: core: Correct default return value for full constraints
  regulator: ab3100: cast fix
2014-02-05 15:52:26 -08:00
Linus Torvalds
4293242db1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a number of concurrency issues on s390 where multiple users
  of the same crypto transform may clobber each other's results"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: s390 - fix des and des3_ede ctr concurrency issue
  crypto: s390 - fix des and des3_ede cbc concurrency issue
  crypto: s390 - fix concurrency issue in aes-ctr mode
2014-02-05 15:51:42 -08:00
Ingo Molnar
f8f2023482 x86: Disable CONFIG_X86_DECODER_SELFTEST in allmod/allyesconfigs
It can take some time to validate the image, make sure
{allyes|allmod}config doesn't enable it.

I'd say randconfig will cover it often enough, and the failure is also
borderline build coverage related: you cannot really make the decoder
test fail via source level changes, only with changes in the build
environment, so I agree with Andi that we can disable this one too.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Paul Gortmaker paul.gortmaker@windriver.com>
Suggested-and-acked-by: Andi Kleen andi@firstfloor.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-05 14:10:30 -08:00
Linus Torvalds
c4ad8f98be execve: use 'struct filename *' for executable name passing
This changes 'do_execve()' to get the executable name as a 'struct
filename', and to free it when it is done.  This is what the normal
users want, and it simplifies and streamlines their error handling.

The controlled lifetime of the executable name also fixes a
use-after-free problem with the trace_sched_process_exec tracepoint: the
lifetime of the passed-in string for kernel users was not at all
obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize
the pathname allocation lifetime with the execve() having finished,
which in turn meant that the trace point that happened after
mm_release() of the old process VM ended up using already free'd memory.

To solve the kernel string lifetime issue, this simply introduces
"getname_kernel()" that works like the normal user-space getname()
function, except with the source coming from kernel memory.

As Oleg points out, this also means that we could drop the tcomm[] array
from 'struct linux_binprm', since the pathname lifetime now covers
setup_new_exec().  That would be a separate cleanup.

Reported-by: Igor Zhbanov <i.zhbanov@samsung.com>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-05 12:54:53 -08:00
Linus Torvalds
878a876b2e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "Filipe is fixing compile and boot problems with our crc32c rework, and
  Josef has disabled snapshot aware defrag for now.

  As the number of snapshots increases, we're hitting OOM.  For the
  short term we're disabling things until a bigger fix is ready"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: use late_initcall instead of module_init
  Btrfs: use btrfs_crc32c everywhere instead of libcrc32c
  Btrfs: disable snapshot aware defrag for now
2014-02-04 12:26:56 -08:00
Linus Torvalds
d7512f79fd NFS client bugfixes for Linux 3.14
Highlights:
 
 - Fix NFSv3 acl regressions
 - Fix NFSv4 memory corruption due to slot table abuse in nfs4_proc_open_confirm
 - nfs4_destroy_session must call rpc_destroy_waitqueue
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJS8RmYAAoJEGcL54qWCgDynYEP/1Ip6/IoaqOfT5kAYo05tw30
 Wqk0vNZWqdkuK/T8kUQmcbaN6Gt29QRhhJzDBhIrQ+7n8LsWwiZ3zoi9m3p2Ry7U
 M+xWcE6P7Cc44w3XLfll/xKb3ktfhXFQUeTHxsQ3X7uBsBP8TcfYabWHFrrkSe2B
 aubj/JqebLm2Uz7Zb03T4JF6LKd+3uHBr7SDK03zTUaPtn+bcVGBwS9MlFTt17Ma
 2zMFMM4TNNPDGJWJMvI460BYB4nySINCA4I+78sSLOQbiDc6yZd1Zq/Ngtvs+AlW
 /+m3eEY3ljcawCSAHMkoef4G5ljHV8cndtr3WBtAUjGq/H6V8eGdW4PB/A2UGYRH
 Vdcxrm6Bn7IDRJ6baq72DuX1ZyTDPBllRz0m0l08w8p2cuK7lFDSF7/HS2MNOb2i
 8Ai02CG9+aWZCz1uuF0vngKViP+hfmJqrLr1OgLuSggPd1kbqU/+HnNGtLmDfi/0
 K0ceuMvIuMlahlYgb7XbGFN7RIcbIYXXhBM2dUQo+/TtGvfjHVI8EmmlC7CtEXdW
 TeVc9gxWgKIlF1p06aoWFzdbBkK3CtNCXgh8vQpFRSqvoAQC/mBVQW9Iank77hRA
 94wiNZVLKaEO+l2WmfuGLhZ9kkLb/Iq5NVA5IUGWgjyT1yJsR5faTFfRhM+yjABE
 IwQF5ntXqxz82nWCRbSl
 =kKEV
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.14-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights:

   - Fix NFSv3 acl regressions
   - Fix NFSv4 memory corruption due to slot table abuse in
     nfs4_proc_open_confirm
   - nfs4_destroy_session must call rpc_destroy_waitqueue"

* tag 'nfs-for-3.14-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  fs: get_acl() must be allowed to return EOPNOTSUPP
  NFSv3: Fix return value of nfs3_proc_setacls
  NFSv3: Remove unused function nfs3_proc_set_default_acl
  NFSv4.1: nfs4_destroy_session must call rpc_destroy_waitqueue
  NFSv4: Fix memory corruption in nfs4_proc_open_confirm
  nfs: fix setting of ACLs on file creation.
2014-02-04 12:26:16 -08:00
Linus Torvalds
12b13835a0 kbuild: don't enable DEBUG_INFO when building for COMPILE_TEST
It really isn't very interesting to have DEBUG_INFO when doing compile
coverage stuff (you wouldn't want to run the result anyway, that's kind
of the whole point of COMPILE_TEST), and it currently makes the build
take longer and use much more disk space for "all{yes,mod}config".

There's somewhat active discussion about this still, and we might end up
with some new config option for things like this (Andi points out that
the silly X86_DECODER_SELFTEST option also slows down the normal
coverage tests hugely), but I'm starting the ball rolling with this
simple one-liner.

DEBUG_INFO isn't that noticeable if you have tons of memory and a good
IO subsystem, but it hurts you a lot if you don't - for very little
upside for the common use.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-04 12:20:01 -08:00
Mark Brown
a6a671e1b6 Merge remote-tracking branches 'regulator/fix/ab3100' and 'regulator/fix/s2mps11' into regulator-linus 2014-02-04 12:58:19 +00:00
Mark Brown
00e6747024 Merge remote-tracking branch 'regulator/fix/core' into regulator-linus 2014-02-04 12:58:17 +00:00
Trond Myklebust
88a78a912e Merge branch 'acl_fixes' into linux-next 2014-02-03 17:13:45 -05:00
Trond Myklebust
789b663ae3 fs: get_acl() must be allowed to return EOPNOTSUPP
posix_acl_xattr_get requires get_acl() to return EOPNOTSUPP if the
filesystem cannot support acls. This is needed for NFS, which can't
know whether or not the server supports acls until it tries to get/set
one.
This patch converts posix_acl_chmod and posix_acl_create to deal with
EOPNOTSUPP return values from get_acl().

Reported-by: Russell King <linux@arm.linux.org.uk>
Link: http://lkml.kernel.org/r/20140130140834.GW15937@n2100.arm.linux.org.uk
Cc: Al Viro viro@zeniv.linux.org.uk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-02-03 17:12:37 -05:00
Mukesh Rathor
afca50132c xen/pvh: set CR4 flags for APs
During bootup in the 'probe_page_size_mask' these CR4 flags are
set in there. But for AP processors they are not set as we do not
use 'secondary_startup_64' which the baremetal kernels uses.
Instead do it in this function which we use in Xen PVH during our
startup for AP processors.

As such fix it up to make sure we have that flag set.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-02-03 15:44:18 -05:00
Trond Myklebust
8f493b9cfc NFSv3: Fix return value of nfs3_proc_setacls
nfs3_proc_setacls is used internally by the NFSv3 create operations
to set the acl after the file has been created. If the operation
fails because the server doesn't support acls, then it must return '0',
not -EOPNOTSUPP.

Reported-by: Russell King <linux@arm.linux.org.uk>
Link: http://lkml.kernel.org/r/20140201010328.GI15937@n2100.arm.linux.org.uk
Cc: Christoph Hellwig <hch@lst.de>
Tested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-02-03 13:14:23 -05:00
Trond Myklebust
d4c42fb493 NFSv3: Remove unused function nfs3_proc_set_default_acl
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-02-03 13:13:50 -05:00
Filipe David Borba Manana
60efa5eb2e Btrfs: use late_initcall instead of module_init
It seems that when init_btrfs_fs() is called, crc32c/crc32c-intel might
not always be already initialized, which results in the call to crypto_alloc_shash()
returning -ENOENT, as experienced by Ahmet who reported this.

Therefore make sure init_btrfs_fs() is called after crc32c is initialized (which
is at initialization level 6, module_init), by using late_initcall (which is at
initialization level 7) instead of module_init for btrfs.

Reported-and-Tested-by: Ahmet Inan <ainan@mathematik.uni-freiburg.de>
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-02-03 09:01:28 -08:00
Filipe David Borba Manana
0b947aff15 Btrfs: use btrfs_crc32c everywhere instead of libcrc32c
After the commit titled "Btrfs: fix btrfs boot when compiled as built-in",
LIBCRC32C requirement was removed from btrfs' Kconfig. This made it not
possible to build a kernel with btrfs enabled (either as module or built-in)
if libcrc32c is not enabled as well. So just replace all uses of libcrc32c
with the equivalent function in btrfs hash.h - btrfs_crc32c.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-02-03 09:01:27 -08:00
Josef Bacik
8101c8dbf6 Btrfs: disable snapshot aware defrag for now
It's just broken and it's taking a lot of effort to fix it, so for now just
disable it so people can defrag in peace.  Thanks,

Cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-02-03 09:01:27 -08:00
Konrad Rzeszutek Wilk
e85fc98055 Revert "xen/grant-table: Avoid m2p_override during mapping"
This reverts commit 08ece5bb23.

As it breaks ARM builds and needs more attention
on the ARM side.

Acked-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-02-03 06:44:49 -05:00
Linus Torvalds
38dbfb59d1 Linus 3.14-rc1 2014-02-02 16:42:13 -08:00
Linus Torvalds
69048e0188 Merge branch 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "The three major changes in this patchset is a implementation for
  flexible userspace memory maps, cache-flushing fixes (again), and a
  long-discussed ABI change to make EWOULDBLOCK the same value as
  EAGAIN.

  parisc has been the only platform where we had EWOULDBLOCK != EAGAIN
  to keep HP-UX compatibility.  Since we will probably never implement
  full HP-UX support, we prefer to drop this compatibility to make it
  easier for us with Linux userspace programs which mostly never checked
  for both values.  We don't expect major fall-outs because of this
  change, and if we face some, we will simply rebuild the necessary
  applications in the debian archives"

* 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: add flexible mmap memory layout support
  parisc: Make EWOULDBLOCK be equal to EAGAIN on parisc
  parisc: convert uapi/asm/stat.h to use native types only
  parisc: wire up sched_setattr and sched_getattr
  parisc: fix cache-flushing
  parisc/sti_console: prefer Linux fonts over built-in ROM fonts
2014-02-02 16:32:53 -08:00
Mikulas Patocka
1c0b8a7a62 hpfs: optimize quad buffer loading
HPFS needs to load 4 consecutive 512-byte sectors when accessing the
directory nodes or bitmaps.  We can't switch to 2048-byte block size
because files are allocated in the units of 512-byte sectors.

Previously, the driver would allocate a 2048-byte area using kmalloc,
copy the data from four buffers to this area and eventually copy them
back if they were modified.

In the current implementation of the buffer cache, buffers are allocated
in the pagecache.  That means that 4 consecutive 512-byte buffers are
stored in consecutive areas in the kernel address space.  So, we don't
need to allocate extra memory and copy the content of the buffers there.

This patch optimizes the code to avoid copying the buffers.  It checks
if the four buffers are stored in contiguous memory - if they are not,
it falls back to allocating a 2048-byte area and copying data there.

Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-02 16:24:07 -08:00
Mikulas Patocka
2cbe5c76fc hpfs: remember free space
Previously, hpfs scanned all bitmaps each time the user asked for free
space using statfs.  This patch changes it so that hpfs scans the
bitmaps only once, remembes the free space and on next invocation of
statfs it returns the value instantly.

New versions of wine are hammering on the statfs syscall very heavily,
making some games unplayable when they're stored on hpfs, with load
times in minutes.

This should be backported to the stable kernels because it fixes
user-visible problem (excessive level load times in wine).

Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-02 16:24:07 -08:00
Helge Deller
9dabf60dc4 parisc: add flexible mmap memory layout support
Add support for the flexible mmap memory layout (as described in
http://lwn.net/Articles/91829). This is especially very interesting on
parisc since we currently only support 32bit userspace (even with a
64bit Linux kernel).

Signed-off-by: Helge Deller <deller@gmx.de>
2014-02-02 21:00:13 +01:00
Guy Martin
f5a408d53e parisc: Make EWOULDBLOCK be equal to EAGAIN on parisc
On Linux, only parisc uses a different value for EWOULDBLOCK which
causes a lot of troubles for applications not checking for both values.
Since the hpux compat is long dead, make EWOULDBLOCK behave the same as
all other architectures.

Signed-off-by: Guy Martin <gmsoft@tuxicoman.be>
Signed-off-by: Helge Deller <deller@gmx.de>
2014-02-02 20:57:42 +01:00
Helge Deller
9391bc777b parisc: convert uapi/asm/stat.h to use native types only
The stat.h header file is exported to userspace. Some userspace
applications failed to compile due to missing/unknown types, so we
better convert it to use native types only (like it's done on other
architectures too).

Signed-off-by: Helge Deller <deller@gmx.de>
2014-02-02 20:57:33 +01:00
Helge Deller
998bbb2fc0 parisc: wire up sched_setattr and sched_getattr
Signed-off-by: Helge Deller <deller@gmx.de>
2014-02-02 20:57:25 +01:00
Helge Deller
57737c49dd parisc: fix cache-flushing
This commit:
f8dae00684: parisc: Ensure full cache coherency for kmap/kunmap
caused negative caching side-effects, e.g. hanging processes with expect and
too many inequivalent alias messages from flush_dcache_page() on Debian 5 systems.

This patch now partly reverts it and has been in production use on our debian buildd
makeservers since a week without any major problems.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Helge Deller <deller@gmx.de>
2014-02-02 20:57:16 +01:00
Helge Deller
8a10bc9d27 parisc/sti_console: prefer Linux fonts over built-in ROM fonts
The built-in ROM fonts lack many necessary ASCII characters, which is
why it makes sens to prefer the Linux fonts instead if they are
available.  This makes consoles on STI graphics cards which are not
supported by the stifb driver (e.g. Visualize FXe) looks much nicer.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v3.13
2014-02-02 20:56:47 +01:00
Linus Torvalds
602456bf16 Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull hwmon kconfig fixes from Jean Delvare.

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: Fix SENSORS_TMP102 dependencies to eliminate build errors
  hwmon: Fix SENSORS_LM75 dependencies to eliminate build errors
2014-02-02 11:30:57 -08:00
Linus Torvalds
7b383bef25 Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux
Pull SLAB changes from Pekka Enberg:
 "Random bug fixes that have accumulated in my inbox over the past few
  months"

* 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
  mm: Fix warning on make htmldocs caused by slab.c
  mm: slub: work around unneeded lockdep warning
  mm: sl[uo]b: fix misleading comments
  slub: Fix possible format string bug.
  slub: use lockdep_assert_held
  slub: Fix calculation of cpu slabs
  slab.h: remove duplicate kmalloc declaration and fix kernel-doc warnings
2014-02-02 11:30:08 -08:00
Linus Torvalds
87af5e5c22 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown.

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: introduce -s to dump counters
  tools/power turbostat: remove unused command line option
  turbostat: Add option to report joules consumed per sample
  turbostat: run on HSX
  turbostat: Add a .gitignore to ignore the compiled turbostat binary
  turbostat: Clean up error handling; disambiguate error messages; use err and errx
  turbostat: Factor out common function to open file and exit on failure
  turbostat: Add a helper to parse a single int out of a file
  turbostat: Check return value of fscanf
  turbostat: Use GCC's CPUID functions to support PIC
  turbostat: Don't attempt to printf an off_t with %zx
  turbostat: Don't put unprocessed uapi headers in the include path
2014-02-02 11:28:48 -08:00
Linus Torvalds
e4c0da2165 ARM: SoC fixes for 3.14-rc1
Here's a set of patches for (hopefully) -rc1. Some of them are fixes,
 but a good number of them also do things such as enable new drivers in
 the defconfigs for platforms that have such devices, increases coverage
 of the multiplatform defconfig and some DTS changes that plumbs up some
 of the devices that now have bindings and driver support.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJS7VYsAAoJEIwa5zzehBx3FKIQAIVIkjlKAtvRpg+P/BYWigXq
 6jG2xuBhwf1Yciu70ttvfRnR9bGJc+0o5w/qjr7ARSFp7GkDlWXtIwj+1EwGSEES
 hB/f2GcBfNLJOI0rHvvudz+7HlpiyfB6iMj3U25hnpShrSTxbOHDC6pU6pYjUzBy
 T1AYTmfa0vwuy9TNodOr4dVYlbJc3YA0V2sBTusXDFwm03XTdCd7E7RJIr/MRDEN
 96PtaNrfstmxeTjN0NBuL2IdsZNR2mn/cgWTL1lVqVE8+z7QyvmPDqmWD5iCVQaM
 +6mJbXL4jzM1CyB8Qf89sNtTaFVjwbgvbtuljPv/7q/jhf2rvixaMzywCv28FjB+
 uJCaLh1r/HiR61tlw0yX3ixPdKFJqyWV9vF5+uuwmEQCCPpFWGzTcNaZ7NRiYzEt
 yqt1ZRDhEAli8iHt96LxMwLIklaDGvuxLtHsv4a5djEoaO/yqRTNZXDjwIsIzkpD
 N8vUwPf7Pnw+3/5iGVTQdZaPzmrNjqTOLwz0s7ys2avVsRsNNi68hMuIj+qrNlzu
 /tJhhh0u9SONRM3c6D9mtrZPD7Gs8QgMbT9TyZ2/ILQKuo4wO9F1GDN93MjZI5oq
 1rX/QUeYItWQamzHGmI8ZlAx4YlsxxB90eJ8CaG2F8sVtumklrFRb43Vf9NRsyo/
 XDrjoS/cdi7PAO0wemkg
 =ymAC
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "Here's a set of patches for (hopefully) -rc1.  Some of them are fixes,
  but a good number of them also do things such as enable new drivers in
  the defconfigs for platforms that have such devices, increases
  coverage of the multiplatform defconfig and some DTS changes that
  plumbs up some of the devices that now have bindings and driver
  support.

  The commit dates are recent; we've mostly collected these fixes in the
  last few days but I also had to rebuild the branch yesterday to sort
  out some internal conflicts which reset the timestamps.  The changes
  should have been tested by each platform maintainer already (and few
  of them have cross-platform impact) so I'm personally not too
  concerned by it at this time"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits)
  ARM: multi_v7_defconfig: remove redundant entries and re-enable TI_EDMA
  ARM: multi_v7_defconfig: add mvebu drivers
  clocksource: kona: Add basic use of external clock
  drivers: bus: fix CCI driver kcalloc call parameters swap
  ARM: dts: bcm28155-ap: Fix Card Detection GPIO
  ARM: multi_v7_defconfig: Select CONFIG_AT803X_PHY
  ARM: keystone: config: fix build warning when CONFIG_DMADEVICES is not set
  MAINTAINERS: ARM: SiRF: use regex patterns to involve all SiRF drivers
  ARM: dts: zynq: Add SDHCI nodes
  ARM: hisi: don't select SMP
  ARM: tegra: rebuild tegra_defconfig to add DEBUG_FS
  ARM: multi_v7: copy most options from tegra_defconfig
  ARM: iop32x: fix power off handling for the EM7210 board
  ARM: integrator: restore static map on the CP
  ARM: msm_defconfig: Enable MSM clock drivers
  ARM: dts: msm: Add clock controller nodes and hook into uart
  ARM: OMAP4+: move errata initialization to omap4_pm_init_early
  ARM: OMAP4460: cpuidle: Extend PM_OMAP4_ROM_SMP_BOOT_ERRATUM_GICD on cpuidle
  ARM: mvebu: fix compilation warning on Armada 370 (i.e. non-SMP)
  ARM: shmobile: r8a7790.dtsi: ficx i2c[0-3] clock reference
  ...
2014-02-02 11:11:06 -08:00
Keith Busch
9ac27090f6 NVMe: Namespace use after free on surprise removal
An nvme block device may have open references when the device is
removed. New commands may still be sent on the removed device, so we
need to ref count the opens, return errors for new commands, and not
free the namespace and nvme_dev until all references are closed.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2014-02-02 13:31:15 -05:00
Jean Delvare
632007e201 hwmon: Fix SENSORS_TMP102 dependencies to eliminate build errors
Similar to what was done for the lm75 driver.

Add depends on THERMAL since that is what provides the
register/unregister functions above, but only if THERMAL_OF was
selected as this is an optional feature of the driver.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Eduardo Valentin <eduardo.valentin@ti.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
2014-02-02 17:59:07 +01:00
Jean Delvare
920130a9de hwmon: Fix SENSORS_LM75 dependencies to eliminate build errors
Based on an earlier attempt by Randy Dunlap.

Fix SENSORS_LM75 dependencies to eliminate build errors:

drivers/built-in.o: In function `lm75_remove':
lm75.c:(.text+0x12bd8c): undefined reference to `thermal_zone_of_sensor_unregister'
drivers/built-in.o: In function `lm75_probe':
lm75.c:(.text+0x12c123): undefined reference to `thermal_zone_of_sensor_register'

Add depends on THERMAL since that is what provides the
register/unregister functions above, but only if THERMAL_OF was
selected as this is an optional feature of the driver.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Eduardo Valentin <eduardo.valentin@ti.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
2014-02-02 17:59:07 +01:00
Andy Shevchenko
3b4d5c7fec tools/power turbostat: introduce -s to dump counters
The new option allows just run turbostat and get dump of counter values. It's
useful when we have something more than one program to test.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-02-01 15:24:28 -05:00
Andy Shevchenko
f591c38b91 tools/power turbostat: remove unused command line option
The -s is not used, let's remove it, and update quick help accordingly.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-02-01 15:22:31 -05:00
Trond Myklebust
20b9a90245 NFSv4.1: nfs4_destroy_session must call rpc_destroy_waitqueue
There may still be timers active on the session waitqueues. Make sure
that we kill them before freeing the memory.

Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-02-01 15:13:39 -05:00
Trond Myklebust
17ead6c85c NFSv4: Fix memory corruption in nfs4_proc_open_confirm
nfs41_wake_and_assign_slot() relies on the task->tk_msg.rpc_argp and
task->tk_msg.rpc_resp always pointing to the session sequence arguments.

nfs4_proc_open_confirm tries to pull a fast one by reusing the open
sequence structure, thus causing corruption of the NFSv4 slot table.

Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-02-01 15:13:39 -05:00
Linus Torvalds
5cb480f6b4 Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull misc kbuild changes from Michal Marek:
 "The non-critical part of kbuild is small this time:
   - Three fixes for make deb-pkg
   - A new coccinelle check

  One of the deb-pkg fixes is a leftover from the last merge window,
  hence the merge commit"

* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  deb-pkg: Fix building for MIPS big-endian or ARM OABI
  deb-pkg: Fix cross-building linux-headers package
  scripts: Coccinelle script for pm_runtime_* return checks with IS_ERR_VALUE
  deb-pkg: Inhibit initramfs builders if CONFIG_BLK_DEV_INITRD is not set
2014-02-01 11:03:16 -08:00
Pali Rohár
1bda2ac071 afs: proc cells and rootcell are writeable
Both proc files are writeable and used for configuring cells. But
there is missing correct mode flag for writeable files. Without
this patch both proc files are read only.

[ It turns out they aren't really read-only, since root can write to
  them even if the write bit isn't set due to CAP_DAC_OVERRIDE ]

Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-01 10:59:39 -08:00
Heiko Carstens
5a5e75f471 tile: remove compat_sys_lookup_dcookie declaration to fix compile error
With commit d8d14bd09c ("fs/compat: fix lookup_dcookie() parameter
handling") I changed the type of the len parameter of the
lookup_dcookie() syscall.

However I missed that there was still a stale declaration in
arch/tile/..  which now causes a compile error on tile:

  In file included from fs/dcookies.c:28:0:
  include/linux/compat.h:425:17: error: conflicting types for 'compat_sys_lookup_dcookie'
  fs/dcookies.c:207:1: error: conflicting types for 'compat_sys_lookup_dcookie'

Simply remove the declaration in the tile architecture, which is only a
leftover from before the different compat lookup_dcookie() versions have
been merged.  The correct declaration is now in include/linux/compat.h

The build error was reported by Fenguang's build bot.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-01 10:55:15 -08:00