Commit Graph

721807 Commits

Author SHA1 Message Date
David Howells
ad6a942a9e afs: Update the cache index structure
Update the cache index structure in the following ways:

 (1) Don't use the volume name followed by the volume type as levels in the
     cache index.  Volumes can be renamed.  Use the volume ID instead.

 (2) Don't store the VLDB data for a volume in the tree.  If the volume
     database should be cached locally, then it should be done in a separate
     tree.

 (3) Expand the volume ID stored in the cache to 64 bits.

 (4) Expand the file/vnode ID stored in the cache to 96 bits.

 (5) Increment the cache structure version number to 1.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13 15:38:17 +00:00
David Howells
91a90380ef afs: Add some protocol defs
Add some protocol definitions, including max field lengths, flag defs, an
XDR-encoded UUID def, more VL operation IDs and more fileserver abort
codes.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13 15:38:17 +00:00
David Howells
9ed900b116 afs: Push the net ns pointer to more places
Push the network namespace pointer to more places in AFS, including the
afs_server structure (which doesn't hold a ref on the netns).

In particular, afs_put_cell() now takes requires a net ns parameter so that
it can safely alter the netns after decrementing the cell usage count - the
cell will be deallocated by a background thread after being cached for a
period, which means that it's not safe to access it after reducing its
usage count.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13 15:38:17 +00:00
David Howells
49566f6f06 afs: Note the cell in the superblock info also
Keep a reference to the cell in the superblock info structure in addition
to the volume and net pointers.  This will make it easier to clean up in a
future patch in which afs_put_volume() will need the cell pointer.

Whilst we're at it, make the cell and volume getting functions return a
pointer to the object got to make the call sites look neater.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13 15:38:16 +00:00
David Howells
59fa1c4a9f afs: Fix server reaping
Fix server reaping and make sure it's all done before we start trying to
purge cells, given that servers currently pin cells.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13 15:38:16 +00:00
David Howells
e3b2ffe0f0 afs: Close the rxrpc socket only after purging the servers
Close the rxrpc socket only after we've purged the server records (and also
cell and volume records which might refer to servers) so that we can give
up the callbacks on each server.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13 15:38:16 +00:00
David Howells
f044c8847b afs: Lay the groundwork for supporting network namespaces
Lay the groundwork for supporting network namespaces (netns) to the AFS
filesystem by moving various global features to a network-namespace struct
(afs_net) and providing an instance of this as a temporary global variable
that everything uses via accessor functions for the moment.

The following changes have been made:

 (1) Store the netns in the superblock info.  This will be obtained from
     the mounter's nsproxy on a manual mount and inherited from the parent
     superblock on an automount.

 (2) The cell list is made per-netns.  It can be viewed through
     /proc/net/afs/cells and also be modified by writing commands to that
     file.

 (3) The local workstation cell is set per-ns in /proc/net/afs/rootcell.
     This is unset by default.

 (4) The 'rootcell' module parameter, which sets a cell and VL server list
     modifies the init net namespace, thereby allowing an AFS root fs to be
     theoretically used.

 (5) The volume location lists and the file lock manager are made
     per-netns.

 (6) The AF_RXRPC socket and associated I/O bits are made per-ns.

The various workqueues remain global for the moment.

Changes still to be made:

 (1) /proc/fs/afs/ should be moved to /proc/net/afs/ and a symlink emplaced
     from the old name.

 (2) A per-netns subsys needs to be registered for AFS into which it can
     store its per-netns data.

 (3) Rather than the AF_RXRPC socket being opened on module init, it needs
     to be opened on the creation of a superblock in that netns.

 (4) The socket needs to be closed when the last superblock using it is
     destroyed and all outstanding client calls on it have been completed.
     This prevents a reference loop on the namespace.

 (5) It is possible that several namespaces will want to use AFS, in which
     case each one will need its own UDP port.  These can either be set
     through /proc/net/afs/cm_port or the kernel can pick one at random.
     The init_ns gets 7001 by default.

Other issues that need resolving:

 (1) The DNS keyring needs net-namespacing.

 (2) Where do upcalls go (eg. DNS request-key upcall)?

 (3) Need something like open_socket_in_file_ns() syscall so that AFS
     command line tools attempting to operate on an AFS file/volume have
     their RPC calls go to the right place.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13 15:38:16 +00:00
David Howells
5e4def2038 Pass mode to wait_on_atomic_t() action funcs and provide default actions
Make wait_on_atomic_t() pass the TASK_* mode onto its action function as an
extra argument and make it 'unsigned int throughout.

Also, consolidate a bunch of identical action functions into a default
function that can do the appropriate thing for the mode.

Also, change the argument name in the bit_wait*() function declarations to
reflect the fact that it's the mode and not the bit number.

[Peter Z gives this a grudging ACK, but thinks that the whole atomic_t wait
should be done differently, though he's not immediately sure as to how]

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
cc: Ingo Molnar <mingo@kernel.org>
2017-11-13 15:38:16 +00:00
David Howells
81445e63e6 Merge remote-tracking branch 'tip/timers/core' into afs-next
These AFS patches need the timer_reduce() patch from timers/core.

Signed-off-by: David Howells <dhowells@redhat.com>
2017-11-13 15:36:33 +00:00
Ilya Dryomov
3cfa3b16dd rbd: default to single-major device number scheme
It's been 3.5 years, let's turn it on by default.  Support in rbd(8)
utility goes back to pre-firefly, "rbd map" has been loading the module
with single_major=Y ever since.  However, if the module is already
loaded (whether by hand or at boot time), we end up with single_major=N.
Also, some people don't install rbd(8) and use the sysfs interface
directly.

(With single-major=N, a major number is consumed for every mapping,
imposing a limit of ~240 rbd images per host.  single-major=Y allows
mapping thousands of rbd images on a single machine.)

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
2017-11-13 16:33:08 +01:00
Ingo Molnar
8e7df2b5b7 timer/debug: Change /proc/timer_list from 0444 to 0400
While it uses %pK, there's still few reasons to read this file
as non-root.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-13 16:04:06 +01:00
Takashi Iwai
76727c2c3b ASoC: Updates for v4.15
The biggest thing this release has been the conversion of the AC98 bus
 to the driver model, that's been a long time coming so thanks to Robert
 Jarzmik for his dedication there.  Due to there being some AC97 MFD
 there's a few fairly large changes in input and the MFD layer, mainly to
 the wm97xx driver.
 
 There's also some drivers/drm changes to support the new AMD Stoney
 platform, these are shared with the DRM subsystem and should be being
 merged via both.
 
 Within the subsystem the overwhelming bulk of the changes is in the
 Intel drivers which continue to need lots of cleanups and fixes, this
 release they've also gained support for their open source firmware.
 There's also some large changs in the core as Morimoto-san continues to
 mirror operations into the component level in preparation for conversion
 of drivers to that.
 
  - The AC97 bus has finally caught up with the driver model thanks to
    some dedicated and persistent work from Robert Jarzmik.
  - Continued work from Morimoto-san on moving us towards being able to
    use components for everything.
  - Lots of cleanups for the Intel platform code, including support for
    their open source audio firmware.
  - Support for scaling MCLK with sample rate in simple-card.
  - Support for AMD Stoney platform.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAloJhwMTHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0KzbB/9tXryXYz3dnKVlm9rk+Cq0Xy4TrUNk
 WY+Il+Di1b6CQJbAm9GSacJxR+siupZCjGC5roHznj/AA2l0RuxJXpxG40Db8ZX+
 bDR7mIWtuTUJHazqXltafj9ydElRKVpOGPAi5YJhhW5bXQ3SR9fFy0D3mdcT02v4
 SyMExhOMz+mdnuBhbWx9kqJ9LPzCs0ow+R4uoRgAQxpFXPBGtq06sMkK86lGfsl/
 iRM36J6FIeIQQfSHG/dkkpoybVax43z4OH7G1IL2FOU7miwkjZh/TTh/xHTd86Mc
 OOuGu4hB+MjvccSOa9HSrOqFjxtkZipstwqYVWoYQcUoIVpcg0YRk7TG
 =5KBY
 -----END PGP SIGNATURE-----

Merge tag 'asoc-v4.15' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Updates for v4.15

The biggest thing this release has been the conversion of the AC98 bus
to the driver model, that's been a long time coming so thanks to Robert
Jarzmik for his dedication there.  Due to there being some AC97 MFD
there's a few fairly large changes in input and the MFD layer, mainly to
the wm97xx driver.

There's also some drivers/drm changes to support the new AMD Stoney
platform, these are shared with the DRM subsystem and should be being
merged via both.

Within the subsystem the overwhelming bulk of the changes is in the
Intel drivers which continue to need lots of cleanups and fixes, this
release they've also gained support for their open source firmware.
There's also some large changs in the core as Morimoto-san continues to
mirror operations into the component level in preparation for conversion
of drivers to that.

 - The AC97 bus has finally caught up with the driver model thanks to
   some dedicated and persistent work from Robert Jarzmik.
 - Continued work from Morimoto-san on moving us towards being able to
   use components for everything.
 - Lots of cleanups for the Intel platform code, including support for
   their open source audio firmware.
 - Support for scaling MCLK with sample rate in simple-card.
 - Support for AMD Stoney platform.
2017-11-13 15:45:57 +01:00
Takashi Iwai
c429bda21f Merge branch 'for-next' into for-linus
Pull 4.15 updates to take over the previous urgent fixes.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-11-13 15:43:13 +01:00
Arvind Yadav
71192a6887 irqchip/gic-v3: pr_err() strings should end with newlines
pr_err() messages should end with a new-line to avoid other messages
being concatenated.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-13 14:35:14 +00:00
Arvind Yadav
6b15e0853a irqchip/s3c24xx: pr_err() strings should end with newlines
pr_err() messages should end with a new-line to avoid other messages
being concatenated.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-13 14:35:08 +00:00
Masahiro Yamada
859fd5860c sh: select KBUILD_DEFCONFIG depending on ARCH
You can not select KBUILD_DEFCONFIG depending on any CONFIG option
because include/config/auto.conf is not included when building config
targets.  So, CONFIG_SUPERH32 is never set during the configuration,
then cayman_defconfig is always chosen.

This commit provides a sensible way to choose shx3/cayman_defconfig.

arch/sh/Kconfig sets either SUPERH32 or SUPERH64 depending on ARCH
environment, like follows:

  config SUPERH32
          def_bool ARCH = "sh"

          ...

  config SUPERH64
          def_bool ARCH = "sh64"

It should make sense to choose the default defconfig by ARCH,
like arch/sparc/Makefile.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-13 22:58:18 +09:00
Nick Desaulniers
86a9df597c kbuild: fix linker feature test macros when cross compiling with Clang
I was not seeing my linker flags getting added when using ld-option when
cross compiling with Clang. Upon investigation, this seems to be due to
a difference in how GCC vs Clang handle cross compilation.

GCC is configured at build time to support one backend, that is implicit
when compiling.  Clang is explicit via the use of `-target <triple>` and
ships with all supported backends by default.

GNU Make feature test macros that compile then link will always fail
when cross compiling with Clang unless Clang's triple is passed along to
the compiler. For example:

$ clang -x c /dev/null -c -o temp.o
$ aarch64-linux-android/bin/ld -E temp.o
aarch64-linux-android/bin/ld:
unknown architecture of input file `temp.o' is incompatible with
aarch64 output
aarch64-linux-android/bin/ld:
warning: cannot find entry symbol _start; defaulting to
0000000000400078
$ echo $?
1

$ clang -target aarch64-linux-android- -x c /dev/null -c -o temp.o
$ aarch64-linux-android/bin/ld -E temp.o
aarch64-linux-android/bin/ld:
warning: cannot find entry symbol _start; defaulting to 00000000004002e4
$ echo $?
0

This causes conditional checks that invoke $(CC) without the target
triple, then $(LD) on the result, to always fail.

Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-13 22:54:24 +09:00
Masahiro Yamada
e17c400ae1 kbuild: shrink .cache.mk when it exceeds 1000 lines
The cache files are only cleaned away by "make clean".  If you continue
incremental builds, the cache files will grow up little by little.
It is not a big deal in general use cases because compiler flags do not
change quite often.

However, if you do build-test for various architectures, compilers, and
kernel configurations, you will end up with huge cache files soon.

When the cache file exceeds 1000 lines, shrink it down to 500 by "tail".
The Least Recently Added lines are cut. (not Least Recently Used)
I hope it will work well enough.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
2017-11-13 22:54:24 +09:00
Masahiro Yamada
433dc2ebe7 kbuild: do not call cc-option before KBUILD_CFLAGS initialization
Some $(call cc-option,...) are invoked very early, even before
KBUILD_CFLAGS, etc. are initialized.

The returned string from $(call cc-option,...) depends on
KBUILD_CPPFLAGS, KBUILD_CFLAGS, and GCC_PLUGINS_CFLAGS.

Since they are exported, they are not empty when the top Makefile
is recursively invoked.

The recursion occurs in several places.  For example, the top
Makefile invokes itself for silentoldconfig.  "make tinyconfig",
"make rpm-pkg" are the cases, too.

In those cases, the second call of cc-option from the same line
runs a different shell command due to non-pristine KBUILD_CFLAGS.

To get the same result all the time, KBUILD_* and GCC_PLUGINS_CFLAGS
must be initialized before any call of cc-option.  This avoids
garbage data in the .cache.mk file.

Move all calls of cc-option below the config targets because target
compiler flags are unnecessary for Kconfig.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
2017-11-13 22:54:24 +09:00
Douglas Anderson
4e56207130 kbuild: Cache a few more calls to the compiler
These are a few stragglers that I left out of the original patch to
cache calls to the C compiler ("kbuild: Add a cache for generated
variables") because they bleed out into the main Makefile and thus
uglify things a little bit.  The idea is the same here, though.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-13 22:54:23 +09:00
Douglas Anderson
3298b690b2 kbuild: Add a cache for generated variables
While timing a "no-op" build of the kernel (incrementally building the
kernel even though nothing changed) in the Chrome OS build system I
found that it was much slower than I expected.

Digging into things a bit, I found that quite a bit of the time was
spent invoking the C compiler even though we weren't actually building
anything.  Currently in the Chrome OS build system the C compiler is
called through a number of wrappers (one of which is written in
python!) and can take upwards of 100 ms to invoke even if we're not
doing anything difficult, so these invocations of the compiler were
taking a lot of time.  Worse the invocations couldn't seem to take
advantage of the multiple cores on my system.

Certainly it seems like we could make the compiler invocations in the
Chrome OS build system faster, but only to a point.  Inherently
invoking a program as big as a C compiler is a fairly heavy
operation.  Thus even if we can speed the compiler calls it made sense
to track down what was happening.

It turned out that all the compiler invocations were coming from
usages like this in the kernel's Makefile:

KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks,)

Due to the way cc-option and similar statements work the above
contains an implicit call to the C compiler.  ...and due to the fact
that we're storing the result in KBUILD_CFLAGS, a simply expanded
variable, the call will happen every time the Makefile is parsed, even
if there are no users of KBUILD_CFLAGS.

Rather than redoing this computation every time, it makes a lot of
sense to cache the result of all of the Makefile's compiler calls just
like we do when we compile a ".c" file to a ".o" file.  Conceptually
this is quite a simple idea.  ...and since the calls to invoke the
compiler and similar tools are centrally located in the Kbuild.include
file this doesn't even need to be super invasive.

Implementing the cache in a simple-to-use and efficient way is not
quite as simple as it first sounds, though.  To get maximum speed we
really want the cache in a format that make can natively understand
and make doesn't really have an ability to load/parse files. ...but
make _can_ import other Makefiles, so the solution is to store the
cache in Makefile format.  This requires coming up with a valid/unique
Makefile variable name for each value to be cached, but that's
solvable with some cleverness.

After this change, we'll automatically create a ".cache.mk" file that
will contain our cached variables.  We'll load this on each invocation
of make and will avoid recomputing anything that's already in our
cache.  The cache is stored in a format that it shouldn't need any
invalidation since anything that might change should affect the "key"
and any old cached value won't be used.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-13 22:54:23 +09:00
Masahiro Yamada
a7d34df3d1 kbuild: add forward declaration of default target to Makefile.asm-generic
$(kbuild-file) and Kbuild.include are included before the default
target "all".

We will add a target into Kbuild.include.  In advance, add a forward
declaration of the default target.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
2017-11-13 22:54:17 +09:00
John Crispin
b54fcf6ae1 MIPS: pci: Make use of the BIT() macro inside the mt7620 driver
There are a few defines that manully shift a bit. Change these to using
the BIT() macro.

Signed-off-by: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15322/
Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13 13:01:50 +00:00
John Crispin
8593b18ad3 MIPS: pci: Remove KERN_WARN instance inside the mt7620 driver
Switch the printk() call to the prefered pr_warn() api.

Fixes: 7e5873d375 ("MIPS: pci: Add MT7620a PCIE driver")
Signed-off-by: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.5+
Patchwork: https://patchwork.linux-mips.org/patch/15321/
Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13 13:01:49 +00:00
John Crispin
ab74abcee5 MIPS: pci: Remove duplicate define in mt7620 driver
An invalid and duplicate define has gone unnoticed for some time. lets
remove it. The correct define is 3 lines below.

Fixes: 7e5873d375 ("MIPS: pci: Add MT7620a PCIE driver")
Signed-off-by: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15320/
Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13 13:01:37 +00:00
Andy Shevchenko
73ed298b06 platform/x86: Revert intel_pmc_ipc: Use MFD framework to create dependent devices
Heikki discovered a runtime issue with this patch. Taking into
consideration we have no time to test any fix right now, revert the
commit 43aaf4f03f.

Reported-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-11-13 14:57:20 +02:00
Nicholas Piggin
4722476bce powerpc/64s: mm_context.addr_limit is only used on hash
Radix keeps no meaningful state in addr_limit, so remove it from radix
code and rename to slb_addr_limit to make it clear it applies to hash
only.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 23:35:43 +11:00
Nicholas Piggin
85e3f1adcb powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation
Radix VA space allocations test addresses against mm->task_size which
is 512TB, even in cases where the intention is to limit allocation to
below 128TB.

This results in mmap with a hint address below 128TB but address +
length above 128TB succeeding when it should fail (as hash does after
the previous patch).

Set the high address limit to be considered up front, and base
subsequent allocation checks on that consistently.

Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 23:35:29 +11:00
Nicholas Piggin
35602f82d0 powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary
While mapping hints with a length that cross 128TB are disallowed,
MAP_FIXED allocations that cross 128TB are allowed. These are failing
on hash (on radix they succeed). Add an additional case for fixed
mappings to expand the addr_limit when crossing 128TB.

Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 23:35:06 +11:00
Nicholas Piggin
effc1b2508 powerpc/64s/hash: Fix fork() with 512TB process address space
Hash unconditionally resets the addr_limit to default (128TB) when the
mm context is initialised. If a process has > 128TB mappings when it
forks, the child will not get the 512TB addr_limit, so accesses to
valid > 128TB mappings will fail in the child.

Fix this by only resetting the addr_limit to default if it was 0. Non
zero indicates it was duplicated from the parent (0 means exec()).

Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 23:34:47 +11:00
Nicholas Piggin
6a72dc038b powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation
When allocating VA space with a hint that crosses 128TB, the SLB
addr_limit variable is not expanded if addr is not > 128TB, but the
slice allocation looks at task_size, which is 512TB. This results in
slice_check_fit() incorrectly succeeding because the slice_count
truncates off bit 128 of the requested mask, so the comparison to the
available mask succeeds.

Fix this by using mm->context.addr_limit instead of mm->task_size for
testing allocation limits. This causes such allocations to fail.

Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Cc: stable@vger.kernel.org # v4.12+
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 23:34:19 +11:00
Michael Ellerman
7ece370996 powerpc/64s/hash: Fix 512T hint detection to use >= 128T
Currently userspace is able to request mmap() search between 128T-512T
by specifying a hint address that is greater than 128T. But that means
a hint of 128T exactly will return an address below 128T, which is
confusing and wrong.

So fix the logic to check the hint is greater than *or equal* to 128T.

Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Cc: stable@vger.kernel.org # v4.12+
Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of Nick's bigger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-13 23:34:06 +11:00
Mathias Kresin
05a67cc258 MIPS: ralink: Fix typo in mt7628 pinmux function
There is a typo inside the pinmux setup code. The function is called
refclk and not reclk.

Fixes: 53263a1c68 ("MIPS: ralink: add mt7628an support")
Signed-off-by: Mathias Kresin <dev@kresin.me>
Acked-by: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.19+
Patchwork: https://patchwork.linux-mips.org/patch/16047/
Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13 11:46:37 +00:00
Mathias Kresin
8ef4b43cd3 MIPS: ralink: Fix MT7628 pinmux
According to the datasheet the REFCLK pin is shared with GPIO#37 and
the PERST pin is shared with GPIO#36.

Fixes: 53263a1c68 ("MIPS: ralink: add mt7628an support")
Signed-off-by: Mathias Kresin <dev@kresin.me>
Acked-by: John Crispin <john@phrozen.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.19+
Patchwork: https://patchwork.linux-mips.org/patch/16046/
Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-13 11:44:47 +00:00
Benjamin Herrenschmidt
f23ab3efb1 powerpc: Fix DABR match on hash based systems
Commit 398a719d34 ("powerpc/mm: Update bits used to skip hash_page")
mistakenly dropped the DSISR_DABRMATCH bit from the mask of bit tested
to skip trying to hash a page.

As a result, the DABR matches would no longer be detected.

This adds it back. We open code it in the 2 places where it matters
rather than fold it into DSISR_BAD_FAULT_32S/64S because this isn't
technically a bad fault and while we would never hit it with the
current code, I prefer if page_fault_is_bad() didn't trigger on these.

Fixes: 398a719d34 ("powerpc/mm: Update bits used to skip hash_page")
Cc: stable@vger.kernel.org # v4.14
Tested-by: Pedro Miraglia Franco de Carvalho <pedromfc@br.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2017-11-13 22:12:48 +11:00
Eric Biggers
b11270853f libceph: don't WARN() if user tries to add invalid key
The WARN_ON(!key->len) in set_secret() in net/ceph/crypto.c is hit if a
user tries to add a key of type "ceph" with an invalid payload as
follows (assuming CONFIG_CEPH_LIB=y):

    echo -e -n '\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' \
	| keyctl padd ceph desc @s

This can be hit by fuzzers.  As this is merely bad input and not a
kernel bug, replace the WARN_ON() with return -EINVAL.

Fixes: 7af3ea189a ("libceph: stop allocating a new cipher on every crypto request")
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:12:44 +01:00
David Disseldorp
7c08428979 rbd: set discard_alignment to zero
RBD devices are currently incorrectly initialised with the block queue
discard_alignment set to the underlying RADOS object size.

As per Documentation/ABI/testing/sysfs-block:
  The discard_alignment parameter indicates how many bytes the beginning
  of the device is offset from the internal allocation unit's natural
  alignment.

Correcting the discard_alignment parameter from the RADOS object size to
zero (the blk_set_default_limits() default) has no effect on how discard
requests are propagated through the block layer - @alignment in
__blkdev_issue_discard() remains zero. However, it does fix the UNMAP
granularity alignment value advertised to SCSI initiators via the Block
Limits VPD.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:12:44 +01:00
Jeff Layton
ec1dff25b0 ceph: silence sparse endianness warning in encode_caps_cb
sparse warns:

  fs/ceph/mds_client.c:2887:34: warning: incorrect type in assignment (different base types)
  fs/ceph/mds_client.c:2887:34:    expected restricted __le32 [assigned] [usertype] flock_len
  fs/ceph/mds_client.c:2887:34:    got int

At this point, it's just being used as a flag. It gets
overwritten later if the rest of the encoding succeeds.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:58 +01:00
Jeff Layton
8130256517 ceph: remove the bump of i_version
Eventually, we'll want to wire cephfs up to use the change attribute
that the cluster tracks instead, but for now this is unneeded.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:51 +01:00
Jeff Layton
080a330e1d ceph: present consistent fsid, regardless of arch endianness
Since its inception, ceph has presented the fsid as an opaque value
without any sort of endianness conversion. This means that the value
presented is different on architectures of different endianness.

While the value that should be stuffed into f_fsid is poorly-defined,
I think it would be best to strive for consistency here between
architectures, and clients (we need to present this properly to the
userland client as well).

Change ceph_statfs to convert the opaque words to host-endian before
doing the xor. On an upgrade, a big-endian box may see a different fsid
than it did before, but little-endian arches should see no change with
this patch.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:42 +01:00
Jeff Layton
c8a96a31cb ceph: clean up spinlocking and list handling around cleanup_cap_releases()
Functions that release a lock taken in a parent frame are notoriously
hard to follow. Split cleanup_cap_releases into two functions, one to
detach the cap releases from the session (which should be called with
the spinlock held), and another to dispose of those caps.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:42 +01:00
Ilya Dryomov
9568c93eca rbd: get rid of rbd_mapping::read_only
It is redundant -- rw/ro state is stored in hd_struct and managed by
the block layer.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:41 +01:00
Ilya Dryomov
1de797bb24 rbd: fix and simplify rbd_ioctl_set_ro()
->open_count/-EBUSY check is bogus and wrong: when an open device is
set read-only, blkdev_write_iter() refuses further writes with -EPERM.
This is standard behaviour and all other block devices allow this.

set_disk_ro() call is also problematic: we affect the entire device
when called on a single partition.

All rbd_ioctl_set_ro() needs to do is refuse ro -> rw transition for
mapped snapshots.  Everything else can be handled by generic code.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:41 +01:00
Colin Ian King
bb0581f01c ceph: remove unused and redundant variable dropping
Variable dropping is set but never read and hence is redundant
and can be removed. Cleans up clang warning:

  fs/ceph/caps.c:1170:2: warning: Value stored to 'dropping' is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:40 +01:00
Gustavo A. R. Silva
18370b36b2 ceph: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
[idryomov@gmail.com: amended "Older OSDs" comment]
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:39 +01:00
Ilya Dryomov
76bd6ec498 ceph: -EINVAL on decoding failure in ceph_mdsc_handle_fsmap()
Don't set ->mdsmap_err to -ENOENT unconditionally, and drop unneeded
return statement while at it.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:39 +01:00
Yan, Zheng
933ad2c9c8 ceph: disable cached readdir after dropping positive dentry
Ideally CEPH_CAP_FILE_SHARED should have been revoked before
postive dentry get dropped. But if something goes wrong, later
cached readdir may dereference the dropped dentry.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:29 +01:00
Thomas Meyer
7271efa79f ceph: fix bool initialization/comparison
Bool initializations should use true and false. Bool tests don't need
comparisons.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:28 +01:00
Yan, Zheng
b3f8d68f38 ceph: handle 'session get evicted while there are file locks'
When session get evicted, all file locks associated with the session
get released remotely by mds. File locks tracked by kernel become
stale. In this situation, set an error flag on inode. The flag makes
further file locks return -EIO.

Another option to handle this situation is cleanup file locks tracked
kernel. I do not choose it because it is inconvenient to notify user
program about the error.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:28 +01:00
Yan, Zheng
4deb14a259 ceph: optimize flock encoding during reconnect
Don't malloc if there is no flock.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-11-13 12:11:27 +01:00