Commit Graph

58800 Commits

Author SHA1 Message Date
Kees Cook
8880fa32c5 pstore/ram: Run without kernel crash dump region
The ram pstore backend has always had the crash dumper frontend enabled
unconditionally. However, it was possible to effectively disable it
by setting a record_size=0. All the machinery would run (storing dumps
to the temporary crash buffer), but 0 bytes would ultimately get stored
due to there being no przs allocated for dumps. Commit 89d328f637
("pstore/ram: Correctly calculate usable PRZ bytes"), however, assumed
that there would always be at least one allocated dprz for calculating
the size of the temporary crash buffer. This was, of course, not the
case when record_size=0, and would lead to a NULL deref trying to find
the dprz buffer size:

BUG: unable to handle kernel NULL pointer dereference at (null)
...
IP: ramoops_probe+0x285/0x37e (fs/pstore/ram.c:808)

        cxt->pstore.bufsize = cxt->dprzs[0]->buffer_size;

Instead, we need to only enable the frontends based on the success of the
prz initialization and only take the needed actions when those zones are
available. (This also fixes a possible error in detecting if the ftrace
frontend should be enabled.)

Reported-and-tested-by: Yaro Slav <yaro330@gmail.com>
Fixes: 89d328f637 ("pstore/ram: Correctly calculate usable PRZ bytes")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2019-05-31 01:19:06 -07:00
Pi-Hsun Shih
a9fb94a99b pstore: Set tfm to NULL on free_buf_for_compression
Set tfm to NULL on free_buf_for_compression() after crypto_free_comp().

This avoid a use-after-free when allocate_buf_for_compression()
and free_buf_for_compression() are called twice. Although
free_buf_for_compression() freed the tfm, allocate_buf_for_compression()
won't reinitialize the tfm since the tfm pointer is not NULL.

Fixes: 95047b0519 ("pstore: Refactor compression initialization")
Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2019-05-31 00:32:06 -07:00
Linus Torvalds
2e2c122001 This pull request contains the following fixes for UBIFS:
- Build errors wrt. xattrs
 - Mismerge which lead to a wrong Kconfig ifdef
 - Missing endianness conversion
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAlzhsxcWHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wbuJEACe3ko0SOhNd0cP9jmHKguIuE10
 FceRVwzhcpJVAJvmoMcBlI36N9NhA+n/qSV6s03a58sX1ECFfeydAyT2pKPXMWiI
 h4gtNkAedPvLMeTRmP6gUFtvtMCKYWVeeqNZZoVjS/caIsb/v3fQZORflQ2uybxv
 +zdrZlrAKl9pP4j2C20mQD9z71jlTEiTys9zppVGcpeK1ByXRpDOTUcK9evVntyy
 k72VkWAnscYupfHehrBCLU04UZ4rI3zs9sk/Vyl6sa3ZiuwFDsJUjQc+Tq2D7ZKL
 XtenVr+qQ/4BMPQGFyXUk/YQCgqlTkdRLxIhMSyn2M2qOsVad57xKxTdB1HB2USp
 pfO4Ki0EsYcPuz+okMVTDfzPARUBA1RFJT0bOLqpVGP15ou52EOsqTwenxsyZCty
 AXPXXMtZO9qO00GdD0AcolgiMJGSQYsivkeQU7qgP7/jYDdKZ1NIVLtUUUBXm36g
 RvdfYJ+gTlaKWVTIimG9BouXnkxCDFmBcBJBXCfWSZXUD7dtlDOImgMGNmBiv+07
 l7mpTarJv2TB01AHaz1obv4iS5+5py5EM0HDPAOC2htslc1AsZnnrRQRN/Pohkvy
 36JSxxfiKZTZn0s9Syqg/i/cw90wuu2E8dWqJGDtVQD77U5rMaEuNxoqVqFd7ozR
 VV2Ry7SjX6Q0NqUnTQ==
 =0wS1
 -----END PGP SIGNATURE-----

Merge tag 'upstream-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull UBIFS fixes from Richard Weinberger:

 - build errors wrt xattrs

 - mismerge which lead to a wrong Kconfig ifdef

 - missing endianness conversion

* tag 'upstream-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  ubifs: Convert xattr inum to host order
  ubifs: Use correct config name for encryption
  ubifs: Fix build error without CONFIG_UBIFS_FS_XATTR
2019-05-19 15:22:03 -07:00
Linus Torvalds
cb6f8739fb Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
 "A few final bits:

   - large changes to vmalloc, yielding large performance benefits

   - tweak the console-flush-on-panic code

   - a few fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  panic: add an option to replay all the printk message in buffer
  initramfs: don't free a non-existent initrd
  fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
  mm/compaction.c: correct zone boundary handling when isolating pages from a pageblock
  mm/vmap: add DEBUG_AUGMENT_LOWEST_MATCH_CHECK macro
  mm/vmap: add DEBUG_AUGMENT_PROPAGATE_CHECK macro
  mm/vmalloc.c: keep track of free blocks for vmap allocation
2019-05-19 12:15:32 -07:00
Linus Torvalds
ff8583d6e4 Kbuild updates for v5.2 (2nd)
- remove unneeded use of cc-option, cc-disable-warning, cc-ldoption
 
  - exclude tracked files from .gitignore
 
  - re-enable -Wint-in-bool-context warning
 
  - refactor samples/Makefile
 
  - stop building immediately if syncconfig fails
 
  - do not sprinkle error messages when $(CC) does not exist
 
  - move arch/alpha/defconfig to the configs subdirectory
 
  - remove crappy header search path manipulation
 
  - add comment lines to .config to clarify the end of menu blocks
 
  - check uniqueness of module names (adding new warnings intentionally)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJc4KbfAAoJED2LAQed4NsGQW4P/3Iu+hM91EXqzdLqit99ML0x
 9hI3Ap0Xd5HvPy+j0xYmAeXyBYolGOVcwGNLv2WMPxOkGFxcBTfVmBgYvugboX30
 dwLMSXnL6MWL3xFVERTmRPl1STNwqHTnGal7Pru4HGUNnYUZJ9f9BGxoh0Qz55DJ
 j5ZQ9RhnYS+DmEo6OmKJepoiM77Pf3W+prCegoZ+5ZtWnYd622p9d/rjiimR94RU
 e5SFvGpSWbaUdaigjUqpA/GYW/iLy0T36xIUe20FQKe/8wstGOq61qKuPfgvxktA
 y1ajRfQbZ/8D3lJyIU7AOZ9xgoY05lmbTEPb6eP8WygG7chKRakX/vy1+AHmDp9D
 1K0IuwVdvFN2Qh5eLYkdtjX0tEUxX+m+CRv2NS75mNY/O3sTa7Yu6ZyZfCFRvGft
 ZOU6Axskr/L0f048WLs0TGpDD8wyi9d2VZywKH+7Kr9KImeqGU0kRD/JzgvCFmcy
 6lAfkptIMIL+VgSeN1igbJz6RjlGfTFvXPymvPzAcrTsVaxGHtjia/SC32h09Nw6
 gS2vD2lv/sO0EaSUjwlwrHC0HVj3lEX3cIA0RGabbfbht1D7mnDjOyi7HWnbs80D
 RHT4LCUuuPTufqCe+MQiQb08mcOSKcd58JdjZ1nfI/z/ME1zVkioMh63TFUPOfSX
 T8zcEjd0wxdeLnyNj6fK
 =oDeB
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:

 - remove unneeded use of cc-option, cc-disable-warning, cc-ldoption

 - exclude tracked files from .gitignore

 - re-enable -Wint-in-bool-context warning

 - refactor samples/Makefile

 - stop building immediately if syncconfig fails

 - do not sprinkle error messages when $(CC) does not exist

 - move arch/alpha/defconfig to the configs subdirectory

 - remove crappy header search path manipulation

 - add comment lines to .config to clarify the end of menu blocks

 - check uniqueness of module names (adding new warnings intentionally)

* tag 'kbuild-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (24 commits)
  kconfig: use 'else ifneq' for Makefile to improve readability
  kbuild: check uniqueness of module names
  kconfig: Terminate menu blocks with a comment in the generated config
  kbuild: add LICENSES to KBUILD_ALLDIRS
  kbuild: remove 'addtree' and 'flags' magic for header search paths
  treewide: prefix header search paths with $(srctree)/
  media: prefix header search paths with $(srctree)/
  media: remove unneeded header search paths
  alpha: move arch/alpha/defconfig to arch/alpha/configs/defconfig
  kbuild: terminate Kconfig when $(CC) or $(LD) is missing
  kbuild: turn auto.conf.cmd into a mandatory include file
  .gitignore: exclude .get_maintainer.ignore and .gitattributes
  kbuild: add all Clang-specific flags unconditionally
  kbuild: Don't try to add '-fcatch-undefined-behavior' flag
  kbuild: add some extra warning flags unconditionally
  kbuild: add -Wvla flag unconditionally
  arch: remove dangling asm-generic wrappers
  samples: guard sub-directories with CONFIG options
  kbuild: re-enable int-in-bool-context warning
  MAINTAINERS: kbuild: Add pattern for scripts/*vmlinux*
  ...
2019-05-19 11:53:58 -07:00
Linus Torvalds
c4d36b63b2 Some bug fixes, and an update to the URL's for the final version of
Unicode 12.1.0.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlzg2qIACgkQ8vlZVpUN
 gaPYfgf+IadtbeBYbKcHQzw8OZ9UxyGv7+Havs523XzhdceiflpOCe3lpSFYuX6w
 0CtRIWEOqFzQLRHUlX/sRywYLvrvg3irDuYJGTzny9MfPEI+tVCkYi877PPAOOR3
 4xay7dF63mcCpmhRxgw1fLx5I/hYpzrDnw5dGe7jjgU6SrR5FpE2GmEmbfBW087W
 zAuRDkuTv03J791Kgpq4TAMNla9F2/oedq/2B0v43+TnKfYe1SEIDm3VusUvSfoI
 ++8CfRrRJ3D7vCmtFSuieFFb91HtHMqdv0gaad2cY1DfvJLVm7gP1uPL7QahA5Dq
 VfhvFfqshqN9Jizr6BqP5GPOj2J9cw==
 =QK+R
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Some bug fixes, and an update to the URL's for the final version of
  Unicode 12.1.0"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: avoid panic during forced reboot due to aborted journal
  ext4: fix block validity checks for journal inodes using indirect blocks
  unicode: update to Unicode 12.1.0 final
  unicode: add missing check for an error return from utf8lookup()
  ext4: fix miscellaneous sparse warnings
  ext4: unsigned int compared against zero
  ext4: fix use-after-free in dx_release()
  ext4: fix data corruption caused by overlapping unaligned and aligned IO
  jbd2: fix potential double free
  ext4: zero out the unused memory region in the extent tree block
2019-05-19 11:43:16 -07:00
Linus Torvalds
d8848eefc1 minor cleanup and fixes, one for stable, also adds SEEK_HOLE support
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlzfKpcACgkQiiy9cAdy
 T1G35AwAh/uXfq+F8Zk9EqqE8akRo6X0HDLzps5ICriQdYI2r5I+hUDLoRwNn2AX
 qeife0CBfcOyib2jlGV7rYjIPJ6SNAIUPLzMvupgXfMQABt+aOeFiQ6KQz/cYgd4
 bQEaLhOiCt7REc8vmhL6hKXjbypiO4wwLPl29kEjHr/ZNQUFCg8MBCpKY5uCpCQe
 paLfVusSzKTjuxRinSvUWSApASQ2GU9ys1hid3PN6/MUt1vD+IXnchWRUXBZ4sNi
 ztZgUNyH7AvdwoY+sM0hdT+jEehl4eTtDP1ZaeGz5Dhq4Fpe86OrfXBsySscUCgq
 wpjioh67ihSg9If8ZETfYJ2srn1VhyME01l30yuJB0HgRLebWXFXe4QDCbPYnUFh
 9wWZavRUlQuiKx9x2mszHBL5L3r6mZXdOHEumm2DV8TJFTjlxDt6jhdxsZA3HAzB
 6Rz0F24F/Dez4ONdYS0eO8am1CScW1tTJU9+6xyW2/XP3YZunx8B6SxU7lu3nzNN
 jqZ7pk7o
 =JhcA
 -----END PGP SIGNATURE-----

Merge tag '5.2-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Minor cleanup and fixes, one for stable, four rdma (smbdirect)
  related. Also adds SEEK_HOLE support"

* tag '5.2-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: add support for SEEK_DATA and SEEK_HOLE
  Fixed https://bugzilla.kernel.org/show_bug.cgi?id=202935 allow write on the same file
  cifs: Allocate memory for all iovs in smb2_ioctl
  cifs: Don't match port on SMBDirect transport
  cifs:smbd Use the correct DMA direction when sending data
  cifs:smbd When reconnecting to server, call smbd_destroy() after all MIDs have been called
  cifs: use the right include for signal_pending()
  smb3: trivial cleanup to smb2ops.c
  cifs: cleanup smb2ops.c and normalize strings
  smb3: display session id in debug data
2019-05-19 11:38:18 -07:00
Jiufei Xue
ec084de929 fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
synchronize_rcu() didn't wait for call_rcu() callbacks, so inode wb
switch may not go to the workqueue after synchronize_rcu().  Thus
previous scheduled switches was not finished even flushing the
workqueue, which will cause a NULL pointer dereferenced followed below.

  VFS: Busy inodes after unmount of vdd. Self-destruct in 5 seconds.  Have a nice day...
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000278
    evict+0xb3/0x180
    iput+0x1b0/0x230
    inode_switch_wbs_work_fn+0x3c0/0x6a0
    worker_thread+0x4e/0x490
    ? process_one_work+0x410/0x410
    kthread+0xe6/0x100
    ret_from_fork+0x39/0x50

Replace the synchronize_rcu() call with a rcu_barrier() to wait for all
pending callbacks to finish.  And inc isw_nr_in_flight after call_rcu()
in inode_switch_wbs() to make more sense.

Link: http://lkml.kernel.org/r/20190429024108.54150-1-jiufei.xue@linux.alibaba.com
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Acked-by: Tejun Heo <tj@kernel.org>
Suggested-by: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-18 15:52:26 -07:00
Masahiro Yamada
9cc342f6c4 treewide: prefix header search paths with $(srctree)/
Currently, the Kbuild core manipulates header search paths in a crazy
way [1].

To fix this mess, I want all Makefiles to add explicit $(srctree)/ to
the search paths in the srctree. Some Makefiles are already written in
that way, but not all. The goal of this work is to make the notation
consistent, and finally get rid of the gross hacks.

Having whitespaces after -I does not matter since commit 48f6e3cf5b
("kbuild: do not drop -I without parameter").

[1]: https://patchwork.kernel.org/patch/9632347/

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-05-18 11:49:57 +09:00
Jan Kara
2c1d0e3631 ext4: avoid panic during forced reboot due to aborted journal
Handling of aborted journal is a special code path different from
standard ext4_error() one and it can call panic() as well. Commit
1dc1097ff6 ("ext4: avoid panic during forced reboot") forgot to update
this path so fix that omission.

Fixes: 1dc1097ff6 ("ext4: avoid panic during forced reboot")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # 5.1
2019-05-17 17:37:18 -04:00
Linus Torvalds
bf8a9a4755 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs mount updates from Al Viro:
 "Propagation of new syscalls to other architectures + cosmetic change
  from Christian (fscontext didn't follow the convention for anon inode
  names)"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]
  uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]
  uapi, fsopen: use square brackets around "fscontext" [ver #2]
2019-05-17 09:46:31 -07:00
Linus Torvalds
a6a4b66bd8 for-linus-20190516
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlzd7MoQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgphgjD/4nCFARTigr4yFIuhATpyA00CEoafjVuMZt
 h2GX+NdU4IyUkEmMqWFzqX3T3OXHvgio2mR5YrCSR/e5Ju0C6QpwvaSzom163WlD
 BSC0ZtOwvLyE3TB/vGTvjjpnTlTvC7wYokTeS0L7dHQVUWWBXLwOXrO7YUpaTWTr
 XDauk0GAJFLzgAH8HnNgV8bfeyPvDNzdn79Ryp/FXMqjbN8Diu6CkPhjYhSs8Yz2
 F9g0zRSfYNKqDouJl+HfYwseUGl4udJ7tb4wsHNI7LwFA/prN7k35/ASErgmeFDn
 0QPJZv14bYgKMRUMwnLpEvpA/B2hRSr8/JwN+RqIL8EXHjokFUqTNeUcWJwqktwS
 pvcTUk7CBK58rBnNKz8sUlDjGfntu6P9zC5+p57aHkP8+ZhI2T22wh2spCVE2oX4
 1+GqdgTbqJMmhwoEEuD2zfBENr4wyshjOLdr8E3B6dtRQOveh0vUd3LX6NrfWXQK
 OCK8gpjazOVFJBFmqoGCbbvd31k9YFrQCS9vtIiHE1Fw9EIDX70BasV+iBl3tGWS
 23/I9SL5j370mmcIR2y/iPxgGoFp2e9v0V20DlmN2OC/Qa6HG/XqUg9jk8kYLSS4
 aOyEVuoI5X1K+0BqAmLdXWBpVwx12ebRRbid6HUun1dNTrSwETepWdr3+nmGTZ7/
 +V26xrj2RA==
 =SOm1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190516' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "A small set of fixes for io_uring.

  This contains:

   - smp_rmb() cleanup for io_cqring_events() (Jackie)

   - io_cqring_wait() simplification (Jackie)

   - removal of dead 'ev_flags' passing (me)

   - SQ poll CPU affinity verification fix (me)

   - SQ poll wait fix (Roman)

   - SQE command prep cleanup and fix (Stefan)"

* tag 'for-linus-20190516' of git://git.kernel.dk/linux-block:
  io_uring: use wait_event_interruptible for cq_wait conditional wait
  io_uring: adjust smp_rmb inside io_cqring_events
  io_uring: fix infinite wait in khread_park() on io_finish_async()
  io_uring: remove 'ev_flags' argument
  io_uring: fix failure to verify SQ_AFF cpu
  io_uring: fix race condition reading SQE data
2019-05-16 19:10:37 -07:00
Linus Torvalds
0d74471924 AFS fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAXN3U4fu3V2unywtrAQKXoA/+LPCttnaq9GpTiVlfPXsBHnbphs0gjZtK
 gN4utmmsLx7sKLCwxRYdOhTSMlShby+hd9FZ71rsEppiF6hATXyIwK7UPD8+D83l
 4eNT7RECPaQtBGDsw4tYd1AA8OVRM6v/+r2AhWpwYrZXIEOkYKJ0st9HLz63M64X
 HOPOVabEVBlTmsbKRULBgZdFPXhQiZWsJHPINkIegPi21KETLb2KVBEciBNKI1iX
 Jb5eAb8tO1a1y2vJNG/YJn1HzVo0gMzzo/dTgmaIkyu+5ULGkqFk/OLZHJ6rwLwd
 peqIzbdtmBNpd43u942zbo2Tx3jegIa9y5dg/WT2NnIUJ5FAfysxXiMi1AnhSbjc
 NRRwUVK1XBZCGZeGBIKtfY1CfgRGAq2rmr1MyXgt++Vciz9BekXgzB7GAqmMHvA0
 6Ud5j6oCqrQhxt/mIPvfJcqnuguuTgadwgHoani/366t0gCRT/lxswbrh4Nv86l9
 CDSRFEBjkSJAwV07xqX37ppoMtP+iECHl6elkLf5HYh9vFI223fK9Jpo6eoUsRj8
 YLYOLtV0LBeZ2Wj4+rFxbDVvUCWz4Gh3hr/YyLwtyVmyuoE0LstHIn51WZ646NOa
 l0Jyjf1Q/CqF98uP65H2USTxInNnZRV2Du8qmtv4MELITPm1MfGUk/nd9oMipGaq
 smKgUe0M7cA=
 =NstZ
 -----END PGP SIGNATURE-----

Merge tag 'afs-fixes-b-20190516' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS callback promise fixes from David Howells:
 "This series fixes a bunch of problems in callback promise handling,
  where a callback promise indicates a promise on the part of the server
  to notify the client in the event of some sort of change to a file or
  volume. In the event of a break, the client has to go and refetch the
  client status from the server and discard any cached permission
  information as the ACL might have changed.

  The problem in the current code is that changes made by other clients
  aren't always noticed, primarily because the file status information
  and the callback information aren't updated in the same critical
  section, even if these are carried in the same reply from an RPC
  operation, and so the AFS_VNODE_CB_PROMISED flag is unreliable.

  Arranging for them to be done in the same critical section during
  reply decoding is tricky because of the FS.InlineBulkStatus op - which
  has all the statuses in the reply arriving and then all the callbacks,
  so they have to be buffered. It simplifies things a lot to move the
  critical section out of the decode phase and do it after the RPC
  function returns.

  Also new inodes (either newly fetched or newly created) aren't
  properly managed against a callback break happening before we get the
  local inode up and running.

  Fix this by:

   - There's now a combined file status and callback record (struct
     afs_status_cb) to carry both plus some flags.

   - Each operation wrapper function allocates sufficient afs_status_cb
     records for all the vnodes it is interested in and passes them into
     RPC operations to be filled in from the reply.

   - The FileStatus and CallBack record decoders no longer apply the
     new/revised status and callback information to the inode/vnode at
     the point of decoding and instead store the information into the
     record from (2).

   - afs_vnode_commit_status() then revises the file status, detects
     deletion and notes callback information inside of a single critical
     section. It also checks the callback break counters and cancels the
     callback promise if they changed during the operation.

     [*] Note that "callback break counters" are counters of server
     events that cancel one or more callback promises that the client
     thinks it has. The client counts the events and compares the
     counters before and after an operation to see if the callback
     promise it thinks it just got evaporated before it got recorded
     under lock.

   - Volume and server callback break counters are passed into
     afs_iget() allowing callback breaks concurrent with inode set up to
     be detected and the callback promise thence to be cancelled.

   - AFS validation checks are now done under RCU conditions using a
     read lock on cb_lock. This requires vnode->cb_interest to be made
     RCU safe.

   - If the checks in (6) fail, the callback breaker is then called
     under write lock on the cb_lock - but only if the callback break
     counter didn't change from the value read before the checks were
     made.

   - Results from FS.InlineBulkStatus that correspond to inodes we
     currently have in memory are now used to update those inodes'
     status and callback information rather than being discarded. This
     requires those inodes to be looked up before the RPC op is made and
     all their callback break values saved.

  To aid in this, the following changes have also been made:

   - Don't pass the vnode into the reply delivery functions or the
     decoders. The vnode shouldn't be altered anywhere in those paths.
     The only exception, for the moment, is for the call done hook for
     file lock ops that wants access to both the vnode and the call -
     this can be fixed at a later time.

   - Get rid of the call->reply[] void* array and replace it with named
     and typed members. This avoids confusion since different ops were
     mapping different reply[] members to different things.

   - Fix an order-1 kmalloc allocation in afs_do_lookup() and replace it
     with kvcalloc().

   - Always get the reply time. Since callback, lock and fileserver
     record expiry times are calculated for several RPCs, make this
     mandatory.

   - Call afs_pages_written_back() from the operation wrapper rather
     than from the delivery function.

   - Don't store the version and type from a callback promise in a reply
     as the information in them is of very limited use"

* tag 'afs-fixes-b-20190516' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Fix application of the results of a inline bulk status fetch
  afs: Pass pre-fetch server and volume break counts into afs_iget5_set()
  afs: Fix unlink to handle YFS.RemoveFile2 better
  afs: Clear AFS_VNODE_CB_PROMISED if we detect callback expiry
  afs: Make vnode->cb_interest RCU safe
  afs: Split afs_validate() so first part can be used under LOOKUP_RCU
  afs: Don't save callback version and type fields
  afs: Fix application of status and callback to be under same lock
  afs: Always get the reply time
  afs: Fix order-1 allocation in afs_do_lookup()
  afs: Get rid of afs_call::reply[]
  afs: Don't pass the vnode pointer through into the inline bulk status op
2019-05-16 17:18:41 -07:00
Linus Torvalds
227747fb9e AFS fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAXN2BC/u3V2unywtrAQJm9w/+L7ufbRkj6XGVongmhf4n+auBQXMJ4jec
 zN6bjWrp/SN9kJfOqOKA+sk9s3cCOCV8SF/2eM5P8DJNtrB6aXlg590u1wSkOp99
 FdSM8Fy7v4bTwW9hCBhvcFpC+layVUEv/WAsCCIZi94W+H43XFY4QM79cqoqIx8r
 nTLu9EcjWFpUoBIAYEU0x/h4IA5Cyl6CUw3YZhZYaGoLLfi9EZkgBLlUU+6OXpDO
 Uepzn1gnpXMCNsiBE/Hr9LR0pfOTtzdJuNADrppRnbPfky8RsPE8tuk6kT6301U1
 IxG66SafYsvbQGzyIdfTydl022DFj5LOtCPFtfALviJqdBOGE/zPPnrBPinHg4oJ
 40P2tIJ/+Ksz5cPzmkA1KanSXaQ2v0sLBVdQJ7yt5EFuAMzj/roWpiPmEmQd6KqB
 ixZdZLehKFPaAB5cR41fHV1jB30HN7oakwqCoYmXd1Chu3AlB15yV9WZMSqjPS8P
 pkNC/X5mU5hDnZUx9e3Fbu8LqoGOjnGvDn5jOxihdKfaGu3A4OlbSerIUbRHvnT8
 u8XDPoq4j61f04MiI9z/bPDFTRYyycIQPcHYQpi4MJt9lSkkydP217P60BJsUv2n
 NIPYwgI7VIse0Gdo8shIg+RnSnJaKHT9Sf86h8pyDFO6wZp/GVVqPSdjjU+Lv5fv
 CZGJ7PCYcfs=
 =2q2Y
 -----END PGP SIGNATURE-----

Merge tag 'afs-fixes-20190516' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull misc AFS fixes from David Howells:
 "This fixes a set of miscellaneous issues in the afs filesystem,
  including:

   - leak of keys on file close.

   - broken error handling in xattr functions.

   - missing locking when updating VL server list.

   - volume location server DNS lookup whereby preloaded cells may not
     ever get a lookup and regular DNS lookups to maintain server lists
     consume power unnecessarily.

   - incorrect error propagation and handling in the fileserver
     iteration code causes operations to sometimes apparently succeed.

   - interruption of server record check/update side op during
     fileserver iteration causes uninterruptible main operations to fail
     unexpectedly.

   - callback promise expiry time miscalculation.

   - over invalidation of the callback promise on directories.

   - double locking on callback break waking up file locking waiters.

   - double increment of the vnode callback break counter.

  Note that it makes some changes outside of the afs code, including:

   - an extra parameter to dns_query() to allow the dns_resolver key
     just accessed to be immediately invalidated. AFS is caching the
     results itself, so the key can be discarded.

   - an interruptible version of wait_var_event().

   - an rxrpc function to allow the maximum lifespan to be set on a
     call.

   - a way for an rxrpc call to be marked as non-interruptible"

* tag 'afs-fixes-20190516' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Fix double inc of vnode->cb_break
  afs: Fix lock-wait/callback-break double locking
  afs: Don't invalidate callback if AFS_VNODE_DIR_VALID not set
  afs: Fix calculation of callback expiry time
  afs: Make dynamic root population wait uninterruptibly for proc_cells_lock
  afs: Make some RPC operations non-interruptible
  rxrpc: Allow the kernel to mark a call as being non-interruptible
  afs: Fix error propagation from server record check/update
  afs: Fix the maximum lifespan of VL and probe calls
  rxrpc: Provide kernel interface to set max lifespan on a call
  afs: Fix "kAFS: AFS vnode with undefined type 0"
  afs: Fix cell DNS lookup
  Add wait_var_event_interruptible()
  dns_resolver: Allow used keys to be invalidated
  afs: Fix afs_cell records to always have a VL server list record
  afs: Fix missing lock when replacing VL server list
  afs: Fix afs_xattr_get_yfs() to not try freeing an error value
  afs: Fix incorrect error handling in afs_xattr_get_acl()
  afs: Fix key leak in afs_release() and afs_evict_inode()
2019-05-16 17:00:13 -07:00
Linus Torvalds
1d9d7cbf28 On the filesystem side we have:
- a fix to enforce quotas set above the mount point (Luis Henriques)
 
 - support for exporting snapshots through NFS (Zheng Yan)
 
 - proper statx implementation (Jeff Layton).  statx flags are mapped
   to MDS caps, with AT_STATX_{DONT,FORCE}_SYNC taken into account.
 
 - some follow-up dentry name handling fixes, in particular elimination
   of our hand-rolled helper and the switch to __getname() as suggested
   by Al (Jeff Layton)
 
 - a set of MDS client cleanups in preparation for async MDS requests
   in the future (Jeff Layton)
 
 - a fix to sync the filesystem before remounting (Jeff Layton)
 
 On the rbd side, work is on-going on object-map and fast-diff image
 features.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAlzdgEkTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi2w0B/9AsskuQezu8HP0NumCNfdgfI02r6d1
 1ZixMp6q8AAtOZYHP0bmiLzaETwC3+sRkD+8nX5DWuFISyjkTlRn8f7wnoziWkBT
 bBmL21fufkSKXN41VFCdolAbUPCKuA8+Fr7YE2hCl517ejbf47W+htv7+a56eTiR
 iAiDyVYokB8sj7WTVW6ET4HJTvJly1Z4QUNmy9Ljfzc8AvL2LFLOe6FRsJtIThdx
 aE00RX9EQsKO2v9ROd6jDmZocg50TvFmgF14A5GFfMmFrxJuri2yEI4iZd3hSKu2
 yZ+fBWmRy4E9w5E20qufrM+bSVjA+Zi7aiTMriaBm54aYtflgJ5gxhFI
 =68dZ
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.2-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "On the filesystem side we have:

   - a fix to enforce quotas set above the mount point (Luis Henriques)

   - support for exporting snapshots through NFS (Zheng Yan)

   - proper statx implementation (Jeff Layton). statx flags are mapped
     to MDS caps, with AT_STATX_{DONT,FORCE}_SYNC taken into account.

   - some follow-up dentry name handling fixes, in particular
     elimination of our hand-rolled helper and the switch to __getname()
     as suggested by Al (Jeff Layton)

   - a set of MDS client cleanups in preparation for async MDS requests
     in the future (Jeff Layton)

   - a fix to sync the filesystem before remounting (Jeff Layton)

  On the rbd side, work is on-going on object-map and fast-diff image
  features"

* tag 'ceph-for-5.2-rc1' of git://github.com/ceph/ceph-client: (29 commits)
  ceph: flush dirty inodes before proceeding with remount
  ceph: fix unaligned access in ceph_send_cap_releases
  libceph: make ceph_pr_addr take an struct ceph_entity_addr pointer
  libceph: fix unaligned accesses in ceph_entity_addr handling
  rbd: don't assert on writes to snapshots
  rbd: client_mutex is never nested
  ceph: print inode number in __caps_issued_mask debugging messages
  ceph: just call get_session in __ceph_lookup_mds_session
  ceph: simplify arguments and return semantics of try_get_cap_refs
  ceph: fix comment over ceph_drop_caps_for_unlink
  ceph: move wait for mds request into helper function
  ceph: have ceph_mdsc_do_request call ceph_mdsc_submit_request
  ceph: after an MDS request, do callback and completions
  ceph: use pathlen values returned by set_request_path_attr
  ceph: use __getname/__putname in ceph_mdsc_build_path
  ceph: use ceph_mdsc_build_path instead of clone_dentry_name
  ceph: fix potential use-after-free in ceph_mdsc_build_path
  ceph: dump granular cap info in "caps" debugfs file
  ceph: make iterate_session_caps a public symbol
  ceph: fix NULL pointer deref when debugging is enabled
  ...
2019-05-16 16:24:01 -07:00
David Howells
39db9815da afs: Fix application of the results of a inline bulk status fetch
Fix afs_do_lookup() such that when it does an inline bulk status fetch op,
it will update inodes that are already extant (something that afs_iget()
doesn't do) and to cache permits for each inode created (thereby avoiding a
follow up FS.FetchStatus call to determine this).

Extant inodes need looking up in advance so that their cb_break counters
before and after the operation can be compared.  To this end, the inode
pointers are cached so that they don't need looking up again after the op.

Fixes: 5cf9dd55a0 ("afs: Prospectively look up extra files when doing a single lookup")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 22:23:21 +01:00
David Howells
b835915325 afs: Pass pre-fetch server and volume break counts into afs_iget5_set()
Pass the server and volume break counts from before the status fetch
operation that queried the attributes of a file into afs_iget5_set() so
that the new vnode's break counters can be initialised appropriately.

This allows detection of a volume or server break that happened whilst we
were fetching the status or setting up the vnode.

Fixes: c435ee3455 ("afs: Overhaul the callback handling")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 22:23:21 +01:00
David Howells
a38a75581e afs: Fix unlink to handle YFS.RemoveFile2 better
Make use of the status update for the target file that the YFS.RemoveFile2
RPC op returns to correctly update the vnode as to whether the file was
actually deleted or just had nlink reduced.

Fixes: 30062bd13e ("afs: Implement YFS support in the fs client")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 22:23:21 +01:00
David Howells
61c347ba55 afs: Clear AFS_VNODE_CB_PROMISED if we detect callback expiry
Fix afs_validate() to clear AFS_VNODE_CB_PROMISED on a vnode if we detect
any condition that causes the callback promise to be broken implicitly,
including server break (cb_s_break), volume break (cb_v_break) or callback
expiry.

Fixes: ae3b7361dc ("afs: Fix validation/callback interaction")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 22:23:21 +01:00
David Howells
f642404a04 afs: Make vnode->cb_interest RCU safe
Use RCU-based freeing for afs_cb_interest struct objects and use RCU on
vnode->cb_interest.  Use that change to allow afs_check_validity() to use
read_seqbegin_or_lock() instead of read_seqlock_excl().

This also requires the caller of afs_check_validity() to hold the RCU read
lock across the call.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 22:23:21 +01:00
David Howells
c925bd0ac4 afs: Split afs_validate() so first part can be used under LOOKUP_RCU
Split afs_validate() so that the part that decides if the vnode is still
valid can be used under LOOKUP_RCU conditions from afs_d_revalidate().

Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 22:23:21 +01:00
David Howells
7c71245866 afs: Don't save callback version and type fields
Don't save callback version and type fields as the version is about the
format of the callback information and the type is relative to the
particular RPC call.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 22:23:21 +01:00
Linus Torvalds
4e785e8d99 configs updates for Linux 5.2:
- a fix for an error path use after free (YueHaibing)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlzdHsQLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPcPBAAj2ZWXip3ouxUxO7dTQYHEx3M0otpI8+r7cX0d3ud
 sJR7GLcXfl/+wj5D34TFFLdUFQFezshy3jBKQE2dD6/b8DL9qMv5BSgI3Z96h27h
 S9jPO5n5pSoHqnfVxNaG3OZBOhWMVOlD3YMBlikcjkKz0ad4ZxscRIPxDXranLwk
 nRtYMLYAAqctBYlXWjqG5zz/kzEWvI0hzJsrihunHQpQZs1398NTUvqCTRn/rsBl
 RfCJUG+mIe54JiO+umTRHPiriMjsCUFxiW6tDzyPnM82lQHnXR/qkN7NeReHu6FR
 unxjK0Yxu2zi/E4daTx/GEZM1mGGyyamwxbYYQ2obQeG34R9WmtJpE7d385rHJC8
 H3oOihvRjYrUTlsilPkISUB+GbtDXpuh7Ij6zm+ypZ6J+Lug1SrP0HmH8ti2nRbD
 tCxai02BcS4BivfIxidx1q8JYBSg1KFLXjz5O7HjpxGmw893l2IhbBAvy1HEhpP7
 nOuebnLDdfCbdWtDRfH3Wa9wotR3Y3nuGrmeD4z9aZ1qyH8acWmSP9Oq6z/UGpbN
 4N55eFqKFZZxSJQfFNqcGd7cS70D92h1MM/mpWgEDD1qvBe1kAxD2SfGIm2559J2
 tZbxSv8PjXYzWXaAVmHsagRVbB4rhSRMWswGey0YEkDG+viuIqpPV5Fx++RzP3pf
 6wQ=
 =lNOc
 -----END PGP SIGNATURE-----

Merge tag 'configfs-for-5.2' of git://git.infradead.org/users/hch/configfs

Pull configfs update from Christoph Hellwig:

 - a fix for an error path use after free (YueHaibing)

* tag 'configfs-for-5.2' of git://git.infradead.org/users/hch/configfs:
  configfs: fix possible use-after-free in configfs_register_group
2019-05-16 11:46:58 -07:00
Christian Brauner
1cdc415f10 uapi, fsopen: use square brackets around "fscontext" [ver #2]
Make the name of the anon inode fd "[fscontext]" instead of "fscontext".
This is minor but most core-kernel anon inode fds already carry square
brackets around their name:

[eventfd]
[eventpoll]
[fanotify]
[io_uring]
[pidfd]
[signalfd]
[timerfd]
[userfaultfd]

For the sake of consistency lets do the same for the fscontext anon inode
fd that comes with the new mount api.

Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-05-16 12:23:45 -04:00
David Howells
fd711586bb afs: Fix double inc of vnode->cb_break
When __afs_break_callback() clears the CB_PROMISED flag, it increments
vnode->cb_break to trigger a future refetch of the status and callback -
however it also calls afs_clear_permits(), which also increments
vnode->cb_break.

Fix this by removing the increment from afs_clear_permits().

Whilst we're at it, fix the conditional call to afs_put_permits() as the
function checks to see if the argument is NULL, so the check is redundant.

Fixes: be080a6f43 ("afs: Overhaul permit caching");
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
a58823ac45 afs: Fix application of status and callback to be under same lock
When applying the status and callback in the response of an operation,
apply them in the same critical section so that there's no race between
checking the callback state and checking status-dependent state (such as
the data version).

Fix this by:

 (1) Allocating a joint {status,callback} record (afs_status_cb) before
     calling the RPC function for each vnode for which the RPC reply
     contains a status or a status plus a callback.  A flag is set in the
     record to indicate if a callback was actually received.

 (2) These records are passed into the RPC functions to be filled in.  The
     afs_decode_status() and yfs_decode_status() functions are removed and
     the cb_lock is no longer taken.

 (3) xdr_decode_AFSFetchStatus() and xdr_decode_YFSFetchStatus() no longer
     update the vnode.

 (4) xdr_decode_AFSCallBack() and xdr_decode_YFSCallBack() no longer update
     the vnode.

 (5) vnodes, expected data-version numbers and callback break counters
     (cb_break) no longer need to be passed to the reply delivery
     functions.

     Note that, for the moment, the file locking functions still need
     access to both the call and the vnode at the same time.

 (6) afs_vnode_commit_status() is now given the cb_break value and the
     expected data_version and the task of applying the status and the
     callback to the vnode are now done here.

     This is done under a single taking of vnode->cb_lock.

 (7) afs_pages_written_back() is now called by afs_store_data() rather than
     by the reply delivery function.

     afs_pages_written_back() has been moved to before the call point and
     is now given the first and last page numbers rather than a pointer to
     the call.

 (8) The indicator from YFS.RemoveFile2 as to whether the target file
     actually got removed (status.abort_code == VNOVNODE) rather than
     merely dropping a link is now checked in afs_unlink rather than in
     xdr_decode_YFSFetchStatus().

Supplementary fixes:

 (*) afs_cache_permit() now gets the caller_access mask from the
     afs_status_cb object rather than picking it out of the vnode's status
     record.  afs_fetch_status() returns caller_access through its argument
     list for this purpose also.

 (*) afs_inode_init_from_status() now uses a write lock on cb_lock rather
     than a read lock and now sets the callback inside the same critical
     section.

Fixes: c435ee3455 ("afs: Overhaul the callback handling")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
c7226e407b afs: Fix lock-wait/callback-break double locking
__afs_break_callback() holds vnode->lock around its call of
afs_lock_may_be_available() - which also takes that lock.

Fix this by not taking the lock in __afs_break_callback().

Also, there's no point checking the granted_locks and pending_locks queues;
it's sufficient to check lock_state, so move that check out of
afs_lock_may_be_available() into __afs_break_callback() to replace the
queue checks.

Fixes: e8d6c55412 ("AFS: implement file locking")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
4571577f16 afs: Always get the reply time
Always ask for the reply time from AF_RXRPC as it's used to calculate the
callback expiry time and lock expiry times, so it's needed by most FS
operations.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
d9052dda8a afs: Don't invalidate callback if AFS_VNODE_DIR_VALID not set
Don't invalidate the callback promise on a directory if the
AFS_VNODE_DIR_VALID flag is not set (which indicates that the directory
contents are invalid, due to edit failure, callback break, page reclaim).

The directory will be reloaded next time the directory is accessed, so
clearing the callback flag at this point may race with a reload of the
directory and cancel it's recorded callback promise.

Fixes: f3ddee8dc4 ("afs: Fix directory handling")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
87182759cd afs: Fix order-1 allocation in afs_do_lookup()
afs_do_lookup() will do an order-1 allocation to allocate status records if
there are more than 39 vnodes to stat.

Fix this by allocating an array of {status,callback} records for each vnode
we want to examine using vmalloc() if larger than a page.

This not only gets rid of the order-1 allocation, but makes it easier to
grow beyond 50 records for YFS servers.  It also allows us to move to
{status,callback} tuples for other calls too and makes it easier to lock
across the application of the status and the callback to the vnode.

Fixes: 5cf9dd55a0 ("afs: Prospectively look up extra files when doing a single lookup")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
781070551c afs: Fix calculation of callback expiry time
Fix the calculation of the expiry time of a callback promise, as obtained
from operations like FS.FetchStatus and FS.FetchData.

The time should be based on the timestamp of the first DATA packet in the
reply and the calculation needs to turn the ktime_t timestamp into a
time64_t.

Fixes: c435ee3455 ("afs: Overhaul the callback handling")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
ffba718e93 afs: Get rid of afs_call::reply[]
Replace the afs_call::reply[] array with a bunch of typed members so that
the compiler can use type-checking on them.  It's also easier for the eye
to see what's going on.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
3b05e528cb afs: Make dynamic root population wait uninterruptibly for proc_cells_lock
Make dynamic root population wait uninterruptibly for proc_cells_lock.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
fefb2483dc afs: Don't pass the vnode pointer through into the inline bulk status op
Don't pass the vnode pointer through into the inline bulk status op.  We
want to process the status records outside of it anyway.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:21 +01:00
David Howells
20b8391fff afs: Make some RPC operations non-interruptible
Make certain RPC operations non-interruptible, including:

 (*) Set attributes
 (*) Store data

     We don't want to get interrupted during a flush on close, flush on
     unlock, writeback or an inode update, leaving us in a state where we
     still need to do the writeback or update.

 (*) Extend lock
 (*) Release lock

     We don't want to get lock extension interrupted as the file locks on
     the server are time-limited.  Interruption during lock release is less
     of an issue since the lock is time-limited, but it's better to
     complete the release to avoid a several-minute wait to recover it.

     *Setting* the lock isn't a problem if it's interrupted since we can
      just return to the user and tell them they were interrupted - at
      which point they can elect to retry.

 (*) Silly unlink

     We want to remove silly unlink files if we can, rather than leaving
     them for the salvager to clear up.

Note that whilst these calls are no longer interruptible, they do have
timeouts on them, so if the server stops responding the call will fail with
something like ETIME or ECONNRESET.

Without this, the following:

	kAFS: Unexpected error from FS.StoreData -512

appears in dmesg when a pending store data gets interrupted and some
processes may just hang.

Additionally, make the code that checks/updates the server record ignore
failure due to interruption if the main call is uninterruptible and if the
server has an address list.  The next op will check it again since the
expiration time on the old list has past.

Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Reported-by: Jonathan Billings <jsbillings@jsbillings.org>
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:20 +01:00
David Howells
b960a34b73 rxrpc: Allow the kernel to mark a call as being non-interruptible
Allow kernel services using AF_RXRPC to indicate that a call should be
non-interruptible.  This allows kafs to make things like lock-extension and
writeback data storage calls non-interruptible.

If this is set, signals will be ignored for operations on that call where
possible - such as waiting to get a call channel on an rxrpc connection.

It doesn't prevent UDP sendmsg from being interrupted, but that will be
handled by packet retransmission.

rxrpc_kernel_recv_data() isn't affected by this since that never waits,
preferring instead to return -EAGAIN and leave the waiting to the caller.

Userspace initiated calls can't be set to be uninterruptible at this time.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:20 +01:00
David Howells
0ab4c95948 afs: Fix error propagation from server record check/update
afs_check/update_server_record() should be setting fc->error rather than
fc->ac.error as they're called from within the cursor iteration function.

afs_fs_cursor::error is where the error code of the attempt to call the
operation on multiple servers is integrated and is the final result,
whereas afs_addr_cursor::error is used to hold the error from individual
iterations of the call loop.  (Note there's also an afs_vl_cursor which
also wraps afs_addr_cursor for accessing VL servers rather than file
servers).

Fix this by setting fc->error in the afs_check/update_server_record() so
that any error incurred whilst talking to the VL server correctly
propagates to the final result.

This results in:

	kAFS: Unexpected error from FS.StoreData -512

being seen, even though the store-data op is non-interruptible.  The error
is actually coming from the server record update getting interrupted.

Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:20 +01:00
David Howells
94f699c9cd afs: Fix the maximum lifespan of VL and probe calls
If an older AFS server doesn't support an operation, it may accept the call
and then sit on it forever, happily responding to pings that make kafs
think that the call is still alive.

Fix this by setting the maximum lifespan of Volume Location service calls
in particular and probe calls in general so that they don't run on
endlessly if they're not supported.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 16:25:20 +01:00
David Howells
51eba99970 afs: Fix "kAFS: AFS vnode with undefined type 0"
Under some circumstances afs_select_fileserver() can return without setting
an error in fc->error.  The problem is in the no_more_servers segment where
the accumulated errors from attempts to contact various servers are
integrated into an afs_error-type variable 'e'.  The resultant error code
is, however, then abandoned.

Fix this by getting the error out of e.error and putting it in 'error' so
that the next part will store it into fc->error.

Not doing this causes a report like the following:

    kAFS: AFS vnode with undefined type 0
    kAFS: A=0 m=0 s=0 v=0
    kAFS: vnode 20000025:1:1

because the code following the server selection loop then sees what it
thinks is a successful invocation because fc.error is 0.  However, it can't
apply the status record because it's all zeros.

The report is followed on the first instance with a trace looking something
like:

     dump_stack+0x67/0x8e
     afs_inode_init_from_status.isra.2+0x21b/0x487
     afs_fetch_status+0x119/0x1df
     afs_iget+0x130/0x295
     afs_get_tree+0x31d/0x595
     vfs_get_tree+0x1f/0xe8
     fc_mount+0xe/0x36
     afs_d_automount+0x328/0x3c3
     follow_managed+0x109/0x20a
     lookup_fast+0x3bf/0x3f8
     do_last+0xc3/0x6a4
     path_openat+0x1af/0x236
     do_filp_open+0x51/0xae
     ? _raw_spin_unlock+0x24/0x2d
     ? __alloc_fd+0x1a5/0x1b7
     do_sys_open+0x13b/0x1e8
     do_syscall_64+0x7d/0x1b3
     entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 4584ae96ae ("afs: Fix missing net error handling")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 15:48:20 +01:00
Jackie Liu
fdb288a679 io_uring: use wait_event_interruptible for cq_wait conditional wait
The previous patch has ensured that io_cqring_events contain
smp_rmb memory barriers, Now we can use wait_event_interruptible
to keep the code simple.

Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-16 08:10:27 -06:00
Jackie Liu
dc6ce4bc2b io_uring: adjust smp_rmb inside io_cqring_events
Whenever smp_rmb is required to use io_cqring_events,
keep smp_rmb inside the function io_cqring_events.

Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-16 08:10:25 -06:00
Roman Penyaev
2bbcd6d3b3 io_uring: fix infinite wait in khread_park() on io_finish_async()
This fixes couple of races which lead to infinite wait of park completion
with the following backtraces:

  [20801.303319] Call Trace:
  [20801.303321]  ? __schedule+0x284/0x650
  [20801.303323]  schedule+0x33/0xc0
  [20801.303324]  schedule_timeout+0x1bc/0x210
  [20801.303326]  ? schedule+0x3d/0xc0
  [20801.303327]  ? schedule_timeout+0x1bc/0x210
  [20801.303329]  ? preempt_count_add+0x79/0xb0
  [20801.303330]  wait_for_completion+0xa5/0x120
  [20801.303331]  ? wake_up_q+0x70/0x70
  [20801.303333]  kthread_park+0x48/0x80
  [20801.303335]  io_finish_async+0x2c/0x70
  [20801.303336]  io_ring_ctx_wait_and_kill+0x95/0x180
  [20801.303338]  io_uring_release+0x1c/0x20
  [20801.303339]  __fput+0xad/0x210
  [20801.303341]  task_work_run+0x8f/0xb0
  [20801.303342]  exit_to_usermode_loop+0xa0/0xb0
  [20801.303343]  do_syscall_64+0xe0/0x100
  [20801.303349]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

  [20801.303380] Call Trace:
  [20801.303383]  ? __schedule+0x284/0x650
  [20801.303384]  schedule+0x33/0xc0
  [20801.303386]  io_sq_thread+0x38a/0x410
  [20801.303388]  ? __switch_to_asm+0x40/0x70
  [20801.303390]  ? wait_woken+0x80/0x80
  [20801.303392]  ? _raw_spin_lock_irqsave+0x17/0x40
  [20801.303394]  ? io_submit_sqes+0x120/0x120
  [20801.303395]  kthread+0x112/0x130
  [20801.303396]  ? kthread_create_on_node+0x60/0x60
  [20801.303398]  ret_from_fork+0x35/0x40

 o kthread_park() waits for park completion, so io_sq_thread() loop
   should check kthread_should_park() along with khread_should_stop(),
   otherwise if kthread_park() is called before prepare_to_wait()
   the following schedule() never returns:

   CPU#0                    CPU#1

   io_sq_thread_stop():     io_sq_thread():

                               while(!kthread_should_stop() && !ctx->sqo_stop) {

      ctx->sqo_stop = 1;
      kthread_park()

	                            prepare_to_wait();
                                    if (kthread_should_stop() {
				    }
                                    schedule();   <<< nobody checks park flag,
				                  <<< so schedule and never return

 o if the flag ctx->sqo_stop is observed by the io_sq_thread() loop
   it is quite possible, that kthread_should_park() check and the
   following kthread_parkme() is never called, because kthread_park()
   has not been yet called, but few moments later is is called and
   waits there for park completion, which never happens, because
   kthread has already exited:

   CPU#0                    CPU#1

   io_sq_thread_stop():     io_sq_thread():

      ctx->sqo_stop = 1;
                               while(!kthread_should_stop() && !ctx->sqo_stop) {
                                   <<< observe sqo_stop and exit the loop
			       }

			       if (kthread_should_park())
			           kthread_parkme();  <<< never called, since was
					              <<< never parked

      kthread_park()           <<< waits forever for park completion

In the current patch we quit the loop by only kthread_should_park()
check (kthread_park() is synchronous, so kthread_should_stop() is
never observed), and we abandon ->sqo_stop flag, since it is racy.
At the end of the io_sq_thread() we unconditionally call parmke(),
since we've exited the loop by the park flag.

Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-05-16 08:10:24 -06:00
David Howells
d5c32c89b2 afs: Fix cell DNS lookup
Currently, once configured, AFS cells are looked up in the DNS at regular
intervals - which is a waste of resources if those cells aren't being
used.  It also leads to a problem where cells preloaded, but not
configured, before the network is brought up end up effectively statically
configured with no VL servers and are unable to get any.

Fix this by not doing the DNS lookup until the first time a cell is
touched.  It is waited for if we don't have any cached records yet,
otherwise the DNS lookup to maintain the record is done in the background.

This has the downside that the first time you touch a cell, you now have to
wait for the upcall to do the required DNS lookups rather than them already
being cached.

Further, the record is not replaced if the old record has at least one
server in it and the new record doesn't have any.

Fixes: 0a5143f2f8 ("afs: Implement VL server rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
2019-05-16 12:58:23 +01:00
Ronnie Sahlberg
dece44e381 cifs: add support for SEEK_DATA and SEEK_HOLE
Add llseek op for SEEK_DATA and SEEK_HOLE.
Improves xfstests/285,286,436,445,448 and 490

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-15 22:27:53 -05:00
Kovtunenko Oleksandr
9ab70ca653 Fixed https://bugzilla.kernel.org/show_bug.cgi?id=202935 allow write on the same file
Copychunk allows source and target to be on the same file.
For details on restrictions see MS-SMB2 3.3.5.15.6

Signed-off-by: Kovtunenko Oleksandr <alexander198961@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-15 22:27:53 -05:00
Long Li
2c87d6a94d cifs: Allocate memory for all iovs in smb2_ioctl
An IOCTL uses up to 2 iovs. The 1st iov is the command itself, the 2nd iov is
optional data for that command. The 1st iov is always allocated on the heap
but the 2nd iov may point to a variable on the stack. This will trigger an
error when passing the 2nd iov for RDMA I/O.

Fix this by allocating a buffer for the 2nd iov.

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie sahlberg <lsahlber@redhat.com>
2019-05-15 22:27:53 -05:00
Long Li
3b24911571 cifs: Don't match port on SMBDirect transport
SMBDirect manages its own ports in the transport layer, there is no need to
check the port to find a connection.

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie sahlberg <lsahlber@redhat.com>
2019-05-15 22:27:45 -05:00
Linus Torvalds
700a800a94 This pull consists mostly of nfsd container work:
Scott Mayhew revived an old api that communicates with a userspace
 daemon to manage some on-disk state that's used to track clients across
 server reboots.  We've been using a usermode_helper upcall for that, but
 it's tough to run those with the right namespaces, so a daemon is much
 friendlier to container use cases.
 
 Trond fixed nfsd's handling of user credentials in user namespaces.  He
 also contributed patches that allow containers to support different sets
 of NFS protocol versions.
 
 The only remaining container bug I'm aware of is that the NFS reply
 cache is shared between all containers.  If anyone's aware of other gaps
 in our container support, let me know.
 
 The rest of this is miscellaneous bugfixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAlzcWNcVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+DUEP/0WD3jKNAHFV3M5YQPAI9fz/iCND
 Db/A4oWP5qa6JmwmHe61il29QeGqkeFr/NPexgzM3Xw2E39d7RBXBeWyVDuqb0wr
 6SCXjXibTsuAHg11nR8Xf0P5Vej3rfGbG6up5lLCIDTEZxVpWoaBJnM8+3bewuCj
 XbeiDW54oiMbmDjon3MXqVAIF/z7LjorecJ+Yw5+0Jy7KZ6num9Kt8+fi7qkEfFd
 i5Bp9KWgzlTbJUJV4EX3ZKN3zlGkfOvjoo2kP3PODPVMB34W8jSLKkRSA1tDWYZg
 43WhBt5OODDlV6zpxSJXehYKIB4Ae469+RRaIL4F+ORRK+AzR0C/GTuOwJiG+P3J
 n95DX5WzX74nPOGQJgAvq4JNpZci85jM3jEK1TR2M7KiBDG5Zg+FTsPYVxx5Sgah
 Akl/pjLtHQPSdBbFGHn5TsXU+gqWNiKsKa9663tjxLb8ldmJun6JoQGkAEF9UJUn
 dzv0UxyHeHAblhSynY+WsUR+Xep9JDo/p5LyFK4if9Sd62KeA1uF/MFhAqpKZF81
 mrgRCqW4sD8aVTBNZI06pZzmcZx4TRr2o+Oj5KAXf6Yk6TJRSGfnQscoMMBsTLkw
 VK1rBQ/71TpjLHGZZZEx1YJrkVZAMmw2ty4DtK2f9jeKO13bWmUpc6UATzVufHKA
 C1rUZXJ5YioDbYDy
 =TUdw
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "This consists mostly of nfsd container work:

  Scott Mayhew revived an old api that communicates with a userspace
  daemon to manage some on-disk state that's used to track clients
  across server reboots. We've been using a usermode_helper upcall for
  that, but it's tough to run those with the right namespaces, so a
  daemon is much friendlier to container use cases.

  Trond fixed nfsd's handling of user credentials in user namespaces. He
  also contributed patches that allow containers to support different
  sets of NFS protocol versions.

  The only remaining container bug I'm aware of is that the NFS reply
  cache is shared between all containers. If anyone's aware of other
  gaps in our container support, let me know.

  The rest of this is miscellaneous bugfixes"

* tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux: (23 commits)
  nfsd: update callback done processing
  locks: move checks from locks_free_lock() to locks_release_private()
  nfsd: fh_drop_write in nfsd_unlink
  nfsd: allow fh_want_write to be called twice
  nfsd: knfsd must use the container user namespace
  SUNRPC: rsi_parse() should use the current user namespace
  SUNRPC: Fix the server AUTH_UNIX userspace mappings
  lockd: Pass the user cred from knfsd when starting the lockd server
  SUNRPC: Temporary sockets should inherit the cred from their parent
  SUNRPC: Cache the process user cred in the RPC server listener
  nfsd: Allow containers to set supported nfs versions
  nfsd: Add custom rpcbind callbacks for knfsd
  SUNRPC: Allow further customisation of RPC program registration
  SUNRPC: Clean up generic dispatcher code
  SUNRPC: Add a callback to initialise server requests
  SUNRPC/nfs: Fix return value for nfs4_callback_compound()
  nfsd: handle legacy client tracking records sent by nfsdcld
  nfsd: re-order client tracking method selection
  nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
  nfsd: un-deprecate nfsdcld
  ...
2019-05-15 18:21:43 -07:00
Richard Weinberger
4dd0481584 ubifs: Convert xattr inum to host order
UBIFS stores inode numbers as LE64 integers.
We have to convert them to host oder, otherwise
BE hosts won't be able to use the integer correctly.

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 9ca2d73264 ("ubifs: Limit number of xattrs per inode")
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-05-15 21:56:48 +02:00
Richard Weinberger
76aa349441 ubifs: Use correct config name for encryption
CONFIG_UBIFS_FS_ENCRYPTION is gone, fscrypt is now
controlled via CONFIG_FS_ENCRYPTION.
This problem slipped into the tree because of a mis-merge on
my side.

Reported-by: Eric Biggers <ebiggers@kernel.org>
Fixes: eea2c05d92 ("ubifs: Remove #ifdef around CONFIG_FS_ENCRYPTION")
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-05-15 21:56:48 +02:00