Commit Graph

630054 Commits

Author SHA1 Message Date
Vlastimil Babka
2d19309cf8 fs/select: add vmalloc fallback for select(2)
The select(2) syscall performs a kmalloc(size, GFP_KERNEL) where size grows
with the number of fds passed. We had a customer report page allocation
failures of order-4 for this allocation. This is a costly order, so it might
easily fail, as the VM expects such allocation to have a lower-order fallback.

Such trivial fallback is vmalloc(), as the memory doesn't have to be physically
contiguous and the allocation is temporary for the duration of the syscall
only. There were some concerns, whether this would have negative impact on the
system by exposing vmalloc() to userspace. Although an excessive use of vmalloc
can cause some system wide performance issues - TLB flushes etc. - a large
order allocation is not for free either and an excessive reclaim/compaction can
have a similar effect. Also note that the size is effectively limited by
RLIMIT_NOFILE which defaults to 1024 on the systems I checked. That means the
bitmaps will fit well within single page and thus the vmalloc() fallback could
be only excercised for processes where root allows a higher limit.

Note that the poll(2) syscall seems to use a linked list of order-0 pages, so
it doesn't need this kind of fallback.

[eric.dumazet@gmail.com: fix failure path logic]
[akpm@linux-foundation.org: use proper type for size]
Link: http://lkml.kernel.org/r/20160927084536.5923-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Jason Baron <jbaron@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:30 -07:00
Darrick J. Wong
25f4c41415 block: implement (some of) fallocate for block devices
After much discussion, it seems that the fallocate feature flag
FALLOC_FL_ZERO_RANGE maps nicely to SCSI WRITE SAME; and the feature
FALLOC_FL_PUNCH_HOLE maps nicely to the devices that have been whitelisted
for zeroing SCSI UNMAP.  Punch still requires that FALLOC_FL_KEEP_SIZE is
set.  A length that goes past the end of the device will be clamped to the
device size if KEEP_SIZE is set; or will return -EINVAL if not.  Both
start and length must be aligned to the device's logical block size.

Since the semantics of fallocate are fairly well established already, wire
up the two pieces.  The other fallocate variants (collapse range, insert
range, and allocate blocks) are not supported.

Link: http://lkml.kernel.org/r/147518379992.22791.8849838163218235007.stgit@birch.djwong.org
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com> # tweaked header
Cc: Brian Foster <bfoster@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:30 -07:00
Darrick J. Wong
28b2be203e block: require write_same and discard requests align to logical block size
Make sure that the offset and length arguments that we're using to
construct WRITE SAME and DISCARD requests are actually aligned to the
logical block size.  Failure to do this causes other errors in other parts
of the block layer or the SCSI layer because disks don't support partial
logical block writes.

Link: http://lkml.kernel.org/r/147518379026.22791.4437508871355153928.stgit@birch.djwong.org
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Mike Snitzer <snitzer@redhat.com> # tweaked header
Cc: Brian Foster <bfoster@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:30 -07:00
Darrick J. Wong
22dd6d3566 block: invalidate the page cache when issuing BLKZEROOUT
Patch series "fallocate for block devices", v11.

This is a patchset to fix page cache coherency with BLKZEROOUT and
implement fallocate for block devices.

The first patch is a fix to the existing BLKZEROOUT ioctl to invalidate
the page cache if the zeroing command to the underlying device succeeds.
Without this patch we still have the pagecache coherence bug that's been
in the kernel forever.

The second patch changes the internal block device functions to reject
attempts to discard or zeroout that are not aligned to the logical block
size.  Previously, we only checked that the start/len parameters were
512-byte aligned, which caused kernel BUG_ONs for unaligned IOs to 4k-LBA
devices.

The third patch creates an fallocate handler for block devices, wires up
the FALLOC_FL_PUNCH_HOLE flag to zeroing-discard, and connects
FALLOC_FL_ZERO_RANGE to write-same so that we can have a consistent
fallocate interface between files and block devices.  It also allows the
combination of PUNCH_HOLE and NO_HIDE_STALE to invoke non-zeroing discard.

Test cases for the new block device fallocate are now in xfstests as
generic/349-351.

This patch (of 3):

Invalidate the page cache (as a regular O_DIRECT write would do) to avoid
returning stale cache contents at a later time.

Link: http://lkml.kernel.org/r/147518378313.22791.16649519283678515021.stgit@birch.djwong.org
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:30 -07:00
Guozhonghua
0cc482ee41 ocfs2: fix memory leak in dlm_migrate_request_handler()
In the dlm_migrate_request_handler(), when `ret' is -EEXIST, the mle
should be freed, otherwise the memory will be leaked.

Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4A3D3522A@H3CMLB12-EX.srv.huawei-3com.com
Signed-off-by: Guozhonghua <guozhonghua@h3c.com>
Reviewed-by: Mark Fasheh <mfasheh@versity.com>
Cc: Eric Ren <zren@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 15:06:30 -07:00
Al Viro
1689c73a73 Fix off-by-one in __pipe_get_pages()
it actually worked only when requested area ended on the page boundary...

Reported-by: Marco Grassi <marco.gra@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11 10:40:01 -07:00
Linus Torvalds
6b5e09a748 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Netfilter list handling fix, from Linus.

 2) RXRPC/AFS bug fixes from David Howells (oops on call to serviceless
    endpoints, build warnings, missing notifications, etc.) From David
    Howells.

 3) Kernel log message missing newlines, from Colin Ian King.

 4) Don't enter direct reclaim in netlink dumps, the idea is to use a
    high order allocation first and fallback quickly to a 0-order
    allocation if such a high-order one cannot be done cheaply and
    without reclaim. From Eric Dumazet.

 5) Fix firmware download errors in btusb bluetooth driver, from Ethan
    Hsieh.

 6) Missing Kconfig deps for QCOM_EMAC, from Geert Uytterhoeven.

 7) Fix MDIO_XGENE dup Kconfig entry. From Laura Abbott.

 8) Constrain ipv6 rtr_solicits sysctl values properly, from Maciej
    Żenczykowski.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
  netfilter: Fix slab corruption.
  be2net: Enable VF link state setting for BE3
  be2net: Fix TX stats for TSO packets
  be2net: Update Copyright string in be_hw.h
  be2net: NCSI FW section should be properly updated with ethtool for BE3
  be2net: Provide an alternate way to read pf_num for BEx chips
  wan/fsl_ucc_hdlc: Fix size used in dma_free_coherent()
  net: macb: NULL out phydev after removing mdio bus
  xen-netback: make sure that hashes are not send to unaware frontends
  Fixing a bug in team driver due to incorrect 'unsigned int' to 'int' conversion
  MAINTAINERS: add myself as a maintainer of xen-netback
  ipv6 addrconf: disallow rtr_solicits < -1
  Bluetooth: btusb: Fix atheros firmware download error
  drivers: net: phy: Correct duplicate MDIO_XGENE entry
  ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA and HAS_IOMEM
  net: ethernet: mediatek: remove hwlro property in the device tree
  net: ethernet: mediatek: get hw lro capability by the chip id instead of by the dtsi
  net: ethernet: mediatek: get the chip id by ETHDMASYS registers
  net: bgmac: Fix errant feature flag check
  netlink: do not enter direct reclaim from netlink_dump()
  ...
2016-10-11 08:10:19 -07:00
Linus Torvalds
bd3769bfed netfilter: Fix slab corruption.
Use the correct pattern for singly linked list insertion and
deletion.  We can also calculate the list head outside of the
mutex.

Fixes: e3b37f11e6 ("netfilter: replace list_head with single linked list")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Aaron Conole <aconole@bytheb.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/netfilter/core.c | 108 ++++++++++++++++-----------------------------------
 1 file changed, 33 insertions(+), 75 deletions(-)
2016-10-11 04:44:37 -04:00
Linus Torvalds
101105b171 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 ">rename2() work from Miklos + current_time() from Deepa"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: Replace current_fs_time() with current_time()
  fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
  fs: Replace CURRENT_TIME with current_time() for inode timestamps
  fs: proc: Delete inode time initializations in proc_alloc_inode()
  vfs: Add current_time() api
  vfs: add note about i_op->rename changes to porting
  fs: rename "rename2" i_op to "rename"
  vfs: remove unused i_op->rename
  fs: make remaining filesystems use .rename2
  libfs: support RENAME_NOREPLACE in simple_rename()
  fs: support RENAME_NOREPLACE for local filesystems
  ncpfs: fix unused variable warning
2016-10-10 20:16:43 -07:00
Al Viro
3873691e5a Merge remote-tracking branch 'ovl/rename2' into for-linus 2016-10-10 23:02:51 -04:00
Linus Torvalds
35ff96dfd3 MTD updates for 4.9-rc1
NAND:
 
  * Add the infrastructure to automate NAND timings configuration
  * Provide a generic DT property to maximize ECC strength
  * Some refactoring in the core bad block table handling, to help with
    improving some of the logic in error cases.
  * Minor cleanups and fixes
 
 MTD:
 
  * Add APIs for handling page pairing; this is necessary for reliably
    supporting MLC and TLC NAND flash, where paired-page disturbance affects
    reliability. Upper layers (e.g., UBI) should make use of these in the near
    future.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX+cr9AAoJEFySrpd9RFgtSmcP/AvaNRXlrmKbZKg07kpJj3Ja
 XtxhftUwz7ncbjls99TD6ObGxWThIJ8U3oLsI5yoofJWiik5KaUk4jXUIVkGF5hm
 m1cDUX4biCwctdJzG03jboquFgwKP/atxxFCvEigauW3EafmUL4KrkrQ/bqOu7qN
 TDDyDL2K+v96lR2lYhCxWMZHcwK2ORGxbxdxfTqVE/NMLk217gHcrJEfJISPodfb
 A9dU/h7gLYF49E5L04Cko1I5HTnyhGPjQGIB/h8dIUlxtrzy1NRGG3IYo5gkdbve
 7yRSzbQB0jokcFdz1kg2SLXJZRArs9pYWUkFGGnYFaDGuFentyySaKgs+SO9gJHG
 wY48IL+RFlR0PF2PKVSdXLf9vgcjoVg9Oi2X5Ap4QJDZaTQqf0P0uz4aRTUbphQx
 /zY6X4Z6DWUXmLncz2tJ+ruwGoEZaUdXvX3/2ov0UnDjZ+w8hGuNNscE6xrrnKGf
 S9qiGOkxamS1Sg+jy2IWb/KBkkZgDXRkt02HecPJtV6kkA0fyLe281FiQU2/BHsb
 +aPA2zavaNMY+UGSkPci2kMZ0lMuIWxTDmH1L1XnscLsowHGrPd9D04zdeZNEP74
 mSnMrldlCt8xWD3xKV6Knh9AgPkXCB9MrsumrG9/RTpplnvEAPfic2YU7hRftKkn
 BJZXNhX1pg4qqhcHT0O5
 =/PuY
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20161008' of git://git.infradead.org/linux-mtd

Pull MTD updates from Brian Norris:
 "I've not been very active this cycle, so these are mostly from Boris,
  for the NAND flash subsystem.

  NAND:

   - Add the infrastructure to automate NAND timings configuration

   - Provide a generic DT property to maximize ECC strength

   - Some refactoring in the core bad block table handling, to help with
     improving some of the logic in error cases.

   - Minor cleanups and fixes

  MTD:

   - Add APIs for handling page pairing; this is necessary for reliably
     supporting MLC and TLC NAND flash, where paired-page disturbance
     affects reliability. Upper layers (e.g., UBI) should make use of
     these in the near future"

* tag 'for-linus-20161008' of git://git.infradead.org/linux-mtd: (35 commits)
  mtd: nand: fix trivial spelling error
  mtdpart: Propagate _get/put_device()
  mtd: nand: Provide nand_cleanup() function to free NAND related resources
  mtd: Kill the OF_MTD Kconfig option
  mtd: nand: mxc: Test CONFIG_OF instead of CONFIG_OF_MTD
  mtd: nand: Fix nand_command_lp() for 8bits opcodes
  mtd: nand: sunxi: Support ECC maximization
  mtd: nand: Support maximizing ECC when using software BCH
  mtd: nand: Add an option to maximize the ECC strength
  mtd: nand: mxc: Add timing setup for v2 controllers
  mtd: nand: mxc: implement onfi get/set features
  mtd: nand: sunxi: switch from manual to automated timing config
  mtd: nand: automate NAND timings selection
  mtd: nand: Expose data interface for ONFI mode 0
  mtd: nand: Add function to convert ONFI mode to data_interface
  mtd: nand: convert ONFI mode into data interface
  mtd: nand: Introduce nand_data_interface
  mtd: nand: Create a NAND reset function
  mtd: nand: remove unnecessary 'extern' from function declarations
  MAINTAINERS: Add maintainer entry for Ingenic JZ4780 NAND driver
  ...
2016-10-10 17:39:51 -07:00
Linus Torvalds
97d2116708 Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs xattr updates from Al Viro:
 "xattr stuff from Andreas

  This completes the switch to xattr_handler ->get()/->set() from
  ->getxattr/->setxattr/->removexattr"

* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: Remove {get,set,remove}xattr inode operations
  xattr: Stop calling {get,set,remove}xattr inode operations
  vfs: Check for the IOP_XATTR flag in listxattr
  xattr: Add __vfs_{get,set,remove}xattr helpers
  libfs: Use IOP_XATTR flag for empty directory handling
  vfs: Use IOP_XATTR flag for bad-inode handling
  vfs: Add IOP_XATTR inode operations flag
  vfs: Move xattr_resolve_name to the front of fs/xattr.c
  ecryptfs: Switch to generic xattr handlers
  sockfs: Get rid of getxattr iop
  sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
  kernfs: Switch to generic xattr handlers
  hfs: Switch to generic xattr handlers
  jffs2: Remove jffs2_{get,set,remove}xattr macros
  xattr: Remove unnecessary NULL attribute name check
2016-10-10 17:11:50 -07:00
Linus Torvalds
30066ce675 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 4.9:

  API:
   - The crypto engine code now supports hashes.

  Algorithms:
   - Allow keys >= 2048 bits in FIPS mode for RSA.

  Drivers:
   - Memory overwrite fix for vmx ghash.
   - Add support for building ARM sha1-neon in Thumb2 mode.
   - Reenable ARM ghash-ce code by adding import/export.
   - Reenable img-hash by adding import/export.
   - Add support for multiple cores in omap-aes.
   - Add little-endian support for sha1-powerpc.
   - Add Cavium HWRNG driver for ThunderX SoC"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits)
  crypto: caam - treat SGT address pointer as u64
  crypto: ccp - Make syslog errors human-readable
  crypto: ccp - clean up data structure
  crypto: vmx - Ensure ghash-generic is enabled
  crypto: testmgr - add guard to dst buffer for ahash_export
  crypto: caam - Unmap region obtained by of_iomap
  crypto: sha1-powerpc - little-endian support
  crypto: gcm - Fix IV buffer size in crypto_gcm_setkey
  crypto: vmx - Fix memory corruption caused by p8_ghash
  crypto: ghash-generic - move common definitions to a new header file
  crypto: caam - fix sg dump
  hwrng: omap - Only fail if pm_runtime_get_sync returns < 0
  crypto: omap-sham - shrink the internal buffer size
  crypto: omap-sham - add support for export/import
  crypto: omap-sham - convert driver logic to use sgs for data xmit
  crypto: omap-sham - change the DMA threshold value to a define
  crypto: omap-sham - add support functions for sg based data handling
  crypto: omap-sham - rename sgl to sgl_tmp for deprecation
  crypto: omap-sham - align algorithms on word offset
  crypto: omap-sham - add context export/import stubs
  ...
2016-10-10 14:04:16 -07:00
Linus Torvalds
6763afe4b9 dlm for 4.9
This includes a bug fix for a bad memory access during workqueue
 cleanup, which can happen while shutting down the dlm networking
 layer.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJX+63GAAoJEDgbc8f8gGmq4hIP/R63HIQkCQJOCrV34gdrk4tN
 7+mwqKkQeWfDYEB0TI+B/iwMsEqtE12Wob6lN9P1pYlTp1OOXulj/jV3xBcENMkM
 trxmcscCwKcVnLvkW1cVqKfLdswFEQZv95g0CVIAaLghI3v39Sf5WDVcaw+L6IEv
 7ko5vet2OY5eJm6vJEDXTJbxWQ3itkOvIrD8f50jA6IFkrgfJQ0oFkPnfVFTU0Mp
 g1v2w05voWINoHQ3b1AfTz7iAYUIv94CnVDYgyIwsqUW2M133Tvw2Cj78rK0EqJa
 t2vr/8+8twm0D2NDq7xX4BLLKkrlchhDVQWNivVHw9DGOtjXdMu3r7WmgDM8FvW6
 NethPt45xLNHtcP/GvE1z7YzUus50aSwjWIRHyRUhMDD/8QwIwCiw3FBx2H3TRuf
 OXZCXOpCXFxM63uOAkZaI9xCni0SWCVCdHxFEmd1VF41t4RNc4jbpGWSe1jqacLx
 a1Dpf/mLClQ1WOgLtnVQ+/OeRhayVnnhRD3u2XxqWl3oZkU4QqJp8cTSeq3gVbgE
 m9ZDi/MozN9amkGECkI7JlyRHgk73ZM0KPCcYgl9v0v9l6R1i8Bi8xvs5YS+i/Je
 e5jyDM2VtcqoGY9lzSQMMbqz6Lrv7aagrDYGPts/togYuKT/925L6BiIVoHIZtfy
 H+NKEqCPMq70mRDjbd/S
 =4las
 -----END PGP SIGNATURE-----

Merge tag 'dlm-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm fix from David Teigland:
 "This includes a bug fix for a bad memory access during workqueue
  cleanup, which can happen while shutting down the dlm networking
  layer"

* tag 'dlm-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: free workqueues after the connections
2016-10-10 13:58:06 -07:00
Linus Torvalds
8dfb790b15 The big ticket item here is support for rbd exclusive-lock feature,
with maintenance operations offloaded to userspace (Douglas Fuller,
 Mike Christie and myself).  Another block device bullet is a series
 fixing up layering error paths (myself).
 
 On the filesystem side, we've got patches that improve our handling of
 buffered vs dio write races (Neil Brown) and a few assorted fixes from
 Zheng.  Also included a couple of random cleanups and a minor CRUSH
 update.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJX+PjZAAoJEEp/3jgCEfOLVuoH/RwtFLIb6/KZUYtBOrVVrTun
 kReRlfq2xKYrGGtyQEqSuz7fBdwT1LVCVcL8kC4GFD4R67o+tNMAr6PfM/7pZABj
 HRoRLgSZ9FLw4W5n0VpBIznih75QUbCdXiTCtH9eorMHU5q1YpTvVHHlF9W9Pm2I
 eNGnBWpGyHVeiK66mpUCH+EQKQ4GkAVD9rneTNqLHgq2yotHkVl1j258+DL6JRGs
 OBoh3RmNQaGOAS37Lss8erCSusAGEcAeGV6ubuK2lFUKyR41EkD3I0xkhNSPe+CD
 RifFcpVziIeTu//cLgl0nnHGtmUytD7HgJubaPthArKIOen9ZDAfEkgI0o+JI2A=
 =45O7
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-4.9-rc1' of git://github.com/ceph/ceph-client

Pull Ceph updates from Ilya Dryomov:
 "The big ticket item here is support for rbd exclusive-lock feature,
  with maintenance operations offloaded to userspace (Douglas Fuller,
  Mike Christie and myself). Another block device bullet is a series
  fixing up layering error paths (myself).

  On the filesystem side, we've got patches that improve our handling of
  buffered vs dio write races (Neil Brown) and a few assorted fixes from
  Zheng. Also included a couple of random cleanups and a minor CRUSH
  update"

* tag 'ceph-for-4.9-rc1' of git://github.com/ceph/ceph-client: (39 commits)
  crush: remove redundant local variable
  crush: don't normalize input of crush_ln iteratively
  libceph: ceph_build_auth() doesn't need ceph_auth_build_hello()
  libceph: use CEPH_AUTH_UNKNOWN in ceph_auth_build_hello()
  ceph: fix description for rsize and rasize mount options
  rbd: use kmalloc_array() in rbd_header_from_disk()
  ceph: use list_move instead of list_del/list_add
  ceph: handle CEPH_SESSION_REJECT message
  ceph: avoid accessing / when mounting a subpath
  ceph: fix mandatory flock check
  ceph: remove warning when ceph_releasepage() is called on dirty page
  ceph: ignore error from invalidate_inode_pages2_range() in direct write
  ceph: fix error handling of start_read()
  rbd: add rbd_obj_request_error() helper
  rbd: img_data requests don't own their page array
  rbd: don't call rbd_osd_req_format_read() for !img_data requests
  rbd: rework rbd_img_obj_exists_submit() error paths
  rbd: don't crash or leak on errors in rbd_img_obj_parent_read_full_callback()
  rbd: move bumping img_request refcount into rbd_obj_request_submit()
  rbd: mark the original request as done if stat request fails
  ...
2016-10-10 13:52:05 -07:00
Linus Torvalds
fed41f7d03 Merge branch 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull splice fixups from Al Viro:
 "A couple of fixups for interaction of pipe-backed iov_iter with
  O_DIRECT reads + constification of a couple of primitives in uio.h
  missed by previous rounds.

  Kudos to davej - his fuzzing has caught those bugs"

* 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  [btrfs] fix check_direct_IO() for non-iovec iterators
  constify iov_iter_count() and iter_is_iovec()
  fix ITER_PIPE interaction with direct_IO
2016-10-10 13:38:49 -07:00
Linus Torvalds
abb5a14fa2 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted misc bits and pieces.

  There are several single-topic branches left after this (rename2
  series from Miklos, current_time series from Deepa Dinamani, xattr
  series from Andreas, uaccess stuff from from me) and I'd prefer to
  send those separately"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
  proc: switch auxv to use of __mem_open()
  hpfs: support FIEMAP
  cifs: get rid of unused arguments of CIFSSMBWrite()
  posix_acl: uapi header split
  posix_acl: xattr representation cleanups
  fs/aio.c: eliminate redundant loads in put_aio_ring_file
  fs/internal.h: add const to ns_dentry_operations declaration
  compat: remove compat_printk()
  fs/buffer.c: make __getblk_slow() static
  proc: unsigned file descriptors
  fs/file: more unsigned file descriptors
  fs: compat: remove redundant check of nr_segs
  cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
  cifs: don't use memcpy() to copy struct iov_iter
  get rid of separate multipage fault-in primitives
  fs: Avoid premature clearing of capabilities
  fs: Give dentry to inode_change_ok() instead of inode
  fuse: Propagate dentry down to inode_change_ok()
  ceph: Propagate dentry down to inode_change_ok()
  xfs: Propagate dentry down to inode_change_ok()
  ...
2016-10-10 13:04:49 -07:00
Linus Torvalds
911f9dab30 Merge branch 'pcmcia' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM pcmcia updates from Russell King:
 "These updates lay the foundations for more generic soc_common PCMCIA
  support, which will result in several of the board specific drivers
  being elimated.

  As the dependencies for this are complex, the preliminary work is
  being submitted now, with the remainder scheduled for the next merge
  window"

* 'pcmcia' of git://git.armlinux.org.uk/~rmk/linux-arm:
  pcmcia: soc_common: add driver-data pointer
  pcmcia: soc_common: add support for voltage sense GPIOs
  pcmcia: soc_common: constify pcmcia_low_level ops pointer
  pcmcia: soc_common: switch to a per-socket cpufreq notifier
  pcmcia: soc_common: add support for Vcc and Vpp regulators
  pcmcia: soc_common: add CF socket state helper
  pcmcia: soc_common: restore previous socket state on error
  pcmcia: soc_common: add support for reset and bus enable GPIOs
  pcmcia: soc_common: request legacy detect GPIO with active low
  pcmcia: soc_common: ignore invalid interrupts
  pcmcia: soc_common: switch to using gpio_descs
  pcmcia: soc_common: use devm_gpio_request_one()
2016-10-10 11:34:28 -07:00
Linus Torvalds
ae50a840e7 nios2 update for v4.9-rc1
nios2: use of_property_read_bool
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX+vyvAAoJEFWoEK+e3syCL4wP/2cPQcRtu/pJy6B6Ll/pwryT
 r2la6NdFCFUl6KH7rTxyjxkz48dc1qf/SB1IKGPhg+qWRTRygivNVFghcwA5do6n
 I5fubkjlg/ZXZNsmxfQRu2zOM/bmV2a/qq7dREQenQlBK5isILB0t+VUzgmwX3vh
 fdq1fgKDONo+KKFxPJNS+HKbiYAAgtEHNhG/2J+lVoGWSAmwQ9XIfdD+k4oQrrQc
 hHRaTHY/JISZwtjlMF5jAoHCdodjoXVuuIylXt/ClwIEfYJuZRWCx0e0ObDuHdf5
 JqtovnmdwQTkJThQhAFjhcVjB2XoHKgpRvEm5GHWY8S0ecS7eBV5nbud+e4fAFmB
 KP/THrweb2QYGCcl8Efz7k7ZLkRreT0VATBuDB/zNVedFxLtwpxig7c3DaYSs9bu
 6jXCmG6l1eJ11gxJvrBbIW222XPkeGftj3/J+YEAD3XcCQG/mNnUumwHT42vbaMO
 pmhVzlsO3S41OXEzpyxj2e6uHMpKPN+kEvfnjPlnmqA2qJneGupqpsrtzgr8VMts
 XtED/HAsVAvT5JaO+3Hu181BPW3pGECdo3V/fPszdDq/A3+iqhMUItfWTEhfOb9Z
 CZckod1Ba1LioqzdCCZBY7vp9IQO9DQgzLlotdDJqePFBLGlKF9mpgrvZF21rw99
 NDyWiWOOer+eZJDlknHS
 =wYAp
 -----END PGP SIGNATURE-----

Merge tag 'nios2-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2

Pull nios2 update from Ley Foon Tan:
 "Use of_property_read_bool() instead of open-coding it"

* tag 'nios2-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
  nios2: use of_property_read_bool
2016-10-10 11:31:19 -07:00
Linus Torvalds
057a056ced CRIS changes for 4.9
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iEYEABECAAYFAlf2FsoACgkQ31LbvUHyf1foXgCePSJdFlFpcz+JHKc8PsA8yV8o
 UN8AnRG267G5QiSnRn8xj9Tq2ftqv4PC
 =xlut
 -----END PGP SIGNATURE-----

Merge tag 'cris-for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris

Pull CRIS updates from Jesper Nilsson.

* tag 'cris-for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jesper/cris:
  cris: return of class_create should be considered
  CRIS: defconfig: remove MTDRAM_ABS_POS
  CRIS v32: remove some double unlocks
  Fix typos
  cris: migrate exception table users off module.h and onto extable.h
  cris: v10: axisflashmap: remove unused ifdefs
  cris: use generic io.h
  cris: fix Kconfig mismatch when building with CONFIG_PCI
  cris: cardbus: fix header include path
  cris: add dev88_defconfig
  cris: irq: stop loop from accessing array out of bounds
  cris: fasttimer: fix mixed declarations and code compile warning
  cris: intmem: fix pointer comparison compile warning
  cris: intmem: fix device_initcall compile warning
2016-10-10 11:28:35 -07:00
Linus Torvalds
93c26d7dc0 Merge branch 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull protection keys syscall interface from Thomas Gleixner:
 "This is the final step of Protection Keys support which adds the
  syscalls so user space can actually allocate keys and protect memory
  areas with them. Details and usage examples can be found in the
  documentation.

  The mm side of this has been acked by Mel"

* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pkeys: Update documentation
  x86/mm/pkeys: Do not skip PKRU register if debug registers are not used
  x86/pkeys: Fix pkeys build breakage for some non-x86 arches
  x86/pkeys: Add self-tests
  x86/pkeys: Allow configuration of init_pkru
  x86/pkeys: Default to a restrictive init PKRU
  pkeys: Add details of system call use to Documentation/
  generic syscalls: Wire up memory protection keys syscalls
  x86: Wire up protection keys system calls
  x86/pkeys: Allocation/free syscalls
  x86/pkeys: Make mprotect_key() mask off additional vm_flags
  mm: Implement new pkey_mprotect() system call
  x86/pkeys: Add fault handling for PF_PK page fault bit
2016-10-10 11:01:51 -07:00
Linus Torvalds
5fa0eb0b4d Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 updates from Thomas Gleixner:
 "A pile of regression fixes and updates:

   - address the fallout of the patches which made the cpuid - nodeid
     relation permanent: Handling of invalid APIC ids and preventing
     pointless warning messages.

   - force eager FPU when protection keys are enabled. Protection keys
     are not generating FPU exceptions so they cannot work with the lazy
     FPU mechanism.

   - prevent force migration of interrupts which are not part of the CPU
     vector domain.

   - handle the fact that APIC ids are not updated in the ACPI/MADT
     tables on physical CPU hotplug

   - remove bash-isms from syscall table generator script

   - use the hypervisor supplied APIC frequency when running on VMware"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pkeys: Make protection keys an "eager" feature
  x86/apic: Prevent pointless warning messages
  x86/acpi: Prevent LAPIC id 0xff from being accounted
  arch/x86: Handle non enumerated CPU after physical hotplug
  x86/unwind: Fix oprofile module link error
  x86/vmware: Skip lapic calibration on VMware
  x86/syscalls: Remove bash-isms in syscall table generator
  x86/irq: Prevent force migration of irqs which are not in the vector domain
2016-10-10 10:59:07 -07:00
Al Viro
cd27e45504 [btrfs] fix check_direct_IO() for non-iovec iterators
looking for duplicate ->iov_base makes sense only for
iovec-backed iterators; for kvec-backed ones it's pointless,
for bvec-backed ones it's pointless and broken on 32bit (we
walk through an array of struct bio_vec accessing them as if
they were struct iovec; works by accident on 64bit, but on
32bit it'll blow up) and for pipe-backed ones it's pointless
and ends up oopsing.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-10 13:58:16 -04:00
Al Viro
b57332b410 constify iov_iter_count() and iter_is_iovec()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-10 13:57:37 -04:00
Al Viro
c3a6902404 fix ITER_PIPE interaction with direct_IO
by making sure we call iov_iter_advance() on original
iov_iter even if direct_IO (done on its copy) has returned 0.
It's a no-op for old iov_iter flavours and does the right thing
(== truncation of the stuff we'd allocated, but not filled) in
ITER_PIPE case.  Failures (e.g. -EIO) get caught and dealt with
by cleanup in generic_file_read_iter().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-10 13:36:06 -04:00
Linus Torvalds
c48ce9f190 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling updates from Thomas Gleixner:

 - handle uretprobe placement proper on little endian PPC64

 - fix buffer handling in libtraceevent

 - add a missing pointer derefence in perf probe

 - fix the build of host tools in cross builds

 - fix Intel PT timestamp handling

 - synchronize memcpy, cpufeatures and bpf headers with the kernel headers

 - support for vendor supplied JSON files describing PMU events

 - a new set of tool tips

 - initial work for clang/llvm support

 - address some style issues found by cppcheck

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
  tools build: Add feature detection for g++
  tools build: Support compiling C++ source file
  perf top/report: Add tips about a list option
  perf report/top: Add a tip about system-wide collection from all CPUs
  perf report/top: Add a tip about source line numbers with overhead
  tools: Synchronize tools/include/uapi/linux/bpf.h
  tools: Synchronize tools/arch/x86/include/asm/cpufeatures.h
  perf bench mem: Sync memcpy assembly sources with the kernel
  perf jevents: Fix Intel JSON fixed counter conversions
  tools lib traceevent: Fix kbuffer_read_at_offset()
  perf intel-pt: Fix MTC timestamp calculation for large MTC periods
  perf intel-pt: Fix estimated timestamps for cycle-accurate mode
  perf uretprobe ppc64le: Fix probe location
  perf pmu-events: Add Skylake frontend MSR support
  perf pmu-events: Fix fixed counters on Intel
  perf tools: Make alias matching case-insensitive
  perf tools: Allow period= in perf stat CPU event descriptions.
  perf tools: Add README for info on parsing JSON/map files
  perf list jevents: Add support for event list topics
  perf list: Support long jevents descriptions
  ...
2016-10-10 10:33:58 -07:00
Linus Torvalds
84ed2da02f Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Thomas Gleixner:
 "A revert of a commit which pointelessly widened a preempt disabled
  section which in turn caused might_sleep() to trigger.

  The patch intended to prevent usage of smp_processor_id() in
  preemptible context, but the usage in that case is fine because the
  thread is pinned on a single cpu and therefore cannot be migrated off"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "sched/core: Do not use smp_processor_id() with preempt enabled in smpboot_thread_fn()"
2016-10-10 10:29:59 -07:00
Linus Torvalds
daba2b314a Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "Two small kerneldoc fixes from Julia Lawall"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/metag-ext: Improve function-level documentation
  irqchip/vic: Improve function-level documentation
2016-10-10 10:24:41 -07:00
Linus Torvalds
604a830d4f Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
 "A single fix for a regression introduced in 4.8 which causes the
  trace/perf clock to return random nonsense if CONFIG_DEBUG_TIMEKEEPING
  is set"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Fix __ktime_get_fast_ns() regression
2016-10-10 10:23:18 -07:00
Linus Torvalds
563873318d Merge branch 'printk-cleanups'
Merge my system logging cleanups, triggered by the broken '\n' patches.

The line continuation handling has been broken basically forever, and
the code to handle the system log records was both confusing and
dubious.  And it would do entirely the wrong thing unless you always had
a terminating newline, partly because it couldn't actually see whether a
message was marked KERN_CONT or not (but partly because the LOG_CONT
handling in the recording code was rather confusing too).

This re-introduces a real semantically meaningful KERN_CONT, and fixes
the few places I noticed where it was missing.  There are probably more
missing cases, since KERN_CONT hasn't actually had any semantic meaning
for at least four years (other than the checkpatch meaning of "no log
level necessary, this is a continuation line").

This also allows the combination of KERN_CONT and a log level.  In that
case the log level will be ignored if the merging with a previous line
is successful, but if a new record is needed, that new record will now
get the right log level.

That also means that you can at least in theory combine KERN_CONT with
the "pr_info()" style helpers, although any use of pr_fmt() prefixing
would make that just result in a mess, of course (the prefix would end
up in the middle of a continuing line).

* printk-cleanups:
  printk: make reading the kernel log flush pending lines
  printk: re-organize log_output() to be more legible
  printk: split out core logging code into helper function
  printk: reinstate KERN_CONT for printing continuation lines
2016-10-10 09:29:50 -07:00
Marcelo Ricardo Leitner
3a8db79889 dlm: free workqueues after the connections
After backporting commit ee44b4bc05 ("dlm: use sctp 1-to-1 API")
series to a kernel with an older workqueue which didn't use RCU yet, it
was noticed that we are freeing the workqueues in dlm_lowcomms_stop()
too early as free_conn() will try to access that memory for canceling
the queued works if any.

This issue was introduced by commit 0d737a8cfd as before it such
attempt to cancel the queued works wasn't performed, so the issue was
not present.

This patch fixes it by simply inverting the free order.

Cc: stable@vger.kernel.org
Fixes: 0d737a8cfd ("dlm: fix race while closing connections")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2016-10-10 09:54:00 -05:00
Herbert Xu
c3afafa478 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Merge the crypto tree to pull in vmx ghash fix.
2016-10-10 11:19:47 +08:00
Linus Torvalds
24532f7681 Merge branch 'for-4.9/block-smp' of git://git.kernel.dk/linux-block
Pull blk-mq CPU hotplug update from Jens Axboe:
 "This is the conversion of blk-mq to the new hotplug state machine"

* 'for-4.9/block-smp' of git://git.kernel.dk/linux-block:
  blk-mq: fixup "Convert to new hotplug state machine"
  blk-mq: Convert to new hotplug state machine
  blk-mq/cpu-notif: Convert to new hotplug state machine
2016-10-09 17:32:20 -07:00
Linus Torvalds
12e3d3cdd9 Merge branch 'for-4.9/block-irq' of git://git.kernel.dk/linux-block
Pull blk-mq irq/cpu mapping updates from Jens Axboe:
 "This is the block-irq topic branch for 4.9-rc. It's mostly from
  Christoph, and it allows drivers to specify their own mappings, and
  more importantly, to share the blk-mq mappings with the IRQ affinity
  mappings. It's a good step towards making this work better out of the
  box"

* 'for-4.9/block-irq' of git://git.kernel.dk/linux-block:
  blk_mq: linux/blk-mq.h does not include all the headers it depends on
  blk-mq: kill unused blk_mq_create_mq_map()
  blk-mq: get rid of the cpumask in struct blk_mq_tags
  nvme: remove the post_scan callout
  nvme: switch to use pci_alloc_irq_vectors
  blk-mq: provide a default queue mapping for PCI device
  blk-mq: allow the driver to pass in a queue mapping
  blk-mq: remove ->map_queue
  blk-mq: only allocate a single mq_map per tag_set
  blk-mq: don't redistribute hardware queues on a CPU hotplug event
2016-10-09 17:29:33 -07:00
Linus Torvalds
48915c2cbc . various fixes and cleanups for request-based DM core
. add support for delaying the requeue of requests; used by DM multipath
   when all paths have failed and 'queue_if_no_path' is enabled
 
 . DM cache improvements to speedup the loading metadata and the writing
   of the hint array
 
 . fix potential for a dm-crypt crash on device teardown
 
 . remove dm_bufio_cond_resched() and just using cond_resched()
 
 . change DM multipath to return a reservation conflict error
   immediately; rather than failing the path and retrying (potentially
   indefinitely)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJX7n9KAAoJEMUj8QotnQNab74IANm+rW2uYdpLNCxWUmcaih0d
 BK8dLS/Mz35S0TRSekvynuBcPx18VP2Zueulc+aHTWcT4sj79l6KnVYT9g6c98rL
 zzcv10QTteqhiiWwFmPHsZgv5dW8Y5wiRdt+SqcQ5sAHMFci6C05gzp9caNu7VTs
 fbcLUdyYm40y3j84Lx/+ABXgnBhq+40OTtdnYSkEmLtdscPLzwpHgPmMctkrEl7e
 7mqGC1KbDDzartqOZOeGP2P2qOCNN21qA+8ctMw9Xyze33uwvj7Vx6cro6e28wMm
 ZClY9XNGlfuW9dCNtFR9o6NXS6NIK30UJbKqyZPPsK+70JrOgzh6GzQnwSXdyNs=
 =7SkG
 -----END PGP SIGNATURE-----

Merge tag 'dm-4.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - various fixes and cleanups for request-based DM core

 - add support for delaying the requeue of requests; used by DM
   multipath when all paths have failed and 'queue_if_no_path' is
   enabled

 - DM cache improvements to speedup the loading metadata and the writing
   of the hint array

 - fix potential for a dm-crypt crash on device teardown

 - remove dm_bufio_cond_resched() and just using cond_resched()

 - change DM multipath to return a reservation conflict error
   immediately; rather than failing the path and retrying (potentially
   indefinitely)

* tag 'dm-4.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (24 commits)
  dm mpath: always return reservation conflict without failing over
  dm bufio: remove dm_bufio_cond_resched()
  dm crypt: fix crash on exit
  dm cache metadata: switch to using the new cursor api for loading metadata
  dm array: introduce cursor api
  dm btree: introduce cursor api
  dm cache policy smq: distribute entries to random levels when switching to smq
  dm cache: speed up writing of the hint array
  dm array: add dm_array_new()
  dm mpath: delay the requeue of blk-mq requests while all paths down
  dm mpath: use dm_mq_kick_requeue_list()
  dm rq: introduce dm_mq_kick_requeue_list()
  dm rq: reduce arguments passed to map_request() and dm_requeue_original_request()
  dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
  dm: convert wait loops to use autoremove_wake_function()
  dm: use signal_pending_state() in dm_wait_for_completion()
  dm: rename task state function arguments
  dm: add two lockdep_assert_held() statements
  dm rq: simplify dm_old_stop_queue()
  dm mpath: check if path's request_queue is dying in activate_path()
  ...
2016-10-09 17:16:18 -07:00
Linus Torvalds
b9044ac829 Merge of primary rdma-core code for 4.9
- Updates to mlx5
 - Updates to mlx4 (two conflicts, both minor and easily resolved)
 - Updates to iw_cxgb4 (one conflict, not so obvious to resolve, proper
   resolution is to keep the code in cxgb4_main.c as it is in Linus'
   tree as attach_uld was refactored and moved into cxgb4_uld.c)
 - Improvements to uAPI (moved vendor specific API elements to uAPI area)
 - Add hns-roce driver and hns and hns-roce ACPI reset support
 - Conversion of all rdma code away from deprecated
   create_singlethread_workqueue
 - Security improvement: remove unsafe ib_get_dma_mr (breaks lustre in
   staging)
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJX+AwSAAoJELgmozMOVy/d0WkQAKxPzVccMWwHv28iZI4ey13u
 JwE+VoCNpCAZAVuEgzK5zzFdNHPvAk2jU93H4apA7dfXJBXPatVuj9Lnk+ieEEnW
 tbFwJjBpbQ3Zol3+SPfAHnsVMbtax+xmd6WDKExPXXEDl1L6rutwL3KKfmgWEitg
 ysX7XOJCiSdyM0hcg4T6UPB9a3jGPff9NLu0oGamV+yoUk5Y0WGoVFxHZ4MKcw8t
 OkFBYIxGz4SGwq2tulStuH03HteURX594KngtrA8dyq6l1R2GlGRv+bkJAUEIWUv
 aA0ow3VWusOM6fT+jLXPCv8iUwIXM8tR/U6F7X+cmORUUtWvCl+uCUVid113j/aN
 BK+Af2nJnfoJ5cDBPsD+bC76l5gQycNZO/Qh8op2kmgJtD+6OpGM3cBXsHx53+kk
 0wloJ2lKCGShWxNj+ig8n8rR/rhhs/x3vV3ouCVWNMbOUgOSN3eYHxmK3wGFW4nd
 Qx+WYCjj9Yi/J6nmUDcfEQ4NWPR22Q2+0ENAabfhLhV6mDloAO5ILHd4GDqC3IA9
 UtxlVjf4ZonaiLnTQQzCnDMGVVk6tT8FJ9D42s0ScwjbdYwjyCW9/rs/g2EhcprR
 Cc+AmjqLviCWGtzBSFO0SijqQon8lcQOwdLw61CdFFvPa/mlLdf1rbx9ArIyNVKn
 JSrbr3CGyoqyYj6qaEO5
 =LC+S
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull main rdma updates from Doug Ledford:
 "This is the main pull request for the rdma stack this release.  The
  code has been through 0day and I had it tagged for linux-next testing
  for a couple days.

  Summary:

   - updates to mlx5

   - updates to mlx4 (two conflicts, both minor and easily resolved)

   - updates to iw_cxgb4 (one conflict, not so obvious to resolve,
     proper resolution is to keep the code in cxgb4_main.c as it is in
     Linus' tree as attach_uld was refactored and moved into
     cxgb4_uld.c)

   - improvements to uAPI (moved vendor specific API elements to uAPI
     area)

   - add hns-roce driver and hns and hns-roce ACPI reset support

   - conversion of all rdma code away from deprecated
     create_singlethread_workqueue

   - security improvement: remove unsafe ib_get_dma_mr (breaks lustre in
     staging)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits)
  staging/lustre: Disable InfiniBand support
  iw_cxgb4: add fast-path for small REG_MR operations
  cxgb4: advertise support for FR_NSMR_TPTE_WR
  IB/core: correctly handle rdma_rw_init_mrs() failure
  IB/srp: Fix infinite loop when FMR sg[0].offset != 0
  IB/srp: Remove an unused argument
  IB/core: Improve ib_map_mr_sg() documentation
  IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets
  IB/mthca: Move user vendor structures
  IB/nes: Move user vendor structures
  IB/ocrdma: Move user vendor structures
  IB/mlx4: Move user vendor structures
  IB/cxgb4: Move user vendor structures
  IB/cxgb3: Move user vendor structures
  IB/mlx5: Move and decouple user vendor structures
  IB/{core,hw}: Add constant for node_desc
  ipoib: Make ipoib_warn ratelimited
  IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue
  IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue
  IB/ipoib: Remove deprecated create_singlethread_workqueue
  ...
2016-10-09 17:04:33 -07:00
Linus Torvalds
1fde76f173 Minor updates for rxe driver
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJX+BuzAAoJELgmozMOVy/dUvwP/juvXUJg7Xh3N0PXo0+Qb8Xc
 TG3a8EdumYZtDF8jualKiuJboGUdkMN4EDWhfdPxxW3lGBc8EQP7vac2H3NCcE3L
 ntRYrEeaDF8x5eUSyIQf2hs9k7HVQ+gkR8KqbeVUFh8iCwNnQ+wjvBXB+WUcqioW
 mp+S0//v8t1r1zPyUE7NgBo8EBVWzapiiGrDObNMke2oZxa4CsKsPCUMzSr9USW+
 /oSQBqcAZQ6LL8ykvx2oov2jt+bcWKco5fxRb8R7+buZknC4jX98f70WkLUne1+q
 oG3fKf1cUmORAjWelNDSxfwz8uWzkZifHqa+5UQHKVR9PcFJIQUKuW/5InMxkU3+
 HG+t7YiwCKOX7+99zUC5GHWFilmAOJma567VzfEgzcJjNeRtoCVG0a7smACWKA/y
 JWUoLUQd5c9dvg46CDG+qqk79tINYmRdtgvJmht2AhCAczRYI8iMIaw9Fh7yC5Dg
 JvUVjLLVtdNjJSP0ixY6RJB4aQVbdVSzo4UUJKXQkaJM6KWRd5HYGcr8rUXn8HEf
 PjHNcPEByOJBfH2Fg759nKDPaVAN4r/8nDHZpLfEfnzzEcXWIUk4vDcCYHxN69RF
 8EZZ681c4YE/WW49Jr/JulKKRsrZn1xB4FJF5gJmAOvb0WHoJPf82YEr3kcUUOQK
 HaECbnzwYqGRQe5tNFiD
 =/P6a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull more rdma updates from Doug Ledford:
 "Minor updates for rxe driver"

[ Starting to do merge window pulls again - the current -git tree does
  appear to have some netfilter use-after-free issues, but I've sent
  off the report to the proper channels, and I don't want to delay merge
  window activity any more ]

* tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/rxe: improved debug prints & code cleanup
  rdma_rxe: Ensure rdma_rxe init occurs at correct time
  IB/rxe: Properly honor max IRD value for rd/atomic.
  IB/{rxe,core,rdmavt}: Fix kernel crash for reg MR
  IB/rxe: Fix sending out loopback packet on netdev interface.
  IB/rxe: Avoid scheduling tasklet for userspace QP
2016-10-09 16:30:03 -07:00
Linus Torvalds
bfd8d3f23b printk: make reading the kernel log flush pending lines
That will mean that any possible subsequent continuation will now be
broken up onto a line of its own (since reading the log has finalized
the beginning og the line), but if user space has activated system
logging (or if there's a kernel message dump going on) that is the right
thing to do.

And now that we actually get the continuation flags _right_ for this
all, the user space logger that is reading the kernel messages can
actually see the continuation marker.  Not that anybody seems to really
bother with it (or care), but in theory user space can do its own
message stitching.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-09 12:23:40 -07:00
Linus Torvalds
5e467652ff printk: re-organize log_output() to be more legible
Avoid some duplicate logic now that we can return early, and update the
comments for the new LOG_CONT world order.

This also stops the continuation flushing from just using random record
flags for the flushing action, instead taking the flags from the proper
original line and updating them as we add continuations to it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-09 12:23:40 -07:00
Linus Torvalds
c362c7ff84 printk: split out core logging code into helper function
The code that actually decides how to log the message (whether to put it
directly into the record log, whether to append it to an existing
buffered log, or whether to start a new buffered log) is fairly
non-obvious code in the middle of the vprintk_emit() function.

Splitting that code up into a helper function makes it easier to
understand, but perhaps more importantly also allows for the code to
just return early out of the helper function once it has made the
decision about where the new log content goes.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-09 12:23:39 -07:00
Linus Torvalds
4bcc595ccd printk: reinstate KERN_CONT for printing continuation lines
Long long ago the kernel log buffer was a buffered stream of bytes, very
much like stdio in user space.  It supported log levels by scanning the
stream and noticing the log level markers at the beginning of each line,
but if you wanted to print a partial line in multiple chunks, you just
did multiple printk() calls, and it just automatically worked.

Except when it didn't, and you had very confusing output when different
lines got all mixed up with each other.  Then you got fragment lines
mixing with each other, or with non-fragment lines, because it was
traditionally impossible to tell whether a printk() call was a
continuation or not.

To at least help clarify the issue of continuation lines, we added a
KERN_CONT marker back in 2007 to mark continuation lines:

  4749252776 ("printk: add KERN_CONT annotation").

That continuation marker was initially an empty string, and didn't
actuall make any semantic difference.  But it at least made it possible
to annotate the source code, and have check-patch notice that a printk()
didn't need or want a log level marker, because it was a continuation of
a previous line.

To avoid the ambiguity between a continuation line that had that
KERN_CONT marker, and a printk with no level information at all, we then
in 2009 made KERN_CONT be a real log level marker which meant that we
could now reliably tell the difference between the two cases.

  5fd29d6ccb ("printk: clean up handling of log-levels and newlines")

and we could take advantage of that to make sure we didn't mix up
continuation lines with lines that just didn't have any loglevel at all.

Then, in 2012, the kernel log buffer was changed to be a "record" based
log, where each line was a record that has a loglevel and a timestamp.

You can see the beginning of that conversion in commits

  e11fea92e1 ("kmsg: export printk records to the /dev/kmsg interface")
  7ff9554bb5 ("printk: convert byte-buffer to variable-length record buffer")

with a number of follow-up commits to fix some painful fallout from that
conversion.  Over all, it took a couple of months to sort out most of
it.  But the upside was that you could have concurrent readers (and
writers) of the kernel log and not have lines with mixed output in them.

And one particular pain-point for the record-based kernel logging was
exactly the fragmentary lines that are generated in smaller chunks.  In
order to still log them as one recrod, the continuation lines need to be
attached to the previous record properly.

However the explicit continuation record marker that is actually useful
for this exact case was actually removed in aroundm the same time by commit

  61e99ab8e3 ("printk: remove the now unnecessary "C" annotation for KERN_CONT")

due to the incorrect belief that KERN_CONT wasn't meaningful.  The
ambiguity between "is this a continuation line" or "is this a plain
printk with no log level information" was reintroduced, and in fact
became an even bigger pain point because there was now the whole
record-level merging of kernel messages going on.

This patch reinstates the KERN_CONT as a real non-empty string marker,
so that the ambiguity is fixed once again.

But it's not a plain revert of that original removal: in the four years
since we made KERN_CONT an empty string again, not only has the format
of the log level markers changed, we've also had some usage changes in
this area.

For example, some ACPI code seems to use KERN_CONT _together_ with a log
level, and now uses both the KERN_CONT marker and (for example) a
KERN_INFO marker to show that it's an informational continuation of a
line.

Which is actually not a bad idea - if the continuation line cannot be
attached to its predecessor, without the log level information we don't
know what log level to assign to it (and we traditionally just assigned
it the default loglevel).  So having both a log level and the KERN_CONT
marker is not necessarily a bad idea, but it does mean that we need to
actually iterate over potentially multiple markers, rather than just a
single one.

Also, since KERN_CONT was still conceptually needed, and encouraged, but
didn't actually _do_ anything, we've also had the reverse problem:
rather than having too many annotations it has too few, and there is bit
rot with code that no longer marks the continuation lines with the
KERN_CONT marker.

So this patch not only re-instates the non-empty KERN_CONT marker, it
also fixes up the cases of bit-rot I noticed in my own logs.

There are probably other cases where KERN_CONT will be needed to be
added, either because it is new code that never dealt with the need for
KERN_CONT, or old code that has bitrotted without anybody noticing.

That said, we should strive to avoid the need for KERN_CONT.  It does
result in real problems for logging, and should generally not be seen as
a good feature.  If we some day can get rid of the feature entirely,
because nobody does any fragmented printk calls, that would be lovely.

But until that point, let's at mark the code that relies on the hacky
multi-fragment kernel printk's.  Not only does it avoid the ambiguity,
it also annotates code as "maybe this would be good to fix some day".

(That said, particularly during single-threaded bootup, the downsides of
KERN_CONT are very limited.  Things get much hairier when you have
multiple threads going on and user level reading and writing logs too).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-09 12:23:38 -07:00
David S. Miller
d24cd733ba Merge branch 'be2net-fixes'
Sriharsha Basavapatna says:

====================
be2net: patch-set

The following patch set contains a few bug fixes.
Please consider applying this to the net-next tree.
Thanks.

Patch-1 Obtains proper PF number for BEx chips
Patch-2 Fixes a FW update issue seen with BEx chips
Patch-3 Updates copyright string
Patch-4 Fixes TX stats for TSO packets
Patch-5 Enables VF link state setting for BE3
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-09 09:30:45 -04:00
Suresh Reddy
dc6e8511ff be2net: Enable VF link state setting for BE3
The VF link state setting feature now works on BE3 chips too from
FW ver 11.1.192.0 onwards.

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-09 09:30:39 -04:00
Sriharsha Basavapatna
f3d6ad8480 be2net: Fix TX stats for TSO packets
TX stats update does not take into account headers which get duplicated
when the TSO packet is split into segments by HW. Fix this for both
tunneled (vxlan) and non-tunneled TSO packets.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-09 09:30:38 -04:00
Sriharsha Basavapatna
77b696cba9 be2net: Update Copyright string in be_hw.h
This patch updates the year and company name in the copyright string
in be_hw.h.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-09 09:30:38 -04:00
Sriharsha Basavapatna
f5ef017e11 be2net: NCSI FW section should be properly updated with ethtool for BE3
The driver has a check to ensure that NCSI FW section is updated only
if the current FW version in the card supports it. This FW version check
is done using memcmp() which obviously fails in some cases. Fix this by
breaking up the version string into integer version components and
comparing them.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-09 09:30:38 -04:00
Sriharsha Basavapatna
6ee080bb09 be2net: Provide an alternate way to read pf_num for BEx chips
The driver gets the pf_num for Skyhawk and Lancer using
GET_FUNC_CONFIG FW command. But since that command is not
supported in BEx, we need to get it from some other command.
Otherwise TPE recovery would fail since all NIC PFs would
end up with a func num of 0. There's a pci function number
field in the response  of GET_CNTL_ATTRIBUTES command that
can be read to get the same info for BEx adapters.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-09 09:30:38 -04:00
Brian Norris
69db4aa44f Introduction of the MTD pairing scheme concept.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX2rbJAAoJEGXtNgF+CLcAnGMP/06Eydb50SiBv7YQRVJTF8Rk
 K0lPQOJBXX+32xXJYKAj0UIRNsd5yxdH1tmWJFOxUV4etI1K9CtlCMLYoLgqPrPf
 SVXMBtfZvWmK8D5Mee9BroQIlSOQnzbYO3zAFY8pV8SFZ8y9SMXvPM6Z/2tTlMvK
 uds1x0t1DHGXLcAwZo4Ctvtxy+x4wS1RmfdMQhWYGSlbAqDzbjwK3Dceq6JmzJ1i
 UxoYHqXdjAF6h67twADHre5vMwdDgYEgEHZHomXtWpHFslWYpWCQUWLLSnWG3h4d
 ia3Q2xTFPxKg90PKurGrLffwfo0KnpwieqLYqHZ70d9pgepgqv2N547kPeYPNKIS
 VdbDqakY7V4P6OZ+ykUJ3VcfDPRXh0b0+V9AVyGPIXH4m03YXlGsKigN1tYHid/K
 I3DeZqcCX35/Fc4iSD9k+HYlIplx6Ijo3JgpnRVlircczloKk+00mZ2THV5/pYcn
 vfgxgjeKaTKeG+zcSehfmYFZO1SuvhSFO4/qXLgYd8LWav8QWADQ8H3C2nn/ZlMk
 krb9S0iAQE9nmAY1ZPUz0hreaaYCEZVst/KAJWAkGW0Laj29gW6PAj/lQb+fG5d4
 4aSn4qnPrXQ7P76NAihbtzYGCTFwtlhoBXDXhv3A5aoqI5QaMI6QQ+LkYJMRAn0s
 HY4ExJEeC/lUe5QiF0Q7
 =GnV0
 -----END PGP SIGNATURE-----

Merge tag '4.9/mtd-pairing-scheme' of github.com:linux-nand/linux

Introduction of the MTD pairing scheme concept.
2016-10-08 20:56:54 -07:00
Al Viro
e55f1d1d13 Merge remote-tracking branch 'jk/vfs' into work.misc 2016-10-08 11:06:08 -04:00
Al Viro
f334bcd94b Merge remote-tracking branch 'ovl/misc' into work.misc 2016-10-08 11:00:01 -04:00