linux/net
Linus Torvalds 7f427d3a60 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull parallel filesystem directory handling update from Al Viro.

This is the main parallel directory work by Al that makes the vfs layer
able to do lookup and readdir in parallel within a single directory.
That's a big change, since this used to be all protected by the
directory inode mutex.

The inode mutex is replaced by an rwsem, and serialization of lookups of
a single name is done by a "in-progress" dentry marker.

The series begins with xattr cleanups, and then ends with switching
filesystems over to actually doing the readdir in parallel (switching to
the "iterate_shared()" that only takes the read lock).

A more detailed explanation of the process from Al Viro:
 "The xattr work starts with some acl fixes, then switches ->getxattr to
  passing inode and dentry separately.  This is the point where the
  things start to get tricky - that got merged into the very beginning
  of the -rc3-based #work.lookups, to allow untangling the
  security_d_instantiate() mess.  The xattr work itself proceeds to
  switch a lot of filesystems to generic_...xattr(); no complications
  there.

  After that initial xattr work, the series then does the following:

   - untangle security_d_instantiate()

   - convert a bunch of open-coded lookup_one_len_unlocked() to calls of
     that thing; one such place (in overlayfs) actually yields a trivial
     conflict with overlayfs fixes later in the cycle - overlayfs ended
     up switching to a variant of lookup_one_len_unlocked() sans the
     permission checks.  I would've dropped that commit (it gets
     overridden on merge from #ovl-fixes in #for-next; proper resolution
     is to use the variant in mainline fs/overlayfs/super.c), but I
     didn't want to rebase the damn thing - it was fairly late in the
     cycle...

   - some filesystems had managed to depend on lookup/lookup exclusion
     for *fs-internal* data structures in a way that would break if we
     relaxed the VFS exclusion.  Fixing hadn't been hard, fortunately.

   - core of that series - parallel lookup machinery, replacing
     ->i_mutex with rwsem, making lookup_slow() take it only shared.  At
     that point lookups happen in parallel; lookups on the same name
     wait for the in-progress one to be done with that dentry.

     Surprisingly little code, at that - almost all of it is in
     fs/dcache.c, with fs/namei.c changes limited to lookup_slow() -
     making it use the new primitive and actually switching to locking
     shared.

   - parallel readdir stuff - first of all, we provide the exclusion on
     per-struct file basis, same as we do for read() vs lseek() for
     regular files.  That takes care of most of the needed exclusion in
     readdir/readdir; however, these guys are trickier than lookups, so
     I went for switching them one-by-one.  To do that, a new method
     '->iterate_shared()' is added and filesystems are switched to it
     as they are either confirmed to be OK with shared lock on directory
     or fixed to be OK with that.  I hope to kill the original method
     come next cycle (almost all in-tree filesystems are switched
     already), but it's still not quite finished.

   - several filesystems get switched to parallel readdir.  The
     interesting part here is dealing with dcache preseeding by readdir;
     that needs minor adjustment to be safe with directory locked only
     shared.

     Most of the filesystems doing that got switched to in those
     commits.  Important exception: NFS.  Turns out that NFS folks, with
     their, er, insistence on VFS getting the fuck out of the way of the
     Smart Filesystem Code That Knows How And What To Lock(tm) have
     grown the locking of their own.  They had their own homegrown
     rwsem, with lookup/readdir/atomic_open being *writers* (sillyunlink
     is the reader there).  Of course, with VFS getting the fuck out of
     the way, as requested, the actual smarts of the smart filesystem
     code etc. had become exposed...

   - do_last/lookup_open/atomic_open cleanups.  As the result, open()
     without O_CREAT locks the directory only shared.  Including the
     ->atomic_open() case.  Backmerge from #for-linus in the middle of
     that - atomic_open() fix got brought in.

   - then comes NFS switch to saner (VFS-based ;-) locking, killing the
     homegrown "lookup and readdir are writers" kinda-sorta rwsem.  All
     exclusion for sillyunlink/lookup is done by the parallel lookups
     mechanism.  Exclusion between sillyunlink and rmdir is a real rwsem
     now - rmdir being the writer.

     Result: NFS lookups/readdirs/O_CREAT-less opens happen in parallel
     now.

   - the rest of the series consists of switching a lot of filesystems
     to parallel readdir; in a lot of cases ->llseek() gets simplified
     as well.  One backmerge in there (again, #for-linus - rockridge
     fix)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (74 commits)
  ext4: switch to ->iterate_shared()
  hfs: switch to ->iterate_shared()
  hfsplus: switch to ->iterate_shared()
  hostfs: switch to ->iterate_shared()
  hpfs: switch to ->iterate_shared()
  hpfs: handle allocation failures in hpfs_add_pos()
  gfs2: switch to ->iterate_shared()
  f2fs: switch to ->iterate_shared()
  afs: switch to ->iterate_shared()
  befs: switch to ->iterate_shared()
  befs: constify stuff a bit
  isofs: switch to ->iterate_shared()
  get_acorn_filename(): deobfuscate a bit
  btrfs: switch to ->iterate_shared()
  logfs: no need to lock directory in lseek
  switch ecryptfs to ->iterate_shared
  9p: switch to ->iterate_shared()
  fat: switch to ->iterate_shared()
  romfs, squashfs: switch to ->iterate_shared()
  more trivial ->iterate_shared conversions
  ...
2016-05-17 11:01:31 -07:00
..
6lowpan 6lowpan: iphc: fix SAM/DAM bit comment 2016-03-10 19:51:29 +01:00
9p net/9p: convert to new CQ API 2016-03-10 20:54:09 -05:00
802
8021q vlan: propagate gso_max_segs 2016-03-17 21:05:01 -04:00
appletalk
atm
ax25 ax25: add link layer header validation function 2016-03-09 22:13:01 -05:00
batman-adv batman-adv: Fix reference counting of hardif_neigh_node object for neigh_node 2016-04-29 19:46:11 +08:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
bridge bridge: fix igmp / mld query parsing 2016-05-06 12:55:13 -04:00
caif net: caif: fix misleading indentation 2016-03-14 13:09:50 -04:00
can
ceph libceph: make authorizer destruction independent of ceph_auth_client 2016-04-25 20:54:13 +02:00
core Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2016-05-04 16:35:31 -04:00
dcb
dccp tcp/dccp: remove obsolete WARN_ON() in icmp handlers 2016-03-17 21:06:40 -04:00
decnet decnet: Do not build routes to devices without decnet private data. 2016-04-10 23:01:30 -04:00
dns_resolver
dsa net: dsa: refine netdev event notifier 2016-03-14 16:05:32 -04:00
ethernet eth: Pull header from first fragment via eth_get_headlen 2016-02-24 13:58:05 -05:00
hsr
ieee802154 ieee802154: 6lowpan: fix return of netdev notifier 2016-02-23 20:29:40 +01:00
ipv4 net/route: enforce hoplimit max value 2016-05-14 15:33:32 -04:00
ipv6 net/route: enforce hoplimit max value 2016-05-14 15:33:32 -04:00
ipx
irda
iucv
kcm kcm: Add receive message timeout 2016-03-09 16:36:15 -05:00
key
l2tp net: l2tp: fix reversed udp6 checksum flags 2016-05-01 19:32:16 -04:00
l3mdev net: l3mdev: address selection should only consider devices in L3 domain 2016-02-26 14:22:26 -05:00
lapb
llc net: fix infoleak in llc 2016-05-04 16:18:48 -04:00
mac80211 mac80211: fix statistics leak if dev_alloc_name() fails 2016-04-27 10:06:58 +02:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
mpls mpls: find_outdev: check for err ptr in addition to NULL check 2016-04-08 12:43:20 -04:00
netfilter nf_conntrack: avoid kernel pointer value leak in slab name 2016-05-14 15:04:43 -04:00
netlabel netlabel: do not initialise statics to NULL 2016-03-07 11:08:26 -05:00
netlink netlink: don't send NETLINK_URELEASE for unbound sockets 2016-04-10 23:32:23 -04:00
netrom
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
openvswitch openvswitch: Fix cached ct with helper. 2016-05-11 15:14:56 -04:00
packet packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface 2016-04-14 00:46:39 -04:00
phonet
rds RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock. 2016-05-03 16:03:44 -04:00
rfkill Here's another round of updates for -next: 2016-03-01 17:03:27 -05:00
rose
rxrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
sched net sched: ife action fix late binding 2016-05-10 23:50:15 -04:00
sctp sctp: avoid refreshing heartbeat timer too often 2016-04-10 22:22:34 -04:00
sunrpc Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-04-14 18:15:40 -07:00
switchdev switchdev: Adding complete operation to deferred switchdev ops 2016-04-24 14:23:32 -04:00
tipc tipc: only process unicast on intended node 2016-05-01 21:03:30 -04:00
unix Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-02-23 00:09:14 -05:00
vmw_vsock VSOCK: do not disconnect socket when peer has shutdown SEND only 2016-05-05 23:31:29 -04:00
wimax
wireless nl80211: check netlink protocol in socket release notification 2016-04-12 15:39:06 +02:00
x25 net: fix a kernel infoleak in x25 module 2016-05-09 22:45:33 -04:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2016-05-04 16:35:31 -04:00
compat.c
Kconfig Make DST_CACHE a silent config option 2016-03-21 22:56:38 -04:00
Makefile kcm: Kernel Connection Multiplexor module 2016-03-09 16:36:14 -05:00
socket.c ->getxattr(): pass dentry and inode as separate arguments 2016-04-11 00:48:00 -04:00
sysctl_net.c