linux

Author	SHA1	Message	Date
J. Bruce Fields	082d4bd72a	nfsd4: "backfill" using write_bytes_to_xdr_buf Normally xdr encoding proceeds in a single pass from start of a buffer to end, but sometimes we have to write a few bytes to an earlier position. Use write_bytes_to_xdr_buf for these cases rather than saving a pointer to write to. We plan to rewrite xdr_reserve_space to handle encoding across page boundaries using a scratch buffer, and don't want to risk writing to a pointer that was contained in a scratch buffer. Also it will no longer be safe to calculate lengths by subtracting two pointers, so use xdr_buf offsets instead. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:31:49 -04:00
J. Bruce Fields	1fcea5b20b	nfsd4: use xdr_truncate_encode Now that lengths are reliable, we can use xdr_truncate instead of open-coding it everywhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:31:48 -04:00
J. Bruce Fields	6ac90391c6	nfsd4: keep xdr buf length updated Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-28 14:52:38 -04:00
J. Bruce Fields	dd97fddedc	nfsd4: no need for encode_compoundres to adjust lengths xdr_reserve_space should now be calculating the length correctly as we go, so there's no longer any need to fix it up here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-28 14:52:37 -04:00
J. Bruce Fields	f46d382a74	nfsd4: remove ADJUST_ARGS It's just uninteresting debugging code at this point. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-28 14:52:36 -04:00
J. Bruce Fields	d3f627c815	nfsd4: use xdr_stream throughout compound encoding Note this makes ADJUST_ARGS useless; we'll remove it in the following patch. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-28 14:52:35 -04:00
J. Bruce Fields	ddd1ea5636	nfsd4: use xdr_reserve_space in attribute encoding This is a cosmetic change for now; no change in behavior. Note we're just depending on xdr_reserve_space to do the bounds checking for us, we're not really depending on its adjustment of iovec or xdr_buf lengths yet, as those are fixed up by as necessary after the fact by read-link operations and by nfs4svc_encode_compoundres. However we do have to update xdr->iov on read-like operations to prevent xdr_reserve_space from messing with the already-fixed-up length of the the head. When the attribute encoding fails partway through we have to undo the length adjustments made so far. We do it manually for now, but later patches will add an xdr_truncate_encode() helper to handle cases like this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-28 14:52:34 -04:00
J. Bruce Fields	5f4ab94587	nfsd4: allow space for final error return This post-encoding check should be taking into account the need to encode at least an out-of-space error to the following op (if any). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-27 11:09:09 -04:00
J. Bruce Fields	07d1f80207	nfsd4: fix encoding of out-of-space replies If nfsd4_check_resp_size() returns an error then we should really be truncating the reply here, otherwise we may leave extra garbage at the end of the rpc reply. Also add a warning to catch any cases where our reply-size estimates may be wrong in the case of a non-idempotent operation. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-27 11:09:08 -04:00
J. Bruce Fields	1802a67894	nfsd4: reserve head space for krb5 integ/priv info Currently if the nfs-level part of a reply would be too large, we'll return an error to the client. But if the nfs-level part fits and leaves no room for krb5p or krb5i stuff, then we just drop the request entirely. That's no good. Instead, reserve some slack space at the end of the buffer and make sure we fail outright if we'd come close. The slack space here is a massive overstimate of what's required, we should probably try for a tighter limit at some point. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-23 09:03:47 -04:00
J. Bruce Fields	2d124dfaad	nfsd4: move proc_compound xdr encode init to helper Mechanical transformation with no change of behavior. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-23 09:03:46 -04:00
J. Bruce Fields	d518465866	nfsd4: tweak nfsd4_encode_getattr to take xdr_stream Just change the nfsd4_encode_getattr api. Not changing any code or adding any new functionality yet. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-23 09:03:46 -04:00
J. Bruce Fields	4aea24b2ff	nfsd4: embed xdr_stream in nfsd4_compoundres This is a mechanical transformation with no change in behavior. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-23 09:03:45 -04:00
J. Bruce Fields	e372ba60de	nfsd4: decoding errors can still be cached and require space Currently a non-idempotent op reply may be cached if it fails in the proc code but not if it fails at xdr decoding. I doubt there are any xdr-decoding-time errors that would make this a problem in practice, so this probably isn't a serious bug. The space estimates should also take into account space required for encoding of error returns. Again, not a practical problem, though it would become one after future patches which will tighten the space estimates. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-23 09:03:44 -04:00
J. Bruce Fields	f34e432b67	nfsd4: fix write reply size estimate The write reply also includes count and stable_how. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-23 09:03:43 -04:00
J. Bruce Fields	622f560e6a	nfsd4: read size estimate should include padding Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-23 09:03:42 -04:00
J. Bruce Fields	24906f3237	nfsd4: allow larger 4.1 session drc slots The client is actually asking for 2532 bytes. I suspect that's a mistake. But maybe we can allow some more. In theory lock needs more if it might return a maximum-length lockowner in the denied case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-23 09:03:41 -04:00
J. Bruce Fields	5b648699af	nfsd4: READ, READDIR, etc., are idempotent OP_MODIFIES_SOMETHING flags operations that we should be careful not to initiate without being sure we have the buffer space to encode a reply. None of these ops fall into that category. We could probably remove a few more, but this isn't a very important problem at least for ops whose reply size is easy to estimate. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-23 09:03:41 -04:00
NeilBrown	8658452e4a	nfsd: Only set PF_LESS_THROTTLE when really needed. PF_LESS_THROTTLE has a very specific use case: to avoid deadlocks and live-locks while writing to the page cache in a loop-back NFS mount situation. It therefore makes sense to only set PF_LESS_THROTTLE in this situation. We now know when a request came from the local-host so it could be a loop-back mount. We already know when we are handling write requests, and when we are doing anything else. So combine those two to allow nfsd to still be throttled (like any other process) in every situation except when it is known to be problematic. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:59:19 -04:00
Christoph Hellwig	abf1135b6e	nfsd: remove nfsd4_free_slab No need for a kmem_cache_destroy wrapper in nfsd, just do proper goto based unwinding. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:52:57 -04:00
Benoit Taine	d40aa3372f	nfsd: Remove assignments inside conditions Assignments should not happen inside an if conditional, but in the line before. This issue was reported by checkpatch. The semantic patch that makes this change is as follows (http://coccinelle.lip6.fr/): // <smpl> @@ identifier i1; expression e1; statement S; @@ -if(!(i1 = e1)) S +i1 = e1; +if(!i1) +S // </smpl> It has been tested by compilation. Signed-off-by: Benoit Taine <benoit.taine@lip6.fr> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:52:23 -04:00
J. Bruce Fields	f35ea0d4b6	Merge 3.15 bugfixes for 3.16	2014-05-22 15:48:15 -04:00
J. Bruce Fields	cbf7a75bc5	nfsd4: fix delegation cleanup on error We're not cleaning up everything we need to on error. In particular, we're not removing our lease. Among other problems this can cause the struct nfs4_file used as fl_owner to be referenced after it has been destroyed. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-21 12:17:17 -04:00
Kinglong Mee	368fe39b50	NFSD: Don't clear SUID/SGID after root writing data We're clearing the SUID/SGID bits on write by hand in nfsd_vfs_write, even though the subsequent vfs_writev() call will end up doing this for us (through file system write methods eventually calling file_remove_suid(), e.g., from __generic_file_aio_write). So, remove the redundant nfsd code. The only change in behavior is when the write is by root, in which case we previously cleared SUID/SGID, but will now leave it alone. The new behavior is the behavior of every filesystem we've checked. It seems better to be consistent with local filesystem behavior. And the security advantage seems limited as root could always restore these bits by hand if it wanted. SUID/SGID is not cleared after writing data with (root, local ext4), File: ‘test’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: 803h/2051d Inode: 1200137 Links: 1 Access: (4777/-rwsrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root) Context: unconfined_u:object_r:admin_home_t:s0 Access: 2014-04-18 21:36:31.016029014 +0800 Modify: 2014-04-18 21:36:31.016029014 +0800 Change: 2014-04-18 21:36:31.026030285 +0800 Birth: - File: ‘test’ Size: 5 Blocks: 8 IO Block: 4096 regular file Device: 803h/2051d Inode: 1200137 Links: 1 Access: (4777/-rwsrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root) Context: unconfined_u:object_r:admin_home_t:s0 Access: 2014-04-18 21:36:31.016029014 +0800 Modify: 2014-04-18 21:36:31.040032065 +0800 Change: 2014-04-18 21:36:31.040032065 +0800 Birth: - With no_root_squash, (root, remote ext4), SUID/SGID are cleared, File: ‘test’ Size: 0 Blocks: 0 IO Block: 262144 regular empty file Device: 24h/36d Inode: 786439 Links: 1 Access: (4777/-rwsrwxrwx) Uid: ( 1000/ test) Gid: ( 1000/ test) Context: system_u:object_r:nfs_t:s0 Access: 2014-04-18 21:45:32.155805097 +0800 Modify: 2014-04-18 21:45:32.155805097 +0800 Change: 2014-04-18 21:45:32.168806749 +0800 Birth: - File: ‘test’ Size: 5 Blocks: 8 IO Block: 262144 regular file Device: 24h/36d Inode: 786439 Links: 1 Access: (0777/-rwxrwxrwx) Uid: ( 1000/ test) Gid: ( 1000/ test) Context: system_u:object_r:nfs_t:s0 Access: 2014-04-18 21:45:32.155805097 +0800 Modify: 2014-04-18 21:45:32.184808783 +0800 Change: 2014-04-18 21:45:32.184808783 +0800 Birth: - Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-21 12:17:16 -04:00
J. Bruce Fields	27b11428b7	nfsd4: warn on finding lockowner without stateid's The current code assumes a one-to-one lockowner<->lock stateid correspondance. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-21 11:11:21 -04:00
J. Bruce Fields	a1b8ff4c97	nfsd4: remove lockowner when removing lock stateid The nfsv4 state code has always assumed a one-to-one correspondance between lock stateid's and lockowners even if it appears not to in some places. We may actually change that, but for now when FREE_STATEID releases a lock stateid it also needs to release the parent lockowner. Symptoms were a subsequent LOCK crashing in find_lockowner_str when it calls same_lockowner_ino on a lockowner that unexpectedly has an empty so_stateids list. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-21 11:11:21 -04:00
J. Bruce Fields	5513a510fa	nfsd4: fix corruption on setting an ACL. As of `06f9cc12ca` "nfsd4: don't create unnecessary mask acl", any non-trivial ACL will be left with an unitialized entry, and a trivial ACL may write one entry beyond what's allocated. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-15 15:36:04 -04:00
Kinglong Mee	9fa1959e97	NFSD: Get rid of empty function nfs4_state_init Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-08 14:59:52 -04:00
Kinglong Mee	f3e41ec5ef	NFSD: Use simple_read_from_buffer for coping data to userspace Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-08 14:59:52 -04:00
J. Bruce Fields	dd15073a26	Merge 3.15 bugfix for 3.16	2014-05-08 14:59:06 -04:00
Christoph Hellwig	5409e46f1b	nfsd: clean up fh_auth usage Use fh_fsid when reffering to the fsid part of the filehandle. The variable length auth field envisioned in nfsfh wasn't ever implemented. Also clean up some lose ends around this and document the file handle format better. Btw, why do we even export nfsfh.h to userspace? The file handle very much is kernel private, and nothing in nfs-utils include the header either. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-08 12:43:03 -04:00
Kinglong Mee	ecc7455d8e	NFSD: cleanup unneeded including linux/export.h commit `4ac7249ea5` have remove all EXPORT_SYMBOL, linux/export.h is not needed, just clean it. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-08 12:43:02 -04:00
Kinglong Mee	aa07c713ec	NFSD: Call ->set_acl with a NULL ACL structure if no entries After setting ACL for directory, I got two problems that caused by the cached zero-length default posix acl. This patch make sure nfsd4_set_nfs4_acl calls ->set_acl with a NULL ACL structure if there are no entries. Thanks for Christoph Hellwig's advice. First problem: ............ hang ........... Second problem: [ 1610.167668] ------------[ cut here ]------------ [ 1610.168320] kernel BUG at /root/nfs/linux/fs/nfsd/nfs4acl.c:239! [ 1610.168320] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 1610.168320] Modules linked in: nfsv4(OE) nfs(OE) nfsd(OE) rpcsec_gss_krb5 fscache ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw auth_rpcgss nfs_acl snd_intel8x0 ppdev lockd snd_ac97_codec ac97_bus snd_pcm snd_timer e1000 pcspkr parport_pc snd parport serio_raw joydev i2c_piix4 sunrpc(OE) microcode soundcore i2c_core ata_generic pata_acpi [last unloaded: nfsd] [ 1610.168320] CPU: 0 PID: 27397 Comm: nfsd Tainted: G OE 3.15.0-rc1+ #15 [ 1610.168320] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 1610.168320] task: ffff88005ab653d0 ti: ffff88005a944000 task.ti: ffff88005a944000 [ 1610.168320] RIP: 0010:[<ffffffffa034d5ed>] [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP: 0018:ffff88005a945b00 EFLAGS: 00010293 [ 1610.168320] RAX: 0000000000000001 RBX: ffff88006700bac0 RCX: 0000000000000000 [ 1610.168320] RDX: 0000000000000000 RSI: ffff880067c83f00 RDI: ffff880068233300 [ 1610.168320] RBP: ffff88005a945b48 R08: ffffffff81c64830 R09: 0000000000000000 [ 1610.168320] R10: ffff88004ea85be0 R11: 000000000000f475 R12: ffff880068233300 [ 1610.168320] R13: 0000000000000003 R14: 0000000000000002 R15: ffff880068233300 [ 1610.168320] FS: 0000000000000000(0000) GS:ffff880077800000(0000) knlGS:0000000000000000 [ 1610.168320] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1610.168320] CR2: 00007f5bcbd3b0b9 CR3: 0000000001c0f000 CR4: 00000000000006f0 [ 1610.168320] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1610.168320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1610.168320] Stack: [ 1610.168320] ffffffff00000000 0000000b67c83500 000000076700bac0 0000000000000000 [ 1610.168320] ffff88006700bac0 ffff880068233300 ffff88005a945c08 0000000000000002 [ 1610.168320] 0000000000000000 ffff88005a945b88 ffffffffa034e2d5 000000065a945b68 [ 1610.168320] Call Trace: [ 1610.168320] [<ffffffffa034e2d5>] nfsd4_get_nfs4_acl+0x95/0x150 [nfsd] [ 1610.168320] [<ffffffffa03400d6>] nfsd4_encode_fattr+0x646/0x1e70 [nfsd] [ 1610.168320] [<ffffffff816a6e6e>] ? kmemleak_alloc+0x4e/0xb0 [ 1610.168320] [<ffffffffa0327962>] ? nfsd_setuser_and_check_port+0x52/0x80 [nfsd] [ 1610.168320] [<ffffffff812cd4bb>] ? selinux_cred_prepare+0x1b/0x30 [ 1610.168320] [<ffffffffa0341caa>] nfsd4_encode_getattr+0x5a/0x60 [nfsd] [ 1610.168320] [<ffffffffa0341e07>] nfsd4_encode_operation+0x67/0x110 [nfsd] [ 1610.168320] [<ffffffffa033844d>] nfsd4_proc_compound+0x21d/0x810 [nfsd] [ 1610.168320] [<ffffffffa0324d9b>] nfsd_dispatch+0xbb/0x200 [nfsd] [ 1610.168320] [<ffffffffa00850cd>] svc_process_common+0x46d/0x6d0 [sunrpc] [ 1610.168320] [<ffffffffa0085433>] svc_process+0x103/0x170 [sunrpc] [ 1610.168320] [<ffffffffa032472f>] nfsd+0xbf/0x130 [nfsd] [ 1610.168320] [<ffffffffa0324670>] ? nfsd_destroy+0x80/0x80 [nfsd] [ 1610.168320] [<ffffffff810a5202>] kthread+0xd2/0xf0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] [<ffffffff816c1ebc>] ret_from_fork+0x7c/0xb0 [ 1610.168320] [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40 [ 1610.168320] Code: 78 02 e9 e7 fc ff ff 31 c0 31 d2 31 c9 66 89 45 ce 41 8b 04 24 66 89 55 d0 66 89 4d d2 48 8d 04 80 49 8d 5c 84 04 e9 37 fd ff ff <0f> 0b 90 0f 1f 44 00 00 55 8b 56 08 c7 07 00 00 00 00 8b 46 0c [ 1610.168320] RIP [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd] [ 1610.168320] RSP <ffff88005a945b00> [ 1610.257313] ---[ end trace 838254e3e352285b ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-08 12:42:21 -04:00
Trond Myklebust	14bcab1a39	NFSd: Clean up nfs4_preprocess_stateid_op Move the state locking and file descriptor reference out from the callers and into nfs4_preprocess_stateid_op() itself. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-07 11:05:48 -04:00
Trond Myklebust	50cc62317d	NFSd: Mark nfs4_free_lockowner and nfs4_free_openowner as static functions They do not need to be used outside fs/nfsd/nfs4state.c Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 17:54:57 -04:00
Christoph Hellwig	6f226e2ab1	nfsd: remove <linux/nfsd/debug.h> There is almost nothing left it in, just merge it into the only file that includes it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 17:54:56 -04:00
Christoph Hellwig	7f94423e8f	nfsd: move <linux/nfsd/stats.h> to fs/nfsd There are no legitimate users outside of fs/nfsd, so move it there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 17:54:55 -04:00
Christoph Hellwig	d430e8d530	nfsd: move <linux/nfsd/export.h> to fs/nfsd There are no legitimate users outside of fs/nfsd, so move it there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 17:54:54 -04:00
Christoph Hellwig	9c69de4c94	nfsd: remove <linux/nfsd/nfsfh.h> The only real user of this header is fs/nfsd/nfsfh.h, so merge the two. Various lockѕ source files used it to indirectly get other sunrpc or nfs headers, so fix those up. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 17:54:53 -04:00
Trond Myklebust	4dd86e150f	NFSd: Remove 'inline' designation for free_client() It is large, it is used in more than one place, and it is not performance critical. Let gcc figure out whether it should be inlined... Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 17:54:53 -04:00
Trond Myklebust	4cb57e3032	NFSd: call rpc_destroy_wait_queue() from free_client() Mainly to ensure that we don't leave any hanging timers. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 12:38:49 -04:00
Trond Myklebust	5694c93e6c	NFSd: Move default initialisers from create_client() to alloc_client() Aside from making it clearer what is non-trivial in create_client(), it also fixes a bug whereby we can call free_client() before idr_init() has been called. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 12:38:46 -04:00
J. Bruce Fields	fc208d026b	Revert "nfsd4: fix nfs4err_resource in 4.1 case" Since we're still limiting attributes to a page, the result here is that a large getattr result will return NFS4ERR_REP_TOO_BIG/TOO_BIG_TO_CACHE instead of NFS4ERR_RESOURCE. Both error returns are wrong, and the real bug here is the arbitrary limit on getattr results, fixed by as-yet out-of-tree patches. But at a minimum we can make life easier for clients by sticking to one broken behavior in released kernels instead of two.... Trond says: one immediate consequence of this patch will be that NFSv4.1 clients will now report EIO instead of EREMOTEIO if they hit the problem. That may make debugging a little less obvious. Another consequence will be that if we ever do try to add client side handling of NFS4ERR_REP_TOO_BIG, then we now have to deal with the “handle existing buggy server” syndrome. Reported-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-04-18 14:46:45 +02:00
Jeff Layton	3758cf7e14	nfsd: set timeparms.to_maxval in setup_callback_client ...otherwise the logic in the timeout handling doesn't work correctly. Spotted-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-04-18 14:34:31 +02:00
Linus Torvalds	75ff24fa52	Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "Highlights: - server-side nfs/rdma fixes from Jeff Layton and Tom Tucker - xdr fixes (a larger xdr rewrite has been posted but I decided it would be better to queue it up for 3.16). - miscellaneous fixes and cleanup from all over (thanks especially to Kinglong Mee)" * 'for-3.15' of git://linux-nfs.org/~bfields/linux: (36 commits) nfsd4: don't create unnecessary mask acl nfsd: revert v2 half of "nfsd: don't return high mode bits" nfsd4: fix memory leak in nfsd4_encode_fattr() nfsd: check passed socket's net matches NFSd superblock's one SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failed NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp SUNRPC: New helper for creating client with rpc_xprt NFSD: Free backchannel xprt in bc_destroy NFSD: Clear wcc data between compound ops nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+ nfsd4: fix nfs4err_resource in 4.1 case nfsd4: fix setclientid encode size nfsd4: remove redundant check from nfsd4_check_resp_size nfsd4: use more generous NFS4_ACL_MAX nfsd4: minor nfsd4_replay_cache_entry cleanup nfsd4: nfsd4_replay_cache_entry should be static nfsd4: update comments with obsolete function name rpc: Allow xdr_buf_subsegment to operate in-place NFSD: Using free_conn free connection SUNRPC: fix memory leak of peer addresses in XPRT ...	2014-04-08 18:28:14 -07:00
Linus Torvalds	7df934526c	Merge branch 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull renameat2 system call from Miklos Szeredi: "This adds a new syscall, renameat2(), which is the same as renameat() but with a flags argument. The purpose of extending rename is to add cross-rename, a symmetric variant of rename, which exchanges the two files. This allows interesting things, which were not possible before, for example atomically replacing a directory tree with a symlink, etc... This also allows overlayfs and friends to operate on whiteouts atomically. Andy Lutomirski also suggested a "noreplace" flag, which disables the overwriting behavior of rename. These two flags, RENAME_EXCHANGE and RENAME_NOREPLACE are only implemented for ext4 as an example and for testing" * 'cross-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ext4: add cross rename support ext4: rename: split out helper functions ext4: rename: move EMLINK check up ext4: rename: create ext4_renament structure for local vars vfs: add cross-rename vfs: lock_two_nondirectories: allow directory args security: add flags to rename hooks vfs: add RENAME_NOREPLACE flag vfs: add renameat2 syscall vfs: rename: use common code for dir and non-dir vfs: rename: move d_move() up vfs: add d_is_dir()	2014-04-04 14:03:05 -07:00
J. Bruce Fields	06f9cc12ca	nfsd4: don't create unnecessary mask acl Any setattr of the ACL attribute, even if it sets just the basic 3-ACE ACL exactly as it was returned from a file with only mode bits, creates a mask entry, and it is only the mask, not group, entry that is changed by subsequent modifications of the mode bits. So, for example, it's surprising that GROUP@ is left without read or write permissions after a chmod 0666: touch test chmod 0600 test nfs4_getfacl test A::OWNER@:rwatTcCy A::GROUP@:tcy A::EVERYONE@:tcy nfs4_getfacl test \| nfs4_setfacl -S - test # chmod 0666 test nfs4_getfacl test A::OWNER@:rwatTcCy A::GROUP@:tcy D::GROUP@:rwa A::EVERYONE@:rwatcy So, let's stop creating the unnecessary mask ACL. A mask will still be created on non-trivial ACLs (ACLs with actual named user and group ACEs), so the odd posix-acl behavior of chmod modifying only the mask will still be left in that case; but that's consistent with local behavior. Reported-by: Soumya Koduri <skoduri@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-04-04 10:13:23 -04:00
J. Bruce Fields	082f31a216	nfsd: revert v2 half of "nfsd: don't return high mode bits" This reverts the part of commit `6e14b46b91` that changes NFSv2 behavior. Mark Lord found that it broke nfs-root for Linux clients, because it broke NFSv2. In fact, from RFC 1094: "Notice that the file type is specified both in the mode bits and in the file type. This is really a bug in the protocol and will be fixed in future versions." So NFSv2 clients really are expected to depend on the high bits of the mode. Cc: stable@kernel.org Reported-by: Mark Lord <mlord@pobox.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-04-04 10:13:07 -04:00
Wang YanQing	8f6c5ffc89	kernel/groups.c: remove return value of set_groups After commit `6307f8fee2` ("security: remove dead hook task_setgroups"), set_groups will always return zero, so we could just remove return value of set_groups. This patch reduces code size, and simplfies code to use set_groups, because we don't need to check its return value any more. [akpm@linux-foundation.org: remove obsolete claims from set_groups() comment] Signed-off-by: Wang YanQing <udknight@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-04-03 16:21:05 -07:00
Miklos Szeredi	520c8b1650	vfs: add renameat2 syscall Add new renameat2 syscall, which is the same as renameat with an added flags argument. Pass flags to vfs_rename() and to i_op->rename() as well. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: J. Bruce Fields <bfields@redhat.com>	2014-04-01 17:08:42 +02:00
Yan, Zheng	18df11d0ea	nfsd4: fix memory leak in nfsd4_encode_fattr() fh_put() does not free the temporary file handle. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-31 17:08:23 -04:00
Stanislav Kinsbursky	3064639423	nfsd: check passed socket's net matches NFSd superblock's one There could be a case, when NFSd file system is mounted in network, different to socket's one, like below: "ip netns exec" creates new network and mount namespace, which duplicates NFSd mount point, created in init_net context. And thus NFS server stop in nested network context leads to RPCBIND client destruction in init_net. Then, on NFSd start in nested network context, rpc.nfsd process creates socket in nested net and passes it into "write_ports", which leads to RPCBIND sockets creation in init_net context because of the same reason (NFSd monut point was created in init_net context). An attempt to register passed socket in nested net leads to panic, because no RPCBIND client present in nexted network namespace. This patch add check that passed socket's net matches NFSd superblock's one. And returns -EINVAL error to user psace otherwise. v2: Put socket on exit. Reported-by: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-31 16:58:16 -04:00
Kinglong Mee	d531c008d7	NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp Besides checking rpc_xprt out of xs_setup_bc_tcp, increase it's reference (it's important). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-30 10:47:36 -04:00
Kinglong Mee	2336745e87	NFSD: Clear wcc data between compound ops Testing NFS4.0 by pynfs, I got some messeages as, "nfsd: inode locked twice during operation." When one compound RPC contains two or more ops that locks the filehandle,the second op will cause the message. As two SETATTR ops, after the first SETATTR, nfsd will not call fh_put() to release current filehandle, it means filehandle have unlocked with fh_post_saved = 1. The second SETATTR find fh_post_saved = 1, and printk the message. v2: introduce helper fh_clear_wcc(). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-30 10:47:34 -04:00
Trond Myklebust	a8a7c6776f	nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+ RFC5661 obsoletes NFS4ERR_STALE_STATEID in favour of NFS4ERR_BAD_STATEID. Note that because nfsd encodes the clientid boot time in the stateid, we can hit this error case in certain scenarios where the Linux client state management thread exits early, before it has finished recovering all state. Reported-by: Idan Kedar <idank@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-30 10:47:33 -04:00
J. Bruce Fields	1bc49d83c3	nfsd4: fix nfs4err_resource in 4.1 case encode_getattr, for example, can return nfserr_resource to indicate it ran out of buffer space. That's not a legal error in the 4.1 case. And in the 4.1 case, if we ran out of buffer space, we should have exceeded a session limit too. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:24:56 -04:00
J. Bruce Fields	480efaee08	nfsd4: fix setclientid encode size Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:24:55 -04:00
J. Bruce Fields	1bed92cb3c	nfsd4: remove redundant check from nfsd4_check_resp_size cstate->slot and ->session are each set together in nfsd4_sequence. If one is non-NULL, so is the other. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:24:54 -04:00
J. Bruce Fields	d139977d00	nfsd4: use more generous NFS4_ACL_MAX Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:24:53 -04:00
J. Bruce Fields	0da7b19cc3	nfsd4: minor nfsd4_replay_cache_entry cleanup Maybe this is comment true, who cares? Handle this like any other error. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:24:52 -04:00
J. Bruce Fields	3ca2eb9814	nfsd4: nfsd4_replay_cache_entry should be static This isn't actually used anywhere else. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:24:51 -04:00
J. Bruce Fields	067e1ace46	nfsd4: update comments with obsolete function name Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:24:50 -04:00
Kinglong Mee	3f42d2c428	NFSD: Using free_conn free connection Connection from alloc_conn must be freed through free_conn, otherwise, the reference of svc_xprt will never be put. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:23:40 -04:00
J. Bruce Fields	fbb74a34a5	nfsd: typo in nfsd_rename comment Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 18:02:11 -04:00
Kinglong Mee	4daeed25ad	NFSD: simplify saved/current fh uses in nfsd4_proc_compound Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 18:01:40 -04:00
Kinglong Mee	2b90563598	NFSD: Traverse unconfirmed client through hash-table When stopping nfsd, I got BUG messages, and soft lockup messages, The problem is cuased by double rb_erase() in nfs4_state_destroy_net() and destroy_client(). This patch just let nfsd traversing unconfirmed client through hash-table instead of rbtree. [ 2325.021995] BUG: unable to handle kernel NULL pointer dereference at (null) [ 2325.022809] IP: [<ffffffff8133c18c>] rb_erase+0x14c/0x390 [ 2325.022982] PGD 7a91b067 PUD 7a33d067 PMD 0 [ 2325.022982] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [ 2325.022982] Modules linked in: nfsd(OF) cfg80211 rfkill bridge stp llc snd_intel8x0 snd_ac97_codec ac97_bus auth_rpcgss nfs_acl serio_raw e1000 i2c_piix4 ppdev snd_pcm snd_timer lockd pcspkr joydev parport_pc snd parport i2c_core soundcore microcode sunrpc ata_generic pata_acpi [last unloaded: nfsd] [ 2325.022982] CPU: 1 PID: 2123 Comm: nfsd Tainted: GF O 3.14.0-rc8+ #2 [ 2325.022982] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 2325.022982] task: ffff88007b384800 ti: ffff8800797f6000 task.ti: ffff8800797f6000 [ 2325.022982] RIP: 0010:[<ffffffff8133c18c>] [<ffffffff8133c18c>] rb_erase+0x14c/0x390 [ 2325.022982] RSP: 0018:ffff8800797f7d98 EFLAGS: 00010246 [ 2325.022982] RAX: ffff880079c1f010 RBX: ffff880079f4c828 RCX: 0000000000000000 [ 2325.022982] RDX: 0000000000000000 RSI: ffff880079bcb070 RDI: ffff880079f4c810 [ 2325.022982] RBP: ffff8800797f7d98 R08: 0000000000000000 R09: ffff88007964fc70 [ 2325.022982] R10: 0000000000000000 R11: 0000000000000400 R12: ffff880079f4c800 [ 2325.022982] R13: ffff880079bcb000 R14: ffff8800797f7da8 R15: ffff880079f4c860 [ 2325.022982] FS: 0000000000000000(0000) GS:ffff88007f900000(0000) knlGS:0000000000000000 [ 2325.022982] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 2325.022982] CR2: 0000000000000000 CR3: 000000007a3ef000 CR4: 00000000000006e0 [ 2325.022982] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2325.022982] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2325.022982] Stack: [ 2325.022982] ffff8800797f7de0 ffffffffa0191c6e ffff8800797f7da8 ffff8800797f7da8 [ 2325.022982] ffff880079f4c810 ffff880079bcb000 ffffffff81cc26c0 ffff880079c1f010 [ 2325.022982] ffff880079bcb070 ffff8800797f7e28 ffffffffa01977f2 ffff8800797f7df0 [ 2325.022982] Call Trace: [ 2325.022982] [<ffffffffa0191c6e>] destroy_client+0x32e/0x3b0 [nfsd] [ 2325.022982] [<ffffffffa01977f2>] nfs4_state_shutdown_net+0x1a2/0x220 [nfsd] [ 2325.022982] [<ffffffffa01700b8>] nfsd_shutdown_net+0x38/0x70 [nfsd] [ 2325.022982] [<ffffffffa017013e>] nfsd_last_thread+0x4e/0x80 [nfsd] [ 2325.022982] [<ffffffffa001f1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc] [ 2325.022982] [<ffffffffa017064b>] nfsd_destroy+0x5b/0x80 [nfsd] [ 2325.022982] [<ffffffffa0170773>] nfsd+0x103/0x130 [nfsd] [ 2325.022982] [<ffffffffa0170670>] ? nfsd_destroy+0x80/0x80 [nfsd] [ 2325.022982] [<ffffffff810a8232>] kthread+0xd2/0xf0 [ 2325.022982] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40 [ 2325.022982] [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0 [ 2325.022982] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40 [ 2325.022982] Code: 48 83 e1 fc 48 89 10 0f 84 02 01 00 00 48 3b 41 10 0f 84 08 01 00 00 48 89 51 08 48 89 fa e9 74 ff ff ff 0f 1f 40 00 48 8b 50 10 <f6> 02 01 0f 84 93 00 00 00 48 8b 7a 10 48 85 ff 74 05 f6 07 01 [ 2325.022982] RIP [<ffffffff8133c18c>] rb_erase+0x14c/0x390 [ 2325.022982] RSP <ffff8800797f7d98> [ 2325.022982] CR2: 0000000000000000 [ 2325.022982] ---[ end trace 28c27ed011655e57 ]--- [ 228.064071] BUG: soft lockup - CPU#0 stuck for 22s! [nfsd:558] [ 228.064428] Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw nfsd(OF) auth_rpcgss nfs_acl lockd snd_intel8x0 snd_ac97_codec ac97_bus joydev snd_pcm snd_timer e1000 sunrpc snd ppdev parport_pc serio_raw pcspkr i2c_piix4 microcode parport soundcore i2c_core ata_generic pata_acpi [ 228.064539] CPU: 0 PID: 558 Comm: nfsd Tainted: GF O 3.14.0-rc8+ #2 [ 228.064539] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 228.064539] task: ffff880076adec00 ti: ffff880074616000 task.ti: ffff880074616000 [ 228.064539] RIP: 0010:[<ffffffff8133ba17>] [<ffffffff8133ba17>] rb_next+0x27/0x50 [ 228.064539] RSP: 0018:ffff880074617de0 EFLAGS: 00000282 [ 228.064539] RAX: ffff880074478010 RBX: ffff88007446f860 RCX: 0000000000000014 [ 228.064539] RDX: ffff880074478010 RSI: 0000000000000000 RDI: ffff880074478010 [ 228.064539] RBP: ffff880074617de0 R08: 0000000000000000 R09: 0000000000000012 [ 228.064539] R10: 0000000000000001 R11: ffffffffffffffec R12: ffffea0001d11a00 [ 228.064539] R13: ffff88007f401400 R14: ffff88007446f800 R15: ffff880074617d50 [ 228.064539] FS: 0000000000000000(0000) GS:ffff88007f800000(0000) knlGS:0000000000000000 [ 228.064539] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 228.064539] CR2: 00007fe9ac6ec000 CR3: 000000007a5d6000 CR4: 00000000000006f0 [ 228.064539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 228.064539] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 228.064539] Stack: [ 228.064539] ffff880074617e28 ffffffffa01ab7db ffff880074617df0 ffff880074617df0 [ 228.064539] ffff880079273000 ffffffff81cc26c0 ffffffff81cc26c0 0000000000000000 [ 228.064539] 0000000000000000 ffff880074617e48 ffffffffa01840b8 ffffffff81cc26c0 [ 228.064539] Call Trace: [ 228.064539] [<ffffffffa01ab7db>] nfs4_state_shutdown_net+0x18b/0x220 [nfsd] [ 228.064539] [<ffffffffa01840b8>] nfsd_shutdown_net+0x38/0x70 [nfsd] [ 228.064539] [<ffffffffa018413e>] nfsd_last_thread+0x4e/0x80 [nfsd] [ 228.064539] [<ffffffffa00aa1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc] [ 228.064539] [<ffffffffa018464b>] nfsd_destroy+0x5b/0x80 [nfsd] [ 228.064539] [<ffffffffa0184773>] nfsd+0x103/0x130 [nfsd] [ 228.064539] [<ffffffffa0184670>] ? nfsd_destroy+0x80/0x80 [nfsd] [ 228.064539] [<ffffffff810a8232>] kthread+0xd2/0xf0 [ 228.064539] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40 [ 228.064539] [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0 [ 228.064539] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40 [ 228.064539] Code: 1f 44 00 00 55 48 8b 17 48 89 e5 48 39 d7 74 3b 48 8b 47 08 48 85 c0 75 0e eb 25 66 0f 1f 84 00 00 00 00 00 48 89 d0 48 8b 50 10 <48> 85 d2 75 f4 5d c3 66 90 48 3b 78 08 75 f6 48 8b 10 48 89 c7 Fixes: `ac55fdc408` (nfsd: move the confirmed and unconfirmed hlists...) Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 10:41:40 -04:00
Jeff Layton	e874f9f8e0	svcrpc: explicitly reject compounds that are not padded out to 4-byte multiple We have a WARN_ON in the nfsd4_decode_write() that tells us when the client has sent a request that is not padded out properly according to RFC4506. A WARN_ON really isn't appropriate in this case though since this indicates a client bug, not a server one. Move this check out to the top-level compound decoder and have it just explicitly return an error. Also add a dprintk() that shows the client address and xid to help track down clients and frames that trigger it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:31:58 -04:00
J. Bruce Fields	9f67f18993	nfsd: notify_change needs elevated write count Looks like this bug has been here since these write counts were introduced, not sure why it was just noticed now. Thanks also to Jan Kara for pointing out the problem. Cc: stable@vger.kernel.org Reported-by: Matthew Rahtz <mrahtz@rapitasystems.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:31:48 -04:00
J. Bruce Fields	a11fcce154	nfsd4: fix test_stateid error reply encoding If the entire operation fails then there's nothing to encode. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:31:39 -04:00
J. Bruce Fields	04819bf644	nfsd4: leave reply buffer space for failed setattr This fixes an ommission from `18032ca062` "NFSD: Server implementation of MAC Labeling", which increased the size of the setattr error reply without increasing COMPOUND_ERR_SLACK_SPACE. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:31:09 -04:00
J. Bruce Fields	798df33879	nfsd4: make set of large acl return efbig, not resource If a client attempts to set an excessively large ACL, return NFS4ERR_FBIG instead of NFS4ERR_RESOURCE. I'm not sure FBIG is correct, but I'm positive RESOURCE is wrong (it isn't even a well-defined error any more for NFS versions since 4.1). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:31:08 -04:00
J. Bruce Fields	4c69d5855a	nfsd4: session needs room for following op to error out Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:30:59 -04:00
J. Bruce Fields	de3997a7ee	nfsd4: buffer-length check for SUPPATTR_EXCLCREAT This was an omission from `8c18f2052e` "nfsd41: SUPPATTR_EXCLCREAT attribute". Cc: Benny Halevy <bhalevy@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:30:42 -04:00
J. R. Okajima	1406b916f4	nfsd: fix lost nfserrno() call in nfsd_setattr() There is a regression in `208d0ac` 2014-01-07 nfsd4: break only delegations when appropriate which deletes an nfserrno() call in nfsd_setattr() (by accident, probably), and NFSD becomes ignoring an error from VFS. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-02-18 12:25:57 -05:00
J. Bruce Fields	09bdc2d70d	nfsd4: fix acl buffer overrun `4ac7249ea5` "nfsd: use get_acl and ->set_acl" forgets to set the size in the case get_acl() succeeds, so _posix_to_nfsv4_one() can then write past the end of its allocation. Symptoms were slab corruption warnings. Also, some minor cleanup while we're here. (Among other things, note that the first few lines guarantee that pacl is non-NULL.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-02-11 13:48:11 -05:00
Linus Torvalds	d9894c228b	Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: - Handle some loose ends from the vfs read delegation support. (For example nfsd can stop breaking leases on its own in a fewer places where it can now depend on the vfs to.) - Make life a little easier for NFSv4-only configurations (thanks to Kinglong Mee). - Fix some gss-proxy problems (thanks Jeff Layton). - miscellaneous bug fixes and cleanup * 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits) nfsd: consider CLAIM_FH when handing out delegation nfsd4: fix delegation-unlink/rename race nfsd4: delay setting current_fh in open nfsd4: minor nfs4_setlease cleanup gss_krb5: use lcm from kernel lib nfsd4: decrease nfsd4_encode_fattr stack usage nfsd: fix encode_entryplus_baggage stack usage nfsd4: simplify xdr encoding of nfsv4 names nfsd4: encode_rdattr_error cleanup nfsd4: nfsd4_encode_fattr cleanup minor svcauth_gss.c cleanup nfsd4: better VERIFY comment nfsd4: break only delegations when appropriate NFSD: Fix a memory leak in nfsd4_create_session sunrpc: get rid of use_gssp_lock sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt sunrpc: don't wait for write before allowing reads from use-gss-proxy file nfsd: get rid of unused function definition Define op_iattr for nfsd4_open instead using macro NFSD: fix compile warning without CONFIG_NFSD_V3 ...	2014-01-30 10:18:43 -08:00
Ming Chen	ed47b062ce	nfsd: consider CLAIM_FH when handing out delegation CLAIM_FH was added by NFSv4.1. It is the same as CLAIM_NULL except that it uses only current FH to identify the file to be opened. The NFS client is using CLAIM_FH if the FH is available when opening a file. Currently, we cannot get any delegation if we stat a file before open it because the server delegation code does not recognize CLAIM_FH. We tested this patch and found delegation can be handed out now when claim is CLAIM_FH. See http://marc.info/?l=linux-nfs&m=136369847801388&w=2 and http://www.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues#New_open_claim_types Signed-off-by: Ming Chen <mchen@cs.stonybrook.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-27 18:02:42 -05:00
J. Bruce Fields	4335723e8e	nfsd4: fix delegation-unlink/rename race If a file is unlinked or renamed between the time when we do the local open and the time when we get the delegation, then we will return to the client indicating that it holds a delegation even though the file no longer exists under the name it was open under. But a client performing an open-by-name, when it is returned a delegation, must be able to assume that the file is still linked at the name it was opened under. So, hold the parent i_mutex for longer to prevent concurrent renames or unlinks. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-27 13:59:16 -05:00
J. Bruce Fields	c0e6bee480	nfsd4: delay setting current_fh in open This is basically a no-op, to simplify a following patch. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-27 13:59:16 -05:00
J. Bruce Fields	e873088f29	nfsd4: minor nfs4_setlease cleanup As far as I can tell, this list is used only under the state lock, so we may as well do this in the simpler order. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-27 13:59:15 -05:00
Christoph Hellwig	4ac7249ea5	nfsd: use get_acl and ->set_acl Remove the boilerplate code to marshall and unmarhall ACL objects into xattrs and operate on the posix_acl objects directly. Also move all the ACL handling code into nfs?acl.c where it belongs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-26 08:26:41 -05:00
J. Bruce Fields	d50e61361c	nfsd4: decrease nfsd4_encode_fattr stack usage A struct svc_fh is 320 bytes on x86_64, it'd be better not to have these on the stack. kmalloc'ing them probably isn't ideal either, but this is the simplest thing to do. If it turns out to be a problem in the readdir case then we could add a svc_fh to nfsd4_readdir and pass that in. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-24 15:58:21 -05:00
J. Bruce Fields	068c34c0ce	nfsd: fix encode_entryplus_baggage stack usage We stick an extra svc_fh in nfsd3_readdirres to save the need to kmalloc, though maybe it would be fine to kmalloc instead. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-23 13:50:27 -05:00
J. Bruce Fields	3554116d3a	nfsd4: simplify xdr encoding of nfsv4 names We can simplify the idmapping code if it does its own encoding and returns nfs errors. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-08 12:18:53 -05:00
J. Bruce Fields	87915c6472	nfsd4: encode_rdattr_error cleanup There's a simpler way to write this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-07 16:01:18 -05:00
J. Bruce Fields	6b6d8137f1	nfsd4: nfsd4_encode_fattr cleanup Remove some pointless goto's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-07 16:01:17 -05:00
J. Bruce Fields	41ae6e714a	nfsd4: better VERIFY comment This confuses me every time. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-07 16:01:15 -05:00
J. Bruce Fields	208d0acc49	nfsd4: break only delegations when appropriate As a temporary fix, nfsd was breaking all leases on unlink, link, rename, and setattr. Now that we can distinguish between leases and delegations, we can be nicer and break only the delegations, and not bother lease-holders with operations they don't care about. And we get to delete some code while we're at it. Note that in the presence of delegations the vfs calls here all return -EWOULDBLOCK instead of blocking, so nfsd threads will not get stuck waiting for delegation returns. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-07 16:01:15 -05:00
Kinglong Mee	60810e5489	NFSD: Fix a memory leak in nfsd4_create_session If failed after calling alloc_session but before init_session, nfsd will call __free_session to free se_slots in session. But, session->se_fchannel.maxreqs is not initialized (value is zero). So that, the memory malloced for slots will be lost in free_session_slots for maxreqs is zero. This path sets the information for channel in alloc_session after mallocing slots succeed, instead in init_session. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-06 15:33:54 -05:00
Kinglong Mee	73ca65904c	nfsd: get rid of unused function definition Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-06 12:23:33 -05:00
Kinglong Mee	3ff69309fe	Define op_iattr for nfsd4_open instead using macro Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-06 12:23:32 -05:00
Kinglong Mee	ff88825fbb	NFSD: fix compile warning without CONFIG_NFSD_V3 Without CONFIG_NFSD_V3, compile will get warning as, fs/nfsd/nfssvc.c: In function 'nfsd_svc': >> fs/nfsd/nfssvc.c:246:60: warning: array subscript is above array bounds [-Warray-bounds] return (nfsd_versions[2] != NULL) \|\| (nfsd_versions[3] != NULL); ^ Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-06 12:23:31 -05:00
Kinglong Mee	8ef667140c	NFSD: Don't start lockd when only NFSv4 is running When starting without nfsv2 and nfsv3, nfsd does not need to start lockd (and certainly doesn't need to fail because lockd failed to register with the portmapper). Reported-by: Gareth Williams <gareth@garethwilliams.me.uk> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-03 18:18:50 -05:00
Kinglong Mee	7e55b59b2f	SUNRPC/NFSD: Support a new option for ignoring the result of svc_register NFSv4 clients can contact port 2049 directly instead of needing the portmapper. Therefore a failure to register to the portmapper when starting an NFSv4-only server isn't really a problem. But Gareth Williams reports that an attempt to start an NFSv4-only server without starting portmap fails: #rpc.nfsd -N 2 -N 3 rpc.nfsd: writing fd to kernel failed: errno 111 (Connection refused) rpc.nfsd: unable to set any sockets for nfsd Add a flag to svc_version to tell the rpc layer it can safely ignore an rpcbind failure in the NFSv4-only case. Reported-by: Gareth Williams <gareth@garethwilliams.me.uk> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-03 18:18:49 -05:00
Kinglong Mee	8a891633b8	NFSD: fix bad length checking for backchannel the length for backchannel checking should be multiplied by sizeof(__be32). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-03 18:18:48 -05:00
Kinglong Mee	f403e450e8	NFSD: fix a leak which can cause CREATE_SESSION failures check_forechannel_attrs gets drc memory, so nfsd must put it when check_backchannel_attrs fails. After many requests with bad back channel attrs, nfsd will deny any client's CREATE_SESSION forever. A new test case named CSESS29 for pynfs will send in another mail. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-03 18:18:47 -05:00
Kinglong Mee	2ce02b6b6c	Add missing recording of back channel attrs in nfsd4_session commit `5b6feee960` forgot recording the back channel attrs in nfsd4_session. nfsd just check the back channel attars by check_backchannel_attrs, but do not record it in nfsd4_session in the latest kernel. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-03 18:18:46 -05:00
Kinglong Mee	dfeecc829e	nfsd: get rid of unused macro definition Since defined in Linux-2.6.12-rc2, READTIME has not been used. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-03 13:44:24 -05:00
Kinglong Mee	eba1c99ce4	nfsd: clean up unnecessary temporary variable in nfsd4_decode_fattr host_err was only used for nfs4_acl_new. This patch delete it, and return nfserr_jukebox directly. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-03 13:44:23 -05:00
Kinglong Mee	43212cc7df	nfsd: using nfsd4_encode_noop for encoding destroy_session/free_stateid Get rid of the extra code, using nfsd4_encode_noop for encoding destroy_session and free_stateid. And, delete unused argument (fr_status) int nfsd4_free_stateid. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-03 13:44:22 -05:00
Kinglong Mee	a9f7b4a06c	nfsd: clean up an xdr reserved space calculation We should use XDR_LEN to calculate reserved space in case the oid is not a multiple of 4. RESERVE_SPACE actually rounds up for us, but it's probably better to be careful here. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-03 13:44:12 -05:00
Kinglong Mee	b9b284df6c	nfsd: get rid of unused function definition commit `557ce2646e` "nfsd41: replace page based DRC with buffer based DRC" have remove unused nfsd4_set_statp, but miss the function definition. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-02 17:53:23 -05:00
Kinglong Mee	a8bb84bc9e	nfsd: calculate the missing length of bitmap in EXCHANGE_ID commit `58cd57bfd9` "nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID" miss calculating the length of bitmap for spo_must_enforce and spo_must_allow. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-01-02 17:44:52 -05:00
Stanislav Kholmanskikh	c4fa6d7c59	nfsd: revoking of suid/sgid bits after chown() in a consistent way There is an inconsistency in the handling of SUID/SGID file bits after chown() between NFS and other local file systems. Local file systems (for example, ext3, ext4, xfs, btrfs) revoke SUID/SGID bits after chown() on a regular file even if the owner/group of the file has not been changed: ~# touch file; chmod ug+s file; chmod u+x file ~# ls -l file -rwsr-Sr-- 1 root root 0 Dec 6 04:49 file ~# chown root file; ls -l file -rwxr-Sr-- 1 root root 0 Dec 6 04:49 file but NFS doesn't do that: ~# touch file; chmod ug+s file; chmod u+x file ~# ls -l file -rwsr-Sr-- 1 root root 0 Dec 6 04:49 file ~# chown root file; ls -l file -rwsr-Sr-- 1 root root 0 Dec 6 04:49 file NFS does that only if the owner/group has been changed: ~# touch file; chmod ug+s file; chmod u+x file ~# ls -l file -rwsr-Sr-- 1 root root 0 Dec 6 05:02 file ~# chown bin file; ls -l file -rwxr-Sr-- 1 bin root 0 Dec 6 05:02 file See: http://pubs.opengroup.org/onlinepubs/9699919799/functions/chown.html "If the specified file is a regular file, one or more of the S_IXUSR, S_IXGRP, or S_IXOTH bits of the file mode are set, and the process has appropriate privileges, it is implementation-defined whether the set-user-ID and set-group-ID bits are altered." So both variants are acceptable by POSIX. This patch makes NFS to behave like local file systems. Signed-off-by: Stanislav Kholmanskikh <stanislav.kholmanskikh@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-12-12 11:37:18 -05:00
Jeff Layton	a0ef5e1968	nfsd: don't try to reuse an expired DRC entry off the list Currently when we are processing a request, we try to scrape an expired or over-limit entry off the list in preference to allocating a new one from the slab. This is unnecessarily complicated. Just use the slab layer. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-12-11 11:27:04 -05:00
Christoph Hellwig	2d8498dbf8	nfsd: start documenting some XDR handling functions Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-12-10 20:37:47 -05:00
Albert Fluegel	6e14b46b91	nfsd: don't return high mode bits The Linux NFS server replies among other things to a "Check access permission" the following: NFS: File type = 2 (Directory) NFS: Mode = 040755 A netapp server replies here: NFS: File type = 2 (Directory) NFS: Mode = 0755 The RFC 1813 i read: fattr3 struct fattr3 { ftype3 type; mode3 mode; uint32 nlink; ... For the mode bits only the lowest 9 are defined in the RFC As far as I can tell, knfsd has always done this, so apparently it's harmless. Nevertheless, it appears to be wrong. Note this is already correct in the NFSv4 case, only v2 and v3 need fixing. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-12-10 20:35:58 -05:00
Jeff Layton	781c2a5a5f	nfsd: when reusing an existing repcache entry, unhash it first The DRC code will attempt to reuse an existing, expired cache entry in preference to allocating a new one. It'll then search the cache, and if it gets a hit it'll then free the cache entry that it was going to reuse. The cache code doesn't unhash the entry that it's going to reuse however, so it's possible for it end up designating an entry for reuse and then subsequently freeing the same entry after it finds it. This leads it to a later use-after-free situation and usually some list corruption warnings or an oops. Fix this by simply unhashing the entry that we intend to reuse. That will mean that it's not findable via a search and should prevent this situation from occurring. Cc: stable@vger.kernel.org # v3.10+ Reported-by: Christoph Hellwig <hch@infradead.org> Reported-by: g. artim <gartim@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-12-10 20:34:44 -05:00
J. Bruce Fields	365da4adeb	nfsd4: fix xdr decoding of large non-write compounds This fixes a regression from `247500820e` "nfsd4: fix decoding of compounds across page boundaries". The previous code was correct: argp->pagelist is initialized in nfs4svc_deocde_compoundargs to rqstp->rq_arg.pages, and is therefore a pointer to the page after the page we are currently decoding. The reason that patch nevertheless fixed a problem with decoding compounds containing write was a bug in the write decoding introduced by `5a80a54d21` "nfsd4: reorganize write decoding", after which write decoding no longer adhered to the rule that argp->pagelist point to the next page. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-11-19 18:06:54 -05:00
Christoph Hellwig	987da47910	nfsd: make sure to balance get/put_write_access Use a straight goto error label style in nfsd_setattr to make sure we always do the put_write_access call after we got it earlier. Note that the we have been failing to do that in the case nfsd_break_lease() returns an error, a bug introduced into 2.6.38 with `6a76bebefe` "nfsd4: break lease on nfsd setattr". Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-11-18 12:06:48 -05:00
Christoph Hellwig	818e5a22e9	nfsd: split up nfsd_setattr Split out two helpers to make the code more readable and easier to verify for correctness. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-11-18 12:06:47 -05:00
Linus Torvalds	449bf8d03c	Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from Bruce Fields: "This includes miscellaneous bugfixes and cleanup and a performance fix for write-heavy NFSv4 workloads. (The most significant nfsd-relevant change this time is actually in the delegation patches that went through Viro, fixing a long-standing bug that can cause NFSv4 clients to miss updates made by non-nfs users of the filesystem. Those enable some followup nfsd patches which I have queued locally, but those can wait till 3.14)" * 'nfsd-next' of git://linux-nfs.org/~bfields/linux: (24 commits) nfsd: export proper maximum file size to the client nfsd4: improve write performance with better sendspace reservations svcrpc: remove an unnecessary assignment sunrpc: comment typo fix Revert "nfsd: remove_stid can be incorporated into nfs4_put_delegation" nfsd4: fix discarded security labels on setattr NFSD: Add support for NFS v4.2 operation checking nfsd4: nfsd_shutdown_net needs state lock NFSD: Combine decode operations for v4 and v4.1 nfsd: -EINVAL on invalid anonuid/gid instead of silent failure nfsd: return better errors to exportfs nfsd: fh_update should error out in unexpected cases nfsd4: need to destroy revoked delegations in destroy_client nfsd: no need to unhash_stid before free nfsd: remove_stid can be incorporated into nfs4_put_delegation nfsd: nfs4_open_delegation needs to remove_stid rather than unhash_stid nfsd: nfs4_free_stid nfsd: fix Kconfig syntax sunrpc: trim off EC bytes in GSSAPI v2 unwrap gss_krb5: document that we ignore sequence number ...	2013-11-16 12:04:02 -08:00
Christoph Hellwig	aea240f416	nfsd: export proper maximum file size to the client I noticed that we export a way to high value for the maxfilesize attribute when debugging a client issue. The issue didn't turn out to be related to it, but I think we should export it, so that clients can limit what write sizes they accept before hitting the server. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-11-14 15:18:47 -05:00
J. Bruce Fields	6ff40decff	nfsd4: improve write performance with better sendspace reservations Currently the rpc code conservatively refuses to accept rpc's from a client if the sum of its worst-case estimates of the replies it owes that client exceed the send buffer space. Unfortunately our estimate of the worst-case reply for an NFSv4 compound is always the maximum read size. This can unnecessarily limit the number of operations we handle concurrently, for example in the case most operations are writes (which have small replies). We can do a little better if we check which ops the compound contains. This is still a rough estimate, we'll need to improve on it some day. Reported-by: Shyam Kaushik <shyamnfs1@gmail.com> Tested-by: Shyam Kaushik <shyamnfs1@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-11-13 16:12:54 -05:00
J. Bruce Fields	27ac0ffeac	locks: break delegations on any attribute modification NFSv4 uses leases to guarantee that clients can cache metadata as well as data. Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: David Howells <dhowells@redhat.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:44 -05:00
J. Bruce Fields	146a8595c6	locks: break delegations on link Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:43 -05:00
J. Bruce Fields	8e6d782cab	locks: break delegations on rename Cc: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:43 -05:00
J. Bruce Fields	b21996e36c	locks: break delegations on unlink We need to break delegations on any operation that changes the set of links pointing to an inode. Start with unlink. Such operations also hold the i_mutex on a parent directory. Breaking a delegation may require waiting for a timeout (by default 90 seconds) in the case of a unresponsive NFS client. To avoid blocking all directory operations, we therefore drop locks before waiting for the delegation. The logic then looks like: acquire locks ... test for delegation; if found: take reference on inode release locks wait for delegation break drop reference on inode retry It is possible this could never terminate. (Even if we take precautions to prevent another delegation being acquired on the same inode, we could get a different inode on each retry.) But this seems very unlikely. The initial test for a delegation happens after the lock on the target inode is acquired, but the directory inode may have been acquired further up the call stack. We therefore add a "struct inode **" argument to any intervening functions, which we use to pass the inode back up to the caller in the case it needs a delegation synchronously broken. Cc: David Howells <dhowells@redhat.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:42 -05:00
J. Bruce Fields	617588d518	locks: introduce new FL_DELEG lock flag For now FL_DELEG is just a synonym for FL_LEASE. So this patch doesn't change behavior. Next we'll modify break_lease to treat FL_DELEG leases differently, to account for the fact that NFSv4 delegations should be broken in more situations than Windows oplocks. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-11-09 00:16:41 -05:00
J. Bruce Fields	b78800baee	Revert "nfsd: remove_stid can be incorporated into nfs4_put_delegation" This reverts commit `7ebe40f203`. We forgot the nfs4_put_delegation call in fs/nfsd/nfs4callback.c which should not be unhashing the stateid. This lead to warnings from the idr code when we tried to removed id's twice. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-11-04 17:46:50 -05:00
J. Bruce Fields	3378b7f40d	nfsd4: fix discarded security labels on setattr Security labels in setattr calls are currently ignored because we forget to set label->len. Cc: stable@vger.kernel.org Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-11-01 13:40:46 -04:00
Anna Schumaker	8217d146ab	NFSD: Add support for NFS v4.2 operation checking The server does allow NFS over v4.2, even if it doesn't add any new operations yet. I also switch to using constants to represent the last operation for each minor version since this makes the code cleaner and easier to understand at a quick glance. Signed-off-by: Anna Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-30 18:25:04 -04:00
J. Bruce Fields	e50a26dc78	nfsd4: nfsd_shutdown_net needs state lock A comment claims the caller should take it, but that's not being done. Note we don't want it around the cancel_delayed_work_sync since that may wait on work which holds the client lock. Reported-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-30 10:35:59 -04:00
Anna Schumaker	e1a90ebd8b	NFSD: Combine decode operations for v4 and v4.1 We were using a different array of function pointers to represent each minor version. This makes adding a new minor version tedious, since it needs a step to copy, paste and modify a new version of the same functions. This patch combines the v4 and v4.1 arrays into a single instance and will check minor version support inside each decoder function. Signed-off-by: Anna Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-30 10:04:08 -04:00
J. Bruce Fields	6f6cc3205c	nfsd: -EINVAL on invalid anonuid/gid instead of silent failure If we're going to refuse to accept these it would be polite of us to at least say so.... This introduces a slight complication since we need to grandfather in exportfs's ill-advised use of -1 uid and gid on its test_export. If it turns out there are other users passing down -1 we may need to do something else. Best might be to drop the checks entirely, but I'm not sure if other parts of the kernel might assume that a task can't run as uid or gid -1. Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-29 17:46:14 -04:00
J. Bruce Fields	427d6c6646	nfsd: return better errors to exportfs Someone noticed exportfs happily accepted exports that would later be rejected when mountd tried to give them to the kernel. Fix this. This is a regression from `4c1e1b34d5` "nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids". Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: stable@vger.kernel.org Reported-by: Yin.JianHong <jiyin@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-29 17:45:30 -04:00
J. Bruce Fields	49e7372063	nfsd: fh_update should error out in unexpected cases The reporter saw a NULL dereference when a filesystem's ->mknod returned success but left the dentry negative, and then nfsd tried to dereference d_inode (in this case because the CREATE was followed by a GETATTR in the same nfsv4 compound). fh_update already checks for this and another broken case, but for some reason it returns success and leaves nfsd trying to soldier on. If it failed we'd avoid the crash. There's only so much we can do with a buggy filesystem, but it's easy enough to bail out here, so let's do that. Reported-by: Antti Tönkyrä <daedalus@pingtimeout.net> Tested-by: Antti Tönkyrä <daedalus@pingtimeout.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-29 17:43:52 -04:00
Benny Halevy	956c4fee44	nfsd4: need to destroy revoked delegations in destroy_client [use list_splice_init] Signed-off-by: Benny Halevy <bhalevy@primarydata.com> [bfields: no need for recall_lock here] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-29 12:00:48 -04:00
Benny Halevy	01a87d91fc	nfsd: no need to unhash_stid before free idr_remove is about to be called before kmem_cache_free so unhashing it is redundant Signed-off-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-29 10:03:27 -04:00
Benny Halevy	7ebe40f203	nfsd: remove_stid can be incorporated into nfs4_put_delegation All calls to nfs4_put_delegation are preceded with remove_stid. Signed-off-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-28 15:58:32 -04:00
Benny Halevy	5d7dab83e3	nfsd: nfs4_open_delegation needs to remove_stid rather than unhash_stid In the out_free: path, the newly allocated stid must be removed rather than unhashed so it can never be found. Signed-off-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-28 15:58:20 -04:00
Benny Halevy	9857df815f	nfsd: nfs4_free_stid Make it symmetric to nfs4_alloc_stid. Signed-off-by: Benny Halevy <bhalevy@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-28 15:43:06 -04:00
Christoph Hellwig	cce6de908e	nfsd: fix Kconfig syntax The description text for CONFIG_NFSD_V4_SECURITY_LABEL has an unpaired quote sign which breaks syntax highlighting for the nfsd Kconfig file. Remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-10-26 15:37:26 -04:00
Al Viro	a6a9f18f0a	nfsd: switch to %p[dD] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-10-24 23:34:51 -04:00
Al Viro	97e47fa11d	nfsd: switch to %p[dD] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-10-02 16:18:24 -04:00
Linus Torvalds	26935fb06e	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 4 from Al Viro: "list_lru pile, mostly" This came out of Andrew's pile, Al ended up doing the merge work so that Andrew didn't have to. Additionally, a few fixes. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (42 commits) super: fix for destroy lrus list_lru: dynamically adjust node arrays shrinker: Kill old ->shrink API. shrinker: convert remaining shrinkers to count/scan API staging/lustre/libcfs: cleanup linux-mem.h staging/lustre/ptlrpc: convert to new shrinker API staging/lustre/obdclass: convert lu_object shrinker to count/scan API staging/lustre/ldlm: convert to shrinkers to count/scan API hugepage: convert huge zero page shrinker to new shrinker API i915: bail out earlier when shrinker cannot acquire mutex drivers: convert shrinkers to new count/scan API fs: convert fs shrinkers to new scan/count API xfs: fix dquot isolation hang xfs-convert-dquot-cache-lru-to-list_lru-fix xfs: convert dquot cache lru to list_lru xfs: rework buffer dispose list tracking xfs-convert-buftarg-lru-to-generic-code-fix xfs: convert buftarg LRU to generic code fs: convert inode and dentry shrinking to be node aware vmscan: per-node deferred work ...	2013-09-12 15:01:38 -07:00
Linus Torvalds	cf596766fc	Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "This was a very quiet cycle! Just a few bugfixes and some cleanup" * 'nfsd-next' of git://linux-nfs.org/~bfields/linux: rpc: let xdr layer allocate gssproxy receieve pages rpc: fix huge kmalloc's in gss-proxy rpc: comment on linux_cred encoding, treat all as unsigned rpc: clean up decoding of gssproxy linux creds svcrpc: remove unused rq_resused nfsd4: nfsd4_create_clid_dir prints uninitialized data nfsd4: fix leak of inode reference on delegation failure Revert "nfsd: nfs4_file_get_access: need to be more careful with O_RDWR" sunrpc: prepare NFS for 2038 nfsd4: fix setlease error return nfsd: nfs4_file_get_access: need to be more careful with O_RDWR	2013-09-10 20:04:59 -07:00
Dave Chinner	1ab6c4997e	fs: convert fs shrinkers to new scan/count API Convert the filesystem shrinkers to use the new API, and standardise some of the behaviours of the shrinkers at the same time. For example, nr_to_scan means the number of objects to scan, not the number of objects to free. I refactored the CIFS idmap shrinker a little - it really needs to be broken up into a shrinker per tree and keep an item count with the tree root so that we don't need to walk the tree every time the shrinker needs to count the number of objects in the tree (i.e. all the time under memory pressure). [glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree] [assorted fixes folded in] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Glauber Costa <glommer@openvz.org> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-10 18:56:31 -04:00
Al Viro	301f0268b6	nfsd: racy access to ->d_name in nsfd4_encode_path() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-03 22:50:28 -04:00
J. Bruce Fields	248f807b47	nfsd4: nfsd4_create_clid_dir prints uninitialized data Take the easy way out and just remove the printk. Reported-by: David Howells <dhowells@redhat.com>	2013-08-30 17:30:52 -04:00
J. Bruce Fields	bf7bd3e98b	nfsd4: fix leak of inode reference on delegation failure This fixes a regression from `68a3396178` "nfsd4: shut down more of delegation earlier". After that commit, nfs4_set_delegation() failures result in nfs4_put_delegation being called, but nfs4_put_delegation doesn't free the nfs4_file that has already been set by alloc_init_deleg(). This can result in an oops on later unmounting the exported filesystem. Note also delaying the fi_had_conflict check we're able to return a better error (hence give 4.1 clients a better idea why the delegation failed; though note CONFLICT isn't an exact match here, as that's supposed to indicate a current conflict, but all we know here is that there was one recently). Reported-by: Toralf Förster <toralf.foerster@gmx.de> Tested-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-30 17:30:52 -04:00
J. Bruce Fields	3477565e6a	Revert "nfsd: nfs4_file_get_access: need to be more careful with O_RDWR" This reverts commit `df66e75395`. nfsd4_lock can get a read-only or write-only reference when only a read-write open is available. This is normal. Cc: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-30 17:30:45 -04:00
J. Bruce Fields	b8297cec2d	Linux 3.11-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJSCDSjAAoJEHm+PkMAQRiGDXMIAI7Loae0Oqb1eoeJkvjyZsBS OJDeeEcn+k58VbxVHyRdc7hGo4yI4tUZm172SpnOaM8sZ/ehPU7zBrwJK2lzX334 /jAM3uvVPfxA2nu0I4paNpkED/NQ8NRRsYE1iTE8dzHXOH6dA3mgp5qfco50rQvx rvseXpME4KIAJEq4jnyFZF5+nuHiPueM9JftPmSSmJJ3/KY9kY1LESovyWd7ttg1 jYSVPFal9J0E+tl2UQY5g9H16GqhhjYn+39Iei6Q5P4bL4ZubQgTRQTN9nyDc06Z ezQtGoqZ8kEz/2SyRlkda6PzjSEhgXlc8mCL5J7AW+dMhTHHx2IrosjiCA80kG8= =c0rK -----END PGP SIGNATURE----- Merge tag 'v3.11-rc5' into for-3.12 branch For testing purposes I want some nfs and nfsd bugfixes (specifically, `58cd57bfd9` and previous nfsd patches, and Trond's `4f3cc4809a`).	2013-08-30 16:42:49 -04:00
Weston Andros Adamson	58cd57bfd9	nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID - don't BUG_ON() when not SP4_NONE - calculate recv and send reserve sizes correctly Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-07 12:06:07 -04:00
J. Bruce Fields	c47205914c	nfsd4: Fix MACH_CRED NULL dereference Fixes a NULL-dereference on attempts to use MACH_CRED protection over auth_sys. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-07 12:05:51 -04:00
J. Bruce Fields	b1948a641d	nfsd4: fix setlease error return This actually makes a difference in the 4.1 case, since we use the status to decide what reason to give the client for the delegation refusal (see nfsd4_open_deleg_none_ext), and in theory a client might choose suboptimal behavior if we give the wrong answer. Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-26 17:02:07 -04:00
Harshula Jayasuriya	df66e75395	nfsd: nfs4_file_get_access: need to be more careful with O_RDWR If fi_fds = {non-NULL, NULL, non-NULL} and oflag = O_WRONLY the WARN_ON_ONCE(!(fp->fi_fds[oflag] \|\| fp->fi_fds[O_RDWR])) doesn't trigger when it should. Signed-off-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-23 12:26:51 -04:00
Harshula Jayasuriya	e4daf1ffbe	nfsd: nfsd_open: when dentry_open returns an error do not propagate as struct file The following call chain: ------------------------------------------------------------ nfs4_get_vfs_file - nfsd_open - dentry_open - do_dentry_open - __get_file_write_access - get_write_access - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY; ------------------------------------------------------------ can result in the following state: ------------------------------------------------------------ struct nfs4_file { ... fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0}, fi_access = {{ counter = 0x1 }, { counter = 0x0 }}, ... ------------------------------------------------------------ 1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NULL, hence nfsd_open() is called where we get status set to an error and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented. 2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented. Thus we leave a landmine in the form of the nfs4_file data structure in an incorrect state. 3) Eventually, when __nfs4_file_put_access() is called it finds fi_access[O_WRONLY] being non-zero, it decrements it and calls nfs4_file_put_fd() which tries to fput -ETXTBSY. ------------------------------------------------------------ ... [exception RIP: fput+0x9] RIP: ffffffff81177fa9 RSP: ffff88062e365c90 RFLAGS: 00010282 RAX: ffff880c2b3d99cc RBX: ffff880c2b3d9978 RCX: 0000000000000002 RDX: dead000000100101 RSI: 0000000000000001 RDI: ffffffffffffffe6 RBP: ffff88062e365c90 R8: ffff88041fe797d8 R9: ffff88062e365d58 R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000007 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd] #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd] #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd] #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd] #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd] #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd] #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd] #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc] #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc] #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd] #19 [ffff88062e365ee8] kthread at ffffffff81090886 #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a ------------------------------------------------------------ Cc: stable@vger.kernel.org Signed-off-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-23 12:15:32 -04:00
Linus Torvalds	61f98b0fca	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from Bruce Fields: "Just three minor bugfixes" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: svcrdma: underflow issue in decode_write_list() nfsd4: fix minorversion support interface lockd: protect nlm_blocked access in nlmsvc_retry_blocked	2013-07-17 13:43:55 -07:00
J. Bruce Fields	35f7a14fc1	nfsd4: fix minorversion support interface You can turn on or off support for minorversions using e.g. echo "-4.2" >/proc/fs/nfsd/versions However, the current implementation is a little wonky. For example, the above will turn off 4.2 support, but it will also turn on 4.1 support. This didn't matter as long as we only had 2 minorversions, which was true till very recently. And do a little cleanup here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-12 16:48:52 -04:00
Linus Torvalds	0ff08ba5d0	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from Bruce Fields: "Changes this time include: - 4.1 enabled on the server by default: the last 4.1-specific issues I know of are fixed, so we're not going to find the rest of the bugs without more exposure. - Experimental support for NFSv4.2 MAC Labeling (to allow running selinux over NFS), from Dave Quigley. - Fixes for some delicate cache/upcall races that could cause rare server hangs; thanks to Neil Brown and Bodo Stroesser for extreme debugging persistence. - Fixes for some bugs found at the recent NFS bakeathon, mostly v4 and v4.1-specific, but also a generic bug handling fragmented rpc calls" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits) nfsd4: support minorversion 1 by default nfsd4: allow destroy_session over destroyed session svcrpc: fix failures to handle -1 uid's sunrpc: Don't schedule an upcall on a replaced cache entry. net/sunrpc: xpt_auth_cache should be ignored when expired. sunrpc/cache: ensure items removed from cache do not have pending upcalls. sunrpc/cache: use cache_fresh_unlocked consistently and correctly. sunrpc/cache: remove races with queuing an upcall. nfsd4: return delegation immediately if lease fails nfsd4: do not throw away 4.1 lock state on last unlock nfsd4: delegation-based open reclaims should bypass permissions svcrpc: don't error out on small tcp fragment svcrpc: fix handling of too-short rpc's nfsd4: minor read_buf cleanup nfsd4: fix decoding of compounds across page boundaries nfsd4: clean up nfs4_open_delegation NFSD: Don't give out read delegations on creates nfsd4: allow client to send no cb_sec flavors nfsd4: fail attempts to request gss on the backchannel nfsd4: implement minimal SP4_MACH_CRED ...	2013-07-11 10:17:13 -07:00
Linus Torvalds	be0c5d8c0b	NFS client updates for Linux 3.11 Feature highlights include: - Add basic client support for NFSv4.2 - Add basic client support for Labeled NFS (selinux for NFSv4.2) - Fix the use of credentials in NFSv4.1 stateful operations, and add support for NFSv4.1 state protection. Bugfix highlights: - Fix another NFSv4 open state recovery race - Fix an NFSv4.1 back channel session regression - Various rpc_pipefs races - Fix another issue with NFSv3 auth negotiation -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJR2vsSAAoJEGcL54qWCgDyWBIP/AqlpBBAblxbNQ1Bl/0m1Pdb iKH961qgM4U1BzK0svGtHTZqkovpm4o/VbkbKBT5mQ4g6SbbsJ/AsS1plCyfnIZi bdnKNJyj6zg0NsAkJ3vKWqd4BTaP+icdSfEIlRKQxAPESewN7b5B3OWgY4KdYmnk q5BP25anC1ryxVycSY67ux8S2IKXVSRZeCZv+RO21rvZ2G0bV5y7t8Om28ztxEnU RKrHgQHgaaktR7i8QVO0sbiWq3iqLa3GPkUvFLwWGr8PQJtTkYY0QwYSrsV3N4rY hYpMRUZFHpZ8UG5YvBT6xyOy/XaGwMGKSfZjB9/YG4QVju+tTy50U1JbTil5PEWY GHWYF68aurIeUkXrhSv8AVnOnhir0mISx5ou/SV7p0QoAZ92V6kq+LkPrW520qlc z8ILh3j28pN3ZUCIEArcaZhYCt48uO2hwBi5TqevQyyGRsXFGbN1moD5jvHkllft Fi0XGuCBdvhrzFRZcsEl+PDq7fT8lXUK2BHe8oR5jz9PhUp+jpEl9m/eg3RsjJjN DuxsHye2U4chScdnRtLBQvpFtdINvWX/Gy8Bi7kdE5tsQySvOa+rdwuBc7h88PHC +4xI2iX3z4O1+GpsAe/T9+pjW689jEilS+eVDRVEGl6yHGn9q8PYOayjPjwbJHxS R2mLTRhKu1DKguTzO13f =wGjn -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Feature highlights include: - Add basic client support for NFSv4.2 - Add basic client support for Labeled NFS (selinux for NFSv4.2) - Fix the use of credentials in NFSv4.1 stateful operations, and add support for NFSv4.1 state protection. Bugfix highlights: - Fix another NFSv4 open state recovery race - Fix an NFSv4.1 back channel session regression - Various rpc_pipefs races - Fix another issue with NFSv3 auth negotiation Please note that Labeled NFS does require some additional support from the security subsystem. The relevant changesets have all been reviewed and acked by James Morris." * tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits) NFS: Set NFS_CS_MIGRATION for NFSv4 mounts NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs nfs: have NFSv3 try server-specified auth flavors in turn nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it nfs: move server_authlist into nfs_try_mount_request nfs: refactor "need_mount" code out of nfs_try_mount SUNRPC: PipeFS MOUNT notification optimization for dying clients SUNRPC: split client creation routine into setup and registration SUNRPC: fix races on PipeFS UMOUNT notifications SUNRPC: fix races on PipeFS MOUNT notifications NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize NFS: Improve legacy idmapping fallback NFSv4.1 end back channel session draining NFS: Apply v4.1 capabilities to v4.2 NFSv4.1: Clean up layout segment comparison helper names NFSv4.1: layout segment comparison helpers should take 'const' parameters NFSv4: Move the DNS resolver into the NFSv4 module rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set ...	2013-07-09 12:09:43 -07:00
J. Bruce Fields	d109148111	nfsd4: support minorversion 1 by default We now have minimal minorversion 1 support; turn it on by default. This can still be turned off with "echo -4.1 >/proc/fs/nfsd/versions". Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-08 19:48:10 -04:00
J. Bruce Fields	f0f51f5cdd	nfsd4: allow destroy_session over destroyed session RFC 5661 allows a client to destroy a session using a compound associated with the destroyed session, as long as the DESTROY_SESSION op is the last op of the compound. We attempt to allow this, but testing against a Solaris client (which does destroy sessions in this way) showed that we were failing the DESTROY_SESSION with NFS4ERR_DELAY, because we assumed the reference count on the session (held by us) represented another rpc in progress over this session. Fix this by noting that in this case the expected reference count is 1, not 0. Also, note as long as the session holds a reference to the compound we're destroying, we can't free it here--instead, delay the free till the final put in nfs4svc_encode_compoundres. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-08 19:46:38 -04:00
J. Bruce Fields	d08d32e6e5	nfsd4: return delegation immediately if lease fails This case shouldn't happen--the administrator shouldn't really allow other applications access to the export until clients have had the chance to reclaim their state--but if it does then we should set the "return this lease immediately" bit on the reply. That still leaves some small races, but it's the best the protocol allows us to do in the case a lease is ripped out from under us.... Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:07 -04:00
J. Bruce Fields	0a262ffb75	nfsd4: do not throw away 4.1 lock state on last unlock This reverts commit `eb2099f31b` "nfsd4: release lockowners on last unlock in 4.1 case". Trond identified language in rfc 5661 section 8.2.4 which forbids this behavior: Stateids associated with byte-range locks are an exception. They remain valid even if a LOCKU frees all remaining locks, so long as the open file with which they are associated remains open, unless the client frees the stateids via the FREE_STATEID operation. And bakeathon 2013 testing found a 4.1 freebsd client was getting an incorrect BAD_STATEID return from a FREE_STATEID in the above situation and then failing. The spec language honestly was probably a mistake but at this point with implementations already following it we're probably stuck with that. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:06 -04:00
J. Bruce Fields	89f6c3362c	nfsd4: delegation-based open reclaims should bypass permissions We saw a v4.0 client's create fail as follows: - open create succeeds and gets a read delegation - client attempts to set mode on new file, gets DELAY while server recalls delegation. - client attempts a CLAIM_DELEGATE_CUR open using the delegation, gets error because of new file mode. This probably can't happen on a recent kernel since we're no longer giving out delegations on create opens. Nevertheless, it's a bug--reclaim opens should bypass permission checks. Reported-by: Steve Dickson <steved@redhat.com> Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:05 -04:00
J. Bruce Fields	590b743143	nfsd4: minor read_buf cleanup The code to step to the next page seems reasonably self-contained. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:03 -04:00
J. Bruce Fields	247500820e	nfsd4: fix decoding of compounds across page boundaries A freebsd NFSv4.0 client was getting rare IO errors expanding a tarball. A network trace showed the server returning BAD_XDR on the final getattr of a getattr+write+getattr compound. The final getattr started on a page boundary. I believe the Linux client ignores errors on the post-write getattr, and that that's why we haven't seen this before. Cc: stable@vger.kernel.org Reported-by: Rick Macklem <rmacklem@uoguelph.ca> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:29:40 -04:00
J. Bruce Fields	99c415156c	nfsd4: clean up nfs4_open_delegation The nfs4_open_delegation logic is unecessarily baroque. Also stop pretending we support write delegations in several places. Some day we will support write delegations, but when that happens adding back in these flag parameters will be the easy part. For now they're just confusing. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:07 -04:00
Steve Dickson	9a0590aec3	NFSD: Don't give out read delegations on creates When an exclusive create is done with the mode bits set (aka open(testfile, O_CREAT \| O_EXCL, 0777)) this causes a OPEN op followed by a SETATTR op. When a read delegation is given in the OPEN, it causes the SETATTR to delay with EAGAIN until the delegation is recalled. This patch caused exclusive creates to give out a write delegation (which turn into no delegation) which allows the SETATTR seamlessly succeed. Signed-off-by: Steve Dickson <steved@redhat.com> [bfields: do this for any CREATE, not just exclusive; comment] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:07 -04:00
J. Bruce Fields	57569a7070	nfsd4: allow client to send no cb_sec flavors In testing I notice that some of the pynfs tests forget to send any cb_sec flavors, and that we haven't necessarily errored out in that case before. I'll fix pynfs, but am also inclined to default to trying AUTH_NONE in that case in case this is something clients actually do. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:07 -04:00
J. Bruce Fields	b78724b705	nfsd4: fail attempts to request gss on the backchannel We don't support gss on the backchannel. We should state that fact up front rather than just letting things continue and later making the client try to figure out why the backchannel isn't working. Trond suggested instead returning NFS4ERR_NOENT. I think it would be tricky for the client to distinguish between the case "I don't support gss on the backchannel" and "I can't find that in my cache, please create another context and try that instead", and I'd prefer something that currently doesn't have any other meaning for this operation, hence the (somewhat arbitrary) NFS4ERR_ENCR_ALG_UNSUPP. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:06 -04:00
J. Bruce Fields	57266a6e91	nfsd4: implement minimal SP4_MACH_CRED Do a minimal SP4_MACH_CRED implementation suggested by Trond, ignoring the client-provided spo_must_* arrays and just enforcing credential checks for the minimum required operations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:06 -04:00
J. Bruce Fields	0dc1531aca	svcrpc: store gss mech in svc_cred Store a pointer to the gss mechanism used in the rq_cred and cl_cred. This will make it easier to enforce SP4_MACH_CRED, which needs to compare the mechanism used on the exchange_id with that used on protected operations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:23:06 -04:00
Jeff Layton	1c8c601a8c	locks: protect most of the file_lock handling with i_lock Having a global lock that protects all of this code is a clear scalability problem. Instead of doing that, move most of the code to be protected by the i_lock instead. The exceptions are the global lists that the ->fl_link sits on, and the ->fl_block list. ->fl_link is what connects these structures to the global lists, so we must ensure that we hold those locks when iterating over or updating these lists. Furthermore, sound deadlock detection requires that we hold the blocked_list state steady while checking for loops. We also must ensure that the search and update to the list are atomic. For the checking and insertion side of the blocked_list, push the acquisition of the global lock into __posix_lock_file and ensure that checking and update of the blocked_list is done without dropping the lock in between. On the removal side, when waking up blocked lock waiters, take the global lock before walking the blocked list and dequeue the waiters from the global list prior to removal from the fl_block list. With this, deadlock detection should be race free while we minimize excessive file_lock_lock thrashing. Finally, in order to avoid a lock inversion problem when handling /proc/locks output we must ensure that manipulations of the fl_block list are also protected by the file_lock_lock. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:57:42 +04:00
Al Viro	ac6614b764	[readdir] constify ->actor Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:57:05 +04:00
Al Viro	bb6f619b3a	[readdir] introduce ->iterate(), ctx->pos, dir_emit() New method - ->iterate(file, ctx). That's the replacement for ->readdir(); it takes callback from ctx->actor, uses ctx->pos instead of file->f_pos and calls dir_emit(ctx, ...) instead of filldir(data, ...). It does not update file->f_pos (or look at it, for that matter); iterate_dir() does the update. Note that dir_emit() takes the offset from ctx->pos (and eventually filldir_t will lose that argument). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:46:47 +04:00
Al Viro	5c0ba4e076	[readdir] introduce iterate_dir() and dir_context iterate_dir(): new helper, replacing vfs_readdir(). struct dir_context: contains the readdir callback (and will get more stuff in it), embedded into whatever data that callback wants to deal with; eventually, we'll be passing it to ->readdir() replacement instead of (data,filldir) pair. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:46:46 +04:00
Jim Rees	1a9357f443	nfsd: avoid undefined signed overflow In C, signed integer overflow results in undefined behavior, but unsigned overflow wraps around. So do the subtraction first, then cast to signed. Reported-by: Joakim Tjernlund <joakim.tjernlund@transmode.se> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-21 11:02:03 -04:00
J. Bruce Fields	ba4e55bb67	nfsd4: fix compile in !CONFIG_NFSD_V4_SECURITY_LABEL case Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-15 10:35:18 -04:00
David Quigley	18032ca062	NFSD: Server implementation of MAC Labeling Implement labeled NFS on the server: encoding and decoding, and writing and reading, of file labels. Enabled with CONFIG_NFSD_V4_SECURITY_LABEL. Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com> Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg> Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg> Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-15 09:27:02 -04:00
Steve Dickson	4bdc33ed5b	NFSDv4.2: Add NFS v4.2 support to the NFS server This enables NFSv4.2 support for the server. To enable this code do the following: echo "+4.2" >/proc/fs/nfsd/versions after the nfsd kernel module is loaded. On its own this does nothing except allow the server to respond to compounds with minorversion set to 2. All the new NFSv4.2 features are optional, so this is perfectly legal. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-13 10:11:47 -04:00
J. Bruce Fields	4f540e29dc	nfsd4: store correct client minorversion for >=4.2 This code assumes that any client using exchange_id is using NFSv4.1, but with the introduction of 4.2 that will no longer true. This main effect of this is that client callbacks will use the same minorversion as that used on the exchange_id. Note that clients are forbidden from mixing 4.1 and 4.2 compounds. (See rfc 5661, section 2.7, #13: "A client MUST NOT attempt to use a stateid, filehandle, or similar returned object from the COMPOUND procedure with minor version X for another COMPOUND procedure with minor version Y, where X != Y.") However, we do not currently attempt to enforce this except in the case of mixing zero minor version with non-zero minor versions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-13 10:11:45 -04:00
Zhao Hongjiang	eb48241bb4	nfsd: get rid of the unused functions in vfs The fh_lock_parent(), nfsd_truncate(), nfsd_notify_change() and nfsd_sync_dir() fuctions are neither implemented nor used, just remove them. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-13 10:11:44 -04:00
Steve Dickson	4488cc96c5	NFS: Add NFSv4.2 protocol constants Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com> Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg> Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg> Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg> Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-13 10:05:43 -04:00
Linus Torvalds	2dbd3cac87	Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Small fixes for two bugs and two warnings" * 'for-3.10' of git://linux-nfs.org/~bfields/linux: nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error SUNRPC: fix decoding of optional gss-proxy xdr fields SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning nfsd4: don't allow owner override on 4.1 CLAIM_FH opens	2013-05-10 09:28:55 -07:00
Jeff Layton	7255e716b1	nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error Toralf reported the following oops to the linux-nfs mailing list: -----------------[snip]------------------ NFSD: unable to generate recoverydir name (-2). NFSD: disabling legacy clientid tracking. Reboot recovery will not function correctly! BUG: unable to handle kernel NULL pointer dereference at 000003c8 IP: [<f90a3d91>] nfsd4_client_tracking_exit+0x11/0x50 [nfsd] pdpt = 000000002ba33001 pde = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: loop nfsd auth_rpcgss ipt_MASQUERADE xt_owner xt_multiport ipt_REJECT xt_tcpudp xt_recent xt_conntrack nf_conntrack_ftp xt_limit xt_LOG iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables af_packet pppoe pppox ppp_generic slhc bridge stp llc tun arc4 iwldvm mac80211 coretemp kvm_intel uvcvideo sdhci_pci sdhci mmc_core videobuf2_vmalloc videobuf2_memops usblp videobuf2_core i915 iwlwifi psmouse videodev cfg80211 kvm fbcon bitblit cfbfillrect acpi_cpufreq mperf evdev softcursor font cfbimgblt i2c_algo_bit cfbcopyarea intel_agp intel_gtt drm_kms_helper snd_hda_codec_conexant drm agpgart fb fbdev tpm_tis thinkpad_acpi tpm nvram e1000e rfkill thermal ptp wmi pps_core tpm_bios 8250_pci processor 8250 ac snd_hda_intel snd_hda_codec snd_pcm battery video i2c_i801 snd_page_alloc snd_timer button serial_core i2c_core snd soundcore thermal_sys hwmon aesni_intel ablk_helper cryp td lrw aes_i586 xts gf128mul cbc fuse nfs lockd sunrpc dm_crypt dm_mod hid_monterey hid_microsoft hid_logitech hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech hid_generic usbhid hid sr_mod cdrom sg [last unloaded: microcode] Pid: 6374, comm: nfsd Not tainted 3.9.1 #6 LENOVO 4180F65/4180F65 EIP: 0060:[<f90a3d91>] EFLAGS: 00010202 CPU: 0 EIP is at nfsd4_client_tracking_exit+0x11/0x50 [nfsd] EAX: 00000000 EBX: fffffffe ECX: 00000007 EDX: 00000007 ESI: eb9dcb00 EDI: eb2991c0 EBP: eb2bde38 ESP: eb2bde34 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 80050033 CR2: 000003c8 CR3: 2ba80000 CR4: 000407f0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Process nfsd (pid: 6374, ti=eb2bc000 task=eb2711c0 task.ti=eb2bc000) Stack: fffffffe eb2bde4c f90a3e0c f90a7754 fffffffe eb0a9c00 eb2bdea0 f90a41ed eb2991c0 1b270000 eb2991c0 eb2bde7c f9099ce9 eb2bde98 0129a020 eb29a020 eb2bdecc eb2991c0 eb2bdea8 f9099da5 00000000 eb9dcb00 00000001 67822f08 Call Trace: [<f90a3e0c>] legacy_recdir_name_error+0x3c/0x40 [nfsd] [<f90a41ed>] nfsd4_create_clid_dir+0x15d/0x1c0 [nfsd] [<f9099ce9>] ? nfsd4_lookup_stateid+0x99/0xd0 [nfsd] [<f9099da5>] ? nfs4_preprocess_seqid_op+0x85/0x100 [nfsd] [<f90a4287>] nfsd4_client_record_create+0x37/0x50 [nfsd] [<f909d6ce>] nfsd4_open_confirm+0xfe/0x130 [nfsd] [<f90980b1>] ? nfsd4_encode_operation+0x61/0x90 [nfsd] [<f909d5d0>] ? nfsd4_free_stateid+0xc0/0xc0 [nfsd] [<f908fd0b>] nfsd4_proc_compound+0x41b/0x530 [nfsd] [<f9081b7b>] nfsd_dispatch+0x8b/0x1a0 [nfsd] [<f857b85d>] svc_process+0x3dd/0x640 [sunrpc] [<f908165d>] nfsd+0xad/0x110 [nfsd] [<f90815b0>] ? nfsd_destroy+0x70/0x70 [nfsd] [<c1054824>] kthread+0x94/0xa0 [<c1486937>] ret_from_kernel_thread+0x1b/0x28 [<c1054790>] ? flush_kthread_work+0xd0/0xd0 Code: 86 b0 00 00 00 90 c5 0a f9 c7 04 24 70 76 0a f9 e8 74 a9 3d c8 eb ba 8d 76 00 55 89 e5 53 66 66 66 66 90 8b 15 68 c7 0a f9 85 d2 <8b> 88 c8 03 00 00 74 2c 3b 11 77 28 8b 5c 91 08 85 db 74 22 8b EIP: [<f90a3d91>] nfsd4_client_tracking_exit+0x11/0x50 [nfsd] SS:ESP 0068:eb2bde34 CR2: 00000000000003c8 ---[ end trace 09e54015d145c9c6 ]--- The problem appears to be a regression that was introduced in commit `9a9c6478` "nfsd: make NFSv4 recovery client tracking options per net". Prior to that commit, it was safe to pass a NULL net pointer to nfsd4_client_tracking_exit in the legacy recdir case, and legacy_recdir_name_error did so. After that comit, the net pointer must be valid. This patch just fixes legacy_recdir_name_error to pass in a valid net pointer to that function. Cc: <stable@vger.kernel.org> # v3.8+ Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-09 12:58:36 -04:00
J. Bruce Fields	9f415eb255	nfsd4: don't allow owner override on 4.1 CLAIM_FH opens The Linux client is using CLAIM_FH to implement regular opens, not just recovery cases, so it depends on the server to check permissions correctly. Therefore the owner override, which may make sense in the delegation recovery case, isn't right in the CLAIM_FH case. Symptoms: on a client with `49f9a0fafd` "NFSv4.1: Enable open-by-filehandle", Bryan noticed this: touch test.txt chmod 000 test.txt echo test > test.txt succeeding. Cc: stable@kernel.org Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-05-03 16:36:14 -04:00
Linus Torvalds	1db772216f	Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from J Bruce Fields: "Highlights include: - Some more DRC cleanup and performance work from Jeff Layton - A gss-proxy upcall from Simo Sorce: currently krb5 mounts to the server using credentials from Active Directory often fail due to limitations of the svcgssd upcall interface. This replacement lifts those limitations. The existing upcall is still supported for backwards compatibility. - More NFSv4.1 support: at this point, if a user with a current client who upgrades from 4.0 to 4.1 should see no regressions. In theory we do everything a 4.1 server is required to do. Patches for a couple minor exceptions are ready for 3.11, and with those and some more testing I'd like to turn 4.1 on by default in 3.11." Fix up semantic conflict as per Stephen Rothwell and linux-next: Commit `030d794bf4` ("SUNRPC: Use gssproxy upcall for server RPCGSS authentication") adds two new users of "PDE(inode)->data", but we're supposed to use "PDE_DATA(inode)" instead since commit `d9dda78bad` ("procfs: new helper - PDE_DATA(inode)"). The old PDE() macro is no longer available since commit `c30480b92c` ("proc: Make the PROC_I() and PDE() macros internal to procfs") * 'for-3.10' of git://linux-nfs.org/~bfields/linux: (60 commits) NFSD: SECINFO doesn't handle unsupported pseudoflavors correctly NFSD: Simplify GSS flavor encoding in nfsd4_do_encode_secinfo() nfsd: make symbol nfsd_reply_cache_shrinker static svcauth_gss: fix error return code in rsc_parse() nfsd4: don't remap EISDIR errors in rename svcrpc: fix gss-proxy to respect user namespaces SUNRPC: gssp_procedures[] can be static SUNRPC: define {create,destroy}_use_gss_proxy_proc_entry in !PROC case nfsd4: better error return to indicate SSV non-support nfsd: fix EXDEV checking in rename SUNRPC: Use gssproxy upcall for server RPCGSS authentication. SUNRPC: Add RPC based upcall mechanism for RPCGSS auth SUNRPC: conditionally return endtime from import_sec_context SUNRPC: allow disabling idle timeout SUNRPC: attempt AF_LOCAL connect on setup nfsd: Decode and send 64bit time values nfsd4: put_client_renew_locked can be static nfsd4: remove unused macro nfsd4: remove some useless code nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED ...	2013-05-03 10:59:39 -07:00
Linus Torvalds	20b4fb4852	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...	2013-05-01 17:51:54 -07:00
Chuck Lever	676e4ebd5f	NFSD: SECINFO doesn't handle unsupported pseudoflavors correctly If nfsd4_do_encode_secinfo() can't find GSS info that matches an export security flavor, it assumes the flavor is not a GSS pseudoflavor, and simply puts it on the wire. However, if this XDR encoding logic is given a legitimate GSS pseudoflavor but the RPC layer says it does not support that pseudoflavor for some reason, then the server leaks GSS pseudoflavor numbers onto the wire. I confirmed this happens by blacklisting rpcsec_gss_krb5, then attempted a client transition from the pseudo-fs to a Kerberos-only share. The client received a flavor list containing the Kerberos pseudoflavor numbers, rather than GSS tuples. The encoder logic can check that each pseudoflavor in flavs[] is less than MAXFLAVOR before writing it into the buffer, to prevent this. But after "nflavs" is written into the XDR buffer, the encoder can't skip writing flavor information into the buffer when it discovers the RPC layer doesn't support that flavor. So count the number of valid flavors as they are written into the XDR buffer, then write that count into a placeholder in the XDR buffer when all recognized flavors have been encoded. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-30 19:18:21 -04:00
Chuck Lever	ed9411a004	NFSD: Simplify GSS flavor encoding in nfsd4_do_encode_secinfo() Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-30 19:18:21 -04:00
Wei Yongjun	c8c797f9fd	nfsd: make symbol nfsd_reply_cache_shrinker static symbol 'nfsd_reply_cache_shrinker' only used within this file. It should be static. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-30 18:19:34 -04:00
J. Bruce Fields	2a6cf944c2	nfsd4: don't remap EISDIR errors in rename We're going out of our way here to remap an error to make rfc 3530 happy--but the rfc itself (nor rfc 1813, which has similar language) gives no justification. And disagrees with local filesystem behavior, with Linux and posix man pages, and knfsd's implemented behavior for v2 and v3. And the documented behavior seems better, in that it gives a little more information--you could implement the 3530 behavior using the posix behavior, but not the other way around. Also, the Linux client makes no attempt to remap this error in the v4 case, so it can end up just returning EEXIST to the application in a case where it should return EISDIR. So honestly I think the rfc's are just buggy here--or in any case it doesn't see worth the trouble to remap this error. Reported-by: Frank S Filz <ffilz@us.ibm.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-30 15:44:20 -04:00
Linus Torvalds	8728f986fe	NFS client bugfixes and cleanups for 3.10 - NLM: stable fix for NFSv2/v3 blocking locks - NFSv4.x: stable fixes for the delegation recall error handling code - NFSv4.x: Security flavour negotiation fixes and cleanups by Chuck Lever - SUNRPC: A number of RPCSEC_GSS fixes and cleanups also from Chuck - NFSv4.x assorted state management and reboot recovery bugfixes - NFSv4.1: In cases where we have already looked up a file, and hold a valid filehandle, use the new open-by-filehandle operation instead of opening by name. - Allow the NFSv4.1 callback thread to freeze - NFSv4.x: ensure that file unlock waits for readahead to complete - NFSv4.1: ensure that the RPC layer doesn't override the NFS session table size negotiation by limiting the number of slots. - NFSv4.x: Fix SETATTR spec compatibility issues -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJRfu+cAAoJEGcL54qWCgDylxkP/24cXOLHMKMYnIab0cQYIW2m SQgADGE+MqgTVlWjGVWublVMDY1R51iINsksAjxMtXYt50FdBJEqV2uxIGi4VnbR nR9eppqQ6vk5e6r5+aZyVmWKoLFnJ4MBF6OpPUZB5mf8iH/fiixmSYLvseopPdDj bjHwCxg+xEgew5EhQF/xqkEfkAp2NN84xUksTWb9uDIW2c3SJweY/ZVR2Zsqpugm oqYVtrSLvNKqINQG8OP10s+mRWULwoqapF+kEHlxNbRy26C0zlbXPaneSgYzqHsY OyRkAT7uJJqStYlqdW7k+DhyNMB+T33WAGJpWQlfJGYk5d/n0rtBJDVo0hfhCSQr VkOXiO9J08NMbelCu4+0CJii7h5GCaqpuJEEmNL6AlF/TJVkIQJuRaG2+WDmEtO2 oYd4UfXlAbUuts1SW7u/yyN/yrjVTm1tZYRBqn2VJdqh1s8dMxEWPct2Yn314mpS ODAnbDkEhtWlc9cloSRnwKec5WcxMZb19IJeK9ZvHm7PfIu/QHtj6Ren8s1//bZI OMQxC/Vf/wcjMdNtr7QdMNxWG1aK8DL9mYP5XwCrkZ560LIrtxmhyqYeoGAfgO5u +K/gKmQwjsaPhEa8jbP2/wI0II9yKPWj/fVwqhbhqaBUx5GA2iAKcdpPP6JAMAti +PXkLTtkyrIgSNwzl63S =Hgot -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes and cleanups from Trond Myklebust: - NLM: stable fix for NFSv2/v3 blocking locks - NFSv4.x: stable fixes for the delegation recall error handling code - NFSv4.x: Security flavour negotiation fixes and cleanups by Chuck Lever - SUNRPC: A number of RPCSEC_GSS fixes and cleanups also from Chuck - NFSv4.x assorted state management and reboot recovery bugfixes - NFSv4.1: In cases where we have already looked up a file, and hold a valid filehandle, use the new open-by-filehandle operation instead of opening by name. - Allow the NFSv4.1 callback thread to freeze - NFSv4.x: ensure that file unlock waits for readahead to complete - NFSv4.1: ensure that the RPC layer doesn't override the NFS session table size negotiation by limiting the number of slots. - NFSv4.x: Fix SETATTR spec compatibility issues * tag 'nfs-for-3.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (67 commits) NFSv4: Warn once about servers that incorrectly apply open mode to setattr NFSv4: Servers should only check SETATTR stateid open mode on size change NFSv4: Don't recheck permissions on open in case of recovery cached open NFSv4.1: Don't do a delegated open for NFS4_OPEN_CLAIM_DELEG_CUR_FH modes NFSv4.1: Use the more efficient open_noattr call for open-by-filehandle NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE NFSv4: Ensure that we clear the NFS_OPEN_STATE flag when appropriate LOCKD: Ensure that nlmclnt_block resets block->b_status after a server reboot NFSv4: Ensure the LOCK call cannot use the delegation stateid NFSv4: Use the open stateid if the delegation has the wrong mode nfs: Send atime and mtime as a 64bit value NFSv4: Record the OPEN create mode used in the nfs4_opendata structure NFSv4.1: Set the RPC_CLNT_CREATE_INFINITE_SLOTS flag for NFSv4.1 transports SUNRPC: Allow rpc_create() to request that TCP slots be unlimited SUNRPC: Fix a livelock problem in the xprt->backlog queue NFSv4: Fix handling of revoked delegations by setattr NFSv4 release the sequence id in the return on close case nfs: remove unnecessary check for NULL inode->i_flock from nfs_delegation_claim_locks NFS: Ensure that NFS file unlock waits for readahead to complete NFS: Add functionality to allow waiting on all outstanding reads to complete ...	2013-04-30 11:28:08 -07:00
Jeff Layton	398c33aaa4	nfsd: convert nfs4_alloc_stid() to use idr_alloc_cyclic() Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-29 18:28:41 -07:00
J. Bruce Fields	b1df763723	Merge branch 'nfs-for-next' of git://linux-nfs.org/~trondmy/nfs-2.6 into for-3.10 Note conflict: Chuck's patches modified (and made static) gss_mech_get_by_OID, which is still needed by gss-proxy patches. The conflict resolution is a bit minimal; we may want some more cleanup.	2013-04-29 16:23:34 -04:00
J. Bruce Fields	dd30333cf5	nfsd4: better error return to indicate SSV non-support As 4.1 becomes less experimental and SSV still isn't implemented, we have to admit it's not going to be, and return some sensible error rather than just saying "our server's broken". Discussion in the ietf group hasn't turned up any objections to using NFS4ERR_ENC_ALG_UNSUPP for that purpose. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-26 16:18:15 -04:00
J. Bruce Fields	aa387d6ce1	nfsd: fix EXDEV checking in rename We again check for the EXDEV a little later on, so the first check is redundant. This check is also slightly racier, since a badly timed eviction from the export cache could leave us with the two fh_export pointers pointing to two different cache entries which each refer to the same underlying export. It's better to compare vfsmounts as the later check does, but that leaves a minor security hole in the case where the two exports refer to two different directories especially if (for example) they have different root-squashing options. So, compare ex_path.dentry too. Reported-by: Joe Habermann <joe.habermann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-26 16:18:15 -04:00
Bryan Schumaker	bf8d909705	nfsd: Decode and send 64bit time values The seconds field of an nfstime4 structure is 64bit, but we are assuming that the first 32bits are zero-filled. So if the client tries to set atime to a value before the epoch (touch -t 196001010101), then the server will save the wrong value on disk. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-23 14:49:37 -04:00
Fengguang Wu	ba138435d1	nfsd4: put_client_renew_locked can be static Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-16 22:15:00 -04:00
J. Bruce Fields	9aeb5aeeb0	nfsd4: remove unused macro Cleanup a piece I forgot to remove in `9411b1d4c7` "nfsd4: cleanup handling of nfsv4.0 closed stateid's". Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-16 21:51:55 -04:00
fanchaoting	53584f6652	nfsd4: remove some useless code The "list_empty(&oo->oo_owner.so_stateids)" is aways true, so remove it. Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-16 10:59:31 -04:00
J. Bruce Fields	3bd64a5ba1	nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED A 4.1 server must notify a client that has had any state revoked using the SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag. The client can figure out exactly which state is the problem using CHECK_STATEID and then free it using FREE_STATEID. The status flag will be unset once all such revoked stateids are freed. Our server's only recallable state is delegations. So we keep with each 4.1 client a list of delegations that have timed out and been recalled, but haven't yet been freed by FREE_STATEID. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-16 10:59:30 -04:00
J. Bruce Fields	23340032e6	nfsd4: clean up validate_stateid The logic here is better expressed with a switch statement. While we're here, CLOSED stateids (or stateids of an unkown type--which would indicate a server bug) should probably return nfserr_bad_stateid, though this behavior shouldn't affect any non-buggy client. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 17:42:28 -04:00
J. Bruce Fields	06b332a522	nfsd4: check backchannel attributes on create_session Make sure the client gives us an adequate backchannel. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 16:53:56 -04:00
J. Bruce Fields	55c760cfc4	nfsd4: fix forechannel attribute negotiation Negotiation of the 4.1 session forechannel attributes is a mess. Fix: - Move it all into check_forechannel_attrs instead of spreading it between that, alloc_session, and init_forechannel_attrs. - set a minimum "slotsize" so that our drc memory limits apply even for small maxresponsesize_cached. This also fixes some bugs when slotsize becomes <= 0. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 16:43:44 -04:00
J. Bruce Fields	373cd4098a	nfsd4: cleanup check_forechannel_attrs Pass this struct by reference, not by value, and return an error instead of a boolean to allow for future additions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 15:49:50 -04:00
Al Viro	75ef9de126	constify a bunch of struct file_operations instances Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-04-09 14:16:20 -04:00
J. Bruce Fields	0c7c3e67ab	nfsd4: don't close read-write opens too soon Don't actually close any opens until we don't need them at all. This means being left with write access when it's not really necessary, but that's better than putting a file that might still have posix locks held on it, as we have been. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 09:08:57 -04:00
J. Bruce Fields	eb2099f31b	nfsd4: release lockowners on last unlock in 4.1 case In the 4.1 case we're supposed to release lockowners as soon as they're no longer used. It would probably be more efficient to reference count them, but that's slightly fiddly due to the need to have callbacks from locks.c to take into account lock merging and splitting. For most cases just scanning the inode's lock list on unlock for matching locks will be sufficient. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 09:08:56 -04:00
J. Bruce Fields	bbc9c36c31	nfsd4: more sessions/open-owner-replay cleanup More logic that's unnecessary in the 4.1 case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 09:08:56 -04:00
J. Bruce Fields	3d74e6a5b6	nfsd4: no need for replay_owner in sessions case The replay_owner will never be used in the sessions case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 09:08:55 -04:00
J. Bruce Fields	c383747ef6	nfsd4: remove some redundant comments Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 09:08:54 -04:00
Wei Yongjun	2c44a23471	nfsd: use kmem_cache_free() instead of kfree() memory allocated by kmem_cache_alloc() should be freed using kmem_cache_free(), not kfree(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-09 09:08:47 -04:00
J. Bruce Fields	9411b1d4c7	nfsd4: cleanup handling of nfsv4.0 closed stateid's Closed stateid's are kept around a little while to handle close replays in the 4.0 case. So we stash them in the last-used stateid in the oo_last_closed_stateid field of the open owner. We can free that in encode_seqid_op_tail once the seqid on the open owner is next incremented. But we don't want to do that on the close itself; so we set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the first time through encode_seqid_op_tail, then when we see that flag set next time we free it. This is unnecessarily baroque. Instead, just move the logic that increments the seqid out of the xdr code and into the operation code itself. The justification given for the current placement is that we need to wait till the last minute to be sure we know whether the status is a sequence-id-mutating error or not, but examination of the code shows that can't actually happen. Reported-by: Yanchuan Nian <ycnian@gmail.com> Tested-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-08 09:55:32 -04:00
J. Bruce Fields	41d22663cb	nfsd4: remove unused nfs4_check_deleg argument Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-04 13:25:17 -04:00
J. Bruce Fields	e8c69d17d1	nfsd4: make del_recall_lru per-network-namespace If nothing else this simplifies the nfs4_state_shutdown_net logic a tad. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-04 13:25:16 -04:00
J. Bruce Fields	68a3396178	nfsd4: shut down more of delegation earlier Once we've unhashed the delegation, it's only hanging around for the benefit of an oustanding recall, which only needs the encoded filehandle, stateid, and dl_retries counter. No point keeping the file around any longer, or keeping it hashed. This also fixes a race: calls to idr_remove should really be serialized by the caller, but the nfs4_put_delegation call from the callback code isn't taking the state lock. (Better might be to cancel the callback before destroying the delegation, and remove any need for reference counting--but I don't see an easy way to cancel an rpc call.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-04 13:25:15 -04:00
J. Bruce Fields	8be2d2344c	nfsd4: minor cb_recall simplification Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-04 13:25:14 -04:00
fanchaoting	ff7c4b3693	nfsd: remove /proc/fs/nfs when create /proc/fs/nfs/exports error when create /proc/fs/nfs/exports error, we should remove /proc/fs/nfs, if don't do it, it maybe cause Memory leak. Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com> Reviewed-by: chendt.fnst <chendt.fnst@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 15:30:07 -04:00
fanchaoting	b022032e19	nfsd: don't run get_file if nfs4_preprocess_stateid_op return error we should return error status directly when nfs4_preprocess_stateid_op return error. Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 15:19:06 -04:00
Jeff Layton	89876f8c0d	nfsd: convert the file_hashtbl to a hlist We only ever traverse the hash chains in the forward direction, so a double pointer list head isn't really necessary. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 15:11:04 -04:00
J. Bruce Fields	66b2b9b2b0	nfsd4: don't destroy in-use session This changes session destruction to be similar to client destruction in that attempts to destroy a session while in use (which should be rare corner cases) result in DELAY. This simplifies things somewhat and helps meet a coming 4.2 requirement. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:40 -04:00
J. Bruce Fields	221a687669	nfsd4: don't destroy in-use clients When a setclientid_confirm or create_session confirms a client after a client reboot, it also destroys any previous state held by that client. The shutdown of that previous state must be careful not to free the client out from under threads processing other requests that refer to the client. This is a particular problem in the NFSv4.1 case when we hold a reference to a session (hence a client) throughout compound processing. The server attempts to handle this by unhashing the client at the time it's destroyed, then delaying the final free to the end. But this still leaves some races in the current code. I believe it's simpler just to fail the attempt to destroy the client by returning NFS4ERR_DELAY. This is a case that should never happen anyway. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:39 -04:00
J. Bruce Fields	4f6e6c1773	nfsd4: simplify bind_conn_to_session locking The locking here is very fiddly, and there's no reason for us to be setting cstate->session, since this is the only op in the compound. Let's just take the state lock and drop the reference counting. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:39 -04:00
J. Bruce Fields	abcdff09a0	nfsd4: fix destroy_session race destroy_session uses the session and client without continuously holding any reference or locks. Put the whole thing under the state lock for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:38 -04:00
J. Bruce Fields	bfa85e83a8	nfsd4: clientid lookup cleanup Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:37 -04:00
J. Bruce Fields	c0293b0131	nfsd4: destroy_clientid simplification I'm not sure what the check for clientid expiry was meant to do here. The check for a matching session is redundant given the previous check for state: a client without state is, in particular, a client without sessions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:36 -04:00
J. Bruce Fields	1ca507920d	nfsd4: remove some dprintk's E.g. printk's that just report the return value from an op are uninteresting as we already do that in the main proc_compound loop. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:36 -04:00
J. Bruce Fields	0eb6f20aa5	nfsd4: STALE_STATEID cleanup Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:35 -04:00
J. Bruce Fields	78389046f7	nfsd4: warn on odd create_session state This should never happen. (Note: the comparable case in setclientid_confirm can happen, since updating a client record can result in both confirmed and unconfirmed records with the same clientid.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:34 -04:00
ycnian@gmail.com	491402a787	nfsd: fix bug on nfs4 stateid deallocation NFS4_OO_PURGE_CLOSE is not handled properly. To avoid memory leak, nfs4 stateid which is pointed by oo_last_closed_stid is freed in nfsd4_close(), but NFS4_OO_PURGE_CLOSE isn't cleared meanwhile. So the stateid released in THIS close procedure may be freed immediately in the coming encoding function. Sorry that Signed-off-by was forgotten in last version. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:34 -04:00
Yanchuan Nian	9c6bdbb8dd	nfsd: remove unused macro in nfsv4 lk_rflags is never used anywhere, and rflags is not defined in struct nfsd4_lock. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:33 -04:00
J. Bruce Fields	2e4b7239a6	nfsd4: fix use-after-free of 4.1 client on connection loss Once we drop the lock here there's nothing keeping the client around: the only lock still held is the xpt_lock on this socket, but this socket no longer has any connection with the client so there's no way for other code to know we're still using the client. The solution is simple: all nfsd4_probe_callback does is set a few variables and queue some work, so there's no reason we can't just keep it under the lock. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:32 -04:00
J. Bruce Fields	b0a9d3ab57	nfsd4: fix race on client shutdown Dropping the session's reference count after the client's means we leave a window where the session's se_client pointer is NULL. An xpt_user callback that encounters such a session may then crash: [ 303.956011] BUG: unable to handle kernel NULL pointer dereference at 0000000000000318 [ 303.959061] IP: [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40 [ 303.959061] PGD 37811067 PUD 3d498067 PMD 0 [ 303.959061] Oops: 0002 [#8] PREEMPT SMP [ 303.959061] Modules linked in: md5 nfsd auth_rpcgss nfs_acl snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc microcode psmouse snd_timer serio_raw pcspkr evdev snd soundcore i2c_piix4 i2c_core intel_agp intel_gtt processor button nfs lockd sunrpc fscache ata_generic pata_acpi ata_piix uhci_hcd libata btrfs usbcore usb_common crc32c scsi_mod libcrc32c zlib_deflate floppy virtio_balloon virtio_net virtio_pci virtio_blk virtio_ring virtio [ 303.959061] CPU 0 [ 303.959061] Pid: 264, comm: nfsd Tainted: G D 3.8.0-ARCH+ #156 Bochs Bochs [ 303.959061] RIP: 0010:[<ffffffff81481a8e>] [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40 [ 303.959061] RSP: 0018:ffff880037877dd8 EFLAGS: 00010202 [ 303.959061] RAX: 0000000000000100 RBX: ffff880037a2b698 RCX: ffff88003d879278 [ 303.959061] RDX: ffff88003d879278 RSI: dead000000100100 RDI: 0000000000000318 [ 303.959061] RBP: ffff880037877dd8 R08: ffff88003c5a0f00 R09: 0000000000000002 [ 303.959061] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 303.959061] R13: 0000000000000318 R14: ffff880037a2b680 R15: ffff88003c1cbe00 [ 303.959061] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 303.959061] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 303.959061] CR2: 0000000000000318 CR3: 000000003d49c000 CR4: 00000000000006f0 [ 303.959061] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 303.959061] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 303.959061] Process nfsd (pid: 264, threadinfo ffff880037876000, task ffff88003c1fd0a0) [ 303.959061] Stack: [ 303.959061] ffff880037877e08 ffffffffa03772ec ffff88003d879000 ffff88003d879278 [ 303.959061] ffff88003d879080 0000000000000000 ffff880037877e38 ffffffffa0222a1f [ 303.959061] 0000000000107ac0 ffff88003c22e000 ffff88003d879000 ffff88003c1cbe00 [ 303.959061] Call Trace: [ 303.959061] [<ffffffffa03772ec>] nfsd4_conn_lost+0x3c/0xa0 [nfsd] [ 303.959061] [<ffffffffa0222a1f>] svc_delete_xprt+0x10f/0x180 [sunrpc] [ 303.959061] [<ffffffffa0223d96>] svc_recv+0xe6/0x580 [sunrpc] [ 303.959061] [<ffffffffa03587c5>] nfsd+0xb5/0x140 [nfsd] [ 303.959061] [<ffffffffa0358710>] ? nfsd_destroy+0x90/0x90 [nfsd] [ 303.959061] [<ffffffff8107ae00>] kthread+0xc0/0xd0 [ 303.959061] [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pte_at+0x50/0x100 [ 303.959061] [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70 [ 303.959061] [<ffffffff814898ec>] ret_from_fork+0x7c/0xb0 [ 303.959061] [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70 [ 303.959061] Code: ff ff 5d c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 65 48 8b 04 25 f0 c6 00 00 48 89 e5 83 80 44 e0 ff ff 01 b8 00 01 00 00 <3e> 66 0f c1 07 0f b6 d4 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f [ 303.959061] RIP [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40 [ 303.959061] RSP <ffff880037877dd8> [ 303.959061] CR2: 0000000000000318 [ 304.001218] ---[ end trace 2d809cd4a7931f5a ]--- [ 304.001903] note: nfsd[264] exited with preempt_count 2 Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:48:31 -04:00
J. Bruce Fields	9d313b17db	nfsd4: handle seqid-mutating open errors from xdr decoding If a client sets an owner (or group_owner or acl) attribute on open for create, and the mapping of that owner to an id fails, then we return BAD_OWNER. But BAD_OWNER is a seqid-mutating error, so we can't shortcut the open processing that case: we have to at least look up the owner so we can find the seqid to bump. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:47:53 -04:00
J. Bruce Fields	b600de7ab9	nfsd4: remove BUG_ON This BUG_ON just crashes the thread a little earlier than it would otherwise--it doesn't seem useful. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:47:45 -04:00
Jeff Layton	0733c7ba1e	nfsd: scale up the number of DRC hash buckets with cache size We've now increased the size of the duplicate reply cache by quite a bit, but the number of hash buckets has not changed. So, we've gone from an average hash chain length of 16 in the old code to 4096 when the cache is its largest. Change the code to scale out the number of buckets with the max size of the cache. At the same time, we also need to fix the hash function since the existing one isn't really suitable when there are more than 256 buckets. Move instead to use the stock hash_32 function for this. Testing on a machine that had 2048 buckets showed that this gave a smaller longest:average ratio than the existing hash function: The formula here is longest hash bucket searched divided by average number of entries per bucket at the time that we saw that longest bucket: old hash: 68/(39258/2048) == 3.547404 hash_32: 45/(33773/2048) == 2.728807 Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:47:25 -04:00
Jeff Layton	98d821bda1	nfsd: keep stats on worst hash balancing seen so far The typical case with the DRC is a cache miss, so if we keep track of the max number of entries that we've ever walked over in a search, then we should have a reasonable estimate of the longest hash chain that we've ever seen. With that, we'll also keep track of the total size of the cache when we see the longest chain. In the case of a tie, we prefer to track the smallest total cache size in order to properly gauge the worst-case ratio of max vs. avg chain length. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:47:25 -04:00
Jeff Layton	a2f999a37e	nfsd: add new reply_cache_stats file in nfsdfs For presenting statistics relating to duplicate reply cache. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:47:24 -04:00
Jeff Layton	6c6910cd4d	nfsd: track memory utilization by the DRC Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:47:23 -04:00
Jeff Layton	9dc56143c2	nfsd: break out comparator into separate function Break out the function that compares the rqstp and checksum against a reply cache entry. While we're at it, track the efficacy of the checksum over the NFS data by tracking the cases where we would have incorrectly matched a DRC entry if we had not tracked it or the length. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:47:22 -04:00
Jeff Layton	0b9ea37f24	nfsd: eliminate one of the DRC cache searches The most common case is to do a search of the cache, followed by an insert. In the case where we have to allocate an entry off the slab, then we end up having to redo the search, which is wasteful. Better optimize the code for the common case by eliminating the initial search of the cache and always preallocating an entry. In the case of a cache hit, we'll end up just freeing that entry but that's preferable to an extra search. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 11:47:22 -04:00
Chuck Lever	a77c806fb9	SUNRPC: Refactor nfsd4_do_encode_secinfo() Clean up. This matches a similar API for the client side, and keeps ULP fingers out the of the GSS mech switch. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-29 15:43:33 -04:00
J. Bruce Fields	64a817cfbd	nfsd4: reject "negative" acl lengths Since we only enforce an upper bound, not a lower bound, a "negative" length can get through here. The symptom seen was a warning when we attempt to a kmalloc with an excessive size. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-26 16:18:27 -04:00
Kent Overstreet	e49dbbf3e7	nfsd: fix bad offset use vfs_writev() updates the offset argument - but the code then passes the offset to vfs_fsync_range(). Since offset now points to the offset after what was just written, this is probably not what was intended Introduced by `face15025f` "nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes". Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: stable@vger.kernel.org Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-22 16:55:15 -04:00
Jeff Layton	ac534ff2d5	nfsd: fix startup order in nfsd_reply_cache_init If we end up doing "goto out_nomem" in this function, we'll call nfsd_reply_cache_shutdown. That will attempt to walk the LRU list and free entries, but that list may not be initialized yet if the server is starting up for the first time. It's also possible for the shrinker to kick in before we've initialized the LRU list. Rearrange the initialization so that the LRU list_head and cache size are initialized before doing any of the allocations that might fail. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-18 17:21:30 -04:00
Jeff Layton	a517b608fa	nfsd: only unhash DRC entries that are in the hashtable It's not safe to call hlist_del() on a newly initialized hlist_node. That leads to a NULL pointer dereference. Only do that if the entry is hashed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-18 14:58:32 -04:00
Tejun Heo	ebd6c70714	nfsd: convert to idr_alloc() idr_get_new*() and friends are about to be deprecated. Convert to the new idr_alloc() interface. Only compile-tested. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: J. Bruce Fields <bfields@redhat.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-13 15:21:45 -07:00
Tejun Heo	801cb2d62d	nfsd: remove unused get_new_stid() get_new_stid() is no longer used since commit `3abdb60712` ("nfsd4: simplify idr allocation"). Remove it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-13 15:21:45 -07:00
Eric W. Biederman	7f78e03513	fs: Limit sys_mount to only request filesystem modules. Modify the request_module to prefix the file system type with "fs-" and add aliases to all of the filesystems that can be built as modules to match. A common practice is to build all of the kernel code and leave code that is not commonly needed as modules, with the result that many users are exposed to any bug anywhere in the kernel. Looking for filesystems with a fs- prefix limits the pool of possible modules that can be loaded by mount to just filesystems trivially making things safer with no real cost. Using aliases means user space can control the policy of which filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf with blacklist and alias directives. Allowing simple, safe, well understood work-arounds to known problematic software. This also addresses a rare but unfortunate problem where the filesystem name is not the same as it's module name and module auto-loading would not work. While writing this patch I saw a handful of such cases. The most significant being autofs that lives in the module autofs4. This is relevant to user namespaces because we can reach the request module in get_fs_type() without having any special permissions, and people get uncomfortable when a user specified string (in this case the filesystem type) goes all of the way to request_module. After having looked at this issue I don't think there is any particular reason to perform any filtering or permission checks beyond making it clear in the module request that we want a filesystem module. The common pattern in the kernel is to call request_module() without regards to the users permissions. In general all a filesystem module does once loaded is call register_filesystem() and go to sleep. Which means there is not much attack surface exposed by loading a filesytem module unless the filesystem is mounted. In a user namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT, which most filesystems do not set today. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Reported-by: Kees Cook <keescook@google.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-03 19:36:31 -08:00
Linus Torvalds	b6669737d3	Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from J Bruce Fields: "Miscellaneous bugfixes, plus: - An overhaul of the DRC cache by Jeff Layton. The main effect is just to make it larger. This decreases the chances of intermittent errors especially in the UDP case. But we'll need to watch for any reports of performance regressions. - Containerized nfsd: with some limitations, we now support per-container nfs-service, thanks to extensive work from Stanislav Kinsbursky over the last year." Some notes about conflicts, since there were two non-data semantic conflicts here: - idr_remove_all() had been added by a memory leak fix, but has since become deprecated since idr_destroy() does it for us now. - xs_local_connect() had been added by this branch to make AF_LOCAL connections be synchronous, but in the meantime Trond had changed the calling convention in order to avoid a RCU dereference. There were a couple of more obvious actual source-level conflicts due to the hlist traversal changes and one just due to code changes next to each other, but those were trivial. * 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits) SUNRPC: make AF_LOCAL connect synchronous nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum svcrpc: fix rpc server shutdown races svcrpc: make svc_age_temp_xprts enqueue under sv_lock lockd: nlmclnt_reclaim(): avoid stack overflow nfsd: enable NFSv4 state in containers nfsd: disable usermode helper client tracker in container nfsd: use proper net while reading "exports" file nfsd: containerize NFSd filesystem nfsd: fix comments on nfsd_cache_lookup SUNRPC: move cache_detail->cache_request callback call to cache_read() SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function SUNRPC: rework cache upcall logic SUNRPC: introduce cache_detail->cache_request callback NFS: simplify and clean cache library NFS: use SUNRPC cache creation and destruction helper for DNS cache nfsd4: free_stid can be static nfsd: keep a checksum of the first 256 bytes of request sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer sunrpc: fix comment in struct xdr_buf definition ...	2013-02-28 18:02:55 -08:00
Sasha Levin	b67bfe0d42	hlist: drop the node parameter from iterators I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S \| hlist_for_each_entry_continue(a, - b, c) S \| hlist_for_each_entry_from(a, - b, c) S \| hlist_for_each_entry_rcu(a, - b, c, d) S \| hlist_for_each_entry_rcu_bh(a, - b, c, d) S \| hlist_for_each_entry_continue_rcu_bh(a, - b, c) S \| for_each_busy_worker(a, c, - b, d) S \| ax25_uid_for_each(a, - b, c) S \| ax25_for_each(a, - b, c) S \| inet_bind_bucket_for_each(a, - b, c) S \| sctp_for_each_hentry(a, - b, c) S \| sk_for_each(a, - b, c) S \| sk_for_each_rcu(a, - b, c) S \| sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S \| sk_for_each_safe(a, - b, c, d) S \| sk_for_each_bound(a, - b, c) S \| hlist_for_each_entry_safe(a, - b, c, d, e) S \| hlist_for_each_entry_continue_rcu(a, - b, c) S \| nr_neigh_for_each(a, - b, c) S \| nr_neigh_for_each_safe(a, - b, c, d) S \| nr_node_for_each(a, - b, c) S \| nr_node_for_each_safe(a, - b, c, d) S \| - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S \| - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S \| for_each_host(a, - b, c) S \| for_each_host_safe(a, - b, c, d) S \| for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:24 -08:00
Linus Torvalds	d895cb1af1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle and ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...	2013-02-26 20:16:07 -08:00
J. Bruce Fields	4f4a4fadde	nfsd: handle vfs_getattr errors in acl protocol We're currently ignoring errors from vfs_getattr. The correct thing to do is to do the stat in the main service procedure not in the response encoding. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-26 02:46:09 -05:00
Al Viro	3dadecce20	switch vfs_getattr() to struct path Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-26 02:46:08 -05:00
Linus Torvalds	94f2f14234	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace and namespace infrastructure changes from Eric W Biederman: "This set of changes starts with a few small enhnacements to the user namespace. reboot support, allowing more arbitrary mappings, and support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the user namespace root. I do my best to document that if you care about limiting your unprivileged users that when you have the user namespace support enabled you will need to enable memory control groups. There is a minor bug fix to prevent overflowing the stack if someone creates way too many user namespaces. The bulk of the changes are a continuation of the kuid/kgid push down work through the filesystems. These changes make using uids and gids typesafe which ensures that these filesystems are safe to use when multiple user namespaces are in use. The filesystems converted for 3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The changes for these filesystems were a little more involved so I split the changes into smaller hopefully obviously correct changes. XFS is the only filesystem that remains. I was hoping I could get that in this release so that user namespace support would be enabled with an allyesconfig or an allmodconfig but it looks like the xfs changes need another couple of days before it they are ready." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits) cifs: Enable building with user namespaces enabled. cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t cifs: Convert struct cifs_sb_info to use kuids and kgids cifs: Modify struct smb_vol to use kuids and kgids cifs: Convert struct cifsFileInfo to use a kuid cifs: Convert struct cifs_fattr to use kuid and kgids cifs: Convert struct tcon_link to use a kuid. cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t cifs: Convert from a kuid before printing current_fsuid cifs: Use kuids and kgids SID to uid/gid mapping cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc cifs: Use BUILD_BUG_ON to validate uids and gids are the same size cifs: Override unmappable incoming uids and gids nfsd: Enable building with user namespaces enabled. nfsd: Properly compare and initialize kuids and kgids nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids nfsd: Modify nfsd4_cb_sec to use kuids and kgids nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion nfsd: Convert nfsxdr to use kuids and kgids nfsd: Convert nfs3xdr to use kuids and kgids ...	2013-02-25 16:00:49 -08:00
Zhang Yanfei	697ce9be7d	fs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_used The three variables are calculated from nr_free_buffer_pages so change their types to unsigned long in case of overflow. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-23 17:50:22 -08:00
Al Viro	496ad9aa8e	new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-22 23:31:31 -05:00
Jeff Layton	56edc86b5a	nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum kbuild test robot says: tree: git://linux-nfs.org/~bfields/linux.git for-3.9 head: `deb4534f4f` commit: `01a7decf75` [32/44] nfsd: keep a checksum of the first 256 bytes of request config: i386-randconfig-x088 (attached as .config) All warnings: fs/nfsd/nfscache.c: In function 'nfsd_cache_csum': >> fs/nfsd/nfscache.c:266:9: warning: comparison of distinct pointer types lacks a cast [enabled by default] vim +266 fs/nfsd/nfscache.c 250 __wsum csum; 251 struct xdr_buf buf = &rqstp->rq_arg; 252 const unsigned char p = buf->head[0].iov_base; 253 size_t csum_len = min_t(size_t, buf->head[0].iov_len + buf->page_len, 254 RC_CSUMLEN); 255 size_t len = min(buf->head[0].iov_len, csum_len); 256 257 /* rq_arg.head first / 258 csum = csum_partial(p, len, 0); 259 csum_len -= len; 260 261 / Continue into page array */ 262 idx = buf->page_base / PAGE_SIZE; 263 base = buf->page_base & ~PAGE_MASK; 264 while (csum_len) { 265 p = page_address(buf->pages[idx]) + base; > 266 len = min(PAGE_SIZE - base, csum_len); 267 csum = csum_partial(p, len, csum); 268 csum_len -= len; 269 base = 0; 270 ++idx; 271 } 272 return csum; 273 } 274 Signed-off-by: Jeff Layton <jlayton@redhat.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-17 11:02:23 -05:00
Stanislav Kinsbursky	deb4534f4f	nfsd: enable NFSv4 state in containers Currently, NFSd is ready to operate in network namespace based containers. So let's drop check for "init_net" and make it able to fly. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-15 11:21:02 -05:00
Stanislav Kinsbursky	71a5030693	nfsd: disable usermode helper client tracker in container This tracker uses khelper kthread to execute binaries. Execution itself is done from kthread context - i.e. global root is used. This is not suitable for containers with own root. So, disable this tracker for a while. Note: one of possible solutions can be pass "init" callback to khelper, which will swap root to desired one. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-15 11:21:01 -05:00
Stanislav Kinsbursky	96d851c4d2	nfsd: use proper net while reading "exports" file Functuon "exports_open" is used for both "/proc/fs/nfs/exports" and "/proc/fs/nfsd/exports" files. Now NFSd filesystem is containerised, so proper net can be taken from superblock for "/proc/fs/nfsd/exports" reader. But for "/proc/fs/nfsd/exports" only current->nsproxy->net_ns can be used. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-15 11:21:01 -05:00
Stanislav Kinsbursky	11f779421a	nfsd: containerize NFSd filesystem This patch makes NFSD file system superblock to be created per net. This makes possible to get proper network namespace from superblock instead of using hard-coded "init_net". Note: NFSd fs super-block holds network namespace. This garantees, that network namespace won't disappear from underneath of it. This, obviously, means, that in case of kill of a container's "init" (which is not a mount namespace, but network namespace creator) netowrk namespace won't be destroyed. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-15 11:21:00 -05:00
Jeff Layton	1ac8362977	nfsd: fix comments on nfsd_cache_lookup Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-15 10:43:48 -05:00
Stanislav Kinsbursky	2d4383383b	SUNRPC: rework cache upcall logic For most of SUNRPC caches (except NFS DNS cache) cache_detail->cache_upcall is redundant since all that it's implementations are doing is calling sunrpc_cache_pipe_upcall() with proper function address argument. Cache request function address is now stored on cache_detail structure and thus all the code can be simplified. Now, for those cache details, which doesn't have cache_upcall callback (the only one, which still has is nfs_dns_resolve_template) sunrpc_cache_pipe_upcall will be called instead. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-15 10:43:46 -05:00
Stanislav Kinsbursky	73fb847a44	SUNRPC: introduce cache_detail->cache_request callback This callback will allow to simplify upcalls in further patches in this series. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-15 10:43:45 -05:00
Eric W. Biederman	6fab877900	nfsd: Properly compare and initialize kuids and kgids Use uid_eq(uid, GLOBAL_ROOT_UID) instead of !uid. Use gid_eq(gid, GLOBAL_ROOT_GID) instead of !gid. Use uid_eq(uid, INVALID_UID) instead of uid == -1 Use gid_eq(uid, INVALID_GID) instead of gid == -1 Use uid = GLOBAL_ROOT_UID instead of uid = 0; Use gid = GLOBAL_ROOT_GID instead of gid = 0; Use !uid_eq(uid1, uid2) instead of uid1 != uid2. Use !gid_eq(gid1, gid2) instead of gid1 != gid2. Use uid_eq(uid1, uid2) instead of uid1 == uid2. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:16:09 -08:00
Eric W. Biederman	4c1e1b34d5	nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:16:08 -08:00
Eric W. Biederman	03bc6d1cc1	nfsd: Modify nfsd4_cb_sec to use kuids and kgids Change uid and gid in struct nfsd4_cb_sec to be of type kuid_t and kgid_t. In nfsd4_decode_cb_sec when reading uids and gids off the wire convert them to kuids and kgids, and if they don't convert to valid kuids or valid kuids ignore RPC_AUTH_UNIX and don't fill in any of the fields. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:16:07 -08:00
Eric W. Biederman	ab8e4aee0a	nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion In struct nfs4_ace remove the member who and replace it with an anonymous union holding who_uid and who_gid. Allowing typesafe storage uids and gids. Add a helper pace_gt for sorting posix_acl_entries. In struct posix_user_ace_state to replace uid with a union of kuid_t uid and kgid_t gid. Remove all initializations of the deprecated posic_acl_entry e_id field. Which is not present when user namespaces are enabled. Split find_uid into two functions find_uid and find_gid that work in a typesafe manner. In nfs4xdr update nfsd4_encode_fattr to deal with the changes in struct nfs4_ace. Rewrite nfsd4_encode_name to take a kuid_t and a kgid_t instead of a generic id and flag if it is a group or a uid. Replace the group flag with a test for a valid gid. Modify nfsd4_encode_user to take a kuid_t and call the modifed nfsd4_encode_name. Modify nfsd4_encode_group to take a kgid_t and call the modified nfsd4_encode_name. Modify nfsd4_encode_aclname to take an ace instead of taking the fields of an ace broken out. This allows it to detect if the ace is for a user or a group and to pass the appropriate value while still being typesafe. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:16:06 -08:00
Eric W. Biederman	7c19723e99	nfsd: Convert nfsxdr to use kuids and kgids When reading uids and gids off the wire convert them to kuids and kgids. If the conversion results in an invalid result don't set the ATTR_UID or ATTR_GID. When putting kuids and kgids onto the wire first convert them to uids and gids the other side will understand. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:16:05 -08:00
Eric W. Biederman	458878a705	nfsd: Convert nfs3xdr to use kuids and kgids When reading uids and gids off the wire convert them to kuids and kgids. When putting kuids and kgids onto the wire first convert them to uids and gids the other side will understand. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:16:04 -08:00
Eric W. Biederman	e097258f2e	nfsd: Remove nfsd_luid, nfsd_lgid, nfsd_ruid and nfsd_rgid These trivial macros that don't currently do anything are the last vestiages of an old attempt at uid mapping that was removed from the kernel in September of 2002. Remove them to make it clear what the code is currently doing. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:15:51 -08:00
Eric W. Biederman	65e10f6d0a	nfsd: Convert idmap to use kuids and kgids Convert nfsd_map_name_to_uid to return a kuid_t value. Convert nfsd_map_name_to_gid to return a kgid_t value. Convert nfsd_map_uid_to_name to take a kuid_t parameter. Convert nfsd_map_gid_to_name to take a kgid_t paramater. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:15:49 -08:00
Eric W. Biederman	b5663898ec	nfsd: idmap use u32 not uid_t as the intermediate type u32 and uid_t have the same size and semantics so this change should have no operational effect. This just removes the WTF factor when looking at variables that hold both uids and gids whos type is uid_t. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:15:37 -08:00
Eric W. Biederman	6c1810e040	nfsd: Remove declaration of nonexistent nfs4_acl_permisison Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-02-13 06:15:35 -08:00
Fengguang Wu	e56a316214	nfsd4: free_stid can be static Reported-by: Fengguang Wu <fengguang.wu@intel.com>	2013-02-11 16:22:50 -05:00
Jeff Layton	01a7decf75	nfsd: keep a checksum of the first 256 bytes of request Now that we're allowing more DRC entries, it becomes a lot easier to hit problems with XID collisions. In order to mitigate those, calculate a checksum of up to the first 256 bytes of each request coming in and store that in the cache entry, along with the total length of the request. This initially used crc32, but Chuck Lever and Jim Rees pointed out that crc32 is probably more heavyweight than we really need for generating these checksums, and recommended looking at using the same routines that are used to generate checksums for IP packets. On an x86_64 KVM guest measurements with ftrace showed ~800ns to use csum_partial vs ~1750ns for crc32. The difference probably isn't terribly significant, but for now we may as well use csum_partial. Signed-off-by: Jeff Layton <jlayton@redhat.com> Stones-thrown-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-08 16:02:26 -05:00
Jeff Layton	5976687a2b	sunrpc: move address copy/cmp/convert routines and prototypes from clnt.h to addr.h These routines are used by server and client code, so having them in a separate header would be best. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-05 09:41:14 -05:00
J. Bruce Fields	3abdb60712	nfsd4: simplify idr allocation We don't really need to preallocate at all; just allocate and initialize everything at once, but leave the sc_type field initially 0 to prevent finding the stateid till it's fully initialized. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-05 09:41:12 -05:00
majianpeng	2d32b29a1c	nfsd: Fix memleak When free nfs-client, it must free the ->cl_stateids. Cc: stable@kernel.org Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-05 09:40:47 -05:00
Jeff Layton	b4e7f2c945	nfsd: register a shrinker for DRC cache entries Since we dynamically allocate them now, allow the system to call us up to release them if it gets low on memory. Since these entries aren't replaceable, only free ones that are expired or that are over the cap. The the seeks value is set to '1' however to indicate that freeing the these entries is low-cost. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 17:19:13 -05:00
Jeff Layton	aca8a23de6	nfsd: add recurring workqueue job to clean the cache It's not sufficient to only clean the cache when requests come in. What if we have a flurry of activity and then the server goes idle? Add a workqueue job that will clean the cache every RC_EXPIRE period. Care is taken to only run this when we expect to have entries expiring. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 17:19:12 -05:00
Jeff Layton	2c6b691c05	nfsd: when updating an entry with RC_NOCACHE, just free it There's no need to keep entries around that we're declaring RC_NOCACHE. Ditto if there's a problem with the entry. With this change too, there's no need to test for RC_UNUSED in the search function. If the entry's in the hash table then it's either INPROG or DONE. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 17:19:11 -05:00
Jeff Layton	13cc8a78e8	nfsd: remove the cache_disabled flag With the change to dynamically allocate entries, the cache is never disabled on the fly. Remove this flag. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 17:19:11 -05:00
Jeff Layton	0338dd1572	nfsd: dynamically allocate DRC entries The existing code keeps a fixed-size cache of 1024 entries. This is much too small for a busy server, and wastes memory on an idle one. This patch changes the code to dynamically allocate and free these cache entries. A cap on the number of entries is retained, but it's much larger than the existing value and now scales with the amount of low memory in the machine. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 17:19:10 -05:00
Jeff Layton	0ee0bf7ee5	nfsd: track the number of DRC entries in the cache Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 17:19:09 -05:00
Jeff Layton	56c2548b2d	nfsd: always move DRC entries to the end of LRU list when updating timestamp ...otherwise, we end up with the list ordering wrong. Currently, it's not a problem since we skip RC_INPROG entries, but keeping the ordering strict will be necessary for a later patch that adds a cache cleaner. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 17:19:09 -05:00
Jeff Layton	2eeb9b2abc	nfsd: initialize the exp->ex_uuid field in svc_export_init commit `885c91f746` in Bruce's tree was causing oopses for me: general protection fault: 0000 [#1] SMP Modules linked in: nfsd(OF) nfs_acl(OF) auth_rpcgss(OF) lockd(OF) sunrpc(OF) kvm_amd kvm microcode i2c_piix4 virtio_net virtio_balloon cirrus drm_kms_helper ttm drm virtio_blk i2c_core CPU 0 Pid: 564, comm: exportfs Tainted: GF O 3.8.0-0.rc5.git2.1.fc19.x86_64 #1 Bochs Bochs RIP: 0010:[<ffffffff811b1509>] [<ffffffff811b1509>] kfree+0x49/0x280 RSP: 0018:ffff88007a3d7c50 EFLAGS: 00010203 RAX: 01adaf8dadadad80 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000001 RDX: ffffffff7fffffff RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b6b RBP: ffff88007a3d7c80 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000000 R10: 0000000000000018 R11: 0000000000000000 R12: ffff88006a117b50 R13: ffffffffa01a589c R14: ffff8800631b0f50 R15: 01ad998dadadad80 FS: 00007fcaa3616740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f5d84b6fdd8 CR3: 0000000064db4000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process exportfs (pid: 564, threadinfo ffff88007a3d6000, task ffff88006af28000) Stack: ffff88007a3d7c80 ffff88006a117b68 ffff88006a117b50 0000000000000000 ffff8800631b0f50 ffff88006a117b50 ffff88007a3d7ca0 ffffffffa01a589c ffff880036be1148 ffff88007a3d7cf8 ffff88007a3d7e28 ffffffffa01a6a98 Call Trace: [<ffffffffa01a589c>] svc_export_put+0x5c/0x70 [nfsd] [<ffffffffa01a6a98>] svc_export_parse+0x328/0x7e0 [nfsd] [<ffffffffa016f1c7>] cache_do_downcall+0x57/0x70 [sunrpc] [<ffffffffa016f25e>] cache_downcall+0x7e/0x100 [sunrpc] [<ffffffffa016f338>] cache_write_procfs+0x58/0x90 [sunrpc] [<ffffffffa016f2e0>] ? cache_downcall+0x100/0x100 [sunrpc] [<ffffffff8123b0e5>] proc_reg_write+0x75/0xb0 [<ffffffff811ccecf>] vfs_write+0x9f/0x170 [<ffffffff811cd089>] sys_write+0x49/0xa0 [<ffffffff816e0919>] system_call_fastpath+0x16/0x1b Code: 66 66 66 90 48 83 fb 10 0f 86 c3 00 00 00 48 89 df 49 bf 00 00 00 00 00 ea ff ff e8 f2 12 ea ff 48 c1 e8 0c 48 c1 e0 06 49 01 c7 <49> 8b 07 f6 c4 80 0f 85 1d 02 00 00 49 8b 07 a8 80 0f 84 ee 01 RIP [<ffffffff811b1509>] kfree+0x49/0x280 RSP <ffff88007a3d7c50> I think Majianpeng's patch is correct, but incomplete. In order for it to be safe to free the ex_uuid unconditionally in svc_export_put, we need to make sure it's initialized to NULL in the init routine. Cc: majianpeng <majianpeng@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 09:16:24 -05:00
Jeff Layton	a4a3ec3291	nfsd: break out hashtable search into separate function Later, we'll need more than one call site for this, so break it out into a new function. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 09:16:24 -05:00
Jeff Layton	d1a0774de6	nfsd: clean up and clarify the cache expiration code Add a preprocessor constant for the expiry time of cache entries, and move the test for an expired entry into a function. Note that the current code does not test for RC_INPROG. It just assumes that it won't take more than 2 minutes to fill out an in-progress entry. I'm not sure how valid that assumption is though, so let's just ensure that we never consider an RC_INPROG entry to be expired. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 09:16:23 -05:00
Jeff Layton	25e6b8b0e1	nfsd: remove redundant test from nfsd_reply_cache_free Entries can only get a c_type of RC_REPLBUFF iff they are RC_DONE. Therefore the test for RC_DONE isn't necessary here. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 09:16:22 -05:00
Jeff Layton	f09841fdfa	nfsd: add alloc and free functions for DRC entries Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 09:16:22 -05:00
Jeff Layton	8a8bc40d9b	nfsd: create a dedicated slabcache for DRC entries Currently we use kmalloc() which wastes a little bit of memory on each allocation since it's a power of 2 allocator. Since we're allocating a 1024 of these now, and may need even more later, let's create a new slabcache for them. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 09:16:21 -05:00
Jeff Layton	09662d58d5	nfsd: get rid of RC_INTR The reply cache code never returns this status. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 09:16:20 -05:00
Jeff Layton	6dc8889589	nfsd: remove unneeded spinlock in nfsd_cache_update The locking rules for cache entries say that locking the cache_lock isn't needed if you're just touching the current entry. Earlier in this function we set rp->c_state to RC_UNUSED without any locking, so I believe it's ok to do the same here. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 09:16:19 -05:00
Jeff Layton	7b9e8522a6	nfsd: fix IPv6 address handling in the DRC Currently, it only stores the first 16 bytes of any address. struct sockaddr_in6 is 28 bytes however, so we're currently ignoring the last 12 bytes of the address. Expand the c_addr field to a sockaddr_in6, and cast it to a sockaddr_in as necessary. Also fix the comparitor to use the existing RPC helpers for this. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-04 09:16:19 -05:00
majianpeng	885c91f746	nfsd: Fix memleak in svc_export_put In func svc_export_parse, the uuid which used kmemdup to alloc will be changed in func export_update.So the later kfree don't free this memory. And it can't be free in func svc_export_parse because other place still used.So put this operation in func svc_export_put. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-01-29 16:50:03 -05:00
J. Bruce Fields	ff89be87c7	nfsd4: require version 4 when enabling or disabling minorversion The current code will allow silly things like: echo "+2 +3 +4 +7.1">/proc/fs/nfsd/versions Reported-by: Fan Chaoting <fanchaoting@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-01-23 18:25:01 -05:00
Stanislav Kinsbursky	bca0ec6511	nfsd: fix unused "nn" variable warning in free_client() If CONFIG_LOCKDEP is disabled, then there would be a warning like this: CC [M] fs/nfsd/nfs4state.o fs/nfsd/nfs4state.c: In function ‘free_client’: fs/nfsd/nfs4state.c:1051:19: warning: unused variable ‘nn’ [-Wunused-variable] So, let's add "maybe_unused" tag to this variable. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-01-23 18:17:40 -05:00
Yanchuan Nian	266533c6df	nfsd: Don't unlock the state while it's not locked In the procedure of CREATE_SESSION, the state is locked after alloc_conn_from_crses(). If the allocation fails, the function goes to "out_free_session", and then "out" where there is an unlock function. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-01-23 18:17:37 -05:00
Yanchuan Nian	74b70dded3	nfsd: Pass correct slot number to nfsd4_put_drc_mem() In alloc_session(), numslots is the correct slot number used by the session. But the slot number passed to nfsd4_put_drc_mem() is the one from nfs client. Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-01-23 18:17:36 -05:00
J. Bruce Fields	84822d0b3b	nfsd4: simplify nfsd4_encode_fattr interface slightly It seems slightly simpler to make nfsd4_encode_fattr rather than its callers responsible for advancing the write pointer on success. (Also: the count == 0 check in the verify case looks superfluous. Running out of buffer space is really the only reason fattr encoding should fail with eresource.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-01-23 18:17:35 -05:00
Kees Cook	f987c90257	fs/nfsd: remove depends on CONFIG_EXPERIMENTAL The CONFIG_EXPERIMENTAL config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it from any "depends on" lines in Kconfigs. CC: "J. Bruce Fields" <bfields@fieldses.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-21 14:39:05 -08:00
J. Bruce Fields	10532b560b	Revert "nfsd: warn on odd reply state in nfsd_vfs_read" This reverts commit `79f77bf9a4`. This is obviously wrong, and I have no idea how I missed seeing the warning in testing: I must just not have looked at the right logs. The caller bumps rq_resused/rq_next_page, so it will always be hit on a large enough read. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-21 17:07:45 -08:00
J. Bruce Fields	24ffb93872	nfsd4: don't leave freed stateid hashed Note the stateid is hashed early on in init_stid(), but isn't currently being unhashed on error paths. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 22:00:28 -05:00
J. Bruce Fields	a1dc695582	nfsd4: free_stateid can use the current stateid Cc: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 22:00:27 -05:00
J. Bruce Fields	afc59400d6	nfsd4: cleanup: replace rq_resused count by rq_next_page pointer It may be a matter of personal taste, but I find this makes the code clearer. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 22:00:16 -05:00
J. Bruce Fields	79f77bf9a4	nfsd: warn on odd reply state in nfsd_vfs_read As far as I can tell this shouldn't currently happen--or if it does, something is wrong and data is going to be corrupted. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 21:55:46 -05:00
J. Bruce Fields	d5f50b0c29	nfsd4: fix oops on unusual readlike compound If the argument and reply together exceed the maximum payload size, then a reply with a read-like operation can overlow the rq_pages array. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 21:55:21 -05:00
J. Bruce Fields	9b3234b922	nfsd4: disable zero-copy on non-final read ops To ensure ordering of read data with any following operations, turn off zero copy if the read is not the final operation in the compound. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 16:02:41 -05:00
Bryan Schumaker	18d9a2ca2e	NFSD: Correct the size calculation in fault_inject_write If len == 0 we end up with size = (0 - 1), which could cause bad things to happen in copy_from_user(). Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 18:24:22 -05:00
Bryan Schumaker	0a5c33e23c	NFSD: Pass correct buffer size to rpc_ntop I honestly have no idea where I got 129 from, but it's a much bigger value than the actual buffer size (INET6_ADDRSTRLEN). Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 18:24:21 -05:00
Stanislav Kinsbursky	88c4766617	nfsd: pass proper net to nfsd_destroy() from NFSd kthreads Since NFSd service is per-net now, we have to pass proper network context in nfsd_shutdown() from NFSd kthreads. The simplest way I found is to get proper net from one of transports with permanent sockets. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:42 -05:00
Stanislav Kinsbursky	541e864f00	nfsd: simplify service shutdown Function nfsd_shutdown is called from two places: nfsd_last_thread (when last kernel thread is exiting) and nfsd_svc (in case of kthreads starting error). When calling from nfsd_svc(), we can be sure that per-net resources are allocated, so we don't need to check per-net nfsd_net_up boolean flag. This allows us to remove nfsd_shutdown function at all and move check for per-net nfsd_net_up boolean flag to nfsd_last_thread. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:42 -05:00
Stanislav Kinsbursky	4539f14981	nfsd: replace boolean nfsd_up flag by users counter Since we have generic NFSd resurces, we have to introduce some way how to allocate and destroy those resources on first per-net NFSd start and on last per-net NFSd stop respectively. This patch replaces global boolean nfsd_up flag (which is unused now) by users counter and use it to determine either we need to allocate generic resources or destroy them. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:41 -05:00
Stanislav Kinsbursky	903d9bf0ed	nfsd: simplify NFSv4 state init and shutdown This patch moves nfsd_startup_generic() and nfsd_shutdown_generic() calls to nfsd_startup_net() and nfsd_shutdown_net() respectively, which allows us to call nfsd_startup_net() instead of nfsd_startup() and makes the code look clearer. It also modifies nfsd_svc() and nfsd_shutdown() to check nn->nfsd_net_up instead of global nfsd_up. The latter is now used only for generic resources shutdown and is currently useless. It will replaced by NFSd users counter later in this series. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:40 -05:00
Stanislav Kinsbursky	bda9cac1db	nfsd: introduce helpers for generic resources init and shutdown NFSd have per-net resources and resources, used globally. Let's move generic resources init and shutdown to separated functions since they are going to be allocated on first NFSd service start and destroyed after last NFSd service shutdown. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:39 -05:00
Stanislav Kinsbursky	9dd9845f08	nfsd: make NFSd service structure allocated per net This patch makes main step in NFSd containerisation. There could be different approaches to how to make NFSd able to handle incoming RPC request from different network namespaces. The two main options are: 1) Share NFSd kthreads betwween all network namespaces. 2) Create separated pool of threads for each namespace. While first approach looks more flexible, second one is simpler and non-racy. This patch implements the second option. To make it possible to allocate separate pools of threads, we have to make it possible to allocate separate NFSd service structures per net. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:39 -05:00
Stanislav Kinsbursky	b9c0ef8571	nfsd: make NFSd service boot time per-net This is simple: an NFSd service can be started at different times in different network environments. So, its "boot time" has to be assigned per net. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:38 -05:00
Stanislav Kinsbursky	2c2fe2909e	nfsd: per-net NFSd up flag introduced This patch introduces introduces per-net "nfsd_net_up" boolean flag, which has the same purpose as general "nfsd_up" flag - skip init or shutdown of per-net resources in case of they are inited on shutted down respectively. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:37 -05:00
Stanislav Kinsbursky	6ff50b3dea	nfsd: move per-net startup code to separated function NFSd resources are partially per-net and partially globally used. This patch splits resources init and shutdown and moves per-net code to separated functions. Generic and per-net init and shutdown are called sequentially for a while. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:36 -05:00
Stanislav Kinsbursky	081603520b	nfsd: pass net to __write_ports() and down Precursor patch. Hard-coded "init_net" will be replaced by proper one in future. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:36 -05:00
Stanislav Kinsbursky	3938a0d5eb	nfsd: pass net to nfsd_set_nrthreads() Precursor patch. Hard-coded "init_net" will be replaced by proper one in future. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:35 -05:00
Stanislav Kinsbursky	d41a9417cd	nfsd: pass net to nfsd_svc() Precursor patch. Hard-coded "init_net" will be replaced by proper one in future. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:34 -05:00
Stanislav Kinsbursky	6777436b0f	nfsd: pass net to nfsd_create_serv() Precursor patch. Hard-coded "init_net" will be replaced by proper one in future. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:34 -05:00
Stanislav Kinsbursky	db42d1a76a	nfsd: pass net to nfsd_startup() and nfsd_shutdown() Precursor patch. Hard-coded "init_net" will be replaced by proper one in future. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:33 -05:00
Stanislav Kinsbursky	db6e182c17	nfsd: pass net to nfsd_init_socks() Precursor patch. Hard-coded "init_net" will be replaced by proper one in future. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:32 -05:00
Stanislav Kinsbursky	f7fb86c6e6	nfsd: use "init_net" for portmapper There could be a situation, when NFSd was started in one network namespace, but stopped in another one. This will trigger kernel panic, because RPCBIND client is stored on per-net NFSd data, and will be NULL on NFSd shutdown. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:32 -05:00
Neil Brown	7007c90fb9	nfsd: avoid permission checks on EXCLUSIVE_CREATE replay With NFSv4, if we create a file then open it we explicit avoid checking the permissions on the file during the open because the fact that we created it ensures we should be allow to open it (the create and the open should appear to be a single operation). However if the reply to an EXCLUSIVE create gets lots and the client resends the create, the current code will perform the permission check - because it doesn't realise that it did the open already.. This patch should fix this. Note that I haven't actually seen this cause a problem. I was just looking at the code trying to figure out a different EXCLUSIVE open related issue, and this looked wrong. (Fix confirmed with pynfs 4.0 test OPEN4--bfields) Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de> [bfields: use OWNER_OVERRIDE and update for 4.1] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:31 -05:00
Stanislav Kinsbursky	9a9c6478a8	nfsd: make NFSv4 recovery client tracking options per net Pointer to client tracking operations - client_tracking_ops - have to be containerized, because different environment can support different trackers (for example, legacy tracker currently is not suported in container). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-10 16:25:30 -05:00
J. Bruce Fields	9b2ef62b15	nfsd4: lockt, release_lockowner should renew clients Fix nfsd4_lockt and release_lockowner to lookup the referenced client, so that it can renew it, or correctly return "expired", as appropriate. Also share some code while we're here. Reported-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:51:12 -05:00
Bryan Schumaker	6c1e82a4b7	NFSD: Forget state for a specific client Write the client's ip address to any state file and all appropriate state for that client will be forgotten. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:59:03 -05:00
Bryan Schumaker	d7cc431edd	NFSD: Add a custom file operations structure for fault injection Controlling the read and write functions allows me to add in "forget client w.x.y.z", since we won't be limited to reading and writing only u64 values. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:59:02 -05:00
Bryan Schumaker	184c18471f	NFSD: Reading a fault injection file prints a state count I also log basic information that I can figure out about the type of state (such as number of locks for each client IP address). This can be useful for checking that state was actually dropped and later for checking if the client was able to recover. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:59:01 -05:00
Bryan Schumaker	8ce54e0d82	NFSD: Fault injection operations take a per-client forget function The eventual goal is to forget state based on ip address, so it makes sense to call this function in a for-each-client loop until the correct amount of state is forgotten. I also use this patch as an opportunity to rename the forget function from "func()" to "forget()". Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:59:00 -05:00
Bryan Schumaker	269de30f10	NFSD: Clean up forgetting and recalling delegations Once I have a client, I can easily use its delegation list rather than searching the file hash table for delegations to remove. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:58:59 -05:00
Bryan Schumaker	4dbdbda84f	NFSD: Clean up forgetting openowners Using "forget_n_state()" forces me to implement the code needed to forget a specific client's openowners. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:58:58 -05:00
Bryan Schumaker	fc29171f5b	NFSD: Clean up forgetting locks I use the new "forget_n_state()" function to iterate through each client first when searching for locks. This may slow down forgetting locks a little bit, but it implements most of the code needed to forget a specified client's locks. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:58:56 -05:00
Bryan Schumaker	44e34da60b	NFSD: Clean up forgetting clients I added in a generic for-each loop that takes a pass over the client_lru list for the current net namespace and calls some function. The next few patches will update other operations to use this function as well. A value of 0 still means "forget everything that is found". Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:58:55 -05:00
Bryan Schumaker	043958395a	NFSD: Lock state before calling fault injection function Each function touches state in some way, so getting the lock earlier can help simplify code. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:58:54 -05:00
J. Bruce Fields	e5f9570319	nfsd4: discard some unused nfsd4_verify xdr code Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-03 09:43:51 -05:00
Bryan Schumaker	f3c7521fe5	NFSD: Fold fault_inject.h into state.h There were only a small number of functions in this file and since they all affect stored state I think it makes sense to put them in state.h instead. I also dropped most static inline declarations since there are no callers when fault injection is not enabled. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 13:01:02 -05:00
Stanislav Kinsbursky	5284b44e43	nfsd: make NFSv4 grace time per net Grace time is a part of NFSv4 state engine, which is constructed per network namespace. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:39:47 -05:00
Stanislav Kinsbursky	3d7337115d	nfsd: make NFSv4 lease time per net Lease time is a part of NFSv4 state engine, which is constructed per network namespace. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:39:46 -05:00
Stanislav Kinsbursky	864aee5c6f	nfsd: remove redundant declarations This is a cleanup patch. Functions nfsd_pool_stats_open() and nfsd_pool_stats_release() are declared in fs/nfsd/nfsd.h. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:55 -05:00
Stanislav Kinsbursky	f141f79d70	nfsd: recovery - make in_grace per net Flag in_grace is a part of client tracking state, which is network namesapce aware. So let'a replace global static variable with per-net one. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:54 -05:00
Stanislav Kinsbursky	3a0733692f	nfsd: recovery - make rec_file per net Opening and closing of this file is done in client tracking init and exit operations. Client tracking is done in network namespace context already. So let's make this file opened and closed per network context - this will simlify it's management. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:53 -05:00
Stanislav Kinsbursky	f252bc6806	nfsd: call state init and shutdown twice Split NFSv4 state init and shutdown into two different calls: per-net one and generic one. Per-net cwinit/shutdown pair have to be called for any namespace, generic pair - only once on NSFd kthreads start and shutdown respectively. Refresh of diff-nfsd-call-state-init-twice Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:53 -05:00
Stanislav Kinsbursky	d85ed44305	nfsd: cleanup NFSd state start a bit This patch renames nfs4_state_start_net() into nfs4_state_create_net(), where get_net() now performed. Also it introduces new nfs4_state_start_net(), which is now responsible for state creation and initializing all per-net data and which is now called from nfs4_state_start(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:52 -05:00
Stanislav Kinsbursky	4dce0ac906	nfsd: cleanup NFSd state shutdown a bit This patch renames __nfs4_state_shutdown_net() into nfs4_state_shutdown_net(), __nfs4_state_shutdown() into nfs4_state_shutdown_net() and moves all network related shutdown operations to nfs4_state_shutdown_net(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:51 -05:00
Stanislav Kinsbursky	4e37a7c207	nfsd: make delegations shutdown network namespace aware NFSv4 delegations are stored in global list. But they are nfs4_client dependent, which is network namespace aware already. State shutdown and laundromat are done per network namespace as well. So, delegations unhash have to be done in network namespace context. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:50 -05:00
Stanislav Kinsbursky	c9a4962881	nfsd: make client_lock per net This lock protects the client lru list and session hash table, which are allocated per network namespace already. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:50 -05:00
Stanislav Kinsbursky	ec28e02ca5	nfsd4: remove state lock from nfs4_state_shutdown Protection of __nfs4_state_shutdown() with nfs4_lock_state() looks redundant. This function is called by the last NFSd thread on it's exit and state lock protects actually two functions (del_recall_lru is protected by recall_lock): 1) nfsd4_client_tracking_exit 2) __nfs4_state_shutdown_net "nfsd4_client_tracking_exit" doesn't require state lock protection, because it's state can be modified only by tracker callbacks. Here a re they: 1) create: is called only from nfsd4_proc_compound. 2) remove: is called from either nfsd4_proc_compound or nfs4_laundromat. 3) check: is called only from nfsd4_proc_compound. 4) grace_done; called only from nfs4_laundromat. nfsd4_proc_compound is called onll by NFSd kthread, which is exiting right now. nfs4_laundromat is called by laundry_wq. But laundromat_work was canceled already. "__nfs4_state_shutdown_net" also doesn't require state lock protection, because all NFSd kthreads are dead, and no race can happen with NFSd start, because "nfsd_up" flag is still set. Moreover, all Nfsd shutdown is protected with global nfsd_mutex. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:49 -05:00
J. Bruce Fields	dba88ba55a	nfsd4: remove state lock from nfsd4_load_reboot_recovery_data That function is only called under nfsd_mutex: we know that because the only caller is nfsd_svc, via nfsd_svc nfsd_startup nfs4_state_start nfsd4_client_tracking_init client_tracking_ops->init == nfsd4_load_reboot_recovery_data The shared state accessed here includes: - user_recovery_dirname: used here, modified only by nfs4_reset_recoverydir, which can be verified to only be called under nfsd_mutex. - filesystem state, protected by i_mutex (handwaving slightly here) - rec_file, reclaim_str_hashtbl, reclaim_str_hashtbl_size: other than here, used only from code called from nfsd or laundromat threads, both of which should be started only after this runs (see nfsd_svc) and stopped before this could run again (see nfsd_shutdown, called from nfsd_last_thread). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:48 -05:00
J. Bruce Fields	a36b1725b3	nfsd4: return badname, not inval, on "." or "..", or "/" The spec requires badname, not inval, in these cases. Some callers want us to return enoent, but I can see no justification for that. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-27 16:41:48 -05:00
J. Bruce Fields	063b0fb9fa	nfsd4: downgrade some fs/nfsd/nfs4state.c BUG's Linus has pointed out that indiscriminate use of BUG's can make it harder to diagnose bugs because they can bring a machine down, often before we manage to get any useful debugging information to the logs. (Consider, for example, a BUG() that fires in a workqueue, or while holding a spinlock). Most of these BUG's won't do much more than kill an nfsd thread, but it would still probably be safer to get out the warning without dying. There's still more of this to do in nfsd/. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:16 -05:00
J. Bruce Fields	ffe1137ba7	nfsd4: delay filling in write iovec array till after xdr decoding Our server rejects compounds containing more than one write operation. It's unclear whether this is really permitted by the spec; with 4.0, it's possibly OK, with 4.1 (which has clearer limits on compound parameters), it's probably not OK. No client that we're aware of has ever done this, but in theory it could be useful. The source of the limitation: we need an array of iovecs to pass to the write operation. In the worst case that array of iovecs could have hundreds of elements (the maximum rwsize divided by the page size), so it's too big to put on the stack, or in each compound op. So we instead keep a single such array in the compound argument. We fill in that array at the time we decode the xdr operation. But we decode every op in the compound before executing any of them. So once we've used that array we can't decode another write. If we instead delay filling in that array till the time we actually perform the write, we can reuse it. Another option might be to switch to decoding compound ops one at a time. I considered doing that, but it has a number of other side effects, and I'd rather fix just this one problem for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:15 -05:00
J. Bruce Fields	70cc7f75b1	nfsd4: move more write parameters into xdr argument In preparation for moving some of this elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:14 -05:00
J. Bruce Fields	5a80a54d21	nfsd4: reorganize write decoding In preparation for moving some of it elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:14 -05:00
J. Bruce Fields	8a61b18c9b	nfsd4: simplify reading of opnum The comment here is totally bogus: - OP_WRITE + 1 is RELEASE_LOCKOWNER. Maybe there was some older version of the spec in which that served as a sort of OP_ILLEGAL? No idea, but it's clearly wrong now. - In any case, I can't see that the spec says anything about what to do if the client sends us less ops than promised. It's clearly nutty client behavior, and we should do whatever's easiest: returning an xdr error (even though it won't be consistent with the error on the last op returned) seems fine to me. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:13 -05:00
J. Bruce Fields	447bfcc936	nfsd4: no, we're not going to check tags for utf8 Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:12 -05:00
J. Bruce Fields	57d276d71a	nfsd: fix v4 reply caching Very embarassing: `1091006c5e` "nfsd: turn on reply cache for NFSv4" missed a line, effectively leaving the reply cache off in the v4 case. I thought I'd tested that, but I guess not. This time, wrote a pynfs test to confirm it works. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:05:19 -05:00
Stanislav Kinsbursky	0912128149	nfsd: make laundromat network namespace aware This patch moves laundromat_work to nfsd per-net context, thus allowing to run multiple laundries. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:51 -05:00
Stanislav Kinsbursky	12760c6685	nfsd: pass nfsd_net instead of net to grace enders Passing net context looks as overkill. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:50 -05:00
Stanislav Kinsbursky	3320fef19b	nfsd: use service net instead of hard-coded init_net This patch replaces init_net by SVC_NET(), where possible and also passes proper context to nested functions where required. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:50 -05:00
Stanislav Kinsbursky	73758fed71	nfsd: make close_lru list per net This list holds nfs4 clients (open) stateowner queue for last close replay, which are network namespace aware. So let's make this list per network namespace too. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:49 -05:00
Stanislav Kinsbursky	5ed58bb243	nfsd: make client_lru list per net This list holds nfs4 clients queue for lease renewal, which are network namespace aware. So let's make this list per network namespace too. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:48 -05:00
Stanislav Kinsbursky	1872de0e81	nfsd: make sessionid_hashtbl allocated per net This hash holds established sessions state and closely associated with nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace too. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:47 -05:00
Stanislav Kinsbursky	20e9e2bc98	nfsd: make lockowner_ino_hashtbl allocated per net This hash holds file lock owners and closely associated with nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace too. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:47 -05:00
Stanislav Kinsbursky	9b53113740	nfsd: make ownerstr_hashtbl allocated per net This hash holds open owner state and closely associated with nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace too. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:46 -05:00
Stanislav Kinsbursky	a99454aa4f	nfsd: make unconf_name_tree per net This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:45 -05:00
Stanislav Kinsbursky	0a7ec37727	nfsd: make unconf_id_hashtbl allocated per net This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:45 -05:00
Stanislav Kinsbursky	382a62e76c	nfsd: make conf_name_tree per net This tree holds nfs4_clients info, which are network namespace aware. So let's make it per network namespace. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:44 -05:00
Stanislav Kinsbursky	8daae4dc0d	nfsd: make conf_id_hashtbl allocated per net This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:43 -05:00
Stanislav Kinsbursky	52e19c09a1	nfsd: make reclaim_str_hashtbl allocated per net This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash is used only by legacy tracker. So let's allocate hash in tracker init. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:43 -05:00
Stanislav Kinsbursky	c212cecfa2	nfsd: make nfs4_client network namespace dependent And use it's net where possible. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:42 -05:00
Stanislav Kinsbursky	7f2210fa6b	nfsd: use service net instead of hard-coded net where possible Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:41 -05:00
Fengguang Wu	2b4cf668a7	nfsd4: get_backchannel_cred should be static Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-14 11:23:00 -05:00
Fengguang Wu	135ae8270d	nfsd4: init_session should be declared static Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-14 11:23:00 -05:00
Jeff Layton	7e4f015d81	nfsd: release the legacy reclaimable clients list in grace_done The current code holds on to this list until nfsd is shut down, but it's never touched once the grace period ends. Release that memory back into the wild when the grace period ends. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:12 -05:00
Jeff Layton	2216d449a9	nfsd: get rid of cl_recdir field Remove the cl_recdir field from the nfs4_client struct. Instead, just compute it on the fly when and if it's needed, which is now only when the legacy client tracking code is in effect. The error handling in the legacy client tracker is also changed to handle the case where md5 is unavailable. In that case, we'll warn the admin with a KERN_ERR message and disable the client tracking. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	ac55fdc408	nfsd: move the confirmed and unconfirmed hlists to a rbtree The current code requires that we md5 hash the name in order to store the client in the confirmed and unconfirmed trees. Change it instead to store the clients in a pair of rbtrees, and simply compare the cl_names directly instead of hashing them. This also necessitates that we add a new flag to the clp->cl_flags field to indicate which tree the client is currently in. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	0ce0c2b5d2	nfsd: don't search for client by hash on legacy reboot recovery gracedone When nfsd starts, the legacy reboot recovery code creates a tracking struct for each directory in the v4recoverydir. When the grace period ends, it basically does a "readdir" on the directory again, and matches each dentry in there to an existing client id to see if it should be removed or not. If the matching client doesn't exist, or hasn't reclaimed its state then it will remove that dentry. This is pretty inefficient since it involves doing a lot of hash-bucket searching. It also means that we have to keep relying on being able to search for a nfs4_client by md5 hashed cl_recdir name. Instead, add a pointer to the nfs4_client that indicates the association between the nfs4_client_reclaim and nfs4_client. When a reclaim operation comes in, we set the pointer to make that association. On gracedone, the legacy client tracker will keep the recdir around iff: 1/ there is a reclaim record for the directory ...and... 2/ there's an association between the reclaim record and a client record -- that is, a create or check operation was performed on the client that matches that directory. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	772a9bbbb5	nfsd: make nfs4_client_to_reclaim return a pointer to the reclaim record Later callers will need to make changes to the record. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	ce30e5392f	nfsd: break out reclaim record removal into separate function We'll need to be able to call this from nfs4recover.c eventually. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	278c931cb0	nfsd: have nfsd4_find_reclaim_client take a char * argument Currently, it takes a client pointer, but later we're going to need to search for these records without knowing whether a matching client even exists. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	8b0554e9a2	nfsd: warn about impending removal of nfsdcld upcall Let's shoot for removing the nfsdcld upcall in 3.10. Most likely, no one is actually using it so I don't expect this warning to fire often (except maybe on misconfigured systems). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:10 -05:00
Jeff Layton	f3aa7e24c9	nfsd: pass info about the legacy recoverydir in environment variables The usermodehelper upcall program can then decide to use this info as a (one-way) transition mechanism to the new scheme. When a "check" upcall occurs and the client doesn't exist in the database, we can look to see whether the directory exists. If it does, then we'd add the client to the database, remove the legacy recdir, and return success to the kernel to allow the recovery to proceed. For gracedone, we simply pass the v4recovery "topdir" so that the upcall can clean it out prior to returning to the kernel. A module parm is also added to disable the legacy conversion if the admin chooses. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:10 -05:00
Jeff Layton	2d77bf0a55	nfsd: change heuristic for selecting the client_tracking_ops First, try to use the new usermodehelper upcall. It should succeed or fail quickly, so there's little cost to doing so. If it fails, and the legacy tracking dir exists, use that. If it doesn't exist then fall back to using nfsdcld. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:10 -05:00
Jeff Layton	2873d2147e	nfsd: add a usermodehelper upcall for NFSv4 client ID tracking Add a new client tracker upcall type that uses call_usermodehelper to call out to a program. This seems to be the preferred method of calling out to usermode these days for seldom-called upcalls. It's simple and doesn't require a running daemon, so it should "just work" as long as the binary is installed. The client tracking exit operation is also changed to check for a NULL pointer before running. The UMH upcall doesn't need to do anything at module teardown time. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:10 -05:00
Jeff Layton	a0af710a65	nfsd: remove unused argument to nfs4_has_reclaimed_state Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-10 14:56:54 -05:00
Jeff Layton	698d8d875a	nfsd: fix error handling in nfsd4_remove_clid_dir If the credential save fails, then we'll leak our mnt_want_write_file reference. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-10 14:52:03 -05:00
J. Bruce Fields	12fc3e92d4	nfsd4: backchannel should use client-provided security flavor For now this only adds support for AUTH_NULL. (Previously we assumed AUTH_UNIX.) We'll also need AUTH_GSS, which is trickier. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:40:05 -05:00
J. Bruce Fields	57725155dc	nfsd4: common helper to initialize callback work I've found it confusing having the only references to nfsd4_do_callback_rpc() in a different file. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:40:04 -05:00
J. Bruce Fields	cb73a9f464	nfsd4: implement backchannel_ctl operation This operation is mandatory for servers to implement. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:39:58 -05:00
J. Bruce Fields	c6bb3ca27d	nfsd4: use callback security parameters in create_session We're currently ignoring the callback security parameters specified in create_session, and just assuming the client wants auth_sys, because that's all the current linux client happens to care about. But this could cause us callbacks to fail to a client that wanted something different. For now, all we're doing is no longer ignoring the uid and gid passed in the auth_sys case. Further patches will add support for auth_null and gss (and possibly use more of the auth_sys information; the spec wants us to use exactly the credential we're passed, though it's hard to imagine why a client would care). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:35 -05:00
J. Bruce Fields	acb2887e04	nfsd4: clean up callback security parsing Move the callback parsing into a separate function. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:35 -05:00
J. Bruce Fields	face15025f	nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes NFSv4 shares the same struct file across multiple writes. (And we'd like NFSv2 and NFSv3 to do that as well some day.) So setting O_SYNC on the struct file as a way to request a synchronous write doesn't work. Instead, do a vfs_fsync_range() in that case. Reported-by: Peter Staubach <pstaubach@exagrid.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:34 -05:00
J. Bruce Fields	fae5096ad2	nfsd: assume writeable exportabled filesystems have f_sync I don't really see how you could claim to support nfsd and not support fsync somehow. And in practice a quick look through the exportable filesystems suggests the only ones without an ->fsync are read-only (efs, isofs, squashfs) or in-memory (shmem). Also, performing a write and then returning an error if the sync fails (as we would do here in the wgather case) seems unhelpful to clients. Also remove an incorrect comment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:33 -05:00
J. Bruce Fields	7fa10cd12d	nfsd4: don't BUG in delegation break callback These conditions would indeed indicate bugs in the code, but if we want to hear about them we're likely better off warning and returning than immediately dying while holding file_lock_lock. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:33 -05:00
J. Bruce Fields	7c1f8b65af	nfsd4: remove unused init_session return Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:31 -05:00
J. Bruce Fields	ae7095a7c4	nfsd4: helper function for getting mounted_on ino Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:31 -05:00
Yanchuan Nian	3c40794b2d	nfs: fix wrong object type in lockowner_slab The object type in the cache of lockowner_slab is wrong, and it is better to fix it. Cc: stable@vger.kernel.org Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:30:57 -05:00
Wei Yongjun	01f6c8fd94	nfsd4: remove unused variable in nfsd4_delegreturn() The variable inode is initialized but never used otherwise, so remove the unused variable. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:22:31 -05:00
Namjae Jeon	216b6cbdcb	exportfs: add FILEID_INVALID to indicate invalid fid_type This commit adds FILEID_INVALID = 0xff in fid_type to indicate invalid fid_type It avoids using magic number 255 Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Vivek Trivedi <vtrivedi018@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:22:30 -05:00
J. Bruce Fields	f474af7051	UAPI Disintegration 2012-10-09 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUAUHPmWxOxKuMESys7AQKN4w//XDwALfbf0MXIw+gwyRiUtJe9mGexvI6X 1R4FWU9a3ImzEZP4cWnmPGT2wmC/x007DcIvx8cyvbdlSuqtR2i/DC+HbWabiLRn nJS7Eer1BJvLv5dn6NmXMEz7yB4Z46+frcmBs3WQeR0sqBMDm+rjQzCqECznO8Jc VtCbox+VR2DuWcM++YECTblYEH3Z+doDXUN2eBaD8L9x3klPbPXD7OcRyOnry8w+ ynmUTKKyH4+hpxDakYrObPIg+vFCxb4QRck1mlgA4wbvb3eqjhM0oOCYJ8GvmILA vdFYztWCjkiuOl5djtXBlsClX8SAMOBYlRed+R1GvjNCSR+WCWrFJJ2F8qoQ1w87 9ts2/8qrozS8luTB475SkT2uLdJkIUKX89Oh+dWeE8YkbPnRPj5lNAdtNY5QSyDq VaRpIo+YfmZygyvHJQlAXBuZ0mvzcPzArfcPgSVTD3B7xTEGVu/45V7SnQX5os/V v39ySPXMdGOIdvK51gw7OtZl64uqrEKu39PyYDX/GUADflp/CHD0J7PJrQePbsH9 AQolVZDIxTfKqYQnUdL8+C8Zc24RowEzz3c2+aO89MSzwGqev3q8sXRVbW/Iqryg p+V3nHe+ipKcga5tOBlPr9KDtDd7j3xN2yaIwf5/QyO1OHBpjAZP1gjSVDcUcwpi svYy4kPn3PA= =etoL -----END PGP SIGNATURE----- nfs: disintegrate UAPI for nfs This is to complete part of the Userspace API (UAPI) disintegration for which the preparatory patches were pulled recently. After these patches, userspace headers will be segregated into: include/uapi/linux/.../foo.h for the userspace interface stuff, and: include/linux/.../foo.h for the strictly kernel internal stuff. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-09 18:35:22 -04:00
Linus Torvalds	aab174f0df	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs update from Al Viro: - big one - consolidation of descriptor-related logics; almost all of that is moved to fs/file.c (BTW, I'm seriously tempted to rename the result to fd.c. As it is, we have a situation when file_table.c is about handling of struct file and file.c is about handling of descriptor tables; the reasons are historical - file_table.c used to be about a static array of struct file we used to have way back). A lot of stray ends got cleaned up and converted to saner primitives, disgusting mess in android/binder.c is still disgusting, but at least doesn't poke so much in descriptor table guts anymore. A bunch of relatively minor races got fixed in process, plus an ext4 struct file leak. - related thing - fget_light() partially unuglified; see fdget() in there (and yes, it generates the code as good as we used to have). - also related - bits of Cyrill's procfs stuff that got entangled into that work; _not_ all of it, just the initial move to fs/proc/fd.c and switch of fdinfo to seq_file. - Alex's fs/coredump.c spiltoff - the same story, had been easier to take that commit than mess with conflicts. The rest is a separate pile, this was just a mechanical code movement. - a few misc patches all over the place. Not all for this cycle, there'll be more (and quite a few currently sit in akpm's tree)." Fix up trivial conflicts in the android binder driver, and some fairly simple conflicts due to two different changes to the sock_alloc_file() interface ("take descriptor handling from sock_alloc_file() to callers" vs "net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries" adding a dentry name to the socket) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits) MAX_LFS_FILESIZE should be a loff_t compat: fs: Generic compat_sys_sendfile implementation fs: push rcu_barrier() from deactivate_locked_super() to filesystems btrfs: reada_extent doesn't need kref for refcount coredump: move core dump functionality into its own file coredump: prevent double-free on an error path in core dumper usb/gadget: fix misannotations fcntl: fix misannotations ceph: don't abuse d_delete() on failure exits hypfs: ->d_parent is never NULL or negative vfs: delete surplus inode NULL check switch simple cases of fget_light to fdget new helpers: fdget()/fdput() switch o2hb_region_dev_write() to fget_light() proc_map_files_readdir(): don't bother with grabbing files make get_file() return its argument vhost_set_vring(): turn pollstart/pollstop into bool switch prctl_set_mm_exe_file() to fget_light() switch xfs_find_handle() to fget_light() switch xfs_swapext() to fget_light() ...	2012-10-02 20:25:04 -07:00
Linus Torvalds	437589a74b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...	2012-10-02 11:11:09 -07:00
J. Bruce Fields	0d22f68f02	nfsd4: don't allow reclaims of expired clients When a confirmed client expires, we normally also need to expire any stable storage record which would allow that client to reclaim state on the next boot. We forgot to do this in some cases. (For example, in destroy_clientid, and in the cases in exchange_id and create_session that destroy and existing confirmed client.) But in most other cases, there's really no harm to calling nfsd4_client_record_remove(), because it is a no-op in the case the client doesn't have an existing The single exception is destroying a client on shutdown, when we want to keep the stable storage records so we can recognize which clients will be allowed to reclaim when we come back up. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:40:04 -04:00
J. Bruce Fields	6a3b156342	nfsd4: remove redundant callback probe Both nfsd4_init_conn and alloc_init_session are probing the callback channel, harmless but pointless. Also, nfsd4_init_conn should probably be probing in the "unknown" case as well. In fact I don't see any harm to just doing it unconditionally when we get a new backchannel connection. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:40:03 -04:00
J. Bruce Fields	8f9d3d3b7c	nfsd4: expire old client earlier Before we had to delay expiring a client till we'd found out whether the session and connection allocations would succeed. That's no longer necessary. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:40:03 -04:00
J. Bruce Fields	81f0b2a496	nfsd4: separate session allocation and initialization This will allow some further simplification. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:40:02 -04:00
J. Bruce Fields	a827bcb242	nfsd4: clean up session allocation Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:40:01 -04:00
J. Bruce Fields	1377b69e68	nfsd4: minor free_session cleanup Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:40:00 -04:00
J. Bruce Fields	e1ff371f9d	nfsd4: new_conn_from_crses should only allocate Do the initialization in the caller, and clarify that the only failure ever possible here was due to allocation. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:40:00 -04:00
J. Bruce Fields	3ba6367124	nfsd4: separate connection allocation and initialization It'll be useful to have connection allocation and initialization as separate functions. Also, note we'd been ignoring the alloc_conn error return in bind_conn_to_session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:39:59 -04:00
J. Bruce Fields	4973050148	nfsd4: reject bad forechannel attrs earlier This could simplify the logic a little later. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:39:58 -04:00
J. Bruce Fields	d15c077e44	nfsd4: enforce per-client sessions/no-sessions distinction Something like creating a client with setclientid and then trying to confirm it with create_session may not crash the server, but I'm not completely positive of that, and in any case it's obviously bad client behavior. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:39:58 -04:00
J. Bruce Fields	c116a0af76	nfsd4: set cl_minorversion at create time And remove some mostly obsolete comments. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:39:57 -04:00
J. Bruce Fields	68eb35081e	nfsd4: don't pin clientids to pseudoflavors I added cr_flavor to the data compared in same_creds without any justification, in `d5497fc693` "nfsd4: move rq_flavor into svc_cred". Recent client changes then started making mount -osec=krb5 server:/export /mnt/ echo "hello" >/mnt/TMP umount /mnt/ mount -osec=krb5i server:/export /mnt/ echo "hello" >/mnt/TMP to fail due to a clid_inuse on the second open. Mounting sequentially like this with different flavors probably isn't that common outside artificial tests. Also, the real bug here may be that the server isn't just destroying the former clientid in this case (because it isn't good enough at recognizing when the old state is gone). But it prompted some discussion and a look back at the spec, and I think the check was probably wrong. Fix and document. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-01 17:39:14 -04:00
Al Viro	cb0942b812	make get_file() return its argument simplifies a bunch of callers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:10:25 -04:00
J. Bruce Fields	6e67b5d184	nfsd4: fix bind_conn_to_session xdr comment Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-25 13:26:42 -04:00
Eric W. Biederman	5f3a4a28ec	userns: Pass a userns parameter into posix_acl_to_xattr and posix_acl_from_xattr - Pass the user namespace the uid and gid values in the xattr are stored in into posix_acl_from_xattr. - Pass the user namespace kuid and kgid values should be converted into when storing uid and gid values in an xattr in posix_acl_to_xattr. - Modify all callers of posix_acl_from_xattr and posix_acl_to_xattr to pass in &init_user_ns. In the short term this change is not strictly needed but it makes the code clearer. In the longer term this change is necessary to be able to mount filesystems outside of the initial user namespace that natively store posix acls in the linux xattr format. Cc: Theodore Tso <tytso@mit.edu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-09-18 01:01:35 -07:00
J. Bruce Fields	fac7a17b5f	nfsd4: cast readlink() bug argument As we already do in readv, writev. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 17:46:19 -04:00
Malahal Naineni	9959ba0c24	NFSD: pass null terminated buf to kstrtouint() The 'buf' is prepared with null termination with intention of using it for this purpose, but 'name' is passed instead! Signed-off-by: Malahal Naineni <malahal@us.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 17:46:19 -04:00
Namjae Jeon	8c8651b8e2	nfsd: remove duplicate init in nfsd4_cb_recall remove duplicate init in nfsd4_cb_recall Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Vivek Trivedi <vtrivedi018@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 17:46:18 -04:00
J. Bruce Fields	ef79859e04	nfsd4: eliminate redundant nfs4_free_stateid Somehow we ended up with identical functions "nfs4_free_stateid" and "free_generic_stateid". Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 17:46:17 -04:00
Julia Lawall	92566e287d	fs/nfsd/nfs4idmap.c: adjust inconsistent IS_ERR and PTR_ERR Change the call to PTR_ERR to access the value just tested by IS_ERR. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e,e1; @@ ( if (IS_ERR(e)) { ... PTR_ERR(e) ... } \| if (IS_ERR(e=e1)) { ... PTR_ERR(e) ... } \| if (IS_ERR(e)) { ... PTR_ERR(e1) ... } ) // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 17:46:17 -04:00
J. Bruce Fields	eccf50c129	nfsd: remove unused listener-removal interfaces You can use nfsd/portlist to give nfsd additional sockets to listen on. In theory you can also remove listening sockets this way. But nobody's ever done that as far as I can tell. Also this was partially broken in 2.6.25, by `a217813f90` "knfsd: Support adding transports by writing portlist file". (Note that we decide whether to take the "delfd" case by checking for a digit--but what's actually expected in that case is something made by svc_one_sock_name(), which won't begin with a digit.) So, let's just rip out this stuff. Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 10:55:19 -04:00
J. Bruce Fields	cf9182e90b	nfsd4: fix nfs4 stateid leak Processes that open and close multiple files may end up setting this oo_last_closed_stid without freeing what was previously pointed to. This can result in a major leak, visible for example by watching the nfsd4_stateids line of /proc/slabinfo. Reported-by: Cyril B. <cbay@excellency.fr> Tested-by: Cyril B. <cbay@excellency.fr> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 10:55:14 -04:00
J. Bruce Fields	5b444cc9a4	svcrpc: remove handling of unknown errors from svc_recv svc_recv() returns only -EINTR or -EAGAIN. If we really want to worry about the case where it has a bug that causes it to return something else, we could stick a WARN() in svc_recv. But it's silly to require every caller to have all this boilerplate to handle that case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 17:42:00 -04:00
J. Bruce Fields	a10fded18e	nfsd: allow configuring nfsd to listen on 5-digit ports Note a 16-bit value can require up to 5 digits. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 17:07:50 -04:00
J. Bruce Fields	38af2cabb6	nfsd: remove redundant "port" argument "port" in all these functions is always NFS_PORT. nfsd can already be run on a nonstandard port using the "nfsd/portlist" interface. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 17:07:49 -04:00
Jeff Layton	21179d81f1	knfsd: don't allocate file_locks on the stack struct file_lock is pretty large and really ought not live on the stack. On my x86_64 machine, they're almost 200 bytes each. (gdb) p sizeof(struct file_lock) $1 = 192 ...allocate them dynamically instead. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 14:08:39 -04:00
Jeff Layton	5592a3f397	knfsd: remove bogus BUG_ON() call from nfsd4_locku The code checks for a NULL filp and handles it gracefully just before this BUG_ON. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 14:08:38 -04:00
J. Bruce Fields	da5c80a935	nfsd4: nfsd_process_n_delegations should be static Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 13:59:39 -04:00
Bryan Schumaker	24ff99c6fe	NFSD: Swap the struct nfs4_operation getter and setter stateid_setter should be matched to op_set_currentstateid, rather than op_get_currentstateid. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:53:25 -04:00
J. Bruce Fields	95c7a20aeb	nfsd: do_nfsd_create verf argument is a u32 The types here are actually a bit of a mess. For now cast as we do in the v4 case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:39:49 -04:00
J. Bruce Fields	87f26f9b08	nfsd4: declare nfs4_recoverydir properly Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:39:49 -04:00
J. Bruce Fields	9c0b0ff799	nfsd4: nfsaclsvc_encode_voidres static Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:39:49 -04:00
Jeff Layton	1696c47ce2	nfsd: trivial comment updates locks.c doesn't use the BKL anymore and there is no fi_perfile field. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:39:42 -04:00
J. Bruce Fields	39307655a1	nfsd4: fix security flavor of NFSv4.0 callback Commit `d5497fc693` "nfsd4: move rq_flavor into svc_cred" forgot to remove cl_flavor from the client, leaving two places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored. After that patch, the latter was the one that was updated, but the former was the one that the callback used. Symptoms were a long delay on utime(). This is because the utime() generated a setattr which recalled a delegation, but the cb_recall was ignored by the client because it had the wrong security flavor. Cc: stable@vger.kernel.org Tested-by: Jamie Heilman <jamie@audible.transient.net> Reported-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:38:36 -04:00
Linus Torvalds	a0e881b7c1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull second vfs pile from Al Viro: "The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the deadlock reproduced by xfstests 068), symlink and hardlink restriction patches, plus assorted cleanups and fixes. Note that another fsfreeze deadlock (emergency thaw one) is not dealt with - the series by Fernando conflicts a lot with Jan's, breaks userland ABI (FIFREEZE semantics gets changed) and trades the deadlock for massive vfsmount leak; this is going to be handled next cycle. There probably will be another pull request, but that stuff won't be in it." Fix up trivial conflicts due to unrelated changes next to each other in drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c} * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits) delousing target_core_file a bit Documentation: Correct s_umount state for freeze_fs/unfreeze_fs fs: Remove old freezing mechanism ext2: Implement freezing btrfs: Convert to new freezing mechanism nilfs2: Convert to new freezing mechanism ntfs: Convert to new freezing mechanism fuse: Convert to new freezing mechanism gfs2: Convert to new freezing mechanism ocfs2: Convert to new freezing mechanism xfs: Convert to new freezing code ext4: Convert to new freezing mechanism fs: Protect write paths by sb_start_write - sb_end_write fs: Skip atime update on frozen filesystem fs: Add freezing handling to mnt_want_write() / mnt_drop_write() fs: Improve filesystem freezing handling switch the protection of percpu_counter list to spinlock nfsd: Push mnt_want_write() outside of i_mutex btrfs: Push mnt_want_write() outside of i_mutex fat: Push mnt_want_write() outside of i_mutex ...	2012-08-01 10:26:23 -07:00
Linus Torvalds	08843b79fb	Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from J. Bruce Fields: "This has been an unusually quiet cycle--mostly bugfixes and cleanup. The one large piece is Stanislav's work to containerize the server's grace period--but that in itself is just one more step in a not-yet-complete project to allow fully containerized nfs service. There are a number of outstanding delegation, container, v4 state, and gss patches that aren't quite ready yet; 3.7 may be wilder." * 'nfsd-next' of git://linux-nfs.org/~bfields/linux: (35 commits) NFSd: make boot_time variable per network namespace NFSd: make grace end flag per network namespace Lockd: move grace period management from lockd() to per-net functions LockD: pass actual network namespace to grace period management functions LockD: manage grace list per network namespace SUNRPC: service request network namespace helper introduced NFSd: make nfsd4_manager allocated per network namespace context. LockD: make lockd manager allocated per network namespace LockD: manage grace period per network namespace Lockd: add more debug to host shutdown functions Lockd: host complaining function introduced LockD: manage used host count per networks namespace LockD: manage garbage collection timeout per networks namespace LockD: make garbage collector network namespace aware. LockD: mark host per network namespace on garbage collect nfsd4: fix missing fault_inject.h include locks: move lease-specific code out of locks_delete_lock locks: prevent side-effects of locks_release_private before file_lock is initialized NFSd: set nfsd_serv to NULL after service destruction NFSd: introduce nfsd_destroy() helper ...	2012-07-31 14:42:28 -07:00
Jan Kara	4a55c1017b	nfsd: Push mnt_want_write() outside of i_mutex When mnt_want_write() starts to handle freezing it will get a full lock semantics requiring proper lock ordering. So push mnt_want_write() call consistently outside of i_mutex. CC: linux-nfs@vger.kernel.org CC: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-31 01:02:51 +04:00
Stanislav Kinsbursky	2c142baa7b	NFSd: make boot_time variable per network namespace NFSd's boot_time represents grace period start point in time. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:49:22 -04:00
Stanislav Kinsbursky	a51c84ed50	NFSd: make grace end flag per network namespace Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:49:22 -04:00
Stanislav Kinsbursky	5ccb0066f2	LockD: pass actual network namespace to grace period management functions Passed network namespace replaced hard-coded init_net Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:49:22 -04:00
Stanislav Kinsbursky	9695c7057f	SUNRPC: service request network namespace helper introduced This is a cleanup patch - makes code looks simplier. It replaces widely used rqstp->rq_xprt->xpt_net by introduced SVC_NET(rqstp). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:49:21 -04:00
Stanislav Kinsbursky	5e1533c788	NFSd: make nfsd4_manager allocated per network namespace context. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:49:21 -04:00
J. Bruce Fields	99dbb8fe09	nfsd4: fix missing fault_inject.h include Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:30:12 -04:00
Stanislav Kinsbursky	57c8b13e3c	NFSd: set nfsd_serv to NULL after service destruction In nfsd_destroy(): if (destroy) svc_shutdown_net(nfsd_serv, net); svc_destroy(nfsd_server); svc_shutdown_net(nfsd_serv, net) calls nfsd_last_thread(), which sets nfsd_serv to NULL, causing a NULL dereference on the following line. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-25 09:21:31 -04:00
Stanislav Kinsbursky	19f7e2ca44	NFSd: introduce nfsd_destroy() helper Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-25 09:21:30 -04:00
J. Bruce Fields	a007c4c3e9	nfsd: add get_uint for u32's I don't think there's a practical difference for the range of values these interfaces should see, but it would be safer to be unambiguous. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-25 09:18:27 -04:00
Stanislav Kinsbursky	a6d88f293e	NFSd: fix locking in nfsd_forget_delegations() This patch adds recall_lock hold to nfsd_forget_delegations() to protect nfsd_process_n_delegations() call. Also, looks like it would be better to collect delegations to some local on-stack list, and then unhash collected list. This split allows to simplify locking, because delegation traversing is protected by recall_lock, when delegation unhash is protected by client_mutex. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-25 09:18:27 -04:00
Vivek Trivedi	5559b50acd	nfsd4: fix cr_principal comparison check in same_creds This fixes a wrong check for same cr_principal in same_creds Introduced by `8fbba96e5b` "nfsd4: stricter cred comparison for setclientid/exchange_id". Cc: stable@vger.kernel.org Signed-off-by: Vivek Trivedi <vtrivedi018@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-25 09:05:30 -04:00
Al Viro	765927b2d5	switch dentry_open() to struct path, make it grab references itself Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-23 00:01:29 +04:00
Al Viro	312b63fba9	don't pass nameidata * to vfs_create() all we want is a boolean flag, same as the method gets now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:34:50 +04:00
J. Bruce Fields	7f2e7dc0fd	nfsd: share some function prototypes Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-10 16:41:35 -04:00
J. Bruce Fields	d91d0b5690	nfsd: allow owner_override only for regular files We normally allow the owner of a file to override permissions checks on IO operations, since: - the client will take responsibility for doing an access check on open; - the permission checks offer no protection against malicious clients--if they can authenticate as the file's owner then they can always just change its permissions; - checking permission on each IO operation breaks the usual posix rule that permission is checked only on open. However, we've never allowed the owner to override permissions on readdir operations, even though the above logic would also apply to directories. I've never heard of this causing a problem, probably because a) simultaneously opening and creating a directory (with restricted mode) isn't possible, and b) opening a directory, then chmod'ing it, is rare. Our disallowal of owner-override on directories appears to be an accident, though--the readdir itself succeeds, and then we fail just because lookup_one_len() calls in our filldir methods fail. I'm not sure what the easiest fix for that would be. For now, just make this behavior obvious by denying the override right at the start. This also fixes some odd v4 behavior: with the rdattr_error attribute requested, it would perform the readdir but return an ACCES error with each entry. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-10 16:41:35 -04:00
J. Bruce Fields	74dbafaf5d	nfsd4: release openowners on free in >=4.1 case We don't need to keep openowners around in the >=4.1 case, because they aren't needed to handle CLOSE replays any more (that's a problem for sessions). And doing so causes unexpected failures on a subsequent destroy_clientid to fail. We probably also need something comparable for lock owners on last unlock. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-10 16:41:34 -04:00
J. Bruce Fields	2930d381d2	nfsd4: our filesystems are normally case sensitive Actually, xfs and jfs can optionally be case insensitive; we'll handle that case in later patches. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-10 15:20:57 -04:00
J. Bruce Fields	4af825041b	nfsd4: process_open2 cleanup Note we can simplify the error handling a little by doing the truncate earlier. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-06-20 08:59:43 -04:00
J. Bruce Fields	e1aaa8916f	nfsd4: nfsd4_lock() cleanup Share a little common logic. And note the comments here are a little out of date (e.g. we don't always create new state in the "new" case any more.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-06-20 08:59:42 -04:00
J. Bruce Fields	9068bed1a3	nfsd4: remove unnecessary comment For the most part readers of cl_cb_state only need a value that is "eventually" right. And the value is set only either 1) in response to some change of state, in which case it's set to UNKNOWN and then a callback rpc is sent to probe the real state, or b) in the handling of a response to such a callback. UNKNOWN is therefore always a "temporary" state, and for the other states we're happy to accept last writer wins. So I think we're OK here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-06-20 08:59:41 -04:00
Chuck Lever	7df302f75e	NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID According to RFC 5661, the TEST_STATEID operation is not allowed to return NFS4ERR_STALE_STATEID. In addition, RFC 5661 says: 15.1.16.5. NFS4ERR_STALE_STATEID (Error Code 10023) A stateid generated by an earlier server instance was used. This error is moot in NFSv4.1 because all operations that take a stateid MUST be preceded by the SEQUENCE operation, and the earlier server instance is detected by the session infrastructure that supports SEQUENCE. I triggered NFS4ERR_STALE_STATEID while testing the Linux client's NOGRACE recovery. Bruce suggested an additional test that could be useful to client developers. Lastly, RFC 5661, section 18.48.3 has this: o Special stateids are always considered invalid (they result in the error code NFS4ERR_BAD_STATEID). An explicit check is made for those state IDs to avoid printk noise. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-06-20 08:59:40 -04:00
Weston Andros Adamson	2411967305	nfsd: probe the back channel on new connections Initiate a CB probe when a new connection with the correct direction is added to a session (IFF backchannel is marked as down). Without this a BIND_CONN_TO_SESSION has no effect on the internal backchannel state, which causes the server to reply to every SEQUENCE op with the SEQ4_STATUS_CB_PATH_DOWN flag set until DESTROY_SESSION. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-06-20 08:59:39 -04:00
J. Bruce Fields	bc2df47a40	nfsd4: BUG_ON(!is_spin_locked()) no good on UP kernels Most frequent symptom was a BUG triggering in expire_client, with the server locking up shortly thereafter. Introduced by `508dc6e110` "nfsd41: free_session/free_client must be called under the client_lock". Cc: stable@kernel.org Cc: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-06-14 13:54:08 -04:00
Linus Torvalds	419f431949	Merge branch 'for-3.5' of git://linux-nfs.org/~bfields/linux Pull the rest of the nfsd commits from Bruce Fields: "... and then I cherry-picked the remainder of the patches from the head of my previous branch" This is the rest of the original nfsd branch, rebased without the delegation stuff that I thought really needed to be redone. I don't like rebasing things like this in general, but in this situation this was the lesser of two evils. * 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits) nfsd4: fix, consolidate client_has_state nfsd4: don't remove rebooted client record until confirmation nfsd4: remove some dprintk's and a comment nfsd4: return "real" sequence id in confirmed case nfsd4: fix exchange_id to return confirm flag nfsd4: clarify that renewing expired client is a bug nfsd4: simpler ordering of setclientid_confirm checks nfsd4: setclientid: remove pointless assignment nfsd4: fix error return in non-matching-creds case nfsd4: fix setclientid_confirm same_cred check nfsd4: merge 3 setclientid cases to 2 nfsd4: pull out common code from setclientid cases nfsd4: merge last two setclientid cases nfsd4: setclientid/confirm comment cleanup nfsd4: setclientid remove unnecessary terms from a logical expression nfsd4: move rq_flavor into svc_cred nfsd4: stricter cred comparison for setclientid/exchange_id nfsd4: move principal name into svc_cred nfsd4: allow removing clients not holding state nfsd4: rearrange exchange_id logic to simplify ...	2012-06-01 08:32:58 -07:00
Linus Torvalds	a00b6151a2	Merge branch 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux Pull nfsd update from Bruce Fields. * 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux: (23 commits) nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek SUNRPC: split upcall function to extract reusable parts nfsd: allocate id-to-name and name-to-id caches in per-net operations. nfsd: make name-to-id cache allocated per network namespace context nfsd: make id-to-name cache allocated per network namespace context nfsd: pass network context to idmap init/exit functions nfsd: allocate export and expkey caches in per-net operations. nfsd: make expkey cache allocated per network namespace context nfsd: make export cache allocated per network namespace context nfsd: pass pointer to export cache down to stack wherever possible. nfsd: pass network context to export caches init/shutdown routines Lockd: pass network namespace to creation and destruction routines NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches nfsd: pass pointer to expkey cache down to stack wherever possible. nfsd: use hash table from cache detail in nfsd export seq ops nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops nfsd: use exp_put() for svc_export_cache put nfsd: use cache detail pointer from svc_export structure on cache put nfsd: add link to owner cache detail to svc_export structure nfsd: use passed cache_detail pointer expkey_parse() ...	2012-05-31 18:18:11 -07:00
J. Bruce Fields	6eccece90b	nfsd4: fix, consolidate client_has_state Whoops: first, I reimplemented the already-existing has_resources without noticing; second, I got the test backwards. I did pick a better name, though. Combine the two.... Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:39 -04:00
J. Bruce Fields	b9831b59f3	nfsd4: don't remove rebooted client record until confirmation In the NFSv4.1 client-reboot case we're currently removing the client's previous state in exchange_id. That's wrong--we should be waiting till the confirming create_session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:34 -04:00
J. Bruce Fields	32f16b3823	nfsd4: remove some dprintk's and a comment The comment is redundant, and if we really want dprintk's here they'd probably be better in the common (check-slot_seqid) code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:31 -04:00
J. Bruce Fields	778df3f0fe	nfsd4: return "real" sequence id in confirmed case The client should ignore the returned sequence_id in the case where the CONFIRMED flag is set on an exchange_id reply--and in the unconfirmed case "1" is always the right response. So it shouldn't actually matter what we return here. We could continue returning 1 just to catch clients ignoring the spec here, but I'd rather be generous. Other things equal, returning the existing sequence_id seems more informative. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:27 -04:00
J. Bruce Fields	0f1ba0ef21	nfsd4: fix exchange_id to return confirm flag Otherwise nfsd4_set_ex_flags writes over the return flags. Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:21 -04:00
J. Bruce Fields	7447758be7	nfsd4: clarify that renewing expired client is a bug This can't happen: - cl_time is zeroed only by unhash_client_locked, which is only ever called under both the state lock and the client lock. - every caller of renew_client() should have looked up a (non-expired) client and then called renew_client() all without dropping the state lock. - the only other caller of renew_client_locked() is release_session_client(), which first checks under the client_lock that the cl_time is nonzero. So make it clear that this is a bug, not something we handle. I can't quite bring myself to make this a BUG(), though, as there are a lot of renew_client() callers, and returning here is probably safer than a BUG(). We'll consider making it a BUG() after some more cleanup. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:14 -04:00
J. Bruce Fields	90d700b779	nfsd4: simpler ordering of setclientid_confirm checks The cases here divide into two main categories: - if there's an uncomfirmed record with a matching verifier, then this is a "normal", succesful case: we're either creating a new client, or updating an existing one. - otherwise, this is a weird case: a replay, or a server reboot. Reordering to reflect that makes the code a bit more concise and the logic a lot easier to understand. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:07 -04:00
J. Bruce Fields	f3d03b9202	nfsd4: setclientid: remove pointless assignment Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:06 -04:00
J. Bruce Fields	8695b90ac3	nfsd4: fix error return in non-matching-creds case Note CLID_INUSE is for the case where two clients are trying to use the same client-provided long-form client identifiers. But what we're looking at here is the server-returned shorthand client id--if those clash there's a bug somewhere. Fix the error return, pull the check out into common code, and do the check unconditionally in all cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:04 -04:00
J. Bruce Fields	788c1eba50	nfsd4: fix setclientid_confirm same_cred check New clients are created only by nfsd4_setclientid(), which always gives any new client a unique clientid. The only exception is in the "callback update" case, in which case it may create an unconfirmed client with the same clientid as a confirmed client. In that case it also checks that the confirmed client has the same credential. Therefore, it is pointless for setclientid_confirm to check whether a confirmed and unconfirmed client with the same clientid have matching credentials--they're guaranteed to. Instead, it should be checking whether the credential on the setclientid_confirm matches either of those. Otherwise, it could be anyone sending the setclientid_confirm. Granted, I can't see why anyone would, but still it's probalby safer to check. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:03 -04:00
J. Bruce Fields	34b232bb37	nfsd4: merge 3 setclientid cases to 2 Boy, is this simpler. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:02 -04:00
J. Bruce Fields	8f9307119d	nfsd4: pull out common code from setclientid cases Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:01 -04:00
J. Bruce Fields	ad72aae5ad	nfsd4: merge last two setclientid cases The code here is mostly the same. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:00 -04:00
J. Bruce Fields	63db46328a	nfsd4: setclientid/confirm comment cleanup Be a little more concise. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:59 -04:00
J. Bruce Fields	e98479b8d6	nfsd4: setclientid remove unnecessary terms from a logical expression Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:58 -04:00
J. Bruce Fields	d5497fc693	nfsd4: move rq_flavor into svc_cred Move the rq_flavor into struct svc_cred, and use it in setclientid and exchange_id comparisons as well. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:58 -04:00
J. Bruce Fields	8fbba96e5b	nfsd4: stricter cred comparison for setclientid/exchange_id The typical setclientid or exchange_id will probably be performed with a credential that maps to either root or nobody, so comparing just uid's is unlikely to be useful. So, use everything else we can get our hands on. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:57 -04:00
J. Bruce Fields	03a4e1f6dd	nfsd4: move principal name into svc_cred Instead of keeping the principal name associated with a request in a structure that's private to auth_gss and using an accessor function, move it to svc_cred. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:55 -04:00
J. Bruce Fields	631fc9ea05	nfsd4: allow removing clients not holding state RFC 5661 actually says we should allow an exchange_id to remove a matching client, even if the exchange_id comes from a different principal, if the victim client lacks any state. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:55 -04:00
J. Bruce Fields	136e658d62	nfsd4: rearrange exchange_id logic to simplify Minor cleanup: it's simpler to have separate code paths for the update and non-update cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:54 -04:00
J. Bruce Fields	2dbb269dfe	nfsd4: exchange_id cleanup: comments Make these comments a bit more concise and uniform. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:53 -04:00
J. Bruce Fields	83e08fd46c	nfsd4: exchange_id cleanup: local shorthands for repeated tests Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:52 -04:00
J. Bruce Fields	1a308118c2	nfsd4: allow an EXCHANGE_ID to kill a 4.0 client Following rfc 5661 section 2.4.1, we can permit a 4.1 client to remove an established 4.0 client's state. (But we don't allow updates.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:52 -04:00
J. Bruce Fields	ea236d0704	nfsd4: exchange_id: check creds before killing confirmed client We mustn't allow a client to destroy another client with established state unless it has the right credential. And some minor cleanup. (Note: our comparison of credentials is actually pretty bogus currently; that will need to be fixed in another patch.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:51 -04:00
J. Bruce Fields	2786cc3a05	nfsd4: exchange_id error cleanup There's no point to the dprintk here as the main proc_compound loop already does this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:50 -04:00
J. Bruce Fields	11ae681052	nfsd4: exchange_id has a pointless copy We just verified above that these two verifiers are already the same. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:50 -04:00
Weston Andros Adamson	5fb35a3a9b	nfsd: return 0 on reads of fault injection files debugfs read operations were returning the contents of an uninitialized u64. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:48 -04:00
Jeff Layton	ce0fc43c5a	nfsd: wrap all accesses to st_deny_bmap Handle the st_deny_bmap in a similar fashion to the st_access_bmap. Add accessor functions and use those instead of bare bitops. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:48 -04:00
Jeff Layton	82c5ff1b14	nfsd: wrap accesses to st_access_bmap Currently, we do this for the most part with "bare" bitops, but eventually we'll need to expand the share mode code to handle access and deny modes on other nodes. In order to facilitate that code in the future, move to some generic accessor functions. For now, these are mostly static inlines, but eventually we'll want to move these to "real" functions that are able to handle multi-node configurations or have a way to "swap in" new operations to be done in lieu of or in conjunction with these atomic bitops. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:47 -04:00
Jeff Layton	3a3286147f	nfsd: make test_share a bool return All of the callers treat the return that way already. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:46 -04:00
Jeff Layton	5ae037e599	nfsd: consolidate set_access and set_deny These functions are identical. Also, rename them to bmap_to_share_mode to better reflect what they do, and have them just return the result instead of passing in a pointer to the storage location. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:46 -04:00
Chuck Lever	f07ea10dc8	NFSD: SETCLIENTID_CONFIRM returns NFS4ERR_CLID_INUSE too often According to RFC 3530bis, the only items SETCLIENTID_CONFIRM processing should be concerned with is the clientid, clientid verifier, and principal. The client's IP address is not supposed to be interesting. And, NFS4ERR_CLID_INUSE is meant only for principal mismatches. I triggered this logic with a prototype UCS client -- one that uses the same nfs_client_id4 string for all servers. The client mounted our server via its IPv4, then via its IPv6 address. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:45 -04:00
Stanislav Kinsbursky	786185b5f8	SUNRPC: move per-net operations from svc_destroy() The idea is to separate service destruction and per-net operations, because these are two different things and the mix looks ugly. Notes: 1) For NFS server this patch looks ugly (sorry for that). But these place will be rewritten soon during NFSd containerization. 2) LockD per-net counter increase int lockd_up() was moved prior to make_socks() to make lockd_down_net() call safe in case of error. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:40 -04:00
Stanislav Kinsbursky	9793f7c889	SUNRPC: new svc_bind() routine introduced This new routine is responsible for service registration in a specified network context. The idea is to separate service creation from per-net operations. Note also: since registering service with svc_bind() can fail, the service will be destroyed and during destruction it will try to unregister itself from rpcbind. In this case unregistration has to be skipped. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:39 -04:00
Weston Andros Adamson	e7a0444aef	nfsd: add IPv6 addr escaping to fs_location hosts The fs_location->hosts list is split on colons, but this doesn't work when IPv6 addresses are used (they contain colons). This patch adds the function nfsd4_encode_components_esc() to allow the caller to specify escape characters when splitting on 'sep'. In order to fix referrals, this patch must be used with the mountd patch that similarly fixes IPv6 [] escaping. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:38 -04:00
J. Bruce Fields	45eaa1c1a1	nfsd4: fix change attribute endianness Though actually this doesn't matter much, as NFSv4.0 clients are required to treat the change attribute as opaque. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:38 -04:00
J. Bruce Fields	d1829b3824	nfsd4: fix free_stateid return endianness Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:37 -04:00
J. Bruce Fields	57b7b43b40	nfsd4: int/__be32 fixes In each of these cases there's a simple unambiguous correct choice, and no actual bug. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:37 -04:00
J. Bruce Fields	bc1b542be9	nfsd4: preserve __user annotation on cld downcall msg Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:36 -04:00
J. Bruce Fields	2355c59644	nfsd4: fix missing "static" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:35 -04:00
J. Bruce Fields	bfa4b36525	nfsd: state.c should include current_stateid.h OK, admittedly I'm mainly just trying to shut sparse up. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:35 -04:00
Linus Torvalds	644473e9c6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...	2012-05-23 17:42:39 -07:00
Eric W. Biederman	ae2975bc34	userns: Convert group_info values from gid_t to kgid_t. As a first step to converting struct cred to be all kuid_t and kgid_t values convert the group values stored in group_info to always be kgid_t values. Unless user namespaces are used this change should have no effect. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-03 03:27:21 -07:00
Randy Dunlap	8a7dc4b04b	nfsd: fix nfs4recover.c printk format warning Fix printk format warnings -- both items are size_t, so use %zu to print them. fs/nfsd/nfs4recover.c:580:3: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'size_t' fs/nfsd/nfs4recover.c:580:3: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'unsigned int' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: linux-nfs@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-30 12:28:48 -07:00
Jeff Layton	b108fe6b08	nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek They're equivalent, but SEEK_SET is more informative... Cc: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-25 15:38:23 -04:00
Linus Torvalds	c6f5c93098	Merge branch 'for-3.4' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from J. Bruce Fields: "One bugfix, and one minor header fix from Jeff Layton while we're here" * 'for-3.4' of git://linux-nfs.org/~bfields/linux: nfsd: include cld.h in the headers_install target nfsd: don't fail unchecked creates of non-special files	2012-04-19 14:54:52 -07:00
Al Viro	efe39651f0	nfsd: fix compose_entry_fh() failure exits Restore the original logics ("fail on mountpoints, negatives and in case of fh_compose() failures"). Since commit 8177e (nfsd: clean up readdirplus encoding) that got broken - rv = fh_compose(fhp, exp, dchild, &cd->fh); if (rv) goto out; if (!dchild->d_inode) goto out; rv = 0; out: is equivalent to rv = fh_compose(fhp, exp, dchild, &cd->fh); out: and the second check has no effect whatsoever... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-13 10:12:02 -04:00
Al Viro	afcf6792af	nfsd: fix error value on allocation failure in nfsd4_decode_test_stateid() PTR_ERR(NULL) is going to be 0... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-13 10:12:01 -04:00
Al Viro	02f5fde5df	nfsd: fix endianness breakage in TEST_STATEID handling ->ts_id_status gets nfs errno, i.e. it's already big-endian; no need to apply htonl() to it. Broken by commit 174568 (NFSD: Added TEST_STATEID operation) last year... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-13 10:12:01 -04:00
Al Viro	04da6e9d63	nfsd: fix error values returned by nfsd4_lockt() when nfsd_open() fails nfsd_open() already returns an NFS error value; only vfs_test_lock() result needs to be fed through nfserrno(). Broken by commit 55ef12 (nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT) three years ago... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-13 10:12:01 -04:00
Al Viro	96f6f98501	nfsd: fix b0rken error value for setattr on read-only mount ..._want_write() returns -EROFS on failure, _not_ an NFS error value. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-13 10:12:00 -04:00
Stanislav Kinsbursky	f69adb2fe2	nfsd: allocate id-to-name and name-to-id caches in per-net operations. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:12:11 -04:00
Stanislav Kinsbursky	9e75a4dee0	nfsd: make name-to-id cache allocated per network namespace context Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:12:10 -04:00
Stanislav Kinsbursky	c2e76ef5e0	nfsd: make id-to-name cache allocated per network namespace context Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:12:10 -04:00
Stanislav Kinsbursky	43ec1a20bf	nfsd: pass network context to idmap init/exit functions These functions will be called from per-net operations. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:12:10 -04:00
Stanislav Kinsbursky	5717e01284	nfsd: allocate export and expkey caches in per-net operations. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:11:47 -04:00
Stanislav Kinsbursky	e5f06f720e	nfsd: make expkey cache allocated per network namespace context This patch also changes svcauth_unix_purge() function: added network namespace as a parameter and thus loop over all networks was replaced by only one call for ip map cache purge. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:11:46 -04:00
Stanislav Kinsbursky	b3853e0ea1	nfsd: make export cache allocated per network namespace context This patch also changes prototypes of nfsd_export_flush() and exp_rootfh(): network namespace parameter added. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:11:11 -04:00
Stanislav Kinsbursky	2a75cfa64e	nfsd: pass pointer to export cache down to stack wherever possible. This cache will be per-net soon. And it's easier to get the pointer to desired per-net instance only once and then pass it down instead of discovering it in every place were required. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:10:19 -04:00
Stanislav Kinsbursky	b89109bef4	nfsd: pass network context to export caches init/shutdown routines These functions will be called from per-net operations. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 18:01:33 -04:00
Stanislav Kinsbursky	e3f70eadb7	Lockd: pass network namespace to creation and destruction routines v2: dereference of most probably already released nlm_host removed in nlmclnt_done() and reclaimer(). These routines are called from locks reclaimer() kernel thread. This thread works in "init_net" network context and currently relays on persence on lockd thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay on current network context. So let's pass corrent one into them. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:06 -04:00
Stanislav Kinsbursky	f890edbbef	NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches These dereferences to global static caches are redundant. They also prevents converting these caches into per-net ones. So this patch is cleanup + precursor of patch set,a which will make them per-net. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:05 -04:00
Stanislav Kinsbursky	c89172e36e	nfsd: pass pointer to expkey cache down to stack wherever possible. This cache will be per-net soon. And it's easier to get the pointer to desired per-net instance only once and then pass it down instead of discovering it in every place were required. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:05 -04:00
Stanislav Kinsbursky	83e0ed700d	nfsd: use hash table from cache detail in nfsd export seq ops Hard-code is redundant and will prevent from making caches per net ns. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:04 -04:00
Stanislav Kinsbursky	f2c7ea10f9	nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops Global svc_export_cache cache is going to be replaced with per-net instance. So prepare the ground for it. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:03 -04:00
Stanislav Kinsbursky	a09581f294	nfsd: use exp_put() for svc_export_cache put This patch replaces cache_put() call for svc_export_cache by exp_put() call. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:02 -04:00
Stanislav Kinsbursky	db3a353263	nfsd: add link to owner cache detail to svc_export structure Without info about owner cache datail it won't be able to find out, which per-net cache detail have to be. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:01 -04:00
Stanislav Kinsbursky	d4bb527e9e	nfsd: use passed cache_detail pointer expkey_parse() Using of hard-coded svc_expkey_cache pointer in expkey_parse() looks redundant. Moreover, global cache will be replaced with per-net instance soon. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:00 -04:00
Jeff Layton	33dcc481ed	nfsd: don't use locks_in_grace to determine whether to call nfs4_grace_end It's possible that lockd or another lock manager might still be on the list after we call nfsd4_end_grace. If the laundromat thread runs again at that point, then we could end up calling nfsd4_end_grace more than once. That's not only inefficient, but calling nfsd4_recdir_purge_old more than once could be problematic. Fix this by adding a new global "grace_ended" flag and use that to determine whether we've already called nfsd4_grace_end. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:00 -04:00
Jeff Layton	03af42c59e	nfsd: trivial: remove unused variable from nfsd4_lock ..."fp" is set but never used. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:54:58 -04:00
J. Bruce Fields	9dc4e6c4d1	nfsd: don't fail unchecked creates of non-special files Allow a v3 unchecked open of a non-regular file succeed as if it were a lookup; typically a client in such a case will want to fall back on a local open, so succeeding and giving it the filehandle is more useful than failing with nfserr_exist, which makes it appear that nothing at all exists by that name. Similarly for v4, on an open-create, return the same errors we would on an attempt to open a non-regular file, instead of returning nfserr_exist. This fixes a problem found doing a v4 open of a symlink with O_RDONLY\|O_CREAT, which resulted in the current client returning EEXIST. Thanks also to Trond for analysis. Cc: stable@kernel.org Reported-by: Orion Poplawski <orion@cora.nwra.com> Tested-by: Orion Poplawski <orion@cora.nwra.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:49:52 -04:00
Linus Torvalds	71db34fc43	Merge branch 'for-3.4' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from Bruce Fields: Highlights: - Benny Halevy and Tigran Mkrtchyan implemented some more 4.1 features, moving us closer to a complete 4.1 implementation. - Bernd Schubert fixed a long-standing problem with readdir cookies on ext2/3/4. - Jeff Layton performed a long-overdue overhaul of the server reboot recovery code which will allow us to deprecate the current code (a rather unusual user of the vfs), and give us some needed flexibility for further improvements. - Like the client, we now support numeric uid's and gid's in the auth_sys case, allowing easier upgrades from NFSv2/v3 to v4.x. Plus miscellaneous bugfixes and cleanup. Thanks to everyone! There are also some delegation fixes waiting on vfs review that I suppose will have to wait for 3.5. With that done I think we'll finally turn off the "EXPERIMENTAL" dependency for v4 (though that's mostly symbolic as it's been on by default in distro's for a while). And the list of 4.1 todo's should be achievable for 3.5 as well: http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues though we may still want a bit more experience with it before turning it on by default. * 'for-3.4' of git://linux-nfs.org/~bfields/linux: (55 commits) nfsd: only register cld pipe notifier when CONFIG_NFSD_V4 is enabled nfsd4: use auth_unix unconditionally on backchannel nfsd: fix NULL pointer dereference in cld_pipe_downcall nfsd4: memory corruption in numeric_name_to_id() sunrpc: skip portmap calls on sessions backchannel nfsd4: allow numeric idmapping nfsd: don't allow legacy client tracker init for anything but init_net nfsd: add notifier to handle mount/unmount of rpc_pipefs sb nfsd: add the infrastructure to handle the cld upcall nfsd: add a header describing upcall to nfsdcld nfsd: add a per-net-namespace struct for nfsd sunrpc: create nfsd dir in rpc_pipefs nfsd: add nfsd4_client_tracking_ops struct and a way to set it nfsd: convert nfs4_client->cl_cb_flags to a generic flags field NFSD: Fix nfs4_verifier memory alignment NFSD: Fix warnings when NFSD_DEBUG is not defined nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes) nfsd: rename 'int access' to 'int may_flags' in nfsd_open() ext4: return 32/64-bit dir name hash according to usage type fs: add new FMODE flags: FMODE_32bithash and FMODE_64bithash ...	2012-03-29 14:53:25 -07:00
Jeff Layton	797a9d797f	nfsd: only register cld pipe notifier when CONFIG_NFSD_V4 is enabled Otherwise, we get a warning or error similar to this when building with CONFIG_NFSD_V4 disabled: ERROR: "nfsd4_cld_block" [fs/nfsd/nfsd.ko] undefined! Fix this by wrapping the calls to rpc_pipefs_notifier_register and ..._unregister in another function and providing no-op replacements when CONFIG_NFSD_V4 is disabled. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-29 08:01:07 -04:00
J. Bruce Fields	4ca1f872cd	nfsd4: use auth_unix unconditionally on backchannel This isn't actually correct, but it works with the Linux client, and agrees with the behavior we used to have before commit `80fc015bdf`. Later patches will implement the spec-mandated behavior (which is to use the security parameters explicitly given by the client in create_session or backchannel_ctl). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-28 19:14:36 -04:00
Jeff Layton	21f72c9f0a	nfsd: fix NULL pointer dereference in cld_pipe_downcall If we find that "cup" is NULL in this case, then we obviously don't want to dereference it. What we really want to print in this case is the xid that we copied off earlier. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-28 10:10:24 -04:00
Dan Carpenter	3af706135b	nfsd4: memory corruption in numeric_name_to_id() "id" is type is a uid_t (32 bits) but on 64 bit systems strict_strtoul() modifies 64 bits of data. We should use kstrtouint() instead. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-28 10:10:23 -04:00
J. Bruce Fields	e9541ce8ef	nfsd4: allow numeric idmapping Mimic the client side by providing a module parameter that turns off idmapping in the auth_sys case, for backwards compatibility with NFSv2 and NFSv3. Unlike in the client case, we don't have any way to negotiate, since the client can return an error to us if it doesn't like the id that we return to it in (for example) a getattr call. However, it has always been possible for servers to return numeric id's, and as far as we're aware clients have always been able to handle them. Also, in the auth_sys case clients already need to have numeric id's the same between client and server. Therefore we believe it's safe to default this to on; but the module parameter is available to return to previous behavior if this proves to be a problem in some unexpected setup. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-26 11:49:48 -04:00
Jeff Layton	cc27e0d407	nfsd: don't allow legacy client tracker init for anything but init_net This code isn't set up for containers, so don't allow it to be used for anything but init_net. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-26 11:49:48 -04:00
Jeff Layton	813fd320c1	nfsd: add notifier to handle mount/unmount of rpc_pipefs sb In the event that rpc_pipefs isn't mounted when nfsd starts, we must register a notifier to handle creating the dentry once it is mounted, and to remove the dentry on unmount. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-26 11:49:48 -04:00
Jeff Layton	f3f8014862	nfsd: add the infrastructure to handle the cld upcall ...and add a mechanism for switching between the "legacy" tracker and the new one. The decision is made by looking to see whether the v4recoverydir exists. If it does, then the legacy client tracker is used. If it's not, then the kernel will create a "cld" pipe in rpc_pipefs. That pipe is used to talk to a daemon for handling the upcall. Most of the data structures for the new client tracker are handled on a per-namespace basis, so this upcall should be essentially ready for containerization. For now however, nfsd just starts it by calling the initialization and exit functions for init_net. I'm making the assumption that at some point in the future we'll be able to determine the net namespace from the nfs4_client. Until then, this patch hardcodes init_net in those places. I've sprinkled some "FIXME" comments around that code to attempt to make it clear where we'll need to fix that up later. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-26 11:49:48 -04:00
Jeff Layton	7ea34ac15e	nfsd: add a per-net-namespace struct for nfsd Eventually, we'll need this when nfsd gets containerized fully. For now, create a struct on a per-net-namespace basis that will just hold a pointer to the cld_net structure. That struct will hold all of the per-net data that we need for the cld tracker. Eventually we can add other pernet objects to struct nfsd_net. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-26 11:49:47 -04:00
Jeff Layton	2a4317c554	nfsd: add nfsd4_client_tracking_ops struct and a way to set it Abstract out the mechanism that we use to track clients into a set of client name tracking functions. This gives us a mechanism to plug in a new set of client tracking functions without disturbing the callers. It also gives us a way to decide on what tracking scheme to use at runtime. For now, this just looks like pointless abstraction, but later we'll add a new alternate scheme for tracking clients on stable storage. Note too that this patch anticipates the eventual containerization of this code by passing in struct net pointers in places. No attempt is made to containerize the legacy client tracker however. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-26 11:49:47 -04:00
Jeff Layton	a52d726bbd	nfsd: convert nfs4_client->cl_cb_flags to a generic flags field We'll need a way to flag the nfs4_client as already being recorded on stable storage so that we don't continually upcall. Currently, that's recorded in the cl_firststate field of the client struct. Using an entire u32 to store a flag is rather wasteful though. The cl_cb_flags field is only using 2 bits right now, so repurpose that to a generic flags field. Rename NFSD4_CLIENT_KILL to NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback flags. Add a mask that we can use for existing checks that look to see whether any flags are set, so that the new flags don't interfere. Convert all references to cl_firstate to the NFSD4_CLIENT_STABLE flag, and add a new NFSD4_CLIENT_RECLAIM_COMPLETE flag. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-26 11:49:47 -04:00
J. Bruce Fields	1df00640c9	Merge nfs containerization work from Trond's tree The nfs containerization work is a prerequisite for Jeff Layton's reboot recovery rework.	2012-03-26 11:48:54 -04:00
Linus Torvalds	f63d395d47	NFS client updates for Linux 3.4 New features include: - Add NFS client support for containers. This should enable most of the necessary functionality, including lockd support, and support for rpc.statd, NFSv4 idmapper and RPCSEC_GSS upcalls into the correct network namespace from which the mount system call was issued. - NFSv4 idmapper scalability improvements Base the idmapper cache on the keyring interface to allow concurrent access to idmapper entries. Start the process of migrating users from the single-threaded daemon-based approach to the multi-threaded request-key based approach. - NFSv4.1 implementation id. Allows the NFSv4.1 client and server to mutually identify each other for logging and debugging purposes. - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of having to use the more counterintuitive 'vers=4,minorversion=1'. - SUNRPC tracepoints. Start the process of adding tracepoints in order to improve debugging of the RPC layer. - pNFS object layout support for autologin. Important bugfixes include: - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail to wake up all tasks when applied to priority waitqueues. - Ensure that we handle read delegations correctly, when we try to truncate a file. - A number of fixes for NFSv4 state manager loops (mostly to do with delegation recovery). -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJPalZbAAoJEGcL54qWCgDyCi4P+QHcmzQhJO7HWx3Pzjs67bFT xMSYaKHGWS4AJKUBVl5OKBxUExfrMHBNbElV3IKUIwBlDx8RVtnwfptKSe146iki dn4TrRO5es8nmI4hRDcGMlzJDZq4y0Qg//qiUFmojiNW/Avw0ljfMoVUejJJ09FV oeDk4EGtcxkEyH+g48ZjYbyspRnG8qtD3atf70Z3lYE0ELdG/B5Dyzw1RDrA5p73 xJX3lqy8p/4ROzw/dmNoxdAXOrr3Q4/T58Bvp/lUglPy/EHyPmWzFoH0MU0C/PFu 5VnAl6QDbNCTcIw9FvJlX/mIyErpNG9eKzUskUc9L9SA+B+J/i4rIap4KATRN3nH 7QhE5qUacPuJnvxml7MPmlQTuft3fkAQ7NhKIWrbRi1QS9FmJC5NxctIb8loqlFn yIXdKeLfMshB+NyuFS9uzStX7SmV3eMgVd+5ZxRjYxm+PKJLw2KXeudArL6M5mHK 3QeKZpqwaYQ3RfaTNpvAp0doiXHCO5UbWfI0Pe8xQs/QcMCNReffqV2G4IJKFAu6 WpoN2UDQC9LCBifLw2nS7kku8+ZVXLQU8OC1NVl3TG15xD9cNLXuk3/y5llPGq4O odo52uLFpJohbDaHMj5RTKOfchTQCm2iyuVmxZEeAySypMSiAXmW7COSKHs/HxI1 VBm+EI00Pvmm5+fUjIlp =LuHE -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates for Linux 3.4 from Trond Myklebust: "New features include: - Add NFS client support for containers. This should enable most of the necessary functionality, including lockd support, and support for rpc.statd, NFSv4 idmapper and RPCSEC_GSS upcalls into the correct network namespace from which the mount system call was issued. - NFSv4 idmapper scalability improvements Base the idmapper cache on the keyring interface to allow concurrent access to idmapper entries. Start the process of migrating users from the single-threaded daemon-based approach to the multi-threaded request-key based approach. - NFSv4.1 implementation id. Allows the NFSv4.1 client and server to mutually identify each other for logging and debugging purposes. - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of having to use the more counterintuitive 'vers=4,minorversion=1'. - SUNRPC tracepoints. Start the process of adding tracepoints in order to improve debugging of the RPC layer. - pNFS object layout support for autologin. Important bugfixes include: - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail to wake up all tasks when applied to priority waitqueues. - Ensure that we handle read delegations correctly, when we try to truncate a file. - A number of fixes for NFSv4 state manager loops (mostly to do with delegation recovery)." * tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits) NFS: fix sb->s_id in nfs debug prints xprtrdma: Remove assumption that each segment is <= PAGE_SIZE xprtrdma: The transport should not bug-check when a dup reply is received pnfs-obj: autologin: Add support for protocol autologin NFS: Remove nfs4_setup_sequence from generic rename code NFS: Remove nfs4_setup_sequence from generic unlink code NFS: Remove nfs4_setup_sequence from generic read code NFS: Remove nfs4_setup_sequence from generic write code NFS: Fix more NFS debug related build warnings SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined nfs: non void functions must return a value SUNRPC: Kill compiler warning when RPC_DEBUG is unset SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG NFS: Use cond_resched_lock() to reduce latencies in the commit scans NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner NFS: ncommit count is being double decremented SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up() Try using machine credentials for RENEW calls NFSv4.1: Fix a few issues in filelayout_commit_pagelist NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code ...	2012-03-23 08:53:47 -07:00
Al Viro	88187398cc	debugfs-related mode_t whack-a-mole all of those should be umode_t... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-03-20 21:29:53 -04:00
Al Viro	68ac1234fb	switch touch_atime to struct path Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-03-20 21:29:41 -04:00
Chuck Lever	ab4684d156	NFSD: Fix nfs4_verifier memory alignment Clean up due to code review. The nfs4_verifier's data field is not guaranteed to be u32-aligned. Casting an array of chars to a u32 * is considered generally hazardous. We can fix most of this by using a __be32 array to generate the verifier's contents and then byte-copying it into the verifier field. However, there is one spot where there is a backwards compatibility constraint: the do_nfsd_create() call expects a verifier which is 32-bit aligned. Fix this spot by forcing the alignment of the create verifier in the nfsd4_open args structure. Also, sizeof(nfs4_verifer) is the size of the in-core verifier data structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd verifier. The two are not interchangeable, even if they happen to have the same value. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-20 15:36:15 -04:00
Trond Myklebust	8f199b8262	NFSD: Fix warnings when NFSD_DEBUG is not defined Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-20 15:34:19 -04:00
J. Bruce Fields	62b9510cb3	nfsd: merge cookie collision fixes from ext4 tree These changes fix readdir loops on ext4 filesystems with dir_index turned on. I'm pulling them from Ted's tree as I'd like to give them some extra nfsd testing, and expect to be applying (potentially conflicting) patches to the same code before the next merge window. From the nfs-ext4-premerge branch of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-19 12:35:05 -04:00
Bernd Schubert	06effdbb49	nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes) Use 32-bit or 64-bit llseek() hashes for directory offsets depending on the NFS version. NFSv2 gets 32-bit hashes only. NOTE: This patch got rather complex as Christoph asked to set the filp->f_mode flag in the open call or immediatly after dentry_open() in nfsd_open() to avoid races. Personally I still do not see a reason for that and in my opinion FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it follows directly after nfsd_open() without a chance of races. Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: J. Bruce Fields<bfields@redhat.com>	2012-03-18 22:44:50 -04:00
Bernd Schubert	999448a8c0	nfsd: rename 'int access' to 'int may_flags' in nfsd_open() Just rename this variable, as the next patch will add a flag and 'access' as variable name would not be correct any more. Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: J. Bruce Fields<bfields@redhat.com>	2012-03-18 22:44:49 -04:00
J. Bruce Fields	8546ee518c	nfsd4: make sure set CB_PATH_DOWN sequence flag set Make sure this is set whenever there is no callback channel. If a client does not set up a callback channel at all, then it will get this flag set from the very start. That's OK, it can just ignore the flag if it doesn't care. If a client does care, I think it's better to inform it of the problem as early as possible. Reported-by: Rick Macklem <rmacklem@uoguelph.ca> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-09 17:05:01 -05:00
J. Bruce Fields	59deeb9e5a	nfsd4: reduce do_open_lookup() stack usage I get 320 bytes for struct svc_fh on x86_64, really a little large to be putting on the stack; kmalloc() instead. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-06 18:13:37 -05:00
J. Bruce Fields	41fd1e42f8	nfsd4: delay setting current filehandle till success Compound processing stops on error, so the current filehandle won't be used on error. Thus the order here doesn't really matter. It'll be more convenient to do it later, though. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-06 18:13:36 -05:00
Benny Halevy	508dc6e110	nfsd41: free_session/free_client must be called under the client_lock The session client is manipulated under the client_lock hence both free_session and nfsd4_del_conns must be called under this lock. This patch adds a BUG_ON that checks this condition in the respective functions and implements the missing locks. nfsd4_{get,put}_session helpers were moved to the C file that uses them so to prevent use from external files and an unlocked version of nfsd4_put_session is provided for external use from nfs4xdr.c Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-06 18:13:35 -05:00
Benny Halevy	e27f49c33b	nfsd41: refactor nfsd4_deleg_xgrade_none_ext logic out of nfsd4_process_open2 Handle the case where the nfsv4.1 client asked to uprade or downgrade its delegations and server returns no delegation. In this case, op_delegate_type is set to NFS4_OPEN_DELEGATE_NONE_EXT and op_why_no_deleg is set respectively to WND4_NOT_SUPP_{UP,DOWN}GRADE Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-06 18:13:35 -05:00
Benny Halevy	4aa8913cb0	nfsd41: refactor nfs4_open_deleg_none_ext logic out of nfs4_open_delegation When a 4.1 client asks for a delegation and the server returns none op_delegate_type is set to NFS4_OPEN_DELEGATE_NONE_EXT and op_why_no_deleg is set to either WND4_CONTENTION or WND4_RESOURCE. Or, if the client sent a NFS4_SHARE_WANT_CANCEL (which it is not supposed to ever do until our server supports delegations signaling), op_why_no_deleg is set to WND4_CANCELLED. Note that for WND4_CONTENTION and WND4_RESOURCE, the xdr layer is hard coded at this time to encode boolean FALSE for ond_server_will_push_deleg / ond_server_will_signal_avail. Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-06 18:13:34 -05:00
J. Bruce Fields	a8ae08ebf1	nfsd4: fix recovery-entry leak nfsd startup failure Another leak on error Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-06 18:13:32 -05:00
Jeff Layton	a6d6b7811c	nfsd4: fix recovery-dir leak on nfsd startup failure The current code never calls nfsd4_shutdown_recdir if nfs4_state_start returns an error. Also, it's better to go ahead and consolidate these functions since one is just a trivial wrapper around the other. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-06 18:13:25 -05:00
J. Bruce Fields	393d8ed80f	nfsd4: purge stable client records with insufficient state To escape having your stable storage record purged at the end of the grace period, it's not sufficient to simply have performed a setclientid_confirm; you also need to meet the same requirements as someone creating a new record: either you should have done an open or open reclaim (in the 4.0 case) or a reclaim_complete (in the 4.1 case). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-06 18:13:24 -05:00
J. Bruce Fields	1255a8f36c	nfsd4: don't set cl_firststate on first reclaim in 4.1 case We set cl_firststate when we first decide that a client will be permitted to reclaim state on next boot. This happens: - for new 4.0 clients, when they confirm their first open - for returning 4.0 clients, when they reclaim their first open - for 4.1+ clients, when they perform reclaim_complete We also use cl_firststate to decide whether a reclaim_complete has already been performed, in the 4.1+ case. We were setting it on 4.1 open reclaims, which caused spurious COMPLETE_ALREADY errors on RECLAIM_COMPLETE from an nfs4.1 client with anything to reclaim. Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-03-06 18:13:23 -05:00
Benny Halevy	d24433cdc9	nfsd41: implement NFS4_SHARE_WANT_NO_DELEG, NFS4_OPEN_DELEGATE_NONE_EXT, why_no_deleg Respect client request for not getting a delegation in NFSv4.1 Appropriately return delegation "type" NFS4_OPEN_DELEGATE_NONE_EXT and WND4_NOT_WANTED reason. [nfsd41: add missing break when encoding op_why_no_deleg] Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-17 18:38:53 -05:00
Bryan Schumaker	03cfb42025	NFSD: Clean up the test_stateid function When I initially wrote it, I didn't understand how lists worked so I wrote something that didn't use them. I think making a list of stateids to test is a more straightforward implementation, especially compared to especially compared to decoding stateids while simultaneously encoding a reply to the client. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-17 18:38:52 -05:00
Benny Halevy	2c8bd7e0d1	nfsd41: split out share_access want and signal flags while decoding Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-17 18:38:42 -05:00
Benny Halevy	00b5f95a26	nfsd41: share_access_to_flags should consider only nfs4.x share_access flags Currently, it will not correctly ignore any nfsv4.1 signal flags if the client sends them. Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-17 11:50:36 -05:00
Tigran Mkrtchyan	37c593c573	nfsd41: use current stateid by value Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:45 -05:00
Tigran Mkrtchyan	9428fe1abb	nfsd41: consume current stateid on DELEGRETURN and OPENDOWNGRADE Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:44 -05:00
Tigran Mkrtchyan	1e97b5190d	nfsd41: handle current stateid in SETATTR and FREE_STATEID Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:43 -05:00
Tigran Mkrtchyan	d14710532f	nfsd41: mark LOOKUP, LOOKUPP and CREATE to invalidate current stateid Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:42 -05:00
Tigran Mkrtchyan	8307111476	nfsd41: save and restore current stateid with current fh Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:41 -05:00
Tigran Mkrtchyan	80e01cc1e2	nfsd41: mark PUTFH, PUTPUBFH and PUTROOTFH to clear current stateid Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:41 -05:00
Tigran Mkrtchyan	30813e2773	nfsd41: consume current stateid on read and write Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:40 -05:00
Tigran Mkrtchyan	62cd4a591c	nfsd41: handle current stateid on lock and locku Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:39 -05:00
Tigran Mkrtchyan	8b70484c67	nfsd41: handle current stateid in open and close Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:38 -05:00
Tigran Mkrtchyan	19ff0f288c	nfsd4: initialize current stateid at compile time Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-15 11:20:29 -05:00
J. Bruce Fields	bf5c43c8f1	nfsd4: check for uninitialized slot This fixes an oops when a buggy client tries to use an initial seqid of 0 on a new slot, which we may misinterpret as a replay. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-14 17:01:58 -05:00
J. Bruce Fields	73e79482b4	nfsd4: rearrange struct nfsd4_slot Combine two booleans into a single flag field, move the smaller fields to the end. (In practice this doesn't make the struct any smaller. But we'll be adding another flag here soon.) Remove some debugging code that doesn't look useful, while we're in the neighborhood. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-14 17:01:29 -05:00
J. Bruce Fields	f6d82485e9	nfsd4: fix sessions slotid wraparound logic From RFC 5661 2.10.6.1: "If the previous sequence ID was 0xFFFFFFFF, then the next request for the slot MUST have the sequence ID set to zero." While we're there, delete some redundant comments. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-13 16:15:18 -05:00
J. Bruce Fields	508f922756	nfsd: fix default iosize calculation on 32bit The rpc buffers will be allocated out of low memory, so we should really only be taking that into account. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-03 15:33:17 -05:00
J. Bruce Fields	87b0fc7deb	nfsd: cleanup setting of default max_block_size Move calculation of the default into a helper function. Get rid of an unused variable "err" while we're there. Thanks to Mi Jinlong for catching an arithmetic error in a previous version. Cc: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-03 15:32:41 -05:00
Dan Carpenter	3476964dba	nfsd: remove some unneeded checks We check for zero length strings in the caller now, so these aren't needed. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-03 14:26:42 -05:00
Trond Myklebust	a613fa168a	SUNRPC: constify the rpc_program Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-31 19:28:20 -05:00
Stanislav Kinsbursky	4cb54ca206	SUNRPC: search for service transports in network namespace context Service transports are parametrized by network namespace. And thus lookup of transport instance have to take network namespace into account. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: J. Bruce Fields <bfields@redhat.com>	2012-01-31 19:28:19 -05:00
Stanislav Kinsbursky	246590f56c	SUNRPC: register service stats /proc entries in passed network namespace context This patch makes it possible to create NFSd program entry ("/proc/net/rpc/nfsd") in passed network namespace context instead of hard-coded "init_net". Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-31 19:28:18 -05:00
Stanislav Kinsbursky	5ecebb7c7f	SUNRPC: unregister service on creation in current network namespace On service shutdown we can be sure, that no more users of it left except current. Thus it looks like using current network namespace context is safe in this case. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-31 19:28:14 -05:00
Stanislav Kinsbursky	f2ac4dc911	SUNRPC: parametrize rpc_uaddr2sockaddr() by network context Parametrize rpc_uaddr2sockaddr() by network context and thus force it's callers to pass in network context instead of using hard-coded "init_net". Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-31 19:28:12 -05:00
Stanislav Kinsbursky	90100b1766	SUNRPC: parametrize rpc_pton() by network context Parametrize rpc_pton() by network context and thus force it's callers to pass in network context instead of using hard-coded "init_net". Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-31 19:28:12 -05:00
Stanislav Kinsbursky	0157d021d2	SUNRPC: handle RPC client pipefs dentries by network namespace aware routines v2: 1) "Over-put" of PipeFS mount point fixed. Fix is ugly, but allows to bisect the patch set. And it will be removed later in the series. This patch makes RPC clients PipeFs dentries allocations in it's owner network namespace context. RPC client pipefs dentries creation logic has been changed: 1) Pipefs dentries creation by sb was moved to separated function, which will be used for handling PipeFS mount notification. 2) Initial value of RPC client PipeFS dir dentry is set no NULL now. RPC client pipefs dentries cleanup logic has been changed: 1) Cleanup is done now in separated rpc_remove_pipedir() function, which takes care about pipefs superblock locking. Also this patch removes slashes from cb_program.pipe_dir_name and from NFS_PIPE_DIRNAME to make rpc_d_lookup_sb() work. This doesn't affect vfs_path_lookup() results in nfs4blocklayout_init() since this slash is cutted off anyway in link_path_walk(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-31 18:20:25 -05:00
Linus Torvalds	0b48d42235	Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linux * 'for-3.3' of git://linux-nfs.org/~bfields/linux: (31 commits) nfsd4: nfsd4_create_clid_dir return value is unused NFSD: Change name of extended attribute containing junction svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown svcrpc: fix double-free on shutdown of nfsd after changing pool mode nfsd4: be forgiving in the absence of the recovery directory nfsd4: fix spurious 4.1 post-reboot failures NFSD: forget_delegations should use list_for_each_entry_safe NFSD: Only reinitilize the recall_lru list under the recall lock nfsd4: initialize special stateid's at compile time NFSd: use network-namespace-aware cache registering routines SUNRPC: create svc_xprt in proper network namespace svcrpc: update outdated BKL comment nfsd41: allow non-reclaim open-by-fh's in 4.1 svcrpc: avoid memory-corruption on pool shutdown svcrpc: destroy server sockets all at once svcrpc: make svc_delete_xprt static nfsd: Fix oops when parsing a 0 length export nfsd4: Use kmemdup rather than duplicating its implementation nfsd4: add a separate (lockowner, inode) lookup nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error ...	2012-01-14 12:26:41 -08:00
Linus Torvalds	57eccf1c2a	Merge branch 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs * 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Change the default setting of the nfs4_disable_idmapping parameter NFSv4: Save the owner/group name string when doing open NFS: Remove pNFS bloat from the generic write path pnfs-obj: Must return layout on IO error pnfs-obj: pNFS errors are communicated on iodata->pnfs_error NFS: Cache state owners after files are closed NFS: Clean up nfs4_find_state_owners_locked() NFSv4: include bitmap in nfsv4 get acl data nfs: fix a minor do_div portability issue NFSv4.1: cleanup comment and debug printk NFSv4.1: change nfs4_free_slot parameters for dynamic slots NFSv4.1: cleanup init and reset of session slot tables NFSv4.1: fix backchannel slotid off-by-one bug nfs: fix regression in handling of context= option in NFSv4 NFS - fix recent breakage to NFS error handling. NFS: Retry mounting NFSROOT SUNRPC: Clean up the RPCSEC_GSS service ticket requests	2012-01-10 14:57:40 -08:00
Linus Torvalds	98793265b4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits) Kconfig: acpi: Fix typo in comment. misc latin1 to utf8 conversions devres: Fix a typo in devm_kfree comment btrfs: free-space-cache.c: remove extra semicolon. fat: Spelling s/obsolate/obsolete/g SCSI, pmcraid: Fix spelling error in a pmcraid_err() call tools/power turbostat: update fields in manpage mac80211: drop spelling fix types.h: fix comment spelling for 'architectures' typo fixes: aera -> area, exntension -> extension devices.txt: Fix typo of 'VMware'. sis900: Fix enum typo 'sis900_rx_bufer_status' decompress_bunzip2: remove invalid vi modeline treewide: Fix comment and string typo 'bufer' hyper-v: Update MAINTAINERS treewide: Fix typos in various parts of the kernel, and fix some comments. clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR gpio: Kconfig: drop unknown symbol 'CS5535_GPIO' leds: Kconfig: Fix typo 'D2NET_V2' sound: Kconfig: drop unknown symbol ARCH_CLPS7500 ... Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new kconfig additions, close to removed commented-out old ones)	2012-01-08 13:21:22 -08:00
Al Viro	d8c9584ea2	vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-06 23:16:53 -05:00
J. Bruce Fields	7a6ef8c723	nfsd4: nfsd4_create_clid_dir return value is unused Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-01-05 15:38:41 -05:00
Chuck Lever	9b4146e855	NFSD: Change name of extended attribute containing junction As of fedfs-utils-0.8.0, user space stores all NFS junction information in a single extended attribute: "trusted.junction.nfs". Both FedFS and NFS basic junctions are stored in this one attribute, and the intention is that all future forms of NFS junction metadata will be stored in this attribute. Other protocols may use a different extended attribute. Thus NFSD needs to look only for that one extended attribute. The "trusted.junction.type" xattr is deprecated. fedfs-utils-0.8.0 will continue to attach a "trusted.junction.type" xattr to junctions, but future fedfs-utils releases may no longer do that. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-01-05 15:35:57 -05:00
J. Bruce Fields	b8548894bd	nfsd4: be forgiving in the absence of the recovery directory If the recovery directory doesn't exist, then behavior after a reboot will be suboptimal. But it's unnecessarily harsh to then prevent the nfsv4 server from working at all. Instead just print a warning (already done in nfsd4_init_recdir()) and soldier on. Tested-by: Lior <lior@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-01-05 15:23:19 -05:00
Trond Myklebust	68c97153fb	SUNRPC: Clean up the RPCSEC_GSS service ticket requests Instead of hacking specific service names into gss_encode_v1_msg, we should just allow the caller to specify the service name explicitly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: J. Bruce Fields <bfields@redhat.com>	2012-01-05 10:42:38 -05:00
Al Viro	175a4eb7ea	fs: propagate umode_t, misc bits Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:55:10 -05:00
Al Viro	2a79f17e4a	vfs: mnt_drop_write_file() new helper (wrapper around mnt_drop_write()) to be used in pair with mnt_want_write_file(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:52:40 -05:00
Al Viro	bad0dcffc2	new helpers: fh_{want,drop}_write() A bunch of places in nfsd does mnt_{want,drop}_write on vfsmount of export of given fhandle. Switched to obvious inlined helpers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:52:35 -05:00
Al Viro	a561be7100	switch a bunch of places to mnt_want_write_file() it's both faster (in case when file has been opened for write) and cleaner. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:52:35 -05:00
J. Bruce Fields	aec39680b0	nfsd4: fix spurious 4.1 post-reboot failures In the NFSv4.1 case, this could cause a spurious "NFSD: failed to write recovery record (err -17); please check that /var/lib/nfs/v4recovery exists and is writable. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reported-by: Steve Dickson <SteveD@redhat.com>	2012-01-02 17:32:59 -05:00
Bryan Schumaker	2d3475c0ad	NFSD: forget_delegations should use list_for_each_entry_safe Otherwise the for loop could try to use a file recently removed from the file_hashtbl list and oops. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Tested-by: Casey Bodley <cbodley@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-12-14 17:38:00 -05:00
Bryan Schumaker	39c4cc0fcc	NFSD: Only reinitilize the recall_lru list under the recall lock unhash_delegation() will grab the recall lock before calling list_del_init() in each of these places. This patch removes the redundant calls. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-12-13 17:11:45 -05:00
J. Bruce Fields	f32f3c2d3f	nfsd4: initialize special stateid's at compile time Stateid's with "other" ("opaque") field all zeros or all ones are reserved. We define all_ones separately on the off chance there will be more such some day, though currently all the other special stateid's have zero other field. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-12-12 15:27:00 -05:00
Stanislav Kinsbursky	f5c8593b94	NFSd: use network-namespace-aware cache registering routines v2: cache_register_net() and cache_unregister_net() GPL exports added This is a cleanup patch. Hope, some day generic cache_register() and cache_unregister() will be removed. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-12-07 15:27:46 -05:00
Mi Jinlong	0cf99b91c6	nfsd41: allow non-reclaim open-by-fh's in 4.1 With NFSv4.0 it was safe to assume that open-by-filehandles were always reclaims. With NFSv4.1 there are non-reclaim open-by-filehandle operations, so we should ensure we're only insisting on reclaims in the OPEN_CLAIM_PREVIOUS case. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-12-06 16:19:04 -05:00
Sasha Levin	b2ea70afad	nfsd: Fix oops when parsing a 0 length export expkey_parse() oopses when handling a 0 length export. This is easily triggerable from usermode by writing 0 bytes into '/proc/[proc id]/net/rpc/nfsd.fh/channel'. Below is the log: [ 1402.286893] BUG: unable to handle kernel paging request at ffff880077c49fff [ 1402.287632] IP: [<ffffffff812b4b99>] expkey_parse+0x28/0x2e1 [ 1402.287632] PGD 2206063 PUD 1fdfd067 PMD 1ffbc067 PTE 8000000077c49160 [ 1402.287632] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 1402.287632] CPU 1 [ 1402.287632] Pid: 20198, comm: trinity Not tainted 3.2.0-rc2-sasha-00058-gc65cd37 #6 [ 1402.287632] RIP: 0010:[<ffffffff812b4b99>] [<ffffffff812b4b99>] expkey_parse+0x28/0x2e1 [ 1402.287632] RSP: 0018:ffff880077f0fd68 EFLAGS: 00010292 [ 1402.287632] RAX: ffff880077c49fff RBX: 00000000ffffffea RCX: 0000000001043400 [ 1402.287632] RDX: 0000000000000000 RSI: ffff880077c4a000 RDI: ffffffff82283de0 [ 1402.287632] RBP: ffff880077f0fe18 R08: 0000000000000001 R09: ffff880000000000 [ 1402.287632] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880077c4a000 [ 1402.287632] R13: ffffffff82283de0 R14: 0000000001043400 R15: ffffffff82283de0 [ 1402.287632] FS: 00007f25fec3f700(0000) GS:ffff88007d400000(0000) knlGS:0000000000000000 [ 1402.287632] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1402.287632] CR2: ffff880077c49fff CR3: 0000000077e1d000 CR4: 00000000000406e0 [ 1402.287632] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1402.287632] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1402.287632] Process trinity (pid: 20198, threadinfo ffff880077f0e000, task ffff880077db17b0) [ 1402.287632] Stack: [ 1402.287632] ffff880077db17b0 ffff880077c4a000 ffff880077f0fdb8 ffffffff810b411e [ 1402.287632] ffff880000000000 ffff880077db17b0 ffff880077c4a000 ffffffff82283de0 [ 1402.287632] 0000000001043400 ffffffff82283de0 ffff880077f0fde8 ffffffff81111f63 [ 1402.287632] Call Trace: [ 1402.287632] [<ffffffff810b411e>] ? lock_release+0x1af/0x1bc [ 1402.287632] [<ffffffff81111f63>] ? might_fault+0x97/0x9e [ 1402.287632] [<ffffffff81111f1a>] ? might_fault+0x4e/0x9e [ 1402.287632] [<ffffffff81a8bcf2>] cache_do_downcall+0x3e/0x4f [ 1402.287632] [<ffffffff81a8c950>] cache_write.clone.16+0xbb/0x130 [ 1402.287632] [<ffffffff81a8c9df>] ? cache_write_pipefs+0x1a/0x1a [ 1402.287632] [<ffffffff81a8c9f8>] cache_write_procfs+0x19/0x1b [ 1402.287632] [<ffffffff8118dc54>] proc_reg_write+0x8e/0xad [ 1402.287632] [<ffffffff8113fe81>] vfs_write+0xaa/0xfd [ 1402.287632] [<ffffffff8114142d>] ? fget_light+0x35/0x9e [ 1402.287632] [<ffffffff8113ff8b>] sys_write+0x48/0x6f [ 1402.287632] [<ffffffff81bbdb92>] system_call_fastpath+0x16/0x1b [ 1402.287632] Code: c0 c9 c3 55 48 63 d2 48 89 e5 48 8d 44 32 ff 41 57 41 56 41 55 41 54 53 bb ea ff ff ff 48 81 ec 88 00 00 00 48 89 b5 58 ff ff ff [ 1402.287632] 38 0a 0f 85 89 02 00 00 c6 00 00 48 8b 3d 44 4a e5 01 48 85 [ 1402.287632] RIP [<ffffffff812b4b99>] expkey_parse+0x28/0x2e1 [ 1402.287632] RSP <ffff880077f0fd68> [ 1402.287632] CR2: ffff880077c49fff [ 1402.287632] ---[ end trace 368ef53ff773a5e3 ]--- Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: linux-nfs@vger.kernel.org Cc: stable@kernel.org Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-12-06 16:18:37 -05:00
Justin P. Mattock	42b2aa86c6	treewide: Fix typos in various parts of the kernel, and fix some comments. The below patch fixes some typos in various parts of the kernel, as well as fixes some comments. Please let me know if I missed anything, and I will try to get it changed and resent. Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-12-02 14:57:31 +01:00
Thomas Meyer	67114fe610	nfsd4: Use kmemdup rather than duplicating its implementation The semantic patch that makes this change is available in scripts/coccinelle/api/memdup.cocci. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-25 18:44:22 -05:00
J. Bruce Fields	009673b439	nfsd4: add a separate (lockowner, inode) lookup Address the possible performance regression mentioned in "nfsd4: hash lockowners to simplify RELEASE_LOCKOWNER" by providing a separate (lockowner, inode) hash. Really, I doubt this matters much, but I think it's likely we'll change these data structures here and I'd rather that the need for (owner, inode) lookups be well-documented. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-15 19:26:08 -05:00
J. Bruce Fields	353de31b86	nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-15 19:26:07 -05:00
J. Bruce Fields	16bfdaafa2	nfsd4: share open and lock owner hash tables Now that they're used in the same way, it's a little simpler to put open and lock owners in the same hash table, and I can't see a reason not to. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-08 11:28:45 -05:00
J. Bruce Fields	06f1f864d4	nfsd4: hash lockowners to simplify RELEASE_LOCKOWNER Hash lockowners on just the owner string rather than on (owner, inode). This makes the owner-string lookup needed for RELEASE_LOCKOWNER simpler (currently it's doing at a linear search through the entire hash table!). That may come at the expense of making (owner, inode) lookups more expensive if a client reuses the same lockowner across multiple files. We might add a separate lookup for that. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-07 21:10:48 -05:00
Bryan Schumaker	c7e8472cf8	NFSD: Remove unnecessary whitespace The close parenthesis was hard to find with it spaced so far over. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> [bfields@redhat.com: get all these lines under 80 chars while we're here] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-07 21:10:48 -05:00
Bryan Schumaker	7208339607	NFSD: Call nfsd4_init_slabs() from init_nfsd() init_nfsd() was calling free_slabs() during cleanup code, but the call to init_slabs() was hidden in nfsd4_state_init(). This could be confusing to people unfamiliar with the code. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-07 21:10:47 -05:00
Bryan Schumaker	65178db42a	NFSD: Added fault injection Fault injection on the NFS server makes it easier to test the client's state manager and recovery threads. Simulating errors on the server is easier than finding the right conditions that cause them naturally. This patch uses debugfs to add a simple framework for fault injection to the server. This framework is a config option, and can be enabled through CONFIG_NFSD_FAULT_INJECTION. Assuming you have debugfs mounted to /sys/debug, a set of files will be created in /sys/debug/nfsd/. Writing to any of these files will cause the corresponding action and write a log entry to dmesg. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-07 21:10:47 -05:00
J. Bruce Fields	64a284d07c	nfsd4: maintain one seqid stream per (lockowner, file) Instead of creating a new lockowner and stateid for every open_to_lockowner call, reuse the existing lockowner if it exists. Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-07 21:10:47 -05:00
J. Bruce Fields	684e563858	nfsd4: cleanup lock clientid handling in sessions case I'd rather the "ignore clientid in sessions case" rule be enforced in just one place. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-07 21:10:47 -05:00
J. Bruce Fields	b93d87c198	nfsd4: fix lockowner matching Lockowners are looked up by file as well as by owner, but we were forgetting to do a comparison on the file. This could cause an incorrect result from lockt. (Note looking up the inode from the lockowner is pretty awkward here. The data structures need fixing.) Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-07 21:10:47 -05:00
Linus Torvalds	32aaeffbd4	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h	2011-11-06 19:44:47 -08:00
Linus Torvalds	6736c04799	Merge branch 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs * 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (25 commits) nfs: set vs_hidden on nfs4_callback_version4 (try #2) pnfs-obj: Support for RAID5 read-4-write interface. pnfs-obj: move to ore 03: Remove old raid engine pnfs-obj: move to ore 02: move to ORE pnfs-obj: move to ore 01: ore_layout & ore_components pnfs-obj: Rename objlayout_io_state => objlayout_io_res pnfs-obj: Get rid of objlayout_{alloc,free}_io_state pnfs-obj: Return PNFS_NOT_ATTEMPTED in case of read/write_pagelist pnfs-obj: Remove redundant EOF from objlayout_io_state nfs: Remove unused variable from write.c nfs: Fix unused variable warning from file.c NFS: Remove no-op less-than-zero checks on unsigned variables. NFS: Clean up nfs4_xdr_dec_secinfo() NFS: Fix documenting comment for nfs_create_request() NFS4: fix cb_recallany decode error nfs4: serialize layoutcommit SUNRPC: remove rpcbind clients destruction on module cleanup SUNRPC: remove rpcbind clients creation during service registering NFSd: call svc rpcbind cleanup explicitly SUNRPC: cleanup service destruction ...	2011-11-04 12:27:43 -07:00
Trond Myklebust	31cbecb4ab	Merge branch 'osd-devel' into nfs-for-next	2011-11-02 23:56:40 -04:00
Benny Halevy	fc0d14fe2d	nfsd4: typo logical vs bitwise negate in nfsd4_decode_share_access Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-11-01 18:06:43 -04:00
Paul Gortmaker	143cb494cb	fs: add module.h to files that were implicitly using it Some files were using the complete module.h infrastructure without actually including the header at all. Fix them up in advance so once the implicit presence is removed, we won't get failures like this: CC [M] fs/nfsd/nfssvc.o fs/nfsd/nfssvc.c: In function 'nfsd_create_serv': fs/nfsd/nfssvc.c:335: error: 'THIS_MODULE' undeclared (first use in this function) fs/nfsd/nfssvc.c:335: error: (Each undeclared identifier is reported only once fs/nfsd/nfssvc.c:335: error: for each function it appears in.) fs/nfsd/nfssvc.c: In function 'nfsd': fs/nfsd/nfssvc.c:555: error: implicit declaration of function 'module_put_and_exit' make[3]: *** [fs/nfsd/nfssvc.o] Error 1 Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:30:31 -04:00
Paul Gortmaker	afeacc8c1f	fs: add export.h to files using EXPORT_SYMBOL/THIS_MODULE macros These files were getting <linux/module.h> via an implicit include path, but we want to crush those out of existence since they cost time during compiles of processing thousands of lines of headers for no reason. Give them the lightweight header that just contains the EXPORT_SYMBOL infrastructure. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:30:31 -04:00
Stanislav Kinsbursky	16d0587090	NFSd: call svc rpcbind cleanup explicitly We have to call svc_rpcb_cleanup() explicitly from nfsd_last_thread() since this function is registered as service shutdown callback and thus nobody else will done it for us. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-10-25 13:19:40 +02:00
Mi Jinlong	345c284290	nfs41: implement DESTROY_CLIENTID operation According to rfc5661 18.50, implement DESTROY_CLIENTID operation. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-24 04:24:30 -04:00
Benny Halevy	92bac8c5d6	nfsd4: typo logical vs bitwise negate for want_mask Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-24 04:24:29 -04:00
Benny Halevy	c668fc6dfc	nfsd4: allow NFS4_SHARE_SIGNAL_DELEG_WHEN_RESRC_AVAIL \| NFS4_SHARE_PUSH_DELEG_WHEN_UNCONTENDED RFC5661 says: The client may set one or both of OPEN4_SHARE_ACCESS_WANT_SIGNAL_DELEG_WHEN_RESRC_AVAIL and OPEN4_SHARE_ACCESS_WANT_PUSH_DELEG_WHEN_UNCONTENDED. Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-24 04:24:28 -04:00
Benny Halevy	fc0c3dd13b	nfsd4: seq->status_flags may be used unitialized Reported-by: Gopala Suryanarayana <gsuryanarayana@vmware.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-24 04:24:28 -04:00
Benny Halevy	5423732a71	nfsd41: use SEQ4_STATUS_BACKCHANNEL_FAULT when cb_sequence is invalid Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-24 04:24:27 -04:00
J. Bruce Fields	8b289b2c23	nfsd4: implement new 4.1 open reclaim types Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-19 11:52:12 -04:00
J. Bruce Fields	a8d86cd75b	nfsd4: remove unneeded CLAIM_DELEGATE_CUR workaround `0c12eaffdf` "nfsd: don't break lease on CLAIM_DELEGATE_CUR" was a temporary workaround for a problem fixed properly in the vfs layer by `778fc546f7` "locks: fix tracking of inprogress lease breaks", so we can revert that change (but keeping some minor cleanup from that commit). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-19 11:42:03 -04:00
J. Bruce Fields	856121b2e8	nfsd4: warn on open failure after create If we create the object and then return failure to the client, we're left with an unexpected file in the filesystem. I'm trying to eliminate such cases but not 100% sure I have so an assertion might be helpful for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-17 17:50:08 -04:00
J. Bruce Fields	4cdc951b86	nfsd4: preallocate open stateid in process_open1() As with the nfs4_file, we'd prefer to find out about any failure before creating a new file rather than after. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-17 17:50:07 -04:00
J. Bruce Fields	996e09385c	nfsd4: do idr preallocation with stateid allocation Move idr preallocation out of stateid initialization, into stateid allocation, so that we no longer have to handle any errors from the former. This is a little subtle due to the way the idr code manages these preallocated items--document that in comments. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-17 17:50:07 -04:00
J. Bruce Fields	32513b40ef	nfsd4: preallocate nfs4_file in process_open1() Creating a new file is an irrevocable step--once it's visible in the filesystem, other processes may have seen it and done something with it, and unlinking it wouldn't simply undo the effects of the create. Therefore, in the case where OPEN creates a new file, we shouldn't do the create until we know that the rest of the OPEN processing will succeed. For example, we should preallocate a struct file in case we need it until waiting to allocate it till process_open2(), which is already too late. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-17 17:50:00 -04:00
J. Bruce Fields	d29b20cd58	nfsd4: clean up open owners on OPEN failure If process_open1() creates a new open owner, but the open later fails, the current code will leave the open owner around. It won't be on the close_lru list, and the client isn't expected to send a CLOSE, so it will hang around as long as the client does. Similarly, if process_open1() removes an existing open owner from the close lru, anticipating that an open owner that previously had no associated stateid's now will, but the open subsequently fails, then we'll again be left with the same leak. Fix both problems. Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-17 17:33:57 -04:00
J. Bruce Fields	bcf130f9df	nfsd4: simplify process_open1 logic No change in behavior. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-17 17:33:51 -04:00
J. Bruce Fields	3557e43b8f	nfsd4: make is_open_owner boolean Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-17 17:09:37 -04:00
J. Bruce Fields	a50d2ad172	nfsd4: centralize renew_client() calls There doesn't seem to be any harm to renewing the client a bit earlier, when it is looked up. That saves us from having to sprinkle renew_client calls over quite so many places. Also remove a redundant comment and do a little cleanup. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-17 17:09:37 -04:00
Dan Carpenter	01cd4afadb	nfsd4: typo logical vs bitwise negate This should be a bitwise negate here. It silences a Sparse warning: fs/nfsd/nfs4xdr.c:693:16: warning: dubious: x & !y Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-17 08:35:09 -04:00
J. Bruce Fields	b6d2f1ca3c	nfsd4: more robust ignoring of WANT bits in OPEN Mask out the WANT bits right at the start instead of on each use. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-11 12:15:15 -04:00
J. Bruce Fields	a084daf512	nfsd4: move name-length checks to xdr Again, these checks are better in the xdr code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-11 12:15:01 -04:00
J. Bruce Fields	04f9e664b2	nfsd4: move access/deny validity checks to xdr code I'd rather put more of these sorts of checks into standardized xdr decoders for the various types rather than have them cluttering up the core logic in nfs4proc.c and nfs4state.c. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-11 08:53:12 -04:00
J. Bruce Fields	c30e92df30	nfsd4: ignore WANT bits in open downgrade We don't use WANT bits yet--and sending them can probably trigger a BUG() further down. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-10 18:05:20 -04:00
J. Bruce Fields	b31b30e5c7	nfsd4: cleanup state.h comments These comments are mostly out of date. Reported-by: Bryan Schumaker <bjschuma@netapp.com>	2011-10-10 18:04:46 -04:00
J. Bruce Fields	6409a5a65d	nfsd4: clean up downgrading code In response to some review comments, get rid of the somewhat obscure for-loop with bitops, and improve a comment. Reported-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-10 18:04:45 -04:00
J. Bruce Fields	71c3bcd713	nfsd4: fix state lock usage in LOCKU In commit `5ec094c109` "nfsd4: extend state lock over seqid replay logic" I modified the exit logic of all the seqid-based procedures except nfsd4_locku(). Fix the oversight. The result of the bug was a double-unlock while handling the LOCKU procedure, and a warning like: [ 142.150014] WARNING: at kernel/mutex-debug.c:78 debug_mutex_unlock+0xda/0xe0() ... [ 142.152927] Pid: 742, comm: nfsd Not tainted 3.1.0-rc1-SLIM+ #9 [ 142.152927] Call Trace: [ 142.152927] [<ffffffff8105fa4f>] warn_slowpath_common+0x7f/0xc0 [ 142.152927] [<ffffffff8105faaa>] warn_slowpath_null+0x1a/0x20 [ 142.152927] [<ffffffff810960ca>] debug_mutex_unlock+0xda/0xe0 [ 142.152927] [<ffffffff813e4200>] __mutex_unlock_slowpath+0x80/0x140 [ 142.152927] [<ffffffff813e42ce>] mutex_unlock+0xe/0x10 [ 142.152927] [<ffffffffa03bd3f5>] nfs4_lock_state+0x35/0x40 [nfsd] [ 142.152927] [<ffffffffa03b0b71>] nfsd4_proc_compound+0x2a1/0x690 [nfsd] [ 142.152927] [<ffffffffa039f9fb>] nfsd_dispatch+0xeb/0x230 [nfsd] [ 142.152927] [<ffffffffa02b1055>] svc_process_common+0x345/0x690 [sunrpc] [ 142.152927] [<ffffffff81058d10>] ? try_to_wake_up+0x280/0x280 [ 142.152927] [<ffffffffa02b16e2>] svc_process+0x102/0x150 [sunrpc] [ 142.152927] [<ffffffffa039f0bd>] nfsd+0xbd/0x160 [nfsd] [ 142.152927] [<ffffffffa039f000>] ? 0xffffffffa039efff [ 142.152927] [<ffffffff8108230c>] kthread+0x8c/0xa0 [ 142.152927] [<ffffffff813e8694>] kernel_thread_helper+0x4/0x10 [ 142.152927] [<ffffffff81082280>] ? kthread_worker_fn+0x190/0x190 [ 142.152927] [<ffffffff813e8690>] ? gs_change+0x13/0x13 Reported-by: Bryan Schumaker <bjschuma@netapp.com> Tested-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-10-10 18:04:45 -04:00
J. Bruce Fields	38c2f4b12a	nfsd4: look up stateid's per clientid Use a separate stateid idr per client, and lookup a stateid by first finding the client, then looking up the stateid relative to that client. Also some minor refactoring. This allows us to improve error returns: we can return expired when the clientid is not found and bad_stateid when the clientid is found but not the stateid, as opposed to returning expired for both cases. I hope this will also help to replace the state lock mostly by a per-client lock, but that hasn't been done yet. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-26 17:35:28 -04:00
J. Bruce Fields	36279ac10c	nfsd4: assume test_stateid always has session Test_stateid is 4.1-only and only allowed after a sequence operation, so this check is unnecessary. Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-26 17:35:27 -04:00
J. Bruce Fields	6136d2b409	nfsd4: use idr for stateid's The idr system is designed exactly for generating id and looking up integer id's. Thanks to Trond for pointing it out. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-26 17:35:26 -04:00
J. Bruce Fields	2a74aba799	nfsd4: move client * to nfs4_stateid, add init_stid helper This will be convenient. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-26 17:35:25 -04:00
J. Bruce Fields	c856694e3d	nfsd4: make op_cacheresult another flag I'm not sure why I used a new field for this originally. Also, the differences between some of these flags are a little subtle; add some comments to explain. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-20 14:45:51 -04:00
J. Bruce Fields	3d02fa29de	nfsd4: fix open downgrade, again Yet another open-management regression: - nfs4_file_downgrade() doesn't remove the BOTH access bit on downgrade, so the server's idea of the stateid's access gets out of sync with the client's. If we want to keep an O_RDWR open in this case, we should do that in the file_put_access logic rather than here. - We forgot to convert v4 access to an open mode here. This logic has proven too hard to get right. In the future we may consider: - reexamining the lock/openowner relationship (locks probably don't really need to take their own references here). - adding open upgrade/downgrade support to the vfs. - removing the atomic operations. They're redundant as long as this is all under some other lock. Also, maybe some kind of additional static checking would help catch O_/NFS4_SHARE_ACCESS confusion. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-20 14:43:39 -04:00
J. Bruce Fields	f7a4d87207	nfsd4: hash closed stateid's like any other Look up closed stateid's in the stateid hash like any other stateid rather than searching the close lru. This is simpler, and fixes a bug: currently we handle only the case of a close that is the last close for a given stateowner, but not the case of a close for a stateowner that still has active opens on other files. Thus in a case like: open(owner, file1) open(owner, file2) close(owner, file2) close(owner, file2) the final close won't be recognized as a retransmission. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-19 08:39:34 -04:00
J. Bruce Fields	d3b313a463	nfsd4: construct stateid from clientid and counter Including the full clientid in the on-the-wire stateid allows more reliable detection of bad vs. expired stateid's, simplifies code, and ensures we won't reuse the opaque part of the stateid (as we currently do when the same openowner closes and reopens the same file). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-19 06:33:57 -04:00
J. Bruce Fields	2da1cec713	nfsd4: simplify free_stateid We no longer need is_deleg_stateid, for example. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-17 10:31:16 -04:00
J. Bruce Fields	38c387b52d	nfsd4: match close replays on stateid, not open owner id Keep around an unhashed copy of the final stateid after the last close using an openowner, and when identifying a replay, match against that stateid instead of just against the open owner id. Free it the next time the seqid is bumped or the stateowner is destroyed. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-17 10:01:54 -04:00
J. Bruce Fields	dad1c067eb	nfsd4: replace oo_confirmed by flag bit I want at least one more bit here. So, let's haul out the caps lock key and add a flags field. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-16 17:44:16 -04:00
Mi Jinlong	58e7b33a58	nfsd41: try to check reply size before operation For checking the size of reply before calling a operation, we need try to get maxsize of the operation's reply. v3: using new method as Bruce said, "we could handle operations in two different ways: - For operations that actually change something (write, rename, open, close, ...), do it the way we're doing it now: be very careful to estimate the size of the response before even processing the operation. - For operations that don't change anything (read, getattr, ...) just go ahead and do the operation. If you realize after the fact that the response is too large, then return the error at that point. So we'd add another flag to op_flags: say, OP_MODIFIES_SOMETHING. And for operations with OP_MODIFIES_SOMETHING set, we'd do the first thing. For operations without it set, we'd do the second." Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> [bfields@redhat.com: crash, don't attempt to handle, undefined op_rsize_bop] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-16 10:31:01 -04:00
Mi Jinlong	849a1cf13d	SUNRPC: Replace svc_addr_u by sockaddr_storage For IPv6 local address, lockd can not callback to client for missing scope id when binding address at inet6_bind: 324 if (addr_type & IPV6_ADDR_LINKLOCAL) { 325 if (addr_len >= sizeof(struct sockaddr_in6) && 326 addr->sin6_scope_id) { 327 /* Override any existing binding, if another one 328 * is supplied by user. 329 / 330 sk->sk_bound_dev_if = addr->sin6_scope_id; 331 } 332 333 / Binding to link-local address requires an interface */ 334 if (!sk->sk_bound_dev_if) { 335 err = -EINVAL; 336 goto out_unlock; 337 } Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info besides address. Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-14 08:21:48 -04:00
Trond Myklebust	11fcee0293	NFSD: Add a cache for fs_locations information Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> [ cel: since this is server-side, use nfsd4_ prefix instead of nfs4_ prefix. ] [ cel: implement S_ISVTX filter in bfields-normal form ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 22:44:17 -04:00
Trond Myklebust	2f1ddda174	NFSD: Remove the ex_pathname field from struct svc_export There are no more users... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 22:44:10 -04:00
Trond Myklebust	ed748aacb8	NFSD: Cleanup for nfsd4_path() The current code is sort of hackish in that it assumes a referral is always matched to an export. When we add support for junctions that may not be the case. We can replace nfsd4_path() with a function that encodes the components directly from the dentries. Since nfsd4_path is currently the only user of the 'ex_pathname' field in struct svc_export, this has the added benefit of allowing us to get rid of that. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 22:43:42 -04:00
J. Bruce Fields	ee626a77d3	nfsd4: better stateid hashing First, we shouldn't care here about the structure of the opaque part of the stateid. Second, this hash is really dumb. (I'm not sure the replacement is much better, though--to look at it another patch.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:30:36 -04:00
J. Bruce Fields	69064a2764	nfsd4: use deleg changes to cleanup preprocess_stateid_op Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:30:36 -04:00
J. Bruce Fields	97b7e3b6d4	nfsd4: fix test_stateid for delegation stateid's Test_stateid should handle delegation stateid's as well. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:30:35 -04:00
J. Bruce Fields	f459e45359	nfsd4: hash deleg stateid's like any other It's simpler to look up delegation stateid's in the same hash table as any other stateid. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:30:34 -04:00
J. Bruce Fields	36d44c6038	nfsd4: share common stid-hashing helper function Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:30:33 -04:00
J. Bruce Fields	d5477a8db8	nfsd4: add common dl_stid field to delegation Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:30:32 -04:00
J. Bruce Fields	dcef0413da	nfsd4: move some of nfs4_stateid into a separate structure We want delegations to share more with open/lock stateid's, so first we'll pull out some of the common stuff we want to share. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:29:58 -04:00
J. Bruce Fields	91a8c04031	nfsd4: remove redundant stateid initialization Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:29:04 -04:00
J. Bruce Fields	881ea2b11e	nfsd4: rename init_stateid Note this is actually open-stateid specific. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:29:03 -04:00
J. Bruce Fields	2288d0e395	nfsd4: pass around typemask instead of flags We're only using those flags to choose lock or open stateid's at this point. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:29:00 -04:00
J. Bruce Fields	c0a5d93efb	nfsd4: split preprocess_seqid, cleanup Move most of this into helper functions. Also move the non-CONFIRM case into caller, providing a helper function for that purpose. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:27:35 -04:00
J. Bruce Fields	4d71ab8751	nfsd4: split up find_stateid Minor cleanup. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:27:31 -04:00
J. Bruce Fields	4581d14099	nfsd4: rearrange to avoid a forward reference Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-13 18:25:39 -04:00
J. Bruce Fields	4665e2bac5	nfsd4: split out some free_generic_stateid code We'll use this elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-07 09:47:23 -04:00
J. Bruce Fields	fe0750e5c4	nfsd4: split stateowners into open and lockowners The stateowner has some fields that only make sense for openowners, and some that only make sense for lockowners, and I find it a lot clearer if those are separated out. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-07 09:45:49 -04:00
J. Bruce Fields	f4dee24cca	nfsd4: move CLOSE_STATE special case to caller Move the CLOSE_STATE case into the unique caller that cares about it rather than putting it in preprocess_seqid_op. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-03 23:15:28 -04:00
J. Bruce Fields	68b66e8270	nfsd4: move double-confirm test to open_confirm I don't see the point of having this check in nfs4_preprocess_seqid_op() when it's only needed by the one caller. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-03 05:01:52 -04:00
J. Bruce Fields	77eaae8d44	nfsd4: simplify check_open logic Sometimes the single-exit style is good, sometimes it's unnecessarily convoluted.... Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-02 19:59:29 -04:00
J. Bruce Fields	7a8711c9a6	nfsd4: share common seqid checks Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-02 19:59:24 -04:00
J. Bruce Fields	16d259418b	nfsd4: eliminate unused lt_stateowner This is used only as a local variable. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-01 11:35:30 -04:00
J. Bruce Fields	7c13f344cf	nfsd4: drop most stateowner refcounting Maybe we'll bring it back some day, but we don't have much real use for it now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-01 11:12:47 -04:00
J. Bruce Fields	fff6ca9cc4	nfsd4: eliminate impossible open replay case If open fails with any error other than nfserr_replay_me, then the main nfsd4_proc_compound() loop continues unconditionally to nfsd4_encode_operation(), which will always call encode_seqid_op_tail. Thus the condition we check for here does not occur. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-01 07:29:01 -04:00
J. Bruce Fields	5ec094c109	nfsd4: extend state lock over seqid replay logic There are currently a couple races in the seqid replay code: a retransmission could come while we're still encoding the original reply, or a new seqid-mutating call could come as we're encoding a replay. So, extend the state lock over the encoding (both encoding of a replayed reply and caching of the original encoded reply). I really hate doing this, and previously added the stateowner reference-counting code to avoid it (which was insufficient)--but I don't see a less complicated alternative at the moment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-01 07:07:59 -04:00
J. Bruce Fields	9072d5c66b	nfsd4: cleanup seqid op stateowner usage Now that the replay owner is in the cstate we can remove it from a lot of other individual operations and further simplify nfs4_preprocess_seqid_op(). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-31 17:56:03 -04:00
J. Bruce Fields	f3e4223751	nfsd4: centralize handling of replay owners Set the stateowner associated with a replay in one spot in nfs4_preprocess_seqid_op() and keep it in cstate. This allows removing a few lines of boilerplate from all the nfs4_preprocess_seqid_op() callers. Also turn ENCODE_SEQID_OP_TAIL into a function while we're here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-31 17:56:02 -04:00
J. Bruce Fields	73997dc418	nfsd4: make delegation stateid's seqid start at 1 Thanks to Casey for reminding me that 5661 gives a special meaning to a value of 0 in the stateid's seqid field, so all stateid's should start out with si_generation 1. We were doing that in the open and lock cases for minorversion 1, but not for the delegation stateid, and not for openstateid's with v4.0. It doesn't really matter much for v4.0 or for delegation stateid's (which never get the seqid field incremented), but we may as well do the same for all of them. Reported-by: Casey Bodley <cbodley@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-31 17:56:01 -04:00
J. Bruce Fields	81b829655d	nfsd4: simplify stateid generation code, fix wraparound Follow the recommendation from rfc3530bis for stateid generation number wraparound, simplify some code, and fix or remove incorrect comments. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-31 17:56:00 -04:00
J. Bruce Fields	b79abaddfe	nfsd4: consolidate lock & open stateid tables There's no reason to have two separate hash tables for open and lock stateid's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-31 17:56:00 -04:00
J. Bruce Fields	5fa0bbb4ee	nfsd4: simplify distinguishing lock & open stateid's The trick free_stateid is using is a little cheesy, and we'll have more uses for this field later. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-31 17:55:59 -04:00
J. Bruce Fields	c2d8eb7ac6	nfsd4: remove typoed replay field Wow, I wonder how long that typo's been there. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-31 17:55:58 -04:00
J. Bruce Fields	b7d7ca3580	nfsd4: fix off-by-one-error in SEQUENCE reply The values here represent highest slotid numbers. Since slotid's are numbered starting from zero, the highest should be one less than the number of slots. Reported-by: Rick Macklem <rmacklem@uoguelph.ca> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-31 17:55:57 -04:00
J. Bruce Fields	c152292f9e	nfsd: remove include/linux/nfsd/syscall.h We don't need this any more. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-31 11:50:11 -04:00
J. Bruce Fields	3cc9fda40a	nfsd4: remove redundant is_open_owner check When called with OPEN_STATE, preprocess_seqid_op only returns an open stateid, hence only an open owner. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:29 -04:00
J. Bruce Fields	b34f27aa5d	nfsd4: get lock checks out of preprocess_seqid_op We've got some lock-specific code here in nfs4_preprocess_seqid_op which is only used by nfsd4_lock(). Move it to the caller. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:28 -04:00
J. Bruce Fields	9afb978400	nfsd4: simplify lock openmode check Note that the special handling for the lock stateid case is already done by nfs4_check_openmode() (as of `0292191417` "nfsd4: fix openmode checking on IO using lock stateid") so we no longer need these two cases in the caller. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:27 -04:00
J. Bruce Fields	a9004abc34	nfsd4: cleanup and consolidate seqid_mutating_err Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:26 -04:00
J. Bruce Fields	28dde241cc	nfsd4: remove HAS_SESSION This flag doesn't really buy us anything. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:25 -04:00
J. Bruce Fields	ff194bd959	nfsd4: cleanup lock/stateowner initialization Share some common code, stop doing silly things like initializing a list head immediately before adding it to a list, etc. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:24 -04:00
J. Bruce Fields	506f275fff	nfsd4: name openowner data structures more clearly These appear to be generic (for both open and lock owners), but they're actually just for open owners. This has confused me more than once. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:23 -04:00
J. Bruce Fields	ddc04c4163	nfsd4: replace some macros by functions For all the usual reasons. (Type safety, readability.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:22 -04:00
J. Bruce Fields	3e77246393	nfsd4: stop using nfserr_resource for transitory errors The server is returning nfserr_resource for both permanent errors and for errors (like allocation failures) that might be resolved by retrying later. Save nfserr_resource for the former and use delay/jukebox for the latter. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:21 -04:00
Boaz Harrosh	6577aac01f	nfsd4: fix failure to end nfsd4 grace period Even if we fail to write a recovery record, we should still mark the client as having acquired its first state. Otherwise we leave 4.1 clients with indefinite ERR_GRACE returns. However, an inability to write stable storage records may cause failures of reboot recovery, and the problem should still be brought to the server administrator's attention. So, make sure the error is logged. These errors shouldn't normally be triggered on a corectly functioning server--this isn't a case where a misconfigured client could spam the logs. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:21 -04:00
J. Bruce Fields	48483bf23a	nfsd4: simplify recovery dir setting Move around some of this code, simplify a bit. Reviewed-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:21:18 -04:00
J. Bruce Fields	8e82fa8fdc	nfsd: prettify NFSD_MAY_* flag definitions Acked-by: Jim Rees <rees@umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:20:21 -04:00
J. Bruce Fields	a043226bc1	nfsd4: permit read opens of executable-only files A client that wants to execute a file must be able to read it. Read opens over nfs are therefore implicitly allowed for executable files even when those files are not readable. NFSv2/v3 get this right by using a passed-in NFSD_MAY_OWNER_OVERRIDE on read requests, but NFSv4 has gotten this wrong ever since `dc730e1737` "nfsd4: fix owner-override on open", when we realized that the file owner shouldn't override permissions on non-reclaim NFSv4 opens. So we can't use NFSD_MAY_OWNER_OVERRIDE to tell nfsd_permission to allow reads of executable files. So, do the same thing we do whenever we encounter another weird NFS permission nit: define yet another NFSD_MAY_* flag. The industry's future standardization on 128-bit processors will be motivated primarily by the need for integers with enough bits for all the NFSD_MAY_* flags. Reported-by: Leonardo Borda <leonardoborda@gmail.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-27 14:20:20 -04:00
J. Bruce Fields	c10bd39d80	Remove include/linux/nfsd/const.h Userspace shouldn't have a use for these constants. Nothing here is used outside fs/nfsd. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-26 18:22:52 -04:00
J. Bruce Fields	75c096f753	nfsd4: it's OK to return nfserr_symlink The nfsd4 code has a bunch of special exceptions for error returns which map nfserr_symlink to other errors. In fact, the spec makes it clear that nfserr_symlink is to be preferred over less specific errors where possible. The patch that introduced it back in 2.6.4 is "kNFSd: correct symlink related error returns.", which claims that these special exceptions are represent an NFSv4 break from v2/v3 tradition--when in fact the symlink error was introduced with v4. I suspect what happened was pynfs tests were written that were overly faithful to the (known-incomplete) rfc3530 error return lists, and then code was fixed up mindlessly to make the tests pass. Delete these unnecessary exceptions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-26 18:22:50 -04:00
J. Bruce Fields	e281d81009	nfsd4: fix incorrect comment in nfsd4_set_nfs4_acl Zero means "I don't care what kind of file this is". And that's probably what we want--acls are also settable at least on directories, and if the filesystem doesn't want them on other objects, leave it to it to complain. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-26 18:22:49 -04:00
J. Bruce Fields	e10f9e1413	nfsd: clean up nfsd_mode_check() Add some more comments, simplify logic, do & S_IFMT just once, name "type" more helpfully. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-26 18:22:48 -04:00
J. Bruce Fields	7d818a7b8f	nfsd: open-code special directory-hardlink check We allow the fh_verify caller to specify that any object except those of a given type is allowed, by passing a negative type. But only one caller actually uses it. Open-code that check in the one caller. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-26 18:22:47 -04:00
J. Bruce Fields	3d2544b1e4	nfsd4: clean up S_IS -> NF4 file type mapping A slightly unconventional approach to make the code more compact I could live with, but let's give the poor reader some chance. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-26 18:22:47 -04:00
J. Bruce Fields	aadab6c6f4	nfsd4: return nfserr_symlink on v4 OPEN of non-regular file Without this, an attempt to open a device special file without first stat'ing it will fail. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-19 13:25:32 -04:00
J. Bruce Fields	576163005d	nfsd4: fix seqid_mutating_error The set of errors here does not agree with the set of errors specified in the rfc! While we're there, turn this macros into a function, for the usual reasons, and move it to the one place where it's actually used. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-19 13:25:31 -04:00
Bernd Schubert	832023bffb	nfsd4: Remove check for a 32-bit cookie in nfsd4_readdir() Fan Yong <yong.fan@whamcloud.com> noticed setting FMODE_32bithash wouldn't work with nfsd v4, as nfsd4_readdir() checks for 32 bit cookies. However, according to RFC 3530 cookies have a 64 bit type and cookies are also defined as u64 in 'struct nfsd4_readdir'. So remove the test for >32-bit values. Cc: stable@kernel.org Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-16 15:19:28 -04:00
Linus Torvalds	2dad3206db	Merge branch 'for-3.1' of git://linux-nfs.org/~bfields/linux * 'for-3.1' of git://linux-nfs.org/~bfields/linux: nfsd: don't break lease on CLAIM_DELEGATE_CUR locks: rename lock-manager ops nfsd4: update nfsv4.1 implementation notes nfsd: turn on reply cache for NFSv4 nfsd4: call nfsd4_release_compoundargs from pc_release nfsd41: Deny new lock before RECLAIM_COMPLETE done fs: locks: remove init_once nfsd41: check the size of request nfsd41: error out when client sets maxreq_sz or maxresp_sz too small nfsd4: fix file leak on open_downgrade nfsd4: remember to put RW access on stateid destruction NFSD: Added TEST_STATEID operation NFSD: added FREE_STATEID operation svcrpc: fix list-corrupting race on nfsd shutdown rpc: allow autoloading of gss mechanisms svcauth_unix.c: quiet sparse noise svcsock.c: include sunrpc.h to quiet sparse noise nfsd: Remove deprecated nfsctl system call and related code. NFSD: allow OP_DESTROY_CLIENTID to be only op in COMPOUND Fix up trivial conflicts in Documentation/feature-removal-schedule.txt	2011-07-25 22:49:19 -07:00
Casey Bodley	0c12eaffdf	nfsd: don't break lease on CLAIM_DELEGATE_CUR CLAIM_DELEGATE_CUR is used in response to a broken lease; allowing it to break the lease and return EAGAIN leaves the client unable to make progress in returning the delegation nfs4_get_vfs_file() now takes struct nfsd4_open for access to the claim type, and calls nfsd_open() with NFSD_MAY_NOT_BREAK_LEASE when claim type is CLAIM_DELEGATE_CUR Signed-off-by: Casey Bodley <cbodley@citi.umich.edu> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-23 14:58:17 -04:00
J. Bruce Fields	8fb47a4fbf	locks: rename lock-manager ops Both the filesystem and the lock manager can associate operations with a lock. Confusingly, one of them (fl_release_private) actually has the same name in both operation structures. It would save some confusion to give the lock-manager ops different names. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-20 20:23:19 -04:00
Al Viro	5b4b299cc7	nfsd4_list_rec_dir(): don't bother with reopening rec_file just rewind it to the beginning before vfs_readdir() and be done with that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 01:44:23 -04:00
J. Bruce Fields	1091006c5e	nfsd: turn on reply cache for NFSv4 It's sort of ridiculous that we've never had a working reply cache for NFSv4. On the other hand, we may still not: our current reply cache is likely not very good, especially in the TCP case (which is the only case that matters for v4). What we really need here is some serious testing. Anyway, here's a start. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-18 09:39:01 -04:00
J. Bruce Fields	3e98abffd1	nfsd4: call nfsd4_release_compoundargs from pc_release This simplifies cleanup a bit. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-18 09:38:02 -04:00
Mi Jinlong	ab1350b2b3	nfsd41: Deny new lock before RECLAIM_COMPLETE done Before nfs41 client's RECLAIM_COMPLETE done, nfs server should deny any new locks or opens. rfc5661: " Whenever a client establishes a new client ID and before it does the first non-reclaim operation that obtains a lock, it MUST send a RECLAIM_COMPLETE with rca_one_fs set to FALSE, even if there are no locks to reclaim. If non-reclaim locking operations are done before the RECLAIM_COMPLETE, an NFS4ERR_GRACE error will be returned. " Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 19:00:40 -04:00
Mi Jinlong	ae82a8d06f	nfsd41: check the size of request Check in SEQUENCE that the request doesn't exceed maxreq_sz for the given session. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 19:00:00 -04:00
Mi Jinlong	1b74c25bc1	nfsd41: error out when client sets maxreq_sz or maxresp_sz too small According to RFC5661, 18.36.3, "if the client selects a value for ca_maxresponsesize such that a replier on a channel could never send a response,the server SHOULD return NFS4ERR_TOOSMALL in the CREATE_SESSION reply." So, error out when the client sets a maxreq_sz less than the minimum possible SEQUENCE request size, or sets a maxresp_sz less than the minimum possible SEQUENCE reply size. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 18:58:51 -04:00
J. Bruce Fields	f197c27196	nfsd4: fix file leak on open_downgrade Stateid's hold a read reference for a read open, a write reference for a write open, and an additional one of each for each read+write open. The latter wasn't getting put on a downgrade, so something like: open RW open R downgrade to R was resulting in a file leak. Also fix an imbalance in an error path. Regression from `7d94784293` "nfsd4: fix downgrade/lock logic". Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 18:58:49 -04:00
J. Bruce Fields	499f3edc23	nfsd4: remember to put RW access on stateid destruction Without this, for example, open read open read+write close will result in a struct file leak. Regression from `7d94784293` "nfsd4: fix downgrade/lock logic". Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 18:58:49 -04:00
Bryan Schumaker	1745680454	NFSD: Added TEST_STATEID operation This operation is used by the client to check the validity of a list of stateids. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 18:58:48 -04:00
Bryan Schumaker	e1ca12dfb1	NFSD: added FREE_STATEID operation This operation is used by the client to tell the server to free a stateid. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 18:58:47 -04:00
NeilBrown	49b28684fd	nfsd: Remove deprecated nfsctl system call and related code. As promised in feature-removal-schedule.txt it is time to remove the nfsctl system call. Userspace has perferred to not use this call throughout 2.6 and it has been excluded in the default configuration since 2.6.36 (9 months ago). So this patch removes all the code that was being compiled out. There are still references to sys_nfsctl in various arch systemcall tables and related code. These should be cleaned out too, probably in the next merge window. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 18:58:42 -04:00
Benny Halevy	094b5d74f4	NFSD: allow OP_DESTROY_CLIENTID to be only op in COMPOUND DESTROY_CLIENTID MAY be preceded with a SEQUENCE operation as long as the client ID derived from the session ID of SEQUENCE is not the same as the client ID to be destroyed. If the client IDs are the same, then the server MUST return NFS4ERR_CLIENTID_BUSY. (that's not implemented yet) If DESTROY_CLIENTID is not prefixed by SEQUENCE, it MUST be the only operation in the COMPOUND request (otherwise, the server MUST return NFS4ERR_NOT_ONLY_OP). This fixes the error return; before, we returned NFS4ERR_OP_NOT_IN_SESSION; after this patch, we return NFS4ERR_NOTSUPP. Signed-off-by: Benny Halevy <benny@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 18:58:41 -04:00
J. Bruce Fields	105f462210	nfsd4: fix break_lease flags on nfsd open Thanks to Casey Bodley for pointing out that on a read open we pass 0, instead of O_RDONLY, to break_lease, with the result that a read open is treated like a write open for the purposes of lease breaking! Reported-by: Casey Bodley <cbodley@citi.umich.edu> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-06-20 10:38:01 -04:00
Casey Bodley	7d751f6f8c	nfsd: link returns nfserr_delay when breaking lease fix for commit `4795bb37ef`, nfsd: break lease on unlink, link, and rename if the LINK operation breaks a delegation, it returns NFS4ERR_NOENT (which is not a valid error in rfc 5661) instead of NFS4ERR_DELAY. the return value of nfsd_break_lease() in nfsd_link() must be converted from host_err to err Signed-off-by: Casey Bodley <cbodley@citi.umich.edu> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-06-06 18:46:56 -04:00
Randy Dunlap	be1f4084b4	nfsd: v4 support requires CRYPTO nfsd V4 support uses crypto interfaces, so select CRYPTO to fix build errors in 2.6.39: ERROR: "crypto_destroy_tfm" [fs/nfsd/nfsd.ko] undefined! ERROR: "crypto_alloc_base" [fs/nfsd/nfsd.ko] undefined! Reported-by: Wakko Warner <wakko@animx.eu.org> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-06-06 18:37:35 -04:00
J. Bruce Fields	b084f598df	nfsd: fix dependency of nfsd on auth_rpcgss Commit `b0b0c0a26e` "nfsd: add proc file listing kernel's gss_krb5 enctypes" added an nunnecessary dependency of nfsd on the auth_rpcgss module. It's a little ad hoc, but since the only piece of information nfsd needs from rpcsec_gss_krb5 is a single static string, one solution is just to share it with an include file. Cc: stable@kernel.org Reported-by: Michael Guntsche <mike@it-loops.com> Cc: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-06-06 15:07:15 -04:00
Linus Torvalds	a74d70b63f	Merge branch 'for-2.6.40' of git://linux-nfs.org/~bfields/linux * 'for-2.6.40' of git://linux-nfs.org/~bfields/linux: (22 commits) nfsd: make local functions static NFSD: Remove unused variable from nfsd4_decode_bind_conn_to_session() NFSD: Check status from nfsd4_map_bcts_dir() NFSD: Remove setting unused variable in nfsd_vfs_read() nfsd41: error out on repeated RECLAIM_COMPLETE nfsd41: compare request's opcnt with session's maxops at nfsd4_sequence nfsd v4.1 lOCKT clientid field must be ignored nfsd41: add flag checking for create_session nfsd41: make sure nfs server process OPEN with EXCLUSIVE4_1 correctly nfsd4: fix wrongsec handling for PUTFH + op cases nfsd4: make fh_verify responsibility of nfsd_lookup_dentry caller nfsd4: introduce OPDESC helper nfsd4: allow fh_verify caller to skip pseudoflavor checks nfsd: distinguish functions of NFSD_MAY_* flags svcrpc: complete svsk processing on cb receive failure svcrpc: take advantage of tcp autotuning SUNRPC: Don't wait for full record to receive tcp data svcrpc: copy cb reply instead of pages svcrpc: close connection if client sends short packet svcrpc: note network-order types in svc_process_calldir ...	2011-05-29 11:21:12 -07:00
Daniel Mack	c47d832bc0	nfsd: make local functions static This also fixes a number of sparse warnings. Signed-off-by: Daniel Mack <zonque@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-05-18 15:28:31 -04:00
Justin P. Mattock	70f23fd66b	treewide: fix a few typos in comments - kenrel -> kernel - whetehr -> whether - ttt -> tt - sss -> ss Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-05-10 10:16:21 +02:00
Bryan Schumaker	6ce2357f1e	NFSD: Remove unused variable from nfsd4_decode_bind_conn_to_session() Compiling gave me this warning: fs/nfsd/nfs4xdr.c: In function ‘nfsd4_decode_bind_conn_to_session’: fs/nfsd/nfs4xdr.c:427:6: warning: variable ‘dummy’ set but not used [-Wunused-but-set-variable] The local variable "dummy" wasn't being used past the READ32() macro that set it. READ_BUF() should ensure that the xdr buffer is pushed past the data read into dummy already, so nothing needs to be read in. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> [bfields@redhat.com: minor comment fixup.] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-29 20:47:59 -04:00
Bryan Schumaker	1db2b9dde3	NFSD: Check status from nfsd4_map_bcts_dir() Compiling gave me this warning: fs/nfsd/nfs4state.c: In function ‘nfsd4_bind_conn_to_session’: fs/nfsd/nfs4state.c:1623:9: warning: variable ‘status’ set but not used [-Wunused-but-set-variable] The local variable "status" was being set by nfsd4_map_bcts_dir() and then ignored before calling nfsd4_new_conn(). Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-29 20:47:58 -04:00
Bryan Schumaker	fccb13c947	NFSD: Remove setting unused variable in nfsd_vfs_read() Compiling gave me this warning: fs/nfsd/vfs.c: In function ‘nfsd_vfs_read’: fs/nfsd/vfs.c:880:16: warning: variable ‘inode’ set but not used [-Wunused-but-set-variable] I discovered that a local variable "inode" was being set towards the beginning of nfsd_vfs_read() and then ignored for the rest of the function. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-29 20:47:57 -04:00
Mi Jinlong	bcecf1ccc3	nfsd41: error out on repeated RECLAIM_COMPLETE Servers are supposed to return nfserr_complete_already to clients that attempt to send multiple RECLAIM_COMPLETEs. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-29 20:47:56 -04:00
Mi Jinlong	868b89c3dc	nfsd41: compare request's opcnt with session's maxops at nfsd4_sequence Make sure nfs server errors out if request contains more ops than channel allows. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> [bfields@redhat.com: use helper function] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-29 20:47:55 -04:00
Andy Adamson	b7c66360dc	nfsd v4.1 lOCKT clientid field must be ignored RFC 5661 Section 18.11.3 The clientid field of the owner MAY be set to any value by the client and MUST be ignored by the server. The reason the server MUST ignore the clientid field is that the server MUST derive the client ID from the session ID from the SEQUENCE operation of the COMPOUND request. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-29 20:47:54 -04:00
Mi Jinlong	a62573dc35	nfsd41: add flag checking for create_session Teach the NFS server to reject invalid create_session flags. Also do some minor formatting adjustments. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-29 20:47:53 -04:00
Mi Jinlong	ac6721a13e	nfsd41: make sure nfs server process OPEN with EXCLUSIVE4_1 correctly The NFS server uses nfsd_create_v3 to handle EXCLUSIVE4_1 opens, but that function is not prepared to handle them. Rename nfsd_create_v3() to do_nfsd_create(), and add handling of EXCLUSIVE4_1. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-29 20:47:52 -04:00
J. Bruce Fields	68d9318435	nfsd4: fix wrongsec handling for PUTFH + op cases When PUTFH is followed by an operation that uses the filehandle, and when the current client is using a security flavor that is inconsistent with the given filehandle, we have a choice: we can return WRONGSEC either when the current filehandle is set using the PUTFH, or when the filehandle is first used by the following operation. Follow the recommendations of RFC 5661 in making this choice. (Our current behavior prevented the client from doing security negotiation by returning WRONGSEC on PUTFH+SECINFO_NO_NAME.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-29 20:47:51 -04:00
Sachin Prabhu	1574dff899	Open with O_CREAT flag set fails to open existing files on non writable directories An open on a NFS4 share using the O_CREAT flag on an existing file for which we have permissions to open but contained in a directory with no write permissions will fail with EACCES. A tcpdump shows that the client had set the open mode to UNCHECKED which indicates that the file should be created if it doesn't exist and encountering an existing flag is not an error. Since in this case the file exists and can be opened by the user, the NFS server is wrong in attempting to check create permissions on the parent directory. The patch adds a conditional statement to check for create permissions only if the file doesn't exist. Signed-off-by: Sachin S. Prabhu <sprabhu@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-20 11:03:01 -04:00
OGAWA Hirofumi	a96e5b9080	nfsd4: Fix filp leak `23fcf2ec93` (nfsd4: fix oops on lock failure) The above patch breaks free path for stp->st_file. If stp was inserted into sop->so_stateids, we have to free stp->st_file refcount. Because stp->st_file refcount itself is taken whether or not any refcounts are taken on the stp->st_file->fi_fds[]. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-19 17:31:13 -04:00
J. Bruce Fields	4ee63624fd	nfsd4: fix struct file leak on delegation Introduced by `acfdf5c383`. Cc: stable@kernel.org Reported-by: Gerhard Heift <ml-nfs-linux-20110412-ef47@gheift.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-18 13:30:56 -04:00
Linus Torvalds	18770c7c3a	Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linux * 'for-2.6.39' of git://linux-nfs.org/~bfields/linux: nfsd4: fix oops on lock failure nfsd: fix auth_domain reference leak on nlm operations	2011-04-11 15:45:17 -07:00
J. Bruce Fields	29a78a3ed7	nfsd4: make fh_verify responsibility of nfsd_lookup_dentry caller The secinfo caller actually won't want this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-11 08:42:22 -04:00
J. Bruce Fields	22b0321496	nfsd4: introduce OPDESC helper Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-11 08:42:21 -04:00
J. Bruce Fields	204f4ce754	nfsd4: allow fh_verify caller to skip pseudoflavor checks Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-11 08:42:20 -04:00
J. Bruce Fields	aea93397db	nfsd: distinguish functions of NFSD_MAY_* flags Most of the NFSD_MAY_* flags actually request permissions, but over the years we've accreted a few that modify the behavior of the permission or open code in other ways. Distinguish the two cases a little more. In particular, allow the shortcut at the start of nfsd_permission to ignore the non-permission-requesting bits. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-11 08:42:03 -04:00
J. Bruce Fields	23fcf2ec93	nfsd4: fix oops on lock failure Lock stateid's can have access_bmap 0 if they were only partially initialized (due to a failed lock request); handle that case in free_generic_stateid. ------------[ cut here ]------------ kernel BUG at fs/nfsd/nfs4state.c:380! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/kernel/mm/ksm/run Modules linked in: nfs fscache md4 nls_utf8 cifs ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc nfsd lockd nfs_acl auth_rpcgss sunrpc ipv6 ppdev parport_pc parport pcnet32 mii pcspkr microcode i2c_piix4 BusLogic floppy [last unloaded: mperf] Pid: 1468, comm: nfsd Not tainted 2.6.38+ #120 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform EIP: 0060:[<e24f180d>] EFLAGS: 00010297 CPU: 0 EIP is at nfs4_access_to_omode+0x1c/0x29 [nfsd] EAX: ffffffff EBX: dd758120 ECX: 00000000 EDX: 00000004 ESI: dd758120 EDI: ddfe657c EBP: dd54dde0 ESP: dd54dde0 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process nfsd (pid: 1468, ti=dd54c000 task=ddc92580 task.ti=dd54c000) Stack: dd54ddf0 e24f19ca 00000000 ddfe6560 dd54de08 e24f1a5d dd758130 deee3a20 ddfe6560 31270000 dd54df1c e24f52fd 0000000f dd758090 e2505dd0 0be304cf dbb51d68 0000000e ddfe657c ddcd8020 dd758130 dd758128 dd7580d8 dd54de68 Call Trace: [<e24f19ca>] free_generic_stateid+0x1c/0x3e [nfsd] [<e24f1a5d>] release_lockowner+0x71/0x8a [nfsd] [<e24f52fd>] nfsd4_lock+0x617/0x66c [nfsd] [<e24e57b6>] ? nfsd_setuser+0x199/0x1bb [nfsd] [<e24e056c>] ? nfsd_setuser_and_check_port+0x65/0x81 [nfsd] [<c07a0052>] ? _cond_resched+0x8/0x1c [<c04ca61f>] ? slab_pre_alloc_hook.clone.33+0x23/0x27 [<c04cac01>] ? kmem_cache_alloc+0x1a/0xd2 [<c04835a0>] ? __call_rcu+0xd7/0xdd [<e24e0dfb>] ? fh_verify+0x401/0x452 [nfsd] [<e24f0b61>] ? nfsd4_encode_operation+0x52/0x117 [nfsd] [<e24ea0d7>] ? nfsd4_putfh+0x33/0x3b [nfsd] [<e24f4ce6>] ? nfsd4_delegreturn+0xd4/0xd4 [nfsd] [<e24ea2c9>] nfsd4_proc_compound+0x1ea/0x33e [nfsd] [<e24de6ee>] nfsd_dispatch+0xd1/0x1a5 [nfsd] [<e1d6e1c7>] svc_process_common+0x282/0x46f [sunrpc] [<e1d6e578>] svc_process+0xdc/0xfa [sunrpc] [<e24de0fa>] nfsd+0xd6/0x115 [nfsd] [<e24de024>] ? nfsd_shutdown+0x24/0x24 [nfsd] [<c0454322>] kthread+0x62/0x67 [<c04542c0>] ? kthread_worker_fn+0x114/0x114 [<c07a6ebe>] kernel_thread_helper+0x6/0x10 Code: eb 05 b8 00 00 27 4f 8d 65 f4 5b 5e 5f 5d c3 83 e0 03 55 83 f8 02 89 e5 74 17 83 f8 03 74 05 48 75 09 eb 09 b8 02 00 00 00 eb 0b <0f> 0b 31 c0 eb 05 b8 01 00 00 00 5d c3 55 89 e5 57 56 89 d6 8d EIP: [<e24f180d>] nfs4_access_to_omode+0x1c/0x29 [nfsd] SS:ESP 0068:dd54dde0 ---[ end trace 2b0bf6c6557cb284 ]--- The trace route is: -> nfsd4_lock() -> if (lock->lk_is_new) { -> alloc_init_lock_stateid() 3739: stp->st_access_bmap = 0; ->if (status && lock->lk_is_new && lock_sop) -> release_lockowner() -> free_generic_stateid() -> nfs4_access_bmap_to_omode() -> nfs4_access_to_omode() 380: BUG(); ***** This problem was introduced by `0997b17360`. Reported-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Tested-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-10 12:21:27 -04:00
J. Bruce Fields	d6c558379a	Merge branch 'for-2.6.39' into for-2.6.40	2011-04-07 15:19:21 -04:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
J. Bruce Fields	954032d252	nfsd: fix auth_domain reference leak on nlm operations This was noticed by users who performed more than 2^32 lock operations and hence made this counter overflow (eventually leading to use-after-free's). Setting rq_client to NULL here means that it won't later get auth_domain_put() when it should be. Appears to have been introduced in 2.5.42 by "[PATCH] kNFSd: Move auth domain lookup into svcauth" which moved most of the rq_client handling to common svcauth code, but left behind this one line. Cc: Neil Brown <neilb@suse.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-24 23:11:27 -04:00
Linus Torvalds	dc87c55120	Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linux * 'for-2.6.39' of git://linux-nfs.org/~bfields/linux: SUNRPC: Remove resource leak in svc_rdma_send_error() nfsd: wrong index used in inner loop nfsd4: fix comment and remove unused nfsd4_file fields nfs41: make sure nfs server return right ca_maxresponsesize_cached nfsd: fix compile error svcrpc: fix bad argument in unix_domain_find nfsd4: fix struct file leak nfsd4: minor nfs4state.c reshuffling svcrpc: fix rare race on unix_domain creation nfsd41: modify the members value of nfsd4_op_flags nfsd: add proc file listing kernel's gss_krb5 enctypes gss:krb5 only include enctype numbers in gm_upcall_enctypes NFSD, VFS: Remove dead code in nfsd_rename() nfsd: kill unused macro definition locks: use assign_type()	2011-03-24 08:20:39 -07:00
Al Viro	7cc90cc3ff	don't pass 'mounting_here' flag to follow_down() it's always false now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-18 09:04:20 -04:00
Mi Jinlong	5a02ab7c3c	nfsd: wrong index used in inner loop We must not use dummy for index. After the first index, READ32(dummy) will change dummy!!!! Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> [bfields@redhat.com: Trond points out READ_BUF alone is sufficient.] Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-17 13:09:19 -04:00
J. Bruce Fields	cf507b6f8e	Merge create_session decoding fix into for-2.6.39 This needs a further fixup!	2011-03-17 13:07:25 -04:00
J. Bruce Fields	9ae78bcc00	nfsd4: fix comment and remove unused nfsd4_file fields A couple fields here were left over from a previous version of a patch, and are no longer used. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-17 12:52:33 -04:00
Mi Jinlong	d2b217439f	nfs41: make sure nfs server return right ca_maxresponsesize_cached According to rfc5661, ca_maxresponsesize_cached: Like ca_maxresponsesize, but the maximum size of a reply that will be stored in the reply cache (Section 2.10.6.1). For each channel, the server MAY decrease this value, but MUST NOT increase it. the latest kernel(2.6.38-rc8) may increase the value for ignoring request's ca_maxresponsesize_cached value. We should not ignore it. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-16 11:10:22 -04:00
J. Bruce Fields	0a5e5f122c	nfsd: fix compile error "fs/built-in.o: In function `supported_enctypes_show': nfsctl.c:(.text+0x7beb0): undefined reference to `gss_mech_get_by_name' nfsctl.c:(.text+0x7bebc): undefined reference to `gss_mech_put' " Reported-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-14 20:57:44 -04:00
roel	3ec07aa952	nfsd: wrong index used in inner loop Index i was already used in the outer loop Cc: stable@kernel.org Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-08 19:46:10 -05:00
J. Bruce Fields	0997b17360	nfsd4: fix struct file leak Make sure we properly reference count the struct files that a lock depends on, and release them when the lock stateid is released. This fixes a major leak of struct files when using locking over nfsv4. Cc: stable@kernel.org Reported-by: Rick Koshi <nfs-bug-report@more-right-rudder.com> Tested-by: Ivo Přikryl <prikryl@eurosat.cz> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-08 19:38:27 -05:00
J. Bruce Fields	529d7b2a7f	nfsd4: minor nfs4state.c reshuffling Minor cleanup in preparation for a bugfix--moving some code to avoid forward references, etc. No change in functionality. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-08 19:38:15 -05:00
Mi Jinlong	5ece3cafbd	nfsd41: modify the members value of nfsd4_op_flags The members of nfsd4_op_flags, (ALLOWED_WITHOUT_FH \| ALLOWED_ON_ABSENT_FS) equals to ALLOWED_AS_FIRST_OP, maybe that's not what we want. OP_PUTROOTFH with op_flags = ALLOWED_WITHOUT_FH \| ALLOWED_ON_ABSENT_FS, can't appears as the first operation with out SEQUENCE ops. This patch modify the wrong value of ALLOWED_WITHOUT_FH etc which was introduced by `f9bb94c4`. Cc: stable@kernel.org Reviewed-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-07 12:10:33 -05:00
Kevin Coffman	b0b0c0a26e	nfsd: add proc file listing kernel's gss_krb5 enctypes Add a new proc file which lists the encryption types supported by the kernel's gss_krb5 code. Newer MIT Kerberos libraries support the assertion of acceptor subkeys. This enctype information allows user-land (svcgssd) to request that the Kerberos libraries limit the encryption types that it uses when generating the subkeys. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-07 12:06:48 -05:00
Jesper Juhl	46d4cef9cf	NFSD, VFS: Remove dead code in nfsd_rename() Currently we have the following code in fs/nfsd/vfs.c::nfsd_rename() : ... host_err = nfsd_break_lease(odentry->d_inode); if (host_err) goto out_drop_write; if (ndentry->d_inode) { host_err = nfsd_break_lease(ndentry->d_inode); if (host_err) goto out_drop_write; } if (host_err) goto out_drop_write; ... 'host_err' is guaranteed to be 0 by the time we test 'ndentry->d_inode'. If 'host_err' becomes != 0 inside the 'if' statement, then we goto 'out_drop_write'. So, after the 'if' statement there is no way that 'host_err' can be anything but 0, so the test afterwards is just dead code. This patch removes the dead code. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-07 12:05:14 -05:00
Shan Wei	35079582e7	nfsd: kill unused macro definition These macros had never been used for several years. So, remove them. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-07 12:05:09 -05:00
J. Bruce Fields	32b007b4e1	nfsd4: fix bad pointer on failure to find delegation In case of a nonempty list, the return on error here is obviously bogus; it ends up being a pointer to the list head instead of to any valid delegation on the list. In particular, if nfsd4_delegreturn() hits this case, and you're quite unlucky, then renew_client may oops, and it may take an embarassingly long time to figure out why. Facepalm. BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 IP: [<ffffffff81292965>] nfsd4_delegreturn+0x125/0x200 ... Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-03-07 11:44:53 -05:00
Benny Halevy	2c9c8f36c3	NFSD: fix decode_cb_sequence4resok Fix bug introduced in patch `85a56480` NFSD: Update XDR decoders in NFSv4 callback client Although decode_cb_sequence4resok ignores highest slotid and target highest slotid it must account for their space in their xdr stream when calling xdr_inline_decode Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-22 15:55:09 -08:00
NeilBrown	47c85291d3	nfsd: correctly handle return value from nfsd_map_name_to_* These functions return an nfs status, not a host_err. So don't try to convert before returning. This is a regression introduced by 3c726023402a2f3b28f49b9d90ebf9e71151157d; I fixed up two of the callers, but missed these two. Cc: stable@kernel.org Reported-by: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-16 18:31:05 -05:00
J. Bruce Fields	83f6b0c182	nfsd: break lease on unlink due to rename `4795bb37ef` "nfsd: break lease on unlink, link, and rename", only broke the lease on the file that was being renamed, and didn't handle the case where the target path refers to an already-existing file that will be unlinked by a rename--in that case the target file should have any leases broken as well. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:19 -05:00
J. Bruce Fields	acfdf5c383	nfsd4: acquire only one lease per file Instead of acquiring one lease each time another client opens a file, nfsd can acquire just one lease to represent all of them, and reference count it to determine when to release it. This fixes a regression introduced by `c45821d263` "locks: eliminate fl_mylease callback": after that patch, only the struct file * is used to determine who owns a given lease. But since we recently converted the server to share a single struct file per open, if we acquire multiple leases on the same file from nfsd, it then becomes impossible on unlocking a lease to determine which of those leases (all of whom share the same struct file *) we meant to remove. Thanks to Takashi Iwai <tiwai@suse.de> for catching a bug in a previous version of this patch. Tested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:19 -05:00
J. Bruce Fields	5d926e8c2f	nfsd4: modify fi_delegations under recall_lock Modify fi_delegations only under the recall_lock, allowing us to use that list on lease breaks. Also some trivial cleanup to simplify later changes. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:19 -05:00
J. Bruce Fields	65bc58f518	nfsd4: remove unused deleg dprintk's. These aren't all that useful, and get in the way of the next steps. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:19 -05:00
J. Bruce Fields	edab9782b5	nfsd4: split lease setting into separate function Splitting some code into a separate function which we'll be adding some more to. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:18 -05:00
J. Bruce Fields	dd239cc05f	nfsd4: fix leak on allocation error Also share some common exit code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:18 -05:00
J. Bruce Fields	22d38c4c10	nfsd4: add helper function for lease setup Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:18 -05:00
J. Bruce Fields	6b57d9c86d	nfsd4: split up nfsd_break_deleg_cb We'll be adding some more code here soon. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:18 -05:00
Konstantin Khorenko	3aa6e0aa8a	NFSD: memory corruption due to writing beyond the stat array If nfsd fails to find an exported via NFS file in the readahead cache, it should increment corresponding nfsdstats counter (ra_depth[10]), but due to a bug it may instead write to ra_depth[11], corrupting the following field. In a kernel with NFSDv4 compiled in the corruption takes the form of an increment of a counter of the number of NFSv4 operation 0's received; since there is no operation 0, this is harmless. In a kernel with NFSDv4 disabled it corrupts whatever happens to be in the memory beyond nfsdstats. Signed-off-by: Konstantin Khorenko <khorenko@openvz.org> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:18 -05:00
Benny Halevy	0af3f814cc	NFSD: use nfserr for status after decode_cb_op_status Bugs introduced in `85a5648019` "NFSD: Update XDR decoders in NFSv4 callback client" Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:18 -05:00
J. Bruce Fields	541ce98c10	nfsd: don't leak dentry count on mnt_want_write failure The exit cleanup isn't quite right here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:31:08 -05:00
Linus Torvalds	f8206b925f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (23 commits) sanitize vfsmount refcounting changes fix old umount_tree() breakage autofs4: Merge the remaining dentry ops tables Unexport do_add_mount() and add in follow_automount(), not ->d_automount() Allow d_manage() to be used in RCU-walk mode Remove a further kludge from __do_follow_link() autofs4: Bump version autofs4: Add v4 pseudo direct mount support autofs4: Fix wait validation autofs4: Clean up autofs4_free_ino() autofs4: Clean up dentry operations autofs4: Clean up inode operations autofs4: Remove unused code autofs4: Add d_manage() dentry operation autofs4: Add d_automount() dentry operation Remove the automount through follow_link() kludge code from pathwalk CIFS: Use d_automount() rather than abusing follow_link() NFS: Use d_automount() rather than abusing follow_link() AFS: Use d_automount() rather than abusing follow_link() Add an AT_NO_AUTOMOUNT flag to suppress terminal automount ...	2011-01-16 11:31:50 -08:00
David Howells	cc53ce53c8	Add a dentry op to allow processes to be held during pathwalk transit Add a dentry op (d_manage) to permit a filesystem to hold a process and make it sleep when it tries to transit away from one of that filesystem's directories during a pathwalk. The operation is keyed off a new dentry flag (DCACHE_MANAGE_TRANSIT). The filesystem is allowed to be selective about which processes it holds and which it permits to continue on or prohibits from transiting from each flagged directory. This will allow autofs to hold up client processes whilst letting its userspace daemon through to maintain the directory or the stuff behind it or mounted upon it. The ->d_manage() dentry operation: int (d_manage)(struct path path, bool mounting_here); takes a pointer to the directory about to be transited away from and a flag indicating whether the transit is undertaken by do_add_mount() or do_move_mount() skipping through a pile of filesystems mounted on a mountpoint. It should return 0 if successful and to let the process continue on its way; -EISDIR to prohibit the caller from skipping to overmounted filesystems or automounting, and to use this directory; or some other error code to return to the user. ->d_manage() is called with namespace_sem writelocked if mounting_here is true and no other locks held, so it may sleep. However, if mounting_here is true, it may not initiate or wait for a mount or unmount upon the parameter directory, even if the act is actually performed by userspace. Within fs/namei.c, follow_managed() is extended to check with d_manage() first on each managed directory, before transiting away from it or attempting to automount upon it. follow_down() is renamed follow_down_one() and should only be used where the filesystem deliberately intends to avoid management steps (e.g. autofs). A new follow_down() is added that incorporates the loop done by all other callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS and CIFS do use it, their use is removed by converting them to use d_automount()). The new follow_down() calls d_manage() as appropriate. It also takes an extra parameter to indicate if it is being called from mount code (with namespace_sem writelocked) which it passes to d_manage(). follow_down() ignores automount points so that it can be used to mount on them. __follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to sleep. It would be possible to enter d_manage() in rcu-walk mode too, and have that determine whether to abort or not itself. That would allow the autofs daemon to continue on in rcu-walk mode. Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't required as every tranist from that directory will cause d_manage() to be invoked. It can always be set again when necessary. ========================== WHAT THIS MEANS FOR AUTOFS ========================== Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to trigger the automounting of indirect mounts, and both of these can be called with i_mutex held. autofs knows that the i_mutex will be held by the caller in lookup(), and so can drop it before invoking the daemon - but this isn't so for d_revalidate(), since the lock is only held on _some_ of the code paths that call it. This means that autofs can't risk dropping i_mutex from its d_revalidate() function before it calls the daemon. The bug could manifest itself as, for example, a process that's trying to validate an automount dentry that gets made to wait because that dentry is expired and needs cleaning up: mkdir S ffffffff8014e05a 0 32580 24956 Call Trace: [<ffffffff885371fd>] :autofs4:autofs4_wait+0x674/0x897 [<ffffffff80127f7d>] avc_has_perm+0x46/0x58 [<ffffffff8009fdcf>] autoremove_wake_function+0x0/0x2e [<ffffffff88537be6>] :autofs4:autofs4_expire_wait+0x41/0x6b [<ffffffff88535cfc>] :autofs4:autofs4_revalidate+0x91/0x149 [<ffffffff80036d96>] __lookup_hash+0xa0/0x12f [<ffffffff80057a2f>] lookup_create+0x46/0x80 [<ffffffff800e6e31>] sys_mkdirat+0x56/0xe4 versus the automount daemon which wants to remove that dentry, but can't because the normal process is holding the i_mutex lock: automount D ffffffff8014e05a 0 32581 1 32561 Call Trace: [<ffffffff80063c3f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff8000ccf1>] do_path_lookup+0x2ca/0x2f1 [<ffffffff80063c89>] .text.lock.mutex+0xf/0x14 [<ffffffff800e6d55>] do_rmdir+0x77/0xde [<ffffffff8005d229>] tracesys+0x71/0xe0 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 which means that the system is deadlocked. This patch allows autofs to hold up normal processes whilst the daemon goes ahead and does things to the dentry tree behind the automouter point without risking a deadlock as almost no locks are held in d_manage() and none in d_automount(). Signed-off-by: David Howells <dhowells@redhat.com> Was-Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-15 20:07:31 -05:00
Linus Torvalds	18bce371ae	Merge branch 'for-2.6.38' of git://linux-nfs.org/~bfields/linux * 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits) nfsd4: fix callback restarting nfsd: break lease on unlink, link, and rename nfsd4: break lease on nfsd setattr nfsd: don't support msnfs export option nfsd4: initialize cb_per_client nfsd4: allow restarting callbacks nfsd4: simplify nfsd4_cb_prepare nfsd4: give out delegations more quickly in 4.1 case nfsd4: add helper function to run callbacks nfsd4: make sure sequence flags are set after destroy_session nfsd4: re-probe callback on connection loss nfsd4: set sequence flag when backchannel is down nfsd4: keep finer-grained callback status rpc: allow xprt_class->setup to return a preexisting xprt rpc: keep backchannel xprt as long as server connection rpc: move sk_bc_xprt to svc_xprt nfsd4: allow backchannel recovery nfsd4: support BIND_CONN_TO_SESSION nfsd4: modify session list under cl_lock Documentation: fl_mylease no longer exists ... Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work. The vfs-scale work touched some msnfs cases, and this merge removes support for that entirely, so the conflict was trivial to resolve.	2011-01-14 13:17:26 -08:00
J. Bruce Fields	a8f2800b4f	nfsd4: fix callback restarting Ensure a new callback is added to the client's list of callbacks at most once. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-14 14:51:31 -05:00
J. Bruce Fields	4795bb37ef	nfsd: break lease on unlink, link, and rename Any change to any of the links pointing to an entry should also break delegations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-13 21:04:09 -05:00
J. Bruce Fields	6a76bebefe	nfsd4: break lease on nfsd setattr Leases (delegations) should really be broken on any metadata change, not just on size change. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-13 21:04:08 -05:00
J. Bruce Fields	9ce137eee4	nfsd: don't support msnfs export option We've long had these pointless #ifdef MSNFS's sprinkled throughout the code--pointless because MSNFS is always defined (and we give no config option to make that easy to change). So we could just remove the ifdef's and compile the resulting code unconditionally. But as long as we're there: why not just rip out this code entirely? The only purpose is to implement the "msnfs" export option which turns on Windows-like behavior in some cases, and: - the export option isn't documented anywhere; - the userland utilities (which would need to be able to parse "msnfs" in an export file) don't support it; - I don't know how to maintain this, as I don't know what the proper behavior is; and - google shows no evidence that anyone has ever used this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-13 21:04:07 -05:00
J. Bruce Fields	9ee1ba5402	nfsd4: initialize cb_per_client Otherwise a callback that is aborted before it runs will result in a list_del on an uninitialized list head. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-13 21:04:06 -05:00
Linus Torvalds	275220f0fc	Merge branch 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block: (43 commits) block: ensure that completion error gets properly traced blktrace: add missing probe argument to block_bio_complete block cfq: don't use atomic_t for cfq_group block cfq: don't use atomic_t for cfq_queue block: trace event block fix unassigned field block: add internal hd part table references block: fix accounting bug on cross partition merges kref: add kref_test_and_get bio-integrity: mark kintegrityd_wq highpri and CPU intensive block: make kblockd_workqueue smarter Revert "sd: implement sd_check_events()" block: Clean up exit_io_context() source code. Fix compile warnings due to missing removal of a 'ret' variable fs/block: type signature of major_to_index(int) to major_to_index(unsigned) block: convert !IS_ERR(p) && p to !IS_ERR_NOR_NULL(p) cfq-iosched: don't check cfqg in choose_service_tree() fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors cdrom: export cdrom_check_events() sd: implement sd_check_events() sr: implement sr_check_events() ...	2011-01-13 10:45:01 -08:00
Linus Torvalds	b9d919a4ac	Merge branch 'nfs-for-2.6.38' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.38' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (89 commits) NFS fix the setting of exchange id flag NFS: Don't use vm_map_ram() in readdir NFSv4: Ensure continued open and lockowner name uniqueness NFS: Move cl_delegations to the nfs_server struct NFS: Introduce nfs_detach_delegations() NFS: Move cl_state_owners and related fields to the nfs_server struct NFS: Allow walking nfs_client.cl_superblocks list outside client.c pnfs: layout roc code pnfs: update nfs4_callback_recallany to handle layouts pnfs: add CB_LAYOUTRECALL handling pnfs: CB_LAYOUTRECALL xdr code pnfs: change lo refcounting to atomic_t pnfs: check that partial LAYOUTGET return is ignored pnfs: add layout to client list before sending rpc pnfs: serialize LAYOUTGET(openstateid) pnfs: layoutget rpc code cleanup pnfs: change how lsegs are removed from layout list pnfs: change layout state seqlock to a spinlock pnfs: add prefix to struct pnfs_layout_hdr fields pnfs: add prefix to struct pnfs_layout_segment fields ...	2011-01-11 15:11:56 -08:00
J. Bruce Fields	5ce8ba25d6	nfsd4: allow restarting callbacks If we lose the backchannel and then the client repairs the problem, resend any callbacks. We use a new cb_done flag to track whether there is still work to be done for the callback or whether it can be destroyed with the rpc. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	3ff3600e7e	nfsd4: simplify nfsd4_cb_prepare Remove handling for a nonexistant case (status && !-EAGAIN). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	14a24e99f4	nfsd4: give out delegations more quickly in 4.1 case Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	229b2a0839	nfsd4: add helper function to run callbacks Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	84f5f7ccc5	nfsd4: make sure sequence flags are set after destroy_session If this loses any backchannel, make sure we have a chance to notice that and set the sequence flags. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	eea4980660	nfsd4: re-probe callback on connection loss This makes sure we set the sequence flag when necessary. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:10 -05:00
J. Bruce Fields	0d7bb71907	nfsd4: set sequence flag when backchannel is down Implement the SEQ4_STATUS_CB_PATH_DOWN flag. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:10 -05:00
J. Bruce Fields	77a3569d6c	nfsd4: keep finer-grained callback status Distinguish between when the callback channel is known to be down, and when it is not yet confirmed. This will be useful in the 4.1 case. Also, we don't seem to be using the fact that this field is atomic. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2011-01-11 15:04:10 -05:00
J. Bruce Fields	dcbeaa68db	nfsd4: allow backchannel recovery Now that we have a list of connections to choose from, we can teach the callback code to just pick a suitable connection and use that, instead of insisting on forever using the connection that the first create_session was sent with. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2011-01-11 15:04:10 -05:00
J. Bruce Fields	1d1bc8f207	nfsd4: support BIND_CONN_TO_SESSION Basic xdr and processing for BIND_CONN_TO_SESSION. This adds a connection to the list of connections associated with a session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:09 -05:00
J. Bruce Fields	4c6493785a	nfsd4: modify session list under cl_lock We want to traverse this from the callback code. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2011-01-11 15:04:09 -05:00
Linus Torvalds	23d69b09b7	Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq * 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits) usb: don't use flush_scheduled_work() speedtch: don't abuse struct delayed_work media/video: don't use flush_scheduled_work() media/video: explicitly flush request_module work ioc4: use static work_struct for ioc4_load_modules() init: don't call flush_scheduled_work() from do_initcalls() s390: don't use flush_scheduled_work() rtc: don't use flush_scheduled_work() mmc: update workqueue usages mfd: update workqueue usages dvb: don't use flush_scheduled_work() leds-wm8350: don't use flush_scheduled_work() mISDN: don't use flush_scheduled_work() macintosh/ams: don't use flush_scheduled_work() vmwgfx: don't use flush_scheduled_work() tpm: don't use flush_scheduled_work() sonypi: don't use flush_scheduled_work() hvsi: don't use flush_scheduled_work() xen: don't use flush_scheduled_work() gdrom: don't use flush_scheduled_work() ... Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c as per Tejun.	2011-01-07 16:58:04 -08:00
Nick Piggin	b7ab39f631	fs: dcache scale dentry refcount Make d_count non-atomic and protect it with d_lock. This allows us to ensure a 0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when we start protecting many other dentry members with d_lock. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:21 +11:00
Takuma Umeya	6f3d772fb8	nfs4: set source address when callback is generated when callback is generated in NFSv4 server, it doesn't set the source address. When an alias IP is utilized on NFSv4 server and suppose the client is accessing via that alias IP (e.g. eth0:0), the client invokes the callback to the IP address that is set on the original device (e.g. eth0). This behavior results in timeout of xprt. The patch sets the IP address that the client should invoke callback to. Signed-off-by: Takuma Umeya <tumeya@redhat.com> [bfields@redhat.com: Simplify gen_callback arguments, use helper function] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 19:43:01 -05:00
J. Bruce Fields	3c72602340	nfsd4: return nfs errno from name_to_id functions This avoids the need for the confusing ESRCH mapping. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 18:22:11 -05:00
J. Bruce Fields	775a1905e1	nfsd4: remove outdated pathname-comments Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 18:22:10 -05:00
J. Bruce Fields	2ca72e17e5	nfsd4: move idmap and acl header files into fs/nfsd These are internal nfsd interfaces. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 18:22:09 -05:00
J. Bruce Fields	f6af99ec1b	nfsd4: name->id mapping should fail with BADOWNER not BADNAME According to rfc 3530 BADNAME is for strings that represent paths; BADOWNER is for user/group names that don't map. And the too-long name should probably be BADOWNER as well; it's effectively the same as if we couldn't map it. Cc: stable@kernel.org Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reported-by: Simon Kirby <sim@hostway.ca> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 18:21:36 -05:00
J. Bruce Fields	c45821d263	locks: eliminate fl_mylease callback The nfs server only supports read delegations for now, so we don't care how conflicts are determined. All we care is that unlocks are recognized as matching the leases they are meant to remove. After the last patch, a comparison of struct files will work for that purpose. So we no longer need this callback. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:28 -05:00
J. Bruce Fields	c84d500bc4	nfsd4: use a single struct file for delegations When we converted to sharing struct filess between nfs4 opens I went too far and also used the same mechanism for delegations. But keeping a reference to the struct file ensures it will outlast the lease, and allows us to remove the lease with the same file as we added it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:27 -05:00
J. Bruce Fields	e63eb93750	nfsd4: eliminate lease delete callback nfsd controls the lifetime of the lease, not the lock code, so there's no need for this callback on lease destruction. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:26 -05:00
J. Bruce Fields	da165dd60e	nfsd: remove some unnecessary dropit handling We no longer need a few of these special cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:23 -05:00
J. Bruce Fields	062304a815	nfsd: stop translating EAGAIN to nfserr_dropit We no longer need this. Also, EWOULDBLOCK is generally a synonym for EAGAIN, but that may not be true on all architectures, so map it as well. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:23 -05:00
J. Bruce Fields	9e701c6109	svcrpc: simpler request dropping Currently we use -EAGAIN returns to determine when to drop a deferred request. On its own, that is error-prone, as it makes us treat -EAGAIN returns from other functions specially to prevent inadvertent dropping. So, use a flag on the request instead. Returning an error on request deferral is still required, to prevent further processing, but we no longer need worry that an error return on its own could result in a drop. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:22 -05:00
J. Bruce Fields	3beb6cd1d4	nfsd: don't drop requests on -ENOMEM We never want to drop a request if we could return a JUKEBOX/DELAY error instead; so, convert to nfserr_jukebox and let nfsd_dispatch() convert that to a dropit error as a last resort if JUKEBOX/DELAY is unavailable (as in the NFSv2 case). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:20 -05:00
Kirill A. Shutemov	65e4c89455	nfsd: declare several functions of nfs4callback as static setup_callback_client(), nfsd4_release_cb() and nfsd4_process_cb_update() do not have users outside the translation unit. Let's declare it as static. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:19 -05:00
Mi Jinlong	22b6dee842	nfsd4: fix oops on secinfo_no_name result encoding The secinfo_no_name code oopses on encoding with BUG: unable to handle kernel NULL pointer dereference at 00000044 IP: [<e2bd239a>] nfsd4_encode_secinfo+0x1c/0x1c1 [nfsd] We should implement a nfsd4_encode_secinfo_no_name() instead using nfsd4_encode_secinfo(). Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-29 11:54:06 -07:00
Jens Axboe	3603b8eacc	Fix compile warnings due to missing removal of a 'ret' variable Commit `a8adbe3` forgot to remove the return variable, kill it. drivers/block/loop.c: In function 'lo_splice_actor': drivers/block/loop.c:398: warning: unused variable 'ret' [...] fs/nfsd/vfs.c: In function 'nfsd_splice_actor': fs/nfsd/vfs.c:848: warning: unused variable 'ret' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-12-20 09:15:19 +01:00
J. Bruce Fields	04f4ad16b2	nfsd4: implement secinfo_no_name Implementation of this operation is mandatory for NFSv4.1. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:25 -05:00
J. Bruce Fields	0ff7ab4671	nfsd4: move guts of nfsd4_lookupp into helper We'll reuse this code in secinfo_no_name. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:24 -05:00
J. Bruce Fields	56560b9ae0	nfsd4: 4.1 SECINFO should consume filehandle See the referenced spec language; an attempt by a 4.1 client to use the current filehandle after a secinfo call should result in a NOFILEHANDLE error. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:23 -05:00
bookjovi@gmail.com	5b6a599f0d	nfs: add missed CONFIG_NFSD_DEPRECATED these pieces of code only make sense when CONFIG_NFSD_DEPRECATED enabled Signed-off-by: Jovi Zhang <bookjovi@gmail.com> fs/nfsd/nfsctl.c \| 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:20 -05:00
J. Bruce Fields	18b631f838	nfsd: fix offset printk's in nfsd3 read/write Thanks to dysbr01@ca.com for noticing that the debugging printk in the v3 write procedure can print >2GB offsets as negative numbers: https://bugzilla.kernel.org/show_bug.cgi?id=23342 Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:18 -05:00
J. Bruce Fields	e203d506bd	nfsd4: fix mixed 4.0/4.1 handling, 4.1 reboot Instead of failing to find client entries which don't match the minorversion, we should be finding them, then either erroring out or expiring them as appropriate. This also fixes a problem which would cause the 4.1 server to fail to recognize clients after a second reboot. Reported-by: Casey Bodley <cbodley@citi.umich.edu> Reviewed-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:01 -05:00
J. Bruce Fields	6e5f15c93d	nfsd4: replace unintuitive match_clientid_establishment Reviewed-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:47:41 -05:00
J. Bruce Fields	ec66ee3797	Merge commit 'v2.6.37-rc6' into for-2.6.38	2010-12-17 13:29:07 -05:00
Michał Mirosław	a8adbe378b	fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors This patch pulls calls to buf->ops->confirm() from all actors passed (also indirectly) to splice_from_pipe_feed(). Is avoiding the call to buf->ops->confirm() while splice()ing to /dev/null is an intentional optimization? No other user does that and this will remove this special case. Against current linux.git `6313e3c217`. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-12-17 08:56:44 +01:00
Chuck Lever	bf2695516d	SUNRPC: New xdr_streams XDR decoder API Now that all client-side XDR decoder routines use xdr_streams, there should be no need to support the legacy calling sequence [rpc_rqst , __be32 , RPC res *] anywhere. We can construct an xdr_stream in the generic RPC code, instead of in each decoder function. This is a refactoring change. It should not cause different behavior. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:25 -05:00
Chuck Lever	9f06c719f4	SUNRPC: New xdr_streams XDR encoder API Now that all client-side XDR encoder routines use xdr_streams, there should be no need to support the legacy calling sequence [rpc_rqst , __be32 , RPC arg *] anywhere. We can construct an xdr_stream in the generic RPC code, instead of in each encoder function. Also, all the client-side encoder functions return 0 now, making a return value superfluous. Take this opportunity to convert them to return void instead. This is a refactoring change. It should not cause different behavior. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:25 -05:00
Chuck Lever	7d93bd71cb	NFS: Repair whitespace damage in NFS PROC macro Clean up. When I was making other changes in this area, checkscript.pl complained about the use of leading blanks in the PROC macros in the xdr files. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:24 -05:00
Chuck Lever	85a5648019	NFSD: Update XDR decoders in NFSv4 callback client Clean up. Remove old-style NFSv4 XDR macros in favor of the style now used in fs/nfs/nfs4xdr.c. These were forgotten during the recent nfs4xdr.c rewrite. Additional whitespace cleanup adds to the size of this patch. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:24 -05:00
Chuck Lever	a033db487e	NFSD: Update XDR encoders in NFSv4 callback client Clean up. Remove old-style NFSv4 XDR macros in favor of the style now used in fs/nfs/nfs4xdr.c. These were forgotten during the recent nfs4xdr.c rewrite. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:23 -05:00
Tejun Heo	afe2c511fb	workqueue: convert cancel_rearming_delayed_work[queue]() users to cancel_delayed_work_sync() cancel_rearming_delayed_work[queue]() has been superceded by cancel_delayed_work_sync() quite some time ago. Convert all the in-kernel users. The conversions are completely equivalent and trivial. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "David S. Miller" <davem@davemloft.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: netdev@vger.kernel.org Cc: Anton Vorontsov <cbou@mail.ru> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: Alex Elder <aelder@sgi.com> Cc: xfs-masters@oss.sgi.com Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: netfilter-devel@vger.kernel.org Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: linux-nfs@vger.kernel.org	2010-12-15 10:56:11 +01:00
Neil Brown	c1ac3ffcd0	nfsd: Fix possible BUG_ON firing in set_change_info If vfs_getattr in fill_post_wcc returns an error, we don't set fh_post_change. For NFSv4, this can result in set_change_info triggering a BUG_ON. i.e. fh_post_saved being zero isn't really a bug. So: - instead of BUGging when fh_post_saved is zero, just clear ->atomic. - if vfs_getattr fails in fill_post_wcc, take a copy of i_ctime anyway. This will be used i seg_change_info, but not overly trusted. - While we are there, remove the pointless 'if' statements in set_change_info. There is no harm setting all the values. Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-08 11:44:04 -05:00
Mi Jinlong	1205065764	NFS4.1: Fix bug server don't reply the right fore_channel to client at create_session At the latest kernel(2.6.37-rc1), server just initialize the forechannel at init_forechannel_attrs, but don't reflect it to reply. After initialize the session success, we should copy the forechannel info to nfsd4_create_session struct. Reviewed-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:12 -05:00
Mi Jinlong	ced6dfe9fc	NFS4.1: server gets drc mem fail should reply error at create_session When server gets drc mem fail, it should reply error to client. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:12 -05:00
J. Bruce Fields	044bc1d432	nfsd4: return serverfault on request for ssv We're refusing to support a mandatory features of 4.1, so serverfault seems the better error; see e.g.: http://www.ietf.org/mail-archive/web/nfsv4/current/msg07638.html Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:12 -05:00
Mi Jinlong	5afa040b30	NFSv4.1: Make sure nfsd can decode SP4_SSV correctly at exchange_id According to RFC, the argument of ssv_sp_parms4 is: struct ssv_sp_parms4 { state_protect_ops4 ssp_ops; sec_oid4 ssp_hash_algs<>; sec_oid4 ssp_encr_algs<>; uint32_t ssp_window; uint32_t ssp_num_gss_handles; }; If client send a exchange_id with SP4_SSV, server cann't decode the SP4_SSV's ssp_hash_algs and ssp_encr_algs arguments correctly. Because the kernel treat the two arguments as a signal sec_oid4 struct, but should be a set of sec_oid4 struct. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:12 -05:00
Dan Carpenter	43b0178eda	nfsd: fix NULL dereference in setattr() The original code would oops if this were called from nfsd4_setattr() because "filpp" is NULL. (Note this case is currently impossible, as long as we only give out read delegations.) Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:11 -05:00
Arnd Bergmann	460781b542	BKL: remove references to lock_kernel from comments Lock_kernel is gone from the code, so the comments should be updated, too. nfsd now uses lock_flocks instead of lock_kernel to protect against posix file locks. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-17 08:59:32 -08:00
J. Bruce Fields	21b75b0199	nfsd4: fix 4.1 connection registration race If a connection is closed just after a sequence or create_session is sent over it, we could end up trying to register a callback that will never get called since the xprt is already marked dead. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-02 17:13:52 -04:00
Christoph Hellwig	51ee4b84f5	locks: let the caller free file_lock on ->setlease failure The caller allocated it, the caller should free it. The only issue so far is that we could change the flp pointer even on an error return if the fl_change callback failed. But we can simply move the flp assignment after the fl_change invocation, as the callers don't care about the flp return value if the setlease call failed. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-31 06:35:15 -07:00
J. Bruce Fields	fcf744a96c	nfsd4: initialize delegation pointer to lease The NFSv4 server was initializing the dp->dl_flock pointer by the somewhat ridiculous method of a locks_copy_lock callback. Now that setlease uses the passed-in lock instead of doing a copy, dl_flock no longer gets set, resulting in the lock leaking on delegation release, and later possible hangs (among other problems). So, initialize dl_flock and get rid of the callback. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-30 18:08:15 -07:00
Al Viro	fc14f2fef6	convert get_sb_single() users Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-29 04:16:28 -04:00
Linus Torvalds	7420a8c0de	Merge branch 'flock' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'flock' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: locks: turn lock_flocks into a spinlock fasync: re-organize fasync entry insertion to allow it under a spinlock locks/nfsd: allocate file lock outside of spinlock lockd: fix nlmsvc_notify_blocked locking lockd: push lock_flocks down	2010-10-27 18:13:34 -07:00
Arnd Bergmann	c5b1f0d92c	locks/nfsd: allocate file lock outside of spinlock As suggested by Christoph Hellwig, this moves allocation of new file locks out of generic_setlease into the callers, nfs4_open_delegation and fcntl_setlease in order to allow GFP_KERNEL allocations when lock_flocks has become a spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com>	2010-10-27 21:41:50 +02:00
Arnd Bergmann	763641d812	lockd: push lock_flocks down lockd should use lock_flocks() instead of lock_kernel() to lock against posix locks accessing the i_flock list. This is a prerequisite to turning lock_flocks into a spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com>	2010-10-27 21:39:39 +02:00
Linus Torvalds	426e1f5cec	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) split invalidate_inodes() fs: skip I_FREEING inodes in writeback_sb_inodes fs: fold invalidate_list into invalidate_inodes fs: do not drop inode_lock in dispose_list fs: inode split IO and LRU lists fs: switch bdev inode bdi's correctly fs: fix buffer invalidation in invalidate_list fsnotify: use dget_parent smbfs: use dget_parent exportfs: use dget_parent fs: use RCU read side protection in d_validate fs: clean up dentry lru modification fs: split __shrink_dcache_sb fs: improve DCACHE_REFERENCED usage fs: use percpu counter for nr_dentry and nr_dentry_unused fs: simplify __d_free fs: take dcache_lock inside __d_path fs: do not assign default i_ino in new_inode fs: introduce a per-cpu last_ino allocator new helper: ihold() ...	2010-10-26 17:58:44 -07:00
Linus Torvalds	4390110fef	Merge branch 'for-2.6.37' of git://linux-nfs.org/~bfields/linux * 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: (99 commits) svcrpc: svc_tcp_sendto XPT_DEAD check is redundant svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue svcrpc: assume svc_delete_xprt() called only once svcrpc: never clear XPT_BUSY on dead xprt nfsd4: fix connection allocation in sequence() nfsd4: only require krb5 principal for NFSv4.0 callbacks nfsd4: move minorversion to client nfsd4: delay session removal till free_client nfsd4: separate callback change and callback probe nfsd4: callback program number is per-session nfsd4: track backchannel connections nfsd4: confirm only on succesful create_session nfsd4: make backchannel sequence number per-session nfsd4: use client pointer to backchannel session nfsd4: move callback setup into session init code nfsd4: don't cache seq_misordered replies SUNRPC: Properly initialize sock_xprt.srcaddr in all cases SUNRPC: Use conventional switch statement when reclassifying sockets sunrpc/xprtrdma: clean up workqueue usage sunrpc: Turn list_for_each-s into the ..._entry-s ... Fix up trivial conflicts (two different deprecation notices added in separate branches) in Documentation/feature-removal-schedule.txt	2010-10-26 09:55:25 -07:00
Linus Torvalds	a4dd8dce14	Merge branch 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: net/sunrpc: Use static const char arrays nfs4: fix channel attribute sanity-checks NFSv4.1: Use more sensible names for 'initialize_mountpoint' NFSv4.1: pnfs: filelayout: add driver's LAYOUTGET and GETDEVICEINFO infrastructure NFSv4.1: pnfs: add LAYOUTGET and GETDEVICEINFO infrastructure NFS: client needs to maintain list of inodes with active layouts NFS: create and destroy inode's layout cache NFSv4.1: pnfs: filelayout: introduce minimal file layout driver NFSv4.1: pnfs: full mount/umount infrastructure NFS: set layout driver NFS: ask for layouttypes during v4 fsinfo call NFS: change stateid to be a union NFSv4.1: pnfsd, pnfs: protocol level pnfs constants SUNRPC: define xdr_decode_opaque_fixed NFSD: remove duplicate NFS4_STATEID_SIZE	2010-10-26 09:52:09 -07:00
Christoph Hellwig	c37650161a	fs: add sync_inode_metadata Add a new helper to write out the inode using the writeback code, that is including the correct dirty bit and list manipulation. A few of filesystems already opencode this, and a lot of others should be using it instead of using write_inode_now which also writes out the data. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-25 21:18:19 -04:00
J. Bruce Fields	a663bdd8c5	nfsd4: fix connection allocation in sequence() We're doing an allocation under a spinlock, and ignoring the possibility of allocation failure. A better fix wouldn't require an unnecessary allocation in the common case, but we'll leave that for later. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-24 21:07:07 -04:00
Andy Adamson	3c9101a057	NFSD: remove duplicate NFS4_STATEID_SIZE Already accepted by Bruce Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-10-24 18:02:53 -04:00
Linus Torvalds	092e0e7e52	Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek	2010-10-22 10:52:56 -07:00
Linus Torvalds	79f14b7c56	Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: (30 commits) BKL: remove BKL from freevxfs BKL: remove BKL from qnx4 autofs4: Only declare function when CONFIG_COMPAT is defined autofs: Only declare function when CONFIG_COMPAT is defined ncpfs: Lock socket in ncpfs while setting its callbacks fs/locks.c: prepare for BKL removal BKL: Remove BKL from ncpfs BKL: Remove BKL from OCFS2 BKL: Remove BKL from squashfs BKL: Remove BKL from jffs2 BKL: Remove BKL from ecryptfs BKL: Remove BKL from afs BKL: Remove BKL from USB gadgetfs BKL: Remove BKL from autofs4 BKL: Remove BKL from isofs BKL: Remove BKL from fat BKL: Remove BKL from ext2 filesystem BKL: Remove BKL from do_new_mount() BKL: Remove BKL from cgroup BKL: Remove BKL from NTFS ...	2010-10-22 10:52:01 -07:00
Linus Torvalds	5704e44d28	Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: BKL: introduce CONFIG_BKL. dabusb: remove the BKL sunrpc: remove the big kernel lock init/main.c: remove BKL notations blktrace: remove the big kernel lock rtmutex-tester: make it build without BKL dvb-core: kill the big kernel lock dvb/bt8xx: kill the big kernel lock tlclk: remove big kernel lock fix rawctl compat ioctls breakage on amd64 and itanic uml: kill big kernel lock parisc: remove big kernel lock cris: autoconvert trivial BKL users alpha: kill big kernel lock isapnp: BKL removal s390/block: kill the big kernel lock hpet: kill BKL, add compat_ioctl	2010-10-22 10:43:11 -07:00
J. Bruce Fields	5d18c1c2a9	nfsd4: only require krb5 principal for NFSv4.0 callbacks In the sessions backchannel case, we don't need a krb5 principal name for the client; we use the already-created forechannel credentials instead. Some cleanup, while we're there: make it clearer which code here is 4.0- or sessions- specific. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:12:14 -04:00
J. Bruce Fields	8323c3b2a6	nfsd4: move minorversion to client The minorversion seems more a property of the client than the callback channel. Some time we should probably also enforce consistent minorversion usage from the client; for now, this is just a cosmetic change. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:12:02 -04:00
J. Bruce Fields	792c95dd51	nfsd4: delay session removal till free_client Have unhash_client_locked() remove client and associated sessions from global hashes, but delay further dismantling till free_client(). (After unhash_client_locked(), the only remaining references outside the destroying thread are from any connections which have xpt_user callbacks registered.) This will simplify locking on session destruction. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:56 -04:00
J. Bruce Fields	5a3c9d7134	nfsd4: separate callback change and callback probe Only one of the nfsd4_callback_probe callers actually cares about changing the callback information. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:55 -04:00
J. Bruce Fields	8b5ce5cd44	nfsd4: callback program number is per-session The callback program is allowed to depend on the session which the callback is going over. No change in behavior yet, while we still only do callbacks over a single session for the lifetime of the client. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:54 -04:00
J. Bruce Fields	d29c374cd2	nfsd4: track backchannel connections We need to keep track of which connections are available for use with the backchannel, which for the forechannel, and which for both. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-21 10:11:53 -04:00
J. Bruce Fields	86c3e16cc7	nfsd4: confirm only on succesful create_session Following rfc 5661, section 18.36.4: "If the session is not successfully created, then no changes are made to any client records on the server." We shouldn't be confirming or incrementing the sequence id in this case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:52 -04:00
J. Bruce Fields	ac7c46f29a	nfsd4: make backchannel sequence number per-session Currently we don't deal well with a client that has multiple sessions associated with it (even simultaneously, or serially over the lifetime of the client). In particular, we don't attempt to keep the backchannel running after the original session diseappears. We will fix that soon. Once we do that, we need the slot sequence number to be per-session; otherwise, for example, we cannot correctly handle a case like this: - All session 1 connections are lost. - The client creates session 2. We use it for the backchannel (since it's the only working choice). - The client gives us a new connection to use with session 1. - The client destroys session 2. At this point our only choice is to go back to using session 1. When we do so we must use the sequence number that is next for session 1. We therefore need to maintain multiple sequence number streams. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-21 10:11:51 -04:00
J. Bruce Fields	90c8145bb6	nfsd4: use client pointer to backchannel session Instead of copying the sessionid, use the new cl_cb_session pointer, which indicates which session we're using for the backchannel. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-21 10:11:50 -04:00
J. Bruce Fields	edd7678663	nfsd4: move callback setup into session init code The backchannel should be associated with a session, it isn't really global to the client. We do, however, want a pointer global to the client which tracks which session we're currently using for client-based callbacks. This is a first step in that direction; for now, just reshuffling of code with no significant change in behavior. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-21 10:11:49 -04:00
J. Bruce Fields	cd5b814458	nfsd4: don't cache seq_misordered replies Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:48 -04:00
Arnd Bergmann	6de5bd128d	BKL: introduce CONFIG_BKL. With all the patches we have queued in the BKL removal tree, only a few dozen modules are left that actually rely on the BKL, and even there are lots of low-hanging fruit. We need to decide what to do about them, this patch illustrates one of the options: Every user of the BKL is marked as 'depends on BKL' in Kconfig, and the CONFIG_BKL becomes a user-visible option. If it gets disabled, no BKL using module can be built any more and the BKL code itself is compiled out. The one exception is file locking, which is practically always enabled and does a 'select BKL' instead. This effectively forces CONFIG_BKL to be enabled until we have solved the fs/lockd mess and can apply the patch that removes the BKL from fs/locks.c. Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2010-10-21 15:44:13 +02:00
Arnd Bergmann	6038f373a3	llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-15 15:53:27 +02:00
J. Bruce Fields	b1e86db1de	nfsd: fix BUG at fs/nfsd/nfsfh.h:199 on unlink As of commit `43a9aa64a2` "NFSD: Fill in WCC data for REMOVE, RMDIR, MKNOD, and MKDIR", we sometimes call fh_unlock on a filehandle that isn't fully initialized. We should fix up the callers, but as a quick fix it is also sufficient just to remove this assertion. Reported-by: Marius Tolzmann <tolzmann@molgen.mpg.de> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-13 15:48:55 -04:00
J. Bruce Fields	ecec6e34e1	nfsd4: expire clients more promptly Expire clients more promptly, at the expense of possibly running the laundromat thread more frequently. Though it's not the default, I'd like it to be feasible to run with a lease time of just a few seconds, at which point a minimum 10 second wait between laundromat runs seems a little much. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-11 20:00:18 -04:00
Arnd Bergmann	b89f432133	fs/locks.c: prepare for BKL removal This prepares the removal of the big kernel lock from the file locking code. We still use the BKL as long as fs/lockd uses it and ceph might sleep, but we can flip the definition to a private spinlock as soon as that's done. All users outside of fs/lockd get converted to use lock_flocks() instead of lock_kernel() where appropriate. Based on an earlier patch to use a spinlock from Matthew Wilcox, who has attempted this a few times before, the earliest patch from over 10 years ago turned it into a semaphore, which ended up being slower than the BKL and was subsequently reverted. Someone should do some serious performance testing when this becomes a spinlock, since this has caused problems before. Using a spinlock should be at least as good as the BKL in theory, but who knows... Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Matthew Wilcox <willy@linux.intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Miklos Szeredi <mszeredi@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Sage Weil <sage@newdream.net> Cc: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org	2010-10-05 11:02:04 +02:00
J. Bruce Fields	3351514215	nfsd4: return expired on unfound stateid's Commit `78155ed75f` "nfsd4: distinguish expired from stale stateids" attempted to distinguish expired and stale stateid's using time information that may not have been completely reliable, so I reverted it. That was throwing out the baby with the bathwater; we still do want to return expired, but let's do that using the simpler approach of just assuming any stateid is expired if it looks like it was given out by the current server instance, but we can't find it any more. This may help clients that are recovering from network partitions. Reported-by: Bian Naimeng <biannm@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-02 18:49:33 -04:00
J. Bruce Fields	328ead2872	nfsd4: add new connections to session As long as we're not implementing any session security, we should just automatically add any new connections that come along to the list of sessions associated with the session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 19:29:45 -04:00
J. Bruce Fields	db90681d6e	nfsd4: refactor connection allocation Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 19:29:45 -04:00
J. Bruce Fields	19cf5c026f	nfsd4: use callbacks on svc_xprt_deletion Remove connections from the list when they go down. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	c7662518c7	nfsd4: keep per-session list of connections The spec requires us in various places to keep track of the connections associated with each session. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	5b6feee960	nfsd4: clean up session allocation Changes: - make sure session memory reservation is released on failure path. - use min_t()/min() for more compact code in several places. - break alloc_init_session into smaller pieces. - miscellaneous other cleanup. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	dd93842457	nfsd4: fix alloc_init_session return type This returns an nfs error, not -ERRNO. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	c23753dac1	nfsd4: fix alloc_init_session BUILD_BUG_ON() Note we're allocating an array of nfsd4_slot *'s, not nfsd4_slot's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	6ff8da0887	nfsd4: Move callback setup to callback queue Instead of creating the new rpc client from a regular server thread, set a flag, kick off a null call, and allow the null call to do the work of setting up the client on the callback workqueue. Use a spinlock to ensure the callback work gets a consistent view of the callback parameters. This allows, for example, changing the callback from contexts where sleeping is not allowed. I hope it will also keep the locking simple as we add more session and trunking features, by serializing most of the callback-specific work. This also closes a small race where the the new cb_ident could be used with an old connection (or vice-versa). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	fb00392326	nfsd4: remove separate cb_args struct I don't see the point of the separate struct. It seems to just be getting in the way. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	cee277d924	nfsd4: use generic callback code in null case This will eventually allow us, for example, to kick off null callback from contexts where we can't sleep. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	5878453dbd	nfsd4: generic callback code Make the recall callback code more generic, so that other callbacks will be able to use it too. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	1c8556026e	nfsd4: rename nfs4_rpc_args->nfsd4_cb_args With apologies for the gratuitous rename, the new name seems more helpful to me. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	586f36735e	nfsd4: combine nfs4_rpc_args and nfsd4_cb_sequence These two structs don't really need to be distinct as far as I can tell. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	07263f1efe	nfsd4: minor variable renaming (cb -> conn) Now that we have both nfsd4_callback and nfsd4_cb_conn structures, I get confused if variables of both types are always named cb.... Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
Pavel Emelyanov	c653ce3f0a	sunrpc: Add net to rpc_create_args Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 17:18:56 -04:00
Pavel Emelyanov	fc5d00b04a	sunrpc: Add net argument to svc_create_xprt Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 17:18:54 -04:00
Benny Halevy	2b44f1ba40	nfsd4: adjust buflen for encoded attrs bitmap based on actual bitmap length The existing code adjusted it based on the worst case scenario for the returned bitmap and the best case scenario for the supported attrs attribute. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [bfields@redhat.com: removed likely/unlikely's] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 16:52:24 -04:00
Pavel Emelyanov	352114f395	sunrpc: Add net to pure API calls There are two calls that operate on ip_map_cache and are directly called from the nfsd code. Other places will be handled in a different way. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-27 10:16:11 -04:00
J. Bruce Fields	74ec1e1269	nfsd: fix /proc/net/rpc/nfsd.export/content display Note with "first" always 0, and "lastflags" initially 0, we always dump a spurious set of 0 flags at the start, among other problems. Fix. And attempt to make the code a little more obvious. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-26 14:48:25 -04:00
Pavel Emelyanov	049ef27b22	nfsd: Export get_task_comm for nfsd The git://linux-nfs.org/~bfields/linux.git nfsd-next branch doesn't compile when nfsd is a module with the following error: ERROR: "get_task_comm" [fs/nfsd/nfsd.ko] undefined! Replace the get_task_comm call with direct comm access, which is safe for current. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-23 10:34:21 -04:00
NeilBrown	1e1405673e	nfsd: allow deprecated interface to be compiled out. Add CONFIG_NFSD_DEPRECATED, default to y. Only include deprecated interface if this is defined. This allows distros to remove this interface before the official removal, and allows developers to test without it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-22 15:33:14 -04:00
NeilBrown	c67874f942	nfsd: formally deprecate legacy nfsd syscall interface The syscall interface is has been replaced by a more flexible interface since 2.6.0. It is time to work towards discarding the old interface. So add a entry in feature-removal-schedule.txt and print a warning when the interface is used. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-22 15:33:13 -04:00
NeilBrown	839049a873	nfsd/idmap: drop special request deferal in favour of improved default. The idmap code manages request deferal by waiting for a reply from userspace rather than putting the NFS request on a queue to be retried from the start. Now that the common deferal code does this there is no need for the special code in idmap. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-21 17:08:31 -04:00
NeilBrown	8ff30fa4ef	nfsd: disable deferral for NFSv4 Now that a slight delay in getting a reply to an upcall doesn't require deferring of requests, request deferral for all NFSv4 requests - the concept doesn't really fit with the v4 model. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-21 17:02:27 -04:00
J. Bruce Fields	c88739b373	Merge remote branch 'trond/bugfixes' into for-2.6.37 Without some client-side fixes, server testing is currently difficult.	2010-09-19 23:48:32 -04:00
Trond Myklebust	827e345702	SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies The NFSv4 client's callback server calls svc_gss_principal(), which is defined in the auth_rpcgss.ko The NFSv4 server has the same dependency, and in addition calls svcauth_gss_flavor(), gss_mech_get_by_pseudoflavor(), gss_pseudoflavor_to_service() and gss_mech_put() from the same module. The module auth_rpcgss itself has no dependencies aside from sunrpc, so we only need to select RPCSEC_GSS. Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-09-12 19:57:50 -04:00
Linus Torvalds	4f63e3c5be	Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: nfsd4: mask out non-access bits in nfs4_access_to_omode	2010-09-07 19:21:02 -07:00
NeilBrown	c5b29f885a	sunrpc: use seconds since boot in expiry cache This protects us from confusion when the wallclock time changes. We convert to and from wallclock when setting or reading expiry times. Also use seconds since boot for last_clost time. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-07 19:21:20 -04:00
NeilBrown	17cebf658e	sunrpc: extract some common sunrpc_cache code from nfsd Rather can duplicating this idiom twice, put it in an inline function. This reduces the usage of 'expiry_time' out side the sunrpc/cache.c code and thus the impact of a change that is about to be made to that field. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-07 19:21:19 -04:00
Andy Adamson	1132b26029	nfsd: remove duplicate NFS4_STATEID_SIZE declaration Use NFS4_STATEID_SIZE from include/linux/nfs4 Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-07 19:21:18 -04:00
J. Bruce Fields	8f34a430ac	nfsd4: mask out non-access bits in nfs4_access_to_omode This fixes an unnecessary BUG(). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-02 15:25:09 -04:00
Linus Torvalds	2547d1d20f	Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: nfsd: fix NULL dereference in nfsd_statfs() nfsd4: fix downgrade/lock logic nfsd4: typo fix in find_any_file nfsd4: bad BUG() in preprocess_stateid_op	2010-08-28 14:05:55 -07:00
Takashi Iwai	f6360efb83	nfsd: fix NULL dereference in nfsd_statfs() The commit `ebabe9a900` pass a struct path to vfs_statfs introduced the struct path initialization, and this seems to trigger an Oops on my machine. fh_dentry field may be NULL and set later in fh_verify(), thus the initialization of path must be after fh_verify(). Signed-off-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-26 13:23:16 -04:00
J. Bruce Fields	f632265d0f	Merge commit 'v2.6.36-rc1' into HEAD	2010-08-26 13:22:27 -04:00
J. Bruce Fields	7d94784293	nfsd4: fix downgrade/lock logic If we already had a RW open for a file, and get a readonly open, we were piggybacking on the existing RW open. That's inconsistent with the downgrade logic which blows away the RW open assuming you'll still have a readonly open. Also, make sure there is a readonly or writeonly open available for locking, again to prevent bad behavior in downgrade cases when any RW open may be lost. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-26 13:22:02 -04:00
J. Bruce Fields	18608ad49c	nfsd4: typo fix in find_any_file Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-26 13:21:09 -04:00
J. Bruce Fields	30c0e1ef0a	nfsd4: bad BUG() in preprocess_stateid_op It's OK for this function to return without setting filp--we do it in the special-stateid case. And there's a legitimate case where we can hit this, since we do permit reads on write-only stateid's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-26 13:20:51 -04:00
Linus Torvalds	763008c435	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix an Oops in the NFSv4 atomic open code NFS: Fix the selection of security flavours in Kconfig NFS: fix the return value of nfs_file_fsync() rpcrdma: Fix SQ size calculation when memreg is FRMR xprtrdma: Do not truncate iova_start values in frmr registrations. nfs: Remove redundant NULL check upon kfree() nfs: Add "lookupcache" to displayed mount options NFS: allow close-to-open cache semantics to apply to root of NFS filesystem SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)	2010-08-18 15:45:23 -07:00
Trond Myklebust	df486a2590	NFS: Fix the selection of security flavours in Kconfig Randy Dunlap reports: ERROR: "svc_gss_principal" [fs/nfs/nfs.ko] undefined! because in fs/nfs/Kconfig, NFS_V4 selects RPCSEC_GSS_KRB5 and/or in fs/nfsd/Kconfig, NFSD_V4 selects RPCSEC_GSS_KRB5. RPCSEC_GSS_KRB5 does 5 selects, but none of these is enforced/followed by the fs/nfs[d]/Kconfig configs: select SUNRPC_GSS select CRYPTO select CRYPTO_MD5 select CRYPTO_DES select CRYPTO_CBC Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: J. Bruce Fields <bfields@fieldses.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-08-17 17:42:45 -04:00
Linus Torvalds	8c8946f509	Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits) fanotify: use both marks when possible fsnotify: pass both the vfsmount mark and inode mark fsnotify: walk the inode and vfsmount lists simultaneously fsnotify: rework ignored mark flushing fsnotify: remove global fsnotify groups lists fsnotify: remove group->mask fsnotify: remove the global masks fsnotify: cleanup should_send_event fanotify: use the mark in handler functions audit: use the mark in handler functions dnotify: use the mark in handler functions inotify: use the mark in handler functions fsnotify: send fsnotify_mark to groups in event handling functions fsnotify: Exchange list heads instead of moving elements fsnotify: srcu to protect read side of inode and vfsmount locks fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called fsnotify: use _rcu functions for mark list traversal fsnotify: place marks on object in order of group memory address vfs/fsnotify: fsnotify_close can delay the final work in fput fsnotify: store struct file not struct path ... Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.	2010-08-10 11:39:13 -07:00
Linus Torvalds	5f248c9c25	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits) no need for list_for_each_entry_safe()/resetting with superblock list Fix sget() race with failing mount vfs: don't hold s_umount over close_bdev_exclusive() call sysv: do not mark superblock dirty on remount sysv: do not mark superblock dirty on mount btrfs: remove junk sb_dirt change BFS: clean up the superblock usage AFFS: wait for sb synchronization when needed AFFS: clean up dirty flag usage cifs: truncate fallout mbcache: fix shrinker function return value mbcache: Remove unused features add f_flags to struct statfs(64) pass a struct path to vfs_statfs update VFS documentation for method changes. All filesystems that need invalidate_inode_buffers() are doing that explicitly convert remaining ->clear_inode() to ->evict_inode() Make ->drop_inode() just return whether inode needs to be dropped fs/inode.c:clear_inode() is gone fs/inode.c:evict() doesn't care about delete vs. non-delete paths now ... Fix up trivial conflicts in fs/nilfs2/super.c	2010-08-10 11:26:52 -07:00
Christoph Hellwig	ebabe9a900	pass a struct path to vfs_statfs We'll need the path to implement the flags field for statvfs support. We do have it available in all callers except: - ecryptfs_statfs. This one doesn't actually need vfs_statfs but just needs to do a caller to the lower filesystem statfs method. - sys_ustat. Add a non-exported statfs_by_dentry helper for it which doesn't won't be able to fill out the flags field later on. In addition rename the helpers for statfs vs fstatfs to do_*statfs instead of the misleading vfs prefix. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:48:42 -04:00
Linus Torvalds	0d9f9e122c	Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: (34 commits) nfsd4: fix file open accounting for RDWR opens nfsd: don't allow setting maxblksize after svc created nfsd: initialize nfsd versions before creating svc net: sunrpc: removed duplicated #include nfsd41: Fix a crash when a callback is retried nfsd: fix startup/shutdown order bug nfsd: minor nfsd read api cleanup gcc-4.6: nfsd: fix initialized but not read warnings nfsd4: share file descriptors between stateid's nfsd4: fix openmode checking on IO using lock stateid nfsd4: miscellaneous process_open2 cleanup nfsd4: don't pretend to support write delegations nfsd: bypass readahead cache when have struct file nfsd: minor nfsd_svc() cleanup nfsd: move more into nfsd_startup() nfsd: just keep single lockd reference for nfsd nfsd: clean up nfsd_create_serv error handling nfsd: fix error handling in __write_ports_addxprt nfsd: fix error handling when starting nfsd with rpcbind down nfsd4: fix v4 state shutdown error paths ...	2010-08-07 14:24:41 -07:00
J. Bruce Fields	998db52c03	nfsd4: fix file open accounting for RDWR opens Commit `f9d7562fdb` "nfsd4: share file descriptors between stateid's" didn't correctly account for O_RDWR opens. Symptoms include leaked files, resulting in failures to unmount and/or warnings about orphaned inodes on reboot. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-07 09:50:05 -04:00
J. Bruce Fields	7fa53cc872	nfsd: don't allow setting maxblksize after svc created It's harmless to set this after the server is created, but also ineffective, since the value is only used at the time of svc_create_pooled(). So fail the attempt, in keeping with the pattern set by write_versions, write_{lease,grace}time and write_recoverydir. (This could break userspace that tried to write to nfsd/max_block_size between setting up sockets and starting the server. However, such code wouldn't have worked anyway, and I don't know of any examples--rpc.nfsd in nfs-utils, probably the only user of the interface, doesn't do that.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 18:00:33 -04:00
J. Bruce Fields	e844a7b980	nfsd: initialize nfsd versions before creating svc Commit `59db4a0c10` "nfsd: move more into nfsd_startup()" inadvertently moved nfsd_versions after nfsd_create_svc(). On older distributions using an rpc.nfsd that does not explicitly set the list of nfsd versions, this results in svc-create_pooled() being called with an empty versions array. The resulting incomplete initialization leads to a NULL dereference in svc_process_common() the first time a client accesses the server. Move nfsd_reset_versions() back before the svc_create_pooled(); this time, put it closer to the svc_create_pooled() call, to make this mistake more difficult in the future. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 17:05:40 -04:00
Boaz Harrosh	c18c821fd4	nfsd41: Fix a crash when a callback is retried If a callback is retried at nfsd4_cb_recall_done() due to some error, the returned rpc reply crashes here: @@ -514,6 +514,7 @@ decode_cb_sequence(struct xdr_stream xdr, struct nfsd4_cb_sequence res, u32 dummy; __be32 *p; + BUG_ON(!res); if (res->cbs_minorversion == 0) return 0; [BUG_ON added for demonstration] This is because the nfsd4_cb_done_sequence() has NULLed out the task->tk_msg.rpc_resp pointer. Also eventually the rpc would use the new slot without making sure it is free by calling nfsd41_cb_setup_sequence(). This problem was introduced by a 4.1 protocol addition patch: [`0421b5c5`] nfsd41: Backchannel: Implement cb_recall over NFSv4.1 Which was overlooking the possibility of an RPC callback retries. For not-4.1 case redoing the _prepare is harmless. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 17:05:39 -04:00
J. Bruce Fields	774f8bbd9e	nfsd: fix startup/shutdown order bug We must create the server before we can call init_socks or check the number of threads. Symptoms were a NULL pointer dereference in nfsd_svc(). Problem identified by Jeff Layton. Also fix a minor cleanup-on-error case in nfsd_startup(). Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 17:05:30 -04:00
J. Bruce Fields	039a87ca53	nfsd: minor nfsd read api cleanup Christoph points that the NFSv2/v3 callers know which case they want here, so we may as well just call the file=NULL case directly instead of making this conditional. Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-30 12:54:54 -04:00
Andi Kleen	6904996101	gcc-4.6: nfsd: fix initialized but not read warnings Fixes at least one real minor bug: the nfs4 recovery dir sysctl would not return its status properly. Also I finished Al's `1e41568d73` ("Take ima_path_check() in nfsd past dentry_open() in nfsd_open()") commit, it moved the IMA code, but left the old path initializer in there. The rest is just dead code removed I think, although I was not fully sure about the "is_borc" stuff. Some more review would be still good. Found by gcc 4.6's new warnings. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 19:32:17 -04:00
J. Bruce Fields	f9d7562fdb	nfsd4: share file descriptors between stateid's The vfs doesn't really allow us to "upgrade" a file descriptor from read-only to read-write, and our attempt to do so in nfs4_upgrade_open is ugly and incomplete. Move to a different scheme where we keep multiple opens, shared between open stateid's, in the nfs4_file struct. Each file will be opened at most 3 times (for read, write, and read-write), and those opens will be shared between all clients and openers. On upgrade we will do another open if necessary instead of attempting to upgrade an existing open. We keep count of the number of readers and writers so we know when to close the shared files. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-07-29 18:19:23 -04:00
J. Bruce Fields	0292191417	nfsd4: fix openmode checking on IO using lock stateid It is legal to perform a write using the lock stateid that was originally associated with a read lock, or with a file that was originally opened for read, but has since been upgraded. So, when checking the openmode, check the mode associated with the open stateid from which the lock was derived. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 16:37:12 -04:00
J. Bruce Fields	21fb4016bd	nfsd4: miscellaneous process_open2 cleanup Move more work into helper functions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 16:34:29 -04:00
J. Bruce Fields	c3e4808086	nfsd4: don't pretend to support write delegations The delegation code mostly pretends to support either read or write delegations. However, correct support for write delegations would require, for example, breaking of delegations (and/or implementation of cb_getattr) on stat. Currently all that stops us from handing out delegations is a subtle reference-counting issue. Avoid confusion by adding an earlier check that explicitly refuses write delegations. For now, though, I'm not going so far as to rip out existing half-support for write delegations, in case we get around to using that soon. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 16:05:51 -04:00
Eric Paris	2a12a9d781	fsnotify: pass a file instead of an inode to open, read, and write fanotify, the upcoming notification system actually needs a struct path so it can do opens in the context of listeners, and it needs a file so it can get f_flags from the original process. Close was the only operation that already was passing a struct file to the notification hook. This patch passes a file for access, modify, and open as well as they are easily available to these hooks. Signed-off-by: Eric Paris <eparis@redhat.com>	2010-07-28 09:58:32 -04:00
J. Bruce Fields	fa0a21269f	nfsd: bypass readahead cache when have struct file The readahead cache compensates for the fact that the NFS server currently does an open and close on every IO operation in the NFSv2 and NFSv3 case. In the NFSv4 case we have long-lived struct files associated with client opens, so there's no need for this. In fact, concurrent IO's using trying to modify the same file->f_ra may cause problems. So, don't bother with the readahead cache in that case. Note eventually we'll likely do this in the v2/v3 case as well by keeping a cache of struct files instead of struct file_ra_state's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-27 18:15:54 -04:00
J. Bruce Fields	af4718f3f9	nfsd: minor nfsd_svc() cleanup More idiomatic to put the error case in the if clause. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:27 -04:00
J. Bruce Fields	59db4a0c10	nfsd: move more into nfsd_startup() This is just cleanup--it's harmless to call nfsd_rachache_init, nfsd_init_socks, and nfsd_reset_versions more than once. But there's no point to it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:26 -04:00
Jeff Layton	ac77efbe2b	nfsd: just keep single lockd reference for nfsd Right now, nfsd keeps a lockd reference for each socket that it has open. This is unnecessary and complicates the error handling on startup and shutdown. Change it to just do a lockd_up when starting the first nfsd thread just do a single lockd_down when taking down the last nfsd thread. Because of the strange way the sv_count is handled this requires an extra flag to tell whether the nfsd_serv holds a reference for lockd or not. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:26 -04:00
Jeff Layton	628b368728	nfsd: clean up nfsd_create_serv error handling There doesn't seem to be any need to reset the nfssvc_boot time if the nfsd startup failed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:25 -04:00
Jeff Layton	0cd14a061e	nfsd: fix error handling in __write_ports_addxprt __write_ports_addxprt calls nfsd_create_serv. That increases the refcount of nfsd_serv (which is tracked in sv_nrthreads). The service only decrements the thread count on error, not on success like __write_ports_addfd does, so using this interface leaves the nfsd thread count high. Fix this by having this function call svc_destroy() on error to release the reference (and possibly to tear down the service) and simply decrement the refcount without tearing down the service on success. This makes the sv_threads handling work basically the same in both __write_ports_addxprt and __write_ports_addfd. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:24 -04:00
Jeff Layton	78a8d7c8ca	nfsd: fix error handling when starting nfsd with rpcbind down The refcounting for nfsd is a little goofy. What happens is that we create the nfsd RPC service, attach sockets to it but don't actually start the threads until someone writes to the "threads" procfile. To do this, __write_ports_addfd will create the nfsd service and then will decrement the refcount when exiting but won't actually destroy the service. This is fine when there aren't errors, but when there are this can cause later attempts to start nfsd to fail. nfsd_serv will be set, and that causes __write_versions to return EBUSY. Fix this by calling svc_destroy on nfsd_serv when this function is going to return error. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:23 -04:00
Jeff Layton	4ad9a344be	nfsd4: fix v4 state shutdown error paths If someone tries to shut down the laundry_wq while it isn't up it'll cause an oops. This can happen because write_ports can create a nfsd_svc before we really start the nfs server, and we may fail before the server is ever started. Also make sure state is shutdown on error paths in nfsd_svc(). Use a common global nfsd_up flag instead of nfs4_init, and create common helper functions for nfsd start/shutdown, as there will be other work that we want done only when we the number of nfsd threads transitions between zero and nonzero. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:22 -04:00
J. Bruce Fields	55b13354d7	nfsd: remove unused assignment from nfsd_link Trivial cleanup, since "dest" is never used. Reported-by: Anshul Madan <Anshul.Madan@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:50:39 -04:00
Chuck Lever	43a9aa64a2	NFSD: Fill in WCC data for REMOVE, RMDIR, MKNOD, and MKDIR Some well-known NFSv3 clients drop their directory entry caches when they receive replies with no WCC data. Without this data, they employ extra READ, LOOKUP, and GETATTR requests to ensure their directory entry caches are up to date, causing performance to suffer needlessly. In order to return WCC data, our server has to have both the pre-op and the post-op attribute data on hand when a reply is XDR encoded. The pre-op data is filled in when the incoming fh is locked, and the post-op data is filled in when the fh is unlocked. Unfortunately, for REMOVE, RMDIR, MKNOD, and MKDIR, the directory fh is not unlocked until well after the reply has been XDR encoded. This means that encode_wcc_data() does not have wcc_data for the parent directory, so none is returned to the client after these operations complete. By unlocking the parent directory fh immediately after the internal operations for each NFS procedure is complete, the post-op data is filled in before XDR encoding starts, so it can be returned to the client properly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-07-07 17:12:32 -04:00
J. Bruce Fields	6a85d6c769	nfsd4: comment nitpick Reported-by: "Madan, Anshul" <Anshul.Madan@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-07-06 12:40:22 -04:00
J. Bruce Fields	cba9ba4b90	nfsd4: fix delegation recall race use-after-free When the rarely-used callback-connection-changing setclientid occurs simultaneously with a delegation recall, we rerun the recall by requeueing it on a workqueue. But we also need to take a reference on the delegation in that case, since the delegation held by the rpc itself will be released by the rpc_release callback. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-24 12:24:55 -04:00
J. Bruce Fields	ac94bf5825	nfsd4: fix deleg leak on callback error Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-24 12:24:53 -04:00
J. Bruce Fields	ec8acac84a	nfsd4: remove some debugging code This is overkill. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 22:29:03 -04:00
Benny Halevy	9303bbd3de	nfsd: nfs4callback encode_stateid helper function To be used also for the pnfs cb_layoutrecall callback Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd4: fix cb_recall encoding] "nfsd: nfs4callback encode_stateid helper function" forgot to reserve more space after return from the new helper. Reported-by: Michael Groshans <groshans@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 17:19:51 -04:00
J. Bruce Fields	4731030d58	nfsd4: translate memory errors to delay, not serverfault If the server is out of memory is better for clients to back off and retry than to just error out. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 17:19:36 -04:00
J. Bruce Fields	76407f76e0	nfsd4; fix session reference count leak Note the session has to be put() here regardless of what happens to the client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 17:19:28 -04:00
Linus Torvalds	b95a568093	Merge branch 'for-2.6.35' of git://linux-nfs.org/~bfields/linux * 'for-2.6.35' of git://linux-nfs.org/~bfields/linux: nfsd4: shut down callback queue outside state lock nfsd: nfsd_setattr needs to call commit_metadata	2010-06-09 12:43:04 -07:00
J. Bruce Fields	44b56603c4	Merge branch 'for-2.6.34-incoming' into for-2.6.35-incoming	2010-06-08 20:05:18 -04:00
J. Bruce Fields	c3935e3049	nfsd4: shut down callback queue outside state lock This reportedly causes a lockdep warning on nfsd shutdown. That looks like a false positive to me, but there's no reason why this needs the state lock anyway. Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-08 19:33:52 -04:00
Christoph Hellwig	b160fdabe9	nfsd: nfsd_setattr needs to call commit_metadata The conversion of write_inode_now calls to commit_metadata in commit `f501912a35` missed out the call in nfsd_setattr. But without this conversion we can't guarantee that a SETATTR request has actually been commited to disk with XFS, which causes a regression from 2.6.32 (only for NFSv2, but anyway). Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-01 19:17:50 -04:00
J. Bruce Fields	68a4b48ce6	nfsd4: don't bother storing callback reply tag We don't use this, and probably never will. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:59 -04:00
J. Bruce Fields	24a0111e40	nfsd4: fix use of op_share_access NFSv4.1 adds additional flags to the share_access argument of the open call. These flags need to be masked out in some of the existing code, but current code does that inconsistently. Tested-by: Michael Groshans <groshans@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:55 -04:00
J. Bruce Fields	172c85dd57	nfsd4: treat more recall errors as failures If a recall fails for some unexpected reason, instead of ignoring it and treating it like a success, it's safer to treat it as a failure, preventing further delgation grants and returning CB_PATH_DOWN. Also put put switches in a (two me) more logical order, with normal case first. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:53 -04:00
J. Bruce Fields	378b7d37f9	nfsd4: remove extra put() on callback errors Since rpc_call_async() guarantees that the release method will be called even on failure, this put is wrong. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:51 -04:00
Alexey Dobriyan	4be929be34	kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN - C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not USHORT_MAX/SHORT_MAX/SHORT_MIN. - Make SHRT_MIN of type s16, not int, for consistency. [akpm@linux-foundation.org: fix drivers/dma/timb_dma.c] [akpm@linux-foundation.org: fix security/keys/keyring.c] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-25 08:07:02 -07:00
Christoph Hellwig	8018ab0574	sanitize vfs_fsync calling conventions Now that the last user passing a NULL file pointer is gone we can remove the redundant dentry argument and associated hacks inside vfs_fsynmc_range. The next step will be removig the dentry argument from ->fsync, but given the luck with the last round of method prototype changes I'd rather defer this until after the main merge window. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-21 18:31:21 -04:00
Christoph Hellwig	e970a573ce	nfsd: open a file descriptor for fsync in nfs4 recovery Instead of just looking up a path use do_filp_open to get us a file structure for the nfs4 recovery directory. This allows us to get rid of the last non-standard vfs_fsync caller with a NULL file pointer. [AV: should be using fput(), not filp_close()] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-21 18:31:21 -04:00
J. Bruce Fields	e4e83ea47b	Revert "nfsd4: distinguish expired from stale stateids" This reverts commit `78155ed75f`. We're depending here on the boot time that we use to generate the stateid being monotonic, but get_seconds() is not necessarily. We still depend at least on boot_time being different every time, but that is a safer bet. We have a few reports of errors that might be explained by this problem, though we haven't been able to confirm any of them. But the minor gain of distinguishing expired from stale errors seems not worth the risk. Conflicts: fs/nfsd/nfs4state.c Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-18 19:03:50 -04:00
Pavel Emelyanov	47cee541a4	nfsd: safer initialization order in find_file() The alloc_init_file() first adds a file to the hash and then initializes its fi_inode, fi_id and fi_had_conflict. The uninitialized fi_inode could thus be erroneously checked by the find_file(), so move the hash insertion lower. The client_mutex should prevent this race in practice; however, we eventually hope to make less use of the client_mutex, so the ordering here is an accident waiting to happen. I didn't find whether the same can be true for two other fields, but the common sense tells me it's better to initialize an object before putting it into a global hash table :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-18 12:05:20 -04:00
J. Bruce Fields	b7299f4439	nfs4: minor callback code simplification, comment Note the position in the version array doesn't have to match the actual rpc version number--to me it seems clearer to maintain the distinction. Also document choice of rpc callback version number, as discussed in e.g. http://www.ietf.org/mail-archive/web/nfsv4/current/msg07985.html and followups. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-18 11:51:38 -04:00
Pavel Emelyanov	15ddb4aec5	NFSD: don't report compiled-out versions as present The /proc/fs/nfsd/versions file calls nfsd_vers() to check whether the particular nfsd version is present/available. The problem is that once I turn off e.g. NFSD-V4 this call returns -1 which is true from the callers POV which is wrong. The proposal is to report false in that case. The bug has existed since `6658d3a7bb` "[PATCH] knfsd: remove nfsd_versbits as intermediate storage for desired versions". Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: stable@kernel.org Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-14 18:46:14 -04:00
J. Bruce Fields	4dc6ec00f6	nfsd4: implement reclaim_complete This is a mandatory operation. Also, here (not in open) is where we should be committing the reboot recovery information. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 12:03:11 -04:00
Benny Halevy	ab707e1565	nfsd4: nfsd4_destroy_session must set callback client under the state lock nfsd4_set_callback_client must be called under the state lock to atomically set or unset the callback client and shutting down the previous one. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:59:11 -04:00
Benny Halevy	d76829889a	nfsd4: keep a reference count on client while in use Get a refcount on the client on SEQUENCE, Release the refcount and renew the client when all respective compounds completed. Do not expire the client by the laundromat while in use. If the client was expired via another path, free it when the compounds complete and the refcount reaches 0. Note that unhash_client_locked must call list_del_init on cl_lru as it may be called twice for the same client (once from nfs4_laundromat and then from expire_client) Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:58:54 -04:00
Benny Halevy	07cd4909a6	nfsd4: mark_client_expired Mark the client as expired under the client_lock so it won't be renewed when an nfsv4.1 session is done, after it was explicitly expired during processing of the compound. Do not renew a client mark as expired (in particular, it is not on the lru list anymore) Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:47:22 -04:00
Benny Halevy	46583e2597	nfsd4: introduce nfs4_client.cl_refcount Currently just initialize the cl_refcount to 1 and decrement in expire_client(), conditionally freeing the client when the refcount reaches 0. To be used later by nfsv4.1 compounds to keep the client from timing out while in use. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:47:03 -04:00
Benny Halevy	84d38ac9ab	nfsd4: refactor expire_client Separate out unhashing of the client and session. To be used later by the laundromat. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:02 -04:00
Benny Halevy	36acb66bda	nfsd4: extend the client_lock to cover cl_lru To be used later on to hold a reference count on the client while in use by a nfsv4.1 compound. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:02 -04:00
Benny Halevy	328efbab0f	nfsd4: use list_move in move_to_confirmed rather than list_del_init, list_add Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:01 -04:00
Benny Halevy	be1fdf6c43	nfsd4: fold release_session into expire_client and grab the client lock once for all the client's sessions. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:01 -04:00
Benny Halevy	9089f1b478	nfsd4: rename sessionid_lock to client_lock In preparation to share the lock's scope to both client and session hash tables. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:01 -04:00
J. Bruce Fields	5d4cec2f2f	nfsd4: fix bare destroy_session null dereference It's legal to send a DESTROY_SESSION outside any session (as the only operation in a compound), in which case cstate->session will be NULL; check for that case. While we're at it, move these checks into a separate helper function. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-07 19:08:47 -04:00
J. Bruce Fields	5306293c9c	Merge commit 'v2.6.34-rc6' Conflicts: fs/nfsd/nfs4callback.c	2010-05-04 11:29:05 -04:00
Benny Halevy	dbd65a7e44	nfsd4: use local variable in nfs4svc_encode_compoundres 'cs' is already computed, re-use it. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-04 10:10:36 -04:00
J. Bruce Fields	26c0c75e69	nfsd4: fix unlikely race in session replay case In the replay case, the renew_client(session->se_client); happens after we've droppped the sessionid_lock, and without holding a reference on the session; so there's nothing preventing the session being freed before we get here. Thanks to Benny Halevy for catching a bug in an earlier version of this patch. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Benny Halevy <bhalevy@panasas.com>	2010-05-03 08:32:31 -04:00
Neil Brown	2bc3c1179c	nfsd4: bug in read_buf When read_buf is called to move over to the next page in the pagelist of an NFSv4 request, it sets argp->end to essentially a random number, certainly not an address within the page which argp->p now points to. So subsequent calls to READ_BUF will think there is much more than a page of spare space (the cast to u32 ensures an unsigned comparison) so we can expect to fall off the end of the second page. We never encountered thsi in testing because typically the only operations which use more than two pages are write-like operations, which have their own decoding logic. Something like a getattr after a write may cross a page boundary, but it would be very unusual for it to cross another boundary after that. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-26 15:39:08 -04:00
Dan Carpenter	d03859a4ac	nfsd: potential ERR_PTR dereference on exp_export() error paths. We "goto finish" from several places where "exp" is an ERR_PTR. Also I changed the check for "fsid_key" so that it was consistent with the check I added. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 12:03:02 -04:00
J. Bruce Fields	5771635592	nfsd4: complete enforcement of 4.1 op ordering Enforce the rules about compound op ordering. Motivated by implementing RECLAIM_COMPLETE, for which the client is implicit in the current session, so it is important to ensure a succesful SEQUENCE proceeds the RECLAIM_COMPLETE. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:35:14 -04:00
J. Bruce Fields	4b21d0defc	nfsd4: allow 4.0 clients to change callback path The rfc allows a client to change the callback parameters, but we didn't previously implement it. Teach the callbacks to rerun themselves (by placing themselves on a workqueue) when they recognize that their rpc task has been killed and that the callback connection has changed. Then we can change the callback connection by setting up a new rpc client, modifying the nfs4 client to point at it, waiting for any work in progress to complete, and then shutting down the old client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:02 -04:00
J. Bruce Fields	2bf23875f5	nfsd4: rearrange cb data structures Mainly I just want to separate the arguments used for setting up the tcp client from the rest. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:02 -04:00
J. Bruce Fields	b12a05cbdf	nfsd4: cl_count is unused Now that the shutdown sequence guarantees callbacks are shut down before the client is destroyed, we no longer have a use for cl_count. We'll probably reinstate a reference count on the client some day, but it will be held by users other than callbacks. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:02 -04:00
J. Bruce Fields	b5a1a81e5c	nfsd4: don't sleep in lease-break callback The NFSv4 server's fl_break callback can sleep (dropping the BKL), in order to allocate a new rpc task to send a recall to the client. As far as I can tell this doesn't cause any races in the current code, but the analysis is difficult. Also, the sleep here may complicate the move away from the BKL. So, just schedule some work to do the job for us instead. The work will later also prove useful for restarting a call after the callback information is changed. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:01 -04:00
J. Bruce Fields	3c4ab2aaa9	nfsd4: indentation cleanup Looks like a put-and-paste mistake. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-19 15:12:51 -04:00
J. Bruce Fields	408b79bcc3	nfsd4: consistent session flag setting We should clear these flags on any new create_session, not just on the first one. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-16 21:47:37 -04:00
J. Bruce Fields	9045b4b9f7	nfsd4: remove probe task's reference on client Any null probe rpc will be synchronously destroyed by the rpc_shutdown_client() in expire_client(), so the rpc task cannot outlast the nfs4 client. Therefore there's no need for that task to hold a reference on the client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 17:04:32 -04:00
J. Bruce Fields	3df796dbe9	nfsd4: remove dprintk I haven't found this useful. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 17:04:31 -04:00
J. Bruce Fields	147efd0dd7	nfsd4: shutdown callbacks on expiry Once we've expired the client, there's no further purpose to the callbacks; go ahead and shut down the callback client rather than waiting for the last reference to go. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 16:36:30 -04:00
J. Bruce Fields	227f98d98d	nfsd4: preallocate nfs4_rpc_args Instead of allocating this small structure, just include it in the delegation. The nfsd4_callback structure isn't really necessary yet, but we plan to add to it all the information necessary to perform a callback. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 16:28:11 -04:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Jeff Layton	91885258e8	nfsd: don't break lease while servicing a COMMIT This is the second attempt to fix the problem whereby a COMMIT call causes a lease break and triggers a possible deadlock. The problem is that nfsd attempts to break a lease on a COMMIT call. This triggers a delegation recall if the lease is held for a delegation. If the client is the one holding the delegation and it's the same one on which it's issuing the COMMIT, then it can't return that delegation until the COMMIT is complete. But, nfsd won't complete the COMMIT until the delegation is returned. The client and server are essentially deadlocked until the state is marked bad (due to the client not responding on the callback channel). The first patch attempted to deal with this by eliminating the open of the file altogether and simply had nfsd_commit pass a NULL file pointer to the vfs_fsync_range. That would conflict with some work in progress by Christoph Hellwig to clean up the fsync interface, so this patch takes a different approach. This declares a new NFSD_MAY_NOT_BREAK_LEASE access flag that indicates to nfsd_open that it should not break any leases when opening the file, and has nfsd_commit set that flag on the nfsd_open call. For now, this patch leaves nfsd_commit opening the file with write access since I'm not clear on what sort of access would be more appropriate. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-22 15:37:53 -04:00
NeilBrown	61f8603d93	nfsd: factor out hash functions for export caches. Both the _lookup and the _update functions for these two caches independently calculate the hash of the key. So factor out that code for improved reuse. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-16 18:05:11 -04:00
J. Bruce Fields	e739cf1da4	Merge commit 'v2.6.34-rc1' into for-2.6.35-incoming	2010-03-09 17:22:08 -05:00
Jiri Kosina	318ae2edc3	Merge branch 'for-next' into for-linus Conflicts: Documentation/filesystems/proc.txt arch/arm/mach-u300/include/mach/debug-macro.S drivers/net/qlge/qlge_ethtool.c drivers/net/qlge/qlge_main.c drivers/net/typhoon.c	2010-03-08 16:55:37 +01:00
J. Bruce Fields	e7b184f199	nfsd4: document lease/grace-period limits The current documentation here is out of date, and not quite right. (Future work: some user documentation would be useful.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:10 -05:00
J. Bruce Fields	efc4bb4fdd	nfsd4: allow setting grace period time Allow explicit configuration of the grace period time as well as the lease period time. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:08 -05:00
J. Bruce Fields	f013574014	nfsd4: reshuffle lease-setting code to allow reuse We'll soon allow setting the grace period, so we'll want to share this code. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:03 -05:00
J. Bruce Fields	f958a1320f	nfsd4: remove unnecessary lease-setting function This is another layer of indirection that doesn't really buy us anything. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:03 -05:00
J. Bruce Fields	e46b498c84	nfsd4: simplify lease/grace interaction The original code here assumed we'd allow the user to change the lease any time, but only allow the change to take effect on restart. Since then we modified the code to allow setting the lease on when the server is down. Update the rest of the code to reflect that fact, clarify variable names, and add document. Also, the code insisted that the grace period always be the longer of the old and new lease periods, but that's overly conservative--as long as it lasts at least the old lease period, old clients should still know to recover in time. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:02 -05:00
J. Bruce Fields	cf07d2ea43	nfsd4: simplify references to nfsd4 lease time Instead of accessing the lease time directly, some users call nfs4_lease_time(), and some a macro, NFSD_LEASE_TIME, defined as nfs4_lease_time(). Neither layer of indirection serves any purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:01 -05:00
Linus Torvalds	05c5cb31ec	Merge branch 'for-2.6.34' of git://linux-nfs.org/~bfields/linux * 'for-2.6.34' of git://linux-nfs.org/~bfields/linux: (22 commits) nfsd4: fix minor memory leak svcrpc: treat uid's as unsigned nfsd: ensure sockets are closed on error Revert "sunrpc: move the close processing after do recvfrom method" Revert "sunrpc: fix peername failed on closed listener" sunrpc: remove unnecessary svc_xprt_put NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN xfs_export_operations.commit_metadata commit_metadata export operation replacing nfsd_sync_dir lockd: don't clear sm_monitored on nsm_reboot_lookup lockd: release reference to nsm_handle in nlm_host_rebooted nfsd: Use vfs_fsync_range() in nfsd_commit NFSD: Create PF_INET6 listener in write_ports SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found" SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt() NFSD: Support AF_INET6 in svc_addsock() function SUNRPC: Use rpc_pton() in ip_map_parse() nfsd: 4.1 has an rfc number nfsd41: Create the recovery entry for the NFSv4.1 client nfsd: use vfs_fsync for non-directories ...	2010-03-06 11:31:38 -08:00
Wu Fengguang	42e4960868	vfs: take f_lock on modifying f_mode after open time We'll introduce FMODE_RANDOM which will be runtime modified. So protect all runtime modification to f_mode with f_lock to avoid races. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@kernel.org> [2.6.33.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-06 11:26:25 -08:00
Linus Torvalds	e213e26ab3	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits) quota: stop using QUOTA_OK / NO_QUOTA dquot: cleanup dquot initialize routine dquot: move dquot initialization responsibility into the filesystem dquot: cleanup dquot drop routine dquot: move dquot drop responsibility into the filesystem dquot: cleanup dquot transfer routine dquot: move dquot transfer responsibility into the filesystem dquot: cleanup inode allocation / freeing routines dquot: cleanup space allocation / freeing routines ext3: add writepage sanity checks ext3: Truncate allocated blocks if direct IO write fails to update i_size quota: Properly invalidate caches even for filesystems with blocksize < pagesize quota: generalize quota transfer interface quota: sb_quota state flags cleanup jbd: Delay discarding buffers in journal_unmap_buffer ext3: quota_write cross block boundary behaviour quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota quota: split out compat_sys_quotactl support from quota.c quota: split out netlink notification support from quota.c quota: remove invalid optimization from quota_sync_all ... Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c	2010-03-05 13:20:53 -08:00
Christoph Hellwig	907f4554e2	dquot: move dquot initialization responsibility into the filesystem Currently various places in the VFS call vfs_dq_init directly. This means we tie the quota code into the VFS. Get rid of that and make the filesystem responsible for the initialization. For most metadata operations this is a straight forward move into the methods, but for truncate and open it's a bit more complicated. For truncate we currently only call vfs_dq_init for the sys_truncate case because open already takes care of it for ftruncate and open(O_TRUNC) - the new code causes an additional vfs_dq_init for those which is harmless. For open the initialization is moved from do_filp_open into the open method, which means it happens slightly earlier now, and only for regular files. The latter is fine because we don't need to initialize it for operations on special files, and we already do it as part of the namespace operations for directories. Add a dquot_file_open helper that filesystems that support generic quotas can use to fill in ->open. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2010-03-05 00:20:30 +01:00
J. Bruce Fields	4ea41e2de5	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs into for-2.6.34-incoming Resolve merge conflict in fs/xfs/linux-2.6/xfs_export.c.	2010-03-04 12:04:51 -05:00
J. Bruce Fields	8d75da8afd	nfsd4: fix minor memory leak There's no need to allocate this cred more than once. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-03 16:13:29 -05:00
Al Viro	462d60577a	fix NFS4 handling of mountpoint stat RFC says we need to follow the chain of mounts if there's more than one stacked on that point. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:07:57 -05:00
Al Viro	8737c9305b	Switch may_open() and break_lease() to passing O_... ... instead of mixing FMODE_ and O_ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 13:00:21 -05:00
Chuck Lever	58255a4e3c	NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN The server's callback client should stop trying to connect to the client's callback server as soon as it gets ECONNREFUSED. The NFS server's callback client does not call rpc_ping(), but appears to have it's own "ping" procedure, so it wasn't covered by commit `caabea8a`. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-24 17:50:28 -08:00
Ben Myers	f501912a35	commit_metadata export operation replacing nfsd_sync_dir - Add commit_metadata export_operation to allow the underlying filesystem to decide how to commit an inode most efficiently. - Usage of nfsd_sync_dir and write_inode_now has been replaced with the commit_metadata function that takes a svc_fh. - The commit_metadata function calls the commit_metadata export_op if it's there, or else falls back to sync_inode instead of fsync and write_inode_now because only metadata need be synced here. - nfsd4_sync_rec_dir now uses vfs_fsync so that commit_metadata can be static Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-20 13:13:44 -08:00
Chuck Ebbert	aeaa5ccd64	vfs: don't call ima_file_check() unconditionally in nfsd_open() commit `1e41568d73` ("Take ima_path_check() in nfsd past dentry_open() in nfsd_open()") moved this code back to its original location but missed the "else". Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-20 00:47:31 -05:00
Daniel Mack	3ad2f3fbb9	tree-wide: Assorted spelling fixes In particular, several occurances of funny versions of 'success', 'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address', 'beginning', 'desirable', 'separate' and 'necessary' are fixed. Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Joe Perches <joe@perches.com> Cc: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-09 11:13:56 +01:00
Linus Torvalds	deb0c98c7f	Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: Revert "nfsd4: fix error return when pseudoroot missing"	2010-02-08 17:08:01 -08:00
J. Bruce Fields	260c64d235	Revert "nfsd4: fix error return when pseudoroot missing" Commit `f39bde24b2` fixed the error return from PUTROOTFH in the case where there is no pseudofilesystem. This is really a case we shouldn't hit on a correctly configured server: in the absence of a root filehandle, there's no point accepting version 4 NFS rpc calls at all. But the shared responsibility between kernel and userspace here means the kernel on its own can't eliminate the possiblity of this happening. And we have indeed gotten this wrong in distro's, so new client-side mount code that attempts to negotiate v4 by default first has to work around this case. Therefore when commit `f39bde24b2` arrived at roughly the same time as the new v4-default mount code, which explicitly checked only for the previous error, the result was previously fine mounts suddenly failing. We'll fix both sides for now: revert the error change, and make the client-side mount workaround more robust. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-08 15:25:23 -05:00
Mimi Zohar	9bbb6cad01	ima: rename ima_path_check to ima_file_check ima_path_check actually deals with files! call it ima_file_check instead. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Mimi Zohar	8eb988c70e	fix ima breakage The "Untangling ima mess, part 2 with counters" patch messed up the counters. Based on conversations with Al Viro, this patch streamlines ima_path_check() by removing the counter maintaince. The counters are now updated independently, from measuring the file, in __dentry_open() and alloc_file() by calling ima_counts_get(). ima_path_check() is called from nfsd and do_filp_open(). It also did not measure all files that should have been measured. Reason: ima_path_check() got bogus value passed as mask. [AV: mea culpa] [AV: add missing nfsd bits] Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Al Viro	1e41568d73	Take ima_path_check() in nfsd past dentry_open() in nfsd_open() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Trond Myklebust	aa696a6f34	nfsd: Use vfs_fsync_range() in nfsd_commit The NFS COMMIT operation allows the client to specify the exact byte range that it wishes to sync to disk in order to optimise server performance. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-29 18:53:11 -05:00
Chuck Lever	37498292aa	NFSD: Create PF_INET6 listener in write_ports Try to create a PF_INET6 listener for NFSD, if IPv6 is enabled in the kernel. Make sure nfsd_serv's reference count is decreased if __write_ports_addxprt() failed to create a listener. See __write_ports_addfd(). Our current plan is to rely on rpc.nfsd to create appropriate IPv6 listeners when server-side NFS/IPv6 support is desired. Legacy behavior, via the write_threads or write_svc kernel APIs, will remain the same -- only IPv4 listeners are created. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [bfields@citi.umich.edu: Move error-handling code to end] Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-27 17:01:08 -05:00
Chuck Lever	6871790815	SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found" write_ports() converts svc_create_xprt()'s ENOENT error return to EPROTONOSUPPORT so that rpc.nfsd (in user space) can report an error message that makes sense. It turns out that several of the other kernel APIs rpc.nfsd use can also return ENOENT from svc_create_xprt(), by way of lockd_up(). On the client side, an NFSv2 or NFSv3 mount request can also return the result of lockd_up(). This error may also be returned during an NFSv4 mount request, since the NFSv4 callback service uses svc_create_xprt() to create the callback listener. An ENOENT error return results in a confusing error message from the mount command. Let's have svc_create_xprt() return EPROTONOSUPPORT instead of ENOENT. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-26 17:59:21 -05:00
Ricardo Labiaga	8b8aae4009	nfsd41: Create the recovery entry for the NFSv4.1 client Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-14 12:24:46 -05:00
Christoph Hellwig	6a68f89ee1	nfsd: use vfs_fsync for non-directories Instead of opencoding the fsync calling sequence use vfs_fsync. This also gets rid of the useless i_mutex over the data writeout. Consolidate the remaining special code for syncing directories and document it's quirks. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-13 09:42:26 -05:00
Ricardo Labiaga	de3cab793c	nfsd4: Use FIRST_NFS4_OP in nfsd4_decode_compound() Since we're checking for LAST_NFS4_OP, use FIRST_NFS4_OP to be consistent. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-13 09:42:26 -05:00
Ricardo Labiaga	c551866e64	nfsd41: nfsd4_decode_compound() does not recognize all ops The server incorrectly assumes that the operations in the array start with value 0. The first operation (OP_ACCESS) has a value of 3, causing the check in nfsd4_decode_compound to be off. Instead of comparing that the operation number is less than the number of elements in the array, the server should verify that it is less than the maximum valid operation number defined by LAST_NFS4_OP. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-13 09:42:26 -05:00
Linus Torvalds	93939f4e5d	Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: sunrpc: fix peername failed on closed listener nfsd: make sure data is on disk before calling ->fsync nfsd: fix "insecure" export option	2010-01-06 18:10:15 -08:00
Christoph Hellwig	7211a4e859	nfsd: make sure data is on disk before calling ->fsync nfsd is not using vfs_fsync, so I missed it when changing the calling convention during the 2.6.32 window. This patch fixes it to not only start the data writeout, but also wait for it to complete before calling into ->fsync. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-06 17:37:26 -05:00
J. Bruce Fields	3d354cbc43	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-20 20:19:51 -08:00
J. Bruce Fields	f69ac2f5a3	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-20 10:22:58 -05:00
Linus Torvalds	bac5e54c29	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits) direct I/O fallback sync simplification ocfs: stop using do_sync_mapping_range cleanup blockdev_direct_IO locking make generic_acl slightly more generic sanitize xattr handler prototypes libfs: move EXPORT_SYMBOL for d_alloc_name vfs: force reval of target when following LAST_BIND symlinks (try #7) ima: limit imbalance msg Untangling ima mess, part 3: kill dead code in ima Untangling ima mess, part 2: deal with counters Untangling ima mess, part 1: alloc_file() O_TRUNC open shouldn't fail after file truncation ima: call ima_inode_free ima_inode_free IMA: clean up the IMA counts updating code ima: only insert at inode creation time ima: valid return code from ima_inode_alloc fs: move get_empty_filp() deffinition to internal.h Sanitize exec_permission_lite() Kill cached_lookup() and real_lookup() Kill path_lookup_open() ... Trivial conflicts in fs/direct-io.c	2009-12-16 12:04:02 -08:00
Al Viro	1429b3eca2	Untangling ima mess, part 3: kill dead code in ima Kill the 'update' argument of ima_path_check(), kill dead code in ima. Current rules: ima counters are bumped at the same time when the file switches from put_filp() fodder to fput() one. Which happens exactly in two places - alloc_file() and __dentry_open(). Nothing else needs to do that at all. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-16 12:16:47 -05:00
Al Viro	b65a9cfc2c	Untangling ima mess, part 2: deal with counters * do ima_get_count() in __dentry_open() * stop doing that in followups * move ima_path_check() to right after nameidata_to_filp() * don't bump counters on it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-16 12:16:47 -05:00
J. Bruce Fields	7663dacd92	nfsd: remove pointless paths in file headers The new .h files have paths at the top that are now out of date. While we're here, just remove all of those from fs/nfsd; they never served any purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 15:01:47 -05:00
J. Bruce Fields	1557aca790	nfsd: move most of nfsfh.h to fs/nfsd Most of this can be trivially moved to a private header as well. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 15:01:46 -05:00
J. Bruce Fields	774b147828	nfsd: make V4ROOT exports read-only I can't see any use for writeable V4ROOT exports. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 15:01:44 -05:00
Steve Dickson	03a816b46d	nfsd: restrict filehandles accepted in V4ROOT case On V4ROOT exports, only accept filehandles that are the root of some export. This allows mountd to allow or deny access to individual directories and symlinks on the pseudofilesystem. Note that the checks in readdir and lookup are not enough, since a malicious host with access to the network could guess filehandles that they weren't able to obtain through lookup or readdir. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:24 -05:00
J. Bruce Fields	f2ca7153ca	nfsd: allow exports of symlinks We want to allow exports of symlinks, to allow mountd to communicate to the kernel which symlinks lead to exports, and hence which symlinks need to be visible on the pseudofilesystem. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:24 -05:00
J. Bruce Fields	3227fa41ab	nfsd: filter readdir results in V4ROOT case As with lookup, we treat every boject as a mountpoint and pretend it doesn't exist if it isn't exported. The preexisting code here is confusing, but I haven't yet figured out how to make it clearer. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:24 -05:00
J. Bruce Fields	82ead7fe41	nfsd: filter lookup results in V4ROOT case We treat every object as a mountpoint and pretend it doesn't exist if it isn't exported. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:23 -05:00
J. Bruce Fields	3b6cee7bc4	nfsd4: don't continue "under" mounts in V4ROOT case If /A/mount/point/ has filesystem "B" mounted on top of it, and if "A" is exported, but not "B", then the nfs server has always returned to the client a filehandle for the mountpoint, instead of for the root of "B", allowing the client to see the subtree of "A" that would otherwise be hidden by B. Disable this behavior in the case of V4ROOT exports; we implement the path restrictions of V4ROOT exports by treating every directory as if it were a mountpoint, and allowing traversal only if the new directory is exported. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:23 -05:00
Steve Dickson	eb4c86c6a5	nfsd: introduce export flag for v4 pseudoroot NFSv4 differs from v2 and v3 in that it presents a single unified filesystem tree, whereas v2 and v3 exported multiple filesystem (whose roots could be found using a separate mount protocol). Our original NFSv4 server implementation asked the administrator to designate a single filesystem as the NFSv4 root, then to mount filesystems they wished to export underneath. (Often using bind mounts of already-existing filesystems.) This was conceptually simple, and allowed easy implementation, but created a serious obstacle to upgrading between v2/v3: since the paths to v4 filesystems were different, administrators would have to adjust all the paths in client-side mount commands when switching to v4. Various workarounds are possible. For example, the administrator could export "/" and designate it as the v4 root. However, the security risks of that approach are obvious, and in any case we shouldn't be requiring the administrator to take extra steps to fix this problem; instead, the server should present consistent paths across different versions by default. These patches take a modified version of that approach: we provide a new export option which exports only a subset of a filesystem. With this flag, it becomes safe for mountd to export "/" by default, with no need for additional configuration. We begin just by defining the new flag. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:00:40 -05:00
J. Bruce Fields	12045a6ee9	nfsd: let "insecure" flag vary by pseudoflavor This was an oversight; it should be among the export flags that can be allowed to vary by pseudoflavor. This allows an administrator to (for example) allow auth_sys mounts only from low ports, but allow auth_krb5 mounts to use any port. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 19:08:58 -05:00
J. Bruce Fields	e8e8753f7a	nfsd: new interface to advertise export features Soon we will add the new V4ROOT flag, and allow the INSECURE flag to vary by pseudoflavor. It would be useful for nfs-utils (for example, for improved exportfs error reporting) to be able to know when this happens. Use this new interface for that purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 18:51:29 -05:00
Boaz Harrosh	9a74af2133	nfsd: Move private headers to source directory Lots of include/linux/nfsd/* headers are only used by nfsd module. Move them to the source directory Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 18:12:12 -05:00
Boaz Harrosh	341eb18446	nfsd: Source files #include cleanups Now that the headers are fixed and carry their own wait, all fs/nfsd/ source files can include a minimal set of headers. and still compile just fine. This patch should improve the compilation speed of the nfsd module. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 18:12:09 -05:00
J. Bruce Fields	57ecb34feb	nfsd4: fix share mode permissions NFSv4 opens may function as locks denying other NFSv4 users the rights to open a file. We're requiring a user to have write permissions before they can deny write. We're not requiring a user to have write permissions to deny read, which is if anything a more drastic denial. What was intended was to require write permissions for DENY_READ. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 18:06:54 -05:00
J. Bruce Fields	864f0f61f8	nfsd: simplify fh_verify access checks All nfsd security depends on the security checks in fh_verify, and especially on nfsd_setuser(). It therefore bothers me that the nfsd_setuser call may be made from three different places, depending on whether the filehandle has already been mapped to a dentry, and on whether subtreechecking is in force. Instead, make an unconditional call in fh_verify(), so it's trivial to verify that the call always occurs. That leaves us with a redundant nfsd_setuser() call in the subtreecheck case--it needs the correct user set earlier in order to check execute permissions on the path to this filehandle--but I'm willing to accept that minor inefficiency in the subtreecheck case in return for more straightforward permission checking. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-11-25 17:55:46 -05:00
J. Bruce Fields	9b8b317d58	Merge commit 'v2.6.32-rc8' into HEAD	2009-11-23 12:34:58 -05:00
Petr Vandrovec	479c2553af	Fix memory corruption caused by nfsd readdir+ Commit `8177e6d6df` ("nfsd: clean up readdirplus encoding") introduced single character typo in nfs3 readdir+ implementation. Unfortunately that typo has quite bad side effects: random memory corruption, followed (on my box) with immediate spontaneous box reboot. Using 'p1' instead of 'p' fixes my Linux box rebooting whenever VMware ESXi box tries to list contents of my home directory. Signed-off-by: Petr Vandrovec <petr@vandrovec.name> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-11-14 12:55:55 -08:00
J. Bruce Fields	0a3adadee4	nfsd: make fs/nfsd/vfs.h for common includes None of this stuff is used outside nfsd, so move it out of the common linux include directory. Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really belongs there, so later we may remove that file entirely. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-11-13 13:23:02 -05:00
Benny Halevy	8c10cbdb4a	nfsd: use STATEID_FMT and STATEID_VAL for printing stateids Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-11-05 12:06:29 -05:00
Peter Staubach	1b7e0403c6	nfsd: register NFS_ACL with rpcbind Modify the NFS server to register the NFS_ACL services with the rpcbind daemon. This allows the client to ping for the existence of the NFS_ACL support via commands such as "rpcinfo -t <server> nfs_acl". This patch also modifies the NFS_ACL support so that responses to version 2 NULLPROC requests can be made. The changelog for the patch which turned off this functionality mentioned something about not registering the NFS_ACL as being part of some tradition. I can't find this tradition and the only other implementation which supports NFS_ACL does register them with the rpcbind daemon. Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-11-04 13:46:37 -05:00
Frank Filz	aba24d7158	nfsd: Fix sort_pacl in fs/nfsd/nf4acl.c to actually sort groups We have been doing some extensive testing of Linux support for ACLs on NFDS v4. We have noticed that the server rejects ACLs where the groups are out of order, for example, the following ACL is rejected: A::OWNER@:rwaxtTcCy A::user101@domain:rwaxtcy A::GROUP@:rwaxtcy A:g:group102@domain:rwaxtcy A:g:group101@domain:rwaxtcy A::EVERYONE@:rwaxtcy Examining the server code, I found that after converting an NFS v4 ACL to POSIX, sort_pacl is called to sort the user ACEs and group ACEs. Unfortunately, a minor bug causes the group sort to be skipped. Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-10-27 19:34:44 -04:00
J. Bruce Fields	efe0cb6d5a	nfsd4.1: common slot allocation size calculation We do the same calculation in a couple places; use a helper function, and add a little documentation, in the hopes of preventing bugs like that fixed in the last patch. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-10-27 19:34:43 -04:00
J. Bruce Fields	dd829c4564	nfsd4.1: fix session memory use calculation Unbalanced calculations on creation and destruction of sessions could cause our estimate of cache memory used to become negative, sometimes resulting in spurious SERVERFAULT returns to client CREATE_SESSION requests. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-10-27 19:34:43 -04:00
J. Bruce Fields	e343eb0d60	Merge commit 'v2.6.32-rc5' into for-2.6.33	2009-10-27 18:45:17 -04:00
Alexey Dobriyan	828c09509b	const: constify remaining file_operations [akpm@linux-foundation.org: fix KVM] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-01 16:11:11 -07:00
Andy Adamson	ddc04fd4d5	nfsd41: use sv_max_mesg for forechannel max sizes ca_maxresponsesize and ca_maxrequest size include the RPC header. sv_max_mesg is sv_max_payolad plus a page for overhead and is used in svc_init_buffer to allocate server buffer space for both the request and reply. Note that this means we can service an RPC compound that requires ca_maxrequestsize (MAXWRITE) or ca_max_responsesize (MAXREAD) but that we do not support an RPC compound that requires both ca_maxrequestsize and ca_maxresponsesize. Signed-off-by: Andy Adamson <andros@netapp.com> [bfields@citi.umich.edu: more documentation updates] Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-28 12:40:15 -04:00
J. Bruce Fields	f39bde24b2	nfsd4: fix error return when pseudoroot missing We really shouldn't hit this case at all, and forthcoming kernel and nfs-utils changes should eliminate this case; if it does happen, consider it a bug rather than reporting an error that doesn't really make sense for the operation (since there's no reason for a server to be accepting v4 traffic yet have no root filehandle). Also move some exp_pseudoroot code into a helper function while we're here. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-28 12:21:26 -04:00
J. Bruce Fields	289ede453e	nfsd: minor nfsd_lookup cleanup Break out some of nfsd_lookup_dentry into helper functions. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-28 12:07:53 -04:00
J. Bruce Fields	fed8381126	nfsd4: cross mountpoints when looking up parents `3c394ddaa7` "nfsd4: nfsv4 clients should cross mountpoints" forgot to handle lookups of parents directories. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-28 12:07:52 -04:00
Alexey Dobriyan	2bcd57ab61	headers: utsname.h redux * remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 18:13:10 -07:00
James Morris	88e9d34c72	seq_file: constify seq_operations Make all seq_operations structs const, to help mitigate against revectoring user-triggerable function pointers. This is derived from the grsecurity patch, although generated from scratch because it's simpler than extracting the changes from there. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 07:39:29 -07:00
Linus Torvalds	a87e84b5cd	Merge branch 'for-2.6.32' of git://linux-nfs.org/~bfields/linux * 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits) nfsd4: nfsv4 clients should cross mountpoints nfsd: revise 4.1 status documentation sunrpc/cache: avoid variable over-loading in cache_defer_req sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req nfsd: return success for non-NFS4 nfs4_state_start nfsd41: Refactor create_client() nfsd41: modify nfsd4.1 backchannel to use new xprt class nfsd41: Backchannel: Implement cb_recall over NFSv4.1 nfsd41: Backchannel: cb_sequence callback nfsd41: Backchannel: Setup sequence information nfsd41: Backchannel: Server backchannel RPC wait queue nfsd41: Backchannel: Add sequence arguments to callback RPC arguments nfsd41: Backchannel: callback infrastructure nfsd4: use common rpc_cred for all callbacks nfsd4: allow nfs4 state startup to fail SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous nfsd4: fix null dereference creating nfsv4 callback client nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked. ...	2009-09-22 07:54:33 -07:00
Alexey Dobriyan	7b021967c5	const: make lock_manager_operations const Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:25 -07:00
Steve Dickson	3c394ddaa7	nfsd4: nfsv4 clients should cross mountpoints Allow NFS v4 clients to seamlessly cross mount point without have to set either the 'crossmnt' or the 'nohide' export options. Signed-Off-By: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-21 16:02:25 -04:00
Ricardo Labiaga	b09333c464	nfsd41: Refactor create_client() Move common initialization of 'struct nfs4_client' inside create_client(). Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [nfsd41: Remember the auth flavor to use for callbacks] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:52:13 -04:00
Alexandros Batsakis	3ddc8bf5f3	nfsd41: modify nfsd4.1 backchannel to use new xprt class This patch enables the use of the nfsv4.1 backchannel. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [initialize rpc_create_args.bc_xprt too] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:52:13 -04:00
Ricardo Labiaga	0421b5c55a	nfsd41: Backchannel: Implement cb_recall over NFSv4.1 Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [nfsd41: cb_recall callback] [Share v4.0 and v4.1 back channel xdr] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Share v4.0 and v4.1 back channel xdr] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use nfsd4_cb_sequence for callback minorversion] [nfsd41: conditionally decode_sequence in nfs4_xdr_dec_cb_recall] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: Backchannel: Add sequence arguments to callback RPC arguments] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [pulled-in definition of nfsd4_cb_done] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:52:12 -04:00
Benny Halevy	2af73580b7	nfsd41: Backchannel: cb_sequence callback Implement the cb_sequence callback conforming to draft-ietf-nfsv4-minorversion1 Note: highest slot id and target highest slot id do not have to be 0 as was previously implemented. They can be greater than what the nfs server sent if the client supports a larger slot table on the backchannel. At this point we just ignore that. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [Rework the back channel xdr using the shared v4.0 and v4.1 framework.] Signed-off-by: Andy Adamson <andros@netapp.com> [fixed indentation] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use nfsd4_cb_sequence for callback minorversion] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: fix verification of CB_SEQUENCE highest slot id[ Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: Backchannel: Remove old backchannel serialization] [nfsd41: Backchannel: First callback sequence ID should be 1] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: decode_cb_sequence does not need to actually decode ignored fields] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:56 -04:00
Ricardo Labiaga	2a1d1b5938	nfsd41: Backchannel: Setup sequence information Follows the model used by the NFS client. Setup the RPC prepare and done function pointers so that we can populate the sequence information if minorversion == 1. rpc_run_task() is then invoked directly just like existing NFS client operations do. nfsd4_cb_prepare() determines if the sequence information needs to be setup. If the slot is in use, it adds itself to the wait queue. nfsd4_cb_done() wakes anyone sleeping on the callback channel wait queue after our RPC reply has been received. It also sets the task message result pointer to NULL to clearly indicate we're done using it. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [define and initialize cl_cb_seq_nr here] [pulled out unused defintion of nfsd4_cb_done] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:56 -04:00
Ricardo Labiaga	199ff35e1c	nfsd41: Backchannel: Server backchannel RPC wait queue RPC callback requests will wait on this wait queue if the backchannel is out of slots. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:55 -04:00
Ricardo Labiaga	132f97715c	nfsd41: Backchannel: Add sequence arguments to callback RPC arguments Follow the model we use in the client. Make the sequence arguments part of the regular RPC arguments. None of the callbacks that are soon to be implemented expect results that need to be passed back to the caller, so we don't define a separate RPC results structure. For session validation, the cb_sequence decoding will use a pointer to the sequence arguments that are part of the RPC argument. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [define struct nfsd4_cb_sequence here] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:55 -04:00
Andy Adamson	38524ab38f	nfsd41: Backchannel: callback infrastructure Keep the xprt used for create_session in cl_cb_xprt. Mark cl_callback.cb_minorversion = 1 and remember the client provided cl_callback.cb_prog rpc program number. Use it to probe the callback path. Use the client's network address to initialize as the callback's address as expected by the xprt creation routines. Define xdr sizes and code nfs4_cb_compound header to be able to send a null callback rpc. Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [get callback minorversion from fore channel's] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: change bc_sock to bc_xprt] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pulled definition for cl_cb_xprt] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: set up backchannel's cb_addr] [moved rpc_create_args init to "nfsd: modify nfsd4.1 backchannel to use new xprt class"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:55 -04:00
J. Bruce Fields	80fc015bdf	nfsd4: use common rpc_cred for all callbacks Callbacks are always made using the machine's identity, so we can use a single auth_generic credential shared among callbacks to all clients and let the rpc code take care of the rest. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:34 -04:00
J. Bruce Fields	29ab23cc5d	nfsd4: allow nfs4 state startup to fail The failure here is pretty unlikely, but we should handle it anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:33 -04:00
J. Bruce Fields	886e3b7fe6	nfsd4: fix null dereference creating nfsv4 callback client On setting up the callback to the client, we attempt to use the same authentication flavor the client did. We find an rpc cred to use by calling rpcauth_lookup_credcache(), which assumes that the given authentication flavor has a credentials cache. However, this is not required to be true--in particular, auth_null does not use one. Instead, we should call the auth's lookup_cred() method. Without this, a client attempting to mount using nfsv4 and auth_null triggers a null dereference. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:33 -04:00
Benny Halevy	4be36ca0ce	nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-13 15:57:39 -04:00
Trond Myklebust	ab3bbaa8b2	Merge branch 'nfs-for-2.6.32'	2009-09-11 14:59:37 -04:00
J. Bruce Fields	aed100fafb	nfsd: fix leak on error in nfsv3 readdir Note the !dchild->d_inode case can leak the filehandle. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-04 15:48:00 -04:00
J. Bruce Fields	8177e6d6df	nfsd: clean up readdirplus encoding Make the return from compose_entry_fh() zero or an error, even though the returned error isn't used, just to make the meaning of the return immediately obvious. Move some repeated code out of main function into helper. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-04 15:47:40 -04:00
J. Bruce Fields	1be10a88ca	nfsd4: filehandle leak or error exit from fh_compose() A number of callers (nfsd4_encode_fattr(), at least) don't bother to release the filehandle returned to fh_compose() if fh_compose() returns an error. So, modify fh_compose() to release the filehandle before returning an error. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-04 11:59:32 -04:00
Trond Myklebust	2671a4bf35	NFSd: Fix filehandle leak in exp_pseudoroot() and nfsd4_path() nfsd4_path() allocates a temporary filehandle and then fails to free it before the function exits, leaking reference counts to the dentry and export that it refers to. Also, nfsd4_lookupp() puts the result of exp_pseudoroot() in a temporary filehandle which it releases on success of exp_pseudoroot() but not on failure; fix exp_pseudoroot to ensure that on failure it releases the filehandle before returning. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-03 16:57:57 -04:00
J. Bruce Fields	bc6c53d5a1	nfsd: move fsid_type choice out of fh_compose More trivial cleanup. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-02 23:54:48 -04:00
J. Bruce Fields	8e498751f2	nfsd: move some of fh_compose into helper functions Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-02 23:53:51 -04:00
David Howells	e0e817392b	CRED: Add some configurable debugging [try #6 ] Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking for credential management. The additional code keeps track of the number of pointers from task_structs to any given cred struct, and checks to see that this number never exceeds the usage count of the cred struct (which includes all references, not just those from task_structs). Furthermore, if SELinux is enabled, the code also checks that the security pointer in the cred struct is never seen to be invalid. This attempts to catch the bug whereby inode_has_perm() faults in an nfsd kernel thread on seeing cred->security be a NULL pointer (it appears that the credential struct has been previously released): http://www.kerneloops.org/oops.php?number=252883 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-09-02 21:29:01 +10:00
Andy Adamson	557ce2646e	nfsd41: replace page based DRC with buffer based DRC Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of holding nfsd pages in cache. Connectathon testing has shown that 1024 bytes for encoded compound operation responses past the sequence operation is sufficient, 512 bytes is a little too small. Set NFSD_SLOT_CACHE_SIZE to 1024. Allocate memory for the session DRC in the CREATE_SESSION operation to guarantee that the memory resource is available for caching responses. Allocate each slot individually in preparation for slot table size negotiation. Remove struct nfsd4_cache_entry and helper functions for the old page-based DRC. The iov_len calculation in nfs4svc_encode_compoundres is now always correct. Replay is now done in nfsd4_sequence under the state lock, so the session ref count is only bumped on non-replay. Clean up the nfs4svc_encode_compoundres session logic. The nfsd4_compound_state statp pointer is also not used. Remove nfsd4_set_statp(). Move useful nfsd4_cache_entry fields into nfsd4_slot. Signed-off-by: Andy Adamson <andros@netapp.com Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 22:24:06 -04:00
Andy Adamson	bdac86e215	nfsd41: replace nfserr_resource in pure nfs41 responses nfserr_resource is not a legal error for NFSv4.1. Replace it with nfserr_serverfault for EXCHANGE_ID and CREATE_SESSION processing. We will also need to map nfserr_resource to other errors in routines shared by NFSv4.0 and NFSv4.1 Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 22:24:05 -04:00
Andy Adamson	a8dfdaeb7a	nfsd41: use session maxreqs for sequence target and highest slotid This fixes a bug in the sequence operation reply. The sequence operation returns the highest slotid it will accept in the future in sr_highest_slotid, and the highest slotid it prefers the client to use. Since we do not re-negotiate the session slot table yet, these should both always be set to the session ca_maxrequests. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 22:24:05 -04:00
Andy Adamson	a649637c73	nfsd41: bound forechannel drc size by memory usage By using the requested ca_maxresponsesize_cached * ca_maxresponses to bound a forechannel drc request size, clients can tailor a session to usage. For example, an I/O session (READ/WRITE only) can have a much smaller ca_maxresponsesize_cached (for only WRITE compound responses) and a lot larger ca_maxresponses to service a large in-flight data window. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 22:24:05 -04:00
Trond Myklebust	a06b1261bd	NFSD: Fix a bug in the NFSv4 'supported attrs' mandatory attribute The fact that the filesystem doesn't currently list any alternate locations does _not_ imply that the fs_locations attribute should be marked as "unsupported". Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 20:00:17 -04:00
Andy Adamson	468de9e54a	nfsd41: expand solo sequence check Compounds consisting of only a sequence operation don't need any additional caching beyond the sequence information we store in the slot entry. Fix nfsd4_is_solo_sequence to identify this case correctly. The additional check for a failed sequence in nfsd4_store_cache_entry() is redundant, since the nfsd4_is_solo_sequence call lower down catches this case. The final ce_cachethis set in nfsd4_sequence is also redundant. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-28 12:20:15 -04:00
Frank Filz	d8d0b85b11	nfsd4: remove ACE4_IDENTIFIER_GROUP flag from GROUP@ entry RFC 3530 says "ACE4_IDENTIFIER_GROUP flag MUST be ignored on entries with these special identifiers. When encoding entries with these special identifiers, the ACE4_IDENTIFIER_GROUP flag SHOULD be set to zero." It really shouldn't matter either way, but the point is that this flag is used to distinguish named users from named groups (since unix allows a group to have the same name as a user), so it doesn't really make sense to use it on a special identifier such as this.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-27 17:35:41 -04:00
Benny Halevy	aaf84eb95a	nfsd41: renew_client must be called under the state lock Until we work out the state locking so we can use a spin lock to protect the cl_lru, we need to take the state_lock to renew the client. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: Do not renew state on error] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: Simplify exit code] Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-27 17:17:40 -04:00
Ryusei Yamaguchi	ed2d8aed52	knfsd: Replace lock_kernel with a mutex in nfsd pool stats. lock_kernel() in knfsd was replaced with a mutex. The later commit `03cf6c9f49` ("knfsd: add file to export stats about nfsd pools") did not follow that change. This patch fixes the issue. Also move the get and put of nfsd_serv to the open and close methods (instead of start and stop methods) to allow atomic check and increment of reference count in the open method (where we can still return an error). Signed-off-by: Ryusei Yamaguchi <mandel59@gmail.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: Greg Banks <gnb@fmeh.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-25 12:39:37 -04:00
Frank Filz	55bb55dca0	nfsd: Fix unnecessary deny bits in NFSv4 ACL The group deny entries end up denying tcy even though tcy was just allowed by the allow entry. This appears to be due to: ace->access_mask = mask_from_posix(deny, flags); instead of: ace->access_mask = deny_mask_from_posix(deny, flags); Denying a previously allowed bit has no effect, so this shouldn't affect behavior, but it's ugly. Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-24 20:01:22 -04:00
Jeff Layton	fbf4665f41	nfsd: populate sin6_scope_id on callback address with scopeid from rq_addr on SETCLIENTID call When a SETCLIENTID call comes in, one of the args given is the svc_rqst. This struct contains an rq_addr field which holds the address that sent the call. If this is an IPv6 address, then we can use the sin6_scope_id field in this address to populate the sin6_scope_id field in the callback address. AFAICT, the rq_addr.sin6_scope_id is non-zero if and only if the client mounted the server's link-local address. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-21 11:27:44 -04:00
Jeff Layton	7077ecbabd	nfsd: add support for NFSv4 callbacks over IPv6 The framework to add this is all in place. Now, add the code to allow support for establishing a callback channel on an IPv6 socket. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-21 11:27:44 -04:00
Jeff Layton	aa9a4ec770	nfsd: convert nfs4_cb_conn struct to hold address in sockaddr_storage ...rather than as a separate address and port fields. This will be necessary for implementing callbacks over IPv6. Also, convert gen_callback to use the standard rpcuaddr2sockaddr routine rather than its own private one. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-21 11:27:43 -04:00
Jeff Layton	363168b4ea	nfsd: make nfs4_client->cl_addr a struct sockaddr_storage It's currently a __be32, which isn't big enough to hold an IPv6 address. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-21 11:27:43 -04:00
J. Bruce Fields	e9dc122166	Merge branch 'nfs-for-2.6.32' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 into for-2.6.32-incoming Conflicts: net/sunrpc/cache.c	2009-08-21 11:27:29 -04:00
Trond Myklebust	f884dcaead	Merge branch 'sunrpc_cache-for-2.6.32' into nfs-for-2.6.32	2009-08-10 17:45:58 -04:00
Trond Myklebust	bc74b4f5e6	SUNRPC: Allow the cache_detail to specify alternative upcall mechanisms For events that are rare, such as referral DNS lookups, it makes limited sense to have a daemon constantly listening for upcalls on a channel. An alternative in those cases might simply be to run the app that fills the cache using call_usermodehelper_exec() and friends. The following patch allows the cache_detail to specify alternative upcall mechanisms for these particular cases. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:29 -04:00
Trond Myklebust	2da8ca26c6	NFSD: Clean up the idmapper warning... What part of 'internal use' is so hard to understand? Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:26 -04:00
Chuck Lever	4116092b92	NFSD: Support IPv6 addresses in write_failover_ip() In write_failover_ip(), replace the sscanf() with a call to the common sunrpc.ko presentation address parser. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:40 -04:00
Andy Adamson	abfabf8caf	nfsd41: encode replay sequence from the slot values The sequence operation is not cached; always encode the sequence operation on a replay from the slot table and session values. This simplifies the sessions replay logic in nfsd4_proc_compound. If this is a replay of a compound that was specified not to be cached, return NFS4ERR_RETRY_UNCACHED_REP. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 16:12:34 -04:00
Andy Adamson	c8647947f8	nfsd41: rename nfsd4_enc_uncached_replay This function is only used for SEQUENCE replay. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:36 -04:00
Andy Adamson	49557cc74c	nfsd41: Use separate DRC for setclientid Instead of trying to share the generic 4.1 reply cache code for the CREATE_SESSION reply cache, it's simpler to handle CREATE_SESSION separately. The nfs41 single slot clientid DRC holds the results of create session processing. CREATE_SESSION can be preceeded by a SEQUENCE operation (an embedded CREATE_SESSION) and the create session single slot cache must be maintained. nfsd4_replay_cache_entry() and nfsd4_store_cache_entry() do not implement the replay of an embedded CREATE_SESSION. The clientid DRC slot does not need the inuse, cachethis or other fields that the multiple slot session cache uses. Replace the clientid DRC cache struct nfs4_slot cache with a new nfsd4_clid_slot cache. Save the xdr struct nfsd4_create_session into the cache at the end of processing, and on a replay, replace the struct for the replay request with the cached version all while under the state lock. nfsd4_proc_compound will handle both the solo and embedded CREATE_SESSION case via the normal use of encode_operation. Errors that do not change the create session cache: A create session NFS4ERR_STALE_CLIENTID error means that a client record (and associated create session slot) could not be found and therefore can't be changed. NFSERR_SEQ_MISORDERED errors do not change the slot cache. All other errors get cached. Remove the clientid DRC specific check in nfs4svc_encode_compoundres to put the session only if cstate.session is set which will now always be true. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:29 -04:00
Andy Adamson	88e588d56a	nfsd41: change check_slot_seqid parameters For separation of session slot and clientid slot processing. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:23 -04:00
Andy Adamson	5261dcf8eb	nfsd41: remove redundant forechannel max requests check This check is done in set_forechannel_maxreqs. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:15 -04:00
Andy Adamson	0c193054a4	nfsd41: hange from page to memory based drc limits NFSD_SLOT_CACHE_SIZE is the size of all encoded operation responses (excluding the sequence operation) that we want to cache. For now, keep NFSD_SLOT_CACHE_SIZE at PAGE_SIZE. It will be reduced when the DRC is changed from page based to memory based. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:05 -04:00
Andy Adamson	6a14dd1a4f	nfsd41: reserve less memory for DRC Also remove a slightly misleading comment. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:59 -04:00
Andy Adamson	b101ebbc39	nfsd41: minor set_forechannel_maxreqs cleanup Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:54 -04:00
Andy Adamson	be98d1bbd1	nfsd41: reclaim DRC memory on session free This fixes a leak which would eventually lock out new clients. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:48 -04:00
J. Bruce Fields	413d63d710	nfsd: minor write_pool_threads exit cleanup Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:41 -04:00
Eric Sesterhenn	2522a776c1	Fix memory leak in write_pool_threads kmemleak produces the following warning unreferenced object 0xc9ec02a0 (size 8): comm "cat", pid 19048, jiffies 730243 backtrace: [<c01bf970>] create_object+0x100/0x240 [<c01bfadb>] kmemleak_alloc+0x2b/0x60 [<c01bcd4b>] __kmalloc+0x14b/0x270 [<c02fd027>] write_pool_threads+0x87/0x1d0 [<c02fcc08>] nfsctl_transaction_write+0x58/0x70 [<c02fcc6f>] nfsctl_transaction_read+0x4f/0x60 [<c01c2574>] vfs_read+0x94/0x150 [<c01c297d>] sys_read+0x3d/0x70 [<c0102d6b>] sysenter_do_call+0x12/0x32 [<ffffffff>] 0xffffffff write_pool_threads() only frees nthreads on error paths, in the success case we leak it. Signed-off-by: Eric Sesterhenn <eric.sesterhenn@lsexperts.de> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:34 -04:00
Andy Adamson	4bd9b0f4af	nfsd41: use globals for DRC limits The version 4.1 DRC memory limit and tracking variables are server wide and session specific. Replace struct svc_serv fields with globals. Stop using the svc_serv sv_lock. Add a spinlock to serialize access to the DRC limit management variables which change on session creation and deletion (usage counter) or (future) administrative action to adjust the total DRC memory limit. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-07-14 17:52:40 -04:00
Yu Zhiguo	9208faf297	NFSv4: ACL in operations 'open' and 'create' should be used ACL in operations 'open' and 'create' is decoded but never be used. It should be set as the initial ACL for the object according to RFC3530. If error occurs when setting the ACL, just clear the ACL bit in the returned attr bitmap. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-14 12:16:47 -04:00
Alexey Dobriyan	405f55712d	headers: smp_lock.h redux * Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-12 12:22:34 -07:00
David Howells	033a666ccb	NFSD: Don't hold unrefcounted creds over call to nfsd_setuser() nfsd_open() gets an unrefcounted pointer to the current process's effective credentials at the top of the function, then calls nfsd_setuser() via fh_verify() - which may replace and destroy the current process's effective credentials - and then passes the unrefcounted pointer to dentry_open() - but the credentials may have been destroyed by this point. Instead, the value from current_cred() should be passed directly to dentry_open() as one of its arguments, rather than being cached in a variable. Possibly fh_verify() should return the creds to use. This is a regression introduced by `745ca2475a` "CRED: Pass credentials through dentry_open()". Signed-off-by: David Howells <dhowells@redhat.com> Tested-and-Verified-By: Steve Dickson <steved@redhat.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-03 10:21:10 -04:00
Linus Torvalds	7e0338c0de	Merge branch 'for-2.6.31' of git://fieldses.org/git/linux-nfsd * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits) SUNRPC: Fix the TCP server's send buffer accounting nfsd41: Backchannel: minorversion support for the back channel nfsd41: Backchannel: cleanup nfs4.0 callback encode routines nfsd41: Remove ip address collision detection case nfsd: optimise the starting of zero threads when none are running. nfsd: don't take nfsd_mutex twice when setting number of threads. nfsd41: sanity check client drc maxreqs nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct NFS: kill off complicated macro 'PROC' sunrpc: potential memory leak in function rdma_read_xdr nfsd: minor nfsd_vfs_write cleanup nfsd: Pull write-gathering code out of nfsd_vfs_write nfsd: track last inode only in use_wgather case sunrpc: align cache_clean work's timer nfsd: Use write gathering only with NFSv2 NFSv4: kill off complicated macro 'PROC' NFSv4: do exact check about attribute specified knfsd: remove unreported filehandle stats counters knfsd: fix reply cache memory corruption knfsd: reply cache cleanups ...	2009-06-22 12:55:50 -07:00
Andy Adamson	ab52ae6db0	nfsd41: Backchannel: minorversion support for the back channel Prepare to share backchannel code with NFSv4.1. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [nfsd41: use nfsd4_cb_sequence for callback minorversion] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-18 18:33:57 -07:00
Andy Adamson	ef52bff840	nfsd41: Backchannel: cleanup nfs4.0 callback encode routines Mimic the client and prepare to share the back channel xdr with NFSv4.1. Bump the number of operations in each encode routine, then backfill the number of operations. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-18 18:33:57 -07:00
Mike Sager	6ddbbbfe52	nfsd41: Remove ip address collision detection case Verified that cthon and pynfs exchange id tests pass (except for the two expected fails: EID8 and EID50) Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-18 17:43:53 -07:00
NeilBrown	671e1fcf63	nfsd: optimise the starting of zero threads when none are running. Currently, if we ask to set then number of nfsd threads to zero when there are none running, we set up all the sockets and register the service, and then tear it all down again. This is pointless. So detect that case and exit promptly. (also remove an assignment to 'error' which was never used. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Jeff Layton <jlayton@redhat.com>	2009-06-18 09:42:41 -07:00
NeilBrown	82e12fe924	nfsd: don't take nfsd_mutex twice when setting number of threads. Currently when we write a number to 'threads' in nfsdfs, we take the nfsd_mutex, update the number of threads, then take the mutex again to read the number of threads. Mostly this isn't a big deal. However if we are write '0', and portmap happens to be dead, then we can get unpredictable behaviour. If the nfsd threads all got killed quickly and the last thread is waiting for portmap to respond, then the second time we take the mutex we will block waiting for the last thread. However if the nfsd threads didn't die quite that fast, then there will be no contention when we try to take the mutex again. Unpredictability isn't fun, and waiting for the last thread to exit is pointless, so avoid taking the lock twice. To achieve this, get nfsd_svc return a non-negative number of active threads when not returning a negative error. Signed-off-by: NeilBrown <neilb@suse.de>	2009-06-18 09:40:31 -07:00
Andy Adamson	5d77ddfbcb	nfsd41: sanity check client drc maxreqs Ensure the client requested maximum requests are between 1 and NFSD_MAX_SLOTS_PER_SESSION Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-16 17:13:16 -07:00
Alexandros Batsakis	6c18ba9f5e	nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct the change is valid for both the forechannel and the backchannel (currently dummy) Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-16 10:13:45 -07:00
Yu Zhiguo	b9081d90f5	NFS: kill off complicated macro 'PROC' kill off obscure macro 'PROC' of NFSv2&3 in order to make the code more clear. Among other things, this makes it simpler to grep for callers of these functions--something which has frequently caused confusion among nfs developers. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 19:34:32 -07:00
J. Bruce Fields	e4636d535e	nfsd: minor nfsd_vfs_write cleanup There's no need to check host_err >= 0 every time here when we could check host_err < 0 once, following the usual kernel style. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 19:18:34 -07:00
J. Bruce Fields	d911df7b8d	nfsd: Pull write-gathering code out of nfsd_vfs_write This is a relatively self-contained piece of code that handles a special case--move it to its own function. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 18:54:05 -07:00
J. Bruce Fields	9d2a3f31d6	nfsd: track last inode only in use_wgather case Updating last_ino and last_dev probably isn't useful in the !use_wgather case. Also remove some pointless ifdef'd-out code. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 18:52:47 -07:00
Trond Myklebust	48e03bc515	nfsd: Use write gathering only with NFSv2 NFSv3 and above can use unstable writes whenever they are sending more than one write, rather than relying on the flaky write gathering heuristics. More often than not, write gathering is currently getting it wrong when the NFSv3 clients are sending a single write with FILE_SYNC for efficiency reasons. This patch turns off write gathering for NFSv3/v4, and ensures that it only applies to the one case that can actually benefit: namely NFSv2. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 18:14:57 -07:00
J. Bruce Fields	7eef4091a6	Merge commit 'v2.6.30' into for-2.6.31	2009-06-15 18:08:07 -07:00
Al Viro	9393bd07cf	switch follow_down() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:01 -04:00
Al Viro	bab77ebf51	switch follow_up() to struct path Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:00 -04:00
Al Viro	e64c390ca0	switch rqst_exp_parent() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:00 -04:00
Al Viro	91c9fa8f75	switch rqst_exp_get_by_name() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:00 -04:00
Al Viro	5bf3bd2b5c	switch exp_parent() to struct path ... and lose the always-NULL last argument (non-NULL case had been split off a while ago). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:00 -04:00
Al Viro	55430e2ece	nfsd struct path use: exp_get_by_name() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:35:59 -04:00
James Morris	0b4ec6e4e0	Merge branch 'master' into next	2009-06-09 09:27:53 +10:00
Yu Zhiguo	0a93a47f04	NFSv4: kill off complicated macro 'PROC' J. Bruce Fields wrote: ... > (This is extremely confusing code to track down: note that > proc->pc_decode is set to nfs4svc_decode_compoundargs() by the PROC() > macro at the end of fs/nfsd/nfs4proc.c. Which means, for example, that > grepping for nfs4svc_decode_compoundargs() gets you nowhere. Patches to > kill off that macro would be welcomed....) the macro 'PROC' is complicated and obscure, it had better be killed off in order to make the code more clear. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-01 18:09:20 -04:00
Yu Zhiguo	3c8e03166a	NFSv4: do exact check about attribute specified Server should return NFS4ERR_ATTRNOTSUPP if an attribute specified is not supported in current environment. Operations CREATE, NVERIFY, OPEN, SETATTR and VERIFY should do this check. This bug is found when do newpynfs tests. The names of the tests that failed are following: CR12 NVF7a NVF7b NVF7c NVF7d NVF7f NVF7r NVF7s OPEN15 VF7a VF7b VF7c VF7d VF7f VF7r VF7s Add function do_check_fattr() to do exact check: 1, Check attribute specified is supported by the NFSv4 server or not. 2, Check FATTR4_WORD0_ACL & FATTR4_WORD0_FS_LOCATIONS are supported in current environment or not. 3, Check attribute specified is writable or not. step 1 and 3 are done in function nfsd4_decode_fattr() but removed to this function now. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-01 18:01:54 -04:00
Mimi Zohar	14dba5331b	integrity: nfsd imbalance bug fix An nfsd exported file is opened/closed by the kernel causing the integrity imbalance message. Before a file is opened, there normally is permission checking, which is done in inode_permission(). However, as integrity checking requires a dentry and mount point, which is not available in inode_permission(), the integrity (permission) checking must be called separately. In order to detect any missing integrity checking calls, we keep track of file open/closes. ima_path_check() increments these counts and does the integrity (permission) checking. As a result, the number of calls to ima_path_check()/ima_file_free() should be balanced. An extra call to fput(), indicates the file could have been accessed without first calling ima_path_check(). In nfsv3 permission checking is done once, followed by multiple reads, which do an open/close for each read. The integrity (permission) checking call should be in nfsd_permission() after the inode_permission() call, but as there is no correlation between the number of permission checking and open calls, the integrity checking call should not increment the counters, but defer it to when the file is actually opened. This patch adds: - integrity (permission) checking for nfsd exported files in nfsd_permission(). - a call to increment counts for files opened by nfsd. This patch has been updated to return the nfs error types. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-05-28 09:32:43 +10:00
Wei Yongjun	a0d24b295a	nfsd: fix hung up of nfs client while sync write data to nfs server Commit 'Short write in nfsd becomes a full write to the client' (`31dec2538e`) broken the sync write. With the following commands to reproduce: $ mount -t nfs -o sync 192.168.0.21:/nfsroot /mnt $ cd /mnt $ echo aaaa > temp.txt Then nfs client is hung up. In SYNC mode the server alaways return the write count 0 to the client. This is because the value of host_err in nfsd_vfs_write() will be overwrite in SYNC mode by 'host_err=nfsd_sync(file);', and then we return host_err(which is now 0) as write count. This patch fixed the problem. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 17:40:06 -04:00
Greg Banks	1dbd0d53f3	knfsd: remove unreported filehandle stats counters The file nfsfh.c contains two static variables nfsd_nr_verified and nfsd_nr_put. These are counters which are incremented as a side effect of the fh_verify() fh_compose() and fh_put() operations, i.e. at least twice per NFS call for any non-trivial workload. Needless to say this makes the cacheline that contains them (and any other innocent victims) a very hot contention point indeed under high call-rate workloads on multiprocessor NFS server. It also turns out that these counters are not used anywhere. They're not reported to userspace, they're not used in logic, they're not even exported from the object file (let alone the module). All they do is waste CPU time. So this patch removes them. Tests on a 16 CPU Altix A4700 with 2 10gige Myricom cards, configured separately (no bonding). Workload is 640 client threads doing directory traverals with random small reads, from server RAM. Before ====== Kernel profile: % cumulative self self total time samples samples calls 1/call 1/call name 6.05 2716.00 2716.00 30406 0.09 1.02 svc_process 4.44 4706.00 1990.00 1975 1.01 1.01 spin_unlock_irqrestore 3.72 6376.00 1670.00 1666 1.00 1.00 svc_export_put 3.41 7907.00 1531.00 1786 0.86 1.02 nfsd_ofcache_lookup 3.25 9363.00 1456.00 10965 0.13 1.01 nfsd_dispatch 3.10 10752.00 1389.00 1376 1.01 1.01 nfsd_cache_lookup 2.57 11907.00 1155.00 4517 0.26 1.03 svc_tcp_recvfrom ... 2.21 15352.00 1003.00 1081 0.93 1.00 nfsd_choose_ofc <---- ^^^^ Here the function nfsd_choose_ofc() reads a global variable which by accident happened to be located in the same cacheline as nfsd_nr_verified. Call rate: nullarbor:~ # pmdumptext nfs3.server.calls ... Thu Dec 13 00:15:27 184780.663 Thu Dec 13 00:15:28 184885.881 Thu Dec 13 00:15:29 184449.215 Thu Dec 13 00:15:30 184971.058 Thu Dec 13 00:15:31 185036.052 Thu Dec 13 00:15:32 185250.475 Thu Dec 13 00:15:33 184481.319 Thu Dec 13 00:15:34 185225.737 Thu Dec 13 00:15:35 185408.018 Thu Dec 13 00:15:36 185335.764 After ===== kernel profile: % cumulative self self total time samples samples calls 1/call 1/call name 6.33 2813.00 2813.00 29979 0.09 1.01 svc_process 4.66 4883.00 2070.00 2065 1.00 1.00 spin_unlock_irqrestore 4.06 6687.00 1804.00 2182 0.83 1.00 nfsd_ofcache_lookup 3.20 8110.00 1423.00 10932 0.13 1.00 nfsd_dispatch 3.03 9456.00 1346.00 1343 1.00 1.00 nfsd_cache_lookup 2.62 10622.00 1166.00 4645 0.25 1.01 svc_tcp_recvfrom [...] 0.10 42586.00 44.00 74 0.59 1.00 nfsd_choose_ofc <--- HA!! ^^^^ Call rate: nullarbor:~ # pmdumptext nfs3.server.calls ... Thu Dec 13 01:45:28 194677.118 Thu Dec 13 01:45:29 193932.692 Thu Dec 13 01:45:30 194294.364 Thu Dec 13 01:45:31 194971.276 Thu Dec 13 01:45:32 194111.207 Thu Dec 13 01:45:33 194999.635 Thu Dec 13 01:45:34 195312.594 Thu Dec 13 01:45:35 195707.293 Thu Dec 13 01:45:36 194610.353 Thu Dec 13 01:45:37 195913.662 Thu Dec 13 01:45:38 194808.675 i.e. about a 5.3% improvement in call rate. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Reviewed-by: David Chinner <dgc@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 14:14:03 -04:00
Greg Banks	cf0a586cf4	knfsd: fix reply cache memory corruption Fix a regression in the reply cache introduced when the code was converted to use proper Linux lists. When a new entry needs to be inserted, the case where all the entries are currently being used by threads is not correctly detected. This can result in memory corruption and a crash. In the current code this is an extremely unlikely corner case; it would require the machine to have 1024 nfsd threads and all of them to be busy at the same time. However, upcoming reply cache changes make this more likely; a crash due to this problem was actually observed in field. Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 14:14:02 -04:00
Greg Banks	fca4217c5b	knfsd: reply cache cleanups Make REQHASH() an inline function. Rename hash_list to cache_hash. Fix an obsolete comment. Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 14:14:02 -04:00
J. Bruce Fields	8daed1e549	nfsd: silence lockdep warning Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-11 17:23:14 -04:00
Wang Chen	02cb2858db	nfsd: nfs4_stat_init cleanup Save some loop time. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-06 16:22:41 -04:00
J. Bruce Fields	b2c0cea6b1	nfsd4: check for negative dentry before use in nfsv4 readdir After `2f9092e102` "Fix i_mutex vs. readdir handling in nfsd" (and `14f7dd63` "Copy XFS readdir hack into nfsd code"), an entry may be removed between the first mutex_unlock and the second mutex_lock. In this case, lookup_one_len() will return a negative dentry. Check for this case to avoid a NULL dereference. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reviewed-by: J. R. Okajima <hooanon05@yahoo.co.jp> Cc: stable@kernel.org	2009-05-06 16:16:36 -04:00
Randy Dunlap	9064caae8f	nfsd: use C99 struct initializers Eliminate 56 sparse warnings like this one: fs/nfsd/nfs4xdr.c:1331:15: warning: obsolete array initializer, use C99 syntax Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 15:09:12 -04:00
J. Bruce Fields	63e4863fab	nfsd4: make recall callback an asynchronous rpc As with the probe, this removes the need for another kthread. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 15:08:56 -04:00
Andy Adamson	ccecee1e5e	nfsd41: slots are freed with session The session and slots are allocated all in one piece. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 14:45:02 -04:00
J. Bruce Fields	3aea09dc91	nfsd4: track recall retries in nfs4_delegation Move this out of a local variable into the nfs4_delegation object in preparation for making this an async rpc call (at which point we'll need any state like this in a common object that's preserved across function calls). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 20:11:12 -04:00
J. Bruce Fields	6707bd3d42	nfsd4: remove unused dl_trunc There's no point in keeping this field around--it's always zero. (Background: the protocol allows you to tell the client that the file is about to be truncated, as an optimization to save the client from writing back dirty pages that will just be discarded. We don't implement this hint. If we do some day, adding this field back in will be the least of the work involved.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 19:57:46 -04:00
J. Bruce Fields	b53d40c507	nfsd4: eliminate struct nfs4_cb_recall The nfs4_cb_recall struct is used only in nfs4_delegation, so its pointer to the containing delegation is unnecessary--we could just use container_of(). But there's no real reason to have this a separate struct at all--just move these fields to nfs4_delegation. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 19:50:00 -04:00
J. Bruce Fields	c237dc0303	nfsd4: rename callback struct to cb_conn I want to use the name for a struct that actually does represent a single callback. (Actually, I've never been sure it helps to a separate struct for the callback information. Some day maybe those fields could just be dumped into struct nfs4_client. I don't know.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 17:31:44 -04:00
J. Bruce Fields	e300a63ce4	nfsd4: replace callback thread by asynchronous rpc We don't really need a synchronous rpc, and moving to an asynchronous rpc allows us to do without this extra kthread. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 17:10:53 -04:00
J. Bruce Fields	3cef9ab266	nfsd4: lookup up callback cred only once Lookup the callback cred once and then use it for all subsequent callbacks. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 16:45:03 -04:00
J. Bruce Fields	ecdd03b791	nfsd4: create rpc callback client from server thread The code is a little simpler, and it should be easier to avoid races, if we just do all rpc client creation/destruction from nfsd or laundromat threads and do only the rpc calls themselves asynchronously. The rpc creation doesn't involve any significant waiting (it doesn't call the client, for example), so there's no reason not to do this. Also don't bother destroying the client on failure of the rpc null probe. We may want to retry the probe later anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 16:44:53 -04:00
J. Bruce Fields	e1cab5a589	nfsd4: set cb_client inside setup_callback_client This is just a minor code simplification. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 16:44:47 -04:00
J. Bruce Fields	595947acaa	nfsd4: set shorter timeout We tried to do something overly complicated with the callback rpc timeouts here. And they're wrong--the result is that by the time a single callback times out, it's already too late to tell the client (using the cb_path_down return to RENEW) that the callback is down. Use a much shorter, simpler timeout. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 16:44:40 -04:00
J. Bruce Fields	f64f79ea5f	nfsd4: setclientid_confirm callback-change fixes This setclientid_confirm case should allow the client to change callbacks, but it currently has a dummy implementation that just turns off callbacks completely. That dummy implementation isn't completely correct either, though: - There's no need to remove any client recovery directory in this case. - New clientid confirm verifiers should be generated (and returned) in setclientid; there's no need to generate a new one here. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 16:44:34 -04:00
J. Bruce Fields	b8fd47aefa	nfsd: quiet compile warning Stephen Rothwell said: "Today's linux-next build (powerpc ppc64_defconfig) produced this new warning: fs/nfsd/nfs4state.c: In function 'EXPIRED_STATEID': fs/nfsd/nfs4state.c:2757: warning: comparison of distinct pointer types lacks a cast Caused by commit `78155ed75f` ("nfsd4: distinguish expired from stale stateids")." Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>	2009-04-29 11:36:17 -04:00
J. Bruce Fields	c654b8a9cb	nfsd: support ext4 i_version ext4 supports a real NFSv4 change attribute, which is bumped whenever the ctime would be updated, including times when two updates arrive within a jiffy of each other. (Note that although ext4 has space for nanosecond-precision ctime, the real resolution is lower: it actually uses jiffies as the time-source.) This ensures clients will invalidate their caches when they need to. There is some fear that keeping the i_version up-to-date could have performance drawbacks, so for now it's turned on only by a mount option. We hope to do something better eventually. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Theodore Tso <tytso@mit.edu>	2009-04-29 11:35:49 -04:00
J. Bruce Fields	3352d2c2d0	nfsd4: delete obsolete xdr comments We don't need comments to tell us these macros are ugly. And we're long past trying to share any of this code with the BSD's. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 11:35:49 -04:00
J. Bruce Fields	bc749ca4c4	nfsd: eliminate ENCODE_HEAD macro This macro doesn't serve any useful purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 11:35:49 -04:00
Chuck Lever	e06b64050e	NFSD: Stricter buffer size checking in fs/nfsd/nfsctl.c Clean up: For consistency, handle output buffer size checking in a other nfsctl functions the same way it's done for write_versions(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:30 -04:00
Chuck Lever	261758b5c3	NFSD: Stricter buffer size checking in write_versions() While it's not likely today that there are enough NFS versions to overflow the output buffer in write_versions(), we should be more careful about detecting the end of the buffer. The number of NFS versions will only increase as NFSv4 minor versions are added. Note that this API doesn't behave the same as portlist. Here we attempt to display as many versions as will fit in the buffer, and do not provide any indication that an overflow would have occurred. I don't have any good rationale for that. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:30 -04:00
Chuck Lever	3d72ab8fdd	NFSD: Stricter buffer size checking in write_recoverydir() While it's not likely a pathname will be longer than SIMPLE_TRANSACTION_SIZE, we should be more careful about just plopping it into the output buffer without bounds checking. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:30 -04:00
Chuck Lever	8435d34dbb	SUNRPC: pass buffer size to svc_sock_names() Adjust the synopsis of svc_sock_names() to pass in the size of the output buffer. Add a documenting comment. This is a cosmetic change for now. A subsequent patch will make sure the buffer length is passed to one_sock_name(), where the length will actually be useful. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:28 -04:00
Chuck Lever	bfba9ab4c6	SUNRPC: pass buffer size to svc_addsock() Adjust the synopsis of svc_addsock() to pass in the size of the output buffer. Add a documenting comment. This is a cosmetic change for now. A subsequent patch will make sure the buffer length is passed to one_sock_name(), where the length will actually be useful. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:28 -04:00
Chuck Lever	335c54bdc4	NFSD: Prevent a buffer overflow in svc_xprt_names() The svc_xprt_names() function can overflow its buffer if it's so near the end of the passed in buffer that the "name too long" string still doesn't fit. Of course, it could never tell if it was near the end of the passed in buffer, since its only caller passes in zero as the buffer length. Let's make this API a little safer. Change svc_xprt_names() so it always checks for a buffer overflow, and change its only caller to pass in the correct buffer length. If svc_xprt_names() does overflow its buffer, it now fails with an ENAMETOOLONG errno, instead of trying to write a message at the end of the buffer. I don't like this much, but I can't figure out a clean way that's always safe to return some of the names, and an indication that the buffer was not long enough. The displayed error when doing a 'cat /proc/fs/nfsd/portlist' is "File name too long". Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:28 -04:00
Chuck Lever	ea068bad27	NFSD: move lockd_up() before svc_addsock() Clean up. A couple of years ago, a series of commits, finishing with commit `5680c446`, swapped the order of the lockd_up() and svc_addsock() calls in __write_ports(). At that time lockd_up() needed to know the transport protocol of the passed-in socket to start a listener on the same transport protocol. These days, lockd_up() doesn't take a protocol argument; it always starts both a UDP and TCP listener. It's now more straightforward to try the lockd_up() first, then do a lockd_down() if the svc_addsock() fails. Careful review of this code shows that the svc_sock_names() call is used only to close the just-opened socket in case lockd_up() fails. So it is no longer needed if lockd_up() is done first. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:28 -04:00
Chuck Lever	0a5372d8a1	NFSD: Finish refactoring __write_ports() Clean up: Refactor transport name listing out of __write_ports() to make it easier to understand and maintain. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:27 -04:00
Chuck Lever	c71206a7b4	NFSD: Note an additional requirement when passing TCP sockets to portlist User space must call listen(3) on SOCK_STREAM sockets passed into /proc/fs/nfsd/portlist, otherwise that listener is ignored. Document this. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:27 -04:00
Chuck Lever	0b7c2f6fc7	NFSD: Refactor socket creation out of __write_ports() Clean up: Refactor the socket creation logic out of __write_ports() to make it easier to understand and maintain. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:27 -04:00
Chuck Lever	82d565919a	NFSD: Refactor portlist socket closing into a helper Clean up: Refactor the socket closing logic out of __write_ports() to make it easier to understand and maintain. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:26 -04:00
Chuck Lever	4eb68c266c	NFSD: Refactor transport addition out of __write_ports() Clean up: Refactor transport addition out of __write_ports() to make it easier to understand and maintain. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:26 -04:00
Chuck Lever	4cd5dc751a	NFSD: Refactor transport removal out of __write_ports() Clean up: Refactor transport removal out of __write_ports() to make it easier to understand and maintain. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:26 -04:00
Bian Naimeng	78155ed75f	nfsd4: distinguish expired from stale stateids If we encode the time of client creation into the stateid instead of the time of server boot, then we can determine whether that stateid is from a previous instance of the a server, or from a client that has expired, and return an appropriate error to the client. Signed-off-by: Bian Naimeng <biannm@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-24 19:17:18 -04:00
Roel Kluin	80492e7d49	rpcgss: remove redundant test on unsigned Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-23 17:25:07 -04:00
David Woodhouse	2f9092e102	Fix i_mutex vs. readdir handling in nfsd Commit `14f7dd63` ("Copy XFS readdir hack into nfsd code") introduced a bug to generic code which had been extant for a long time in the XFS version -- it started to call through into lookup_one_len() and hence into the file systems' ->lookup() methods without i_mutex held on the directory. This patch fixes it by locking the directory's i_mutex again before calling the filldir functions. The original deadlocks which commit `14f7dd63` was designed to avoid are still avoided, because they were due to fs-internal locking, not i_mutex. While we're at it, fix the return type of nfsd_buffered_readdir() which should be a __be32 not an int -- it's an NFS errno, not a Linux errno. And return nfserrno(-ENOMEM) when allocation fails, not just -ENOMEM. Sparse would have caught that, if it wasn't so busy bitching about __cold__. Commit `05f4f678` ("nfsd4: don't do lookup within readdir in recovery code") introduced a similar problem with calling lookup_one_len() without i_mutex, which this patch also addresses. To fix that, it was necessary to fix the called functions so that they expect i_mutex to be held; that part was done by J. Bruce Fields. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Umm-I-can-live-with-that-by: Al Viro <viro@zeniv.linux.org.uk> Reported-by: J. R. Okajima <hooanon05@yahoo.co.jp> Tested-by: J. Bruce Fields <bfields@citi.umich.edu> LKML-Reference: <8036.1237474444@jrobl> Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-04-20 23:01:16 -04:00
Al Viro	1644ccc8a9	Safer nfsd_cross_mnt() AFAICS, we have a subtle bug there: if we have crossed mountpoint and it got mount --move'd away, we'll be holding only one reference to fs containing dentry - exp->ex_path.mnt. IOW, we ought to dput() before exp_put(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-04-20 23:01:15 -04:00
Linus Torvalds	a63856252d	Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linux * 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits) nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4 nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc nfsd41: Documentation/filesystems/nfs41-server.txt nfsd41: CREATE_EXCLUSIVE4_1 nfsd41: SUPPATTR_EXCLCREAT attribute nfsd41: support for 3-word long attribute bitmask nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify nfsd41: pass writable attrs mask to nfsd4_decode_fattr nfsd41: provide support for minor version 1 at rpc level nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap nfsd41: access_valid nfsd41: clientid handling nfsd41: check encode size for sessions maxresponse cached nfsd41: stateid handling nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op nfsd41: destroy_session operation nfsd41: non-page DRC for solo sequence responses nfsd41: Add a create session replay cache nfsd41: create_session operation ...	2009-04-06 13:25:56 -07:00
Benny Halevy	f0ad670d70	nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc Fixes the following compiler error: fs/nfsd/nfssvc.c: In function 'set_max_drc': fs/nfsd/nfssvc.c:240: error: 'NFSD_DRC_SIZE_SHIFT' undeclared CONFIG_NFSD_V4 is not set Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-06 09:17:53 -07:00
Benny Halevy	79fb54abd2	nfsd41: CREATE_EXCLUSIVE4_1 Implement the CREATE_EXCLUSIVE4_1 open mode conforming to http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26 This mode allows the client to atomically create a file if it doesn't exist while setting some of its attributes. It must be implemented if the server supports persistent reply cache and/or pnfs. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:23 -07:00
Benny Halevy	8c18f2052e	nfsd41: SUPPATTR_EXCLCREAT attribute Return bitmask for supported EXCLUSIVE4_1 create attributes. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:23 -07:00
Andy Adamson	7e70570647	nfsd41: support for 3-word long attribute bitmask Also, use client minorversion to generate supported attrs Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:23 -07:00
Benny Halevy	95ec28cda3	nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify _nfsd4_verify currently skips 3 words from the encoded buffer begining. With support for 3-word attr bitmaps in nfsd41, nfsd4_encode_fattr may encode 1, 2, or 3 words, and not always 2 as it used to be, hence we need to find out where to skip using the encoded bitmap length. Note: This patch may be applied over pre-nfsd41 nfsd. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:22 -07:00
Benny Halevy	c0d6fc8a2d	nfsd41: pass writable attrs mask to nfsd4_decode_fattr In preparation for EXCLUSIVE4_1 Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:22 -07:00
Benny Halevy	8daf220a6a	nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions Support enabling and disabling nfsv4.1 via /proc/fs/nfsd/versions by writing the strings "+4.1" or "-4.1" correspondingly. Use user mode nfs-utils (rpc.nfsd option) to enable. This will allow us to get rid of CONFIG_NFSD_V4_1 [nfsd41: disable support for minorversion by default] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:21 -07:00
Andy Adamson	84459a1162	nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap Separate the access bits from the want bits and enable __set_bit to work correctly with st_access_bmap. Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:21 -07:00
Andy Adamson	d87a8ade95	nfsd41: access_valid For nfs41, the open share flags are used also for delegation "wants" and "signals". Check that they are valid. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:21 -07:00
Andy Adamson	60adfc50de	nfsd41: clientid handling Extract the clientid from sessionid to set the op_clientid on open. Verify that the clid for other stateful ops is zero for minorversion != 0 Do all other checks for stateful ops without sessions. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamson <andros@netapp.com> [fixed whitespace indent] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41 remove sl_session from nfsd4_open] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:20 -07:00
Andy Adamson	496c262cf0	nfsd41: check encode size for sessions maxresponse cached Calculate the space the compound response has taken after encoding the current operation. pad: add on 8 bytes for the next operation's op_code and status so that there is room to cache a failure on the next operation. Compare this length to the session se_fmaxresp_cached and return nfserr_rep_too_big_to_cache if the length is too large. Our se_fmaxresp_cached will always be a multiple of PAGE_SIZE, and so will be at least a page and will therefore hold the xdr_buf head. Signed-off-by: Andy Adamson <andros@netapp.com> [nfsd41: non-page DRC for solo sequence responses] [fixed nfsd4_check_drc_limit cosmetics] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use cstate session in nfsd4_check_drc_limit] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:20 -07:00
Andy Adamson	6668958fac	nfsd41: stateid handling When sessions are used, stateful operation sequenceid and stateid handling are not used. When sessions are used, on the first open set the seqid to 1, mark state confirmed and skip seqid processing. When sessionas are used the stateid generation number is ignored when it is zero whereas without sessions bad_stateid or stale stateid is returned. Add flags to propagate session use to all stateful ops and down to check_stateid_generation. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamson <andros@netapp.com> [nfsd4_has_session should return a boolean, not u32] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: pass nfsd4_compoundres * to nfsd4_process_open1] [nfsd41: calculate HAS_SESSION in nfs4_preprocess_stateid_op] [nfsd41: calculate HAS_SESSION in nfs4_preprocess_seqid_op] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:19 -07:00
Benny Halevy	dd453dfd70	nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op Currently we only use cstate->current_fh, will also be used by nfsd41 code. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:19 -07:00
Benny Halevy	e10e0cfc2f	nfsd41: destroy_session operation Implement the destory_session operation confoming to http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26 [use sessionid_lock spin lock] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:19 -07:00
Andy Adamson	bf864a31d5	nfsd41: non-page DRC for solo sequence responses A session inactivity time compound (lease renewal) or a compound where the sequence operation has sa_cachethis set to FALSE do not require any pages to be held in the v4.1 DRC. This is because struct nfsd4_slot is already caching the session information. Add logic to the nfs41 server to not cache response pages for solo sequence responses. Return nfserr_replay_uncached_rep on the operation following the sequence operation when sa_cachethis is FALSE. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use cstate session in nfsd4_replay_cache_entry] [nfsd41: rename nfsd4_no_page_in_cache] [nfsd41 rename nfsd4_enc_no_page_replay] [nfsd41 nfsd4_is_solo_sequence] [nfsd41 change nfsd4_not_cached return] Signed-off-by: Andy Adamson <andros@netapp.com> [changed return type to bool] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41 drop parens in nfsd4_is_solo_sequence call] Signed-off-by: Andy Adamson <andros@netapp.com> [changed "== 0" to "!"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:19 -07:00
Andy Adamson	38eb76a54d	nfsd41: Add a create session replay cache Replace the nfs4_client cl_seqid field with a single struct nfs41_slot used for the create session replay cache. The CREATE_SESSION slot sets the sl_session pointer to NULL. Otherwise, the slot and it's replay cache are used just like the session slots. Fix unconfirmed create_session replay response by initializing the create_session slot sequence id to 0. A future patch will set the CREATE_SESSION cache when a SEQUENCE operation preceeds the CREATE_SESSION operation. This compound is currently only cached in the session slot table. Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use bool inuse for slot state] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: revert portion of nfsd4_set_cache_entry] Signed-off-by: Andy Adamson <andros@netpp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:18 -07:00
Andy Adamson	ec6b5d7b50	nfsd41: create_session operation Implement the create_session operation confoming to http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26 Look up the client id (generated by the server on exchange_id, given by the client on create_session). If neither a confirmed or unconfirmed client is found then the client id is stale If a confirmed cilent is found (i.e. we already received create_session for it) then compare the sequence id to determine if it's a replay or possibly a mis-ordered rpc. If the seqid is in order, update the confirmed client seqid and procedd with updating the session parameters. If an unconfirmed client_id is found then verify the creds and seqid. If both match move the client id to confirmed state and proceed with processing the create_session. Currently, we do not support persistent sessions, and RDMA. alloc_init_session generates a new sessionid and creates a session structure. NFSD_PAGES_PER_SLOT is used for the max response cached calculation, and for the counting of DRC pages using the hard limits set in struct srv_serv. A note on NFSD_PAGES_PER_SLOT: Other patches in this series allow for NFSD_PAGES_PER_SLOT + 1 pages to be cached in a DRC slot when the response size is less than NFSD_PAGES_PER_SLOT * PAGE_SIZE but xdr_buf pages are used. e.g. a READDIR operation will encode a small amount of data in the xdr_buf head, and then the READDIR in the xdr_buf pages. So, the hard limit calculation use of pages by a session is underestimated by the number of cached operations using the xdr_buf pages. Yet another patch caches no pages for the solo sequence operation, or any compound where cache_this is False. So the hard limit calculation use of pages by a session is overestimated by the number of these operations in the cache. TODO: improve resource pre-allocation and negotiate session parameters accordingly. Respect and possibly adjust backchannel attributes. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> [nfsd41: remove headerpadsz from channel attributes] Our client and server only support a headerpadsz of 0. [nfsd41: use DRC limits in fore channel init] [nfsd41: do not change CREATE_SESSION back channel attrs] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [use sessionid_lock spin lock] [nfsd41: use bool inuse for slot state] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41 remove sl_session from alloc_init_session] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [simplify nfsd4_encode_create_session error handling] [nfsd41: fix comment style in init_forechannel_attrs] [nfsd41: allocate struct nfsd4_session and slot table in one piece] [nfsd41: no need to INIT_LIST_HEAD in alloc_init_session just prior to list_add] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:18 -07:00
Andy Adamson	14778a133e	nfsd41: clear DRC cache on free_session Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:18 -07:00
Andy Adamson	da3846a286	nfsd41: nfsd DRC logic Replay a request in nfsd4_sequence. Add a minorversion to struct nfsd4_compound_state. Pass the current slot to nfs4svc_encode_compound res via struct nfsd4_compoundres to set an NFSv4.1 DRC entry. Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use bool inuse for slot state] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use cstate session in nfs4svc_encode_compoundres] [nfsd41 replace nfsd4_set_cache_entry] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:17 -07:00
Andy Adamson	c3d06f9ce8	nfsd41: hard page limit for DRC Use no more than 1/128th of the number of free pages at nfsd startup for the v4.1 DRC. This is an arbitrary default which should probably end up under the control of an administrator. Signed-off-by: Andy Adamson <andros@netapp.com> [moved added fields in struct svc_serv under CONFIG_NFSD_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [fix set_max_drc calculation of sv_drc_max_pages] [moved NFSD_DRC_SIZE_SHIFT's declaration up in header file] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:17 -07:00
Andy Adamson	074fe89753	nfsd41: DRC save, restore, and clear functions Cache all the result pages, including the rpc header in rq_respages[0], for a request in the slot table cache entry. Cache the statp pointer from nfsd_dispatch which points into rq_respages[0] just past the rpc header. When setting a cache entry, calculate and save the length of the nfs data minus the rpc header for rq_respages[0]. When replaying a cache entry, replace the cached rpc header with the replayed request rpc result header, unless there is not enough room in the cached results first page. In that case, use the cached rpc header. The sessions fore channel maxresponse size cached is set to NFSD_PAGES_PER_SLOT * PAGE_SIZE. For compounds we are cacheing with operations such as READDIR that use the xdr_buf->pages to hold data, we choose to cache the extra page of data rather than copying data from xdr_buf->pages into the xdr_buf->head page. [nfsd41: limit cache to maxresponsesize_cached] [nfsd41: mv nfsd4_set_statp under CONFIG_NFSD_V4_1] [nfsd41: rename nfsd4_move_pages] [nfsd41: rename page_no variable] [nfsd41: rename nfsd4_set_cache_entry] [nfsd41: fix nfsd41_copy_replay_data comment] [nfsd41: add to nfsd4_set_cache_entry] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:17 -07:00
Andy Adamson	f9bb94c4c6	nfsd41: enforce NFS4ERR_SEQUENCE_POS operation order rules for minorversion != 0 only. Signed-off-by: Andy Adamson<andros@netapp.com> [nfsd41: do not verify nfserr_sequence_pos for minorversion 0] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:16 -07:00
Benny Halevy	b85d4c01b7	nfsd41: sequence operation Implement the sequence operation conforming to http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26 Check for stale clientid (as derived from the sessionid). Enforce slotid range and exactly-once semantics using the slotid and seqid. If everything went well renew the client lease and mark the slot INPROGRESS. Add a struct nfsd4_slot pointer to struct nfsd4_compound_state. To be used for sessions DRC replay. [nfsd41: rename sequence catchthis to cachethis] Signed-off-by: Andy Adamson<andros@netapp.com> [pulled some code to set cstate->slot from "nfsd DRC logic"] [use sessionid_lock spin lock] [nfsd41: use bool inuse for slot state] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd: add a struct nfsd4_slot pointer to struct nfsd4_compound_state] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: add nfsd4_session pointer to nfsd4_compound_state] [nfsd41: set cstate session] [nfsd41: use cstate session in nfsd4_sequence] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [simplify nfsd4_encode_sequence error handling] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:16 -07:00
Andy Adamson	a1bcecd29c	nfsd41: match clientid establishment method We need to distinguish between client names provided by NFSv4.0 clients SETCLIENTID and those provided by NFSv4.1 via EXCHANGE_ID when looking up the clientid by string. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamson <andros@netapp.com> [nfsd41: use boolean values for use_exchange_id argument] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: simplify match_clientid_establishment logic] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:15 -07:00
Andy Adamson	0733d21338	nfsd41: exchange_id operation Implement the exchange_id operation confoming to http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-28 Based on the client provided name, hash a client id. If a confirmed one is found, compare the op's creds and verifier. If the creds match and the verifier is different then expire the old client (client re-incarnated), otherwise, if both match, assume it's a replay and ignore it. If an unconfirmed client is found, then copy the new creds and verifer if need update, otherwise assume replay. The client is moved to a confirmed state on create_session. In the nfs41 branch set the exchange_id flags to EXCHGID4_FLAG_USE_NON_PNFS \| EXCHGID4_FLAG_SUPP_MOVED_REFER (pNFS is not supported, Referrals are supported, Migration is not.). Address various scenarios from section 18.35 of the spec: 1. Check for EXCHGID4_FLAG_UPD_CONFIRMED_REC_A and set EXCHGID4_FLAG_CONFIRMED_R as appropriate. 2. Return error codes per 18.35.4 scenarios. 3. Update client records or generate new client ids depending on scenario. Note: 18.35.4 case 3 probably still needs revisiting. The handling seems not quite right. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamosn <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use utsname for major_id (and copy to server_scope)] [nfsd41: fix handling of various exchange id scenarios] Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: reverse use of EXCHGID4_INVAL_FLAG_MASK_A] [simplify nfsd4_encode_exchange_id error handling] [nfsd41: embed an xdr_netobj in nfsd4_exchange_id] [nfsd41: return nfserr_serverfault for spa_how == SP4_MACH_CRED] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:15 -07:00
Andy Adamson	069b6ad4bb	nfsd41: proc stubs Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:14 -07:00
Andy Adamson	2db134eb3b	nfsd41: xdr infrastructure Define nfsd41_dec_ops vector and add it to nfsd4_minorversion for minorversion 1. Note: nfsd4_enc_ops vector is shared for v4.0 and v4.1 since we don't need to filter out obsolete ops as this is done in the decoding phase. exchange_id, create_session, destroy_session, and sequence ops are implemented as stubs returning nfserr_opnotsupp at this stage. [was nfsd41: xdr stubs] [get rid of CONFIG_NFSD_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:14 -07:00
Marc Eshel	5282fd724b	nfsd41: sessionid hashing Simple sessionid hashing using its monotonically increasing sequence number. Locking considerations: sessionid_hashtbl access is controlled by the sessionid_lock spin lock. It must be taken for insert, delete, and lookup. nfsd4_sequence looks up the session id and if the session is found, it calls nfsd4_get_session (still under the sessionid_lock). nfsd4_destroy_session calls nfsd4_put_session after unhashing it, so when the session's kref reaches zero it's going to get freed. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [we don't use a prime for sessionid hash table size] [use sessionid_lock spin lock] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:14 -07:00
Marc Eshel	c4bf786806	nfsd41: release_session when client is expired Signed-off-by: Benny Halevy <bhalevy@panasas.com> [add CONFIG_NFSD_V4_1 to fix v4.0 regression bug] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:13 -07:00
Marc Eshel	9fb870702d	nfsd41: introduce nfs4_client cl_sessions list [get rid of CONFIG_NFSD_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:13 -07:00
Andy Adamson	7116ed6b99	nfsd41: sessions basic data types This patch provides basic data structures representing the nfs41 sessions and slots, plus helpers for keeping a reference count on the session and freeing it. Note that our server only support a headerpadsz of 0 and it ignores backchannel attributes at the moment. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: remove headerpadsz from channel attributes] [nfsd41: embed nfsd4_channel in nfsd4_session] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use bool inuse for slot state] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41 remove sl_session from nfsd4_slot] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:13 -07:00
Andy Adamson	2f425878b6	nfsd: don't use the deferral service, return NFS4ERR_DELAY On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be returned. It is up to the NFSv4.1 client to resend only the operations that have not been processed. Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in nfsd4_proc_compound() only when NFSv4.1 Sessions are used. Note: this isn't an adequate solution on its own. It's acceptable as a way to get some minimal 4.1 up and working, but we're going to have to find a way to avoid returning DELAY in all common cases before 4.1 can really be considered ready. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: reverse rq_nodeferral negative logic] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [sunrpc: initialize rq_usedeferral] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:12 -07:00
Linus Torvalds	8fe74cf053	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: Remove two unneeded exports and make two symbols static in fs/mpage.c Cleanup after commit `585d3bc06f` Trim includes of fdtable.h Don't crap into descriptor table in binfmt_som Trim includes in binfmt_elf Don't mess with descriptor table in load_elf_binary() Get rid of indirect include of fs_struct.h New helper - current_umask() check_unsafe_exec() doesn't care about signal handlers sharing New locking/refcounting for fs_struct Take fs_struct handling to new file (fs/fs_struct.c) Get rid of bumping fs_struct refcount in pivot_root(2) Kill unsharing fs_struct in __set_personality()	2009-04-02 21:09:10 -07:00
Trond Myklebust	cc85906110	Merge branch 'devel' into for-linus	2009-04-01 13:28:15 -04:00
Al Viro	3e93cd6718	Take fs_struct handling to new file (fs/fs_struct.c) Pure code move; two new helper functions for nfsd and daemonize (unshare_fs_struct() and daemonize_fs_struct() resp.; for now - the same code as used to be in callers). unshare_fs_struct() exported (for nfsd, as copy_fs_struct()/exit_fs() used to be), copy_fs_struct() and exit_fs() don't need exports anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-03-31 23:00:26 -04:00
Benny Halevy	2076601632	nfsd: remove nfsd4_ops array size There's no need for it. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-30 17:03:11 -04:00
Andy Adamson	e354d571bb	nfsd: embed nfsd4_current_state in nfsd4_compoundres Remove the allocation of struct nfsd4_compound_state. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-29 16:20:12 -04:00
Chuck Lever	49a9072f29	SUNRPC: Remove @family argument from svc_create() and svc_create_pooled() Since an RPC service listener's protocol family is specified now via svc_create_xprt(), it no longer needs to be passed to svc_create() or svc_create_pooled(). Remove that argument from the synopsis of those functions, and remove the sv_family field from the svc_serv struct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:54:48 -04:00
Chuck Lever	9652ada3fb	SUNRPC: Change svc_create_xprt() to take a @family argument The sv_family field is going away. Pass a protocol family argument to svc_create_xprt() instead of extracting the family from the passed-in svc_serv struct. Again, as this is a listener socket and not an address, we make this new argument an "int" protocol family, instead of an "sa_family_t." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:54:36 -04:00
Chuck Lever	adbbe92956	NFSD: If port value written to /proc/fs/nfsd/portlist is invalid, return EINVAL Make sure port value read from user space by write_ports is valid before passing it to svc_find_xprt(). If it wasn't, the writer would get ENOENT instead of EINVAL. Noticed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:53:42 -04:00
Linus Torvalds	2c9e15a011	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-quota-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-quota-2.6: (27 commits) ext2: Zero our b_size in ext2_quota_read() trivial: fix typos/grammar errors in fs/Kconfig quota: Coding style fixes quota: Remove superfluous inlines quota: Remove uppercase aliases for quota functions. nfsd: Use lowercase names of quota functions jfs: Use lowercase names of quota functions udf: Use lowercase names of quota functions ufs: Use lowercase names of quota functions reiserfs: Use lowercase names of quota functions ext4: Use lowercase names of quota functions ext3: Use lowercase names of quota functions ext2: Use lowercase names of quota functions ramfs: Remove quota call vfs: Use lowercase names of quota functions quota: Remove dqbuf_t and other cleanups quota: Remove NODQUOT macro quota: Make global quota locks cacheline aligned quota: Move quota files into separate directory ext4: quota reservation for delayed allocation ...	2009-03-27 14:48:34 -07:00
Linus Torvalds	8e9d208972	Merge branch 'bkl-removal' of git://git.lwn.net/linux-2.6 * 'bkl-removal' of git://git.lwn.net/linux-2.6: Rationalize fasync return values Move FASYNC bit handling to f_op->fasync() Use f_lock to protect f_flags Rename struct file->f_ep_lock	2009-03-26 16:14:02 -07:00
Jan Kara	90c0af05a5	nfsd: Use lowercase names of quota functions Use lowercase names of quota functions instead of old uppercase ones. CC: bfields@fieldses.org CC: neilb@suse.de Signed-off-by: Jan Kara <jack@suse.cz>	2009-03-26 02:18:37 +01:00
Sachin S. Prabhu	0953e620de	Inconsistent setattr behaviour There is an inconsistency seen in the behaviour of nfs compared to other local filesystems on linux when changing owner or group of a directory. If the directory has SUID/SGID flags set, on changing owner or group on the directory, the flags are stripped off on nfs. These flags are maintained on other filesystems such as ext3. To reproduce on a nfs share or local filesystem, run the following commands mkdir test; chmod +s+g test; chown user1 test; ls -ld test On the nfs share, the flags are stripped and the output seen is drwxr-xr-x 2 user1 root 4096 Feb 23 2009 test On other local filesystems(ex: ext3), the flags are not stripped and the output seen is drwsr-sr-x 2 user1 root 4096 Feb 23 13:57 test chown_common() called from sys_chown() will only strip the flags if the inode is not a directory. static int chown_common(struct dentry * dentry, uid_t user, gid_t group) { .. if (!S_ISDIR(inode->i_mode)) newattrs.ia_valid \|= ATTR_KILL_SUID \| ATTR_KILL_SGID \| ATTR_KILL_PRIV; .. } See: http://www.opengroup.org/onlinepubs/7990989775/xsh/chown.html "If the path argument refers to a regular file, the set-user-ID (S_ISUID) and set-group-ID (S_ISGID) bits of the file mode are cleared upon successful return from chown(), unless the call is made by a process with appropriate privileges, in which case it is implementation-dependent whether these bits are altered. If chown() is successfully invoked on a file that is not a regular file, these bits may be cleared. These bits are defined in <sys/stat.h>." The behaviour as it stands does not appear to violate POSIX. However the actions performed are inconsistent when comparing ext3 and nfs. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:59:37 -04:00
J. Bruce Fields	026722c25e	nfsd4: don't check ip address in setclientid The spec allows clients to change ip address, so we shouldn't be requiring that setclientid always come from the same address. For example, a client could reboot and get a new dhcpd address, but still present the same clientid to the server. In that case the server should revoke the client's previous state and allow it to continue, instead of (as it currently does) returning a CLID_INUSE error. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:42 -04:00
Greg Banks	03cf6c9f49	knfsd: add file to export stats about nfsd pools Add /proc/fs/nfsd/pool_stats to export to userspace various statistics about the operation of rpc server thread pools. This patch is based on a forward-ported version of knfsd-add-pool-thread-stats which has been shipping in the SGI "Enhanced NFS" product since 2006 and which was previously posted: http://article.gmane.org/gmane.linux.nfs/10375 It has also been updated thus: * moved EXPORT_SYMBOL() to near the function it exports * made the new struct struct seq_operations const * used SEQ_START_TOKEN instead of ((void )1) merged fix from SGI PV 990526 "sunrpc: use dprintk instead of printk in svc_pool_stats_()" by Harshula Jayasuriya. merged fix from SGI PV 964001 "Crash reading pool_stats before nfsds are started". Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: Harshula Jayasuriya <harshula@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:42 -04:00
Greg Banks	8bbfa9f388	knfsd: remove the nfsd thread busy histogram Stop gathering the data that feeds the 'th' line in /proc/net/rpc/nfsd because the questionable data provided is not worth the scalability impact of calculating it. Instead, always report zeroes. The current approach suffers from three major issues: 1. update_thread_usage() increments buckets by call service time or call arrival time...in jiffies. On lightly loaded machines, call service times are usually < 1 jiffy; on heavily loaded machines call arrival times will be << 1 jiffy. So a large portion of the updates to the buckets are rounded down to zero, and the histogram is undercounting. 2. As seen previously on the nfs mailing list, the format in which the histogram is presented is cryptic, difficult to explain, and difficult to use. 3. Updating the histogram requires taking a global spinlock and dirtying the global variables nfsd_last_call, nfsd_busy, and nfsdstats twice on every RPC call, which is a significant scaling limitation. Testing on a 4 CPU 4 NIC Altix using 4 IRIX clients each doing 1K streaming reads at full line rate, shows the stats update code (inlined into nfsd()) takes about 1.7% of each CPU. This patch drops the contribution from nfsd() into the profile noise. This patch is a forward-ported version of knfsd-remove-nfsd-threadstats which has been shipping in the SGI "Enhanced NFS" product since 2006. In that time, exactly one customer has noticed that the threadstats were missing. It has been previously posted: http://article.gmane.org/gmane.linux.nfs/10376 and more recently requested to be posted again. Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:41 -04:00
J. Bruce Fields	5cb031b0af	nfsd4: remove redundant check from nfsd4_open Note that we already checked for this invalid case at the top of this function. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:41 -04:00
J. Bruce Fields	05f4f678b0	nfsd4: don't do lookup within readdir in recovery code The main nfsd code was recently modified to no longer do lookups from withing the readdir callback, to avoid locking problems on certain filesystems. This (rather hacky, and overdue for replacement) NFSv4 recovery code has the same problem. Fix it to build up a list of names (instead of dentries) and do the lookups afterwards. Reported symptoms were a deadlock in the xfs code (called from nfsd4_recdir_load), with /var/lib/nfs on xfs. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reported-by: David Warren <warren@atmos.washington.edu>	2009-03-18 17:38:40 -04:00
J. Bruce Fields	a1c8c4d1ff	nfsd4: support putpubfh operation Currently putpubfh returns NFSERR_OPNOTSUPP, which isn't actually allowed for v4. The right error is probably NFSERR_NOTSUPP. But let's just implement it; though rarely seen, it can be used by Solaris (with a special mount option), is mandated by the rfc, and is trivial for us to support. Thanks to Yang Hongyang for pointing out the original problem, and to Mike Eisler, Tom Talpey, Trond Myklebust, and Dave Noveck for further argument.... Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:40 -04:00
David Shaw	31dec2538e	Short write in nfsd becomes a full write to the client If a filesystem being written to via NFS returns a short write count (as opposed to an error) to nfsd, nfsd treats that as a success for the entire write, rather than the short count that actually succeeded. For example, given a 8192 byte write, if the underlying filesystem only writes 4096 bytes, nfsd will ack back to the nfs client that all 8192 bytes were written. The nfs client does have retry logic for short writes, but this is never called as the client is told the complete write succeeded. There are probably other ways it could happen, but in my case it happened with a fuse (filesystem in userspace) filesystem which can rather easily have a partial write. Here is a patch to properly return the short write count to the client. Signed-off-by: David Shaw <dshaw@jabberwocky.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:40 -04:00
Benny Halevy	1e685ec270	NFSD: return nfsv4 error code nfserr_notsupp rather than nfsv[23]'s nfserr_opnotsupp Thanks for Bill Baker at sun.com for catching this at Connectathon 2009. This bug was introduced in 2.6.27 Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:39 -04:00

... 23 24 25 26 27 ...

3046 Commits