Commit Graph

852 Commits

Author SHA1 Message Date
J. Bruce Fields
27b11428b7 nfsd4: warn on finding lockowner without stateid's
The current code assumes a one-to-one lockowner<->lock stateid
correspondance.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-21 11:11:21 -04:00
J. Bruce Fields
a1b8ff4c97 nfsd4: remove lockowner when removing lock stateid
The nfsv4 state code has always assumed a one-to-one correspondance
between lock stateid's and lockowners even if it appears not to in some
places.

We may actually change that, but for now when FREE_STATEID releases a
lock stateid it also needs to release the parent lockowner.

Symptoms were a subsequent LOCK crashing in find_lockowner_str when it
calls same_lockowner_ino on a lockowner that unexpectedly has an empty
so_stateids list.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-21 11:11:21 -04:00
Kinglong Mee
9fa1959e97 NFSD: Get rid of empty function nfs4_state_init
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-08 14:59:52 -04:00
Trond Myklebust
14bcab1a39 NFSd: Clean up nfs4_preprocess_stateid_op
Move the state locking and file descriptor reference out from the
callers and into nfs4_preprocess_stateid_op() itself.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-07 11:05:48 -04:00
Trond Myklebust
50cc62317d NFSd: Mark nfs4_free_lockowner and nfs4_free_openowner as static functions
They do not need to be used outside fs/nfsd/nfs4state.c

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-06 17:54:57 -04:00
Trond Myklebust
4dd86e150f NFSd: Remove 'inline' designation for free_client()
It is large, it is used in more than one place, and it is not performance
critical. Let gcc figure out whether it should be inlined...

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-06 17:54:53 -04:00
Trond Myklebust
4cb57e3032 NFSd: call rpc_destroy_wait_queue() from free_client()
Mainly to ensure that we don't leave any hanging timers.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-06 12:38:49 -04:00
Trond Myklebust
5694c93e6c NFSd: Move default initialisers from create_client() to alloc_client()
Aside from making it clearer what is non-trivial in create_client(), it
also fixes a bug whereby we can call free_client() before idr_init()
has been called.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-06 12:38:46 -04:00
Trond Myklebust
a8a7c6776f nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+
RFC5661 obsoletes NFS4ERR_STALE_STATEID in favour of NFS4ERR_BAD_STATEID.

Note that because nfsd encodes the clientid boot time in the stateid, we
can hit this error case in certain scenarios where the Linux client
state management thread exits early, before it has finished recovering
all state.

Reported-by: Idan Kedar <idank@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-30 10:47:33 -04:00
J. Bruce Fields
0da7b19cc3 nfsd4: minor nfsd4_replay_cache_entry cleanup
Maybe this is comment true, who cares?  Handle this like any other
error.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-28 21:24:52 -04:00
J. Bruce Fields
3ca2eb9814 nfsd4: nfsd4_replay_cache_entry should be static
This isn't actually used anywhere else.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-28 21:24:51 -04:00
J. Bruce Fields
067e1ace46 nfsd4: update comments with obsolete function name
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-28 21:24:50 -04:00
Kinglong Mee
3f42d2c428 NFSD: Using free_conn free connection
Connection from alloc_conn must be freed through free_conn,
otherwise, the reference of svc_xprt will never be put.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-28 21:23:40 -04:00
Kinglong Mee
2b90563598 NFSD: Traverse unconfirmed client through hash-table
When stopping nfsd, I got BUG messages, and soft lockup messages,
The problem is cuased by double rb_erase() in nfs4_state_destroy_net()
and destroy_client().

This patch just let nfsd traversing unconfirmed client through
hash-table instead of rbtree.

[ 2325.021995] BUG: unable to handle kernel NULL pointer dereference at
          (null)
[ 2325.022809] IP: [<ffffffff8133c18c>] rb_erase+0x14c/0x390
[ 2325.022982] PGD 7a91b067 PUD 7a33d067 PMD 0
[ 2325.022982] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 2325.022982] Modules linked in: nfsd(OF) cfg80211 rfkill bridge stp
llc snd_intel8x0 snd_ac97_codec ac97_bus auth_rpcgss nfs_acl serio_raw
e1000 i2c_piix4 ppdev snd_pcm snd_timer lockd pcspkr joydev parport_pc
snd parport i2c_core soundcore microcode sunrpc ata_generic pata_acpi
[last unloaded: nfsd]
[ 2325.022982] CPU: 1 PID: 2123 Comm: nfsd Tainted: GF          O
3.14.0-rc8+ #2
[ 2325.022982] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[ 2325.022982] task: ffff88007b384800 ti: ffff8800797f6000 task.ti:
ffff8800797f6000
[ 2325.022982] RIP: 0010:[<ffffffff8133c18c>]  [<ffffffff8133c18c>]
rb_erase+0x14c/0x390
[ 2325.022982] RSP: 0018:ffff8800797f7d98  EFLAGS: 00010246
[ 2325.022982] RAX: ffff880079c1f010 RBX: ffff880079f4c828 RCX:
0000000000000000
[ 2325.022982] RDX: 0000000000000000 RSI: ffff880079bcb070 RDI:
ffff880079f4c810
[ 2325.022982] RBP: ffff8800797f7d98 R08: 0000000000000000 R09:
ffff88007964fc70
[ 2325.022982] R10: 0000000000000000 R11: 0000000000000400 R12:
ffff880079f4c800
[ 2325.022982] R13: ffff880079bcb000 R14: ffff8800797f7da8 R15:
ffff880079f4c860
[ 2325.022982] FS:  0000000000000000(0000) GS:ffff88007f900000(0000)
knlGS:0000000000000000
[ 2325.022982] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2325.022982] CR2: 0000000000000000 CR3: 000000007a3ef000 CR4:
00000000000006e0
[ 2325.022982] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 2325.022982] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 2325.022982] Stack:
[ 2325.022982]  ffff8800797f7de0 ffffffffa0191c6e ffff8800797f7da8
ffff8800797f7da8
[ 2325.022982]  ffff880079f4c810 ffff880079bcb000 ffffffff81cc26c0
ffff880079c1f010
[ 2325.022982]  ffff880079bcb070 ffff8800797f7e28 ffffffffa01977f2
ffff8800797f7df0
[ 2325.022982] Call Trace:
[ 2325.022982]  [<ffffffffa0191c6e>] destroy_client+0x32e/0x3b0 [nfsd]
[ 2325.022982]  [<ffffffffa01977f2>] nfs4_state_shutdown_net+0x1a2/0x220
[nfsd]
[ 2325.022982]  [<ffffffffa01700b8>] nfsd_shutdown_net+0x38/0x70 [nfsd]
[ 2325.022982]  [<ffffffffa017013e>] nfsd_last_thread+0x4e/0x80 [nfsd]
[ 2325.022982]  [<ffffffffa001f1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc]
[ 2325.022982]  [<ffffffffa017064b>] nfsd_destroy+0x5b/0x80 [nfsd]
[ 2325.022982]  [<ffffffffa0170773>] nfsd+0x103/0x130 [nfsd]
[ 2325.022982]  [<ffffffffa0170670>] ? nfsd_destroy+0x80/0x80 [nfsd]
[ 2325.022982]  [<ffffffff810a8232>] kthread+0xd2/0xf0
[ 2325.022982]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[ 2325.022982]  [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0
[ 2325.022982]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[ 2325.022982] Code: 48 83 e1 fc 48 89 10 0f 84 02 01 00 00 48 3b 41 10
0f 84 08 01 00 00 48 89 51 08 48 89 fa e9 74 ff ff ff 0f 1f 40 00 48 8b
50 10 <f6> 02 01 0f 84 93 00 00 00 48 8b 7a 10 48 85 ff 74 05 f6 07 01
[ 2325.022982] RIP  [<ffffffff8133c18c>] rb_erase+0x14c/0x390
[ 2325.022982]  RSP <ffff8800797f7d98>
[ 2325.022982] CR2: 0000000000000000
[ 2325.022982] ---[ end trace 28c27ed011655e57 ]---

[  228.064071] BUG: soft lockup - CPU#0 stuck for 22s! [nfsd:558]
[  228.064428] Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211
xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc
ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw
ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security
iptable_raw nfsd(OF) auth_rpcgss nfs_acl lockd snd_intel8x0
snd_ac97_codec ac97_bus joydev snd_pcm snd_timer e1000 sunrpc snd ppdev
parport_pc serio_raw pcspkr i2c_piix4 microcode parport soundcore
i2c_core ata_generic pata_acpi
[  228.064539] CPU: 0 PID: 558 Comm: nfsd Tainted: GF          O
3.14.0-rc8+ #2
[  228.064539] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[  228.064539] task: ffff880076adec00 ti: ffff880074616000 task.ti:
ffff880074616000
[  228.064539] RIP: 0010:[<ffffffff8133ba17>]  [<ffffffff8133ba17>]
rb_next+0x27/0x50
[  228.064539] RSP: 0018:ffff880074617de0  EFLAGS: 00000282
[  228.064539] RAX: ffff880074478010 RBX: ffff88007446f860 RCX:
0000000000000014
[  228.064539] RDX: ffff880074478010 RSI: 0000000000000000 RDI:
ffff880074478010
[  228.064539] RBP: ffff880074617de0 R08: 0000000000000000 R09:
0000000000000012
[  228.064539] R10: 0000000000000001 R11: ffffffffffffffec R12:
ffffea0001d11a00
[  228.064539] R13: ffff88007f401400 R14: ffff88007446f800 R15:
ffff880074617d50
[  228.064539] FS:  0000000000000000(0000) GS:ffff88007f800000(0000)
knlGS:0000000000000000
[  228.064539] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  228.064539] CR2: 00007fe9ac6ec000 CR3: 000000007a5d6000 CR4:
00000000000006f0
[  228.064539] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  228.064539] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  228.064539] Stack:
[  228.064539]  ffff880074617e28 ffffffffa01ab7db ffff880074617df0
ffff880074617df0
[  228.064539]  ffff880079273000 ffffffff81cc26c0 ffffffff81cc26c0
0000000000000000
[  228.064539]  0000000000000000 ffff880074617e48 ffffffffa01840b8
ffffffff81cc26c0
[  228.064539] Call Trace:
[  228.064539]  [<ffffffffa01ab7db>] nfs4_state_shutdown_net+0x18b/0x220
[nfsd]
[  228.064539]  [<ffffffffa01840b8>] nfsd_shutdown_net+0x38/0x70 [nfsd]
[  228.064539]  [<ffffffffa018413e>] nfsd_last_thread+0x4e/0x80 [nfsd]
[  228.064539]  [<ffffffffa00aa1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc]
[  228.064539]  [<ffffffffa018464b>] nfsd_destroy+0x5b/0x80 [nfsd]
[  228.064539]  [<ffffffffa0184773>] nfsd+0x103/0x130 [nfsd]
[  228.064539]  [<ffffffffa0184670>] ? nfsd_destroy+0x80/0x80 [nfsd]
[  228.064539]  [<ffffffff810a8232>] kthread+0xd2/0xf0
[  228.064539]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[  228.064539]  [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0
[  228.064539]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[  228.064539] Code: 1f 44 00 00 55 48 8b 17 48 89 e5 48 39 d7 74 3b 48
8b 47 08 48 85 c0 75 0e eb 25 66 0f 1f 84 00 00 00 00 00 48 89 d0 48 8b
50 10 <48> 85 d2 75 f4 5d c3 66 90 48 3b 78 08 75 f6 48 8b 10 48 89 c7

Fixes: ac55fdc408 (nfsd: move the confirmed and unconfirmed hlists...)
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-28 10:41:40 -04:00
Ming Chen
ed47b062ce nfsd: consider CLAIM_FH when handing out delegation
CLAIM_FH was added by NFSv4.1.  It is the same as CLAIM_NULL except that it
uses only current FH to identify the file to be opened.

The NFS client is using CLAIM_FH if the FH is available when opening a file.
Currently, we cannot get any delegation if we stat a file before open it
because the server delegation code does not recognize CLAIM_FH.

We tested this patch and found delegation can be handed out now when claim is
CLAIM_FH.

See http://marc.info/?l=linux-nfs&m=136369847801388&w=2 and
http://www.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues#New_open_claim_types

Signed-off-by: Ming Chen <mchen@cs.stonybrook.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-27 18:02:42 -05:00
J. Bruce Fields
e873088f29 nfsd4: minor nfs4_setlease cleanup
As far as I can tell, this list is used only under the state lock, so we
may as well do this in the simpler order.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-27 13:59:15 -05:00
Kinglong Mee
60810e5489 NFSD: Fix a memory leak in nfsd4_create_session
If failed after calling alloc_session but before init_session, nfsd will call __free_session to
free se_slots in session. But, session->se_fchannel.maxreqs is not initialized (value is zero).
So that, the memory malloced for slots will be lost in free_session_slots for maxreqs is zero.

This path sets the information for channel in alloc_session after mallocing slots succeed,
instead in init_session.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-06 15:33:54 -05:00
Kinglong Mee
8a891633b8 NFSD: fix bad length checking for backchannel
the length for backchannel checking should be multiplied by sizeof(__be32).

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:48 -05:00
Kinglong Mee
f403e450e8 NFSD: fix a leak which can cause CREATE_SESSION failures
check_forechannel_attrs gets drc memory, so nfsd must put it when
check_backchannel_attrs fails.

After many requests with bad back channel attrs, nfsd will deny any
client's CREATE_SESSION forever.

A new test case named CSESS29 for pynfs will send in another mail.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:47 -05:00
Kinglong Mee
2ce02b6b6c Add missing recording of back channel attrs in nfsd4_session
commit 5b6feee960 forgot
recording the back channel attrs in nfsd4_session.

nfsd just check the back channel attars by check_backchannel_attrs,
but do not  record it in nfsd4_session in the latest kernel.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:46 -05:00
Linus Torvalds
449bf8d03c Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from Bruce Fields:
 "This includes miscellaneous bugfixes and cleanup and a performance fix
  for write-heavy NFSv4 workloads.

  (The most significant nfsd-relevant change this time is actually in
  the delegation patches that went through Viro, fixing a long-standing
  bug that can cause NFSv4 clients to miss updates made by non-nfs users
  of the filesystem.  Those enable some followup nfsd patches which I
  have queued locally, but those can wait till 3.14)"

* 'nfsd-next' of git://linux-nfs.org/~bfields/linux: (24 commits)
  nfsd: export proper maximum file size to the client
  nfsd4: improve write performance with better sendspace reservations
  svcrpc: remove an unnecessary assignment
  sunrpc: comment typo fix
  Revert "nfsd: remove_stid can be incorporated into nfs4_put_delegation"
  nfsd4: fix discarded security labels on setattr
  NFSD: Add support for NFS v4.2 operation checking
  nfsd4: nfsd_shutdown_net needs state lock
  NFSD: Combine decode operations for v4 and v4.1
  nfsd: -EINVAL on invalid anonuid/gid instead of silent failure
  nfsd: return better errors to exportfs
  nfsd: fh_update should error out in unexpected cases
  nfsd4: need to destroy revoked delegations in destroy_client
  nfsd: no need to unhash_stid before free
  nfsd: remove_stid can be incorporated into nfs4_put_delegation
  nfsd: nfs4_open_delegation needs to remove_stid rather than unhash_stid
  nfsd: nfs4_free_stid
  nfsd: fix Kconfig syntax
  sunrpc: trim off EC bytes in GSSAPI v2 unwrap
  gss_krb5: document that we ignore sequence number
  ...
2013-11-16 12:04:02 -08:00
J. Bruce Fields
617588d518 locks: introduce new FL_DELEG lock flag
For now FL_DELEG is just a synonym for FL_LEASE.  So this patch doesn't
change behavior.

Next we'll modify break_lease to treat FL_DELEG leases differently, to
account for the fact that NFSv4 delegations should be broken in more
situations than Windows oplocks.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-09 00:16:41 -05:00
J. Bruce Fields
b78800baee Revert "nfsd: remove_stid can be incorporated into nfs4_put_delegation"
This reverts commit 7ebe40f203.  We forgot
the nfs4_put_delegation call in fs/nfsd/nfs4callback.c which should not
be unhashing the stateid.  This lead to warnings from the idr code when
we tried to removed id's twice.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-11-04 17:46:50 -05:00
J. Bruce Fields
e50a26dc78 nfsd4: nfsd_shutdown_net needs state lock
A comment claims the caller should take it, but that's not being done.
Note we don't want it around the cancel_delayed_work_sync since that may
wait on work which holds the client lock.

Reported-by: Benny Halevy <bhalevy@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-10-30 10:35:59 -04:00
Benny Halevy
956c4fee44 nfsd4: need to destroy revoked delegations in destroy_client
[use list_splice_init]
Signed-off-by: Benny Halevy <bhalevy@primarydata.com>
[bfields: no need for recall_lock here]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-10-29 12:00:48 -04:00
Benny Halevy
01a87d91fc nfsd: no need to unhash_stid before free
idr_remove is about to be called before kmem_cache_free so unhashing it
is redundant

Signed-off-by: Benny Halevy <bhalevy@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-10-29 10:03:27 -04:00
Benny Halevy
7ebe40f203 nfsd: remove_stid can be incorporated into nfs4_put_delegation
All calls to nfs4_put_delegation are preceded with remove_stid.

Signed-off-by: Benny Halevy <bhalevy@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-10-28 15:58:32 -04:00
Benny Halevy
5d7dab83e3 nfsd: nfs4_open_delegation needs to remove_stid rather than unhash_stid
In the out_free: path, the newly allocated stid must be removed rather
than unhashed so it can never be found.

Signed-off-by: Benny Halevy <bhalevy@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-10-28 15:58:20 -04:00
Benny Halevy
9857df815f nfsd: nfs4_free_stid
Make it symmetric to nfs4_alloc_stid.

Signed-off-by: Benny Halevy <bhalevy@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-10-28 15:43:06 -04:00
Al Viro
a6a9f18f0a nfsd: switch to %p[dD]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-10-24 23:34:51 -04:00
Al Viro
97e47fa11d nfsd: switch to %p[dD]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-10-02 16:18:24 -04:00
J. Bruce Fields
bf7bd3e98b nfsd4: fix leak of inode reference on delegation failure
This fixes a regression from 68a3396178
"nfsd4: shut down more of delegation earlier".

After that commit, nfs4_set_delegation() failures result in
nfs4_put_delegation being called, but nfs4_put_delegation doesn't free
the nfs4_file that has already been set by alloc_init_deleg().

This can result in an oops on later unmounting the exported filesystem.

Note also delaying the fi_had_conflict check we're able to return a
better error (hence give 4.1 clients a better idea why the delegation
failed; though note CONFLICT isn't an exact match here, as that's
supposed to indicate a current conflict, but all we know here is that
there was one recently).

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-30 17:30:52 -04:00
J. Bruce Fields
3477565e6a Revert "nfsd: nfs4_file_get_access: need to be more careful with O_RDWR"
This reverts commit df66e75395.

nfsd4_lock can get a read-only or write-only reference when only a
read-write open is available.  This is normal.

Cc: Harshula Jayasuriya <harshula@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-30 17:30:45 -04:00
J. Bruce Fields
b8297cec2d Linux 3.11-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJSCDSjAAoJEHm+PkMAQRiGDXMIAI7Loae0Oqb1eoeJkvjyZsBS
 OJDeeEcn+k58VbxVHyRdc7hGo4yI4tUZm172SpnOaM8sZ/ehPU7zBrwJK2lzX334
 /jAM3uvVPfxA2nu0I4paNpkED/NQ8NRRsYE1iTE8dzHXOH6dA3mgp5qfco50rQvx
 rvseXpME4KIAJEq4jnyFZF5+nuHiPueM9JftPmSSmJJ3/KY9kY1LESovyWd7ttg1
 jYSVPFal9J0E+tl2UQY5g9H16GqhhjYn+39Iei6Q5P4bL4ZubQgTRQTN9nyDc06Z
 ezQtGoqZ8kEz/2SyRlkda6PzjSEhgXlc8mCL5J7AW+dMhTHHx2IrosjiCA80kG8=
 =c0rK
 -----END PGP SIGNATURE-----

Merge tag 'v3.11-rc5' into for-3.12 branch

For testing purposes I want some nfs and nfsd bugfixes (specifically,
58cd57bfd9 and previous nfsd patches, and
Trond's 4f3cc4809a).
2013-08-30 16:42:49 -04:00
J. Bruce Fields
c47205914c nfsd4: Fix MACH_CRED NULL dereference
Fixes a NULL-dereference on attempts to use MACH_CRED protection over
auth_sys.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-07 12:05:51 -04:00
J. Bruce Fields
b1948a641d nfsd4: fix setlease error return
This actually makes a difference in the 4.1 case, since we use the
status to decide what reason to give the client for the delegation
refusal (see nfsd4_open_deleg_none_ext), and in theory a client might
choose suboptimal behavior if we give the wrong answer.

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-26 17:02:07 -04:00
Harshula Jayasuriya
df66e75395 nfsd: nfs4_file_get_access: need to be more careful with O_RDWR
If fi_fds = {non-NULL, NULL, non-NULL} and oflag = O_WRONLY
the WARN_ON_ONCE(!(fp->fi_fds[oflag] || fp->fi_fds[O_RDWR]))
doesn't trigger when it should.

Signed-off-by: Harshula Jayasuriya <harshula@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-23 12:26:51 -04:00
Linus Torvalds
0ff08ba5d0 Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from Bruce Fields:
 "Changes this time include:

   - 4.1 enabled on the server by default: the last 4.1-specific issues
     I know of are fixed, so we're not going to find the rest of the
     bugs without more exposure.
   - Experimental support for NFSv4.2 MAC Labeling (to allow running
     selinux over NFS), from Dave Quigley.
   - Fixes for some delicate cache/upcall races that could cause rare
     server hangs; thanks to Neil Brown and Bodo Stroesser for extreme
     debugging persistence.
   - Fixes for some bugs found at the recent NFS bakeathon, mostly v4
     and v4.1-specific, but also a generic bug handling fragmented rpc
     calls"

* 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits)
  nfsd4: support minorversion 1 by default
  nfsd4: allow destroy_session over destroyed session
  svcrpc: fix failures to handle -1 uid's
  sunrpc: Don't schedule an upcall on a replaced cache entry.
  net/sunrpc: xpt_auth_cache should be ignored when expired.
  sunrpc/cache: ensure items removed from cache do not have pending upcalls.
  sunrpc/cache: use cache_fresh_unlocked consistently and correctly.
  sunrpc/cache: remove races with queuing an upcall.
  nfsd4: return delegation immediately if lease fails
  nfsd4: do not throw away 4.1 lock state on last unlock
  nfsd4: delegation-based open reclaims should bypass permissions
  svcrpc: don't error out on small tcp fragment
  svcrpc: fix handling of too-short rpc's
  nfsd4: minor read_buf cleanup
  nfsd4: fix decoding of compounds across page boundaries
  nfsd4: clean up nfs4_open_delegation
  NFSD: Don't give out read delegations on creates
  nfsd4: allow client to send no cb_sec flavors
  nfsd4: fail attempts to request gss on the backchannel
  nfsd4: implement minimal SP4_MACH_CRED
  ...
2013-07-11 10:17:13 -07:00
J. Bruce Fields
f0f51f5cdd nfsd4: allow destroy_session over destroyed session
RFC 5661 allows a client to destroy a session using a compound
associated with the destroyed session, as long as the DESTROY_SESSION op
is the last op of the compound.

We attempt to allow this, but testing against a Solaris client (which
does destroy sessions in this way) showed that we were failing the
DESTROY_SESSION with NFS4ERR_DELAY, because we assumed the reference
count on the session (held by us) represented another rpc in progress
over this session.

Fix this by noting that in this case the expected reference count is 1,
not 0.

Also, note as long as the session holds a reference to the compound
we're destroying, we can't free it here--instead, delay the free till
the final put in nfs4svc_encode_compoundres.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-08 19:46:38 -04:00
J. Bruce Fields
d08d32e6e5 nfsd4: return delegation immediately if lease fails
This case shouldn't happen--the administrator shouldn't really allow
other applications access to the export until clients have had the
chance to reclaim their state--but if it does then we should set the
"return this lease immediately" bit on the reply.  That still leaves
some small races, but it's the best the protocol allows us to do in the
case a lease is ripped out from under us....

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:32:07 -04:00
J. Bruce Fields
0a262ffb75 nfsd4: do not throw away 4.1 lock state on last unlock
This reverts commit eb2099f31b "nfsd4:
release lockowners on last unlock in 4.1 case".  Trond identified
language in rfc 5661 section 8.2.4 which forbids this behavior:

	Stateids associated with byte-range locks are an exception.
	They remain valid even if a LOCKU frees all remaining locks, so
	long as the open file with which they are associated remains
	open, unless the client frees the stateids via the FREE_STATEID
	operation.

And bakeathon 2013 testing found a 4.1 freebsd client was getting an
incorrect BAD_STATEID return from a FREE_STATEID in the above situation
and then failing.

The spec language honestly was probably a mistake but at this point with
implementations already following it we're probably stuck with that.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:32:06 -04:00
J. Bruce Fields
99c415156c nfsd4: clean up nfs4_open_delegation
The nfs4_open_delegation logic is unecessarily baroque.

Also stop pretending we support write delegations in several places.

Some day we will support write delegations, but when that happens adding
back in these flag parameters will be the easy part.  For now they're
just confusing.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:07 -04:00
Steve Dickson
9a0590aec3 NFSD: Don't give out read delegations on creates
When an exclusive create is done with the mode bits
set (aka open(testfile, O_CREAT | O_EXCL, 0777)) this
causes a OPEN op followed by a SETATTR op. When a
read delegation is given in the OPEN, it causes
the SETATTR to delay with EAGAIN until the
delegation is recalled.

This patch caused exclusive creates to give out
a write delegation (which turn into no delegation)
which allows the SETATTR seamlessly succeed.

Signed-off-by: Steve Dickson <steved@redhat.com>
[bfields: do this for any CREATE, not just exclusive; comment]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:07 -04:00
J. Bruce Fields
b78724b705 nfsd4: fail attempts to request gss on the backchannel
We don't support gss on the backchannel.  We should state that fact up
front rather than just letting things continue and later making the
client try to figure out why the backchannel isn't working.

Trond suggested instead returning NFS4ERR_NOENT.  I think it would be
tricky for the client to distinguish between the case "I don't support
gss on the backchannel" and "I can't find that in my cache, please
create another context and try that instead", and I'd prefer something
that currently doesn't have any other meaning for this operation, hence
the (somewhat arbitrary) NFS4ERR_ENCR_ALG_UNSUPP.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:06 -04:00
J. Bruce Fields
57266a6e91 nfsd4: implement minimal SP4_MACH_CRED
Do a minimal SP4_MACH_CRED implementation suggested by Trond, ignoring
the client-provided spo_must_* arrays and just enforcing credential
checks for the minimum required operations.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:06 -04:00
J. Bruce Fields
0dc1531aca svcrpc: store gss mech in svc_cred
Store a pointer to the gss mechanism used in the rq_cred and cl_cred.
This will make it easier to enforce SP4_MACH_CRED, which needs to
compare the mechanism used on the exchange_id with that used on
protected operations.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:23:06 -04:00
Jeff Layton
1c8c601a8c locks: protect most of the file_lock handling with i_lock
Having a global lock that protects all of this code is a clear
scalability problem. Instead of doing that, move most of the code to be
protected by the i_lock instead. The exceptions are the global lists
that the ->fl_link sits on, and the ->fl_block list.

->fl_link is what connects these structures to the
global lists, so we must ensure that we hold those locks when iterating
over or updating these lists.

Furthermore, sound deadlock detection requires that we hold the
blocked_list state steady while checking for loops. We also must ensure
that the search and update to the list are atomic.

For the checking and insertion side of the blocked_list, push the
acquisition of the global lock into __posix_lock_file and ensure that
checking and update of the  blocked_list is done without dropping the
lock in between.

On the removal side, when waking up blocked lock waiters, take the
global lock before walking the blocked list and dequeue the waiters from
the global list prior to removal from the fl_block list.

With this, deadlock detection should be race free while we minimize
excessive file_lock_lock thrashing.

Finally, in order to avoid a lock inversion problem when handling
/proc/locks output we must ensure that manipulations of the fl_block
list are also protected by the file_lock_lock.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:57:42 +04:00
Jim Rees
1a9357f443 nfsd: avoid undefined signed overflow
In C, signed integer overflow results in undefined behavior, but unsigned
overflow wraps around. So do the subtraction first, then cast to signed.

Reported-by: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-21 11:02:03 -04:00
J. Bruce Fields
4f540e29dc nfsd4: store correct client minorversion for >=4.2
This code assumes that any client using exchange_id is using NFSv4.1,
but with the introduction of 4.2 that will no longer true.

This main effect of this is that client callbacks will use the same
minorversion as that used on the exchange_id.

Note that clients are forbidden from mixing 4.1 and 4.2 compounds.  (See
rfc 5661, section 2.7, #13: "A client MUST NOT attempt to use a stateid,
filehandle, or similar returned object from the COMPOUND procedure with
minor version X for another COMPOUND procedure with minor version Y,
where X != Y.")  However, we do not currently attempt to enforce this
except in the case of mixing zero minor version with non-zero minor
versions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-13 10:11:45 -04:00
Linus Torvalds
1db772216f Merge branch 'for-3.10' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from J Bruce Fields:
 "Highlights include:

   - Some more DRC cleanup and performance work from Jeff Layton

   - A gss-proxy upcall from Simo Sorce: currently krb5 mounts to the
     server using credentials from Active Directory often fail due to
     limitations of the svcgssd upcall interface.  This replacement
     lifts those limitations.  The existing upcall is still supported
     for backwards compatibility.

   - More NFSv4.1 support: at this point, if a user with a current
     client who upgrades from 4.0 to 4.1 should see no regressions.  In
     theory we do everything a 4.1 server is required to do.  Patches
     for a couple minor exceptions are ready for 3.11, and with those
     and some more testing I'd like to turn 4.1 on by default in 3.11."

Fix up semantic conflict as per Stephen Rothwell and linux-next:

Commit 030d794bf4 ("SUNRPC: Use gssproxy upcall for server RPCGSS
authentication") adds two new users of "PDE(inode)->data", but we're
supposed to use "PDE_DATA(inode)" instead since commit d9dda78bad
("procfs: new helper - PDE_DATA(inode)").

The old PDE() macro is no longer available since commit c30480b92c
("proc: Make the PROC_I() and PDE() macros internal to procfs")

* 'for-3.10' of git://linux-nfs.org/~bfields/linux: (60 commits)
  NFSD: SECINFO doesn't handle unsupported pseudoflavors correctly
  NFSD: Simplify GSS flavor encoding in nfsd4_do_encode_secinfo()
  nfsd: make symbol nfsd_reply_cache_shrinker static
  svcauth_gss: fix error return code in rsc_parse()
  nfsd4: don't remap EISDIR errors in rename
  svcrpc: fix gss-proxy to respect user namespaces
  SUNRPC: gssp_procedures[] can be static
  SUNRPC: define {create,destroy}_use_gss_proxy_proc_entry in !PROC case
  nfsd4: better error return to indicate SSV non-support
  nfsd: fix EXDEV checking in rename
  SUNRPC: Use gssproxy upcall for server RPCGSS authentication.
  SUNRPC: Add RPC based upcall mechanism for RPCGSS auth
  SUNRPC: conditionally return endtime from import_sec_context
  SUNRPC: allow disabling idle timeout
  SUNRPC: attempt AF_LOCAL connect on setup
  nfsd: Decode and send 64bit time values
  nfsd4: put_client_renew_locked can be static
  nfsd4: remove unused macro
  nfsd4: remove some useless code
  nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED
  ...
2013-05-03 10:59:39 -07:00
Jeff Layton
398c33aaa4 nfsd: convert nfs4_alloc_stid() to use idr_alloc_cyclic()
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 18:28:41 -07:00
J. Bruce Fields
dd30333cf5 nfsd4: better error return to indicate SSV non-support
As 4.1 becomes less experimental and SSV still isn't implemented, we
have to admit it's not going to be, and return some sensible error
rather than just saying "our server's broken".  Discussion in the ietf
group hasn't turned up any objections to using NFS4ERR_ENC_ALG_UNSUPP
for that purpose.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-26 16:18:15 -04:00
Fengguang Wu
ba138435d1 nfsd4: put_client_renew_locked can be static
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 22:15:00 -04:00
fanchaoting
53584f6652 nfsd4: remove some useless code
The "list_empty(&oo->oo_owner.so_stateids)" is aways true, so remove it.

Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 10:59:31 -04:00
J. Bruce Fields
3bd64a5ba1 nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED
A 4.1 server must notify a client that has had any state revoked using
the SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag.  The client can figure
out exactly which state is the problem using CHECK_STATEID and then free
it using FREE_STATEID.  The status flag will be unset once all such
revoked stateids are freed.

Our server's only recallable state is delegations.  So we keep with each
4.1 client a list of delegations that have timed out and been recalled,
but haven't yet been freed by FREE_STATEID.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-16 10:59:30 -04:00
J. Bruce Fields
23340032e6 nfsd4: clean up validate_stateid
The logic here is better expressed with a switch statement.

While we're here, CLOSED stateids (or stateids of an unkown type--which
would indicate a server bug) should probably return nfserr_bad_stateid,
though this behavior shouldn't affect any non-buggy client.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 17:42:28 -04:00
J. Bruce Fields
06b332a522 nfsd4: check backchannel attributes on create_session
Make sure the client gives us an adequate backchannel.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 16:53:56 -04:00
J. Bruce Fields
55c760cfc4 nfsd4: fix forechannel attribute negotiation
Negotiation of the 4.1 session forechannel attributes is a mess.  Fix:

	- Move it all into check_forechannel_attrs instead of spreading
	  it between that, alloc_session, and init_forechannel_attrs.
	- set a minimum "slotsize" so that our drc memory limits apply
	  even for small maxresponsesize_cached.  This also fixes some
	  bugs when slotsize becomes <= 0.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 16:43:44 -04:00
J. Bruce Fields
373cd4098a nfsd4: cleanup check_forechannel_attrs
Pass this struct by reference, not by value, and return an error instead
of a boolean to allow for future additions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 15:49:50 -04:00
J. Bruce Fields
0c7c3e67ab nfsd4: don't close read-write opens too soon
Don't actually close any opens until we don't need them at all.

This means being left with write access when it's not really necessary,
but that's better than putting a file that might still have posix locks
held on it, as we have been.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:57 -04:00
J. Bruce Fields
eb2099f31b nfsd4: release lockowners on last unlock in 4.1 case
In the 4.1 case we're supposed to release lockowners as soon as they're
no longer used.

It would probably be more efficient to reference count them, but that's
slightly fiddly due to the need to have callbacks from locks.c to take
into account lock merging and splitting.

For most cases just scanning the inode's lock list on unlock for
matching locks will be sufficient.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:56 -04:00
J. Bruce Fields
3d74e6a5b6 nfsd4: no need for replay_owner in sessions case
The replay_owner will never be used in the sessions case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:55 -04:00
J. Bruce Fields
c383747ef6 nfsd4: remove some redundant comments
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:54 -04:00
Wei Yongjun
2c44a23471 nfsd: use kmem_cache_free() instead of kfree()
memory allocated by kmem_cache_alloc() should be freed using
kmem_cache_free(), not kfree().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:47 -04:00
J. Bruce Fields
9411b1d4c7 nfsd4: cleanup handling of nfsv4.0 closed stateid's
Closed stateid's are kept around a little while to handle close replays
in the 4.0 case.  So we stash them in the last-used stateid in the
oo_last_closed_stateid field of the open owner.  We can free that in
encode_seqid_op_tail once the seqid on the open owner is next
incremented.  But we don't want to do that on the close itself; so we
set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the
first time through encode_seqid_op_tail, then when we see that flag set
next time we free it.

This is unnecessarily baroque.

Instead, just move the logic that increments the seqid out of the xdr
code and into the operation code itself.

The justification given for the current placement is that we need to
wait till the last minute to be sure we know whether the status is a
sequence-id-mutating error or not, but examination of the code shows
that can't actually happen.

Reported-by: Yanchuan Nian <ycnian@gmail.com>
Tested-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-08 09:55:32 -04:00
J. Bruce Fields
41d22663cb nfsd4: remove unused nfs4_check_deleg argument
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-04 13:25:17 -04:00
J. Bruce Fields
e8c69d17d1 nfsd4: make del_recall_lru per-network-namespace
If nothing else this simplifies the nfs4_state_shutdown_net logic a tad.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-04 13:25:16 -04:00
J. Bruce Fields
68a3396178 nfsd4: shut down more of delegation earlier
Once we've unhashed the delegation, it's only hanging around for the
benefit of an oustanding recall, which only needs the encoded
filehandle, stateid, and dl_retries counter.  No point keeping the file
around any longer, or keeping it hashed.

This also fixes a race: calls to idr_remove should really be serialized
by the caller, but the nfs4_put_delegation call from the callback code
isn't taking the state lock.

(Better might be to cancel the callback before destroying the
delegation, and remove any need for reference counting--but I don't see
an easy way to cancel an rpc call.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-04 13:25:15 -04:00
Jeff Layton
89876f8c0d nfsd: convert the file_hashtbl to a hlist
We only ever traverse the hash chains in the forward direction, so a
double pointer list head isn't really necessary.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 15:11:04 -04:00
J. Bruce Fields
66b2b9b2b0 nfsd4: don't destroy in-use session
This changes session destruction to be similar to client destruction in
that attempts to destroy a session while in use (which should be rare
corner cases) result in DELAY.  This simplifies things somewhat and
helps meet a coming 4.2 requirement.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:40 -04:00
J. Bruce Fields
221a687669 nfsd4: don't destroy in-use clients
When a setclientid_confirm or create_session confirms a client after a
client reboot, it also destroys any previous state held by that client.

The shutdown of that previous state must be careful not to free the
client out from under threads processing other requests that refer to
the client.

This is a particular problem in the NFSv4.1 case when we hold a
reference to a session (hence a client) throughout compound processing.

The server attempts to handle this by unhashing the client at the time
it's destroyed, then delaying the final free to the end.  But this still
leaves some races in the current code.

I believe it's simpler just to fail the attempt to destroy the client by
returning NFS4ERR_DELAY.  This is a case that should never happen
anyway.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:39 -04:00
J. Bruce Fields
4f6e6c1773 nfsd4: simplify bind_conn_to_session locking
The locking here is very fiddly, and there's no reason for us to be
setting cstate->session, since this is the only op in the compound.
Let's just take the state lock and drop the reference counting.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:39 -04:00
J. Bruce Fields
abcdff09a0 nfsd4: fix destroy_session race
destroy_session uses the session and client without continuously holding
any reference or locks.

Put the whole thing under the state lock for now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:38 -04:00
J. Bruce Fields
bfa85e83a8 nfsd4: clientid lookup cleanup
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:37 -04:00
J. Bruce Fields
c0293b0131 nfsd4: destroy_clientid simplification
I'm not sure what the check for clientid expiry was meant to do here.

The check for a matching session is redundant given the previous check
for state: a client without state is, in particular, a client without
sessions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:36 -04:00
J. Bruce Fields
1ca507920d nfsd4: remove some dprintk's
E.g. printk's that just report the return value from an op are
uninteresting as we already do that in the main proc_compound loop.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:36 -04:00
J. Bruce Fields
0eb6f20aa5 nfsd4: STALE_STATEID cleanup
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:35 -04:00
J. Bruce Fields
78389046f7 nfsd4: warn on odd create_session state
This should never happen.

(Note: the comparable case in setclientid_confirm *can* happen, since
updating a client record can result in both confirmed and unconfirmed
records with the same clientid.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:34 -04:00
ycnian@gmail.com
491402a787 nfsd: fix bug on nfs4 stateid deallocation
NFS4_OO_PURGE_CLOSE is not handled properly. To avoid memory leak, nfs4
stateid which is pointed by oo_last_closed_stid is freed in nfsd4_close(),
but NFS4_OO_PURGE_CLOSE isn't cleared meanwhile. So the stateid released in
THIS close procedure may be freed immediately in the coming encoding function.
Sorry that Signed-off-by was forgotten in last version.

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:34 -04:00
J. Bruce Fields
2e4b7239a6 nfsd4: fix use-after-free of 4.1 client on connection loss
Once we drop the lock here there's nothing keeping the client around:
the only lock still held is the xpt_lock on this socket, but this socket
no longer has any connection with the client so there's no way for other
code to know we're still using the client.

The solution is simple: all nfsd4_probe_callback does is set a few
variables and queue some work, so there's no reason we can't just keep
it under the lock.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:32 -04:00
J. Bruce Fields
b0a9d3ab57 nfsd4: fix race on client shutdown
Dropping the session's reference count after the client's means we leave
a window where the session's se_client pointer is NULL.  An xpt_user
callback that encounters such a session may then crash:

[  303.956011] BUG: unable to handle kernel NULL pointer dereference at 0000000000000318
[  303.959061] IP: [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
[  303.959061] PGD 37811067 PUD 3d498067 PMD 0
[  303.959061] Oops: 0002 [#8] PREEMPT SMP
[  303.959061] Modules linked in: md5 nfsd auth_rpcgss nfs_acl snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc microcode psmouse snd_timer serio_raw pcspkr evdev snd soundcore i2c_piix4 i2c_core intel_agp intel_gtt processor button nfs lockd sunrpc fscache ata_generic pata_acpi ata_piix uhci_hcd libata btrfs usbcore usb_common crc32c scsi_mod libcrc32c zlib_deflate floppy virtio_balloon virtio_net virtio_pci virtio_blk virtio_ring virtio
[  303.959061] CPU 0
[  303.959061] Pid: 264, comm: nfsd Tainted: G      D      3.8.0-ARCH+ #156 Bochs Bochs
[  303.959061] RIP: 0010:[<ffffffff81481a8e>]  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
[  303.959061] RSP: 0018:ffff880037877dd8  EFLAGS: 00010202
[  303.959061] RAX: 0000000000000100 RBX: ffff880037a2b698 RCX: ffff88003d879278
[  303.959061] RDX: ffff88003d879278 RSI: dead000000100100 RDI: 0000000000000318
[  303.959061] RBP: ffff880037877dd8 R08: ffff88003c5a0f00 R09: 0000000000000002
[  303.959061] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  303.959061] R13: 0000000000000318 R14: ffff880037a2b680 R15: ffff88003c1cbe00
[  303.959061] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[  303.959061] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  303.959061] CR2: 0000000000000318 CR3: 000000003d49c000 CR4: 00000000000006f0
[  303.959061] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  303.959061] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  303.959061] Process nfsd (pid: 264, threadinfo ffff880037876000, task ffff88003c1fd0a0)
[  303.959061] Stack:
[  303.959061]  ffff880037877e08 ffffffffa03772ec ffff88003d879000 ffff88003d879278
[  303.959061]  ffff88003d879080 0000000000000000 ffff880037877e38 ffffffffa0222a1f
[  303.959061]  0000000000107ac0 ffff88003c22e000 ffff88003d879000 ffff88003c1cbe00
[  303.959061] Call Trace:
[  303.959061]  [<ffffffffa03772ec>] nfsd4_conn_lost+0x3c/0xa0 [nfsd]
[  303.959061]  [<ffffffffa0222a1f>] svc_delete_xprt+0x10f/0x180 [sunrpc]
[  303.959061]  [<ffffffffa0223d96>] svc_recv+0xe6/0x580 [sunrpc]
[  303.959061]  [<ffffffffa03587c5>] nfsd+0xb5/0x140 [nfsd]
[  303.959061]  [<ffffffffa0358710>] ? nfsd_destroy+0x90/0x90 [nfsd]
[  303.959061]  [<ffffffff8107ae00>] kthread+0xc0/0xd0
[  303.959061]  [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pte_at+0x50/0x100
[  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
[  303.959061]  [<ffffffff814898ec>] ret_from_fork+0x7c/0xb0
[  303.959061]  [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70
[  303.959061] Code: ff ff 5d c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 65 48 8b 04 25 f0 c6 00 00 48 89 e5 83 80 44 e0 ff ff 01 b8 00 01 00 00 <3e> 66 0f c1 07 0f b6 d4 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f
[  303.959061] RIP  [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40
[  303.959061]  RSP <ffff880037877dd8>
[  303.959061] CR2: 0000000000000318
[  304.001218] ---[ end trace 2d809cd4a7931f5a ]---
[  304.001903] note: nfsd[264] exited with preempt_count 2

Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:48:31 -04:00
Tejun Heo
ebd6c70714 nfsd: convert to idr_alloc()
idr_get_new*() and friends are about to be deprecated.  Convert to the
new idr_alloc() interface.

Only compile-tested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Tested-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13 15:21:45 -07:00
Tejun Heo
801cb2d62d nfsd: remove unused get_new_stid()
get_new_stid() is no longer used since commit 3abdb60712 ("nfsd4:
simplify idr allocation").  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13 15:21:45 -07:00
Linus Torvalds
b6669737d3 Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from J Bruce Fields:
 "Miscellaneous bugfixes, plus:

   - An overhaul of the DRC cache by Jeff Layton.  The main effect is
     just to make it larger.  This decreases the chances of intermittent
     errors especially in the UDP case.  But we'll need to watch for any
     reports of performance regressions.

   - Containerized nfsd: with some limitations, we now support
     per-container nfs-service, thanks to extensive work from Stanislav
     Kinsbursky over the last year."

Some notes about conflicts, since there were *two* non-data semantic
conflicts here:

 - idr_remove_all() had been added by a memory leak fix, but has since
   become deprecated since idr_destroy() does it for us now.

 - xs_local_connect() had been added by this branch to make AF_LOCAL
   connections be synchronous, but in the meantime Trond had changed the
   calling convention in order to avoid a RCU dereference.

There were a couple of more obvious actual source-level conflicts due to
the hlist traversal changes and one just due to code changes next to
each other, but those were trivial.

* 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
  SUNRPC: make AF_LOCAL connect synchronous
  nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
  svcrpc: fix rpc server shutdown races
  svcrpc: make svc_age_temp_xprts enqueue under sv_lock
  lockd: nlmclnt_reclaim(): avoid stack overflow
  nfsd: enable NFSv4 state in containers
  nfsd: disable usermode helper client tracker in container
  nfsd: use proper net while reading "exports" file
  nfsd: containerize NFSd filesystem
  nfsd: fix comments on nfsd_cache_lookup
  SUNRPC: move cache_detail->cache_request callback call to cache_read()
  SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
  SUNRPC: rework cache upcall logic
  SUNRPC: introduce cache_detail->cache_request callback
  NFS: simplify and clean cache library
  NFS: use SUNRPC cache creation and destruction helper for DNS cache
  nfsd4: free_stid can be static
  nfsd: keep a checksum of the first 256 bytes of request
  sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
  sunrpc: fix comment in struct xdr_buf definition
  ...
2013-02-28 18:02:55 -08:00
Linus Torvalds
94f2f14234 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace and namespace infrastructure changes from Eric W Biederman:
 "This set of changes starts with a few small enhnacements to the user
  namespace.  reboot support, allowing more arbitrary mappings, and
  support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the
  user namespace root.

  I do my best to document that if you care about limiting your
  unprivileged users that when you have the user namespace support
  enabled you will need to enable memory control groups.

  There is a minor bug fix to prevent overflowing the stack if someone
  creates way too many user namespaces.

  The bulk of the changes are a continuation of the kuid/kgid push down
  work through the filesystems.  These changes make using uids and gids
  typesafe which ensures that these filesystems are safe to use when
  multiple user namespaces are in use.  The filesystems converted for
  3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs.  The
  changes for these filesystems were a little more involved so I split
  the changes into smaller hopefully obviously correct changes.

  XFS is the only filesystem that remains.  I was hoping I could get
  that in this release so that user namespace support would be enabled
  with an allyesconfig or an allmodconfig but it looks like the xfs
  changes need another couple of days before it they are ready."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits)
  cifs: Enable building with user namespaces enabled.
  cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t
  cifs: Convert struct cifs_sb_info to use kuids and kgids
  cifs: Modify struct smb_vol to use kuids and kgids
  cifs: Convert struct cifsFileInfo to use a kuid
  cifs: Convert struct cifs_fattr to use kuid and kgids
  cifs: Convert struct tcon_link to use a kuid.
  cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t
  cifs: Convert from a kuid before printing current_fsuid
  cifs: Use kuids and kgids SID to uid/gid mapping
  cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc
  cifs: Use BUILD_BUG_ON to validate uids and gids are the same size
  cifs: Override unmappable incoming uids and gids
  nfsd: Enable building with user namespaces enabled.
  nfsd: Properly compare and initialize kuids and kgids
  nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids
  nfsd: Modify nfsd4_cb_sec to use kuids and kgids
  nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion
  nfsd: Convert nfsxdr to use kuids and kgids
  nfsd: Convert nfs3xdr to use kuids and kgids
  ...
2013-02-25 16:00:49 -08:00
Zhang Yanfei
697ce9be7d fs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_used
The three variables are calculated from nr_free_buffer_pages so change
their types to unsigned long in case of overflow.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:22 -08:00
Stanislav Kinsbursky
deb4534f4f nfsd: enable NFSv4 state in containers
Currently, NFSd is ready to operate in network namespace based containers.
So let's drop check for "init_net" and make it able to fly.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-15 11:21:02 -05:00
Eric W. Biederman
6fab877900 nfsd: Properly compare and initialize kuids and kgids
Use uid_eq(uid, GLOBAL_ROOT_UID) instead of !uid.
Use gid_eq(gid, GLOBAL_ROOT_GID) instead of !gid.
Use uid_eq(uid, INVALID_UID) instead of uid == -1
Use gid_eq(uid, INVALID_GID) instead of gid == -1
Use uid = GLOBAL_ROOT_UID instead of uid = 0;
Use gid = GLOBAL_ROOT_GID instead of gid = 0;
Use !uid_eq(uid1, uid2) instead of uid1 != uid2.
Use !gid_eq(gid1, gid2) instead of gid1 != gid2.
Use uid_eq(uid1, uid2) instead of uid1 == uid2.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-02-13 06:16:09 -08:00
Fengguang Wu
e56a316214 nfsd4: free_stid can be static
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
2013-02-11 16:22:50 -05:00
Jeff Layton
5976687a2b sunrpc: move address copy/cmp/convert routines and prototypes from clnt.h to addr.h
These routines are used by server and client code, so having them in a
separate header would be best.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-05 09:41:14 -05:00
J. Bruce Fields
3abdb60712 nfsd4: simplify idr allocation
We don't really need to preallocate at all; just allocate and initialize
everything at once, but leave the sc_type field initially 0 to prevent
finding the stateid till it's fully initialized.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-05 09:41:12 -05:00
majianpeng
2d32b29a1c nfsd: Fix memleak
When free nfs-client, it must free the ->cl_stateids.

Cc: stable@kernel.org
Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-05 09:40:47 -05:00
Stanislav Kinsbursky
bca0ec6511 nfsd: fix unused "nn" variable warning in free_client()
If CONFIG_LOCKDEP is disabled, then there would be a warning like this:

  CC [M]  fs/nfsd/nfs4state.o
fs/nfsd/nfs4state.c: In function ‘free_client’:
fs/nfsd/nfs4state.c:1051:19: warning: unused variable ‘nn’ [-Wunused-variable]

So, let's add "maybe_unused" tag to this variable.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23 18:17:40 -05:00
Yanchuan Nian
266533c6df nfsd: Don't unlock the state while it's not locked
In the procedure of CREATE_SESSION, the state is locked after
alloc_conn_from_crses(). If the allocation fails, the function
goes to "out_free_session", and then "out" where there is an
unlock function.

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23 18:17:37 -05:00
Yanchuan Nian
74b70dded3 nfsd: Pass correct slot number to nfsd4_put_drc_mem()
In alloc_session(), numslots is the correct slot number used by the session.
But the slot number passed to nfsd4_put_drc_mem() is the one from nfs client.

Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23 18:17:36 -05:00
J. Bruce Fields
24ffb93872 nfsd4: don't leave freed stateid hashed
Note the stateid is hashed early on in init_stid(), but isn't currently
being unhashed on error paths.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-17 22:00:28 -05:00
J. Bruce Fields
9b3234b922 nfsd4: disable zero-copy on non-final read ops
To ensure ordering of read data with any following operations, turn off
zero copy if the read is not the final operation in the compound.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-17 16:02:41 -05:00
Bryan Schumaker
0a5c33e23c NFSD: Pass correct buffer size to rpc_ntop
I honestly have no idea where I got 129 from, but it's a much bigger
value than the actual buffer size (INET6_ADDRSTRLEN).

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 18:24:21 -05:00
Stanislav Kinsbursky
9dd9845f08 nfsd: make NFSd service structure allocated per net
This patch makes main step in NFSd containerisation.

There could be different approaches to how to make NFSd able to handle
incoming RPC request from different network namespaces.  The two main
options are:

1) Share NFSd kthreads betwween all network namespaces.
2) Create separated pool of threads for each namespace.

While first approach looks more flexible, second one is simpler and
non-racy.  This patch implements the second option.

To make it possible to allocate separate pools of threads, we have to
make it possible to allocate separate NFSd service structures per net.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:39 -05:00
J. Bruce Fields
9b2ef62b15 nfsd4: lockt, release_lockowner should renew clients
Fix nfsd4_lockt and release_lockowner to lookup the referenced client,
so that it can renew it, or correctly return "expired", as appropriate.

Also share some code while we're here.

Reported-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-04 07:51:12 -05:00
Bryan Schumaker
6c1e82a4b7 NFSD: Forget state for a specific client
Write the client's ip address to any state file and all appropriate
state for that client will be forgotten.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-03 09:59:03 -05:00
Bryan Schumaker
184c18471f NFSD: Reading a fault injection file prints a state count
I also log basic information that I can figure out about the type of
state (such as number of locks for each client IP address).  This can be
useful for checking that state was actually dropped and later for
checking if the client was able to recover.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-03 09:59:01 -05:00
Bryan Schumaker
8ce54e0d82 NFSD: Fault injection operations take a per-client forget function
The eventual goal is to forget state based on ip address, so it makes
sense to call this function in a for-each-client loop until the correct
amount of state is forgotten.  I also use this patch as an opportunity
to rename the forget function from "func()" to "forget()".

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-03 09:59:00 -05:00
Bryan Schumaker
269de30f10 NFSD: Clean up forgetting and recalling delegations
Once I have a client, I can easily use its delegation list rather than
searching the file hash table for delegations to remove.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-03 09:58:59 -05:00
Bryan Schumaker
4dbdbda84f NFSD: Clean up forgetting openowners
Using "forget_n_state()" forces me to implement the code needed to
forget a specific client's openowners.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-03 09:58:58 -05:00
Bryan Schumaker
fc29171f5b NFSD: Clean up forgetting locks
I use the new "forget_n_state()" function to iterate through each client
first when searching for locks.  This may slow down forgetting locks a
little bit, but it implements most of the code needed to forget a
specified client's locks.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-03 09:58:56 -05:00
Bryan Schumaker
44e34da60b NFSD: Clean up forgetting clients
I added in a generic for-each loop that takes a pass over the client_lru
list for the current net namespace and calls some function.  The next few
patches will update other operations to use this function as well.  A value
of 0 still means "forget everything that is found".

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-03 09:58:55 -05:00
Bryan Schumaker
043958395a NFSD: Lock state before calling fault injection function
Each function touches state in some way, so getting the lock earlier
can help simplify code.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-03 09:58:54 -05:00
Bryan Schumaker
f3c7521fe5 NFSD: Fold fault_inject.h into state.h
There were only a small number of functions in this file and since they
all affect stored state I think it makes sense to put them in state.h
instead.  I also dropped most static inline declarations since there are
no callers when fault injection is not enabled.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 13:01:02 -05:00
Stanislav Kinsbursky
5284b44e43 nfsd: make NFSv4 grace time per net
Grace time is a part of NFSv4 state engine, which is constructed per network
namespace.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 10:39:47 -05:00
Stanislav Kinsbursky
3d7337115d nfsd: make NFSv4 lease time per net
Lease time is a part of NFSv4 state engine, which is constructed per network
namespace.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 10:39:46 -05:00
Stanislav Kinsbursky
f252bc6806 nfsd: call state init and shutdown twice
Split NFSv4 state init and shutdown into two different calls: per-net one and
generic one.
Per-net cwinit/shutdown pair have to be called for any namespace, generic pair
- only once on NSFd kthreads start and shutdown respectively.

Refresh of diff-nfsd-call-state-init-twice

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 10:13:53 -05:00
Stanislav Kinsbursky
d85ed44305 nfsd: cleanup NFSd state start a bit
This patch renames nfs4_state_start_net() into nfs4_state_create_net(), where
get_net() now performed.
Also it introduces new nfs4_state_start_net(), which is now responsible for
state creation and initializing all per-net data and which is now called from
nfs4_state_start().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 10:13:52 -05:00
Stanislav Kinsbursky
4dce0ac906 nfsd: cleanup NFSd state shutdown a bit
This patch renames __nfs4_state_shutdown_net() into nfs4_state_shutdown_net(),
__nfs4_state_shutdown() into nfs4_state_shutdown_net() and moves all network
related shutdown operations to nfs4_state_shutdown_net().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 10:13:51 -05:00
Stanislav Kinsbursky
4e37a7c207 nfsd: make delegations shutdown network namespace aware
NFSv4 delegations are stored in global list. But they are nfs4_client
dependent, which is network namespace aware already.
State shutdown and laundromat are done per network namespace as well.
So, delegations unhash have to be done in network namespace context.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 10:13:50 -05:00
Stanislav Kinsbursky
c9a4962881 nfsd: make client_lock per net
This lock protects the client lru list and session hash table, which are
allocated per network namespace already.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 10:13:50 -05:00
Stanislav Kinsbursky
ec28e02ca5 nfsd4: remove state lock from nfs4_state_shutdown
Protection of __nfs4_state_shutdown() with nfs4_lock_state() looks redundant.

This function is called by the last NFSd thread on it's exit and state lock
protects actually two functions (del_recall_lru is protected by recall_lock):
1) nfsd4_client_tracking_exit
2) __nfs4_state_shutdown_net

"nfsd4_client_tracking_exit" doesn't require state lock protection, because it's
state can be modified only by tracker callbacks.
Here a re they:
1) create: is called only from nfsd4_proc_compound.
2) remove: is called from either nfsd4_proc_compound or nfs4_laundromat.
3) check: is called only from nfsd4_proc_compound.
4) grace_done; called only from nfs4_laundromat.

nfsd4_proc_compound is called onll by NFSd kthread, which is exiting right
now.
nfs4_laundromat is called by laundry_wq. But laundromat_work was canceled
already.

"__nfs4_state_shutdown_net" also doesn't require state lock protection,
because all NFSd kthreads are dead, and no race can happen with NFSd start,
because "nfsd_up" flag is still set.
Moreover, all Nfsd shutdown is protected with global nfsd_mutex.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 10:13:49 -05:00
J. Bruce Fields
063b0fb9fa nfsd4: downgrade some fs/nfsd/nfs4state.c BUG's
Linus has pointed out that indiscriminate use of BUG's can make it
harder to diagnose bugs because they can bring a machine down, often
before we manage to get any useful debugging information to the logs.
(Consider, for example, a BUG() that fires in a workqueue, or while
holding a spinlock).

Most of these BUG's won't do much more than kill an nfsd thread, but it
would still probably be safer to get out the warning without dying.

There's still more of this to do in nfsd/.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26 09:08:16 -05:00
Stanislav Kinsbursky
0912128149 nfsd: make laundromat network namespace aware
This patch moves laundromat_work to nfsd per-net context, thus allowing to run
multiple laundries.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:51 -05:00
Stanislav Kinsbursky
12760c6685 nfsd: pass nfsd_net instead of net to grace enders
Passing net context looks as overkill.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:50 -05:00
Stanislav Kinsbursky
3320fef19b nfsd: use service net instead of hard-coded init_net
This patch replaces init_net by SVC_NET(), where possible and also passes
proper context to nested functions where required.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:50 -05:00
Stanislav Kinsbursky
73758fed71 nfsd: make close_lru list per net
This list holds nfs4 clients (open) stateowner queue for last close replay,
which are network namespace aware. So let's make this list per network
namespace too.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:49 -05:00
Stanislav Kinsbursky
5ed58bb243 nfsd: make client_lru list per net
This list holds nfs4 clients queue for lease renewal, which are network
namespace aware. So let's make this list per network namespace too.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:48 -05:00
Stanislav Kinsbursky
1872de0e81 nfsd: make sessionid_hashtbl allocated per net
This hash holds established sessions state and closely associated with
nfs4_clients info, which are network namespace aware. So let's make it
allocated per network namespace too.

Note: this hash can be allocated in per-net operations. But it looks
better to allocate it on nfsd state start and thus don't waste resources
if server is not running.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:47 -05:00
Stanislav Kinsbursky
20e9e2bc98 nfsd: make lockowner_ino_hashtbl allocated per net
This hash holds file lock owners and closely associated with nfs4_clients info,
which are network namespace aware. So let's make it allocated per network
namespace too.

Note: this hash can be allocated in per-net operations. But it looks
better to allocate it on nfsd state start and thus don't waste resources
if server is not running.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:47 -05:00
Stanislav Kinsbursky
9b53113740 nfsd: make ownerstr_hashtbl allocated per net
This hash holds open owner state and closely associated with nfs4_clients
info, which are network namespace aware. So let's make it allocated per
network namespace too.

Note: this hash can be allocated in per-net operations. But it looks
better to allocate it on nfsd state start and thus don't waste resources
if server is not running.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:46 -05:00
Stanislav Kinsbursky
a99454aa4f nfsd: make unconf_name_tree per net
This hash holds nfs4_clients info, which are network namespace aware.
So let's make it allocated per network namespace.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:45 -05:00
Stanislav Kinsbursky
0a7ec37727 nfsd: make unconf_id_hashtbl allocated per net
This hash holds nfs4_clients info, which are network namespace aware.
So let's make it allocated per network namespace.

Note: this hash can be allocated in per-net operations. But it looks
better to allocate it on nfsd state start and thus don't waste resources
if server is not running.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:45 -05:00
Stanislav Kinsbursky
382a62e76c nfsd: make conf_name_tree per net
This tree holds nfs4_clients info, which are network namespace aware.
So let's make it per network namespace.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:44 -05:00
Stanislav Kinsbursky
8daae4dc0d nfsd: make conf_id_hashtbl allocated per net
This hash holds nfs4_clients info, which are network namespace aware.
So let's make it allocated per network namespace.

Note: this hash can be allocated in per-net operations. But it looks
better to allocate it on nfsd state start and thus don't waste resources
if server is not running.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:43 -05:00
Stanislav Kinsbursky
52e19c09a1 nfsd: make reclaim_str_hashtbl allocated per net
This hash holds nfs4_clients info, which are network namespace aware.
So let's make it allocated per network namespace.

Note: this hash is used only by legacy tracker. So let's allocate hash in
tracker init.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:43 -05:00
Stanislav Kinsbursky
c212cecfa2 nfsd: make nfs4_client network namespace dependent
And use it's net where possible.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:42 -05:00
Stanislav Kinsbursky
7f2210fa6b nfsd: use service net instead of hard-coded net where possible
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:41 -05:00
Fengguang Wu
135ae8270d nfsd4: init_session should be declared static
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-14 11:23:00 -05:00
Jeff Layton
2216d449a9 nfsd: get rid of cl_recdir field
Remove the cl_recdir field from the nfs4_client struct. Instead, just
compute it on the fly when and if it's needed, which is now only when
the legacy client tracking code is in effect.

The error handling in the legacy client tracker is also changed to
handle the case where md5 is unavailable. In that case, we'll warn
the admin with a KERN_ERR message and disable the client tracking.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12 18:55:11 -05:00
Jeff Layton
ac55fdc408 nfsd: move the confirmed and unconfirmed hlists to a rbtree
The current code requires that we md5 hash the name in order to store
the client in the confirmed and unconfirmed trees. Change it instead
to store the clients in a pair of rbtrees, and simply compare the
cl_names directly instead of hashing them. This also necessitates that
we add a new flag to the clp->cl_flags field to indicate which tree
the client is currently in.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12 18:55:11 -05:00
Jeff Layton
0ce0c2b5d2 nfsd: don't search for client by hash on legacy reboot recovery gracedone
When nfsd starts, the legacy reboot recovery code creates a tracking
struct for each directory in the v4recoverydir. When the grace period
ends, it basically does a "readdir" on the directory again, and matches
each dentry in there to an existing client id to see if it should be
removed or not. If the matching client doesn't exist, or hasn't
reclaimed its state then it will remove that dentry.

This is pretty inefficient since it involves doing a lot of hash-bucket
searching. It also means that we have to keep relying on being able to
search for a nfs4_client by md5 hashed cl_recdir name.

Instead, add a pointer to the nfs4_client that indicates the association
between the nfs4_client_reclaim and nfs4_client. When a reclaim operation
comes in, we set the pointer to make that association. On gracedone, the
legacy client tracker will keep the recdir around iff:

1/ there is a reclaim record for the directory

...and...

2/ there's an association between the reclaim record and a client record
-- that is, a create or check operation was performed on the client that
matches that directory.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12 18:55:11 -05:00
Jeff Layton
772a9bbbb5 nfsd: make nfs4_client_to_reclaim return a pointer to the reclaim record
Later callers will need to make changes to the record.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12 18:55:11 -05:00
Jeff Layton
ce30e5392f nfsd: break out reclaim record removal into separate function
We'll need to be able to call this from nfs4recover.c eventually.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12 18:55:11 -05:00
Jeff Layton
278c931cb0 nfsd: have nfsd4_find_reclaim_client take a char * argument
Currently, it takes a client pointer, but later we're going to need to
search for these records without knowing whether a matching client even
exists.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12 18:55:11 -05:00
Jeff Layton
a0af710a65 nfsd: remove unused argument to nfs4_has_reclaimed_state
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-10 14:56:54 -05:00
J. Bruce Fields
57725155dc nfsd4: common helper to initialize callback work
I've found it confusing having the only references to
nfsd4_do_callback_rpc() in a different file.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07 19:40:04 -05:00
J. Bruce Fields
cb73a9f464 nfsd4: implement backchannel_ctl operation
This operation is mandatory for servers to implement.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07 19:39:58 -05:00
J. Bruce Fields
c6bb3ca27d nfsd4: use callback security parameters in create_session
We're currently ignoring the callback security parameters specified in
create_session, and just assuming the client wants auth_sys, because
that's all the current linux client happens to care about.  But this
could cause us callbacks to fail to a client that wanted something
different.

For now, all we're doing is no longer ignoring the uid and gid passed in
the auth_sys case.  Further patches will add support for auth_null and
gss (and possibly use more of the auth_sys information; the spec wants
us to use exactly the credential we're passed, though it's hard to
imagine why a client would care).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07 19:31:35 -05:00
J. Bruce Fields
7fa10cd12d nfsd4: don't BUG in delegation break callback
These conditions would indeed indicate bugs in the code, but if we want
to hear about them we're likely better off warning and returning than
immediately dying while holding file_lock_lock.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07 19:31:33 -05:00
J. Bruce Fields
7c1f8b65af nfsd4: remove unused init_session return
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07 19:31:31 -05:00
Yanchuan Nian
3c40794b2d nfs: fix wrong object type in lockowner_slab
The object type in the cache of lockowner_slab is wrong, and it is
better to fix it.

Cc: stable@vger.kernel.org
Signed-off-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07 19:30:57 -05:00
Wei Yongjun
01f6c8fd94 nfsd4: remove unused variable in nfsd4_delegreturn()
The variable inode is initialized but never used
otherwise, so remove the unused variable.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07 19:22:31 -05:00
J. Bruce Fields
f474af7051 UAPI Disintegration 2012-10-09
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIVAwUAUHPmWxOxKuMESys7AQKN4w//XDwALfbf0MXIw+gwyRiUtJe9mGexvI6X
 1R4FWU9a3ImzEZP4cWnmPGT2wmC/x007DcIvx8cyvbdlSuqtR2i/DC+HbWabiLRn
 nJS7Eer1BJvLv5dn6NmXMEz7yB4Z46+frcmBs3WQeR0sqBMDm+rjQzCqECznO8Jc
 VtCbox+VR2DuWcM++YECTblYEH3Z+doDXUN2eBaD8L9x3klPbPXD7OcRyOnry8w+
 ynmUTKKyH4+hpxDakYrObPIg+vFCxb4QRck1mlgA4wbvb3eqjhM0oOCYJ8GvmILA
 vdFYztWCjkiuOl5djtXBlsClX8SAMOBYlRed+R1GvjNCSR+WCWrFJJ2F8qoQ1w87
 9ts2/8qrozS8luTB475SkT2uLdJkIUKX89Oh+dWeE8YkbPnRPj5lNAdtNY5QSyDq
 VaRpIo+YfmZygyvHJQlAXBuZ0mvzcPzArfcPgSVTD3B7xTEGVu/45V7SnQX5os/V
 v39ySPXMdGOIdvK51gw7OtZl64uqrEKu39PyYDX/GUADflp/CHD0J7PJrQePbsH9
 AQolVZDIxTfKqYQnUdL8+C8Zc24RowEzz3c2+aO89MSzwGqev3q8sXRVbW/Iqryg
 p+V3nHe+ipKcga5tOBlPr9KDtDd7j3xN2yaIwf5/QyO1OHBpjAZP1gjSVDcUcwpi
 svYy4kPn3PA=
 =etoL
 -----END PGP SIGNATURE-----

nfs: disintegrate UAPI for nfs

This is to complete part of the Userspace API (UAPI) disintegration for which
the preparatory patches were pulled recently.  After these patches, userspace
headers will be segregated into:

        include/uapi/linux/.../foo.h

for the userspace interface stuff, and:

        include/linux/.../foo.h

for the strictly kernel internal stuff.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-09 18:35:22 -04:00
J. Bruce Fields
0d22f68f02 nfsd4: don't allow reclaims of expired clients
When a confirmed client expires, we normally also need to expire any
stable storage record which would allow that client to reclaim state on
the next boot.  We forgot to do this in some cases.  (For example, in
destroy_clientid, and in the cases in exchange_id and create_session
that destroy and existing confirmed client.)

But in most other cases, there's really no harm to calling
nfsd4_client_record_remove(), because it is a no-op in the case the
client doesn't have an existing

The single exception is destroying a client on shutdown, when we want to
keep the stable storage records so we can recognize which clients will
be allowed to reclaim when we come back up.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:40:04 -04:00
J. Bruce Fields
6a3b156342 nfsd4: remove redundant callback probe
Both nfsd4_init_conn and alloc_init_session are probing the callback
channel, harmless but pointless.

Also, nfsd4_init_conn should probably be probing in the "unknown" case
as well.  In fact I don't see any harm to just doing it unconditionally
when we get a new backchannel connection.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:40:03 -04:00
J. Bruce Fields
8f9d3d3b7c nfsd4: expire old client earlier
Before we had to delay expiring a client till we'd found out whether the
session and connection allocations would succeed.  That's no longer
necessary.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:40:03 -04:00
J. Bruce Fields
81f0b2a496 nfsd4: separate session allocation and initialization
This will allow some further simplification.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:40:02 -04:00
J. Bruce Fields
a827bcb242 nfsd4: clean up session allocation
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:40:01 -04:00
J. Bruce Fields
1377b69e68 nfsd4: minor free_session cleanup
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:40:00 -04:00
J. Bruce Fields
e1ff371f9d nfsd4: new_conn_from_crses should only allocate
Do the initialization in the caller, and clarify that the only failure
ever possible here was due to allocation.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:40:00 -04:00
J. Bruce Fields
3ba6367124 nfsd4: separate connection allocation and initialization
It'll be useful to have connection allocation and initialization as
separate functions.

Also, note we'd been ignoring the alloc_conn error return in
bind_conn_to_session.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:39:59 -04:00
J. Bruce Fields
4973050148 nfsd4: reject bad forechannel attrs earlier
This could simplify the logic a little later.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:39:58 -04:00
J. Bruce Fields
d15c077e44 nfsd4: enforce per-client sessions/no-sessions distinction
Something like creating a client with setclientid and then trying to
confirm it with create_session may not crash the server, but I'm not
completely positive of that, and in any case it's obviously bad client
behavior.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:39:58 -04:00
J. Bruce Fields
c116a0af76 nfsd4: set cl_minorversion at create time
And remove some mostly obsolete comments.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:39:57 -04:00
J. Bruce Fields
68eb35081e nfsd4: don't pin clientids to pseudoflavors
I added cr_flavor to the data compared in same_creds without any
justification, in d5497fc693 "nfsd4: move
rq_flavor into svc_cred".

Recent client changes then started making

	mount -osec=krb5 server:/export /mnt/
	echo "hello" >/mnt/TMP
	umount /mnt/
	mount -osec=krb5i server:/export /mnt/
	echo "hello" >/mnt/TMP

to fail due to a clid_inuse on the second open.

Mounting sequentially like this with different flavors probably isn't
that common outside artificial tests.  Also, the real bug here may be
that the server isn't just destroying the former clientid in this case
(because it isn't good enough at recognizing when the old state is
gone).  But it prompted some discussion and a look back at the spec, and
I think the check was probably wrong.  Fix and document.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:39:14 -04:00
Al Viro
cb0942b812 make get_file() return its argument
simplifies a bunch of callers...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-09-26 21:10:25 -04:00
J. Bruce Fields
ef79859e04 nfsd4: eliminate redundant nfs4_free_stateid
Somehow we ended up with identical functions "nfs4_free_stateid" and
"free_generic_stateid".

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-09-10 17:46:17 -04:00
J. Bruce Fields
cf9182e90b nfsd4: fix nfs4 stateid leak
Processes that open and close multiple files may end up setting this
oo_last_closed_stid without freeing what was previously pointed to.
This can result in a major leak, visible for example by watching the
nfsd4_stateids line of /proc/slabinfo.

Reported-by: Cyril B. <cbay@excellency.fr>
Tested-by: Cyril B. <cbay@excellency.fr>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-09-10 10:55:14 -04:00
Jeff Layton
21179d81f1 knfsd: don't allocate file_locks on the stack
struct file_lock is pretty large and really ought not live on the stack.
On my x86_64 machine, they're almost 200 bytes each.

    (gdb) p sizeof(struct file_lock)
    $1 = 192

...allocate them dynamically instead.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 14:08:39 -04:00
Jeff Layton
5592a3f397 knfsd: remove bogus BUG_ON() call from nfsd4_locku
The code checks for a NULL filp and handles it gracefully just before
this BUG_ON.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 14:08:38 -04:00
J. Bruce Fields
da5c80a935 nfsd4: nfsd_process_n_delegations should be static
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 13:59:39 -04:00
Jeff Layton
1696c47ce2 nfsd: trivial comment updates
locks.c doesn't use the BKL anymore and there is no fi_perfile field.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-20 18:39:42 -04:00
Stanislav Kinsbursky
2c142baa7b NFSd: make boot_time variable per network namespace
NFSd's boot_time represents grace period start point in time.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27 16:49:22 -04:00
Stanislav Kinsbursky
a51c84ed50 NFSd: make grace end flag per network namespace
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27 16:49:22 -04:00
Stanislav Kinsbursky
5ccb0066f2 LockD: pass actual network namespace to grace period management functions
Passed network namespace replaced hard-coded init_net

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27 16:49:22 -04:00
Stanislav Kinsbursky
5e1533c788 NFSd: make nfsd4_manager allocated per network namespace context.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27 16:49:21 -04:00
J. Bruce Fields
99dbb8fe09 nfsd4: fix missing fault_inject.h include
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27 16:30:12 -04:00
Stanislav Kinsbursky
a6d88f293e NFSd: fix locking in nfsd_forget_delegations()
This patch adds recall_lock hold to nfsd_forget_delegations() to protect
nfsd_process_n_delegations() call.
Also, looks like it would be better to collect delegations to some local
on-stack list, and then unhash collected list. This split allows to
simplify locking, because delegation traversing is protected by recall_lock,
when delegation unhash is protected by client_mutex.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-25 09:18:27 -04:00
Vivek Trivedi
5559b50acd nfsd4: fix cr_principal comparison check in same_creds
This fixes a wrong check for same cr_principal in same_creds

Introduced by 8fbba96e5b "nfsd4: stricter
cred comparison for setclientid/exchange_id".

Cc: stable@vger.kernel.org
Signed-off-by: Vivek Trivedi <vtrivedi018@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-25 09:05:30 -04:00
J. Bruce Fields
74dbafaf5d nfsd4: release openowners on free in >=4.1 case
We don't need to keep openowners around in the >=4.1 case, because they
aren't needed to handle CLOSE replays any more (that's a problem for
sessions).  And doing so causes unexpected failures on a subsequent
destroy_clientid to fail.

We probably also need something comparable for lock owners on last
unlock.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-10 16:41:34 -04:00
J. Bruce Fields
4af825041b nfsd4: process_open2 cleanup
Note we can simplify the error handling a little by doing the truncate
earlier.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-06-20 08:59:43 -04:00
J. Bruce Fields
e1aaa8916f nfsd4: nfsd4_lock() cleanup
Share a little common logic.  And note the comments here are a little
out of date (e.g. we don't always create new state in the "new" case any
more.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-06-20 08:59:42 -04:00
Chuck Lever
7df302f75e NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID
According to RFC 5661, the TEST_STATEID operation is not allowed to
return NFS4ERR_STALE_STATEID.  In addition, RFC 5661 says:

15.1.16.5.  NFS4ERR_STALE_STATEID (Error Code 10023)

   A stateid generated by an earlier server instance was used.  This
   error is moot in NFSv4.1 because all operations that take a stateid
   MUST be preceded by the SEQUENCE operation, and the earlier server
   instance is detected by the session infrastructure that supports
   SEQUENCE.

I triggered NFS4ERR_STALE_STATEID while testing the Linux client's
NOGRACE recovery.  Bruce suggested an additional test that could be
useful to client developers.

Lastly, RFC 5661, section 18.48.3 has this:

 o  Special stateids are always considered invalid (they result in the
    error code NFS4ERR_BAD_STATEID).

An explicit check is made for those state IDs to avoid printk noise.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-06-20 08:59:40 -04:00
Weston Andros Adamson
2411967305 nfsd: probe the back channel on new connections
Initiate a CB probe when a new connection with the correct direction is added
to a session (IFF backchannel is marked as down).  Without this a
BIND_CONN_TO_SESSION has no effect on the internal backchannel state, which
causes the server to reply to every SEQUENCE op with the
SEQ4_STATUS_CB_PATH_DOWN flag set until DESTROY_SESSION.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-06-20 08:59:39 -04:00
J. Bruce Fields
bc2df47a40 nfsd4: BUG_ON(!is_spin_locked()) no good on UP kernels
Most frequent symptom was a BUG triggering in expire_client, with the
server locking up shortly thereafter.

Introduced by 508dc6e110 "nfsd41:
free_session/free_client must be called under the client_lock".

Cc: stable@kernel.org
Cc: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-06-14 13:54:08 -04:00
Linus Torvalds
419f431949 Merge branch 'for-3.5' of git://linux-nfs.org/~bfields/linux
Pull the rest of the nfsd commits from Bruce Fields:
 "... and then I cherry-picked the remainder of the patches from the
  head of my previous branch"

This is the rest of the original nfsd branch, rebased without the
delegation stuff that I thought really needed to be redone.

I don't like rebasing things like this in general, but in this situation
this was the lesser of two evils.

* 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits)
  nfsd4: fix, consolidate client_has_state
  nfsd4: don't remove rebooted client record until confirmation
  nfsd4: remove some dprintk's and a comment
  nfsd4: return "real" sequence id in confirmed case
  nfsd4: fix exchange_id to return confirm flag
  nfsd4: clarify that renewing expired client is a bug
  nfsd4: simpler ordering of setclientid_confirm checks
  nfsd4: setclientid: remove pointless assignment
  nfsd4: fix error return in non-matching-creds case
  nfsd4: fix setclientid_confirm same_cred check
  nfsd4: merge 3 setclientid cases to 2
  nfsd4: pull out common code from setclientid cases
  nfsd4: merge last two setclientid cases
  nfsd4: setclientid/confirm comment cleanup
  nfsd4: setclientid remove unnecessary terms from a logical expression
  nfsd4: move rq_flavor into svc_cred
  nfsd4: stricter cred comparison for setclientid/exchange_id
  nfsd4: move principal name into svc_cred
  nfsd4: allow removing clients not holding state
  nfsd4: rearrange exchange_id logic to simplify
  ...
2012-06-01 08:32:58 -07:00
Linus Torvalds
a00b6151a2 Merge branch 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux
Pull nfsd update from Bruce Fields.

* 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux: (23 commits)
  nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek
  SUNRPC: split upcall function to extract reusable parts
  nfsd: allocate id-to-name and name-to-id caches in per-net operations.
  nfsd: make name-to-id cache allocated per network namespace context
  nfsd: make id-to-name cache allocated per network namespace context
  nfsd: pass network context to idmap init/exit functions
  nfsd: allocate export and expkey caches in per-net operations.
  nfsd: make expkey cache allocated per network namespace context
  nfsd: make export cache allocated per network namespace context
  nfsd: pass pointer to export cache down to stack wherever possible.
  nfsd: pass network context to export caches init/shutdown routines
  Lockd: pass network namespace to creation and destruction routines
  NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches
  nfsd: pass pointer to expkey cache down to stack wherever possible.
  nfsd: use hash table from cache detail in nfsd export seq ops
  nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops
  nfsd: use exp_put() for svc_export_cache put
  nfsd: use cache detail pointer from svc_export structure on cache put
  nfsd: add link to owner cache detail to svc_export structure
  nfsd: use passed cache_detail pointer expkey_parse()
  ...
2012-05-31 18:18:11 -07:00
J. Bruce Fields
6eccece90b nfsd4: fix, consolidate client_has_state
Whoops: first, I reimplemented the already-existing has_resources
without noticing; second, I got the test backwards.  I did pick a better
name, though.  Combine the two....

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:39 -04:00
J. Bruce Fields
b9831b59f3 nfsd4: don't remove rebooted client record until confirmation
In the NFSv4.1 client-reboot case we're currently removing the client's
previous state in exchange_id.  That's wrong--we should be waiting till
the confirming create_session.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:34 -04:00
J. Bruce Fields
32f16b3823 nfsd4: remove some dprintk's and a comment
The comment is redundant, and if we really want dprintk's here they'd
probably be better in the common (check-slot_seqid) code.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:31 -04:00
J. Bruce Fields
778df3f0fe nfsd4: return "real" sequence id in confirmed case
The client should ignore the returned sequence_id in the case where the
CONFIRMED flag is set on an exchange_id reply--and in the unconfirmed
case "1" is always the right response.  So it shouldn't actually matter
what we return here.

We could continue returning 1 just to catch clients ignoring the spec
here, but I'd rather be generous.  Other things equal, returning the
existing sequence_id seems more informative.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:27 -04:00
J. Bruce Fields
0f1ba0ef21 nfsd4: fix exchange_id to return confirm flag
Otherwise nfsd4_set_ex_flags writes over the return flags.

Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:21 -04:00
J. Bruce Fields
7447758be7 nfsd4: clarify that renewing expired client is a bug
This can't happen:
	- cl_time is zeroed only by unhash_client_locked, which is only
	  ever called under both the state lock and the client lock.
	- every caller of renew_client() should have looked up a
	  (non-expired) client and then called renew_client() all
	  without dropping the state lock.
	- the only other caller of renew_client_locked() is
	  release_session_client(), which first checks under the
	  client_lock that the cl_time is nonzero.

So make it clear that this is a bug, not something we handle.  I can't
quite bring myself to make this a BUG(), though, as there are a lot of
renew_client() callers, and returning here is probably safer than a
BUG().

We'll consider making it a BUG() after some more cleanup.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:14 -04:00
J. Bruce Fields
90d700b779 nfsd4: simpler ordering of setclientid_confirm checks
The cases here divide into two main categories:

	- if there's an uncomfirmed record with a matching verifier,
	  then this is a "normal", succesful case: we're either creating
	  a new client, or updating an existing one.
	- otherwise, this is a weird case: a replay, or a server reboot.

Reordering to reflect that makes the code a bit more concise and the
logic a lot easier to understand.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:07 -04:00
J. Bruce Fields
f3d03b9202 nfsd4: setclientid: remove pointless assignment
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:06 -04:00
J. Bruce Fields
8695b90ac3 nfsd4: fix error return in non-matching-creds case
Note CLID_INUSE is for the case where two clients are trying to use the
same client-provided long-form client identifiers.  But what we're
looking at here is the server-returned shorthand client id--if those
clash there's a bug somewhere.

Fix the error return, pull the check out into common code, and do the
check unconditionally in all cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:04 -04:00
J. Bruce Fields
788c1eba50 nfsd4: fix setclientid_confirm same_cred check
New clients are created only by nfsd4_setclientid(), which always gives
any new client a unique clientid.  The only exception is in the
"callback update" case, in which case it may create an unconfirmed
client with the same clientid as a confirmed client.  In that case it
also checks that the confirmed client has the same credential.

Therefore, it is pointless for setclientid_confirm to check whether a
confirmed and unconfirmed client with the same clientid have matching
credentials--they're guaranteed to.

Instead, it should be checking whether the credential on the
setclientid_confirm matches either of those.  Otherwise, it could be
anyone sending the setclientid_confirm.  Granted, I can't see why anyone
would, but still it's probalby safer to check.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:03 -04:00
J. Bruce Fields
34b232bb37 nfsd4: merge 3 setclientid cases to 2
Boy, is this simpler.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:02 -04:00
J. Bruce Fields
8f9307119d nfsd4: pull out common code from setclientid cases
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:01 -04:00
J. Bruce Fields
ad72aae5ad nfsd4: merge last two setclientid cases
The code here is mostly the same.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:30:00 -04:00
J. Bruce Fields
63db46328a nfsd4: setclientid/confirm comment cleanup
Be a little more concise.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:29:59 -04:00
J. Bruce Fields
e98479b8d6 nfsd4: setclientid remove unnecessary terms from a logical expression
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:29:58 -04:00
J. Bruce Fields
d5497fc693 nfsd4: move rq_flavor into svc_cred
Move the rq_flavor into struct svc_cred, and use it in setclientid and
exchange_id comparisons as well.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:29:58 -04:00
J. Bruce Fields
8fbba96e5b nfsd4: stricter cred comparison for setclientid/exchange_id
The typical setclientid or exchange_id will probably be performed with a
credential that maps to either root or nobody, so comparing just uid's
is unlikely to be useful.  So, use everything else we can get our hands
on.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:29:57 -04:00