Commit Graph

210686 Commits

Author SHA1 Message Date
J. Bruce Fields
edc7a89403 nfsd: provide callbacks on svc_xprt deletion
NFSv4.1 needs warning when a client tcp connection goes down, if that
connection is being used as a backchannel, so that it can warn the
client that it has lost the backchannel connection.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields
c7662518c7 nfsd4: keep per-session list of connections
The spec requires us in various places to keep track of the connections
associated with each session.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields
5b6feee960 nfsd4: clean up session allocation
Changes:
	- make sure session memory reservation is released on failure
	  path.
	- use min_t()/min() for more compact code in several places.
	- break alloc_init_session into smaller pieces.
	- miscellaneous other cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields
dd93842457 nfsd4: fix alloc_init_session return type
This returns an nfs error, not -ERRNO.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:44 -04:00
J. Bruce Fields
c23753dac1 nfsd4: fix alloc_init_session BUILD_BUG_ON()
Note we're allocating an array of nfsd4_slot *'s, not nfsd4_slot's.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 19:29:44 -04:00
J. Bruce Fields
6ff8da0887 nfsd4: Move callback setup to callback queue
Instead of creating the new rpc client from a regular server thread,
set a flag, kick off a null call, and allow the null call to do the work
of setting up the client on the callback workqueue.

Use a spinlock to ensure the callback work gets a consistent view of the
callback parameters.

This allows, for example, changing the callback from contexts where
sleeping is not allowed.  I hope it will also keep the locking simple as
we add more session and trunking features, by serializing most of the
callback-specific work.

This also closes a small race where the the new cb_ident could be used
with an old connection (or vice-versa).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:44 -04:00
J. Bruce Fields
fb00392326 nfsd4: remove separate cb_args struct
I don't see the point of the separate struct.  It seems to just be
getting in the way.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields
cee277d924 nfsd4: use generic callback code in null case
This will eventually allow us, for example, to kick off null callback
from contexts where we can't sleep.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields
5878453dbd nfsd4: generic callback code
Make the recall callback code more generic, so that other callbacks
will be able to use it too.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields
1c8556026e nfsd4: rename nfs4_rpc_args->nfsd4_cb_args
With apologies for the gratuitous rename, the new name seems more
helpful to me.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields
586f36735e nfsd4: combine nfs4_rpc_args and nfsd4_cb_sequence
These two structs don't really need to be distinct as far as I can tell.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields
07263f1efe nfsd4: minor variable renaming (cb -> conn)
Now that we have both nfsd4_callback and nfsd4_cb_conn structures, I get
confused if variables of both types are always named cb....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01 19:29:43 -04:00
J. Bruce Fields
1e7af1b806 nfsd4: remove spkm3
Unfortunately, spkm3 never got very far; while interoperability with one
other implementation was demonstrated at some point, problems were found
with the spec that were deemed not worth fixing.

The kernel code is useless on its own without nfs-utils patches which
were never merged into nfs-utils, and were only ever available from
citi.umich.edu.  They appear not to have been updated since 2005.

Therefore it seems safe to assume that this code has no users, and never
will.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 18:09:55 -04:00
NeilBrown
277f68dbba sunrpc: fix race in new cache_wait code.
If we set up to wait for a cache item to be filled in, and then find
that it is no longer pending, it could be that some other thread is
in 'cache_revisit_request' and has moved our request to its 'pending' list.
So when our setup_deferral calls cache_revisit_request it will find nothing to
put on the pending list, and do nothing.

We then return from cache_wait_req, thus leaving the 'sleeper'
on-stack structure open to being corrupted by subsequent stack usage.

However that 'sleeper' could still be on the 'pending' list that the
other thread is looking at and so any corruption could cause it to behave badly.

To avoid this race we simply take the same path as if the
'wait_for_completion_interruptible_timeout' was interrupted and if the
sleeper is no longer on the list (which it won't be) we wait on the
completion - which will ensure that any other cache_revisit_request
will have let go of the sleeper.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 18:09:54 -04:00
Pavel Emelyanov
14ec63c333 sunrpc: Create sockets in net namespaces
The context is already known in all the sock_create callers.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:19:00 -04:00
Pavel Emelyanov
721db93a55 net: Export __sock_create
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:18:59 -04:00
Pavel Emelyanov
37aa213373 sunrpc: Tag rpc_xprt with net
The net is known from the xprt_create and this tagging will also
give un the context in the conntection workers where real sockets
are created.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:18:58 -04:00
Pavel Emelyanov
9a23e332ec sunrpc: Add net to xprt_create
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:18:57 -04:00
Pavel Emelyanov
c653ce3f0a sunrpc: Add net to rpc_create_args
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:18:56 -04:00
Pavel Emelyanov
62832c039e sunrpc: Pull net argument downto svc_create_socket
After this the socket creation in it knows the context.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:18:55 -04:00
Pavel Emelyanov
fc5d00b04a sunrpc: Add net argument to svc_create_xprt
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:18:54 -04:00
Pavel Emelyanov
e204e621b4 sunrpc: Factor out rpc_xprt freeing
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:18:53 -04:00
Pavel Emelyanov
bd1722d431 sunrpc: Factor out rpc_xprt allocation
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:18:52 -04:00
Benny Halevy
2b44f1ba40 nfsd4: adjust buflen for encoded attrs bitmap based on actual bitmap length
The existing code adjusted it based on the worst case scenario for the returned
bitmap and the best case scenario for the supported attrs attribute.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[bfields@redhat.com: removed likely/unlikely's]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 16:52:24 -04:00
Stephen Rothwell
c135e84afb sunrpc: fix up rpcauth_remove_module section mismatch
On Wed, 29 Sep 2010 14:02:38 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> After merging the final tree, today's linux-next build (powerpc
> ppc44x_defconfig) produced tis warning:
>
> WARNING: net/sunrpc/sunrpc.o(.init.text+0x110): Section mismatch in reference from the function init_sunrpc() to the function .exit.text:rpcauth_remove_module()
> The function __init init_sunrpc() references
> a function __exit rpcauth_remove_module().
> This is often seen when error handling in the init function
> uses functionality in the exit path.
> The fix is often to remove the __exit annotation of
> rpcauth_remove_module() so it may be used outside an exit section.
>
> Probably caused by commit 2f72c9b737
> ("sunrpc: The per-net skeleton").

This actually causes a build failure on a sparc32 defconfig build:

`rpcauth_remove_module' referenced in section `.init.text' of net/built-in.o: defined in discarded section `.exit.text' of net/built-in.o

I applied the following patch for today:

Fixes:

`rpcauth_remove_module' referenced in section `.init.text' of net/built-in.o: defined in discarded section `.exit.text' of net/built-in.o

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-29 12:27:37 -04:00
Pavel Emelyanov
90d51b02fd sunrpc: Make the ip_map_cache be per-net
Everything that is required for that already exists:
* the per-net cache registration with respective proc entries
* the context (struct net) is available in all the users

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:12 -04:00
Pavel Emelyanov
4f42d0d53c sunrpc: Make the /proc/net/rpc appear in net namespaces
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:12 -04:00
Pavel Emelyanov
2f72c9b737 sunrpc: The per-net skeleton
Register empty per-net operations for the sunrpc layer.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:12 -04:00
Pavel Emelyanov
4fb8518bda sunrpc: Tag svc_xprt with net
The transport representation should be per-net of course.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:12 -04:00
Pavel Emelyanov
593ce16b94 sunrpc: Add routines that allow registering per-net caches
Existing calls do the same, but for the init_net.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:11 -04:00
Pavel Emelyanov
352114f395 sunrpc: Add net to pure API calls
There are two calls that operate on ip_map_cache and are
directly called from the nfsd code. Other places will be
handled in a different way.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:11 -04:00
Pavel Emelyanov
3be4479fdf sunrpc: Pass xprt to cached get/put routines
They do not require the rqst actually and having the xprt simplifies
further patching.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:11 -04:00
Pavel Emelyanov
e3bfca01c1 sunrpc: Make xprt auth cache release work with the xprt
This is done in order to facilitate getting the ip_map_cache from
which to put the ip_map.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:11 -04:00
Pavel Emelyanov
bf18ab32ff sunrpc: Pass the ip_map_parse's cd to lower calls
The target is to have many ip_map_cache-s in the system. This particular
patch handles its usage by the ip_map_parse callback.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:11 -04:00
J. Bruce Fields
74ec1e1269 nfsd: fix /proc/net/rpc/nfsd.export/content display
Note with "first" always 0, and "lastflags" initially 0, we always dump
a spurious set of 0 flags at the start, among other problems.

Fix.  And attempt to make the code a little more obvious.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-26 14:48:25 -04:00
Pavel Emelyanov
049ef27b22 nfsd: Export get_task_comm for nfsd
The git://linux-nfs.org/~bfields/linux.git nfsd-next branch doesn't
compile when nfsd is a module with the following error:

   ERROR: "get_task_comm" [fs/nfsd/nfsd.ko] undefined!

Replace the get_task_comm call with direct comm access, which is
safe for current.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-23 10:34:21 -04:00
NeilBrown
1e1405673e nfsd: allow deprecated interface to be compiled out.
Add CONFIG_NFSD_DEPRECATED, default to y.
Only include deprecated interface if this is defined.
This allows distros to remove this interface before the official
removal, and allows developers to test without it.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-22 15:33:14 -04:00
NeilBrown
c67874f942 nfsd: formally deprecate legacy nfsd syscall interface
The syscall interface is has been replaced by a more flexible
interface since 2.6.0.  It is time to work towards discarding
the old interface.

So add a entry in feature-removal-schedule.txt and print a warning
when the interface is used.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-22 15:33:13 -04:00
NeilBrown
e95dffa430 sunrpc/cache: fix recent breakage of cache_clean_deferred
commit 6610f720e9
broke cache_clean_deferred as entries are no longer added to the
pending list for subsequent revisiting.

So put those requests back on the pending list.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-22 15:33:12 -04:00
Bryan Schumaker
f904be9cc7 lockd: Mostly remove BKL from the server
This patch removes all but one call to lock_kernel() from the server.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-22 15:32:58 -04:00
Andy Shevchenko
e7f483eabe sunrpc/cache: don't use custom hex_to_bin() converter
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-21 22:45:07 -04:00
NeilBrown
1117449276 sunrpc/cache: change deferred-request hash table to use hlist.
Being a hash table, hlist is the best option.

There is currently some ugliness were we treat "->next == NULL" as
a special case to avoid having to initialise the whole array.
This change nicely gets rid of that case.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-21 21:51:11 -04:00
NeilBrown
2ed5282cd9 svcauth_gss: replace a trivial 'switch' with an 'if'
Code like:

  switch(xxx) {
  case -error1:
  case -error2:
     ..
     return;
  case 0:
     stuff;
  }

  can more naturally be written:

  if (xxx < 0)
      return;

  stuff;

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-21 19:16:31 -04:00
NeilBrown
839049a873 nfsd/idmap: drop special request deferal in favour of improved default.
The idmap code manages request deferal by waiting for a reply from
userspace rather than putting the NFS request on a queue to be retried
from the start.
Now that the common deferal code does this there is no need for the
special code in idmap.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-21 17:08:31 -04:00
NeilBrown
8ff30fa4ef nfsd: disable deferral for NFSv4
Now that a slight delay in getting a reply to an upcall doesn't
require deferring of requests, request deferral for all NFSv4
requests - the concept doesn't really fit with the v4 model.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-21 17:02:27 -04:00
NeilBrown
1ebede86b8 sunrpc: close connection when a request is irretrievably lost.
If we drop a request in the sunrpc layer, either due kmalloc failure,
or due to a cache miss when we could not queue the request for later
replay, then close the connection to encourage the client to retry sooner.

Note that if the drop happens in the NFS layer, NFSERR_JUKEBOX
(aka NFS4ERR_DELAY) is returned to guide the client concerning
replay.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-21 16:57:49 -04:00
J. Bruce Fields
0649752458 nfsd4: fix hang on fast-booting nfs servers
The last_close field of a cache_detail is initialized to zero, so the
condition

	detail->last_close < seconds_since_boot() - 30

may be false even for a cache that was never opened.

However, we want to immediately fail upcalls to caches that were never
opened: in the case of the auth_unix_gid cache, especially, which may
never be opened by mountd (if the --manage-gids option is not set), we
want to fail the upcall immediately.  Otherwise client requests will be
dropped unnecessarily on reboot.

Also document these conditions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-19 23:49:30 -04:00
J. Bruce Fields
c88739b373 Merge remote branch 'trond/bugfixes' into for-2.6.37
Without some client-side fixes, server testing is currently difficult.
2010-09-19 23:48:32 -04:00
Trond Myklebust
827e345702 SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies
The NFSv4 client's callback server calls svc_gss_principal(), which
is defined in the auth_rpcgss.ko

The NFSv4 server has the same dependency, and in addition calls
svcauth_gss_flavor(), gss_mech_get_by_pseudoflavor(),
gss_pseudoflavor_to_service() and gss_mech_put() from the same module.

The module auth_rpcgss itself has no dependencies aside from sunrpc,
so we only need to select RPCSEC_GSS.

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-12 19:57:50 -04:00
Menyhart Zoltan
fbf3fdd244 statfs() gives ESTALE error
Hi,

An NFS client executes a statfs("file", &buff) call.
"file" exists / existed, the client has read / written it,
but it has already closed it.

user_path(pathname, &path) looks up "file" successfully in the
directory-cache  and restarts the aging timer of the directory-entry.
Even if "file" has already been removed from the server, because the
lookupcache=positive option I use, keeps the entries valid for a while.

nfs_statfs() returns ESTALE if "file" has already been removed from the
server.

If the user application repeats the statfs("file", &buff) call, we
are stuck: "file" remains young forever in the directory-cache.

Signed-off-by: Zoltan Menyhart  <Zoltan.Menyhart@bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2010-09-12 19:55:26 -04:00