Commit Graph

215 Commits

Author SHA1 Message Date
Jeff Layton
65294c1f2c nfsd: add a new struct file caching facility to nfsd
Currently, NFSv2/3 reads and writes have to open a file, do the read or
write and then close it again for each RPC. This is highly inefficient,
especially when the underlying filesystem has a relatively slow open
routine.

This patch adds a new open file cache to knfsd. Rather than doing an
open for each RPC, the read/write handlers can call into this cache to
see if there is one already there for the correct filehandle and
NFS_MAY_READ/WRITE flags.

If there isn't an entry, then we create a new one and attempt to
perform the open. If there is, then we wait until the entry is fully
instantiated and return it if it is at the end of the wait. If it's
not, then we attempt to take over construction.

Since the main goal is to speed up NFSv2/3 I/O, we don't want to
close these files on last put of these objects. We need to keep them
around for a little while since we never know when the next READ/WRITE
will come in.

Cache entries have a hardcoded 1s timeout, and we have a recurring
workqueue job that walks the cache and purges any entries that have
expired.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Richard Sharpe <richard.sharpe@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:00:39 -04:00
Trond Myklebust
40373b125d lockd: Pass the user cred from knfsd when starting the lockd server
When starting up a new knfsd server, pass the user cred to the
supporting lockd server.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-04-24 09:46:35 -04:00
Trond Myklebust
4df493a260 SUNRPC: Cache the process user cred in the RPC server listener
In order to be able to interpret uids and gids correctly in knfsd, we
should cache the user namespace of the process that created the RPC
server's listener. To do so, we refcount the credential of that process.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-04-24 09:46:35 -04:00
Trond Myklebust
e333f3bbef nfsd: Allow containers to set supported nfs versions
Support use of the --nfs-version/--no-nfs-version arguments to rpc.nfsd
in containers.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-04-24 09:46:35 -04:00
Trond Myklebust
029be5d033 nfsd: Add custom rpcbind callbacks for knfsd
Add custom rpcbind callbacks in preparation for the knfsd
per-container version feature.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-04-24 09:46:35 -04:00
Trond Myklebust
642ee6b209 SUNRPC: Allow further customisation of RPC program registration
Add a callback to allow customisation of the rpcbind registration.
When clients have the ability to turn on and off version support,
we want to allow them to also prevent registration of those
versions with the rpc portmapper.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-04-24 09:46:35 -04:00
Trond Myklebust
8e5b67731d SUNRPC: Add a callback to initialise server requests
Add a callback to help initialise server requests before they are
processed. This will allow us to clean up the NFS server version
support, and to make it container safe.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-04-24 09:46:34 -04:00
Vasily Averin
2317dc557a race of nfsd inetaddr notifiers vs nn->nfsd_serv change
nfsd_inet[6]addr_event uses nn->nfsd_serv without taking nfsd_mutex,
which can be changed during execution of notifiers and crash the host.

Moreover if notifiers were enabled in one net namespace they are enabled
in all other net namespaces, from creation until destruction.

This patch allows notifiers to access nn->nfsd_serv only after the
pointer is correctly initialized and delays cleanup until notifiers are
no longer in use.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-27 16:45:11 -05:00
Linus Torvalds
4dd3c2e5a4 Lots of good bugfixes, including:
- fix a number of races in the NFSv4+ state code.
 	- fix some shutdown crashes in multiple-network-namespace cases.
 	- relax our 4.1 session limits; if you've an artificially low limit
 	  to the number of 4.1 clients that can mount simultaneously, try
 	  upgrading.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaEH3oAAoJECebzXlCjuG++t0P/2t7RvRUunQa4pngCmg5QbOA
 rldfEd1HM1F6+4fXzN0wcxWjphUNxs19VjEaWNjThYoGGTEdSOuFhBHgK18xmHjp
 Cjz5IYJ0yS7PClCxMTmz5u3gfyExPR83whmNaNK69CGvn5xu97gDntOv/06Llw4Y
 nCUJrEmVcMAOHek3tOD0Rlv8eYFyfLhF6zacp+qWFIlymU118iK1Or83M7pi6j51
 yVVOvxktDLzkyDq5gQD/Py3rKHikOWFMCoseOPfMnOiGF/Bp7YDzWt6HT17mwyU4
 xDeICbnfqve2SwT9NChpJOYtUAPuZDiQR6G2ZtnI8/JN7ob/wls/4CbDVlzYFN4r
 dLsRlEC5spQmg34j6dscOKkt1vRK9vKXTC46wEMfXZLtiDLA/uZ/J0gNh3EXqpbt
 LQQZI4B2MomYPcp64i4UHHO8BqSIX+lC5otVlAW105TQvZflJ8Mhtawmpu1O3nXZ
 DSUhkZrImlBmb7/ulhjyXpmNAxQLXsqb0lP5tUYR5Re+A2lyea/pMJmtBLu3fv6h
 tzHqq2JL13kblqJY+Frc1zqQGI5AAyKmdTTjmljBIGHxbVwAMzk1qO+VOI/f+J21
 MWNmFkEqw+Tnvwy6sIm1eUGtTWIGc6ejvMxXguAfa+QjT4iHAL3F4PkpSihzIZnm
 bzHDeJ87HRWWj/ICPQ1j
 =PBs+
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Lots of good bugfixes, including:

   -  fix a number of races in the NFSv4+ state code

   -  fix some shutdown crashes in multiple-network-namespace cases

   -  relax our 4.1 session limits; if you've an artificially low limit
      to the number of 4.1 clients that can mount simultaneously, try
      upgrading"

* tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux: (22 commits)
  SUNRPC: Improve ordering of transport processing
  nfsd: deal with revoked delegations appropriately
  svcrdma: Enqueue after setting XPT_CLOSE in completion handlers
  nfsd: use nfs->ns.inum as net ID
  rpc: remove some BUG()s
  svcrdma: Preserve CB send buffer across retransmits
  nfds: avoid gettimeofday for nfssvc_boot time
  fs, nfsd: convert nfs4_file.fi_ref from atomic_t to refcount_t
  fs, nfsd: convert nfs4_cntl_odstate.co_odcount from atomic_t to refcount_t
  fs, nfsd: convert nfs4_stid.sc_count from atomic_t to refcount_t
  lockd: double unregister of inetaddr notifiers
  nfsd4: catch some false session retries
  nfsd4: fix cached replies to solo SEQUENCE compounds
  sunrcp: make function _svc_create_xprt static
  SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status
  nfsd: use ARRAY_SIZE
  nfsd: give out fewer session slots as limit approaches
  nfsd: increase DRC cache limit
  nfsd: remove unnecessary nofilehandle checks
  nfs_common: convert int to bool
  ...
2017-11-18 11:22:04 -08:00
Arnd Bergmann
256a89fa3d nfds: avoid gettimeofday for nfssvc_boot time
do_gettimeofday() is deprecated and we should generally use time64_t
based functions instead.

In case of nfsd, all three users of nfssvc_boot only use the initial
time as a unique token, and are not affected by it overflowing, so they
are not affected by the y2038 overflow.

This converts the structure to timespec64 anyway and adds comments
to all uses, to document that we have thought about it and avoid
having to look at it again.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-07 16:44:00 -05:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
J. Bruce Fields
44d8660d3b nfsd: increase DRC cache limit
An NFSv4.1+ client negotiates the size of its duplicate reply cache size
in the initial CREATE_SESSION request.  The server preallocates the
memory for the duplicate reply cache to ensure that we'll never fail to
record the response to a nonidempotent operation.

To prevent a few CREATE_SESSIONs from consuming all of memory we set an
upper limit based on nr_free_buffer_pages().  1/2^10 has been too
limiting in practice; 1/2^7 is still less than one percent.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-10-04 16:25:01 -04:00
Chuck Lever
afea5657c2 sunrpc: Const-ify struct sv_serv_ops
Close an attack vector by moving the arrays of per-server methods to
read-only memory.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-08-24 22:13:50 -04:00
Christoph Hellwig
e9679189e3 sunrpc: mark all struct svc_version instances as const
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-15 17:42:31 +02:00
Christoph Hellwig
860bda29b9 sunrpc: mark all struct svc_procinfo instances as const
struct svc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:31 +02:00
Christoph Hellwig
63f8de3795 sunrpc: properly type pc_encode callbacks
Drop the resp argument as it can trivially be derived from the rqstp
argument.  With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-15 17:42:25 +02:00
Christoph Hellwig
026fec7e7c sunrpc: properly type pc_decode callbacks
Drop the argp argument as it can trivially be derived from the rqstp
argument.  With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:24 +02:00
Christoph Hellwig
a6beb73272 sunrpc: properly type pc_func callbacks
Drop the argp and resp arguments as they can trivially be derived from
the rqstp argument.  With that all functions now have the same prototype,
and we can remove the unsafe casting to svc_procfunc as well as the
svc_procfunc typedef itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-05-15 17:42:23 +02:00
J. Bruce Fields
e6838a29ec nfsd: check for oversized NFSv2/v3 arguments
A client can append random data to the end of an NFSv2 or NFSv3 RPC call
without our complaining; we'll just stop parsing at the end of the
expected data and ignore the rest.

Encoded arguments and replies are stored together in an array of pages,
and if a call is too large it could leave inadequate space for the
reply.  This is normally OK because NFS RPC's typically have either
short arguments and long replies (like READ) or long arguments and short
replies (like WRITE).  But a client that sends an incorrectly long reply
can violate those assumptions.  This was observed to cause crashes.

Also, several operations increment rq_next_page in the decode routine
before checking the argument size, which can leave rq_next_page pointing
well past the end of the page array, causing trouble later in
svc_free_pages.

So, following a suggestion from Neil Brown, add a central check to
enforce our expectation that no NFSv2/v3 call has both a large call and
a large reply.

As followup we may also want to rewrite the encoding routines to check
more carefully that they aren't running off the end of the page array.

We may also consider rejecting calls that have any extra garbage
appended.  That would be safer, and within our rights by spec, but given
the age of our server and the NFS protocol, and the fact that we've
never enforced this before, we may need to balance that against the
possibility of breaking some oddball client.

Reported-by: Tuomas Haanpää <thaan@synopsys.com>
Reported-by: Ari Kauppi <ari@synopsys.com>
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-04-25 16:34:37 -04:00
NeilBrown
800a938f0b NFSD: fix nfsd_reset_versions for NFSv4.
If you write "-2 -3 -4" to the "versions" file, it will
notice that no versions are enabled, and nfsd_reset_versions()
is called.
This enables all major versions, not no minor versions.
So we lose the invariant that NFSv4 is only advertised when
at least one minor is enabled.

Fix the code to explicitly enable minor versions for v4,
change it to use nfsd_vers() to test and set, and simplify
the code.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-03-10 17:04:50 -05:00
NeilBrown
928c6fb3a9 NFSD: fix nfsd_minorversion(.., NFSD_AVAIL)
Current code will return 1 if the version is supported,
and -1 if it isn't.
This is confusing and inconsistent with the one place where this
is used.
So change to return 1 if it is supported, and zero if not.
i.e. an error is never returned.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-03-10 17:04:50 -05:00
Ingo Molnar
3f07c01441 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:29 +01:00
Trond Myklebust
d3635ff07e nfsd: fix configuration of supported minor versions
When the user turns off all minor versions of NFSv4, that should be
equivalent to turning off NFSv4 support, so a mount attempt using NFSv4
should get RPC_PROG_MISMATCH, not NFSERR_MINOR_VERS_MISMATCH.

Allow the user to use either '4.0' or '4' to enable or disable minor
version 0.  Other minor versions are still enabled or disabled using the
'4.x' format.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-02-27 18:04:08 -05:00
Scott Mayhew
7b19824de6 nfsd: initialize sin6_scope_id in nfsd_inet6addr_event()
I noticed this was missing when I was testing with link local addresses.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-01-31 12:31:53 -05:00
Andreas Gruenbacher
47057abde5 nfsd: add support for the umask attribute
Clients can set the umask attribute when creating files to cause the
server to apply it always except when inheriting permissions from the
parent directory.  That way, the new files will end up with the same
permissions as files created locally.

See https://tools.ietf.org/html/draft-ietf-nfsv4-umask-02 for more
details.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-12-15 20:42:48 -05:00
Vasily Averin
1eca45f8a8 NFSD: fix corruption in notifier registration
By design notifier can be registered once only, however nfsd registers
the same inetaddr notifiers per net-namespace.  When this happen it
corrupts list of notifiers, as result some notifiers can be not called
on proper event, traverse on list can be cycled forever, and second
unregister can access already freed memory.

Cc: stable@vger.kernel.org
fixes: 36684996 ("nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-26 14:17:45 -04:00
Kinglong Mee
691412b443 nfsd: Fix nfsd leaks sunrpc module references
Stefan Hajnoczi reports,
nfsd leaks 3 references to the sunrpc module here:

  # echo -n "asdf 1234" >/proc/fs/nfsd/portlist
  bash: echo: write error: Protocol not supported

Now stop nfsd and try unloading the kernel modules:

  # systemctl stop nfs-server
  # systemctl stop nfs
  # systemctl stop proc-fs-nfsd.mount
  # systemctl stop var-lib-nfs-rpc_pipefs.mount
  # rmmod nfsd
  # rmmod nfs_acl
  # rmmod lockd
  # rmmod auth_rpcgss
  # rmmod sunrpc
  rmmod: ERROR: Module sunrpc is in use
  # lsmod | grep rpc
  sunrpc                315392  3

It is caused by nfsd don't cleanup rpcb program for nfsd
when destroying svc service after creating xprt fail.

Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-01-07 10:10:51 -05:00
Scott Mayhew
366849966f nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain
Register callbacks on inetaddr_chain and inet6addr_chain to trigger
cleanup of nfsd transport sockets when an ip address is deleted.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-12-23 10:08:15 -05:00
Jeff Layton
598e235909 nfsd/sunrpc: abstract out svc_set_num_threads to sv_ops
Add an operation that will do setup of the service. In the case of a
classic thread-based service that means starting up threads. In the case
of a workqueue-based service, the setup will do something different.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirliey.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10 16:05:43 -04:00
Jeff Layton
b9e13cdfac nfsd/sunrpc: turn enqueueing a svc_xprt into a svc_serv operation
For now, all services use svc_xprt_do_enqueue, but once we add
workqueue-based service support, we'll need to do something different.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10 16:05:42 -04:00
Jeff Layton
758f62fff9 nfsd/sunrpc: move sv_module parm into sv_ops
...not technically an operation, but it's more convenient and cleaner
to pass the module pointer in this struct.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10 16:05:41 -04:00
Jeff Layton
c369014f17 nfsd/sunrpc: move sv_function into sv_ops
Since we now have a container for holding svc_serv operations, move the
sv_function into it as well.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10 16:05:41 -04:00
Jeff Layton
ea126e7435 nfsd/sunrpc: add a new svc_serv_ops struct and move sv_shutdown into it
In later patches we'll need to abstract out more operations on a
per-service level, besides sv_shutdown and sv_function.

Declare a new svc_serv_ops struct to hold these operations, and move
sv_shutdown into this struct.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10 16:05:40 -04:00
J. Bruce Fields
c23ae60178 nfsd: default NFSv4.2 to on
The code seems to work.  The protocol looks stable.  The kernel's
version defaults can be overridden by rpc.nfsd arguments.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-02-09 14:58:50 -05:00
Jeff Layton
78b65eb3fd sunrpc: move rq_dropme flag into rq_flags
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:22:20 -05:00
Kinglong Mee
d9499a9571 NFSD: Decrease nfsd_users in nfsd_startup_generic fail
A memory allocation failure could cause nfsd_startup_generic to fail, in
which case nfsd_users wouldn't be incorrectly left elevated.

After nfsd restarts nfsd_startup_generic will then succeed without doing
anything--the first consequence is likely nfs4_start_net finding a bad
laundry_wq and crashing.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Fixes: 4539f14981 "nfsd: replace boolean nfsd_up flag by users counter"
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-01 16:26:09 -04:00
Jeff Layton
5b8db00bae nfsd: add a new /proc/fs/nfsd/max_connections file
Currently, the maximum number of connections that nfsd will allow
is based on the number of threads spawned. While this is fine for a
default, there really isn't a clear relationship between the two.

The number of threads corresponds to the number of concurrent requests
that we want to allow the server to process at any given time. The
connection limit corresponds to the maximum number of clients that we
want to allow the server to handle. These are two entirely different
quantities.

Break the dependency on increasing threads in order to allow for more
connections, by adding a new per-net parameter that can be set to a
non-zero value. The default is still to base it on the number of threads,
so there should be no behavior change for anyone who doesn't use it.

Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-08 17:14:32 -04:00
Kinglong Mee
3c7aa15d20 NFSD: Using min/max/min_t/max_t for calculate
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-23 11:31:36 -04:00
NeilBrown
8658452e4a nfsd: Only set PF_LESS_THROTTLE when really needed.
PF_LESS_THROTTLE has a very specific use case: to avoid deadlocks
and live-locks while writing to the page cache in a loop-back
NFS mount situation.

It therefore makes sense to *only* set PF_LESS_THROTTLE in this
situation.
We now know when a request came from the local-host so it could be a
loop-back mount.  We already know when we are handling write requests,
and when we are doing anything else.

So combine those two to allow nfsd to still be throttled (like any
other process) in every situation except when it is known to be
problematic.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-22 15:59:19 -04:00
Kinglong Mee
ff88825fbb NFSD: fix compile warning without CONFIG_NFSD_V3
Without CONFIG_NFSD_V3, compile will get warning as,

fs/nfsd/nfssvc.c: In function 'nfsd_svc':
>> fs/nfsd/nfssvc.c:246:60: warning: array subscript is above array bounds [-Warray-bounds]
        return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
                                                               ^

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-06 12:23:31 -05:00
Kinglong Mee
8ef667140c NFSD: Don't start lockd when only NFSv4 is running
When starting without nfsv2 and nfsv3, nfsd does not need to start
lockd (and certainly doesn't need to fail because lockd failed to
register with the portmapper).

Reported-by: Gareth Williams <gareth@garethwilliams.me.uk>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:50 -05:00
J. Bruce Fields
35f7a14fc1 nfsd4: fix minorversion support interface
You can turn on or off support for minorversions using e.g.

	echo "-4.2" >/proc/fs/nfsd/versions

However, the current implementation is a little wonky.  For example, the
above will turn off 4.2 support, but it will also turn *on* 4.1 support.

This didn't matter as long as we only had 2 minorversions, which was
true till very recently.

And do a little cleanup here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-12 16:48:52 -04:00
J. Bruce Fields
d109148111 nfsd4: support minorversion 1 by default
We now have minimal minorversion 1 support; turn it on by default.

This can still be turned off with "echo -4.1 >/proc/fs/nfsd/versions".

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-08 19:48:10 -04:00
Linus Torvalds
b6669737d3 Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux
Pull nfsd changes from J Bruce Fields:
 "Miscellaneous bugfixes, plus:

   - An overhaul of the DRC cache by Jeff Layton.  The main effect is
     just to make it larger.  This decreases the chances of intermittent
     errors especially in the UDP case.  But we'll need to watch for any
     reports of performance regressions.

   - Containerized nfsd: with some limitations, we now support
     per-container nfs-service, thanks to extensive work from Stanislav
     Kinsbursky over the last year."

Some notes about conflicts, since there were *two* non-data semantic
conflicts here:

 - idr_remove_all() had been added by a memory leak fix, but has since
   become deprecated since idr_destroy() does it for us now.

 - xs_local_connect() had been added by this branch to make AF_LOCAL
   connections be synchronous, but in the meantime Trond had changed the
   calling convention in order to avoid a RCU dereference.

There were a couple of more obvious actual source-level conflicts due to
the hlist traversal changes and one just due to code changes next to
each other, but those were trivial.

* 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
  SUNRPC: make AF_LOCAL connect synchronous
  nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
  svcrpc: fix rpc server shutdown races
  svcrpc: make svc_age_temp_xprts enqueue under sv_lock
  lockd: nlmclnt_reclaim(): avoid stack overflow
  nfsd: enable NFSv4 state in containers
  nfsd: disable usermode helper client tracker in container
  nfsd: use proper net while reading "exports" file
  nfsd: containerize NFSd filesystem
  nfsd: fix comments on nfsd_cache_lookup
  SUNRPC: move cache_detail->cache_request callback call to cache_read()
  SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
  SUNRPC: rework cache upcall logic
  SUNRPC: introduce cache_detail->cache_request callback
  NFS: simplify and clean cache library
  NFS: use SUNRPC cache creation and destruction helper for DNS cache
  nfsd4: free_stid can be static
  nfsd: keep a checksum of the first 256 bytes of request
  sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
  sunrpc: fix comment in struct xdr_buf definition
  ...
2013-02-28 18:02:55 -08:00
Zhang Yanfei
697ce9be7d fs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_used
The three variables are calculated from nr_free_buffer_pages so change
their types to unsigned long in case of overflow.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:22 -08:00
Stanislav Kinsbursky
11f779421a nfsd: containerize NFSd filesystem
This patch makes NFSD file system superblock to be created per net.
This makes possible to get proper network namespace from superblock instead of
using hard-coded "init_net".

Note: NFSd fs super-block holds network namespace. This garantees, that
network namespace won't disappear from underneath of it.
This, obviously, means, that in case of kill of a container's "init" (which is not a mount
namespace, but network namespace creator) netowrk namespace won't be
destroyed.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-15 11:21:00 -05:00
Jeff Layton
09662d58d5 nfsd: get rid of RC_INTR
The reply cache code never returns this status.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-04 09:16:20 -05:00
Stanislav Kinsbursky
88c4766617 nfsd: pass proper net to nfsd_destroy() from NFSd kthreads
Since NFSd service is per-net now, we have to pass proper network
context in nfsd_shutdown() from NFSd kthreads.

The simplest way I found is to get proper net from one of transports
with permanent sockets.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:42 -05:00
Stanislav Kinsbursky
541e864f00 nfsd: simplify service shutdown
Function nfsd_shutdown is called from two places: nfsd_last_thread (when last
kernel thread is exiting) and nfsd_svc (in case of kthreads starting error).
When calling from nfsd_svc(), we can be sure that per-net resources are
allocated, so we don't need to check per-net nfsd_net_up boolean flag.
This allows us to remove nfsd_shutdown function at all and move check for
per-net nfsd_net_up boolean flag to nfsd_last_thread.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:42 -05:00
Stanislav Kinsbursky
4539f14981 nfsd: replace boolean nfsd_up flag by users counter
Since we have generic NFSd resurces, we have to introduce some way how to
allocate and destroy those resources on first per-net NFSd start and on
last per-net NFSd stop respectively.
This patch replaces global boolean nfsd_up flag (which is unused now) by users
counter and use it to determine either we need to allocate generic resources
or destroy them.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:41 -05:00
Stanislav Kinsbursky
903d9bf0ed nfsd: simplify NFSv4 state init and shutdown
This patch moves nfsd_startup_generic() and nfsd_shutdown_generic()
calls to nfsd_startup_net() and nfsd_shutdown_net() respectively, which
allows us to call nfsd_startup_net() instead of nfsd_startup() and makes
the code look clearer.  It also modifies nfsd_svc() and nfsd_shutdown()
to check nn->nfsd_net_up instead of global nfsd_up.  The latter is now
used only for generic resources shutdown and is currently useless.  It
will replaced by NFSd users counter later in this series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:40 -05:00
Stanislav Kinsbursky
bda9cac1db nfsd: introduce helpers for generic resources init and shutdown
NFSd have per-net resources and resources, used globally.
Let's move generic resources init and shutdown to separated functions since
they are going to be allocated on first NFSd service start and destroyed after
last NFSd service shutdown.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:39 -05:00
Stanislav Kinsbursky
9dd9845f08 nfsd: make NFSd service structure allocated per net
This patch makes main step in NFSd containerisation.

There could be different approaches to how to make NFSd able to handle
incoming RPC request from different network namespaces.  The two main
options are:

1) Share NFSd kthreads betwween all network namespaces.
2) Create separated pool of threads for each namespace.

While first approach looks more flexible, second one is simpler and
non-racy.  This patch implements the second option.

To make it possible to allocate separate pools of threads, we have to
make it possible to allocate separate NFSd service structures per net.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:39 -05:00
Stanislav Kinsbursky
b9c0ef8571 nfsd: make NFSd service boot time per-net
This is simple: an NFSd service can be started at different times in
different network environments. So, its "boot time" has to be assigned
per net.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:38 -05:00
Stanislav Kinsbursky
2c2fe2909e nfsd: per-net NFSd up flag introduced
This patch introduces introduces per-net "nfsd_net_up" boolean flag, which has
the same purpose as general "nfsd_up" flag - skip init or shutdown of per-net
resources in case of they are inited on shutted down respectively.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:37 -05:00
Stanislav Kinsbursky
6ff50b3dea nfsd: move per-net startup code to separated function
NFSd resources are partially per-net and partially globally used.
This patch splits resources init and shutdown and moves per-net code to
separated functions.
Generic and per-net init and shutdown are called sequentially for a while.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:36 -05:00
Stanislav Kinsbursky
3938a0d5eb nfsd: pass net to nfsd_set_nrthreads()
Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:35 -05:00
Stanislav Kinsbursky
d41a9417cd nfsd: pass net to nfsd_svc()
Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:34 -05:00
Stanislav Kinsbursky
6777436b0f nfsd: pass net to nfsd_create_serv()
Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:34 -05:00
Stanislav Kinsbursky
db42d1a76a nfsd: pass net to nfsd_startup() and nfsd_shutdown()
Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:33 -05:00
Stanislav Kinsbursky
db6e182c17 nfsd: pass net to nfsd_init_socks()
Precursor patch. Hard-coded "init_net" will be replaced by proper one in
future.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:32 -05:00
Stanislav Kinsbursky
f7fb86c6e6 nfsd: use "init_net" for portmapper
There could be a situation, when NFSd was started in one network namespace, but
stopped in another one.
This will trigger kernel panic, because RPCBIND client is stored on per-net
NFSd data, and will be NULL on NFSd shutdown.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:32 -05:00
Stanislav Kinsbursky
f252bc6806 nfsd: call state init and shutdown twice
Split NFSv4 state init and shutdown into two different calls: per-net one and
generic one.
Per-net cwinit/shutdown pair have to be called for any namespace, generic pair
- only once on NSFd kthreads start and shutdown respectively.

Refresh of diff-nfsd-call-state-init-twice

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-28 10:13:53 -05:00
J. Bruce Fields
57d276d71a nfsd: fix v4 reply caching
Very embarassing: 1091006c5e "nfsd: turn
on reply cache for NFSv4" missed a line, effectively leaving the reply
cache off in the v4 case.  I thought I'd tested that, but I guess not.

This time, wrote a pynfs test to confirm it works.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26 09:05:19 -05:00
J. Bruce Fields
5b444cc9a4 svcrpc: remove handling of unknown errors from svc_recv
svc_recv() returns only -EINTR or -EAGAIN.  If we really want to worry
about the case where it has a bug that causes it to return something
else, we could stick a WARN() in svc_recv.  But it's silly to require
every caller to have all this boilerplate to handle that case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 17:42:00 -04:00
J. Bruce Fields
38af2cabb6 nfsd: remove redundant "port" argument
"port" in all these functions is always NFS_PORT.

nfsd can already be run on a nonstandard port using the "nfsd/portlist"
interface.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 17:07:49 -04:00
Stanislav Kinsbursky
57c8b13e3c NFSd: set nfsd_serv to NULL after service destruction
In nfsd_destroy():

	if (destroy)
		svc_shutdown_net(nfsd_serv, net);
	svc_destroy(nfsd_server);

svc_shutdown_net(nfsd_serv, net) calls nfsd_last_thread(), which sets
nfsd_serv to NULL, causing a NULL dereference on the following line.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-25 09:21:31 -04:00
Stanislav Kinsbursky
19f7e2ca44 NFSd: introduce nfsd_destroy() helper
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-25 09:21:30 -04:00
Stanislav Kinsbursky
786185b5f8 SUNRPC: move per-net operations from svc_destroy()
The idea is to separate service destruction and per-net operations,
because these are two different things and the mix looks ugly.

Notes:

1) For NFS server this patch looks ugly (sorry for that). But these
place will be rewritten soon during NFSd containerization.

2) LockD per-net counter increase int lockd_up() was moved prior to
make_socks() to make lockd_down_net() call safe in case of error.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:29:40 -04:00
Stanislav Kinsbursky
9793f7c889 SUNRPC: new svc_bind() routine introduced
This new routine is responsible for service registration in a specified
network context.

The idea is to separate service creation from per-net operations.

Note also: since registering service with svc_bind() can fail, the
service will be destroyed and during destruction it will try to
unregister itself from rpcbind. In this case unregistration has to be
skipped.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:29:39 -04:00
Stanislav Kinsbursky
b3853e0ea1 nfsd: make export cache allocated per network namespace context
This patch also changes prototypes of nfsd_export_flush() and exp_rootfh():
network namespace parameter added.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-04-12 09:11:11 -04:00
Stanislav Kinsbursky
e3f70eadb7 Lockd: pass network namespace to creation and destruction routines
v2: dereference of most probably already released nlm_host removed in
nlmclnt_done() and reclaimer().

These routines are called from locks reclaimer() kernel thread. This thread
works in "init_net" network context and currently relays on persence on lockd
thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay
on current network context. So let's pass corrent one into them.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-04-11 17:55:06 -04:00
J. Bruce Fields
1df00640c9 Merge nfs containerization work from Trond's tree
The nfs containerization work is a prerequisite for Jeff Layton's reboot
recovery rework.
2012-03-26 11:48:54 -04:00
J. Bruce Fields
508f922756 nfsd: fix default iosize calculation on 32bit
The rpc buffers will be allocated out of low memory, so we should really
only be taking that into account.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-03 15:33:17 -05:00
J. Bruce Fields
87b0fc7deb nfsd: cleanup setting of default max_block_size
Move calculation of the default into a helper function.

Get rid of an unused variable "err" while we're there.

Thanks to Mi Jinlong for catching an arithmetic error in a previous
version.

Cc: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-03 15:32:41 -05:00
Stanislav Kinsbursky
5ecebb7c7f SUNRPC: unregister service on creation in current network namespace
On service shutdown we can be sure, that no more users of it left except
current. Thus it looks like using current network namespace context is safe in
this case.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:14 -05:00
Linus Torvalds
32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Paul Gortmaker
143cb494cb fs: add module.h to files that were implicitly using it
Some files were using the complete module.h infrastructure without
actually including the header at all.  Fix them up in advance so
once the implicit presence is removed, we won't get failures like this:

  CC [M]  fs/nfsd/nfssvc.o
fs/nfsd/nfssvc.c: In function 'nfsd_create_serv':
fs/nfsd/nfssvc.c:335: error: 'THIS_MODULE' undeclared (first use in this function)
fs/nfsd/nfssvc.c:335: error: (Each undeclared identifier is reported only once
fs/nfsd/nfssvc.c:335: error: for each function it appears in.)
fs/nfsd/nfssvc.c: In function 'nfsd':
fs/nfsd/nfssvc.c:555: error: implicit declaration of function 'module_put_and_exit'
make[3]: *** [fs/nfsd/nfssvc.o] Error 1

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:31 -04:00
Stanislav Kinsbursky
16d0587090 NFSd: call svc rpcbind cleanup explicitly
We have to call svc_rpcb_cleanup() explicitly from nfsd_last_thread() since
this function is registered as service shutdown callback and thus nobody else
will done it for us.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-25 13:19:40 +02:00
J. Bruce Fields
1091006c5e nfsd: turn on reply cache for NFSv4
It's sort of ridiculous that we've never had a working reply cache for
NFSv4.

On the other hand, we may still not: our current reply cache is likely
not very good, especially in the TCP case (which is the only case that
matters for v4).  What we really need here is some serious testing.

Anyway, here's a start.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-18 09:39:01 -04:00
NeilBrown
49b28684fd nfsd: Remove deprecated nfsctl system call and related code.
As promised in feature-removal-schedule.txt it is time to
remove the nfsctl system call.

Userspace has perferred to not use this call throughout 2.6 and it has been
excluded in the default configuration since 2.6.36 (9 months ago).

So this patch removes all the code that was being compiled out.

There are still references to sys_nfsctl in various arch systemcall tables
and related code.  These should be cleaned out too, probably in the next
merge window.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:42 -04:00
J. Bruce Fields
9e701c6109 svcrpc: simpler request dropping
Currently we use -EAGAIN returns to determine when to drop a deferred
request.  On its own, that is error-prone, as it makes us treat -EAGAIN
returns from other functions specially to prevent inadvertent dropping.

So, use a flag on the request instead.

Returning an error on request deferral is still required, to prevent
further processing, but we no longer need worry that an error return on
its own could result in a drop.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-04 16:49:22 -05:00
Pavel Emelyanov
fc5d00b04a sunrpc: Add net argument to svc_create_xprt
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01 17:18:54 -04:00
J. Bruce Fields
e844a7b980 nfsd: initialize nfsd versions before creating svc
Commit 59db4a0c10 "nfsd: move more into
nfsd_startup()" inadvertently moved nfsd_versions after
nfsd_create_svc().  On older distributions using an rpc.nfsd that does
not explicitly set the list of nfsd versions, this results in
svc-create_pooled() being called with an empty versions array.  The
resulting incomplete initialization leads to a NULL dereference in
svc_process_common() the first time a client accesses the server.

Move nfsd_reset_versions() back before the svc_create_pooled(); this
time, put it closer to the svc_create_pooled() call, to make this
mistake more difficult in the future.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-06 17:05:40 -04:00
J. Bruce Fields
774f8bbd9e nfsd: fix startup/shutdown order bug
We must create the server before we can call init_socks or check the
number of threads.

Symptoms were a NULL pointer dereference in nfsd_svc().  Problem
identified by Jeff Layton.

Also fix a minor cleanup-on-error case in nfsd_startup().

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-06 17:05:30 -04:00
J. Bruce Fields
af4718f3f9 nfsd: minor nfsd_svc() cleanup
More idiomatic to put the error case in the if clause.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-23 08:51:27 -04:00
J. Bruce Fields
59db4a0c10 nfsd: move more into nfsd_startup()
This is just cleanup--it's harmless to call nfsd_rachache_init,
nfsd_init_socks, and nfsd_reset_versions more than once.  But there's no
point to it.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-23 08:51:26 -04:00
Jeff Layton
ac77efbe2b nfsd: just keep single lockd reference for nfsd
Right now, nfsd keeps a lockd reference for each socket that it has
open. This is unnecessary and complicates the error handling on
startup and shutdown. Change it to just do a lockd_up when starting
the first nfsd thread just do a single lockd_down when taking down the
last nfsd thread. Because of the strange way the sv_count is handled
this requires an extra flag to tell whether the nfsd_serv holds a
reference for lockd or not.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-23 08:51:26 -04:00
Jeff Layton
628b368728 nfsd: clean up nfsd_create_serv error handling
There doesn't seem to be any need to reset the nfssvc_boot time if the
nfsd startup failed.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-23 08:51:25 -04:00
Jeff Layton
4ad9a344be nfsd4: fix v4 state shutdown error paths
If someone tries to shut down the laundry_wq while it isn't up it'll
cause an oops.

This can happen because write_ports can create a nfsd_svc before we
really start the nfs server, and we may fail before the server is ever
started.

Also make sure state is shutdown on error paths in nfsd_svc().

Use a common global nfsd_up flag instead of nfs4_init, and create common
helper functions for nfsd start/shutdown, as there will be other work
that we want done only when we the number of nfsd threads transitions
between zero and nonzero.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-07-23 08:51:22 -04:00
Pavel Emelyanov
15ddb4aec5 NFSD: don't report compiled-out versions as present
The /proc/fs/nfsd/versions file calls nfsd_vers() to check whether
the particular nfsd version is present/available. The problem is
that once I turn off e.g. NFSD-V4 this call returns -1 which is
true from the callers POV which is wrong.

The proposal is to report false in that case.

The bug has existed since 6658d3a7bb "[PATCH] knfsd: remove
nfsd_versbits as intermediate storage for desired versions".

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: stable@kernel.org
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-14 18:46:14 -04:00
J. Bruce Fields
7663dacd92 nfsd: remove pointless paths in file headers
The new .h files have paths at the top that are now out of date.  While
we're here, just remove all of those from fs/nfsd; they never served any
purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 15:01:47 -05:00
Boaz Harrosh
9a74af2133 nfsd: Move private headers to source directory
Lots of include/linux/nfsd/* headers are only used by
nfsd module. Move them to the source directory

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:12 -05:00
Boaz Harrosh
341eb18446 nfsd: Source files #include cleanups
Now that the headers are fixed and carry their own wait, all fs/nfsd/
source files can include a minimal set of headers. and still compile just
fine.

This patch should improve the compilation speed of the nfsd module.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:09 -05:00
J. Bruce Fields
0a3adadee4 nfsd: make fs/nfsd/vfs.h for common includes
None of this stuff is used outside nfsd, so move it out of the common
linux include directory.

Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really
belongs there, so later we may remove that file entirely.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-13 13:23:02 -05:00
Linus Torvalds
a87e84b5cd Merge branch 'for-2.6.32' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits)
  nfsd4: nfsv4 clients should cross mountpoints
  nfsd: revise 4.1 status documentation
  sunrpc/cache: avoid variable over-loading in cache_defer_req
  sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
  nfsd: return success for non-NFS4 nfs4_state_start
  nfsd41: Refactor create_client()
  nfsd41: modify nfsd4.1 backchannel to use new xprt class
  nfsd41: Backchannel: Implement cb_recall over NFSv4.1
  nfsd41: Backchannel: cb_sequence callback
  nfsd41: Backchannel: Setup sequence information
  nfsd41: Backchannel: Server backchannel RPC wait queue
  nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
  nfsd41: Backchannel: callback infrastructure
  nfsd4: use common rpc_cred for all callbacks
  nfsd4: allow nfs4 state startup to fail
  SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous
  nfsd4: fix null dereference creating nfsv4 callback client
  nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
  nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel
  sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
  ...
2009-09-22 07:54:33 -07:00
J. Bruce Fields
29ab23cc5d nfsd4: allow nfs4 state startup to fail
The failure here is pretty unlikely, but we should handle it anyway.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:33 -04:00
David Howells
e0e817392b CRED: Add some configurable debugging [try #6]
Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
for credential management.  The additional code keeps track of the number of
pointers from task_structs to any given cred struct, and checks to see that
this number never exceeds the usage count of the cred struct (which includes
all references, not just those from task_structs).

Furthermore, if SELinux is enabled, the code also checks that the security
pointer in the cred struct is never seen to be invalid.

This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
kernel thread on seeing cred->security be a NULL pointer (it appears that the
credential struct has been previously released):

	http://www.kerneloops.org/oops.php?number=252883

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:01 +10:00
Andy Adamson
557ce2646e nfsd41: replace page based DRC with buffer based DRC
Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of holding nfsd
pages in cache.

Connectathon testing has shown that 1024 bytes for encoded compound operation
responses past the sequence operation is sufficient, 512 bytes is a little too
small. Set NFSD_SLOT_CACHE_SIZE to 1024.

Allocate memory for the session DRC in the CREATE_SESSION operation
to guarantee that the memory resource is available for caching responses.
Allocate each slot individually in preparation for slot table size negotiation.

Remove struct nfsd4_cache_entry and helper functions for the old page-based
DRC.

The iov_len calculation in nfs4svc_encode_compoundres is now always
correct.  Replay is now done in nfsd4_sequence under the state lock, so
the session ref count is only bumped on non-replay. Clean up the
nfs4svc_encode_compoundres session logic.

The nfsd4_compound_state statp pointer is also not used.
Remove nfsd4_set_statp().

Move useful nfsd4_cache_entry fields into nfsd4_slot.

Signed-off-by: Andy Adamson <andros@netapp.com
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 22:24:06 -04:00
Ryusei Yamaguchi
ed2d8aed52 knfsd: Replace lock_kernel with a mutex in nfsd pool stats.
lock_kernel() in knfsd was replaced with a mutex. The later
commit 03cf6c9f49 ("knfsd:
add file to export stats about nfsd pools") did not follow
that change. This patch fixes the issue.

Also move the get and put of nfsd_serv to the open and close methods
(instead of start and stop methods) to allow atomic check and increment
of reference count in the open method (where we can still return an
error).

Signed-off-by: Ryusei Yamaguchi <mandel59@gmail.com>
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Greg Banks <gnb@fmeh.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-08-25 12:39:37 -04:00