Exchange ID can be called in a lease reclaim situation, so it
will deadlock if it then tries to write out dirty NFS pages.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch adds the BIND_CONN_TO_SESSION operation which is needed for
upcoming SP4_MACH_CRED work and useful for recovering from broken connections
without destroying the session.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ensure that a process that uses the nfs_client->cl_cons_state test
for whether the initialisation process is finished does not read
stale data.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Session initialisation is not complete until the lease manager
has run. We need to ensure that both nfs4_init_session and
nfs4_init_ds_session do so, and that they check for any resulting
errors in clp->cl_cons_state.
Only after this is done, can nfs4_ds_connect check the contents
of clp->cl_exchange_flags.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Andy Adamson <andros@netapp.com>
Save the server major and minor ID results from EXCHANGE_ID, as they
are needed for detecting server trunking.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently our NFS client assigns a unique SETCLIENTID boot verifier
for each server IP address it knows about. It's set to CURRENT_TIME
when the struct nfs_client for that server IP is created.
During the SETCLIENTID operation, our client also presents an
nfs_client_id4 string to servers, as an identifier on which the server
can hang all of this client's NFSv4 state. Our client's
nfs_client_id4 string is unique for each server IP address.
An NFSv4 server is obligated to wipe all NFSv4 state associated with
an nfs_client_id4 string when the client presents the same
nfs_client_id4 string along with a changed SETCLIENTID boot verifier.
When our client unmounts the last of a server's shares, it destroys
that server's struct nfs_client. The next time the client mounts that
NFS server, it creates a fresh struct nfs_client with a fresh boot
verifier. On seeing the fresh verifer, the server wipes any previous
NFSv4 state associated with that nfs_client_id4.
However, NFSv4.1 clients are supposed to present the same
nfs_client_id4 string to all servers. And, to support Transparent
State Migration, the same nfs_client_id4 string should be presented
to all NFSv4.0 servers so they recognize that migrated state for this
client belongs with state a server may already have for this client.
(This is known as the Uniform Client String model).
If the nfs_client_id4 string is the same but the boot verifier changes
for each server IP address, SETCLIENTID and EXCHANGE_ID operations
from such a client could unintentionally result in a server wiping a
client's previously obtained lease.
Thus, if our NFS client is going to use a fixed nfs_client_id4 string,
either for NFSv4.0 or NFSv4.1 mounts, our NFS client should use a
boot verifier that does not change depending on server IP address.
Replace our current per-nfs_client boot verifier with a per-nfs_net
boot verifier.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
nfs4_reset_all_state() refreshes the boot verifier a server sees to
trigger that server to wipe this client's state. This function is
invoked when an NFSv4.1 server reports that it has revoked some or
all of a client's NFSv4 state.
To facilitate server trunking discovery, we will eventually want to
move the cl_boot_time field to a more global structure. The Uniform
Client String model (and specifically, server trunking detection)
requires that all servers see the same boot verifier until the client
actually does reboot, and not a fresh verifier every time the client
unmounts and remounts the server.
Without the cl_boot_time field, however, nfs4_reset_all_state() will
have to find some other way to force the server to purge the client's
NFSv4 state.
Because these verifiers are opaque (ie, the server doesn't know or
care that they happen to be timestamps), we can force the server
to wipe NFSv4 state by updating the boot verifier as we do now, then
immediately afterwards establish a fresh client ID using the old boot
verifier again.
Hopefully there are no extra paranoid server implementations that keep
track of the client's boot verifiers and prevent clients from reusing
a previous one.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: update to use matching types in "if" expressions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: When naming fields and data types, follow established
conventions to facilitate accurate grep/cscope searches.
Additionally, for consistency, move the impl_id field into the NFSv4-
specific part of the nfs_client, and free that memory in the logic
that shuts down NFSv4 nfs_clients.
Introduced by commit 7d2ed9ac "NFSv4: parse and display server
implementation ids," Fri Feb 17, 2012.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: When naming fields and data types, follow established
conventions to facilitate accurate grep/cscope searches.
Additionally, for consistency, move the scope field into the NFSv4-
specific part of the nfs_client, and free that memory in the logic
that shuts down NFSv4 nfs_clients.
Introduced by commit 99fe60d0 "nfs41: exchange_id operation", April
1 2009.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The SETCLIENTID boot verifier is opaque to NFSv4 servers, thus there
is no requirement for byte swapping before the client puts the
verifier on the wire.
This treatment is similar to other timestamp-based verifiers.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Replaced by filelayout_reset_write and filelayout_reset_read
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch splits out the NFS v4 specific functionality of
nfs4_get_root() into its own rpc_op called by the generic client, and
leaves nfs4_proc_get_rootfh() as its own stand alone function. This
also allows me to change nfs4_remote_mount(), nfs4_xdev_mount() and
nfs4_remote_referral_mount() to use the generic client's nfs_get_root()
function. Later patches in this series will collapse these functions
into one common function, so using the same get_root() function
everywhere simplifies future changes.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This function is really getting the root filehandle and not the root
dentry of the filesystem. I also removed the rpc_ops lookup from
nfs4_get_rootfh() under the assumption that if we reach this function
then we already know we are using NFS v4.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
No attributes are supposed to change during a COMMIT call, so there
is no need to request post-op attributes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We don't need cache consistency information when we're doing O_DIRECT
writes. Ditto for the case of delegated writes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Get rid of the post-op GETATTR on the directory in order to reduce
the amount of processing done on the server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Get rid of the post-op GETATTR on the directory in order to reduce
the amount of processing done on the server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Get rid of the post-op GETATTR on the directory in order to reduce
the amount of processing done on the server.
The cost is that if we later need to stat() the directory, then we
know that the ctime and mtime are likely to be invalid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In order to retrieve cache consistency attributes before
anyone else has a chance to change the inode, we need to
put the GETATTR op _before_ the DELEGRETURN op.
We can then use that as part of a 'nfs_post_op_update_inode_force_wcc()'
call, to ensure that we update the attributes without clearing our
cached data.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In order to do close-to-open cache consistency checking after
a delegreturn, we don't need to retrieve the full set of
attributes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We need to use the hostname of the process that created the nfs_client.
That hostname is now stored in the rpc_client->cl_nodename.
Also remove the utsname()->domainname component. There is no reason
to include the NIS/YP domainname in a client identifier string.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Now that I'm doing secinfo automatically in the v4 code this extra
argument isn't needed.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This simplifies the code for v2 and v3 and gives v4 a chance to decide
on referrals without needing to modify the generic client.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Decouple nfs_pgio_header and nfs_write_data, and have (possibly
multiple) nfs_write_datas each take a refcount on nfs_pgio_header.
For the moment keeps nfs_write_header as a way to preallocate a single
nfs_write_data with the nfs_pgio_header. The code doesn't need this,
and would be prettier without, but given the amount of churn I am
already introducing I didn't want to play with tuning new mempools.
This also fixes bug in pnfs_ld_handle_write_error. In the case of
desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing
replay attempt to do nothing.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Decouple nfs_pgio_header and nfs_read_data, and have (possibly
multiple) nfs_read_datas each take a refcount on nfs_pgio_header.
For the moment keeps nfs_read_header as a way to preallocate a single
nfs_read_data with the nfs_pgio_header. The code doesn't need this,
and would be prettier without, but given the amount of churn I am
already introducing I didn't want to play with tuning new mempools.
This also fixes bug in pnfs_ld_handle_read_error. In the case of
desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing
replay attempt to do nothing.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In order to avoid duplicating all the data in nfs_read_data whenever we
split it up into multiple RPC calls (either due to a short read result
or due to rsize < PAGE_SIZE), we split out the bits that are the same
per RPC call into a separate "header" structure.
The goal this patch moves towards is to have a single header
refcounted by several rpc_data structures. Thus, want to always refer
from rpc_data to the header, and not the other way. This patch comes
close to that ideal, but the directio code currently needs some
special casing, isolated in the nfs_direct_[read_write]hdr_release()
functions. This will be dealt with in a future patch.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Commits don't need the vectors of pages, etc. that writes do. Split out
a separate structure for the commit operation.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
I create a new proc_lookup_mountpoint() to use when submounting an NFS
v4 share. This function returns an rpc_clnt to use for performing an
fs_locations() call on a referral's mountpoint.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Whenever lookup sees wrongsec do a secinfo and retry the lookup to find
attributes of the file or directory, such as "is this a referral
mountpoint?". This also allows me to remove handling -NFS4ERR_WRONSEC
as part of getattr xdr decoding.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We don't want to return -NFS4ERR_WRONGSEC to the VFS because it could
cause the kernel to oops.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When attempting to cache ACLs returned from the server, if the bitmap
size + the ACL size is greater than a PAGE_SIZE but the ACL size itself
is smaller than a PAGE_SIZE, we can read past the buffer page boundary.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bug noticed in commit
bf118a342f
When calling GETACL, if the size of the bitmap array, the length
attribute and the acl returned by the server is greater than the
allocated buffer(args.acl_len), we can Oops with a General Protection
fault at _copy_from_pages() when we attempt to read past the pages
allocated.
This patch allocates an extra PAGE for the bitmap and checks to see that
the bitmap + attribute_length + ACLs don't exceed the buffer space
allocated to it.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Jian Li <jiali@redhat.com>
[Trond: Fixed a size_t vs unsigned int printk() warning]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The NFSv4 spec is ambiguous about whether or not it is permissible
to reuse open owner names, so play it safe. This patch adds a timestamp
to the state_owner structure, and combines that with the IDA based
uniquifier.
Fixes a regression whereby the Linux server returns NFS4ERR_BAD_SEQID.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the file wasn't opened for writing, then truncate and ftruncate
need to report the appropriate errors.
Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Since we may be simulating flock() locks using NFS byte range locks,
we can't rely on the VFS having checked the file open mode for us.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
All callers of nfs4_handle_exception() that need to handle
NFS4ERR_OPENMODE correctly should set exception->inode
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Firstly, task->tk_status will always return negative error values,
so the current tests for 'NFS4ERR_DELEG_REVOKED' etc. are all being
ignored.
Secondly, clean up the code so that we only need to test
task->tk_status once!
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
We can currently loop forever in nfs4_lookup_root() and in
nfs41_proc_secinfo_no_name(), if the first iteration returns a
NFS4ERR_DELAY or something else that causes exception.retry to get
set.
Reported-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
_copy_from_pages() used to copy data from the temporary buffer to the
user passed buffer is passed the wrong size parameter when copying
data. res.acl_len contains both the bitmap and acl lenghts while
acl_len contains the acl length after adjusting for the bitmap size.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
New features include:
- Add NFS client support for containers.
This should enable most of the necessary functionality, including
lockd support, and support for rpc.statd, NFSv4 idmapper and
RPCSEC_GSS upcalls into the correct network namespace from
which the mount system call was issued.
- NFSv4 idmapper scalability improvements
Base the idmapper cache on the keyring interface to allow concurrent
access to idmapper entries. Start the process of migrating users from
the single-threaded daemon-based approach to the multi-threaded
request-key based approach.
- NFSv4.1 implementation id.
Allows the NFSv4.1 client and server to mutually identify each other
for logging and debugging purposes.
- Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
having to use the more counterintuitive 'vers=4,minorversion=1'.
- SUNRPC tracepoints.
Start the process of adding tracepoints in order to improve debugging
of the RPC layer.
- pNFS object layout support for autologin.
Important bugfixes include:
- Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail
to wake up all tasks when applied to priority waitqueues.
- Ensure that we handle read delegations correctly, when we try to
truncate a file.
- A number of fixes for NFSv4 state manager loops (mostly to do with
delegation recovery).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJPalZbAAoJEGcL54qWCgDyCi4P+QHcmzQhJO7HWx3Pzjs67bFT
xMSYaKHGWS4AJKUBVl5OKBxUExfrMHBNbElV3IKUIwBlDx8RVtnwfptKSe146iki
dn4TrRO5es8nmI4hRDcGMlzJDZq4y0Qg//qiUFmojiNW/Avw0ljfMoVUejJJ09FV
oeDk4EGtcxkEyH+g48ZjYbyspRnG8qtD3atf70Z3lYE0ELdG/B5Dyzw1RDrA5p73
xJX3lqy8p/4ROzw/dmNoxdAXOrr3Q4/T58Bvp/lUglPy/EHyPmWzFoH0MU0C/PFu
5VnAl6QDbNCTcIw9FvJlX/mIyErpNG9eKzUskUc9L9SA+B+J/i4rIap4KATRN3nH
7QhE5qUacPuJnvxml7MPmlQTuft3fkAQ7NhKIWrbRi1QS9FmJC5NxctIb8loqlFn
yIXdKeLfMshB+NyuFS9uzStX7SmV3eMgVd+5ZxRjYxm+PKJLw2KXeudArL6M5mHK
3QeKZpqwaYQ3RfaTNpvAp0doiXHCO5UbWfI0Pe8xQs/QcMCNReffqV2G4IJKFAu6
WpoN2UDQC9LCBifLw2nS7kku8+ZVXLQU8OC1NVl3TG15xD9cNLXuk3/y5llPGq4O
odo52uLFpJohbDaHMj5RTKOfchTQCm2iyuVmxZEeAySypMSiAXmW7COSKHs/HxI1
VBm+EI00Pvmm5+fUjIlp
=LuHE
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates for Linux 3.4 from Trond Myklebust:
"New features include:
- Add NFS client support for containers.
This should enable most of the necessary functionality, including
lockd support, and support for rpc.statd, NFSv4 idmapper and
RPCSEC_GSS upcalls into the correct network namespace from which
the mount system call was issued.
- NFSv4 idmapper scalability improvements
Base the idmapper cache on the keyring interface to allow
concurrent access to idmapper entries. Start the process of
migrating users from the single-threaded daemon-based approach to
the multi-threaded request-key based approach.
- NFSv4.1 implementation id.
Allows the NFSv4.1 client and server to mutually identify each
other for logging and debugging purposes.
- Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
having to use the more counterintuitive 'vers=4,minorversion=1'.
- SUNRPC tracepoints.
Start the process of adding tracepoints in order to improve
debugging of the RPC layer.
- pNFS object layout support for autologin.
Important bugfixes include:
- Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to
fail to wake up all tasks when applied to priority waitqueues.
- Ensure that we handle read delegations correctly, when we try to
truncate a file.
- A number of fixes for NFSv4 state manager loops (mostly to do with
delegation recovery)."
* tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits)
NFS: fix sb->s_id in nfs debug prints
xprtrdma: Remove assumption that each segment is <= PAGE_SIZE
xprtrdma: The transport should not bug-check when a dup reply is received
pnfs-obj: autologin: Add support for protocol autologin
NFS: Remove nfs4_setup_sequence from generic rename code
NFS: Remove nfs4_setup_sequence from generic unlink code
NFS: Remove nfs4_setup_sequence from generic read code
NFS: Remove nfs4_setup_sequence from generic write code
NFS: Fix more NFS debug related build warnings
SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined
nfs: non void functions must return a value
SUNRPC: Kill compiler warning when RPC_DEBUG is unset
SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG
NFS: Use cond_resched_lock() to reduce latencies in the commit scans
NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner
NFS: ncommit count is being double decremented
SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
Try using machine credentials for RENEW calls
NFSv4.1: Fix a few issues in filelayout_commit_pagelist
NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code
...
This is an NFS v4 specific operation, so it belongs in the NFS v4 code
and not the generic client.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This is an NFS v4 specific operation, so it belongs in the NFS v4 code
and not the generic client.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This is an NFS v4 specific operation, so it belongs in the NFS v4 code
and not the generic client.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This is an NFS v4 specific operation, so it belongs in the NFS v4 code
and not the generic client.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
It is quite possible for the release_lockowner RPC call to race with the
close RPC call, in which case, we cannot dereference lsp->ls_state in
order to find the nfs_server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>