Commit Graph

4822 Commits

Author SHA1 Message Date
Trond Myklebust
1ebf980127 NFS/filelayout: Fix racy setting of fl->dsaddr in filelayout_check_deviceid()
We must set fl->dsaddr once, and once only, even if there are multiple
processes calling filelayout_check_deviceid() for the same layout
segment.

Reported-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 14:08:45 -04:00
Trond Myklebust
ecbb903c56 NFS: Be more careful about mapping file permissions
When mapping a directory, we want the MAY_WRITE permissions to reflect
whether or not we have permission to modify, add and delete the directory
entries. MAY_EXEC must map to lookup permissions.

On the other hand, for files, we want MAY_WRITE to reflect a permission
to modify and extend the file.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 11:51:19 -04:00
Trond Myklebust
bd8b244174 NFS: Store the raw NFS access mask in the inode's access cache
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 11:51:19 -04:00
Trond Myklebust
eda3e20847 NFSv3: Convert nfs3_proc_access() to use nfs_access_set_mask()
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 11:51:19 -04:00
Trond Myklebust
15d4b73ac2 NFS: Refactor NFS access to kernel access mask calculation
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 11:51:19 -04:00
Eryu Guan
ecc7b435d2 nfs: count correct array for mnt3_counts array size
Array size of mnt3_counts should be the size of array
mnt3_procedures, not mnt_procedures, though they're same in size
right now. Found this by code inspection.

Fixes: 1c5876ddbd ("sunrpc: move p_count out of struct rpc_procinfo")
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-21 08:49:57 -04:00
Trond Myklebust
213297369c Revert commit 722f0b8911 ("pNFS: Don't send COMMITs to the DSes if...")
Doing the test without taking any locks is racy, and so really it makes
more sense to do it in the flexfiles code (which is the only case that
cares).

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-19 15:28:21 -04:00
Trond Myklebust
4b75053e9b pNFS/flexfiles: Handle expired layout segments in ff_layout_initiate_commit()
If the layout has expired due to a fencing event, then we should not
attempt to commit to the DS.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-19 15:28:21 -04:00
Trond Myklebust
4118188645 NFS: Fix another COMMIT race in pNFS
We must make sure that cinfo->ds->ncommitting is in sync with the
commit list, since it is checked as part of pnfs_commit_list().

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-19 15:28:21 -04:00
Trond Myklebust
e39928f942 NFS: Fix a COMMIT race in pNFS
We must make sure that cinfo->ds->nwritten is in sync with the
commit list, since it is checked as part of pnfs_scan_commit_lists().

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-19 15:28:21 -04:00
Steve Dickson
89a6814d9b mount: copy the port field into the cloned nfs_server structure.
Doing this copy eliminates the "port=0" entry in
the /proc/mounts entries

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=69241

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-19 15:28:21 -04:00
Trond Myklebust
301bfa4830 NFS: Don't run wake_up_bit() when nobody is waiting...
"perf lock" shows fairly heavy contention for the bit waitqueue locks
when doing an I/O heavy workload.
Use a bit to tell whether or not there has been contention for a lock
so that we can optimise away the bit waitqueue options in those cases.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:57:18 -04:00
Peng Tao
00422483ad nfs: add export operations
This support for opening files on NFS by file handle, both through the
open_by_handle syscall, and for re-exporting NFS (for example using a
different version).  The support is very basic for now, as each open by
handle will have to do an NFSv4 open operation on the wire.  In the
future this will hopefully be mitigated by an open file cache, as well
as various optimizations in NFS for this specific case.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
[hch: incorporated various changes, resplit the patches, new changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-13 16:00:15 -04:00
Jeff Layton
5b5faaf6df nfs4: add NFSv4 LOOKUPP handlers
This will be needed in order to implement the get_parent export op
for nfsd.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:15 -04:00
Peng Tao
f174ff7a0a nfs: add a nfs_ilookup helper
This helper will allow to find an existing NFS inode by the file handle
and fattr.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
[hch: split from a larger patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:15 -04:00
Peng Tao
774d9513a3 nfs: replace d_add with d_splice_alias in atomic_open
It's a trival change but follows knfsd export document that asks
for d_splice_alias during lookup.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:14 -04:00
Olga Kornievskaia
1ee48bdd22 NFSv4.2 fix size storage for nfs42_proc_copy
Return size of COPY is u64 but it was assigned to an "int" status.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:14 -04:00
Chuck Lever
838edb9497 NFSv4.1: Use seqid returned by EXCHANGE_ID after state migration
Transparent State Migration copies a client's lease state from the
server where a filesystem used to reside to the server where it now
resides. When an NFSv4.1 client first contacts that destination
server, it uses EXCHANGE_ID to detect trunking relationships.

The lease that was copied there is returned to that client, but the
destination server sets EXCHGID4_FLAG_CONFIRMED_R when replying to
the client. This is because the lease was confirmed on the source
server (before it was copied).

When CONFIRMED_R is set, the client throws away the sequence ID
returned by the server. During a Transparent State Migration, however
there's no other way for the client to know what sequence ID to use
with a lease that's been migrated.

Therefore, the client must save and use the contrived slot sequence
value returned by the destination server even when CONFIRMED_R is
set.

Note that some servers always return a seqid of 1 after a migration.

Reported-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:12 -04:00
Chuck Lever
8dcbec6d20 NFSv4.1: Handle EXCHGID4_FLAG_CONFIRMED_R during NFSv4.1 migration
Transparent State Migration copies a client's lease state from the
server where a filesystem used to reside to the server where it now
resides. When an NFSv4.1 client first contacts that destination
server, it uses EXCHANGE_ID to detect trunking relationships.

The lease that was copied there is returned to that client, but the
destination server sets EXCHGID4_FLAG_CONFIRMED_R when replying to
the client. This is because the lease was confirmed on the source
server (before it was copied).

Normally, when CONFIRMED_R is set, a client purges the lease and
creates a new one. However, that throws away the entire benefit of
Transparent State Migration.

Therefore, the client must not purge that lease when it is possible
that Transparent State Migration has occurred.

Reported-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:12 -04:00
NeilBrown
26fde4dfcb NFS: check for nfs_refresh_inode() errors in nfs_fhget()
If an NFS server returns a filehandle that we have previously
seen, and reports a different type, then nfs_refresh_inode()
will log a warning and return an error.

nfs_fhget() does not check for this error and may return an
inode with a different type than the one that the server
reported.

This is likely to cause confusion, and is one way that
->open_context() could return a directory inode as discussed
in the previous patch.

So if nfs_refresh_inode() returns and error, return that error
from nfs_fhget() to avoid the confusion propagating.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:09 -04:00
NeilBrown
eaa2b82c3b NFS: guard against confused server in nfs_atomic_open()
A confused server could return a filehandle for an
NFSv4 OPEN request, which it previously returned for a directory.
So the inode returned by  ->open_context() in nfs_atomic_open()
could conceivably be a directory inode.

This has particular implications for the call to
nfs_file_set_open_context() in nfs_finish_open().
If that is called on a directory inode, then the nfs_open_context
that gets stored in the filp->private_data will be linked to
nfs_inode->open_files.

When the directory is closed, nfs_closedir() will (ultimately)
free the ->private_data, but not unlink it from nfs_inode->open_files
(because it doesn't expect an nfs_open_context there).

Subsequently the memory could get used for something else and eventually
if the ->open_files list is walked, the walker will fall off the end and
crash.

So: change nfs_finish_open() to only call nfs_file_set_open_context()
for regular-file inodes.

This failure mode has been seen in a production setting (unknown NFS
server implementation).  The kernel was v3.0 and the specific sequence
seen would not affect more recent kernels, but I think a risk is still
present, and caution is wise.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:08 -04:00
NeilBrown
cc89684c9a NFS: only invalidate dentrys that are clearly invalid.
Since commit bafc9b754f ("vfs: More precise tests in d_invalidate")
in v3.18, a return of '0' from ->d_revalidate() will cause the dentry
to be invalidated even if it has filesystems mounted on or it or on a
descendant.  The mounted filesystem is unmounted.

This means we need to be careful not to return 0 unless the directory
referred to truly is invalid.  So -ESTALE or -ENOENT should invalidate
the directory.  Other errors such a -EPERM or -ERESTARTSYS should be
returned from ->d_revalidate() so they are propagated to the caller.

A particular problem can be demonstrated by:

1/ mount an NFS filesystem using NFSv3 on /mnt
2/ mount any other filesystem on /mnt/foo
3/ ls /mnt/foo
4/ turn off network, or otherwise make the server unable to respond
5/ ls /mnt/foo &
6/ cat /proc/$!/stack # note that nfs_lookup_revalidate is in the call stack
7/ kill -9 $! # this results in -ERESTARTSYS being returned
8/ observe that /mnt/foo has been unmounted.

This patch changes nfs_lookup_revalidate() to only treat
  -ESTALE from nfs_lookup_verify_inode() and
  -ESTALE or -ENOENT from ->lookup()
as indicating an invalid inode.  Other errors are returned.

Also nfs_check_inode_attributes() is changed to return -ESTALE rather
than -EIO.  This is consistent with the error returned in similar
circumstances from nfs_update_inode().

As this bug allows any user to unmount a filesystem mounted on an NFS
filesystem, this fix is suitable for stable kernels.

Fixes: bafc9b754f ("vfs: More precise tests in d_invalidate")
Cc: stable@vger.kernel.org (v3.18+)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:08 -04:00
Olga Kornievskaia
22368ff11d PNFS for stateid errors retry against MDS first
Upon receiving a stateid error such as BAD_STATEID, the client
should retry the operation against the MDS before deciding to
do stateid recovery.

Previously, the code would initiate state recovery and it could
lead to a race in a state manager that could chose an incorrect
recovery method which would lead to the EIO failure for the
application.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:08 -04:00
Olga Kornievskaia
a0bc01e0f1 PNFS fix EACCESS on commit to DS handling
Commit fabbbee0eb "PNFS fix fallback to MDS if got error on
commit to DS" moved the pnfs_set_lo_fail() to unhandled errors
which was not correct and lead to a kernel oops on umount.

Instead, fix the original EACCESS on commit to DS error by
getting the new layout and re-doing the IO.

Fixes: fabbbee0eb ("PNFS fix fallback to MDS if got error on commit to DS")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Cc: stable@vger.kernel.org # v4.12
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:59:57 -04:00
Dan Carpenter
4cd1ec95bd NFS: silence a uninitialized variable warning
Static checkers have gotten clever enough to complain that "id_long" is
uninitialized on the failure path.  It's harmless, but simple to fix.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:58:28 -04:00
Tuo Chen Peng
ce85bd2921 nfs: Fix fscache stat printing in nfs_show_stats()
nfs_show_stats() was incorrectly reading statistics for bytes when printing that
for fsc. It caused files like /proc/self/mountstats to report incorrect fsc
statistics for NFS mounts.

Signed-off-by: Tuo Chen Peng <tpeng@nvidia.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:58:27 -04:00
Benjamin Coddington
2eb3aea7d9 NFS: Fix initialization of nfs_page_array->npages
Commit 8ef9b0b9e1 open-coded nfs_pgarray_set(), and left out the
initialization of the nfs_page_array's npages.  This mistake didn't show up
until testing with block layouts, and there shows that all pNFS reads
return -EIO.

Fixes: 8ef9b0b9e1 ("NFS: move nfs_pgarray_set() to open code")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org # 4.12
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:58:06 -04:00
Trond Myklebust
1a4edf0f46 NFS: Fix commit policy for non-blocking calls to nfs_write_inode()
Now that the writes will schedule a commit on their own, we don't
need nfs_write_inode() to schedule one if there are outstanding
writes, and we're being called in non-blocking mode.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:58:05 -04:00
Trond Myklebust
919e3bd9a8 NFS: Ensure we commit after writeback is complete
If the page cache is being flushed, then we want to ensure that we
do start a commit once the pages are done being flushed.
If we just wait until all I/O is done to that file, we can end up
livelocking until the balance_dirty_pages() mechanism puts its
foot down and forces I/O to stop.
So instead we do more or less the same thing that O_DIRECT does,
and set up a counter to tell us when the flush is done,

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:58:05 -04:00
Trond Myklebust
b5973a8c1c NFS: Remove unused fields in the page I/O structures
Remove the 'layout_private' fields that were only used by the pNFS OSD
layout driver.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:58:05 -04:00
Benjamin Coddington
818a8dbe83 NFS: nfs_rename() - revalidate directories on -ERESTARTSYS
An interrupted rename will leave the old dentry behind if the rename
succeeds.  Fix this by forcing a lookup the next time through
->d_revalidate.

A previous attempt at solving this problem took the approach to complete
the work of the rename asynchronously, however that approach was wrong
since it would allow the d_move() to occur after the directory's i_mutex
had been dropped by the original process.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:58:04 -04:00
Benjamin Coddington
a7a3b1e971 NFS: convert flags to bool
NFS uses some int, and unsigned int :1, and bool as flags in structs and
args.  Assert the preference for uniformly replacing these with the bool
type.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:58:04 -04:00
Anna Schumaker
18fe6a23e3 NFS: Set FATTR4_WORD0_TYPE for . and .. entries
The current code worked okay for getdents(), but getdents64() expects
the d_type field to get filled out properly in the stat structure.
Setting this field fixes xfstests generic/401.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 15:58:03 -04:00
Christoph Hellwig
aa8217d5dc sunrpc: mark all struct svc_version instances as const
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-07-13 15:58:03 -04:00
Christoph Hellwig
b9c744c19c sunrpc: mark all struct svc_procinfo instances as const
struct svc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-13 15:58:02 -04:00
Christoph Hellwig
0becc1181c sunrpc: move pc_count out of struct svc_procinfo
pc_count is the only writeable memeber of struct svc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct svc_procinfo, and into a
separate writable array that is pointed to by struct svc_version.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-13 15:58:02 -04:00
Christoph Hellwig
d16d186721 sunrpc: properly type pc_encode callbacks
Drop the resp argument as it can trivially be derived from the rqstp
argument.  With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-07-13 15:58:00 -04:00
Christoph Hellwig
cc6acc20a6 sunrpc: properly type pc_decode callbacks
Drop the argp argument as it can trivially be derived from the rqstp
argument.  With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-13 15:58:00 -04:00
Christoph Hellwig
1c8a5409f3 sunrpc: properly type pc_func callbacks
Drop the argp and resp arguments as they can trivially be derived from
the rqstp argument.  With that all functions now have the same prototype,
and we can remove the unsafe casting to svc_procfunc as well as the
svc_procfunc typedef itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-13 15:57:59 -04:00
Christoph Hellwig
511e936bf2 sunrpc: mark all struct rpc_procinfo instances as const
struct rpc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-07-13 15:57:57 -04:00
Christoph Hellwig
9ae7d8ff29 nfs: use ARRAY_SIZE() in the nfsacl_version3 declaration
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-13 15:57:57 -04:00
Christoph Hellwig
c551858a88 sunrpc: move p_count out of struct rpc_procinfo
p_count is the only writeable memeber of struct rpc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct rpc_procinfo, and into a
separate writable array that is pointed to by struct rpc_version and
indexed by p_statidx.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-13 15:57:57 -04:00
Christoph Hellwig
947c6e431d nfs: don't cast callback decode/proc/encode routines
Instead declare all functions with the proper methods signature.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-07-13 15:57:56 -04:00
Christoph Hellwig
fc016483eb nfs: fix decoder callback prototypes
Declare the p_decode callbacks with the proper prototype instead of
casting to kxdrdproc_t and losing all type safety.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-07-13 15:57:56 -04:00
Christoph Hellwig
fcc85819ee nfs: fix encoder callback prototypes
Declare the p_encode callbacks with the proper prototype instead of
casting to kxdreproc_t and losing all type safety.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-07-13 15:57:53 -04:00
Linus Torvalds
125f42b0e2 NFS client bugfixes for Linux 4.12
Bugfixes include:
 
 - Fix a typo in commit e092693443 that breaks copy offload
 - Fix the connect error propagation in xs_tcp_setup_socket()
 - Fix a lock leak in nfs40_walk_client_list
 - Verify that pNFS requests lie within the offset range of the layout segment.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZM0YBAAoJEGcL54qWCgDyLUwQALaPEVp00UMdDR0in7MIFKsO
 2mgi7pOyn6po3EjxKbGtjAbL4nSlVxdaFpCIGg47YXrl9/95Zjjmyke+iwRdnMsa
 ZPyXwfhVRa80fxbOogAverNCnCptoHoG7EzdWuCTcOOxMxR3Ixs7wVJXrs+7ig+r
 IdvIAyTsiDYuP5yVp5KkmJCtLGc0Ze20rb7VgdQJfdiLibWvfYCLZ9CgfAQkdAMU
 RIlbT0/BG13XDqwh/C2V1vLge0VfpT5p8qbIb/kFyQ0ZJUUiicGGGjp3u/yj0aG9
 ljldI34WmQpsy+nCNN4dEgsF461ECvWLwRZnnpN9nv7VurUBpJNUqHLnubvDbzhh
 w8QX54ceEWuQAjg96keNuYOhoG53Omle2/Cm+nmiJOmShJbJ0yh4OcB9DYe0gdYa
 5YXbKRjPvf/HfdE7PPpvbPG2E211zfvkLdHnFxswggWyGrh23kqlWrpcHpZomGNW
 GbJLfIfhyEfBjCPdNJT3Tzvewo2LkcTNLb+3mJhkxOegkdops8vGYA9G2mba3Daj
 1HWl1yFAdzlEf2H1Cb8Y2ZrJKHAmaYBKBkKZYUeAcr6EtoxNqnNMP+PEDcVIzPKg
 6Jq7DiYwYksK+XDWK9G4QBguKKGLvYtv0MIA3QDX+bBGLo+eFYxc2iaaJefYNdkK
 +vdLHclg/YpepLg+Ui21
 =P2bm
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Bugfixes include:

   - Fix a typo in commit e092693443 ("NFS append COMMIT after
     synchronous COPY") that breaks copy offload

   - Fix the connect error propagation in xs_tcp_setup_socket()

   - Fix a lock leak in nfs40_walk_client_list

   - Verify that pNFS requests lie within the offset range of the layout
     segment"

* tag 'nfs-for-4.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: Mark unnecessarily extern functions as static
  SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()
  NFSv4.0: Fix a lock leak in nfs40_walk_client_list
  pnfs: Fix the check for requests in range of layout segment
  xprtrdma: Delete an error message for a failed memory allocation in xprt_rdma_bc_setup()
  pNFS/flexfiles: missing error code in ff_layout_alloc_lseg()
  NFS fix COMMIT after COPY
2017-06-04 11:56:53 -07:00
Jan Kara
4f253e1eb6 nfs: Mark unnecessarily extern functions as static
nfs_initialise_sb() and nfs_clone_super() are declared as extern even
though they are used only in fs/nfs/super.c. Mark them as static.

Also remove explicit 'inline' directive from nfs_initialise_sb() and
leave it upto compiler to decide whether inlining is worth it.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-06-03 16:06:38 -04:00
Kees Cook
fe3b81b446 NFS: Use ERR_CAST() to avoid cross-structure cast
When the call to nfs_devname() fails, the error path attempts to retain
the error via the mnt variable, but this requires a cast across very
different types (char * to struct vfsmount *), which the upcoming
structure layout randomization plugin flags as being potentially
dangerous in the face of randomization. This is a false positive, but
what this code actually wants to do is retain the error value, so this
patch explicitly sets it, instead of using what seems to be an
unexpected cast.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-05-28 10:11:47 -07:00
Trond Myklebust
b49c15f97c NFSv4.0: Fix a lock leak in nfs40_walk_client_list
Xiaolong Ye's kernel test robot detected the following Oops:
[  299.158991] BUG: scheduling while atomic: mount.nfs/9387/0x00000002
[  299.169587] 2 locks held by mount.nfs/9387:
[  299.176165]  #0:  (nfs_clid_init_mutex){......}, at: [<ffffffff8130cc92>] nfs4_discover_server_trunking+0x47/0x1fc
[  299.201802]  #1:  (&(&nn->nfs_client_lock)->rlock){......}, at: [<ffffffff813125fa>] nfs40_walk_client_list+0x2e9/0x338
[  299.221979] CPU: 0 PID: 9387 Comm: mount.nfs Not tainted 4.11.0-rc7-00021-g14d1bbb #45
[  299.235584] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[  299.251176] Call Trace:
[  299.255192]  dump_stack+0x61/0x7e
[  299.260416]  __schedule_bug+0x65/0x74
[  299.266208]  __schedule+0x5d/0x87c
[  299.271883]  schedule+0x89/0x9a
[  299.276937]  schedule_timeout+0x232/0x289
[  299.283223]  ? detach_if_pending+0x10b/0x10b
[  299.289935]  schedule_timeout_uninterruptible+0x2a/0x2c
[  299.298266]  ? put_rpccred+0x3e/0x115
[  299.304327]  ? schedule_timeout_uninterruptible+0x2a/0x2c
[  299.312851]  msleep+0x1e/0x22
[  299.317612]  nfs4_discover_server_trunking+0x102/0x1fc
[  299.325644]  nfs4_init_client+0x13f/0x194

It looks as if we recently added a spin_lock() leak to
nfs40_walk_client_list() when cleaning up the code.

Reported-by: kernel test robot <xiaolong.ye@intel.com>
Fixes: 14d1bbb0ca ("NFS: Create a common nfs4_match_client() function")
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-24 08:05:16 -04:00
Benjamin Coddington
08cb5b0f05 pnfs: Fix the check for requests in range of layout segment
It's possible and acceptable for NFS to attempt to add requests beyond the
range of the current pgio->pg_lseg, a case which should be caught and
limited by the pg_test operation.  However, the current handling of this
case replaces pgio->pg_lseg with a new layout segment (after a WARN) within
that pg_test operation.  That will cause all the previously added requests
to be submitted with this new layout segment, which may not be valid for
those requests.

Fix this problem by only returning zero for the number of bytes to coalesce
from pg_test for this case which allows any previously added requests to
complete on the current layout segment.  The check for requests starting
out of range of the layout segment moves to pg_init, so that the
replacement of pgio->pg_lseg will be done when the next request is added.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-05-24 07:55:02 -04:00