NFSD 6.7 Release Notes

This release completes the SunRPC thread scheduler work that was
 begun in v6.6. The scheduler can now find an svc thread to wake in
 constant time and without a list walk. Thanks again to Neil Brown
 for this overhaul.
 
 Lorenzo Bianconi contributed infrastructure for a netlink-based
 NFSD control plane. The long-term plan is to provide the same
 functionality as found in /proc/fs/nfsd, plus some interesting
 additions, and then migrate the NFSD user space utilities to
 netlink.
 
 A long series to overhaul NFSD's NFSv4 operation encoding was
 applied in this release. The goals are to bring this family of
 encoding functions in line with the matching NFSv4 decoding
 functions and with the NFSv2 and NFSv3 XDR functions, preparing
 the way for better memory safety and maintainability.
 
 A further improvement to NFSD's write delegation support was
 contributed by Dai Ngo. This adds a CB_GETATTR callback,
 enabling the server to retrieve cached size and mtime data from
 clients holding write delegations. If the server can retrieve
 this information, it does not have to recall the delegation in
 some cases.
 
 The usual panoply of bug fixes and minor improvements round out
 this release. As always I am grateful to all contributors,
 reviewers, and testers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmU5IuoACgkQM2qzM29m
 f5eVsg//bVp8S93ci/oDlKfzOwH2fO5e5rna91wrDpJxkd51h6KTx55dSRG5sjAZ
 EywIVOann6xCtsixAPyff5Cweg2dWvzQRsy1ZnvWQ1qZBzD5KAJY5LPkeSFUCKBo
 Zani/qTOYbxzgFMjZx+yDSXDPKG68WYZBQK59SI7mURu4SYdk8aRyNY8mjHfr0Vh
 Aqrcny4oVtXV4sL5P5G/2FUW7WKT3olA3jSYlRRNMhbs2qpEemRCCrspOEMMad+b
 t1+ZCg+U27PMranvOJnof4RU7peZbaxDWA0gyiUbivVXVtZn9uOs0ffhktkvechL
 ePc33dqdp2ITdKIPA6JlaRv5WflKXQw0YYM9Kv5mcR4A2el7owL4f/pMlPhtbYwJ
 IOJv15KdKVN979G2e6WMYiKK+iHfaUUguhMEXnfnGoAajHOZNQiUEo3iFQAD7LDc
 DvMF8d9QqYmB9IW8FOYaRRfZGJOQHf3TL79Nd08z/bn5swvlvfj77leux9Sb+0/m
 Luk2Xvz2AJVSXE31wzabaGHkizN+BtH+e4MMbXUHBPW5jE9v7XOnEUFr4UdZyr9P
 Gl87A7NcrzNjJWT5TrnzM4sOslNsx46Aeg+VuNt2fSRn2dm6iBu2B8s0N4imx6dV
 PX1y9VSLq5WRhjrFZ1qeiZdsuTaQtrEiNDoRIQR6nCJPAV80iFk=
 =B4wJ
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "This release completes the SunRPC thread scheduler work that was begun
  in v6.6. The scheduler can now find an svc thread to wake in constant
  time and without a list walk. Thanks again to Neil Brown for this
  overhaul.

  Lorenzo Bianconi contributed infrastructure for a netlink-based NFSD
  control plane. The long-term plan is to provide the same functionality
  as found in /proc/fs/nfsd, plus some interesting additions, and then
  migrate the NFSD user space utilities to netlink.

  A long series to overhaul NFSD's NFSv4 operation encoding was applied
  in this release. The goals are to bring this family of encoding
  functions in line with the matching NFSv4 decoding functions and with
  the NFSv2 and NFSv3 XDR functions, preparing the way for better memory
  safety and maintainability.

  A further improvement to NFSD's write delegation support was
  contributed by Dai Ngo. This adds a CB_GETATTR callback, enabling the
  server to retrieve cached size and mtime data from clients holding
  write delegations. If the server can retrieve this information, it
  does not have to recall the delegation in some cases.

  The usual panoply of bug fixes and minor improvements round out this
  release. As always I am grateful to all contributors, reviewers, and
  testers"

* tag 'nfsd-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (127 commits)
  svcrdma: Fix tracepoint printk format
  svcrdma: Drop connection after an RDMA Read error
  NFSD: clean up alloc_init_deleg()
  NFSD: Fix frame size warning in svc_export_parse()
  NFSD: Rewrite synopsis of nfsd_percpu_counters_init()
  nfsd: Clean up errors in nfs3proc.c
  nfsd: Clean up errors in nfs4state.c
  NFSD: Clean up errors in stats.c
  NFSD: simplify error paths in nfsd_svc()
  NFSD: Clean up nfsd4_encode_seek()
  NFSD: Clean up nfsd4_encode_offset_status()
  NFSD: Clean up nfsd4_encode_copy_notify()
  NFSD: Clean up nfsd4_encode_copy()
  NFSD: Clean up nfsd4_encode_test_stateid()
  NFSD: Clean up nfsd4_encode_exchange_id()
  NFSD: Clean up nfsd4_do_encode_secinfo()
  NFSD: Clean up nfsd4_encode_access()
  NFSD: Clean up nfsd4_encode_readdir()
  NFSD: Clean up nfsd4_encode_entry4()
  NFSD: Add an nfsd4_encode_nfs_cookie4() helper
  ...
This commit is contained in:
Linus Torvalds 2023-10-30 10:12:29 -10:00
commit 8b16da681e
61 changed files with 3531 additions and 1744 deletions

View File

@ -241,3 +241,10 @@ following flags are defined:
all of an inode's dirty data on last close. Exports that behave this
way should set EXPORT_OP_FLUSH_ON_CLOSE so that NFSD knows to skip
waiting for writeback when closing such files.
EXPORT_OP_ASYNC_LOCK - Indicates a capable filesystem to do async lock
requests from lockd. Only set EXPORT_OP_ASYNC_LOCK if the filesystem has
it's own ->lock() functionality as core posix_lock_file() implementation
has no async lock request handling yet. For more information about how to
indicate an async lock request from a ->lock() file_operations struct, see
fs/locks.c and comment for the function vfs_lock_file().

View File

@ -0,0 +1,89 @@
# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
name: nfsd
protocol: genetlink
uapi-header: linux/nfsd_netlink.h
doc: NFSD configuration over generic netlink.
attribute-sets:
-
name: rpc-status
attributes:
-
name: xid
type: u32
byte-order: big-endian
-
name: flags
type: u32
-
name: prog
type: u32
-
name: version
type: u8
-
name: proc
type: u32
-
name: service_time
type: s64
-
name: pad
type: pad
-
name: saddr4
type: u32
byte-order: big-endian
display-hint: ipv4
-
name: daddr4
type: u32
byte-order: big-endian
display-hint: ipv4
-
name: saddr6
type: binary
display-hint: ipv6
-
name: daddr6
type: binary
display-hint: ipv6
-
name: sport
type: u16
byte-order: big-endian
-
name: dport
type: u16
byte-order: big-endian
-
name: compound-ops
type: u32
multi-attr: true
operations:
list:
-
name: rpc-status-get
doc: dump pending nfsd rpc
attribute-set: rpc-status
dump:
pre: nfsd-nl-rpc-status-get-start
post: nfsd-nl-rpc-status-get-done
reply:
attributes:
- xid
- flags
- prog
- version
- proc
- service_time
- saddr4
- daddr4
- saddr6
- daddr6
- sport
- dport
- compound-ops

View File

@ -24,7 +24,6 @@
#include <linux/uio.h>
#include <linux/smp.h>
#include <linux/mutex.h>
#include <linux/kthread.h>
#include <linux/freezer.h>
#include <linux/inetdevice.h>
@ -135,11 +134,11 @@ lockd(void *vrqstp)
* The main request loop. We don't terminate until the last
* NFS mount or NFS daemon has gone away.
*/
while (!kthread_should_stop()) {
while (!svc_thread_should_stop(rqstp)) {
/* update sv_maxconn if it has changed */
rqstp->rq_server->sv_maxconn = nlm_max_connections;
nlmsvc_retry_blocked();
nlmsvc_retry_blocked(rqstp);
svc_recv(rqstp);
}
if (nlmsvc_ops)
@ -373,7 +372,9 @@ static void lockd_put(void)
unregister_inet6addr_notifier(&lockd_inet6addr_notifier);
#endif
svc_get(nlmsvc_serv);
svc_set_num_threads(nlmsvc_serv, NULL, 0);
svc_put(nlmsvc_serv);
timer_delete_sync(&nlmsvc_retry);
nlmsvc_serv = NULL;
dprintk("lockd_down: service destroyed\n");

View File

@ -30,7 +30,6 @@
#include <linux/sunrpc/svc_xprt.h>
#include <linux/lockd/nlm.h>
#include <linux/lockd/lockd.h>
#include <linux/kthread.h>
#include <linux/exportfs.h>
#define NLMDBG_FACILITY NLMDBG_SVCLOCK
@ -481,9 +480,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
struct nlm_host *host, struct nlm_lock *lock, int wait,
struct nlm_cookie *cookie, int reclaim)
{
#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
struct inode *inode = nlmsvc_file_inode(file);
#endif
struct nlm_block *block = NULL;
int error;
int mode;
@ -497,7 +494,7 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
(long long)lock->fl.fl_end,
wait);
if (nlmsvc_file_file(file)->f_op->lock) {
if (!exportfs_lock_op_is_async(inode->i_sb->s_export_op)) {
async_block = wait;
wait = 0;
}
@ -543,6 +540,25 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
goto out;
}
spin_lock(&nlm_blocked_lock);
/*
* If this is a lock request for an already pending
* lock request we return nlm_lck_blocked without calling
* vfs_lock_file() again. Otherwise we have two pending
* requests on the underlaying ->lock() implementation but
* only one nlm_block to being granted by lm_grant().
*/
if (exportfs_lock_op_is_async(inode->i_sb->s_export_op) &&
!list_empty(&block->b_list)) {
spin_unlock(&nlm_blocked_lock);
ret = nlm_lck_blocked;
goto out;
}
/* Append to list of blocked */
nlmsvc_insert_block_locked(block, NLM_NEVER);
spin_unlock(&nlm_blocked_lock);
if (!wait)
lock->fl.fl_flags &= ~FL_SLEEP;
mode = lock_to_openmode(&lock->fl);
@ -552,16 +568,12 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
dprintk("lockd: vfs_lock_file returned %d\n", error);
switch (error) {
case 0:
nlmsvc_remove_block(block);
ret = nlm_granted;
goto out;
case -EAGAIN:
/*
* If this is a blocking request for an
* already pending lock request then we need
* to put it back on lockd's block list
*/
if (wait)
break;
if (!wait)
nlmsvc_remove_block(block);
ret = async_block ? nlm_lck_blocked : nlm_lck_denied;
goto out;
case FILE_LOCK_DEFERRED:
@ -572,17 +584,16 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
ret = nlmsvc_defer_lock_rqst(rqstp, block);
goto out;
case -EDEADLK:
nlmsvc_remove_block(block);
ret = nlm_deadlock;
goto out;
default: /* includes ENOLCK */
nlmsvc_remove_block(block);
ret = nlm_lck_denied_nolocks;
goto out;
}
ret = nlm_lck_blocked;
/* Append to list of blocked */
nlmsvc_insert_block(block, NLM_NEVER);
out:
mutex_unlock(&file->f_mutex);
nlmsvc_release_block(block);
@ -1020,13 +1031,13 @@ retry_deferred_block(struct nlm_block *block)
* be retransmitted.
*/
void
nlmsvc_retry_blocked(void)
nlmsvc_retry_blocked(struct svc_rqst *rqstp)
{
unsigned long timeout = MAX_SCHEDULE_TIMEOUT;
struct nlm_block *block;
spin_lock(&nlm_blocked_lock);
while (!list_empty(&nlm_blocked) && !kthread_should_stop()) {
while (!list_empty(&nlm_blocked) && !svc_thread_should_stop(rqstp)) {
block = list_entry(nlm_blocked.next, struct nlm_block, b_list);
if (block->b_when == NLM_NEVER)

View File

@ -2264,11 +2264,13 @@ out:
* To avoid blocking kernel daemons, such as lockd, that need to acquire POSIX
* locks, the ->lock() interface may return asynchronously, before the lock has
* been granted or denied by the underlying filesystem, if (and only if)
* lm_grant is set. Callers expecting ->lock() to return asynchronously
* will only use F_SETLK, not F_SETLKW; they will set FL_SLEEP if (and only if)
* the request is for a blocking lock. When ->lock() does return asynchronously,
* it must return FILE_LOCK_DEFERRED, and call ->lm_grant() when the lock
* request completes.
* lm_grant is set. Additionally EXPORT_OP_ASYNC_LOCK in export_operations
* flags need to be set.
*
* Callers expecting ->lock() to return asynchronously will only use F_SETLK,
* not F_SETLKW; they will set FL_SLEEP if (and only if) the request is for a
* blocking lock. When ->lock() does return asynchronously, it must return
* FILE_LOCK_DEFERRED, and call ->lm_grant() when the lock request completes.
* If the request is for non-blocking lock the file system should return
* FILE_LOCK_DEFERRED then try to get the lock and call the callback routine
* with the result. If the request timed out the callback routine will return a

View File

@ -78,7 +78,7 @@ nfs4_callback_svc(void *vrqstp)
set_freezable();
while (!kthread_freezable_should_stop(NULL))
while (!svc_thread_should_stop(rqstp))
svc_recv(rqstp);
svc_exit_thread(rqstp);
@ -86,45 +86,6 @@ nfs4_callback_svc(void *vrqstp)
}
#if defined(CONFIG_NFS_V4_1)
/*
* The callback service for NFSv4.1 callbacks
*/
static int
nfs41_callback_svc(void *vrqstp)
{
struct svc_rqst *rqstp = vrqstp;
struct svc_serv *serv = rqstp->rq_server;
struct rpc_rqst *req;
int error;
DEFINE_WAIT(wq);
set_freezable();
while (!kthread_freezable_should_stop(NULL)) {
prepare_to_wait(&serv->sv_cb_waitq, &wq, TASK_IDLE);
spin_lock_bh(&serv->sv_cb_lock);
if (!list_empty(&serv->sv_cb_list)) {
req = list_first_entry(&serv->sv_cb_list,
struct rpc_rqst, rq_bc_list);
list_del(&req->rq_bc_list);
spin_unlock_bh(&serv->sv_cb_lock);
finish_wait(&serv->sv_cb_waitq, &wq);
dprintk("Invoking bc_svc_process()\n");
error = bc_svc_process(serv, req, rqstp);
dprintk("bc_svc_process() returned w/ error code= %d\n",
error);
} else {
spin_unlock_bh(&serv->sv_cb_lock);
if (!kthread_should_stop())
schedule();
finish_wait(&serv->sv_cb_waitq, &wq);
}
}
svc_exit_thread(rqstp);
return 0;
}
static inline void nfs_callback_bc_serv(u32 minorversion, struct rpc_xprt *xprt,
struct svc_serv *serv)
{
@ -237,10 +198,7 @@ static struct svc_serv *nfs_callback_create_svc(int minorversion)
cb_info->users);
threadfn = nfs4_callback_svc;
#if defined(CONFIG_NFS_V4_1)
if (minorversion)
threadfn = nfs41_callback_svc;
#else
#if !defined(CONFIG_NFS_V4_1)
if (minorversion)
return ERR_PTR(-ENOTSUPP);
#endif

View File

@ -12,7 +12,8 @@ nfsd-y += trace.o
nfsd-y += nfssvc.o nfsctl.o nfsfh.o vfs.o \
export.o auth.o lockd.o nfscache.o \
stats.o filecache.o nfs3proc.o nfs3xdr.o
stats.o filecache.o nfs3proc.o nfs3xdr.o \
netlink.o
nfsd-$(CONFIG_NFSD_V2) += nfsproc.o nfsxdr.o
nfsd-$(CONFIG_NFSD_V2_ACL) += nfs2acl.o
nfsd-$(CONFIG_NFSD_V3_ACL) += nfs3acl.o

View File

@ -16,9 +16,9 @@
__be32
nfsd4_block_encode_layoutget(struct xdr_stream *xdr,
struct nfsd4_layoutget *lgp)
const struct nfsd4_layoutget *lgp)
{
struct pnfs_block_extent *b = lgp->lg_content;
const struct pnfs_block_extent *b = lgp->lg_content;
int len = sizeof(__be32) + 5 * sizeof(__be64) + sizeof(__be32);
__be32 *p;
@ -77,7 +77,7 @@ nfsd4_block_encode_volume(struct xdr_stream *xdr, struct pnfs_block_volume *b)
__be32
nfsd4_block_encode_getdeviceinfo(struct xdr_stream *xdr,
struct nfsd4_getdeviceinfo *gdp)
const struct nfsd4_getdeviceinfo *gdp)
{
struct pnfs_block_deviceaddr *dev = gdp->gd_device;
int len = sizeof(__be32), ret, i;

View File

@ -51,9 +51,9 @@ struct pnfs_block_deviceaddr {
};
__be32 nfsd4_block_encode_getdeviceinfo(struct xdr_stream *xdr,
struct nfsd4_getdeviceinfo *gdp);
const struct nfsd4_getdeviceinfo *gdp);
__be32 nfsd4_block_encode_layoutget(struct xdr_stream *xdr,
struct nfsd4_layoutget *lgp);
const struct nfsd4_layoutget *lgp);
int nfsd4_block_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp,
u32 block_size);
int nfsd4_scsi_decode_layoutupdate(__be32 *p, u32 len, struct iomap **iomapp,

View File

@ -339,12 +339,16 @@ static int export_stats_init(struct export_stats *stats)
static void export_stats_reset(struct export_stats *stats)
{
nfsd_percpu_counters_reset(stats->counter, EXP_STATS_COUNTERS_NUM);
if (stats)
nfsd_percpu_counters_reset(stats->counter,
EXP_STATS_COUNTERS_NUM);
}
static void export_stats_destroy(struct export_stats *stats)
{
nfsd_percpu_counters_destroy(stats->counter, EXP_STATS_COUNTERS_NUM);
if (stats)
nfsd_percpu_counters_destroy(stats->counter,
EXP_STATS_COUNTERS_NUM);
}
static void svc_export_put(struct kref *ref)
@ -353,7 +357,8 @@ static void svc_export_put(struct kref *ref)
path_put(&exp->ex_path);
auth_domain_put(exp->ex_client);
nfsd4_fslocs_free(&exp->ex_fslocs);
export_stats_destroy(&exp->ex_stats);
export_stats_destroy(exp->ex_stats);
kfree(exp->ex_stats);
kfree(exp->ex_uuid);
kfree_rcu(exp, ex_rcu);
}
@ -767,13 +772,15 @@ static int svc_export_show(struct seq_file *m,
seq_putc(m, '\t');
seq_escape(m, exp->ex_client->name, " \t\n\\");
if (export_stats) {
seq_printf(m, "\t%lld\n", exp->ex_stats.start_time);
struct percpu_counter *counter = exp->ex_stats->counter;
seq_printf(m, "\t%lld\n", exp->ex_stats->start_time);
seq_printf(m, "\tfh_stale: %lld\n",
percpu_counter_sum_positive(&exp->ex_stats.counter[EXP_STATS_FH_STALE]));
percpu_counter_sum_positive(&counter[EXP_STATS_FH_STALE]));
seq_printf(m, "\tio_read: %lld\n",
percpu_counter_sum_positive(&exp->ex_stats.counter[EXP_STATS_IO_READ]));
percpu_counter_sum_positive(&counter[EXP_STATS_IO_READ]));
seq_printf(m, "\tio_write: %lld\n",
percpu_counter_sum_positive(&exp->ex_stats.counter[EXP_STATS_IO_WRITE]));
percpu_counter_sum_positive(&counter[EXP_STATS_IO_WRITE]));
seq_putc(m, '\n');
return 0;
}
@ -819,7 +826,7 @@ static void svc_export_init(struct cache_head *cnew, struct cache_head *citem)
new->ex_layout_types = 0;
new->ex_uuid = NULL;
new->cd = item->cd;
export_stats_reset(&new->ex_stats);
export_stats_reset(new->ex_stats);
}
static void export_update(struct cache_head *cnew, struct cache_head *citem)
@ -856,7 +863,14 @@ static struct cache_head *svc_export_alloc(void)
if (!i)
return NULL;
if (export_stats_init(&i->ex_stats)) {
i->ex_stats = kmalloc(sizeof(*(i->ex_stats)), GFP_KERNEL);
if (!i->ex_stats) {
kfree(i);
return NULL;
}
if (export_stats_init(i->ex_stats)) {
kfree(i->ex_stats);
kfree(i);
return NULL;
}

View File

@ -64,10 +64,10 @@ struct svc_export {
struct cache_head h;
struct auth_domain * ex_client;
int ex_flags;
int ex_fsid;
struct path ex_path;
kuid_t ex_anon_uid;
kgid_t ex_anon_gid;
int ex_fsid;
unsigned char * ex_uuid; /* 16 byte fsid */
struct nfsd4_fs_locations ex_fslocs;
uint32_t ex_nflavors;
@ -76,8 +76,8 @@ struct svc_export {
struct nfsd4_deviceid_map *ex_devid_map;
struct cache_detail *cd;
struct rcu_head ex_rcu;
struct export_stats ex_stats;
unsigned long ex_xprtsec_modes;
struct export_stats *ex_stats;
};
/* an "export key" (expkey) maps a filehandlefragement to an

View File

@ -989,22 +989,21 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
unsigned char need = may_flags & NFSD_FILE_MAY_MASK;
struct net *net = SVC_NET(rqstp);
struct nfsd_file *new, *nf;
const struct cred *cred;
bool stale_retry = true;
bool open_retry = true;
struct inode *inode;
__be32 status;
int ret;
retry:
status = fh_verify(rqstp, fhp, S_IFREG,
may_flags|NFSD_MAY_OWNER_OVERRIDE);
if (status != nfs_ok)
return status;
inode = d_inode(fhp->fh_dentry);
cred = get_current_cred();
retry:
rcu_read_lock();
nf = nfsd_file_lookup_locked(net, cred, inode, need, want_gc);
nf = nfsd_file_lookup_locked(net, current_cred(), inode, need, want_gc);
rcu_read_unlock();
if (nf) {
@ -1026,7 +1025,7 @@ retry:
rcu_read_lock();
spin_lock(&inode->i_lock);
nf = nfsd_file_lookup_locked(net, cred, inode, need, want_gc);
nf = nfsd_file_lookup_locked(net, current_cred(), inode, need, want_gc);
if (unlikely(nf)) {
spin_unlock(&inode->i_lock);
rcu_read_unlock();
@ -1058,6 +1057,7 @@ wait_for_construction:
goto construction_err;
}
open_retry = false;
fh_put(fhp);
goto retry;
}
this_cpu_inc(nfsd_file_cache_hits);
@ -1074,7 +1074,6 @@ out:
nfsd_file_check_write_error(nf);
*pnf = nf;
}
put_cred(cred);
trace_nfsd_file_acquire(rqstp, inode, may_flags, nf, status);
return status;
@ -1088,8 +1087,20 @@ open_file:
status = nfs_ok;
trace_nfsd_file_opened(nf, status);
} else {
status = nfsd_open_verified(rqstp, fhp, may_flags,
&nf->nf_file);
ret = nfsd_open_verified(rqstp, fhp, may_flags,
&nf->nf_file);
if (ret == -EOPENSTALE && stale_retry) {
stale_retry = false;
nfsd_file_unhash(nf);
clear_and_wake_up_bit(NFSD_FILE_PENDING,
&nf->nf_flags);
if (refcount_dec_and_test(&nf->nf_ref))
nfsd_file_free(nf);
nf = NULL;
fh_put(fhp);
goto retry;
}
status = nfserrno(ret);
trace_nfsd_file_open(nf, status);
}
} else

View File

@ -17,9 +17,9 @@ struct ff_idmap {
__be32
nfsd4_ff_encode_layoutget(struct xdr_stream *xdr,
struct nfsd4_layoutget *lgp)
const struct nfsd4_layoutget *lgp)
{
struct pnfs_ff_layout *fl = lgp->lg_content;
const struct pnfs_ff_layout *fl = lgp->lg_content;
int len, mirror_len, ds_len, fh_len;
__be32 *p;
@ -77,7 +77,7 @@ nfsd4_ff_encode_layoutget(struct xdr_stream *xdr,
__be32
nfsd4_ff_encode_getdeviceinfo(struct xdr_stream *xdr,
struct nfsd4_getdeviceinfo *gdp)
const struct nfsd4_getdeviceinfo *gdp)
{
struct pnfs_ff_device_addr *da = gdp->gd_device;
int len;

View File

@ -43,8 +43,8 @@ struct pnfs_ff_layout {
};
__be32 nfsd4_ff_encode_getdeviceinfo(struct xdr_stream *xdr,
struct nfsd4_getdeviceinfo *gdp);
const struct nfsd4_getdeviceinfo *gdp);
__be32 nfsd4_ff_encode_layoutget(struct xdr_stream *xdr,
struct nfsd4_layoutget *lgp);
const struct nfsd4_layoutget *lgp);
#endif /* _NFSD_FLEXFILELAYOUTXDR_H */

32
fs/nfsd/netlink.c Normal file
View File

@ -0,0 +1,32 @@
// SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
/* Do not edit directly, auto-generated from: */
/* Documentation/netlink/specs/nfsd.yaml */
/* YNL-GEN kernel source */
#include <net/netlink.h>
#include <net/genetlink.h>
#include "netlink.h"
#include <uapi/linux/nfsd_netlink.h>
/* Ops table for nfsd */
static const struct genl_split_ops nfsd_nl_ops[] = {
{
.cmd = NFSD_CMD_RPC_STATUS_GET,
.start = nfsd_nl_rpc_status_get_start,
.dumpit = nfsd_nl_rpc_status_get_dumpit,
.done = nfsd_nl_rpc_status_get_done,
.flags = GENL_CMD_CAP_DUMP,
},
};
struct genl_family nfsd_nl_family __ro_after_init = {
.name = NFSD_FAMILY_NAME,
.version = NFSD_FAMILY_VERSION,
.netnsok = true,
.parallel_ops = true,
.module = THIS_MODULE,
.split_ops = nfsd_nl_ops,
.n_split_ops = ARRAY_SIZE(nfsd_nl_ops),
};

22
fs/nfsd/netlink.h Normal file
View File

@ -0,0 +1,22 @@
/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
/* Do not edit directly, auto-generated from: */
/* Documentation/netlink/specs/nfsd.yaml */
/* YNL-GEN kernel header */
#ifndef _LINUX_NFSD_GEN_H
#define _LINUX_NFSD_GEN_H
#include <net/netlink.h>
#include <net/genetlink.h>
#include <uapi/linux/nfsd_netlink.h>
int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb);
int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb);
int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
struct netlink_callback *cb);
extern struct genl_family nfsd_nl_family;
#endif /* _LINUX_NFSD_GEN_H */

View File

@ -171,7 +171,8 @@ nfsd3_proc_read(struct svc_rqst *rqstp)
* + 1 (xdr opaque byte count) = 26
*/
resp->count = argp->count;
svc_reserve_auth(rqstp, ((1 + NFS3_POST_OP_ATTR_WORDS + 3)<<2) + resp->count +4);
svc_reserve_auth(rqstp, ((1 + NFS3_POST_OP_ATTR_WORDS + 3) << 2) +
resp->count + 4);
fh_copy(&resp->fh, &argp->fh);
resp->status = nfsd_read(rqstp, &resp->fh, argp->offset,
@ -194,7 +195,7 @@ nfsd3_proc_write(struct svc_rqst *rqstp)
SVCFH_fmt(&argp->fh),
argp->len,
(unsigned long long) argp->offset,
argp->stable? " stable" : "");
argp->stable ? " stable" : "");
resp->status = nfserr_fbig;
if (argp->offset > (u64)OFFSET_MAX ||

View File

@ -84,7 +84,21 @@ static void encode_uint32(struct xdr_stream *xdr, u32 n)
static void encode_bitmap4(struct xdr_stream *xdr, const __u32 *bitmap,
size_t len)
{
WARN_ON_ONCE(xdr_stream_encode_uint32_array(xdr, bitmap, len) < 0);
xdr_stream_encode_uint32_array(xdr, bitmap, len);
}
static int decode_cb_fattr4(struct xdr_stream *xdr, uint32_t *bitmap,
struct nfs4_cb_fattr *fattr)
{
fattr->ncf_cb_change = 0;
fattr->ncf_cb_fsize = 0;
if (bitmap[0] & FATTR4_WORD0_CHANGE)
if (xdr_stream_decode_u64(xdr, &fattr->ncf_cb_change) < 0)
return -NFSERR_BAD_XDR;
if (bitmap[0] & FATTR4_WORD0_SIZE)
if (xdr_stream_decode_u64(xdr, &fattr->ncf_cb_fsize) < 0)
return -NFSERR_BAD_XDR;
return 0;
}
/*
@ -357,6 +371,30 @@ encode_cb_recallany4args(struct xdr_stream *xdr,
hdr->nops++;
}
/*
* CB_GETATTR4args
* struct CB_GETATTR4args {
* nfs_fh4 fh;
* bitmap4 attr_request;
* };
*
* The size and change attributes are the only one
* guaranteed to be serviced by the client.
*/
static void
encode_cb_getattr4args(struct xdr_stream *xdr, struct nfs4_cb_compound_hdr *hdr,
struct nfs4_cb_fattr *fattr)
{
struct nfs4_delegation *dp =
container_of(fattr, struct nfs4_delegation, dl_cb_fattr);
struct knfsd_fh *fh = &dp->dl_stid.sc_file->fi_fhandle;
encode_nfs_cb_opnum4(xdr, OP_CB_GETATTR);
encode_nfs_fh4(xdr, fh);
encode_bitmap4(xdr, fattr->ncf_cb_bmap, ARRAY_SIZE(fattr->ncf_cb_bmap));
hdr->nops++;
}
/*
* CB_SEQUENCE4args
*
@ -492,6 +530,26 @@ static void nfs4_xdr_enc_cb_null(struct rpc_rqst *req, struct xdr_stream *xdr,
xdr_reserve_space(xdr, 0);
}
/*
* 20.1. Operation 3: CB_GETATTR - Get Attributes
*/
static void nfs4_xdr_enc_cb_getattr(struct rpc_rqst *req,
struct xdr_stream *xdr, const void *data)
{
const struct nfsd4_callback *cb = data;
struct nfs4_cb_fattr *ncf =
container_of(cb, struct nfs4_cb_fattr, ncf_getattr);
struct nfs4_cb_compound_hdr hdr = {
.ident = cb->cb_clp->cl_cb_ident,
.minorversion = cb->cb_clp->cl_minorversion,
};
encode_cb_compound4args(xdr, &hdr);
encode_cb_sequence4args(xdr, cb, &hdr);
encode_cb_getattr4args(xdr, &hdr, ncf);
encode_cb_nops(&hdr);
}
/*
* 20.2. Operation 4: CB_RECALL - Recall a Delegation
*/
@ -547,6 +605,42 @@ static int nfs4_xdr_dec_cb_null(struct rpc_rqst *req, struct xdr_stream *xdr,
return 0;
}
/*
* 20.1. Operation 3: CB_GETATTR - Get Attributes
*/
static int nfs4_xdr_dec_cb_getattr(struct rpc_rqst *rqstp,
struct xdr_stream *xdr,
void *data)
{
struct nfsd4_callback *cb = data;
struct nfs4_cb_compound_hdr hdr;
int status;
u32 bitmap[3] = {0};
u32 attrlen;
struct nfs4_cb_fattr *ncf =
container_of(cb, struct nfs4_cb_fattr, ncf_getattr);
status = decode_cb_compound4res(xdr, &hdr);
if (unlikely(status))
return status;
status = decode_cb_sequence4res(xdr, cb);
if (unlikely(status || cb->cb_seq_status))
return status;
status = decode_cb_op_status(xdr, OP_CB_GETATTR, &cb->cb_status);
if (status)
return status;
if (xdr_stream_decode_uint32_array(xdr, bitmap, 3) < 0)
return -NFSERR_BAD_XDR;
if (xdr_stream_decode_u32(xdr, &attrlen) < 0)
return -NFSERR_BAD_XDR;
if (attrlen > (sizeof(ncf->ncf_cb_change) + sizeof(ncf->ncf_cb_fsize)))
return -NFSERR_BAD_XDR;
status = decode_cb_fattr4(xdr, bitmap, ncf);
return status;
}
/*
* 20.2. Operation 4: CB_RECALL - Recall a Delegation
*/
@ -855,6 +949,7 @@ static const struct rpc_procinfo nfs4_cb_procedures[] = {
PROC(CB_NOTIFY_LOCK, COMPOUND, cb_notify_lock, cb_notify_lock),
PROC(CB_OFFLOAD, COMPOUND, cb_offload, cb_offload),
PROC(CB_RECALL_ANY, COMPOUND, cb_recall_any, cb_recall_any),
PROC(CB_GETATTR, COMPOUND, cb_getattr, cb_getattr),
};
static unsigned int nfs4_cb_counts[ARRAY_SIZE(nfs4_cb_procedures)];

View File

@ -515,11 +515,11 @@ nfsd4_return_file_layouts(struct svc_rqst *rqstp,
if (!list_empty(&ls->ls_layouts)) {
if (found)
nfs4_inc_and_copy_stateid(&lrp->lr_sid, &ls->ls_stid);
lrp->lrs_present = 1;
lrp->lrs_present = true;
} else {
trace_nfsd_layoutstate_unhash(&ls->ls_stid.sc_stateid);
nfs4_unhash_stid(&ls->ls_stid);
lrp->lrs_present = 0;
lrp->lrs_present = false;
}
spin_unlock(&ls->ls_lock);
@ -539,7 +539,7 @@ nfsd4_return_client_layouts(struct svc_rqst *rqstp,
struct nfs4_layout *lp, *t;
LIST_HEAD(reaplist);
lrp->lrs_present = 0;
lrp->lrs_present = false;
spin_lock(&clp->cl_lock);
list_for_each_entry_safe(ls, n, &clp->cl_lo_states, ls_perclnt) {

View File

@ -1329,7 +1329,8 @@ extern void nfs_sb_deactive(struct super_block *sb);
* setup a work entry in the ssc delayed unmount list.
*/
static __be32 nfsd4_ssc_setup_dul(struct nfsd_net *nn, char *ipaddr,
struct nfsd4_ssc_umount_item **nsui)
struct nfsd4_ssc_umount_item **nsui,
struct svc_rqst *rqstp)
{
struct nfsd4_ssc_umount_item *ni = NULL;
struct nfsd4_ssc_umount_item *work = NULL;
@ -1351,7 +1352,7 @@ try_again:
spin_unlock(&nn->nfsd_ssc_lock);
/* allow 20secs for mount/unmount for now - revisit */
if (kthread_should_stop() ||
if (svc_thread_should_stop(rqstp) ||
(schedule_timeout(20*HZ) == 0)) {
finish_wait(&nn->nfsd_ssc_waitq, &wait);
kfree(work);
@ -1467,7 +1468,7 @@ nfsd4_interssc_connect(struct nl4_server *nss, struct svc_rqst *rqstp,
goto out_free_rawdata;
snprintf(dev_name, len + 5, "%s%s%s:/", startsep, ipaddr, endsep);
status = nfsd4_ssc_setup_dul(nn, ipaddr, nsui);
status = nfsd4_ssc_setup_dul(nn, ipaddr, nsui, rqstp);
if (status)
goto out_free_devname;
if ((*nsui)->nsui_vfsmount)
@ -1642,6 +1643,7 @@ static ssize_t _nfsd_copy_file_range(struct nfsd4_copy *copy,
if (bytes_total == 0)
bytes_total = ULLONG_MAX;
do {
/* Only async copies can be stopped here */
if (kthread_should_stop())
break;
bytes_copied = nfsd_copy_file_range(src, src_pos, dst, dst_pos,
@ -1760,6 +1762,7 @@ static int nfsd4_do_async_copy(void *data)
struct nfsd4_copy *copy = (struct nfsd4_copy *)data;
__be32 nfserr;
trace_nfsd_copy_do_async(copy);
if (nfsd4_ssc_is_inter(copy)) {
struct file *filp;
@ -1798,21 +1801,27 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
__be32 status;
struct nfsd4_copy *async_copy = NULL;
copy->cp_clp = cstate->clp;
if (nfsd4_ssc_is_inter(copy)) {
trace_nfsd_copy_inter(copy);
if (!inter_copy_offload_enable || nfsd4_copy_is_sync(copy)) {
status = nfserr_notsupp;
goto out;
}
status = nfsd4_setup_inter_ssc(rqstp, cstate, copy);
if (status)
if (status) {
trace_nfsd_copy_done(copy, status);
return nfserr_offload_denied;
}
} else {
trace_nfsd_copy_intra(copy);
status = nfsd4_setup_intra_ssc(rqstp, cstate, copy);
if (status)
if (status) {
trace_nfsd_copy_done(copy, status);
return status;
}
}
copy->cp_clp = cstate->clp;
memcpy(&copy->fh, &cstate->current_fh.fh_handle,
sizeof(struct knfsd_fh));
if (nfsd4_copy_is_async(copy)) {
@ -1847,6 +1856,7 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
copy->nf_dst->nf_file, true);
}
out:
trace_nfsd_copy_done(copy, status);
release_copy_files(copy);
return status;
out_err:
@ -1929,8 +1939,8 @@ nfsd4_copy_notify(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
if (status)
return status;
cn->cpn_sec = nn->nfsd4_lease;
cn->cpn_nsec = 0;
cn->cpn_lease_time.tv_sec = nn->nfsd4_lease;
cn->cpn_lease_time.tv_nsec = 0;
status = nfserrno(-ENOMEM);
cps = nfs4_alloc_init_cpntf_state(nn, stid);
@ -2347,10 +2357,10 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp,
mutex_unlock(&ls->ls_mutex);
if (new_size > i_size_read(inode)) {
lcp->lc_size_chg = 1;
lcp->lc_size_chg = true;
lcp->lc_newsize = new_size;
} else {
lcp->lc_size_chg = 0;
lcp->lc_size_chg = false;
}
nfserr = ops->proc_layoutcommit(inode, lcp);
@ -3200,6 +3210,7 @@ static const struct nfsd4_operation nfsd4_ops[] = {
},
[OP_LOCK] = {
.op_func = nfsd4_lock,
.op_release = nfsd4_lock_release,
.op_flags = OP_MODIFIES_SOMETHING |
OP_NONTRIVIAL_ERROR_ENCODE,
.op_name = "OP_LOCK",
@ -3208,6 +3219,7 @@ static const struct nfsd4_operation nfsd4_ops[] = {
},
[OP_LOCKT] = {
.op_func = nfsd4_lockt,
.op_release = nfsd4_lockt_release,
.op_flags = OP_NONTRIVIAL_ERROR_ENCODE,
.op_name = "OP_LOCKT",
.op_rsize_bop = nfsd4_lock_rsize,

View File

@ -59,7 +59,7 @@
#define NFSDDBG_FACILITY NFSDDBG_PROC
#define all_ones {{~0,~0},~0}
#define all_ones {{ ~0, ~0}, ~0}
static const stateid_t one_stateid = {
.si_generation = ~0,
.si_opaque = all_ones,
@ -127,6 +127,7 @@ static void free_session(struct nfsd4_session *);
static const struct nfsd4_callback_ops nfsd4_cb_recall_ops;
static const struct nfsd4_callback_ops nfsd4_cb_notify_lock_ops;
static const struct nfsd4_callback_ops nfsd4_cb_getattr_ops;
static struct workqueue_struct *laundry_wq;
@ -297,7 +298,7 @@ find_or_allocate_block(struct nfs4_lockowner *lo, struct knfsd_fh *fh,
nbl = find_blocked_lock(lo, fh, nn);
if (!nbl) {
nbl= kmalloc(sizeof(*nbl), GFP_KERNEL);
nbl = kmalloc(sizeof(*nbl), GFP_KERNEL);
if (nbl) {
INIT_LIST_HEAD(&nbl->nbl_list);
INIT_LIST_HEAD(&nbl->nbl_lru);
@ -1159,6 +1160,7 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
struct nfs4_clnt_odstate *odstate, u32 dl_type)
{
struct nfs4_delegation *dp;
struct nfs4_stid *stid;
long n;
dprintk("NFSD alloc_init_deleg\n");
@ -1167,9 +1169,10 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
goto out_dec;
if (delegation_blocked(&fp->fi_fhandle))
goto out_dec;
dp = delegstateid(nfs4_alloc_stid(clp, deleg_slab, nfs4_free_deleg));
if (dp == NULL)
stid = nfs4_alloc_stid(clp, deleg_slab, nfs4_free_deleg);
if (stid == NULL)
goto out_dec;
dp = delegstateid(stid);
/*
* delegation seqid's are never incremented. The 4.1 special
@ -1187,6 +1190,10 @@ alloc_init_deleg(struct nfs4_client *clp, struct nfs4_file *fp,
dp->dl_recalled = false;
nfsd4_init_cb(&dp->dl_recall, dp->dl_stid.sc_client,
&nfsd4_cb_recall_ops, NFSPROC4_CLNT_CB_RECALL);
nfsd4_init_cb(&dp->dl_cb_fattr.ncf_getattr, dp->dl_stid.sc_client,
&nfsd4_cb_getattr_ops, NFSPROC4_CLNT_CB_GETATTR);
dp->dl_cb_fattr.ncf_file_modified = false;
dp->dl_cb_fattr.ncf_cb_bmap[0] = FATTR4_WORD0_CHANGE | FATTR4_WORD0_SIZE;
get_nfs4_file(fp);
dp->dl_stid.sc_file = fp;
return dp;
@ -2894,11 +2901,56 @@ nfsd4_cb_recall_any_release(struct nfsd4_callback *cb)
spin_unlock(&nn->client_lock);
}
static int
nfsd4_cb_getattr_done(struct nfsd4_callback *cb, struct rpc_task *task)
{
struct nfs4_cb_fattr *ncf =
container_of(cb, struct nfs4_cb_fattr, ncf_getattr);
ncf->ncf_cb_status = task->tk_status;
switch (task->tk_status) {
case -NFS4ERR_DELAY:
rpc_delay(task, 2 * HZ);
return 0;
default:
return 1;
}
}
static void
nfsd4_cb_getattr_release(struct nfsd4_callback *cb)
{
struct nfs4_cb_fattr *ncf =
container_of(cb, struct nfs4_cb_fattr, ncf_getattr);
struct nfs4_delegation *dp =
container_of(ncf, struct nfs4_delegation, dl_cb_fattr);
nfs4_put_stid(&dp->dl_stid);
clear_bit(CB_GETATTR_BUSY, &ncf->ncf_cb_flags);
wake_up_bit(&ncf->ncf_cb_flags, CB_GETATTR_BUSY);
}
static const struct nfsd4_callback_ops nfsd4_cb_recall_any_ops = {
.done = nfsd4_cb_recall_any_done,
.release = nfsd4_cb_recall_any_release,
};
static const struct nfsd4_callback_ops nfsd4_cb_getattr_ops = {
.done = nfsd4_cb_getattr_done,
.release = nfsd4_cb_getattr_release,
};
void nfs4_cb_getattr(struct nfs4_cb_fattr *ncf)
{
struct nfs4_delegation *dp =
container_of(ncf, struct nfs4_delegation, dl_cb_fattr);
if (test_and_set_bit(CB_GETATTR_BUSY, &ncf->ncf_cb_flags))
return;
refcount_inc(&dp->dl_stid.sc_count);
nfsd4_run_cb(&ncf->ncf_getattr);
}
static struct nfs4_client *create_client(struct xdr_netobj name,
struct svc_rqst *rqstp, nfs4_verifier *verf)
{
@ -5634,13 +5686,15 @@ nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
struct svc_fh *parent = NULL;
int cb_up;
int status = 0;
struct kstat stat;
struct path path;
cb_up = nfsd4_cb_channel_good(oo->oo_owner.so_client);
open->op_recall = 0;
open->op_recall = false;
switch (open->op_claim_type) {
case NFS4_OPEN_CLAIM_PREVIOUS:
if (!cb_up)
open->op_recall = 1;
open->op_recall = true;
break;
case NFS4_OPEN_CLAIM_NULL:
parent = currentfh;
@ -5671,6 +5725,18 @@ nfs4_open_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
open->op_delegate_type = NFS4_OPEN_DELEGATE_WRITE;
trace_nfsd_deleg_write(&dp->dl_stid.sc_stateid);
path.mnt = currentfh->fh_export->ex_path.mnt;
path.dentry = currentfh->fh_dentry;
if (vfs_getattr(&path, &stat,
(STATX_SIZE | STATX_CTIME | STATX_CHANGE_COOKIE),
AT_STATX_SYNC_AS_STAT)) {
nfs4_put_stid(&dp->dl_stid);
destroy_delegation(dp);
goto out_no_deleg;
}
dp->dl_cb_fattr.ncf_cur_fsize = stat.size;
dp->dl_cb_fattr.ncf_initial_cinfo =
nfsd4_change_attribute(&stat, d_inode(currentfh->fh_dentry));
} else {
open->op_delegate_type = NFS4_OPEN_DELEGATE_READ;
trace_nfsd_deleg_read(&dp->dl_stid.sc_stateid);
@ -5682,7 +5748,7 @@ out_no_deleg:
if (open->op_claim_type == NFS4_OPEN_CLAIM_PREVIOUS &&
open->op_delegate_type != NFS4_OPEN_DELEGATE_NONE) {
dprintk("NFSD: WARNING: refusing delegation reclaim\n");
open->op_recall = 1;
open->op_recall = true;
}
/* 4.1 client asking for a delegation? */
@ -7487,6 +7553,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
struct nfsd4_blocked_lock *nbl = NULL;
struct file_lock *file_lock = NULL;
struct file_lock *conflock = NULL;
struct super_block *sb;
__be32 status = 0;
int lkflg;
int err;
@ -7508,6 +7575,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
dprintk("NFSD: nfsd4_lock: permission denied!\n");
return status;
}
sb = cstate->current_fh.fh_dentry->d_sb;
if (lock->lk_is_new) {
if (nfsd4_has_session(cstate))
@ -7559,7 +7627,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
fp = lock_stp->st_stid.sc_file;
switch (lock->lk_type) {
case NFS4_READW_LT:
if (nfsd4_has_session(cstate))
if (nfsd4_has_session(cstate) ||
exportfs_lock_op_is_async(sb->s_export_op))
fl_flags |= FL_SLEEP;
fallthrough;
case NFS4_READ_LT:
@ -7571,7 +7640,8 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
fl_type = F_RDLCK;
break;
case NFS4_WRITEW_LT:
if (nfsd4_has_session(cstate))
if (nfsd4_has_session(cstate) ||
exportfs_lock_op_is_async(sb->s_export_op))
fl_flags |= FL_SLEEP;
fallthrough;
case NFS4_WRITE_LT:
@ -7599,7 +7669,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
* for file locks), so don't attempt blocking lock notifications
* on those filesystems:
*/
if (nf->nf_file->f_op->lock)
if (!exportfs_lock_op_is_async(sb->s_export_op))
fl_flags &= ~FL_SLEEP;
nbl = find_or_allocate_block(lock_sop, &fp->fi_fhandle, nn);
@ -7705,6 +7775,14 @@ out:
return status;
}
void nfsd4_lock_release(union nfsd4_op_u *u)
{
struct nfsd4_lock *lock = &u->lock;
struct nfsd4_lock_denied *deny = &lock->lk_denied;
kfree(deny->ld_owner.data);
}
/*
* The NFSv4 spec allows a client to do a LOCKT without holding an OPEN,
* so we do a temporary open here just to get an open file to pass to
@ -7810,6 +7888,14 @@ out:
return status;
}
void nfsd4_lockt_release(union nfsd4_op_u *u)
{
struct nfsd4_lockt *lockt = &u->lockt;
struct nfsd4_lock_denied *deny = &lockt->lt_denied;
kfree(deny->ld_owner.data);
}
__be32
nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
union nfsd4_op_u *u)
@ -8403,6 +8489,8 @@ nfsd4_get_writestateid(struct nfsd4_compound_state *cstate,
* nfsd4_deleg_getattr_conflict - Recall if GETATTR causes conflict
* @rqstp: RPC transaction context
* @inode: file to be checked for a conflict
* @modified: return true if file was modified
* @size: new size of file if modified is true
*
* This function is called when there is a conflict between a write
* delegation and a change/size GETATTR from another client. The server
@ -8411,21 +8499,23 @@ nfsd4_get_writestateid(struct nfsd4_compound_state *cstate,
* delegation before replying to the GETATTR. See RFC 8881 section
* 18.7.4.
*
* The current implementation does not support CB_GETATTR yet. However
* this can avoid recalling the delegation could be added in follow up
* work.
*
* Returns 0 if there is no conflict; otherwise an nfs_stat
* code is returned.
*/
__be32
nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode)
nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode,
bool *modified, u64 *size)
{
__be32 status;
struct file_lock_context *ctx;
struct file_lock *fl;
struct nfs4_delegation *dp;
struct nfs4_cb_fattr *ncf;
struct file_lock *fl;
struct iattr attrs;
__be32 status;
might_sleep();
*modified = false;
ctx = locks_inode_context(inode);
if (!ctx)
return 0;
@ -8452,10 +8542,34 @@ nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp, struct inode *inode)
break_lease:
spin_unlock(&ctx->flc_lock);
nfsd_stats_wdeleg_getattr_inc();
status = nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ));
if (status != nfserr_jukebox ||
!nfsd_wait_for_delegreturn(rqstp, inode))
return status;
dp = fl->fl_owner;
ncf = &dp->dl_cb_fattr;
nfs4_cb_getattr(&dp->dl_cb_fattr);
wait_on_bit(&ncf->ncf_cb_flags, CB_GETATTR_BUSY, TASK_INTERRUPTIBLE);
if (ncf->ncf_cb_status) {
status = nfserrno(nfsd_open_break_lease(inode, NFSD_MAY_READ));
if (status != nfserr_jukebox ||
!nfsd_wait_for_delegreturn(rqstp, inode))
return status;
}
if (!ncf->ncf_file_modified &&
(ncf->ncf_initial_cinfo != ncf->ncf_cb_change ||
ncf->ncf_cur_fsize != ncf->ncf_cb_fsize))
ncf->ncf_file_modified = true;
if (ncf->ncf_file_modified) {
/*
* The server would not update the file's metadata
* with the client's modified size.
*/
attrs.ia_mtime = attrs.ia_ctime = current_time(inode);
attrs.ia_valid = ATTR_MTIME | ATTR_CTIME;
setattr_copy(&nop_mnt_idmap, inode, &attrs);
mark_inode_dirty(inode);
ncf->ncf_cur_fsize = ncf->ncf_cb_fsize;
*size = ncf->ncf_cur_fsize;
*modified = true;
}
return 0;
}
break;

File diff suppressed because it is too large Load Diff

View File

@ -26,6 +26,7 @@
#include "pnfs.h"
#include "filecache.h"
#include "trace.h"
#include "netlink.h"
/*
* We have a single directory with several nodes in it.
@ -1495,6 +1496,203 @@ static int create_proc_exports_entry(void)
unsigned int nfsd_net_id;
/**
* nfsd_nl_rpc_status_get_start - Prepare rpc_status_get dumpit
* @cb: netlink metadata and command arguments
*
* Return values:
* %0: The rpc_status_get command may proceed
* %-ENODEV: There is no NFSD running in this namespace
*/
int nfsd_nl_rpc_status_get_start(struct netlink_callback *cb)
{
struct nfsd_net *nn = net_generic(sock_net(cb->skb->sk), nfsd_net_id);
int ret = -ENODEV;
mutex_lock(&nfsd_mutex);
if (nn->nfsd_serv) {
svc_get(nn->nfsd_serv);
ret = 0;
}
mutex_unlock(&nfsd_mutex);
return ret;
}
static int nfsd_genl_rpc_status_compose_msg(struct sk_buff *skb,
struct netlink_callback *cb,
struct nfsd_genl_rqstp *rqstp)
{
void *hdr;
u32 i;
hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
&nfsd_nl_family, 0, NFSD_CMD_RPC_STATUS_GET);
if (!hdr)
return -ENOBUFS;
if (nla_put_be32(skb, NFSD_A_RPC_STATUS_XID, rqstp->rq_xid) ||
nla_put_u32(skb, NFSD_A_RPC_STATUS_FLAGS, rqstp->rq_flags) ||
nla_put_u32(skb, NFSD_A_RPC_STATUS_PROG, rqstp->rq_prog) ||
nla_put_u32(skb, NFSD_A_RPC_STATUS_PROC, rqstp->rq_proc) ||
nla_put_u8(skb, NFSD_A_RPC_STATUS_VERSION, rqstp->rq_vers) ||
nla_put_s64(skb, NFSD_A_RPC_STATUS_SERVICE_TIME,
ktime_to_us(rqstp->rq_stime),
NFSD_A_RPC_STATUS_PAD))
return -ENOBUFS;
switch (rqstp->rq_saddr.sa_family) {
case AF_INET: {
const struct sockaddr_in *s_in, *d_in;
s_in = (const struct sockaddr_in *)&rqstp->rq_saddr;
d_in = (const struct sockaddr_in *)&rqstp->rq_daddr;
if (nla_put_in_addr(skb, NFSD_A_RPC_STATUS_SADDR4,
s_in->sin_addr.s_addr) ||
nla_put_in_addr(skb, NFSD_A_RPC_STATUS_DADDR4,
d_in->sin_addr.s_addr) ||
nla_put_be16(skb, NFSD_A_RPC_STATUS_SPORT,
s_in->sin_port) ||
nla_put_be16(skb, NFSD_A_RPC_STATUS_DPORT,
d_in->sin_port))
return -ENOBUFS;
break;
}
case AF_INET6: {
const struct sockaddr_in6 *s_in, *d_in;
s_in = (const struct sockaddr_in6 *)&rqstp->rq_saddr;
d_in = (const struct sockaddr_in6 *)&rqstp->rq_daddr;
if (nla_put_in6_addr(skb, NFSD_A_RPC_STATUS_SADDR6,
&s_in->sin6_addr) ||
nla_put_in6_addr(skb, NFSD_A_RPC_STATUS_DADDR6,
&d_in->sin6_addr) ||
nla_put_be16(skb, NFSD_A_RPC_STATUS_SPORT,
s_in->sin6_port) ||
nla_put_be16(skb, NFSD_A_RPC_STATUS_DPORT,
d_in->sin6_port))
return -ENOBUFS;
break;
}
}
for (i = 0; i < rqstp->rq_opcnt; i++)
if (nla_put_u32(skb, NFSD_A_RPC_STATUS_COMPOUND_OPS,
rqstp->rq_opnum[i]))
return -ENOBUFS;
genlmsg_end(skb, hdr);
return 0;
}
/**
* nfsd_nl_rpc_status_get_dumpit - Handle rpc_status_get dumpit
* @skb: reply buffer
* @cb: netlink metadata and command arguments
*
* Returns the size of the reply or a negative errno.
*/
int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
struct netlink_callback *cb)
{
struct nfsd_net *nn = net_generic(sock_net(skb->sk), nfsd_net_id);
int i, ret, rqstp_index = 0;
rcu_read_lock();
for (i = 0; i < nn->nfsd_serv->sv_nrpools; i++) {
struct svc_rqst *rqstp;
if (i < cb->args[0]) /* already consumed */
continue;
rqstp_index = 0;
list_for_each_entry_rcu(rqstp,
&nn->nfsd_serv->sv_pools[i].sp_all_threads,
rq_all) {
struct nfsd_genl_rqstp genl_rqstp;
unsigned int status_counter;
if (rqstp_index++ < cb->args[1]) /* already consumed */
continue;
/*
* Acquire rq_status_counter before parsing the rqst
* fields. rq_status_counter is set to an odd value in
* order to notify the consumers the rqstp fields are
* meaningful.
*/
status_counter =
smp_load_acquire(&rqstp->rq_status_counter);
if (!(status_counter & 1))
continue;
genl_rqstp.rq_xid = rqstp->rq_xid;
genl_rqstp.rq_flags = rqstp->rq_flags;
genl_rqstp.rq_vers = rqstp->rq_vers;
genl_rqstp.rq_prog = rqstp->rq_prog;
genl_rqstp.rq_proc = rqstp->rq_proc;
genl_rqstp.rq_stime = rqstp->rq_stime;
genl_rqstp.rq_opcnt = 0;
memcpy(&genl_rqstp.rq_daddr, svc_daddr(rqstp),
sizeof(struct sockaddr));
memcpy(&genl_rqstp.rq_saddr, svc_addr(rqstp),
sizeof(struct sockaddr));
#ifdef CONFIG_NFSD_V4
if (rqstp->rq_vers == NFS4_VERSION &&
rqstp->rq_proc == NFSPROC4_COMPOUND) {
/* NFSv4 compound */
struct nfsd4_compoundargs *args;
int j;
args = rqstp->rq_argp;
genl_rqstp.rq_opcnt = args->opcnt;
for (j = 0; j < genl_rqstp.rq_opcnt; j++)
genl_rqstp.rq_opnum[j] =
args->ops[j].opnum;
}
#endif /* CONFIG_NFSD_V4 */
/*
* Acquire rq_status_counter before reporting the rqst
* fields to the user.
*/
if (smp_load_acquire(&rqstp->rq_status_counter) !=
status_counter)
continue;
ret = nfsd_genl_rpc_status_compose_msg(skb, cb,
&genl_rqstp);
if (ret)
goto out;
}
}
cb->args[0] = i;
cb->args[1] = rqstp_index;
ret = skb->len;
out:
rcu_read_unlock();
return ret;
}
/**
* nfsd_nl_rpc_status_get_done - rpc_status_get dumpit post-processing
* @cb: netlink metadata and command arguments
*
* Return values:
* %0: Success
*/
int nfsd_nl_rpc_status_get_done(struct netlink_callback *cb)
{
mutex_lock(&nfsd_mutex);
nfsd_put(sock_net(cb->skb->sk));
mutex_unlock(&nfsd_mutex);
return 0;
}
/**
* nfsd_net_init - Prepare the nfsd_net portion of a new net namespace
* @net: a freshly-created network namespace
@ -1589,6 +1787,10 @@ static int __init init_nfsd(void)
retval = register_filesystem(&nfsd_fs_type);
if (retval)
goto out_free_all;
retval = genl_register_family(&nfsd_nl_family);
if (retval)
goto out_free_all;
return 0;
out_free_all:
nfsd4_destroy_laundry_wq();
@ -1613,6 +1815,7 @@ out_free_slabs:
static void __exit exit_nfsd(void)
{
genl_unregister_family(&nfsd_nl_family);
unregister_filesystem(&nfsd_fs_type);
nfsd4_destroy_laundry_wq();
unregister_cld_notifier();

View File

@ -62,6 +62,23 @@ struct readdir_cd {
__be32 err; /* 0, nfserr, or nfserr_eof */
};
/* Maximum number of operations per session compound */
#define NFSD_MAX_OPS_PER_COMPOUND 50
struct nfsd_genl_rqstp {
struct sockaddr rq_daddr;
struct sockaddr rq_saddr;
unsigned long rq_flags;
ktime_t rq_stime;
__be32 rq_xid;
u32 rq_vers;
u32 rq_prog;
u32 rq_proc;
/* NFSv4 compound */
u32 rq_opcnt;
u32 rq_opnum[NFSD_MAX_OPS_PER_COMPOUND];
};
extern struct svc_program nfsd_program;
extern const struct svc_version nfsd_version2, nfsd_version3, nfsd_version4;

View File

@ -771,7 +771,7 @@ enum fsid_source fsid_source(const struct svc_fh *fhp)
* assume that the new change attr is always logged to stable storage in some
* fashion before the results can be seen.
*/
u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode)
u64 nfsd4_change_attribute(const struct kstat *stat, const struct inode *inode)
{
u64 chattr;

View File

@ -293,7 +293,8 @@ static inline void fh_clear_pre_post_attrs(struct svc_fh *fhp)
fhp->fh_pre_saved = false;
}
u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode);
u64 nfsd4_change_attribute(const struct kstat *stat,
const struct inode *inode);
__be32 __must_check fh_fill_pre_attrs(struct svc_fh *fhp);
__be32 fh_fill_post_attrs(struct svc_fh *fhp);
__be32 __must_check fh_fill_both_attrs(struct svc_fh *fhp);

View File

@ -572,7 +572,6 @@ static void nfsd_last_thread(struct net *net)
return;
nfsd_shutdown_net(net);
pr_info("nfsd: last server has exited, flushing export cache\n");
nfsd_export_flush(net);
}
@ -713,14 +712,13 @@ int nfsd_nrpools(struct net *net)
int nfsd_get_nrthreads(int n, int *nthreads, struct net *net)
{
int i = 0;
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
struct svc_serv *serv = nn->nfsd_serv;
int i;
if (nn->nfsd_serv != NULL) {
for (i = 0; i < nn->nfsd_serv->sv_nrpools && i < n; i++)
nthreads[i] = nn->nfsd_serv->sv_pools[i].sp_nrthreads;
}
if (serv)
for (i = 0; i < serv->sv_nrpools && i < n; i++)
nthreads[i] = atomic_read(&serv->sv_pools[i].sp_nrthreads);
return 0;
}
@ -787,7 +785,6 @@ int
nfsd_svc(int nrservs, struct net *net, const struct cred *cred)
{
int error;
bool nfsd_up_before;
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
struct svc_serv *serv;
@ -807,8 +804,6 @@ nfsd_svc(int nrservs, struct net *net, const struct cred *cred)
error = nfsd_create_serv(net);
if (error)
goto out;
nfsd_up_before = nn->nfsd_net_up;
serv = nn->nfsd_serv;
error = nfsd_startup_net(net, cred);
@ -816,17 +811,15 @@ nfsd_svc(int nrservs, struct net *net, const struct cred *cred)
goto out_put;
error = svc_set_num_threads(serv, NULL, nrservs);
if (error)
goto out_shutdown;
goto out_put;
error = serv->sv_nrthreads;
if (error == 0)
nfsd_last_thread(net);
out_shutdown:
if (error < 0 && !nfsd_up_before)
nfsd_shutdown_net(net);
out_put:
/* Threads now hold service active */
if (xchg(&nn->keep_active, 0))
svc_put(serv);
if (serv->sv_nrthreads == 0)
nfsd_last_thread(net);
svc_put(serv);
out:
mutex_unlock(&nfsd_mutex);
@ -957,7 +950,7 @@ nfsd(void *vrqstp)
/*
* The main request loop
*/
while (!kthread_should_stop()) {
while (!svc_thread_should_stop(rqstp)) {
/* Update sv_maxconn if it has changed */
rqstp->rq_server->sv_maxconn = nn->max_connections;
@ -998,6 +991,15 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
if (!proc->pc_decode(rqstp, &rqstp->rq_arg_stream))
goto out_decode_err;
/*
* Release rq_status_counter setting it to an odd value after the rpc
* request has been properly parsed. rq_status_counter is used to
* notify the consumers if the rqstp fields are stable
* (rq_status_counter is odd) or not meaningful (rq_status_counter
* is even).
*/
smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter | 1);
rp = NULL;
switch (nfsd_cache_lookup(rqstp, &rp)) {
case RC_DOIT:
@ -1015,6 +1017,12 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
if (!proc->pc_encode(rqstp, &rqstp->rq_res_stream))
goto out_encode_err;
/*
* Release rq_status_counter setting it to an even value after the rpc
* request has been properly processed.
*/
smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter + 1);
nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, statp + 1);
out_cached_reply:
return 1;

View File

@ -27,12 +27,12 @@ struct nfsd4_layout_ops {
struct nfs4_client *clp,
struct nfsd4_getdeviceinfo *gdevp);
__be32 (*encode_getdeviceinfo)(struct xdr_stream *xdr,
struct nfsd4_getdeviceinfo *gdevp);
const struct nfsd4_getdeviceinfo *gdevp);
__be32 (*proc_layoutget)(struct inode *, const struct svc_fh *fhp,
struct nfsd4_layoutget *lgp);
__be32 (*encode_layoutget)(struct xdr_stream *,
struct nfsd4_layoutget *lgp);
__be32 (*encode_layoutget)(struct xdr_stream *xdr,
const struct nfsd4_layoutget *lgp);
__be32 (*proc_layoutcommit)(struct inode *inode,
struct nfsd4_layoutcommit *lcp);

View File

@ -117,6 +117,24 @@ struct nfs4_cpntf_state {
time64_t cpntf_time; /* last time stateid used */
};
struct nfs4_cb_fattr {
struct nfsd4_callback ncf_getattr;
u32 ncf_cb_status;
u32 ncf_cb_bmap[1];
/* from CB_GETATTR reply */
u64 ncf_cb_change;
u64 ncf_cb_fsize;
unsigned long ncf_cb_flags;
bool ncf_file_modified;
u64 ncf_initial_cinfo;
u64 ncf_cur_fsize;
};
/* bits for ncf_cb_flags */
#define CB_GETATTR_BUSY 0
/*
* Represents a delegation stateid. The nfs4_client holds references to these
* and they are put when it is being destroyed or when the delegation is
@ -150,6 +168,9 @@ struct nfs4_delegation {
int dl_retries;
struct nfsd4_callback dl_recall;
bool dl_recalled;
/* for CB_GETATTR */
struct nfs4_cb_fattr dl_cb_fattr;
};
#define cb_to_delegation(cb) \
@ -174,8 +195,6 @@ static inline struct nfs4_delegation *delegstateid(struct nfs4_stid *s)
/* Maximum number of slots per session. 160 is useful for long haul TCP */
#define NFSD_MAX_SLOTS_PER_SESSION 160
/* Maximum number of operations per session compound */
#define NFSD_MAX_OPS_PER_COMPOUND 50
/* Maximum session per slot cache size */
#define NFSD_SLOT_CACHE_SIZE 2048
/* Maximum number of NFSD_SLOT_CACHE_SIZE slots per session */
@ -642,6 +661,7 @@ enum nfsd4_cb_op {
NFSPROC4_CLNT_CB_SEQUENCE,
NFSPROC4_CLNT_CB_NOTIFY_LOCK,
NFSPROC4_CLNT_CB_RECALL_ANY,
NFSPROC4_CLNT_CB_GETATTR,
};
/* Returns true iff a is later than b: */
@ -734,5 +754,6 @@ static inline bool try_to_expire_client(struct nfs4_client *clp)
}
extern __be32 nfsd4_deleg_getattr_conflict(struct svc_rqst *rqstp,
struct inode *inode);
struct inode *inode, bool *file_modified, u64 *size);
extern void nfs4_cb_getattr(struct nfs4_cb_fattr *ncf);
#endif /* NFSD4_STATE_H */

View File

@ -60,7 +60,7 @@ static int nfsd_show(struct seq_file *seq, void *v)
#ifdef CONFIG_NFSD_V4
/* Show count for individual nfsv4 operations */
/* Writing operation numbers 0 1 2 also for maintaining uniformity */
seq_printf(seq,"proc4ops %u", LAST_NFS4_OP + 1);
seq_printf(seq, "proc4ops %u", LAST_NFS4_OP + 1);
for (i = 0; i <= LAST_NFS4_OP; i++) {
seq_printf(seq, " %lld",
percpu_counter_sum_positive(&nfsdstats.counter[NFSD_STATS_NFS4_OP(i)]));
@ -76,7 +76,7 @@ static int nfsd_show(struct seq_file *seq, void *v)
DEFINE_PROC_SHOW_ATTRIBUTE(nfsd);
int nfsd_percpu_counters_init(struct percpu_counter counters[], int num)
int nfsd_percpu_counters_init(struct percpu_counter *counters, int num)
{
int i, err = 0;

View File

@ -37,9 +37,9 @@ extern struct nfsd_stats nfsdstats;
extern struct svc_stat nfsd_svcstats;
int nfsd_percpu_counters_init(struct percpu_counter counters[], int num);
void nfsd_percpu_counters_reset(struct percpu_counter counters[], int num);
void nfsd_percpu_counters_destroy(struct percpu_counter counters[], int num);
int nfsd_percpu_counters_init(struct percpu_counter *counters, int num);
void nfsd_percpu_counters_reset(struct percpu_counter *counters, int num);
void nfsd_percpu_counters_destroy(struct percpu_counter *counters, int num);
int nfsd_stat_init(void);
void nfsd_stat_shutdown(void);
@ -61,22 +61,22 @@ static inline void nfsd_stats_rc_nocache_inc(void)
static inline void nfsd_stats_fh_stale_inc(struct svc_export *exp)
{
percpu_counter_inc(&nfsdstats.counter[NFSD_STATS_FH_STALE]);
if (exp)
percpu_counter_inc(&exp->ex_stats.counter[EXP_STATS_FH_STALE]);
if (exp && exp->ex_stats)
percpu_counter_inc(&exp->ex_stats->counter[EXP_STATS_FH_STALE]);
}
static inline void nfsd_stats_io_read_add(struct svc_export *exp, s64 amount)
{
percpu_counter_add(&nfsdstats.counter[NFSD_STATS_IO_READ], amount);
if (exp)
percpu_counter_add(&exp->ex_stats.counter[EXP_STATS_IO_READ], amount);
if (exp && exp->ex_stats)
percpu_counter_add(&exp->ex_stats->counter[EXP_STATS_IO_READ], amount);
}
static inline void nfsd_stats_io_write_add(struct svc_export *exp, s64 amount)
{
percpu_counter_add(&nfsdstats.counter[NFSD_STATS_IO_WRITE], amount);
if (exp)
percpu_counter_add(&exp->ex_stats.counter[EXP_STATS_IO_WRITE], amount);
if (exp && exp->ex_stats)
percpu_counter_add(&exp->ex_stats->counter[EXP_STATS_IO_WRITE], amount);
}
static inline void nfsd_stats_payload_misses_inc(struct nfsd_net *nn)

View File

@ -1863,6 +1863,93 @@ TRACE_EVENT(nfsd_end_grace,
)
);
DECLARE_EVENT_CLASS(nfsd_copy_class,
TP_PROTO(
const struct nfsd4_copy *copy
),
TP_ARGS(copy),
TP_STRUCT__entry(
__field(bool, intra)
__field(bool, async)
__field(u32, src_cl_boot)
__field(u32, src_cl_id)
__field(u32, src_so_id)
__field(u32, src_si_generation)
__field(u32, dst_cl_boot)
__field(u32, dst_cl_id)
__field(u32, dst_so_id)
__field(u32, dst_si_generation)
__field(u64, src_cp_pos)
__field(u64, dst_cp_pos)
__field(u64, cp_count)
__sockaddr(addr, sizeof(struct sockaddr_in6))
),
TP_fast_assign(
const stateid_t *src_stp = &copy->cp_src_stateid;
const stateid_t *dst_stp = &copy->cp_dst_stateid;
__entry->intra = test_bit(NFSD4_COPY_F_INTRA, &copy->cp_flags);
__entry->async = !test_bit(NFSD4_COPY_F_SYNCHRONOUS, &copy->cp_flags);
__entry->src_cl_boot = src_stp->si_opaque.so_clid.cl_boot;
__entry->src_cl_id = src_stp->si_opaque.so_clid.cl_id;
__entry->src_so_id = src_stp->si_opaque.so_id;
__entry->src_si_generation = src_stp->si_generation;
__entry->dst_cl_boot = dst_stp->si_opaque.so_clid.cl_boot;
__entry->dst_cl_id = dst_stp->si_opaque.so_clid.cl_id;
__entry->dst_so_id = dst_stp->si_opaque.so_id;
__entry->dst_si_generation = dst_stp->si_generation;
__entry->src_cp_pos = copy->cp_src_pos;
__entry->dst_cp_pos = copy->cp_dst_pos;
__entry->cp_count = copy->cp_count;
__assign_sockaddr(addr, &copy->cp_clp->cl_addr,
sizeof(struct sockaddr_in6));
),
TP_printk("client=%pISpc intra=%d async=%d "
"src_stateid[si_generation:0x%x cl_boot:0x%x cl_id:0x%x so_id:0x%x] "
"dst_stateid[si_generation:0x%x cl_boot:0x%x cl_id:0x%x so_id:0x%x] "
"cp_src_pos=%llu cp_dst_pos=%llu cp_count=%llu",
__get_sockaddr(addr), __entry->intra, __entry->async,
__entry->src_si_generation, __entry->src_cl_boot,
__entry->src_cl_id, __entry->src_so_id,
__entry->dst_si_generation, __entry->dst_cl_boot,
__entry->dst_cl_id, __entry->dst_so_id,
__entry->src_cp_pos, __entry->dst_cp_pos, __entry->cp_count
)
);
#define DEFINE_COPY_EVENT(name) \
DEFINE_EVENT(nfsd_copy_class, nfsd_copy_##name, \
TP_PROTO(const struct nfsd4_copy *copy), \
TP_ARGS(copy))
DEFINE_COPY_EVENT(inter);
DEFINE_COPY_EVENT(intra);
DEFINE_COPY_EVENT(do_async);
TRACE_EVENT(nfsd_copy_done,
TP_PROTO(
const struct nfsd4_copy *copy,
__be32 status
),
TP_ARGS(copy, status),
TP_STRUCT__entry(
__field(int, status)
__field(bool, intra)
__field(bool, async)
__sockaddr(addr, sizeof(struct sockaddr_in6))
),
TP_fast_assign(
__entry->status = be32_to_cpu(status);
__entry->intra = test_bit(NFSD4_COPY_F_INTRA, &copy->cp_flags);
__entry->async = !test_bit(NFSD4_COPY_F_SYNCHRONOUS, &copy->cp_flags);
__assign_sockaddr(addr, &copy->cp_clp->cl_addr,
sizeof(struct sockaddr_in6));
),
TP_printk("addr=%pISpc status=%d intra=%d async=%d ",
__get_sockaddr(addr), __entry->status, __entry->intra, __entry->async
)
);
#endif /* _NFSD_TRACE_H */
#undef TRACE_INCLUDE_PATH

View File

@ -337,6 +337,24 @@ out:
return err;
}
static void
commit_reset_write_verifier(struct nfsd_net *nn, struct svc_rqst *rqstp,
int err)
{
switch (err) {
case -EAGAIN:
case -ESTALE:
/*
* Neither of these are the result of a problem with
* durable storage, so avoid a write verifier reset.
*/
break;
default:
nfsd_reset_write_verifier(nn);
trace_nfsd_writeverf_reset(nn, rqstp, err);
}
}
/*
* Commit metadata changes to stable storage.
*/
@ -647,8 +665,7 @@ __be32 nfsd4_clone_file_range(struct svc_rqst *rqstp,
&nfsd4_get_cstate(rqstp)->current_fh,
dst_pos,
count, status);
nfsd_reset_write_verifier(nn);
trace_nfsd_writeverf_reset(nn, rqstp, status);
commit_reset_write_verifier(nn, rqstp, status);
ret = nfserrno(status);
}
}
@ -823,7 +840,7 @@ int nfsd_open_break_lease(struct inode *inode, int access)
* and additional flags.
* N.B. After this call fhp needs an fh_put
*/
static __be32
static int
__nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
int may_flags, struct file **filp)
{
@ -831,14 +848,12 @@ __nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
struct inode *inode;
struct file *file;
int flags = O_RDONLY|O_LARGEFILE;
__be32 err;
int host_err = 0;
int host_err = -EPERM;
path.mnt = fhp->fh_export->ex_path.mnt;
path.dentry = fhp->fh_dentry;
inode = d_inode(path.dentry);
err = nfserr_perm;
if (IS_APPEND(inode) && (may_flags & NFSD_MAY_WRITE))
goto out;
@ -847,7 +862,7 @@ __nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
host_err = nfsd_open_break_lease(inode, may_flags);
if (host_err) /* NOMEM or WOULDBLOCK */
goto out_nfserr;
goto out;
if (may_flags & NFSD_MAY_WRITE) {
if (may_flags & NFSD_MAY_READ)
@ -859,13 +874,13 @@ __nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
file = dentry_open(&path, flags, current_cred());
if (IS_ERR(file)) {
host_err = PTR_ERR(file);
goto out_nfserr;
goto out;
}
host_err = ima_file_check(file, may_flags);
if (host_err) {
fput(file);
goto out_nfserr;
goto out;
}
if (may_flags & NFSD_MAY_64BIT_COOKIE)
@ -874,10 +889,8 @@ __nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
file->f_mode |= FMODE_32BITHASH;
*filp = file;
out_nfserr:
err = nfserrno(host_err);
out:
return err;
return host_err;
}
__be32
@ -885,6 +898,7 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
int may_flags, struct file **filp)
{
__be32 err;
int host_err;
bool retried = false;
validate_process_creds();
@ -904,12 +918,13 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
retry:
err = fh_verify(rqstp, fhp, type, may_flags);
if (!err) {
err = __nfsd_open(rqstp, fhp, type, may_flags, filp);
if (err == nfserr_stale && !retried) {
host_err = __nfsd_open(rqstp, fhp, type, may_flags, filp);
if (host_err == -EOPENSTALE && !retried) {
retried = true;
fh_put(fhp);
goto retry;
}
err = nfserrno(host_err);
}
validate_process_creds();
return err;
@ -922,13 +937,13 @@ retry:
* @may_flags: internal permission flags
* @filp: OUT: open "struct file *"
*
* Returns an nfsstat value in network byte order.
* Returns zero on success, or a negative errno value.
*/
__be32
int
nfsd_open_verified(struct svc_rqst *rqstp, struct svc_fh *fhp, int may_flags,
struct file **filp)
{
__be32 err;
int err;
validate_process_creds();
err = __nfsd_open(rqstp, fhp, S_IFREG, may_flags, filp);
@ -1172,8 +1187,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfsd_file *nf,
host_err = vfs_iter_write(file, &iter, &pos, flags);
file_end_write(file);
if (host_err < 0) {
nfsd_reset_write_verifier(nn);
trace_nfsd_writeverf_reset(nn, rqstp, host_err);
commit_reset_write_verifier(nn, rqstp, host_err);
goto out_nfserr;
}
*cnt = host_err;
@ -1185,10 +1199,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfsd_file *nf,
if (stable && use_wgather) {
host_err = wait_for_concurrent_writes(file);
if (host_err < 0) {
nfsd_reset_write_verifier(nn);
trace_nfsd_writeverf_reset(nn, rqstp, host_err);
}
if (host_err < 0)
commit_reset_write_verifier(nn, rqstp, host_err);
}
out_nfserr:
@ -1331,8 +1343,7 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfsd_file *nf,
err = nfserr_notsupp;
break;
default:
nfsd_reset_write_verifier(nn);
trace_nfsd_writeverf_reset(nn, rqstp, err2);
commit_reset_write_verifier(nn, rqstp, err2);
err = nfserrno(err2);
}
} else

View File

@ -104,8 +104,8 @@ __be32 nfsd_setxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
int nfsd_open_break_lease(struct inode *, int);
__be32 nfsd_open(struct svc_rqst *, struct svc_fh *, umode_t,
int, struct file **);
__be32 nfsd_open_verified(struct svc_rqst *, struct svc_fh *,
int, struct file **);
int nfsd_open_verified(struct svc_rqst *rqstp, struct svc_fh *fhp,
int may_flags, struct file **filp);
__be32 nfsd_splice_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
struct file *file, loff_t offset,
unsigned long *count,

View File

@ -50,6 +50,134 @@
#define HAS_CSTATE_FLAG(c, f) ((c)->sid_flags & (f))
#define CLEAR_CSTATE_FLAG(c, f) ((c)->sid_flags &= ~(f))
/**
* nfsd4_encode_bool - Encode an XDR bool type result
* @xdr: target XDR stream
* @val: boolean value to encode
*
* Return values:
* %nfs_ok: @val encoded; @xdr advanced to next position
* %nfserr_resource: stream buffer space exhausted
*/
static __always_inline __be32
nfsd4_encode_bool(struct xdr_stream *xdr, bool val)
{
__be32 *p = xdr_reserve_space(xdr, XDR_UNIT);
if (unlikely(p == NULL))
return nfserr_resource;
*p = val ? xdr_one : xdr_zero;
return nfs_ok;
}
/**
* nfsd4_encode_uint32_t - Encode an XDR uint32_t type result
* @xdr: target XDR stream
* @val: integer value to encode
*
* Return values:
* %nfs_ok: @val encoded; @xdr advanced to next position
* %nfserr_resource: stream buffer space exhausted
*/
static __always_inline __be32
nfsd4_encode_uint32_t(struct xdr_stream *xdr, u32 val)
{
__be32 *p = xdr_reserve_space(xdr, XDR_UNIT);
if (unlikely(p == NULL))
return nfserr_resource;
*p = cpu_to_be32(val);
return nfs_ok;
}
#define nfsd4_encode_aceflag4(x, v) nfsd4_encode_uint32_t(x, v)
#define nfsd4_encode_acemask4(x, v) nfsd4_encode_uint32_t(x, v)
#define nfsd4_encode_acetype4(x, v) nfsd4_encode_uint32_t(x, v)
#define nfsd4_encode_count4(x, v) nfsd4_encode_uint32_t(x, v)
#define nfsd4_encode_mode4(x, v) nfsd4_encode_uint32_t(x, v)
#define nfsd4_encode_nfs_lease4(x, v) nfsd4_encode_uint32_t(x, v)
#define nfsd4_encode_qop4(x, v) nfsd4_encode_uint32_t(x, v)
#define nfsd4_encode_sequenceid4(x, v) nfsd4_encode_uint32_t(x, v)
#define nfsd4_encode_slotid4(x, v) nfsd4_encode_uint32_t(x, v)
/**
* nfsd4_encode_uint64_t - Encode an XDR uint64_t type result
* @xdr: target XDR stream
* @val: integer value to encode
*
* Return values:
* %nfs_ok: @val encoded; @xdr advanced to next position
* %nfserr_resource: stream buffer space exhausted
*/
static __always_inline __be32
nfsd4_encode_uint64_t(struct xdr_stream *xdr, u64 val)
{
__be32 *p = xdr_reserve_space(xdr, XDR_UNIT * 2);
if (unlikely(p == NULL))
return nfserr_resource;
put_unaligned_be64(val, p);
return nfs_ok;
}
#define nfsd4_encode_changeid4(x, v) nfsd4_encode_uint64_t(x, v)
#define nfsd4_encode_nfs_cookie4(x, v) nfsd4_encode_uint64_t(x, v)
#define nfsd4_encode_length4(x, v) nfsd4_encode_uint64_t(x, v)
#define nfsd4_encode_offset4(x, v) nfsd4_encode_uint64_t(x, v)
/**
* nfsd4_encode_opaque_fixed - Encode a fixed-length XDR opaque type result
* @xdr: target XDR stream
* @data: pointer to data
* @size: length of data in bytes
*
* Return values:
* %nfs_ok: @data encoded; @xdr advanced to next position
* %nfserr_resource: stream buffer space exhausted
*/
static __always_inline __be32
nfsd4_encode_opaque_fixed(struct xdr_stream *xdr, const void *data,
size_t size)
{
__be32 *p = xdr_reserve_space(xdr, xdr_align_size(size));
size_t pad = xdr_pad_size(size);
if (unlikely(p == NULL))
return nfserr_resource;
memcpy(p, data, size);
if (pad)
memset((char *)p + size, 0, pad);
return nfs_ok;
}
/**
* nfsd4_encode_opaque - Encode a variable-length XDR opaque type result
* @xdr: target XDR stream
* @data: pointer to data
* @size: length of data in bytes
*
* Return values:
* %nfs_ok: @data encoded; @xdr advanced to next position
* %nfserr_resource: stream buffer space exhausted
*/
static __always_inline __be32
nfsd4_encode_opaque(struct xdr_stream *xdr, const void *data, size_t size)
{
size_t pad = xdr_pad_size(size);
__be32 *p;
p = xdr_reserve_space(xdr, XDR_UNIT + xdr_align_size(size));
if (unlikely(p == NULL))
return nfserr_resource;
*p++ = cpu_to_be32(size);
memcpy(p, data, size);
if (pad)
memset((char *)p + size, 0, pad);
return nfs_ok;
}
#define nfsd4_encode_component4(x, d, s) nfsd4_encode_opaque(x, d, s)
struct nfsd4_compound_state {
struct svc_fh current_fh;
struct svc_fh save_fh;
@ -170,12 +298,8 @@ struct nfsd4_lock {
} v;
/* response */
union {
struct {
stateid_t stateid;
} ok;
struct nfsd4_lock_denied denied;
} u;
stateid_t lk_resp_stateid;
struct nfsd4_lock_denied lk_denied;
};
#define lk_new_open_seqid v.new.open_seqid
#define lk_new_open_stateid v.new.open_stateid
@ -185,20 +309,15 @@ struct nfsd4_lock {
#define lk_old_lock_stateid v.old.lock_stateid
#define lk_old_lock_seqid v.old.lock_seqid
#define lk_resp_stateid u.ok.stateid
#define lk_denied u.denied
struct nfsd4_lockt {
u32 lt_type;
clientid_t lt_clientid;
struct xdr_netobj lt_owner;
u64 lt_offset;
u64 lt_length;
struct nfsd4_lock_denied lt_denied;
struct nfsd4_lock_denied lt_denied;
};
struct nfsd4_locku {
u32 lu_type;
u32 lu_seqid;
@ -267,9 +386,9 @@ struct nfsd4_open {
u32 op_deleg_want; /* request */
stateid_t op_stateid; /* response */
__be32 op_xdr_error; /* see nfsd4_open_omfg() */
u32 op_recall; /* recall */
struct nfsd4_change_info op_cinfo; /* response */
u32 op_rflags; /* response */
bool op_recall; /* response */
bool op_truncate; /* used during processing */
bool op_created; /* used during processing */
struct nfs4_openowner *op_openowner; /* used during processing */
@ -496,7 +615,7 @@ struct nfsd4_layoutcommit {
u32 lc_layout_type; /* request */
u32 lc_up_len; /* layout length */
void *lc_up_layout; /* decoded by callback */
u32 lc_size_chg; /* boolean for response */
bool lc_size_chg; /* response */
u64 lc_newsize; /* response */
};
@ -508,7 +627,7 @@ struct nfsd4_layoutreturn {
u32 lrf_body_len; /* request */
void *lrf_body; /* request */
stateid_t lr_sid; /* request/response */
u32 lrs_present; /* response */
bool lrs_present; /* response */
};
struct nfsd4_fallocate {
@ -626,8 +745,7 @@ struct nfsd4_copy_notify {
/* response */
stateid_t cpn_cnr_stateid;
u64 cpn_sec;
u32 cpn_nsec;
struct timespec64 cpn_lease_time;
struct nl4_server *cpn_src;
};
@ -820,8 +938,10 @@ extern __be32 nfsd4_open_downgrade(struct svc_rqst *rqstp,
struct nfsd4_compound_state *, union nfsd4_op_u *u);
extern __be32 nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *,
union nfsd4_op_u *u);
extern void nfsd4_lock_release(union nfsd4_op_u *u);
extern __be32 nfsd4_lockt(struct svc_rqst *rqstp, struct nfsd4_compound_state *,
union nfsd4_op_u *u);
extern void nfsd4_lockt_release(union nfsd4_op_u *u);
extern __be32 nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *,
union nfsd4_op_u *u);
extern __be32

View File

@ -54,3 +54,21 @@
#define NFS4_dec_cb_recall_any_sz (cb_compound_dec_hdr_sz + \
cb_sequence_dec_sz + \
op_dec_sz)
/*
* 1: CB_GETATTR opcode (32-bit)
* N: file_handle
* 1: number of entry in attribute array (32-bit)
* 1: entry 0 in attribute array (32-bit)
*/
#define NFS4_enc_cb_getattr_sz (cb_compound_enc_hdr_sz + \
cb_sequence_enc_sz + \
1 + enc_nfs4_fh_sz + 1 + 1)
/*
* 4: fattr_bitmap_maxsz
* 1: attribute array len
* 2: change attr (64-bit)
* 2: size (64-bit)
*/
#define NFS4_dec_cb_getattr_sz (cb_compound_dec_hdr_sz + \
cb_sequence_dec_sz + 4 + 1 + 2 + 2 + op_dec_sz)

View File

@ -224,9 +224,23 @@ struct export_operations {
atomic attribute updates
*/
#define EXPORT_OP_FLUSH_ON_CLOSE (0x20) /* fs flushes file data on close */
#define EXPORT_OP_ASYNC_LOCK (0x40) /* fs can do async lock request */
unsigned long flags;
};
/**
* exportfs_lock_op_is_async() - export op supports async lock operation
* @export_ops: the nfs export operations to check
*
* Returns true if the nfs export_operations structure has
* EXPORT_OP_ASYNC_LOCK in their flags set
*/
static inline bool
exportfs_lock_op_is_async(const struct export_operations *export_ops)
{
return export_ops->flags & EXPORT_OP_ASYNC_LOCK;
}
extern int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
int *max_len, struct inode *parent,
int flags);

View File

@ -256,7 +256,7 @@ inode_peek_iversion(const struct inode *inode)
* For filesystems without any sort of change attribute, the best we can
* do is fake one up from the ctime:
*/
static inline u64 time_to_chattr(struct timespec64 *t)
static inline u64 time_to_chattr(const struct timespec64 *t)
{
u64 chattr = t->tv_sec;

View File

@ -73,6 +73,33 @@ static inline void init_llist_head(struct llist_head *list)
list->first = NULL;
}
/**
* init_llist_node - initialize lock-less list node
* @node: the node to be initialised
*
* In cases where there is a need to test if a node is on
* a list or not, this initialises the node to clearly
* not be on any list.
*/
static inline void init_llist_node(struct llist_node *node)
{
node->next = node;
}
/**
* llist_on_list - test if a lock-list list node is on a list
* @node: the node to test
*
* When a node is on a list the ->next pointer will be NULL or
* some other node. It can never point to itself. We use that
* in init_llist_node() to record that a node is not on any list,
* and here to test whether it is on any list.
*/
static inline bool llist_on_list(const struct llist_node *node)
{
return node->next != node;
}
/**
* llist_entry - get the struct of this entry
* @ptr: the &struct llist_node pointer.
@ -249,6 +276,25 @@ static inline struct llist_node *__llist_del_all(struct llist_head *head)
extern struct llist_node *llist_del_first(struct llist_head *head);
/**
* llist_del_first_init - delete first entry from lock-list and mark is as being off-list
* @head: the head of lock-less list to delete from.
*
* This behave the same as llist_del_first() except that llist_init_node() is called
* on the returned node so that llist_on_list() will report false for the node.
*/
static inline struct llist_node *llist_del_first_init(struct llist_head *head)
{
struct llist_node *n = llist_del_first(head);
if (n)
init_llist_node(n);
return n;
}
extern bool llist_del_first_this(struct llist_head *head,
struct llist_node *this);
struct llist_node *llist_reverse_order(struct llist_node *head);
#endif /* LLIST_H */

View File

@ -282,7 +282,7 @@ __be32 nlmsvc_testlock(struct svc_rqst *, struct nlm_file *,
struct nlm_host *, struct nlm_lock *,
struct nlm_lock *, struct nlm_cookie *);
__be32 nlmsvc_cancel_blocked(struct net *net, struct nlm_file *, struct nlm_lock *);
void nlmsvc_retry_blocked(void);
void nlmsvc_retry_blocked(struct svc_rqst *rqstp);
void nlmsvc_traverse_blocks(struct nlm_host *, struct nlm_file *,
nlm_host_match_fn_t match);
void nlmsvc_grant_reply(struct nlm_cookie *, __be32);

124
include/linux/lwq.h Normal file
View File

@ -0,0 +1,124 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#ifndef LWQ_H
#define LWQ_H
/*
* Light-weight single-linked queue built from llist
*
* Entries can be enqueued from any context with no locking.
* Entries can be dequeued from process context with integrated locking.
*
* This is particularly suitable when work items are queued in
* BH or IRQ context, and where work items are handled one at a time
* by dedicated threads.
*/
#include <linux/container_of.h>
#include <linux/spinlock.h>
#include <linux/llist.h>
struct lwq_node {
struct llist_node node;
};
struct lwq {
spinlock_t lock;
struct llist_node *ready; /* entries to be dequeued */
struct llist_head new; /* entries being enqueued */
};
/**
* lwq_init - initialise a lwq
* @q: the lwq object
*/
static inline void lwq_init(struct lwq *q)
{
spin_lock_init(&q->lock);
q->ready = NULL;
init_llist_head(&q->new);
}
/**
* lwq_empty - test if lwq contains any entry
* @q: the lwq object
*
* This empty test contains an acquire barrier so that if a wakeup
* is sent when lwq_dequeue returns true, it is safe to go to sleep after
* a test on lwq_empty().
*/
static inline bool lwq_empty(struct lwq *q)
{
/* acquire ensures ordering wrt lwq_enqueue() */
return smp_load_acquire(&q->ready) == NULL && llist_empty(&q->new);
}
struct llist_node *__lwq_dequeue(struct lwq *q);
/**
* lwq_dequeue - dequeue first (oldest) entry from lwq
* @q: the queue to dequeue from
* @type: the type of object to return
* @member: them member in returned object which is an lwq_node.
*
* Remove a single object from the lwq and return it. This will take
* a spinlock and so must always be called in the same context, typcially
* process contet.
*/
#define lwq_dequeue(q, type, member) \
({ struct llist_node *_n = __lwq_dequeue(q); \
_n ? container_of(_n, type, member.node) : NULL; })
struct llist_node *lwq_dequeue_all(struct lwq *q);
/**
* lwq_for_each_safe - iterate over detached queue allowing deletion
* @_n: iterator variable
* @_t1: temporary struct llist_node **
* @_t2: temporary struct llist_node *
* @_l: address of llist_node pointer from lwq_dequeue_all()
* @_member: member in _n where lwq_node is found.
*
* Iterate over members in a dequeued list. If the iterator variable
* is set to NULL, the iterator removes that entry from the queue.
*/
#define lwq_for_each_safe(_n, _t1, _t2, _l, _member) \
for (_t1 = (_l); \
*(_t1) ? (_n = container_of(*(_t1), typeof(*(_n)), _member.node),\
_t2 = ((*_t1)->next), \
true) \
: false; \
(_n) ? (_t1 = &(_n)->_member.node.next, 0) \
: ((*(_t1) = (_t2)), 0))
/**
* lwq_enqueue - add a new item to the end of the queue
* @n - the lwq_node embedded in the item to be added
* @q - the lwq to append to.
*
* No locking is needed to append to the queue so this can
* be called from any context.
* Return %true is the list may have previously been empty.
*/
static inline bool lwq_enqueue(struct lwq_node *n, struct lwq *q)
{
/* acquire enqures ordering wrt lwq_dequeue */
return llist_add(&n->node, &q->new) &&
smp_load_acquire(&q->ready) == NULL;
}
/**
* lwq_enqueue_batch - add a list of new items to the end of the queue
* @n - the lwq_node embedded in the first item to be added
* @q - the lwq to append to.
*
* No locking is needed to append to the queue so this can
* be called from any context.
* Return %true is the list may have previously been empty.
*/
static inline bool lwq_enqueue_batch(struct llist_node *n, struct lwq *q)
{
struct llist_node *e = n;
/* acquire enqures ordering wrt lwq_dequeue */
return llist_add_batch(llist_reverse_order(n), e, &q->new) &&
smp_load_acquire(&q->ready) == NULL;
}
#endif /* LWQ_H */

View File

@ -150,7 +150,7 @@ enum nfs_opnum4 {
OP_WRITE_SAME = 70,
OP_CLONE = 71,
/* xattr support (RFC8726) */
/* xattr support (RFC8276) */
OP_GETXATTR = 72,
OP_SETXATTR = 73,
OP_LISTXATTRS = 74,
@ -389,79 +389,203 @@ enum lock_type4 {
NFS4_WRITEW_LT = 4
};
/*
* Symbol names and values are from RFC 7531 Section 2.
* "XDR Description of NFSv4.0"
*/
enum {
FATTR4_SUPPORTED_ATTRS = 0,
FATTR4_TYPE = 1,
FATTR4_FH_EXPIRE_TYPE = 2,
FATTR4_CHANGE = 3,
FATTR4_SIZE = 4,
FATTR4_LINK_SUPPORT = 5,
FATTR4_SYMLINK_SUPPORT = 6,
FATTR4_NAMED_ATTR = 7,
FATTR4_FSID = 8,
FATTR4_UNIQUE_HANDLES = 9,
FATTR4_LEASE_TIME = 10,
FATTR4_RDATTR_ERROR = 11,
FATTR4_ACL = 12,
FATTR4_ACLSUPPORT = 13,
FATTR4_ARCHIVE = 14,
FATTR4_CANSETTIME = 15,
FATTR4_CASE_INSENSITIVE = 16,
FATTR4_CASE_PRESERVING = 17,
FATTR4_CHOWN_RESTRICTED = 18,
FATTR4_FILEHANDLE = 19,
FATTR4_FILEID = 20,
FATTR4_FILES_AVAIL = 21,
FATTR4_FILES_FREE = 22,
FATTR4_FILES_TOTAL = 23,
FATTR4_FS_LOCATIONS = 24,
FATTR4_HIDDEN = 25,
FATTR4_HOMOGENEOUS = 26,
FATTR4_MAXFILESIZE = 27,
FATTR4_MAXLINK = 28,
FATTR4_MAXNAME = 29,
FATTR4_MAXREAD = 30,
FATTR4_MAXWRITE = 31,
FATTR4_MIMETYPE = 32,
FATTR4_MODE = 33,
FATTR4_NO_TRUNC = 34,
FATTR4_NUMLINKS = 35,
FATTR4_OWNER = 36,
FATTR4_OWNER_GROUP = 37,
FATTR4_QUOTA_AVAIL_HARD = 38,
FATTR4_QUOTA_AVAIL_SOFT = 39,
FATTR4_QUOTA_USED = 40,
FATTR4_RAWDEV = 41,
FATTR4_SPACE_AVAIL = 42,
FATTR4_SPACE_FREE = 43,
FATTR4_SPACE_TOTAL = 44,
FATTR4_SPACE_USED = 45,
FATTR4_SYSTEM = 46,
FATTR4_TIME_ACCESS = 47,
FATTR4_TIME_ACCESS_SET = 48,
FATTR4_TIME_BACKUP = 49,
FATTR4_TIME_CREATE = 50,
FATTR4_TIME_DELTA = 51,
FATTR4_TIME_METADATA = 52,
FATTR4_TIME_MODIFY = 53,
FATTR4_TIME_MODIFY_SET = 54,
FATTR4_MOUNTED_ON_FILEID = 55,
};
/*
* Symbol names and values are from RFC 5662 Section 2.
* "XDR Description of NFSv4.1"
*/
enum {
FATTR4_DIR_NOTIF_DELAY = 56,
FATTR4_DIRENT_NOTIF_DELAY = 57,
FATTR4_DACL = 58,
FATTR4_SACL = 59,
FATTR4_CHANGE_POLICY = 60,
FATTR4_FS_STATUS = 61,
FATTR4_FS_LAYOUT_TYPES = 62,
FATTR4_LAYOUT_HINT = 63,
FATTR4_LAYOUT_TYPES = 64,
FATTR4_LAYOUT_BLKSIZE = 65,
FATTR4_LAYOUT_ALIGNMENT = 66,
FATTR4_FS_LOCATIONS_INFO = 67,
FATTR4_MDSTHRESHOLD = 68,
FATTR4_RETENTION_GET = 69,
FATTR4_RETENTION_SET = 70,
FATTR4_RETENTEVT_GET = 71,
FATTR4_RETENTEVT_SET = 72,
FATTR4_RETENTION_HOLD = 73,
FATTR4_MODE_SET_MASKED = 74,
FATTR4_SUPPATTR_EXCLCREAT = 75,
FATTR4_FS_CHARSET_CAP = 76,
};
/*
* Symbol names and values are from RFC 7863 Section 2.
* "XDR Description of NFSv4.2"
*/
enum {
FATTR4_CLONE_BLKSIZE = 77,
FATTR4_SPACE_FREED = 78,
FATTR4_CHANGE_ATTR_TYPE = 79,
FATTR4_SEC_LABEL = 80,
};
/*
* Symbol names and values are from RFC 8275 Section 5.
* "The mode_umask Attribute"
*/
enum {
FATTR4_MODE_UMASK = 81,
};
/*
* Symbol names and values are from RFC 8276 Section 8.6.
* "Numeric Values Assigned to Protocol Extensions"
*/
enum {
FATTR4_XATTR_SUPPORT = 82,
};
/*
* The following internal definitions enable processing the above
* attribute bits within 32-bit word boundaries.
*/
/* Mandatory Attributes */
#define FATTR4_WORD0_SUPPORTED_ATTRS (1UL << 0)
#define FATTR4_WORD0_TYPE (1UL << 1)
#define FATTR4_WORD0_FH_EXPIRE_TYPE (1UL << 2)
#define FATTR4_WORD0_CHANGE (1UL << 3)
#define FATTR4_WORD0_SIZE (1UL << 4)
#define FATTR4_WORD0_LINK_SUPPORT (1UL << 5)
#define FATTR4_WORD0_SYMLINK_SUPPORT (1UL << 6)
#define FATTR4_WORD0_NAMED_ATTR (1UL << 7)
#define FATTR4_WORD0_FSID (1UL << 8)
#define FATTR4_WORD0_UNIQUE_HANDLES (1UL << 9)
#define FATTR4_WORD0_LEASE_TIME (1UL << 10)
#define FATTR4_WORD0_RDATTR_ERROR (1UL << 11)
#define FATTR4_WORD0_SUPPORTED_ATTRS BIT(FATTR4_SUPPORTED_ATTRS)
#define FATTR4_WORD0_TYPE BIT(FATTR4_TYPE)
#define FATTR4_WORD0_FH_EXPIRE_TYPE BIT(FATTR4_FH_EXPIRE_TYPE)
#define FATTR4_WORD0_CHANGE BIT(FATTR4_CHANGE)
#define FATTR4_WORD0_SIZE BIT(FATTR4_SIZE)
#define FATTR4_WORD0_LINK_SUPPORT BIT(FATTR4_LINK_SUPPORT)
#define FATTR4_WORD0_SYMLINK_SUPPORT BIT(FATTR4_SYMLINK_SUPPORT)
#define FATTR4_WORD0_NAMED_ATTR BIT(FATTR4_NAMED_ATTR)
#define FATTR4_WORD0_FSID BIT(FATTR4_FSID)
#define FATTR4_WORD0_UNIQUE_HANDLES BIT(FATTR4_UNIQUE_HANDLES)
#define FATTR4_WORD0_LEASE_TIME BIT(FATTR4_LEASE_TIME)
#define FATTR4_WORD0_RDATTR_ERROR BIT(FATTR4_RDATTR_ERROR)
/* Mandatory in NFSv4.1 */
#define FATTR4_WORD2_SUPPATTR_EXCLCREAT (1UL << 11)
#define FATTR4_WORD2_SUPPATTR_EXCLCREAT BIT(FATTR4_SUPPATTR_EXCLCREAT - 64)
/* Recommended Attributes */
#define FATTR4_WORD0_ACL (1UL << 12)
#define FATTR4_WORD0_ACLSUPPORT (1UL << 13)
#define FATTR4_WORD0_ARCHIVE (1UL << 14)
#define FATTR4_WORD0_CANSETTIME (1UL << 15)
#define FATTR4_WORD0_CASE_INSENSITIVE (1UL << 16)
#define FATTR4_WORD0_CASE_PRESERVING (1UL << 17)
#define FATTR4_WORD0_CHOWN_RESTRICTED (1UL << 18)
#define FATTR4_WORD0_FILEHANDLE (1UL << 19)
#define FATTR4_WORD0_FILEID (1UL << 20)
#define FATTR4_WORD0_FILES_AVAIL (1UL << 21)
#define FATTR4_WORD0_FILES_FREE (1UL << 22)
#define FATTR4_WORD0_FILES_TOTAL (1UL << 23)
#define FATTR4_WORD0_FS_LOCATIONS (1UL << 24)
#define FATTR4_WORD0_HIDDEN (1UL << 25)
#define FATTR4_WORD0_HOMOGENEOUS (1UL << 26)
#define FATTR4_WORD0_MAXFILESIZE (1UL << 27)
#define FATTR4_WORD0_MAXLINK (1UL << 28)
#define FATTR4_WORD0_MAXNAME (1UL << 29)
#define FATTR4_WORD0_MAXREAD (1UL << 30)
#define FATTR4_WORD0_MAXWRITE (1UL << 31)
#define FATTR4_WORD1_MIMETYPE (1UL << 0)
#define FATTR4_WORD1_MODE (1UL << 1)
#define FATTR4_WORD1_NO_TRUNC (1UL << 2)
#define FATTR4_WORD1_NUMLINKS (1UL << 3)
#define FATTR4_WORD1_OWNER (1UL << 4)
#define FATTR4_WORD1_OWNER_GROUP (1UL << 5)
#define FATTR4_WORD1_QUOTA_HARD (1UL << 6)
#define FATTR4_WORD1_QUOTA_SOFT (1UL << 7)
#define FATTR4_WORD1_QUOTA_USED (1UL << 8)
#define FATTR4_WORD1_RAWDEV (1UL << 9)
#define FATTR4_WORD1_SPACE_AVAIL (1UL << 10)
#define FATTR4_WORD1_SPACE_FREE (1UL << 11)
#define FATTR4_WORD1_SPACE_TOTAL (1UL << 12)
#define FATTR4_WORD1_SPACE_USED (1UL << 13)
#define FATTR4_WORD1_SYSTEM (1UL << 14)
#define FATTR4_WORD1_TIME_ACCESS (1UL << 15)
#define FATTR4_WORD1_TIME_ACCESS_SET (1UL << 16)
#define FATTR4_WORD1_TIME_BACKUP (1UL << 17)
#define FATTR4_WORD1_TIME_CREATE (1UL << 18)
#define FATTR4_WORD1_TIME_DELTA (1UL << 19)
#define FATTR4_WORD1_TIME_METADATA (1UL << 20)
#define FATTR4_WORD1_TIME_MODIFY (1UL << 21)
#define FATTR4_WORD1_TIME_MODIFY_SET (1UL << 22)
#define FATTR4_WORD1_MOUNTED_ON_FILEID (1UL << 23)
#define FATTR4_WORD1_DACL (1UL << 26)
#define FATTR4_WORD1_SACL (1UL << 27)
#define FATTR4_WORD1_FS_LAYOUT_TYPES (1UL << 30)
#define FATTR4_WORD2_LAYOUT_TYPES (1UL << 0)
#define FATTR4_WORD2_LAYOUT_BLKSIZE (1UL << 1)
#define FATTR4_WORD2_MDSTHRESHOLD (1UL << 4)
#define FATTR4_WORD2_CLONE_BLKSIZE (1UL << 13)
#define FATTR4_WORD2_CHANGE_ATTR_TYPE (1UL << 15)
#define FATTR4_WORD2_SECURITY_LABEL (1UL << 16)
#define FATTR4_WORD2_MODE_UMASK (1UL << 17)
#define FATTR4_WORD2_XATTR_SUPPORT (1UL << 18)
#define FATTR4_WORD0_ACL BIT(FATTR4_ACL)
#define FATTR4_WORD0_ACLSUPPORT BIT(FATTR4_ACLSUPPORT)
#define FATTR4_WORD0_ARCHIVE BIT(FATTR4_ARCHIVE)
#define FATTR4_WORD0_CANSETTIME BIT(FATTR4_CANSETTIME)
#define FATTR4_WORD0_CASE_INSENSITIVE BIT(FATTR4_CASE_INSENSITIVE)
#define FATTR4_WORD0_CASE_PRESERVING BIT(FATTR4_CASE_PRESERVING)
#define FATTR4_WORD0_CHOWN_RESTRICTED BIT(FATTR4_CHOWN_RESTRICTED)
#define FATTR4_WORD0_FILEHANDLE BIT(FATTR4_FILEHANDLE)
#define FATTR4_WORD0_FILEID BIT(FATTR4_FILEID)
#define FATTR4_WORD0_FILES_AVAIL BIT(FATTR4_FILES_AVAIL)
#define FATTR4_WORD0_FILES_FREE BIT(FATTR4_FILES_FREE)
#define FATTR4_WORD0_FILES_TOTAL BIT(FATTR4_FILES_TOTAL)
#define FATTR4_WORD0_FS_LOCATIONS BIT(FATTR4_FS_LOCATIONS)
#define FATTR4_WORD0_HIDDEN BIT(FATTR4_HIDDEN)
#define FATTR4_WORD0_HOMOGENEOUS BIT(FATTR4_HOMOGENEOUS)
#define FATTR4_WORD0_MAXFILESIZE BIT(FATTR4_MAXFILESIZE)
#define FATTR4_WORD0_MAXLINK BIT(FATTR4_MAXLINK)
#define FATTR4_WORD0_MAXNAME BIT(FATTR4_MAXNAME)
#define FATTR4_WORD0_MAXREAD BIT(FATTR4_MAXREAD)
#define FATTR4_WORD0_MAXWRITE BIT(FATTR4_MAXWRITE)
#define FATTR4_WORD1_MIMETYPE BIT(FATTR4_MIMETYPE - 32)
#define FATTR4_WORD1_MODE BIT(FATTR4_MODE - 32)
#define FATTR4_WORD1_NO_TRUNC BIT(FATTR4_NO_TRUNC - 32)
#define FATTR4_WORD1_NUMLINKS BIT(FATTR4_NUMLINKS - 32)
#define FATTR4_WORD1_OWNER BIT(FATTR4_OWNER - 32)
#define FATTR4_WORD1_OWNER_GROUP BIT(FATTR4_OWNER_GROUP - 32)
#define FATTR4_WORD1_QUOTA_HARD BIT(FATTR4_QUOTA_AVAIL_HARD - 32)
#define FATTR4_WORD1_QUOTA_SOFT BIT(FATTR4_QUOTA_AVAIL_SOFT - 32)
#define FATTR4_WORD1_QUOTA_USED BIT(FATTR4_QUOTA_USED - 32)
#define FATTR4_WORD1_RAWDEV BIT(FATTR4_RAWDEV - 32)
#define FATTR4_WORD1_SPACE_AVAIL BIT(FATTR4_SPACE_AVAIL - 32)
#define FATTR4_WORD1_SPACE_FREE BIT(FATTR4_SPACE_FREE - 32)
#define FATTR4_WORD1_SPACE_TOTAL BIT(FATTR4_SPACE_TOTAL - 32)
#define FATTR4_WORD1_SPACE_USED BIT(FATTR4_SPACE_USED - 32)
#define FATTR4_WORD1_SYSTEM BIT(FATTR4_SYSTEM - 32)
#define FATTR4_WORD1_TIME_ACCESS BIT(FATTR4_TIME_ACCESS - 32)
#define FATTR4_WORD1_TIME_ACCESS_SET BIT(FATTR4_TIME_ACCESS_SET - 32)
#define FATTR4_WORD1_TIME_BACKUP BIT(FATTR4_TIME_BACKUP - 32)
#define FATTR4_WORD1_TIME_CREATE BIT(FATTR4_TIME_CREATE - 32)
#define FATTR4_WORD1_TIME_DELTA BIT(FATTR4_TIME_DELTA - 32)
#define FATTR4_WORD1_TIME_METADATA BIT(FATTR4_TIME_METADATA - 32)
#define FATTR4_WORD1_TIME_MODIFY BIT(FATTR4_TIME_MODIFY - 32)
#define FATTR4_WORD1_TIME_MODIFY_SET BIT(FATTR4_TIME_MODIFY_SET - 32)
#define FATTR4_WORD1_MOUNTED_ON_FILEID BIT(FATTR4_MOUNTED_ON_FILEID - 32)
#define FATTR4_WORD1_DACL BIT(FATTR4_DACL - 32)
#define FATTR4_WORD1_SACL BIT(FATTR4_SACL - 32)
#define FATTR4_WORD1_FS_LAYOUT_TYPES BIT(FATTR4_FS_LAYOUT_TYPES - 32)
#define FATTR4_WORD2_LAYOUT_TYPES BIT(FATTR4_LAYOUT_TYPES - 64)
#define FATTR4_WORD2_LAYOUT_BLKSIZE BIT(FATTR4_LAYOUT_BLKSIZE - 64)
#define FATTR4_WORD2_MDSTHRESHOLD BIT(FATTR4_MDSTHRESHOLD - 64)
#define FATTR4_WORD2_CLONE_BLKSIZE BIT(FATTR4_CLONE_BLKSIZE - 64)
#define FATTR4_WORD2_CHANGE_ATTR_TYPE BIT(FATTR4_CHANGE_ATTR_TYPE - 64)
#define FATTR4_WORD2_SECURITY_LABEL BIT(FATTR4_SEC_LABEL - 64)
#define FATTR4_WORD2_MODE_UMASK BIT(FATTR4_MODE_UMASK - 64)
#define FATTR4_WORD2_XATTR_SUPPORT BIT(FATTR4_XATTR_SUPPORT - 64)
/* MDS threshold bitmap bits */
#define THRESHOLD_RD (1UL << 0)

View File

@ -17,6 +17,7 @@
#include <linux/sunrpc/xdr.h>
#include <linux/sunrpc/auth.h>
#include <linux/sunrpc/svcauth.h>
#include <linux/lwq.h>
#include <linux/wait.h>
#include <linux/mm.h>
#include <linux/pagevec.h>
@ -33,10 +34,10 @@
*/
struct svc_pool {
unsigned int sp_id; /* pool id; also node id on NUMA */
spinlock_t sp_lock; /* protects all fields */
struct list_head sp_sockets; /* pending sockets */
unsigned int sp_nrthreads; /* # of threads in pool */
struct lwq sp_xprts; /* pending transports */
atomic_t sp_nrthreads; /* # of threads in pool */
struct list_head sp_all_threads; /* all server threads */
struct llist_head sp_idle_threads; /* idle server threads */
/* statistics on pool operation */
struct percpu_counter sp_messages_arrived;
@ -49,7 +50,8 @@ struct svc_pool {
/* bits for sp_flags */
enum {
SP_TASK_PENDING, /* still work to do even if no xprt is queued */
SP_CONGESTED, /* all threads are busy, none idle */
SP_NEED_VICTIM, /* One thread needs to agree to exit */
SP_VICTIM_REMAINS, /* One thread needs to actually exit */
};
@ -88,12 +90,9 @@ struct svc_serv {
int (*sv_threadfn)(void *data);
#if defined(CONFIG_SUNRPC_BACKCHANNEL)
struct list_head sv_cb_list; /* queue for callback requests
struct lwq sv_cb_list; /* queue for callback requests
* that arrive over the same
* connection */
spinlock_t sv_cb_lock; /* protects the svc_cb_list */
wait_queue_head_t sv_cb_waitq; /* sleep here if there are no
* entries in the svc_cb_list */
bool sv_bc_enabled; /* service uses backchannel */
#endif /* CONFIG_SUNRPC_BACKCHANNEL */
};
@ -186,6 +185,7 @@ extern u32 svc_max_payload(const struct svc_rqst *rqstp);
*/
struct svc_rqst {
struct list_head rq_all; /* all threads list */
struct llist_node rq_idle; /* On the idle list */
struct rcu_head rq_rcu_head; /* for RCU deferred kfree */
struct svc_xprt * rq_xprt; /* transport ptr */
@ -251,6 +251,7 @@ struct svc_rqst {
* net namespace
*/
void ** rq_lease_breaker; /* The v4 client breaking a lease */
unsigned int rq_status_counter; /* RPC processing counter */
};
/* bits for rq_flags */
@ -261,8 +262,7 @@ enum {
RQ_DROPME, /* drop current reply */
RQ_SPLICE_OK, /* turned off in gss privacy to prevent
* encrypting page cache pages */
RQ_VICTIM, /* about to be shut down */
RQ_BUSY, /* request is busy */
RQ_VICTIM, /* Have agreed to shut down */
RQ_DATA, /* request has data */
};
@ -301,6 +301,28 @@ static inline struct sockaddr *svc_daddr(const struct svc_rqst *rqst)
return (struct sockaddr *) &rqst->rq_daddr;
}
/**
* svc_thread_should_stop - check if this thread should stop
* @rqstp: the thread that might need to stop
*
* To stop an svc thread, the pool flags SP_NEED_VICTIM and SP_VICTIM_REMAINS
* are set. The first thread which sees SP_NEED_VICTIM clears it, becoming
* the victim using this function. It should then promptly call
* svc_exit_thread() to complete the process, clearing SP_VICTIM_REMAINS
* so the task waiting for a thread to exit can wake and continue.
*
* Return values:
* %true: caller should invoke svc_exit_thread()
* %false: caller should do nothing
*/
static inline bool svc_thread_should_stop(struct svc_rqst *rqstp)
{
if (test_and_clear_bit(SP_NEED_VICTIM, &rqstp->rq_pool->sp_flags))
set_bit(RQ_VICTIM, &rqstp->rq_flags);
return test_bit(RQ_VICTIM, &rqstp->rq_flags);
}
struct svc_deferred_req {
u32 prot; /* protocol (UDP or TCP) */
struct svc_xprt *xprt;
@ -413,8 +435,7 @@ struct svc_serv * svc_create_pooled(struct svc_program *, unsigned int,
int svc_set_num_threads(struct svc_serv *, struct svc_pool *, int);
int svc_pool_stats_open(struct svc_serv *serv, struct file *file);
void svc_process(struct svc_rqst *rqstp);
int bc_svc_process(struct svc_serv *, struct rpc_rqst *,
struct svc_rqst *);
void svc_process_bc(struct rpc_rqst *req, struct svc_rqst *rqstp);
int svc_register(const struct svc_serv *, struct net *, const int,
const unsigned short, const unsigned short);

View File

@ -54,7 +54,7 @@ struct svc_xprt {
const struct svc_xprt_ops *xpt_ops;
struct kref xpt_ref;
struct list_head xpt_list;
struct list_head xpt_ready;
struct lwq_node xpt_ready;
unsigned long xpt_flags;
struct svc_serv *xpt_server; /* service for transport */

View File

@ -57,6 +57,7 @@ struct xprt_class;
struct seq_file;
struct svc_serv;
struct net;
#include <linux/lwq.h>
/*
* This describes a complete RPC request
@ -121,7 +122,7 @@ struct rpc_rqst {
int rq_ntrans;
#if defined(CONFIG_SUNRPC_BACKCHANNEL)
struct list_head rq_bc_list; /* Callback service list */
struct lwq_node rq_bc_list; /* Callback service list */
unsigned long rq_bc_pa_state; /* Backchannel prealloc state */
struct list_head rq_bc_pa_list; /* Backchannel prealloc list */
#endif /* CONFIG_SUNRPC_BACKCHANEL */

View File

@ -1667,7 +1667,7 @@ TRACE_EVENT(svcrdma_encode_wseg,
__entry->offset = offset;
),
TP_printk("cq_id=%u cid=%d segno=%u %u@0x%016llx:0x%08x",
TP_printk("cq.id=%u cid=%d segno=%u %u@0x%016llx:0x%08x",
__entry->cq_id, __entry->completion_id,
__entry->segno, __entry->length,
(unsigned long long)__entry->offset, __entry->handle
@ -1703,7 +1703,7 @@ TRACE_EVENT(svcrdma_decode_rseg,
__entry->offset = segment->rs_offset;
),
TP_printk("cq_id=%u cid=%d segno=%u position=%u %u@0x%016llx:0x%08x",
TP_printk("cq.id=%u cid=%d segno=%u position=%u %u@0x%016llx:0x%08x",
__entry->cq_id, __entry->completion_id,
__entry->segno, __entry->position, __entry->length,
(unsigned long long)__entry->offset, __entry->handle
@ -1740,7 +1740,7 @@ TRACE_EVENT(svcrdma_decode_wseg,
__entry->offset = segment->rs_offset;
),
TP_printk("cq_id=%u cid=%d segno=%u %u@0x%016llx:0x%08x",
TP_printk("cq.id=%u cid=%d segno=%u %u@0x%016llx:0x%08x",
__entry->cq_id, __entry->completion_id,
__entry->segno, __entry->length,
(unsigned long long)__entry->offset, __entry->handle
@ -1959,7 +1959,7 @@ TRACE_EVENT(svcrdma_send_pullup,
__entry->msglen = msglen;
),
TP_printk("cq_id=%u cid=%d hdr=%u msg=%u (total %u)",
TP_printk("cq.id=%u cid=%d hdr=%u msg=%u (total %u)",
__entry->cq_id, __entry->completion_id,
__entry->hdrlen, __entry->msglen,
__entry->hdrlen + __entry->msglen)
@ -2014,7 +2014,7 @@ TRACE_EVENT(svcrdma_post_send,
wr->ex.invalidate_rkey : 0;
),
TP_printk("cq_id=%u cid=%d num_sge=%u inv_rkey=0x%08x",
TP_printk("cq.id=%u cid=%d num_sge=%u inv_rkey=0x%08x",
__entry->cq_id, __entry->completion_id,
__entry->num_sge, __entry->inv_rkey
)

View File

@ -1677,7 +1677,6 @@ DEFINE_SVCXDRBUF_EVENT(sendto);
svc_rqst_flag(DROPME) \
svc_rqst_flag(SPLICE_OK) \
svc_rqst_flag(VICTIM) \
svc_rqst_flag(BUSY) \
svc_rqst_flag_end(DATA)
#undef svc_rqst_flag

View File

@ -0,0 +1,39 @@
/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
/* Do not edit directly, auto-generated from: */
/* Documentation/netlink/specs/nfsd.yaml */
/* YNL-GEN uapi header */
#ifndef _UAPI_LINUX_NFSD_H
#define _UAPI_LINUX_NFSD_H
#define NFSD_FAMILY_NAME "nfsd"
#define NFSD_FAMILY_VERSION 1
enum {
NFSD_A_RPC_STATUS_XID = 1,
NFSD_A_RPC_STATUS_FLAGS,
NFSD_A_RPC_STATUS_PROG,
NFSD_A_RPC_STATUS_VERSION,
NFSD_A_RPC_STATUS_PROC,
NFSD_A_RPC_STATUS_SERVICE_TIME,
NFSD_A_RPC_STATUS_PAD,
NFSD_A_RPC_STATUS_SADDR4,
NFSD_A_RPC_STATUS_DADDR4,
NFSD_A_RPC_STATUS_SADDR6,
NFSD_A_RPC_STATUS_DADDR6,
NFSD_A_RPC_STATUS_SPORT,
NFSD_A_RPC_STATUS_DPORT,
NFSD_A_RPC_STATUS_COMPOUND_OPS,
__NFSD_A_RPC_STATUS_MAX,
NFSD_A_RPC_STATUS_MAX = (__NFSD_A_RPC_STATUS_MAX - 1)
};
enum {
NFSD_CMD_RPC_STATUS_GET = 1,
__NFSD_CMD_MAX,
NFSD_CMD_MAX = (__NFSD_CMD_MAX - 1)
};
#endif /* _UAPI_LINUX_NFSD_H */

View File

@ -729,6 +729,11 @@ config PARMAN
config OBJAGG
tristate "objagg" if COMPILE_TEST
config LWQ_TEST
bool "Boot-time test for lwq queuing"
help
Run boot-time test of light-weight queuing.
endmenu
config GENERIC_IOREMAP

View File

@ -45,7 +45,7 @@ obj-y += lockref.o
obj-y += bcd.o sort.o parser.o debug_locks.o random32.o \
bust_spinlocks.o kasprintf.o bitmap.o scatterlist.o \
list_sort.o uuid.o iov_iter.o clz_ctz.o \
bsearch.o find_bit.o llist.o memweight.o kfifo.o \
bsearch.o find_bit.o llist.o lwq.o memweight.o kfifo.o \
percpu-refcount.o rhashtable.o base64.o \
once.o refcount.o rcuref.o usercopy.o errseq.o bucket_locks.o \
generic-radix-tree.o

View File

@ -65,6 +65,34 @@ struct llist_node *llist_del_first(struct llist_head *head)
}
EXPORT_SYMBOL_GPL(llist_del_first);
/**
* llist_del_first_this - delete given entry of lock-less list if it is first
* @head: the head for your lock-less list
* @this: a list entry.
*
* If head of the list is given entry, delete and return %true else
* return %false.
*
* Multiple callers can safely call this concurrently with multiple
* llist_add() callers, providing all the callers offer a different @this.
*/
bool llist_del_first_this(struct llist_head *head,
struct llist_node *this)
{
struct llist_node *entry, *next;
/* acquire ensures orderig wrt try_cmpxchg() is llist_del_first() */
entry = smp_load_acquire(&head->first);
do {
if (entry != this)
return false;
next = READ_ONCE(entry->next);
} while (!try_cmpxchg(&head->first, &entry, next));
return true;
}
EXPORT_SYMBOL_GPL(llist_del_first_this);
/**
* llist_reverse_order - reverse order of a llist chain
* @head: first item of the list to be reversed

158
lib/lwq.c Normal file
View File

@ -0,0 +1,158 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Light-weight single-linked queue.
*
* Entries are enqueued to the head of an llist, with no blocking.
* This can happen in any context.
*
* Entries are dequeued using a spinlock to protect against multiple
* access. The llist is staged in reverse order, and refreshed
* from the llist when it exhausts.
*
* This is particularly suitable when work items are queued in BH or
* IRQ context, and where work items are handled one at a time by
* dedicated threads.
*/
#include <linux/rcupdate.h>
#include <linux/lwq.h>
struct llist_node *__lwq_dequeue(struct lwq *q)
{
struct llist_node *this;
if (lwq_empty(q))
return NULL;
spin_lock(&q->lock);
this = q->ready;
if (!this && !llist_empty(&q->new)) {
/* ensure queue doesn't appear transiently lwq_empty */
smp_store_release(&q->ready, (void *)1);
this = llist_reverse_order(llist_del_all(&q->new));
if (!this)
q->ready = NULL;
}
if (this)
q->ready = llist_next(this);
spin_unlock(&q->lock);
return this;
}
EXPORT_SYMBOL_GPL(__lwq_dequeue);
/**
* lwq_dequeue_all - dequeue all currently enqueued objects
* @q: the queue to dequeue from
*
* Remove and return a linked list of llist_nodes of all the objects that were
* in the queue. The first on the list will be the object that was least
* recently enqueued.
*/
struct llist_node *lwq_dequeue_all(struct lwq *q)
{
struct llist_node *r, *t, **ep;
if (lwq_empty(q))
return NULL;
spin_lock(&q->lock);
r = q->ready;
q->ready = NULL;
t = llist_del_all(&q->new);
spin_unlock(&q->lock);
ep = &r;
while (*ep)
ep = &(*ep)->next;
*ep = llist_reverse_order(t);
return r;
}
EXPORT_SYMBOL_GPL(lwq_dequeue_all);
#if IS_ENABLED(CONFIG_LWQ_TEST)
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/wait_bit.h>
#include <linux/kthread.h>
#include <linux/delay.h>
struct tnode {
struct lwq_node n;
int i;
int c;
};
static int lwq_exercise(void *qv)
{
struct lwq *q = qv;
int cnt;
struct tnode *t;
for (cnt = 0; cnt < 10000; cnt++) {
wait_var_event(q, (t = lwq_dequeue(q, struct tnode, n)) != NULL);
t->c++;
if (lwq_enqueue(&t->n, q))
wake_up_var(q);
}
while (!kthread_should_stop())
schedule_timeout_idle(1);
return 0;
}
static int lwq_test(void)
{
int i;
struct lwq q;
struct llist_node *l, **t1, *t2;
struct tnode *t;
struct task_struct *threads[8];
printk(KERN_INFO "testing lwq....\n");
lwq_init(&q);
printk(KERN_INFO " lwq: run some threads\n");
for (i = 0; i < ARRAY_SIZE(threads); i++)
threads[i] = kthread_run(lwq_exercise, &q, "lwq-test-%d", i);
for (i = 0; i < 100; i++) {
t = kmalloc(sizeof(*t), GFP_KERNEL);
if (!t)
break;
t->i = i;
t->c = 0;
if (lwq_enqueue(&t->n, &q))
wake_up_var(&q);
}
/* wait for threads to exit */
for (i = 0; i < ARRAY_SIZE(threads); i++)
if (!IS_ERR_OR_NULL(threads[i]))
kthread_stop(threads[i]);
printk(KERN_INFO " lwq: dequeue first 50:");
for (i = 0; i < 50 ; i++) {
if (i && (i % 10) == 0) {
printk(KERN_CONT "\n");
printk(KERN_INFO " lwq: ... ");
}
t = lwq_dequeue(&q, struct tnode, n);
if (t)
printk(KERN_CONT " %d(%d)", t->i, t->c);
kfree(t);
}
printk(KERN_CONT "\n");
l = lwq_dequeue_all(&q);
printk(KERN_INFO " lwq: delete the multiples of 3 (test lwq_for_each_safe())\n");
lwq_for_each_safe(t, t1, t2, &l, n) {
if ((t->i % 3) == 0) {
t->i = -1;
kfree(t);
t = NULL;
}
}
if (l)
lwq_enqueue_batch(l, &q);
printk(KERN_INFO " lwq: dequeue remaining:");
while ((t = lwq_dequeue(&q, struct tnode, n)) != NULL) {
printk(KERN_CONT " %d", t->i);
kfree(t);
}
printk(KERN_CONT "\n");
return 0;
}
module_init(lwq_test);
#endif /* CONFIG_LWQ_TEST*/

View File

@ -83,7 +83,6 @@ static struct rpc_rqst *xprt_alloc_bc_req(struct rpc_xprt *xprt)
return NULL;
req->rq_xprt = xprt;
INIT_LIST_HEAD(&req->rq_bc_list);
/* Preallocate one XDR receive buffer */
if (xprt_alloc_xdr_buf(&req->rq_rcv_buf, gfp_flags) < 0) {
@ -349,10 +348,8 @@ found:
}
/*
* Add callback request to callback list. The callback
* service sleeps on the sv_cb_waitq waiting for new
* requests. Wake it up after adding enqueing the
* request.
* Add callback request to callback list. Wake a thread
* on the first pool (usually the only pool) to handle it.
*/
void xprt_complete_bc_request(struct rpc_rqst *req, uint32_t copied)
{
@ -369,8 +366,6 @@ void xprt_complete_bc_request(struct rpc_rqst *req, uint32_t copied)
dprintk("RPC: add callback request to list\n");
xprt_get(xprt);
spin_lock(&bc_serv->sv_cb_lock);
list_add(&req->rq_bc_list, &bc_serv->sv_cb_list);
wake_up(&bc_serv->sv_cb_waitq);
spin_unlock(&bc_serv->sv_cb_lock);
lwq_enqueue(&req->rq_bc_list, &bc_serv->sv_cb_list);
svc_pool_wake_idle_thread(&bc_serv->sv_pools[0]);
}

View File

@ -438,9 +438,7 @@ EXPORT_SYMBOL_GPL(svc_bind);
static void
__svc_init_bc(struct svc_serv *serv)
{
INIT_LIST_HEAD(&serv->sv_cb_list);
spin_lock_init(&serv->sv_cb_lock);
init_waitqueue_head(&serv->sv_cb_waitq);
lwq_init(&serv->sv_cb_list);
}
#else
static void
@ -509,9 +507,9 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools,
i, serv->sv_name);
pool->sp_id = i;
INIT_LIST_HEAD(&pool->sp_sockets);
lwq_init(&pool->sp_xprts);
INIT_LIST_HEAD(&pool->sp_all_threads);
spin_lock_init(&pool->sp_lock);
init_llist_head(&pool->sp_idle_threads);
percpu_counter_init(&pool->sp_messages_arrived, 0, GFP_KERNEL);
percpu_counter_init(&pool->sp_sockets_queued, 0, GFP_KERNEL);
@ -575,11 +573,12 @@ svc_destroy(struct kref *ref)
timer_shutdown_sync(&serv->sv_temptimer);
/*
* The last user is gone and thus all sockets have to be destroyed to
* the point. Check this.
* Remaining transports at this point are not expected.
*/
BUG_ON(!list_empty(&serv->sv_permsocks));
BUG_ON(!list_empty(&serv->sv_tempsocks));
WARN_ONCE(!list_empty(&serv->sv_permsocks),
"SVC: permsocks remain for %s\n", serv->sv_program->pg_name);
WARN_ONCE(!list_empty(&serv->sv_tempsocks),
"SVC: tempsocks remain for %s\n", serv->sv_program->pg_name);
cache_clean_deferred(serv);
@ -642,7 +641,6 @@ svc_rqst_alloc(struct svc_serv *serv, struct svc_pool *pool, int node)
folio_batch_init(&rqstp->rq_fbatch);
__set_bit(RQ_BUSY, &rqstp->rq_flags);
rqstp->rq_server = serv;
rqstp->rq_pool = pool;
@ -682,10 +680,13 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node)
serv->sv_nrthreads += 1;
spin_unlock_bh(&serv->sv_lock);
spin_lock_bh(&pool->sp_lock);
pool->sp_nrthreads++;
atomic_inc(&pool->sp_nrthreads);
/* Protected by whatever lock the service uses when calling
* svc_set_num_threads()
*/
list_add_rcu(&rqstp->rq_all, &pool->sp_all_threads);
spin_unlock_bh(&pool->sp_lock);
return rqstp;
}
@ -701,23 +702,25 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node)
void svc_pool_wake_idle_thread(struct svc_pool *pool)
{
struct svc_rqst *rqstp;
struct llist_node *ln;
rcu_read_lock();
list_for_each_entry_rcu(rqstp, &pool->sp_all_threads, rq_all) {
if (test_and_set_bit(RQ_BUSY, &rqstp->rq_flags))
continue;
ln = READ_ONCE(pool->sp_idle_threads.first);
if (ln) {
rqstp = llist_entry(ln, struct svc_rqst, rq_idle);
WRITE_ONCE(rqstp->rq_qtime, ktime_get());
wake_up_process(rqstp->rq_task);
if (!task_is_running(rqstp->rq_task)) {
wake_up_process(rqstp->rq_task);
trace_svc_wake_up(rqstp->rq_task->pid);
percpu_counter_inc(&pool->sp_threads_woken);
}
rcu_read_unlock();
percpu_counter_inc(&pool->sp_threads_woken);
trace_svc_wake_up(rqstp->rq_task->pid);
return;
}
rcu_read_unlock();
set_bit(SP_CONGESTED, &pool->sp_flags);
}
EXPORT_SYMBOL_GPL(svc_pool_wake_idle_thread);
static struct svc_pool *
svc_pool_next(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state)
@ -725,36 +728,38 @@ svc_pool_next(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state)
return pool ? pool : &serv->sv_pools[(*state)++ % serv->sv_nrpools];
}
static struct task_struct *
svc_pool_victim(struct svc_serv *serv, struct svc_pool *pool, unsigned int *state)
static struct svc_pool *
svc_pool_victim(struct svc_serv *serv, struct svc_pool *target_pool,
unsigned int *state)
{
struct svc_pool *pool;
unsigned int i;
struct task_struct *task = NULL;
retry:
pool = target_pool;
if (pool != NULL) {
spin_lock_bh(&pool->sp_lock);
if (atomic_inc_not_zero(&pool->sp_nrthreads))
goto found_pool;
return NULL;
} else {
for (i = 0; i < serv->sv_nrpools; i++) {
pool = &serv->sv_pools[--(*state) % serv->sv_nrpools];
spin_lock_bh(&pool->sp_lock);
if (!list_empty(&pool->sp_all_threads))
if (atomic_inc_not_zero(&pool->sp_nrthreads))
goto found_pool;
spin_unlock_bh(&pool->sp_lock);
}
return NULL;
}
found_pool:
if (!list_empty(&pool->sp_all_threads)) {
struct svc_rqst *rqstp;
rqstp = list_entry(pool->sp_all_threads.next, struct svc_rqst, rq_all);
set_bit(RQ_VICTIM, &rqstp->rq_flags);
list_del_rcu(&rqstp->rq_all);
task = rqstp->rq_task;
}
spin_unlock_bh(&pool->sp_lock);
return task;
set_bit(SP_VICTIM_REMAINS, &pool->sp_flags);
set_bit(SP_NEED_VICTIM, &pool->sp_flags);
if (!atomic_dec_and_test(&pool->sp_nrthreads))
return pool;
/* Nothing left in this pool any more */
clear_bit(SP_NEED_VICTIM, &pool->sp_flags);
clear_bit(SP_VICTIM_REMAINS, &pool->sp_flags);
goto retry;
}
static int
@ -795,18 +800,16 @@ svc_start_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
static int
svc_stop_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
{
struct svc_rqst *rqstp;
struct task_struct *task;
unsigned int state = serv->sv_nrthreads-1;
struct svc_pool *victim;
do {
task = svc_pool_victim(serv, pool, &state);
if (task == NULL)
victim = svc_pool_victim(serv, pool, &state);
if (!victim)
break;
rqstp = kthread_data(task);
/* Did we lose a race to svo_function threadfn? */
if (kthread_stop(task) == -EINTR)
svc_exit_thread(rqstp);
svc_pool_wake_idle_thread(victim);
wait_on_bit(&victim->sp_flags, SP_VICTIM_REMAINS,
TASK_IDLE);
nrservs++;
} while (nrservs < 0);
return 0;
@ -832,13 +835,10 @@ svc_stop_kthreads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
int
svc_set_num_threads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
{
if (pool == NULL) {
if (!pool)
nrservs -= serv->sv_nrthreads;
} else {
spin_lock_bh(&pool->sp_lock);
nrservs -= pool->sp_nrthreads;
spin_unlock_bh(&pool->sp_lock);
}
else
nrservs -= atomic_read(&pool->sp_nrthreads);
if (nrservs > 0)
return svc_start_kthreads(serv, pool, nrservs);
@ -924,11 +924,9 @@ svc_exit_thread(struct svc_rqst *rqstp)
struct svc_serv *serv = rqstp->rq_server;
struct svc_pool *pool = rqstp->rq_pool;
spin_lock_bh(&pool->sp_lock);
pool->sp_nrthreads--;
if (!test_and_set_bit(RQ_VICTIM, &rqstp->rq_flags))
list_del_rcu(&rqstp->rq_all);
spin_unlock_bh(&pool->sp_lock);
list_del_rcu(&rqstp->rq_all);
atomic_dec(&pool->sp_nrthreads);
spin_lock_bh(&serv->sv_lock);
serv->sv_nrthreads -= 1;
@ -938,6 +936,11 @@ svc_exit_thread(struct svc_rqst *rqstp)
svc_rqst_free(rqstp);
svc_put(serv);
/* That svc_put() cannot be the last, because the thread
* waiting for SP_VICTIM_REMAINS to clear must hold
* a reference. So it is still safe to access pool.
*/
clear_and_wake_up_bit(SP_VICTIM_REMAINS, &pool->sp_flags);
}
EXPORT_SYMBOL_GPL(svc_exit_thread);
@ -1544,24 +1547,20 @@ out_drop:
}
#if defined(CONFIG_SUNRPC_BACKCHANNEL)
/*
* Process a backchannel RPC request that arrived over an existing
* outbound connection
/**
* svc_process_bc - process a reverse-direction RPC request
* @req: RPC request to be used for client-side processing
* @rqstp: server-side execution context
*
*/
int
bc_svc_process(struct svc_serv *serv, struct rpc_rqst *req,
struct svc_rqst *rqstp)
void svc_process_bc(struct rpc_rqst *req, struct svc_rqst *rqstp)
{
struct rpc_task *task;
int proc_error;
int error;
dprintk("svc: %s(%p)\n", __func__, req);
/* Build the svc_rqst used by the common processing routine */
rqstp->rq_xid = req->rq_xid;
rqstp->rq_prot = req->rq_xprt->prot;
rqstp->rq_server = serv;
rqstp->rq_bc_net = req->rq_xprt->xprt_net;
rqstp->rq_addrlen = sizeof(req->rq_xprt->addr);
@ -1590,10 +1589,8 @@ bc_svc_process(struct svc_serv *serv, struct rpc_rqst *req,
* been processed by the caller.
*/
svcxdr_init_decode(rqstp);
if (!xdr_inline_decode(&rqstp->rq_arg_stream, XDR_UNIT * 2)) {
error = -EINVAL;
goto out;
}
if (!xdr_inline_decode(&rqstp->rq_arg_stream, XDR_UNIT * 2))
return;
/* Parse and execute the bc call */
proc_error = svc_process_common(rqstp);
@ -1602,26 +1599,18 @@ bc_svc_process(struct svc_serv *serv, struct rpc_rqst *req,
if (!proc_error) {
/* Processing error: drop the request */
xprt_free_bc_request(req);
error = -EINVAL;
goto out;
return;
}
/* Finally, send the reply synchronously */
memcpy(&req->rq_snd_buf, &rqstp->rq_res, sizeof(req->rq_snd_buf));
task = rpc_run_bc_task(req);
if (IS_ERR(task)) {
error = PTR_ERR(task);
goto out;
}
if (IS_ERR(task))
return;
WARN_ON_ONCE(atomic_read(&task->tk_count) != 1);
error = task->tk_status;
rpc_put_task(task);
out:
dprintk("svc: %s(), error=%d\n", __func__, error);
return error;
}
EXPORT_SYMBOL_GPL(bc_svc_process);
EXPORT_SYMBOL_GPL(svc_process_bc);
#endif /* CONFIG_SUNRPC_BACKCHANNEL */
/**

View File

@ -9,7 +9,6 @@
#include <linux/sched/mm.h>
#include <linux/errno.h>
#include <linux/freezer.h>
#include <linux/kthread.h>
#include <linux/slab.h>
#include <net/sock.h>
#include <linux/sunrpc/addr.h>
@ -17,6 +16,7 @@
#include <linux/sunrpc/svc_xprt.h>
#include <linux/sunrpc/svcsock.h>
#include <linux/sunrpc/xprt.h>
#include <linux/sunrpc/bc_xprt.h>
#include <linux/module.h>
#include <linux/netdevice.h>
#include <trace/events/sunrpc.h>
@ -201,7 +201,6 @@ void svc_xprt_init(struct net *net, struct svc_xprt_class *xcl,
kref_init(&xprt->xpt_ref);
xprt->xpt_server = serv;
INIT_LIST_HEAD(&xprt->xpt_list);
INIT_LIST_HEAD(&xprt->xpt_ready);
INIT_LIST_HEAD(&xprt->xpt_deferred);
INIT_LIST_HEAD(&xprt->xpt_users);
mutex_init(&xprt->xpt_mutex);
@ -472,9 +471,7 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
pool = svc_pool_for_cpu(xprt->xpt_server);
percpu_counter_inc(&pool->sp_sockets_queued);
spin_lock_bh(&pool->sp_lock);
list_add_tail(&xprt->xpt_ready, &pool->sp_sockets);
spin_unlock_bh(&pool->sp_lock);
lwq_enqueue(&xprt->xpt_ready, &pool->sp_xprts);
svc_pool_wake_idle_thread(pool);
}
@ -487,18 +484,9 @@ static struct svc_xprt *svc_xprt_dequeue(struct svc_pool *pool)
{
struct svc_xprt *xprt = NULL;
if (list_empty(&pool->sp_sockets))
goto out;
spin_lock_bh(&pool->sp_lock);
if (likely(!list_empty(&pool->sp_sockets))) {
xprt = list_first_entry(&pool->sp_sockets,
struct svc_xprt, xpt_ready);
list_del_init(&xprt->xpt_ready);
xprt = lwq_dequeue(&pool->sp_xprts, struct svc_xprt, xpt_ready);
if (xprt)
svc_xprt_get(xprt);
}
spin_unlock_bh(&pool->sp_lock);
out:
return xprt;
}
@ -674,7 +662,7 @@ static bool svc_alloc_arg(struct svc_rqst *rqstp)
continue;
set_current_state(TASK_IDLE);
if (kthread_should_stop()) {
if (svc_thread_should_stop(rqstp)) {
set_current_state(TASK_RUNNING);
return false;
}
@ -699,7 +687,7 @@ static bool svc_alloc_arg(struct svc_rqst *rqstp)
}
static bool
rqst_should_sleep(struct svc_rqst *rqstp)
svc_thread_should_sleep(struct svc_rqst *rqstp)
{
struct svc_pool *pool = rqstp->rq_pool;
@ -708,65 +696,51 @@ rqst_should_sleep(struct svc_rqst *rqstp)
return false;
/* was a socket queued? */
if (!list_empty(&pool->sp_sockets))
if (!lwq_empty(&pool->sp_xprts))
return false;
/* are we shutting down? */
if (kthread_should_stop())
if (svc_thread_should_stop(rqstp))
return false;
/* are we freezing? */
if (freezing(current))
return false;
#if defined(CONFIG_SUNRPC_BACKCHANNEL)
if (svc_is_backchannel(rqstp)) {
if (!lwq_empty(&rqstp->rq_server->sv_cb_list))
return false;
}
#endif
return true;
}
static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp)
static void svc_thread_wait_for_work(struct svc_rqst *rqstp)
{
struct svc_pool *pool = rqstp->rq_pool;
struct svc_pool *pool = rqstp->rq_pool;
/* rq_xprt should be clear on entry */
WARN_ON_ONCE(rqstp->rq_xprt);
if (svc_thread_should_sleep(rqstp)) {
set_current_state(TASK_IDLE | TASK_FREEZABLE);
llist_add(&rqstp->rq_idle, &pool->sp_idle_threads);
if (likely(svc_thread_should_sleep(rqstp)))
schedule();
rqstp->rq_xprt = svc_xprt_dequeue(pool);
if (rqstp->rq_xprt)
goto out_found;
set_current_state(TASK_IDLE);
smp_mb__before_atomic();
clear_bit(SP_CONGESTED, &pool->sp_flags);
clear_bit(RQ_BUSY, &rqstp->rq_flags);
smp_mb__after_atomic();
if (likely(rqst_should_sleep(rqstp)))
schedule();
else
while (!llist_del_first_this(&pool->sp_idle_threads,
&rqstp->rq_idle)) {
/* Work just became available. This thread can only
* handle it after removing rqstp from the idle
* list. If that attempt failed, some other thread
* must have queued itself after finding no
* work to do, so that thread has taken responsibly
* for this new work. This thread can safely sleep
* until woken again.
*/
schedule();
set_current_state(TASK_IDLE | TASK_FREEZABLE);
}
__set_current_state(TASK_RUNNING);
} else {
cond_resched();
}
try_to_freeze();
set_bit(RQ_BUSY, &rqstp->rq_flags);
smp_mb__after_atomic();
clear_bit(SP_TASK_PENDING, &pool->sp_flags);
rqstp->rq_xprt = svc_xprt_dequeue(pool);
if (rqstp->rq_xprt)
goto out_found;
if (kthread_should_stop())
return NULL;
return NULL;
out_found:
clear_bit(SP_TASK_PENDING, &pool->sp_flags);
/* Normally we will wait up to 5 seconds for any required
* cache information to be provided.
*/
if (!test_bit(SP_CONGESTED, &pool->sp_flags))
rqstp->rq_chandle.thread_wait = 5*HZ;
else
rqstp->rq_chandle.thread_wait = 1*HZ;
trace_svc_xprt_dequeue(rqstp);
return rqstp->rq_xprt;
}
static void svc_add_new_temp_xprt(struct svc_serv *serv, struct svc_xprt *newxpt)
@ -785,7 +759,7 @@ static void svc_add_new_temp_xprt(struct svc_serv *serv, struct svc_xprt *newxpt
svc_xprt_received(newxpt);
}
static int svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt)
static void svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt)
{
struct svc_serv *serv = rqstp->rq_server;
int len = 0;
@ -826,11 +800,35 @@ static int svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt)
len = xprt->xpt_ops->xpo_recvfrom(rqstp);
rqstp->rq_reserved = serv->sv_max_mesg;
atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
if (len <= 0)
goto out;
trace_svc_xdr_recvfrom(&rqstp->rq_arg);
clear_bit(XPT_OLD, &xprt->xpt_flags);
rqstp->rq_chandle.defer = svc_defer;
if (serv->sv_stats)
serv->sv_stats->netcnt++;
percpu_counter_inc(&rqstp->rq_pool->sp_messages_arrived);
rqstp->rq_stime = ktime_get();
svc_process(rqstp);
} else
svc_xprt_received(xprt);
out:
return len;
rqstp->rq_res.len = 0;
svc_xprt_release(rqstp);
}
static void svc_thread_wake_next(struct svc_rqst *rqstp)
{
if (!svc_thread_should_sleep(rqstp))
/* More work pending after I dequeued some,
* wake another worker
*/
svc_pool_wake_idle_thread(rqstp->rq_pool);
}
/**
@ -843,44 +841,51 @@ out:
*/
void svc_recv(struct svc_rqst *rqstp)
{
struct svc_xprt *xprt = NULL;
struct svc_serv *serv = rqstp->rq_server;
int len;
struct svc_pool *pool = rqstp->rq_pool;
if (!svc_alloc_arg(rqstp))
goto out;
return;
try_to_freeze();
cond_resched();
if (kthread_should_stop())
goto out;
svc_thread_wait_for_work(rqstp);
xprt = svc_get_next_xprt(rqstp);
if (!xprt)
goto out;
clear_bit(SP_TASK_PENDING, &pool->sp_flags);
len = svc_handle_xprt(rqstp, xprt);
if (svc_thread_should_stop(rqstp)) {
svc_thread_wake_next(rqstp);
return;
}
/* No data, incomplete (TCP) read, or accept() */
if (len <= 0)
goto out_release;
rqstp->rq_xprt = svc_xprt_dequeue(pool);
if (rqstp->rq_xprt) {
struct svc_xprt *xprt = rqstp->rq_xprt;
trace_svc_xdr_recvfrom(&rqstp->rq_arg);
svc_thread_wake_next(rqstp);
/* Normally we will wait up to 5 seconds for any required
* cache information to be provided. When there are no
* idle threads, we reduce the wait time.
*/
if (pool->sp_idle_threads.first)
rqstp->rq_chandle.thread_wait = 5 * HZ;
else
rqstp->rq_chandle.thread_wait = 1 * HZ;
clear_bit(XPT_OLD, &xprt->xpt_flags);
trace_svc_xprt_dequeue(rqstp);
svc_handle_xprt(rqstp, xprt);
}
rqstp->rq_chandle.defer = svc_defer;
#if defined(CONFIG_SUNRPC_BACKCHANNEL)
if (svc_is_backchannel(rqstp)) {
struct svc_serv *serv = rqstp->rq_server;
struct rpc_rqst *req;
if (serv->sv_stats)
serv->sv_stats->netcnt++;
percpu_counter_inc(&rqstp->rq_pool->sp_messages_arrived);
rqstp->rq_stime = ktime_get();
svc_process(rqstp);
out:
return;
out_release:
rqstp->rq_res.len = 0;
svc_xprt_release(rqstp);
req = lwq_dequeue(&serv->sv_cb_list,
struct rpc_rqst, rq_bc_list);
if (req) {
svc_thread_wake_next(rqstp);
svc_process_bc(req, rqstp);
}
}
#endif
}
EXPORT_SYMBOL_GPL(svc_recv);
@ -890,7 +895,6 @@ EXPORT_SYMBOL_GPL(svc_recv);
void svc_drop(struct svc_rqst *rqstp)
{
trace_svc_drop(rqstp);
svc_xprt_release(rqstp);
}
EXPORT_SYMBOL_GPL(svc_drop);
@ -906,8 +910,6 @@ void svc_send(struct svc_rqst *rqstp)
int status;
xprt = rqstp->rq_xprt;
if (!xprt)
return;
/* calculate over-all length */
xb = &rqstp->rq_res;
@ -920,7 +922,6 @@ void svc_send(struct svc_rqst *rqstp)
status = xprt->xpt_ops->xpo_sendto(rqstp);
trace_svc_send(rqstp, status);
svc_xprt_release(rqstp);
}
/*
@ -1031,7 +1032,6 @@ static void svc_delete_xprt(struct svc_xprt *xprt)
spin_lock_bh(&serv->sv_lock);
list_del_init(&xprt->xpt_list);
WARN_ON_ONCE(!list_empty(&xprt->xpt_ready));
if (test_bit(XPT_TEMP, &xprt->xpt_flags))
serv->sv_tmpcnt--;
spin_unlock_bh(&serv->sv_lock);
@ -1082,36 +1082,26 @@ static int svc_close_list(struct svc_serv *serv, struct list_head *xprt_list, st
return ret;
}
static struct svc_xprt *svc_dequeue_net(struct svc_serv *serv, struct net *net)
{
struct svc_pool *pool;
struct svc_xprt *xprt;
struct svc_xprt *tmp;
int i;
for (i = 0; i < serv->sv_nrpools; i++) {
pool = &serv->sv_pools[i];
spin_lock_bh(&pool->sp_lock);
list_for_each_entry_safe(xprt, tmp, &pool->sp_sockets, xpt_ready) {
if (xprt->xpt_net != net)
continue;
list_del_init(&xprt->xpt_ready);
spin_unlock_bh(&pool->sp_lock);
return xprt;
}
spin_unlock_bh(&pool->sp_lock);
}
return NULL;
}
static void svc_clean_up_xprts(struct svc_serv *serv, struct net *net)
{
struct svc_xprt *xprt;
int i;
while ((xprt = svc_dequeue_net(serv, net))) {
set_bit(XPT_CLOSE, &xprt->xpt_flags);
svc_delete_xprt(xprt);
for (i = 0; i < serv->sv_nrpools; i++) {
struct svc_pool *pool = &serv->sv_pools[i];
struct llist_node *q, **t1, *t2;
q = lwq_dequeue_all(&pool->sp_xprts);
lwq_for_each_safe(xprt, t1, t2, &q, xpt_ready) {
if (xprt->xpt_net == net) {
set_bit(XPT_CLOSE, &xprt->xpt_flags);
svc_delete_xprt(xprt);
xprt = NULL;
}
}
if (q)
lwq_enqueue_batch(q, &pool->sp_xprts);
}
}

View File

@ -263,11 +263,9 @@ void rpcrdma_bc_receive_call(struct rpcrdma_xprt *r_xprt,
/* Queue rqst for ULP's callback service */
bc_serv = xprt->bc_serv;
xprt_get(xprt);
spin_lock(&bc_serv->sv_cb_lock);
list_add(&rqst->rq_bc_list, &bc_serv->sv_cb_list);
spin_unlock(&bc_serv->sv_cb_lock);
lwq_enqueue(&rqst->rq_bc_list, &bc_serv->sv_cb_list);
wake_up(&bc_serv->sv_cb_waitq);
svc_pool_wake_idle_thread(&bc_serv->sv_pools[0]);
r_xprt->rx_stats.bcall_count++;
return;

View File

@ -852,7 +852,8 @@ out_readfail:
if (ret == -EINVAL)
svc_rdma_send_error(rdma_xprt, ctxt, ret);
svc_rdma_recv_ctxt_put(rdma_xprt, ctxt);
return ret;
svc_xprt_deferred_close(xprt);
return -ENOTCONN;
out_backchannel:
svc_rdma_handle_bc_reply(rqstp, ctxt);

View File

@ -18,3 +18,4 @@ CFLAGS_devlink:=$(call get_hdr_inc,_LINUX_DEVLINK_H_,devlink.h)
CFLAGS_ethtool:=$(call get_hdr_inc,_LINUX_ETHTOOL_NETLINK_H_,ethtool_netlink.h)
CFLAGS_handshake:=$(call get_hdr_inc,_LINUX_HANDSHAKE_H,handshake.h)
CFLAGS_netdev:=$(call get_hdr_inc,_LINUX_NETDEV_H,netdev.h)
CFLAGS_nfsd:=$(call get_hdr_inc,_LINUX_NFSD_H,nfsd.h)

View File

@ -14,7 +14,7 @@ YNL_GEN_ARG_ethtool:=--user-header linux/ethtool_netlink.h \
TOOL:=../ynl-gen-c.py
GENS:=ethtool devlink handshake fou netdev
GENS:=ethtool devlink handshake fou netdev nfsd
SRCS=$(patsubst %,%-user.c,${GENS})
HDRS=$(patsubst %,%-user.h,${GENS})
OBJS=$(patsubst %,%-user.o,${GENS})

View File

@ -0,0 +1,95 @@
// SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
/* Do not edit directly, auto-generated from: */
/* Documentation/netlink/specs/nfsd.yaml */
/* YNL-GEN user source */
#include <stdlib.h>
#include <string.h>
#include "nfsd-user.h"
#include "ynl.h"
#include <linux/nfsd_netlink.h>
#include <libmnl/libmnl.h>
#include <linux/genetlink.h>
/* Enums */
static const char * const nfsd_op_strmap[] = {
[NFSD_CMD_RPC_STATUS_GET] = "rpc-status-get",
};
const char *nfsd_op_str(int op)
{
if (op < 0 || op >= (int)MNL_ARRAY_SIZE(nfsd_op_strmap))
return NULL;
return nfsd_op_strmap[op];
}
/* Policies */
struct ynl_policy_attr nfsd_rpc_status_policy[NFSD_A_RPC_STATUS_MAX + 1] = {
[NFSD_A_RPC_STATUS_XID] = { .name = "xid", .type = YNL_PT_U32, },
[NFSD_A_RPC_STATUS_FLAGS] = { .name = "flags", .type = YNL_PT_U32, },
[NFSD_A_RPC_STATUS_PROG] = { .name = "prog", .type = YNL_PT_U32, },
[NFSD_A_RPC_STATUS_VERSION] = { .name = "version", .type = YNL_PT_U8, },
[NFSD_A_RPC_STATUS_PROC] = { .name = "proc", .type = YNL_PT_U32, },
[NFSD_A_RPC_STATUS_SERVICE_TIME] = { .name = "service_time", .type = YNL_PT_U64, },
[NFSD_A_RPC_STATUS_PAD] = { .name = "pad", .type = YNL_PT_IGNORE, },
[NFSD_A_RPC_STATUS_SADDR4] = { .name = "saddr4", .type = YNL_PT_U32, },
[NFSD_A_RPC_STATUS_DADDR4] = { .name = "daddr4", .type = YNL_PT_U32, },
[NFSD_A_RPC_STATUS_SADDR6] = { .name = "saddr6", .type = YNL_PT_BINARY,},
[NFSD_A_RPC_STATUS_DADDR6] = { .name = "daddr6", .type = YNL_PT_BINARY,},
[NFSD_A_RPC_STATUS_SPORT] = { .name = "sport", .type = YNL_PT_U16, },
[NFSD_A_RPC_STATUS_DPORT] = { .name = "dport", .type = YNL_PT_U16, },
[NFSD_A_RPC_STATUS_COMPOUND_OPS] = { .name = "compound-ops", .type = YNL_PT_U32, },
};
struct ynl_policy_nest nfsd_rpc_status_nest = {
.max_attr = NFSD_A_RPC_STATUS_MAX,
.table = nfsd_rpc_status_policy,
};
/* Common nested types */
/* ============== NFSD_CMD_RPC_STATUS_GET ============== */
/* NFSD_CMD_RPC_STATUS_GET - dump */
void nfsd_rpc_status_get_list_free(struct nfsd_rpc_status_get_list *rsp)
{
struct nfsd_rpc_status_get_list *next = rsp;
while ((void *)next != YNL_LIST_END) {
rsp = next;
next = rsp->next;
free(rsp->obj.saddr6);
free(rsp->obj.daddr6);
free(rsp->obj.compound_ops);
free(rsp);
}
}
struct nfsd_rpc_status_get_list *nfsd_rpc_status_get_dump(struct ynl_sock *ys)
{
struct ynl_dump_state yds = {};
struct nlmsghdr *nlh;
int err;
yds.ys = ys;
yds.alloc_sz = sizeof(struct nfsd_rpc_status_get_list);
yds.cb = nfsd_rpc_status_get_rsp_parse;
yds.rsp_cmd = NFSD_CMD_RPC_STATUS_GET;
yds.rsp_policy = &nfsd_rpc_status_nest;
nlh = ynl_gemsg_start_dump(ys, ys->family_id, NFSD_CMD_RPC_STATUS_GET, 1);
err = ynl_exec_dump(ys, nlh, &yds);
if (err < 0)
goto free_list;
return yds.first;
free_list:
nfsd_rpc_status_get_list_free(yds.first);
return NULL;
}
const struct ynl_family ynl_nfsd_family = {
.name = "nfsd",
};

View File

@ -0,0 +1,33 @@
/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
/* Do not edit directly, auto-generated from: */
/* Documentation/netlink/specs/nfsd.yaml */
/* YNL-GEN user header */
#ifndef _LINUX_NFSD_GEN_H
#define _LINUX_NFSD_GEN_H
#include <stdlib.h>
#include <string.h>
#include <linux/types.h>
#include <linux/nfsd_netlink.h>
struct ynl_sock;
extern const struct ynl_family ynl_nfsd_family;
/* Enums */
const char *nfsd_op_str(int op);
/* Common nested types */
/* ============== NFSD_CMD_RPC_STATUS_GET ============== */
/* NFSD_CMD_RPC_STATUS_GET - dump */
struct nfsd_rpc_status_get_list {
struct nfsd_rpc_status_get_list *next;
struct nfsd_rpc_status_get_rsp obj __attribute__ ((aligned (8)));
};
void nfsd_rpc_status_get_list_free(struct nfsd_rpc_status_get_list *rsp);
struct nfsd_rpc_status_get_list *nfsd_rpc_status_get_dump(struct ynl_sock *ys);
#endif /* _LINUX_NFSD_GEN_H */