linux/fs/ceph
David Howells 7b589a9b45
netfs: Fix handling of USE_PGPRIV2 and WRITE_TO_CACHE flags
The NETFS_RREQ_USE_PGPRIV2 and NETFS_RREQ_WRITE_TO_CACHE flags aren't used
correctly.  The problem is that we try to set them up in the request
initialisation, but we the cache may be in the process of setting up still,
and so the state may not be correct.  Further, we secondarily sample the
cache state and make contradictory decisions later.

The issue arises because we set up the cache resources, which allows the
cache's ->prepare_read() to switch on NETFS_SREQ_COPY_TO_CACHE - which
triggers cache writing even if we didn't set the flags when allocating.

Fix this in the following way:

 (1) Drop NETFS_ICTX_USE_PGPRIV2 and instead set NETFS_RREQ_USE_PGPRIV2 in
     ->init_request() rather than trying to juggle that in
     netfs_alloc_request().

 (2) Repurpose NETFS_RREQ_USE_PGPRIV2 to merely indicate that if caching is
     to be done, then PG_private_2 is to be used rather than only setting
     it if we decide to cache and then having netfs_rreq_unlock_folios()
     set the non-PG_private_2 writeback-to-cache if it wasn't set.

 (3) Split netfs_rreq_unlock_folios() into two functions, one of which
     contains the deprecated code for using PG_private_2 to avoid
     accidentally doing the writeback path - and always use it if
     USE_PGPRIV2 is set.

 (4) As NETFS_ICTX_USE_PGPRIV2 is removed, make netfs_write_begin() always
     wait for PG_private_2.  This function is deprecated and only used by
     ceph anyway, and so label it so.

 (5) Drop the NETFS_RREQ_WRITE_TO_CACHE flag and use
     fscache_operation_valid() on the cache_resources instead.  This has
     the advantage of picking up the result of netfs_begin_cache_read() and
     fscache_begin_write_operation() - which are called after the object is
     initialised and will wait for the cache to come to a usable state.

Just reverting ae678317b95e[1] isn't a sufficient fix, so this need to be
applied on top of that.  Without this as well, things like:

 rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: {

and:

 WARNING: CPU: 13 PID: 3621 at fs/ceph/caps.c:3386

may happen, along with some UAFs due to PG_private_2 not getting used to
wait on writeback completion.

Fixes: 2ff1e97587 ("netfs: Replace PG_fscache by setting folio->private and marking dirty")
Reported-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: Hristo Venev <hristo@venev.name>
cc: Jeff Layton <jlayton@kernel.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: ceph-devel@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/3575457.1722355300@warthog.procyon.org.uk/ [1]
Link: https://lore.kernel.org/r/1173209.1723152682@warthog.procyon.org.uk
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-12 22:03:27 +02:00
..
acl.c ceph: allow idmapped set_acl inode op 2023-11-03 23:28:34 +01:00
addr.c netfs: Fix handling of USE_PGPRIV2 and WRITE_TO_CACHE flags 2024-08-12 22:03:27 +02:00
cache.c ceph: rename _to_client() to _to_fs_client() 2023-11-03 23:28:33 +01:00
cache.h netfs: Provide invalidate_folio and release_folio calls 2023-12-24 15:08:51 +00:00
caps.c ceph: use cap_wait_list only if debugfs is enabled 2024-07-23 10:01:57 +02:00
ceph_frag.c
crypto.c Two items: 2023-11-10 09:52:56 -08:00
crypto.h ceph: add support for encrypted snapshot names 2023-08-24 11:24:36 +02:00
debugfs.c ceph: print cluster fsid and client global_id in all debug logs 2023-11-03 23:28:33 +01:00
dir.c A small patchset to address bogus I/O errors and ultimately an 2024-07-26 10:34:42 -07:00
export.c ceph: d_obtain_{alias,root}(ERR_PTR(...)) will do the right thing 2024-01-15 15:40:51 +01:00
file.c ceph: check the cephx mds auth access for async dirop 2024-05-23 10:35:47 +02:00
inode.c netfs: Fix handling of USE_PGPRIV2 and WRITE_TO_CACHE flags 2024-08-12 22:03:27 +02:00
io.c ceph: fix kerneldoc copypasta over ceph_start_io_direct 2021-04-27 23:52:23 +02:00
io.h ceph: add buffered/direct exclusionary locking for reads and writes 2019-09-16 12:06:25 +02:00
ioctl.c ceph: print cluster fsid and client global_id in all debug logs 2023-11-03 23:28:33 +01:00
ioctl.h
Kconfig ceph: select FS_ENCRYPTION_ALGS if FS_ENCRYPTION 2024-01-15 15:40:50 +01:00
locks.c ceph: adapt to breakup of struct file_lock 2024-02-05 13:11:42 +01:00
Makefile ceph: fscrypt_auth handling for ceph 2023-08-22 09:01:48 +02:00
mds_client.c ceph: periodically flush the cap releases 2024-07-23 10:01:57 +02:00
mds_client.h ceph: use cap_wait_list only if debugfs is enabled 2024-07-23 10:01:57 +02:00
mdsmap.c ceph: switch to corrected encoding of max_xattr_size in mdsmap 2024-02-26 19:20:30 +01:00
mdsmap.h ceph: switch to corrected encoding of max_xattr_size in mdsmap 2024-02-26 19:20:30 +01:00
metric.c ceph: print cluster fsid and client global_id in all debug logs 2023-11-03 23:28:33 +01:00
metric.h ceph: include average/stdev r/w/m latency in mds metrics 2022-03-21 13:35:16 +01:00
quota.c ceph: fix invalid pointer access if get_quota_realm return ERR_PTR 2024-01-15 15:40:51 +01:00
snap.c Two items: 2023-11-10 09:52:56 -08:00
strings.c ceph: add getvxattr op 2022-03-01 18:26:37 +01:00
super.c ceph: fix incorrect kmalloc size of pagevec mempool 2024-07-23 10:01:57 +02:00
super.h ceph: always check dir caps asynchronously 2024-02-07 14:58:02 +01:00
util.c ceph: move net/ceph/ceph_fs.c to fs/ceph/util.c 2020-01-27 16:53:40 +01:00
xattr.c Two items: 2023-11-10 09:52:56 -08:00