Commit Graph

2945 Commits

Author SHA1 Message Date
Andreas Gruenbacher
f75efefb6d gfs2: Clean up glock demote logic
The logic for determining when to demote a glock in glock_work_func(),
introduced in commit 7cf8dcd3b6 ("GFS2: Automatically adjust glock min
hold time"), doesn't make sense: inode glocks have a minimum hold time
that delays demotion, while all other glocks are expected to be demoted
immediately.  Instead of demoting non-inode glocks immediately,
glock_work_func() schedules glock work for them to be demoted, however.
Get rid of that unnecessary indirection.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-07-09 10:40:03 +02:00
Andreas Gruenbacher
5a1906a476 gfs2: Revert "check for no eligible quota changes"
Since the previous commit, function gfs2_quota_sync() will not cause the
sync generation to creep forward by one every time the function is
called; this helps keep things a but more tidy.  We also don't care that
this function allocates a page of memory every time it is called, so no
good reason for keeping qd_changed() anymore, which just duplicates
qd_grab_sync().

This reverts commit 06aa6fd31a.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-20 16:38:15 +02:00
Andreas Gruenbacher
d9a75a6069 gfs2: Be more careful with the quota sync generation
The quota sync generation is only ever updated under sd_quota_sync_mutex
by gfs2_quota_sync(), but its current value is fetched ouside of that
mutex, so use WRITE_ONCE() and READ_ONCE() when accessing it without
holding that mutex.

Pass the current sync generation to do_sync() from its callers to ensure
that we're not recording the wrong generation when the syncing is
done.  Also, make sure that qd->qd_sync_gen only ever moves forward.

In gfs2_quota_sync(), only write the new sync generation when we know
that there are changes.  This eliminates the need for function
sd_changed(), which we will remove in the next commit.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-20 16:38:15 +02:00
Andreas Gruenbacher
8d89e068de gfs2: Get rid of some unnecessary quota locking
With the locking the previous patch has introduced for each struct
gfs2_quota_data object, sd_quota_mutex has become largely irrelevant.
By waiting on the buffer head instead of waiting on the mutex in
get_bh(), it becomes completely irrelevant and can be removed.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-20 16:38:13 +02:00
Andreas Gruenbacher
d5563f42f5 gfs2: Add some missing quota locking
The quota code is missing some locking between local quota changes and
syncing those quota changes to the global quota file (gfs2_quota_sync);
in particular, qd->qd_change needs to be kept in sync with the
QDF_CHANGE change flag and the number of references held.  Use the
qd->qd_lockref.lock spinlock for that.

With the qd->qd_lockref.lock spinlock held, we can no longer call
lockref_get(), so turn qd_hold() into a variant that assumes that the
lock is held.  This function is really supposed to take an additional
reference when one or more references are already held, so check for
that instead of checking if the lockref is dead.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-12 20:33:51 +02:00
Andreas Gruenbacher
614abc1187 gfs2: Fold qd_fish into gfs2_quota_sync
The split between qd_fish() and gfs2_quota_sync() is rather unfortunate
as qd_fish() is repeatedly called to scan sdp->sd_quota_list only to
find the next object to that needs syncing; if there are multiple
objects on the list that need syncing, it makes more sense to grab them
all in one go.  This is relatively easy to do when qd_fish() is folded
into gfs2_quota_sync().

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-08 02:35:16 +02:00
Andreas Gruenbacher
b510af07aa gfs2: quota need_sync cleanup
Rename variable 'value' to 'change' as it stores a change in value.

Add new 'value' and 'limit' variables for the current value and limit.

Only fetch the tuning parameters when we need them.

Get rid of unnecessary nesting.

No change in functionality.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-08 02:35:16 +02:00
Andreas Gruenbacher
7da4d6e178 gfs2: Fix and clean up function do_qc
Function do_qc() is supposed to be conceptually simple: it alters the
current in-memory and on-disk quota change values for a given uid/gid by
a given delta.  If the on-disk record isn't defined yet, a new record is
created.  If the on-disk record exists and the resulting change value is
zero, there no longer is a need for that record and so the record is
deleted.  On top of that, some reference counting is involved when
creating and deleting records.

Currently, instead of doing the above, do_qc() alters the on-disk value
and then it sets the in-memory value to the on-disk value.  This is
incorrect when the on-disk value differs from the in-memory value.  The
two values are allowed to differ when quota changes are synced to the
global quota file.  Fix by changing both values by the same amount.

In addition, do_qc() currently gets confused when the delta value is 0.
It isn't supposed to be called that way, but that assumption isn't
mentioned and it makes the code harder to read.  Make the code more
explicit.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-08 02:35:16 +02:00
Andreas Gruenbacher
ec4b5200c8 gfs2: Revert "Add quota_change type"
Commit 432928c937 ("gfs2: Add quota_change type") makes the incorrect
assertion that function do_qc() should behave differently in the two
contexts it is used in, but that isn't actually true.  In all cases,
do_qc() grabs a "reference" when it starts using a slot in the per-node
quota changes file, and it releases that "reference" when no more
residual changes remain.  Revert that broken commit.

There are some remaining issues with function do_qc() which are
addressed in the next commit.

This reverts commit 432928c937.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-08 02:35:09 +02:00
Andreas Gruenbacher
4b4b6374dc gfs2: Revert "ignore negated quota changes"
Commit 4c6a08125f ("gfs2: ignore negated quota changes") skips quota
changes with qd_change == 0 instead of writing them back, which leaves
behind non-zero qd_change values in the affected slots.  The kernel then
assumes that those slots are unused, while the qd_change values on disk
indicate that they are indeed still in use.  The next time the
filesystem is mounted, those invalid slots are read in from disk, which
will cause inconsistencies.

Revert that commit to avoid filesystem corruption.

This reverts commit 4c6a08125f.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-08 02:34:57 +02:00
Andreas Gruenbacher
59ebc33201 gfs2: qd_check_sync cleanups
Rename qd_check_sync() to qd_grab_sync() and make it return a bool.
Turn the sync_gen pointer into a regular u64 and pass in U64_MAX instead
of a NULL pointer when sync generation checking isn't needed.

Introduce a new qd_ungrab_sync() helper for undoing the effects of
qd_grab_sync() if the subsequent bh_get() on the qd object fails.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-08 02:34:56 +02:00
Andreas Gruenbacher
2aedfe847b gfs2: Revert "introduce qd_bh_get_or_undo"
The qd_bh_get_or_undo() helper introduced by that commit doesn't improve
the code much, so revert it and clean things up in a more useful way in
the next commit.

This reverts commit 7dbc6ae60d.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-08 02:34:40 +02:00
Andreas Gruenbacher
de0d95c26c gfs2: Check quota consistency on mount
In gfs2_quota_init(), make sure that the per-node "quota_change%u" file
doesn't contain duplicate uids/gids.  Those duplicates would cause us to
acquire the glock corresponding to those ids repeatedly, which the glock
code doesn't allow.

When finding inconsistencies, we wipe them out and ignore them.  The
resulting quotas will likely be inconsistent, and running quotacheck(1)
is advised.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-07 22:36:16 +02:00
Andreas Gruenbacher
51316523d1 gfs2: Minor gfs2_quota_init error path cleanup
Add a fail_brelse label and use it where useful.  Move variable bh out
of the loop to extend its visibility to the new label.  No functional
change.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-06-04 16:00:34 +02:00
Andreas Gruenbacher
713f883438 gfs2: Get rid of demote_ok checks
The demote_ok glock operation is only still used to prevent the inode
glocks of the "jindex" and "rindex" directories from getting recycled
while they are still referenced by sdp->sd_jindex and sdp->sd_rindex.
However, the LRU walking code will no longer recycle glocks which are
referenced, so the demote_ok glock operation is obsolete and can be
removed.

Each of a glock's holders in the gl_holders list is holding a reference
on the glock, so when the list of holders isn't empty in demote_ok(),
the existing reference count check will already prevent the glock from
getting released.  This means that demote_ok() is obsolete as well.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-29 15:34:55 +02:00
Andreas Gruenbacher
3f4475bf24 Revert "GFS2: Don't add all glocks to the lru"
This reverts commit e7ccaf5fe1.

Before commit e7ccaf5fe1, every time a resource group glock was
dequeued by gfs2_glock_dq(), it was added to the glock LRU list even
though the glock was still referenced by the resource group and could
never be evicted, anyway.  Commit e7ccaf5fe1 added a GLOF_LRU hack to
avoid that overhead for resource group glocks, and that hack was since
adopted for some other types of glocks as well.

We now no longer add glocks to the glock LRU list while they are still
referenced.  This solves the underlying problem, and obsoletes the
GLOF_LRU hack.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 3e5257c810cba91e274d07f3db5cf013c7c830be)
2024-05-29 15:34:55 +02:00
Andreas Gruenbacher
767fd5a016 gfs2: Revise glock reference counting model
In the current glock reference counting model, a bias of one is added to
a glock's refcount when it is locked (gl->gl_state != LM_ST_UNLOCKED).
A glock is removed from the lru_list when it is enqueued, and added back
when it is dequeued.  This isn't a very appropriate model because most
glocks are held for long periods of time (for example, the inode "owns"
references to its inode and iopen glocks as long as the inode is cached
even when the glock state changes to LM_ST_UNLOCKED), and they can only
be freed when they are no longer referenced, anyway.

Fix this by getting rid of the refcount bias for locked glocks.  That
way, we can use lockref_put_or_lock() to efficiently drop all but the
last glock reference, and put the glock onto the lru_list when the last
reference is dropped.  When find_insert_glock() returns a reference to a
cached glock, it removes the glock from the lru_list.

Dumping the "glocks" and "glstats" debugfs files also takes glock
references, but instead of removing the glocks from the lru_list in that
case as well, we leave them on the list.  This ensures that dumping
those files won't perturb the order of the glocks on the lru_list.

In addition, when the last reference to an *unlocked* glock is dropped,
we immediately free it; this preserves the preexisting behavior.  If it
later turns out that caching unlocked glocks is useful in some
situations, we can change the caching strategy.

It is currently unclear if a glock that has no active references can
have the GLF_LFLUSH flag set.  To make sure that such a glock won't
accidentally be evicted due to memory pressure, we add a GLF_LFLUSH
check to gfs2_dispose_glock_lru().

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-29 15:34:55 +02:00
Andreas Gruenbacher
30e388d573 gfs2: Switch to a per-filesystem glock workqueue
Switch to a per-filesystem glock workqueue.  Additional workqueues are
cheap nowadays, and keeping separate workqueues allows to flush the work
of each filesystem without affecting the others.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-29 15:34:55 +02:00
Andreas Gruenbacher
51568ac2e9 gfs2: Report when glocks cannot be freed for a long time
When glocks cannot be freed for a long time, avoid the "task blocked for
more than N seconds" messages and report how many glocks are still
outstanding, instead.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-29 15:34:55 +02:00
Andreas Gruenbacher
8f6b8f142b gfs2: gfs2_glock_get cleanup
Clean up the messy code in gfs2_glock_get().  No change in
functionality.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-29 15:34:55 +02:00
Andreas Gruenbacher
c8758ad005 gfs2: Invert the GLF_INITIAL flag
Invert the meaning of the GLF_INITIAL flag: right now, when GLF_INITIAL
is set, a DLM lock exists and we have a valid identifier for it; when
GLF_INITIAL is cleared, no DLM lock exists (yet).  This is confusing.
In addition, it makes more sense to highlight the exceptional case
(i.e., no DLM lock exists yet) in glock dumps and trace points than to
highlight the common case.

To avoid confusion between the "old" and the "new" meaning of the flag,
use 'a' instead of 'I' to represent the flag.

For improved code consistency, check if the GLF_INITIAL flag is cleared
to determine whether a DLM lock exists instead of checking if the lock
identifier is non-zero.

Document what the flag is used for.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-29 15:34:55 +02:00
Andreas Gruenbacher
c8cf2d9f18 gfs2: Remove outdated comment in glock_work_func
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-29 15:34:55 +02:00
Andreas Gruenbacher
edeb180f1c gfs2: Rename handle_callback to request_demote
Function handle_callback() is used to request a glock demote.  This
often happens in response to a conflicting remote locking request and
subsequent bast callback from DLM, but there are other reasons for
triggering a demote request as well, such as when trying to release a
glock in response to memory pressure.  To clarify that, rename the
function to request_demote().

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-28 16:59:53 +02:00
Andreas Gruenbacher
1fb5f67e21 gfs2: Rename GLF_FROZEN to GLF_HAVE_FROZEN_REPLY
The GLF_FROZEN flag indicates that a reply to a DLM locking request has
been received, but should not be processed at this time.  To clarify
that meaning, rename the flag to GLF_HAVE_FROZEN_REPLY.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-28 16:59:53 +02:00
Andreas Gruenbacher
0a0383a93e gfs2: Rename GLF_REPLY_PENDING to GLF_HAVE_REPLY
The GLF_REPLY_PENDING flag indicates to glock_work_func() that in
response to a locking request, DLM has sent a reply that needs to be
processed.  A flag with that name could as well indicate that we are
waiting on a reply from DLM, however.  To disambiguate these two cases,
rename the flag to GLF_HAVE_REPLY.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-28 16:59:53 +02:00
Andreas Gruenbacher
121e730112 gfs2: Rename GLF_FREEING to GLF_UNLOCKED
Rename the GLF_FREEING flag to GLF_UNLOCKED, and the ->go_free glock
operation to ->go_unlocked.  This mechanism is used to wait for the
underlying DLM lock to be unlocked; being able to free the glock is a
consequence of the DLM lock being unlocked.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-28 16:59:53 +02:00
Andreas Gruenbacher
932a9052dc gfs2: Remove useless return statement in run_queue
The return statement at the end of run_queue() is useless.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-28 16:59:53 +02:00
Andreas Gruenbacher
99b8520c00 gfs2: Remove unnecessary function prototype
Function __gfs2_glock_dq() gets defined before it is used, so there is
no need for a separate function declaration.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-28 16:59:53 +02:00
Linus Torvalds
38da32ee70 bd_inode series
Replacement of bdev->bd_inode with sane(r) set of primitives.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZkwjlgAKCRBZ7Krx/gZQ
 66OmAP9nhZLASn/iM2+979I6O0GW+vid+uLh48uW3d+LbsmVIgD9GYpR+cuLQ/xj
 mJESWfYKOVSpFFSrqlzKg9PQlU/GFgs=
 =6LRp
 -----END PGP SIGNATURE-----

Merge tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull bdev bd_inode updates from Al Viro:
 "Replacement of bdev->bd_inode with sane(r) set of primitives by me and
  Yu Kuai"

* tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RIP ->bd_inode
  dasd_format(): killing the last remaining user of ->bd_inode
  nilfs_attach_log_writer(): use ->bd_mapping->host instead of ->bd_inode
  block/bdev.c: use the knowledge of inode/bdev coallocation
  gfs2: more obvious initializations of mapping->host
  fs/buffer.c: massage the remaining users of ->bd_inode to ->bd_mapping
  blk_ioctl_{discard,zeroout}(): we only want ->bd_inode->i_mapping here...
  grow_dev_folio(): we only want ->bd_inode->i_mapping there
  use ->bd_mapping instead of ->bd_inode->i_mapping
  block_device: add a pointer to struct address_space (page cache of bdev)
  missing helpers: bdev_unhash(), bdev_drop()
  block: move two helpers into bdev.c
  block2mtd: prevent direct access of bd_inode
  dm-vdo: use bdev_nr_bytes(bdev) instead of i_size_read(bdev->bd_inode)
  blkdev_write_iter(): saner way to get inode and bdev
  bcachefs: remove dead function bdev_sectors()
  ext4: remove block_device_ejected()
  erofs_buf: store address_space instead of inode
  erofs: switch erofs_bread() to passing offset instead of block number
2024-05-21 09:51:42 -07:00
Linus Torvalds
9518ae6ec5 gfs2 updates
- Properly fix the glock shrinker this time: it broke in commit "gfs2:
   Make glock lru list scanning safer" and commit "gfs2: fix glock
   shrinker ref issues" wasn't actually enough to fix it.
 
 - On unmount, keep glocks around long enough that no more dlm callbacks
   can occur on them.
 
 - Some more folio conversion patches from Matthew Wilcox.
 
 - Lots of other smaller fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmZCfuIUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTravg//UIAXF+o8POtTCozn9QCfPqP5eWIF
 r879u7cf1o2yH43Qfx9urQKrNFuqQnDoDbbkCnDNQW5nsWWAjIAidG2Vi1j02YdM
 HTsU8890Y9uVlFtJ41jbRIkk2KtIZgKtKRGtPx7+FlWdCFB5ai36/aKTXRK34xcY
 KOBM2dDpnPdByERJ7+EPRC0EonhSZKhug3PqVfaCcj3p9S0eZtQHQdxhnPSF8wDK
 8ds2ErUV8N5DQtfR5WZ4N89JTk83CgFMuNaSXp8Kc8b/Wxa4KuKBTz+h6K90CJMP
 HKAnCz+qYro9ptoQCdXc7Wi1lWWS8kNuzXpTrQyDiVC5TCivxaBwQssOT2dVyB4+
 dlLL/4ikXDDYOq6r5q1rU8eVpKkCuzImCRtUW3rG4q7/Rvva4dfsVqcD/ISuxYD/
 fmXw/cb3zMDC06/59OusEFlwrltHjxH/2/IPS9ZQLLcP5An8LIbemDPdMvD7s2si
 b/OjgWA4M08cdBgKAhF4Kzla762y78kQYtZ/v18CI+I2Kr9fRXM74OI/rjTLSRim
 EyFH82lrpU8b38UpbElG4LI6FfZGvGCwmZYaEFQx8MUuvqAOceS2TH1z7v65Ekfu
 3NSqabKi0PGcH4oydfLSdQMxoc6glgZpTl3L/rx09qTnm3SFH/ghKVlivODbM8OK
 oJNhtATgTFbODuk=
 =d50l
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Andreas Gruenbacher:

 - Properly fix the glock shrinker this time: it broke in commit "gfs2:
   Make glock lru list scanning safer" and commit "gfs2: fix glock
   shrinker ref issues" wasn't actually enough to fix it

 - On unmount, keep glocks around long enough that no more dlm callbacks
   can occur on them

 - Some more folio conversion patches from Matthew Wilcox

 - Lots of other smaller fixes and cleanups

* tag 'gfs2-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (27 commits)
  gfs2: make timeout values more explicit
  gfs2: Convert gfs2_aspace_writepage() to use a folio
  gfs2: Add a migrate_folio operation for journalled files
  gfs2: Simplify gfs2_read_super
  gfs2: Convert gfs2_page_mkwrite() to use a folio
  gfs2: gfs2_freeze_unlock cleanup
  gfs2: Remove and replace gfs2_glock_queue_work
  gfs2: do_xmote fixes
  gfs2: finish_xmote cleanup
  gfs2: Unlock fewer glocks on unmount
  gfs2: Fix potential glock use-after-free on unmount
  gfs2: Remove ill-placed consistency check
  gfs2: Fix lru_count accounting
  gfs2: Fix "Make glock lru list scanning safer"
  Revert "gfs2: fix glock shrinker ref issues"
  gfs2: Fix "ignore unlock failures after withdraw"
  gfs2: Get rid of unnecessary test_and_set_bit
  gfs2: Don't set GLF_LOCK in gfs2_dispose_glock_lru
  gfs2: Replace gfs2_glock_queue_put with gfs2_glock_put_async
  gfs2: Get rid of gfs2_glock_queue_put in signal_our_withdraw
  ...
2024-05-14 17:35:22 -07:00
Wolfram Sang
c1c53c26e3 gfs2: make timeout values more explicit
'timeout' is a vague name for the return value of wait_event_*_timeout
because it actually returns the time left. Because the variable is never
used later, just drop the return value. Since variable 'timeout' is then
only used to carry a fixed timeout value, drop this in favor of a fixed
function argument as in the other call to wait_event_timeout() above.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-07 12:42:48 +02:00
Matthew Wilcox (Oracle)
50fabd42cb gfs2: Convert gfs2_aspace_writepage() to use a folio
Convert the incoming struct page to a folio and use it throughout.
Saves six calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-03 21:01:02 +02:00
Matthew Wilcox (Oracle)
b844048011 gfs2: Add a migrate_folio operation for journalled files
For journalled data, folio migration currently works by writing the folio
back, freeing the folio and faulting the new folio back in.  We can
bypass that by telling the migration code to migrate the buffer_heads
attached to our folios.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-03 21:01:02 +02:00
Al Viro
2d0026a490 gfs2: more obvious initializations of mapping->host
what's going on is copying the ->host of bdev's address_space

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20240411145346.2516848-4-viro@zeniv.linux.org.uk
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-05-03 02:36:51 -04:00
Matthew Wilcox (Oracle)
75377ae754 gfs2: Simplify gfs2_read_super
Use submit_bio_wait() instead of hand-rolling our own synchronous
wait.  Also allocate the BIO on the stack since we're not deep in
the call stack at this point.

There's no need to kmap the page, since it isn't allocated from HIGHMEM.
Turn the GFP_NOFS allocation into GFP_KERNEL; if the page allocator
enters reclaim, we cannot be called as the filesystem has not yet been
initialised and so has no pages to reclaim.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-05-02 19:24:08 +02:00
Matthew Wilcox (Oracle)
f3851fed07 gfs2: Convert gfs2_page_mkwrite() to use a folio
Convert the incoming page to a folio and use it throughout, saving
several calls to compound_head().  Also use 'pos' for file position
rather than the ambiguous 'offset' and convert 'length' to type size_t
in case we get some truly ridiculous sized folios in the future.  This
function should now be large-folio safe, but I may have missed
something.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-29 13:04:40 +02:00
Andreas Gruenbacher
fcd63086bc gfs2: gfs2_freeze_unlock cleanup
Function gfs2_freeze_unlock() is always called with &sdp->sd_freeze_gh
as its argument, so clean up the code by passing in sdp instead.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-29 12:35:15 +02:00
Andreas Gruenbacher
1e86044402 gfs2: Remove and replace gfs2_glock_queue_work
There are no more callers of gfs2_glock_queue_work() left, so remove
that helper.  With that, we can now rename __gfs2_glock_queue_work()
back to gfs2_glock_queue_work() to get rid of some unnecessary clutter.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-24 19:48:20 +02:00
Andreas Gruenbacher
9947a06d29 gfs2: do_xmote fixes
Function do_xmote() is called with the glock spinlock held.  Commit
86934198ee added a 'goto skip_inval' statement at the beginning of the
function to further below where the glock spinlock is expected not to be
held anymore.  Then it added code there that requires the glock spinlock
to be held.  This doesn't make sense; fix this up by dropping and
retaking the spinlock where needed.

In addition, when ->lm_lock() returned an error, do_xmote() didn't fail
the locking operation, and simply left the glock hanging; fix that as
well.  (This is a much older error.)

Fixes: 86934198ee ("gfs2: Clear flags when withdraw prevents xmote")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-24 19:48:20 +02:00
Andreas Gruenbacher
1cd28e1586 gfs2: finish_xmote cleanup
Currently, function finish_xmote() takes and releases the glock
spinlock.  However, all of its callers immediately take that spinlock
again, so it makes more sense to take the spin lock before calling
finish_xmote() already.

With that, thaw_glock() is the only place that sets the GLF_HAVE_REPLY
flag outside of the glock spinlock, but it also takes that spinlock
immediately thereafter.  Change that to set the bit when the spinlock is
already held.  This allows to switch from test_and_clear_bit() to
test_bit() and clear_bit() in glock_work_func().

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-24 19:48:20 +02:00
Andreas Gruenbacher
a3730c5ec5 gfs2: Unlock fewer glocks on unmount
At unmount time, we would generally like to explicitly unlock as few
glocks as possible for efficiency.  We are already skipping glocks that
don't have a lock value block (LVB), but we can also skip glocks which
are not held in DLM_LOCK_EX or DLM_LOCK_PW mode (of which gfs2 only uses
DLM_LOCK_EX under the name LM_ST_EXCLUSIVE).

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: David Teigland <teigland@redhat.com>
2024-04-24 19:48:20 +02:00
Andreas Gruenbacher
d98779e687 gfs2: Fix potential glock use-after-free on unmount
When a DLM lockspace is released and there ares still locks in that
lockspace, DLM will unlock those locks automatically.  Commit
fb6791d100 started exploiting this behavior to speed up filesystem
unmount: gfs2 would simply free glocks it didn't want to unlock and then
release the lockspace.  This didn't take the bast callbacks for
asynchronous lock contention notifications into account, which remain
active until until a lock is unlocked or its lockspace is released.

To prevent those callbacks from accessing deallocated objects, put the
glocks that should not be unlocked on the sd_dead_glocks list, release
the lockspace, and only then free those glocks.

As an additional measure, ignore unexpected ast and bast callbacks if
the receiving glock is dead.

Fixes: fb6791d100 ("GFS2: skip dlm_unlock calls in unmount")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: David Teigland <teigland@redhat.com>
2024-04-24 19:48:20 +02:00
Andreas Gruenbacher
59f6000579 gfs2: Remove ill-placed consistency check
This consistency check was originally added by commit 9287c6452d
("gfs2: Fix occasional glock use-after-free").  It is ill-placed in
gfs2_glock_free() because if it holds there, it must equally hold in
__gfs2_glock_put() already.  Either way, the check doesn't seem
necessary anymore.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-24 19:48:20 +02:00
Andreas Gruenbacher
7a1ad9d812 gfs2: Fix lru_count accounting
Currently, gfs2_scan_glock_lru() decrements lru_count when a glock is
moved onto the dispose list.  When such a glock is then stolen from the
dispose list while gfs2_dispose_glock_lru() doesn't hold the lru_lock,
lru_count will be decremented again, so the counter will eventually go
negative.

This bug has existed in one form or another since at least commit
97cc1025b1 ("GFS2: Kill two daemons with one patch").

Fix this by only decrementing lru_count when we actually remove a glock
and schedule for it to be unlocked and dropped.  We also don't need to
remove and then re-add glocks when we can just as well move them back
onto the lru_list when necessary.

In addition, return the number of glocks freed as we should, not the
number of glocks moved onto the dispose list.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-24 19:46:38 +02:00
Andreas Gruenbacher
acf1f42faf gfs2: Fix "Make glock lru list scanning safer"
Commit 228804a35c tried to add a refcount check to
gfs2_scan_glock_lru() to make sure that glocks that are still referenced
cannot be freed.  It failed to account for the bias state_change() adds
to the refcount for held glocks, so held glocks are no longer removed
from the glock cache, which can lead to out-of-memory problems.  Fix
that.  (The inodes those glocks are associated with do get shrunk and do
get pushed out of memory.)

In addition, use the same eligibility check in gfs2_scan_glock_lru() and
gfs2_dispose_glock_lru().

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-09 18:35:58 +02:00
Andreas Gruenbacher
c9a0a4b028 Revert "gfs2: fix glock shrinker ref issues"
This reverts commit 62862485a4.

Commit 62862485a4 tried to fix issues introduced by commit
228804a35c ("gfs2: Make glock lru list scanning safer"), but like that
commit, it failed to account for the bias state_change() adds to the
glock reference count for locked glocks.  Revert commit 62862485a4 so
that we can fix commit 228804a35c properly.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-09 18:35:58 +02:00
Andreas Gruenbacher
5d92311119 gfs2: Fix "ignore unlock failures after withdraw"
Commit 3e11e53041 tries to suppress dlm_lock() lock conversion errors
that occur when the lockspace has already been released.

It does that by setting and checking the SDF_SKIP_DLM_UNLOCK flag.  This
conflicts with the intended meaning of the SDF_SKIP_DLM_UNLOCK flag, so
check whether the lockspace is still allocated instead.

(Given the current DLM API, checking for this kind of error after the
fact seems easier that than to make sure that the lockspace is still
allocated before calling dlm_lock().  Changing the DLM API so that users
maintain the lockspace references themselves would be an option.)

Fixes: 3e11e53041 ("GFS2: ignore unlock failures after withdraw")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-09 18:35:58 +02:00
Andreas Gruenbacher
262ee3a07e gfs2: Get rid of unnecessary test_and_set_bit
The GLF_LOCK flag is protected by the gl->gl_lockref.lock spin lock
which is held when entering run_queue(), so we can use test_bit() and
set_bit() here.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-09 18:35:58 +02:00
Andreas Gruenbacher
927cfc90d2 gfs2: Don't set GLF_LOCK in gfs2_dispose_glock_lru
In gfs2_dispose_glock_lru(), we want to skip glocks which are in the
process of transitioning state (as indicated by the set GLF_LOCK flag),
but we we don't need to set that flag for requesting a state transition.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-09 18:35:57 +02:00
Andreas Gruenbacher
ee2be7d7c7 gfs2: Replace gfs2_glock_queue_put with gfs2_glock_put_async
Function gfs2_glock_queue_put() puts a glock reference by enqueuing
glock work instead of putting the reference directly.  This ensures that
the operation won't sleep, but it is costly and really only necessary
when putting the final glock reference.  Replace it with a new
gfs2_glock_put_async() function that only queues glock work when putting
the last glock reference.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2024-04-09 18:35:57 +02:00