Commit Graph

141 Commits

Author SHA1 Message Date
Alexander Aring
94e180d625 dlm: async freeing of lockspace resources
This patch handles freeing of lockspace resources asynchronously besides
the release_lockspace() context. The release_lockspace() context is
sometimes called in a time critical context, e.g. umount syscall. Most
every user space init system will timeout if it takes too long. To
reduce the potential waiting time we deregister in release_lockspace()
the lockspace from the DLM subsystem and do the actual releasing of
lockspace resource in a worker of a workqueue following recommendation
of:

https://lore.kernel.org/all/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp/T/#u

as flushing of system workqueues are not allowed. The most time to
release the DLM resources are spent to release the data structures
"ls->ls_lkbxa" and "ls->ls_rsbtbl" as they iterate over each entries and
those data structures can contain millions of entries. This patch handles
for now only freeing of those data structures as those operations are
the most reason why release_lockspace() blocking of being returned.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-08-08 15:15:08 -05:00
Alexander Aring
8a4cf500f1 dlm: drop kobject release callback handling
This patch removes the releasing of the "struct dlm ls" resource out of
the kobject handling. Instead we run kfree() after kobject_put() of the
lockspace kobject structure that should always being the last put call.
This prepares to split the releasing of all lockspace resources
asynchronously in the background and just deregister everything in
release_lockspace().

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-08-08 15:15:08 -05:00
Alexander Aring
79ced51e2e dlm: remove DLM_LSFL_SOFTIRQ from exflags
The DLM rcom handling has a check that all exflags are the same for the
whole lockspace membership nodes. There are some flags that requires
such handling, however DLM_LSFL_SOFTIRQ does not require this handling
and it should be backwards compatibility with other lockspaces that does
not set this flag.

Fixes: f328a26eeb ("dlm: introduce DLM_LSFL_SOFTIRQ_SAFE")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-06-12 16:20:51 -05:00
Alexander Aring
68bde2a67a dlm: implement LSFL_SOFTIRQ_SAFE
When a lockspace user allows it, run callback functions directly from
softirq context, instead of queueing callbacks to be run from the
dlm_callback workqueue context.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-06-11 13:13:00 -05:00
Alexander Aring
f328a26eeb dlm: introduce DLM_LSFL_SOFTIRQ_SAFE
Introduce a new external lockspace flag DLM_LSFL_SOFTIRQ_SAFE.  A
lockspace user will set this flag if it can handle dlm running the
callback functions from softirq context.  When not set, dlm will
continue to run callback functions from the dlm_callback workqueue.
The new lockspace flag cannot be used for user space lockspaces, so
a uapi placeholder definition is used for the new flag value.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-06-11 12:57:49 -05:00
Alexander Aring
d3d85e9ad5 dlm: use LSFL_FS to check for kernel lockspace
The existing external lockspace flag DLM_LSFL_FS is now also
saved as an internal flag LSFL_FS, so it can be checked from
other code locations which want to know if a lockspace is
used from the kernel or user space.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-06-11 12:50:30 -05:00
David Teigland
4f5957a980 dlm: change list and timer names
The old terminology of "toss" and "keep" is no longer an
accurate description of the rsb states and lists, so change
the names to "inactive" and "active".  The old names had
also been copied into the scanning code, which is changed
back to use the "scan" name.

- "active" rsb structs have lkb's attached, and are ref counted.
- "inactive" rsb structs have no lkb's attached, are not ref counted.
- "scan" list is for rsb's that can be freed after a timeout period.
- "slow" lists are for infrequent iterations through active or
   inactive rsb structs.
- inactive rsb structs that are directory records will not be put
  on the scan list, since they are not freed based on timeouts.
- inactive rsb structs that are not directory records will be
  put on the scan list to be freed, since they are not longer needed.

Signed-off-by: David Teigland <teigland@redhat.com>
2024-06-10 15:11:46 -05:00
Alexander Aring
fa0b54f17a dlm: move recover idr to xarray datastructure
According to kdoc idr is deprecated and xarrays should be used nowadays.
This patch is moving the recover idr implementation to xarray
datastructure.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-05-31 11:04:54 -05:00
Alexander Aring
f455eb8490 dlm: move lkb idr to xarray datastructure
According to kernel doc idr is deprecated and xarrays should be used
nowadays. This patch is moving the lkb idr implementation to xarrays.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-05-31 11:04:54 -05:00
Alexander Aring
1ffefc19c4 dlm: drop own rsb pre allocation mechanism
This patch drops the own written rsb pre allocation mechanism as this is
already done by using kmem caches, we don't need another layer on top of
that to running some pre allocation scheme.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-05-31 11:04:54 -05:00
Alexander Aring
4db41bf4f0 dlm: remove ls_local_handle from struct dlm_ls
This patch removes ls_local_handle from struct dlm_ls as it stores the
ls pointer of the top level structure itesef and this isn't necessary.
There is a lookup functionality to lookup the lockspace in
dlm_find_lockspace_local() but the given input parameter is the pointer
already. This might be more safe to lookup a lockspace but given a wrong
lockspace pointer is a bug in the code and we save the additional lookup
here. The dlm_ls structure can be still hidden by using dlm_lockspace_t
handle pointer.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-05-31 11:04:54 -05:00
Alexander Aring
b88b249ba7 dlm: remove scand leftovers
This patch removes some leftover related code from dlm_scand that was
dropped in commit b1f2381c1a ("dlm: drop dlm_scand kthread and use
timers").

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-05-31 11:04:54 -05:00
Alexander Aring
7b72ab2c6a dlm: return -ENOMEM if ls_recover_buf fails
This patch fixes to return -ENOMEM in case of an allocation failure that
was forgotten to change in commit 6c648035cb ("dlm: switch to use
rhashtable for rsbs").

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202404200536.jGi6052v-lkp@intel.com/
Fixes: 6c648035cb ("dlm: switch to use rhashtable for rsbs")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-23 16:08:55 -05:00
Alexander Aring
7b012732d0 dlm: fix sleep in atomic context
This patch changes the orphans mutex to a spinlock since commit
c288745f1d ("dlm: avoid blocking receive at the end of recovery") is
using a rwlock_t to lock the DLM message receive path and do_purge() can
be called while this lock is held that forbids to sleep.

We need to use spin_lock_bh() because also a user context that calls
dlm_user_purge() can call do_purge() and since commit 92d59adfaf
("dlm: do message processing in softirq context") the DLM message
receive path is done under softirq context.

Fixes: c288745f1d ("dlm: avoid blocking receive at the end of recovery")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/gfs2/9ad928eb-2ece-4ad9-a79c-d2bce228e4bc@moroto.mountain/
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-17 14:20:04 -05:00
Alexander Aring
15fd7e5517 dlm: use rwlock for lkbidr
Convert the lock for lkbidr to an rwlock.  Most idr lookups will use
the read lock.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-16 14:45:57 -05:00
Alexander Aring
e91313591b dlm: use rwlock for rsb hash table
The conversion to rhashtable introduced a hash table lock per lockspace,
in place of per bucket locks.  To make this more scalable, switch to
using a rwlock for hash table access.  The common case fast path uses
it as a read lock.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-16 14:45:31 -05:00
Alexander Aring
b1f2381c1a dlm: drop dlm_scand kthread and use timers
Currently the scand kthread acts like a garbage collection for expired
rsbs on toss list, to clean them up after a certain timeout. It triggers
every couple of seconds and iterates over the toss list while holding
ls_rsbtbl_lock for the whole hash bucket iteration.

To reduce the amount of time holding ls_rsbtbl_lock, we now handle the
disposal of expired rsbs using a per-lockspace timer that expires for the
earliest tossed rsb on the lockspace toss queue. This toss queue is
ordered according to the rsb res_toss_time with the earliest tossed rsb
as the first entry. The toss timer will only trylock() necessary locks,
since it is low priority garbage collection, and will rearm the timer
if trylock() fails. If the timer function does not find any expired
rsb's, it rearms the timer with the next earliest expired rsb.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-16 14:40:27 -05:00
Alexander Aring
6c648035cb dlm: switch to use rhashtable for rsbs
Replace our own hash table with the more advanced rhashtable
for keeping rsb structs.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-16 14:34:39 -05:00
Alexander Aring
93a693d19d dlm: add rsb lists for iteration
To prepare for using rhashtable, add two rsb lists for iterating
through rsb's in two uncommon cases where this is necesssary:
- when dumping rsb state from debugfs, now using seq_list.
- when looking at all rsb's during recovery.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-16 14:33:25 -05:00
Alexander Aring
2d90354027 dlm: merge toss and keep hash table lists into one list
There are several places where lock processing can perform two hash table
lookups, first in the "keep" list, and if not found, in the "toss" list.
This patch introduces a new rsb state flag "RSB_TOSS" to represent the
difference between the state of being on keep vs toss list, so that the
two lists can be combined.  This avoids cases of two lookups.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-16 13:49:13 -05:00
Alexander Aring
dcdaad05ca dlm: change to single hashtable lock
Prepare to replace our own hash table with rhashtable by replacing
the per-bucket locks in our own hash table with a single lock.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-16 13:46:41 -05:00
Alexander Aring
700b04808f dlm: increment ls_count for dlm_scand
Increment the ls_count value while dlm_scand is processing a
lockspace so that release_lockspace()/remove_lockspace() will
wait for dlm_scand to finish.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-16 13:42:45 -05:00
Alexander Aring
578acf9a87 dlm: use spin_lock_bh for message processing
Use spin_lock_bh for all spinlocks involved in message processing,
in preparation for softirq message processing.  DLM lock requests
from user space involve dlm processing in user context, in addition
to the standard kernel context, necessitating bh variants.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-09 11:45:23 -05:00
Alexander Aring
d52c9b8fef dlm: convert ls_recv_active from rw_semaphore to rwlock
Convert ls_recv_active rw_semaphore to an rwlock to avoid
sleeping, in preparation for softirq message processing.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-09 11:44:49 -05:00
Alexander Aring
c288745f1d dlm: avoid blocking receive at the end of recovery
The end of the recovery process transitioned to normal message
processing by temporarily blocking the receiving context,
processing saved messages, then unblocking the receiving
context.  To avoid blocking the receiving context, the old
wait_queue and mutex are replaced by a new rwlock and the new
RECV_MSG_BLOCKED flag.  Received messages are added to the
list of saved messages, protected by the rwlock, until the
flag is cleared, which happens when all saved messages have
been processed.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-09 11:44:49 -05:00
Alexander Aring
097691dbad dlm: convert ls_waiters_mutex to spinlock
Convert the waiters mutex to a spinlock in prepration for
processing messages in softirq context.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-09 11:44:49 -05:00
Alexander Aring
3ae6776056 dlm: add new struct to save position in dlm_copy_master_names
Add a new struct to save the current position in the rsb masters_list
while sending the rsb names to other nodes. The rsb names are sent in
multiple chunks, and for each new chunk, the new "dlm_dir_dump" struct
saves the last position in the masters_list. The new struct is also
used to save more information to sanity check the recovery process.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-09 11:44:49 -05:00
Alexander Aring
3a747f4a2e dlm: move rsb root_list to ls_recover() stack
Move the rsb root_list from the lockspace to a stack variable since
it is now only used by the ls_recover() function.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-09 11:44:49 -05:00
Alexander Aring
aff46e0f24 dlm: use a new list for recovery of master rsb names
Add a new "masters_list" for master rsb structs, with a new
rwlock. The new list is created and used during the recovery
process to send the master rsb names to new nodes. With this
change, the current "root_list" can be used without locking.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-04-09 11:44:49 -05:00
Alexander Aring
c6b6d6dcc7 fs: dlm: revert check required context while close
This patch reverts commit 2c3fa6ae4d ("dlm: check required context
while close"). The function dlm_midcomms_close(), which will call later
dlm_lowcomms_close(), is called when the cluster manager tells the node
got fenced which means on midcomms/lowcomms layer to disconnect the node
from the cluster communication. The node can rejoin the cluster later.
This patch was ensuring no new message were able to be triggered when we
are in the close() function context. This was done by checking if the
lockspace has been stopped. However there is a missing check that we
only need to check specific lockspaces where the fenced node is member
of. This is currently complicated because there is no way to easily
check if a node is part of a specific lockspace without stopping the
recovery. For now we just revert this commit as it is just a check to
finding possible leaks of stopping lockspaces before close() is called.

Cc: stable@vger.kernel.org
Fixes: 2c3fa6ae4d ("dlm: check required context while close")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2023-06-14 10:17:33 -05:00
Alexander Aring
e1af8728f6 fs: dlm: move internal flags to atomic ops
This patch will move the lkb_flags value to the recently introduced
lkb_iflags value. For lkb_iflags we use atomic bit operations because
some flags like DLM_IFL_CB_PENDING are used while non rsb lock is held
to avoid issues with other flag manipulations which might run at the
same time we switch to atomic bit operations. Snapshot the bit values to
an uint32_t value is only used for debugging/logging use cases and don't
need to be 100% correct.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2023-03-06 15:49:07 -06:00
Alexander Aring
a7e7ffacad fs: dlm: rename stub to local message flag
This patch renames DLM_IFL_STUB_MS to DLM_IFL_LOCAL_MS flag. The
DLM_IFL_STUB_MS flag is somewhat misnamed, it means the dlm message is
used for local message transfer only. It is used by recovery to resolve
lock states if a node got fenced.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2023-03-06 15:49:07 -06:00
Alexander Aring
01c7a59789 fs: dlm: remove deprecated code parts
This patch removes code parts which was declared deprecated by
commit 6b0afc0cc3 ("fs: dlm: don't use deprecated timeout features by
default"). This contains the following dlm functionality:

- start a cancel of a dlm request did not complete after certain timeout:
  The current way how dlm cancellation works and interfering with other
  dlm requests triggered by the user can end in an overlapping and
  returning in -EBUSY. The most user don't handle this case and are
  unaware that DLM can return such errno in such situation. Due the
  timeout the user are mostly unaware when this happens.
- start a netlink warning messages for user space if dlm requests did
  not complete after certain timeout:
  This feature was never being built in the only known dlm user space side.
  As we are to remove the timeout cancellation feature we can directly
  remove this feature as well.

There might be the possibility to bring the timeout cancellation feature
back. However the current way of handling the -EBUSY case which is only
a software limitation and not a hardware limitation should be changed.
We minimize the current code base in DLM cancellation feature to not have
to deal with those existing features while solving the DLM cancellation
feature in general.

UAPI define DLM_LSFL_TIMEWARN is commented as deprecated and reserved
value. We should avoid at first to give it a new meaning but let
possible users still compile by keeping this define. In far future we
can give this flag a new meaning. The same for the DLM_LKF_TIMEOUT lock
request flag.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2023-03-06 15:49:07 -06:00
Linus Torvalds
a93e884edf Driver core changes for 6.3-rc1
Here is the large set of driver core changes for 6.3-rc1.
 
 There's a lot of changes this development cycle, most of the work falls
 into two different categories:
   - fw_devlink fixes and updates.  This has gone through numerous review
     cycles and lots of review and testing by lots of different devices.
     Hopefully all should be good now, and Saravana will be keeping a
     watch for any potential regression on odd embedded systems.
   - driver core changes to work to make struct bus_type able to be moved
     into read-only memory (i.e. const)  The recent work with Rust has
     pointed out a number of areas in the driver core where we are
     passing around and working with structures that really do not have
     to be dynamic at all, and they should be able to be read-only making
     things safer overall.  This is the contuation of that work (started
     last release with kobject changes) in moving struct bus_type to be
     constant.  We didn't quite make it for this release, but the
     remaining patches will be finished up for the release after this
     one, but the groundwork has been laid for this effort.
 
 Other than that we have in here:
   - debugfs memory leak fixes in some subsystems
   - error path cleanups and fixes for some never-able-to-be-hit
     codepaths.
   - cacheinfo rework and fixes
   - Other tiny fixes, full details are in the shortlog
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6
 6oeFOjD3JDju3cQsfGgd
 =Su6W
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the large set of driver core changes for 6.3-rc1.

  There's a lot of changes this development cycle, most of the work
  falls into two different categories:

   - fw_devlink fixes and updates. This has gone through numerous review
     cycles and lots of review and testing by lots of different devices.
     Hopefully all should be good now, and Saravana will be keeping a
     watch for any potential regression on odd embedded systems.

   - driver core changes to work to make struct bus_type able to be
     moved into read-only memory (i.e. const) The recent work with Rust
     has pointed out a number of areas in the driver core where we are
     passing around and working with structures that really do not have
     to be dynamic at all, and they should be able to be read-only
     making things safer overall. This is the contuation of that work
     (started last release with kobject changes) in moving struct
     bus_type to be constant. We didn't quite make it for this release,
     but the remaining patches will be finished up for the release after
     this one, but the groundwork has been laid for this effort.

  Other than that we have in here:

   - debugfs memory leak fixes in some subsystems

   - error path cleanups and fixes for some never-able-to-be-hit
     codepaths.

   - cacheinfo rework and fixes

   - Other tiny fixes, full details are in the shortlog

  All of these have been in linux-next for a while with no reported
  problems"

[ Geert Uytterhoeven points out that that last sentence isn't true, and
  that there's a pending report that has a fix that is queued up - Linus ]

* tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits)
  debugfs: drop inline constant formatting for ERR_PTR(-ERROR)
  OPP: fix error checking in opp_migrate_dentry()
  debugfs: update comment of debugfs_rename()
  i3c: fix device.h kernel-doc warnings
  dma-mapping: no need to pass a bus_type into get_arch_dma_ops()
  driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place
  Revert "driver core: add error handling for devtmpfs_create_node()"
  Revert "devtmpfs: add debug info to handle()"
  Revert "devtmpfs: remove return value of devtmpfs_delete_node()"
  driver core: cpu: don't hand-override the uevent bus_type callback.
  devtmpfs: remove return value of devtmpfs_delete_node()
  devtmpfs: add debug info to handle()
  driver core: add error handling for devtmpfs_create_node()
  driver core: bus: update my copyright notice
  driver core: bus: add bus_get_dev_root() function
  driver core: bus: constify bus_unregister()
  driver core: bus: constify some internal functions
  driver core: bus: constify bus_get_kset()
  driver core: bus: constify bus_register/unregister_notifier()
  driver core: remove private pointer from struct bus_type
  ...
2023-02-24 12:58:55 -08:00
Greg Kroah-Hartman
56d5f362ad kobject: kset_uevent_ops: make uevent() callback take a const *
The uevent() callback in struct kset_uevent_ops does not modify the
kobject passed into it, so make the pointer const to enforce this
restriction.  When doing so, fix up all existing uevent() callbacks to
have the correct signature to preserve the build.

Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-17-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-27 13:45:53 +01:00
Alexander Aring
317dd6ba6c fs: dlm: make dlm sequence id more robust
When joining a new lockspace, use a random number to initialize
a sequence number used in messages. This makes it easier to detect
sequence number mismatches in message replies during tests that
repeatedly join and leave a lockspace.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2023-01-23 16:13:20 -06:00
Alexander Aring
b8b750e0c9 fs: dlm: wait until all midcomms nodes detect version
The current dlm version detection is very complex due to backwards
compatablilty with earlier dlm protocol versions. It takes some time to
detect if a peer node has a specific DLM version. If it's not detected,
we just cut the socket connection. There could be cases where the local
node has not detected the version yet, but the peer node has.  In these
cases, we are trying to shutdown the dlm connection with a FIN/ACK message
exchange to be sure the other peer is ready to shutdown the connection on
dlm application level.  However this mechanism is only available on DLM
protocol version 3.2 and we need to be sure the DLM version is detected
before.

To make it more robust we introduce a a "best effort" wait to wait for the
version detection before shutdown the dlm connection. This need to be
done before the kthread recoverd for recovery handling is stopped,
because recovery handling will trigger enough messages to have a version
detection going on.

It is a corner case which was detected by modprobe dlm_locktroture module
and rmmod dlm_locktorture module directly afterwards (in a looping
behaviour). In practice probably nobody would leave a lockspace immediately
after joining it.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2023-01-23 14:58:19 -06:00
Alexander Aring
aad633dc0c fs: dlm: start midcomms before scand
The scand kthread can send dlm messages out, especially dlm remove
messages to free memory for unused rsb on other nodes. To send out dlm
messages, midcomms must be initialized. This patch moves the midcomms
start before scand is started.

Cc: stable@vger.kernel.org
Fixes: e7fd41792f ("[DLM] The core of the DLM for GFS2/CLVM")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2023-01-23 12:43:53 -06:00
Alexander Aring
8b0188b0d6 fs: dlm: add midcomms init/start functions
This patch introduces leftovers of init, start, stop and exit
functionality. The dlm application layer should always call the midcomms
layer which getting aware of such event and redirect it to the lowcomms
layer. Some functionality which is currently handled inside the start
functionality of midcomms and lowcomms should be handled in the init
functionality as it only need to be initialized once when dlm is loaded.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-11-21 09:45:49 -06:00
Alexander Aring
3e54c9e80e fs: dlm: fix log of lowcomms vs midcomms
This patch will fix a small issue when printing out that
dlm_midcomms_start() failed to start and it was printing out that the
dlm subcomponent lowcomms was failed but lowcomms is behind the midcomms
layer.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-11-08 12:59:41 -06:00
Alexander Aring
3872f87b09 fs: dlm: remove ls_remove_wait waitqueue
This patch removes the ls_remove_wait waitqueue handling. The current
handling tries to wait before a lookup is send out for a identically
resource name which is going to be removed. Hereby the remove message
should be send out before the new lookup message. The reason is that
after a lookup request and response will actually use the specific
remote rsb. A followed remove message would delete the rsb on the remote
side but it's still being used.

To reach a similar behaviour we simple send the remove message out while
the rsb lookup lock is held and the rsb is removed from the toss list.
Other find_rsb() calls would never have the change to get a rsb back to
live while a remove message will be send out (without holding the lock).

This behaviour requires a non-sleepable context which should be provided
now and might be the reason why it was not implemented so in the first
place.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-11-08 12:59:41 -06:00
Alexander Aring
a4c0352bb1 fs: dlm: convert ls_cb_mutex mutex to spinlock
This patch converts the ls_cb_mutex mutex to a spinlock, there is no
sleepable context when this lock is held.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-11-08 12:59:41 -06:00
Paulo Miguel Almeida
d96d0f9617 dlm: replace one-element array with fixed size array
One-element arrays are deprecated. So, replace one-element array with
fixed size array member in struct dlm_ls, and refactor the rest of the
code, accordingly.

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/228
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836
Link: https://lore.kernel.org/lkml/Y0W5jkiXUkpNl4ap@mail.google.com/

Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-11-08 12:58:47 -06:00
Alexander Aring
12cda13cfd fs: dlm: remove DLM_LSFL_FS from uapi
The DLM_LSFL_FS flag is set in lockspaces created directly
for a kernel user, as opposed to those lockspaces created
for user space applications.  The user space libdlm allowed
this flag to be set for lockspaces created from user space,
but then used by a kernel user.  No kernel user has ever
used this method, so remove the ability to do it.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-08-23 14:54:54 -05:00
Alexander Aring
296d9d1e98 fs: dlm: change ls_clear_proc_locks to spinlock
This patch changes the ls_clear_proc_locks to a spinlock because there
is no need to handle it as a mutex as there is no sleepable context when
ls_clear_proc_locks is held. This allows us to call those functionality
in non-sleepable contexts.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-08-23 14:54:02 -05:00
Alexander Aring
b5c9d37c7f fs: dlm: allow lockspaces have zero lvblen
A dlm user may not use the DLM_LKF_VALBLK flag in the DLM API,
so a zero lvblen should be allowed as a lockspace parameter.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-08-23 14:53:16 -05:00
Alexander Aring
6b0afc0cc3 fs: dlm: don't use deprecated timeout features by default
This patch will disable use of deprecated timeout features if
CONFIG_DLM_DEPRECATED_API is not set.  The deprecated features
will be removed in upcoming kernel release v6.2.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-08-01 09:31:38 -05:00
Alexander Aring
81eeb82fc2 fs: dlm: add deprecation Kconfig and warnings for timeouts
This patch adds a CONFIG_DLM_DEPRECATED_API Kconfig option
that must be enabled to use two timeout-related features
that we intend to remove in kernel v6.2.  Warnings are
printed if either is enabled and used.  Neither has ever
been used as far as we know.

. The DLM_LSFL_TIMEWARN lockspace creation flag will be
  removed, along with the associated configfs entry for
  setting the timeout.  Setting the flag and configfs file
  would cause dlm to track how long locks were waiting
  for reply messages.  After a timeout, a kernel message
  would be logged, and a netlink message would be sent
  to userspace.  Recently, midcomms messages have been
  added that produce much better logging about actual
  problems with messages.  No use has ever been found
  for the netlink messages.

. The userspace libdlm API has allowed the DLM_LKF_TIMEOUT
  flag with a timeout value to be set in lock requests.
  The lock request would be cancelled after the timeout.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-08-01 09:31:32 -05:00
Alexander Aring
2bb2a3d66c fs: dlm: remove waiter warnings
This patch removes warning messages that could be logged when
remote requests had been waiting on a reply message for some timeout
period (which could be set through configfs, but was rarely enabled.)
The improved midcomms layer now carefully tracks all messages and
replies, and logs much more useful messages if there is an actual
problem.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-06-24 11:57:52 -05:00
Alexander Aring
682bb91b6b fs: dlm: make new_lockspace() wait until recovery completes
Make dlm_new_lockspace() wait until a full recovery completes
sucessfully or fails. Previously, dlm_new_lockspace() returned
to the caller after dlm_recover_members() finished, which is
only partially through recovery.  The result of the previous
behavior is that the new lockspace would not be usable for some
time (especially with overlapping recoveries), and some errors
in the later part of recovery could not be returned to the caller.

Kernel callers gfs2 and cluster-md have their own wait handling to
wait for recovery to complete after calling dlm_new_lockspace().
This continues to work, but will be unnecessary.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-06-24 11:57:47 -05:00