linux/drivers/dax
Shiyang Ruan fa422b353d mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind
Now, if we suddenly remove a PMEM device(by calling unbind) which
contains FSDAX while programs are still accessing data in this device,
e.g.:
```
 $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 &
 # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 &
 echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind
```
it could come into an unacceptable state:
  1. device has gone but mount point still exists, and umount will fail
       with "target is busy"
  2. programs will hang and cannot be killed
  3. may crash with NULL pointer dereference

To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we
are going to remove the whole device, and make sure all related processes
could be notified so that they could end up gracefully.

This patch is inspired by Dan's "mm, dax, pmem: Introduce
dev_pagemap_failure()"[1].  With the help of dax_holder and
->notify_failure() mechanism, the pmem driver is able to ask filesystem
on it to unmap all files in use, and notify processes who are using
those files.

Call trace:
trigger unbind
 -> unbind_store()
  -> ... (skip)
   -> devres_release_all()
    -> kill_dax()
     -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE)
      -> xfs_dax_notify_failure()
      `-> freeze_super()             // freeze (kernel call)
      `-> do xfs rmap
      ` -> mf_dax_kill_procs()
      `  -> collect_procs_fsdax()    // all associated processes
      `  -> unmap_and_kill()
      ` -> invalidate_inode_pages2_range() // drop file's cache
      `-> thaw_super()               // thaw (both kernel & user call)

Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove
event.  Use the exclusive freeze/thaw[2] to lock the filesystem to prevent
new dax mapping from being created.  Do not shutdown filesystem directly
if configuration is not supported, or if failure range includes metadata
area.  Make sure all files and processes(not only the current progress)
are handled correctly.  Also drop the cache of associated files before
pmem is removed.

[1]: https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.stgit@dwillia2-desk3.amr.corp.intel.com/
[2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-07 14:34:26 +05:30
..
hmem dax: Cleanup extra dax_region references 2023-06-23 01:03:50 -06:00
pmem dax: Kill DEV_DAX_PMEM_COMPAT 2021-11-24 19:21:35 -08:00
bus.c dax: refactor deprecated strncpy 2023-09-27 10:33:46 -07:00
bus.h dax: Cleanup extra dax_region references 2023-06-23 01:03:50 -06:00
cxl.c dax: Cleanup extra dax_region references 2023-06-23 01:03:50 -06:00
dax-private.h dax: Introduce alloc_dev_dax_id() 2023-06-23 01:03:50 -06:00
device.c mm: remove enum page_entry_size 2023-08-24 16:20:30 -07:00
Kconfig cxl for v6.3 2023-02-25 09:19:23 -08:00
kmem.c dax, kmem: calculate abstract distance with general interface 2023-10-16 15:44:39 -07:00
Makefile cxl/dax: Create dax devices for CXL RAM regions 2023-02-10 17:33:45 -08:00
pmem.c dax: Cleanup extra dax_region references 2023-06-23 01:03:50 -06:00
super.c mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind 2023-12-07 14:34:26 +05:30