linux/fs
Mike Christie f9010dbdce fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
When switching from kthreads to vhost_tasks two bugs were added:
1. The vhost worker tasks's now show up as processes so scripts doing
ps or ps a would not incorrectly detect the vhost task as another
process.  2. kthreads disabled freeze by setting PF_NOFREEZE, but
vhost tasks's didn't disable or add support for them.

To fix both bugs, this switches the vhost task to be thread in the
process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
SIGKILL/STOP support is required because CLONE_THREAD requires
CLONE_SIGHAND which requires those 2 signals to be supported.

This is a modified version of the patch written by Mike Christie
<michael.christie@oracle.com> which was a modified version of patch
originally written by Linus.

Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER.
Including ignoring signals, setting up the register state, and having
get_signal return instead of calling do_group_exit.

Tidied up the vhost_task abstraction so that the definition of
vhost_task only needs to be visible inside of vhost_task.c.  Making
it easier to review the code and tell what needs to be done where.
As part of this the main loop has been moved from vhost_worker into
vhost_task_fn.  vhost_worker now returns true if work was done.

The main loop has been updated to call get_signal which handles
SIGSTOP, freezing, and collects the message that tells the thread to
exit as part of process exit.  This collection clears
__fatal_signal_pending.  This collection is not guaranteed to
clear signal_pending() so clear that explicitly so the schedule()
sleeps.

For now the vhost thread continues to exist and run work until the
last file descriptor is closed and the release function is called as
part of freeing struct file.  To avoid hangs in the coredump
rendezvous and when killing threads in a multi-threaded exec.  The
coredump code and de_thread have been modified to ignore vhost threads.

Remvoing the special case for exec appears to require teaching
vhost_dev_flush how to directly complete transactions in case
the vhost thread is no longer running.

Removing the special case for coredump rendezvous requires either the
above fix needed for exec or moving the coredump rendezvous into
get_signal.

Fixes: 6e890c5d50 ("vhost: use vhost_tasks for worker threads")
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Co-developed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-01 17:15:33 -04:00
..
9p Including fixes from netfilter. 2023-05-05 19:12:01 -07:00
adfs fs: port ->setattr() to pass mnt_idmap 2023-01-19 09:24:02 +01:00
affs for-6.3/dio-2023-02-16 2023-02-20 14:10:36 -08:00
afs Five hotfixes. Three are cc:stable, two pertain to post-6.3 merge window 2023-05-06 11:25:03 -07:00
autofs fs: port ->permission() to pass mnt_idmap 2023-01-19 09:24:28 +01:00
befs befs: Convert befs_symlink_read_folio() to use a folio 2022-08-02 12:34:03 -04:00
bfs fs: port inode_init_owner() to mnt_idmap 2023-01-19 09:24:28 +01:00
btrfs for-6.4-rc4-tag 2023-05-30 17:23:50 -04:00
cachefiles fs/cachefiles: simplify one-level sysctl registration for cachefiles_sysctls 2023-04-13 11:49:35 -07:00
ceph ceph: force updating the msg pointer in non-split case 2023-05-18 11:15:28 +02:00
coda sysctl-6.4-rc1 2023-04-27 16:52:33 -07:00
configfs fs: consolidate duplicate dt_type helpers 2023-04-03 09:23:54 +02:00
cramfs fs/cramfs/inode.c: initialize file_ra_state 2023-03-02 21:54:23 -08:00
crypto fscrypt: optimize fscrypt_initialize() 2023-04-06 11:16:39 -07:00
debugfs ARM: SoC drivers for 6.3 2023-02-27 10:04:49 -08:00
devpts devpts: simplify two-level sysctl registration for pty_kern_table 2023-03-13 12:36:34 +01:00
dlm Networking changes for 6.4. 2023-04-26 16:07:23 -07:00
ecryptfs fs: drop unused posix acl handlers 2023-03-06 09:57:12 +01:00
efivarfs A healthy mix of EFI contributions this time: 2023-02-23 14:41:48 -08:00
efs
erofs erofs: use HIPRI by default if per-cpu kthreads are enabled 2023-05-23 16:57:08 +08:00
exfat Description for this pull request: 2023-03-01 08:42:27 -08:00
exportfs fs: port ->permission() to pass mnt_idmap 2023-01-19 09:24:28 +01:00
ext2 \n 2023-04-26 09:07:46 -07:00
ext4 ext4: enable the lazy init thread when remounting read/write 2023-05-30 15:33:57 -04:00
f2fs f2fs update for 6.4-rc1 2023-04-26 09:42:10 -07:00
fat There is no particular theme here - mainly quick hits all over the tree. 2023-02-23 17:55:40 -08:00
freevxfs There is no particular theme here - mainly quick hits all over the tree. 2023-02-23 17:55:40 -08:00
fscache fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work() 2023-01-30 12:51:54 +00:00
fuse Driver core changes for 6.4-rc1 2023-04-27 11:53:57 -07:00
gfs2 gfs2: Don't deref jdesc in evict 2023-05-10 17:15:18 +02:00
hfs There is no particular theme here - mainly quick hits all over the tree. 2023-02-23 17:55:40 -08:00
hfsplus fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode() 2023-04-12 11:29:32 +02:00
hostfs um: hostfs: define our own API boundary 2023-04-20 23:04:40 +02:00
hpfs fs: port ->rename() to pass mnt_idmap 2023-01-19 09:24:26 +01:00
hugetlbfs mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() 2023-04-21 14:52:05 -07:00
iomap New code for 6.4: 2023-04-29 10:35:48 -07:00
isofs - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
jbd2 jdb2: Don't refuse invalidation of already invalidated buffers 2023-04-14 19:38:50 -04:00
jffs2 fs: rename generic posix acl handlers 2023-03-06 09:57:13 +01:00
jfs write_one_page series 2023-04-24 19:20:27 -07:00
kernfs Driver core changes for 6.4-rc1 2023-04-27 11:53:57 -07:00
lockd nfsd-6.4 fixes: 2023-05-17 09:56:01 -07:00
minix Merge branch 'work.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2023-02-24 19:01:15 -08:00
netfs - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
nfs NFSv4.2: Fix a potential double free with READ_PLUS 2023-05-19 17:11:59 -04:00
nfs_common NFSv4.2: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:52 -07:00
nfsd nfsd-6.4 fixes: 2023-05-17 09:56:01 -07:00
nilfs2 nilfs2: fix use-after-free bug of nilfs_root in nilfs_evict_inode() 2023-05-17 15:24:34 -07:00
nls
notify inotify: Avoid reporting event with invalid wd 2023-04-25 12:36:55 +02:00
ntfs ntfs: simplfy one-level sysctl registration for ntfs_sysctls 2023-04-13 11:49:35 -07:00
ntfs3 driver ntfs3 for linux 6.4 2023-04-29 10:52:37 -07:00
ocfs2 Mainly singleton patches all over the place. Series of note are: 2023-04-27 19:57:00 -07:00
omfs fs: port inode_init_owner() to mnt_idmap 2023-01-19 09:24:28 +01:00
openpromfs
orangefs - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
overlayfs ovl: check for ->listxattr() support 2023-03-06 09:57:13 +01:00
proc sysctl: remove register_sysctl_paths() 2023-05-02 19:24:16 -07:00
pstore pstore update for v6.4-rc1 2023-04-27 17:03:40 -07:00
qnx4 qnx4: credit contributors in CREDITS 2023-03-14 12:56:30 -06:00
qnx6 qnx6: credit contributor and mark filesystem orphan 2023-03-14 12:56:30 -06:00
quota quota: mark PRINT_QUOTA_WARNING as BROKEN 2023-04-14 13:06:50 +02:00
ramfs mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
reiserfs \n 2023-04-26 09:07:46 -07:00
romfs mm/nommu: factor out check for NOMMU shared mappings into is_nommu_shared_mapping() 2023-01-18 17:12:56 -08:00
smb eight server fixes 2023-06-01 08:27:34 -04:00
squashfs revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" 2023-02-03 17:52:25 -08:00
sysfs
sysv sysv: switch to put_and_unmap_page() 2023-03-12 20:03:41 -04:00
tracefs fs: port ->mkdir() to pass mnt_idmap 2023-01-19 09:24:26 +01:00
ubifs ubifs: Fix memleak when insert_old_idx() failed 2023-04-23 23:36:38 +02:00
udf udf: use wrapper i_blocksize() in udf_discard_prealloc() 2023-03-13 11:16:16 +01:00
ufs ufs: don't flush page immediately for DIRSYNC directories 2023-03-28 16:20:14 -07:00
unicode unicode: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:54 -07:00
vboxsf fs: port ->rename() to pass mnt_idmap 2023-01-19 09:24:26 +01:00
verity fsverity: reject FS_IOC_ENABLE_VERITY on mode 3 fds 2023-04-11 19:23:23 -07:00
xfs xfs: bug fixes for 6.4-rc2 2023-05-11 16:51:11 -05:00
zonefs zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user space 2023-03-30 20:56:02 +09:00
aio.c Merge branch 'mm-hotfixes-stable' into mm-stable 2023-02-10 15:34:48 -08:00
anon_inodes.c dynamic_dname(): drop unused dentry argument 2022-08-20 11:34:04 -04:00
attr.c nfs: use vfs setgid helper 2023-03-30 08:51:48 +02:00
bad_inode.c fs: port ->permission() to pass mnt_idmap 2023-01-19 09:24:28 +01:00
binfmt_elf_fdpic.c ELF: fix all "Elf" typos 2023-04-08 13:45:37 -07:00
binfmt_elf_test.c
binfmt_elf.c Mainly singleton patches all over the place. Series of note are: 2023-04-27 19:57:00 -07:00
binfmt_flat.c
binfmt_misc.c binfmt_misc: fix shift-out-of-bounds in check_special_flags 2022-12-02 13:57:04 -08:00
binfmt_script.c
buffer.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
char_dev.c chardev: fix error handling in cdev_device_add() 2022-12-02 17:48:59 +01:00
compat_binfmt_elf.c
coredump.c fork, vhost: Use CLONE_THREAD to fix freezer/ps regression 2023-06-01 17:15:33 -04:00
d_path.c d_path.c: typo fix... 2022-08-20 11:34:33 -04:00
dax.c fsdax: force clear dirty mark if CoW 2023-04-05 18:06:23 -07:00
dcache.c tmpfile API change 2022-10-10 19:45:17 -07:00
direct-io.c __blockdev_direct_IO(): get rid of submit_io callback 2023-03-05 20:27:41 -05:00
drop_caches.c
eventfd.c eventfd: use wait_event_interruptible_locked_irq() helper 2023-04-06 10:01:50 +02:00
eventpoll.c Mainly singleton patches all over the place. Series of note are: 2023-04-27 19:57:00 -07:00
exec.c tracing updates for 6.4: 2023-04-28 15:57:53 -07:00
fcntl.c fs.idmapped.v6.3 2023-02-20 11:53:11 -08:00
fhandle.c do_sys_name_to_handle(): constify path 2022-09-01 17:36:39 -04:00
file_table.c filelock: move file locking definitions to separate header file 2023-01-11 06:52:32 -05:00
file.c fs: prevent out-of-bounds array speculation when closing a file descriptor 2023-03-09 22:46:21 -05:00
filesystems.c
fs_context.c
fs_parser.c ext4: journal_path mount options should follow links 2022-12-01 10:46:54 -05:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c for-6.4/block-2023-05-06 2023-05-06 08:28:58 -07:00
fsopen.c
init.c fs: port ->permission() to pass mnt_idmap 2023-01-19 09:24:28 +01:00
inode.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
internal.h five ksmbd server fixes, and new lock_rename_child VFS helper routine 2023-04-29 11:10:39 -07:00
ioctl.c fs: port inode_owner_or_capable() to mnt_idmap 2023-01-19 09:24:29 +01:00
Kconfig smb: move client and server files to common directory fs/smb 2023-05-24 16:29:21 -05:00
Kconfig.binfmt Xtensa updates for v6.1 2022-10-10 14:21:11 -07:00
kernel_read_file.c
libfs.c fs: consolidate duplicate dt_type helpers 2023-04-03 09:23:54 +02:00
locks.c filelocks: use mount idmapping for setlease permission check 2023-03-09 22:36:12 +01:00
Makefile smb: move client and server files to common directory fs/smb 2023-05-24 16:29:21 -05:00
mbcache.c ext4: fix deadlock due to mbcache entry corruption 2022-12-08 21:49:25 -05:00
mnt_idmapping.c fs: move mnt_idmap 2023-01-19 09:24:30 +01:00
mount.h
mpage.c mpage: use folios in bio end_io handler 2023-04-18 16:30:02 -07:00
namei.c five ksmbd server fixes, and new lock_rename_child VFS helper routine 2023-04-29 11:10:39 -07:00
namespace.c fget() to fdget() conversions 2023-04-24 19:14:20 -07:00
no-block.c
nsfs.c kill the last remaining user of proc_ns_fget() 2023-04-20 22:55:35 -04:00
open.c open: return EINVAL for O_DIRECTORY | O_CREAT 2023-03-22 11:06:55 +01:00
pipe.c pipe: check for IOCB_NOWAIT alongside O_NONBLOCK 2023-05-12 17:17:27 +02:00
pnode.c pnode: pass mountpoint directly 2023-04-06 14:53:38 +02:00
pnode.h
posix_acl.c acl: don't depend on IOP_XATTR 2023-03-06 09:59:20 +01:00
proc_namespace.c
read_write.c iov_iter: add iter_iov_addr() and iter_iov_len() helpers 2023-03-30 08:12:29 -06:00
readdir.c Change calling conventions for filldir_t 2022-08-17 17:25:04 -04:00
remap_range.c fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap 2023-01-19 09:24:29 +01:00
select.c
seq_file.c use less confusing names for iov_iter direction initializers 2022-11-25 13:01:55 -05:00
signalfd.c
splice.c pipe-nonblock-2023-05-06 2023-05-06 08:15:20 -07:00
stack.c
stat.c fs.idmapped.v6.3 2023-02-20 11:53:11 -08:00
statfs.c statfs: enforce statfs[64] structure initialization 2023-05-17 15:20:17 +02:00
super.c mm: shrinkers: convert shrinker_rwsem to mutex 2023-03-28 16:20:17 -07:00
sync.c
sysctls.c
timerfd.c
userfaultfd.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
utimes.c fs.idmapped.v6.3 2023-02-20 11:53:11 -08:00
xattr.c fs: don't call posix_acl_listxattr in generic_listxattr 2023-05-17 15:25:20 +02:00