linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-14 08:02:07 +00:00

Author	SHA1	Message	Date
Dongdong Zhang	b5f1a218ae	f2fs: fix normal discard process In the DPOLICY_BG mode, there is a conflict between the two conditions "i + 1 < dpolicy->granularity" and "i < DEFAULT_DISCARD_GRANULARITY". If i = 15, the first condition is false, it will enter the second condition and dispatch all small granularity discards in function __issue_discard_cmd_orderly. The restrictive effect of the first condition to small discards will be invalidated. These two conditions should align. Fixes: `20ee438232` ("f2fs: issue small discard by LBA order") Signed-off-by: Dongdong Zhang <zhangdongdong1@oppo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-01 17:56:02 -07:00
Yangtao Li	44b9d01f2e	f2fs: cleanup in f2fs_create_flush_cmd_control() Just cleanup for readable, no functional changes. Suggested-by: Chao Yu <chao@kernel.org> Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-01 17:56:02 -07:00
Yangtao Li	6359a1aaca	f2fs: fix gc mode when gc_urgent_high_remaining is 1 Under the current logic, when gc_urgent_high_remaining is set to 1, the mode will be switched to normal at the beginning, instead of running in gc_urgent mode. Let's switch the gc mode back to normal when the gc ends. Fixes: `265576181b` ("f2fs: remove gc_urgent_high_limited for cleanup") Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-01 17:56:02 -07:00
Yangtao Li	3688cbe39b	f2fs: remove batched_trim_sections node commit 377224c47118("f2fs: don't split checkpoint in fstrim") obsolete batch mode and related sysfs entry. Since this testing sysfs node has been deprecated for a long time, let's remove it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-01 17:56:02 -07:00
Chao Yu	18792e64c8	f2fs: support fault injection for f2fs_is_valid_blkaddr() This patch supports to inject fault into f2fs_is_valid_blkaddr() to simulate accessing inconsistent data/meta block addressses from caller. Usage: a) echo 262144 > /sys/fs/f2fs/<dev>/inject_type or b) mount -o fault_type=262144 <dev> <mountpoint> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-01 17:56:01 -07:00
Chao Yu	91586ce0d3	f2fs: fix to invalidate dcc->f2fs_issue_discard in error path Syzbot reports a NULL pointer dereference issue as below: __refcount_add include/linux/refcount.h:193 [inline] __refcount_inc include/linux/refcount.h:250 [inline] refcount_inc include/linux/refcount.h:267 [inline] get_task_struct include/linux/sched/task.h:110 [inline] kthread_stop+0x34/0x1c0 kernel/kthread.c:703 f2fs_stop_discard_thread+0x3c/0x5c fs/f2fs/segment.c:1638 kill_f2fs_super+0x5c/0x194 fs/f2fs/super.c:4522 deactivate_locked_super+0x70/0xe8 fs/super.c:332 deactivate_super+0xd0/0xd4 fs/super.c:363 cleanup_mnt+0x1f8/0x234 fs/namespace.c:1186 __cleanup_mnt+0x20/0x30 fs/namespace.c:1193 task_work_run+0xc4/0x14c kernel/task_work.c:177 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0x26c/0xbe0 kernel/exit.c:795 do_group_exit+0x60/0xe8 kernel/exit.c:925 __do_sys_exit_group kernel/exit.c:936 [inline] __se_sys_exit_group kernel/exit.c:934 [inline] __wake_up_parent+0x0/0x40 kernel/exit.c:934 __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] invoke_syscall arch/arm64/kernel/syscall.c:52 [inline] el0_svc_common+0x138/0x220 arch/arm64/kernel/syscall.c:142 do_el0_svc+0x48/0x164 arch/arm64/kernel/syscall.c:206 el0_svc+0x58/0x150 arch/arm64/kernel/entry-common.c:636 el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:654 el0t_64_sync+0x18c/0x190 arch/arm64/kernel/entry.S:581 The root cause of this issue is in error path of f2fs_start_discard_thread(), it missed to invalidate dcc->f2fs_issue_discard, later kthread_stop() may access invalid pointer. Fixes: `4d67490498` ("f2fs: Don't create discard thread when device doesn't support realtime discard") Reported-by: syzbot+035a381ea1afb63f098d@syzkaller.appspotmail.com Reported-by: syzbot+729c925c2d9fc495ddee@syzkaller.appspotmail.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-01 17:56:01 -07:00
Zhang Qilong	28fc4e9077	f2fs: Fix the race condition of resize flag between resizefs Because the set/clear SBI_IS_RESIZEFS flag not between any locks, In the following case: thread1 thread2 ->ioctl(resizefs) ->set RESIZEFS flag ->ioctl(resizefs) ... ->set RESIZEFS flag ->clear RESIZEFS flag ->resizefs stream # No RESIZEFS flag in the stream Also before freeze_super, the resizefs not started, we should not set the SBI_IS_RESIZEFS flag. So move the set/clear SBI_IS_RESIZEFS flag between the cp_mutex and gc_lock. Fixes: `b4b10061ef` ("f2fs: refactor resize_fs to avoid meta updates in progress") Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-27 20:25:59 -07:00
Jaegeuk Kim	14dc00a0e2	f2fs: let's avoid to get cp_rwsem twice by f2fs_evict_inode by d_invalidate f2fs_unlink -> f2fs_lock_op -> d_invalidate -> shrink_dentry_list -> iput_final -> f2fs_evict_inode -> f2fs_lock_op Reviewed-by: Chao Yu <chao@kernel.org> Tested-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-25 10:44:27 -07:00
Pavel Machek	c3db3c2fd9	f2fs: should put a page when checking the summary info The commit introduces another bug. Cc: stable@vger.kernel.org Fixes: `c6ad7fd166` ("f2fs: fix to do sanity check on summary info") Signed-off-by: Pavel Machek <pavel@denx.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-25 10:44:21 -07:00
Linus Torvalds	f1947d7c8a	Random number generator fixes for Linux 6.1-rc1. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmNHYD0ACgkQSfxwEqXe A655AA//dJK0PdRghqrKQsl18GOCffV5TUw5i1VbJQbI9d8anfxNjVUQiNGZi4et qUwZ8OqVXxYx1Z1UDgUE39PjEDSG9/cCvOpMUWqN20/+6955WlNZjwA7Fk6zjvlM R30fz5CIJns9RFvGT4SwKqbVLXIMvfg/wDENUN+8sxt36+VD2gGol7J2JJdngEhM lW+zqzi0ABqYy5so4TU2kixpKmpC08rqFvQbD1GPid+50+JsOiIqftDErt9Eg1Mg MqYivoFCvbAlxxxRh3+UHBd7ZpJLtp1UFEOl2Rf00OXO+ZclLCAQAsTczucIWK9M 8LCZjb7d4lPJv9RpXFAl3R1xvfc+Uy2ga5KeXvufZtc5G3aMUKPuIU7k28ZyblVS XXsXEYhjTSd0tgi3d0JlValrIreSuj0z2QGT5pVcC9utuAqAqRIlosiPmgPlzXjr Us4jXaUhOIPKI+Musv/fqrxsTQziT0jgVA3Njlt4cuAGm/EeUbLUkMWwKXjZLTsv vDsBhEQFmyZqxWu4pYo534VX2mQWTaKRV1SUVVhQEHm57b00EAiZohoOvweB09SR 4KiJapikoopmW4oAUFotUXUL1PM6yi+MXguTuc1SEYuLz/tCFtK8DJVwNpfnWZpE lZKvXyJnHq2Sgod/hEZq58PMvT6aNzTzSg7YzZy+VabxQGOO5mc= =M+mV -----END PGP SIGNATURE----- Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_() namespace, not the get_random_() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1	2022-10-16 15:27:07 -07:00
Linus Torvalds	5e714bf171	- Alistair Popple has a series which addresses a race which causes page refcounting errors in ZONE_DEVICE pages. - Peter Xu fixes some userfaultfd test harness instability. - Various other patches in MM, mainly fixes. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0j6igAKCRDdBJ7gKXxA jnGxAP99bV39ZtOsoY4OHdZlWU16BUjKuf/cb3bZlC2G849vEwD+OKlij86SG20j MGJQ6TfULJ8f1dnQDd6wvDfl3FMl7Qc= =tbdp -----END PGP SIGNATURE----- Merge tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - fix a race which causes page refcounting errors in ZONE_DEVICE pages (Alistair Popple) - fix userfaultfd test harness instability (Peter Xu) - various other patches in MM, mainly fixes * tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (29 commits) highmem: fix kmap_to_page() for kmap_local_page() addresses mm/page_alloc: fix incorrect PGFREE and PGALLOC for high-order page mm/selftest: uffd: explain the write missing fault check mm/hugetlb: use hugetlb_pte_stable in migration race check mm/hugetlb: fix race condition of uffd missing/minor handling zram: always expose rw_page LoongArch: update local TLB if PTE entry exists mm: use update_mmu_tlb() on the second thread kasan: fix array-bounds warnings in tests hmm-tests: add test for migrate_device_range() nouveau/dmem: evict device private memory during release nouveau/dmem: refactor nouveau_dmem_fault_copy_one() mm/migrate_device.c: add migrate_device_range() mm/migrate_device.c: refactor migrate_vma and migrate_deivce_coherent_page() mm/memremap.c: take a pgmap reference on page allocation mm: free device private pages have zero refcount mm/memory.c: fix race when faulting a device private page mm/damon: use damon_sz_region() in appropriate place mm/damon: move sz_damon_region to damon_sz_region lib/test_meminit: add checks for the allocation functions ...	2022-10-14 12:28:43 -07:00
Matthew Wilcox (Oracle)	4fa0e3ff21	ext4,f2fs: fix readahead of verity data The recent change of page_cache_ra_unbounded() arguments was buggy in the two callers, causing us to readahead the wrong pages. Move the definition of ractl down to after the index is set correctly. This affected performance on configurations that use fs-verity. Link: https://lkml.kernel.org/r/20221012193419.1453558-1-willy@infradead.org Fixes: `73bb49da50` ("mm/readahead: make page_cache_ra_unbounded take a readahead_control") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Jintao Yin <nicememory@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-10-12 18:51:48 -07:00
Jason A. Donenfeld	a251c17aa5	treewide: use get_random_u32() when possible The prandom_u32() function has been a deprecated inline wrapper around get_random_u32() for several releases now, and compiles down to the exact same code. Replace the deprecated wrapper with a direct call to the real function. The same also applies to get_random_int(), which is just a wrapper around get_random_u32(). This was done as a basic find and replace. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Acked-by: Helge Deller <deller@gmx.de> # for parisc Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-10-11 17:42:58 -06:00
Jason A. Donenfeld	81895a65ec	treewide: use prandom_u32_max() when possible, part 1 Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done mechanically with this coccinelle script: @basic@ expression E; type T; identifier get_random_u32 =~ "get_random_int\|prandom_u32\|get_random_u32"; typedef u64; @@ ( - ((T)get_random_u32() % (E)) + prandom_u32_max(E) \| - ((T)get_random_u32() & ((E) - 1)) + prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2) \| - ((u64)(E) * get_random_u32() >> 32) + prandom_u32_max(E) \| - ((T)get_random_u32() & ~PAGE_MASK) + prandom_u32_max(PAGE_SIZE) ) @multi_line@ identifier get_random_u32 =~ "get_random_int\|prandom_u32\|get_random_u32"; identifier RAND; expression E; @@ - RAND = get_random_u32(); ... when != RAND - RAND %= (E); + RAND = prandom_u32_max(E); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int\|prandom_u32\|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Add one to the literal. @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value == 232 - 1 or value == 231 - 1 or value == 224 - 1 or value == 216 - 1 or value == 2**8 - 1: print("Skipping 0x%x for cleanup elsewhere" % (value)) cocci.include_match(False) elif value & (value + 1) != 0: print("Skipping 0x%x because it's not a power of two minus one" % (value)) cocci.include_match(False) elif literal.startswith('0x'): coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1)) else: coccinelle.RESULT = cocci.make_expr("%d" % (value + 1)) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; expression add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + prandom_u32_max(RESULT) @collapse_ret@ type T; identifier VAR; expression E; @@ { - T VAR; - VAR = (E); - return VAR; + return E; } @drop_var@ type T; identifier VAR; @@ { - T VAR; ... when != VAR } Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: KP Singh <kpsingh@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-10-11 17:42:55 -06:00
Linus Torvalds	5d170fe435	f2fs-for-6.1-rc1 This round looks fairly small comparing to the previous updates which includes mostly minor bug fixes. Nevertheless, as we've still interested in improving the stability, Chao added some debugging methods to diagnoze subtle runtime inconsistency problem. Enhancement - store all the corruption or failure reasons in superblock - detect meta inode, summary info, and block address inconsistency - increase the limit for reserve_root for low-end devices - add the number of compressed IO in iostat Bug fix - DIO write fix for zoned devices - do out-of-place writes for cold files - fix some stat updates (FS_CP_DATA_IO, dirty page count) - fix race condition on setting FI_NO_EXTENT flag - fix data races when freezing super - fix wrong continue condition check in GC - do not allow ATGC for LFS mode In addition, there're some code enhancement and clean-ups as usual. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmNEVIkACgkQQBSofoJI UNL/Qg//eu7k196yIKflDZmp5aJbb5ybpFmh7XkPiqAV17ns+R2uLGq68BvTs+Tg rqCjB7j2kkBh1kN32R7aGcx6tcbHjWc94pi59YTGQ6+pwkop3KJxFHSwAaUw6y34 8NZwmsnrm9rv0A0QPhQPK19yWmG/2smUE9b/u7M3+20I1WANaxdS/vOKbZz/amOu f/BvsIIGS7Zzm9OpBCvGmq9Qpd83jlH6PuYGTC/OVbCrUiAJEmwN8wGsKP/9qB/5 KxVpdlh3vxulS6ixNbMu2qw9GBAQpAOz50+eDL5ZtGvGIQNHZRpGlfpJoW1lz0EO 4fJtpf5OMGqUbNaPCTG4qQGYAtKWA9YnFeWSS7RViQ6MryRXZMK8ka5eIe5Qblcf AXD/eU2gKzOu0fuvdBRCt/wTSb4gY8sMNhe4psDsZxfhaYIpX8Ee/XVa4d+Z4frg irN9gid1k3laMTx9dwJL8m7gIFvy3pak6l3B0bA69fAXd3faI40enuyfubFxnDet OuRNxj8j3J5C140ag5KOuBCRub2/aPaj9YSQqUstf64d8FzN/Ypn5iVPTs2DP/3D bcAFBwCS2+MCsk9+ra0WldZ5awdd6CRHDkvaYeDEuLCaLHUCo6CXe3aIyWawJBvJ RnghKNv82RIV+rQlI1/sg8lseoDnEZTp5iwDGw/qZ+ZUyn05apM= =aZ9y -----END PGP SIGNATURE----- Merge tag 'f2fs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This round looks fairly small comparing to the previous updates and includes mostly minor bug fixes. Nevertheless, as we've still interested in improving the stability, Chao added some debugging methods to diagnoze subtle runtime inconsistency problem. Enhancements: - store all the corruption or failure reasons in superblock - detect meta inode, summary info, and block address inconsistency - increase the limit for reserve_root for low-end devices - add the number of compressed IO in iostat Bug fixes: - DIO write fix for zoned devices - do out-of-place writes for cold files - fix some stat updates (FS_CP_DATA_IO, dirty page count) - fix race condition on setting FI_NO_EXTENT flag - fix data races when freezing super - fix wrong continue condition check in GC - do not allow ATGC for LFS mode In addition, there're some code enhancement and clean-ups as usual" * tag 'f2fs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits) f2fs: change to use atomic_t type form sbi.atomic_files f2fs: account swapfile inodes f2fs: allow direct read for zoned device f2fs: support recording errors into superblock f2fs: support recording stop_checkpoint reason into super_block f2fs: remove the unnecessary check in f2fs_xattr_fiemap f2fs: introduce cp_status sysfs entry f2fs: fix to detect corrupted meta ino f2fs: fix to account FS_CP_DATA_IO correctly f2fs: code clean and fix a type error f2fs: add "c_len" into trace_f2fs_update_extent_tree_range for compressed file f2fs: fix to do sanity check on summary info f2fs: port to vfs{g,u}id_t and associated helpers f2fs: fix to do sanity check on destination blkaddr during recovery f2fs: let FI_OPU_WRITE override FADVISE_COLD_BIT f2fs: fix race condition on setting FI_NO_EXTENT flag f2fs: remove redundant check in f2fs_sanity_check_cluster f2fs: add static init_idisk_time function to reduce the code f2fs: fix typo f2fs: fix wrong dirty page count when race between mmap and fallocate. ...	2022-10-10 20:28:41 -07:00
Linus Torvalds	f721d24e5d	tmpfile API change -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCY0DP2AAKCRBZ7Krx/gZQ 6/+qAQCEGQWpcC5MB17zylaX7gqzhgAsDrwtpevlno3aIv/1pQD/YWr/E8tf7WTW ERXRXMRx1cAzBJhUhVgIY+3ANfU2Rg4= =cko4 -----END PGP SIGNATURE----- Merge tag 'pull-tmpfile' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs tmpfile updates from Al Viro: "Miklos' ->tmpfile() signature change; pass an unopened struct file to it, let it open the damn thing. Allows to add tmpfile support to FUSE" * tag 'pull-tmpfile' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fuse: implement ->tmpfile() vfs: open inside ->tmpfile() vfs: move open right after ->tmpfile() vfs: make vfs_tmpfile() static ovl: use vfs_tmpfile_open() helper cachefiles: use vfs_tmpfile_open() helper cachefiles: only pass inode to *mark_inode_inuse() helpers cachefiles: tmpfile error handling cleanup hugetlbfs: cleanup mknod and tmpfile vfs: add vfs_tmpfile_open() helper	2022-10-10 19:45:17 -07:00
Chao Yu	b4dac1203f	f2fs: change to use atomic_t type form sbi.atomic_files inode_lock[ATOMIC_FILE] was used for protecting sbi->atomic_files, update atomic_files variable's type to atomic_t instead of unsigned int, then inode_lock[ATOMIC_FILE] can be obsoleted. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-07 12:57:26 -07:00
Chao Yu	8ec071c363	f2fs: account swapfile inodes In order to check count of opened swapfile inodes. Signed-off-by: Chao Yu <chao.yu@oppo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-07 12:56:45 -07:00
Jaegeuk Kim	689fe57e7e	f2fs: allow direct read for zoned device This reverts `dbf8e63f48` ("f2fs: remove device type check for direct IO"), and apply the below first version, since it contributed out-of-order DIO writes. For zoned devices, f2fs forbids direct IO and forces buffered IO to serialize write IOs. However, the constraint does not apply to read IOs. Cc: stable@vger.kernel.org Fixes: `dbf8e63f48` ("f2fs: remove device type check for direct IO") Signed-off-by: Eunhee Rho <eunhee83.rho@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-05 15:01:32 -07:00
Chao Yu	95fa90c9e5	f2fs: support recording errors into superblock This patch supports to record detail reason of FSCORRUPTED error into f2fs_super_block.s_errors[]. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:45 -07:00
Chao Yu	a9cfee0ef9	f2fs: support recording stop_checkpoint reason into super_block This patch supports to record stop_checkpoint error into f2fs_super_block.s_stop_reason[]. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:44 -07:00
Zhang Qilong	ca7efd71c3	f2fs: remove the unnecessary check in f2fs_xattr_fiemap Whehter or not error occurs, checking "err == 1" is unnecessary in f2fs_xattr_fiemap(), and just remove it here. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:44 -07:00
Chao Yu	718693c84d	f2fs: introduce cp_status sysfs entry This patch adds a new sysfs entry named cp_status, it can output checkpoint flags in real time. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:44 -07:00
Chao Yu	fcc2d8cc96	f2fs: fix to detect corrupted meta ino It is possible that ino of dirent or orphan inode is corrupted in a fuzzed image, occasionally, if corrupted ino is equal to meta ino: meta_ino, node_ino or compress_ino, caller of f2fs_iget() from below call paths will get meta inode directly, it's not allowed, let's add sanity check to detect such cases. case #1 - recover_dentry - __f2fs_find_entry - f2fs_iget_retry case #2 - recover_orphan_inode - f2fs_iget_retry Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:44 -07:00
Chao Yu	d80afefb17	f2fs: fix to account FS_CP_DATA_IO correctly f2fs_inode_info.cp_task was introduced for FS_CP_DATA_IO accounting since commit `b0af6d491a` ("f2fs: add app/fs io stat"). However, cp_task usage coverage has been increased due to below commits: commit `040d2bb318` ("f2fs: fix to avoid deadloop if data_flush is on") commit `186857c5a1` ("f2fs: fix potential recursive call when enabling data_flush") So that, if data_flush mountoption is on, when data flush was triggered from background, the IO from data flush will be accounted as checkpoint IO type incorrectly. In order to fix this issue, this patch splits cp_task into two: a) cp_task: used for IO accounting b) wb_task: used to avoid deadlock Fixes: `040d2bb318` ("f2fs: fix to avoid deadloop if data_flush is on") Fixes: `186857c5a1` ("f2fs: fix potential recursive call when enabling data_flush") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:44 -07:00
Zhang Qilong	544b53dadc	f2fs: code clean and fix a type error ERROR: code indent should use tabs where possible ERROR: spaces required around that ':' ERROR: incorrect tab Found serveral code type errors when review the code and fix it. There is no function change. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:44 -07:00
Zhang Qilong	a834aa3ec9	f2fs: add "c_len" into trace_f2fs_update_extent_tree_range for compressed file The trace_f2fs_update_extent_tree_range could not record compressed block length in the cluster of compress file and we just add it. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:43 -07:00
Chao Yu	c6ad7fd166	f2fs: fix to do sanity check on summary info As Wenqing Liu reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216456 BUG: KASAN: use-after-free in recover_data+0x63ae/0x6ae0 [f2fs] Read of size 4 at addr ffff8881464dcd80 by task mount/1013 CPU: 3 PID: 1013 Comm: mount Tainted: G W 6.0.0-rc4 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 Call Trace: dump_stack_lvl+0x45/0x5e print_report.cold+0xf3/0x68d kasan_report+0xa8/0x130 recover_data+0x63ae/0x6ae0 [f2fs] f2fs_recover_fsync_data+0x120d/0x1fc0 [f2fs] f2fs_fill_super+0x4665/0x61e0 [f2fs] mount_bdev+0x2cf/0x3b0 legacy_get_tree+0xed/0x1d0 vfs_get_tree+0x81/0x2b0 path_mount+0x47e/0x19d0 do_mount+0xce/0xf0 __x64_sys_mount+0x12c/0x1a0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd The root cause is: in fuzzed image, SSA table is corrupted: ofs_in_node is larger than ADDRS_PER_PAGE(), result in out-of-range access on 4k-size page. - recover_data - do_recover_data - check_index_in_prev_nodes - f2fs_data_blkaddr This patch adds sanity check on summary info in recovery and GC flow in where the flows rely on them. After patch: [ 29.310883] F2FS-fs (loop0): Inconsistent ofs_in_node:65286 in summary, ino:0, nid:6, max:1018 Cc: stable@vger.kernel.org Reported-by: Wenqing Liu <wenqingliu0120@gmail.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:43 -07:00
Christian Brauner	1e8a9191cc	f2fs: port to vfs{g,u}id_t and associated helpers A while ago we introduced a dedicated vfs{g,u}id_t type in commit `1e5267cd08` ("mnt_idmapping: add vfs{g,u}id_t"). We already switched over a good part of the VFS. Ultimately we will remove all legacy idmapped mount helpers that operate only on k{g,u}id_t in favor of the new type safe helpers that operate on vfs{g,u}id_t. Cc: Seth Forshee (Digital Ocean) <sforshee@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Chao Yu <chao@kernel.org> Cc: linux-f2fs-devel@lists.sourceforge.net Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:43 -07:00
Chao Yu	0ef4ca04a3	f2fs: fix to do sanity check on destination blkaddr during recovery As Wenqing Liu reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216456 loop5: detected capacity change from 0 to 131072 F2FS-fs (loop5): recover_inode: ino = 6, name = hln, inline = 1 F2FS-fs (loop5): recover_data: ino = 6 (i_size: recover) err = 0 F2FS-fs (loop5): recover_inode: ino = 6, name = hln, inline = 1 F2FS-fs (loop5): recover_data: ino = 6 (i_size: recover) err = 0 F2FS-fs (loop5): recover_inode: ino = 6, name = hln, inline = 1 F2FS-fs (loop5): recover_data: ino = 6 (i_size: recover) err = 0 F2FS-fs (loop5): Bitmap was wrongly set, blk:5634 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1013 at fs/f2fs/segment.c:2198 RIP: 0010:update_sit_entry+0xa55/0x10b0 [f2fs] Call Trace: <TASK> f2fs_do_replace_block+0xa98/0x1890 [f2fs] f2fs_replace_block+0xeb/0x180 [f2fs] recover_data+0x1a69/0x6ae0 [f2fs] f2fs_recover_fsync_data+0x120d/0x1fc0 [f2fs] f2fs_fill_super+0x4665/0x61e0 [f2fs] mount_bdev+0x2cf/0x3b0 legacy_get_tree+0xed/0x1d0 vfs_get_tree+0x81/0x2b0 path_mount+0x47e/0x19d0 do_mount+0xce/0xf0 __x64_sys_mount+0x12c/0x1a0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd If we enable CONFIG_F2FS_CHECK_FS config, it will trigger a kernel panic instead of warning. The root cause is: in fuzzed image, SIT table is inconsistent with inode mapping table, result in triggering such warning during SIT table update. This patch introduces a new flag DATA_GENERIC_ENHANCE_UPDATE, w/ this flag, data block recovery flow can check destination blkaddr's validation in SIT table, and skip f2fs_replace_block() to avoid inconsistent status. Cc: stable@vger.kernel.org Reported-by: Wenqing Liu <wenqingliu0120@gmail.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:43 -07:00
Weichao Guo	f3b23c785a	f2fs: let FI_OPU_WRITE override FADVISE_COLD_BIT Cold files may be fragmented due to SSR, defragment is needed as sequential reads are dominant scenarios of these files. FI_OPU_WRITE should override FADVISE_COLD_BIT to avoid defragment fails. Signed-off-by: Weichao Guo <guoweichao@oppo.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:43 -07:00
Zhang Qilong	07725adc55	f2fs: fix race condition on setting FI_NO_EXTENT flag The following scenarios exist. process A: process B: ->f2fs_drop_extent_tree ->f2fs_update_extent_cache_range ->f2fs_update_extent_tree_range ->write_lock ->set_inode_flag ->is_inode_flag_set ->__free_extent_tree // Shouldn't // have been // cleaned up // here ->write_lock In this case, the "FI_NO_EXTENT" flag is set between f2fs_update_extent_tree_range and is_inode_flag_set by other process. it leads to clearing the whole exten tree which should not have happened. And we fix it by move the setting it to the range of write_lock. Fixes:5f281fab9b9a3 ("f2fs: disable extent_cache for fcollapse/finsert inodes") Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:43 -07:00
Zhang Qilong	9df6d6f9be	f2fs: remove redundant check in f2fs_sanity_check_cluster It have checked "compressed" at the entry of f2fs_sanity_check_cluster, just remove the redundant check for better performance here. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:42 -07:00
Zhang Qilong	049ea86cb5	f2fs: add static init_idisk_time function to reduce the code We can use a inner function to init the disk time of f2fs_inode_info for cleaning redundant code. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:42 -07:00
Yonggil Song	d382e36970	f2fs: fix typo Fix typo in f2fs.h Detected by Jaeyoon Choi Signed-off-by: Yonggil Song <yonggil.song@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:42 -07:00
Shuqi Zhang	9b7eadd9bd	f2fs: fix wrong dirty page count when race between mmap and fallocate. This is a BUG_ON issue as follows when running xfstest-generic-503: WARNING: CPU: 21 PID: 1385 at fs/f2fs/inode.c:762 f2fs_evict_inode+0x847/0xaa0 Modules linked in: CPU: 21 PID: 1385 Comm: umount Not tainted 5.19.0-rc5+ #73 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014 Call Trace: evict+0x129/0x2d0 dispose_list+0x4f/0xb0 evict_inodes+0x204/0x230 generic_shutdown_super+0x5b/0x1e0 kill_block_super+0x29/0x80 kill_f2fs_super+0xe6/0x140 deactivate_locked_super+0x44/0xc0 deactivate_super+0x79/0x90 cleanup_mnt+0x114/0x1a0 __cleanup_mnt+0x16/0x20 task_work_run+0x98/0x100 exit_to_user_mode_prepare+0x3d0/0x3e0 syscall_exit_to_user_mode+0x12/0x30 do_syscall_64+0x42/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Function flow analysis when BUG occurs: f2fs_fallocate mmap do_page_fault pte_spinlock // ---lock_pte do_wp_page wp_page_shared pte_unmap_unlock // unlock_pte do_page_mkwrite f2fs_vm_page_mkwrite down_read(invalidate_lock) lock_page if (PageMappedToDisk(page)) goto out; // set_page_dirty --NOT RUN out: up_read(invalidate_lock); finish_mkwrite_fault // unlock_pte f2fs_collapse_range down_write(i_mmap_sem) truncate_pagecache unmap_mapping_pages i_mmap_lock_write // down_write(i_mmap_rwsem) ...... zap_pte_range pte_offset_map_lock // ---lock_pte set_page_dirty f2fs_dirty_data_folio if (!folio_test_dirty(folio)) { fault_dirty_shared_page set_page_dirty f2fs_dirty_data_folio if (!folio_test_dirty(folio)) { filemap_dirty_folio f2fs_update_dirty_folio // ++ } unlock_page filemap_dirty_folio f2fs_update_dirty_folio // page count++ } pte_unmap_unlock // --unlock_pte i_mmap_unlock_write // up_write(i_mmap_rwsem) truncate_inode_pages up_write(i_mmap_sem) When race happens between mmap-do_page_fault-wp_page_shared and fallocate-truncate_pagecache-zap_pte_range, the zap_pte_range calls function set_page_dirty without page lock. Besides, though truncate_pagecache has immap and pte lock, wp_page_shared calls fault_dirty_shared_page without any. In this case, two threads race in f2fs_dirty_data_folio function. Page is set to dirty only ONCE, but the count is added TWICE by calling filemap_dirty_folio. Thus the count of dirty page cannot accord with the real dirty pages. Following is the solution to in case of race happens without any lock. Since folio_test_set_dirty in filemap_dirty_folio is atomic, judge return value will not be at risk of race. Signed-off-by: Shuqi Zhang <zhangshuqi3@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:42 -07:00
Zhang Qilong	173cdf2c32	f2fs: use COMPRESS_MAPPING to get compress cache mapping Just use the defined COMPRESS_MAPPING to get compress cache mapping instaed of direct accessing name. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:42 -07:00
Zhang Qilong	280dfeae56	f2fs: return the tmp_ptr directly in __bitmap_ptr Just return tmp_ptr here, it's no need to dereference checkpoint pointer again. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-10-04 13:31:42 -07:00
Linus Torvalds	725737e7c2	STATX_DIOALIGN for 6.1 Make statx() support reporting direct I/O (DIO) alignment information. This provides a generic interface for userspace programs to determine whether a file supports DIO, and if so with what alignment restrictions. Specifically, STATX_DIOALIGN works on block devices, and on regular files when their containing filesystem has implemented support. An interface like this has been requested for years, since the conditions for when DIO is supported in Linux have gotten increasingly complex over time. Today, DIO support and alignment requirements can be affected by various filesystem features such as multi-device support, data journalling, inline data, encryption, verity, compression, checkpoint disabling, log-structured mode, etc. Further complicating things, Linux v6.0 relaxed the traditional rule of DIO needing to be aligned to the block device's logical block size; now user buffers (but not file offsets) only need to be aligned to the DMA alignment. The approach of uplifting the XFS specific ioctl XFS_IOC_DIOINFO was discarded in favor of creating a clean new interface with statx(). For more information, see the individual commits and the man page update https://lore.kernel.org/r/20220722074229.148925-1-ebiggers@kernel.org. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCYzpV2xQcZWJpZ2dlcnNA Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKwF1AQDetPX5hyuq0/mwikOywLTTJsoHgGY5 euO+dISqjH/InwD9HAQqfPRkdM1j4ml82BjjkAfrhzZXOOWPKJm0zOhMIQg= =0Oav -----END PGP SIGNATURE----- Merge tag 'statx-dioalign-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull STATX_DIOALIGN support from Eric Biggers: "Make statx() support reporting direct I/O (DIO) alignment information. This provides a generic interface for userspace programs to determine whether a file supports DIO, and if so with what alignment restrictions. Specifically, STATX_DIOALIGN works on block devices, and on regular files when their containing filesystem has implemented support. An interface like this has been requested for years, since the conditions for when DIO is supported in Linux have gotten increasingly complex over time. Today, DIO support and alignment requirements can be affected by various filesystem features such as multi-device support, data journalling, inline data, encryption, verity, compression, checkpoint disabling, log-structured mode, etc. Further complicating things, Linux v6.0 relaxed the traditional rule of DIO needing to be aligned to the block device's logical block size; now user buffers (but not file offsets) only need to be aligned to the DMA alignment. The approach of uplifting the XFS specific ioctl XFS_IOC_DIOINFO was discarded in favor of creating a clean new interface with statx(). For more information, see the individual commits and the man page update[1]" Link: https://lore.kernel.org/r/20220722074229.148925-1-ebiggers@kernel.org [1] * tag 'statx-dioalign-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: xfs: support STATX_DIOALIGN f2fs: support STATX_DIOALIGN f2fs: simplify f2fs_force_buffered_io() f2fs: move f2fs_force_buffered_io() into file.c ext4: support STATX_DIOALIGN fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN vfs: support STATX_DIOALIGN on block devices statx: add direct I/O alignment information	2022-10-03 20:33:41 -07:00
Miklos Szeredi	863f144f12	vfs: open inside ->tmpfile() This is in preparation for adding tmpfile support to fuse, which requires that the tmpfile creation and opening are done as a single operation. Replace the 'struct dentry ' argument of i_op->tmpfile with 'struct file '. Call finish_open_simple() as the last thing in ->tmpfile() instances (may be omitted in the error case). Change d_tmpfile() argument to 'struct file *' as well to make callers more readable. Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2022-09-24 07:00:00 +02:00
Christoph Hellwig	0e91fc1e0f	fscrypt: work on block_devices instead of request_queues request_queues are a block layer implementation detail that should not leak into file systems. Change the fscrypt inline crypto code to retrieve block devices instead of request_queues from the file system. As part of that, clean up the interaction with multi-device file systems by returning both the number of devices and the actual device array in a single method call. Signed-off-by: Christoph Hellwig <hch@lst.de> [ebiggers: bug fixes and minor tweaks] Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220901193208.138056-4-ebiggers@kernel.org	2022-09-21 20:33:06 -07:00
Zhang Qilong	8140654e78	f2fs: simplify code in f2fs_prepare_decomp_mem It could return directly after init_decompress_ctx. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-09-12 23:08:25 -07:00
Zhang Qilong	ddd3b16c8c	f2fs: replace logical value "true" with a int number The "true" is not match the parametera type "int", and we modify it. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-09-12 23:08:25 -07:00
Jaegeuk Kim	da35fe96d1	f2fs: increase the limit for reserve_root This patch increases the threshold that limits the reserved root space from 0.2% to 12.5% by using simple shift operation. Typically Android sets 128MB, but if the storage capacity is 32GB, 0.2% which is around 64MB becomes too small. Let's relax it. Cc: stable@vger.kernel.org Reported-by: Aran Dalton <arda@allwinnertech.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-09-12 23:08:21 -07:00
Jaegeuk Kim	4f99484d27	f2fs: complete checkpoints during remount Otherwise, pending checkpoints can contribute a race condition to give a quota warning. - Thread - checkpoint thread add checkpoints to the list do_remount() down_write(&sb->s_umount); f2fs_remount() block_operations() down_read_trylock(&sb->s_umount) = 0 up_write(&sb->s_umount); f2fs_quota_sync() dquot_writeback_dquots() WARN_ON_ONCE(!rwsem_is_locked(&sb->s_umount)); Or, do_remount() down_write(&sb->s_umount); f2fs_remount() create a ckpt thread f2fs_enable_checkpoint() adds checkpoints wait for f2fs_sync_fs() trigger another pending checkpoint block_operations() down_read_trylock(&sb->s_umount) = 0 up_write(&sb->s_umount); f2fs_quota_sync() dquot_writeback_dquots() WARN_ON_ONCE(!rwsem_is_locked(&sb->s_umount)); Cc: stable@vger.kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-09-12 23:07:20 -07:00
Jaegeuk Kim	c7b5857637	f2fs: flush pending checkpoints when freezing super This avoids -EINVAL when trying to freeze f2fs. Cc: stable@vger.kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-09-12 23:07:12 -07:00
Eric Biggers	c8c02272a9	f2fs: support STATX_DIOALIGN Add support for STATX_DIOALIGN to f2fs, so that direct I/O alignment restrictions are exposed to userspace in a generic way. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Link: https://lore.kernel.org/r/20220827065851.135710-8-ebiggers@kernel.org	2022-09-11 19:47:12 -05:00
Eric Biggers	bd36732931	f2fs: simplify f2fs_force_buffered_io() f2fs only allows direct I/O that is aligned to the filesystem block size. Given that fact, simplify f2fs_force_buffered_io() by removing the redundant call to block_unaligned_IO(). This makes it easier to reuse this code for STATX_DIOALIGN. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Link: https://lore.kernel.org/r/20220827065851.135710-7-ebiggers@kernel.org	2022-09-11 19:47:12 -05:00
Eric Biggers	2db0487faa	f2fs: move f2fs_force_buffered_io() into file.c f2fs_force_buffered_io() is only used in file.c, so move it into there. No behavior change. This makes it easier to review later patches. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Link: https://lore.kernel.org/r/20220827065851.135710-6-ebiggers@kernel.org	2022-09-11 19:47:12 -05:00
Eric Biggers	53dd3f802a	fscrypt: change fscrypt_dio_supported() to prepare for STATX_DIOALIGN To prepare for STATX_DIOALIGN support, make two changes to fscrypt_dio_supported(). First, remove the filesystem-block-alignment check and make the filesystems handle it instead. It previously made sense to have it in fs/crypto/; however, to support STATX_DIOALIGN the alignment restriction would have to be returned to filesystems. It ends up being simpler if filesystems handle this part themselves, especially for f2fs which only allows fs-block-aligned DIO in the first place. Second, make fscrypt_dio_supported() work on inodes whose encryption key hasn't been set up yet, by making it set up the key if needed. This is required for statx(), since statx() doesn't require a file descriptor. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20220827065851.135710-4-ebiggers@kernel.org	2022-09-11 19:47:12 -05:00

1 2 3 4 5 ...

3557 Commits