It turns out that I misunderstood the difference between the attr and
attr2 feature bits. "attr" means that at some point an attr fork was
created somewhere in the filesystem. "attr2" means that inodes have
variable-sized forks, but says nothing about whether or not there
actually /are/ attr forks in the system.
If we have an attr fork, we only need to check that attr is set.
Fixes: 99d9d8d05d ("xfs: scrub inode block mappings")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Replace a comma between expression statements by a semicolon.
Fixes: 178b48d588 ("xfs: remove the for_each_xbitmap_ helpers")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Since file_path() takes the output buffer as one of its arguments, we
might as well have it format directly into the tracepoint's char array
instead of wasting stack space.
Fixes: 3934e8ebb7 ("xfs: create a big array data structure")
Fixes: 5076a6040c ("xfs: support in-memory buffer cache targets")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202403290419.HPcyvqZu-lkp@intel.com/
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
kmemleak reported that we don't free the parent pointer names here if we
found corruption.
Fixes: 0d29a20fbd ("xfs: scrub parent pointers")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Convert the boolean to skip discard on free into a proper flags field so
that we can add more flags in the next patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Move the directory entry update hook code to xfs_dir2 so that it is
mostly consolidated with the higher level directory functions. Retain
the exports so that online fsck can still send notifications through the
hooks.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
INIT_XATTRS is overloaded here -- it's set during the creat process when
we think that we're immediately going to set some ACL xattrs to save
time. However, it's also used by the parent pointers code to enable the
attr fork in preparation to receive ppptr xattrs. This results in
xfs_has_parent() branches scattered around the codebase to turn on
INIT_XATTRS.
Linkable files are created far more commonly than unlinkable temporary
files or directory tree roots, so we should centralize this logic in
xfs_inode_init. For the three callers that don't want parent pointers
(online repiar tempfiles, unlinkable tempfiles, rootdir creation) we
provide an UNLINKABLE flag to skip attr fork initialization.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Create a helper that calls dqalloc to allocate and grab a reference to
dquots for the user, group, and project ids listed in an icreate
structure. This simplifies the creat-related dqalloc callsites
scattered around the code base.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Callers that want to create an inode currently pass all possible file
attribute values for the new inode into xfs_init_new_inode as ten
separate parameters. This causes two code maintenance issues: first, we
have large multi-line call sites which programmers must read carefully
to make sure they did not accidentally invert a value. Second, all
three file id parameters must be passed separately to the quota
functions; any discrepancy results in quota count errors.
Clean this up by creating a new icreate_args structure to hold all this
information, some helpers to initialize them properly, and make the
callers pass this structure through to the creation function, whose name
we shorten to xfs_icreate. This eliminates the issues, enables us to
keep the inode init code in sync with userspace via libxfs, and is
needed for future metadata directory tree management.
(A subsequent cleanup will also fix the quota alloc calls.)
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Following warning is reported, so remove these duplicated header
including:
./fs/xfs/libxfs/xfs_trans_resv.c: xfs_da_format.h is included more than once.
./fs/xfs/scrub/quota_repair.c: xfs_format.h is included more than once.
./fs/xfs/xfs_handle.c: xfs_da_btree.h is included more than once.
./fs/xfs/xfs_qm_bhv.c: xfs_mount.h is included more than once.
./fs/xfs/xfs_trace.c: xfs_bmap.h is included more than once.
This is just a clean code, no logic changed.
Signed-off-by: Wenchao Hao <haowenchao22@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Don't open-code what the kernel already provides.
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Chandan Babu reports the following livelock in xfs/708:
run fstests xfs/708 at 2024-05-04 15:35:29
XFS (loop16): EXPERIMENTAL online scrub feature in use. Use at your own risk!
XFS (loop5): Mounting V5 Filesystem e96086f0-a2f9-4424-a1d5-c75d53d823be
XFS (loop5): Ending clean mount
XFS (loop5): Quotacheck needed: Please wait.
XFS (loop5): Quotacheck: Done.
XFS (loop5): EXPERIMENTAL online scrub feature in use. Use at your own risk!
INFO: task xfs_io:143725 blocked for more than 122 seconds.
Not tainted 6.9.0-rc4+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:xfs_io state:D stack:0 pid:143725 tgid:143725 ppid:117661 flags:0x00004006
Call Trace:
<TASK>
__schedule+0x69c/0x17a0
schedule+0x74/0x1b0
io_schedule+0xc4/0x140
folio_wait_bit_common+0x254/0x650
shmem_undo_range+0x9d5/0xb40
shmem_evict_inode+0x322/0x8f0
evict+0x24e/0x560
__dentry_kill+0x17d/0x4d0
dput+0x263/0x430
__fput+0x2fc/0xaa0
task_work_run+0x132/0x210
get_signal+0x1a8/0x1910
arch_do_signal_or_restart+0x7b/0x2f0
syscall_exit_to_user_mode+0x1c2/0x200
do_syscall_64+0x72/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The shmem code is trying to drop all the folios attached to a shmem
file and gets stuck on a locked folio after a bnobt repair. It looks
like the process has a signal pending, so I started looking for places
where we lock an xfile folio and then deal with a fatal signal.
I found a bug in xfarray_sort_scan via code inspection. This function
is called to set up the scanning phase of a quicksort operation, which
may involve grabbing a locked xfile folio. If we exit the function with
an error code, the caller does not call xfarray_sort_scan_done to put
the xfile folio. If _sort_scan returns an error code while si->folio is
set, we leak the reference and never unlock the folio.
Therefore, change xfarray_sort to call _scan_done on exit. This is safe
to call multiple times because it sets si->folio to NULL and ignores a
NULL si->folio. Also change _sort_scan to use an intermediate variable
so that we never pollute si->folio with an errptr.
Fixes: 232ea05277 ("xfs: enable sorting of xfile-backed arrays")
Reported-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
With the rework of how the __string() handles dynamic strings where it
saves off the source string in field in the helper structure[1], the
assignment of that value to the trace event field is stored in the helper
value and does not need to be passed in again.
This means that with:
__string(field, mystring)
Which use to be assigned with __assign_str(field, mystring), no longer
needs the second parameter and it is unused. With this, __assign_str()
will now only get a single parameter.
There's over 700 users of __assign_str() and because coccinelle does not
handle the TRACE_EVENT() macro I ended up using the following sed script:
git grep -l __assign_str | while read a ; do
sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
mv /tmp/test-file $a;
done
I then searched for __assign_str() that did not end with ';' as those
were multi line assignments that the sed script above would fail to catch.
Note, the same updates will need to be done for:
__assign_str_len()
__assign_rel_str()
__assign_rel_str_len()
I tested this with both an allmodconfig and an allyesconfig (build only for both).
[1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/
Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts.
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for
Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs
Tested-by: Guenter Roeck <linux@roeck-us.net>
Create a helper function to compute the number of fsblocks needed to
store a maximally-sized extended attribute value.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
xfs_bmapi_write can return 0 without actually returning a mapping in
mval in two different cases:
1) when there is absolutely no space available to do an allocation
2) when converting delalloc space, and the allocation is so small
that it only covers parts of the delalloc extent before the
range requested by the caller
Callers at best can handle one of these cases, but in many cases can't
cope with either one. Switch xfs_bmapi_write to always return a
mapping or return an error code instead. For case 1) above ENOSPC is
the obvious choice which is very much what the callers expect anyway.
For case 2) there is no really good error code, so pick a funky one
from the SysV streams portfolio.
This fixes the reproducer here:
https://lore.kernel.org/linux-xfs/CAEJPjCvT3Uag-pMTYuigEjWZHn1sGMZ0GCjVVCv29tNHK76Cgg@mail.gmail.com0/
which uses reserved blocks to create file systems that are gravely
out of space and thus cause at least xfs_file_alloc_space to hang
and trigger the lack of ENOSPC handling in xfs_dquot_disk_alloc.
Note that this patch does not actually make any caller but
xfs_alloc_file_space deal intelligently with case 2) above.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: 刘通 <lyutoon@gmail.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Add a new enum and a xfs_dir2_format helper that returns it to allow
the code to switch on the format of a directory in a single operation
and switch all helpers of xfs_dir2_isblock and xfs_dir2_isleaf to it.
This also removes the explicit xfs_iread_extents call in a few of the
call sites given that xfs_bmap_last_offset already takes care of it
underneath.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Add a helper to switch between the different directory formats for
removing a directory entry.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Add a helper to switch between the different directory formats for
removing a directory entry.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Add a helper to switch between the different directory formats for
creating a directory entry and to handle the XFS_DA_OP_JUSTCHECK flag
based on the passed in ino number field.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Add a helper to switch between the different directory formats for
lookup and to handle the -EEXIST return for a successful lookup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
The function are defined in the dir_repair.c file, but not called
elsewhere, so delete the unused function.
fs/xfs/scrub/dir_repair.c:186:1: warning: unused function 'xrep_dir_self_parent'.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8867
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Invalidate the cached dentries that point to the file that we're moving
to lost+found before we actually move it.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The atomic file exchange-range functionality is now a permanent
filesystem feature instead of a dynamic log-incompat feature. It cannot
be turned on at runtime, so we no longer need the XCHK_FSGATES flags and
whatnot that supported it. Remove the flag and the enable function, and
move the xfs_has_exchange_range checks to the start of the repair
functions.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
If the transaction allocation in xrep_adoption_trans_alloc fails, we
should drop only the locks that we took. In this case this is
ILOCK_EXCL of both the orphanage and the file being repaired. Dropping
any IOLOCK here is incorrect.
Found by fuzzing u3.sfdir3.list[1].name = zeroes in xfs/1546.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
If the transaction allocation in the !orphanage_available case of
xrep_nlinks_repair_inode fails, we need to drop the IOLOCK of the file
being scrubbed before exiting.
Found by fuzzing u3.sfdir3.list[1].name = zeroes in xfs/1546.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
If a program wants us to perform a scrub on a file handle and the fd
passed to ioctl() is not the file referenced in the handle, iget the
file once and pass it into the scrub code. This amortizes the untrusted
iget lookup over /all/ the scrubbers mentioned in the scrubv call.
When running fstests in "rebuild all metadata after each test" mode, I
observed a 10% reduction in runtime on account of avoiding repeated
inobt lookups.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Back when I wrote commit a03297a0ca, I had thought that we'd be doing
users a favor by only marking inodes dontcache at the end of a scrub
operation, and only if there's only one reference to that inode. This
was more or less true back when I_DONTCACHE was an XFS iflag and the
only thing it did was change the outcome of xfs_fs_drop_inode to 1.
Note: If there are dentries pointing to the inode when scrub finishes,
the inode will have positive i_count and stay around in cache until
dentry reclaim.
But now we have d_mark_dontcache, which cause the inode *and* the
dentries attached to it all to be marked I_DONTCACHE, which means that
we drop the dentries ASAP, which drops the inode ASAP.
This is bad if scrub found problems with the inode, because now they can
be scheduled for inactivation, which can cause inodegc to trip on it and
shut down the filesystem.
Even if the inode isn't bad, this is still suboptimal because phases 3-7
each initiate inode scans. Dropping the inode immediately during phase
3 is silly because phase 5 will reload it and drop it immediately, etc.
It's fine to mark the inodes dontcache, but if there have been accesses
to the file that set up dentries, we should keep them.
I validated this by setting up ftrace to capture xfs_iget_recycle*
tracepoints and ran xfs/285 for 30 seconds. With current djwong-wtf I
saw ~30,000 recycle events. I then dropped the d_mark_dontcache calls
and set XFS_IGET_DONTCACHE, and the recycle events dropped to ~5,000 per
30 seconds.
Therefore, grab the inode with XFS_IGET_DONTCACHE, which only has the
effect of setting I_DONTCACHE for cache misses. Remove the
d_mark_dontcache call that can happen in xchk_irele.
Fixes: a03297a0ca ("xfs: manage inode DONTCACHE status at irele time")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Introduce a variant on XFS_SCRUB_METADATA that allows for a vectored
mode. The caller specifies the principal metadata object that they want
to scrub (allocation group, inode, etc.) once, followed by an array of
scrub types they want called on that object. The kernel runs the scrub
operations and writes the output flags and errno code to the
corresponding array element.
A new pseudo scrub type BARRIER is introduced to force the kernel to
return to userspace if any corruptions have been found when scrubbing
the previous scrub types in the array. This enables userspace to
schedule, for example, the sequence:
1. data fork
2. barrier
3. directory
If the data fork scrub is clean, then the kernel will perform the
directory scrub. If not, the barrier in 2 will exit back to userspace.
The alternative would have been an interface where userspace passes a
pointer to an empty buffer, and the kernel formats that with
xfs_scrub_vecs that tell userspace what it scrubbed and what the outcome
was. With that the kernel would have to communicate that the buffer
needed to have been at least X size, even though for our cases
XFS_SCRUB_TYPE_NR + 2 would always be enough.
Compared to that, this design keeps all the dependency policy and
ordering logic in userspace where it already resides instead of
duplicating it in the kernel. The downside of that is that it needs the
barrier logic.
When running fstests in "rebuild all metadata after each test" mode, I
observed a 10% reduction in runtime due to fewer transitions across the
system call boundary.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Move the scrub ioctl handler to scrub.c to keep the code together and to
reduce unnecessary code when CONFIG_XFS_ONLINE_SCRUB=n.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
We really don't want to call cond_resched every single time we go
through a loop in scrub -- there may be billions of records, and probing
into the scheduler itself has overhead. Reduce this overhead by only
calling cond_resched 10x per second; and add a counter so that we only
check jiffies once every 1000 records or so.
Surprisingly, this reduces scrub-only fstests runtime by about 2%. I
used the bmapinflate xfs_db command to produce a billion-extent file and
this stupid gadget reduced the scrub runtime by about 4%.
From a stupid microbenchmark of calling these things 1 billion times, I
estimate that cond_resched costs about 5.5ns per call; jiffes costs
about 0.3ns per read; and fatal_signal_pending costs about 0.4ns per
call.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Repair corruptions in the directory tree itself. Cycles are broken by
removing an incoming parent->child link. Multiply-owned directories are
fixed by pruning the extra parent -> child links Disconnected subtrees
are reconnected to the lost and found.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Report directories that are the source of corruption in the directory
tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Add a dirent update hook so that we can detect directory tree updates
that affect any of the paths found by this scrubber and force it to
rescan.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Create a new scrubber that detects corruptions within the directory tree
structure itself. It can detect directories with multiple parents;
loops within the directory tree; and directory loops not accessible from
the root.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The runtime parent pointer update code expects that any file being moved
around the directory tree already has an attr fork. However, if we had
to rebuild an inode core record, there's a chance that we zeroed forkoff
as part of the inode to pass the iget verifiers.
Therefore, if we performed any repairs on an inode core, ensure that the
inode has a nonzero forkoff before unlocking the inode.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Since the parent pointer scrubber does not exhaustively search the
filesystem for missing parent pointers, it doesn't have a good way to
determine that there are pointers missing from an otherwise uncorrupt
xattr structure. Instead, for nondirectories it employs a heuristic of
comparing the file link count to the number of parent pointers found.
However, we don't want this heuristic flagging a false corruption after
a repair has actually scanned the entire filesystem to rebuild the
parent pointers. Therefore, reset the file link count in this one case
because we actually know the correct link count.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Adapt the orphanage's adoption code to update the child file's parent
pointers as part of the reparenting process. Also ensure that the child
has an attr fork to receive the parent pointer update, since the runtime
code assumes one exists.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Once we've assembled all the parent pointers for a file, we need to
commit the new dataset atomically to that file. Parent pointer records
are embedded in the xattr structure, which means that we must write a
new extended attribute structure, again, atomically. Therefore, we must
copy the non-parent-pointer attributes from the file being repaired into
the temporary file's extended attributes and then call the atomic extent
swap mechanism to exchange the blocks.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Add a second callback function to xchk_xattr_walk so that we can do
something in between attr leaf blocks. This will be used by the next
patch to see if we should flush cached parent pointer updates to
constrain memory usage.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
While we're scanning the filesystem for dirents that we can turn into
parent pointers, we cannot hold the IOLOCK or ILOCK of the file being
repaired. Therefore, we need to set up a dirent hook so that we can
keep the temporary file's parent pionters up to date with the rest of
the filesystem. Hence we add the ability to *remove* pptrs from the
temporary file.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
If parent pointers are enabled on the filesystem, we can repair the
entire dataset by walking the directories of the filesystem looking for
dirents that we can turn into parent pointers. Once we have a full
incore dataset, we'll figure out what to do with it, but that's for a
subsequent patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
There are a few places where the extended attribute repair code drops
the ILOCK to apply stashed xattrs to the temporary file. Although
setxattr and removexattr are still locked out because we retain our hold
on the IOLOCK, this doesn't prevent renames from updating parent
pointers, because the VFS doesn't take i_rwsem on children that are
being moved.
Therefore, set up a dirent hook to capture parent pointer updates for
this file, and replay(?) the updates.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
While we're scanning the filesystem for parent pointers that we can turn
into dirents, we cannot hold the IOLOCK or ILOCK of the directory being
repaired. Therefore, we need to set up a dirent hook so that we can
keep the temporary directory up to date with the rest of the filesystem.
Hence we add the ability to *remove* entries from the temporary dir.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
For filesystems with parent pointers, scan the entire filesystem looking
for parent pointers that target the directory we're rebuilding instead
of trying to salvage whatever we can from the directory data blocks.
This will be more robust than salvaging, but there's more code to come.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
When we're salvaging extended attributes, make sure we validate the ones
that claim to be parent pointers before adding them to the salvage pile.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Make the use of reserved blocks an explicit parameter to xfs_attr_set.
Userspace setting XFS_ATTR_ROOT attrs should continue to be able to use
it, but for online repairs we can back out and therefore do not care.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
In preparation for online/offline repair wanting to use xfs_attr_set,
move some of the boilerplate out of this function into the callers.
Repair can initialize the da_args completely, and the userspace flag
handling/twisting goes away once we move it to xfs_attr_change.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Check parent pointer xattrs as part of scrubbing xattrs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
If the filesystem has parent pointers enabled, walk the parent pointers
of subdirectories to determine the true backref count. In theory each
subdir should have a single parent reachable via dotdot, but in the case
of (corrupt) subdirs with multiple parents, we need to keep the link
counts high enough that the directory loop detector will be able to
correct the multiple parents problems.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
If the trylock-based dirent check fails, retain those parent pointers
and check them at the end. This may involve dropping the locks on the
file being scanned, so yay.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>