Newly generated keys, in the transaction commit path or write path,
should not be AUTOFIX; those indicate bugs that we need to fail fast
for.
Fixes: 5612daafb7 ("bcachefs: Fix fsck warnings from bkey validation")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Ensure a copy of the lost+found inode exists in the snapshot that we're
reattaching, so that we don't trigger warnings in
lookup_inode_for_snapshot() later.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This fixes two different bugs:
- Looser locking with the rhashtable means we need to recheck if the
inode is still hashed after prepare_to_wait(), and add a corresponding
wakeup after removing from the hash table.
- da18ecbf0f ("fs: add i_state helpers") changed the bit waitqueues
used for inodes, and bcachefs wasn't updated and thus broke; this
updates bcachefs to the new helper.
Fixes: 112d21fd1a ("bcachefs: switch to rhashtable for vfs inodes hash")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We only are applying JSET_ENTRY_TYPE_write_buffer_keys, revert path was
missed.
Fixes: a3581ca35d ("bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
we're returning an error code now, not a bool
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Builds on big endian systems fail as follows.
fs/bcachefs/bkey.h: In function 'bch2_bkey_format_add_key':
fs/bcachefs/bkey.h:557:41: error:
'const struct bkey' has no member named 'bversion'
The original commit only renamed the variable for little endian builds.
Rename it for big endian builds as well to fix the problem.
Fixes: cf49f8a8c2 ("bcachefs: rename version -> bversion")
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Assorted minor syzbot fixes, and for bigger stuff:
- Fix two disk accounting rewrite bugs
- Disk accounting keys use the version field of bkey so that journal
replay can tell which updates have been applied to the btree. This is
set in the transaction commit path, after we've gotten our journal
reservation (and our time ordering), but the
BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay uses
was incorrectly skipping this for new updates generated prior to
journal replay.
This fixes the underlying cause of an assertion pop in
disk_accounting_read.
- A couple fixes for disk accounting + device removal. Checking if
acocunting replicas entries were marked in the superblock was being
done at the wrong point, when deltas in the journal could still zero
them out, and then additionally we'd try to add a missing replicas
entry to the superblock without checking if it referred to an invalid
(removed) device.
- A whole slew of repair fixes
- fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes
an infinite loop when repairing a filesystem with many snapshots
- fix incorrect transaction restart handling leading to occasional
"fsck counted ..." warnings"
- fix warning in __bch2_fsck_err() for bkey fsck errors
- check_inode() in fsck now correctly checks if the filesystem was
clean
- there shouldn't be pending logged ops if the fs was clean, we now
check for this
- remove_backpointer() doesn't remove a dirent that doesn't actually
point to the inode
- many more fsck errors are AUTOFIX
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmb4QtsACgkQE6szbY3K
bnYx4A//bhGgZYgP55FxduuxUH8XjX2eOnXwuPv/MmYO/4oCok5VBa9bRDTVXhIK
PtY4pP2IJZ3+u963mwbwJAawsPA01AEEty9tE+AdXbltDRQ03I33OEuIy0HFIso2
s8VBkVPbru6yU4RCCvYNIVvRG/9GOL+J0GgrR1t05zHVyKXe1FuS00Yq5+z3niNP
HtuGTsD273Nnhikz47bqyD+M6VizU+uzSUFLgnB3zrzpb+gPSGETSwgc4ggajlM4
2P10Vc4L/Nb3KYV9RW+C3WpRfUR/o8BZA3wjJfNo0JeA4iDaUbltSjpCA07EcAnA
3D6Omzqkm4aobL2WlvioT0UhZx4t8X/8x5t5F9HyX52i1k+g87oMT9/KIKec1Dzd
8vQCwCdXFfWaLSZoOJsHyIljip7BuRLKhWwKosdzzLIAnRQy5StxAhsG99fNStu6
JOWICPNCn1b6SkktnoKou1unL+K5RczeNfAxMAjcJjTD7IIAmytLe4mdRbP9q+Oa
x8no7pttbb4JnoRvfo42GVz8KWQR07oN/Zy7mH3K4Y0Ix+xDOrLqlfLIDLGpxMNv
HZz+UPchdlfpYJO+nTLoAOGXZWnKDqg70SAEcWKDc82Ri4vNOhraYDZvXrzl9qE+
63RPzqDbg3uXGxLYMvujjPe610QkPxS9zKKyDvUZZx0ZiUX4CjI=
=cdrz
-----END PGP SIGNATURE-----
Merge tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs
Pull more bcachefs updates from Kent Overstreet:
"Assorted minor syzbot fixes, and for bigger stuff:
Fix two disk accounting rewrite bugs:
- Disk accounting keys use the version field of bkey so that journal
replay can tell which updates have been applied to the btree.
This is set in the transaction commit path, after we've gotten our
journal reservation (and our time ordering), but the
BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay
uses was incorrectly skipping this for new updates generated prior
to journal replay.
This fixes the underlying cause of an assertion pop in
disk_accounting_read.
- A couple of fixes for disk accounting + device removal.
Checking if acocunting replicas entries were marked in the
superblock was being done at the wrong point, when deltas in the
journal could still zero them out, and then additionally we'd try
to add a missing replicas entry to the superblock without checking
if it referred to an invalid (removed) device.
A whole slew of repair fixes:
- fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes
an infinite loop when repairing a filesystem with many snapshots
- fix incorrect transaction restart handling leading to occasional
"fsck counted ..." warnings
- fix warning in __bch2_fsck_err() for bkey fsck errors
- check_inode() in fsck now correctly checks if the filesystem was
clean
- there shouldn't be pending logged ops if the fs was clean, we now
check for this
- remove_backpointer() doesn't remove a dirent that doesn't actually
point to the inode
- many more fsck errors are AUTOFIX"
* tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs: (35 commits)
bcachefs: check_subvol_path() now prints subvol root inode
bcachefs: remove_backpointer() now checks if dirent points to inode
bcachefs: dirent_points_to_inode() now warns on mismatch
bcachefs: Fix lost wake up
bcachefs: Check for logged ops when clean
bcachefs: BCH_FS_clean_recovery
bcachefs: Convert disk accounting BUG_ON() to WARN_ON()
bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply
bcachefs: Check for accounting keys with bversion=0
bcachefs: rename version -> bversion
bcachefs: Don't delete unlinked inodes before logged op resume
bcachefs: Fix BCH_SB_ERRS() so we can reorder
bcachefs: Fix fsck warnings from bkey validation
bcachefs: Move transaction commit path validation to as late as possible
bcachefs: Fix disk accounting attempting to mark invalid replicas entry
bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate()
bcachefs: Fix accounting read + device removal
bcachefs: bch_accounting_mode
bcachefs: fix transaction restart handling in check_extents(), check_dirents()
bcachefs: kill inode_walker_entry.seen_this_pos
...
cleanups.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmb3JroTHGlkcnlvbW92
QGdtYWlsLmNvbQAKCRBKf944AhHzizDiB/0elHQQaFxXMjuJRY1IzohozAHi0cHK
gwgE4nEbECE8vRYK/QvyvZ3S+ep+N+r6jOIiIDyqhjtlY3//oSyyxL7RjMJlVFBq
Ie37w8r4q1aL1mn9QDQ4iQxcRYyU+JxcUcPR1UUUvLiKgWaRixmq27zby/WQSrkA
ke2ScBRDtEAYVtdxvxmUJK/DrPr3skwJAGY52KesjwgVhXSL8KG9X1zMUbWdJYDV
THbQzLZsu4NVh7LlAsS/mh+z0EIZsXxQYU5IY3dIVEYcuLK93lXRGZb+7whtmUef
wsDtYIe/w30QVxFdrN28qAQp8daUJhp+3t0EZSyecRcq5OPey6ICx1P4
=+bdB
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-6.12-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"Three CephFS fixes from Xiubo and Luis and a bunch of assorted
cleanups"
* tag 'ceph-for-6.12-rc1' of https://github.com/ceph/ceph-client:
ceph: remove the incorrect Fw reference check when dirtying pages
ceph: Remove empty definition in header file
ceph: Fix typo in the comment
ceph: fix a memory leak on cap_auths in MDS client
ceph: flush all caps releases when syncing the whole filesystem
ceph: rename ceph_flush_cap_releases() to ceph_flush_session_cap_releases()
libceph: use min() to simplify code in ceph_dns_resolve_name()
ceph: Convert to use jiffies macro
ceph: Remove unused declarations
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmb2y3gACgkQiiy9cAdy
T1EATQwAwQTCQSd916VK4HPzphM/lJLaxNCNNV0AY8QNXsUxhBxQny2FJ+jilFZU
y4G5/zSi8+PX0YtyrPJqtofbFX+eeD6eKRCFT/1YEEkwEYp53mjsCIHWidSPGh6X
S2du6tAebSCQSqlHv5zlTpL24UVhi6amse7aJyXs8v7JZO9ZjtEE0D+a1xqSV4kt
0+6/W3RM49HTEAql7TavduNR3UcesYg2KS48qNVvGhHhY3wcGe92mZ0Sr4NUStfg
IjtpfsxxBJWKiXDJhGBN8M/O6jqBtE++O/CyDknYGOs6M7QtPJ1xtXpESlq+OgWV
JEqNorZI4qvl/5PbY/1+6wJDY3ogv2DhwyRaOdhtVc5CgF1JGLKW4lVBGBIrQz2B
dyHbiGAXEA+Rm7/8UkyFZRmvbmLXDqRM7AEyLrXoeS5Vw51RxS6CT0oDesVGVsdX
+koQ1OQ55AiR1TXhasDj6XvmFAyYKuEPh/qhBz1jEBX8unyhIUUVrG3CnMULl3rY
FWbdmtDB
=4ons
-----END PGP SIGNATURE-----
Merge tag 'v6.12-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- fix querying dentry for char/block special files
- small cleanup patches
* tag 'v6.12-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: Correct typos in multiple comments across various files
ksmbd: fix open failure from block and char device file
ksmbd: remove unsafe_memcpy use in session setup
ksmbd: Replace one-element arrays with flexible-array members
ksmbd: fix warning: comparison of distinct pointer types lacks a cast
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmb3FfgACgkQiiy9cAdy
T1HLMQv/ZXkiNPvMmCHJE3rWdSBZQdGLDV1qOhxuW9y4CvenIhukwDmwOq7wjWOn
3dQHDaqFkVTVurosozFOJK9Aw94iz2Ad9dlryMcNN+Gb4vY3d9l3AvsbqmgbZSsg
DmdqOg1SA9NgDaHl6RFsFQQY9O5BsjkEeHBX71gZZUYMw1d6CTpFUT+wTD43L0LQ
g8r0Ksil7edw9f5WGvu8YzB4rclR45QiTVG1OMgXmr43cvoJz3GrIPXeHNm7j5D7
hzx6ELNviY77DPKnxSd55UGPngVms6c1qqWCOJMefsJRY5bhh3lEc+TQyX11HGft
Kta1TQI2gI1xgueqpR2Dh/bUuprWcc6vCbNjxezXpFOiSMt6qNfGzQDE9gZAALAj
568lRpwcMPgS9laqK4Sh9v5+Vw6E8T+FUAJKvtLRidv5iBoqI0+50mhHTBlBgiy6
XdyiAdUGSzejcu/OiOGc9C4PvVRgf0qw9+hqqZeZAsRydKgw3QwPtJ6yW+RYu86p
xQ5CE6zs
=mpUs
-----END PGP SIGNATURE-----
Merge tag '6.12rc-more-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull xmb client fixes from Steve French:
- Noisy log message cleanup
- Important netfs fix for cifs crash in generic/074
- Three minor improvements to use of hashing (multichannel and mount
improvements)
- Fix decryption crash for large read with small esize
* tag '6.12rc-more-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: make SHA-512 TFM ephemeral
smb: client: make HMAC-MD5 TFM ephemeral
smb: client: stop flooding dmesg in smb2_calc_signature()
smb: client: allocate crypto only for primary server
smb: client: fix UAF in async decryption
netfs: Fix write oops in generic/346 (9p) and generic/074 (cifs)
if an inode backpointer points to a dirent that doesn't point back,
that's an error we should warn about.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
If the reader acquires the read lock and then the writer enters the slow
path, while the reader proceeds to the unlock path, the following scenario
can occur without the change:
writer: pcpu_read_count(lock) return 1 (so __do_six_trylock will return 0)
reader: this_cpu_dec(*lock->readers)
reader: smp_mb()
reader: state = atomic_read(&lock->state) (there is no waiting flag set)
writer: six_set_bitmask()
then the writer will sleep forever.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Add a filesystem flag to indicate whether we did a clean recovery -
using c->sb.clean after we've got rw is incorrect, since c->sb is
updated whenever we write the superblock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We had a bug where disk accounting keys didn't always have their version
field set in journal replay; change the BUG_ON() to a WARN(), and
exclude this case since it's now checked for elsewhere (in the bkey
validate function).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This was added to avoid double-counting accounting keys in journal
replay. But applied incorrectly (easily done since it applies to the
transaction commit, not a particular update), it leads to skipping
in-mem accounting for real accounting updates, and failure to give them
a version number - which leads to journal replay becoming very confused
the next time around.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Previously, check_inode() would delete unlinked inodes if they weren't
on the deleted list - this code dating from before there was a deleted
list.
But, if we crash during a logged op (truncate or finsert/fcollapse) of
an unlinked file, logged op resume will get confused if the inode has
already been deleted - instead, just add it to the deleted list if it
needs to be there; delete_dead_inodes runs after logged op resume.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
BCH_SB_ERRS() has a field for the actual enum val so that we can reorder
to reorganize, but the way BCH_SB_ERR_MAX was defined didn't allow for
this.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
__bch2_fsck_err() warns if the current task has a btree_trans object and
it wasn't passed in, because if it has to prompt for user input it has
to be able to unlock it.
But plumbing the btree_trans through bkey_validate(), as well as
transaction restarts, is problematic - so instead make bkey fsck errors
FSCK_AUTOFIX, which doesn't need to warn.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
In order to check for accounting keys with version=0, we need to run
validation after they've been assigned version numbers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
accounting read was checking if accounting replicas entries were marked
in the superblock prior to applying accounting from the journal,
which meant that a recently removed device could spuriously trigger a
"not marked in superblocked" error (when journal entries zero out the
offending counter).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Minor refactoring - replace multiple bool arguments with an enum; prep
work for fixing a bug in accounting read.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Dealing with outside state within a btree transaction is always tricky.
check_extents() and check_dirents() have to accumulate counters for
i_sectors and i_nlink (for subdirectories). There were two bugs:
- transaction commit may return a restart; therefore we have to commit
before accumulating to those counters
- get_inode_all_snapshots() may return a transaction restart, before
updating w->last_pos; then, on the restart,
check_i_sectors()/check_subdir_count() would see inodes that were not
for w->last_pos
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Returning a positive integer instead of an error code causes error paths
to become very confused.
Closes: syzbot+c0360e8367d6d8d04a66@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The pointer clean points the memory allocated by kmemdup, when the
return value of bch2_sb_clean_validate_late is not zero. The memory
pointed by clean is leaked. So we should free it in this case.
Fixes: a37ad1a3ab ("bcachefs: sb-clean.c")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
In downgrade_table_extra, the return value is needed. When it
return failed, we should exit immediately.
Fixes: 7773df19c3 ("bcachefs: metadata version bucket_stripe_sectors")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
check_topology doesn't need the srcu lock and doesn't use normal btree
transactions - we can just drop the srcu lock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
fsck_err() jumps to the fsck_err label when bailing out; need to make
sure bp_iter was initialized...
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This fixes a kasan splat in propagate_key_to_snapshot_leaves() -
varint_decode_fast() does reads (that it never uses) up to 7 bytes past
the end of the integer.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Most or all errors will be autofix in the future, we're currently just
doing the ones that we know are well tested.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
ovl_open_realfile() is wrongly called twice after conversion to
new struct fd.
Fixes: 88a2f6468d ("struct fd: representation change")
Reported-by: syzbot+d9efec94dcbfa0de1c07@syzkaller.appspotmail.com
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There's a focus on fixes for the memfd_pin_folios() work which was added
into 6.11. Apart from that, the usual shower of singleton fixes.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZvbhSAAKCRDdBJ7gKXxA
jp8CAP47txk2c+tBLggog2MkQamADY5l5MT6E3fYq3ghSiKtVQEAnqX3LiQJ02tB
o9LcPcVrM90QntpKrLP1CpWCVdR+zA8=
=e0QC
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2024-09-27-09-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"19 hotfixes. 13 are cc:stable.
There's a focus on fixes for the memfd_pin_folios() work which was
added into 6.11. Apart from that, the usual shower of singleton fixes"
* tag 'mm-hotfixes-stable-2024-09-27-09-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
ocfs2: fix uninit-value in ocfs2_get_block()
zram: don't free statically defined names
memory tiers: use default_dram_perf_ref_source in log message
Revert "list: test: fix tests for list_cut_position()"
kselftests: mm: fix wrong __NR_userfaultfd value
compiler.h: specify correct attribute for .rodata..c_jump_table
mm/damon/Kconfig: update DAMON doc URL
mm: kfence: fix elapsed time for allocated/freed track
ocfs2: fix deadlock in ocfs2_get_system_file_inode
ocfs2: reserve space for inline xattr before attaching reflink tree
mm: migrate: annotate data-race in migrate_folio_unmap()
mm/hugetlb: simplify refs in memfd_alloc_folio
mm/gup: fix memfd_pin_folios alloc race panic
mm/gup: fix memfd_pin_folios hugetlb page allocation
mm/hugetlb: fix memfd_pin_folios resv_huge_pages leak
mm/hugetlb: fix memfd_pin_folios free_huge_pages leak
mm/filemap: fix filemap_get_folios_contig THP panic
mm: make SPLIT_PTE_PTLOCKS depend on SMP
tools: fix shared radix-tree build
no_llseek had been defined to NULL two years ago, in commit 868941b144
("fs: remove no_llseek")
To quote that commit,
At -rc1 we'll need do a mechanical removal of no_llseek -
git grep -l -w no_llseek | grep -v porting.rst | while read i; do
sed -i '/\<no_llseek\>/d' $i
done
would do it.
Unfortunately, that hadn't been done. Linus, could you do that now, so
that we could finally put that thing to rest? All instances are of the
form
.llseek = no_llseek,
so it's obviously safe.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The SHA-512 shash TFM is used only briefly during Session Setup stage,
when computing SMB 3.1.1 preauth hash.
There's no need to keep it allocated in servers' secmech the whole time,
so keep its lifetime inside smb311_update_preauth_hash().
This also makes smb311_crypto_shash_allocate() redundant, so expose
smb3_crypto_shash_allocate() and use that.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
The HMAC-MD5 shash TFM is used only briefly during Session Setup stage,
when computing NTLMv2 hashes.
There's no need to keep it allocated in servers' secmech the whole time,
so keep its lifetime inside setup_ntlmv2_rsp().
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>