btrfs: tree-checker: add dev extent item checks

[REPORT]
There is a corruption report that btrfs refused to mount a fs that has
overlapping dev extents:

  BTRFS error (device sdc): dev extent devid 4 physical offset 14263979671552 overlap with previous dev extent end 14263980982272
  BTRFS error (device sdc): failed to verify dev extents against chunks: -117
  BTRFS error (device sdc): open_ctree failed

[CAUSE]
The direct cause is very obvious, there is a bad dev extent item with
incorrect length.

With btrfs check reporting two overlapping extents, the second one shows
some clue on the cause:

  ERROR: dev extent devid 4 offset 14263979671552 len 6488064 overlap with previous dev extent end 14263980982272
  ERROR: dev extent devid 13 offset 2257707008000 len 6488064 overlap with previous dev extent end 2257707270144
  ERROR: errors found in extent allocation tree or chunk allocation

The second one looks like a bitflip happened during new chunk
allocation:
hex(2257707008000) = 0x20da9d30000
hex(2257707270144) = 0x20da9d70000
diff               = 0x00000040000

So it looks like a bitflip happened during new dev extent allocation,
resulting the second overlap.

Currently we only do the dev-extent verification at mount time, but if the
corruption is caused by memory bitflip, we really want to catch it before
writing the corruption to the storage.

Furthermore the dev extent items has the following key definition:

	(<device id> DEV_EXTENT <physical offset>)

Thus we can not just rely on the generic key order check to make sure
there is no overlapping.

[ENHANCEMENT]
Introduce dedicated dev extent checks, including:

- Fixed member checks
  * chunk_tree should always be BTRFS_CHUNK_TREE_OBJECTID (3)
  * chunk_objectid should always be
    BTRFS_FIRST_CHUNK_CHUNK_TREE_OBJECTID (256)

- Alignment checks
  * chunk_offset should be aligned to sectorsize
  * length should be aligned to sectorsize
  * key.offset should be aligned to sectorsize

- Overlap checks
  If the previous key is also a dev-extent item, with the same
  device id, make sure we do not overlap with the previous dev extent.

Reported: Stefan N <stefannnau@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CA+W5K0rSO3koYTo=nzxxTm1-Pdu1HYgVxEpgJ=aGc7d=E8mGEg@mail.gmail.com/
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
Qu Wenruo 2024-08-11 15:00:22 +09:30 committed by David Sterba
parent 3bc2ac2f8f
commit 008e2512dc

View File

@ -1764,6 +1764,72 @@ static int check_raid_stripe_extent(const struct extent_buffer *leaf,
return 0;
}
static int check_dev_extent_item(const struct extent_buffer *leaf,
const struct btrfs_key *key,
int slot,
struct btrfs_key *prev_key)
{
struct btrfs_dev_extent *de;
const u32 sectorsize = leaf->fs_info->sectorsize;
de = btrfs_item_ptr(leaf, slot, struct btrfs_dev_extent);
/* Basic fixed member checks. */
if (unlikely(btrfs_dev_extent_chunk_tree(leaf, de) !=
BTRFS_CHUNK_TREE_OBJECTID)) {
generic_err(leaf, slot,
"invalid dev extent chunk tree id, has %llu expect %llu",
btrfs_dev_extent_chunk_tree(leaf, de),
BTRFS_CHUNK_TREE_OBJECTID);
return -EUCLEAN;
}
if (unlikely(btrfs_dev_extent_chunk_objectid(leaf, de) !=
BTRFS_FIRST_CHUNK_TREE_OBJECTID)) {
generic_err(leaf, slot,
"invalid dev extent chunk objectid, has %llu expect %llu",
btrfs_dev_extent_chunk_objectid(leaf, de),
BTRFS_FIRST_CHUNK_TREE_OBJECTID);
return -EUCLEAN;
}
/* Alignment check. */
if (unlikely(!IS_ALIGNED(key->offset, sectorsize))) {
generic_err(leaf, slot,
"invalid dev extent key.offset, has %llu not aligned to %u",
key->offset, sectorsize);
return -EUCLEAN;
}
if (unlikely(!IS_ALIGNED(btrfs_dev_extent_chunk_offset(leaf, de),
sectorsize))) {
generic_err(leaf, slot,
"invalid dev extent chunk offset, has %llu not aligned to %u",
btrfs_dev_extent_chunk_objectid(leaf, de),
sectorsize);
return -EUCLEAN;
}
if (unlikely(!IS_ALIGNED(btrfs_dev_extent_length(leaf, de),
sectorsize))) {
generic_err(leaf, slot,
"invalid dev extent length, has %llu not aligned to %u",
btrfs_dev_extent_length(leaf, de), sectorsize);
return -EUCLEAN;
}
/* Overlap check with previous dev extent. */
if (slot && prev_key->objectid == key->objectid &&
prev_key->type == key->type) {
struct btrfs_dev_extent *prev_de;
u64 prev_len;
prev_de = btrfs_item_ptr(leaf, slot - 1, struct btrfs_dev_extent);
prev_len = btrfs_dev_extent_length(leaf, prev_de);
if (unlikely(prev_key->offset + prev_len > key->offset)) {
generic_err(leaf, slot,
"dev extent overlap, prev offset %llu len %llu current offset %llu",
prev_key->objectid, prev_len, key->offset);
return -EUCLEAN;
}
}
return 0;
}
/*
* Common point to switch the item-specific validation.
*/
@ -1800,6 +1866,9 @@ static enum btrfs_tree_block_status check_leaf_item(struct extent_buffer *leaf,
case BTRFS_DEV_ITEM_KEY:
ret = check_dev_item(leaf, key, slot);
break;
case BTRFS_DEV_EXTENT_KEY:
ret = check_dev_extent_item(leaf, key, slot, prev_key);
break;
case BTRFS_INODE_ITEM_KEY:
ret = check_inode_item(leaf, key, slot);
break;