This patch changes printbufs dynamically allocate and reallocate a
buffer as needed. Stack usage has become a bit of a problem, and a major
cause of that has been static size string buffers on the stack.
The most involved part of this refactoring is that printbufs must now be
exited with printbuf_exit().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This patch improves the superblock .to_text() methods and adds methods
for all types that were missing them. It also improves printbufs by
allowing them to specfiy what units we want to be printing in, and adds
new wrapper methods for unifying our kernel and userspace environments.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This patch converts bch2_sb_validate() and the .validate methods for the
various superblock sections to take printbuf, to which they can print
detailed error messages, including printing the entire section that was
invalid.
This is a great improvement over the previous situation, where we could
only return static strings that didn't have precise information about
what was wrong.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This simplifies the code quite a bit and eliminates an inconsistency - a
given bkey doesn't necessarily translate to a single replicas entry for
disk space accounting.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This changes bch2_trans_fs_usage_apply() to handle failure (replicas
entry missing) by reverting the changes it made - meaning we can make
the main transaction commit path a bit slimmer, and perhaps also
simplify some locking in upcoming patches.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
It's not an error if we don't have cached data - skip BCH_DATA_cached in
bch2_have_enough_devs().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This is an important cleanup, eliminating an unnecessary copy in the
transaction commit path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This patch standardizes all the enums that have associated string tables
(probably more enums should have string tables).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
If a given set of replicas is entirely on failed devices, don't fail the
mount: we will still fail the mount if we have some copies on non failed
devices.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
When the replicas mechanism was added, for tracking data by which drives
it's replicated on, the check for whether we have sufficient devices was
never updated to make use of it. This patch finally does that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This fixes some arithmetic bugs in "bcachefs: Journal updates to dev
usage" - additionally, it cleans things up by switching everything that
goes in every journal entry to the journal_entry_res mechanism.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This fixes a bug introduced by "bcachefs: Improve diagnostics when
journal entries are missing" - devices in a replicas entry are supposed
to be sorted.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We're transitioning to memalloc_nofs_save/restore instead of GFP flags
with the rest of the kernel, and GFP_NOIO was excessively strict and
causing unnnecessary allocation failures - these allocations are done
with btree locks dropped.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This fixes a regression introduced by "bcachefs: Refactor filesystem
usage accounting". We have to include all the replicas entries that have
any of the entries for different journal entries nonzero, we can't skip
them if they sum to zero.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This is something we clearly should be checking for, but weren't -
oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Various filesystem usage counters are kept in percpu counters, with one
set per in flight journal buffer. Right now all the code that deals with
it assumes that there's only two buffers/sets of counters, but the
number of journal bufs is getting increased to 4 in the next patch - so
refactor that code to not assume a constant.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Instead of trying to charge EC parity to the data within the stripe
(which is subject to rounding errors), let's charge it to the stripe
itself. It should also make -ENOSPC issues easier to deal with if we
charge for parity blocks up front, and means we can also make more fine
grained accounting available to the user.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Awhile back the mechanism for garbage collecting unused replicas entries
was significantly improved, but some cleanup was missed - this patch
does that now.
This is also prep work for a patch to account for erasure coded parity
blocks separately - we need to consolidate the logic for
checking/marking the various replicas entries from one bkey into a
single function.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Add a new btree ptr type which contains the sequence number (random 64
bit cookie, actually) for that btree node - this lets us verify that
when we read in a btree node it really is the btree node we wanted.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This needs to match bch2_mark_extent()/bch2_trans_mark_extent() in
buckets.c
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This make the disk accounting code saner, and it's not clear why we'd
ever want the same data to be in multiple stripes simultaneously.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
When an extent is erasure coded, we need to record a replicas entry to
indicate that data is present on the devices that extent has pointers to
- but nr_required should be 0, because it's erasure coded.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The continue statement in bch2_trans_mark_extent() was wrong - by
bailing out early, we'd be constructing the wrong replicas list to
update. Also, the assertion in update_replicas() was wrong - due to
rounding with compressed extents, it is possible for sectors to be 0
sometimes.
Also, change extent_to_replicas() in replicas.c to match the replicas
list we construct in buckets.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
this lets us get rid of a lot of extra switch statements - in a lot of
places we dispatch on the btree node type, and then the key type, so
this is a nice cleanup across a lot of code.
Also improve the on disk format versioning stuff.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>