Commit Graph

156 Commits

Author SHA1 Message Date
Joel Becker
2b6cb576aa ocfs2: Set suballoc_loc on allocated metadata.
Get the suballoc_loc from ocfs2_claim_new_inode() or
ocfs2_claim_metadata().  Store it on the appropriate field of the block
we just allocated.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-03-26 10:09:15 +08:00
Joel Becker
1ed9b777f7 ocfs2: ocfs2_claim_*() don't need an ocfs2_super argument.
They all take an ocfs2_alloc_context, which has the allocation inode.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Tao Ma <tao.ma@oracle.com>
2010-05-06 13:59:06 +08:00
Tao Ma
c901fb0073 ocfs2: Make ocfs2_extend_trans() really extend.
In ocfs2, we use ocfs2_extend_trans() to extend a journal handle's
blocks. But if jbd2_journal_extend() fails, it will only restart
with the the new number of blocks.  This tends to be awkward since
in most cases we want additional reserved blocks. It makes our code
harder to mantain since the caller can't be sure all the original
blocks will not be accessed and dirtied again.  There are 15 callers
of ocfs2_extend_trans() in fs/ocfs2, and 12 of them have to add
h_buffer_credits before they call ocfs2_extend_trans().  This makes
ocfs2_extend_trans() really extend atop the original block count.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-05 18:18:09 -07:00
Mark Fasheh
4fe370afaa ocfs2: use allocation reservations during file write
Add a per-inode reservations structure and pass it through to the
reservations code.

Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2010-05-05 18:17:30 -07:00
Joel Becker
ec20cec7a3 ocfs2: Make ocfs2_journal_dirty() void.
jbd[2]_journal_dirty_metadata() only returns 0.  It's been returning 0
since before the kernel moved to git.  There is no point in checking
this error.

ocfs2_journal_dirty() has been faithfully returning the status since the
beginning.  All over ocfs2, we have blocks of code checking this can't
fail status.  In the past few years, we've tried to avoid adding these
checks, because they are pointless.  But anyone who looks at our code
assumes they are needed.

Finally, ocfs2_journal_dirty() is made a void function.  All error
checking is removed from other files.  We'll BUG_ON() the status of
jbd2_journal_dirty_metadata() just in case they change it someday.  They
won't.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-05-05 18:17:29 -07:00
Linus Torvalds
e213e26ab3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
  quota: stop using QUOTA_OK / NO_QUOTA
  dquot: cleanup dquot initialize routine
  dquot: move dquot initialization responsibility into the filesystem
  dquot: cleanup dquot drop routine
  dquot: move dquot drop responsibility into the filesystem
  dquot: cleanup dquot transfer routine
  dquot: move dquot transfer responsibility into the filesystem
  dquot: cleanup inode allocation / freeing routines
  dquot: cleanup space allocation / freeing routines
  ext3: add writepage sanity checks
  ext3: Truncate allocated blocks if direct IO write fails to update i_size
  quota: Properly invalidate caches even for filesystems with blocksize < pagesize
  quota: generalize quota transfer interface
  quota: sb_quota state flags cleanup
  jbd: Delay discarding buffers in journal_unmap_buffer
  ext3: quota_write cross block boundary behaviour
  quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
  quota: split out compat_sys_quotactl support from quota.c
  quota: split out netlink notification support from quota.c
  quota: remove invalid optimization from quota_sync_all
  ...

Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c
2010-03-05 13:20:53 -08:00
Christoph Hellwig
5dd4056db8 dquot: cleanup space allocation / freeing routines
Get rid of the alloc_space, free_space, reserve_space, claim_space and
release_rsv dquot operations - they are always called from the filesystem
and if a filesystem really needs their own (which none currently does)
it can just call into it's own routine directly.

Move shared logic into the common __dquot_alloc_space,
dquot_claim_space_nodirty and __dquot_free_space low-level methods,
and rationalize the wrappers around it to move as much as possible
code into the common block for CONFIG_QUOTA vs not.  Also rename
all these helpers to be named dquot_* instead of vfs_dq_*.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-03-05 00:20:28 +01:00
Tiger Yang
b89c54282d ocfs2: add extent block stealing for ocfs2 v5
This patch add extent block (metadata) stealing mechanism for
extent allocation. This mechanism is same as the inode stealing.
if no room in slot specific extent_alloc, we will try to
allocate extent block from the next slot.

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Acked-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2010-02-26 15:41:07 -08:00
Linus Torvalds
849254e9bb Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
  ocfs2: Set i_nlink properly during reflink.
  ocfs2: Add reflinked file's inode to inode hash eariler.
  ocfs2: refcounttree.c cleanup.
  ocfs2: Find proper end cpos for a leaf refcount block.
2009-12-23 13:33:07 -08:00
Christoph Hellwig
2cfd30adf6 ocfs: stop using do_sync_mapping_range
do_sync_mapping_range(..., SYNC_FILE_RANGE_WRITE) is a very awkward way
to perform a filemap_fdatawrite_range.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:49 -05:00
André Goddard Rosa
af901ca181 tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:55 +01:00
Tao Ma
38a04e4327 ocfs2: Find proper end cpos for a leaf refcount block.
ocfs2 refcount tree is stored as an extent tree while
the leaf ocfs2_refcount_rec points to a refcount block.

The following step can trip a kernel panic.
mkfs.ocfs2 -b 512 -C 1M --fs-features=refcount $DEVICE
mount -t ocfs2 $DEVICE $MNT_DIR
FILE_NAME=$RANDOM
FILE_NAME_1=$RANDOM
FILE_REF="${FILE_NAME}_ref"
FILE_REF_1="${FILE_NAME}_ref_1"
for((i=0;i<305;i++))
do
# /mnt/1048576 is a file with 1048576 sizes.
cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME
cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME_1
done
for((i=0;i<3;i++))
do
cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME
done

for((i=0;i<2;i++))
do
cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME
cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME_1
done

cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME

for((i=0;i<11;i++))
do
cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME
cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME_1
done
reflink $MNT_DIR/$FILE_NAME $MNT_DIR/$FILE_REF
# write_f is a program which will write some bytes to a file at offset.
# write_f -f file_name -l offset -w write_bytes.
./write_f -f $MNT_DIR/$FILE_REF -l $[310*1048576] -w 4096
./write_f -f $MNT_DIR/$FILE_REF -l $[306*1048576] -w 4096
./write_f -f $MNT_DIR/$FILE_REF -l $[311*1048576] -w 4096
./write_f -f $MNT_DIR/$FILE_NAME -l $[310*1048576] -w 4096
./write_f -f $MNT_DIR/$FILE_NAME -l $[311*1048576] -w 4096
reflink $MNT_DIR/$FILE_NAME $MNT_DIR/$FILE_REF_1
./write_f -f $MNT_DIR/$FILE_NAME -l $[311*1048576] -w 4096
#kernel panic here.

The reason is that if the ocfs2_extent_rec is the last record
in a leaf extent block, the old solution fails to find the
suitable end cpos. So this patch try to walk through the b-tree,
find the next sub root and get the c_pos the next sub-tree starts
from.

btw, I have runned tristan's test case against the patched kernel
for several days and this type of kernel panic never happens again.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-12-02 16:14:57 -08:00
Tao Ma
c18b812d12 ocfs2: Make transaction extend more efficient.
In ocfs2_extend_rotate_transaction, op_credits is the orignal
credits in the handle and we only want to extend the credits
for the rotation, but the old solution always double it. It
is harmless for some minor operations, but for actions like
reflink we may rotate tree many times and cause the credits
increase dramatically. So this patch try to only increase
the desired credits.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:46 -07:00
Tao Ma
6ae23c5555 ocfs2: CoW refcount tree improvement.
During CoW, if the old extent record is refcounted, we allocate
som new clusters and do CoW. Actually we can have some improvement
here. If the old extent has refcount=1, that means now it is only
used by this file. So we don't need to allocate new clusters, just
remove the refcounted flag and it is OK. We also have to remove
it from the refcount tree while not deleting it.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:36 -07:00
Tao Ma
6f70fa5199 ocfs2: Add CoW support.
This patch try CoW support for a refcounted record.

the whole process will be:
1. Calculate how many clusters we need to CoW and where we start.
   Extents that are not completely encompassed by the write will
   be broken on 1MB boundaries.
2. Do CoW for the clusters with the help of page cache.
3. Change the b-tree structure with the new allocated clusters.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:36 -07:00
Tao Ma
bcbbb24a6a ocfs2: Decrement refcount when truncating refcounted extents.
Add 'Decrement refcount for delete' in to the normal truncate
process. So for a refcounted extent record, call refcount rec
decrementation instead of cluster free.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:35 -07:00
Tao Ma
1aa75fea64 ocfs2: Add functions for extents refcounted.
Add function ocfs2_mark_extent_refcounted which can mark
an extent refcounted.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:34 -07:00
Tao Ma
1823cb0b9f ocfs2: Add support of decrementing refcount for delete.
Given a physical cpos and length, decrement the refcount
in the tree. If the refcount for any portion of the extent goes
to zero, that portion is queued for freeing.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:33 -07:00
Tao Ma
e2e9f6082b ocfs2: move tree path functions to alloc.h.
Now fs/ocfs2/alloc.c has more than 7000 lines. It contains our
basic b-tree operation. Although we have already make our b-tree
operation generic, the basic structrue ocfs2_path which is used
to iterate one b-tree branch is still static and limited to only
used in alloc.c. As refcount tree need them and I don't want to
add any more b-tree unrelated code to alloc.c, export them out.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:32 -07:00
Tao Ma
fe92441595 ocfs2: Add refcount b-tree as a new extent tree.
Add refcount b-tree as a new extent tree so that it can
use the b-tree to store and maniuplate ocfs2_refcount_rec.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:31 -07:00
Tao Ma
555936bfcb ocfs2: Abstract extent split process.
ocfs2_mark_extent_written actually does the following things:
1. check the parameters.
2. initialize the left_path and split_rec.
3. call __ocfs2_mark_extent_written. it will do:
   1) check the flags of unwritten
   2) do the real split work.
The whole process is packed tightly somehow. So this patch
will abstract 2 different functions so that future b-tree
operation can work with it.

1. __ocfs2_split_extent will accept path and split_rec and do
  the real split work.
2. ocfs2_change_extent_flag will accept a new flag and initialize
   path and split_rec.

So now ocfs2_mark_extent_written will do:
1. check the parameters.
2. call ocfs2_change_extent_flag.
   1) initalize the left_path and split_rec.
   2) check whether the new flags conflict with the old one.
   3) call __ocfs2_split_extent to do the split.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:31 -07:00
Tao Ma
853a3a1439 ocfs2: Wrap ocfs2_extent_contig in ocfs2_extent_tree.
Add a new operation eo_ocfs2_extent_contig int the extent tree's
operations vector. So that with the new refcount tree, We want
this so that refcount trees can always return CONTIG_NONE and
prevent extent merging.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
2009-09-22 20:09:30 -07:00
Joel Becker
5e404e9ed1 ocfs2: Pass ocfs2_caching_info into ocfs_init_*_extent_tree().
With this commit, extent tree operations are divorced from inodes and
rely on ocfs2_caching_info.  Phew!

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:13 -07:00
Joel Becker
a1cf076ba9 ocfs2: __ocfs2_mark_extent_written() doesn't need struct inode.
We only allow unwritten extents on data, so the toplevel
ocfs2_mark_extent_written() can use an inode all it wants.  But the
subfunction isn't even using the inode argument.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:12 -07:00
Joel Becker
f3868d0fa2 ocfs2: Teach ocfs2_replace_extent_rec() to use an extent_tree.
Don't use a struct inode anymore.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:11 -07:00
Joel Becker
d231129f44 ocfs2: ocfs2_split_and_insert() no longer needs struct inode.
It already has an extent_tree.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:11 -07:00
Joel Becker
dbdcf6a48a ocfs2: ocfs2_remove_extent() no longer needs struct inode.
One more generic btree function that is isolated from struct inode.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:10 -07:00
Joel Becker
cbee7e1a6a ocfs2: ocfs2_add_clusters_in_btree() no longer needs struct inode.
One more function that doesn't need a struct inode to pass to its
children.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:09 -07:00
Joel Becker
cc79d8c19e ocfs2: ocfs2_insert_extent() no longer needs struct inode.
One more function down, no inode in the entire insert-extent chain.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:09 -07:00
Joel Becker
92ba470c44 ocfs2: Make extent map insertion an extent_tree_operation.
ocfs2_insert_extent() wants to insert a record into the extent map if
it's an inode data extent.  But since many btrees can call that
function, let's make it an op on ocfs2_extent_tree.  Other tree types
can leave it empty.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:08 -07:00
Joel Becker
627961b77e ocfs2: ocfs2_figure_insert_type() no longer needs struct inode.
It's not using it, so remove it from the parameter list.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:08 -07:00
Joel Becker
1ef61b3314 ocfs2: Remove inode from ocfs2_figure_extent_contig().
It already has an ocfs2_extent_tree and doesn't need the inode.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:07 -07:00
Joel Becker
a29702914a ocfs2: Swap inode for extent_tree in ocfs2_figure_merge_contig_type().
We don't want struct inode in generic btree operations.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:07 -07:00
Joel Becker
b4a176515c ocfs2: ocfs2_extent_contig() only requires the superblock.
Don't pass the inode in.  We don't want it around for generic btree
operations.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:05 -07:00
Joel Becker
3505bec018 ocfs2: ocfs2_do_insert_extent() and ocfs2_insert_path() no longer need an inode.
They aren't using it, so remove it from their parameter lists.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:05 -07:00
Joel Becker
c38e52bb1c ocfs2: Give ocfs2_split_record() an extent_tree instead of an inode.
Another on the way to generic btree functions.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:05 -07:00
Joel Becker
d562862314 ocfs2: ocfs2_insert_at_leaf() doesn't need struct inode.
Give it an ocfs2_extent_tree and it is happy.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:04 -07:00
Joel Becker
4c911eefca ocfs2: Make truncating the extent map an extent_tree_operation.
ocfs2_remove_extent() wants to truncate the extent map if it's
truncating an inode data extent.  But since many btrees can call that
function, let's make it an op on ocfs2_extent_tree.  Other tree types
can leave it empty.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:03 -07:00
Joel Becker
043beebb6c ocfs2: ocfs2_truncate_rec() doesn't need struct inode.
It's not using it anymore.  Remove it from the parameter list.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:03 -07:00
Joel Becker
d401dc12fc ocfs2: ocfs2_grow_branch() and ocfs2_append_rec_to_path() lose struct inode.
ocfs2_grow_branch() not really using it other than to pass it to the
subfunctions ocfs2_shift_tree_depth(), ocfs2_find_branch_target(), and
ocfs2_add_branch().  The first two weren't it either, so they drop the
argument.  ocfs2_add_branch() only passed it to
ocfs2_adjust_rightmost_branch(), which drops the inode argument and uses
the ocfs2_extent_tree as well.

ocfs2_append_rec_to_path() can be take an ocfs2_extent_tree instead of
the inode.  The function ocfs2_adjust_rightmost_records() goes along for
the ride.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:02 -07:00
Joel Becker
c495dd24ac ocfs2: ocfs2_try_to_merge_extent() doesn't need struct inode.
It's not using it, so remove it from the parameter list.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:02 -07:00
Joel Becker
4fe82c312a ocfs2: ocfs2_merge_rec_left/right() no longer need struct inode.
Drop it from the parameters - they already have ocfs2_extent_list.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:01 -07:00
Joel Becker
70f18c08b4 ocfs2: ocfs2_rotate_tree_left() no longer needs struct inode.
It already gets ocfs2_extent_tree, so we can just use that.  This chains
to the same modification for ocfs2_remove_rightmost_path() and
ocfs2_rotate_rightmost_leaf_left().

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:08:00 -07:00
Joel Becker
e46f74dc35 ocfs2: __ocfs2_rotate_tree_left() doesn't need struct inode.
It already has struct ocfs2_extent_tree, which has the caching info.  So
we don't need to pass it struct inode.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:07:59 -07:00
Joel Becker
1e2dd63fe0 ocfs2: ocfs2_rotate_subtree_left() doesn't need struct inode.
It already has struct ocfs2_extent_tree, which has the caching info.  So
we don't need to pass it struct inode.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:07:59 -07:00
Joel Becker
09106bae05 ocfs2: ocfs2_update_edge_lengths() doesn't need struct inode.
Pass in the extent tree, which is all we need.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:07:58 -07:00
Joel Becker
1bbf0b8d60 ocfs2: ocfs2_rotate_tree_right() doesn't need struct inode.
We don't need struct inode in ocfs2_rotate_tree_right() anymore.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:07:58 -07:00
Joel Becker
6136ca5f5f ocfs2: Drop struct inode from ocfs2_extent_tree_operations.
We can get to the inode from the caching information.  Other parent
types don't need it.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:07:57 -07:00
Joel Becker
7dc0280567 ocfs2: Pass ocfs2_extent_tree to ocfs2_get_subtree_root()
Get rid of the inode argument.  Use extent_tree instead.  This means a
few more functions have to pass an extent_tree around.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:07:55 -07:00
Joel Becker
5c601aba8c ocfs2: Get inode out of ocfs2_rotate_subtree_root_right().
Pass the ocfs2_extent_list down through ocfs2_rotate_tree_right() and
get rid of struct inode in ocfs2_rotate_subtree_root_right().

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-09-04 16:07:55 -07:00