linux

Author	SHA1	Message	Date
Sage Weil	393f662096	ceph: fix possible double-free of mds request reference Clear pointer to mds request after dropping the reference to ensure we don't drop it again, as there is at least one error path through this function that does not reset fi->last_readdir to a new value. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:47:06 -07:00
Sage Weil	d96d60498f	ceph: fix session check on mds reply Fix a broken check that a reply came back from the same MDS we sent the request to. I don't think a case that actually triggers this would ever come up in practice, but it's clearly wrong and easy to fix. Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:47:05 -07:00
Dan Carpenter	4736b009b8	ceph: handle kmalloc() failure Return ERR_PTR(-ENOMEM) if kmalloc() fails. We handle allocation failures the same way later in the function. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:47:04 -07:00
Sage Weil	9c423956b8	ceph: propagate mds session allocation failures to caller Return error to original caller if register_session() fails. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:47:04 -07:00
Sage Weil	8f883c24de	ceph: make write_begin wait propagate ERESTARTSYS Currently, if the wait_event_interruptible is interrupted, we return EAGAIN unconditionally and loop, such that we aren't, in fact, interruptible. So, propagate ERESTARTSYS if we get it. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:47:03 -07:00
Sage Weil	ec4318bcb4	ceph: fix snap rebuild condition We were rebuilding the snap context when it was not necessary (i.e. when the realm seq hadn't changed _and_ the parent seq was still older), which caused page snapc pointers to not match the realm's snapc pointer (even though the snap context itself was identical). This confused begin_write and put it into an endless loop. The correct logic is: rebuild snapc if _my_ realm seq changed, or if my parent realm's seq is newer than mine (and thus mine needs to be rebuilt too). Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:47:02 -07:00
Sage Weil	87b315a5b5	ceph: avoid reopening osd connections when address hasn't changed We get a fault callback on _every_ tcp connection fault. Normally, we want to reopen the connection when that happens. If the address we have is bad, however, and connection attempts always result in a connection refused or similar error, explicitly closing and reopening the msgr connection just prevents the messenger's backoff logic from kicking in. The result can be a console full of [ 3974.417106] ceph: osd11 10.3.14.138:6800 connection failed [ 3974.423295] ceph: osd11 10.3.14.138:6800 connection failed [ 3974.429709] ceph: osd11 10.3.14.138:6800 connection failed Instead, if we get a fault, and have outstanding requests, but the osd address hasn't changed and the connection never successfully connected in the first place, do nothing to the osd connection. The messenger layer will back off and retry periodically, because we never connected and thus the lossy bit is not set. Instead, touch each request's r_stamp so that handle_timeout can tell the request is still alive and kicking. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:47:01 -07:00
Sage Weil	3dd72fc0e6	ceph: rename r_sent_stamp r_stamp Make variable name slightly more generic, since it will (soon) reflect either the time the request was sent OR the time it was last determined to be still retrying. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:47:00 -07:00
Sage Weil	3c3f2e32ef	ceph: fix connection fault con_work reentrancy problem The messenger fault was clearing the BUSY bit, for reasons unclear. This made it possible for the con->ops->fault function to reopen the connection, and requeue work in the workqueue--even though the current thread was already in con_work. This avoids a problem where the client busy loops with connection failures on an unreachable OSD, but doesn't address the root cause of that problem. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:59 -07:00
Sage Weil	e4cb4cb8a0	ceph: prevent dup stale messages to console for restarting mds Prevent duplicate 'mds0 caps stale' message from spamming the console every few seconds while the MDS restarts. Set s_renew_requested earlier, so that we only print the message once, even if we don't send an actual request. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:58 -07:00
Sage Weil	efd7576b23	ceph: fix pg pool decoding from incremental osdmap update The incremental map decoding of pg pool updates wasn't skipping the snaps and removed_snaps vectors. This caused osd requests to stall when pool snapshots were created or fs snapshots were deleted. Use a common helper for full and incremental map decoders that decodes pools properly. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:57 -07:00
Sage Weil	80fc7314a7	ceph: fix mds sync() race with completing requests The wait_unsafe_requests() helper dropped the mdsc mutex to wait for each request to complete, and then examined r_node to get the next request after retaking the lock. But the request completion removes the request from the tree, so r_node was always undefined at this point. Since it's a small race, it usually led to a valid request, but not always. The result was an occasional crash in rb_next() while dereferencing node->rb_left. Fix this by clearing the rb_node when removing the request from the request tree, and not walking off into the weeds when we are done waiting for a request. Since the request we waited on will _always_ be out of the request tree, take a ref on the next request, in the hopes that it won't be. But if it is, it's ok: we can start over from the beginning (and traverse over older read requests again). Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:56 -07:00
Sage Weil	916623da10	ceph: only release unused caps with mds requests We were releasing used caps (e.g. FILE_CACHE) from encode_inode_release with MDS requests (e.g. setattr). We don't carry refs on most caps, so this code worked most of the time, but for setattr (utimes) we try to drop Fscr. This causes cap state to get slightly out of sync with reality, and may result in subsequent mds revoke messages getting ignored. Fix by only releasing unused caps. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:55 -07:00
Sage Weil	15637c8b12	ceph: clean up handle_cap_grant, handle_caps wrt session mutex Drop session mutex unconditionally in handle_cap_grant, and do the check_caps from the handle_cap_grant helper. This avoids using a magic return value. Also avoid using a flag variable in the IMPORT case and call check_caps at the appropriate point. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:54 -07:00
Sage Weil	cdc2ce056a	ceph: fix session locking in handle_caps, ceph_check_caps Passing a session pointer to ceph_check_caps() used to mean it would leave the session mutex locked. That wasn't always possible if it wasn't passed CHECK_CAPS_AUTHONLY. If could unlock the passed session and lock a differet session mutex, which was clearly wrong, and also emitted a warning when it a racing CPU retook it and we did an unlock from the wrong context. This was only a problem when there was more than one MDS. First, make ceph_check_caps unconditionally drop the session mutex, so that it is free to lock other sessions as needed. Then adjust the one caller that passes in a session (handle_cap_grant) accordingly. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:53 -07:00
Sage Weil	4ea0043a29	ceph: drop unnecessary WARN_ON in caps migration If we don't have the exported cap it's because we already released it. No need to WARN. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:52 -07:00
Sage Weil	12eadc1900	ceph: fix null pointer deref of r_osd in debug output This causes an oops when debug output is enabled and we kick an osd request with no current r_osd (sometime after an osd failure). Check the pointer before dereferencing. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:51 -07:00
Sage Weil	0a990e7093	ceph: clean up service ticket decoding Previously we would decode state directly into our current ticket_handler. This is problematic if for some reason we fail to decode, because we end up with half new state and half old state. We are probably already in bad shape if we get an update we can't decode, but we may as well be tidy anyway. Decode into new_* temporaries and update the ticket_handler only on success. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-23 07:46:47 -07:00
Sage Weil	5b3dbb44ab	ceph: release old ticket_blob buffer Release the old ticket_blob buffer when we get an updated service ticket from the monitor. Previously these were getting leaked. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-20 21:33:11 -07:00
Sage Weil	807c86e2ce	ceph: fix authenticator buffer size calculation The buffer size was incorrectly calculated for the ceph_x_encrypt() encapsulated ticket blob. Use a helper (with correct arithmetic) and BUG out if we were wrong. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-20 21:33:10 -07:00
Sage Weil	63733a0fc5	ceph: fix authenticator timeout We were failing to reconnect to services due to an old authenticator, even though we had the new ticket, because we weren't properly retrying the connect handshake, because we were calling an old/incorrect helper that left in_base_pos incorrect. The result was a failure to reconnect to the OSD or MDS (with an authentication error) if the MDS restarted after the service had been up a few hours (long enough for the original authenticator to be invalid). This was only a problem if the AUTH_X authentication was enabled. Now that the 'negotiate' and 'connect' stages are fully separated, we should use the prepare_read_connect() helper instead, and remove the obsolete one. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-20 21:33:09 -07:00
Sage Weil	8b218b8a4a	ceph: fix inode removal from snap realm when racing with migration When an inode was dropped while being migrated between two MDSs, i_cap_exporting_issued was non-zero such that issue caps were non-zero and __ceph_is_any_caps(ci) was true. This prevented the inode from being removed from the snap realm, even as it was dropped from the cache. Fix this by dropping any residual i_snap_realm ref in destroy_inode. Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-20 21:33:08 -07:00
Sage Weil	052bb34af3	ceph: add missing locking to protect i_snap_realm_item during split All ci->i_snap_realm_item/realm->inodes_with_caps manipulation should be protected by realm->inodes_with_caps_lock. This bug would have only bit us in a rare race with a realm split (during some snap creations). Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-20 21:33:07 -07:00
Sage Weil	978097c907	ceph: implemented caps should always be superset of issued caps Added assertion, and cleared one case where the implemented caps were not following the issued caps. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-03-20 21:33:06 -07:00
Linus Torvalds	fc7f99cf36	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (205 commits) ceph: update for write_inode API change ceph: reset osd after relevant messages timed out ceph: fix flush_dirty_caps race with caps migration ceph: include migrating caps in issued set ceph: fix osdmap decoding when pools include (removed) snaps ceph: return EBADF if waiting for caps on closed file ceph: set osd request message front length correctly ceph: reset front len on return to msgpool; BUG on mismatched front iov ceph: fix snaptrace decoding on cap migration between mds ceph: use single osd op reply msg ceph: reset bits on connection close ceph: remove bogus mds forward warning ceph: remove fragile __map_osds optimization ceph: fix connection fault STANDBY check ceph: invalidate_authorizer without con->mutex held ceph: don't clobber write return value when using O_SYNC ceph: fix client_request_forward decoding ceph: drop messages on unregistered mds sessions; cleanup ceph: fix comments, locking in destroy_inode ceph: move dereference after NULL test ... Fix trivial conflicts in Documentation/ioctl/ioctl-number.txt	2010-03-19 09:43:06 -07:00
Linus Torvalds	0a492fdef8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: trivial white space [CIFS] checkpatch cleanup cifs: add cifs_revalidate_file cifs: add a CIFSSMBUnixQFileInfo function cifs: add a CIFSSMBQFileInfo function cifs: overhaul cifs_revalidate and rename to cifs_revalidate_dentry	2010-03-19 09:36:18 -07:00
Linus Torvalds	441f4058a0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (30 commits) Btrfs: fix the inode ref searches done by btrfs_search_path_in_tree Btrfs: allow treeid==0 in the inode lookup ioctl Btrfs: return keys for large items to the search ioctl Btrfs: fix key checks and advance in the search ioctl Btrfs: buffer results in the space_info ioctl Btrfs: use __u64 types in ioctl.h Btrfs: fix search_ioctl key advance Btrfs: fix gfp flags masking in the compression code Btrfs: don't look at bio flags after submit_bio btrfs: using btrfs_stack_device_id() get devid btrfs: use memparse Btrfs: add a "df" ioctl for btrfs Btrfs: cache the extent state everywhere we possibly can V2 Btrfs: cache ordered extent when completing io Btrfs: cache extent state in find_delalloc_range Btrfs: change the ordered tree to use a spinlock instead of a mutex Btrfs: finish read pages in the order they are submitted btrfs: fix btrfs_mkdir goto for no free objectids Btrfs: flush data on snapshot creation Btrfs: make df be a little bit more understandable ...	2010-03-18 16:50:55 -07:00
Linus Torvalds	7c34691abe	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: ensure bdi_unregister is called on mount failure. NFS: Avoid a deadlock in nfs_release_page NFSv4: Don't ignore the NFS_INO_REVAL_FORCED flag in nfs_revalidate_inode() nfs4: Make the v4 callback service hidden nfs: fix unlikely memory leak rpc client can not deal with ENOSOCK, so translate it into ENOCONN	2010-03-18 16:50:09 -07:00
Linus Torvalds	01d61d0d64	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: don't warn about page discards on shutdown xfs: use scalable vmap API xfs: remove old vmap cache	2010-03-18 16:46:05 -07:00
Chris Mason	8ad6fcab56	Btrfs: fix the inode ref searches done by btrfs_search_path_in_tree This is used by the inode lookup ioctl to follow all the backrefs up to the subvol root. But the search being done would sometimes land one past the last item in the leaf instead of finding the backref. This changes the search to look for the highest possible backref and hop back one item. It also fixes a leaked path on failure to find the root. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-18 12:23:10 -04:00
Chris Mason	1b53ac4d1b	Btrfs: allow treeid==0 in the inode lookup ioctl When a root id of 0 is sent to the inode lookup ioctl, it will use the root of the file we're ioctling and pass the root id back to userland along with the results. This allows userland to do searches based on that root later on. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-18 12:17:05 -04:00
Chris Mason	90fdde147f	Btrfs: return keys for large items to the search ioctl The search ioctl was skipping large items entirely (ones that are too big for the results buffer). This changes things to at least copy the item header so that we can send information about the item back to userland. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-18 12:14:54 -04:00
Chris Mason	abc6e1341b	Btrfs: fix key checks and advance in the search ioctl The search ioctl was working well for finding tree roots, but using it for generic searches requires a few changes to how the keys are advanced. This treats the search control min fields for objectid, type and offset more like a key, where we drop the offset to zero once we bump the type, etc. The downside of this is that we are changing the min_type and min_offset fields during the search, and so the ioctl caller needs extra checks to make sure the keys in the result are the ones it wanted. This also changes key_in_sk to use btrfs_comp_cpu_keys, just to make things more readable. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-18 12:10:08 -04:00
Akinobu Mita	c4af96449e	ntfs: use bitmap_weight Use bitmap_weight() instead of doing hweight32() for each u32 element in the page. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-17 18:43:47 -07:00
Venkatesh Pallipadi	bcc54e2a6d	jffs2: fix up rb_root initializations to use RB_ROOT jffs2 uses rb_node = NULL; to zero rb_root. The problem with this is that `17d9ddc72f` ("rbtree: Add support for augmented rbtrees") in the linux-next tree adds a new field to that struct which needs to be NULL as well. This patch uses RB_ROOT as the intializer so all of the relevant fields will be NULL'd. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Eric Paris <eparis@redhat.com> Acked-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-17 18:43:47 -07:00
Dave Chinner	e8c3753ce4	xfs: don't warn about page discards on shutdown If we are doing a forced shutdown, we can get lots of noise about delalloc pages being discarded. This is happens by design during a forced shutdown, so don't spam the logs with these messages. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-03-16 15:40:53 -05:00
Alex Elder	8a262e573d	xfs: use scalable vmap API Re-apply a commit that had been reverted due to regressions that have since been fixed. From `95f8e302c0` Mon Sep 17 00:00:00 2001 From: Nick Piggin <npiggin@suse.de> Date: Tue, 6 Jan 2009 14:43:09 +1100 Implement XFS's large buffer support with the new vmap APIs. See the vmap rewrite (`db64fe02`) for some numbers. The biggest improvement that comes from using the new APIs is avoiding the global KVA allocation lock on every call. Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Only modifications here were a minor reformat, plus making the patch apply given the new use of xfs_buf_is_vmapped(). Modified-by: Alex Elder <aelder@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-03-16 15:40:36 -05:00
Alex Elder	cd9640a70d	xfs: remove old vmap cache Re-apply a commit that had been reverted due to regressions that have since been fixed. Original commit: `d2859751cd` Author: Nick Piggin <npiggin@suse.de> Date: Tue, 6 Jan 2009 14:40:44 +1100 XFS's vmap batching simply defers a number (up to 64) of vunmaps, and keeps track of them in a list. To purge the batch, it just goes through the list and calls vunamp on each one. This is pretty poor: a global TLB flush is generally still performed on each vunmap, with the most expensive parts of the operation being the broadcast IPIs and locking involved in the SMP callouts, and the locking involved in the vmap management -- none of these are avoided by just batching up the calls. I'm actually surprised it ever made much difference. (Now that the lazy vmap allocator is upstream, this description is not quite right, but the vunmap batching still doesn't seem to do much). Rip all this logic out of XFS completely. I will improve vmap performance and scalability directly in subsequent patch. Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> The only change I made was to use the "new" xfs_buf_is_vmapped() function in a place it had been open-coded in the original. Modified-by: Alex Elder <aelder@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-03-16 15:40:19 -05:00
Chris Mason	7fde62bffb	Btrfs: buffer results in the space_info ioctl The space_info ioctl was using copy_to_user inside rcu_read_lock. This commit changes things to copy into a buffer first and then dump the result down to userland. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-16 15:40:10 -04:00
Sage Weil	ce769a2904	Btrfs: use __u64 types in ioctl.h Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-16 14:24:27 -04:00
Sage Weil	854d2c3531	Btrfs: fix search_ioctl key advance key->type is u8, not u64. fs/btrfs/ioctl.c: In function 'copy_to_sk': fs/btrfs/ioctl.c:1024: warning: comparison is always true due to limited range of data type Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-16 14:24:27 -04:00
NeilBrown	cfbc0683af	NFS: ensure bdi_unregister is called on mount failure. bdi_unregister is called by nfs_put_super which is only called by generic_shutdown_super if ->s_root is not NULL. So if we error out in a circumstance where we called nfs_bdi_register (i.e. server != NULL) but have not set s_root, then we need to call bdi_unregister explicitly in nfs_get_sb and various other *_get_sb() functions. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-03-15 15:37:45 -04:00
Dan Carpenter	8212cf7583	cifs: trivial white space I fixed the indent level. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2010-03-15 15:19:47 +00:00
Nick Piggin	ef5780c018	Btrfs: fix gfp flags masking in the compression code GFP_FS must be masked out, NOFS can't be or'd in. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-15 11:05:57 -04:00
Chris Mason	5ff7ba3a79	Btrfs: don't look at bio flags after submit_bio After callling submit_bio, the bio can be freed at any time. The btrfs submission thread helper was checking the bio flags too late, which might not give the correct answer. When CONFIG_DEBUG_PAGE_ALLOC is turned on, it can lead to oopsen. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-15 11:00:15 -04:00
Xiao Guangrong	a343832f1a	btrfs: using btrfs_stack_device_id() get devid We can use btrfs_stack_device_id() to get dev_item->devid Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-15 11:00:14 -04:00
Akinobu Mita	91748467a5	btrfs: use memparse Use memparse() instead of its own private implementation. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: linux-btrfs@vger.kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-15 11:00:14 -04:00
Josef Bacik	1406e4327b	Btrfs: add a "df" ioctl for btrfs df is a very loaded question in btrfs. This gives us a way to get the per-space usage information so we can tell exactly what is in use where. This will help us figure out ENOSPC problems, and help users better understand where their disk space is going. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-15 11:00:14 -04:00
Josef Bacik	2ac55d41b5	Btrfs: cache the extent state everywhere we possibly can V2 This patch just goes through and fixes everybody that does lock_extent() blah unlock_extent() to use lock_extent_bits() blah unlock_extent_cached() and pass around a extent_state so we only have to do the searches once per function. This gives me about a 3 mb/s boots on my random write test. I have not converted some things, like the relocation and ioctl's, since they aren't heavily used and the relocation stuff is in the middle of being re-written. I also changed the clear_extent_bit() to only unset the cached state if we are clearing EXTENT_LOCKED and related stuff, so we can do things like this lock_extent_bits() clear delalloc bits unlock_extent_cached() without losing our cached state. I tested this thoroughly and turned on LEAK_DEBUG to make sure we weren't leaking extent states, everything worked out fine. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-15 11:00:13 -04:00
Josef Bacik	5a1a3df1f6	Btrfs: cache ordered extent when completing io When finishing io we run btrfs_dec_test_ordered_pending, and then immediately run btrfs_lookup_ordered_extent, but btrfs_dec_test_ordered_pending does that already, so we're searching twice when we don't have to. This patch lets us pass a btrfs_ordered_extent in to btrfs_dec_test_ordered_pending so if we do complete io on that ordered extent we can just use the one we found then instead of having to do another btrfs_lookup_ordered_extent. This made my fio job with the other patch go from 24 mb/s to 29 mb/s. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-03-15 11:00:13 -04:00

1 2 3 4 5 ...

17368 Commits