linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-10 22:21:40 +00:00

Author	SHA1	Message	Date
Tao Ma	c1e8d35ef5	ocfs2: Remove EXIT from masklog. mlog_exit is used to record the exit status of a function. But because it is added in so many functions, if we enable it, the system logs get filled up quickly and cause too much I/O. So actually no one can open it for a production system or even for a test. This patch just try to remove it or change it. So: 1. if all the error paths already use mlog_errno, it is just removed. Otherwise, it will be replaced by mlog_errno. 2. if it is used to print some return value, it is replaced with mlog(0,...). mlog_exit_ptr is changed to mlog(0. All those mlog(0,...) will be replaced with trace events later. Signed-off-by: Tao Ma <boyu.mt@taobao.com>	2011-03-07 16:43:21 +08:00
Tao Ma	ef6b689b63	ocfs2: Remove ENTRY from masklog. ENTRY is used to record the entry of a function. But because it is added in so many functions, if we enable it, the system logs get filled up quickly and cause too much I/O. So actually no one can open it for a production system or even for a test. So for mlog_entry_void, we just remove it. for mlog_entry(...), we replace it with mlog(0,...), and they will be replace by trace event later. Signed-off-by: Tao Ma <boyu.mt@taobao.com>	2011-02-21 11:10:44 +08:00
Al Viro	ba87167c06	switch ocfs2, close races Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-12 20:02:46 -05:00
Linus Torvalds	498f7f505d	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (22 commits) MAINTAINERS: Update Joel Becker's email address ocfs2: Remove unused truncate function from alloc.c ocfs2/cluster: dereferencing before checking in nst_seq_show() ocfs2: fix build for OCFS2_FS_STATS not enabled ocfs2/cluster: Show o2net timing statistics ocfs2/cluster: Track process message timing stats for each socket ocfs2/cluster: Track send message timing stats for each socket ocfs2/cluster: Use ktime instead of timeval in struct o2net_sock_container ocfs2/cluster: Replace timeval with ktime in struct o2net_send_tracking ocfs2: Add DEBUG_FS dependency ocfs2/dlm: Hard code the values for enums ocfs2/dlm: Minor cleanup ocfs2/dlm: Cleanup dlmdebug.c ocfs2: Release buffer_head in case of error in ocfs2_double_lock. ocfs2/cluster: Pin the local node when o2hb thread starts ocfs2/cluster: Show pin state for each o2hb region ocfs2/cluster: Pin/unpin o2hb regions ocfs2/cluster: Remove dropped region from o2hb quorum region bitmap ocfs2/cluster: Pin the remote node item in configfs ocfs2/dlm: make existing convertion precedent over new lock ...	2011-01-11 11:28:34 -08:00
Nick Piggin	fb045adb99	fs: dcache reduce branches in lookup path Reduce some branches and memory accesses in dcache lookup by adding dentry flags to indicate common d_ops are set, rather than having to check them. This saves a pointer memory access (dentry->d_op) in common path lookup situations, and saves another pointer load and branch in cases where we have d_op but not the particular operation. Patched with: git grep -E '[.>]([[:space:]])d_op([[:space:]])=' \| xargs sed -e 's/\([^\t ]\)->d_op = \(.\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]\)\.d_op = \(.\);/d_set_d_op(\&\1, \2);/' -i Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:28 +11:00
Tao Ma	1e6d9153df	ocfs2: Release buffer_head in case of error in ocfs2_double_lock. In ocfs2_double_lock, when ocfs2_inode_lock for inode1 fails, we just unlock inode2 and return without releasing buffer we get from inode_lock(inode2). The good thing is that it is freed by the only caller ocfs2_rename when it exits. But I don't think this is a right way for error handling. We should free the buffer_head we get in ocfs2_double_lock before exit so that the caller doesn't need to take care of it. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-12-22 18:34:42 -08:00
Al Viro	7de9c6ee3e	new helper: ihold() Clones an existing reference to inode; caller must already hold one. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-25 21:26:11 -04:00
Goldwyn Rodrigues	5e98d49240	Track negative entries v3 Track negative dentries by recording the generation number of the parent directory in d_fsdata. The generation number for the parent directory is recorded in the inode_info, which increments every time the lock on the directory is dropped. If the generation number of the parent directory and the negative dentry matches, there is no need to perform the revalidate, else a revalidate is forced. This improves performance in situations where nodes look for the same non-existent file multiple times. Thanks Mark for explaining the DLM sequence. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.de> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-09-10 09:18:15 -07:00
Mark Fasheh	97b8f4a9df	ocfs2: Fix orphan add in ocfs2_create_inode_in_orphan ocfs2_create_inode_in_orphan() is used by reflink to create the newly reflinked inode simultaneously in the orphan dir. This allows us to easily handle partially-reflinked files during recovery cleanup. We have a problem though - the orphan dir stringifies inode # to determine a unique name under which the orphan entry dirent can be created. Since ocfs2_create_inode_in_orphan() needs the space allocated in the orphan dir before it can allocate the inode, we currently call into the orphan code: /* * We give the orphan dir the root blkno to fake an orphan name, * and allocate enough space for our insertion. */ status = ocfs2_prepare_orphan_dir(osb, &orphan_dir, osb->root_blkno, orphan_name, &orphan_insert); Using osb->root_blkno might work fine on unindexed directories, but the orphan dir can have an index. When it has that index, the above code fails to allocate the proper index entry. Later, when we try to remove the file from the orphan dir (using the actual inode #), the reflink operation will fail. To fix this, I created a function ocfs2_alloc_orphaned_file() which uses the newly split out orphan and inode alloc code to figure out what the inode block number will be (once allocated) and then prepare the orphan dir from that data. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-09-08 14:26:00 +08:00
Mark Fasheh	dd43bcde23	ocfs2: split out ocfs2_prepare_orphan_dir() into locking and prep functions We do this because ocfs2_create_inode_in_orphan() wants to order locking of the orphan dir with respect to locking of the inode allocator before making any changes to the directory. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-09-08 14:26:00 +08:00
Mark Fasheh	021960cab3	ocfs2: split out inode alloc code from ocfs2_mknod_locked Do this by splitting the bulk of the function away from the inode allocation code at the very tom of ocfs2_mknod_locked(). Existing callers don't need to change and won't see any difference. The new function created, __ocfs2_mknod_locked() will be used shortly. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-09-08 14:25:58 +08:00
Dmitry Monakhov	75fe0a2477	ocfs2: replace inode uid,gid,mode initialization with helper function Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-21 18:31:25 -04:00
Linus Torvalds	03e62303cf	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (47 commits) ocfs2: Silence a gcc warning. ocfs2: Don't retry xattr set in case value extension fails. ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break ocfs2: Reset xattr value size after xa_cleanup_value_truncate(). fs/ocfs2/dlm: Use kstrdup fs/ocfs2/dlm: Drop memory allocation cast Ocfs2: Optimize punching-hole code. Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public. Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing. Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead. ocfs2: Block signals for mkdir/link/symlink/O_CREAT. ocfs2: Wrap signal blocking in void functions. ocfs2/dlm: Increase o2dlm lockres hash size ocfs2: Make ocfs2_extend_trans() really extend. ocfs2/trivial: Code cleanup for allocation reservation. ocfs2: make ocfs2_adjust_resv_from_alloc simple. ocfs2: Make nointr a default mount option ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE o2net: log socket state changes ocfs2: print node # when tcp fails ...	2010-05-21 07:20:17 -07:00
Joel Becker	41841b0bce	Merge branch 'discontig-bg' of git://oss.oracle.com/git/tma/linux-2.6 into ocfs2-merge-window	2010-05-18 16:40:42 -07:00
Joel Becker	547ba7c8ef	ocfs2: Block signals for mkdir/link/symlink/O_CREAT. Once file or link creation gets going, it can't be interrupted by a signal. They're not idempotent. This blocks signals in ocfs2_mknod(), ocfs2_link(), and ocfs2_symlink() once we start actually changing things. ocfs2_mknod() covers mknod(), creat(), mkdir(), and open(O_CREAT). Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-10 11:56:52 -07:00
Joel Becker	ec20cec7a3	ocfs2: Make ocfs2_journal_dirty() void. jbd[2]_journal_dirty_metadata() only returns 0. It's been returning 0 since before the kernel moved to git. There is no point in checking this error. ocfs2_journal_dirty() has been faithfully returning the status since the beginning. All over ocfs2, we have blocks of code checking this can't fail status. In the past few years, we've tried to avoid adding these checks, because they are pointless. But anyone who looks at our code assumes they are needed. Finally, ocfs2_journal_dirty() is made a void function. All error checking is removed from other files. We'll BUG_ON() the status of jbd2_journal_dirty_metadata() just in case they change it someday. They won't. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-05-05 18:17:29 -07:00
Mark Fasheh	a9743fcdc0	ocfs2: Add directory entry later in ocfs2_symlink() and ocfs2_mknod() If we get a failure during creation of an inode we'll allow the orphan code to remove the inode, which is correct. However, we need to ensure that we don't get any errors after the call to ocfs2_add_entry(), otherwise we could leave a dangling directory reference. The solution is simple - in both cases, all I had to do was move ocfs2_dentry_attach_lock() above the ocfs2_add_entry() call. Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-04-23 11:42:22 -07:00
Li Dongyang	062d340384	ocfs2: use OCFS2_INODE_SKIP_ORPHAN_DIR in ocfs2_mknod error path Mark the inode with flag OCFS2_INODE_SKIP_ORPHAN_DIR in ocfs2_mknod, so we can kill the inode in case of error. [ Fixed up comment style -Mark ] Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-04-23 11:05:00 -07:00
Li Dongyang	ab41fdc8fd	ocfs2: use OCFS2_INODE_SKIP_ORPHAN_DIR in ocfs2_symlink error path Mark the inode with flag OCFS2_INODE_SKIP_ORPHAN_DIR when we get an error after allocating one, so that we can kill the inode. Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-04-23 11:05:00 -07:00
Li Dongyang	d4cd1871cf	ocfs2: add OCFS2_INODE_SKIP_ORPHAN_DIR flag and honor it in the inode wipe code Currently in the error path of ocfs2_symlink and ocfs2_mknod, we just call iput with the inode we failed with, but the inode wipe code will complain because we don't add the inode to orphan dir. One solution would be to lock the orphan dir during the entire transaction, but that's too heavy for a rare error path. Instead, we add a flag, OCFS2_INODE_SKIP_ORPHAN_DIR which tells the inode wipe code that it won't find this inode in the orphan dir. [ Merge fixes and comment style cleanups -Mark ] Signed-off-by: Li Dongyang <lidongyang@novell.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2010-04-23 11:03:49 -07:00
Tristan Ye	3939fda4b3	Ocfs2: Journaling i_flags and i_orphaned_slot when adding inode to orphan dir. Currently, some callers were missing to journal the dirty inode after adding it to orphan dir. Now we're going to journal such modifications within the ocfs2_orphan_add() itself, It's safe to do so, though some existing caller may duplicate this, and it makes the logic look more straightforward anyway. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-23 18:22:51 -07:00
Joel Becker	2b6cb576aa	ocfs2: Set suballoc_loc on allocated metadata. Get the suballoc_loc from ocfs2_claim_new_inode() or ocfs2_claim_metadata(). Store it on the appropriate field of the block we just allocated. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2010-03-26 10:09:15 +08:00
Joel Becker	1ed9b777f7	ocfs2: ocfs2_claim_*() don't need an ocfs2_super argument. They all take an ocfs2_alloc_context, which has the allocation inode. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>	2010-05-06 13:59:06 +08:00
Christoph Hellwig	871a293155	dquot: cleanup dquot initialize routine Get rid of the initialize dquot operation - it is now always called from the filesystem and if a filesystem really needs it's own (which none currently does) it can just call into it's own routine directly. Rename the now static low-level dquot_initialize helper to __dquot_initialize and vfs_dq_init to dquot_initialize to have a consistent namespace. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2010-03-05 00:20:30 +01:00
Christoph Hellwig	907f4554e2	dquot: move dquot initialization responsibility into the filesystem Currently various places in the VFS call vfs_dq_init directly. This means we tie the quota code into the VFS. Get rid of that and make the filesystem responsible for the initialization. For most metadata operations this is a straight forward move into the methods, but for truncate and open it's a bit more complicated. For truncate we currently only call vfs_dq_init for the sys_truncate case because open already takes care of it for ftruncate and open(O_TRUNC) - the new code causes an additional vfs_dq_init for those which is harmless. For open the initialization is moved from do_filp_open into the open method, which means it happens slightly earlier now, and only for regular files. The latter is fine because we don't need to initialize it for operations on special files, and we already do it as part of the namespace operations for directories. Add a dquot_file_open helper that filesystems that support generic quotas can use to fill in ->open. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2010-03-05 00:20:30 +01:00
Christoph Hellwig	63936ddaa1	dquot: cleanup inode allocation / freeing routines Get rid of the alloc_inode and free_inode dquot operations - they are always called from the filesystem and if a filesystem really needs their own (which none currently does) it can just call into it's own routine directly. Also get rid of the vfs_dq_alloc/vfs_dq_free wrappers and always call the lowlevel dquot_alloc_inode / dqout_free_inode routines directly, which now lose the number argument which is always 1. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2010-03-05 00:20:28 +01:00
Christoph Hellwig	5dd4056db8	dquot: cleanup space allocation / freeing routines Get rid of the alloc_space, free_space, reserve_space, claim_space and release_rsv dquot operations - they are always called from the filesystem and if a filesystem really needs their own (which none currently does) it can just call into it's own routine directly. Move shared logic into the common __dquot_alloc_space, dquot_claim_space_nodirty and __dquot_free_space low-level methods, and rationalize the wrappers around it to move as much as possible code into the common block for CONFIG_QUOTA vs not. Also rename all these helpers to be named dquot_* instead of vfs_dq_*. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2010-03-05 00:20:28 +01:00
Linus Torvalds	45e62974fb	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.c ocfs2/trivial: Use proper mask for 2 places in hearbeat.c Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink. Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode? ocfs2: Use FIEMAP_EXTENT_SHARED fiemap: Add new extent flag FIEMAP_EXTENT_SHARED ocfs2: replace u8 by __u8 in ocfs2_fs.h ocfs2: explicit declare uninitialized var in user_cluster_connect() ocfs2-devel: remove redundant OCFS2_MOUNT_POSIX_ACL check in ocfs2_get_acl_nolock() ocfs2: return -EAGAIN instead of EAGAIN in dlm ocfs2/cluster: Make fence method configurable - v2 ocfs2: Set MS_POSIXACL on remount ocfs2: Make acl use the default ocfs2: Always include ACL support	2009-12-24 12:59:11 -08:00
Tao Ma	10cf1a02f4	ocfs2: Set i_nlink properly during reflink. We create a file in orphan dir for reflink so that if there is any error, we don't create any wrong dentry in the dir. But actually the file in orphan dir should be i_nlink = 0 so that it can be replayed and freed successfully. This patch first set i_nlink to 0 when creating the file in orphan dir and then set it to 1(reflink now only works for regular file) when we move it to the dest dir. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-18 13:32:28 -08:00
Tao Ma	c7d260afcb	ocfs2: Add reflinked file's inode to inode hash eariler. We used to add reflinked file's inode to inode hash when we add it to the dest dir. But actually there is a race. Consider the following sequence. 1. reflink happens and create the inode in orphan dir. 2. reflink thread is scheduled out because of some io. 3. recovery begins to work and calls ocfs2_recover_orphans. It calls ocfs2_iget and get a new inode and i_count = 1. It calls iput then and delete inode. the buffer's uptodate state is cleared. This patch move insert_inode_hash to the create function so that it can be found by step 3 and prevented from deleting because i_count > 1. This resolves the bug http://oss.oracle.com/bugzilla/show_bug.cgi?id=1183. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-18 13:32:20 -08:00
Tristan Ye	55f4946ed2	Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode? Let userspace have a chance to get the extent info of a directory just like extN did. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-17 21:21:32 -08:00
Tao Ma	bc13d34757	ocfs2: Create reflinked file in orphan dir. reflink is a very complicated process, so it can't be integrated into one transaction. So if the system panic in the operation, we may leave a unfinished inode in the destication directory. So we will try to create an inode in orphan_dir first, reflink it to the src file and then move it to the destication file in the end. In that way we won't be afraid of any corruption during the reflink. This patch adds 2 functions for orphan_dir operation: 1. Create a new inode in orphand dir. 2. Move an inode to a target dir. Note: fsck.ocfs2 should work for us to remove the unfinished file in the orphan_dir. Signed-off-by: Tao Ma <tao.ma@oracle.com>	2009-09-22 20:09:48 -07:00
Tao Ma	19bd341f6a	ocfs2: Use proper parameter for some inode operation. In order to make the original function more suitable for reflink, we modify the following inode operations. Both are tiny. 1. ocfs2_mknod_locked only use dentry for mlog, so move it to the caller so that reflink can use it without dentry. 2. ocfs2_prepare_orphan_dir only want inode to get its ip_blkno. So use ip_blkno instead. Signed-off-by: Tao Ma <tao.ma@oracle.com>	2009-09-22 20:09:47 -07:00
Joel Becker	0cf2f7632b	ocfs2: Pass struct ocfs2_caching_info to the journal functions. The next step in divorcing metadata I/O management from struct inode is to pass struct ocfs2_caching_info to the journal functions. Thus the journal locks a metadata cache with the cache io_lock function. It also can compare ci_last_trans and ci_created_trans directly. This is a large patch because of all the places we change ocfs2_journal_access..(handle, inode, ...) to ocfs2_journal_access..(handle, INODE_CACHE(inode), ...). Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-09-04 16:07:50 -07:00
Joel Becker	8cb471e8f8	ocfs2: Take the inode out of the metadata read/write paths. We are really passing the inode into the ocfs2_read/write_blocks() functions to get at the metadata cache. This commit passes the cache directly into the metadata block functions, divorcing them from the inode. Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-09-04 16:07:48 -07:00
Jan Kara	cb25797d45	ocfs2: Add lockdep annotations Add lockdep support to OCFS2. The support also covers all of the cluster locks except for open locks, journal locks, and local quotafile locks. These are special because they are acquired for a node, not for a particular process and lockdep cannot deal with such type of locking. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-06-22 14:34:26 -07:00
Tao Ma	7e31a966ad	ocfs2/trivial: Remove unused variable in ocfs2_rename. With indexed dir enabled, now we use ocfs2_dir_lookup_result to wrap all the bh used for dir. So remove the 2 unused variables. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-04-29 10:57:18 -07:00
Tao Ma	138211515c	ocfs2: Optimize inode allocation by remembering last group In ocfs2, the inode block search looks for the "emptiest" inode group to allocate from. So if an inode alloc file has many equally (or almost equally) empty groups, new inodes will tend to get spread out amongst them, which in turn can put them all over the disk. This is undesirable because directory operations on conceptually "nearby" inodes force a large number of seeks. So we add ip_last_used_group in core directory inodes which records the last used allocation group. Another field named ip_last_used_slot is also added in case inode stealing happens. When claiming new inode, we passed in directory's inode so that the allocation can use this information. For more details, please see http://oss.oracle.com/osswiki/OCFS2/DesignDocs/InodeAllocationStrategy. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-04-03 11:39:17 -07:00
Mark Fasheh	b80b549c35	ocfs2: re-order ocfs2_empty_dir checks ocfs2_empty_dir() is far more expensive than checking link count. Since both need to be checked at the same time, we can improve performance by checking link count first. Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-04-03 11:39:17 -07:00
Mark Fasheh	198a1ca3b7	ocfs2: Increase max links count Since we've now got a directory format capable of handling a large number of entries, we can increase the maximum link count supported. This only gets increased if the directory indexing feature is turned on. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>	2009-04-03 11:39:16 -07:00
Mark Fasheh	4ed8a6bb08	ocfs2: Store dir index records inline Allow us to store a small number of directory index records in the ocfs2_dx_root_block. This saves us a disk read on small to medium sized directories (less than about 250 entries). The inline root is automatically turned into a root block with extents if the directory size increases beyond it's capacity. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>	2009-04-03 11:39:16 -07:00
Mark Fasheh	9b7895efac	ocfs2: Add a name indexed b-tree to directory inodes This patch makes use of Ocfs2's flexible btree code to add an additional tree to directory inodes. The new tree stores an array of small, fixed-length records in each leaf block. Each record stores a hash value, and pointer to a block in the traditional (unindexed) directory tree where a dirent with the given name hash resides. Lookup exclusively uses this tree to find dirents, thus providing us with constant time name lookups. Some of the hashing code was copied from ext3. Unfortunately, it has lots of unfixed checkpatch errors. I left that as-is so that tracking changes would be easier. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>	2009-04-03 11:39:15 -07:00
Mark Fasheh	4a12ca3a00	ocfs2: Introduce dir lookup helper struct Many directory manipulation calls pass around a tuple of dirent, and it's containing buffer_head. Dir indexing has a bit more state, but instead of adding yet more arguments to functions, we introduce 'struct ocfs2_dir_lookup_result'. In this patch, it simply holds the same tuple, but future patches will add more state. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>	2009-04-03 11:39:15 -07:00
Tiger Yang	d9ae49d6e2	ocfs2: tweak to get the maximum inline data size with xattr Replace max_inline_data with max_inline_data_with_xattr to ensure it correct when xattr inlined. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-03-12 16:45:46 -07:00
Joel Becker	13723d00e3	ocfs2: Use metadata-specific ocfs2_journal_access_() functions. The per-metadata-type ocfs2_journal_access_() functions hook up jbd2 commit triggers and allow us to compute metadata ecc right before the buffers are written out. This commit provides ecc for inodes, extent blocks, group descriptors, and quota blocks. It is not safe to use extened attributes and metaecc at the same time yet. The ocfs2_extent_tree and ocfs2_path abstractions in alloc.c both hide the type of block at their root. Before, it didn't matter, but now the root block must use the appropriate ocfs2_journal_access_*() function. To keep this abstract, the structures now have a pointer to the matching journal_access function and a wrapper call to call it. A few places use naked ocfs2_write_block() calls instead of adding the blocks to the journal. We make sure to calculate their checksum and ecc before the write. Since we pass around the journal_access functions. Let's typedef them in ocfs2.h. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:32 -08:00
Jan Kara	a90714c150	ocfs2: Add quota calls for allocation and freeing of inodes and space Add quota calls for allocation and freeing of inodes and space, also update estimates on number of needed credits for a transaction. Move out inode allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called outside of a transaction. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:40:23 -08:00
Joel Becker	b657c95c11	ocfs2: Wrap inode block reads in a dedicated function. The ocfs2 code currently reads inodes off disk with a simple ocfs2_read_block() call. Each place that does this has a different set of sanity checks it performs. Some check only the signature. A couple validate the block number (the block read vs di->i_blkno). A couple others check for VALID_FL. Only one place validates i_fs_generation. A couple check nothing. Even when an error is found, they don't all do the same thing. We wrap inode reading into ocfs2_read_inode_block(). This will validate all the above fields, going readonly if they are invalid (they never should be). ocfs2_read_inode_block_full() is provided for the places that want to pass read_block flags. Every caller is passing a struct inode with a valid ip_blkno, so we don't need a separate blkno argument either. We will remove the validation checks from the rest of the code in a later commit, as they are no longer necessary. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:36:52 -08:00
Tiger Yang	89c38bd0ad	ocfs2: add ocfs2_init_acl in mknod We need to get the parent directories acls and let the new child inherit it. To this, we add additional calculations for data/metadata allocation. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:34:20 -08:00
Tiger Yang	534eadddc1	ocfs2: add ocfs2_init_security in during file create Security attributes must be set when creating a new inode. We do this in three steps. - First, get security xattr's name and value by security_operation - Calculate and reserve the meta data and clusters needed by this security xattr before starting transaction - Finally, we set it before add_entry Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:34:20 -08:00
Tiger Yang	f5d362022a	ocfs2: move new inode allocation out of the transaction Move out inode allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called outside of a transaction. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2009-01-05 08:34:19 -08:00

1 2 3

109 Commits