Commit Graph

1957 Commits

Author SHA1 Message Date
Felix Blyakher
35fd035968 Merge branch 'master' of git://git.kernel.org/pub/scm/fs/xfs/xfs 2009-06-11 16:56:49 -05:00
Linus Torvalds
c9059598ea Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits)
  block: add request clone interface (v2)
  floppy: fix hibernation
  ramdisk: remove long-deprecated "ramdisk=" boot-time parameter
  fs/bio.c: add missing __user annotation
  block: prevent possible io_context->refcount overflow
  Add serial number support for virtio_blk, V4a
  block: Add missing bounce_pfn stacking and fix comments
  Revert "block: Fix bounce limit setting in DM"
  cciss: decode unit attention in SCSI error handling code
  cciss: Remove no longer needed sendcmd reject processing code
  cciss: change SCSI error handling routines to work with interrupts enabled.
  cciss: separate error processing and command retrying code in sendcmd_withirq_core()
  cciss: factor out fix target status processing code from sendcmd functions
  cciss: simplify interface of sendcmd() and sendcmd_withirq()
  cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code
  cciss: Use schedule_timeout_uninterruptible in SCSI error handling code
  block: needs to set the residual length of a bidi request
  Revert "block: implement blkdev_readpages"
  block: Fix bounce limit setting in DM
  Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt
  ...

Manually fix conflicts with tracing updates in:
	block/blk-sysfs.c
	drivers/ide/ide-atapi.c
	drivers/ide/ide-cd.c
	drivers/ide/ide-floppy.c
	drivers/ide/ide-tape.c
	include/trace/events/block.h
	kernel/trace/blktrace.c
2009-06-11 11:10:35 -07:00
Christoph Hellwig
ef14f0c157 xfs: use generic Posix ACL code
This patch rips out the XFS ACL handling code and uses the generic
fs/posix_acl.c code instead.  The ondisk format is of course left
unchanged.

This also introduces the same ACL caching all other Linux filesystems do
by adding pointers to the acl and default acl in struct xfs_inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-10 17:07:47 +02:00
Christoph Hellwig
8b5403a6d7 xfs: remove SYNC_BDFLUSH
SYNC_BDFLUSH is a leftover from IRIX and rather misnamed for todays
code.  Make xfs_sync_fsdata and xfs_dq_sync use the SYNC_TRYLOCK flag
for not blocking on logs just as the inode sync code already does.

For xfs_sync_fsdata it's a trivial 1:1 replacement, but for xfs_qm_sync
I use the opportunity to decouple the non-blocking lock case from the
different flushing modes, similar to the inode sync code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:37:16 +02:00
Christoph Hellwig
b0710ccc6d xfs: remove SYNC_IOWAIT
We want to wait for all I/O to finish when we do data integrity syncs.  So
there is no reason to keep SYNC_WAIT separate from SYNC_IOWAIT.  This
causes a little change in behaviour for the ENOSPC flushing code which now
does a second submission and wait of buffered I/O, but that should finish
ASAP as we already did an asynchronous writeout earlier.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:37:11 +02:00
Christoph Hellwig
075fe10286 xfs: split xfs_sync_inodes
xfs_sync_inodes is used to write back either file data or inode metadata.
In general we always do these separately, except for one fishy case in
xfs_fs_put_super that does both.  So separate xfs_sync_inodes into
separate xfs_sync_data and xfs_sync_attr functions.  In xfs_fs_put_super
we first call the data sync and then the attr sync as that was the previous
order.  The moved log force in that path doesn't make a difference because
we will force the log again as part of the real unmount process.

The filesystem readonly checks are not performed by the new function but
instead moved into the callers, given that most callers alredy have it
further up in the stack.  Also add debug checks that we do not pass in
incorrect flags in the new xfs_sync_data and xfs_sync_attr function and
fix the one place that did pass in a wrong flag.

Also remove a comment mentioning xfs_sync_inodes that has been incorrect
for a while because we always take either the iolock or ilock in the
sync path these days.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:35:48 +02:00
Christoph Hellwig
fe588ed328 xfs: use generic inode iterator in xfs_qm_dqrele_all_inodes
Use xfs_inode_ag_iterator instead of opencoding the inode walk in the
quota code.  Mark xfs_inode_ag_iterator and xfs_sync_inode_valid non-static
to allow using them from the quota code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:35:27 +02:00
Dave Chinner
75f3cb1393 xfs: introduce a per-ag inode iterator
Given that we walk across the per-ag inode lists so often, it makes sense to
introduce an iterator for this.

Convert the sync and reclaim code to use this new iterator, quota code will
follow in the next patch.

Also change xfs_reclaim_inode to return -EGAIN instead of 1 for an inode
already under reclaim.  This simplifies the AG iterator and doesn't
matter for the only other caller.

[hch: merged the lookup and execute callbacks back into one to get the
 pag_ici_lock locking correct and simplify the code flow]

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:35:14 +02:00
Dave Chinner
abc1064742 xfs: remove unused parameter from xfs_reclaim_inodes
The noblock parameter of xfs_reclaim_inodes is only ever set to zero. Remove
it and all the conditional code that is never executed.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:35:12 +02:00
Dave Chinner
1da8eecab5 xfs: factor out inode validation for sync
Separate the validation of inodes found by the radix
tree walk from the radix tree lookup.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:35:07 +02:00
Christoph Hellwig
845b6d0cbb xfs: split inode flushing from xfs_sync_inodes_ag
In many cases we only want to sync inode metadata. Split out the inode
flushing into a separate helper to prepare factoring the inode sync code.

Based on a patch from Dave Chinner, but redone to keep the current behaviour
exactly and leave changes to the flushing logic to another patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:35:05 +02:00
Dave Chinner
5a34d5cd09 xfs: split inode data writeback from xfs_sync_inodes_ag
In many cases we only want to sync inode data. Start spliting the inode sync
into data sync and inode sync by factoring out the inode data flush.

[hch: minor cleanups]

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:35:03 +02:00
Christoph Hellwig
7d095257e3 xfs: kill xfs_qmops
Kill the quota ops function vector and replace it with direct calls or
stubs in the CONFIG_XFS_QUOTA=n case.

Make sure we check XFS_IS_QUOTA_RUNNING in the right spots.  We can remove
the number of those checks because the XFS_TRANS_DQ_DIRTY flag can't be set
otherwise.

This brings us back closer to the way this code worked in IRIX and earlier
Linux versions, but we keep a lot of the more useful factoring of common
code.

Eventually we should also kill xfs_qm_bhv.c, but that's left for a later
patch.

Reduces the size of the source code by about 250 lines and the size of
XFS module by about 1.5 kilobytes with quotas enabled:

   text	   data	    bss	    dec	    hex	filename
 615957	   2960	   3848	 622765	  980ad	fs/xfs/xfs.o
 617231	   3152	   3848	 624231	  98667	fs/xfs/xfs.o.old

Fallout:

 - xfs_qm_dqattach is split into xfs_qm_dqattach_locked which expects
   the inode locked and xfs_qm_dqattach which does the locking around it,
   thus removing XFS_QMOPT_ILOCKED.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:33:32 +02:00
Christoph Hellwig
0c5e1ce89f xfs: validate quota log items during log recovery
Arkadiusz has seen really strange crashes in xfs_qm_dqcheck that
I can only explain by a log item being too smal to actually fit the
xfs_dqblk_t we're dereferencing all over xfs_qm_dqcheck.  So add
graceful checks for NULL or too small quota items to the log recovery
code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:33:21 +02:00
Christoph Hellwig
e1696834e8 xfs: update max log size
Commit a6634fba3dec4a92f0a2c4e30c80b634c0576ad5 in xfsprogs increased the
maximum log size supported by mkfs.  Merged back the changes to xfs_fs.h
so the growfs enforced the same limit and the headers are in sync.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2009-06-08 15:32:59 +02:00
Linus Torvalds
4157fd85fc Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: prevent deadlock in xfs_qm_shake()
  xfs: fix overflow in xfs_growfs_data_private
  xfs: fix double unlock in xfs_swap_extents()
2009-06-02 09:47:21 -07:00
Felix Blyakher
1b17d76646 xfs: prevent deadlock in xfs_qm_shake()
It's possible to recurse into filesystem from the memory
allocation, which deadlocks in xfs_qm_shake(). Add check
for __GFP_FS, and bail out if it is not set.

Signed-off-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Hedi Berriche <hedi@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-06-01 22:59:45 -05:00
Eric Sandeen
e6da7c9fed xfs: fix overflow in xfs_growfs_data_private
In the case where growing a filesystem would leave the last AG
too small, the fixup code has an overflow in the calculation
of the new size with one fewer ag, because "nagcount" is a 32
bit number.  If the new filesystem has > 2^32 blocks in it
this causes a problem resulting in an EINVAL return from growfs:

 # xfs_io -f -c "truncate 19998630180864" fsfile
 # mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile
 # mount -o loop fsfile /mnt
 # xfs_growfs /mnt

meta-data=/dev/loop0             isize=256    agcount=52,
agsize=76288719 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=3905982455, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=32768, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0
xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument

Reported-by: richard.ems@cape-horn-eng.com
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-06-01 22:59:38 -05:00
Felix Blyakher
1f23920dbf xfs: fix double unlock in xfs_swap_extents()
Regreesion from commit ef8f7fc, which rearranged the code in
xfs_swap_extents() leading to double unlock of xfs inode ilock.
That resulted in xfs_fsr deadlocking itself on platforms, which
don't handle double unlock of rw_semaphore nicely. It caused the
count go negative, which represents the write holder, without
really having one. ia64 is one of the platforms where deadlock
was easily reproduced and the fix was tested.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-06-01 22:59:29 -05:00
Felix Blyakher
4156e735d3 xfs: prevent deadlock in xfs_qm_shake()
It's possible to recurse into filesystem from the memory
allocation, which deadlocks in xfs_qm_shake(). Add check
for __GFP_FS, and bail out if it is not set.

Signed-off-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Hedi Berriche <hedi@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-06-01 13:13:24 -05:00
Eric Sandeen
096324873f xfs: fix overflow in xfs_growfs_data_private
In the case where growing a filesystem would leave the last AG
too small, the fixup code has an overflow in the calculation
of the new size with one fewer ag, because "nagcount" is a 32
bit number.  If the new filesystem has > 2^32 blocks in it
this causes a problem resulting in an EINVAL return from growfs:

 # xfs_io -f -c "truncate 19998630180864" fsfile
 # mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile
 # mount -o loop fsfile /mnt
 # xfs_growfs /mnt

meta-data=/dev/loop0             isize=256    agcount=52,
agsize=76288719 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=3905982455, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=32768, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0
xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument

Reported-by: richard.ems@cape-horn-eng.com
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-05-26 17:46:37 -05:00
Martin K. Petersen
e1defc4ff0 block: Do away with the notion of hardsect_size
Until now we have had a 1:1 mapping between storage device physical
block size and the logical block sized used when addressing the device.
With SATA 4KB drives coming out that will no longer be the case.  The
sector size will be 4KB but the logical block size will remain
512-bytes.  Hence we need to distinguish between the physical block size
and the logical ditto.

This patch renames hardsect_size to logical_block_size.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-05-22 23:22:54 +02:00
Felix Blyakher
ec91d1335f xfs: fix double unlock in xfs_swap_extents()
Regreesion from commit ef8f7fc, which rearranged the code in
xfs_swap_extents() leading to double unlock of xfs inode ilock.
That resulted in xfs_fsr deadlocking itself on platforms, which
don't handle double unlock of rw_semaphore nicely. It caused the
count go negative, which represents the write holder, without
really having one. ia64 is one of the platforms where deadlock
was easily reproduced and the fix was tested.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-05-08 00:29:44 -05:00
Linus Torvalds
b4348f32da Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: fix getbmap vs mmap deadlock
  xfs: a couple getbmap cleanups
  xfs: add more checks to superblock validation
  xfs_file_last_byte() needs to acquire ilock
2009-05-02 16:52:50 -07:00
Christoph Hellwig
28e211700a xfs: fix getbmap vs mmap deadlock
xfs_getbmap (or rather the formatters called by it) copy out the getbmap
structures under the ilock, which can deadlock against mmap.  This has
been reported via bugzilla a while ago (#717) and has recently also
shown up via lockdep.

So allocate a temporary buffer to format the kernel getbmap structures
into and then copy them out after dropping the locks.

A little problem with this is that we limit the number of extents we
can copy out by the maximum allocation size, but I see no real way
around that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-30 00:29:02 -05:00
Christoph Hellwig
5f79ed685f xfs: a couple getbmap cleanups
- reshuffle various conditionals for data vs attr fork to make the code
   more readable
 - do fine-grainded goto-based error handling
 - exit early from conditionals instead of keeping a long else branch around
 - allow kmem_alloc to fail

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-30 00:28:31 -05:00
Olaf Weber
b9ec9068d7 xfs: add more checks to superblock validation
There had been reports where xfs filesystem was randomly
corrupted with fsfuzzer, and xfs failed to handle it
gracefully. This patch fixes couple of reported problem
by providing additional checks in the superblock
validation routine.

Signed-off-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-30 00:26:14 -05:00
Lachlan McIlroy
def6b3ba56 xfs_file_last_byte() needs to acquire ilock
We had some systems crash with this stack:

[<a00000010000cb20>] ia64_leave_kernel+0x0/0x280
[<a00000021291ca00>] xfs_bmbt_get_startoff+0x0/0x20 [xfs]
[<a0000002129080b0>] xfs_bmap_last_offset+0x210/0x280 [xfs]
[<a00000021295b010>] xfs_file_last_byte+0x70/0x1a0 [xfs]
[<a00000021295b200>] xfs_itruncate_start+0xc0/0x1a0 [xfs]
[<a0000002129935f0>] xfs_inactive_free_eofblocks+0x290/0x460 [xfs]
[<a000000212998fb0>] xfs_release+0x1b0/0x240 [xfs]
[<a0000002129ad930>] xfs_file_release+0x70/0xa0 [xfs]
[<a000000100162ea0>] __fput+0x1a0/0x420
[<a000000100163160>] fput+0x40/0x60

The problem here is that xfs_file_last_byte() does not acquire the
inode lock and can therefore race with another thread that is modifying
the extext list.  While xfs_bmap_last_offset() is trying to lookup
what was the last extent some extents were merged and the extent list
shrunk so the index we lookup is now beyond the end of the extent list
and potentially in a freed buffer.

Signed-off-by: Lachlan McIlroy <lmcilroy@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-30 00:25:25 -05:00
Christoph Hellwig
6321e3ed2a xfs: fix getbmap vs mmap deadlock
xfs_getbmap (or rather the formatters called by it) copy out the getbmap
structures under the ilock, which can deadlock against mmap.  This has
been reported via bugzilla a while ago (#717) and has recently also
shown up via lockdep.

So allocate a temporary buffer to format the kernel getbmap structures
into and then copy them out after dropping the locks.

A little problem with this is that we limit the number of extents we
can copy out by the maximum allocation size, but I see no real way
around that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-29 13:25:29 -05:00
Christoph Hellwig
4be4a00fb5 xfs: a couple getbmap cleanups
- reshuffle various conditionals for data vs attr fork to make the code
   more readable
 - do fine-grainded goto-based error handling
 - exit early from conditionals instead of keeping a long else branch around
 - allow kmem_alloc to fail

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-29 10:00:01 -05:00
Olaf Weber
2ac00af7a6 xfs: add more checks to superblock validation
There had been reports where xfs filesystem was randomly
corrupted with fsfuzzer, and xfs failed to handle it
gracefully. This patch fixes couple of reported problem
by providing additional checks in the superblock
validation routine.

Signed-off-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-29 09:24:29 -05:00
Lachlan McIlroy
f25181f598 xfs_file_last_byte() needs to acquire ilock
We had some systems crash with this stack:

[<a00000010000cb20>] ia64_leave_kernel+0x0/0x280
[<a00000021291ca00>] xfs_bmbt_get_startoff+0x0/0x20 [xfs]
[<a0000002129080b0>] xfs_bmap_last_offset+0x210/0x280 [xfs]
[<a00000021295b010>] xfs_file_last_byte+0x70/0x1a0 [xfs]
[<a00000021295b200>] xfs_itruncate_start+0xc0/0x1a0 [xfs]
[<a0000002129935f0>] xfs_inactive_free_eofblocks+0x290/0x460 [xfs]
[<a000000212998fb0>] xfs_release+0x1b0/0x240 [xfs]
[<a0000002129ad930>] xfs_file_release+0x70/0xa0 [xfs]
[<a000000100162ea0>] __fput+0x1a0/0x420
[<a000000100163160>] fput+0x40/0x60

The problem here is that xfs_file_last_byte() does not acquire the
inode lock and can therefore race with another thread that is modifying
the extext list.  While xfs_bmap_last_offset() is trying to lookup
what was the last extent some extents were merged and the extent list
shrunk so the index we lookup is now beyond the end of the extent list
and potentially in a freed buffer.

Signed-off-by: Lachlan McIlroy <lmcilroy@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-04-29 09:14:10 -05:00
Li Zefan
0e639bdeef xfs: use memdup_user()
Remove open-coded memdup_user()

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-04-20 23:02:51 -04:00
Linus Torvalds
3c1795cc4b Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: remove xfs_flush_space
  xfs: flush delayed allcoation blocks on ENOSPC in create
  xfs: block callers of xfs_flush_inodes() correctly
  xfs: make inode flush at ENOSPC synchronous
  xfs: use xfs_sync_inodes() for device flushing
  xfs: inform the xfsaild of the push target before sleeping
  xfs: prevent unwritten extent conversion from blocking I/O completion
  xfs: fix double free of inode
  xfs: validate log feature fields correctly
2009-04-13 14:35:13 -07:00
Felix Blyakher
dc2a5536d6 Merge branch 'master' into for-linus 2009-04-09 14:12:07 -05:00
Dave Chinner
8de2bf937a xfs: remove xfs_flush_space
The only thing we need to do now when we get an ENOSPC condition during delayed
allocation reservation is flush all the other inodes with delalloc blocks on
them and retry without EOF preallocation. Remove the unneeded mess that is
xfs_flush_space() and just call xfs_flush_inodes() directly from
xfs_iomap_write_delay().

Also, change the location of the retry label to avoid trying to do EOF
preallocation because we don't want to do that at ENOSPC. This enables us to
remove the BMAPI_SYNC flag as it is no longer used.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:49:12 +02:00
Dave Chinner
153fec43ce xfs: flush delayed allcoation blocks on ENOSPC in create
If we are creating lots of small files, we can fail to get
a reservation for inode create earlier than we should due to
EOF preallocation done during delayed allocation reservation.
Hence on the first reservation ENOSPC failure flush all the
delayed allocation blocks out of the system and retry.

This fixes the last commonly triggered spurious ENOSPC issue
that has been reported.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:48:30 +02:00
Dave Chinner
e43afd72d2 xfs: block callers of xfs_flush_inodes() correctly
xfs_flush_inodes() currently uses a magic timeout to wait for
some inodes to be flushed before returning. This isn't
really reliable but used to be the best that could be done
due to deadlock potential of waiting for the entire flush.

Now the inode flush is safe to execute while we hold page
and inode locks, we can wait for all the inodes to flush
synchronously. Convert the wait mechanism to a completion
to do this efficiently. This should remove all remaining
spurious ENOSPC errors from the delayed allocation reservation
path.

This is extracted almost line for line from a larger patch
from Mikulas Patocka.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:47:27 +02:00
Dave Chinner
5825294edd xfs: make inode flush at ENOSPC synchronous
When we are writing to a single file and hit ENOSPC, we trigger a background
flush of the inode and try again.  Because we hold page locks and the iolock,
the flush won't proceed until after we release these locks. This occurs once
we've given up and ENOSPC has been reported. Hence if this one is the only
dirty inode in the system, we'll get an ENOSPC prematurely.

To fix this, remove the async flush from the allocation routines and move
it to the top of the write path where we can do a synchronous flush
and retry the write again. Only retry once as a second ENOSPC indicates
that we really are ENOSPC.

This avoids a page cache deadlock when trying to do this flush synchronously
in the allocation layer that was identified by Mikulas Patocka.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:45:44 +02:00
Dave Chinner
a8d770d987 xfs: use xfs_sync_inodes() for device flushing
Currently xfs_device_flush calls sync_blockdev() which is
a no-op for XFS as all it's metadata is held in a different
address to the one sync_blockdev() works on.

Call xfs_sync_inodes() instead to flush all the delayed
allocation blocks out. To do this as efficiently as possible,
do it via two passes - one to do an async flush of all the
dirty blocks and a second to wait for all the IO to complete.
This requires some modification to the xfs-sync_inodes_ag()
flush code to do efficiently.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:44:54 +02:00
Dave Chinner
9d7fef74b2 xfs: inform the xfsaild of the push target before sleeping
When trying to reserve log space, we find the amount of space
we need, then go to sleep waiting for space. When we are
woken, we try to push the tail of the log forward to make
sure we have space available.

Unfortunately, this means that if there is not space available, and
everyone who needs space goes to sleep there is no-one left to push
the tail of the log to make space available. Once we have a thread
waiting for space to become available, the others queue up behind
it in a FIFO, and none of them push the tail of the log.

This can result in everyone going to sleep in xlog_grant_log_space()
if the first sleeper races with the last I/O that moves the tail
of the log forward. With no further I/O tomove the tail of the log,
there is nothing to wake the sleepers and hence all transactions
just stop.

Fix this by making sure the xfsaild will create enough space for the
transaction that is about to sleep by moving the push target far
enough forwards to ensure that that the curent proceeees will have
enough space available when it is woken. That is, we push the
AIL before we go to sleep.

Because we've inserted the log ticket into the queue before we've
pushed and gone to sleep, subsequent transactions will wait behind
this one. Hence we are guaranteed to have space available when we
are woken.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:42:59 +02:00
Dave Chinner
c626d174cf xfs: prevent unwritten extent conversion from blocking I/O completion
Unwritten extent conversion can recurse back into the filesystem due
to memory allocation. Memory reclaim requires I/O completions to be
processed to allow the callers to make progress. If the I/O
completion workqueue thread is doing the recursion, then we have a
deadlock situation.

Move unwritten extent completion into it's own workqueue so it
doesn't block I/O completions for normal delayed allocation or
overwrite data.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:42:11 +02:00
Dave Chinner
705db3fd46 xfs: fix double free of inode
If we fail to initialise the VFS inode in inode_init_always(),
it will call ->delete_inode internally resulting in the inode being
freed. Hence we need to delay the call to inode_init_always()
until after the XFS inode is sufficient set up to handle a
call to ->delete_inode, and then if that fails do not touch
the inode again at all as it has been freed.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:40:17 +02:00
Dave Chinner
a6cb767e24 xfs: validate log feature fields correctly
If the large log sector size feature bit is set in the
superblock by accident (say disk corruption), the then
fields that are now considered valid are not checked on
production kernels. The checks are present as ASSERT
statements so cause a panic on a debug kernel.

Change this so that the fields are validity checked if
the feature bit is set and abort the log mount if the
fields do not contain valid values.

Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-04-06 18:39:27 +02:00
Linus Torvalds
ac7c1a776d Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs: (61 commits)
  Revert "xfs: increase the maximum number of supported ACL entries"
  xfs: cleanup uuid handling
  xfs: remove m_attroffset
  xfs: fix various typos
  xfs: pagecache usage optimization
  xfs: remove m_litino
  xfs: kill ino64 mount option
  xfs: kill mutex_t typedef
  xfs: increase the maximum number of supported ACL entries
  xfs: factor out code to find the longest free extent in the AG
  xfs: kill VN_BAD
  xfs: kill vn_atime_* helpers.
  xfs: cleanup xlog_bread
  xfs: cleanup xlog_recover_do_trans
  xfs: remove another leftover of the old inode log item format
  xfs: cleanup log unmount handling
  Fix xfs debug build breakage by pushing xfs_error.h after
  xfs: include header files for prototypes
  xfs: make symbols static
  xfs: move declaration to header file
  ...
2009-04-03 09:52:29 -07:00
Linus Torvalds
8fe74cf053 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  Remove two unneeded exports and make two symbols static in fs/mpage.c
  Cleanup after commit 585d3bc06f
  Trim includes of fdtable.h
  Don't crap into descriptor table in binfmt_som
  Trim includes in binfmt_elf
  Don't mess with descriptor table in load_elf_binary()
  Get rid of indirect include of fs_struct.h
  New helper - current_umask()
  check_unsafe_exec() doesn't care about signal handlers sharing
  New locking/refcounting for fs_struct
  Take fs_struct handling to new file (fs/fs_struct.c)
  Get rid of bumping fs_struct refcount in pivot_root(2)
  Kill unsharing fs_struct in __set_personality()
2009-04-02 21:09:10 -07:00
Felix Blyakher
f36345ff9a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-linus 2009-04-01 16:58:39 -05:00
Nick Piggin
c2ec175c39 mm: page_mkwrite change prototype to match fault
Change the page_mkwrite prototype to take a struct vm_fault, and return
VM_FAULT_xxx flags.  There should be no functional change.

This makes it possible to return much more detailed error information to
the VM (and also can provide more information eg.  virtual_address to the
driver, which might be important in some special cases).

This is required for a subsequent fix.  And will also make it easier to
merge page_mkwrite() with fault() in future.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01 08:59:14 -07:00
Al Viro
ce3b0f8d5c New helper - current_umask()
current->fs->umask is what most of fs_struct users are doing.
Put that into a helper function.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-31 23:00:26 -04:00
Felix Blyakher
1aacc064e0 Revert "xfs: increase the maximum number of supported ACL entries"
This reverts commit 8b11217173.
Premature unintended commit.

Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-31 00:23:37 -05:00
Felix Blyakher
5123bc35d2 Merge branch 'master' of git://git.kernel.org/pub/scm/fs/xfs/xfs 2009-03-30 22:17:44 -05:00
Christoph Hellwig
27174203f5 xfs: cleanup uuid handling
The uuid table handling should not be part of a semi-generic uuid library
but in the XFS code using it, so move those bits to xfs_mount.c and
refactor the whole glob to make it a proper abstraction.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-30 10:21:31 +02:00
Christoph Hellwig
1a5902c5d2 xfs: remove m_attroffset
With the upcoming v3 inodes the default attroffset needs to be calculated
for each specific inode, so we can't cache it in the superblock anymore.

Also replace the assert for wrong inode sizes with a proper error check
also included in non-debug builds.  Note that the ENOSYS return for
that might seem odd, but that error is returned by xfs_mount_validate_sb
for all theoretically valid but not supported filesystem geometries.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
2009-03-29 19:26:46 +02:00
Malcolm Parsons
9da096fd13 xfs: fix various typos
Signed-off-by: Malcolm Parsons <malcolm.parsons@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-03-29 09:55:42 +02:00
Hisashi Hifumi
bddaafa11a xfs: pagecache usage optimization
Hi.

I introduced "is_partially_uptodate" aops for XFS.

A page can have multiple buffers and even if a page is not uptodate,
some buffers can be uptodate on pagesize != blocksize environment.

This aops checks that all buffers which correspond to a part of a file
that we want to read are uptodate. If so, we do not have to issue actual
read IO to HDD even if a page is not uptodate because the portion we
want to read are uptodate.

"block_is_partially_uptodate" function is already used by ext2/3/4.
With the following patch random read/write mixed workloads or random read
after random write workloads can be optimized and we can get performance
improvement.

I did a performance test using the sysbench.

#sysbench --num-threads=4 --max-requests=100000 --test=fileio --file-num=1 \
--file-block-size=8K --file-total-size=1G --file-test-mode=rndrw \
--file-fsync-freq=0 --file-rw-ratio=0.5 run

-2.6.29-rc6
Test execution summary:
    total time:                          123.8645s
    total number of events:              100000
    total time taken by event execution: 442.4994
    per-request statistics:
         min:                            0.0000s
         avg:                            0.0044s
         max:                            0.3387s
         approx.  95 percentile:         0.0118s

-2.6.29-rc6-patched
Test execution summary:
    total time:                          108.0757s
    total number of events:              100000
    total time taken by event execution: 417.7505
    per-request statistics:
         min:                            0.0000s
         avg:                            0.0042s
         max:                            0.3217s
         approx.  95 percentile:         0.0118s

arch: ia64
pagesize: 16k
blocksize: 4k

Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-29 09:53:38 +02:00
Christoph Hellwig
6447c36209 xfs: remove m_litino
With the upcoming v3 inodes the inode data/attr area size needs to be
calculated for each specific inode, so we can't cache it in the superblock
anymore.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-29 09:51:14 +02:00
Christoph Hellwig
a19d9f887d xfs: kill ino64 mount option
The ino64 mount option adds a fixed offset to 32bit inode numbers
to bring them into the 64bit range.  There's no need for this kind
of debug tool given that it's easy to produce real 64bit inode numbers
for testing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-29 09:51:08 +02:00
Christoph Hellwig
a0b0b8a5b3 xfs: kill mutex_t typedef
People continue to complain about this for weird reasons, but there's
really no point in keeping this typedef for a couple of users anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-03-29 09:51:00 +02:00
Felix Blyakher
8b11217173 xfs: increase the maximum number of supported ACL entries
With big installation current 25 maximum number of
supported ACL entries is not enough any more.
Increase the limit to 100.

Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-27 17:28:43 -05:00
Dave Chinner
6cc87645e2 xfs: factor out code to find the longest free extent in the AG
Signed-off-by: Dave Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2009-03-16 08:29:46 +01:00
Christoph Hellwig
cb4c8cc1e9 xfs: kill VN_BAD
Remove this rather pointless wrapper and use is_bad_inode directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:25:25 +01:00
Christoph Hellwig
8fab451e3c xfs: kill vn_atime_* helpers.
Two out of three are unused already, and the third is better done open-coded
with a comment describing what's going on here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:24:46 +01:00
Christoph Hellwig
076e6acb8f xfs: cleanup xlog_bread
Most callers of xlog_bread need to call xlog_align to get the actual offset.
Consolidate that call into the main xlog_bread and provide a _xlog_bread
for those few that don't want the actual offset.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:24:13 +01:00
Christoph Hellwig
ff0205e032 xfs: cleanup xlog_recover_do_trans
Change the big if-elsif-else block handling the different item types
into a more natural switch, remove assignments in conditionals and
remove an out of place comment from centuries ago on IRIX.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:20:52 +01:00
Christoph Hellwig
dd0bbad81c xfs: remove another leftover of the old inode log item format
There's another little snipplet of code left from the handling of the old
inode log item format in xlog_recover_do_inode_trans.  Kill it as it
can't be reached anymore.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:19:59 +01:00
Christoph Hellwig
21b699c895 xfs: cleanup log unmount handling
Kill the current xfs_log_unmount wrapper and opencode the two function
calls in the only caller.  Rename the current xfs_log_unmount_dealloc to
xfs_log_unmount as it undoes xfs_log_mount and the new name makes that
more clear.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:19:29 +01:00
Felix Blyakher
da5309cd28 Fix xfs debug build breakage by pushing xfs_error.h after
xfs_mount.h, which it depends on.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-15 08:10:25 -05:00
Christoph Hellwig
c141b2928f xfs: only issues a cache flush on unmount if barriers are enabled
Currently we unconditionally issue a flush from xfs_free_buftarg, but
since 2.6.29-rc1 this gives a warning in the style of

	end_request: I/O error, dev vdb, sector 0

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-06 17:35:12 -06:00
Christoph Hellwig
7d46be4a25 xfs: prevent lockdep false positive in xfs_iget_cache_miss
The inode can't be locked by anyone else as we just created it a few
lines above and it's not been added to any lookup data structure yet.

So use a trylock that must succeed to get around the lockdep warnings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-06 17:34:59 -06:00
Christoph Hellwig
ff392c497b xfs: prevent kernel crash due to corrupted inode log format
Andras Korn reported an oops on log replay causes by a corrupted
xfs_inode_log_format_t passing a 0 size to kmem_zalloc.  This patch handles
to small or too large numbers of log regions gracefully by rejecting the
log replay with a useful error message.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Andras Korn <korn-sgi.com@chardonnay.math.bme.hu>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-06 17:34:45 -06:00
Hannes Eder
7bf446f8b5 xfs: include header files for prototypes
Fix this sparse warnings:
  fs/xfs/linux-2.6/xfs_ioctl.c:72:1: warning: symbol 'xfs_find_handle' was not declared. Should it be static?
  fs/xfs/linux-2.6/xfs_ioctl.c:249:1: warning: symbol 'xfs_open_by_handle' was not declared. Should it be static?
  fs/xfs/linux-2.6/xfs_ioctl.c:361:1: warning: symbol 'xfs_readlink_by_handle' was not declared. Should it be static?
  fs/xfs/linux-2.6/xfs_ioctl.c:496:1: warning: symbol 'xfs_attrmulti_attr_get' was not declared. Should it be static?
  fs/xfs/linux-2.6/xfs_ioctl.c:525:1: warning: symbol 'xfs_attrmulti_attr_set' was not declared. Should it be static?
  fs/xfs/linux-2.6/xfs_ioctl.c:555:1: warning: symbol 'xfs_attrmulti_attr_remove' was not declared. Should it be static?
  fs/xfs/linux-2.6/xfs_ioctl.c:657:1: warning: symbol 'xfs_ioc_space' was not declared. Should it be static?
  fs/xfs/linux-2.6/xfs_ioctl.c:1340:1: warning: symbol 'xfs_file_ioctl' was not declared. Should it be static?
  fs/xfs/support/debug.c:65:1: warning: symbol 'xfs_fs_vcmn_err' was not declared. Should it be static?
  fs/xfs/support/debug.c:112:1: warning: symbol 'xfs_hex_dump' was not declared. Should it be static?

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-06 17:21:16 -06:00
Hannes Eder
3180e66d77 xfs: make symbols static
Instead of the keyword 'static' the macro 'STATIC' is used, so the
symbols are still global with CONFIG_XFS_DEBUG.

Fix this sparse warnings:
  fs/xfs/linux-2.6/xfs_super.c:638:1: warning: symbol 'xfs_blkdev_get' was not declared. Should it be static?
  fs/xfs/linux-2.6/xfs_super.c:655:1: warning: symbol 'xfs_blkdev_put' was not declared. Should it be static?
  fs/xfs/linux-2.6/xfs_super.c:876:1: warning: symbol 'xfsaild' was not declared. Should it be static?
  fs/xfs/xfs_bmap.c:6208:1: warning: symbol 'xfs_check_block' was not declared. Should it be static?
  fs/xfs/xfs_dir2_leaf.c:553:1: warning: symbol 'xfs_dir2_leaf_check' was not declared. Should it be static?

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-06 17:20:56 -06:00
Hannes Eder
24418492aa xfs: move declaration to header file
Fix this sparse warning:
  fs/xfs/xfs_da_btree.c:1550:26: warning: symbol 'xfs_default_nameops' was not declared. Should it be static?

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-06 17:20:15 -06:00
Christoph Hellwig
b79631330a xfs: only issues a cache flush on unmount if barriers are enabled
Currently we unconditionally issue a flush from xfs_free_buftarg, but
since 2.6.29-rc1 this gives a warning in the style of

	end_request: I/O error, dev vdb, sector 0

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-04 07:31:55 -06:00
Christoph Hellwig
ed93ec3907 xfs: prevent lockdep false positive in xfs_iget_cache_miss
The inode can't be locked by anyone else as we just created it a few
lines above and it's not been added to any lookup data structure yet.

So use a trylock that must succeed to get around the lockdep warnings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-04 07:31:48 -06:00
Christoph Hellwig
e8fa6b483f xfs: prevent kernel crash due to corrupted inode log format
Andras Korn reported an oops on log replay causes by a corrupted
xfs_inode_log_format_t passing a 0 size to kmem_zalloc.  This patch handles
to small or too large numbers of log regions gracefully by rejecting the
log replay with a useful error message.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Andras Korn <korn-sgi.com@chardonnay.math.bme.hu>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-04 07:31:42 -06:00
Felix Blyakher
27e88bf6af Revert "[XFS] remove old vmap cache"
This reverts commit d2859751cd.

This commit caused regression. We'll try to fix use of new
vmap API for next release.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-02-19 13:15:55 -06:00
Felix Blyakher
7fdf582447 Revert "[XFS] use scalable vmap API"
This reverts commit 95f8e302c0.

This commit caused regression. We'll try to fix use of new
vmap API for next release.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-02-19 13:15:44 -06:00
Felix Blyakher
3a011a1719 Revert "[XFS] remove old vmap cache"
This reverts commit d2859751cd.

This commit caused regression. We'll try to fix use of new
vmap API for next release.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-02-18 15:57:51 -06:00
Felix Blyakher
cf7dab8017 Revert "[XFS] use scalable vmap API"
This reverts commit 95f8e302c0.

This commit caused regression. We'll try to fix use of new
vmap API for next release.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-02-18 15:41:28 -06:00
Christoph Hellwig
7c8f7af67d xfs: reject swapext ioctl on swapfiles
Swapfiles are magic - I/O is directly initialized by the VM without
involving the filesystem.  Swapping out extents underneath the VM thus
can cause severe problems.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-02-12 19:56:00 +01:00
Christoph Hellwig
264307520b xfs: fix error handling in xfs_log_mount
We can't just call xfs_log_unmount_dealloc on any failure because the
ail thread which is torn down by xfs_log_unmount_dealloc might not
be initialized yet.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Reported-by: Lachlan McIlroy <lachlan@sgi.com>
2009-02-12 19:55:48 +01:00
Christoph Hellwig
fcafb71b57 xfs: get rid of indirections in the quotaops implementation
Currently we call from the nicely abstracted linux quotaops into a ugly
multiplexer just to split the calls out at the same boundary again.
Rewrite the quota ops handling to remove that obfucation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-02-09 08:47:34 +01:00
Christoph Hellwig
c9a192dcf9 xfs: sanitize qh_lock wrappers
Get rid of various obsfucating wrappers for accessing the quota hash lock,
we only keep the accessors for accessing the mplist and freelist locks as
they encode a multi-level datastructure walk.  But make sure all of them
are defined in the same way as simple macros.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-02-09 08:47:22 +01:00
Christoph Hellwig
7201813bf5 xfs: use mutex_is_locked in XFS_DQ_IS_LOCKED
Now that we have a helper to test if a mutex is held use it instead of our
own little hacks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-02-09 08:39:24 +01:00
Christoph Hellwig
e249458220 xfs: remove XFS_QM_LOCK/XFS_QM_UNLOCK/XFS_QM_HOLD/XFS_QM_RELE
Remove these macros which only obsfucated the code in rather nast ways.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-02-09 08:38:39 +01:00
Christoph Hellwig
517b5e8c85 xfs: merge xfs_mkdir into xfs_create
xfs_create and xfs_mkdir only have minor differences, so merge both of them
into a sigle function.  While we're at it also make the error handling code
more straight-forward.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Dave Chinner <david@fromorbit.com>
2009-02-09 08:38:02 +01:00
Christoph Hellwig
a568778739 xfs: remove uchar_t/ushort_t/uint_t/ulong_t types
Just another set of types obsfucating the code, remove them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-02-09 08:37:39 +01:00
Christoph Hellwig
0d87e656dd xfs: remove superflous inobt macros
xfs_ialloc_btree.h has a a cuple of macros that only obsfucate the code
but don't provide any abstraction benefits.  This patches removes those
and cleans up the reamaining defintions up a little.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-02-09 08:37:14 +01:00
Christoph Hellwig
7153f8ba2b xfs: remove iclog calculation special cases
Our default has been to always use 8 32KB log buffers for a while now, so
remove the special casing for larger block size filesystem to use the same
or even lower number of buffers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-02-09 08:36:46 +01:00
Christoph Hellwig
8e9b6e7fa4 xfs: remove the unused XFS_QMOPT_DQLOCK flag
The XFS_QMOPT_DQLOCK flag introduces major complexity in the quota subsystem
but isn't actually used anywhere.  So remove it and all the hazzles it
introduces.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-02-08 21:51:42 +01:00
Christoph Hellwig
4346cdd464 xfs: cleanup xfs_find_handle
Remove the superflous igrab by keeping a reference on the path/file all the
time and clean up various bits of surrounding code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-02-08 21:51:14 +01:00
Josef 'Jeff' Sipek
ef8f7fc549 xfs: cleanup error handling in xfs_swap_extents
Use multiple lables for proper error unwinding and get rid of some now
superflous variables.

Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-02-04 09:37:43 +01:00
Christoph Hellwig
d4bb6d0698 xfs: merge xfs_inode_flush into xfs_fs_write_inode
Splitting the task for a VFS-induced inode flush into two functions doesn't
make any sense, so merge the two functions dealing with it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-02-04 09:36:19 +01:00
Christoph Hellwig
e1486dea0b xfs: factor out attr fork reset handling
We currently duplicate code to reset the attribute fork after the last
attribute has been deleted.  Factor this out into a small helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-02-04 09:36:00 +01:00
Christoph Hellwig
c52e9fd8a9 xfs: remove unused XFS_MOUNT_ILOCK/XFS_MOUNT_IUNLOCK
These aren't only unused but also reference a lock that doesn't exist anymore.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-02-04 09:34:34 +01:00
Christoph Hellwig
cb3f35bb3b xfs: tiny cleanup for xfs_link
The source and target inodes are guaranteed to never be the same by the VFS,
so no need to check for that (and we would get into bad trouble later anyway
if that were the case).  Also clean up the error handling to use two gotos
instead of nested conditions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-02-04 09:34:20 +01:00
Christoph Hellwig
b93b6e434c xfs: make sure to free the real-time inodes in the mount error path
When mount fails after allocating the real-time inodes we currently leak
them.  Add a new helper to free the real-time inodes which can be used by
both the mount and unmount path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-02-04 09:33:58 +01:00
Christoph Hellwig
f9057e3da7 xfs: cleanup error handling in xfs_mountfs:
Clean up the error handling in xfs_mountfs.  Use readable goto label names,
simplify the uuid handling and other error conditions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
2009-02-04 09:31:52 +01:00
Felix Blyakher
43f3f057c5 [XFS] Warn on transaction in flight on read-only remount
Till VFS can correctly support read-only remount without racing,
use WARN_ON instead of BUG_ON on detecting transaction in flight
after quiescing filesystem.

Signed-off-by: Felix Blyakher <felixb@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-02-03 11:04:54 -06:00
Dave Chinner
6139a23609 xfs: Check buffer lengths in log recovery
Before trying to obtain, read or write a buffer,
check that the buffer length is actually valid. If
it is not valid, then something read in the recovery
process has been corrupted and we should abort
recovery.

Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
Tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-02-03 11:01:32 -06:00
Dave Chinner
3228149ceb xfs: Check buffer lengths in log recovery
Before trying to obtain, read or write a buffer,
check that the buffer length is actually valid. If
it is not valid, then something read in the recovery
process has been corrupted and we should abort
recovery.

Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
Tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-02-03 10:19:33 -06:00
Eric Sandeen
f0e0059b9c don't reallocate sxp variable passed into xfs_swapext
fixes kernel.org bugzilla 12538, xfs_fsr fails on 2.6.29-rc kernels

Regression caused by 743bb4650d

This was an embarrasing mistake, reallocating the sxp pointer passed
in from the main ioctl switch.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net
Reported-by: Paul Martin <pm@debian.org>
Tested-by: Paul Martin <pm@debian.org>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-01-27 14:51:39 -06:00
Eric Sandeen
ac12b4e25e don't reallocate sxp variable passed into xfs_swapext
fixes kernel.org bugzilla 12538, xfs_fsr fails on 2.6.29-rc kernels

Regression caused by 743bb4650d

This was an embarrasing mistake, reallocating the sxp pointer passed
in from the main ioctl switch.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net
Reported-by: Paul Martin <pm@debian.org>
Tested-by: Paul Martin <pm@debian.org>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-01-27 13:59:43 -06:00
Felix Blyakher
5e1065726e [XFS] Warn on transaction in flight on read-only remount
Till VFS can correctly support read-only remount without racing,
use WARN_ON instead of BUG_ON on detecting transaction in flight
after quiescing filesystem.

Signed-off-by: Felix Blyakher <felixb@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-01-27 13:37:24 -06:00
Dave Chinner
74e2d06521 Long btree pointers are still 64 bit on disk
[XFS] Long btree pointers are still 64 bit on disk

On 32 bit machines with CONFIG_LBD=n, XFS reduces the
in memory size of xfs_fsblock_t to 32 bits so that it
will fit within 32 bit addressing. However, the disk format
for long btree pointers are still 64 bits in size.

The recent btree rewrite failed to take this into account
when initialising new btree blocks, setting sibling pointers
to NULL and checking if they are NULL. Hence checking whether
a 64 bit NULL was the same as a 32 bit NULL was failingi
resulting in NULL sibling pointers failing to be detected
correctly. This showed up as WANT_CORRUPTED_GOTO shutdowns
in xfs_btree_delrec.

Fix this by making all the comparisons and setting of long
pointer btree NULL blocks to the disk format, not the
in memory format. i.e. use NULLDFSBNO.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Reported-by: Jacek Luczak <difrost.kernel@gmail.com>
Reported-by: Danny ter Haar <dth@dth.net>
Tested-by: Jacek Luczak <difrost.kernel@gmail.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-01-22 01:23:11 -06:00
Felix Blyakher
957274d7ce Merge branch 'master' of git+ssh://oss.sgi.com/oss/git/xfs/xfs 2009-01-21 22:39:29 -06:00
Eric Sandeen
5253a11a81 [XFS] remove always-true #ifndef HAVE_FORMAT32 tests
There are several tests for #ifndef HAVE_FORMAT32, but
this is never defined anywhere so it is always the default
behavior; just remove the ifndef goop.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-22 14:07:31 +11:00
Dave Chinner
33ad965dde Long btree pointers are still 64 bit on disk
[XFS] Long btree pointers are still 64 bit on disk

On 32 bit machines with CONFIG_LBD=n, XFS reduces the
in memory size of xfs_fsblock_t to 32 bits so that it
will fit within 32 bit addressing. However, the disk format
for long btree pointers are still 64 bits in size.

The recent btree rewrite failed to take this into account
when initialising new btree blocks, setting sibling pointers
to NULL and checking if they are NULL. Hence checking whether
a 64 bit NULL was the same as a 32 bit NULL was failingi
resulting in NULL sibling pointers failing to be detected
correctly. This showed up as WANT_CORRUPTED_GOTO shutdowns
in xfs_btree_delrec.

Fix this by making all the comparisons and setting of long
pointer btree NULL blocks to the disk format, not the
in memory format. i.e. use NULLDFSBNO.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Reported-by: Jacek Luczak <difrost.kernel@gmail.com>
Reported-by: Danny ter Haar <dth@dth.net>
Tested-by: Jacek Luczak <difrost.kernel@gmail.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-01-21 18:33:46 -06:00
Eric Sandeen
b6e3222732 [XFS] Remove the rest of the macro-to-function indirections.
Remove the last of the macros-defined-to-static-functions.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-19 14:45:55 +11:00
Christoph Hellwig
b828d8c338 xfs: sanity check attr fork size
Recently we have quite a few kerneloops reports about dereferencing a NULL
if_data in the attribute fork.  From looking over the code this can only
happen if we pass a 0 size argument to xfs_iformat_local.  This implies some
sort of corruption and in fact the only mailinglist report about this from
earlier this year was after a powerfail presumably on a system with write
cache and without barriers.

Add a quick sanity check for the attr fork size in xfs_iformat to catch
these early and without an oops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 14:45:11 +11:00
Christoph Hellwig
49739140e5 xfs: fix bad_features2 fixups for the root filesystem
Currently the bad_features2 fixup and the alignment updates in the superblock
are skipped if we mount a filesystem read-only.  But for the root filesystem
the typical case is to mount read-only first and only later remount writeable
so we'll never perform this update at all.  It's not a big problem but means
the logs of people needing the fixup get spammed at every boot because they
never happen on disk.

Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 14:45:04 +11:00
Christoph Hellwig
5aa2dc0a06 xfs: add a lock class for group/project dquots
We can have both a user and a group/project dquot locked at the same time,
as long as the user dquot is locked first.  Tell lockdep about that fact
by making the group/project dquots a different lock class.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 14:44:59 +11:00
Christoph Hellwig
4f2d4ac6e5 xfs: lockdep annotations for xfs_dqlock2
xfs_dqlock2 locks two xfs_dquots, which is fine as it always locks the
dquot with the lower id first.  Use mutex_lock_nested to tell lockdep
about this fact.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 14:44:52 +11:00
Christoph Hellwig
080dda7f5e xfs: add a separate lock class for the per-mount list of dquots
We can have both a a quota hash chain and the per-mount list locked at
the same time.  But given that both use the same struct dqhash as list
head we have to tell lockdep that they are different lock classes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 14:44:44 +11:00
Christoph Hellwig
62e194ecda xfs: use mnt_want_write in compat_attrmulti ioctl
The compat version of the attrmulti ioctl needs to ask for and then
later release write access to the mount just like the native version,
otherwise we could potentially write to read-only mounts.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 14:44:30 +11:00
Christoph Hellwig
ab596ad897 xfs: fix dentry aliasing issues in open_by_handle
Open by handle just grabs an inode by handle and then creates itself
a dentry for it.  While this works for regular files it is horribly
broken for directories, where the VFS locking relies on the fact that
there is only just one single dentry for a given inode, and that
these are always connected to the root of the filesystem so that
it's locking algorithms work (see Documentations/filesystems/Locking)

Remove all the existing open by handle code and replace it with a small
wrapper around the exportfs code which deals with all these issues.
At the same time we also make the checks for a valid handle strict
enough to reject all not perfectly well formed handles - given that
we never hand out others that's okay and simplifies the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 14:43:18 +11:00
Christoph Hellwig
2809f76afc xfs: sanity check attr fork size
Recently we have quite a few kerneloops reports about dereferencing a NULL
if_data in the attribute fork.  From looking over the code this can only
happen if we pass a 0 size argument to xfs_iformat_local.  This implies some
sort of corruption and in fact the only mailinglist report about this from
earlier this year was after a powerfail presumably on a system with write
cache and without barriers.

Add a quick sanity check for the attr fork size in xfs_iformat to catch
these early and without an oops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 02:04:16 +01:00
Christoph Hellwig
7884bc8617 xfs: fix bad_features2 fixups for the root filesystem
Currently the bad_features2 fixup and the alignment updates in the superblock
are skipped if we mount a filesystem read-only.  But for the root filesystem
the typical case is to mount read-only first and only later remount writeable
so we'll never perform this update at all.  It's not a big problem but means
the logs of people needing the fixup get spammed at every boot because they
never happen on disk.

Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 02:04:07 +01:00
Christoph Hellwig
98b8c7a0c4 xfs: add a lock class for group/project dquots
We can have both a user and a group/project dquot locked at the same time,
as long as the user dquot is locked first.  Tell lockdep about that fact
by making the group/project dquots a different lock class.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 02:03:25 +01:00
Christoph Hellwig
5bb87a33b2 xfs: lockdep annotations for xfs_dqlock2
xfs_dqlock2 locks two xfs_dquots, which is fine as it always locks the
dquot with the lower id first.  Use mutex_lock_nested to tell lockdep
about this fact.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 02:03:19 +01:00
Christoph Hellwig
a4edd1da20 xfs: add a separate lock class for the per-mount list of dquots
We can have both a a quota hash chain and the per-mount list locked at
the same time.  But given that both use the same struct dqhash as list
head we have to tell lockdep that they are different lock classes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 02:03:11 +01:00
Christoph Hellwig
178eae342b xfs: use mnt_want_write in compat_attrmulti ioctl
The compat version of the attrmulti ioctl needs to ask for and then
later release write access to the mount just like the native version,
otherwise we could potentially write to read-only mounts.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 02:03:03 +01:00
Christoph Hellwig
d296d30a99 xfs: fix dentry aliasing issues in open_by_handle
Open by handle just grabs an inode by handle and then creates itself
a dentry for it.  While this works for regular files it is horribly
broken for directories, where the VFS locking relies on the fact that
there is only just one single dentry for a given inode, and that
these are always connected to the root of the filesystem so that
it's locking algorithms work (see Documentations/filesystems/Locking)

Remove all the existing open by handle code and replace it with a small
wrapper around the exportfs code which deals with all these issues.
At the same time we also make the checks for a valid handle strict
enough to reject all not perfectly well formed handles - given that
we never hand out others that's okay and simplifies the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-01-19 02:02:57 +01:00
Eric Sandeen
9d87c3192d [XFS] Remove the rest of the macro-to-function indirections.
Remove the last of the macros-defined-to-static-functions.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-16 17:10:42 +11:00
Lachlan McIlroy
cb7a97d015 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-linus 2009-01-14 16:29:51 +11:00
Lachlan McIlroy
c088f4e9da Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2009-01-14 16:29:08 +11:00
Takashi Sato
8e961870bb filesystem freeze: remove XFS specific ioctl interfaces for freeze feature
It removes XFS specific ioctl interfaces and request codes
for freeze feature.

This patch has been supplied by David Chinner.

Signed-off-by: Dave Chinner <dgc@sgi.com>
Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <xfs-masters@oss.sgi.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-09 16:54:42 -08:00
Takashi Sato
c4be0c1dc4 filesystem freeze: add error handling of write_super_lockfs/unlockfs
Currently, ext3 in mainline Linux doesn't have the freeze feature which
suspends write requests.  So, we cannot take a backup which keeps the
filesystem's consistency with the storage device's features (snapshot and
replication) while it is mounted.

In many case, a commercial filesystem (e.g.  VxFS) has the freeze feature
and it would be used to get the consistent backup.

If Linux's standard filesystem ext3 has the freeze feature, we can do it
without a commercial filesystem.

So I have implemented the ioctls of the freeze feature.
I think we can take the consistent backup with the following steps.
1. Freeze the filesystem with the freeze ioctl.
2. Separate the replication volume or create the snapshot
   with the storage device's feature.
3. Unfreeze the filesystem with the unfreeze ioctl.
4. Take the backup from the separated replication volume
   or the snapshot.

This patch:

VFS:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that they can return an error.
Rename write_super_lockfs and unlockfs of the super block operation
freeze_fs and unfreeze_fs to avoid a confusion.

ext3, ext4, xfs, gfs2, jfs:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that write_super_lockfs returns an error if needed,
and unlockfs always returns 0.

reiserfs:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that they always return 0 (success) to keep a current behavior.

Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com>
Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com>
Cc: <xfs-masters@oss.sgi.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-09 16:54:42 -08:00
Nick Piggin
0087167c9d [XFS] use scalable vmap API
Implement XFS's large buffer support with the new vmap APIs. See the vmap
rewrite (db64fe02) for some numbers. The biggest improvement that comes from
using the new APIs is avoiding the global KVA allocation lock on every call.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-09 17:09:47 +11:00
Nick Piggin
958f8c0e4f [XFS] remove old vmap cache
XFS's vmap batching simply defers a number (up to 64) of vunmaps, and keeps
track of them in a list. To purge the batch, it just goes through the list and
calls vunamp on each one. This is pretty poor: a global TLB flush is generally
still performed on each vunmap, with the most expensive parts of the operation
being the broadcast IPIs and locking involved in the SMP callouts, and the
locking involved in the vmap management -- none of these are avoided by just
batching up the calls. I'm actually surprised it ever made much difference.
(Now that the lazy vmap allocator is upstream, this description is not quite
right, but the vunmap batching still doesn't seem to do much)

Rip all this logic out of XFS completely. I will improve vmap performance
and scalability directly in subsequent patch.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-09 17:09:25 +11:00
Lachlan McIlroy
ce79735c12 Merge branch 'for-linus' of git+ssh://git.melbourne.sgi.com/git/xfs 2009-01-09 16:24:48 +11:00
Christoph Hellwig
058652a37d [XFS] make xfs_ino_t an unsigned long long
Currently xfs_ino_t is defined as a u64 which can either be an unsigned
long long or on some 64 bit platforms and unsigned long.  Just making
it and unsigned long long mean's it's still always 64 bits wide, but we
don't need to resort to cases to print it.

Fixes a warning regression on 64 bit powerpc in current git.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-09 16:19:14 +11:00
Christoph Hellwig
1544031976 [XFS] truncate readdir offsets to signed 32 bit values
John Stanley reported EOVERFLOW errors in readdir from his self-build
glibc.  I traced this down to glibc enabling d_off overflow checks
in one of the about five million different getdents implementations.

In 2.6.28 Dave Woodhouse moved our readdir double buffering required
for NFS4 readdirplus into nfsd and at that point we lost the capping
of the directory offsets to 32 bit signed values.  Johns glibc used
getdents64 to even implement readdir for normal 32 bit offset dirents,
and failed with EOVERFLOW only if this happens on the first dirent in
a getdents call.  I managed to come up with a testcase that uses
raw getdents and does the EOVERFLOW check manually.  We always hit
it with our last entry due to the special end of directory marker.

The patch below is a dumb version of just putting back the masking,
to make sure we have the same behavior as in 2.6.27 and earlier.

I will work on a better and cleaner fix for 2.6.30.

Reported-by: John Stanley <jpsinthemix@verizon.net>
Tested-by: John Stanley <jpsinthemix@verizon.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-09 16:18:24 +11:00
Christoph Hellwig
e6edbd1c1c [XFS] fix compile of xfs_btree_readahead_lblock on m68k
Change the left/right variables to the proper always 64bit xfs_dfsbo_t
type because otherwise compilation fails for Geert on m68k without
CONFIG_LBD:

| fs/xfs/xfs_btree.c: In function 'xfs_btree_readahead_lblock':
| fs/xfs/xfs_btree.c:736: warning: comparison is always true due to limited range of data type
| fs/xfs/xfs_btree.c:741: warning: comparison is always true due to limited range of data type

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-09 16:16:51 +11:00
Eric Sandeen
fb82557f16 [XFS] Remove macro-to-function indirections in the mask code
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-09 15:53:54 +11:00
Eric Sandeen
c9fb86a917 [XFS] Remove macro-to-function indirections in attr code
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-09 15:46:44 +11:00
Eric Sandeen
9800b55035 [XFS] Remove several unused typedefs.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-09 15:46:16 +11:00
Christoph Hellwig
c9a98553d5 [XFS] pass XFS_IGET_BULKSTAT to xfs_iget for handle operations
NFS clients or users of the handle ioctls can pass us arbitrary inode
numbers through the exportfs interface.  Make sure we use the
XFS_IGET_BULKSTAT so that these don't cause shutdowns due to the corruption
checks.  Also translate the EINVAL we get back for invalid inode clusters
into an ESTALE which is more appropinquate, and remove the useless check
for a NULL inode on a successfull xfs_iget return.

I have a testcase to reproduce this using the handle interface which
I will submit to xfsqa.

Reported-by: Mario Becroft <mb@gem.win.co.nz>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-09 15:17:17 +11:00
Lachlan McIlroy
6206aa8b2b Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2009-01-08 13:22:55 +11:00
Frederik Schwarzer
025dfdafe7 trivial: fix then -> than typos in comments and documentation
- (better, more, bigger ...) then -> (...) than

Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-01-06 11:28:06 +01:00
Nick Piggin
95f8e302c0 [XFS] use scalable vmap API
Implement XFS's large buffer support with the new vmap APIs. See the vmap
rewrite (db64fe02) for some numbers. The biggest improvement that comes from
using the new APIs is avoiding the global KVA allocation lock on every call.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-06 14:43:09 +11:00
Nick Piggin
d2859751cd [XFS] remove old vmap cache
XFS's vmap batching simply defers a number (up to 64) of vunmaps, and keeps
track of them in a list. To purge the batch, it just goes through the list and
calls vunamp on each one. This is pretty poor: a global TLB flush is generally
still performed on each vunmap, with the most expensive parts of the operation
being the broadcast IPIs and locking involved in the SMP callouts, and the
locking involved in the vmap management -- none of these are avoided by just
batching up the calls. I'm actually surprised it ever made much difference.
(Now that the lazy vmap allocator is upstream, this description is not quite
right, but the vunmap batching still doesn't seem to do much)

Rip all this logic out of XFS completely. I will improve vmap performance
and scalability directly in subsequent patch.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-06 14:40:44 +11:00
Lachlan McIlroy
0a8c5395f9 [XFS] Fix merge failures
Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

Conflicts:

	fs/xfs/linux-2.6/xfs_cred.h
	fs/xfs/linux-2.6/xfs_globals.h
	fs/xfs/linux-2.6/xfs_ioctl.c
	fs/xfs/xfs_vnodeops.h

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-29 16:47:18 +11:00
James Morris
cbacc2c7f0 Merge branch 'next' into for-linus 2008-12-25 11:40:09 +11:00
Lachlan McIlroy
25051158bb [XFS] Fix race in xfs_write() between direct and buffered I/O with DMAPI
The iolock is dropped and re-acquired around the call to XFS_SEND_NAMESP().
While the iolock is released the file can become cached.  We then
'goto retry' and - if we are doing direct I/O - mapping->nrpages may now be
non zero but need_i_mutex will be zero and we will hit the WARN_ON().

Since we have dropped the I/O lock then the file size may have also changed
so what we need to do here is 'goto start' like we do for the XFS_SEND_DATA()
DMAPI event.

We also need to update the filesize before releasing the iolock so that
needs to be done before the XFS_SEND_NAMESP event.  If we drop the iolock
before setting the filesize we could race with a truncate.

Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-24 14:07:32 +11:00
Christoph Hellwig
ad1ad968f4 [XFS] handle unaligned data in xfs_bmbt_disk_get_all
In libxfs xfs_bmbt_disk_get_all needs to handle unaligned data and thus
has been updated to use get_unaligned_be64.  In kernelspace we don't strictly
need it as the routine is only used for tracing and xfsidbg, but let's keep
the two implementations in sync.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-23 11:54:46 +11:00
Christoph Hellwig
efc557570d [XFS] avoid memory allocations in xfs_fs_vcmn_err
xfs_fs_vcmn_err can be called under a spinlock, but does a sleeping memory
allocation to create buffer for it's internal sprintf.  Fortunately it's
the only caller of icmn_err, so we can merge the two and have one single
static buffer and spinlock protecting it.  While we're at it make sure
we proper __attribute__ format annotations so that the compiler can detect
mismatched format strings.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22 18:02:01 +11:00
Lachlan McIlroy
9f6c92b9cc [XFS] Fix speculative allocation beyond eof
Speculative allocation beyond eof doesn't work properly.  It was
broken some time ago after a code cleanup that moved what is now
xfs_iomap_eof_align_last_fsb() and xfs_iomap_eof_want_preallocate()
out of xfs_iomap_write_delay() into separate functions.  The code
used to use the current file size in various checks but got changed
to be max(file_size, i_new_size).  Since i_new_size is the result
of 'offset + count' then in xfs_iomap_eof_want_preallocate() the
check for '(offset + count) <= isize' will always be true.

ie if 'offset + count' is > ip->i_size then isize will be i_new_size
and equal to 'offset + count'.

This change fixes all the places that used to use the current file
size.

Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22 17:56:49 +11:00
Lachlan McIlroy
4fdc778179 [XFS] Remove XFS_BUF_SHUT() and friends
Code does nothing so remove it.

Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22 17:52:58 +11:00
Lachlan McIlroy
d415867e0a [XFS] Use the incore inode size in xfs_file_readdir()
We should be using the incore inode size here not the linux inode
size.  The incore inode size is always up to date for directories
whereas the linux inode size is not updated for directories.

We've hit assertions in xfs_bmap() and traced it back to the linux
inode size being zero but the incore size being correct.

Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22 17:50:56 +11:00
Lachlan McIlroy
4d9d4ebf5d Merge branch 'master' of git+ssh://git.melbourne.sgi.com/git/xfs 2008-12-12 15:28:02 +11:00
Lachlan McIlroy
cfbe52672f [XFS] set b_error from bio error in xfs_buf_bio_end_io
Preserve any error returned by the bio layer.

Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-12 15:27:25 +11:00
Christoph Hellwig
c4cd747ee6 [XFS] use inode_change_ok for setattr permission checking
Instead of implementing our own checks use inode_change_ok to check for
necessary permission in setattr.  There is a slight change in behaviour
as inode_change_ok doesn't allow i_mode updates to add the suid or sgid
without superuser privilegues while the old XFS code just stripped away
those bits from the file mode.

(First sent on Semptember 29th)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-11 13:15:10 +11:00
Christoph Hellwig
4d4be482a4 [XFS] add a FMODE flag to make XFS invisible I/O less hacky
XFS has a mode called invisble I/O that doesn't update any of the
timestamps.  It's used for HSM-style applications and exposed through
the nasty open by handle ioctl.

Instead of doing directly assignment of file operations that set an
internal flag for it add a new FMODE_NOCMTIME flag that we can check
in the normal file operations.

(addition of the generic VFS flag has been ACKed by Al as an interims
 solution)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-11 13:14:41 +11:00
Christoph Hellwig
6d73cf133c [XFS] resync headers with libxfs
- xfs_sb.h add the XFS_SB_VERSION2_PARENTBIT features2 that has been
   around in userspace for some time
 - xfs_inode.h: move a few things out of __KERNEL__ that are needed by
   userspace
 - xfs_mount.h: only include xfs_sync.h under __KERNEL__
 - xfs_inode.c: minor whitespace fixup.  I accidentaly changes this when
   importing this file for use by userspace.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-11 13:14:17 +11:00
Christoph Hellwig
2175dd9574 [XFS] simplify projid check in xfs_rename
Check for the project ID after attaching all inodes to the transaction.
That way the unlock in the error case is done by the transaction subsystem,
which guaratees that is uses the right flags (which was wrong from day one
of this check), and avoids having special code unlocking an array of inodes
with potential duplicates.  Attaching the inode first is the method used
by xfs_rename and the other namespace methods all other error that require
multiple locked inodes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-11 13:13:52 +11:00
Christoph Hellwig
15ac08a8b2 [XFS] replace b_fspriv with b_mount
Replace the b_fspriv pointer and it's ugly accessors with a properly types
xfs_mount pointer.  Also switch log reocvery over to it instead of using
b_fspriv for the mount pointer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-11 13:13:33 +11:00
Lachlan McIlroy
e055f13a6d [XFS] Remove unused tracing code
None of this code appears to be used anywhere so remove it.

Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-10 11:51:54 +11:00
Dave Chinner
576a488a27 [XFS] Fix hang after disallowed rename across directory quota domains
When project quota is active and is being used for directory tree
quota control, we disallow rename outside the current directory
tree. This requires a check to be made after all the inodes
involved in the rename are locked. We fail to unlock the inodes
correctly if we disallow the rename when the target is outside the
current directory tree. This results in a hang on the next access
to the inodes involved in failed rename.

Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-05 15:39:13 +11:00
Lachlan McIlroy
797eaed40e [XFS] Remove unnecessary assertion
Hit this assert because an inode was tagged with XFS_ICI_RECLAIM_TAG but
not XFS_IRECLAIMABLE|XFS_IRECLAIM.  This is because xfs_iget_cache_hit()
first clears XFS_IRECLAIMABLE and then calls __xfs_inode_clear_reclaim_tag()
while only holding the pag_ici_lock in read mode so we can race with
xfs_reclaim_inodes_ag().  Looks like xfs_reclaim_inodes_ag() will do the
right thing anyway so just remove the assert.

Thanks to Christoph for pointing out where the problem was.

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
2008-12-05 14:15:49 +11:00
Lachlan McIlroy
a5b429d41f [XFS] Remove unused variable in ktrace_free()
entries_size is probably left over from when we used to pass the
size to kmem_free().

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2008-12-05 13:31:51 +11:00
Lachlan McIlroy
c6422617a1 [XFS] Check return value of xfs_buf_get_noaddr()
We check the return value of all other calls to xfs_buf_get_noaddr().
Make sense to do it here too.

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
2008-12-05 13:16:15 +11:00
Dave Chinner
6a0775a991 [XFS] Fix hang after disallowed rename across directory quota domains
When project quota is active and is being used for directory tree
quota control, we disallow rename outside the current directory
tree. This requires a check to be made after all the inodes
involved in the rename are locked. We fail to unlock the inodes
correctly if we disallow the rename when the target is outside the
current directory tree. This results in a hang on the next access
to the inodes involved in failed rename.

Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-05 12:50:04 +11:00
Christoph Hellwig
8bb57320f3 [XFS] Fix compile with CONFIG_COMPAT enabled
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-05 11:23:10 +11:00
Christoph Hellwig
5a8d0f3c7a move inode tracing out of xfs_vnode.
Move the inode tracing into xfs_iget.c / xfs_inode.h and kill xfs_vnode.c
now that it's empty.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:25 +11:00
Christoph Hellwig
25e41b3d52 move vn_iowait / vn_iowake into xfs_aops.c
The whole machinery to wait on I/O completion is related to the I/O path
and should be there instead of in xfs_vnode.c.  Also give the functions
more descriptive names.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:24 +11:00
Christoph Hellwig
583fa586f0 kill vn_ioerror
There's just one caller of this helper, and it's much cleaner to just merge
the xfs_do_force_shutdown call into it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:24 +11:00
Christoph Hellwig
f95099ba5a kill xfs_unmount_flush
There's almost nothing left in this function, instead remove the IRELE
on the real times inodes and the call to XFS_QM_UNMOUNT into xfs_unmountfs.

For the regular unmount case that means it now also happenes after dmapi
notification, but otherwise there is no difference in behaviour.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:24 +11:00
Christoph Hellwig
e57481dc26 no explicit xfs_iflush for special inodes during unmount
Currently we explicitly call xfs_iflush on the quota, real-time and root
inodes from xfs_unmount_flush.  But we just called xfs_sync_inodes with
SYNC_ATTR and do an XFS_bflush aka xfs_flush_buftarg to make sure all inodes
are on disk already, so there is no need for these special cases.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:23 +11:00
Christoph Hellwig
070c4616ec use xfs_trans_ijoin in xfs_trans_iget
Use xfs_trans_ijoin in xfs_trans_iget in case we need to join an inode into
a transaction instead of opencoding it.  Based on a discussion with and an
incomplete patch from Niv Sardi.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:23 +11:00
Christoph Hellwig
b56757becf remove leftovers of shared read-only support
We never supported shared read-only filesystems, so remove the dead
code left over from IRIX for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:23 +11:00
Christoph Hellwig
e88f11abe0 remove unused m_inode_quiesce member from struct xfs_mount
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:22 +11:00
Christoph Hellwig
6bd16ff270 kill dead inode flags
There are a few inode flags around that aren't used anywhere, so remove
them.  Also update xfsidbg to display all used inode flags correctly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:22 +11:00
Christoph Hellwig
5efcbb853b cleanup xfs_sb.h feature flag helpers
The various inlines in xfs_sb.h that deal with the superblock version
and fature flags were converted from macros a while ago, and this
show by the odd coding style full of useless braces and backslashes
and the avoidance of conditionals.

Clean these up to look like normal C code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:22 +11:00
Christoph Hellwig
df6771bde1 kill dead quota flags
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:22 +11:00
Christoph Hellwig
63ad2a5c4c remove dead code from sv_t implementation
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:21 +11:00
Christoph Hellwig
39e2defe73 reduce l_icloglock roundtrips
All but one caller of xlog_state_want_sync drop and re-acquire
l_icloglock around the call to it, just so that xlog_state_want_sync can
acquire and drop it.

Move all lock operation out of l_icloglock and assert that the lock is
held when it is called.

Note that it would make sense to extende this scheme to
xlog_state_release_iclog, but the locking in there is more complicated
and we'd like to keep the atomic_dec_and_lock optmization for those
callers not having l_icloglock yet.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:21 +11:00
Christoph Hellwig
d9424b3c4a stop using igrab in xfs_vn_link
->link is guranteed to get an already reference inode passed so we
can do a simple increment of i_count instead of using igrab and thus
avoid banging on the global inode_lock.  This is what most filesystems
already do.

Also move the increment after the call to xfs_link to simplify error
handling.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:21 +11:00
Christoph Hellwig
5d765b976c kill xfs_buf_iostart
xfs_buf_iostart is a "shared" helper for xfs_buf_read_flags,
xfs_bawrite, and xfs_bdwrite - except that there isn't much shared
code but rather special cases for each caller.

So remove this function and move the functionality to the caller.
xfs_bawrite and xfs_bdwrite are now big enough to be moved out of
line and the xfs_buf_read_flags is moved into a new helper called
_xfs_buf_read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:20 +11:00
Christoph Hellwig
5cafdeb289 cleanup the inode reclaim path
Merge xfs_iextract and xfs_idestroy into xfs_ireclaim as they are never
called individually.  Also rewrite most comments in this area as they
were severly out of date.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:20 +11:00
Christoph Hellwig
ccd0be6cfc remove unused prototypes for xfs_ihash_init / xfs_ihash_free
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:20 +11:00
Christoph Hellwig
73e6335c14 remove unused behvavior cruft in xfs_super.h
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:19 +11:00
Christoph Hellwig
2234d54d3d remove useless mnt_want_write call in xfs_write
When mnt_want_write was introduced a call to it was added around
xfs_ichgtime, but there is no need for this because a file can't be open
read/write on a r/o mount, and a mount can't degrade r/o while we still
have files open for writing.  As the mnt_want_write changes were never
merged into the CVS tree this patch is for mainline only.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-04 15:39:19 +11:00
Christoph Hellwig
ddcd856d81 [XFS] fix compile on 32 bit systems
The recent compat patches make xfs_file.c include xfs_ioctl32.h unconditional,
which breaks the build on 32 bit systems which don't have the various compat
defintions.

Remove the include and move the defintion of xfs_file_compat_ioctl to
xfs_ioctl.h so that we can avoid including all the compat defintions in
xfs_file.c

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-04 13:07:29 +11:00
sandeen@sandeen.net
e5d412f178 [XFS] Reorder xfs_ioctl32.c for some tidiness
Put things in IMHO a more readable order, now
that it's all done; add some comments.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:18:21 +11:00
sandeen@sandeen.net
710d62aaaf [XFS] Hook up compat XFS_IOC_FSSETDM_BY_HANDLE ioctl handler
Add a compat handler for XFS_IOC_FSSETDM_BY_HANDLE.

I haven't tested this, lacking dmapi tools to do so
(unless xfsqa magically gets this somehow?)

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:17:43 +11:00
sandeen@sandeen.net
28750975ac [XFS] Hook up compat XFS_IOC_ATTRMULTI_BY_HANDLE ioctl handler
Add a compat handler for XFS_IOC_ATTRMULTI_BY_HANDLE

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:17:07 +11:00
sandeen@sandeen.net
ebeecd2b04 [XFS] Hook up compat XFS_IOC_ATTRLIST_BY_HANDLE ioctl handler
Add a compat handler for XFS_IOC_ATTRLIST_BY_HANDLE

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:16:45 +11:00
sandeen@sandeen.net
af819d2763 [XFS] Fix compat XFS_IOC_FSBULKSTAT_SINGLE ioctl
The XFS_IOC_FSBULKSTAT_SINGLE ioctl passes in the
desired inode number, while XFS_IOC_FSBULKSTAT passes
in the previous/last-stat'd inode number.  The
compat handler wasn't differentiating these, so
when a XFS_IOC_FSBULKSTAT_SINGLE request for inode
128 was sent in, stat information for 131 was sent out.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:16:24 +11:00
sandeen@sandeen.net
65fbaf2489 [XFS] Fix xfs_bulkstat_one size checks & error handling
The 32-bit xfs_blkstat_one handler was failing because
a size check checked whether the remaining (32-bit)
user buffer was less than the (64-bit) bulkstat buffer,
and failed with ENOMEM if so.  Move this check
into the respective handlers so that they check the
correct sizes.

Also, the formatters were returning negative errors
or positive bytes copied; this was odd in the positive
error value world of xfs, and handled wrong by at least
some of the callers, which treated the bytes returned
as an error value.  Move the bytes-used assignment
into the formatters.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:16:03 +11:00
sandeen@sandeen.net
2ee4fa5cb7 [XFS] Make the bulkstat_one compat ioctl handling more sane
Currently the compat formatter was handled by passing
in "private_data" for the xfs_bulkstat_one formatter,
which was really just another formatter... IMHO this
got confusing.

Instead, just make a new xfs_bulkstat_one_compat
formatter for xfs_bulkstat, and call it via a wrapper.

Also, don't translate the ioctl nrs into their native
counterparts, that just clouds the issue; we're in a
compat handler anyway, just switch on the 32-bit cmds.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:15:36 +11:00
sandeen@sandeen.net
471d591031 [XFS] Add compat handlers for data & rt growfs ioctls
The args for XFS_IOC_FSGROWFSDATA and XFS_IOC_FSGROWFSRTA
have padding on the end on intel, so add arg copyin functions,
and then just call the growfs ioctl helpers.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:15:09 +11:00
sandeen@sandeen.net
e94fc4a43e [XFS] Add compat handlers for swapext ioctl
The big hitter here was the bstat field, which contains
different sized time_t on 32 vs. 64 bit.  Add a copyin
function to translate the 32-bit arg to 64-bit, and
call the swapext ioctl helper.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:10:04 +11:00
sandeen@sandeen.net
d5547f9fee [XFS] Clean up some existing compat ioctl calls
Create a new xfs_ioctl.h file which has prototypes for
ioctl helpers that may be called in compat mode.

Change several compat ioctl cases which are IOW to simply copy
in the userspace argument, then call the common ioctl helper.

This also fixes xfs_compat_ioc_fsgeometry_v1(), which had
it backwards before; it copied in an (empty) arg, then copied
out the native result, which probably corrupted userspace.  It
should be translating on the copyout.

Also, a bit of formatting cleanup for consistency, and conversion
of all error returns to use XFS_ERROR().

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:09:43 +11:00
sandeen@sandeen.net
ffae263a64 [XFS] Move compat ioctl structs & numbers into xfs_ioctl32.h
This makes the c file less cluttered and a bit more
readable.   Consistently name the ioctl number
macros with "_32" and the compatibility stuctures
with "_compat."  Rename the helpers which simply
copy in the arg with "_copyin" for easy identification.

Finally, for a few of the existing helpers, modify them
so that they directly call the native ioctl helper
after userspace argument fixup.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:08:44 +11:00
sandeen@sandeen.net
743bb4650d [XFS] Move copy_from_user calls out of ioctl helpers into ioctl switch.
Moving the copy_from_user out of some of the ioctl helpers will
make it easier for the compat ioctl switch to copy in the right
struct, then just pass to the underlying helper.

Also, move common access checks into the helpers themselves,
and out of the native ioctl switch code, to reduce code
duplication between native & compat ioctl callers.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:08:01 +11:00
Christoph Hellwig
0e446673a1 [XFS] fix error handling in xlog_recover_process_one_iunlink
If we fail after xfs_iget we have to drop the reference count, spotted
by Dave Chinner.  Also remove some useless asserts and stop trying to
deal with di_mode == 0 inodes because never gets those without passing
the IGET_CREATE flag to xfs_iget.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:38:22 +11:00
Christoph Hellwig
24f211bad0 [XFS] move inode allocation out xfs_iread
Allocate the inode in xfs_iget_cache_miss and pass it into xfs_iread.  This
simplifies the error handling and allows xfs_iread to be shared with userspace
which already uses these semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:38:17 +11:00
Christoph Hellwig
b48d8d6437 [XFS] kill the XFS_IMAP_BULKSTAT flag
Just pass down the XFS_IGET_* flags all the way down to xfs_imap instead
of translating them mid-way.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:38:13 +11:00
Christoph Hellwig
92bfc6e7c4 [XFS] embededd struct xfs_imap into xfs_inode
Most uses of struct xfs_imap are to map and inode to a buffer.  To avoid
copying around the inode location information we should just embedd a
strcut xfs_imap into the xfs_inode.  To make sure it doesn't bloat an
inode the im_len is changed to a ushort, which is fine as that's what
the users exepect anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:38:08 +11:00
Christoph Hellwig
94e1b69d1a [XFS] merge xfs_imap into xfs_dilocate
xfs_imap is the only caller of xfs_dilocate and doesn't add any significant
value.  Merge the two functions and document the various cases we have for
inode cluster lookup in the new xfs_imap.

Also remove the unused im_agblkno and im_ioffset fields from struct xfs_imap
while we're at it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:38:03 +11:00
Christoph Hellwig
a194189503 [XFS] remove dead code for old inode item recovery
We have removed the support for old-style inode items a while ago and
xlog_recover_do_inode_trans is now only called for XFS_LI_INODE items.
That means we can remove the call to xfs_imap there and with it the
XFS_IMAP_LOOKUP that is set by all other callers.  We can also mark
xfs_imap static now.

(First sent on October 21st)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:58 +11:00
Christoph Hellwig
76d8b277f7 [XFS] stop using xfs_itobp in xfs_iread
The only caller of xfs_itobp that doesn't have i_blkno setup is now
the initial inode read.  It needs access to the whole xfs_imap so using
xfs_inotobp is not an option.  Instead opencode the buffer lookup in
xfs_iread and kill all the functionality for the initial map from
xfs_itobp.

(First sent on October 21st)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:52 +11:00
Christoph Hellwig
23fac50f95 [XFS] split up xlog_recover_process_iunlinks
Split out the body of the main loop into a separate helper to make the
code readable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:48 +11:00
Christoph Hellwig
51ce16d519 [XFS] kill XFS_DINODE_VERSION_ defines
These names don't add any value at all over just using the numerical
values.

(First sent on October 9th)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:42 +11:00
Christoph Hellwig
81591fe2db [XFS] kill xfs_dinode_core_t
Now that we have a separate xfs_icdinode_t for the in-core inode which
gets logged there is no need anymore for the xfs_dinode vs xfs_dinode_core
split - the fact that part of the structure gets logged through the inode
log item and a small part not can better be described in a comment.

All sizeof operations on the dinode_core either really wanted the
icdinode and are switched to that one, or had already added the size
of the agi unlinked list pointer.  Later both will be replaced with
helpers once we get the larger CRC-enabled dinode.

Removing the data and attribute fork unions also has the advantage that
xfs_dinode.h doesn't need to pull in every header under the sun.

While we're at it also add some more comments describing the dinode
structure.

(First sent on October 7th)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:35 +11:00
Christoph Hellwig
d42f08f61c [XFS] kill xfs_ialloc_log_di
xfs_ialloc_log_di is only used to log the full inode core + di_next_unlinked.
That means all the offset magic is not nessecary and we can simply use
xfs_trans_log_buf directly.  Also add a comment describing what we should do
here instead.

(First sent on October 7th)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:31 +11:00
Christoph Hellwig
b28708d6a0 [XFS] sanitize xlog_in_core_t definition
Move all fields from xlog_iclog_fields_t into xlog_in_core_t instead of having
them in a substructure and the using #defines to make it look like they were
directly in xlog_in_core_t.  Also document that xlog_in_core_2_t is grossly
misnamed, and make all references to it typesafe.

(First sent on Semptember 15th)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:25 +11:00
From: Christoph Hellwig
4805621a37 [XFS] factor out xfs_read_agf helper
Add a helper to read the AGF header and perform basic verification.
Based on hunks from a larger patch from Dave Chinner.

(First sent on Juli 23rd)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:20 +11:00
Christoph Hellwig
5e1be0fb1a [XFS] factor out xfs_read_agi helper
Add a helper to read the AGI header and perform basic verification.
Based on hunks from a larger patch from Dave Chinner.

(First sent on Juli 23rd)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:15 +11:00
Dave Chinner
26c5295135 [XFS] remove i_gen from incore inode
i_gen is incremented in directory operations when the
directory is changed. It is never read or otherwise used
so it should be removed to help reduce the size of the
struct xfs_inode.

The patch also removes a duplicate logging of the directory
inode core. We only need to do this once per transaction
so kill the one associated with the i_gen increment.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:10 +11:00
Christoph Hellwig
207fcfad58 [XFS] remove xfs_vfsops.h
The only thing left is xfs_do_force_shutdown which already has a defintion
in xfs_mount.h.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:06 +11:00
Christoph Hellwig
2b5decd09e [XFS] remove xfs_vfs.h
The only thing left are the forced shutdown flags and freeze macros which
fit into xfs_mount.h much better.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:36:59 +11:00
Christoph Hellwig
00dd4029e9 [XFS] remove bhv_statvfs_t typedef
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:36:46 +11:00
Eric Sandeen
f35642e2f8 [XFS] Hook up the fiemap ioctl.
This adds the fiemap inode_operation, which for us converts the
fiemap values & flags into a getbmapx structure which can be sent
to xfs_getbmap.  The formatter then copies the bmv array back into
the user's fiemap buffer via the fiemap helpers.

If we wanted to be more clever, we could also return mapping data
for in-inode attributes, but I'm not terribly motivated to do that
just yet.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:29:42 +11:00
Eric Sandeen
5af317c942 [XFS] Add new getbmap flags.
This adds a new output flag, BMV_OF_LAST to indicate if we've hit
the last extent in the inode.  This potentially saves an extra call
from userspace to see when the whole mapping is done.

It also adds BMV_IF_DELALLOC and BMV_OF_DELALLOC to request, and
indicate, delayed-allocation extents.  In this case bmv_block
is set to -2 (-1 was already taken for HOLESTARTBLOCK; unfortunately
these are the reverse of the in-kernel constants.)

These new flags facilitate addition of the new fiemap interface.

Rather than adding sh_delalloc, remove sh_unwritten & just test
the flags directly.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:29:28 +11:00
Eric Sandeen
8a7141a8b9 [XFS] convert xfs_getbmap to take formatter functions
Preliminary work to hook up fiemap, this allows us to pass in an
arbitrary formatter to copy extent data back to userspace.

The formatter takes info for 1 extent, a pointer to the user "thing*"
and a pointer to a "filled" variable to indicate whether a userspace
buffer did get filled in (for fiemap, hole "extents" are skipped).

I'm just using the getbmapx struct as a "common denominator" because
as far as I can see, it holds all info that any formatters will care
about.

("*thing" because fiemap doesn't pass the user pointer around, but rather
has a pointer to a fiemap info structure, and helpers associated with it)

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:29:00 +11:00
Dave Chinner
0924b585fc [XFS] fix uninitialised variable bug in dquot release.
gcc is warning about an uninitialised variable in xfs_growfs_rt().
This is a false positive. Fix it by changing the scope of the
transaction pointer to wholly within the internal loop inside
the function.

While there, preemptively change xfs_growfs_rt_alloc() in the
same way as it has exactly the same structure as xfs_growfs_rt()
but gcc is not warning about it. Yet.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:11:36 +11:00
Dave Chinner
2e6560929d [XFS] fix error inversion problems with data flushing
XFS gets the sign of the error wrong in several places when
gathering the error from generic linux functions. These functions
return negative error values, while the core XFS code returns
positive error values. Hence when XFS inverts the error to be
returned to the VFS, it can incorrectly invert a negative
error and this error will be ignored by the syscall return.

Fix all the problems related to calling filemap_* functions.

Problem initially identified by Nick Piggin in xfs_fsync().

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:11:10 +11:00
Christoph Hellwig
65795910c1 [XFS] fix spurious gcc warnings
Some recent gcc warnings don't like passing string variables to
printf-like functions without using at least a "%s" format string.
Change the two occurances of that in xfs to please gcc.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:07:37 +11:00
Christoph Hellwig
6c31b93a14 [XFS] allow inode64 mount option on 32 bit systems
Now that we've stopped using the Linux inode cache when can trivally
support the inode64 mount option on 32bit architectures.  As far as the
kernel and most userspace is concerned this works perfectly, but
applications still using really old stat and readdir interfaces will get
an EOVERFLOW error when hitting an inode number not fitting into 32
bits (that problem of course also exists when using these applications
on a 64bit kernel).

Note that because inode64 is simply a mount option we can currently
mount a filesystem having > 32 bit inode numbers and cause a variety of
problems, all this is solved but this patch which enables XFS_BIG_INUMS,
even when inode64 is not used.

(First sent on October 18th)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:07:20 +11:00
Christoph Hellwig
f999a5bf3f [XFS] wire up ->open for directories
Currently there's no ->open method set for directories on XFS.  That
means we don't perform any check for opening too large directories
without O_LARGEFILE, we don't check for shut down filesystems, and we
don't actually do the readahead for the first block in the directory.

Instead of just setting the directories open routine to xfs_file_open
we merge the shutdown check directly into xfs_file_open and create
a new xfs_dir_open that first calls xfs_file_open and then performs
the readahead for block 0.

(First sent on September 29th)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:07:08 +11:00
Christoph Hellwig
bac8dca9f9 [XFS] fix NULL pointer dereference in xfs_log_force_umount
xfs_log_force_umount may be called very early during log recovery where

If we fail a buffer read in xlog_recover_do_inode_trans we abort the mount.
But at that point log recovery has started delayed writeback of inode
buffers.   As part of the aborted mount we try to flush out all delwri
buffers, but at that point we have already freed the superblock, and set
mp->m_sb_bp to NULL, and xfs_log_force_umount which gets called after
the inode buffer writeback trips over it.

Make xfs_log_force_umount a little more careful when accessing mp->m_sb_bp
to avoid this.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:06:44 +11:00
Dave Chinner
cc09c0dc57 [XFS] Fix double free of log tickets
When an I/O error occurs during an intermediate commit on a rolling
transaction, xfs_trans_commit() will free the transaction structure
and the related ticket. However, the duplicate transaction that
gets used as the transaction continues still contains a pointer
to the ticket. Hence when the duplicate transaction is cancelled
and freed, we free the ticket a second time.

Add reference counting to the ticket so that we hold an extra
reference to the ticket over the transaction commit. We drop the
extra reference once we have checked that the transaction commit
did not return an error, thus avoiding a double free on commit
error.

Credit to Nick Piggin for tripping over the problem.

SGI-PV: 989741

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-17 17:37:10 +11:00
James Morris
2b82892565 Merge branch 'master' into next
Conflicts:
	security/keys/internal.h
	security/keys/process_keys.c
	security/keys/request_key.c

Fixed conflicts above by using the non 'tsk' versions.

Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 11:29:12 +11:00
David Howells
745ca2475a CRED: Pass credentials through dentry_open()
Pass credentials through dentry_open() so that the COW creds patch can have
SELinux's flush_unauthorized_files() pass the appropriate creds back to itself
when it opens its null chardev.

The security_dentry_open() call also now takes a creds pointer, as does the
dentry_open hook in struct security_operations.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:22 +11:00
David Howells
b6dff3ec5e CRED: Separate task security context from task_struct
Separate the task security context from task_struct.  At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:16 +11:00
David Howells
82ab8deda7 CRED: Wrap task credential accesses in the XFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: xfs@oss.sgi.com
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:04 +11:00
David Chinner
220ca310a5 [XFS] XFS: Check for valid transaction headers in recovery
When we are about to add a new item to a transaction in recovery, we need
to check that it is valid first. Currently we just assert that header
magic number matches, but in production systems that is not present and we
add a corrupted transaction to the list to be processed. This results in a
kernel oops later when processing the corrupted transaction.

Instead, if we detect a corrupted transaction, abort recovery and leave
the user to clean up the mess that has occurred.

SGI-PV: 988145

SGI-Modid: xfs-linux-melb:xfs-kern:32356a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 18:01:50 +11:00
Dave Chinner
8f330f5149 [XFS] handle memory allocation failures during log initialisation
When there is no memory left in the system, xfs_buf_get_noaddr()
can fail. If this happens at mount time during xlog_alloc_log()
we fail to catch the error and oops.

Catch the error from xfs_buf_get_noaddr(), and allow other memory
allocations to fail and catch those errors too. Report the error
to the console and fail the mount with ENOMEM.

Tested by manually injecting errors into xfs_buf_get_noaddr() and
xlog_alloc_log().

Version 2:
o remove unnecessary casts of the returned pointer from kmem_zalloc()

SGI-PV: 987246

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 17:57:06 +11:00
David Chinner
6f9f51adb6 [XFS] Account for allocated blocks when expanding directories
When we create a directory, we reserve a number of blocks for the maximum
possible expansion of of the directory due to various btree splits,
freespace allocation, etc. Unfortunately, each allocation is not reflected
in the total number of blocks still available to the transaction, so the
maximal reservation is used over and over again.

This leads to problems where an allocation group has only enough blocks
for *some* of the allocations required for the directory modification.
After the first N allocations, the remaining blocks in the allocation
group drops below the total reservation, and subsequent allocations fail
because the allocator will not allow the allocation to proceed if the AG
does not have the enough blocks available for the entire allocation total.

This results in an ENOSPC occurring after an allocation has already
occurred. This results in aborting the directory operation (leaving the
directory in an inconsistent state) and cancelling a dirty transaction,
which results in a filesystem shutdown.

Avoid the problem by reflecting the number of blocks allocated in any
directory expansion in the total number of blocks available to the
modification in progress. This prevents a directory modification from
being aborted part way through with an ENOSPC.

SGI-PV: 988144

SGI-Modid: xfs-linux-melb:xfs-kern:32340a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 17:51:14 +11:00
Lachlan McIlroy
2cf7f0da3a [XFS] Wait for all I/O on truncate to zero file size
It's possible to have outstanding xfs_ioend_t's queued when the file size
is zero. This can happen in the direct I/O path when a direct I/O write
fails due to ENOSPC. In this case the xfs_ioend_t will still be queued (ie
xfs_end_io_direct() does not know that the I/O failed so can't force the
xfs_ioend_t to be flushed synchronously).

When we truncate a file on unlink we don't know to wait for these
xfs_ioend_ts and we can have a use-after-free situation if the inode is
reclaimed before the xfs_ioend_t is finally processed.

As was suggested by Dave Chinner lets wait for all I/Os to complete when
truncating the file size to zero.

SGI-PV: 981668

SGI-Modid: xfs-linux-melb:xfs-kern:32216a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-11-10 17:51:00 +11:00
Lachlan McIlroy
9ccbece546 [XFS] Fix use-after-free with log and quotas
Destroying the quota stuff on unmount can access the log - ie
XFS_QM_DONE() ends up in xfs_dqunlock() which calls
xfs_trans_unlocked_item() and then xfs_log_move_tail(). By this time the
log has already been destroyed. Just move the cleanup of the quota code
earlier in xfs_unmountfs() before the call to xfs_log_unmount(). Moving
XFS_QM_DONE() up near XFS_QM_DQPURGEALL() seems like a good spot.

SGI-PV: 987086

SGI-Modid: xfs-linux-melb:xfs-kern:32148a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
2008-11-10 17:43:23 +11:00
Dave Chinner
6307091fe6 [XFS] Avoid using inodes that haven't been completely initialised
The radix tree walks in xfs_sync_inodes_ag and xfs_qm_dqrele_all_inodes()
can find inodes that are still undergoing initialisation. Avoid
them by checking for the the XFS_INEW() flag once we have a reference
on the inode. This flag is cleared once the inode is properly initialised.

SGI-PV: 987246

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 17:13:23 +11:00
Dave Chinner
cb4f0d1d42 [XFS] fix uninitialised variable bug in dquot release
gcc on ARM warns about an using an uninitialised variable
in xfs_qm_dqrele_all_inodes(). This is a real bug, but gcc
on x86_64 is not reporting this warning so it went unnoticed.

Fix the bug by bring the inode radix tree walk code up to
date with xfs_sync_inodes_ag().

SGI-PV: 987246

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 17:11:18 +11:00
Dave Chinner
644c3567d1 [XFS] handle memory allocation failures during log initialisation
When there is no memory left in the system, xfs_buf_get_noaddr()
can fail. If this happens at mount time during xlog_alloc_log()
we fail to catch the error and oops.

Catch the error from xfs_buf_get_noaddr(), and allow other memory
allocations to fail and catch those errors too. Report the error
to the console and fail the mount with ENOMEM.

Tested by manually injecting errors into xfs_buf_get_noaddr() and
xlog_alloc_log().

Version 2:
o remove unnecessary casts of the returned pointer from kmem_zalloc()

SGI-PV: 987246

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 16:50:24 +11:00
David Howells
91b7771251 CRED: Wrap task credential accesses in the XFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
2008-10-31 15:50:04 +11:00
David Chinner
6bfb3d065f [XFS] Fix race when looking up reclaimable inodes
If we get a race looking up a reclaimable inode, we can end up with the
winner proceeding to use the inode before it has been completely
re-initialised. This is a Bad Thing.

Fix the race by checking whether we are still initialising the inod eonce
we have a reference to it, and if so wait for the initialisation to
complete before continuing.

While there, fix a leaked reference count in the same code when
encountering an unlinked inode and we are not doing a lookup for a create
operation.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32429a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 18:32:43 +11:00
Tim Shimmin
e0b8e8b65d [XFS] remove restricted chown parameter from xfs linux
On Linux all filesystems are supposed to be operating under Posix'
restricted chown. Restricted chown means it restricts chown to the owner
unless you have CAP_FOWNER.

NOTE: that 2 files outside of fs/xfs have been modified too for this
change.

Reviewed-by: Dave Chinner <david@fromorbit.com>

SGI-PV: 988919

SGI-Modid: xfs-linux-melb:xfs-kern:32413a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 18:30:48 +11:00
Christoph Hellwig
ea5a3dc835 [XFS] kill sys_cred
capable_cred has been unused for a while so we can kill it and sys_cred.
That also means the cred argument to xfs_setattr and xfs_change_file_space
can be removed now.

SGI-PV: 988918

SGI-Modid: xfs-linux-melb:xfs-kern:32412a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 18:27:48 +11:00
David Chinner
7ee49acfe5 [XFS] correctly select first log item to push
Under heavy metadata load we are seeing log hangs. The AIL has items in it
ready to be pushed, and they are within the push target window. However,
we are not pushing them when the last pushed LSN is less than the LSN of
the first log item on the AIL. This is a regression introduced by the AIL
push cursor modifications.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32409a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-10-30 18:26:51 +11:00
Christoph Hellwig
9ed0451ee0 [XFS] free partially initialized inodes using destroy_inode
To make sure we free the security data inodes need to be freed using the
proper VFS helper (which we also need to export for this). We mark these
inodes bad so we can skip the flush path for them.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32398a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 18:26:04 +11:00
Christoph Hellwig
c679eef052 [XFS] stop using xfs_itobp in xfs_bulkstat
xfs_bulkstat only wants the dinode, offset and buffer from a given inode
number. Instead of using xfs_itobp on a fake inode which is complicated
and currently leads to leaks of the security data just use xfs_inotobp
which is designed to do exactly the kind of lookup xfs_bulkstat wants. The
only thing that's missing in xfs_inotobp is a flags paramter that let's us
pass down XFS_IMAP_BULKSTAT, but that can easily added.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32397a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 18:04:13 +11:00
David Chinner
455486b9cc [XFS] avoid all reclaimable inodes in xfs_sync_inodes_ag
If we are syncing data in xfs_sync_inodes_ag(), the VFS inode must still
be referencable as the dirty data state is carried on the VFS inode. hence
if we can't get a reference via igrab(), the inode must be in reclaim
which implies that it has no dirty data attached.

Leave such inodes to the reclaim code to flush the dirty inode state to
disk and so avoid attempting to access the VFS inode when it may not exist
in xfs_sync_inodes_ag().

Version 4:
o don't reference linux inode until after igrab() succeeds

Version 3:
o converted unlock/rele to an xfs_iput() call.

Version 2:
o change igrab logic to be more linear
o remove initial reclaimable inode check now that we are using
  igrab() failure to find reclaimable inodes
o assert that igrab failure occurs only on reclaimable inodes
o clean up inode locking - only grab the iolock if we are doing
  a SYNC_DELWRI call and we have a dirty inode.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32391a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 18:03:14 +11:00
David Chinner
56e73ec47d [XFS] Can't lock inodes in radix tree preload region
When we are inside a radix tree preload region, we cannot sleep. Recently
we moved the inode locking inside the preload region for the inode radix
tree. Fix that, and fix a missed unlock in another error path in the same
code at the same time.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32385a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:55:27 +11:00
Christoph Hellwig
2b7035fd74 [XFS] Trivial xfs_remove comment fixup
The dp to ip comment should be for the unconditional xfs_droplink call,
and the "." link obviously only exists for directories, so it should be in
the is_dir conditional.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32374a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:55:18 +11:00
Christoph Hellwig
1ec7944beb [XFS] fix biosize option
iosizelog shouldn't be the same as iosize but the logarithm of it. Then
again the current biosize option doesn't make much sense to me as it
doesn't set the preferred I/O size as mentioned in the comment next to it
but rather the allocation size and thus is identical to the allocsize
option (except for the missing logarithm). It's also not documented in
Documentation/filesystems/xfs.txt or the mount manpage.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32373a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:55:08 +11:00
Christoph Hellwig
469fc23d5d [XFS] fix the noquota mount option
Noquota should clear all mount options, and not just user and group quota.
Probably doesn't matter very much in real life.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32372a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:54:57 +11:00
Christoph Hellwig
9d565ffa33 [XFS] kill struct xfs_mount_args
No need to parse the mount option into a structure before applying it to
struct xfs_mount.

The content of xfs_start_flags gets merged into xfs_parseargs. Calls
inbetween don't care and can use mount members instead of the args struct.

This patch uncovered that the mount option for shared filesystems wasn't
ever exposed on Linux. The code to handle it is #if 0'ed in this patch
pending a decision on this feature. I'll send a writeup about it to the
list soon.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32371a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:53:24 +11:00
David Chinner
5a792c4579 [XFS] XFS: Check for valid transaction headers in recovery
When we are about to add a new item to a transaction in recovery, we need
to check that it is valid first. Currently we just assert that header
magic number matches, but in production systems that is not present and we
add a corrupted transaction to the list to be processed. This results in a
kernel oops later when processing the corrupted transaction.

Instead, if we detect a corrupted transaction, abort recovery and leave
the user to clean up the mess that has occurred.

SGI-PV: 988145

SGI-Modid: xfs-linux-melb:xfs-kern:32356a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:40:09 +11:00
David Chinner
783a2f656f [XFS] Finish removing the mount pointer from the AIL API
Change all the remaining AIL API functions that are passed struct
xfs_mount pointers to pass pointers directly to the struct xfs_ail being
used. With this conversion, all external access to the AIL is via the
struct xfs_ail. Hence the operation and referencing of the AIL is almost
entirely independent of the xfs_mount that is using it - it is now much
more tightly tied to the log and the items it is tracking in the log than
it is tied to the xfs_mount.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32353a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:58 +11:00
David Chinner
fc1829f34d [XFS] Add ail pointer into log items
Add an xfs_ail pointer to log items so that the log items can reference
the AIL directly during callbacks without needed a struct xfs_mount.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32352a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:46 +11:00
David Chinner
a9c21c1b9d [XFS] Given the log a pointer to the AIL
When we need to go from the log to the AIL, we have to go via the
xfs_mount. Add a xfs_ail pointer to the log so we can go directly to the
AIL associated with the log.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32351a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:35 +11:00
David Chinner
c7e8f26827 [XFS] Move the AIL lock into the struct xfs_ail
Bring the ail lock inside the struct xfs_ail. This means the AIL can be
entirely manipulated via the struct xfs_ail rather than needing both the
struct xfs_mount and the struct xfs_ail.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32350a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:23 +11:00
David Chinner
7b2e2a31f5 [XFS] Allow 64 bit machines to avoid the AIL lock during flushes
When copying lsn's from the log item to the inode or dquot flush lsn, we
currently grab the AIL lock. We do this because the LSN is a 64 bit
quantity and it needs to be read atomically. The lock is used to guarantee
atomicity for 32 bit platforms.

Make the LSN copying a small function, and make the function used
conditional on BITS_PER_LONG so that 64 bit machines don't need to take
the AIL lock in these places.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32349a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:12 +11:00
David Chinner
5b00f14fbd [XFS] move the AIl traversal over to a consistent interface
With the new cursor interface, it makes sense to make all the traversing
code use the cursor interface and make the old one go away. This means
more of the AIL interfacing is done by passing struct xfs_ail pointers
around the place instead of struct xfs_mount pointers.

We can replace the use of xfs_trans_first_ail() in xfs_log_need_covered()
as it is only checking if the AIL is empty. We can do that with a call to
xfs_trans_ail_tail() instead, where a zero LSN returned indicates and
empty AIL...

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32348a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:00 +11:00
David Chinner
27d8d5fe0e [XFS] Use a cursor for AIL traversal.
To replace the current generation number ensuring sanity of the AIL
traversal, replace it with an external cursor that is linked to the AIL.

Basically, we store the next item in the cursor whenever we want to drop
the AIL lock to do something to the current item. When we regain the lock.
the current item may already be free, so we can't reference it, but the
next item in the traversal is already held in the cursor.

When we move or delete an object, we search all the active cursors and if
there is an item match we clear the cursor(s) that point to the object.
This forces the traversal to restart transparently.

We don't invalidate the cursor on insert because the cursor still points
to a valid item. If the intem is inserted between the current item and the
cursor it does not matter; the traversal is considered to be past the
insertion point so it will be picked up in the next traversal.

Hence traversal restarts pretty much disappear altogether with this method
of traversal, which should substantially reduce the overhead of pushing on
a busy AIL.

Version 2 o add restart logic o comment cursor interface o minor cleanups

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32347a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:38:39 +11:00
David Chinner
82fa901245 [XFS] Allocate the struct xfs_ail
Rather than embedding the struct xfs_ail in the struct xfs_mount, allocate
it during AIL initialisation. Add a back pointer to the struct xfs_ail so
that we can pass around the xfs_ail and still be able to access the
xfs_mount if need be. This is th first step involved in isolating the AIL
implementation from the surrounding filesystem code.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32346a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:38:26 +11:00
David Chinner
a7444053fb [XFS] Account for allocated blocks when expanding directories
When we create a directory, we reserve a number of blocks for the maximum
possible expansion of of the directory due to various btree splits,
freespace allocation, etc. Unfortunately, each allocation is not reflected
in the total number of blocks still available to the transaction, so the
maximal reservation is used over and over again.

This leads to problems where an allocation group has only enough blocks
for *some* of the allocations required for the directory modification.
After the first N allocations, the remaining blocks in the allocation
group drops below the total reservation, and subsequent allocations fail
because the allocator will not allow the allocation to proceed if the AG
does not have the enough blocks available for the entire allocation total.

This results in an ENOSPC occurring after an allocation has already
occurred. This results in aborting the directory operation (leaving the
directory in an inconsistent state) and cancelling a dirty transaction,
which results in a filesystem shutdown.

Avoid the problem by reflecting the number of blocks allocated in any
directory expansion in the total number of blocks available to the
modification in progress. This prevents a directory modification from
being aborted part way through with an ENOSPC.

SGI-PV: 988144

SGI-Modid: xfs-linux-melb:xfs-kern:32340a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:38:12 +11:00
David Chinner
8c38ab0320 [XFS] Prevent looping in xfs_sync_inodes_ag
If the last block of the AG has inodes in it and the AG is an exactly
power-of-2 size then the last inode in the AG points to the last block in
the AG. If we try to find the next inode in the AG by adding one to the
inode number, we increment the inode number past the size of the AG. The
result is that the macro XFS_INO_TO_AGINO() will strip the AG portion of
the inode number and return an inode number of zero.

That is, instead of terminating the lookup loop because we hit the inode
number went outside the valid range for the AG, the search index returns
to zero and we start traversing the radix tree from the start again. This
results in an endless loop in xfs_sync_inodes_ag().

Fix it be detecting if the new search index decreases as a result of
incrementing the current inode number. That indicate an overflow and hence
that we have finished processing the AG so we can terminate the loop.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32335a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:38:00 +11:00
David Chinner
116545130c [XFS] kill deleted inodes list
Now that the deleted inodes list is unused, kill it. This also removes the
i_reclaim list head from the xfs_inode, shrinking it by two pointers.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32334a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:49 +11:00
David Chinner
7a3be02bae [XFS] use the inode radix tree for reclaiming inodes
Use the reclaim tag to walk the radix tree and find the inodes under
reclaim. This was the only user of the deleted inode list.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32333a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:37 +11:00
David Chinner
396beb8531 [XFS] mark inodes for reclaim via a tag in the inode radix tree
Prepare for removing the deleted inode list by marking inodes for reclaim
in the inode radix trees so that we can use the radix trees to find
reclaimable inodes.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32331a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:26 +11:00
David Chinner
1dc3318ae1 [XFS] rename inode reclaim functions
The function names xfs_finish_reclaim and xfs_finish_reclaim_all are not
very descriptive of what they are reclaiming. Rename to
xfs_reclaim_inode[s] to match the xfs_sync_inodes() function.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32330a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:15 +11:00
David Chinner
fce08f2f3b [XFS] move inode reclaim functions to xfs_sync.c
Background inode reclaim is run by the xfssyncd. Move the reclaim worker
functions to be close to the sync code as the are very similar in
structure and are both run from the same background thread.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32329a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:03 +11:00
Lachlan McIlroy
493dca6178 [XFS] Fix build warning - xfs_fs_alloc_inode() needs a return statement
SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32325a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:36:52 +11:00
David Chinner
99fa8cb3c5 [XFS] Prevent use-after-free caused by synchronous inode reclaim
With the combined linux and XFS inode, we need to ensure that the combined
structure is not freed before the generic code is finished with the inode.
As it turns out, there is a case where the XFS inode is freed before the
linux inode - when xfs_reclaim() is called from ->clear_inode() on a clean
inode, the xfs inode is freed during that call. The generic code
references the inode after the ->clear_inode() call, so this is a use
after free situation.

Fix the problem by moving the xfs_reclaim() call to ->destroy_inode()
instead of in ->clear_inode(). This ensures the combined inode structure
is not freed until after the generic code has finished with it.

SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32324a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:36:40 +11:00
David Chinner
bf904248a2 [XFS] Combine the XFS and Linux inodes
To avoid issues with different lifecycles of XFS and Linux inodes, embedd
the linux inode inside the XFS inode. This means that the linux inode has
the same lifecycle as the XFS inode, even when it has been released by the
OS. XFS inodes don't live much longer than this (a short stint in reclaim
at most), so there isn't significant memory usage penalties here.

Version 3 o kill xfs_icount()

Version 2 o remove unused commented out code from xfs_iget(). o kill
useless cast in VFS_I()

SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32323a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:36:14 +11:00
David Chinner
94b97e39b0 [XFS] Never call mark_inode_dirty_sync() directly
Once the Linux inode and the XFS inode are combined, we cannot rely on
just check if the linux inode exists as a method of determining if it is
valid or not. Hence we should always call xfs_mark_inode_dirty_sync()
instead as it does the correct checks to determine if the liinux inode is
in a valid state or not.

SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32318a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:21:30 +11:00
David Chinner
6441e54915 [XFS] factor xfs_iget_core() into hit and miss cases
There are really two cases in xfs_iget_core(). The first is the cache hit
case, the second is the miss case. They share very little code, and hence
can easily be factored out into separate functions. This makes the code
much easier to understand and subsequently modify.

SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32317a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:21:19 +11:00
Christoph Hellwig
3471394ba5 [XFS] fix instant oops with tracing enabled
We can only read inode->i_count if the inode is actually there and not a
NULL pointer. This was introduced in one of the recent sync patches.

SGI-PV: 988255

SGI-Modid: xfs-linux-melb:xfs-kern:32315a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:21:10 +11:00
David Chinner
76bf105cb1 [XFS] Move remaining quiesce code.
With all the other filesystem sync code it in xfs_sync.c including the
data quiesce code, it makes sense to move the remaining quiesce code to
the same place.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32312a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:16:21 +11:00
David Chinner
a4e4c4f4a8 [XFS] Kill xfs_sync()
There are no more callers to xfs_sync() now, so remove the function
altogther.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32311a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:16:11 +11:00
David Chinner
cb56a4b995 [XFS] Kill SYNC_CLOSE
SYNC_CLOSE is only ever used and checked in conjunction with SYNC_WAIT,
and this only done in one spot. The only thing this does is make
XFS_bflush() calls to the data buftargs.

This will happen very shortly afterwards the xfs_sync() call anyway in the
unmount path via the xfs_close_devices(), so this code is redundant and
can be removed. That only user of SYNC_CLOSE is now gone, so kill the flag
completely.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32310a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:16:00 +11:00
David Chinner
e9f1c6ee12 [XFS] make SYNC_DELWRI no longer use xfs_sync
Continue to de-multiplex xfs_sync be replacing all SYNC_DELWRI callers
with direct calls functions that do the work. Isolate the data quiesce
case to a function in xfs_sync.c. Isolate the FSDATA case with explicit
calls to xfs_sync_fsdata().

Version 2: o Push delwri related log forces into xfs_sync_inodes().

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32309a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:50 +11:00
David Chinner
be97d9d557 [XFS] make SYNC_ATTR no longer use xfs_sync
Continue to de-multiplex xfs_sync be replacing all SYNC_ATTR callers with
direct calls xfs_sync_inodes(). Add an assert into xfs_sync() to ensure we
caught all the SYNC_ATTR callers.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32308a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:38 +11:00
David Chinner
aacaa880bf [XFS] xfssyncd: don't call xfs_sync
Start de-multiplexing xfs_sync() by making xfs_sync_worker() call the
specific sync functions it needs. This is only a small, unique subset of
the entire xfs_sync() code so is easier to follow.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32307a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:29 +11:00
David Chinner
dfd837a9eb [XFS] kill xfs_syncsub
Now that the only caller is xfs_sync(), merge the two together as it makes
no sense to keep them separate.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32306a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:21 +11:00
David Chinner
2030b5aba8 [XFS] use xfs_sync_inodes rather than xfs_syncsub
Kill the unused arg in xfs_syncsub() and xfs_sync_inodes(). For callers of
xfs_syncsub() that only want to flush inodes, replace xfs_syncsub() with
direct calls to xfs_sync_inodes() as that is all that is being done with
the specific flags being passed in.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32305a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:12 +11:00
David Chinner
bc60a99323 [XFS] Use struct inodes instead of vnodes to kill vn_grab
With the sync code relocated to the linux-2.6 directory we can use struct
inodes directly. If we do the same thing for the quota release code, we
can remove vn_grab altogether. While here, convert the VN_BAD() checks to
is_bad_inode() so we can remove vnodes entirely from this code.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32304a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:03 +11:00
Christoph Hellwig
2af75df7be [XFS] split out two helpers from xfs_syncsub
Split out two helpers from xfs_syncsub for the dummy log commit and the
superblock writeout.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32303a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:14:53 +11:00
Christoph Hellwig
4e8938feba [XFS] Move XFS_BMAP_SANITY_CHECK out of line.
Move the XFS_BMAP_SANITY_CHECK macro out of line and make it a properly
typed function. Also pass the xfs_buf for the btree block instead of just
the btree block header, as we will need some additional information for it
to implement CRC checking of btree blocks.

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32301a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:14:43 +11:00
Christoph Hellwig
7cc95a821d [XFS] Always use struct xfs_btree_block instead of short / longform
structures.

Always use the generic xfs_btree_block type instead of the short / long
structures. Add XFS_BTREE_SBLOCK_LEN / XFS_BTREE_LBLOCK_LEN defines for
the length of a short / long form block. The rationale for this is that we
will grow more btree block header variants to support CRCs and other RAS
information, and always accessing them through the same datatype with
unions for the short / long form pointers makes implementing this much
easier.

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32300a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:14:34 +11:00
Christoph Hellwig
136341b41a [XFS] cleanup btree record / key / ptr addressing macros.
Replace the generic record / key / ptr addressing macros that use cpp
token pasting with simpler macros that do the job for just one given btree
type. The new macros lose the cur argument and thus can be used outside
the core btree code, but also gain an xfs_mount * argument to allow for
checking the CRC flag in the near future. Note that many of these macros
aren't actually used in the kernel code, but only in userspace (mostly in
xfs_repair).

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32295a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:11:40 +11:00
David Chinner
6c7699c047 [XFS] remove the mount inode list
Now we've removed all users of the mount inode list, we can kill it. This
reduces the size of the xfs_inode by 2 pointers.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32293a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:11:29 +11:00
Christoph Hellwig
60197e8df3 [XFS] Cleanup maxrecs calculation.
Clean up the way the maximum and minimum records for the btree blocks are
calculated. For the alloc and inobt btrees all the values are
pre-calculated in xfs_mount_common, and we switch the current loop around
the ugly generic macros that use cpp token pasting to generate type names
to two small helpers in normal C code. For the bmbt and bmdr trees these
helpers also exist, but can be called during runtime, too. Here we also
kill various macros dealing with them and inline the logic into the
get_minrecs / get_maxrecs / get_dmaxrecs methods in xfs_bmap_btree.c.

Note that all these new helpers take an xfs_mount * argument which will be
needed to determine the size of a btree block once we add support for
extended btree blocks with CRCs and other RAS information.

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32292a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:11:19 +11:00
David Chinner
5b4d89ae0f [XFS] Traverse inode trees when releasing dquots
Make releasing all inode dquots traverse the per-ag inode radix trees
rather than the mount inode list. This removes another user of the mount
inode list.

Version 3 o fix comment relating to avoiding trying to release the

quota inodes and those in reclaim.

Version 2 o add comment explaining use of gang lookups for a single inode
o use IRELE, not VN_RELE o move check for ag initialisation to caller.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32291a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:08:03 +11:00
David Chinner
683a897080 [XFS] Use the inode tree for finding dirty inodes
Update xfs_sync_inodes to walk the inode radix tree cache to find dirty
inodes. This removes a huge bunch of nasty, messy code for traversing the
mount inode list safely and removes another user of the mount inode list.

Version 3 o rediff against new linux-2.6/xfs_sync.c code

Version 2 o add comment explaining use of gang lookups for a single inode
o use IRELE, not VN_RELE o move check for ag initialisation to caller.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32290a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:07:29 +11:00
David Chinner
2f8a3ce1c2 [XFS] don't block in xfs_qm_dqflush() during async writeback.
Normally dquots are written back via delayed write mechanisms. They are
flushed to their backing buffer by xfssyncd, which is then pushed out by
either AIL or xfsbufd flushing. The flush from the xfssyncd is supposed to
be non-blocking, but xfs_qm_dqflush() always waits for pinned duots, which
means that it will block for the length of time it takes to do a
synchronous log force. This causes unnecessary extra log I/O to be issued
whenever we try to flush a busy dquot.

Avoid the log forces and blocking xfssyncd by making xfs_qm_dqflush() pay
attention to what type of sync it is doing when it sees a pinned dquot and
not waiting when doing non-blocking flushes.

SGI-PV: 988147

SGI-Modid: xfs-linux-melb:xfs-kern:32287a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:07:20 +11:00
David Chinner
75c68f411b [XFS] Remove xfs_iflush_all and clean up xfs_finish_reclaim_all()
xfs_iflush_all() walks the m_inodes list to find inodes that need
reclaiming. We already have such a list - the m_del_inodes list. Replace
xfs_iflush_all() with a call to xfs_finish_reclaim_all() and clean up
xfs_finish_reclaim_all() to handle the different flush modes now needed.

Originally based on a patch from Christoph Hellwig.

Version 3 o rediff against new linux-2.6/xfs_sync.c code

Version 2 o revert xfs_syncsub() inode reclaim behaviour back to original

code o xfs_quiesce_fs() should use XFS_IFLUSH_DELWRI_ELSE_ASYNC, not

XFS_IFLUSH_ASYNC, to prevent change of behaviour.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32284a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:06:28 +11:00
David Chinner
a167b17e89 [XFS] move xfssyncd code to xfs_sync.c
Move all the xfssyncd code to the new xfs_sync.c file. This places it
closer to the actual code that it interacts with, rather than just being
associated with high level VFS code.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32283a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:06:18 +11:00
David Chinner
fe4fa4b8e4 [XFS] move sync code to its own file
The sync code in XFS is spread around several files. While it used to make
sense to have such a distribution, the code is about to be cleaned up and
so centralising it in one spot as the first step makes sense.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32282a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:06:08 +11:00
Barry Naujok
34519daae6 [XFS] Show buffer address with debug hexdump on corruption
SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32233a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:58 +11:00
Barry Naujok
89b2839319 [XFS] Check agf_btreeblks is valid when reading in the AGF
SGI-PV: 987683

SGI-Modid: xfs-linux-melb:xfs-kern:32232a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:49 +11:00
Barry Naujok
847fff5ca8 [XFS] Sync up kernel and user-space headers
SGI-PV: 986558

SGI-Modid: xfs-linux-melb:xfs-kern:32231a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:38 +11:00
Lachlan McIlroy
24ee0e49c9 [XFS] Make xfs_btree_check_ptr() debug-only code.
SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32224a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:05:26 +11:00
Peter Leckie
d1de802155 [XFS] Fix build brakage from patch "Clean up dquot pincount code"
This is a fix for patch " Clean up dquot pincount code" which introduced a
build breakage due to a missing & in xfs_qm_dquot_logitem_pin.

SGI-PV: 986789

SGI-Modid: xfs-linux-melb:xfs-kern:32221a

Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:18 +11:00
Peter Leckie
bc3048e3cd [XFS] Clean up dquot pincount code.
This is a code cleanup and optimization that removes a per mount point
spinlock from the quota code and cleans up the code.

The patch changes the pincount from being an int protected by a spinlock
to an atomic_t allowing the pincount to be manipulated without holding the
spinlock.

This cleanup also protects against random wakup's of both the aild and
xfssyncd by reevaluating the pincount after been woken. Two latter patches
will address the Spurious wakeups.

SGI-PV: 986789

SGI-Modid: xfs-linux-melb:xfs-kern:32215a

Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:04 +11:00
Lachlan McIlroy
d112f29845 [XFS] Wait for all I/O on truncate to zero file size
It's possible to have outstanding xfs_ioend_t's queued when the file size
is zero. This can happen in the direct I/O path when a direct I/O write
fails due to ENOSPC. In this case the xfs_ioend_t will still be queued (ie
xfs_end_io_direct() does not know that the I/O failed so can't force the
xfs_ioend_t to be flushed synchronously).

When we truncate a file on unlink we don't know to wait for these
xfs_ioend_ts and we can have a use-after-free situation if the inode is
reclaimed before the xfs_ioend_t is finally processed.

As was suggested by Dave Chinner lets wait for all I/Os to complete when
truncating the file size to zero.

SGI-PV: 981668

SGI-Modid: xfs-linux-melb:xfs-kern:32216a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 16:59:06 +11:00
Christoph Hellwig
7f7c39ccb6 [XFS] make btree tracing generic
Make the existing bmap btree tracing generic so that it applies to all
btree types.

Some fragments lifted from a patch by Dave Chinner.

This adds two files that were missed from the previous btree tracing
checkin.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32210a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:50 +11:00
Christoph Hellwig
3cc7524c84 [XFS] mark various functions in xfs_btree.c static
Lots of functionality in xfs_btree.c isn't needed by callers outside of
this file anymore, so mark these functions static.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32209a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:41 +11:00
Christoph Hellwig
4a26e66e77 [XFS] add keys_inorder and recs_inorder btree methods
Add methods to check whether two keys/records are in the righ order. This
replaces the xfs_btree_check_key and xfs_btree_check_rec methods. For the
callers from xfs_bmap.c just opencode the bmbt-specific asserts.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32208a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:32 +11:00
Christoph Hellwig
fd6bcc5b63 [XFS] kill xfs_bmbt_log_block and xfs_bmbt_log_recs
These are equivalent to the xfs_btree_* versions, and the only remaining
caller can be switched to the generic one after they are exported. Also
remove some now dead infrastructure in xfs_bmap_btree.c.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32207a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:21 +11:00
Christoph Hellwig
8cc938fe42 [XFS] implement generic xfs_btree_get_rec
Not really much reason to make it generic given that it's so small, but
this is the last non-method in xfs_alloc_btree.c and xfs_ialloc_btree.c,
so it makes the whole btree implementation more structured.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32206a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:11 +11:00
Christoph Hellwig
91cca5df9b [XFS] implement generic xfs_btree_delete/delrec
Make the btree delete code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32205a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:01 +11:00
Christoph Hellwig
d4b3a4b7dd [XFS] move xfs_bmbt_killroot to common code
xfs_bmbt_killroot is a mostly generic implementation of moving from a real
block based root to an inode based root. So move it to xfs_btree.c where
it can use all the nice infrastructure there and make it pointer size
agnostic

The new name for it is xfs_btree_kill_iroot, following the old naming but
making it clear we're dealing with the root in inode case here, and to
avoid confusion with xfs_btree_new_root which is used for the not inode
rooted case. I've also added a comment describing what it does and why
it's named the way it is.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32203a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:51 +11:00
Christoph Hellwig
4b22a57188 [XFS] implement generic xfs_btree_insert/insrec
Make the btree insert code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32202a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:40 +11:00
Christoph Hellwig
ea77b0a66e [XFS] move xfs_bmbt_newroot to common code
xfs_bmbt_newroot is a mostly generic implementation of moving from an
inode root to a real block based root. So move it to xfs_btree.c where it
can use all the nice infrastructure there and make it pointer size
agnostic

The new name for it is xfs_btree_new_iroot, following the old naming but
making it clear we're dealing with the root in inode case here, and to
avoid confusion with xfs_btree_new_root which is used for the not inode
rooted case.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32201a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:28 +11:00
Christoph Hellwig
344207ce84 [XFS] implement semi-generic xfs_btree_new_root
From: Dave Chinner <dgc@sgi.com>

Add a xfs_btree_new_root helper for the alloc and ialloc btrees. The bmap
btree needs it's own version and is not converted.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32200a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:16 +11:00
Christoph Hellwig
f5eb8e7ca5 [XFS] implement generic xfs_btree_split
Make the btree split code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32198a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:03 +11:00
Christoph Hellwig
687b890a18 [XFS] implement generic xfs_btree_lshift
Make the btree left shift code generic. Based on a patch from David
Chinner with lots of changes to follow the original btree implementations
more closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32197a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:53 +11:00
Christoph Hellwig
9eaead51be [XFS] implement generic xfs_btree_rshift
Make the btree right shift code generic. Based on a patch from David
Chinner with lots of changes to follow the original btree implementations
more closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32196a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:43 +11:00
Christoph Hellwig
278d0ca14e [XFS] implement generic xfs_btree_update
From: Dave Chinner <dgc@sgi.com>

The most complicated part here is the lastrec tracking for the alloc
btree. Most logic is in the update_lastrec method which has to do some
hopefully good enough dirty magic to maintain it.

[hch: split out from bigger patch and a rework of the lastrec

logic]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32194a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:32 +11:00
Christoph Hellwig
38bb74237d [XFS] implement generic xfs_btree_updkey
From: Dave Chinner <dgc@sgi.com>

Note that there are many > 80 char lines introduced due to the
xfs_btree_key casts. But the places where this happens is throw-away code
once the whole btree code gets merged into a common implementation.

The same is true for the temporary xfs_alloc_log_keys define to the new
name. All old users will be gone after a few patches.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32193a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:22 +11:00
Christoph Hellwig
fe033cc848 [XFS] implement generic xfs_btree_lookup
From: Dave Chinner <dgc@sgi.com>

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32192a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:09 +11:00
Christoph Hellwig
8df4da4a0a [XFS] implement generic xfs_btree_decrement
From: Dave Chinner <dgc@sgi.com>

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32191a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:58 +11:00
Christoph Hellwig
637aa50f46 [XFS] implement generic xfs_btree_increment
From: Dave Chinner <dgc@sgi.com>

Because this is the first major generic btree routine this patch includes
some infrastrucure, first a few routines to deal with a btree block that
can be either in short or long form, second xfs_btree_read_buf_block,
which is the new central routine to read a btree block given a cursor, and
third the new xfs_btree_ptr_addr routine to calculate the address for a
given btree pointer record.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32190a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:45 +11:00
Christoph Hellwig
65f1eaeac0 [XFS] add helpers for addressing entities inside a btree block
Add new helpers in xfs_btree.c to find the record, key and block pointer
entries inside a btree block. To implement this genericly the
->get_maxrecs methods and two new xfs_btree_ops entries for the key and
record sizes are used. Also add a big comment describing how the
addressing inside a btree block works.

Note that these helpers are unused until users are introduced in the next
patches and this patch will thus cause some harmless compiler warnings.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32189a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:34 +11:00
Christoph Hellwig
ce5e42db42 [XFS] add get_maxrecs btree operation
Factor xfs_btree_maxrecs into a per-btree operation.

The get_maxrecs method is based on a patch from Dave Chinner.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32188a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:23 +11:00
Christoph Hellwig
8c4ed633e6 [XFS] make btree tracing generic
Make the existing bmap btree tracing generic so that it applies to all
btree types.

Some fragments lifted from a patch by Dave Chinner.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32187a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:13 +11:00
David Chinner
854929f058 [XFS] add new btree statistics
From: Dave Chinner <dgc@sgi.com>

Introduce statistics coverage of all the btrees and cover all the btree
operations, not just some.

Invaluable for determining test code coverage of all the btree
operations....

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32184a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
2008-10-30 16:55:03 +11:00
Christoph Hellwig
a23f6ef8ce [XFS] refactor btree validation helpers
Move the various btree validation helpers around in xfs_btree.c so that
they are close to each other and in common #ifdef DEBUG sections.

Also add a new xfs_btree_check_ptr helper to check a btree ptr that can be
either long or short form.

Split out from a bigger patch from Dave Chinner with various small changes
applied by me.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32183a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:53 +11:00
Christoph Hellwig
b524bfeee2 [XFS] refactor xfs_btree_readahead
From: Dave Chinner <dgc@sgi.com>

Refactor xfs_btree_readahead to make it more readable:

(a) remove the inline xfs_btree_readahead wrapper and move all checks out

of line into the main routine.

(b) factor out helpers for short/long form btrees

(c) move check for root in inodes from the callers into
xfs_btree_readahead

[hch: split out from a big patch and minor cleanups]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32182a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:43 +11:00
Christoph Hellwig
e99ab90d6a [XFS] add a long pointers flag to xfs_btree_cur
Add a flag to the xfs btree cursor when using long (64bit) block pointers
instead of checking btnum == XFS_BTNUM_BMAP.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32181a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:33 +11:00
Christoph Hellwig
8186e517fa [XFS] make btree root in inode support generic
The bmap btree is rooted in the inode and not in a disk block. Make the
support for this feature more generic by adding a btree flag to for this
feature instead of relying on the XFS_BTNUM_BMAP btnum check.

Also clean up xfs_btree_get_block where this new flag is used.

Based upon a patch from Dave Chinner.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32180a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:22 +11:00
Christoph Hellwig
de227dd960 [XFS] add generic btree types
Add generic union types for btree pointers, keys and records. The generic
btree pointer contains either a 32 and 64bit big endian scalar for short
and long form btrees, and the key and record contain the relevant type for
each possible btree.

Split out from a bigger patch from Dave Chinner and simplified a little
further.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32178a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:12 +11:00
Christoph Hellwig
561f7d1739 [XFS] split up xfs_btree_init_cursor
xfs_btree_init_cursor contains close to little shared code for the
different btrees and will get even more non-common code in the future.
Split it up into one routine per btree type.

Because xfs_btree_dup_cursor needs to call the init routine for a generic
btree cursor add a new btree operation vector that contains a dup_cursor
method that initializes a new cursor based on an existing one.

The btree operations vector is based on an idea and code from Dave Chinner
and will grow more entries later during this series.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32176a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:53:59 +11:00
Christoph Hellwig
f2277f06e6 [XFS] kill struct xfs_btree_hdr
This type is only embedded in struct xfs_btree_block and never used
directly. By moving the fields directly into struct xfs_btree_block a lot
of the macros for struct xfs_btree_sblock and struct xfs_btree_lblock can
be used for struct xfs_btree_block too now which helps greatly with some
of the migrations during implementing the generic btree code.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32174a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:53:47 +11:00
Lachlan McIlroy
f338f90364 [XFS] Unlock inode before calling xfs_idestroy()
Lock debugging reported the ilock was being destroyed without being
unlocked. We don't need to lock the inode until we are going to insert it
into the radix tree.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32159a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 16:53:38 +11:00
Lachlan McIlroy
a357a12156 [XFS] Fix use-after-free with log and quotas
Destroying the quota stuff on unmount can access the log - ie
XFS_QM_DONE() ends up in xfs_dqunlock() which calls
xfs_trans_unlocked_item() and then xfs_log_move_tail(). By this time the
log has already been destroyed. Just move the cleanup of the quota code
earlier in xfs_unmountfs() before the call to xfs_log_unmount(). Moving
XFS_QM_DONE() up near XFS_QM_DQPURGEALL() seems like a good spot.

SGI-PV: 987086

SGI-Modid: xfs-linux-melb:xfs-kern:32148a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
2008-10-30 16:53:25 +11:00
Barry Naujok
46039928c9 [XFS] Remove final remnants of dirv1 macros and other stuff
SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:32002a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 16:52:35 +11:00
Lachlan McIlroy
d07c60e54f [XFS] Use xfs_idestroy() to cleanup an inode.
SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31927a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:50:35 +11:00
Lachlan McIlroy
be8b78a626 [XFS] Remove kmem_zone_t argument from xfs_inode_init_once()
kmem cache constructor no longer takes a kmem_zone_t argument.

SGI-PV: 957103

SGI-Modid: xfs-linux-melb:xfs-kern:32254a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 16:42:34 +11:00
David Chinner
07c8f67587 [XFS] Make use of the init-once slab optimisation.
To avoid having to initialise some fields of the XFS inode on every
allocation, we can use the slab init-once feature to initialise them. All
we have to guarantee is that when we free the inode, all it's entries are
in the initial state. Add asserts where possible to ensure debug kernels
check this initial state before freeing and after allocation.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31925a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 16:11:59 +11:00
Linus Torvalds
2248485640 Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: (66 commits)
  [PATCH] kill the rest of struct file propagation in block ioctls
  [PATCH] get rid of struct file use in blkdev_ioctl() BLKBSZSET
  [PATCH] get rid of blkdev_locked_ioctl()
  [PATCH] get rid of blkdev_driver_ioctl()
  [PATCH] sanitize blkdev_get() and friends
  [PATCH] remember mode of reiserfs journal
  [PATCH] propagate mode through swsusp_close()
  [PATCH] propagate mode through open_bdev_excl/close_bdev_excl
  [PATCH] pass fmode_t to blkdev_put()
  [PATCH] kill the unused bsize on the send side of /dev/loop
  [PATCH] trim file propagation in block/compat_ioctl.c
  [PATCH] end of methods switch: remove the old ones
  [PATCH] switch sr
  [PATCH] switch sd
  [PATCH] switch ide-scsi
  [PATCH] switch tape_block
  [PATCH] switch dcssblk
  [PATCH] switch dasd
  [PATCH] switch mtd_blkdevs
  [PATCH] switch mmc
  ...
2008-10-23 10:23:07 -07:00
David Woodhouse
d88f1833fc [PATCH] Remove XFS buffered readdir hack
Now that we've moved the readdir hack to the nfsd code, we can
remove the local version from the XFS code.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:06 -04:00
Christoph Hellwig
440037287c [PATCH] switch all filesystems over to d_obtain_alias
Switch all users of d_alloc_anon to d_obtain_alias.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:01 -04:00
Al Viro
30c40d2c01 [PATCH] propagate mode through open_bdev_excl/close_bdev_excl
replace open_bdev_excl/close_bdev_excl with variants taking fmode_t.
superblock gets the value used to mount it stored in sb->s_mode

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21 07:49:00 -04:00
Christoph Hellwig
6c5e51dae2 xfs: fix remount rw with unrecognized options
When we skip unrecognized options in xfs_fs_remount we should just break
out of the switch and not return because otherwise we may skip clearing
the xfs-internal read-only flag.  This will only show up on some
operations like touch because most read-only checks are done by the VFS
which thinks this filesystem is r/w.  Eventually we should replace the
XFS read-only flag with a helper that always checks the VFS flag to make
sure they can never get out of sync.

Bug reported and fix verified by Marcel Beister on #xfs.
Bug fix verified by updated xfstests/189.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Timothy Shimmin <tes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-15 10:00:00 -07:00
Steven Whitehouse
a447c09324 vfs: Use const for kernel parser table
This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 10:10:37 -07:00
Christoph Hellwig
73f6aa4d44 Fix barrier fail detection in XFS
Currently we disable barriers as soon as we get a buffer in xlog_iodone
that has the XBF_ORDERED flag cleared.  But this can be the case not only
for buffers where the barrier failed, but also the first buffer of a
split log write in case of a log wraparound.  Due to the disabled
barriers we can easily get directory corruption on unclean shutdowns.
So instead of using this check add a new buffer flag for failed barrier
writes.

This is a regression vs 2.6.26 caused by patch to use the right macro
to check for the ORDERED flag, as we previously got true returned for
every buffer.

Thanks to Toei Rei for reporting the bug.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: David Chinner <david@fromorbit.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-10 11:08:07 -07:00
Lachlan McIlroy
71a8c87fb3 [XFS] Remove xfs_iext_irec_compact_full()
Yet another bug was found in xfs_iext_irec_compact_full() and while the
source of the bug was found it wasn't an easy task to track it down
because the conditions are very difficult to reproduce.

A HUGE thank-you goes to Russell Cattelan and Eric Sandeen for their
significant effort in tracking down the source of this corruption.

xfs_iext_irec_compact_full() and xfs_iext_irec_compact_pages() are almost
identical - they both compact indirect extent lists by moving extents from
subsequent buffers into earlier ones. xfs_iext_irec_compact_pages() only
moves extents if all of the extents in the next buffer will fit into the
empty space in the buffer before it. xfs_iext_irec_compact_full() will go
a step further and move part of the next buffer if all the extents wont
fit. It will then shift the remaining extents in the next buffer up to the
start of the buffer. The bug here was that we did not update er_extoff and
this caused extent list corruption.

It does not appear that this extra functionality gains us much. Calling
xfs_iext_irec_compact_pages() instead will do a good enough job at
compacting the indirect list and will be quicker too.

For the case in xfs_iext_indirect_to_direct() the total number of extents
in the indirect list will fit into one buffer so we will never need the
extra functionality of xfs_iext_irec_compact_full() there.

Also xfs_iext_irec_compact_pages() doesn't need to do a memmove() (the
buffers will never overlap) so we don't want the performance hit that can
incur.

SGI-PV: 987159

SGI-Modid: xfs-linux-melb:xfs-kern:32166a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
2008-09-26 12:17:57 +10:00
Lachlan McIlroy
f1ccd29551 [XFS] Fix extent list corruption in xfs_iext_irec_compact_full().
If we don't move all the records from the next buffer into the current
buffer then we need to update the er_extoff field of the next buffer as we
shift the remaining records to the start of the buffer.

SGI-PV: 987159

SGI-Modid: xfs-linux-melb:xfs-kern:32165a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Russell Cattelan <cattelan@thebarn.com>
2008-09-26 12:16:46 +10:00
Lachlan McIlroy
2fd6f6ec64 [XFS] Don't do I/O beyond eof when unreserving space
When unreserving space with boundaries that are not block aligned we round
up the start and round down the end boundaries and then use this function,
xfs_zero_remaining_bytes(), to zero the parts of the blocks that got
dropped during the rounding. The problem is we don't consider if these
blocks are beyond eof. Worse still is if we encounter delayed allocations
beyond eof we will try to use the magic delayed allocation block number as
a real block number. If the file size is ever extended to expose these
blocks then we'll go through xfs_zero_eof() to zero them anyway.

SGI-PV: 983683

SGI-Modid: xfs-linux-melb:xfs-kern:32055a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17 16:52:50 +10:00
Lachlan McIlroy
e1f5dbd707 [XFS] Fix use-after-free with buffers
We have a use-after-free issue where log completions access buffers via
the buffer log item and the buffer has already been freed. Fix this by
taking a reference on the buffer when attaching the buffer log item and
release the hold when the buffer log item is detached and we no longer
need the buffer. Also create a new function xfs_buf_item_free() to combine
some common code.

SGI-PV: 985757

SGI-Modid: xfs-linux-melb:xfs-kern:32025a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17 16:52:13 +10:00
David Chinner
f9114eba1e [XFS] Prevent lockdep false positives when locking two inodes.
If we call xfs_lock_two_inodes() to grab both the iolock and the ilock,
then drop the ilocks on both inodes, then grab them again (as
xfs_swap_extents() does) then lockdep will report a locking order problem.
This is a false positive.

To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks
at once - force calers to make two separate calls. This means that nested
dropping and regaining of the ilocks will retain the same lockdep subclass
and so lockdep will not see anything wrong with this code.

SGI-PV: 986238

SGI-Modid: xfs-linux-melb:xfs-kern:31999a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-09-17 16:51:21 +10:00
David Chinner
b5b8c9acd5 [XFS] Fix barrier status change detection.
The current code in xlog_iodone() uses the wrong macro to check if the
barrier has been cleared due to an EOPNOTSUPP error form the lower layer.

SGI-PV: 986143

SGI-Modid: xfs-linux-melb:xfs-kern:31984a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Nathaniel W. Turner <nate@houseofnate.net>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-09-17 16:50:50 +10:00
Lachlan McIlroy
364f358a73 [XFS] Prevent direct I/O from mapping extents beyond eof
With the help from some tracing I found that we try to map extents beyond
eof when doing a direct I/O read. It appears that the way to inform the
generic direct I/O path (ie do_direct_IO()) that we have breached eof is
to return an unmapped buffer from xfs_get_blocks_direct(). This will cause
do_direct_IO() to jump to the hole handling code where is will check for
eof and then abort.

This problem was found because a direct I/O read was trying to map beyond
eof and was encountering delayed allocations. The delayed allocations
beyond eof are speculative allocations and they didn't get converted when
the direct I/O flushed the file because there was only enough space in the
current AG to convert and write out the dirty pages within eof. Note that
xfs_iomap_write_allocate() wont necessarily convert all the delayed
allocation passed to it - it will return after allocating the first extent
- so if the delayed allocation extends beyond eof then it will stay that
way.

SGI-PV: 983683

SGI-Modid: xfs-linux-melb:xfs-kern:31929a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17 16:50:14 +10:00
Christoph Hellwig
6efdf28177 [XFS] Fix regression introduced by remount fixup
Logically we would return an error in xfs_fs_remount code to prevent users
from believing they might have changed mount options using remount which
can't be changed.

But unfortunately mount(8) adds all options from mtab and fstab to the
mount arguments in some cases so we can't blindly reject options, but have
to check for each specified option if it actually differs from the
currently set option and only reject it if that's the case.

Until that is implemented we return success for every remount request, and
silently ignore all options that we can't actually change.

SGI-PV: 985710

SGI-Modid: xfs-linux-melb:xfs-kern:31908a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-09-17 16:49:33 +10:00
Lachlan McIlroy
31bd61f2bb [XFS] Move memory allocations for log tracing out of the critical path
Memory allocations for log->l_grant_trace and iclog->ic_trace are done on
demand when the first event is logged. In xlog_state_get_iclog_space() we
call xlog_trace_iclog() under a spinlock and allocating memory here can
cause us to sleep with a spinlock held and deadlock the system.

For the log grant tracing we use KM_NOSLEEP but that means we can lose
trace entries. Since there is no locking to serialize the log grant
tracing we could race and have multiple allocations and leak memory.

So move the allocations to where we initialize the log/iclog structures.
Use KM_NOFS to avoid recursing into the filesystem and drop log->l_trace
since it's not even used.

SGI-PV: 983738

SGI-Modid: xfs-linux-melb:xfs-kern:31896a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17 16:45:37 +10:00
Al Viro
59af1584bf [PATCH] fix ->llseek() for a bunch of directories
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-25 01:18:09 -04:00
Christoph Hellwig
e45b590b97 [PATCH] change d_add_ci argument ordering
As pointed out during review d_add_ci argument order should match d_add,
so switch the dentry and inode arguments.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-08-25 01:18:05 -04:00
Adrian Bunk
7a8fc9b248 removed unused #include <linux/version.h>'s
This patch lets the files using linux/version.h match the files that
#include it.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-23 12:14:12 -07:00
David Howells
9e2b2dc413 CRED: Introduce credential access wrappers
The patches that are intended to introduce copy-on-write credentials for 2.6.28
require abstraction of access to some fields of the task structure,
particularly for the case of one task accessing another's credentials where RCU
will have to be observed.

Introduced here are trivial no-op versions of the desired accessors for current
and other tasks so that other subsystems can start to be converted over more
easily.

Wrappers are introduced into a new header (linux/cred.h) for UID/GID,
EUID/EGID, SUID/SGID, FSUID/FSGID, cap_effective and current's subscribed
user_struct.  These wrappers are macros because the ordering between header
files mitigates against making them inline functions.

linux/cred.h is #included from linux/sched.h.

Further, XFS is modified such that it no longer defines and uses parameterised
versions of current_fs[ug]id(), thus getting rid of the namespace collision
otherwise incurred.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-08-14 09:35:23 +10:00
Lachlan McIlroy
c6a7b0f8a4 [XFS] Fix use after free in xfs_log_done().
The ticket allocation code got reworked in 2.6.26 and we now free tickets
whereas before we used to cache them so the use-after-free went
undetected.

SGI-PV: 985525

SGI-Modid: xfs-linux-melb:xfs-kern:31877a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-08-13 16:52:50 +10:00
Ruben Porras
c94312de22 [XFS] Make xfs_bmap_*_count_leaves void.
xfs_bmap_count_leaves and xfs_bmap_disk_count_leaves always return always
0, make them void.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31844a

Signed-off-by: Ruben Porras <ruben.porras@linworks.de>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:52:25 +10:00
Lachlan McIlroy
5695ef46ef [XFS] Use KM_NOFS for debug trace buffers
Use KM_NOFS to prevent recursion back into the filesystem which can cause
deadlocks.

In the case of xfs_iread() we hold the lock on the inode cluster buffer
while allocating memory for the trace buffers. If we recurse back into XFS
to flush data that may require a transaction to allocate extents which
needs log space. This can deadlock with the xfsaild thread which can't
push the tail of the log because it is trying to get the inode cluster
buffer lock.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31838a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-08-13 16:51:57 +10:00
Christoph Hellwig
d62c251fe4 [XFS] use KM_MAYFAIL in xfs_mountfs
Use KM_MAYFAIL for the m_perag allocation, we can deal with the error
easily and blocking forever during mount is not a good idea either.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31837a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:51:29 +10:00
Christoph Hellwig
ff4f038c6b [XFS] refactor xfs_mount_free
xfs_mount_free mostly frees the perag data, which is something that is
duplicated in the mount error path.

Move the XFS_QM_DONE call to the caller and remove the useless
mutex_destroy/spinlock_destroy calls so that we can re-use it for the
mount error path. Also rename it to xfs_free_perag to reflect what it
does.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31836a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:50:47 +10:00
Christoph Hellwig
6203300e5e [XFS] don't call xfs_freesb from xfs_unmountfs
xfs_readsb is called before xfs_mount so xfs_freesb should be called after
xfs_unmountfs, too. This means it now happens after a few things during
the of xfs_unmount which all have nothing to do with the superblock.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31835a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:50:21 +10:00
Christoph Hellwig
41b5c2e77a [XFS] xfs_unmountfs should return void
xfs_unmounts can't and shouldn't return errors so declare it as returning
void.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31833a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:49:57 +10:00
Christoph Hellwig
4249023a5d [XFS] cleanup xfs_mountfs
Remove all the useless flags and code keyed off it in xfs_mountfs.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31831a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:49:32 +10:00
Christoph Hellwig
77508ec8e6 [XFS] move root inode IRELE into xfs_unmountfs
The root inode is allocated in xfs_mountfs so it should be release in
xfs_unmountfs. For the unmount case that means we do it after the the
xfs_sync(mp, SYNC_WAIT | SYNC_CLOSE) in the forced shutdown case and the
dmapi unmount event. Note that both reference the rip variable which might
be freed by that time in case inode flushing has kicked in, so strictly
speaking this might count as a bug fix

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31830a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:49:04 +10:00
Christoph Hellwig
3a76c1ea07 [XFS] stop using file_update_time
xfs_ichtime updates the xfs_inode and Linux inode timestamps just fine, no
need to call file_update_time and then copy the values over to the XFS
inode. The only additional thing in file_update_time are checks not
applicable to the write path.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31829a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-08-13 16:48:12 +10:00
Christoph Hellwig
8e5975c82f [XFS] optimize xfs_ichgtime
Port a little optmization from file_update_time to xfs_ichgtime, and only
update the timestamp and mark the inode dirty if the timestamp actually
changes in the timer tick resultion supported by the running kernel.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31827a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:45:13 +10:00
Christoph Hellwig
dff35fd41f [XFS] update timestamp in xfs_ialloc manually
In xfs_ialloc we just want to set all timestamps to the current time. We
don't need to mark the inode dirty like xfs_ichgtime does, and we don't
need nor want the opimizations in xfs_ichgtime that I will introduce in
the next patch.

So just opencode the timestamp update in xfs_ialloc, and remove the new
unused XFS_ICHGTIME_ACC case in xfs_ichgtime.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31825a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:44:15 +10:00
David Chinner
ab4a9b04a3 [XFS] remove the sema_t from XFS.
Now that all users of the sema_t are gone from XFS we can finally kill it.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31823a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:42:10 +10:00
David Chinner
e1f49cf20c [XFS] replace dquot flush semaphore with a completion
Use the new completion flush code to implement the dquot flush lock.
Removes one of the final users of semaphores in the XFS code base.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31822a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:41:43 +10:00
David Chinner
c63942d3ee [XFS] replace inode flush semaphore with a completion
Use the new completion flush code to implement the inode flush lock.
Removes one of the final users of semaphores in the XFS code base.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31817a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:41:16 +10:00
David Chinner
b4dd330b9e [XFS] replace the XFS buf iodone semaphore with a completion
The xfs_buf_t b_iodonesema is really just a semaphore that wants to be a
completion. Change it to a completion and remove the last user of the
sema_t from XFS.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31815a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:36:11 +10:00
David Chinner
12017faf38 [XFS] clean up stale references to semaphores
A lot of code has been converted away from semaphores, but there are still
comments that reference semaphore behaviour. The log code is the worst
offender. Update the comments to reflect what the code really does now.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31814a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:34:31 +10:00
Harvey Harrison
597bca6378 [XFS] use get_unaligned_* helpers
SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31813a

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:29:21 +10:00
Lachlan McIlroy
d63f154a36 [XFS] Fix compile failure in xfs_buf_trace()
SGI-PV: 957103

SGI-Modid: xfs-linux-melb:xfs-kern:31804a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:28:40 +10:00
Christoph Hellwig
169d6227a7 [XFS] Use the same btree_cur union member for alloc and inobt trees.
The alloc and inobt btree use the same agbp/agno pair in the btree_cur
union. Make them use the same bc_private.a union member so that code for
these two short form btree implementations can be shared.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31788a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:25:27 +10:00
Christoph Hellwig
cdcf43335c [XFS] small cleanups in xfs_btree.c
Remove unneeded xfs_btree_get_block forward declaration. Move
xfs_btree_firstrec next to xfs_btree_lastrec.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31787a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:23:50 +10:00
Christoph Hellwig
41be8bed1f [XFS] sanitize xfs_initialize_vnode
Sanitize setting up the Linux indode.

Setting up the xfs_inode <-> inode link is opencoded in xfs_iget_core now
because that's the only place it needs to be done, xfs_initialize_vnode is
renamed to xfs_setup_inode and loses all superflous paramaters. The check
for I_NEW is removed because it always is true and the di_mode check moves
into xfs_iget_core because it's only needed there.

xfs_set_inodeops and xfs_revalidate_inode are merged into xfs_setup_inode
and the whole things is moved into xfs_iops.c where it belongs.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31782a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:23:13 +10:00
Christoph Hellwig
5ec7f8c7d1 [XFS] kill bhv_vnode_t
All remaining bhv_vnode_t instance are in code that's more or less Linux
specific. (Well, for xfs_acl.c that could be argued, but that code is on
the removal list, too). So just do an s/bhv_vnode_t/struct inode/ over the
whole tree. We can clean up variable naming and some useless helpers
later.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31781a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:22:40 +10:00
Christoph Hellwig
df80c933f9 [XFS] remove some easy bhv_vnode_t instances
In various places we can just move a VFS_I call into the argument list of
called functions/macros instead of having a local bhv_vnode_t.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31776a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:22:09 +10:00
Christoph Hellwig
e1cccd917b [XFS] kill xfs_lock_dir_and_entry
When multiple inodes are locked in XFS it happens in order of the inode
number, with the everything but the first inode trylocked if any of the
previous inodes is in the AIL.

Except for the sorting of the inodes this logic is implemented in
xfs_lock_inodes, but also partially duplicated in xfs_lock_dir_and_entry
in a particularly stupid way adds a lock roundtrip if the inode ordering
is not optimal.

This patch adds a new helper xfs_lock_two_inodes that takes two inodes and
locks them in the most optimal way according to the above locking protocol
and uses it for all places that want to lock two inodes.

The only caller of xfs_lock_inodes is xfs_rename which might lock up to
four inodes.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31772a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:18:07 +10:00
Christoph Hellwig
1550d0b0b0 [XFS] kill INDUCE_IO_ERROR
All the error injection is already enabled through ifdef DEBUG, so kill
the never set second cpp symbol to activate it without the rest of the
debugging infrastructure.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31771a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:17:37 +10:00
Christoph Hellwig
907f49a8f5 [XFS] implement IHOLD/IRELE directly
Now that all direct calls to VN_HOLD/VN_RELE are gone we can implement
IHOLD/IRELE directly.

For the IHOLD case also replace igrab with a direct increment of i_count
because we are guaranteed to already have a live and referenced inode by
the VFS. Also remove the vn_hold statistic because it's been rather
meaningless for some time with most references done by other callers.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31764a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:13:45 +10:00
Christoph Hellwig
0b1f917730 [XFS] remove remaining VN_HOLD calls
Use IHOLD(ip) instead of VN_HOLD(VFS_I(ip)).

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31765a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:13:09 +10:00
Christoph Hellwig
604323ca76 [XFS] remove spurious VN_HOLD/VN_RELE calls from xfs_acl.c
All the ACL routines are called from inode operations which are guaranteed
to have a referenced inode by the VFS, so there's no need for the ACL code
to grab another temporary one.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31763a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:12:37 +10:00
Christoph Hellwig
863890cd90 [XFS] kill vn_to_inode
bhv_vnode_t is just a typedef for struct inode, so there's
no need for a helper to convert between the two.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31761a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:12:05 +10:00
Christoph Hellwig
a19d033cd2 [XFS] Remove vn_from_inode()
bhv_vnode_t is just a typedef for struct inode, so there's
no need for a helper to convert between the two.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31760a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:11:26 +10:00
Eric Sandeen
39dab9d7da [XFS] remove shouting-indirection macros from xfs_trans.h
SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31758a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:10:52 +10:00
Eric Sandeen
db7a2c71d2 [XFS] convert xfs to use ERR_CAST
Looks like somehow xfs got missed in the conversion that took place in
e231c2ee64, "Convert ERR_PTR(PTR_ERR(p))
instances to ERR_CAST(p)
<http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit
diff;h=e231c2ee64eb1c5cd3c63c31da9dac7d888dcf7f>"

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31757a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:09:25 +10:00
Eric Sandeen
cdeb380aa2 [XFS] remove INT_GET and friends
Thanks to hch's endian work, INT_GET etc are no longer used, and may as
well be removed. INT_SET is still used in the acl code, though.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31756a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:07:53 +10:00
Niv Sardi
322ff6b8cd [XFS] Move xfs_attr_rolltrans to xfs_trans_roll
Move it from the attr code to the transaction code and make
the attr code call the new function.

We rolltrans is really usefull whenever we want to use rolling
transaction, should be generic, it isn't dependent on any part
of the attr code anyway.

We use this excuse to change all the:

if ((error = xfs_attr_rolltrans()))

calls into:

error = xfs_trans_roll();

if (error)

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31729a

Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:05:49 +10:00
Christoph Hellwig
a738159df2 [XFS] don't leak m_fsname/m_rtname/m_logname
Add a helper to free the m_fsname/m_rtname/m_logname allocations and use
it properly for all mount failure cases. Also switch the allocations for
these to kstrdup while we're at it.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31728a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:04:05 +10:00
Niv Sardi
5e9da7b7a1 [XFS] Move attr log alloc size calculator to another function.
We will need that to be able to calculate the size of log we need for a
specific attr (for Create+EA). The local flag is needed so that we can
fail if we run into ENOSPC when trying to alloc blocks.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31727a

Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:03:35 +10:00
David Chinner
6785073ba1 [XFS] Use KM_NOFS for incore inode extent tree allocation V2
If we allow incore extent tree allocations to recurse into the
filesystem under memory pressure, new delayed allocations through
xfs_iomap_write_delay() can deadlock on themselves if memory
reclaim tries to write back dirty pages from that inode.

It will deadlock in xfs_iomap_write_allocate() trying to take the
ilock we already hold. This can also show up as complex ABBA deadlocks
when multiple threads are triggering memory reclaim when trying to
allocate extents.

The main cause of this is the fact that delayed allocation is not done in
a transaction, so KM_NOFS is not automatically added to the allocations to
prevent this recursion.

Mark all allocations done for the incore inode extent tree as KM_NOFS to
ensure they never recurse back into the filesystem.

Version 2: o KM_NOFS implies KM_SLEEP, so just use KM_NOFS

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31726a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:02:51 +10:00
David Chinner
e6064d30c3 [XFS] XFS: Kill xfs_vtoi()
xfs_vtoi() is redundant and only unsed in small sections of code.
Replace them with widely used XFS_I() inline and kill xfs_vtoi().

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31725a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:01:45 +10:00
David Chinner
e4f7529108 [XFS] Kill shouty XFS_ITOV() macro
Replace XFS_ITOV() with the new VFS_I() inline.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31724a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:00:45 +10:00
David Chinner
705db4a24e [XFS] kill shouty XFS_ITOV_NULL macro
Replace XFS_ITOV_NULL() with the new VFS_I() inline.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31722a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 15:47:43 +10:00
David Chinner
0165164625 [XFS] Avoid directly referencing the VFS inode.
In several places we directly convert from the XFS inode
to the linux (VFS) inode by a simple deference of ip->i_vnode.
We should not do this - a helper function should be used to
extract the VFS inode from the XFS inode.

Introduce the function VFS_I() to extract the VFS inode
from the XFS inode. The name was chosen to match XFS_I() which
is used to extract the XFS inode from the VFS inode.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31720a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 15:45:15 +10:00
Lachlan McIlroy
3790689fa3 [XFS] Do not access buffers after dropping reference count
We should not access a buffer after dropping it's reference count
otherwise we could race with another thread that releases the final
reference count and frees the buffer causing us to access potentially
unmapped memory. The bug this change fixes only occured on DEBUG XFS since
the offending code was in an ASSERT.

SGI-PV: 984429

SGI-Modid: xfs-linux-melb:xfs-kern:31715a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-08-13 15:42:10 +10:00
David Chinner
79071eb0b2 [XFS] Use the generic bitops rather than implementing them ourselves.
This keeps xfs_lowbit64 as it was since there aren't good generic helpers
there ... Patch inspired by Andi Kleen.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31472a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 15:41:12 +10:00
Nick Piggin
ca5de404ff fs: rename buffer trylock
Like the page lock change, this also requires name change, so convert the
raw test_and_set bitop to a trylock.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-04 21:56:09 -07:00
Nick Piggin
529ae9aaa0 mm: rename page trylock
Converting page lock to new locking bitops requires a change of page flag
operation naming, so we might as well convert it to something nicer
(!TestSetPageLocked_Lock => trylock_page, SetPageLocked => set_page_locked).

This also facilitates lockdeping of page lock.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-04 21:31:34 -07:00
Christoph Hellwig
f13fae2d2a [XFS] Remove vn_revalidate calls in xfs.
These days most of the attributes in struct inode are properly kept in
sync by XFS. This patch removes the need for vn_revalidate completely by:

- keeping inode.i_flags uptodate after any flags are updated in

xfs_ioctl_setattr

- keeping i_mode, i_uid and i_gid uptodate in xfs_setattr

SGI-PV: 984566

SGI-Modid: xfs-linux-melb:xfs-kern:31679a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:39 +10:00
Christoph Hellwig
0f285c8a1c [XFS] Now that xfs_setattr is only used for attributes set from ->setattr
it can be switched to take struct iattr directly and thus simplify the
implementation greatly. Also rename the ATTR_ flags to XFS_ATTR_ to not
conflict with the ATTR_ flags used by the VFS.

SGI-PV: 984565

SGI-Modid: xfs-linux-melb:xfs-kern:31678a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:37 +10:00
Christoph Hellwig
25fe55e814 [XFS] xfs_setattr currently doesn't just handle the attributes set through
->setattr but also addition XFS-specific attributes: project id, inode
flags and extent size hint. Having these in a single function makes it
more complicated and forces to have us a bhv_vattr intermediate structure
eating up stackspace.

This patch adds a new xfs_ioctl_setattr helper for the XFS ioctls that set
these attributes and remove the code to set them through xfs_setattr.

SGI-PV: 984564

SGI-Modid: xfs-linux-melb:xfs-kern:31677a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:36 +10:00
Lachlan McIlroy
c032bfcf46 [XFS] fix use after free with external logs or real-time devices
SGI-PV: 983806

SGI-Modid: xfs-linux-melb:xfs-kern:31666a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:59:34 +10:00
Tim Shimmin
6a617dd22b [XFS] A bug was found in xfs_bmap_add_extent_unwritten_real(). In a
particular case, the delta param which is supposed to describe the region
where extents have changed was not updated appropriately.

SGI-PV: 984030

SGI-Modid: xfs-linux-melb:xfs-kern:31663a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
2008-07-28 16:59:32 +10:00
Christoph Hellwig
766b0925c0 [XFS] fix compilation without CONFIG_PROC_FS
SGI-PV: 984019

SGI-Modid: xfs-linux-melb:xfs-kern:31408a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:31 +10:00
Christoph Hellwig
26cc002180 [XFS] s/XFS_PURGE_INODE/IRELE/g s/VN_HOLD(XFS_ITOV())/IHOLD()/
SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31405a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:29 +10:00
Christoph Hellwig
62a877e35d [XFS] fix mount option parsing in remount
Remount currently happily accept any option thrown at it, although the
only filesystem specific option it actually handles is barrier/nobarrier.
And it actually doesn't handle these correctly either because it only uses
the value it parsed when we're doing a ro->rw transition. In addition to
that there's also a bad bug in xfs_parseargs which doesn't touch the
actual option in the mount point except for a single one,
XFS_MOUNT_SMALL_INUMS and thus forced any filesystem that's every
remounted in some way to not support 64bit inodes with no way to recover
unless unmounted.

This patch changes xfs_fs_remount to use it's own linux/parser.h based
options parse instead of xfs_parseargs and reject all options except for
barrier/nobarrier and to the right thing in general. Eventually I'd like
to have a single big option table used for mount aswell but that can wait
for a while.

SGI-PV: 983964

SGI-Modid: xfs-linux-melb:xfs-kern:31382a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:28 +10:00
Eric Sandeen
deeb5912db [XFS] Disable queue flag test in barrier check.
md raid1 can pass down barriers, but does not set an ordered flag on the
queue, so xfs does not even attempt a barrier write, and will never use
barriers on these block devices.

Remove the flag check and just let the barrier write test determine
barrier support.

A possible risk here is that if something does not set an ordered flag and
also does not properly return an error on a barrier write... but if it's
any consolation jbd/ext3/reiserfs never test the flag, and don't even do a
test write, they just disable barriers the first time an actual journal
barrier write fails.

SGI-PV: 983924

SGI-Modid: xfs-linux-melb:xfs-kern:31377a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:26 +10:00
Christoph Hellwig
9f8868ffb3 [XFS] streamline init/exit path
Currently the xfs module init/exit code is a mess. It's farmed out over a
lot of function with very little error checking. This patch makes sure we
propagate all initialization failures properly and clean up after them.
Various runtime initializations are replaced with compile-time
initializations where possible to make this easier. The exit path is
similarly consolidated.

There's now split out function to create/destroy the kmem zones and
alloc/free the trace buffers. I've also changed the ktrace allocations to
KM_MAYFAIL and handled errors resulting from that.

And yes, we really should replace the XFS_*_TRACE ifdefs with a single
XFS_TRACE..

SGI-PV: 976035

SGI-Modid: xfs-linux-melb:xfs-kern:31354a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:25 +10:00
Tim Shimmin
136f8f21b6 [XFS] Fix up problem when CONFIG_XFS_POSIX_ACL is not set and yet we still
can use the _ACL_TYPE_* definitions in linux-2.6/xfs_xattr.c. The
forthcoming generic acl code will also fix this problem.

SGI-PV: 982343

SGI-Modid: xfs-linux-melb:xfs-kern:31369a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:17 +10:00
Lachlan McIlroy
2edbddd5f4 [XFS] Don't assert if trying to mount with blocksize > pagesize
If we don't do the blocksize/PAGESIZE check before calling
xfs_sb_validate_fsb_count() we can assert if we try to mount with a
blocksize > pagesize. The assert is valid so leave it and just move the
blocksize/pagesize check earlier.

SGI-PV: 983734

SGI-Modid: xfs-linux-melb:xfs-kern:31365a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2008-07-28 16:59:15 +10:00
Christoph Hellwig
8f8670bb1c [XFS] Don't update mtime on rename source
As reported by Michael-John Turner XFS updates the mtime on the source
inode of a rename call in case it's a directory and changes the parent.

This doesn't make any sense, is not mentioned in the standards and not
performed by any other Linux filesystems so remove it.

SGI-PV: 983684

SGI-Modid: xfs-linux-melb:xfs-kern:31364a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:14 +10:00
Lachlan McIlroy
313b5c767a [XFS] Allow xfs_bmbt_split() to fallback to the lowspace allocator
algorithm

If xfs_bmbt_split() cannot find an AG with sufficient free space to
satisfy a full extent btree split then fall back to the lowspace allocator
algorithm.

SGI-PV: 983338

SGI-Modid: xfs-linux-melb:xfs-kern:31359a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2008-07-28 16:59:13 +10:00
Lachlan McIlroy
b877e3d37d [XFS] Restore the lowspace extent allocator algorithm
When free space is running low the extent allocator may choose to allocate
an extent from an AG without leaving sufficient space for a btree split
when inserting the new extent (see where xfs_bmap_btalloc() sets minleft
to 0). In this case the allocator will enable the lowspace algorithm which
is supposed to allow further allocations (such as btree splits and
newroots) to allocate from sequential AGs. This algorithm has been broken
for a long time and this patch restores its behaviour.

SGI-PV: 983338

SGI-Modid: xfs-linux-melb:xfs-kern:31358a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2008-07-28 16:59:11 +10:00
Lachlan McIlroy
4ddd8bb1d2 [XFS] use minleft when allocating in xfs_bmbt_split()
The bmap btree split code relies on a previous data extent allocation
(from xfs_bmap_btalloc()) to find an AG that has sufficient space to
perform a full btree split, when inserting the extent. When converting
unwritten extents we don't allocate a data extent so a btree split will be
the first allocation. In this case we need to set minleft so the allocator
will pick an AG that has space to complete the split(s).

SGI-PV: 983338

SGI-Modid: xfs-linux-melb:xfs-kern:31357a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2008-07-28 16:59:10 +10:00
Christoph Hellwig
e182f57ac0 [XFS] attrmulti cleanup
xfs_attrmulti_by_handle currently request the size based on
sizeof(attr_multiop_t) but should be using sizeof(xfs_attr_multiop_t)
because that is what it is dealing with. Despite beeing wrong this
actually harmless in practice because both structures are the same size on
all platforms.

But this sizeof was the only user of struct attr_multiop so we can just
kill it. Also move the ATTR_OP_* defines xfs_attr.h into the struct
xfs_attr_multiop defintion in xfs_fs.h because they are only used with
that structure, and are part of the user ABI for the
XFS_IOC_ATTRMULTI_BY_HANDLE ioctl.

SGI-PV: 983508

SGI-Modid: xfs-linux-melb:xfs-kern:31352a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:09 +10:00
Christoph Hellwig
90ad58a83a [XFS] Check for invalid flags in xfs_attrlist_by_handle.
xfs_attrlist_by_handle should only take the ATTR_ flags for the root
namespaces. The ATTR_KERN* flags may change at anytime and expect special
preconditions that can't be guaranteed for userspace-originating requests.
For example passing down ATTR_KERNNOVAL through xfs_attrlist_by_handle
will hit an assert in debug builds currently.

SGI-PV: 983677

SGI-Modid: xfs-linux-melb:xfs-kern:31351a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:07 +10:00
Barry Naujok
07fe4dd48d [XFS] Fix CI lookup in leaf-form directories
Instead of comparing buffer pointers, compare buffer block numbers and
don't keep buff

SGI-PV: 983564

SGI-Modid: xfs-linux-melb:xfs-kern:31346a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:06 +10:00
Lachlan McIlroy
f9e09f095f [XFS] Use the generic xattr methods.
Add missing file fs/xfs/linux-2.6/xfs_xattr.c

SGI-PV: 982343

SGI-Modid: xfs-linux-melb:xfs-kern:31234a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:04 +10:00
Lachlan McIlroy
ddea2d5246 [XFS] Always reset btree cursor after an insert
After a btree insert operation a cursor can be invalid due to block splits
and a maybe a new root block. We reset the cursor in xfs_bmbt_insert() in
the cases where we think we need to but it isn't enough as we still see
assertions. Just do what we do elsewhere and reset the cursor
unconditionally. Also remove the fix to revalidate the original cursor in
xfs_bmbt_insert().

SGI-PV: 983336

SGI-Modid: xfs-linux-melb:xfs-kern:31342a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2008-07-28 16:59:03 +10:00
Lachlan McIlroy
6bd8fc8a55 [XFS] Convert ASSERTs to XFS_WANT_CORRUPTED_GOTOs
ASSERTs are no good to us on a non-debug build so use
XFS_WANT_CORRUPTED_GOTOs to report extent btree corruption ASAP.

SGI-PV: 983500

SGI-Modid: xfs-linux-melb:xfs-kern:31338a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:59:02 +10:00
Barry Naujok
90bb7ab077 [XFS] Fix returning case-preserved name with CI node form directories
xfs_dir2_node_lookup() calls xfs_da_node_lookup_int() which iterates
through leaf blocks containing the matching hash value for the name being
looked up. Inside xfs_da_node_lookup_int(), it calls the
xfs_dir2_leafn_lookup_for_entry() for each leaf block.
xfs_dir2_leafn_lookup_for_entry() iterates through each matching
hash/offset pair doing a name comparison to find the matching dirent.

For CI mode, the state->extrablk retains the details of the block that has
the CI match so xfs_dir2_node_lookup() can return the case-preserved name.

The original implementation didn't retain the xfs_da_buf_t properly, so
the lookup was returning a bogus name to be stored in the dentry.

In the case of unlink, the bad name was passed and in debug mode, ASSERTed
when it can't find the entry.

SGI-PV: 983284

SGI-Modid: xfs-linux-melb:xfs-kern:31337a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:01 +10:00
Christoph Hellwig
e5700704b2 [XFS] Don't update i_size for directories and special files
The core kernel uses vfs_getattr to look at the inode size and similar
attributes, so there is no need to keep i_size uptodate for directories or
special files. This means we can remove xfs_validate_fields because the
I/O path already keeps i_size uptodate for regular files.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31336a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:59:00 +10:00
Christoph Hellwig
8f112e3bc3 [XFS] Merge xfs_rmdir into xfs_remove
xfs_remove and xfs_rmdir are almost the same with a little more work
performed in xfs_rmdir due to the . and .. entries. This patch merges
xfs_rmdir into xfs_remove and performs these actions conditionally.

Also clean up the error handling which was a nightmare in both versions
before.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31335a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:58 +10:00
Tim Shimmin
61f10fad19 [XFS] Fix up warning for xfs_vn_listxatt's call of list_one_attr() with
context count of ssize_t versus int. Change context count to be ssize_t.

SGI-PV: 983395

SGI-Modid: xfs-linux-melb:xfs-kern:31333a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:57 +10:00
Lachlan McIlroy
6278debdf9 [XFS] fix extent corruption in xfs_iext_irec_compact_full()
This function is used to compact the indirect extent list by moving
extents from one page to the previous to fill them up. After we move some
extents to an earlier page we need to shuffle the remaining extents to the
start of the page. The actual bug here is the second argument to memmove()
needs to index past the extents, that were copied to the previous page,
and move the remaining extents. For pages that are already full (ie
ext_avail == 0) the compaction code has no net effect so don't do it.

SGI-PV: 983337

SGI-Modid: xfs-linux-melb:xfs-kern:31332a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:58:56 +10:00
Lachlan McIlroy
7f871d5d1b [XFS] make inode reclaim wait for log I/O to complete
During a forced shutdown a xfs inode can be destroyed before log I/O
involving that inode is complete. We need to wait for the inode to be
unpinned before tearing it down. Version 2 cleans up the code a bit by
relying on xfs_iflush() to do the unpinning and forced shutdown check.

SGI-PV: 981240

SGI-Modid: xfs-linux-melb:xfs-kern:31326a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2008-07-28 16:58:54 +10:00
Christoph Hellwig
ad9b463aa2 [XFS] Switches xfs_vn_listxattr to set it's put_listent callback directly
and not go through xfs_attr_list.

SGI-PV: 983395

SGI-Modid: xfs-linux-melb:xfs-kern:31324a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:53 +10:00
Christoph Hellwig
caf8aabdbc [XFS] Factor out code for whether inode has attributes or not.
SGI-PV: 983394

SGI-Modid: xfs-linux-melb:xfs-kern:31323a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:52 +10:00
Eric Sandeen
ae23a5e87d [XFS] Pack some shortform dir2 structures for the ARM old ABI
architecture.

This should fix the longstanding issues with xfs and old ABI arm boxes,
which lead to various asserts and xfs shutdowns, and for which an
(incorrect) patch has been floating around for years.

I've verified this patch by comparing the on-disk structure layouts using
pahole from the dwarves package, as well as running through a bit of xfsqa
under qemu-arm, modified so that the check/repair phase after each test
actually executes check/repair from the x86 host, on the filesystem
populated by the arm emulator. Thus far it all looks good.

There are 2 other structures with extra padding at the end, but they don't
seem to cause trouble. I suppose they could be packed as well:
xfs_dir2_data_unused_t and xfs_dir2_sf_t.

Note that userspace needs a similar treatment, and any filesystems which
were running with the previous rogue "fix" will now see corruption (either
in the kernel, or during xfs_repair) with this fix properly in place; it
may be worth teaching xfs_repair to identify and fix that specific issue.

SGI-PV: 982930

SGI-Modid: xfs-linux-melb:xfs-kern:31280a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:50 +10:00
Lachlan McIlroy
0ec585163a [XFS] Use the generic xattr methods.
Use the generic set, get and removexattr methods and supply the s_xattr
array with fine-grained handlers. All XFS/Linux highlevel attr handling is
rewritten from scratch and placed into fs/xfs/linux-2.6/xfs_xattr.c so
that it's separated from the generic low-level code.

SGI-PV: 982343

SGI-Modid: xfs-linux-melb:xfs-kern:31234a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:49 +10:00
Barry Naujok
d532506cd8 [XFS] Invalidate dentry in unlink/rmdir if in case-insensitive mode
The vfs_unlink/d_delete functionality in the Linux VFS make the
dentry negative if it is the only inode being referenced. Case-insensitive
mode doesn't work with negative dentries, so if using CI-mode, invalidate
the dentry on unlink/rmdir.

SGI-PV: 983102
SGI-Modid: xfs-linux-melb:xfs-kern:31308a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:58:47 +10:00
Barry Naujok
87affd08bc [XFS] Zero uninitialised xfs_da_args structure in xfs_dir2.c
Fixes a problem in the xfs_dir2_remove and xfs_dir2_replace paths which
intenally call directory format specific lookup funtions that assume
args->cmpresult is zeroed.

SGI-PV: 982606
SGI-Modid: xfs-linux-melb:xfs-kern:31268a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:58:46 +10:00
Barry Naujok
866d5dc974 [XFS] Remove d_add call for an ENOENT lookup return code
SGI-PV: 981521
SGI-Modid: xfs-linux-melb:xfs-kern:31214a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2008-07-28 16:58:44 +10:00
Barry Naujok
d3689d7687 [XFS] kmem_free and kmem_realloc to use const void *
SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31212a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:58:43 +10:00
Barry Naujok
189f4bf22b [XFS] XFS: ASCII case-insensitive support
Implement ASCII case-insensitive support. It's primary purpose is for
supporting existing filesystems that already use this case-insensitive
mode migrated from IRIX. But, if you only need ASCII-only case-insensitive
support (ie. English only) and will never use another language, then this
mode is perfectly adequate.

ASCII-CI is implemented by generating hashes based on lower-case letters
and doing lower-case compares. It implements a new xfs_nameops vector for
doing the hashes and comparisons for all filename operations.

To create a filesystem with this CI mode, use: # mkfs.xfs -n version=ci
<device>

SGI-PV: 981516
SGI-Modid: xfs-linux-melb:xfs-kern:31209a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:58:42 +10:00
Barry Naujok
384f3ced07 [XFS] Return case-insensitive match for dentry cache
This implements the code to store the actual filename found during a
lookup in the dentry cache and to avoid multiple entries in the dcache
pointing to the same inode.

To avoid polluting the dcache, we implement a new directory inode
operations for lookup. xfs_vn_ci_lookup() stores the correct case name in
the dcache.

The "actual name" is only allocated and returned for a case- insensitive
match and not an actual match.

Another unusual interaction with the dcache is not storing negative
dentries like other filesystems doing a d_add(dentry, NULL) when an ENOENT
is returned. During the VFS lookup, if a dentry returned has no inode,
dput is called and ENOENT is returned. By not doing a d_add, this actually
removes it completely from the dcache to be reused. create/rename have to
be modified to support unhashed dentries being passed in.

SGI-PV: 981521
SGI-Modid: xfs-linux-melb:xfs-kern:31208a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:58:40 +10:00
Barry Naujok
6a178100ab [XFS] Add op_flags field and helpers to xfs_da_args
The end of the xfs_da_args structure has 4 unsigned char fields for
true/false information on directory and attr operations using the
xfs_da_args structure.

The following converts these 4 into a op_flags field that uses the first 4
bits for these fields and allows expansion for future operation
information (eg. case-insensitive lookup request).

SGI-PV: 981520
SGI-Modid: xfs-linux-melb:xfs-kern:31206a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:58:37 +10:00
Barry Naujok
5163f95a08 [XFS] Name operation vector for hash and compare
Adds two pieces of functionality for the basis of case-insensitive support
in XFS:

1. A comparison result enumerated type: xfs_dacmp. It represents an

exact match, case-insensitive match or no match at all. This patch

only implements different and exact results.

2. xfs_nameops vector for specifying how to perform the hash generation

of filenames and comparision methods. In this patch the hash vector

points to the existing xfs_da_hashname function and the comparison

method does a length compare, and if the same, does a memcmp and

return the xfs_dacmp result.

All filename functions that use the hash (create, lookup remove, rename,
etc) now use the xfs_nameops.hashname function and all directory lookup
functions also use the xfs_nameops.compname function.

The lookup functions also handle case-insensitive results even though the
default comparison function cannot return that. And important aspect of
the lookup functions is that an exact match always has precedence over a
case-insensitive. So while a case-insensitive match is found, we have to
keep looking just in case there is an exact match. In the meantime, the
info for the first case-insensitive match is retained if no exact match is
found.

SGI-PV: 981519
SGI-Modid: xfs-linux-melb:xfs-kern:31205a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:58:36 +10:00
Eric Sandeen
68f34d5107 [XFS]
de-duplicate calls to xfs_attr_trace_enter

Every call to xfs_attr_trace_enter() shares the exact same 16 args in the
middle... just send in the context pointer and let the next level down
split it into the ktrace.

Compile tested only.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:31200a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:35 +10:00
Christoph Hellwig
120226c11a [XFS] add missing call to xfs_filestream_unmount on xfs_mountfs failure
SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31199a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:33 +10:00
Christoph Hellwig
effa2eda3a [XFS] rename error2 goto label in xfs_fs_fill_super
SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31198a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:31 +10:00
Christoph Hellwig
95db4e21b7 [XFS] kill calls to xfs_binval in the mount error path
xfs_binval aka xfs_flush_buftarg is the first thing done in
xfs_free_buftarg, so there is no need to have duplicated calls just before
xfs_free_buftarg in the mount failure path.

SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31197a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:30 +10:00
Christoph Hellwig
c962fb7902 [XFS] kill xfs_mount_init
xfs_mount_init is inlined into xfs_fs_fill_super and allocation switched
to kzalloc. Plug a leak of the mount structure for most early mount
failures. Move xfs_icsb_init_counters to as late as possible in the mount
path and make sure to undo it so that no stale hotplug cpu notifiers are
left around on mount failures.

SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31196a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:29 +10:00
Christoph Hellwig
bdd907bab7 [XFS] allow xfs_args_allocate to fail
Switch xfs_args_allocate to kzalloc and handle failures.

SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31195a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:27 +10:00
Christoph Hellwig
e34b562c6b [XFS] add xfs_setup_devices helper
Split setting the block and sector size out of xfs_fs_fill_super into a
small helper to make xfs_fs_fill_super more readable.

SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31194a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:26 +10:00
Christoph Hellwig
19f354d4c3 [XFS] sort out opening and closing of the block devices
Currently closing the rt/log block device is done in the wrong spot, and
far too early. So revampt it:

- xfs_blkdev_put moved out of xfs_free_buftarg into the caller so that

it is done after tearing down the buftarg completely.

- call to xfs_unmountfs_close moved from xfs_mountfs into caller so

that it's done after tearing down the filesystem completely.

- xfs_unmountfs_close is renamed to xfs_close_devices and made static

in xfs_super.c

- opening of the block devices is split into a helper xfs_open_devices

that is symetric in use to xfs_close_devices

- xfs_unmountfs can now lose struct cred

- error handling around device opening sanitized in xfs_fs_fill_super

SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31193a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:25 +10:00
Christoph Hellwig
af15b8953a [XFS] don't call xfs_freesb from xfs_mountfs failure case
Freeing of the superblock is already handled in the caller, and that is
more symmetric with the mount path, too.

SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31192a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:23 +10:00
Christoph Hellwig
f8f15e42b4 [XFS] merge xfs_mount into xfs_fs_fill_super
xfs_mount is already pretty linux-specific so merge it into
xfs_fs_fill_super to allow for a more structured mount code in the next
patches. xfs_start_flags and xfs_finish_flags also move to xfs_super.c.

SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31189a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:21 +10:00
Christoph Hellwig
e48ad3160e [XFS] merge xfs_unmount into xfs_fs_put_super / xfs_fs_fill_super
xfs_unmount is small and already pretty Linux specific, so merge it into
the callers. The real unmount path is simplified a little by doing a
WARN_ON on the xfs_unmount_flush retval directly instead of propagating
the error back to the caller, and the mout failure case in simplified
significantly by removing the forced shutdown case and all the dmapi
events that shouldn't be sent because the dmapi mount event hasn't been
sent by that time either.

SGI-PV: 981951
SGI-Modid: xfs-linux-melb:xfs-kern:31188a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:20 +10:00
Christoph Hellwig
61436febae [XFS] kill xfs_igrow_start and xfs_igrow_finish
xfs_igrow_start just expands to xfs_zero_eof with two asserts that are
useless in the context of the only caller and some rather confusing
comments.

xfs_igrow_finish is just a few lines of code decorated again with useless
asserts and confusing comments.

Just kill those two and merge them into xfs_setattr.

SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31186a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:18 +10:00
Christoph Hellwig
48b62a1a97 [XFS] merge xfs_mntupdate into xfs_fs_remount
xfs_mntupdate already is completely Linux specific due to the VFS flags
passed in, so it might aswell be merged into xfs_fs_remount.

SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31185a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:17 +10:00
Christoph Hellwig
fa6adbe088 [XFS] kill xfs_uuid_unmount
Quite useless wrapper that doesn't help making the code more readable.

SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31184a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:16 +10:00
David Chinner
4b166de0a0 [XFS] Update valid fields in xfs_mount_log_sb()
Recent changes to update the version number during mount (attr2 stuff)
failed to change the assert that checked for calid flags being changed on
mount. Clearly this path hasn't been exercised by the test code....

SGI-PV: 981950
SGI-Modid: xfs-linux-melb:xfs-kern:31183a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:14 +10:00
Christoph Hellwig
911ee3de3d [XFS] Kill attr_capable checks as already done in xattr_permission.
No need for addition permission checks in the xattr handler,
fs/xattr.c:xattr_permission() already does them, and in fact slightly more
strict then what was in the attr_capable handlers.

SGI-PV: 981809
SGI-Modid: xfs-linux-melb:xfs-kern:31164a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:13 +10:00
Matthew Wilcox
d748c62367 [XFS] Convert l_flushsema to a sv_t
The l_flushsema doesn't exactly have completion semantics, nor mutex
semantics. It's used as a list of tasks which are waiting to be notified
that a flush has completed. It was also being used in a way that was
potentially racy, depending on the semaphore implementation.

By using a sv_t instead of a semaphore we avoid the need for a separate
counter, since we know we just need to wake everything on the queue.

Original waitqueue implementation from Matthew Wilcox. Cleanup and
conversion to sv_t by Christoph Hellwig.

SGI-PV: 981507
SGI-Modid: xfs-linux-melb:xfs-kern:31059a

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:12 +10:00
Michael Nishimoto
d729eae893 [XFS] Ensure that 2 GiB xfs logs work properly.
We found this while experimenting with 2GiB xfs logs. The previous code
never assumed that xfs logs would ever get so large.

SGI-PV: 981502
SGI-Modid: xfs-linux-melb:xfs-kern:31058a

Signed-off-by: Michael Nishimoto <miken@agami.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:11 +10:00
Denys Vlasenko
b41759cf11 [XFS] Remove unused wbc parameter from xfs_start_page_writeback()
SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31057a

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:09 +10:00
Denys Vlasenko
4f0e8a9816 [XFS] Remove unused Falgs parameter from xfs_qm_dqpurge()
SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31056a

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:08 +10:00
Denys Vlasenko
f0e2d93c29 [XFS] Remove unused arg from kmem_free()
kmem_free() function takes (ptr, size) arguments but doesn't actually use
second one.

This patch removes size argument from all callsites.

SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31050a

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:07 +10:00
Tim Shimmin
7c12f29650 [XFS] Fix up noattr2 so that it will properly update the versionnum and
features2 fields.

Previously, mounting with noattr2 failed to achieve anything because
although it cleared the attr2 mount flag, it would set it again as soon as
it processed the superblock fields. The fix now has an explicit noattr2
flag and uses it later to fix up the versionnum and features2 fields.

SGI-PV: 980021
SGI-Modid: xfs-linux-melb:xfs-kern:31003a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:05 +10:00
Barry Naujok
f9f6dce019 [XFS] Split xfs_dir2_leafn_lookup_int into its two pieces of functionality
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30834a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:03 +10:00
Al Viro
2d8f30380a [PATCH] sanitize __user_walk_fd() et.al.
* do not pass nameidata; struct path is all the callers want.
* switch to new helpers:
	user_path_at(dfd, pathname, flags, &path)
	user_path(pathname, &path)
	user_lpath(pathname, &path)
	user_path_dir(pathname, &path)  (fail if not a directory)
  The last 3 are trivial macro wrappers for the first one.
* remove nameidata in callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:34 -04:00
Miklos Szeredi
2f1936b877 [patch 3/5] vfs: change remove_suid() to file_remove_suid()
All calls to remove_suid() are made with a file pointer, because
(similarly to file_update_time) it is called when the file is written.

Clean up callers by passing in a file instead of a dentry.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2008-07-26 20:53:16 -04:00
Al Viro
e6305c43ed [PATCH] sanitize ->permission() prototype
* kill nameidata * argument; map the 3 bits in ->flags anybody cares
  about to new MAY_... ones and pass with the mask.
* kill redundant gfs2_iop_permission()
* sanitize ecryptfs_permission()
* fix remaining places where ->permission() instances might barf on new
  MAY_... found in mask.

The obvious next target in that direction is permission(9)

folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:14 -04:00
Alexey Dobriyan
51cc50685a SL*B: drop kmem cache argument from constructor
Kmem cache passed to constructor is only needed for constructors that are
themselves multiplexeres.  Nobody uses this "feature", nor does anybody uses
passed kmem cache in non-trivial way, so pass only pointer to object.

Non-trivial places are:
	arch/powerpc/mm/init_64.c
	arch/powerpc/mm/hugetlbpage.c

This is flag day, yes.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Matt Mackall <mpm@selenic.com>
[akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
[akpm@linux-foundation.org: fix mm/slab.c]
[akpm@linux-foundation.org: fix ubifs]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:07 -07:00
Dave Chinner
49641f1acf Fix reference counting race on log buffers
When we release the iclog, we do an atomic_dec_and_lock to determine if
we are the last reference and need to trigger update of log headers and
writeout.  However, in xlog_state_get_iclog_space() we also need to
check if we have the last reference count there.  If we do, we release
the log buffer, otherwise we decrement the reference count.

But the compare and decrement in xlog_state_get_iclog_space() is not
atomic, so both places can see a reference count of 2 and neither will
release the iclog.  That leads to a filesystem hang.

Close the race by replacing the atomic_read() and atomic_dec() pair with
atomic_add_unless() to ensure that they are executed atomically.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Tim Shimmin <tes@sgi.com>
Tested-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-11 11:37:18 -07:00
Christoph Hellwig
6ab455eeaf [XFS] Fix memory corruption with small buffer reads
When we have multiple buffers in a single page for a blocksize == pagesize
filesystem we might overwrite the page contents if two callers hit it
shortly after each other. To prevent that we need to keep the page locked
until I/O is completed and the page marked uptodate.

Thanks to Eric Sandeen for triaging this bug and finding a reproducible
testcase and Dave Chinner for additional advice.

This should fix kernel.org bz #10421.

Tested-by: Eric Sandeen <sandeen@sandeen.net>

SGI-PV: 981813
SGI-Modid: xfs-linux-melb:xfs-kern:31173a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-05-23 18:12:49 +10:00
David Chinner
c8f5f12e46 [XFS] Fix inode list allocation size in writeback.
We only need to allocate space for the number of inodes in the cluster
when writing back inodes, not every byte in the inode cluster. This
reduces the amount of memory needing to be allocated to 256 bytes instead
of 64k.

SGI-PV: 981949
SGI-Modid: xfs-linux-melb:xfs-kern:31182a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-05-23 15:26:15 +10:00
David Chinner
49383b0e98 [XFS] Don't allow memory reclaim to wait on the filesystem in inode
writeback

If we allow memory reclaim to wait on the pages under writeback in inode
cluster writeback we could deadlock because we are currently holding the
ILOCK on the initial writeback inode which is needed in data I/O
completion to change the file size or do unwritten extent conversion
before the pages are taken out of writeback state.

SGI-PV: 981091
SGI-Modid: xfs-linux-melb:xfs-kern:31015a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-05-23 15:26:03 +10:00
David Chinner
978b723712 [XFS] Fix fsync() b0rkage.
xfs_fsync() fails to wait for data I/O completion before checking if the
inode is dirty or clean to decide whether to log the inode or not. This
misses inode size updates when the data flushed by the fsync() is
extending the file.

Hence, like fdatasync(), we need to wait for I/o completion first, then
check the inode for cleanliness. Doing so makes the behaviour of
xfs_fsync() identical for fsync and fdatasync and we *always* use
synchronous semantics if the inode is dirty. Therefore also kill the
differences and remove the unused flags from the xfs_fsync function and
callers.

SGI-PV: 981296
SGI-Modid: xfs-linux-melb:xfs-kern:31033a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-05-23 15:25:25 +10:00
David Chinner
a94477da38 [XFS] Include linux/random.h in all builds, not just debug builds.
SGI-PV: 979416
SGI-Modid: xfs-linux-melb:xfs-kern:31008a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-30 18:17:44 +10:00
Stephen Rothwell
adaa693b84 [XFS] Fix build failure after enabling CONFIG_XFS_DEBUG
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 16:08:44 +10:00
Christoph Hellwig
c5acbaf43d [XFS] remove dmapi cruft in xfs_file.c
The dmapi cruft in xfs_file.c is totally out of date in mainline vs
CVS, and at this point just removing this code which can't be used on
mainline at all seems to be the best option to keep it maintainable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 16:08:27 +10:00
Christoph Hellwig
3a738a5c73 [XFS] remove sendfile leftovers
Remove the last sendfile leftovers in mainline.  This code is already
gone in CVS.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 16:08:14 +10:00
Christoph Hellwig
7788fae6cc [XFS] allow enabling CONFIG_XFS_DEBUG
Back when I first submitted XFS for mainline inclusion we made the
decision that the debug code is far to extensive to be accidentally
enabled by users in mainline.  But then again it's often quite useful
to track problems down and hacking the makefile all the time is rather
annoying.  Given all the debug options with even more overhead like
lockdep or DEBUG_PAGE_ALLOC users (or rather developers) should know
by now what they're doing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 16:07:48 +10:00
David Chinner
359346a965 [XFS] Don't initialise new inode generation numbers to zero
When we allocation new inode chunks, we initialise the generation numbers
to zero. This works fine until we delete a chunk and then reallocate it,
resulting in the same inode numbers but with a reset generation count.
This can result in inode/generation pairs of different inodes occurring
relatively close together.

Given that the inode/gen pair makes up the "unique" portion of an NFS
filehandle on XFS, this can result in file handles cached on clients being
seen on the wire from the server but refer to a different file. This
causes .... issues for NFS clients.

Hence we need a unique generation number initialisation for each inode to
prevent reuse of a small portion of the generation number space. Use a
random number to initialise the generation number so we don't need to keep
any new state on disk whilst making the new number difficult to guess from
previous allocations.

SGI-PV: 979416
SGI-Modid: xfs-linux-melb:xfs-kern:31001a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:58:56 +10:00
David Chinner
86c4d62305 [XFS] Fix check for block zero access in xfs_write_iomap_allocate()
The check for block zero access should be done on non-realtime inodes. Fix
the logic error in xfs_write_iomap_allocate(), and simplify the logic on
all checks for block zero access in xfs_iomap.c

SGI-PV: 980888
SGI-Modid: xfs-linux-melb:xfs-kern:30998a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:58:40 +10:00
David Chinner
d349404ff1 [XFS] Don't double count reserved block changes on UP.
On uniprocessor machines, the incore superblock is used for all in memory
accounting of free blocks. in this situation, changes to the reserved
block count are accounted twice; once directly and once via
xfs_mod_incore_sb(). Seeing as the modification on SMP is done via
xfs_mod_incore_sb(), make this the only update mechanism that UP uses as
well.

SGI-PV: 980654
SGI-Modid: xfs-linux-melb:xfs-kern:30997a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:58:27 +10:00
Alexey Dobriyan
fe0754f0e5 [XFS] remove xfs_log_ticket_zone on rmmod
Fix bug introduced in commit eb01c9cd87 aka
"[XFS] Remove the xlog_ticket allocator"

SGI-PV: 980887
SGI-Modid: xfs-linux-melb:xfs-kern:30995a

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:58:14 +10:00
Eric Sandeen
7155054c9d [XFS] fix non-smp xfs build
xfs_reserve_blocks() calls xfs_icsb_sync_counters_locked(), which is not
defined if !CONFIG_SMP/!HAVE_PERCPU_SB

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30991a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:58:00 +10:00
Donald Douwsma
18d18208da [XFS] Fix broken HAVE_SPLICE removal commit.
Commit e687330b5e was meant to remove the
unused HAVE_SPLICE macro, instead an unrelated change was checked enabling
QUOTADEBUG when building DEBUG XFS. Restore the intended changes.

SGI-PV: 971046
SGI-Modid: xfs-linux-melb:xfs-kern:30924a

Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:57:49 +10:00
Christoph Hellwig
ce46193bca [XFS] kill XFS_ICSB_SB_LOCKED
With the last two patches XFS_ICSB_SB_LOCKED is never checked and only
superflously passed to xfs_icsb_count, so kill it.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30920a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:57:38 +10:00
Christoph Hellwig
45af6c6de6 [XFS] split xfs_icsb_balance_counter
Add an xfs_icsb_balance_counter_locked for the case where mp->m_sb_lock is
already locked.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30918a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:57:28 +10:00
Christoph Hellwig
d4d90b577e [XFS] Add xfs_icsb_sync_counters_locked for when m_sb_lock already held
Add a new xfs_icsb_sync_counters_locked for the case where m_sb_lock
is already taken and add a flags argument to xfs_icsb_sync_counters so
that xfs_icsb_sync_counters_flags is not needed.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30917a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:57:11 +10:00
Barry Naujok
e8b0ebaa11 [XFS] Cleanup xfs_attr a bit with xfs_name and remove cred
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30913a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:54:55 +10:00
Christoph Hellwig
5df78e73d3 [XFS] kill usesless IHOLD calls in xfs_remove and xfs_rmdir
The VFS always has an inode reference when we call these functions. So we
only need to grab a signle reference to each inode that's joined to a
transaction - all the other bumping and dropping is as useless as the
comments describing the IRIX semantics.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30912a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:54:45 +10:00
Christoph Hellwig
82dab941a1 [XFS] kill parent == child checks in xfs_remove and xfs_rmdir
VFS guaranteed these can't happen.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30911a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:54:34 +10:00
Christoph Hellwig
1ac74e01df [XFS] kill usesless IHOLD calls in xfs_rename
Similar to to the previous patch for remove and rmdir only grab a
reference to inodes when we join them to transaction to balance the
decrement on transaction completion. Everything else it taken care of by
the VFS.

Note that the old case had leaks of inode count when src == target or src
or target == one of the parent inodes, but these cases are fortunately
already rejected by the VFS.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30904a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:54:24 +10:00
Christoph Hellwig
cfa853e47d [XFS] remove manual lookup from xfs_rename and simplify locking
->rename already gets the target inode passed if it exits. Pass it down to
xfs_rename so that we can avoid looking it up again. Also simplify locking
as the first lock section in xfs_rename can go away now: the isdir is an
invariant over the lifetime of the inode, and new_parent and the nlink
check are namespace topology protected by i_mutex in the VFS. The projid
check needs to move into the second lock section anyway to not be racy.

Also kill the now unused xfs_dir_lookup_int and remove the now-unused
first_locked argumet to xfs_lock_inodes.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30903a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:54:12 +10:00
Christoph Hellwig
579aa9caf5 [XFS] shrink mrlock_t
The writer field is not needed for non_DEBU builds so remove it. While
we're at i also clean up the interface for is locked asserts to go through
and xfs_iget.c helper with an interface like the xfs_ilock routines to
isolated the XFS codebase from mrlock internals. That way we can kill
mrlock_t entirely once rw_semaphores grow an islocked facility. Also
remove unused flags to the ilock family of functions.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30902a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:54:02 +10:00
Christoph Hellwig
eca450b7c2 [XFS] simplify xfs_lookup
Opencode xfs-kill-xfs_dir_lookup_int here, which gets rid of a lock
roundtrip, and lots of stack space. Also kill the di_mode == 0 check that
has been done in xfs_iget for a few years now.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30901a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:53:52 +10:00
Christoph Hellwig
d4377d8418 [XFS] xfs_rename: pass resblks to xfs_dir_removename
Similar to rmdir and remove - avoids a potential transaction reservation
overrun.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30900a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:53:41 +10:00
Christoph Hellwig
6a7f422d47 [XFS] kill di_mode checks after xfs_iget
Unless XFS_IGET_CREATE is passed xfs_iget will return ENOENT if it
encounters an inode with di_mode == 0. Remove the duplicated checks in the
callers.

(the log recovery case is not touched for now)

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30898a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:53:31 +10:00
Christoph Hellwig
4e5dbb3498 [XFS] kill xfs_getattr
It's currently used by the ACL code to read di_mode/di_uid, but these are
simple 32bit scalar values we can just read directly without locking.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30897a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:53:16 +10:00
Christoph Hellwig
42173f6860 [XFS] Remove VN_IS* macros and related cruft.
We can just check i_mode / di_mode directly.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30896a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:53:05 +10:00
Linus Torvalds
429f731dea Merge branch 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
  Deprecate the asm/semaphore.h files in feature-removal-schedule.
  Convert asm/semaphore.h users to linux/semaphore.h
  security: Remove unnecessary inclusions of asm/semaphore.h
  lib: Remove unnecessary inclusions of asm/semaphore.h
  kernel: Remove unnecessary inclusions of asm/semaphore.h
  include: Remove unnecessary inclusions of asm/semaphore.h
  fs: Remove unnecessary inclusions of asm/semaphore.h
  drivers: Remove unnecessary inclusions of asm/semaphore.h
  net: Remove unnecessary inclusions of asm/semaphore.h
  arch: Remove unnecessary inclusions of asm/semaphore.h
2008-04-21 15:41:27 -07:00
Dave Hansen
ec82687f29 [PATCH] r/o bind mounts: elevate count for xfs timestamp updates
Elevate the write count during the xfs m/ctime updates.

XFS has to do it's own timestamp updates due to an unfortunate VFS
design limitation, so it will have to track writers by itself aswell.

[hch: split out from the touch_atime patch as it's not related to it at all]

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:29:26 -04:00
Dave Hansen
42a74f206b [PATCH] r/o bind mounts: elevate write count for ioctls()
Some ioctl()s can cause writes to the filesystem.  Take these, and make them
use mnt_want/drop_write() instead.

[AV: updated]

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:29:24 -04:00
Matthew Wilcox
6188e10d38 Convert asm/semaphore.h users to linux/semaphore.h
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-18 22:22:54 -04:00
Lachlan McIlroy
65e67f5165 [XFS] Fix merge failure 2008-04-18 12:59:45 +10:00
Lachlan McIlroy
3b2816be27 [XFS] The forward declarations for the xfs_ioctl() helpers and the
associated comment about gcc behavior really aren't needed; all of these
functions are marked STATIC which includes noinline, and the stack usage
won't be a problem.

This effectively just removes the forward declarations and moves
xfs_ioctl() back to the end of the file.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30534a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:43:35 +10:00
Donald Douwsma
e687330b5e [XFS] Remove unused HAVE_SPLICE macro.
HAVE_SPLICE was part of the infrastructure for building 2.4 and 2.6
kernels out of the same tree. Now we don't build 2.4 kernels this

SGI-PV: 971046
SGI-Modid: xfs-linux-melb:xfs-kern:30878a

Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:04:29 +10:00
Eric Sandeen
f7d3c34788 [XFS] Remove CONFIG_XFS_SECURITY.
There is no point to the CONFIG_XFS_SECURITY option; it disables the
ability to set security attributes at runtime, but it does not actually
slim down or remove any code for runtime. Just remove it and always allow
security attributes to be set.

SGI-PV: 980310
SGI-Modid: xfs-linux-melb:xfs-kern:30877a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:04:19 +10:00
Tim Shimmin
6d1337b29b [XFS] xfs_bmap_compute_maxlevels should be based on di_forkoff
Fix up xfs_bmap_compute_maxlevels() to account for the case when we go
from using attr2 to using attr1. In that case attr1 will no longer
necessarily be at m_attr_offset>>3, but could be at a different value for
di_forkoff. Therefore, we return the worst case scenario using MINDBTPTRS
and MINABTPTRS, as this function is used for determining the maximum log
space.

SGI-PV: 979606
SGI-Modid: xfs-linux-melb:xfs-kern:30862a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:04:08 +10:00
Eric Sandeen
cb49dbb130 [XFS] Always use di_forkoff when checking for attr space.
In the case where we mount a filesystem which was previously using the
attr2 format as attr1, returning the default mp->m_attroffset instead of
the per-inode di_forkoff for inline attribute fit calculations, may result
in corruption, if for example, the data fork is already taking more space
than the default fork offset and we try to add an extended attribute. Fix
tested by xfstests/186.

SGI-PV: 979606
SGI-Modid: xfs-linux-melb:xfs-kern:30861a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:03:40 +10:00
David Chinner
f6485057c5 [XFS] Ensure the inode is joined in xfs_itruncate_finish
On success, we still need to join the inode to the current transaction in
xfs_itruncate_finish(). Fixes regression from error handling changes.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30845a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:03:26 +10:00
David Chinner
7e20694d91 [XFS] Remove periodic logging of in-core superblock counters.
xfssyncd triggers the logging of superblock counters every 30s if the
filesystem is made with lazy-count=1. This will prevent disks from idling
and spinning down as there will be a log write every 30s. With the way
counter recovery works for lazy-count=1, this code is unnecessary and
provides no real benefit, so just remove it.

SGI-PV: 980145
SGI-Modid: xfs-linux-melb:xfs-kern:30840a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:03:12 +10:00
David Chinner
e6430037e9 [XFS] fix logic error in xfs_alloc_ag_vextent_near()
Fix a logic error in xfs_alloc_ag_vextent_near(). This is a regression
introduced by the error handling changes.

SGI-PV: 890084
SGI-Modid: xfs-linux-melb:xfs-kern:30838a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:03:02 +10:00
David Chinner
d4055947bd [XFS] Don't error out on good I/Os.
xfsbdstrat() made all I/Os error out, good or bad. Fix it.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30836a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:02:41 +10:00
David Chinner
1bb7d6b5a8 [XFS] Catch log unmount failures.
Unmounting the log can fail. unlikely, but it can. Catch all the error
conditions an make sure it's propagated upwards.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30833a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:02:30 +10:00
David Chinner
b911ca0472 [XFS] Sanitise xfs_log_force error checking.
xfs_log_force() is declared to return an error, but we almost never check
it. We don't need to check it in most cases; if there's a log I/O error
then we'll be shutting down the filesystem anyway and that means we'll
catch the error somewhere else.

However, on certain calls we should be returning an error - sync
transactions, fsync, sync writes, etc. so this isn't a pure black and
white distinction. Hence make xfs_log_force() a void function that issues
a warning to the syslog on error, and call _xfs_log_force() in all the
places where we actually care about the error status returned.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30832a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:02:20 +10:00
David Chinner
234f56aca2 [XFS] Check for errors when changing buffer pointers.
xfs_buf_associate_memory() can fail, but the return is never checked.
Propagate the error through XFS_BUF_SET_PTR() so that failures are
detected.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30831a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:02:10 +10:00
David Chinner
78e9da77f1 [XFS] Don't allow silent errors in xfs_inactive().
xfs_inactive() fails to report errors when committing the inactive
transaction. Hence we can get silent failures either finishing off the
truncation or committing the transaction. Even if we get errors, we need
to continue, so simply warn loudly to the system if we get errors here.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30830a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:01:58 +10:00
David Chinner
64bfe1bfae [XFS] Catch errors from xfs_imap().
Catch errors from xfs_imap() in log recovery when we might be trying to
map an invalid inode number due to a corrupted log.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30829a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:01:39 +10:00
David Chinner
7b07339048 [XFS] xfs_bulkstat_one_dinode() never returns an error.
Mark it void.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30828a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:01:27 +10:00
David Chinner
e4ac967b11 [XFS] xfs_iflush_fork() never returns an error.
xfs_iflush_fork() never returns an error. Mark it void and clean up the
code calling it that checks for errors.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30827a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:01:11 +10:00
David Chinner
cc88466f3f [XFS] Catch unwritten extent conversion errors.
On unwritten I/O completion, we fail to propagate an error when converting
the extent to a written extent. This means that the I/O silently fails.
propagate the error onto the ioend so that the inode is marked with an
error appropriately.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30826a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:00:58 +10:00
David Chinner
958d4ec606 [XFS] xfs_bdwrite() does not return errors.
xfs_bdwrite() cannot return an error; it only queues buffers to the
delayed write list and as such never encounters anything that can fail.
Mark it void.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30825a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:00:46 +10:00
David Chinner
db7a19f2c8 [XFS] Ensure xfs_bawrite() errors are checked.
xfs_bawrite() can return immediate error status on async writes. Unlike
xfsbdstrat() we don't ever check the error on the buffer after the call,
so we currently do not catch errors at all here. Ensure we catch and
propagate or warn to the syslog about up-front async write errors.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30824a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:00:35 +10:00
David Chinner
d64e31a2f5 [XFS] Ensure errors from xfs_bdstrat() are correctly checked.
xfsbdstrat() is declared to return an error. That is never checked because
the error is propagated by the xfs_buf_t that is passed through the
function.

Mark xfsbdstrat() as returning void and comment the prototype on the
methods needed for error checking.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30823a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:00:24 +10:00
Barry Naujok
556b8b166c [XFS] remove bhv_vname_t and xfs_rename code
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30804a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:00:12 +10:00
David Chinner
7c9ef85c56 [XFS] Catch errors returned from xfs_bmap_last_offset().
xfs_bmap_last_offset() can fail and return an error.
xfs_iomap_write_allocate() fails to detect and propagate the error.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30802a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:59:45 +10:00
David Chinner
fc6149d8d9 [XFS] Check for xfs_free_extent() failing.
xfs_free_extent() can fail, but log recovery never bothers to check if it
successfully free the extent it was supposed to. This could lead to silent
corruption during log recovery. Abort log recovery if we fail to free an
extent.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30801a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:59:23 +10:00
David Chinner
d87dd6360d [XFS] Warn if errors come from block_truncate_page().
block_truncate_page() can return errors that we currently ignore and
silently discard. We should not ever get errors reported here - an error
indicates a bug somewhere else. Hence catch the error and issue a stack
dump to the syslog because we cannot propagate the error any further up
the call chain.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30800a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:59:12 +10:00
David Chinner
c2b1cba683 [XFS] xfs_bmap_adjacent() never returns an error.
Mark it void.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30798a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:58:46 +10:00
David Chinner
12375c8237 [XFS] Make xfs_alloc_compute_aligned() void.
xfs_alloc_compute_aligned() returns a value based on a comparison of the
computed extent length and the minimum length allowed. This is only used
by some callers - the other four return parameters are used more often.
Hence move the comparison to the code that actually needs to do it and
make xfs_alloc_compute_aligned() a void function.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30797a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:58:36 +10:00
David Chinner
f4586e4061 [XFS] Clean up xfs_alloc_search_busy() return values.
xfs_alloc_search_busy() returns an index into the busy array if the extent
was found in the array. This is never checked, and the
xfs_alloc_search_busy() does a log force to prevent reuse of the extent
before the free transaction hits the disk. Hence the return value is
useless. Declare the function void and remove the slot number from the
tracing as well.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30796a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:58:27 +10:00
David Chinner
e5720eec05 [XFS] Propagate errors from xfs_trans_commit().
xfs_trans_commit() can return errors when there are problems in the
transaction subsystem. They are indicative that the entire transaction may
be incomplete, and hence the error should be propagated as there is a good
possibility that there is something fatally wrong in the filesystem. Catch
and propagate or warn about commit errors in the places where they are
currently ignored.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30795a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:58:17 +10:00
David Chinner
3c1e2bbe5b [XFS] Propagate xfs_trans_reserve() errors.
xfs_trans_reserve() reports errors that should not be ignored. For
example, a shutdown filesystem will report errors through
xfs_trans_reserve() to prevent further changes from being attempted on a
damaged filesystem. Catch and propagate all error conditions from
xfs_trans_reserve().

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30794a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:58:08 +10:00
David Chinner
5ca1f261a0 [XFS] Catch errors from xfs_acl_vremove().
Removing an ACL can return an error. Propagate it.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30793a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:57:57 +10:00
David Chinner
0c92829967 [XFS] Catch errors from xfs_acl_setmode().
Propagate the error status from xfs_acl_setmode() so that callers know if
the ACl was set correctly or not.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30792a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:57:46 +10:00
David Chinner
88ab020853 [XFS] Propagate quota file truncation errors.
Truncating the quota files can silently fail. Ensure that truncation
errors are propagated to the callers.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30791a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:57:36 +10:00
David Chinner
cb6edc26c3 [XFS] Catch errors when turning off quotas.
When turning off quota, we need to write various transactions to the log
to ensure that they are cleanly removed in the case of a crash. We need to
check that the transactions hit the disk correctly. If we fail to write
the final quota off transaction, we are corrupt in memory and so the only
option is to shut the filesystem down at this point.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30790a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:57:26 +10:00
David Chinner
31d5577b35 [XFS] Catch errors resetting quota flags.
Warn to the syslog if we fail to reset the quota flags in the superblock
when a quota check fails.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30789a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:57:16 +10:00
David Chinner
53aa7915d6 [XFS] Clean up quotamount error handling.
xfs_qm_mount_quotas() returns an error status that is ignored. If we fail
to mount quotas, we continue with quota's turned off, which is all handled
inside xfs_qm_mount_quotas(). Mark it as void to indicate that errors need
not be returned to the callers.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30788a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:57:05 +10:00
David Chinner
3c56836f92 [XFS] Check for dquot flush errors
xfs_qm_dqflush() can fail, but the return is not checked anywhere. Hence
we never know if we've failed to flush a dquot to disk. Propagate the
error and warn to the syslog if a flush ever fails.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30787a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:56:55 +10:00
David Chinner
4b8879df8c [XFS] Propagate xfs_qm_dqflush_all() errors.
xfs_qm_dqflush_all() can return flush errors. Ensure they are propagated
into the quotacheck code to determine if the quotacheck succeeded or not.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30786a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:54:56 +10:00
David Chinner
5b1397385b [XFS] xfs_qm_reset_dqcounts() does not return errors.
Declare it void.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30785a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:54:06 +10:00
David Chinner
714082bc12 [XFS] Report errors from xfs_reserve_blocks().
xfs_reserve_blocks() can fail in interesting ways. In neither case is it a
fatal error, but the result can lead to sub-optimal behaviour. Warn to the
syslog if the call fails but otherwise continue.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30784a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:53:51 +10:00
David Chinner
36fbe6e6bd [XFS] xfs_icsb_counter_disabled() never returns an error.
Mark it void.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30782a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:52:43 +10:00
David Chinner
a414047fc9 [XFS] Remove useless whitespace in function prototypes
Makes it simpler to annotate function prototypes with __must_check via sed
scripts.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30781a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:51:58 +10:00
David Chinner
3c85c36cc2 [XFS] xfs_quiesce_fs() never returns an error. Mark it void.
SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30780a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:51:46 +10:00
Christoph Hellwig
b6ddc4e6fe [XFS] Don't validate symlink target component length
This target component validation is not POSIX conformant and it is not
done by any other Linux filesystem so remove it from XFS.

SGI-PV: 980080
SGI-Modid: xfs-linux-melb:xfs-kern:30776a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:51:36 +10:00
Harvey Harrison
34a622b2e1 [XFS] replace remaining __FUNCTION__ occurrences
__FUNCTION__ is gcc-specific, use __func__

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30775a

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:51:26 +10:00
Harvey Harrison
0225da1f35 [XFS] Replace __inline with inline
Remove the remaining uses of __inline in the XFS code base.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30774a

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:51:15 +10:00
David Chinner
6b1d1a732f [XFS] Fix lock inversion in forced shutdown.
Recent changes to xlog_state_release_iclog() placed the grant_lock inside
the icloglock. forced unmount of the log does this the opposite way
around, but does not depend on the order for correct working. Fix the
inversion by changing the order locks are gained in
xfs_log_force_umount().

SGI-PV: 979661
SGI-Modid: xfs-linux-melb:xfs-kern:30773a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:51:04 +10:00
David Chinner
4679b2d36d [XFS] Reorganise xlog_t for better cacheline isolation of contention
To reduce contention on the log in large CPU count, separate out different
parts of the xlog_t structure onto different cachelines. Move each lock
onto a different cacheline along with all the members that are
accessed/modified while that lock is held.

Also, move the debugging code into debug code.

SGI-PV: 978729
SGI-Modid: xfs-linux-melb:xfs-kern:30772a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:50:53 +10:00
David Chinner
eb01c9cd87 [XFS] Remove the xlog_ticket allocator
The ticket allocator is just a simple slab implementation internal to the
log. It requires the icloglock to be held when manipulating it and this
contributes to contention on that lock.

Just kill the entire allocator and use a memory zone instead. While there,
allow us to gracefully fail allocation with ENOMEM.

SGI-PV: 978729
SGI-Modid: xfs-linux-melb:xfs-kern:30771a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:50:39 +10:00
David Chinner
114d23aae5 [XFS] Per iclog callback chain lock
Rather than use the icloglock for protecting the iclog completion callback
chain, use a new per-iclog lock so that walking the callback chain doesn't
require holding a global lock.

This reduces contention on the icloglock during transaction commit and log
I/O completion by reducing the number of times we need to hold the global
icloglock during these operations.

SGI-PV: 978729
SGI-Modid: xfs-linux-melb:xfs-kern:30770a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:50:22 +10:00
Lachlan McIlroy
2abdb8c881 [XFS] Prevent xfs_bmap_check_leaf_extents() referencing unmapped memory.
While investigating the extent corruption bug I ran into this bug in debug
only code. xfs_bmap_check_leaf_extents() loops through the leaf blocks of
the extent btree checking that every extent is entirely before the next
extent. It also compares the last extent in the previous block to the
first extent in the current block when the previous block has been
released and potentially unmapped. So take a copy of the last extent
instead of a pointer. Also move the last extent check out of the loop
because we only need to do it once.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30718a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-04-18 11:49:51 +10:00
Christoph Hellwig
433550990e [XFS] remove most calls to VN_RELE
Most VN_RELE calls either directly contain a XFS_ITOV or have the
corresponding xfs_inode already in scope. Use the IRELE helper instead of
VN_RELE to clarify the code. With a little more work we can kill VN_RELE
altogether and define IRELE in terms of iput directly.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30710a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:49:08 +10:00
Lachlan McIlroy
df26cfe849 [XFS] split xfs_ioc_xattr
The three subcases of xfs_ioc_xattr don't share any semantics and almost
no code, so split it into three separate helpers.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30709a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:44:03 +10:00
Christoph Hellwig
f3dcc13f6f [XFS] cleanup root inode handling in xfs_fs_fill_super
- rename rootvp to root for clarify
- remove useless vn_to_inode call
- check is_bad_inode before calling d_alloc_root
- use iput instead of VN_RELE in the error case

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30708a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:42:36 +10:00
David Chinner
59a33f9f77 [XFS] Ensure a btree insert returns a valid cursor.
When writing into preallocated regions there is a case where XFS can oops
or hang doing the unwritten extent conversion on I/O completion. It turns
out that the problem is related to the btree cursor being invalid.

When we do an insert into the tree, we may need to split blocks in the
tree. When we only split at the leaf level (i.e. level 0), everything
works just fine. However, if we have a multi-level split in the btreee,
the cursor passed to the insert function is no longer valid once the
insert is complete.

The leaf level split is handled correctly because all the operations at
level 0 are done using the original cursor, hence it is updated correctly.
However, when we need to update the next level up the tree, we don't use
that cursor - we use a cloned cursor that points to the index in the next
level up where we need to do the insert.

Hence if we need to split a second level, the changes to the tree are
reflected in the cloned cursor and not the original cursor. This
clone-and-move-up-a-level-on-split behaviour recurses all the way to the
top of the tree.

The complexity here is that these cloned cursors do not point to the
original index that was inserted - they point to the newly allocated block
(the right block) and the original cursor pointer to that level may still
point to the left block. Hence, without deep examination of the cloned
cursor and buffers, we cannot update the original cursor with the new path
from the cloned cursor.

In these cases the original cursor could be pointing to the wrong block(s)
and hence a subsequent modification to the tree using that cursor will
lead to corruption of the tree.

The crash case occurs when the tree changes height - we insert a new level
in the tree, and the cursor does not have a buffer in it's path for that
level. Hence any attempt to walk back up the cursor to the root block will
result in a null pointer dereference.

To make matters even more complex, the BMAP BT is rooted in an inode, so
we can have a change of height in the btree *without a root split*. That
is, if the root block in the inode is full when we split a leaf node, we
cannot fit the pointer to the new block in the root, so we allocate a new
block, migrate all the ptrs out of the inode into the new block and point
the inode root block at the newly allocated block. This changes the height
of the tree without a root split having occurred and hence invalidates the
path in the original cursor.

The patch below prevents xfs_bmbt_insert() from returning with an invalid
cursor by detecting the cases that invalidate the original cursor and
refresh it by do a lookup into the btree for the original index we were
inserting at.

Note that the INOBT, AGFBNO and AGFCNT btree implementations also have
this bug, but the cursor is currently always destroyed or revalidated
after an insert for those trees. Hence this patch only address the problem
in the BMBT code.

SGI-PV: 979339
SGI-Modid: xfs-linux-melb:xfs-kern:30701a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:42:21 +10:00
David Chinner
75de2a91c9 [XFS] Account for inode cluster alignment in all allocations
At ENOSPC, we can get a filesystem shutdown due to a cancelling a dirty
transaction in xfs_mkdir or xfs_create. This is due to the initial
allocation attempt not taking into account inode alignment and hence we
can prepare the AGF freelist for allocation when it's not actually
possible to do an allocation. This results in inode allocation returning
ENOSPC with a dirty transaction, and hence we shut down the filesystem.

Because the first allocation is an exact allocation attempt, we must tell
the allocator that the alignment does not affect the allocation attempt.
i.e. we will accept any extent alignment as long as the extent starts at
the block we want. Unfortunately, this means that if the longest free
extent is less than the length + alignment necessary for fallback
allocation attempts but is long enough to attempt a non-aligned
allocation, we will modify the free list.

If we then have the exact allocation fail, all other allocation attempts
will also fail due to the alignment constraint being taken into account.
Hence the initial attempt needs to set the "alignment slop" field so that
alignment, while not required, must be taken into account when determining
if there is enough space left in the AG to do the allocation.

That means if the exact allocation fails, we will not dirty the freelist
if there is not enough space available fo a subsequent allocation to
succeed. Hence we get an ENOSPC error back to userspace without shutting
down the filesystem.

SGI-PV: 978886
SGI-Modid: xfs-linux-melb:xfs-kern:30699a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:42:09 +10:00
Josef 'Jeff' Sipek
535f6b3735 [XFS] Replace custom AIL linked-list code with struct list_head
Replace the xfs_ail_entry_t with a struct list_head and clean the
surrounding code up. Also fixes a livelock in xfs_trans_first_push_ail()
by terminating the loop at the head of the list correctly.

SGI-PV: 978682
SGI-Modid: xfs-linux-melb:xfs-kern:30636a

Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:41:57 +10:00
Christoph Hellwig
a45c796867 [XFS] Remove superflous xfs_readsb call in xfs_mountfs.
When xfs_mountfs is called by xfs_mount xfs_readsb was called 35 lines
above unconditionally, so there is no need to try to read the superblock
if it's not present. If any other port doesn't have the superblock read at
this point it should just call it directly from it's xfs_mount equivalent.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30603a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:41:46 +10:00
Niv Sardi
dfa18b1179 [XFS] kill t_sema member of struct xfs_trans
It's completely unused so we might aswell kill it. Note that there is
another t_sema in struct xlog_ticket, which is used and actually an sv_t
despite the name. That one is left untouched by this patch.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30591a

Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:41:35 +10:00
Christoph Hellwig
5f90150aba [XFS] cleanup vnode use in xfs_bmap.c
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30553a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:41:25 +10:00
Christoph Hellwig
af048193fc [XFS] cleanup vnode use in xfs_iops.c
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30552a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:41:14 +10:00
Christoph Hellwig
dcf49cc5cf [XFS] cleanup vnode use in xfs_lrw.c
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30551a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:41:04 +10:00
Christoph Hellwig
ef1f5e7ad3 [XFS] cleanup vnode use in xfs_lookup
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30550a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:40:55 +10:00
Christoph Hellwig
3937be5ba8 [XFS] cleanup vnode use in xfs_symlink and xfs_rename
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30548a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:40:45 +10:00
Christoph Hellwig
a3da789640 [XFS] cleanup vnode use in xfs_link
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30547a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:40:35 +10:00
Christoph Hellwig
979ebab116 [XFS] cleanup vnode use in xfs_create/mknod/mkdir
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30546a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:40:25 +10:00
Christoph Hellwig
bc4ac74a4e [XFS] cleanup vnode use in dmapi calls
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30545a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:40:15 +10:00
David Chinner
d234154125 [XFS] Use power-of-2 sized buffers to reduce overhead
Now that the ktrace_enter() code is using atomics, the non-power-of-2
buffer sizes - which require modulus operations to get the index - are
showing up as using substantial CPU in the profiles.

Force the buffer sizes to be rounded up to the nearest power of two and
use masking rather than modulus operations to convert the index counter to
the buffer index. This reduces ktrace_enter overhead to 8% of a CPU time,
and again almost halves the trace intensive test runtime.

SGI-PV: 977546
SGI-Modid: xfs-linux-melb:xfs-kern:30538a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:40:04 +10:00
David Chinner
6ee4752ffe [XFS] Use atomic counters for ktrace buffer indexes
ktrace_enter() is consuming vast amounts of CPU time due to the use of a
single global lock for protecting buffer index increments. Change it to
use per-buffer atomic counters - this reduces ktrace_enter() overhead
during a trace intensive test on a 4p machine from 58% of all CPU time to
12% and halves test runtime.

SGI-PV: 977546
SGI-Modid: xfs-linux-melb:xfs-kern:30537a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:39:55 +10:00
David Chinner
44d814ced4 [XFS] Update c/mtime correctly on truncates
XFS changes the c/mtime of an inode when truncating it to the same size.
The c/mtime is only supposed to change if the size is changed. Not to be
confused with ftruncate, where the c/mtime is supposed to be changed even
if the size is not changed.

The Linux VFS encodes this semantic difference in the flags it sends down
to ->setattr, which XFS currently ignores. We need to make XFS pay
attention to the VFS flags and hence Do The Right Thing.

SGI-PV: 977547
SGI-Modid: xfs-linux-melb:xfs-kern:30536a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:39:45 +10:00
Christoph Hellwig
24bd861d1c [XFS] don't encode parent in nfs filehandles unless nessecary
As Dave pointed out after the export ops changes we now always encode the
parent into the filehandle for regular files, but it's not actually needed
when the filesystem is export with no_subtree_check. This one-liner fixes
xfs_fs_encode_fh to skip encoding the parent unless nessecary.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30535a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:39:35 +10:00
Christoph Hellwig
126468b115 [XFS] kill xfs_rwlock/xfs_rwunlock
We can just use xfs_ilock/xfs_iunlock instead and get rid of the ugly
bhv_vrwlock_t.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30533a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:39:25 +10:00
Christoph Hellwig
43973964a3 [XFS] kill xfs_get_dir_entry
Instead of of xfs_get_dir_entry use a macro to get the xfs_inode from the
dentry in the callers and grab the reference manually.

Only grab the reference once as it's fine to keep it over the dmapi calls.
(And even that reference is actually superflous in Linux but I'll leave
that for another patch)

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30531a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:39:14 +10:00
Christoph Hellwig
a8b3acd57e [XFS] vnode cleanup in xfs_fs_subr.c
Cleanup the unneeded intermediate vnode step in the flushing helpers and
go directly from the xfs_inode to the struct address_space.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30530a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:39:03 +10:00
Christoph Hellwig
db0bb7baa1 [XFS] cleanup xfs_vn_mknod
- use proper goto based unwinding instead of the current mess of
  multiple conditionals
- rename ip to inode because that's the normal convention for Linux
  inodes while ip is the convention for xfs_inodes
- remove unlikely checks for the default_acl - branches marked unlikely
  might lead to extreme branch bredictor slowdons if taken and for some
  workloads a default acl is quite common
- properly indent the switch statements
- remove xfs_has_fs_struct as nfsd has a fs_struct in any semi-recent
  kernel

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30529a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:38:53 +10:00
David Chinner
155cc6b784 [XFS] Use atomics for iclog reference counting
Now that we update the log tail LSN less frequently on transaction
completion, we pass the contention straight to the global log state lock
(l_iclog_lock) during transaction completion.

We currently have to take this lock to decrement the iclog reference
count. there is a reference count on each iclog, so we need to take he
global lock for all refcount changes.

When large numbers of processes are all doing small trnasctions, the iclog
reference counts will be quite high, and the state change that absolutely
requires the l_iclog_lock is the except rather than the norm.

Change the reference counting on the iclogs to use atomic_inc/dec so that
we can use atomic_dec_and_lock during transaction completion and avoid the
need for grabbing the l_iclog_lock for every reference count decrement
except the one that matters - the last.

SGI-PV: 975671
SGI-Modid: xfs-linux-melb:xfs-kern:30505a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:38:10 +10:00
David Chinner
b589334c7a [XFS] Prevent AIL lock contention during transaction completion
When hundreds of processors attempt to commit transactions at the same
time, they can contend on the AIL lock when updating the tail LSN held in
the in-core log structure.

At the moment, the tail LSN is only needed when actually writing out an
iclog, so it really does not need to be updated on every single
transaction completion - only those that result in switching iclogs and
flushing them to disk.

The result is that we reduce the number of times we need to grab the AIL
lock and the log grant lock by up to two orders of magnitude on large
processor count machines. The problem has previously been hidden by AIL
lock contention walking the AIL list which was recently solved and
uncovered this issue.

SGI-PV: 975671
SGI-Modid: xfs-linux-melb:xfs-kern:30504a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:38:01 +10:00
David Chinner
3354040897 [XFS] Use xfs_inode_clean() in more places
Remove open coded checks for the whether the inode is clean and replace
them with an inlined function.

SGI-PV: 977461
SGI-Modid: xfs-linux-melb:xfs-kern:30503a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:37:51 +10:00
David Chinner
bad5584332 [XFS] Remove the xfs_icluster structure
Remove the xfs_icluster structure and replace with a radix tree lookup.

We don't need to keep a list of inodes in each cluster around anymore as
we can look them up quickly when we need to. The only time we need to do
this now is during inode writeback.

Factor the inode cluster writeback code out of xfs_iflush and convert it
to use radix_tree_gang_lookup() instead of walking a list of inodes built
when we first read in the inodes.

This remove 3 pointers from each xfs_inode structure and the xfs_icluster
structure per inode cluster. Hence we reduce the cache footprint of the
xfs_inodes by between 5-10% depending on cluster sparseness.

To be truly efficient we need a radix_tree_gang_lookup_range() call to
stop searching once we are past the end of the cluster instead of trying
to find a full cluster's worth of inodes.

Before (ia64):

$ cat /sys/slab/xfs_inode/object_size 536

After:

$ cat /sys/slab/xfs_inode/object_size 512

SGI-PV: 977460
SGI-Modid: xfs-linux-melb:xfs-kern:30502a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:37:41 +10:00
David Chinner
a3f74ffb6d [XFS] Don't block pdflush when writing back inodes
When pdflush is writing back inodes, it can get stuck on inode cluster
buffers that are currently under I/O. This occurs when we write data to
multiple inodes in the same inode cluster at the same time.

Effectively, delayed allocation marks the inode dirty during the data
writeback. Hence if the inode cluster was flushed during the writeback of
the first inode, the writeback of the second inode will block waiting for
the inode cluster write to complete before writing it again for the newly
dirtied inode.

Basically, we want to avoid this from happening so we don't block pdflush
and slow down all of writeback. Hence we introduce a non-blocking async
inode flush flag that pdflush uses. If this flag is set, we use
non-blocking operations (e.g. try locks) whereever we can to avoid
blocking or extra I/O being issued.

SGI-PV: 970925
SGI-Modid: xfs-linux-melb:xfs-kern:30501a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:37:32 +10:00
David Chinner
4ae29b4321 [XFS] Factor xfs_itobp() and xfs_inotobp().
The only difference between the functions is one passes an inode for the
lookup, the other passes an inode number. However, they don't do the same
validity checking or set all the same state on the buffer that is returned
yet they should.

Factor the functions into a common implementation.

SGI-PV: 970925
SGI-Modid: xfs-linux-melb:xfs-kern:30500a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:37:19 +10:00
Lachlan McIlroy
e9a56b7cda [XFS] Fix regression due to refcache removal
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30490a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
2008-04-18 11:37:06 +10:00
Donald Douwsma
163d3686bb [XFS] Remove the xfs_refcache
Remove the xfs_refcache, it was only needed while we were still
building for 2.4 kernels.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30472a

Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:36:55 +10:00
Lachlan McIlroy
461aa8a225 [XFS] make inode reclaim synchronise with xfs_iflush_done()
On a forced shutdown, xfs_finish_reclaim() will skip flushing the inode.
If the inode flush lock is not already held and there is an outstanding
xfs_iflush_done() then we might free the inode prematurely. By acquiring
and releasing the flush lock we will synchronise with xfs_iflush_done().

SGI-PV: 909874
SGI-Modid: xfs-linux-melb:xfs-kern:30468a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2008-04-18 11:34:54 +10:00
Niv Sardi
e12070a5dc [XFS] actually check error returned by xfs_flush_pages, clean up and
bailout if fails.

SGI-PV: 973041
SGI-Modid: xfs-linux-melb:xfs-kern:30462a

Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:34:47 +10:00
Eric Sandeen
e6957ea484 [XFS] Ensure "both" features2 slots are consistent
Since older kernels may look in the sb_bad_features2 slot for flags,
rather than zeroing it out on fixup, we should make it equal to the
sb_features2 value.

Also, if the ATTR2 flag was not found prior to features2 fixup, it was not
set in the mount flags, so re-check after the fixup so that the current
session will use the feature.

Also fix up the comments to reflect these changes.

SGI-PV: 980085
SGI-Modid: xfs-linux-melb:xfs-kern:30778a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-10 16:25:26 +10:00
David Chinner
ee1c090825 [XFS] Fix superblock features2 field alignment problem
Due to the xfs_dsb_t structure not being 64 bit aligned, the last field of
the on-disk superblock can vary in location This causes problems when the
filesystem gets moved to a different platform, or there is a 32 bit
userspace and 64 bit kernel.

This patch detects the defect at mount time, logs a warning such as:

XFS: correcting sb_features alignment problem

in dmesg and corrects the problem so that everything is OK. it also
blacklists the bad field in the superblock so it does not get used for
something else later on.

SGI-PV: 977636
SGI-Modid: xfs-linux-melb:xfs-kern:30539a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-10 16:25:15 +10:00
Eric Sandeen
6211870992 [XFS] remove shouting-indirection macros from xfs_sb.h
Remove macro-to-small-function indirection from xfs_sb.h, and remove some
which are completely unused.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30528a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-10 16:24:45 +10:00
David Chinner
72772a3b5b [XFS] fix inode leak in xfs_iget_core()
If the radix_tree_preload() fails, we need to destroy the inode we just
read in before trying again. This could leak xfs_vnode structures when
there is memory pressure. Noticed by Christoph Hellwig.

SGI-PV: 977823
SGI-Modid: xfs-linux-melb:xfs-kern:30606a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-03-06 16:38:50 +11:00
David Chinner
92d9cd1059 [XFS] 977545 977545 977545 977545 977545 977545 xfsaild causing too many
wakeups

Idle state is not being detected properly by the xfsaild push code. The
current idle state is detected by an empty list which may never happen
with mostly idle filesystem or one using lazy superblock counters. A
single dirty item in the list that exists beyond the push target can
result repeated looping attempting to push up to the target because it
fails to check if the push target has been acheived or not.

Fix by considering a dirty list with everything past the target as an idle
state and set the timeout appropriately.

SGI-PV: 977545
SGI-Modid: xfs-linux-melb:xfs-kern:30532a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-03-06 16:38:17 +11:00
Josef Jeff Sipek
1bd960ee2b [XFS] If you mount an XFS filesystem with no mount options at all, then
the "ikeep" option is set rather than "noikeep".

This regression was introduced in 970451.

With no mount options specified, xfs_parseargs() does the following:

int ikeep = 0;

args->flags |= XFSMNT_BARRIER;

args->flags2 |= XFSMNT2_COMPAT_IOSIZE;

if (!options)

goto done;

It only sets the above two options by default and before, it also used to
set XFSMNT_IDELETE by default.

If options are specified, then

if (!(args->flags & XFSMNT_DMAPI) && !ikeep)

args->flags |= XFSMNT_IDELETE;

is executed later on which is skipped by the "goto done;" above.

The solution is to invert the logic.

SGI-PV: 977771
SGI-Modid: xfs-linux-melb:xfs-kern:30590a

Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-28 20:37:56 -08:00
Lachlan McIlroy
ef8ece55d9 [XFS] Undo bit ops cleanup mod due to regression on 32-bit powermac
platform.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30559a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-26 17:05:44 +11:00
Lachlan McIlroy
db69c915e6 [XFS] Undo bit ops cleanup mod due to regression on 32-bit powermac
platform.

SGI-PV: 974005
SGI-Modid: xfs-linux-melb:xfs-kern:30558a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-26 17:05:37 +11:00
Lachlan McIlroy
6e5e93424d Remove empty file fs/xfs/Makefile-linux-2.6. 2008-02-22 15:39:10 +11:00
Lachlan McIlroy
c58310bf49 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-linus 2008-02-18 13:51:42 +11:00
Lachlan McIlroy
269cdfaf76 [XFS] Added quota targets and removed dmapi directory
Fixes build failures introduced by bad merge to mainline.
2008-02-18 13:06:17 +11:00
Eric Sandeen
794f744b22 [XFS] Fix up xfs out-of-tree builds. (a.k.a. external modules)
Change -I include directives to find headers in the out-of-tree spot. This
allows a directory containing only xfs files to be built as:

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29878a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-18 12:59:11 +11:00
Andi Kleen
58b7983d15 [XFS] Remove Makefile wrappers in XFS
Makefile (and Kbuild) would include Makefile-linux-26 I doubt XFS will
really still compile on 2.4; so drop that. This moves Makefile-linux-26
into Makefile and drops Kbuild. Also having wrappers as both Kbuild and
Makefile seemed redundant anyways.

The patch is relatively large because it renames a file, but no functional
changes.

SGI-PV: 971050
SGI-Modid: xfs-linux-melb:xfs-kern:29781a

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-18 12:48:03 +11:00
Jan Blunck
1d957f9bf8 Introduce path_put()
* Add path_put() functions for releasing a reference to the dentry and
  vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck
4ac9137858 Embed a struct path into struct nameidata instead of nd->{dentry,mnt}
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
  <dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
  struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
   text    data     bss     dec     hex filename
5321639  858418  715768 6895825  6938d1 vmlinux

with patch series:
   text    data     bss     dec     hex filename
5320026  858418  715768 6894212  693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Marcin Slusarz
413d57c990 xfs: convert beX_add to beX_add_cpu (new common API)
remove beX_add functions and replace all uses with beX_add_cpu

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: Dave Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-13 16:21:19 -08:00
Linus Torvalds
0b61a2ba5d Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: (62 commits)
  [XFS] add __init/__exit mark to specific init/cleanup functions
  [XFS] Fix oops in xfs_file_readdir()
  [XFS] kill xfs_root
  [XFS] keep i_nlink updated and use proper accessors
  [XFS] stop updating inode->i_blocks
  [XFS] Make xfs_ail_check check less by default
  [XFS] Move AIL pushing into it's own thread
  [XFS] use generic_permission
  [XFS] stop re-checking permissions in xfs_swapext
  [XFS] clean up xfs_swapext
  [XFS] remove permission check from xfs_change_file_space
  [XFS] prevent panic during log recovery due to bogus op_hdr length
  [XFS] Cleanup various fid related bits:
  [XFS] Fix xfs_lowbit64
  [XFS] Remove CFORK macros and use code directly in IFORK and DFORK macros.
  [XFS] kill superflous buffer locking (2nd attempt)
  [XFS] Use kernel-supplied "roundup_pow_of_two" for simplicity
  [XFS] Remove the BPCSHIFT and NB* based macros from XFS.
  [XFS] Remove bogus assert
  [XFS] optimize XFS_IS_REALTIME_INODE w/o realtime config
  ...
2008-02-07 19:12:12 -08:00
Lachlan McIlroy
de2eeea609 [XFS] add __init/__exit mark to specific init/cleanup functions
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30459a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Denis Cheng <crquan@gmail.com>
2008-02-07 18:25:19 +11:00
David Chinner
450790a2c5 [XFS] Fix oops in xfs_file_readdir()
When xfs_file_readdir() exactly fills a buffer, it can move it's index
past the end of the buffer and dereference it even though the result of
the dereference is never used. On some platforms this causes an oops.

SGI-PV: 976923
SGI-Modid: xfs-linux-melb:xfs-kern:30458a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:24:13 +11:00
Christoph Hellwig
cbc89dcfd2 [XFS] kill xfs_root
The only caller (xfs_fs_fill_super) can simplify call igrab on the root
inode.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30393a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:24:00 +11:00
Christoph Hellwig
4188c78d95 [XFS] keep i_nlink updated and use proper accessors
To get the read-only bind mounts in -mm to work correctly with XFS we need
to call the drop_nlink and inc_nlink helpers to monitor the link count.
Add calls to these to xfs_bumplink and xfs_droplink and stop copying over
di_nlink to i_nlink in xfs_validate_fields and vn_revalidate.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30392a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:23:38 +11:00
Christoph Hellwig
222096ae7f [XFS] stop updating inode->i_blocks
The VFS doesn't use i_blocks, it's only used by generic_fillattr and the
generic quota code which XFS doesn't use. In XFS there is one use to check
whether we have an inline or out of line sumlink, but we can replace that
with a check of the XFS_IFINLINE inode flag.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30391a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:23:15 +11:00
David Chinner
de08dbc197 [XFS] Make xfs_ail_check check less by default
Checking the entire AIL on every insert and remove is prohibitively
expensive - the sustained sequntial create rate on a single disk drops
from about 1800/s to 60/s because of this checking resulting in the
xfslogd becoming cpu bound.

By default on debug builds, only check the next and previous entries in
the list to ensure they are ordered correctly. If you really want, define
XFS_TRANS_DEBUG to use the old behaviour.

SGI-PV: 972759
SGI-Modid: xfs-linux-melb:xfs-kern:30372a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:23:05 +11:00
David Chinner
249a8c1124 [XFS] Move AIL pushing into it's own thread
When many hundreds to thousands of threads all try to do simultaneous
transactions and the log is in a tail-pushing situation (i.e. full), we
can get multiple threads walking the AIL list and contending on the AIL
lock.

The AIL push is, in effect, a simple I/O dispatch algorithm complicated by
the ordering constraints placed on it by the transaction subsystem. It
really does not need multiple threads to push on it - even when only a
single CPU is pushing the AIL, it can push the I/O out far faster that
pretty much any disk subsystem can handle.

So, to avoid contention problems stemming from multiple list walkers, move
the list walk off into another thread and simply provide a "target" to
push to. When a thread requires a push, it sets the target and wakes the
push thread, then goes to sleep waiting for the required amount of space
to become available in the log.

This mechanism should also be a lot fairer under heavy load as the waiters
will queue in arrival order, rather than queuing in "who completed a push
first" order.

Also, by moving the pushing to a separate thread we can do more
effectively overload detection and prevention as we can keep context from
loop iteration to loop iteration. That is, we can push only part of the
list each loop and not have to loop back to the start of the list every
time we run. This should also help by reducing the number of items we try
to lock and/or push items that we cannot move.

Note that this patch is not intended to solve the inefficiencies in the
AIL structure and the associated issues with extremely large list
contents. That needs to be addresses separately; parallel access would
cause problems to any new structure as well, so I'm only aiming to isolate
the structure from unbounded parallelism here.

SGI-PV: 972759
SGI-Modid: xfs-linux-melb:xfs-kern:30371a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:22:51 +11:00
Christoph Hellwig
4576758db5 [XFS] use generic_permission
Now that all direct caller of xfs_iaccess are gone we can kill xfs_iaccess
and xfs_access and just use generic_permission with a check_acl callback.
This is required for the per-mount read-only patchset in -mm to work
properly with XFS.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30370a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:22:38 +11:00
Christoph Hellwig
f6aa7f2184 [XFS] stop re-checking permissions in xfs_swapext
xfs_swapext should simplify check if we have a writeable file descriptor
instead of re-checking the permissions using xfs_iaccess. Add an
additional check to refuse O_APPEND file descriptors because swapext is
not an append-only write operation.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30369a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:22:24 +11:00
Christoph Hellwig
35fec8df65 [XFS] clean up xfs_swapext
- stop using vnodes
- use proper multiple label goto unwinding
- give the struct file * variables saner names

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30366a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:22:02 +11:00
Christoph Hellwig
199037c598 [XFS] remove permission check from xfs_change_file_space
Both callers of xfs_change_file_space alreaedy do the file->f_mode &
FMODE_WRITE check to ensure we have a file descriptor that has been opened
for write mode, so there is no need to re-check that with xfs_iaccess.
Especially as the later might wrongly deny it for corner cases like file
descriptor passing through unix domain sockets.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30365a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:21:14 +11:00
Lachlan McIlroy
9742bb93da [XFS] prevent panic during log recovery due to bogus op_hdr length
A problem was reported where a system panicked in log recovery due to a
corrupt log record. The cause of the corruption is not known but this
change will at least prevent a crash for this specific scenario. Log
recovery definitely needs some more work in this area.

SGI-PV: 974151
SGI-Modid: xfs-linux-melb:xfs-kern:30318a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-02-07 18:20:58 +11:00
Christoph Hellwig
f71354bc3a [XFS] Cleanup various fid related bits:
- merge xfs_fid2 into it's only caller xfs_dm_inode_to_fh.
- remove xfs_vget and opencode it in the two callers, simplifying
  both of them by avoiding the awkward calling convetion.
- assign directly to the dm_fid_t members in various places in the
  dmapi code instead of casting them to xfs_fid_t first (which
  is identical to dm_fid_t)

SGI-PV: 974747
SGI-Modid: xfs-linux-melb:xfs-kern:30258a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:20:11 +11:00
David Chinner
edd319dc52 [XFS] Fix xfs_lowbit64
xfs_lowbit64 was broken on 32 bit platforms in a recent cleanup of the xfs
bitops. Fix it back up again.

SGI-PV: 974005
SGI-Modid: xfs-linux-melb:xfs-kern:30202a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:19:41 +11:00
Christoph Hellwig
45ba598e56 [XFS] Remove CFORK macros and use code directly in IFORK and DFORK macros.
Currently XFS_IFORK_* and XFS_DFORK* are implemented by means of
XFS_CFORK* macros. But given that XFS_IFORK_* operates on an xfs_inode
that embedds and xfs_icdinode_core and XFS_DFORK_* operates on an
xfs_dinode that embedds a xfs_dinode_core one will have to do endian
swapping while the other doesn't. Instead of having the current mess with
the CFORK macros that have byteswapping and non-byteswapping version
(which are inconsistantly named while we're at it) just define each family
of the macros to stand by itself and simplify the whole matter.

A few direct references to the CFORK variants were cleaned up to use IFORK
or DFORK to make this possible.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30163a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:19:24 +11:00
Christoph Hellwig
a9759f2de3 [XFS] kill superflous buffer locking (2nd attempt)
There is no need to lock any page in xfs_buf.c because we operate on our
own address_space and all locking is covered by the buffer semaphore. If
we ever switch back to main blockdeive address_space as suggested e.g. for
fsblock with a similar scheme the locking will have to be totally revised
anyway because the current scheme is neither correct nor coherent with
itself.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30156a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:18:50 +11:00
Robert P. J. Day
40ebd81d1a [XFS] Use kernel-supplied "roundup_pow_of_two" for simplicity
SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30098a

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:18:19 +11:00
Tim Shimmin
e6a4b37f38 [XFS] Remove the BPCSHIFT and NB* based macros from XFS.
The BPCSHIFT based macros, btoc*, ctob*, offtoc* and ctooff are either not
used or don't need to be used. The NDPP, NDPP, NBBY macros don't need to
be used but instead are replaced directly by PAGE_SIZE and PAGE_CACHE_SIZE
where appropriate. Initial patch and motivation from Nicolas Kaiser.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30096a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:17:58 +11:00
Niv Sardi
f7b7c3673e [XFS] Remove bogus assert
This assert is bogus. We can have a forced shutdown occur
between the check for the XLOG_FORCED_SHUTDOWN and the ASSERT. Also, the
logging system shouldn't care about the state of XFS_FORCED_SHUTDOWN, it
should only check XLOG_FORCED_SHUTDOWN. The logging system has it's own
forced shutdown flag so, for the case of a forced shutdown that's not due
to a logging error, we can flush the log.

SGI-PV: 972985
SGI-Modid: xfs-linux-melb:xfs-kern:30029a

Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:17:39 +11:00
Eric Sandeen
71ddabb94a [XFS] optimize XFS_IS_REALTIME_INODE w/o realtime config
Use XFS_IS_REALTIME_INODE in more places, and #define it to 0 if
CONFIG_XFS_RT is off. This should be safe because mount checks in
xfs_rtmount_init:

so if we get mounted w/o CONFIG_XFS_RT, no realtime inodes should be
encountered after that.

Defining XFS_IS_REALTIME_INODE to 0 saves a bit of stack space,
presumeably gcc can optimize around the various "if (0)" type checks:

xfs_alloc_file_space -8 xfs_bmap_adjacent -16 xfs_bmapi -8
xfs_bmap_rtalloc -16 xfs_bunmapi -28 xfs_free_file_space -64 xfs_imap +8
<-- ? hmm. xfs_iomap_write_direct -12 xfs_qm_dqusage_adjust -4
xfs_qm_vop_chown_reserve -4

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30014a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:16:43 +11:00
David Chinner
a67d7c5f5d [XFS] Move platform specific mount option parse out of core XFS code
Mount option parsing is platform specific. Move it out of core code into
the platform specific superblock operation file.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30012a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:16:30 +11:00
David Chinner
3ed6526441 [XFS] Implement fallocate.
Implement the new generic callout for file preallocation. Atomically
change the file size if requested.

SGI-PV: 972756
SGI-Modid: xfs-linux-melb:xfs-kern:30009a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:16:17 +11:00
David Chinner
5d51eff453 [XFS] Fix inode allocation latency
The log force added in xfs_iget_core() has been a performance issue since
it was introduced for tight loops that allocate then unlink a single file.
under heavy writeback, this can introduce unnecessary latency due tothe
log I/o getting stuck behind bulk data writes.

Fix this latency problem by avoinding the need for the log force by moving
the place we mark linux inode dirty to the transaction commit rather than
on transaction completion.

This also closes a potential hole in the sync code where a linux inode is
not dirty between the time it is modified and the time the log buffer has
been written to disk.

SGI-PV: 972753
SGI-Modid: xfs-linux-melb:xfs-kern:30007a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:16:07 +11:00
David Chinner
e4143a1cf5 [XFS] Fix transaction overrun during writeback.
Prevent transaction overrun in xfs_iomap_write_allocate() if we race with
a truncate that overlaps the delalloc range we were planning to allocate.

If we race, we may allocate into a hole and that requires block
allocation. At this point in time we don't have a reservation for block
allocation (apart from metadata blocks) and so allocating into a hole
rather than a delalloc region results in overflowing the transaction block
reservation.

Fix it by only allowing a single extent to be allocated at a time.

SGI-PV: 972757
SGI-Modid: xfs-linux-melb:xfs-kern:30005a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:15:55 +11:00
David Chinner
786f486f81 [XFS] Show all mount args in /proc/mounts
There are several mount options that don't show up in /proc/mounts. Add
them in and clean up the showargs code at the same time.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30004a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:15:43 +11:00
David Chinner
8ae2c0f64a [XFS] Fix sparse warning in xlog_recover_do_efd_trans.
Sparse trips over the locking order in xlog_recover_do_efd_trans() when
xfs_trans_delete_ail() drops the ail lock. Because the unlock is
conditional, we need to either annotate with a "fake unlock" or change the
structure of the code so sparse thinks the function always unlocks.

Reordering the code makes it simpler, so do that.

SGI-PV: 972755
SGI-Modid: xfs-linux-melb:xfs-kern:30003a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:15:29 +11:00
David Chinner
a8272ce0c1 [XFS] Fix up sparse warnings.
These are mostly locking annotations, marking things static, casts where
needed and declaring stuff in header files.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30002a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:14:38 +11:00
David Chinner
a69b176df2 [XFS] Use the generic bitops rather than implementing them ourselves.
Patch inspired by Andi Kleen.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30000a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:14:22 +11:00
Vlad Apostolov
c319b58b13 [XFS] Make xfs_bulkstat() to report unlinked but referenced inodes
We need xfs_bulkstat() to report inode stat for inodes with link count
zero but reference count non zero.

The fix here:

http://oss.sgi.com/archives/xfs/2007-09/msg00266.html

changed this behavior and made xfs_bulkstat() to filter all unlinked
inodes including those that are not destroyed yet but held by reference.

The attached patch returns back to the original behavior by marking the
on-disk inode buffer "dirty" when di_mode is cleared (at that time both
inode link and reference counter are zero).

SGI-PV: 972004
SGI-Modid: xfs-linux-melb:xfs-kern:29914a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:13:37 +11:00
Lachlan McIlroy
98ce2b5b1b [XFS] 971186 Undo mod xfs-linux-melb:xfs-kern:29845a due to a regression
SGI-PV: 971596
SGI-Modid: xfs-linux-melb:xfs-kern:29902a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:13:27 +11:00
Eric Sandeen
bc58f9bb6b [XFS] fix 32-bit compat ioctls for GETXFLAGS, SETXFLAGS, GETVERSION
XFS_IOC_GETVERSION, XFS_IOC_GETXFLAGS and XFS_IOC_SETXFLAGS all take a
"long" which changes size between 32 and 64 bit platforms.

So, the ioctl cmds that come in from a 32-bit app aren't as expected, for
example on GETXFLAGS,

unknown cmd fd(3) cmd(80046601){t:'f';sz:4}

due to the size mismatch.

So, use instead the 32-bit version of the commands for compat ioctls, and
other than that it doesn't take any more manipulation.

Also, for both native and compat versions, just define them to the values
as defined in fs.h

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29849a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 18:13:17 +11:00
Eric Sandeen
d4f3cc016f [XFS] lose xfs_hex_dump in favor of print_hex_dump
No need for xfs to have its own hex dumping routine now that the kernel
has one.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29847a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 18:13:05 +11:00
Christoph Hellwig
91906a882a [XFS] kill XFS_INOBT_IS_FREE_DISK
This macro is unused an all other acros in this family operate on native
types, so we most likely won't grow a user either.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29846a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 18:12:41 +11:00
Christoph Hellwig
c40ea74101 [XFS] kill superflous buffer locking
There is no need to lock any page in xfs_buf.c because we operate on our
own address_space and all locking is covered by the buffer semaphore. If
we ever switch back to main blockdeive address_space as suggested e.g. for
fsblock with a similar scheme the locking will have to be totally revised
anyway because the current scheme is neither correct nor coherent with
itself.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29845a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 18:12:07 +11:00
Eric Sandeen
0771fb4515 [XFS] Refactor xfs_mountfs
Refactoring xfs_mountfs() to call sub-functions for logical chunks can
help save a bit of stack, and can make it easier to read this long
function.

The mount path is one of the longest common callchains, easily getting to
within a few bytes of the end of a 4k stack when over lvm, quotas are
enabled, and quotacheck must be done.

With this change on top of the other stack-related changes I've sent, I
can get xfs to survive a normal xfsqa run on 4k stacks over lvm.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29834a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 18:11:56 +11:00
Christoph Hellwig
b53e675dc8 [XFS] xlog_rec_header/xlog_rec_ext_header endianess annotations
Mostly trivial conversion with one exceptions: h_num_logops was kept in
native endian previously and only converted to big endian in xlog_sync,
but we always keep it big endian now. With todays cpus fast byteswap
instructions that's not an issue but the new variant keeps the code clean
and maintainable.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29821a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 18:11:47 +11:00
Christoph Hellwig
67fcb7bfb6 [XFS] clean up some xfs_log_priv.h macros
- the various assign lsn macros are replaced by a single inline,
xlog_assign_lsn, which is equivalent to ASSIGN_ANY_LSN_HOST except
for a more sane calling convention. ASSIGN_LSN_DISK is replaced
by xlog_assign_lsn and a manual bytespap, and ASSIGN_LSN by the same,
except we pass the cycle and block arguments explicitly instead of a
log paramter. The latter two variants only had 2, respectively one
user anyway.
- the GET_CYCLE is replaced by a xlog_get_cycle inline with exactly the
same calling conventions.
- GET_CLIENT_ID is replaced by xlog_get_client_id which leaves away
the unused arch argument. Instead of conditional defintions
depending on host endianess we now do an unconditional swap and shift
then, which generates equal code.
- the unused XLOG_SET macro is removed.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29820a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 18:11:38 +11:00
Christoph Hellwig
03bea6fe6c [XFS] clean up some xfs_log_priv.h macros
- the various assign lsn macros are replaced by a single inline,
xlog_assign_lsn, which is equivalent to ASSIGN_ANY_LSN_HOST except
for a more sane calling convention. ASSIGN_LSN_DISK is replaced
by xlog_assign_lsn and a manual bytespap, and ASSIGN_LSN by the same,
except we pass the cycle and block arguments explicitly instead of a
log paramter. The latter two variants only had 2, respectively one
user anyway.
- the GET_CYCLE is replaced by a xlog_get_cycle inline with exactly the
same calling conventions.
- GET_CLIENT_ID is replaced by xlog_get_client_id which leaves away
the unused arch argument. Instead of conditional defintions
depending on host endianess we now do an unconditional swap and shift
then, which generates equal code.
- the unused XLOG_SET macro is removed.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29819a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 18:10:31 +11:00
Christoph Hellwig
9909c4aa1a [XFS] kill xfs_freeze.
No need to have a wrapper just two call two more functions.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29816a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 18:09:56 +11:00
Christoph Hellwig
10090be25c [XFS] cleanup vnode useage in xfs_iget.c
Get rid of vnode useage in xfs_iget.c and pass Linux inode / xfs_inode
where apropinquate. And kill some useless helpers while we're at it.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29808a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 16:55:46 +11:00
Christoph Hellwig
6e7f75eafb [XFS] cleanup vnode useage in xfs_ioctl.c
xfs_ioctl.c passes around vnode pointers quite a lot, but all places
already have the Linux inode which is identical to the vnode these days.
Clean the code up to always use the Linux inode.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29807a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 16:55:35 +11:00
Christoph Hellwig
4ca488eb45 [XFS] Kill off xfs_statvfs.
We were already filling the Linux struct statfs anyway, and doing this
trivial task directly in xfs_fs_statfs makes the code quite a bit cleaner.
While I was at it I also moved copying attributes that don't change over
the lifetime of the filesystem outside the superblock lock.

xfs_fs_fill_super used to get the magic number and blocksize through
xfs_statvfs, but assigning them directly is a lot cleaner and will save
some stack space during mount.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:29802a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 16:53:27 +11:00
Christoph Hellwig
c43f408795 [XFS] simplify xfs_vn_getattr
Just fill in struct kstat directly from the xfs_inode instead of doing a
detour through a bhv_vattr_t and xfs_getattr.

SGI-PV: 970980
SGI-Modid: xfs-linux-melb:xfs-kern:29770a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 16:49:06 +11:00
Christoph Hellwig
613d70436c [XFS] kill xfs_iocore_t
xfs_iocore_t is a structure embedded in xfs_inode. Except for one field it
just duplicates fields already in xfs_inode, and there is nothing this
abstraction buys us on XFS/Linux. This patch removes it and shrinks source
and binary size of xfs aswell as shrinking the size of xfs_inode by 60/44
bytes in debug/non-debug builds.

SGI-PV: 970852
SGI-Modid: xfs-linux-melb:xfs-kern:29754a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 16:48:58 +11:00
Eric Sandeen
007c61c686 [XFS] Remove spin.h
remove spinlock init abstraction macro in spin.h, remove the callers, and
remove the file. Move no-op spinlock_destroy to xfs_linux.h Cleanup
spinlock locals in xfs_mount.c

SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29751a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 16:47:45 +11:00
Eric Sandeen
36e41eebda [XFS] Cleanup lock goop.
Switch last couple lock_t's to spinlock_t's. Remove now-unused
spinlock-related macros & types.

SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29748a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-02-07 16:47:35 +11:00