linux/fs
Dave Chinner 9f87832a82 xfs: fix shutdown hang on invalid inode during create
When the new inode verify in xfs_iread() fails, the create
transaction is aborted and a shutdown occurs. The subsequent unmount
then hangs in xfs_wait_buftarg() on a buffer that has an elevated
hold count. Debug showed that it was an AGI buffer getting stuck:

[   22.576147] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
[   22.976213] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
[   23.376206] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck
[   23.776325] XFS (vdb): buffer 0x2/0x1, hold 0x2 stuck

The trace of this buffer leading up to the shutdown (trimmed for
brevity) looks like:

xfs_buf_init:        bno 0x2 nblks 0x1 hold 1 caller xfs_buf_get_map
xfs_buf_get:         bno 0x2 len 0x200 hold 1 caller xfs_buf_read_map
xfs_buf_read:        bno 0x2 len 0x200 hold 1 caller xfs_trans_read_buf_map
xfs_buf_iorequest:   bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_read
xfs_buf_hold:        bno 0x2 nblks 0x1 hold 1 caller xfs_buf_iorequest
xfs_buf_rele:        bno 0x2 nblks 0x1 hold 2 caller xfs_buf_iorequest
xfs_buf_iowait:      bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_read
xfs_buf_ioerror:     bno 0x2 len 0x200 hold 1 caller xfs_buf_bio_end_io
xfs_buf_iodone:      bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_ioend
xfs_buf_iowait_done: bno 0x2 nblks 0x1 hold 1 caller _xfs_buf_read
xfs_buf_hold:        bno 0x2 nblks 0x1 hold 1 caller xfs_buf_item_init
xfs_trans_read_buf:  bno 0x2 len 0x200 hold 2 recur 0 refcount 1
xfs_trans_brelse:    bno 0x2 len 0x200 hold 2 recur 0 refcount 1
xfs_buf_item_relse:  bno 0x2 nblks 0x1 hold 2 caller xfs_trans_brelse
xfs_buf_rele:        bno 0x2 nblks 0x1 hold 2 caller xfs_buf_item_relse
xfs_buf_unlock:      bno 0x2 nblks 0x1 hold 1 caller xfs_trans_brelse
xfs_buf_rele:        bno 0x2 nblks 0x1 hold 1 caller xfs_trans_brelse
xfs_buf_trylock:     bno 0x2 nblks 0x1 hold 2 caller _xfs_buf_find
xfs_buf_find:        bno 0x2 len 0x200 hold 2 caller xfs_buf_get_map
xfs_buf_get:         bno 0x2 len 0x200 hold 2 caller xfs_buf_read_map
xfs_buf_read:        bno 0x2 len 0x200 hold 2 caller xfs_trans_read_buf_map
xfs_buf_hold:        bno 0x2 nblks 0x1 hold 2 caller xfs_buf_item_init
xfs_trans_read_buf:  bno 0x2 len 0x200 hold 3 recur 0 refcount 1
xfs_trans_log_buf:   bno 0x2 len 0x200 hold 3 recur 0 refcount 1
xfs_buf_item_unlock: bno 0x2 len 0x200 hold 3 flags DIRTY liflags ABORTED
xfs_buf_unlock:      bno 0x2 nblks 0x1 hold 3 caller xfs_buf_item_unlock
xfs_buf_rele:        bno 0x2 nblks 0x1 hold 3 caller xfs_buf_item_unlock

And that is the AGI buffer from cold cache read into memory to
transaction abort. You can see at transaction abort the bli is dirty
and only has a single reference. The item is not pinned, and it's
not in the AIL. Hence the only reference to it is this transaction.

The problem is that the xfs_buf_item_unlock() call is dropping the
last reference to the xfs_buf_log_item attached to the buffer (which
holds a reference to the buffer), but it is not freeing the
xfs_buf_log_item. Hence nothing will ever release the buffer, and
the unmount hangs waiting for this reference to go away.

The fix is simple - xfs_buf_item_unlock needs to detect the last
reference going away in this case and free the xfs_buf_log_item to
release the reference it holds on the buffer.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-01-28 12:51:12 -06:00
..
9p The following changes since commit 4cbe5a555f: 2012-10-12 09:59:23 +09:00
adfs adfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
affs affs: drop vmtruncate 2012-12-20 14:00:01 -05:00
afs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
autofs4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-17 15:44:47 -08:00
befs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
bfs bfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-12-20 18:14:31 -08:00
cachefiles FS-Cache: Mark cancellation of in-progress operation 2012-12-20 22:34:00 +00:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2012-12-20 14:00:13 -08:00
cifs cifs: eliminate cifsERROR variable 2012-12-20 11:27:17 -06:00
coda fs: push rcu_barrier() from deactivate_locked_super() to filesystems 2012-10-02 21:35:55 -04:00
configfs lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
cramfs userns: Convert cramfs to use kuid/kgid where appropriate 2012-09-21 03:13:08 -07:00
debugfs fs/debugsfs: remove unnecessary inode->i_private initialization 2012-11-15 17:46:42 -08:00
devpts TTY: devpts, document devpts inode operations 2012-10-22 16:50:13 -07:00
dlm dlm: fix lvb invalidation conditions 2012-11-16 11:20:42 -06:00
ecryptfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
efs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
exofs exofs: don't leak io_state and pages on read error 2012-12-14 12:17:32 +02:00
exportfs Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux 2012-12-20 14:04:11 -08:00
ext2 ext2: fix return values on parse_options() failure 2012-10-09 23:23:53 +02:00
ext3 lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
ext4 lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
f2fs f2fs: fix tracking parent inode number 2012-12-11 13:43:45 +09:00
fat fat: fix incorrect function comment 2012-12-20 17:40:20 -08:00
freevxfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
fscache FS-Cache: Clear remaining page count on retrieval cancellation 2012-12-20 22:35:15 +00:00
fuse Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-17 20:58:12 -08:00
gfs2 lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
hfs hfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
hfsplus Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-20 20:00:43 -08:00
hostfs Merge branch 'for-linus-37rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml 2012-10-10 11:15:20 +09:00
hpfs hpfs: drop vmtruncate 2012-12-20 18:40:00 -05:00
hppfs pidns: Use task_active_pid_ns where appropriate 2012-11-19 05:59:09 -08:00
hugetlbfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
isofs tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking 2012-10-09 23:33:55 -04:00
jbd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
jbd2 There are two major features for this merge window. The first is 2012-12-16 17:33:01 -08:00
jffs2 jffs2: hold erase_completion_lock on exit 2012-11-18 11:59:01 +02:00
jfs jfs: drop vmtruncate 2012-12-20 18:40:52 -05:00
lockd lockd: Remove BUG_ON()s from fs/lockd/clntproc.c 2012-11-04 14:43:40 -05:00
logfs logfs: drop vmtruncate 2012-12-20 18:40:53 -05:00
minix minix: drop vmtruncate 2012-12-20 18:40:53 -05:00
ncpfs ncpfs: drop vmtruncate 2012-12-20 18:40:54 -05:00
nfs NFS: Kill fscache warnings when mounting without -ofsc 2012-12-21 08:32:09 -08:00
nfs_common
nfsd Revert "nfsd: warn on odd reply state in nfsd_vfs_read" 2012-12-21 17:07:45 -08:00
nilfs2 nilfs2: drop vmtruncate 2012-12-20 18:40:54 -05:00
nls
notify Merge branch 'for-next' of git://git.infradead.org/users/eparis/notify 2012-12-20 20:11:52 -08:00
ntfs ntfs: drop vmtruncate 2012-12-20 18:40:55 -05:00
ocfs2 ocfs2: drop vmtruncate 2012-12-20 14:00:01 -05:00
omfs omfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
openpromfs fs: push rcu_barrier() from deactivate_locked_super() to filesystems 2012-10-02 21:35:55 -04:00
proc Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-20 20:00:43 -08:00
pstore lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
qnx4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
qnx6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
quota quota: Use the pre-processor to compile out quotactl_cmd_write when !CONFIG_BLOCK 2012-12-13 16:33:24 +01:00
ramfs don't pass nameidata to ->create() 2012-07-14 16:34:47 +04:00
reiserfs reiserfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
romfs fs: push rcu_barrier() from deactivate_locked_super() to filesystems 2012-10-02 21:35:55 -04:00
squashfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
sysfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-17 15:44:47 -08:00
sysv sysv: drop vmtruncate 2012-12-20 14:00:01 -05:00
ubifs ubifs: use prandom_bytes 2012-12-17 17:15:26 -08:00
udf udf: remove un-needed variable from inode_getblk 2012-12-13 16:33:23 +01:00
ufs ufs: drop vmtruncate 2012-12-20 14:00:01 -05:00
xfs xfs: fix shutdown hang on invalid inode during create 2013-01-28 12:51:12 -06:00
aio.c aio: now fput() is OK from interrupt context; get rid of manual delayed __fput() 2012-07-22 23:57:59 +04:00
anon_inodes.c
attr.c userns: Allow chown and setgid preservation 2012-11-20 04:17:24 -08:00
bad_inode.c lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
binfmt_aout.c get rid of pt_regs argument of ->load_binary() 2012-11-28 21:53:38 -05:00
binfmt_elf_fdpic.c get rid of pt_regs argument of ->load_binary() 2012-11-28 21:53:38 -05:00
binfmt_elf.c binfmt_elf: fix corner case kfree of uninitialized data 2012-12-17 17:15:19 -08:00
binfmt_em86.c exec: use -ELOOP for max recursion depth 2012-12-17 17:15:23 -08:00
binfmt_flat.c get rid of pt_regs argument of ->load_binary() 2012-11-28 21:53:38 -05:00
binfmt_misc.c exec: do not leave bprm->interp on stack 2012-12-20 17:40:19 -08:00
binfmt_script.c exec: do not leave bprm->interp on stack 2012-12-20 17:40:19 -08:00
binfmt_som.c get rid of pt_regs argument of ->load_binary() 2012-11-28 21:53:38 -05:00
bio-integrity.c block: Ues bi_pool for bio_integrity_alloc() 2012-09-09 10:35:38 +02:00
bio.c vfs: fix: don't increase bio_slab_max if krealloc() fails 2012-10-22 22:00:26 +02:00
block_dev.c lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
buffer.c fs/buffer.c: remove redundant initialization in alloc_page_buffers() 2012-12-12 17:38:35 -08:00
char_dev.c char_dev: pin parent kobject 2012-10-22 08:50:37 +03:00
compat_binfmt_elf.c coredump: extend core dump note section to contain file names of mapped files 2012-10-06 03:05:17 +09:00
compat_ioctl.c Merge 3.7-rc3 into tty-next 2012-10-29 09:00:57 -07:00
compat.c vfs: define struct filename and have getname() return it 2012-10-12 20:14:55 -04:00
coredump.c do_coredump(): get rid of pt_regs argument 2012-11-29 00:01:25 -05:00
coredump.h coredump: update coredump-related headers 2012-10-06 03:05:15 +09:00
dcache.c vfs: d_obtain_alias() needs to use "/" as default name. 2012-12-20 18:49:10 -05:00
dcookies.c
direct-io.c direct-io: don't read inode->i_blkbits multiple times 2012-11-29 12:38:44 -08:00
drop_caches.c
eventfd.c fs, eventfd: add procfs fdinfo helper 2012-12-17 17:15:27 -08:00
eventpoll.c fs, epoll: add procfs fdinfo helper 2012-12-17 17:15:27 -08:00
exec.c Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-20 20:00:43 -08:00
fcntl.c Fix F_DUPFD_CLOEXEC breakage 2012-10-09 15:52:31 +09:00
fhandle.c Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux 2012-12-20 14:04:11 -08:00
fifo.c fifo: Do not restart open() if it already found a partner 2012-07-16 08:33:14 -07:00
file_table.c fs: Fix imbalance in freeze protection in mark_files_ro() 2012-12-20 13:57:36 -05:00
file.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2012-12-12 12:22:13 -08:00
filesystems.c vfs: define struct filename and have getname() return it 2012-10-12 20:14:55 -04:00
fs_struct.c kill daemonize() 2012-11-28 21:49:02 -05:00
fs-writeback.c writeback: fix a typo in comment 2012-12-12 17:38:34 -08:00
generic_acl.c userns: Pass a userns parameter into posix_acl_to_xattr and posix_acl_from_xattr 2012-09-18 01:01:35 -07:00
inode.c mm: redefine address_space.assoc_mapping 2012-12-11 17:22:26 -08:00
internal.h writeback: put unused inodes to LRU after writeback completion 2012-11-26 17:41:24 -08:00
ioctl.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
ioprio.c
Kconfig Introduce a new file system, Flash-Friendly File System (F2FS), to Linux 3.8. 2012-12-20 13:54:52 -08:00
Kconfig.binfmt coredump: make core dump functionality optional 2012-10-06 03:05:15 +09:00
libfs.c vfs: drop vmtruncate 2012-12-20 18:46:29 -05:00
locks.c UAPI Disintegration 2012-10-09 2012-10-09 18:35:22 -04:00
Makefile f2fs: update Kconfig and Makefile 2012-12-11 13:43:42 +09:00
mbcache.c
mount.h proc: Usable inode numbers for the namespace file descriptors. 2012-11-20 04:19:49 -08:00
mpage.c
namei.c vfs: fix renameat to retry on ESTALE errors 2012-12-20 18:50:05 -05:00
namespace.c vfs, freeze: use ACCESS_ONCE() to guard access to ->mnt_flags 2012-12-20 13:36:18 -05:00
no-block.c
open.c vfs: make fchownat retry once on ESTALE errors 2012-12-20 18:50:07 -05:00
pipe.c pipe(2) - race-free error recovery 2012-09-26 21:08:52 -04:00
pnode.c VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errors 2012-07-14 16:37:27 +04:00
pnode.h vfs: Only support slave subtrees across different user namespaces 2012-11-19 05:59:20 -08:00
posix_acl.c userns: Convert vfs posix_acl support to use kuids and kgids 2012-09-18 01:01:35 -07:00
proc_namespace.c get rid of magic in proc_namespace.c 2012-07-14 16:32:48 +04:00
read_write.c sendfile: allows bypassing of notifier events 2012-12-20 17:40:21 -08:00
read_write.h compat: fs: Generic compat_sys_sendfile implementation 2012-10-02 21:35:55 -04:00
readdir.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
select.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
seq_file.c lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
signalfd.c fs, epoll: add procfs fdinfo helper 2012-12-17 17:15:27 -08:00
splice.c writeback: remove nr_pages_dirtied arg from balance_dirty_pages_ratelimited_nr() 2012-12-11 17:22:21 -08:00
stack.c
stat.c vfs: fix readlinkat to retry on ESTALE 2012-12-20 18:50:01 -05:00
statfs.c vfs: fix user_statfs to retry once on ESTALE errors 2012-12-20 18:50:07 -05:00
super.c vfs: drop lock/unlock super 2012-10-09 23:33:39 -04:00
sync.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
timerfd.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
utimes.c vfs: allow utimensat() calls to retry once on an ESTALE error 2012-12-20 18:50:08 -05:00
xattr_acl.c userns: Fix posix_acl_file_xattr_userns gid conversion 2012-10-12 13:16:48 -07:00
xattr.c vfs: make lremovexattr retry once on ESTALE error 2012-12-20 18:50:11 -05:00