linux/fs
Jens Axboe 18ce3751cc Properly notify block layer of sync writes
fsync_buffers_list() and sync_dirty_buffer() both issue async writes and
then immediately wait on them. Conceptually, that makes them sync writes
and we should treat them as such so that the IO schedulers can handle
them appropriately.

This patch fixes a write starvation issue that Lin Ming reported, where
xx is stuck for more than 2 minutes because of a large number of
synchronous IO in the system:

INFO: task kjournald:20558 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kjournald     D ffff810010820978  6712 20558      2
ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2
ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb
0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537
Call Trace:
[<ffffffff803ba6f2>] kobject_get+0x12/0x17
[<ffffffff80247537>] getnstimeofday+0x2f/0x83
[<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
[<ffffffff8066d195>] io_schedule+0x5d/0x9f
[<ffffffff8029c1e7>] sync_buffer+0x3b/0x3f
[<ffffffff8066d3f0>] __wait_on_bit+0x40/0x6f
[<ffffffff8029c1ac>] sync_buffer+0x0/0x3f
[<ffffffff8066d48b>] out_of_line_wait_on_bit+0x6c/0x78
[<ffffffff80243909>] wake_bit_function+0x0/0x23
[<ffffffff8029e3ad>] sync_dirty_buffer+0x98/0xcb
[<ffffffff8030056b>] journal_commit_transaction+0x97d/0xcb6
[<ffffffff8023a676>] lock_timer_base+0x26/0x4b
[<ffffffff8030300a>] kjournald+0xc1/0x1fb
[<ffffffff802438db>] autoremove_wake_function+0x0/0x2e
[<ffffffff80302f49>] kjournald+0x0/0x1fb
[<ffffffff802437bb>] kthread+0x47/0x74
[<ffffffff8022de51>] schedule_tail+0x28/0x5d
[<ffffffff8020cac8>] child_rip+0xa/0x12
[<ffffffff80243774>] kthread+0x0/0x74
[<ffffffff8020cabe>] child_rip+0x0/0x12

Lin Ming confirms that this patch fixes the issue. I've run tests with
it for the past week and no ill effects have been observed, so I'm
proposing it for inclusion into 2.6.26.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-07-01 09:07:34 +02:00
..
9p 9p: fix error path during early mount 2008-05-14 19:23:27 -05:00
adfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
affs [PATCH] fix reservation discarding in affs 2008-05-06 13:45:33 -04:00
afs Fix various old email addresses for dwmw2 2008-06-06 11:29:10 -07:00
autofs mount options: fix autofs 2008-02-08 09:22:40 -08:00
autofs4 autofs: path_{get,put}() cleanups 2008-05-01 08:04:01 -07:00
befs byteorder: don't directly include linux/byteorder/generic.h 2008-05-16 12:01:45 -07:00
bfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
cifs Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 2008-06-11 09:45:51 -07:00
coda codafs: fix build warning 2008-04-29 08:06:04 -07:00
configfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
cramfs fs: Remove unnecessary inclusions of asm/semaphore.h 2008-04-18 22:16:44 -04:00
debugfs DEBUGFS: Correct location of debugfs API documentation. 2008-04-30 16:52:47 -07:00
devpts devpts: factor out PTY index allocation 2008-04-30 08:29:48 -07:00
dlm dlm: fix plock dev_write return value 2008-05-19 15:37:27 -05:00
ecryptfs eCryptfs: remove unnecessary page decrypt call 2008-06-06 11:29:09 -07:00
efs efs: update error msg to not refer to deleted read_inode() 2008-04-02 15:28:19 -07:00
exportfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
ext2 ext2: retry block allocation if new blocks are allocated from system zone 2008-04-28 08:58:43 -07:00
ext3 ext3: fix online resize bug 2008-06-06 11:29:13 -07:00
ext4 Ext4: Fix online resize block group descriptor corruption 2008-06-20 11:48:48 -04:00
fat fat: relax the permission check of fat_setattr() 2008-06-12 18:05:39 -07:00
freevxfs fs/freevxfs/: proper externs 2008-04-29 08:06:00 -07:00
fuse fuse: fix thinko in max I/O size calucation 2008-06-17 18:08:10 -07:00
gfs2 [GFS2] fix gfs2 block allocation (cleaned up) 2008-06-24 19:02:28 +01:00
hfs hfs: fix warning with 64k PAGE_SIZE 2008-04-30 08:29:52 -07:00
hfsplus Fix hfsplus oops on image without extents 2008-05-13 08:02:24 -07:00
hostfs uml: fix hostfs tv_usec calculations 2008-02-05 09:44:30 -08:00
hpfs mount options: fix hpfs 2008-02-08 09:22:40 -08:00
hppfs fix hppfs Makefile breakage 2008-05-21 16:55:58 -07:00
hugetlbfs mm: bdi: add separate writeback accounting capability 2008-04-30 08:29:50 -07:00
isofs isofs: fix access to unallocated memory when reading corrupted filesystem 2008-04-30 08:29:33 -07:00
jbd jbd: need to hold j_state_lock to updates to transaction t_state to T_COMMIT 2008-05-14 19:11:14 -07:00
jbd2 jbd2: Fix barrier fallback code to re-lock the buffer head 2008-06-03 22:31:11 -04:00
jffs2 Merge git://git.infradead.org/mtd-2.6 2008-05-01 11:15:28 -07:00
jfs proc: remove proc_root_fs 2008-04-29 08:06:18 -07:00
lockd fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
minix iget: stop the MINIX filesystem from using iget() and read_inode() 2008-02-07 08:42:28 -08:00
msdos fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
ncpfs ncpfs: use get/put_unaligned_* helpers 2008-04-29 08:06:28 -07:00
nfs NFS: nfs_updatepage(): don't mark page as dirty if an error occurred 2008-06-23 17:09:07 -04:00
nfs_common
nfsd nfsd: reorder printk in do_probe_callback to avoid use-after-free 2008-05-18 19:13:07 -04:00
nls
ntfs ntfs: le*_add_cpu conversion 2008-05-24 09:56:08 -07:00
ocfs2 ocfs2: Remove ->hangup() from stack glue operations. 2008-06-16 10:46:52 -07:00
openpromfs iget: stop OPENPROMFS from using iget() and read_inode() 2008-02-07 08:42:29 -08:00
partitions fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
proc pagemap: fix large pages in pagemap 2008-06-12 18:05:41 -07:00
qnx4 iget: stop QNX4 from using iget() and read_inode() 2008-02-07 08:42:28 -08:00
ramfs mm: bdi: add separate writeback accounting capability 2008-04-30 08:29:50 -07:00
reiserfs reiserfs: use open_bdev_excl 2008-04-30 08:29:51 -07:00
romfs ROMFS: Fix up an error in iget removal 2008-03-19 18:53:36 -07:00
smbfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
sysfs sysfs: remove error messages for -EEXIST case 2008-05-14 22:34:16 -07:00
sysv sysv: [bl]e*_add_cpu conversion 2008-04-30 08:29:52 -07:00
udf udf: Fix regression in UDF anchor block detection 2008-06-24 11:38:03 +02:00
ufs ufs: remove unneeded ufs_put_inode prototype 2008-05-13 08:02:23 -07:00
vfat fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
xfs [XFS] Fix memory corruption with small buffer reads 2008-05-23 18:12:49 +10:00
aio.c uml: activate_mm: remove the dead PF_BORROWED_MM check 2008-06-06 11:36:22 -07:00
anon_inodes.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
attr.c
bad_inode.c iget: introduce a function to register iget failure 2008-02-07 08:42:26 -08:00
binfmt_aout.c fs/binfmt_aout.c: use printk_ratelimit() 2008-04-29 08:06:04 -07:00
binfmt_elf_fdpic.c nommu: fix ksize() abuse 2008-06-06 11:29:13 -07:00
binfmt_elf.c Remove last traces of a.out support from ELF loader. 2008-06-16 10:20:57 -07:00
binfmt_em86.c binfmt_misc.c: avoid potential kernel stack overflow 2008-04-29 08:06:04 -07:00
binfmt_flat.c nommu: fix ksize() abuse 2008-06-06 11:29:13 -07:00
binfmt_misc.c binfmt_misc.c: avoid potential kernel stack overflow 2008-04-29 08:06:04 -07:00
binfmt_script.c binfmt_misc.c: avoid potential kernel stack overflow 2008-04-29 08:06:04 -07:00
binfmt_som.c [PATCH] sanitize handling of shared descriptor tables in failing execve() 2008-04-25 09:23:53 -04:00
bio.c docbook: fix bio missing parameter 2008-05-07 18:35:03 +02:00
block_dev.c [PATCH] fix cgroup-inflicted breakage in block_dev.c 2008-06-23 08:30:55 -04:00
buffer.c Properly notify block layer of sync writes 2008-07-01 09:07:34 +02:00
char_dev.c fs: remove unused fops from struct char_device_struct 2008-04-29 08:06:01 -07:00
compat_binfmt_elf.c x86: compat_binfmt_elf 2008-01-30 13:31:46 +01:00
compat_ioctl.c tty: The big operations rework 2008-04-30 08:29:47 -07:00
compat.c [PATCH] get rid of leak in compat_execve() 2008-05-16 17:23:05 -04:00
dcache.c [patch 2/3] vfs: dcache cleanups 2008-06-23 13:07:00 -04:00
dcookies.c d_path: Make d_path() use a struct path 2008-02-14 21:17:09 -08:00
direct-io.c Pagecache zeroing: zero_user_segment, zero_user_segments and zero_user 2008-02-05 09:44:13 -08:00
dnotify.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
dquot.c quota: don't call sync_fs() from vfs_quota_off() when there's no quota turn off 2008-05-13 08:02:23 -07:00
drop_caches.c vfs: skip inodes without pages to free in drop_pagecache_sb() 2008-04-29 08:06:05 -07:00
eventfd.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
eventpoll.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
exec.c Include <asm/a.out.h> in fs/exec.c only for Alpha. 2008-06-16 10:20:57 -07:00
fcntl.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
fifo.c
file_table.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
file.c [PATCH] avoid multiplication overflows and signedness issues for max_fds 2008-05-16 17:22:52 -04:00
filesystems.c
fs-writeback.c fs/fs-writeback.c: make 2 functions static 2008-04-29 08:06:00 -07:00
generic_acl.c
inode.c VFS: fix unused variable warning 2008-05-06 13:13:37 -07:00
inotify_user.c Remove duplicated unlikely() in IS_ERR() 2008-04-29 08:06:25 -07:00
inotify.c inotify: remove debug code 2008-02-06 10:41:07 -08:00
internal.h [PATCH] move a bunch of declarations to fs/internal.h 2008-04-21 23:11:01 -04:00
ioctl.c make vfs_ioctl() static 2008-04-29 08:06:00 -07:00
ioprio.c cfq-iosched: relax IOPRIO_CLASS_IDLE restrictions 2008-01-28 11:38:15 +01:00
Kconfig [S390] System z large page support. 2008-04-30 13:38:47 +02:00
Kconfig.binfmt frv: don't offer BINFMT_FLAT 2008-06-06 11:29:08 -07:00
libfs.c introduce memory_read_from_buffer() 2008-06-06 11:29:11 -07:00
locks.c [patch 4/4] flock: remove unused fields from file_lock_operations 2008-06-23 11:52:30 -04:00
Makefile x86: compat_binfmt_elf Kconfig 2008-01-30 13:31:46 +01:00
mbcache.c vfs: fix possible deadlock in ext2, ext3, ext4 when using xattrs 2008-04-15 19:35:41 -07:00
mpage.c docbook: fix filesystems.tmpl source files 2008-03-03 10:47:13 -08:00
namei.c [patch 3/4] vfs: fix ERR_PTR abuse in generic_readlink 2008-06-23 11:52:30 -04:00
namespace.c fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
nfsctl.c Introduce path_put() 2008-02-14 21:13:33 -08:00
no-block.c
open.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
pipe.c [patch 1/4] vfs: path_{get,put}() cleanups 2008-06-23 11:52:29 -04:00
pnode.c [patch 7/7] vfs: mountinfo: show dominating group id 2008-04-23 00:05:09 -04:00
pnode.h [patch 7/7] vfs: mountinfo: show dominating group id 2008-04-23 00:05:09 -04:00
posix_acl.c
quota_v1.c quota: do not allow setting of quota limits to too high values 2008-04-28 08:58:32 -07:00
quota_v2.c quota: le*_add_cpu conversion 2008-04-30 08:29:51 -07:00
quota.c quota: quota core changes for quotaon on remount 2008-04-28 08:58:33 -07:00
read_write.c fs: use loff_t type instead of long long 2008-04-22 15:17:11 -07:00
read_write.h
readdir.c Use mutex_lock_killable in vfs_readdir 2007-12-06 17:39:54 -05:00
select.c Fix performance regression on lmbench select benchmark 2008-06-22 12:23:15 -07:00
seq_file.c [patch 2/7] vfs: mountinfo: add seq_file_root() 2008-04-23 00:04:38 -04:00
signalfd.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
splice.c splice: handle try_to_release_page() failure 2008-05-28 14:49:27 +02:00
stack.c
stat.c Introduce path_put() 2008-02-14 21:13:33 -08:00
super.c make __put_super() static 2008-04-29 08:06:00 -07:00
sync.c vfs: fix unconditional write_super() call in file_fsync() 2008-04-29 08:06:06 -07:00
timerfd.c [PATCH] sanitize anon_inode_getfd() 2008-05-01 13:08:50 -04:00
utimes.c [patch for 2.6.26 4/4] vfs: utimensat(): fix write access check for futimens() 2008-06-23 08:43:52 -04:00
xattr_acl.c
xattr.c xattr: add missing consts to function arguments 2008-04-29 08:06:06 -07:00