Commit Graph

383 Commits

Author SHA1 Message Date
Linus Torvalds
82062e7b50 Merge branch 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing
* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
  reiserfs: Relax reiserfs_xattr_set_handle() while acquiring xattr locks
  reiserfs: Fix unreachable statement
  reiserfs: Don't call reiserfs_get_acl() with the reiserfs lock
  reiserfs: Relax lock on xattr removing
  reiserfs: Relax the lock before truncating pages
  reiserfs: Fix recursive lock on lchown
  reiserfs: Fix mistake in down_write() conversion
2010-01-08 14:03:55 -08:00
Frederic Weisbecker
31370f62ba reiserfs: Relax reiserfs_xattr_set_handle() while acquiring xattr locks
Fix remaining xattr locks acquired in reiserfs_xattr_set_handle()
while we are holding the reiserfs lock to avoid lock inversions.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-07 16:02:53 +01:00
Jiri Slaby
e0baec1b63 reiserfs: Fix unreachable statement
Stanse found an unreachable statement in reiserfs_ioctl. There is a
if followed by error assignment and `break' with no braces. Add the
braces so that we don't break every time, but only in error case,
so that REISERFS_IOC_SETVERSION actually works when it returns no
error.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Reiserfs <reiserfs-devel@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-01-07 14:03:18 +01:00
Frederic Weisbecker
6c28705418 reiserfs: Don't call reiserfs_get_acl() with the reiserfs lock
reiserfs_get_acl is usually not called under the reiserfs lock,
as it doesn't need it. But it happens when it is called by
reiserfs_acl_chmod(), which creates a dependency inversion against
the private xattr inodes mutexes for the given inode.

We need to call it without the reiserfs lock, especially since
it's unnecessary.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-07 13:46:48 +01:00
Frederic Weisbecker
4f3be1b5a9 reiserfs: Relax lock on xattr removing
When we remove an xattr, we call lookup_and_delete_xattr()
that takes some private xattr inodes mutexes. But we hold
the reiserfs lock at this time, which leads to dependency
inversions.

We can safely call lookup_and_delete_xattr() without the
reiserfs lock, where xattr inodes lookups only need the
xattr inodes mutexes.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-05 08:00:50 +01:00
Frederic Weisbecker
108d3943c0 reiserfs: Relax the lock before truncating pages
While truncating a file, reiserfs_setattr() calls inode_setattr()
that will truncate the mapping for the given inode, but for that
it needs the pages locks.

In order to release these, the owners need the reiserfs lock to
complete their jobs. But they can't, as we don't release it before
calling inode_setattr().

We need to do that to fix the following softlockups:

INFO: task flush-8:0:2149 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-8:0     D f51af998     0  2149      2 0x00000000
 f51af9ac 00000092 00000002 f51af998 c2803304 00000000 c1894ad0 010f3000
 f51af9cc c1462604 c189ef80 f51af974 c1710304 f715b450 f715b5ec c2807c40
 00000000 0005bb00 c2803320 c102c55b c1710304 c2807c50 c2803304 00000246
Call Trace:
 [<c1462604>] ? schedule+0x434/0xb20
 [<c102c55b>] ? resched_task+0x4b/0x70
 [<c106fa22>] ? mark_held_locks+0x62/0x80
 [<c146414d>] ? mutex_lock_nested+0x1fd/0x350
 [<c14640b9>] mutex_lock_nested+0x169/0x350
 [<c1178cde>] ? reiserfs_write_lock+0x2e/0x40
 [<c1178cde>] reiserfs_write_lock+0x2e/0x40
 [<c11719a2>] do_journal_end+0xc2/0xe70
 [<c1172912>] journal_end+0xb2/0x120
 [<c11686b3>] ? pathrelse+0x33/0xb0
 [<c11729e4>] reiserfs_end_persistent_transaction+0x64/0x70
 [<c1153caa>] reiserfs_get_block+0x12ba/0x15f0
 [<c106fa22>] ? mark_held_locks+0x62/0x80
 [<c1154b24>] reiserfs_writepage+0xa74/0xe80
 [<c1465a27>] ? _raw_spin_unlock_irq+0x27/0x50
 [<c11f3d25>] ? radix_tree_gang_lookup_tag_slot+0x95/0xc0
 [<c10b5377>] ? find_get_pages_tag+0x127/0x1a0
 [<c106fa22>] ? mark_held_locks+0x62/0x80
 [<c106fcd4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10bc1e0>] __writepage+0x10/0x40
 [<c10bc9ab>] write_cache_pages+0x16b/0x320
 [<c10bc1d0>] ? __writepage+0x0/0x40
 [<c10bcb88>] generic_writepages+0x28/0x40
 [<c10bcbd5>] do_writepages+0x35/0x40
 [<c11059f7>] writeback_single_inode+0xc7/0x330
 [<c11067b2>] writeback_inodes_wb+0x2c2/0x490
 [<c1106a86>] wb_writeback+0x106/0x1b0
 [<c1106cf6>] wb_do_writeback+0x106/0x1e0
 [<c1106c18>] ? wb_do_writeback+0x28/0x1e0
 [<c1106e0a>] bdi_writeback_task+0x3a/0xb0
 [<c10cbb13>] bdi_start_fn+0x63/0xc0
 [<c10cbab0>] ? bdi_start_fn+0x0/0xc0
 [<c105d1f4>] kthread+0x74/0x80
 [<c105d180>] ? kthread+0x0/0x80
 [<c100327a>] kernel_thread_helper+0x6/0x10
3 locks held by flush-8:0/2149:
 #0:  (&type->s_umount_key#30){+++++.}, at: [<c110676f>] writeback_inodes_wb+0x27f/0x490
 #1:  (&journal->j_mutex){+.+...}, at: [<c117199a>] do_journal_end+0xba/0xe70
 #2:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1178cde>] reiserfs_write_lock+0x2e/0x40
INFO: task fstest:3813 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
fstest        D 00000002     0  3813   3812 0x00000000
 f5103c94 00000082 f5103c40 00000002 f5ad5450 00000007 f5103c28 011f3000
 00000006 f5ad5450 c10bb005 00000480 c1710304 f5ad5450 f5ad55ec c2907c40
 00000001 f5ad5450 f5103c74 00000046 00000002 f5ad5450 00000007 f5103c6c
Call Trace:
 [<c10bb005>] ? free_hot_cold_page+0x1d5/0x280
 [<c1462d64>] io_schedule+0x74/0xc0
 [<c10b5a45>] sync_page+0x35/0x60
 [<c146325a>] __wait_on_bit_lock+0x4a/0x90
 [<c10b5a10>] ? sync_page+0x0/0x60
 [<c10b59e5>] __lock_page+0x85/0x90
 [<c105d660>] ? wake_bit_function+0x0/0x60
 [<c10bf654>] truncate_inode_pages_range+0x1e4/0x2d0
 [<c10bf75f>] truncate_inode_pages+0x1f/0x30
 [<c10bf7cf>] truncate_pagecache+0x5f/0xa0
 [<c10bf86a>] vmtruncate+0x5a/0x70
 [<c10fdb7d>] inode_setattr+0x5d/0x190
 [<c1150117>] reiserfs_setattr+0x1f7/0x2f0
 [<c1464569>] ? down_write+0x49/0x70
 [<c10fde01>] notify_change+0x151/0x330
 [<c10e6f3d>] do_truncate+0x6d/0xa0
 [<c10f4ce2>] do_filp_open+0x9a2/0xcf0
 [<c1465aec>] ? _raw_spin_unlock+0x2c/0x50
 [<c10fec50>] ? alloc_fd+0xe0/0x100
 [<c10e602d>] do_sys_open+0x6d/0x130
 [<c1002cfb>] ? sysenter_exit+0xf/0x16
 [<c10e615e>] sys_open+0x2e/0x40
 [<c1002ccc>] sysenter_do_call+0x12/0x32
3 locks held by fstest/3813:
 #0:  (&sb->s_type->i_mutex_key#4){+.+.+.}, at: [<c10e6f33>] do_truncate+0x63/0xa0
 #1:  (&sb->s_type->i_alloc_sem_key#3){+.+.+.}, at: [<c10fdf07>] notify_change+0x257/0x330
 #2:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1178c8e>] reiserfs_write_lock_once+0x2e/0x50

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-05 08:00:29 +01:00
Frederic Weisbecker
5fe1533fda reiserfs: Fix recursive lock on lchown
On chown, reiserfs will call reiserfs_setattr() to change the owner
of the given inode, but it may also recursively call
reiserfs_setattr() to propagate the owner change to the private xattr
files for this inode.

Hence, the reiserfs lock may be acquired twice which is not wanted
as reiserfs_setattr() calls journal_begin() that is going to try to
relax the lock in order to safely acquire the journal mutex.

Using reiserfs_write_lock_once() from reiserfs_setattr() solves
the problem.

This fixes the following warning, that precedes a lockdep report.

WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3f/0x50()
Hardware name: MS-7418
Unwanted recursive reiserfs lock!
Pid: 4189, comm: fsstress Not tainted 2.6.33-rc2-tip-atom+ #195
Call Trace:
 [<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50
 [<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50
 [<c103f7ac>] warn_slowpath_common+0x6c/0xc0
 [<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50
 [<c103f84b>] warn_slowpath_fmt+0x2b/0x30
 [<c1178bff>] reiserfs_lock_check_recursive+0x3f/0x50
 [<c1172ae3>] do_journal_begin_r+0x83/0x350
 [<c1172f2d>] journal_begin+0x7d/0x140
 [<c106509a>] ? in_group_p+0x2a/0x30
 [<c10fda71>] ? inode_change_ok+0x91/0x140
 [<c115007d>] reiserfs_setattr+0x15d/0x2e0
 [<c10f9bf3>] ? dput+0xe3/0x140
 [<c1465adc>] ? _raw_spin_unlock+0x2c/0x50
 [<c117831d>] chown_one_xattr+0xd/0x10
 [<c11780a3>] reiserfs_for_each_xattr+0x113/0x2c0
 [<c1178310>] ? chown_one_xattr+0x0/0x10
 [<c14641e9>] ? mutex_lock_nested+0x2a9/0x350
 [<c117826f>] reiserfs_chown_xattrs+0x1f/0x60
 [<c106509a>] ? in_group_p+0x2a/0x30
 [<c10fda71>] ? inode_change_ok+0x91/0x140
 [<c1150046>] reiserfs_setattr+0x126/0x2e0
 [<c1177c20>] ? reiserfs_getxattr+0x0/0x90
 [<c11b0d57>] ? cap_inode_need_killpriv+0x37/0x50
 [<c10fde01>] notify_change+0x151/0x330
 [<c10e659f>] chown_common+0x6f/0x90
 [<c10e67bd>] sys_lchown+0x6d/0x80
 [<c1002ccc>] sysenter_do_call+0x12/0x32
---[ end trace 7c2b77224c1442fc ]---

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-05 07:59:38 +01:00
Frederic Weisbecker
f3e22f48f3 reiserfs: Fix mistake in down_write() conversion
Fix a mistake in commit 0719d34347
(reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion)
that has converted a down_write() into a down_read() accidentally.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-03 03:44:53 +01:00
Linus Torvalds
45d28b0972 Merge branch 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing
* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
  reiserfs: Safely acquire i_mutex from xattr_rmdir
  reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr
  reiserfs: Fix journal mutex <-> inode mutex lock inversion
  reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink()
  reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle()
  reiserfs: Relax reiserfs lock while freeing the journal
  reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr
  reiserfs: Warn on lock relax if taken recursively
  reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion
  reiserfs: Fix remaining in-reclaim-fs <-> reclaim-fs-on locking inversion
  reiserfs: Fix reiserfs lock <-> inode mutex dependency inversion
  reiserfs: Fix reiserfs lock and journal lock inversion dependency
  reiserfs: Fix possible recursive lock
2010-01-02 11:17:05 -08:00
Frederic Weisbecker
835d5247d9 reiserfs: Safely acquire i_mutex from xattr_rmdir
Relax the reiserfs lock before taking the inode mutex from
xattr_rmdir() to avoid the usual reiserfs lock <-> inode mutex
bad dependency.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-02 01:59:48 +01:00
Frederic Weisbecker
8b513f56d4 reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr
Relax the reiserfs lock before taking the inode mutex from
reiserfs_for_each_xattr() to avoid the usual bad dependencies:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #179
-------------------------------------------------------
rm/3242 is trying to acquire lock:
 (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290

but task is already holding lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401aab>] mutex_lock_nested+0x5b/0x340
       [<c1143339>] reiserfs_write_lock_once+0x29/0x50
       [<c1117022>] reiserfs_lookup+0x62/0x140
       [<c10bd85f>] __lookup_hash+0xef/0x110
       [<c10bf21d>] lookup_one_len+0x8d/0xc0
       [<c1141e3a>] open_xa_dir+0xea/0x1b0
       [<c1142720>] reiserfs_for_each_xattr+0x70/0x290
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #0 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401aab>] mutex_lock_nested+0x5b/0x340
       [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

1 lock held by rm/3242:
 #0:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40

stack backtrace:
Pid: 3242, comm: rm Not tainted 2.6.32-atom #179
Call Trace:
 [<c13ffa13>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c105cc3b>] ? trace_hardirqs_on+0xb/0x10
 [<c1401098>] ? mutex_unlock+0x8/0x10
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
 [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
 [<c1401aab>] mutex_lock_nested+0x5b/0x340
 [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
 [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
 [<c1143180>] ? delete_one_xattr+0x0/0x100
 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
 [<c1143339>] ? reiserfs_write_lock_once+0x29/0x50
 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
 [<c11b0d4f>] ? _atomic_dec_and_lock+0x4f/0x70
 [<c111e990>] ? reiserfs_delete_inode+0x0/0x150
 [<c10c9c32>] generic_delete_inode+0xa2/0x170
 [<c10c9d4f>] generic_drop_inode+0x4f/0x70
 [<c10c8b07>] iput+0x47/0x50
 [<c10c0965>] do_unlinkat+0xd5/0x160
 [<c1401098>] ? mutex_unlock+0x8/0x10
 [<c10c3e0d>] ? vfs_readdir+0x7d/0xb0
 [<c10c3af0>] ? filldir64+0x0/0xf0
 [<c1002ef3>] ? sysenter_exit+0xf/0x16
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10c0b13>] sys_unlinkat+0x23/0x40
 [<c1002ec4>] sysenter_do_call+0x12/0x32

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-02 01:59:14 +01:00
Frederic Weisbecker
4dd859697f reiserfs: Fix journal mutex <-> inode mutex lock inversion
We need to relax the reiserfs lock before locking the inode mutex
from xattr_unlink(), otherwise we'll face the usual bad dependencies:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #178
-------------------------------------------------------
rm/3202 is trying to acquire lock:
 (&journal->j_mutex){+.+...}, at: [<c113c234>] do_journal_begin_r+0x94/0x360

but task is already holding lock:
 (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&sb->s_type->i_mutex_key#4/2){+.+...}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a7b>] mutex_lock_nested+0x5b/0x340
       [<c1142a67>] xattr_unlink+0x57/0xb0
       [<c1143179>] delete_one_xattr+0x29/0x100
       [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a7b>] mutex_lock_nested+0x5b/0x340
       [<c1143359>] reiserfs_write_lock+0x29/0x40
       [<c113c23c>] do_journal_begin_r+0x9c/0x360
       [<c113c680>] journal_begin+0x80/0x130
       [<c1127363>] reiserfs_remount+0x223/0x4e0
       [<c10b6dd6>] do_remount_sb+0xa6/0x140
       [<c10ce6a0>] do_mount+0x560/0x750
       [<c10ce914>] sys_mount+0x84/0xb0
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #0 (&journal->j_mutex){+.+...}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a7b>] mutex_lock_nested+0x5b/0x340
       [<c113c234>] do_journal_begin_r+0x94/0x360
       [<c113c680>] journal_begin+0x80/0x130
       [<c1116d63>] reiserfs_unlink+0x83/0x2e0
       [<c1142a74>] xattr_unlink+0x64/0xb0
       [<c1143179>] delete_one_xattr+0x29/0x100
       [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
       [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0b13>] sys_unlinkat+0x23/0x40
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

2 locks held by rm/3202:
 #0:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c114274b>] reiserfs_for_each_xattr+0x9b/0x290
 #1:  (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0

stack backtrace:
Pid: 3202, comm: rm Not tainted 2.6.32-atom #178
Call Trace:
 [<c13ff9e3>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c1142a67>] ? xattr_unlink+0x57/0xb0
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c113c234>] ? do_journal_begin_r+0x94/0x360
 [<c113c234>] ? do_journal_begin_r+0x94/0x360
 [<c1401a7b>] mutex_lock_nested+0x5b/0x340
 [<c113c234>] ? do_journal_begin_r+0x94/0x360
 [<c113c234>] do_journal_begin_r+0x94/0x360
 [<c10411b6>] ? run_timer_softirq+0x1a6/0x220
 [<c103cb00>] ? __do_softirq+0x50/0x140
 [<c113c680>] journal_begin+0x80/0x130
 [<c103cba2>] ? __do_softirq+0xf2/0x140
 [<c104f72f>] ? hrtimer_interrupt+0xdf/0x220
 [<c1116d63>] reiserfs_unlink+0x83/0x2e0
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c11b8d08>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c1002fd8>] ? restore_all_notrace+0x0/0x18
 [<c1142a67>] ? xattr_unlink+0x57/0xb0
 [<c1142a74>] xattr_unlink+0x64/0xb0
 [<c1143179>] delete_one_xattr+0x29/0x100
 [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
 [<c1143150>] ? delete_one_xattr+0x0/0x100
 [<c1401cb9>] ? mutex_lock_nested+0x299/0x340
 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
 [<c1143309>] ? reiserfs_write_lock_once+0x29/0x50
 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
 [<c11b0d1f>] ? _atomic_dec_and_lock+0x4f/0x70
 [<c111e990>] ? reiserfs_delete_inode+0x0/0x150
 [<c10c9c32>] generic_delete_inode+0xa2/0x170
 [<c10c9d4f>] generic_drop_inode+0x4f/0x70
 [<c10c8b07>] iput+0x47/0x50
 [<c10c0965>] do_unlinkat+0xd5/0x160
 [<c1401068>] ? mutex_unlock+0x8/0x10
 [<c10c3e0d>] ? vfs_readdir+0x7d/0xb0
 [<c10c3af0>] ? filldir64+0x0/0xf0
 [<c1002ef3>] ? sysenter_exit+0xf/0x16
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10c0b13>] sys_unlinkat+0x23/0x40
 [<c1002ec4>] sysenter_do_call+0x12/0x32

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-02 01:58:32 +01:00
Frederic Weisbecker
c674905ca7 reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink()
reiserfs_unlink() may or may not be called under the reiserfs
lock.
But it also takes the reiserfs lock and can then acquire it
recursively which leads to do_journal_begin_r() that fails to
relax the reiserfs lock before grabbing the journal mutex,
creating an unexpected lock inversion.

We need to ensure reiserfs_unlink() won't get the reiserfs lock
recursively using reiserfs_write_lock_once().

This fixes the following warning that precedes a lock inversion
report (reiserfs lock <-> journal mutex).

------------[ cut here ]------------
WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3a/0x50()
Hardware name: MS-7418
Unwanted recursive reiserfs lock!
Pid: 3208, comm: dbench Not tainted 2.6.32-atom #177
Call Trace:
 [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
 [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
 [<c10373a7>] warn_slowpath_common+0x67/0xc0
 [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
 [<c1037446>] warn_slowpath_fmt+0x26/0x30
 [<c114327a>] reiserfs_lock_check_recursive+0x3a/0x50
 [<c113c213>] do_journal_begin_r+0x83/0x360
 [<c105eb16>] ? __lock_acquire+0x1296/0x19e0
 [<c1142a57>] ? xattr_unlink+0x57/0xb0
 [<c113c670>] journal_begin+0x80/0x130
 [<c1116d5d>] reiserfs_unlink+0x7d/0x2d0
 [<c1142a57>] ? xattr_unlink+0x57/0xb0
 [<c1142a57>] ? xattr_unlink+0x57/0xb0
 [<c1142a57>] ? xattr_unlink+0x57/0xb0
 [<c1142a64>] xattr_unlink+0x64/0xb0
 [<c1143169>] delete_one_xattr+0x29/0x100
 [<c11427ab>] reiserfs_for_each_xattr+0x10b/0x290
 [<c1143140>] ? delete_one_xattr+0x0/0x100
 [<c1401ca9>] ? mutex_lock_nested+0x299/0x340
 [<c11429aa>] reiserfs_delete_xattrs+0x1a/0x60
 [<c11432f9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
 [<c11b0d0f>] ? _atomic_dec_and_lock+0x4f/0x70
 [<c111e980>] ? reiserfs_delete_inode+0x0/0x150
 [<c10c9c32>] generic_delete_inode+0xa2/0x170
 [<c10c9d4f>] generic_drop_inode+0x4f/0x70
 [<c10c8b07>] iput+0x47/0x50
 [<c10c0965>] do_unlinkat+0xd5/0x160
 [<c10505c6>] ? up_read+0x16/0x30
 [<c1022ab7>] ? do_page_fault+0x187/0x330
 [<c1002fd8>] ? restore_all_notrace+0x0/0x18
 [<c1022930>] ? do_page_fault+0x0/0x330
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10c0a00>] sys_unlink+0x10/0x20
 [<c1002ec4>] sysenter_do_call+0x12/0x32
---[ end trace 2e35d71a6cc69d0c ]---

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-02 01:57:32 +01:00
Frederic Weisbecker
3f14fea6bb reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle()
We call xattr_lookup() from reiserfs_xattr_get(). We then hold
the reiserfs lock when we grab the i_mutex. But later, we may
relax the reiserfs lock, creating dependency inversion between
both locks.

The lookups and creation jobs ar already protected by the
inode mutex, so we can safely relax the reiserfs lock, dropping
the unwanted reiserfs lock -> i_mutex dependency, as shown
in the following lockdep report:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #173
-------------------------------------------------------
cp/3204 is trying to acquire lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50

but task is already holding lock:
 (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c1141d83>] open_xa_dir+0x43/0x1b0
       [<c1142722>] reiserfs_for_each_xattr+0x62/0x260
       [<c114299a>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0a00>] sys_unlink+0x10/0x20
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
       [<c1117012>] reiserfs_lookup+0x62/0x140
       [<c10bd85f>] __lookup_hash+0xef/0x110
       [<c10bf21d>] lookup_one_len+0x8d/0xc0
       [<c1141e2a>] open_xa_dir+0xea/0x1b0
       [<c1141fe5>] xattr_lookup+0x15/0x160
       [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
       [<c1144042>] reiserfs_get_acl+0xa2/0x360
       [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
       [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
       [<c10bea96>] vfs_mkdir+0xd6/0x180
       [<c10c0c10>] sys_mkdirat+0xc0/0xd0
       [<c10c0c40>] sys_mkdir+0x20/0x30
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

2 locks held by cp/3204:
 #0:  (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0
 #1:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

stack backtrace:
Pid: 3204, comm: cp Not tainted 2.6.32-atom #173
Call Trace:
 [<c13ff993>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c105d3aa>] ? check_usage+0x6a/0x460
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c1401a2b>] mutex_lock_nested+0x5b/0x340
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
 [<c1117012>] reiserfs_lookup+0x62/0x140
 [<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10bd85f>] __lookup_hash+0xef/0x110
 [<c10bf21d>] lookup_one_len+0x8d/0xc0
 [<c1141e2a>] open_xa_dir+0xea/0x1b0
 [<c1141fe5>] xattr_lookup+0x15/0x160
 [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
 [<c1144042>] reiserfs_get_acl+0xa2/0x360
 [<c10ca2e7>] ? new_inode+0x27/0xa0
 [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
 [<c1402eb7>] ? _spin_unlock+0x27/0x40
 [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
 [<c10c7cb8>] ? __d_lookup+0x108/0x190
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340
 [<c10bd17a>] ? generic_permission+0x1a/0xa0
 [<c11788fe>] ? security_inode_permission+0x1e/0x20
 [<c10bea96>] vfs_mkdir+0xd6/0x180
 [<c10c0c10>] sys_mkdirat+0xc0/0xd0
 [<c10505c6>] ? up_read+0x16/0x30
 [<c1002fd8>] ? restore_all_notrace+0x0/0x18
 [<c10c0c40>] sys_mkdir+0x20/0x30
 [<c1002ec4>] sysenter_do_call+0x12/0x32

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-02 01:57:01 +01:00
Frederic Weisbecker
0523676d3f reiserfs: Relax reiserfs lock while freeing the journal
Keeping the reiserfs lock while freeing the journal on
umount path triggers a lock inversion between bdev->bd_mutex
and the reiserfs lock.

We don't need the reiserfs lock at this stage. The filesystem
is not usable anymore, and there are no more pending commits,
everything got flushed (even this operation was done in parallel
and didn't required the reiserfs lock from the current process).

This fixes the following lockdep report:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #172
-------------------------------------------------------
umount/3904 is trying to acquire lock:
 (&bdev->bd_mutex){+.+.+.}, at: [<c10de2c2>] __blkdev_put+0x22/0x160

but task is already holding lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c140199b>] mutex_lock_nested+0x5b/0x340
       [<c1143229>] reiserfs_write_lock_once+0x29/0x50
       [<c111c485>] reiserfs_get_block+0x85/0x1620
       [<c10e1040>] do_mpage_readpage+0x1f0/0x6d0
       [<c10e1640>] mpage_readpages+0xc0/0x100
       [<c1119b89>] reiserfs_readpages+0x19/0x20
       [<c108f1ec>] __do_page_cache_readahead+0x1bc/0x260
       [<c108f2b8>] ra_submit+0x28/0x40
       [<c1087e3e>] filemap_fault+0x40e/0x420
       [<c109b5fd>] __do_fault+0x3d/0x430
       [<c109d47e>] handle_mm_fault+0x12e/0x790
       [<c1022a65>] do_page_fault+0x135/0x330
       [<c1403663>] error_code+0x6b/0x70
       [<c10ef9ca>] load_elf_binary+0x82a/0x1a10
       [<c10ba130>] search_binary_handler+0x90/0x1d0
       [<c10bb70f>] do_execve+0x1df/0x250
       [<c1001746>] sys_execve+0x46/0x70
       [<c1002fa5>] syscall_call+0x7/0xb

-> #2 (&mm->mmap_sem){++++++}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c109b1ab>] might_fault+0x8b/0xb0
       [<c11b8f52>] copy_to_user+0x32/0x70
       [<c10c3b94>] filldir64+0xa4/0xf0
       [<c1109116>] sysfs_readdir+0x116/0x210
       [<c10c3e1d>] vfs_readdir+0x8d/0xb0
       [<c10c3ea9>] sys_getdents64+0x69/0xb0
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #1 (sysfs_mutex){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c140199b>] mutex_lock_nested+0x5b/0x340
       [<c110951c>] sysfs_addrm_start+0x2c/0xb0
       [<c1109aa0>] create_dir+0x40/0x90
       [<c1109b1b>] sysfs_create_dir+0x2b/0x50
       [<c11b2352>] kobject_add_internal+0xc2/0x1b0
       [<c11b2531>] kobject_add_varg+0x31/0x50
       [<c11b25ac>] kobject_add+0x2c/0x60
       [<c1258294>] device_add+0x94/0x560
       [<c11036ea>] add_partition+0x18a/0x2a0
       [<c110418a>] rescan_partitions+0x33a/0x450
       [<c10de5bf>] __blkdev_get+0x12f/0x2d0
       [<c10de76a>] blkdev_get+0xa/0x10
       [<c11034b8>] register_disk+0x108/0x130
       [<c11a87a9>] add_disk+0xd9/0x130
       [<c12998e5>] sd_probe_async+0x105/0x1d0
       [<c10528af>] async_thread+0xcf/0x230
       [<c104bfd4>] kthread+0x74/0x80
       [<c1003aab>] kernel_thread_helper+0x7/0x3c

-> #0 (&bdev->bd_mutex){+.+.+.}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c140199b>] mutex_lock_nested+0x5b/0x340
       [<c10de2c2>] __blkdev_put+0x22/0x160
       [<c10de40a>] blkdev_put+0xa/0x10
       [<c113ce22>] free_journal_ram+0xd2/0x130
       [<c113ea18>] do_journal_release+0x98/0x190
       [<c113eb2a>] journal_release+0xa/0x10
       [<c1128eb6>] reiserfs_put_super+0x36/0x130
       [<c10b776f>] generic_shutdown_super+0x4f/0xe0
       [<c10b7825>] kill_block_super+0x25/0x40
       [<c11255df>] reiserfs_kill_sb+0x7f/0x90
       [<c10b7f4a>] deactivate_super+0x7a/0x90
       [<c10cccd8>] mntput_no_expire+0x98/0xd0
       [<c10ccfcc>] sys_umount+0x4c/0x310
       [<c10cd2a9>] sys_oldumount+0x19/0x20
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

2 locks held by umount/3904:
 #0:  (&type->s_umount_key#30){+++++.}, at: [<c10b7f45>] deactivate_super+0x75/0x90
 #1:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40

stack backtrace:
Pid: 3904, comm: umount Not tainted 2.6.32-atom #172
Call Trace:
 [<c13ff903>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c108b66f>] ? free_pcppages_bulk+0x1f/0x250
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c10de2c2>] ? __blkdev_put+0x22/0x160
 [<c10de2c2>] ? __blkdev_put+0x22/0x160
 [<c140199b>] mutex_lock_nested+0x5b/0x340
 [<c10de2c2>] ? __blkdev_put+0x22/0x160
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c10afe12>] ? kfree+0x92/0xd0
 [<c10de2c2>] __blkdev_put+0x22/0x160
 [<c105cc3b>] ? trace_hardirqs_on+0xb/0x10
 [<c10de40a>] blkdev_put+0xa/0x10
 [<c113ce22>] free_journal_ram+0xd2/0x130
 [<c113ea18>] do_journal_release+0x98/0x190
 [<c113eb2a>] journal_release+0xa/0x10
 [<c1128eb6>] reiserfs_put_super+0x36/0x130
 [<c1050596>] ? up_write+0x16/0x30
 [<c10b776f>] generic_shutdown_super+0x4f/0xe0
 [<c10b7825>] kill_block_super+0x25/0x40
 [<c10f41e0>] ? vfs_quota_off+0x0/0x20
 [<c11255df>] reiserfs_kill_sb+0x7f/0x90
 [<c10b7f4a>] deactivate_super+0x7a/0x90
 [<c10cccd8>] mntput_no_expire+0x98/0xd0
 [<c10ccfcc>] sys_umount+0x4c/0x310
 [<c10cd2a9>] sys_oldumount+0x19/0x20
 [<c1002ec4>] sysenter_do_call+0x12/0x32

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-02 01:56:54 +01:00
Frederic Weisbecker
27026a05bb reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr
While deleting the xattrs of an inode, we hold the reiserfs lock
and grab the inode->i_mutex of the targeted inode and the root
private xattr directory.

Later on, we may relax the reiserfs lock for various reasons, this
creates inverted dependencies.

We can remove the reiserfs lock -> i_mutex dependency by relaxing
the former before calling open_xa_dir(). This is fine because the
lookup and creation of xattr private directories done in
open_xa_dir() are covered by the targeted inode mutexes. And deeper
operations in the tree are still done under the write lock.

This fixes the following lockdep report:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #173
-------------------------------------------------------
cp/3204 is trying to acquire lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50

but task is already holding lock:
 (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
       [<c105ea7f>] __lock_acquire+0x11ff/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c1141d83>] open_xa_dir+0x43/0x1b0
       [<c1142722>] reiserfs_for_each_xattr+0x62/0x260
       [<c114299a>] reiserfs_delete_xattrs+0x1a/0x60
       [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
       [<c10c9c32>] generic_delete_inode+0xa2/0x170
       [<c10c9d4f>] generic_drop_inode+0x4f/0x70
       [<c10c8b07>] iput+0x47/0x50
       [<c10c0965>] do_unlinkat+0xd5/0x160
       [<c10c0a00>] sys_unlink+0x10/0x20
       [<c1002ec4>] sysenter_do_call+0x12/0x32

-> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c105f176>] __lock_acquire+0x18f6/0x19e0
       [<c105f2c8>] lock_acquire+0x68/0x90
       [<c1401a2b>] mutex_lock_nested+0x5b/0x340
       [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
       [<c1117012>] reiserfs_lookup+0x62/0x140
       [<c10bd85f>] __lookup_hash+0xef/0x110
       [<c10bf21d>] lookup_one_len+0x8d/0xc0
       [<c1141e2a>] open_xa_dir+0xea/0x1b0
       [<c1141fe5>] xattr_lookup+0x15/0x160
       [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
       [<c1144042>] reiserfs_get_acl+0xa2/0x360
       [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
       [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
       [<c10bea96>] vfs_mkdir+0xd6/0x180
       [<c10c0c10>] sys_mkdirat+0xc0/0xd0
       [<c10c0c40>] sys_mkdir+0x20/0x30
       [<c1002ec4>] sysenter_do_call+0x12/0x32

other info that might help us debug this:

2 locks held by cp/3204:
 #0:  (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0
 #1:  (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0

stack backtrace:
Pid: 3204, comm: cp Not tainted 2.6.32-atom #173
Call Trace:
 [<c13ff993>] ? printk+0x18/0x1a
 [<c105d33a>] print_circular_bug+0xca/0xd0
 [<c105f176>] __lock_acquire+0x18f6/0x19e0
 [<c105d3aa>] ? check_usage+0x6a/0x460
 [<c105f2c8>] lock_acquire+0x68/0x90
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c1401a2b>] mutex_lock_nested+0x5b/0x340
 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
 [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
 [<c1117012>] reiserfs_lookup+0x62/0x140
 [<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140
 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10bd85f>] __lookup_hash+0xef/0x110
 [<c10bf21d>] lookup_one_len+0x8d/0xc0
 [<c1141e2a>] open_xa_dir+0xea/0x1b0
 [<c1141fe5>] xattr_lookup+0x15/0x160
 [<c1142476>] reiserfs_xattr_get+0x56/0x2a0
 [<c1144042>] reiserfs_get_acl+0xa2/0x360
 [<c10ca2e7>] ? new_inode+0x27/0xa0
 [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
 [<c1402eb7>] ? _spin_unlock+0x27/0x40
 [<c111789c>] reiserfs_mkdir+0x6c/0x2c0
 [<c10c7cb8>] ? __d_lookup+0x108/0x190
 [<c105c932>] ? mark_held_locks+0x62/0x80
 [<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340
 [<c10bd17a>] ? generic_permission+0x1a/0xa0
 [<c11788fe>] ? security_inode_permission+0x1e/0x20
 [<c10bea96>] vfs_mkdir+0xd6/0x180
 [<c10c0c10>] sys_mkdirat+0xc0/0xd0
 [<c10505c6>] ? up_read+0x16/0x30
 [<c1002fd8>] ? restore_all_notrace+0x0/0x18
 [<c10c0c40>] sys_mkdir+0x20/0x30
 [<c1002ec4>] sysenter_do_call+0x12/0x32

v2: Don't drop reiserfs_mutex_lock_nested_safe() as we'll still
    need it later

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-02 01:54:47 +01:00
Frederic Weisbecker
c4a62ca362 reiserfs: Warn on lock relax if taken recursively
When we relax the reiserfs lock to avoid creating unwanted
dependencies against others locks while grabbing these,
we want to ensure it has not been taken recursively, otherwise
the lock won't be really relaxed. Only its depth will be decreased.
The unwanted dependency would then actually happen.

To prevent from that, add a reiserfs_lock_check_recursive() call
in the places that need it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-02 01:54:37 +01:00
Frederic Weisbecker
0719d34347 reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion
i_xattr_sem depends on the reiserfs lock. But after we grab
i_xattr_sem, we may relax/relock the reiserfs lock while waiting
on a freezed filesystem, creating a dependency inversion between
the two locks.

In order to avoid the i_xattr_sem -> reiserfs lock dependency, let's
create a reiserfs_down_read_safe() that acts like
reiserfs_mutex_lock_safe(): relax the reiserfs lock while grabbing
another lock to avoid undesired dependencies induced by the
heivyweight reiserfs lock.

This fixes the following warning:

[  990.005931] =======================================================
[  990.012373] [ INFO: possible circular locking dependency detected ]
[  990.013233] 2.6.33-rc1 #1
[  990.013233] -------------------------------------------------------
[  990.013233] dbench/1891 is trying to acquire lock:
[  990.013233]  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
[  990.013233]
[  990.013233] but task is already holding lock:
[  990.013233]  (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]
[  990.013233] which lock already depends on the new lock.
[  990.013233]
[  990.013233]
[  990.013233] the existing dependency chain (in reverse order) is:
[  990.013233]
[  990.013233] -> #1 (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}:
[  990.013233]        [<ffffffff81063afc>] __lock_acquire+0xf9c/0x1560
[  990.013233]        [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
[  990.013233]        [<ffffffff814ac194>] down_write+0x44/0x80
[  990.013233]        [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]        [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
[  990.013233]        [<ffffffff8115a6aa>] user_set+0x8a/0x90
[  990.013233]        [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
[  990.013233]        [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
[  990.013233]        [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
[  990.013233]        [<ffffffff810e2780>] setxattr+0xc0/0x150
[  990.013233]        [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
[  990.013233]        [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
[  990.013233]
[  990.013233] -> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
[  990.013233]        [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
[  990.013233]        [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
[  990.013233]        [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
[  990.013233]        [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
[  990.013233]        [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
[  990.013233]        [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
[  990.013233]        [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
[  990.013233]        [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
[  990.013233]        [<ffffffff8115a6aa>] user_set+0x8a/0x90
[  990.013233]        [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
[  990.013233]        [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
[  990.013233]        [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
[  990.013233]        [<ffffffff810e2780>] setxattr+0xc0/0x150
[  990.013233]        [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
[  990.013233]        [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
[  990.013233]
[  990.013233] other info that might help us debug this:
[  990.013233]
[  990.013233] 2 locks held by dbench/1891:
[  990.013233]  #0:  (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff810e2678>] vfs_setxattr+0x78/0xc0
[  990.013233]  #1:  (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]
[  990.013233] stack backtrace:
[  990.013233] Pid: 1891, comm: dbench Not tainted 2.6.33-rc1 #1
[  990.013233] Call Trace:
[  990.013233]  [<ffffffff81061639>] print_circular_bug+0xe9/0xf0
[  990.013233]  [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
[  990.013233]  [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]  [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
[  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
[  990.013233]  [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
[  990.013233]  [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
[  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
[  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
[  990.013233]  [<ffffffff81062592>] ? mark_held_locks+0x72/0xa0
[  990.013233]  [<ffffffff814ab81d>] ? __mutex_unlock_slowpath+0xbd/0x140
[  990.013233]  [<ffffffff810628ad>] ? trace_hardirqs_on_caller+0x14d/0x1a0
[  990.013233]  [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
[  990.013233]  [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
[  990.013233]  [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
[  990.013233]  [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
[  990.013233]  [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
[  990.013233]  [<ffffffff814abcb4>] ? __mutex_lock_common+0x284/0x3b0
[  990.013233]  [<ffffffff8115a6aa>] user_set+0x8a/0x90
[  990.013233]  [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
[  990.013233]  [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
[  990.013233]  [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
[  990.013233]  [<ffffffff810e2780>] setxattr+0xc0/0x150
[  990.013233]  [<ffffffff81056018>] ? sched_clock_cpu+0xb8/0x100
[  990.013233]  [<ffffffff8105eded>] ? trace_hardirqs_off+0xd/0x10
[  990.013233]  [<ffffffff810560a3>] ? cpu_clock+0x43/0x50
[  990.013233]  [<ffffffff810c6820>] ? fget+0xb0/0x110
[  990.013233]  [<ffffffff810c6770>] ? fget+0x0/0x110
[  990.013233]  [<ffffffff81002ddc>] ? sysret_check+0x27/0x62
[  990.013233]  [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
[  990.013233]  [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b

Reported-and-tested-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2010-01-02 01:54:04 +01:00
Frederic Weisbecker
98ea3f50bc reiserfs: Fix remaining in-reclaim-fs <-> reclaim-fs-on locking inversion
Commit 500f5a0bf5
(reiserfs: Fix possible recursive lock) fixed a vmalloc under reiserfs
lock that triggered a lockdep warning because of a
IN-FS-RECLAIM <-> RECLAIM-FS-ON locking dependency inversion.

But this patch has ommitted another vmalloc call in the same path
that allocates the journal. Relax the lock for this one too.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
2009-12-29 22:34:59 +01:00
Jan Kara
ec8e2f7466 reiserfs: truncate blocks not used by a write
It can happen that write does not use all the blocks allocated in
write_begin either because of some filesystem error (like ENOSPC) or
because page with data to write has been removed from memory.  We truncate
these blocks so that we don't have dangling blocks beyond i_size.

Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-17 15:45:30 -08:00
Linus Torvalds
b6e3224fb2 Revert "task_struct: make journal_info conditional"
This reverts commit e4c570c4cb, as
requested by Alexey:

 "I think I gave a good enough arguments to not merge it.
  To iterate:
   * patch makes impossible to start using ext3 on EXT3_FS=n kernels
     without reboot.
   * this is done only for one pointer on task_struct"

  None of config options which define task_struct are tristate directly
  or effectively."

Requested-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-17 13:23:24 -08:00
Frederic Weisbecker
47376ceba5 reiserfs: Fix reiserfs lock <-> inode mutex dependency inversion
The reiserfs lock -> inode mutex dependency gets inverted when we
relax the lock while walking to the tree.

To fix this, use a specialized version of reiserfs_mutex_lock_safe
that takes care of mutex subclasses. Then we can grab the inode
mutex with I_MUTEX_XATTR subclass without any reiserfs lock
dependency.

This fixes the following report:

[ INFO: possible circular locking dependency detected ]
2.6.32-06793-gf405425-dirty #2
-------------------------------------------------------
mv/18566 is trying to acquire lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1110708>] reiserfs_write_lock+0x28=
/0x40

but task is already holding lock:
 (&sb->s_type->i_mutex_key#5/3){+.+.+.}, at: [<c111033c>]
reiserfs_for_each_xattr+0x10c/0x380

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&sb->s_type->i_mutex_key#5/3){+.+.+.}:
       [<c104f723>] validate_chain+0xa23/0xf70
       [<c1050155>] __lock_acquire+0x4e5/0xa70
       [<c105075a>] lock_acquire+0x7a/0xa0
       [<c134c76f>] mutex_lock_nested+0x5f/0x2b0
       [<c11102b4>] reiserfs_for_each_xattr+0x84/0x380
       [<c1110615>] reiserfs_delete_xattrs+0x15/0x50
       [<c10ef57f>] reiserfs_delete_inode+0x8f/0x140
       [<c10a565c>] generic_delete_inode+0x9c/0x150
       [<c10a574d>] generic_drop_inode+0x3d/0x60
       [<c10a4667>] iput+0x47/0x50
       [<c109cc0b>] do_unlinkat+0xdb/0x160
       [<c109cca0>] sys_unlink+0x10/0x20
       [<c1002c50>] sysenter_do_call+0x12/0x36

-> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c104fc68>] validate_chain+0xf68/0xf70
       [<c1050155>] __lock_acquire+0x4e5/0xa70
       [<c105075a>] lock_acquire+0x7a/0xa0
       [<c134c76f>] mutex_lock_nested+0x5f/0x2b0
       [<c1110708>] reiserfs_write_lock+0x28/0x40
       [<c1103d6b>] search_by_key+0x1f7b/0x21b0
       [<c10e73ef>] search_by_entry_key+0x1f/0x3b0
       [<c10e77f7>] reiserfs_find_entry+0x77/0x400
       [<c10e81e5>] reiserfs_lookup+0x85/0x130
       [<c109a144>] __lookup_hash+0xb4/0x110
       [<c109b763>] lookup_one_len+0xb3/0x100
       [<c1110350>] reiserfs_for_each_xattr+0x120/0x380
       [<c1110615>] reiserfs_delete_xattrs+0x15/0x50
       [<c10ef57f>] reiserfs_delete_inode+0x8f/0x140
       [<c10a565c>] generic_delete_inode+0x9c/0x150
       [<c10a574d>] generic_drop_inode+0x3d/0x60
       [<c10a4667>] iput+0x47/0x50
       [<c10a1c4f>] dentry_iput+0x6f/0xf0
       [<c10a1d74>] d_kill+0x24/0x50
       [<c10a396b>] dput+0x5b/0x120
       [<c109ca89>] sys_renameat+0x1b9/0x230
       [<c109cb28>] sys_rename+0x28/0x30
       [<c1002c50>] sysenter_do_call+0x12/0x36

other info that might help us debug this:

2 locks held by mv/18566:
 #0:  (&sb->s_type->i_mutex_key#5/1){+.+.+.}, at: [<c109b6ac>]
lock_rename+0xcc/0xd0
 #1:  (&sb->s_type->i_mutex_key#5/3){+.+.+.}, at: [<c111033c>]
reiserfs_for_each_xattr+0x10c/0x380

stack backtrace:
Pid: 18566, comm: mv Tainted: G         C 2.6.32-06793-gf405425-dirty #2
Call Trace:
 [<c134b252>] ? printk+0x18/0x1e
 [<c104e790>] print_circular_bug+0xc0/0xd0
 [<c104fc68>] validate_chain+0xf68/0xf70
 [<c104c8cb>] ? trace_hardirqs_off+0xb/0x10
 [<c1050155>] __lock_acquire+0x4e5/0xa70
 [<c105075a>] lock_acquire+0x7a/0xa0
 [<c1110708>] ? reiserfs_write_lock+0x28/0x40
 [<c134c76f>] mutex_lock_nested+0x5f/0x2b0
 [<c1110708>] ? reiserfs_write_lock+0x28/0x40
 [<c1110708>] ? reiserfs_write_lock+0x28/0x40
 [<c134b60a>] ? schedule+0x27a/0x440
 [<c1110708>] reiserfs_write_lock+0x28/0x40
 [<c1103d6b>] search_by_key+0x1f7b/0x21b0
 [<c1050176>] ? __lock_acquire+0x506/0xa70
 [<c1051267>] ? lock_release_non_nested+0x1e7/0x340
 [<c1110708>] ? reiserfs_write_lock+0x28/0x40
 [<c104e354>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c104e3ab>] ? trace_hardirqs_on+0xb/0x10
 [<c1042a55>] ? T.316+0x15/0x1a0
 [<c1042d2d>] ? sched_clock_cpu+0x9d/0x100
 [<c10e73ef>] search_by_entry_key+0x1f/0x3b0
 [<c134bf2a>] ? __mutex_unlock_slowpath+0x9a/0x120
 [<c104e354>] ? trace_hardirqs_on_caller+0x124/0x170
 [<c10e77f7>] reiserfs_find_entry+0x77/0x400
 [<c10e81e5>] reiserfs_lookup+0x85/0x130
 [<c1042d2d>] ? sched_clock_cpu+0x9d/0x100
 [<c109a144>] __lookup_hash+0xb4/0x110
 [<c109b763>] lookup_one_len+0xb3/0x100
 [<c1110350>] reiserfs_for_each_xattr+0x120/0x380
 [<c110ffe0>] ? delete_one_xattr+0x0/0x1c0
 [<c1003342>] ? math_error+0x22/0x150
 [<c1110708>] ? reiserfs_write_lock+0x28/0x40
 [<c1110615>] reiserfs_delete_xattrs+0x15/0x50
 [<c1110708>] ? reiserfs_write_lock+0x28/0x40
 [<c10ef57f>] reiserfs_delete_inode+0x8f/0x140
 [<c10a561f>] ? generic_delete_inode+0x5f/0x150
 [<c10ef4f0>] ? reiserfs_delete_inode+0x0/0x140
 [<c10a565c>] generic_delete_inode+0x9c/0x150
 [<c10a574d>] generic_drop_inode+0x3d/0x60
 [<c10a4667>] iput+0x47/0x50
 [<c10a1c4f>] dentry_iput+0x6f/0xf0
 [<c10a1d74>] d_kill+0x24/0x50
 [<c10a396b>] dput+0x5b/0x120
 [<c109ca89>] sys_renameat+0x1b9/0x230
 [<c1042d2d>] ? sched_clock_cpu+0x9d/0x100
 [<c104c8cb>] ? trace_hardirqs_off+0xb/0x10
 [<c1042dde>] ? cpu_clock+0x4e/0x60
 [<c1350825>] ? do_page_fault+0x155/0x370
 [<c1041816>] ? up_read+0x16/0x30
 [<c1350825>] ? do_page_fault+0x155/0x370
 [<c109cb28>] sys_rename+0x28/0x30
 [<c1002c50>] sysenter_do_call+0x12/0x36

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2009-12-16 23:25:50 +01:00
Linus Torvalds
bac5e54c29 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits)
  direct I/O fallback sync simplification
  ocfs: stop using do_sync_mapping_range
  cleanup blockdev_direct_IO locking
  make generic_acl slightly more generic
  sanitize xattr handler prototypes
  libfs: move EXPORT_SYMBOL for d_alloc_name
  vfs: force reval of target when following LAST_BIND symlinks (try #7)
  ima: limit imbalance msg
  Untangling ima mess, part 3: kill dead code in ima
  Untangling ima mess, part 2: deal with counters
  Untangling ima mess, part 1: alloc_file()
  O_TRUNC open shouldn't fail after file truncation
  ima: call ima_inode_free ima_inode_free
  IMA: clean up the IMA counts updating code
  ima: only insert at inode creation time
  ima: valid return code from ima_inode_alloc
  fs: move get_empty_filp() deffinition to internal.h
  Sanitize exec_permission_lite()
  Kill cached_lookup() and real_lookup()
  Kill path_lookup_open()
  ...

Trivial conflicts in fs/direct-io.c
2009-12-16 12:04:02 -08:00
Christoph Hellwig
431547b3c4 sanitize xattr handler prototypes
Add a flags argument to struct xattr_handler and pass it to all xattr
handler methods.  This allows using the same methods for multiple
handlers, e.g. for the ACL methods which perform exactly the same action
for the access and default ACLs, just using a different underlying
attribute.  With a little more groundwork it'll also allow sharing the
methods for the regular user/trusted/secure handlers in extN, ocfs2 and
jffs2 like it's already done for xfs in this patch.

Also change the inode argument to the handlers to a dentry to allow
using the handlers mechnism for filesystems that require it later,
e.g. cifs.

[with GFS2 bits updated by Steven Whitehouse <swhiteho@redhat.com>]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:49 -05:00
Alexey Dobriyan
e3c96f53ac reiserfs: don't compile procfs.o at all if no support
* small define cleanup in header
* fix #ifdeffery in procfs.c via Kconfig

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:06 -08:00
Alexey Dobriyan
904e812931 reiserfs: remove /proc/fs/reiserfs/version
/proc/fs/reiserfs/version is on the way of removing ->read_proc interface.
 It's empty however, so simply remove it instead of doing dummy
conversion.  It's hard to see what information userspace can extract from
empty file.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:06 -08:00
Hiroshi Shimamoto
e4c570c4cb task_struct: make journal_info conditional
journal_info in task_struct is used in journaling file system only.  So
introduce CONFIG_FS_JOURNAL_INFO and make it conditional.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:27 -08:00
Frederic Weisbecker
cb1c2e51c5 reiserfs: Fix reiserfs lock and journal lock inversion dependency
When we were using the bkl, we didn't care about dependencies against
other locks, but the mutex conversion created new ones, which is why
we have reiserfs_mutex_lock_safe(), which unlocks the reiserfs lock
before acquiring another mutex.

But this trick actually fails if we have acquired the reiserfs lock
recursively, as we try to unlock it to acquire the new mutex without
inverted dependency, but we eventually only decrease its depth.

This happens in the case of a nested inode creation/deletion.
Say we have no space left on the device, we create an inode
and tak the lock but fail to create its entry, then we release the
inode using iput(), which calls reiserfs_delete_inode() that takes
the reiserfs lock recursively. The path eventually ends up in
journal_begin() where we try to take the journal safely but we
fail because of the reiserfs lock recursion:

[ INFO: possible circular locking dependency detected ]
2.6.32-06486-g053fe57 #2
-------------------------------------------------------
vi/23454 is trying to acquire lock:
 (&journal->j_mutex){+.+...}, at: [<c110dac4>] do_journal_begin_r+0x64/0x2f0

but task is already holding lock:
 (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11106a8>] reiserfs_write_lock+0x28/0x40

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
       [<c104f8f3>] validate_chain+0xa23/0xf70
       [<c1050325>] __lock_acquire+0x4e5/0xa70
       [<c105092a>] lock_acquire+0x7a/0xa0
       [<c134c78f>] mutex_lock_nested+0x5f/0x2b0
       [<c11106a8>] reiserfs_write_lock+0x28/0x40
       [<c110dacb>] do_journal_begin_r+0x6b/0x2f0
       [<c110ddcf>] journal_begin+0x7f/0x120
       [<c10f76c2>] reiserfs_remount+0x212/0x4d0
       [<c1093997>] do_remount_sb+0x67/0x140
       [<c10a9ca6>] do_mount+0x436/0x6b0
       [<c10a9f86>] sys_mount+0x66/0xa0
       [<c1002c50>] sysenter_do_call+0x12/0x36

-> #0 (&journal->j_mutex){+.+...}:
       [<c104fe38>] validate_chain+0xf68/0xf70
       [<c1050325>] __lock_acquire+0x4e5/0xa70
       [<c105092a>] lock_acquire+0x7a/0xa0
       [<c134c78f>] mutex_lock_nested+0x5f/0x2b0
       [<c110dac4>] do_journal_begin_r+0x64/0x2f0
       [<c110ddcf>] journal_begin+0x7f/0x120
       [<c10ef52f>] reiserfs_delete_inode+0x9f/0x140
       [<c10a55fc>] generic_delete_inode+0x9c/0x150
       [<c10a56ed>] generic_drop_inode+0x3d/0x60
       [<c10a4607>] iput+0x47/0x50
       [<c10e915c>] reiserfs_create+0x16c/0x1c0
       [<c109a9c1>] vfs_create+0xc1/0x130
       [<c109dbec>] do_filp_open+0x81c/0x920
       [<c109004f>] do_sys_open+0x4f/0x110
       [<c1090179>] sys_open+0x29/0x40
       [<c1002c50>] sysenter_do_call+0x12/0x36

other info that might help us debug this:

2 locks held by vi/23454:
 #0:  (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [<c109d64e>]
do_filp_open+0x27e/0x920
 #1:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11106a8>]
reiserfs_write_lock+0x28/0x40

stack backtrace:
Pid: 23454, comm: vi Not tainted 2.6.32-06486-g053fe57 #2
Call Trace:
 [<c134b202>] ? printk+0x18/0x1e
 [<c104e960>] print_circular_bug+0xc0/0xd0
 [<c104fe38>] validate_chain+0xf68/0xf70
 [<c104ca9b>] ? trace_hardirqs_off+0xb/0x10
 [<c1050325>] __lock_acquire+0x4e5/0xa70
 [<c105092a>] lock_acquire+0x7a/0xa0
 [<c110dac4>] ? do_journal_begin_r+0x64/0x2f0
 [<c134c78f>] mutex_lock_nested+0x5f/0x2b0
 [<c110dac4>] ? do_journal_begin_r+0x64/0x2f0
 [<c110dac4>] ? do_journal_begin_r+0x64/0x2f0
 [<c110ff80>] ? delete_one_xattr+0x0/0x1c0
 [<c110dac4>] do_journal_begin_r+0x64/0x2f0
 [<c110ddcf>] journal_begin+0x7f/0x120
 [<c11105b5>] ? reiserfs_delete_xattrs+0x15/0x50
 [<c10ef52f>] reiserfs_delete_inode+0x9f/0x140
 [<c10a55bf>] ? generic_delete_inode+0x5f/0x150
 [<c10ef490>] ? reiserfs_delete_inode+0x0/0x140
 [<c10a55fc>] generic_delete_inode+0x9c/0x150
 [<c10a56ed>] generic_drop_inode+0x3d/0x60
 [<c10a4607>] iput+0x47/0x50
 [<c10e915c>] reiserfs_create+0x16c/0x1c0
 [<c1099a5d>] ? inode_permission+0x7d/0xa0
 [<c109a9c1>] vfs_create+0xc1/0x130
 [<c10e8ff0>] ? reiserfs_create+0x0/0x1c0
 [<c109dbec>] do_filp_open+0x81c/0x920
 [<c104ca9b>] ? trace_hardirqs_off+0xb/0x10
 [<c134dc0d>] ? _spin_unlock+0x1d/0x20
 [<c10a6eea>] ? alloc_fd+0xba/0xf0
 [<c109004f>] do_sys_open+0x4f/0x110
 [<c1090179>] sys_open+0x29/0x40
 [<c1002c50>] sysenter_do_call+0x12/0x36

To fix this, use reiserfs_lock_once() from reiserfs_delete_inode()
which prevents from adding reiserfs lock recursion.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2009-12-14 11:47:11 +01:00
Frederic Weisbecker
500f5a0bf5 reiserfs: Fix possible recursive lock
While allocating the bitmap using vmalloc, we hold the reiserfs lock,
which makes lockdep later reporting a possible deadlock as we may
swap out pages to allocate memory and then take the reiserfs lock
recursively:

inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
kswapd0/312 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&REISERFS_SB(s)->lock){+.+.?.}, at: [<c11108a8>] reiserfs_write_lock+0x28/0x40
{RECLAIM_FS-ON-W} state was registered at:
  [<c104e1c2>] mark_held_locks+0x62/0x90
  [<c104e28a>] lockdep_trace_alloc+0x9a/0xc0
  [<c108e396>] kmem_cache_alloc+0x26/0xf0
  [<c10850ec>] __get_vm_area_node+0x6c/0xf0
  [<c10857de>] __vmalloc_node+0x7e/0xa0
  [<c108597b>] vmalloc+0x2b/0x30
  [<c10e00b9>] reiserfs_init_bitmap_cache+0x39/0x70
  [<c10f8178>] reiserfs_fill_super+0x2e8/0xb90
  [<c1094345>] get_sb_bdev+0x145/0x180
  [<c10f5a11>] get_super_block+0x21/0x30
  [<c10931f0>] vfs_kern_mount+0x40/0xd0
  [<c10932d9>] do_kern_mount+0x39/0xd0
  [<c10a9857>] do_mount+0x2c7/0x6b0
  [<c10a9ca6>] sys_mount+0x66/0xa0
  [<c161589b>] mount_block_root+0xc4/0x245
  [<c1615a75>] mount_root+0x59/0x5f
  [<c1615b8c>] prepare_namespace+0x111/0x14b
  [<c1615269>] kernel_init+0xcf/0xdb
  [<c10031fb>] kernel_thread_helper+0x7/0x1c

This is actually fine for two reasons: we call vmalloc at mount time
then it's not in the swapping out path. Also the reiserfs lock can be
acquired recursively, but since its implementation depends on a mutex,
it's hard and not necessary worth it to teach that to lockdep.

The lock is useless at mount time anyway, at least until we replay the
journal. But let's remove it from this path later as this needs
more thinking and is a sensible change.

For now we can just relax the lock around vmalloc,

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2009-12-14 11:43:09 +01:00
Linus Torvalds
4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
Linus Torvalds
a9280fed38 Merge branch 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing
* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing: (31 commits)
  kill-the-bkl/reiserfs: turn GFP_ATOMIC flag to GFP_NOFS in reiserfs_get_block()
  kill-the-bkl/reiserfs: drop the fs race watchdog from _get_block_create_0()
  kill-the-bkl/reiserfs: definitely drop the bkl from reiserfs_ioctl()
  kill-the-bkl/reiserfs: always lock the ioctl path
  kill-the-bkl/reiserfs: fix reiserfs lock to cpu_add_remove_lock dependency
  kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to sysfs_mutex dependency
  kill-the-bkl/reiserfs: panic in case of lock imbalance
  kill-the-bkl/reiserfs: fix recursive reiserfs write lock in reiserfs_commit_write()
  kill-the-bkl/reiserfs: fix recursive reiserfs lock in reiserfs_mkdir()
  kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock inversion dependency
  kill-the-bkl/reiserfs: move the concurrent tree accesses checks per superblock
  kill-the-bkl/reiserfs: acquire the inode mutex safely
  kill-the-bkl/reiserfs: unlock only when needed in search_by_key
  kill-the-bkl/reiserfs: use mutex_lock in reiserfs_mutex_lock_safe
  kill-the-bkl/reiserfs: factorize the locking in reiserfs_write_end()
  kill-the-bkl/reiserfs: reduce number of contentions in search_by_key()
  kill-the-bkl/reiserfs: don't hold the write recursively in reiserfs_lookup()
  kill-the-bkl/reiserfs: lock only once on reiserfs_get_block()
  kill-the-bkl/reiserfs: conditionaly release the write lock on fs_changed()
  kill-the-BKL/reiserfs: add reiserfs_cond_resched()
  ...
2009-12-09 07:58:15 -08:00
Frederic Weisbecker
6548698f92 Merge commit 'v2.6.32' into reiserfs/kill-bkl
Merge-reason: The tree was based 2.6.31. It's better to be up to date
with 2.6.32. Although no conflicting changes were made in between,
it gives benchmarking results closer to the lastest kernel behaviour.
2009-12-07 07:29:22 +01:00
Adam Buchbinder
febe29d957 reiserfs: fix misspelling of "journaled"
"Journaled" is misspelled "journlaled" in an output string; this patch
fixed it. No changes in functionality.

Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 23:39:11 +01:00
Frederic Weisbecker
1d2c6cfd40 kill-the-bkl/reiserfs: turn GFP_ATOMIC flag to GFP_NOFS in reiserfs_get_block()
GFP_ATOMIC was used in reiserfs_get_block to not lose the Bkl so that
nobody can modify the tree in the middle of its work. Now that we
kicked out the bkl, we can use a more friendly flag. We use GFP_NOFS
here because we already hold the reiserfs lock.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>
2009-11-20 18:25:02 +01:00
Frederic Weisbecker
27b3a5c51b kill-the-bkl/reiserfs: drop the fs race watchdog from _get_block_create_0()
We had a watchdog in _get_block_create_0() that jumped to a fixup retry
path in case the bkl got relaxed while calling kmap().
This is not necessary anymore since we now have a reiserfs lock that is
not implicitly relaxed while sleeping.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>
2009-10-14 23:34:31 +02:00
Frederic Weisbecker
205cb37b89 kill-the-bkl/reiserfs: definitely drop the bkl from reiserfs_ioctl()
The reiserfs ioctl path doesn't need the big kernel lock anymore , now
that the filesystem synchronizes through its own lock.

We can then turn reiserfs_ioctl() into an unlocked_ioctl callback.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>
2009-10-14 23:28:12 +02:00
Frederic Weisbecker
ac78a07893 kill-the-bkl/reiserfs: always lock the ioctl path
Reiserfs uses the ioctl callback for its file operations, which means
that its ioctl path is still locked by the bkl, this was synchronizing
with the rest of the filsystem operations. We have changed that by
locking it with the new reiserfs lock but we do that only from the
compat_ioctl callback.

Fix that by locking reiserfs_ioctl() everytime.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>
2009-10-14 23:27:57 +02:00
Frederic Weisbecker
48f6ba5e69 kill-the-bkl/reiserfs: fix reiserfs lock to cpu_add_remove_lock dependency
While creating the reiserfs workqueue during the journal
initialization, we are holding the reiserfs lock, but
create_workqueue() also holds the cpu_add_remove_lock, creating
then the following dependency:

- reiserfs lock -> cpu_add_remove_lock

But we also have the following existing dependencies:

- mm->mmap_sem -> reiserfs lock
- cpu_add_remove_lock -> cpu_hotplug.lock -> slub_lock -> sysfs_mutex

The merged dependency chain then becomes:

- mm->mmap_sem -> reiserfs lock -> cpu_add_remove_lock ->
	cpu_hotplug.lock -> slub_lock -> sysfs_mutex

But when we fill a dir entry in sysfs_readir(), we are holding the
sysfs_mutex and we also might fault while copying the directory entry
to the user, leading to the following dependency:

- sysfs_mutex -> mm->mmap_sem

The end result is then a lock inversion between sysfs_mutex and
mm->mmap_sem, as reported in the following lockdep warning:

[ INFO: possible circular locking dependency detected ]
2.6.31-07095-g25a3912 #4
-------------------------------------------------------
udevadm/790 is trying to acquire lock:
 (&mm->mmap_sem){++++++}, at: [<c1098942>] might_fault+0x72/0xc0

but task is already holding lock:
 (sysfs_mutex){+.+.+.}, at: [<c110813c>] sysfs_readdir+0x7c/0x260

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #5 (sysfs_mutex){+.+.+.}:
      [...]

-> #4 (slub_lock){+++++.}:
      [...]

-> #3 (cpu_hotplug.lock){+.+.+.}:
      [...]

-> #2 (cpu_add_remove_lock){+.+.+.}:
      [...]

-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
      [...]

-> #0 (&mm->mmap_sem){++++++}:
      [...]

This can be fixed by relaxing the reiserfs lock while creating the
workqueue.
This is fine to relax the lock here, we just keep it around to pass
through reiserfs lock checks and for paranoid reasons.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
2009-10-05 16:31:37 +02:00
Alexey Dobriyan
0d54b217a2 const: make struct super_block::s_qcop const
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22 07:17:24 -07:00
Alexey Dobriyan
61e225dc34 const: make struct super_block::dq_op const
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22 07:17:24 -07:00
Frederic Weisbecker
193be0ee17 kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to sysfs_mutex dependency
Alexander Beregalov reported the following warning:

	=======================================================
	[ INFO: possible circular locking dependency detected ]
	2.6.31-03149-gdcc030a #1
	-------------------------------------------------------
	udevadm/716 is trying to acquire lock:
	 (&mm->mmap_sem){++++++}, at: [<c107249a>] might_fault+0x4a/0xa0

	but task is already holding lock:
	 (sysfs_mutex){+.+.+.}, at: [<c10cb9aa>] sysfs_readdir+0x5a/0x200

	which lock already depends on the new lock.

	the existing dependency chain (in reverse order) is:

	-> #3 (sysfs_mutex){+.+.+.}:
	       [...]

	-> #2 (&bdev->bd_mutex){+.+.+.}:
	       [...]

	-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
	       [...]

	-> #0 (&mm->mmap_sem){++++++}:
	       [...]

On reiserfs mount path, we take the reiserfs lock and while
initializing the journal, we open the device, taking the
bdev->bd_mutex. Then rescan_partition() may signal the change
to sysfs.

We have then the following dependency:

	reiserfs_lock -> bd_mutex -> sysfs_mutex

Later, while entering reiserfs_readpage() after a pagefault in an
mmaped reiserfs file, we are holding the mm->mmap_sem, and we are going
to take the reiserfs lock too.
We have then the following dependency:

	mm->mmap_sem -> reiserfs_lock

which, expanded with the previous dependency gives us:

	mm->mmap_sem -> reiserfs_lock -> bd_mutex -> sysfs_mutex

Now while entering the sysfs readdir path, we are holding the
sysfs_mutex. And when we copy a directory entry to the user buffer, we
might fault and then take the mm->mmap_sem lock. Which leads to the
circular locking dependency reported.

We can fix that by relaxing the reiserfs lock during the call to
journal_init_dev(), which is the place where we open the mounted
device.

This is fine to relax the lock here because we are in the begining of
the reiserfs mount path and there is nothing to protect at this time,
the journal is not intialized.
We just keep this lock around for paranoid reasons.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
2009-09-17 05:31:37 +02:00
Frederic Weisbecker
8050318598 kill-the-bkl/reiserfs: panic in case of lock imbalance
Until now, trying to unlock the reiserfs write lock whereas the current
task doesn't hold it lead to a simple warning.
We should actually warn and panic in this case to avoid the user datas
to reach an unstable state.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
2009-09-14 07:18:30 +02:00
Frederic Weisbecker
7e94277050 kill-the-bkl/reiserfs: fix recursive reiserfs write lock in reiserfs_commit_write()
reiserfs_commit_write() is always called with the write lock held.
Thus the current calls to reiserfs_write_lock() in this function are
acquiring the lock recursively.
We can safely drop them.

This also solves further assumptions for this lock to be really
released while calling reiserfs_write_unlock().

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
2009-09-14 07:18:29 +02:00
Frederic Weisbecker
b10ab4c337 kill-the-bkl/reiserfs: fix recursive reiserfs lock in reiserfs_mkdir()
reiserfs_mkdir() acquires the reiserfs lock, assuming it has been called
from the dir inodes callbacks, without the lock held.

But it can also be called from other internal sites such as
reiserfs_xattr_init() which already holds the lock. This recursive
locking leads to further wrong assumptions. For example, later calls
to reiserfs_mutex_lock_safe() won't actually unlock the reiserfs lock
the time we acquire a given mutex, creating unexpected lock inversions.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
2009-09-14 07:18:27 +02:00
Frederic Weisbecker
ae635c0bbd kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock inversion dependency
reiserfs_xattr_init is called with the reiserfs write lock held, but
if the ".reiserfs_priv" entry is not created, we take the superblock
root directory inode mutex until .reiserfs_priv is created.

This creates a lock dependency inversion against other sites such as
reiserfs_file_release() which takes an inode mutex and the reiserfs
lock after.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
2009-09-14 07:18:26 +02:00
Frederic Weisbecker
08f14fc896 kill-the-bkl/reiserfs: move the concurrent tree accesses checks per superblock
When do_balance() balances the tree, a trick is performed to
provide the ability for other tree writers/readers to check whether
do_balance() is executing concurrently (requires CONFIG_REISERFS_CHECK).

This is done to protect concurrent accesses to the tree. The trick
is the following:

When do_balance is called, a unique global variable called cur_tb
takes a pointer to the current tree to be rebalanced.
Once do_balance finishes its work, cur_tb takes the NULL value.

Then, concurrent tree readers/writers just have to check the value
of cur_tb to ensure do_balance isn't executing concurrently.
If it is, then it proves that schedule() occured on do_balance(),
which then relaxed the bkl that protected the tree.

Now that the bkl has be turned into a mutex, this check is still
fine even though do_balance() becomes preemptible: the write lock
will not be automatically released on schedule(), so the tree is
still protected.

But this is only fine if we have a single reiserfs mountpoint.
Indeed, because the bkl is a global lock, it didn't allowed
concurrent executions between a tree reader/writer in a mount point
and a do_balance() on another tree from another mountpoint.

So assuming all these readers/writers weren't supposed to be
reentrant, the current check now sometimes detect false positives with
the current per-superblock mutex which allows this reentrancy.

This patch keeps the concurrent tree accesses check but moves it
per superblock, so that only trees from a same mount point are
checked to be not accessed concurrently.

[ Impact: fix spurious panic while running several reiserfs mount-points ]

Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-09-14 07:18:25 +02:00
Frederic Weisbecker
c72e05756b kill-the-bkl/reiserfs: acquire the inode mutex safely
While searching a pathname, an inode mutex can be acquired
in do_lookup() which calls reiserfs_lookup() which in turn
acquires the write lock.

On the other side reiserfs_fill_super() can acquire the write_lock
and then call reiserfs_lookup_privroot() which can acquire an
inode mutex (the root of the mount point).

So we theoretically risk an AB - BA lock inversion that could lead
to a deadlock.

As for other lock dependencies found since the bkl to mutex
conversion, the fix is to use reiserfs_mutex_lock_safe() which
drops the lock dependency to the write lock.

[ Impact: fix a possible deadlock with reiserfs ]

Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-09-14 07:18:24 +02:00
Frederic Weisbecker
2ac626955e kill-the-bkl/reiserfs: unlock only when needed in search_by_key
search_by_key() is the site which most requires the lock.
This is mostly because it is a very central function and also
because it releases/reaqcuires the write lock at least once each
time it is called.

Such release/reacquire creates a lot of contention in this place and
also opens more the window which let another thread changing the tree.
When it happens, the current path searching over the tree must be
retried from the beggining (the root) which is a wasteful and
time consuming recovery.

This patch factorizes two release/reacquire sequences:

- reading leaf nodes blocks
- reading current block

The latter immediately follows the former.

The whole sequence is safe as a single unlocked section because
we check just after if the tree has changed during these operations.

Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-09-14 07:18:22 +02:00
Frederic Weisbecker
c63e3c0b24 kill-the-bkl/reiserfs: use mutex_lock in reiserfs_mutex_lock_safe
reiserfs_mutex_lock_safe() is a hack to avoid any dependency between
an internal reiserfs mutex and the write lock, it has been proposed
to follow the old bkl logic.

The code does the following:

while (!mutex_trylock(m)) {
	reiserfs_write_unlock(s);
	schedule();
	reiserfs_write_lock(s);
}

It then imitate the implicit behaviour of the lock when it was
a Bkl and hadn't such dependency:

mutex_lock(m) {
	if (fastpath)
		let's go
	else {
		wait_for_mutex() {
			schedule() {
				unlock_kernel()
				reacquire_lock_kernel()
			}
		}
	}
}

The problem is that by using such explicit schedule(), we don't
benefit of the adaptive mutex spinning on owner.

The logic in use now is:

reiserfs_write_unlock(s);
mutex_lock(m); // -> possible adaptive spinning
reiserfs_write_lock(s);

[ Impact: restore the use of adaptive spinning mutexes in reiserfs ]

Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-09-14 07:18:21 +02:00
Frederic Weisbecker
d6f5b0aa08 kill-the-bkl/reiserfs: factorize the locking in reiserfs_write_end()
reiserfs_write_end() is a hot path in reiserfs.
We have two wasteful write lock lock/release inside that can be gathered
without changing the code logic.

This patch factorizes them out in a single protected section, reducing the
number of contentions inside.

[ Impact: reduce lock contention in a reiserfs hotpath ]

Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-09-14 07:18:20 +02:00