linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-16 17:12:06 +00:00

Author	SHA1	Message	Date
Kurt Hackel	71ac106243	ocfs2_dlm: Make dlmunlock() wait for migration to complete dlmunlock() was not waiting for migration to complete before releasing locks on locally mastered locks. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-02-07 12:01:49 -08:00
Kurt Hackel	ddc09c8dda	ocfs2_dlm: Fixes race between migrate and dirty dlmthread was removing lockres' from the dirty list and resetting the dirty flag before shuffling the list. This patch retains the dirty state flag until the lists are shuffled. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Sunil Mushran <Sunil.Mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-02-07 12:00:57 -08:00
Adrian Bunk	faf0ec9f13	[PATCH] fs/ocfs2/dlm/: make functions static This patch makes some needlessly global functions static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-02-07 11:53:31 -08:00
Kurt Hackel	ba2bf21851	ocfs2_dlm: fix cluster-wide refcounting of lock resources This was previously broken and migration of some locks had to be temporarily disabled. We use a new (and backward-incompatible) set of network messages to account for all references to a lock resources held across the cluster. once these are all freed, the master node may then free the lock resource memory once its local references are dropped. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-02-07 11:53:07 -08:00
Eric W. Biederman	b592fcfe7f	sysfs: Shadow directory support The problem. When implementing a network namespace I need to be able to have multiple network devices with the same name. Currently this is a problem for /sys/class/net/*. What I want is a separate /sys/class/net directory in sysfs for each network namespace, and I want to name each of them /sys/class/net. I looked and the VFS actually allows that. All that is needed is for /sys/class/net to implement a follow link method to redirect lookups to the real directory you want. Implementing a follow link method that is sensitive to the current network namespace turns out to be 3 lines of code so it looks like a clean approach. Modifying sysfs so it doesn't get in my was is a bit trickier. I am calling the concept of multiple directories all at the same path in the filesystem shadow directories. With the directory entry really at that location the shadow master. The following patch modifies sysfs so it can handle a directory structure slightly different from the kobject tree so I can implement the shadow directories for handling /sys/class/net/. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:14 -08:00
Oliver Neukum	82244b169e	sysfs: error handling in sysfs, fill_read_buffer() if a driver returns an error in fill_read_buffer(), the buffer will be marked as filled. Subsequent reads will return eof. But there is no data because of an error, not because it has been read. Not marking the buffer filled is the obvious fix. Signed-off-by: Oliver Neukum <oliver@neukum.name> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:13 -08:00
Mariusz Kozlowski	f750653670	sysfs: kobject_put cleanup This patch removes redundant argument checks for kobject_put(). Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:13 -08:00
Frederik Deweerdt	d3fc373ac5	sysfs: suppress lockdep warnings Lockdep issues the following warning: [ 9.064000] ============================================= [ 9.064000] [ INFO: possible recursive locking detected ] [ 9.064000] 2.6.20-rc3-mm1 #3 [ 9.064000] --------------------------------------------- [ 9.064000] init/1 is trying to acquire lock: [ 9.064000] (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.064000] [ 9.064000] but task is already holding lock: [ 9.064000] (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.065000] [ 9.065000] other info that might help us debug this: [ 9.065000] 2 locks held by init/1: [ 9.065000] #0: (tty_mutex){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.065000] #1: (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.065000] [ 9.065000] stack backtrace: [ 9.065000] [<c010390d>] show_trace_log_lvl+0x1a/0x30 [ 9.066000] [<c0103935>] show_trace+0x12/0x14 [ 9.066000] [<c0103a2f>] dump_stack+0x16/0x18 [ 9.066000] [<c0138cb8>] print_deadlock_bug+0xb9/0xc3 [ 9.066000] [<c0138d17>] check_deadlock+0x55/0x5a [ 9.066000] [<c013a953>] __lock_acquire+0x371/0xbf0 [ 9.066000] [<c013b7a9>] lock_acquire+0x69/0x83 [ 9.066000] [<c03e6b7e>] __mutex_lock_slowpath+0x75/0x2d1 [ 9.066000] [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.066000] [<c01b249c>] sysfs_drop_dentry+0xb1/0x133 [ 9.066000] [<c01b25d1>] sysfs_hash_and_remove+0xb3/0x142 [ 9.066000] [<c01b30ed>] sysfs_remove_file+0xd/0x10 [ 9.067000] [<c02849e0>] device_remove_file+0x23/0x2e [ 9.067000] [<c02850b2>] device_del+0x188/0x1e6 [ 9.067000] [<c028511b>] device_unregister+0xb/0x15 [ 9.067000] [<c0285318>] device_destroy+0x9c/0xa9 [ 9.067000] [<c0261431>] vcs_remove_sysfs+0x1c/0x3b [ 9.067000] [<c0267a08>] con_close+0x5e/0x6b [ 9.067000] [<c02598f2>] release_dev+0x4c4/0x6e5 [ 9.067000] [<c0259faa>] tty_release+0x12/0x1c [ 9.067000] [<c0174872>] __fput+0x177/0x1a0 [ 9.067000] [<c01746f5>] fput+0x3b/0x41 [ 9.068000] [<c0172ee1>] filp_close+0x36/0x65 [ 9.068000] [<c0172f73>] sys_close+0x63/0xa4 [ 9.068000] [<c0102a96>] sysenter_past_esp+0x5f/0x99 [ 9.068000] ======================= This is due to sysfs_hash_and_remove() holding dir->d_inode->i_mutex before calling sysfs_drop_dentry() which calls orphan_all_buffers() which in turn takes node->i_mutex. Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Cc: Oliver Neukum <oliver@neukum.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:13 -08:00
Oliver Neukum	94bebf4d1b	Driver core: fix race in sysfs between sysfs_remove_file() and read()/write() This patch prevents a race between IO and removing a file from sysfs. It introduces a list of sysfs_buffers associated with a file at the inode. Upon removal of a file the list is walked and the buffers marked orphaned. IO to orphaned buffers fails with -ENODEV. The driver can safely free associated data structures or be unloaded. Signed-off-by: Oliver Neukum <oliver@neukum.name> Acked-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:13 -08:00
Cornelia Huck	c744aeae9d	driver core: Allow device_move(dev, NULL). If we allow NULL as the new parent in device_move(), we need to make sure that the device is placed into the same place as it would if it was newly registered: - Consider the device virtual tree. In order to be able to reuse code, setup_parent() has been tweaked a bit. - kobject_move() can fall back to the kset's kobject. - sysfs_move_dir() uses the sysfs root dir as fallback. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:11 -08:00
Linus Torvalds	5331be0905	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6: JFS: Remove incorrect kgdb define JFS: call io_schedule() instead of schedule() to avoid deadlock JFS: Add lockdep annotations JFS: Avoid BUG() on a damaged file system	2007-02-07 08:10:48 -08:00
Linus Torvalds	d3f8fd765e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (57 commits) [GFS2] make gfs2_writepages() static [GFS2] Unlock page on prepare_write try lock failure [GFS2] nfsd readdirplus assertion failure [DLM] fix softlockup in dlm_recv [DLM] zero new user lvbs [DLM/GFS2] indent help text [GFS2] Fix unlink deadlocks [GFS2] Put back semaphore to avoid umount problem [GFS2] more CURRENT_TIME_SEC [GFS2/DLM] fix GFS2 circular dependency [GFS2/DLM] use sysfs [GFS2] make lock_dlm drop_count tunable in sysfs [GFS2] increase default lock limit [GFS2] Fix list corruption in lops.c [GFS2] Fix recursive locking attempt with NFS [DLM] can miss clearing resend flag [DLM] saved dlm message can be dropped [DLM] Make sock_sem into a mutex [GFS2] Fix typo in glock.c [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 ...	2007-02-07 08:09:00 -08:00
Adrian Bunk	a2cf822274	[GFS2] make gfs2_writepages() static On Mon, Jan 29, 2007 at 08:45:28PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc6-mm2: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly global gfs2_writepages() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-07 10:48:48 -05:00
Steven Whitehouse	2d72e7101c	[GFS2] Unlock page on prepare_write try lock failure When the try lock of the glock failed in prepare_write we were incorrectly exiting this function with the page still locked. This was resulting in further I/O to this page hanging. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-07 10:25:59 -05:00
Linus Torvalds	dda2ac15d2	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: ocfs2_link() journal credits update	2007-02-06 14:59:27 -08:00
Linus Torvalds	768c242b30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Minor cleanup [CIFS] Missing free in error path [CIFS] Reduce cifs stack space usage [CIFS] lseek polling returned stale EOF	2007-02-06 14:42:20 -08:00
Herbert Xu	f1ddcaf339	[CRYPTO] api: Remove deprecated interface This patch removes the old cipher interface and related code. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-02-07 09:21:00 +11:00
Steve French	a850790f6c	[CIFS] Minor cleanup Missing tab. Missing entry in changelog Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-02-06 20:43:30 +00:00
Wendy Cheng	549ae0ac3d	[GFS2] nfsd readdirplus assertion failure Glock assertion failure found in '07 NFS connectathon. One of the NFSDs is doing a "readdirplus" procedure call. It passes the logic into gfs2_readdir() where it obtains its directory inode glock. This is then followed by filehandle construction that invokes lookup code. It hits the assertion failure while trying to obtain the inode glock again inside gfs2_drevalidate(). This patch bypasses the recursive glock call if caller already holds the lock. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-06 11:36:01 -05:00
Patrick Caulfield	a34fbc6363	[DLM] fix softlockup in dlm_recv This patch stops the dlm_recv workqueue from busy-waiting when a node disconnects. This can cause soft lockup errors on debug systems and bad performance generally. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:27 -05:00
David Teigland	62a0f62369	[DLM] zero new user lvbs A new lvb for a userland lock wasn't being initialized to zero. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:24 -05:00
Randy Dunlap	9beeb9f3c5	[DLM/GFS2] indent help text Indent help text as expected. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:20 -05:00
Russell Cattelan	ddee76089c	[GFS2] Fix unlink deadlocks Move the glock acquisition to outside of the transactions. Lock odering must be preserved in order to prevent ABBA deadlocks. The current gfs2_change_nlink code would tries to grab the glock after having started a transaction and thus is holding the log lock. This is inconsistent with other code paths in gfs that grab the resource group glock prior to staring a tranactions. One problem with this fix is that the resource group lock is always grabbed now even if the inode still has ref count and can not be marked for unlink. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:17 -05:00
Steven Whitehouse	61be084efc	[GFS2] Put back semaphore to avoid umount problem Dave Teigland fixed this bug a while back, but I managed to mistakenly remove the semaphore during later development. It is required to avoid the list of inodes changing during an invalidate_inodes call. I have made it an rwsem since the read side will be taken frequently during normal filesystem operation. The write site will only happen during umount of the file system. Also the bug only triggers when using the DLM lock manager and only then under certain conditions as its timing related. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Teigland <teigland@redhat.com>	2007-02-05 13:38:14 -05:00
Eric Sandeen	bbb28ab759	[GFS2] more CURRENT_TIME_SEC Whoops, quilt user error, missed this one in the previous patch. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:11 -05:00
Adrian Bunk	0011727785	[GFS2/DLM] fix GFS2 circular dependency On Sun, Jan 28, 2007 at 11:08:18AM +0100, Jiri Slaby wrote: > Andrew Morton napsal(a): > >Temporarily at > > > > http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/ > > Unable to select IPV6. Menuconfig doesn't offer it when INET is selected. > When it's not it appears in the menu, but after state change it gets away. > The same behaviour in xconfig, gconfig. > > $ mkdir ../a/tst > $ make O=../a/tst menuconfig > HOSTCC scripts/basic/fixdep > [...] > HOSTLD scripts/kconfig/mconf > scripts/kconfig/mconf arch/i386/Kconfig > Warning! Found recursive dependency: INET GFS2_FS_LOCKING_DLM SYSFS > OCFS2_FS INET > > Maybe this is the problem? Yes, patch below. > regards, cu Adrian <-- snip --> This patch fixes a circular dependency by letting GFS2_FS_LOCKING_DLM and DLM depend on instead of select SYSFS. Since SYSFS depends on EMBEDDED this change shouldn't cause any problems for users. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:08 -05:00
Randy Dunlap	67f55897ee	[GFS2/DLM] use sysfs With CONFIG_DLM=m, CONFIG_PROC_FS=n, and CONFIG_SYSFS=n, kernel build fails with: WARNING: "kernel_subsys" [fs/gfs2/locking/dlm/lock_dlm.ko] undefined! WARNING: "kernel_subsys" [fs/dlm/dlm.ko] undefined! WARNING: "kernel_subsys" [fs/configfs/configfs.ko] undefined! make[1]: * [__modpost] Error 1 make: * [modules] Error 2 Since fs/dlm/lockspace.c and fs/gfs2/locking/dlm/sysfs.c use kernel_subsys, they should either DEPEND on it or SELECT it. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:05 -05:00
David Teigland	ee32e4f3d3	[GFS2] make lock_dlm drop_count tunable in sysfs We want to be able to change or disable the default drop_count (number at which the dlm asks gfs to limit the the number of locks it's holding). Add it to the collection of sysfs tunables for an fs. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:01 -05:00
David Teigland	2f708649ba	[GFS2] increase default lock limit Increase the number of locks at which point the dlm begins asking gfs to reduce its lock usage. The default value is largely arbitrary, but the current value of 50,000 ends up limiting performance unnecessarily for too many users. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:59 -05:00
Steven Whitehouse	8bd9572769	[GFS2] Fix list corruption in lops.c The patch below appears to fix the list corruption that we are seeing on occasion. Although the transaction structure is private to a single thread, when the queued structures are dismantled during an in-core commit, its possible for a different thread to be trying to add the same structure to another, new, transaction at the same time. To avoid this, this patch takes the log spinlock during this operation. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:56 -05:00
Steven Whitehouse	d7c103d0bd	[GFS2] Fix recursive locking attempt with NFS In certain cases, its possible for NFS to call the lookup code while holding the glock (when doing a readdirplus operation) so we need to check for that and not try and lock the glock twice. This also fixes a typo in a previous NFS related GFS2 patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:53 -05:00
David Teigland	b790c3b7c3	[DLM] can miss clearing resend flag A long, complicated sequence of events, beginning with the RESEND flag not being cleared on an lkb, can result in an unlock never completing. - lkb on waiters list for remote lookup - the remote node is both the dir node and the master node, so it optimizes the lookup into a request and sends a request reply back - the request reply is saved on the requestqueue to be processed after recovery - recovery runs dlm_recover_waiters_pre() which sets RESEND flag so the lookup will be resent after recovery - end of recovery: process_requestqueue takes saved request reply which removes the lkb off the waitesr list, _without_ clearing the RESEND flag - end of recovery: dlm_recover_waiters_post() doesn't do anything with the now completed lookup lkb (would usually clear RESEND) - later, the node unmounts, unlocks this lkb that still has RESEND flag set - the lkb is on the waiters list again, now for unlock, when recovery occurs, dlm_recover_waiters_pre() shows the lkb for unlock with RESEND set, doesn't do anything since the master still exists - end of recovery: dlm_recover_waiters_post() takes this lkb off the waiters list because it has the RESEND flag set, then reports an error because unlocks are never supposed to be handled in recover_waiters_post(). - later, the unlock reply is received, doesn't find the lkb on the waiters list because recover_waiters_post() has wrongly removed it. - the unlock operation has been lost, and we're left with a stray granted lock - unmount spins waiting for the unlock to complete The visible evidence of this problem will be a node where gfs umount is spinning, the dlm waiters list will be empty, and the dlm locks list will show a granted lock. The fix is simply to clear the RESEND flag when taking an lkb off the waiters list. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:50 -05:00
David Teigland	8fd3a98f2c	[DLM] saved dlm message can be dropped dlm_receive_message() returns 0 instead of returning 'error'. What would happen is that process_requestqueue would take a saved message off the requestqueue and call receive_message on it. receive_message would then see that recovery had been aborted, set error to EINTR, and 'goto out', expecting that the error would be returned. Instead, 0 was always returned, so process_requestqueue would think that the message had been processed and delete it instead of saving it to process next time. This means the message (usually an unlock in my tests) would be lost. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:47 -05:00
Patrick Caulfield	f1f1c1ccf7	[DLM] Make sock_sem into a mutex Now that there can be multiple dlm_recv threads running we need to prevent two recvs running for the same connection - it's unlikely but it can happen and it causes message corruption. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:44 -05:00
Steven Whitehouse	d043e1900c	[GFS2] Fix typo in glock.c This is a one letter typo fix in glock.c, spotted by Rob Kenna. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:41 -05:00
Eric Sandeen	ddfe062783	[GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 I was looking something else up and came across this... I don't honestly have a good reason to change it other than to make it like every other Linux filesystem in this regard. ;-) It doesn't functionally change anything, but makes some lines shorter. :) I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but not in timespec_t format, and only stores second resolutions? Seems like you're halfway to sub-second resolutions already. I suppose if that gets implemented then all of the below should instead be CURRENT_TIME not CURRENT_TIME_SEC. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:38 -05:00
Steven Whitehouse	90101c3186	[GFS2] Compile fix for glock.c This one liner got missed from the previous patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:35 -05:00
Steven Whitehouse	12132933c4	[GFS2] Remove queue_empty() function This function is not longer required since we do not do recursive locking in the glock layer. As a result all its callers can be replaceed with list_empty() calls. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:32 -05:00
Patrick Caulfield	bd44e2b007	[DLM] fix lowcomms receiving This patch fixes a bug whereby data on a newly accepted connection would be ignored if it arrived soon after the accept. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:29 -05:00
Steven Whitehouse	b5d32bead1	[GFS2] Tidy up glops calls This patch doesn't make any changes to the ordering of the various operations related to glocking, but it does tidy up the calls to the glops.c functions to make the structure more obvious. The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be made static within glock.c since they are called by every set of glock operations. The xmote_th and drop_th glock operations are then made conditional upon those two routines existing and called from the previously mentioned functions in glock.c respectively. Also it can be seen that the go_sync operation isn't needed since it can easily be replaced by calls to xmote_bh and drop_bh respectively. This results in no longer (confusingly) calling back into routines in glock.c from glops.c and also reducing the glock operations by one member. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:26 -05:00
Patrick Caulfield	f2f5095f9e	[DLM] lowcomms tidy This patch removes some redundant fields from the connection structure and adds some lockdep annotation to remove spurious warnings. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:23 -05:00
Steven Whitehouse	1c0f4872dc	[GFS2] Remove local exclusive glock mode Here is a patch for GFS2 to remove the local exclusive flag. In the places it was used, mutex's are always held earlier in the call path, so it appears redundant in the LM_ST_SHARED case. Also, the GFS2 holders were setting local exclusive in any case where the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock code where the flag was tested have been replaced with tests for the lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well as globally exclusive). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:20 -05:00
Steven Whitehouse	6bd9c8c2fb	[GFS2] Remove unused go_callback operation This is never used, so we might as well remove it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:17 -05:00
Steven Whitehouse	e5dab552c8	[GFS2] Remove the "greedy" function from glock.[ch] The "greedy" code was an attempt to retain glocks for a minimum length of time when they relate to mmap()ed files. The current implementation of this feature is not, however, ideal in that it required allocating memory in order to do this and its overly complicated. It also misses the mark by ignoring the other I/O operations which are just as likely to suffer from the same problem. So the plan is to remove this now and then add the functionality back as part of the glock state machine at a later date (and thus take into account all the possible users of this feature) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:14 -05:00
Steven Whitehouse	fee852e374	[GFS2] Shrink gfs2_inode memory by half Here is something I spotted (while looking for something entirely different) the other day. Rather than using a completion in each and every struct gfs2_holder, this removes it in favour of hashed wait queues, thus saving a considerable amount of memory both on the stack (where a number of gfs2_holder structures are allocated) and in particular in the gfs2_inode which has 8 gfs2_holder structures embedded within it. As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to 1912 bytes, a saving of 576 bytes per inode (no thats not a typo!). In actual practice we get a much better result than that since now that a gfs2_inode is under the 2048 byte barrier, we get two per 4k slab page effectively halving the amount of memory required to store gfs2_inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:11 -05:00
Steven Whitehouse	330005c2b2	[GFS2] Remove max_atomic_write tunable This removes an unused sysfs tunable parameter. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:08 -05:00
Steven Whitehouse	3699e3a44b	[GFS2] Clean up/speed up readdir This removes the extra filldir callback which gfs2 was using to enclose an attempt at readahead for inodes during readdir. The code was too complicated and also hurts performance badly in the case that the getdents64/readdir call isn't being followed by stat() and it wasn't even getting it right all the time when it was. As a result, on my test box an "ls" of a directory containing 250000 files fell from about 7mins (freshly mounted, so nothing cached) to between about 15 to 25 seconds. When the directory content was cached, the time taken fell from about 3mins to about 4 or 5 seconds. Interestingly in the cached case, running "ls -l" once reduced the time taken for subsequent runs of "ls" to about 6 secs even without this patch. Now it turns out that there was a special case of glocks being used for prefetching the metadata, but because of the timeouts for these locks (set to 10 secs) the metadata was being timed out before it was being used and this the prefetch code was constantly trying to prefetch the same data over and over. Calling "ls -l" meant that the inodes were brought into memory and once the inodes are cached, the glocks are not disposed of until the inodes are pushed out of the cache, thus extending the lifetime of the glocks, and thus bringing down the time for subsequent runs of "ls" considerably. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:04 -05:00
Steven Whitehouse	a8d638e30e	[GFS2] Add writepages for "data=writeback" mounts It occurred to me that although a gfs2 specific writepages for ordered writes and journaled data would be tricky, by hooking writepages only for "data=writeback" mounts we could take advantage of not needing buffer heads (we don't use them on the read side, nor have we for some time) and create much larger I/Os for the block layer. Using blktrace both before and after, its possible to see that for large I/Os, most of the requests generated through writepages are now 1024 sectors after this patch is applied as opposed to 8 sectors before. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:01 -05:00
David Teigland	222d396092	[DLM] fix master recovery If master recovery happens on an rsb in one recovery sequence, then that sequence is aborted before lock recovery happens, then in the next sequence, we rely on the previous master recovery (which may now be invalid due to another node ignoring a lookup result) and go on do to the lock recovery where we get stuck due to an invalid master value. recovery cycle begins: master of rsb X has left nodes A and B send node C an rcom lookup for X to find the new master C gets lookup from B first, sets B as new master, and sends reply back to B C gets lookup from A next, and sends reply back to A saying B is master A gets lookup reply from C and sets B as the new master in the rsb recovery cycle on A, B and C is aborted to start a new recovery B gets lookup reply from C and ignores it since there's a new recovery recovery cycle begins: some other node has joined B doesn't think it's the master of X so it doesn't rebuild it in the directory C looks up the master of X, no one is master, so it becomes new master B looks up the master of X, finds it's C A believes that B is the master of X, so it sends its lock to B B sends an error back to A A resends this repeats forever, the incorrect master value on A is never corrected The fix is to do master recovery on an rsb that still has the NEW_MASTER flag set from an earlier recovery sequence, and therefore didn't complete lock recovery. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:58 -05:00
David Teigland	a1bc86e6bd	[DLM] fix user unlocking When a user process exits, we clear all the locks it holds. There is a problem, though, with locks that the process had begun unlocking before it exited. We couldn't find the lkb's that were in the process of being unlocked remotely, to flag that they are DEAD. To solve this, we move lkb's being unlocked onto a new list in the per-process structure that tracks what locks the process is holding. We can then go through this list to flag the necessary lkb's when clearing locks for a process when it exits. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:55 -05:00
Patrick Caulfield	1d6e8131cf	[DLM] Use workqueues for dlm lowcomms This patch converts the DLM TCP lowcomms to use workqueues rather than using its own daemon functions. Simultaneously removing a lot of code and making it more scalable on multi-processor machines. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:52 -05:00
Adrian Bunk	03dc6a538e	[GFS2] make gfs2_change_nlink_i() static On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc3-mm1: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly globlal gfs2_change_nlink_i() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:49 -05:00
Robert Peterson	7083146564	[GFS2] gfs2 knows of directories which it chooses not to display This is for Red Hat bugzilla bug bz #222302: Moving a virtual IP from node to node between two NFS-over-GFS2 servers was causing one of the GFS2 servers to become confused and reference a deleted inode. The problem was due to vfs dentries that did not reference the gfs2_dops and therefore didn't call the gfs2 revalidate code to revalidate a dentry after a directory had been deleted & recreated. This patch is a crosswrite from a RHEL4 bug found in GFS1 as bz #190756 and it is against the latest -nmw git tree. Signed-off-by: Robert Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:46 -05:00
David Teigland	d200778e12	[DLM] expose dlm_config_info fields in configfs Make the dlm_config_info values readable and writeable via configfs entries. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:43 -05:00
David Teigland	99fc64874a	[DLM] add config entry to enable log_debug Add a new dlm_config_info field to enable log_debug output and change log_debug() to use it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:40 -05:00
David Teigland	68c817a1c4	[DLM] rename dlm_config_info fields Add a "ci_" prefix to the fields in the dlm_config_info struct so that we can use macros to add configfs functions to access them (in a later patch). No functional changes in this patch, just naming changes. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:37 -05:00
David Teigland	8ec6886748	[DLM] change some log_error to log_debug Some common, non-error messages should use log_debug instead of log_error so they can be turned off. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:34 -05:00
S. Wendy Cheng	87d21e07f3	[GFS2] Fix gfs2_rename deadlock Second round of gfs2_rename lock re-ordering to allow Anaconda adding root partition on top of gfs2. Previous to this patch the recursive lock detector in glock.c can be triggered due to attempting to lock the rgrp twice. This fixes it by checking to see whether the rgrp is already locked. This fixes Red Hat bugzilla #221237 Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:31 -05:00
Russell Cattelan	6c93fd1e57	[GFS2] BZ 217008 fsfuzzer fix. Update the quilt header comments to match the code changes. Change gfs2_lookup_simple to return an error in the case of a NULL inode. The callers of gfs2_lookup_simple do not check for NULL in the no entry case and such would end up dereferencing a NULL ptr. This fixes: http://projects.info-pull.com/mokb/MOKB-15-11-2006.html Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:28 -05:00
Steven Whitehouse	49686f7106	[GFS2] Fix ordering of page disposal vs. glock_dq In case of unlinked files with dirty pages GFS2 wasn't clearing the pages in quite the right order. This patch clears the pages earlier (before the qlock_dq) to avoid the situation that the release of the glock results in attempting to write back data that has already been deallocated. This fixes Red Hat bugzilla: #220117 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:24 -05:00
Patrick Caulfield	4edde74eed	[DLM] Fix spin lock already unlocked bug I just noticed this message when testing some other changes I'd made to lowcomms (to use workqueues) but the problem seems to be in the current git trees too. I'm amazed no-one has seen it. BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868 Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:21 -05:00
Patrick Caulfield	3fb4a251fe	[DLM] Fix schedule() calls I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton. These four should really be calls to schedule() or the dlm can busy-wait. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:18 -05:00
S. Wendy Cheng	5509826f1e	[GFS2] Fix change nlink deadlock Bugzilla 215088 Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2 partition. The gfs2_rename() apparently needs block allocation for the new name (into the directory) where it requires rg locks. At the same time, while updating the nlink count for the replaced file, gfs2_change_nlink() tries to return the inode meta-data back to resource group where it needs rg locks too. Our logic doesn't allow process to acquire these locks recursively by the same process (RHEL installer) that results a BUG call. This only happens within rename code path and only if the destination file exists before the rename operation. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:15 -05:00
Steven Whitehouse	e1d5b18ae9	[GFS2] Fail over to readpage for stuffed files This is partially derrived from a patch written by Russell Cattelan. It fixes a bug where there is a race between readpages and truncate by ignoring readpages for stuffed files. This is ok because a stuffed file will never be more than one block (minus sizeof(struct gfs2_dinode)) in size and block size is always less than page size, so we do not lose anything efficiency-wise by not doing readahead for stuffed files. They will have already been "read ahead" by the action of reading the inode in, in the first place. This is the remaining part of the fix for Red Hat bugzilla #218966 which had not yet made it upstream. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Russell Cattelan <cattelan@redhat.com>	2007-02-05 13:36:12 -05:00
Steven Whitehouse	c7b3383437	[GFS2] Fix DIO deadlock This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs due to trying to take the i_mutex while holding a glock. The correct locking order is defined as i_mutex -> glock in all cases. I've left dealing with allocating writes. I know that we need to do that, but for now this should do the trick. We don't need to take the i_mutex on write, because the VFS has already taken it for us. On read we don't need it since the glock is enough protection. The reason that I've made some of the checks into a separate function is that we'll need to do the checks again in the allocating write case eventually, so this is partly in preparation for this. Likewise the return value test of != 1 might look a bit odd and thats because we'll need a third return value in case of requiring an allocation. I've made the change to deferred mode on the glock to ensure flushing read caches on other nodes. I notice that (using blktrace to look at whats going on) we appear to do a better job of large I/Os than ext3 after this patch (in terms of not splitting up the I/Os). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>	2007-02-05 13:36:09 -05:00
Adrian Bunk	927255f038	[DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions Remove the following unused functions: - lowcomms_send_message() - lowcomms_max_buffer_size() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:05 -05:00
David Teigland	075529b5e1	[DLM] fix lost flags in stub replies When the dlm fakes an unlock/cancel reply from a failed node using a stub message struct, it wasn't setting the flags in the stub message. So, in the process of receiving the fake message the lkb flags would be updated and cleared from the zero flags in the message. The problem observed in tests was the loss of the USER flag which caused the dlm to think a user lock was a kernel lock and subsequently fail an assertion checking the validity of the ast/callback field. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:02 -05:00
David Teigland	8d07fd509e	[DLM] fix receive_request() lvb copying LVB's are not sent as part of new requests, but the code receiving the request was copying data into the lvb anyway. The space in the message where it mistakenly thought the lvb lived actually contained the resource name, so it wound up incorrectly copying this name data into the lvb. Fix is to just create the lvb, not copy junk into it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:59 -05:00
David Teigland	da49f36f4f	[DLM] fix send_args() lvb copying The send_args() function is used to copy parameters into a message for a number different message types. Only some of those types are set up beforehand (in create_message) to include space for sending lvb data. send_args was wrongly copying the lvb for all message types as long as the lock had an lvb. This means that the lvb data was being written past the end of the message into unknown space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:56 -05:00
David Teigland	9e971b715d	[DLM] add version check Check if we receive a message from another lockspace member running a version of the dlm with an incompatible inter-node message protocol. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:53 -05:00
David Teigland	38aa8b0c59	[DLM] fix old rcom messages A reply to a recovery message will often be received after the relevant recovery sequence has aborted and the next recovery sequence has begun. We need to ignore replies to these old messages from the previous recovery. There's already a way to do this for synchronous recovery requests using the rc_id number, but not for async. Each recovery sequence already has a locally unique sequence number associated with it. This patch adds a field to the rcom (recovery message) structure where this recovery sequence number can be placed, rc_seq. When a node sends a reply to a recovery request, it copies the rc_seq number it received into rc_seq_reply. When the first node receives the reply to its recovery message, it will check whether rc_seq_reply matches the current recovery sequence number, ls_recover_seq, and if not then it ignores the old reply. An old, inadequate approach to filtering out old replies (checking if the current stage of recovery has moved back to the start) has been removed from two spots. The protocol version number is changed to reflect the different rcom structures. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:50 -05:00
David Teigland	dc200a8848	[DLM] fix resend rcom lock There's a chance the new master of resource hasn't learned it's the new master before another node sends it a lock during recovery. The node sending the lock needs to resend if this happens. - A sends a master lookup for resource R to C - B sends a master lookup for resource R to C - C receives A's lookup, assigns A to be master of R and sends a reply back to A - C receives B's lookup and sends a reply back to B saying that A is the master - B receives lookup reply from C and sends its lock for R to A - A receives lock from B, doesn't think it's the master of R and sends an error back to B - A receives lookup reply from C and becomes master of R - B gets error back from A and resends its lock back to A (this resending is what this patch does) - A receives lock from B, it now sees it's the master of R and takes the lock Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:47 -05:00
David Teigland	c378051177	[GFS2] don't try to lockfs after shutdown If an fs has already been shut down, a lockfs callback should do nothing. An fs that's been shut down can't acquire locks or do anything with respect to the cluster. Also, remove FIXME comment in withdraw function. The missing bits of the withdraw procedure are now all done by user space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:35:44 -05:00
Andrew Morton	b2e895dbd8	[PATCH] revert blockdev direct io back to 2.6.19 version Andrew Vasquez is reporting as-iosched oopses and a 65% throughput slowdown due to the recent special-casing of direct-io against blockdevs. We don't know why either of these things are occurring. The patch minimally reverts us back to the 2.6.19 code for a 2.6.20 release. Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-03 11:26:06 -08:00
Ken Chen	dee11c2364	[PATCH] aio: fix buggy put_ioctx call in aio_complete - v2 An AIO bug was reported that sleeping function is being called in softirq context: BUG: warning at kernel/mutex.c:132/__mutex_lock_common() Call Trace: [<a000000100577b00>] __mutex_lock_slowpath+0x640/0x6c0 [<a000000100577ba0>] mutex_lock+0x20/0x40 [<a0000001000a25b0>] flush_workqueue+0xb0/0x1a0 [<a00000010018c0c0>] __put_ioctx+0xc0/0x240 [<a00000010018d470>] aio_complete+0x2f0/0x420 [<a00000010019cc80>] finished_one_bio+0x200/0x2a0 [<a00000010019d1c0>] dio_bio_complete+0x1c0/0x200 [<a00000010019d260>] dio_bio_end_aio+0x60/0x80 [<a00000010014acd0>] bio_endio+0x110/0x1c0 [<a0000001002770e0>] __end_that_request_first+0x180/0xba0 [<a000000100277b90>] end_that_request_chunk+0x30/0x60 [<a0000002073c0c70>] scsi_end_request+0x50/0x300 [scsi_mod] [<a0000002073c1240>] scsi_io_completion+0x200/0x8a0 [scsi_mod] [<a0000002074729b0>] sd_rw_intr+0x330/0x860 [sd_mod] [<a0000002073b3ac0>] scsi_finish_command+0x100/0x1c0 [scsi_mod] [<a0000002073c2910>] scsi_softirq_done+0x230/0x300 [scsi_mod] [<a000000100277d20>] blk_done_softirq+0x160/0x1c0 [<a000000100083e00>] __do_softirq+0x200/0x240 [<a000000100083eb0>] do_softirq+0x70/0xc0 See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2 flush_workqueue() is not allowed to be called in the softirq context. However, aio_complete() called from I/O interrupt can potentially call put_ioctx with last ref count on ioctx and triggers bug. It is simply incorrect to perform ioctx freeing from aio_complete. The bug is trigger-able from a race between io_destroy() and aio_complete(). A possible scenario: cpu0 cpu1 io_destroy aio_complete wait_for_all_aios { __aio_put_req ... ctx->reqs_active--; if (!ctx->reqs_active) return; } ... put_ioctx(ioctx) put_ioctx(ctx); __put_ioctx bam! Bug trigger! The real problem is that the condition check of ctx->reqs_active in wait_for_all_aios() is incorrect that access to reqs_active is not being properly protected by spin lock. This patch adds that protective spin lock, and at the same time removes all duplicate ref counting for each kiocb as reqs_active is already used as a ref count for each active ioctx. This also ensures that buggy call to flush_workqueue() in softirq context is eliminated. Signed-off-by: "Ken Chen" <kenchen@google.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Suparna Bhattacharya <suparna@in.ibm.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: <stable@kernel.org> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-03 11:26:06 -08:00
Steve French	914afcf55a	[CIFS] Missing free in error path Thanks to jra for pointing this out Signed-off-by: Jeremy Allison <jra@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-02-02 14:42:12 +00:00
Steve French	9a0c8230e8	[CIFS] Reduce cifs stack space usage The two cifs functions that used the most stack according to "make checkstack" have been changed to use less stack. Thanks to jra and Shaggy for helpful ideas Signed-off-by: Steve French <sfrench@us.ibm.com> cc: jra@samba.org cc: shaggy@us.ibm.com	2007-02-02 04:21:57 +00:00
Guillaume Chazarain	7d8952440f	[PATCH] procfs: Fix listing of /proc/NOT_A_TGID/task Listing /proc/PID/task were PID is not a TGID should not result in duplicated entries. [g ~]$ pidof thunderbird-bin 2751 [g ~]$ ls /proc/2751/task 2751 2770 2771 2824 2826 2834 2835 2851 2853 [g ~]$ ls /proc/2770/task 2751 2770 2771 2824 2826 2834 2835 2851 2853 2770 2771 2824 2826 2834 2835 2851 2853 [g ~]$ Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-01 16:22:41 -08:00
Al Viro	fc2dd2e51a	[PATCH] endianness bug: ntohl() misspelled as >> 24 in fh_verify(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-01 16:17:06 -08:00
Mark Fasheh	e051fda4fd	ocfs2: ocfs2_link() journal credits update Commit `592282cf2e` fixed some missing directory c/mtime updates in part by introducing a dinode update in ocfs2_add_entry(). Unfortunately, ocfs2_link() (which didn't update the directory inode before) is now missing a single journal credit. Fix this by doubling the number of inode updates expected during hard link creation. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-02-01 12:03:19 -08:00
Steve French	030e9d8147	[CIFS] lseek polling returned stale EOF Fixes Samba bug 4362 Discovered by Jeremy Allison Clipper database polls on EOF via lseek and can get stale EOF when file is open on different client Signed-off-by: Jeremy Allison <jra@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-02-01 04:27:59 +00:00
Andrew Morton	fa8609da99	[PATCH] ntfs: kmap_atomic() atomicity fix The KM_BIO_SRC_IRQ kmap slot requires local irq protection. Acked-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 16:01:35 -08:00
Neil Brown	46bae1a9a7	[PATCH] Remove warning: VFS is out of sync with lock manager But keep it as a dprintk The message can be generated in a quite normal situation: If a 'lock' request is interrupted, then the lock client needs to record that the server has the lock, incase it does. When we come the unlock, the server might say it doesn't, even though we think it does (or might) and this generates the message. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 16:01:35 -08:00
Evgeniy Dushistov	efee2b8126	[PATCH] ufs: reallocation fix In blocks reallocation function sometimes does not update some of buffer_head::b_blocknr, which may and cause data damage. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:45 -08:00
Evgeniy Dushistov	8682164a66	[PATCH] ufs: truncate negative to unsigned fix During ufs_trunc_direct which is subroutine of ufs::truncate, we try the first of all free parts of block and then whole blocks. But we calculate size of block's part to free in the wrong way. This may cause bad update of used blocks and fragments statistic, and you can got report that you have free 32T on 1Gb partition. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:45 -08:00
Evgeniy Dushistov	a685e26fff	[PATCH] ufs: alloc metadata null page fix These series of patches result of UFS1 write support stress testing, like running fsx-linux, untar and build linux kernel etc We pass from ufs::get_block_t to levels below: pointer to the current page, to make possible things like reallocation of blocks on the fly, and we also uses this pointer for indication, what actually we allocate data block or meta data block, but currently we make decision about what we allocate on the wrong level, this may and cause oops if we allocate blocks in some special order. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:45 -08:00
Miklos Szeredi	ff79544754	[PATCH] fuse: fix bug in control filesystem mount The BUG in fuse_ctl_add_dentry() could be triggered if the control filesystem was unmounted and mounted again while one or more fuse filesystems were present. The fix is to reset the dentry counter in fuse_ctl_kill_sb(). Bug reported by Florent Mertens. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:45 -08:00
NeilBrown	34e9a63b4f	[PATCH] knfsd: ratelimit some nfsd messages that are triggered by external events Also remove {NFSD,RPC}_PARANOIA as having the defines doesn't really add anything. The printks covered by RPC_PARANOIA were triggered by badly formatted packets and so should be ratelimited. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:45 -08:00
Adrian Bunk	d019bcf0eb	[PATCH] fs/lockd/clntlock.c: add missing newlines to dprintk's This patch adds missing newlines to dprintk's. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:45 -08:00
Johannes Stezenbach	88f6cd0c3b	[PATCH] uml: fix mknod Fix UML hostfs mknod(): userspace has differernt dev_t size and encoding than kernel, so extract major/minor and reencode using glibc makedev() macro. Signed-off-by: Johannes Stezenbach <js@linuxtv.org> Acked-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:44 -08:00
Nick Piggin	87df7241bd	[PATCH] Fix try_to_free_buffer() locking Fix commit `ecdfc9787f` Not to put too fine a point on it, but in a nutshell... __set_page_dirty_buffers() \| try_to_free_buffers() ---------------------------+--------------------------- \| spin_lock(private_lock); \| drop_bufers() \| spin_unlock(private_lock); spin_lock(private_lock) \| !page_has_buffers() \| spin_unlock(private_lock) \| SetPageDirty() \| \| cancel_dirty_page() oops! Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-29 20:20:42 -08:00
Mark Fasheh	a8a75a20e9	[PATCH] ocfs2: fix thinko in ocfs2_backup_super_blkno() Fix a bug which was introduced when I synced up ocfs2_fs.h with ocfs2-tools. We can't do u64/u32 in kernel. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 14:53:27 -08:00
Alexey Dobriyan	1fb8449618	[PATCH] core-dumping unreadable binaries via PT_INTERP Proposed patch to fix #5 in http://www.isec.pl/vulnerabilities/isec-0017-binfmt_elf.txt aka http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2004-1073 To reproduce, do * grab poc at the end of advisory. * add line "eph.p_memsz = 4096;" after "eph.p_filesz = 4096;" where first "4096" is something equal to or greater than 4096. * ./poc /usr/bin/sudo && ls -l Here I get with 2.6.20-rc5: -rw------- 1 ad ad 102400 2007-01-15 19:17 core ---s--x--x 2 root root 101820 2007-01-15 19:15 /usr/bin/sudo Check for MAY_READ like binfmt_misc.c does. Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:51:00 -08:00
NeilBrown	a0ad13ef64	[PATCH] knfsd: Fix type mismatch with filldir_t used by nfsd nfsd defines a type 'encode_dent_fn' which is much like 'filldir_t' except that the first pointer is 'struct readdir_cd ' rather than 'void '. It then casts encode_dent_fn points to 'filldir_t' as needed. This hides any other type mismatches between the two such as the fact that the 'ino' arg recently changed from ino_t to u64. So: get rid of 'encode_dent_fn', get rid of the cast of the function type, change the first arg of various functions from 'struct readdir_cd ' to 'void ', and live with the fact that we have a little less type checking on the calling of these functions now. Less internal (to nfsd) checking offset by more external checking, which is more important. Thanks to Gabriel Paubert <paubert@iram.es> for discovering this and providing an initial patch. Signed-off-by: Gabriel Paubert <paubert@iram.es> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:51:00 -08:00
Eric Van Hensbergen	e540eb45a5	[PATCH] 9p: null terminate error strings for debug print We weren't properly NULL terminating protocol error strings for our debug printk resulting in garbage being included in the output when debug was enabled. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:51:00 -08:00
Eric Van Hensbergen	da977b2c7e	[PATCH] 9p: fix segfault caused by race condition in meta-data operations Running dbench multithreaded exposed a race condition where fid structures were removed while in use. This patch adds semaphores to meta-data operations to protect the fid structure. Some cleanup of error-case handling in the inode operations is also included. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:51:00 -08:00
Eric Van Hensbergen	621997cd39	[PATCH] 9p: fix rename return code 9p doesn't handle renames between directories -- however, we were returning EPERM instead of EXDEV when we detected this case. Signed-off-by: Eric Van Hensbergren <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:59 -08:00
Eric Van Hensbergen	f94b347059	[PATCH] 9p: fix bogus return code checks during initialization There is a simple logic error in init_v9fs - the return code checks are reversed. This patch fixes the return code and adds some messages to prevent module initialization from failing silently. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:59 -08:00
Peter Staubach	c397852c3d	[PATCH] knfsd: Don't mess with the 'mode' when storing a exclusive-create cookie NFS V3 (and V4) support exclusive create by passing a 'cookie' which can get stored with the file. If the file exists but has exactly the right cookie stored, then we assume this is a retransmit and the exclusive create was successful. The cookie is 64bits and is traditionally stored in the mtime and atime fields. This causes a problem with Solaris7 as negative mtime or atime confuse it. So we moved two bits into the mode word instead. But inherited ACLs sometimes overwrite the mode word on create, so this is a problem. So we give up and just store 62 of the 64 bits and assume that is close enough. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:59 -08:00
NeilBrown	250f391518	[PATCH] knfsd: fix an NFSD bug with full sized, non-page-aligned reads NFSd assumes that largest number of pages that will be needed for a request+response is 2+N where N pages is the size of the largest permitted read/write request. The '2' are 1 for the non-data part of the request, and 1 for the non-data part of the reply. However, when a read request is not page-aligned, and we choose to use ->sendfile to send it directly from the page cache, we may need N+1 pages to hold the whole reply. This can overflow and array and cause an Oops. This patch increases size of the array for holding pages by one and makes sure that entry is NULL when it is not in use. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-26 13:50:59 -08:00

1 2 3 4 5 ...

4832 Commits