linux

Author	SHA1	Message	Date
Roland McGrath	4e4c22c711	signals: add set_restore_sigmask This adds the set_restore_sigmask() inline in <linux/thread_info.h> and replaces every set_thread_flag(TIF_RESTORE_SIGMASK) with a call to it. No change, but abstracts the details of the flag protocol from all the calls. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:37 -07:00
Oleg Nesterov	7a5e873f09	signals: de_thread: simplify the ->child_reaper switching Now that we rely on SIGNAL_UNKILLABLE flag, de_thread() doesn't need the nasty hack to kill the old ->child_reaper during the mt-exec. This also means we can avoid taking tasklist_lock around zap_other_threads(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:37 -07:00
Oleg Nesterov	06fffb1267	do_task_stat: don't take rcu_read_lock() lock_task_sighand() was changed, and do_task_stat() doesn't need rcu_read_lock any longer. sighand->siglock protects all "interesting" fields. Except: it doesn't protect ->tty->pgrp, but neither does rcu_read_lock(), this should be fixed. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Pavel Emelyanov <xemul@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:34 -07:00
Jan Kara	2deb1acc65	isofs: fix access to unallocated memory when reading corrupted filesystem When a directory on isofs is corrupted, we did not check whether length of the name in a directory entry and the length of the directory entry itself are consistent. This could lead to possible access beyond the end of buffer when the length of the name was too big. Add this sanity check to directory reading code. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:33 -07:00
David Chinner	64275ea4f3	[XFS] Include linux/random.h in all builds, not just debug. Noted-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Dave Chinner <dgc@sgi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 07:53:50 -07:00
Gerald Schaefer	53492b1de4	[S390] System z large page support. This adds hugetlbfs support on System z, using both hardware large page support if available and software large page emulation on older hardware. Shared (large) page tables are implemented in software emulation mode, by using page->index of the first tail page from a compound large page to store page table information. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-04-30 13:38:47 +02:00
Linus Torvalds	c4755d16fc	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (48 commits) ext4: fix hot spins in mballoc after err_freebuddy and err_freemeta ext4: fix test ext_generic_write_end() copied return value ext3: fix test ext_generic_write_end() copied return value ext4: Move mballoc headers/structures to a seperate header file mballoc.h ext4: cleanup for compiling mballoc with verification and debugging #defines ext4: don't use ext4_error in ext4_check_descriptors ext4: mark inode dirty after initializing the extent tree ext4: update ctime and mtime for truncate with extents. ext4: Don't do GFP_NOFS allocations after taking ext4_lock_group ext4: move headers out of include/linux ext4: fix wrong gfp type under transaction ext4: Fix hang on umount with quotas when journal is aborted ext4: Fix update of mtime and ctime on rename jdb2: replace remaining __FUNCTION__ occurrences ext4: replace remaining __FUNCTION__ occurrences jbd2: only create debugfs and stats entries if init is successful jbd2: fix kernel-doc notation jbd2: replace potentially false assertion with if block jbd2: eliminate duplicated code in revocation table init/destroy functions jbd2: tidy up revoke cache initialisation and destruction ...	2008-04-29 20:34:49 -07:00
Linus Torvalds	c15a2434ed	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: (24 commits) [XFS] Fix build failure after enabling CONFIG_XFS_DEBUG [XFS] remove dmapi cruft in xfs_file.c [XFS] remove sendfile leftovers [XFS] allow enabling CONFIG_XFS_DEBUG [XFS] Don't initialise new inode generation numbers to zero [XFS] Fix check for block zero access in xfs_write_iomap_allocate() [XFS] Don't double count reserved block changes on UP. [XFS] remove xfs_log_ticket_zone on rmmod [XFS] fix non-smp xfs build [XFS] Fix broken HAVE_SPLICE removal commit. [XFS] kill XFS_ICSB_SB_LOCKED [XFS] split xfs_icsb_balance_counter [XFS] Add xfs_icsb_sync_counters_locked for when m_sb_lock already held [XFS] Cleanup xfs_attr a bit with xfs_name and remove cred [XFS] kill usesless IHOLD calls in xfs_remove and xfs_rmdir [XFS] kill parent == child checks in xfs_remove and xfs_rmdir [XFS] kill usesless IHOLD calls in xfs_rename [XFS] remove manual lookup from xfs_rename and simplify locking [XFS] shrink mrlock_t [XFS] simplify xfs_lookup ...	2008-04-29 20:34:17 -07:00
Roel Kluin	f1fa3342e2	ext4: fix hot spins in mballoc after err_freebuddy and err_freemeta In ext4_mb_init_backend() 'i' is of type ext4_group_t. Since unsigned, i >= 0 is always true, so fix hot spins after err_freebuddy: and -meta: and prevent decrements when zero. Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:01:15 -04:00
Roel Kluin	f8a87d8930	ext4: fix test ext_generic_write_end() copied return value 'copied' is unsigned, whereas 'ret2' is not. The test (copied < 0) fails Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:01:18 -04:00
Roel Kluin	7c2f3d6f89	ext3: fix test ext_generic_write_end() copied return value 'copied' is unsigned, whereas 'ret2' is not. The test (copied < 0) fails Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:01:27 -04:00
Mingming Cao	8f6e39a7ad	ext4: Move mballoc headers/structures to a seperate header file mballoc.h Move function and structure definiations out of mballoc.c and put it under a new header file mballoc.h Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:01:31 -04:00
Solofo Ramangalahy	60bd63d192	ext4: cleanup for compiling mballoc with verification and debugging #defines This patch allows compiling mballoc with: #define AGGRESSIVE_CHECK #define DOUBLE_CHECK #define MB_DEBUG It fixes: Compilation errors: fs/ext4/mballoc.c: In function '__mb_check_buddy': fs/ext4/mballoc.c:605: error: 'struct ext4_prealloc_space' has no member named 'group_list' fs/ext4/mballoc.c:606: error: 'struct ext4_prealloc_space' has no member named 'pstart' fs/ext4/mballoc.c:608: error: 'struct ext4_prealloc_space' has no member named 'len' Compilation warnings: fs/ext4/mballoc.c: In function 'ext4_mb_normalize_group_request': fs/ext4/mballoc.c:2863: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'int' fs/ext4/mballoc.c: In function 'ext4_mb_use_inode_pa': fs/ext4/mballoc.c:3103: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'int' Sparse check: fs/ext4/mballoc.c:3818:2: warning: context imbalance in 'ext4_mb_show_ac' - different lock contexts for basic block Signed-off-by: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 21:59:59 -04:00
Josef Bacik	c19204b0ae	ext4: don't use ext4_error in ext4_check_descriptors Because ext4_check_descriptors is called at mount time you can't use ext4_error as it calls ext4_commit_sb, which since the sb isn't all the way initialized causes bad things to happen (ie a panic). This patch changes the ext4_error's to printk's to keep this problem from happening. Thanks much, Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:00:28 -04:00
Aneesh Kumar K.V	8753e88f1b	ext4: mark inode dirty after initializing the extent tree We should mark the inode dirty only after initializing the extent tree. Also if we fail during extent initialization we need to call DQUOT_FREE_INODE. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:00:36 -04:00
Solofo Ramangalahy	ef7377289a	ext4: update ctime and mtime for truncate with extents. The recently announced "Linux POSIX file system test suite" caught a truncate issue when using extents: mtime and ctime are not updated when truncate is successful. This is the single issue caught with "default" ext4 (mkfs and mount with minimal options). The testsuite does not report failure with -o noextents. With the following patch, all tests of the testsuite pass. Signed-off-by: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:00:41 -04:00
Aneesh Kumar K.V	c83617db76	ext4: Don't do GFP_NOFS allocations after taking ext4_lock_group We can't do GFP_NOFS allocation after taking ext4_lock_group BUG: sleeping function called from invalid context at mm/slab.c:3054 in_atomic():1, irqs_disabled():0 1 lock held by vi/2426: #0: (&ei->i_data_sem){----}, at: [<c01cf665>] ext4_release_file+0x23/0x66 Pid: 2426, comm: vi Not tainted 2.6.25-rc7 #24 [<c011a3dc>] __might_sleep+0xbe/0xc5 [<c01620c9>] kmem_cache_alloc+0x22/0xa6 [<c01e382a>] ext4_mb_release_inode_pa+0x73/0x1b3 [<c01e6adf>] ext4_mb_discard_inode_preallocations+0x22d/0x2d4 [<c013000a>] ? param_set_ushort+0x32/0x39 [<c01ceba1>] ext4_discard_reservation+0x27/0x6a [<c01cf66c>] ext4_release_file+0x2a/0x66 [<c0165bd6>] __fput+0xae/0x155 [<c0165e46>] fput+0x17/0x19 [<c0163756>] filp_close+0x50/0x5a [<c01647c0>] sys_close+0x71/0xad [<c0104aba>] sysenter_past_esp+0x5f/0xa5 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:00:47 -04:00
Christoph Hellwig	3dcf54515a	ext4: move headers out of include/linux Move ext4 headers out of include/linux. This is just the trivial move, there's some more thing that could be done later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 18:13:32 -04:00
Josef Bacik	216553c4b7	ext4: fix wrong gfp type under transaction This fixes the allocations with GFP_KERNEL while under a transaction problems in ext4. This patch is the same as its ext3 counterpart, just switches these to GFP_NOFS. Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:02:02 -04:00
Jan Kara	2887df139c	ext4: Fix hang on umount with quotas when journal is aborted Call dquot_drop() from ext4_dquot_drop() even if we fail to start a transaction. Otherwise we never get to dropping references to quota structures from the inode and umount will hang indefinitely. Thanks to Payphone LIOU for spotting the problem. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> CC: Payphone LIOU <lioupayphone@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:02:07 -04:00
Jan Kara	53b7e9f680	ext4: Fix update of mtime and ctime on rename The patch below makes ext4 update mtime and ctime of the directory into which we move file even if the directory entry already exists. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:02:11 -04:00
Dave Jones	355a46961b	trivial: fix user-visible typo in hfsplus Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 13:23:21 -07:00
Steve French	9b1ec9ecea	[CIFS] Remove duplicate call to mode_to_acl The current logic in cifs_setattr calls mode_to_acl twice on mode changes if cifsacl is enabled. Remove the duplicate call. Signed-off-by: Jeff Layton <jlayton@redhat.com> CC: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-04-29 20:15:43 +00:00
Linus Torvalds	bd5d435a96	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: Skip I/O merges when disabled block: add large command support block: replace sizeof(rq->cmd) with BLK_MAX_CDB ide: use blk_rq_init() to initialize the request block: use blk_rq_init() to initialize the request block: rename and export rq_init() block: no need to initialize rq->cmd with blk_get_request block: no need to initialize rq->cmd in prepare_flush_fn hook block/blk-barrier.c:blk_ordered_cur_seq() mustn't be inline block/elevator.c:elv_rq_merge_ok() mustn't be inline block: make queue flags non-atomic block: add dma alignment and padding support to blk_rq_map_kern unexport blk_max_pfn ps3disk: Remove superfluous cast block: make rq_init() do a full memset() relay: fix splice problem	2008-04-29 08:18:03 -07:00
Jeff Moyer	39fa00311f	aio: fix misleading comments The FIXME comments are inaccurate. The locking comment over lookup_ioctx() is wrong. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:29 -07:00
Harvey Harrison	97a4feb4a7	ncpfs: use get/put_unaligned_* helpers [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:28 -07:00
Harvey Harrison	58d485d481	isofs: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:28 -07:00
Harvey Harrison	8b3789e5d5	hfsplus: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:28 -07:00
Harvey Harrison	803f445f17	fat: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:28 -07:00
David Howells	9396d496d7	afs: support the CB.ProbeUuid RPC op Add support for the CB.ProbeUuid cache manager RPC op. This allows a modern OpenAFS server to quickly ask if the client has been rebooted. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
David Howells	7c80bcce34	afs: the AFS RPC op CBGetCapabilities is actually CBTellMeAboutYourself The AFS RxRPC op CBGetCapabilities is actually CBTellMeAboutYourself. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
Robert P. J. Day	0ae52d6fba	afs: use the shorter LIST_HEAD for brevity Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
Hirofumi Nakagawa	801678c5a3	Remove duplicated unlikely() in IS_ERR() Some drivers have duplicated unlikely() macros. IS_ERR() already has unlikely() in itself. This patch cleans up such pointless code. Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jeff@garzik.org> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.de> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Pavel Emelyanov	d7321cd624	sysctl: add the ->permissions callback on the ctl_table_root When reading from/writing to some table, a root, which this table came from, may affect this table's permissions, depending on who is working with the table. The core hunk is at the bottom of this patch. All the rest is just pushing the ctl_table_root argument up to the sysctl_perm() function. This will be mostly (only?) used in the net sysctls. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:23 -07:00
Pavel Emelyanov	7708bfb1c8	sysctl: merge equal proc_sys_read and proc_sys_write Many (most of) sysctls do not have a per-container sense. E.g. kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget and so on and so forth. Besides, tuning then from inside a container is not even secure. On the other hand, hiding them completely from the container's tasks sometimes causes user-space to stop working. When developing net sysctl, the common practice was to duplicate a table and drop the write bits in table->mode, but this approach was not very elegant, lead to excessive memory consumption and was not suitable in general. Here's the alternative solution. To facilitate the per-container sysctls ctl_table_root-s were introduced. Each root contains a list of ctl_table_header-s that are visible to different namespaces. The idea of this set is to add the permissions() callback on the ctl_table_root to allow ctl root limit permissions to the same ctl_table-s. The main user of this functionality is the net-namespaces code, but later this will (should) be used by more and more namespaces, containers and control groups. Actually, this idea's core is in a single hunk in the third patch. First two patches are cleanups for sysctl code, while the third one mostly extends the arguments set of some sysctl functions. This patch: These ->read and ->write callbacks act in a very similar way, so merge these paths to reduce the number of places to patch later and shrink the .text size (a bit). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:23 -07:00
Denis V. Lunev	79da3664f6	jbd2: use non-racy method for proc entries creation Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: <linux-ext4@vger.kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	19b4fc52d6	reiserfs: use non-racy method for proc entries creation Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. /proc entry owner is also added. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	46fe74f2ae	ext4: use non-racy method for proc entries creation Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: <linux-ext4@vger.kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	21ac295b42	afs: use non-racy method for proc entries creation Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: David Howells <dhowells@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	34b37235c6	nfs: use proc_create to setup de->proc_fops Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	9ef2db2630	nfsd: use proc_create to setup de->proc_fops Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	59b7435149	proc: introduce proc_create_data to setup de->data This set of patches fixes an proc ->open'less usage due to ->proc_fops flip in the most part of the kernel code. The original OOPS is described in the commit `2d3a4e3666`: Typical PDE creation code looks like: pde = create_proc_entry("foo", 0, NULL); if (pde) pde->proc_fops = &foo_proc_fops; Notice that PDE is first created, only then ->proc_fops is set up to final value. This is a problem because right after creation a) PDE is fully visible in /proc , and b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's possible to ->read without ->open (see one class of oopses below). The fix is new API called proc_create() which makes sure ->proc_fops are set up before gluing PDE to main tree. Typical new code looks like: pde = proc_create("foo", 0, NULL, &foo_proc_fops); if (!pde) return -ENOMEM; Fix most networking users for a start. In the long run, create_proc_entry() for regular files will go. In addition to this, proc_create_data is introduced to fix reading from proc without PDE->data. The race is basically the same as above. create_proc_entries is replaced in the entire kernel code as new method is also simply better. This patch: The problem is the same as for de->proc_fops. Right now PDE becomes visible without data set. So, the entry could be looked up without data. This, in most cases, will simply OOPS. proc_create_data call is created to address this issue. proc_create now becomes a wrapper around it. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Chris Mason <chris.mason@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jaroslav Kysela <perex@suse.cz> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Karsten Keil <kkeil@suse.de> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Len Brown <lenb@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Neil Brown <neilb@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Osterlund <petero2@telia.com> Cc: Pierre Peiffer <peifferp@gmail.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Alexey Dobriyan	b640a89ddd	proc: convert /proc/tty/ldiscs to seq_file interface Note: THIS_MODULE and header addition aren't technically needed because this code is not modular, but let's keep it anyway because people can copy this code into modular code. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Alexey Dobriyan	8731f14d37	proc: remove ->get_info infrastructure Now that last dozen or so users of ->get_info were removed, ditch it too. Everyone sane shouldd have switched to seq_file interface long ago. P.S.: Co-existing 3 interfaces (->get_info/->read_proc/->proc_fops) for proc is long-standing crap, BTW, thus a) put ->read_proc/->write_proc/read_proc_entry() users on death row, b) new such users should be rejected, c) everyone is encouraged to convert his favourite ->read_proc user or I'll do it, lazy bastards. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:19 -07:00
Alexey Dobriyan	c74c120a21	proc: remove proc_root from drivers Remove proc_root export. Creation and removal works well if parent PDE is supplied as NULL -- it worked always that way. So, one useless export removed and consistency added, some drivers created PDEs with &proc_root as parent but removed them as NULL and so on. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	928b4d8c89	proc: remove proc_root_driver Use creation by full path: "driver/foo". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	36a5aeb878	proc: remove proc_root_fs Use creation by full path instead: "fs/foo". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	9c37066d88	proc: remove proc_bus Remove proc_bus export and variable itself. Using pathnames works fine and is slightly more understandable and greppable. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	5e971dce0b	proc: drop several "PDE valid/invalid" checks proc-misc code is noticeably full of "if (de)" checks when PDE passed is always valid. Remove them. Addition of such check in proc_lookup_de() is for failed lookup case. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	7cee4e00e0	proc: less special case in xlate code If valid "parent" is passed to proc_create/remove_proc_entry(), then name of PDE should consist of only one path component, otherwise creation or or removal will fail. However, if NULL is passed as parent then create/remove accept full path as a argument. This is arbitrary restriction -- all infrastructure is in place. So, patch allows the following to succeed: create_proc_entry("foo/bar", 0, pde_baz); remove_proc_entry("baz/foo/bar", &proc_root); Also makes the following to behave identically: create_proc_entry("foo/bar", 0, NULL); create_proc_entry("foo/bar", 0, &proc_root); Discrepancy noticed by Den Lunev (IIRC). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Alexey Dobriyan	f649d6d326	proc: simplify locking in remove_proc_entry() proc_subdir_lock protects only modifying and walking through PDE lists, so after we've found PDE to remove and actually removed it from lists, there is no need to hold proc_subdir_lock for the rest of operation. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Roland McGrath	638fa202cd	procfs: mem permission cleanup This cleans up the permission checks done for /proc/PID/mem i/o calls. It puts all the logic in a new function, check_mem_permission(). The old code repeated the (!MAY_PTRACE(task) \|\| !ptrace_may_attach(task)) magical expression multiple times. The new function does all that work in one place, with clear comments. The old code called security_ptrace() twice on successful checks, once in MAY_PTRACE() and once in __ptrace_may_attach(). Now it's only called once, and only if all other checks have succeeded. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Alexey Dobriyan	0d5c9f5f59	proc: switch to proc_create() Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Matt Helsley	925d1c401f	procfs task exe symlink The kernel implements readlink of /proc/pid/exe by getting the file from the first executable VMA. Then the path to the file is reconstructed and reported as the result. Because of the VMA walk the code is slightly different on nommu systems. This patch avoids separate /proc/pid/exe code on nommu systems. Instead of walking the VMAs to find the first executable file-backed VMA we store a reference to the exec'd file in the mm_struct. That reference would prevent the filesystem holding the executable file from being unmounted even after unmapping the VMAs. So we track the number of VM_EXECUTABLE VMAs and drop the new reference when the last one is unmapped. This avoids pinning the mounted filesystem. [akpm@linux-foundation.org: improve comments] [yamamoto@valinux.co.jp: fix dup_mmap] Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: David Howells <dhowells@redhat.com> Cc:"Eric W. Biederman" <ebiederm@xmission.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Alexey Dobriyan	e93b4ea20a	proc: print more information when removing non-empty directories This usually saves one recompile to insert similar printk like below. :) Sample nastygram: remove_proc_entry: removing non-empty directory '/proc/foo', leaking at least 'bar' ------------[ cut here ]------------ WARNING: at fs/proc/generic.c:776 remove_proc_entry+0x18a/0x200() Modules linked in: foo(-) container fan battery dock sbs ac sbshc backlight ipv6 loop af_packet amd_rng sr_mod i2c_amd8111 i2c_amd756 cdrom i2c_core button thermal processor Pid: 3034, comm: rmmod Tainted: G M 2.6.25-rc1 #5 Call Trace: [<ffffffff80231974>] warn_on_slowpath+0x64/0x90 [<ffffffff80232a6e>] printk+0x4e/0x60 [<ffffffff802d6c8a>] remove_proc_entry+0x18a/0x200 [<ffffffff8045cd88>] mutex_lock_nested+0x1c8/0x2d0 [<ffffffff8025f0f0>] __try_stop_module+0x0/0x40 [<ffffffff8025effd>] sys_delete_module+0x14d/0x200 [<ffffffff8045df3d>] lockdep_sys_exit_thunk+0x35/0x67 [<ffffffff8031c307>] __up_read+0x27/0xa0 [<ffffffff8045decc>] trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8020b6ab>] system_call_after_swapgs+0x7b/0x80 ---[ end trace 10ef850597e89c54 ]--- Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
WANG Cong	4220b7fe89	elf: fix shadowed variables in fs/binfmt_elf.c Fix these sparse warings: fs/binfmt_elf.c:1749:29: warning: symbol 'tmp' shadows an earlier one fs/binfmt_elf.c:1734:28: originally declared here fs/binfmt_elf.c:2009:26: warning: symbol 'vma' shadows an earlier one fs/binfmt_elf.c:1892:24: originally declared here [akpm@linux-foundation.org: chose better variable name] Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:16 -07:00
Cyrill Gorcunov	6970c8eff8	BINFMT: fill_elf_header cleanup - use straight memset first This patch does simplify fill_elf_header function by setting to zero the whole elf header first. So we fillup the fields we really need only. before: text data bss dec hex filename 11735 80 0 11815 2e27 fs/binfmt_elf.o after: text data bss dec hex filename 11710 80 0 11790 2e0e fs/binfmt_elf.o viola, 25 bytes of text is freed Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:16 -07:00
Balbir Singh	cf475ad28a	cgroups: add an owner to the mm_struct Remove the mem_cgroup member from mm_struct and instead adds an owner. This approach was suggested by Paul Menage. The advantage of this approach is that, once the mm->owner is known, using the subsystem id, the cgroup can be determined. It also allows several control groups that are virtually grouped by mm_struct, to exist independent of the memory controller i.e., without adding mem_cgroup's for each controller, to mm_struct. A new config option CONFIG_MM_OWNER is added and the memory resource controller selects this config option. This patch also adds cgroup callbacks to notify subsystems when mm->owner changes. The mm_cgroup_changed callback is called with the task_lock() of the new task held and is called just prior to changing the mm->owner. I am indebted to Paul Menage for the several reviews of this patchset and helping me make it lighter and simpler. This patch was tested on a powerpc box, it was compiled with both the MM_OWNER config turned on and off. After the thread group leader exits, it's moved to init_css_state by cgroup_exit(), thus all future charges from runnings threads would be redirected to the init_css_set's subsystem. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: Hirokazu Takahashi <taka@valinux.co.jp> Cc: David Rientjes <rientjes@google.com>, Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Reviewed-by: Paul Menage <menage@google.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:10 -07:00
Serge E. Hallyn	08ce5f16ee	cgroups: implement device whitelist Implement a cgroup to track and enforce open and mknod restrictions on device files. A device cgroup associates a device access whitelist with each cgroup. A whitelist entry has 4 fields. 'type' is a (all), c (char), or b (block). 'all' means it applies to all types and all major and minor numbers. Major and minor are either an integer or * for all. Access is a composition of r (read), w (write), and m (mknod). The root device cgroup starts with rwm to 'all'. A child devcg gets a copy of the parent. Admins can then remove devices from the whitelist or add new entries. A child cgroup can never receive a device access which is denied its parent. However when a device access is removed from a parent it will not also be removed from the child(ren). An entry is added using devices.allow, and removed using devices.deny. For instance echo 'c 1:3 mr' > /cgroups/1/devices.allow allows cgroup 1 to read and mknod the device usually known as /dev/null. Doing echo a > /cgroups/1/devices.deny will remove the default 'a : mrw' entry. CAP_SYS_ADMIN is needed to change permissions or move another task to a new cgroup. A cgroup may not be granted more permissions than the cgroup's parent has. Any task can move itself between cgroups. This won't be sufficient, but we can decide the best way to adequately restrict movement later. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix may-be-used-uninitialized warning] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: James Morris <jmorris@namei.org> Looks-good-to: Pavel Emelyanov <xemul@openvz.org> Cc: Daniel Hokka Zakrisson <daniel@hozac.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:09 -07:00
Michael Halcrow	2f9b12a31f	eCryptfs: protect crypt_stat->flags in ecryptfs_open() Make sure crypt_stat->flags is protected with a lock in ecryptfs_open(). Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:07 -07:00
Michael Halcrow	6a3fd92e73	eCryptfs: make key module subsystem respect namespaces Make eCryptfs key module subsystem respect namespaces. Since I will be removing the netlink interface in a future patch, I just made changes to the netlink.c code so that it will not break the build. With my recent patches, the kernel module currently defaults to the device handle interface rather than the netlink interface. [akpm@linux-foundation.org: export free_user_ns()] Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:07 -07:00
Michael Halcrow	f66e883eb6	eCryptfs: integrate eCryptfs device handle into the module. Update the versioning information. Make the message types generic. Add an outgoing message queue to the daemon struct. Make the functions to parse and write the packet lengths available to the rest of the module. Add functions to create and destroy the daemon structs. Clean up some of the comments and make the code a little more consistent with itself. [akpm@linux-foundation.org: printk fixes] Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:07 -07:00
Michael Halcrow	8bf2debd5f	eCryptfs: introduce device handle for userspace daemon communications A regular device file was my real preference from the get-go, but I went with netlink at the time because I thought it would be less complex for managing send queues (i.e., just do a unicast and move on). It turns out that we do not really get that much complexity reduction with netlink, and netlink is more heavyweight than a device handle. In addition, the netlink interface to eCryptfs has been broken since 2.6.24. I am assuming this is a bug in how eCryptfs uses netlink, since the other in-kernel users of netlink do not seem to be having any problems. I have had one report of a user successfully using eCryptfs with netlink on 2.6.24, but for my own systems, when starting the userspace daemon, the initial helo message sent to the eCryptfs kernel module results in an oops right off the bat. I spent some time looking at it, but I have not yet found the cause. The netlink interface breaking gave me the motivation to just finish my patch to migrate to a regular device handle. If I cannot find out soon why the netlink interface in eCryptfs broke, I am likely to just send a patch to disable it in 2.6.24 and 2.6.25. I would like the device handle to be the preferred means of communicating with the userspace daemon from 2.6.26 on forward. This patch: Functions to facilitate reading and writing to the eCryptfs miscellaneous device handle. This will replace the netlink interface as the preferred mechanism for communicating with the userspace eCryptfs daemon. Each user has his own daemon, which registers itself by opening the eCryptfs device handle. Only one daemon per euid may be registered at any given time. The eCryptfs module sends a message to a daemon by adding its message to the daemon's outgoing message queue. The daemon reads the device handle to get the oldest message off the queue. Incoming messages from the userspace daemon are immediately handled. If the message is a response, then the corresponding process that is blocked waiting for the response is awakened. Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:07 -07:00
Miklos Szeredi	9c3580aa52	ecryptfs: add missing lock around notify_change Callers of notify_change() need to hold i_mutex. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:07 -07:00
Harvey Harrison	18d1dbf1d4	ecryptfs: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:06 -07:00
Adrian Bunk	05db67a4f2	remove ecryptfs_header_cache_0 Remove the no longer used ecryptfs_header_cache_0. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:06 -07:00
OGAWA Hirofumi	762873c251	vfs: fix unconditional write_super() call in file_fsync() We need to check ->s_dirt before calling write_super(). It became the cause of an unneeded write. This bug was noticed by Sudhanshu Saxena. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:06 -07:00
David Howells	8f0cfa52a1	xattr: add missing consts to function arguments Add missing consts to xattr function arguments. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:06 -07:00
Jan Blunck	7ec02ef159	vfs: remove lives_below_in_same_fs() Remove lives_below_in_same_fs() since is_subdir() from fs/dcache.c is providing the same functionality. Signed-off-by: Jan Blunck <jblunck@suse.de> Acked-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:06 -07:00
Matthias Kaehlcke	c5c8be3ce5	fs/inode.c: use hlist_for_each_entry() fs/inode.c: use hlist_for_each_entry() in find_inode() and find_inode_fast() [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Matthias Kaehlcke <matthias@kaehlcke.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:06 -07:00
Jan Kara	af065b8a19	vfs: skip inodes without pages to free in drop_pagecache_sb() Many inodes have no pagecache, so we can avoid lots of lock-takings. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Fengguang Wu <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:05 -07:00
Jan Kara	eccb95cee4	vfs: fix lock inversion in drop_pagecache_sb() Fix longstanding lock inversion in drop_pagecache_sb by dropping inode_lock before calling __invalidate_mapping_pages(). We just have to make sure inode won't go away from under us by keeping reference to it and putting the reference only after we have safely resumed the scan of the inode list. A bit tricky but not too bad... Signed-off-by: Jan Kara <jack@suse.cz> Cc: Fengguang Wu <wfg@mail.ustc.edu.cn> Cc: David Chinner <dgc@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:05 -07:00
David Howells	e1d2c8b69a	fdpic: check that the size returned by kernel_read() is what we asked for Check that the size of the read returned by kernel_read() is what we asked for. If it isn't, then reject the binary as being a badly formatted. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:05 -07:00
S.Caglar Onur	2e50b6ccda	fs/binfmt_aout.c: use printk_ratelimit() Use printk_ratelimit() instead of jiffies based arithmetic, suggested by Geert Uytterhoeven Signed-off-by: S.Caglar Onur <caglar@pardus.org.tr> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:04 -07:00
Pavel Emelyanov	3a2e7f47d7	binfmt_misc.c: avoid potential kernel stack overflow This can be triggered with root help only, but... Register the ":text:E::txt::/root/cat.txt:' rule in binfmt_misc (by root) and try launching the cat.txt file (by anyone) :) The result is - the endless recursion in the load_misc_binary -> open_exec -> load_misc_binary chain and stack overflow. There's a similar problem with binfmt_script, and there's a sh_bang memner on linux_binprm structure to handle this, but simply raising this in binfmt_misc may break some setups when the interpreter of some misc binaries is a script. So the proposal is to turn sh_bang into a bit, add a new one (the misc_bang) and raise it in load_misc_binary. After this, even if we set up the misc -> script -> misc loop for binfmts one of them will step on its own bang and exit. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:04 -07:00
Andrew Morton	a2d416dcc9	codafs: fix build warning powerpc: fs/coda/coda_linux.c: In function 'coda_iattr_to_vattr': fs/coda/coda_linux.c:137: warning: large integer implicitly truncated to unsigned type Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:04 -07:00
Tetsuo Handa	175a06ae30	exec: remove argv_len from struct linux_binprm I noticed that 2.6.24.2 calculates bprm->argv_len at do_execve(). But it doesn't update bprm->argv_len after "remove_arg_zero() + copy_strings_kernel()" at load_script() etc. audit_bprm() is called from search_binary_handler() and search_binary_handler() is called from load_script() etc. Thus, I think the condition check if (bprm->argv_len > (audit_argv_kb << 10)) return -E2BIG; in audit_bprm() might return wrong result when strlen(removed_arg) != strlen(spliced_args). Why not update bprm->argv_len at load_script() etc. ? By the way, 2.6.25-rc3 seems to not doing the condition check. Is the field bprm->argv_len no longer needed? Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Ollie Wild <aaw@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:03 -07:00
Julia Lawall	8d4b69002e	fs/affs/file.c: use BUG_ON if (...) BUG(); should be replaced with BUG_ON(...) when the test has no side-effects to allow a definition of BUG_ON that drops the code completely. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @ disable unlikely @ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } \| - if (unlikely(E)) { BUG(); } + BUG_ON(E); ) @@ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } \| - if (E) { BUG(); } + BUG_ON(E); ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:02 -07:00
Jim Meyering	cd6fda3608	hfsplus: handle match_strdup failure fs/hfsplus/options.c (hfsplus_parse_options): Handle match_strdup failure. Signed-off-by: Jim Meyering <meyering@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:02 -07:00
Jim Meyering	3fbe5c3100	hfs: handle match_strdup failure fs/hfs/super.c (parse_options): Handle match_strdup failure, twice. Signed-off-by: Jim Meyering <meyering@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:01 -07:00
Jim Meyering	6db27dd9d2	affs: handle match_strdup failure fs/affs/super.c (parse_options): Remove useless initialization. Handle match_strdup failure. Signed-off-by: Jim Meyering <meyering@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:01 -07:00
Jiri Olsa	61d64576a2	fs: remove unused fops from struct char_device_struct struct char_device_struct::fops is no longer used: remove it. Signed-off-by: Jiri Olsa <olsajiri@gmail.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:01 -07:00
Harvey Harrison	f249fdd8c1	autofs4: fix sparse warning in root.c fs/autofs4/root.c:536:23: warning: symbol 'ino' shadows an earlier one fs/autofs4/root.c:510:22: originally declared here There is no need to redeclare, we are at the end of the loop and in the next iteration of the loop, ino will be reset. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:01 -07:00
Adrian Bunk	3202e1811f	make BINFMT_FLAT a bool I have not yet seen anyone saying he has a reasonable use case for using BINFMT_FLAT modular on his embedded device. Considering that fs/binfmt_flat.c even lacks a MODULE_LICENSE() I really doubt there is any, and this patch therefore makes BINFMT_FLAT a bool. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Bryan Wu <cooloney.lkml@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:01 -07:00
Adrian Bunk	f1e3af72c1	make fs/buffer.c:cont_expand_zero() static cont_expand_zero() can become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:01 -07:00
Adrian Bunk	946a57b526	remove generic_commit_write() Remove the obsolete and no longer used generic_commit_write(). Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:01 -07:00
Adrian Bunk	45cc2b96f2	fs/timerfd.c should #include <linux/syscalls.h> Every file should include the headers containing the prototypes for its global functions (in this case for sys_timerfd_*()). Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:01 -07:00
Adrian Bunk	d5470b596a	fs/aio.c: make 3 functions static Make the following needlessly global functions static: - __put_ioctx() - lookup_ioctx() - io_submit_one() Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Zach Brown <zach.brown@oracle.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:00 -07:00
Adrian Bunk	07d45da616	fs/drop_caches.c: make 2 functions static Make the following needlessly global functions static: - drop_pagecache() - drop_slab() Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:00 -07:00
Adrian Bunk	f11b00f3bd	fs/fs-writeback.c: make 2 functions static Make the following needlessly global functions static: - writeback_acquire() - writeback_release() Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:00 -07:00
Adrian Bunk	67cde59537	make vfs_ioctl() static Make the needlessly global vfs_ioctl() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:00 -07:00
Adrian Bunk	6b09ae6692	make __put_super() static Make the needlessly global __put_super() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:00 -07:00
Adrian Bunk	8b1919a1e8	fs/freevxfs/: proper externs Move the extern declarations of several structs to vxfs_extern.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:00 -07:00
Adrian Bunk	4b0a8da7a7	fs/hfsplus/: proper externs Add proper extern declarations for two structs in fs/hfsplus/hfsplus_fs.h Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:00 -07:00
Adrian Bunk	4488c59c94	fs/ramfs/ extern cleanup - internal.h shouldn't duplicate the extern declaration for ramfs_file_operations already in include/linux/ramfs.h - file-mmu.c needs two #include's for seeing the extern declarations of it's global struct's Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:00 -07:00
Harvey Harrison	63e3453e54	befs: fix sparse warning in linuxvfs.c Use link as the variable name to avoid shadowing the arg. fs/befs/linuxvfs.c:492:8: warning: symbol 'p' shadows an earlier one fs/befs/linuxvfs.c:488:77: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: "Sergey S. Kostyliov" <rathamahata@php4.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:05:59 -07:00
Harvey Harrison	9fe76c763f	coda: add static to functions in dir.c coda_unlink, coda_rmdir, coda_readdir can all be static, the forward declarations already were. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:05:59 -07:00
Harvey Harrison	e5949050f2	adfs: work around bogus sparse warning fs/adfs/dir_f.c:126:4: warning: do-while statement is not a compound statement Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:05:59 -07:00
Davide Libenzi	cdac75e6f2	epoll: avoid kmemcheck warning Epoll calls rb_set_parent(n, n) to initialize the rb-tree node, but rb_set_parent() accesses node's pointer in its code. This creates a warning in kmemcheck (reported by Vegard Nossum) about an uninitialized memory access. The warning is harmless since the following rb-tree node insert is going to overwrite the node data. In any case I think it's better to not have that happening at all, and fix it by simplifying the code to get rid of a few lines that became superfluous after the previous epoll changes. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:05:59 -07:00
FUJITA Tomonori	68154e90c9	block: add dma alignment and padding support to blk_rq_map_kern This patch adds bio_copy_kern similar to bio_copy_user. blk_rq_map_kern uses bio_copy_kern instead of bio_map_kern if necessary. bio_copy_kern uses temporary pages and the bi_end_io callback frees these pages. bio_copy_kern saves the original kernel buffer at bio->bi_private it doesn't use something like struct bio_map_data to store the information about the caller. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 09:50:34 +02:00
Tom Zanussi	c3270e577c	relay: fix splice problem Splice isn't always incrementing the ppos correctly, which broke relay splice. Signed-off-by: Tom Zanussi <zanussi@comcast.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-04-29 09:48:15 +02:00
Stephen Rothwell	adaa693b84	[XFS] Fix build failure after enabling CONFIG_XFS_DEBUG Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 16:08:44 +10:00
Christoph Hellwig	c5acbaf43d	[XFS] remove dmapi cruft in xfs_file.c The dmapi cruft in xfs_file.c is totally out of date in mainline vs CVS, and at this point just removing this code which can't be used on mainline at all seems to be the best option to keep it maintainable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 16:08:27 +10:00
Christoph Hellwig	3a738a5c73	[XFS] remove sendfile leftovers Remove the last sendfile leftovers in mainline. This code is already gone in CVS. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 16:08:14 +10:00
Christoph Hellwig	7788fae6cc	[XFS] allow enabling CONFIG_XFS_DEBUG Back when I first submitted XFS for mainline inclusion we made the decision that the debug code is far to extensive to be accidentally enabled by users in mainline. But then again it's often quite useful to track problems down and hacking the makefile all the time is rather annoying. Given all the debug options with even more overhead like lockdep or DEBUG_PAGE_ALLOC users (or rather developers) should know by now what they're doing. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 16:07:48 +10:00
David Chinner	359346a965	[XFS] Don't initialise new inode generation numbers to zero When we allocation new inode chunks, we initialise the generation numbers to zero. This works fine until we delete a chunk and then reallocate it, resulting in the same inode numbers but with a reset generation count. This can result in inode/generation pairs of different inodes occurring relatively close together. Given that the inode/gen pair makes up the "unique" portion of an NFS filehandle on XFS, this can result in file handles cached on clients being seen on the wire from the server but refer to a different file. This causes .... issues for NFS clients. Hence we need a unique generation number initialisation for each inode to prevent reuse of a small portion of the generation number space. Use a random number to initialise the generation number so we don't need to keep any new state on disk whilst making the new number difficult to guess from previous allocations. SGI-PV: 979416 SGI-Modid: xfs-linux-melb:xfs-kern:31001a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:58:56 +10:00
David Chinner	86c4d62305	[XFS] Fix check for block zero access in xfs_write_iomap_allocate() The check for block zero access should be done on non-realtime inodes. Fix the logic error in xfs_write_iomap_allocate(), and simplify the logic on all checks for block zero access in xfs_iomap.c SGI-PV: 980888 SGI-Modid: xfs-linux-melb:xfs-kern:30998a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:58:40 +10:00
David Chinner	d349404ff1	[XFS] Don't double count reserved block changes on UP. On uniprocessor machines, the incore superblock is used for all in memory accounting of free blocks. in this situation, changes to the reserved block count are accounted twice; once directly and once via xfs_mod_incore_sb(). Seeing as the modification on SMP is done via xfs_mod_incore_sb(), make this the only update mechanism that UP uses as well. SGI-PV: 980654 SGI-Modid: xfs-linux-melb:xfs-kern:30997a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:58:27 +10:00
Alexey Dobriyan	fe0754f0e5	[XFS] remove xfs_log_ticket_zone on rmmod Fix bug introduced in commit `eb01c9cd87` aka "[XFS] Remove the xlog_ticket allocator" SGI-PV: 980887 SGI-Modid: xfs-linux-melb:xfs-kern:30995a Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:58:14 +10:00
Eric Sandeen	7155054c9d	[XFS] fix non-smp xfs build xfs_reserve_blocks() calls xfs_icsb_sync_counters_locked(), which is not defined if !CONFIG_SMP/!HAVE_PERCPU_SB SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30991a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:58:00 +10:00
Donald Douwsma	18d18208da	[XFS] Fix broken HAVE_SPLICE removal commit. Commit `e687330b5e` was meant to remove the unused HAVE_SPLICE macro, instead an unrelated change was checked enabling QUOTADEBUG when building DEBUG XFS. Restore the intended changes. SGI-PV: 971046 SGI-Modid: xfs-linux-melb:xfs-kern:30924a Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:57:49 +10:00
Christoph Hellwig	ce46193bca	[XFS] kill XFS_ICSB_SB_LOCKED With the last two patches XFS_ICSB_SB_LOCKED is never checked and only superflously passed to xfs_icsb_count, so kill it. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30920a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:57:38 +10:00
Christoph Hellwig	45af6c6de6	[XFS] split xfs_icsb_balance_counter Add an xfs_icsb_balance_counter_locked for the case where mp->m_sb_lock is already locked. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30918a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:57:28 +10:00
Christoph Hellwig	d4d90b577e	[XFS] Add xfs_icsb_sync_counters_locked for when m_sb_lock already held Add a new xfs_icsb_sync_counters_locked for the case where m_sb_lock is already taken and add a flags argument to xfs_icsb_sync_counters so that xfs_icsb_sync_counters_flags is not needed. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30917a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:57:11 +10:00
Barry Naujok	e8b0ebaa11	[XFS] Cleanup xfs_attr a bit with xfs_name and remove cred SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30913a Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:54:55 +10:00
Christoph Hellwig	5df78e73d3	[XFS] kill usesless IHOLD calls in xfs_remove and xfs_rmdir The VFS always has an inode reference when we call these functions. So we only need to grab a signle reference to each inode that's joined to a transaction - all the other bumping and dropping is as useless as the comments describing the IRIX semantics. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30912a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:54:45 +10:00
Christoph Hellwig	82dab941a1	[XFS] kill parent == child checks in xfs_remove and xfs_rmdir VFS guaranteed these can't happen. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30911a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:54:34 +10:00
Christoph Hellwig	1ac74e01df	[XFS] kill usesless IHOLD calls in xfs_rename Similar to to the previous patch for remove and rmdir only grab a reference to inodes when we join them to transaction to balance the decrement on transaction completion. Everything else it taken care of by the VFS. Note that the old case had leaks of inode count when src == target or src or target == one of the parent inodes, but these cases are fortunately already rejected by the VFS. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30904a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:54:24 +10:00
Christoph Hellwig	cfa853e47d	[XFS] remove manual lookup from xfs_rename and simplify locking ->rename already gets the target inode passed if it exits. Pass it down to xfs_rename so that we can avoid looking it up again. Also simplify locking as the first lock section in xfs_rename can go away now: the isdir is an invariant over the lifetime of the inode, and new_parent and the nlink check are namespace topology protected by i_mutex in the VFS. The projid check needs to move into the second lock section anyway to not be racy. Also kill the now unused xfs_dir_lookup_int and remove the now-unused first_locked argumet to xfs_lock_inodes. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30903a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:54:12 +10:00
Christoph Hellwig	579aa9caf5	[XFS] shrink mrlock_t The writer field is not needed for non_DEBU builds so remove it. While we're at i also clean up the interface for is locked asserts to go through and xfs_iget.c helper with an interface like the xfs_ilock routines to isolated the XFS codebase from mrlock internals. That way we can kill mrlock_t entirely once rw_semaphores grow an islocked facility. Also remove unused flags to the ilock family of functions. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30902a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:54:02 +10:00
Christoph Hellwig	eca450b7c2	[XFS] simplify xfs_lookup Opencode xfs-kill-xfs_dir_lookup_int here, which gets rid of a lock roundtrip, and lots of stack space. Also kill the di_mode == 0 check that has been done in xfs_iget for a few years now. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30901a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:53:52 +10:00
Christoph Hellwig	d4377d8418	[XFS] xfs_rename: pass resblks to xfs_dir_removename Similar to rmdir and remove - avoids a potential transaction reservation overrun. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30900a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:53:41 +10:00
Christoph Hellwig	6a7f422d47	[XFS] kill di_mode checks after xfs_iget Unless XFS_IGET_CREATE is passed xfs_iget will return ENOENT if it encounters an inode with di_mode == 0. Remove the duplicated checks in the callers. (the log recovery case is not touched for now) SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30898a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:53:31 +10:00
Christoph Hellwig	4e5dbb3498	[XFS] kill xfs_getattr It's currently used by the ACL code to read di_mode/di_uid, but these are simple 32bit scalar values we can just read directly without locking. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30897a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:53:16 +10:00
Christoph Hellwig	42173f6860	[XFS] Remove VN_IS* macros and related cruft. We can just check i_mode / di_mode directly. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:30896a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-04-29 15:53:05 +10:00
Steve French	4b18f2a9c3	[CIFS] convert usage of implicit booleans to bool Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-04-29 00:06:05 +00:00
Igor Mammedov	e9f20d6f03	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-04-28 23:08:21 +00:00
Igor Mammedov	bf62fd887c	[CIFS] fixed compatibility issue with samba refferal request treeName part is canonicalized to '/' path separator Signed-off-by: Igor Mammedov <niallain@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-04-28 23:05:58 +00:00
Adrian Bunk	22ba0317c8	udf: fs/udf/partition.c:udf_get_pblock() mustn't be inline This patch fixes the following build error with UML and gcc 4.3: <-- snip --> ... CC fs/udf/partition.o /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/udf/partition.c: In function ‘udf_get_pblock_virt15’: /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/udf/partition.c:32: sorry, unimplemented: inlining failed in call to ‘udf_get_pblock’: function body not available /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/udf/partition.c:102: sorry, unimplemented: called from here make[3]: *** [fs/udf/partition.o] Error 1 <-- snip --> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>	2008-04-28 18:44:26 +02:00
Olof Johansson	9f354858b8	fatfs: fix build warning with 64k PAGE_SIZE Annoying gcc warning: fs/fat/inode.c: In function 'fat_fill_super': fs/fat/inode.c:1222: warning: comparison is always false due to limited range of data type Change it to compare with 4K instead of PAGE_CACHE_SIZE, as suggested by OGAWA-san. [FAT spec says: logical_sector_size should be 512, 1024, 2048 4096] So, at least for now, we limit it to 4096. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
Frank Seidel	0607fd0258	fat: detect media without partition table correctly I received a complaint that some FAT formated medias (e.g. sd memory cards) trigger a "unknown partition table" message even though there is no partition table and they work correctly, while in general (when e.g. formated with mkdosfs or even Windows Vista) this message is not shown. Currently this seems only to happen when the medias get formatted with Windows XP (and possibly Win 2000). Then the boot indicator byte contains garbage (part of text message) and so do the other parts checked by msdos_paritition which then later triggers this message. References: novell bug #364365 Most fat formatted media without partition table contains zeros in the boot indication and the other tested bytes and so falls through the checks in msdos_partition, leading it to return with 1 (all is fine). But some (e.g. WinXP formatted) fat fomated medias don't use boot_ind and so the check fails and causes a "unkown partition table" warning eventhough there is none and everything would be fine. This additional check directly verifies if there is a fat formatted medium without a partition table. Signed-off-by: Frank Seidel <fseidel@suse.de> Cc: Andreas Dilger <adilger@sun.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
Andrew Morton	73f20e58b1	FAT_VALID_MEDIA(): remove pointless test The on-disk media specification field in FAT is only 8-bits, so testing for <=0xff is pointless, and can generate a "comparison is always true due to limited range of data type" warning. While we're there, convert FAT_VALID_MEDIA() into a C function - the present implementation is buggy: it generates either one or two references to its argument. Cc: Frank Seidel <fseidel@suse.de> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
OGAWA Hirofumi	c7a6c4edc7	fat: use __getname() __getname() is faster than __get_free_page(). Use it. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
Keith Mok	f22032ba8d	vfat: bug fix for vfat cannot handle filename with 255 This patch fix the problem that the buffer allocated for convert of unicode to utf8 in fat/dir.c is too small. And cannot handle filename with 255 asian characters when mounted with utf8 options. Also it fix the filename length limitation checking in vfat/namei.c that the filename length should be checked against the number of converted unicode characters. Not the length before NLS/UTF8 converted. Signed-off-by: Keith Mok <ek9852@gmail.com> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
OGAWA Hirofumi	061e97469f	Add balance_dirty_pages_ratelimited() to cont_expand_zero() On the systems, ftruncate() which expand size for FAT became the cause of OOM. The cont_expand_zero() filled all memory with dirty pages, and since disk is very slow, limit of page scanning was exceeded, then it triggered OOM. This adds balance_dirty_pages_ratelimited() to avoid filling memory with dirty pages. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
OGAWA Hirofumi	e69be4c9c4	fat: Remove fat_clusters_flush() This removes unneeded fat_clusters_flush(). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
OGAWA Hirofumi	606e423e43	fat: Update free_clusters even if it is untrusted Currently, free_clusters is not updated until it is trusted, because Windows doesn't update it correctly. But if user is using FAT driver of Linux, it updates free_clusters correctly. Instead, this updates it even if it's untrusted, so if free_clustes is correct, now keep correct value. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
OGAWA Hirofumi	1ae43f826b	fat: Add allow_utime option Normally utime(2) checks current process is owner of the file, or it has CAP_FOWNER capability. But FAT filesystem doesn't have uid/gid as on disk info, so normal check is too unflexible. With this option you can relax it. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
OGAWA Hirofumi	e97e8de388	fat: fat_setattr() fix Fix fat_setattr() on the case of showexec option. If user specified showexec option, inode->i_mode may not have S_IXUGO. This just use inode->i_mode to fix it. And with this patch, we don't allow chmod() on memory inode, it's just bad behaviour. IOW, we allow changing S_IWUGO only which can be stored to disk. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
OGAWA Hirofumi	1278fdd34b	fat: fat_notify_change() and check_mode() cleanup - Rename fat_notify_change() to fat_setattr() - check_mode() cleanup - Change layout of code Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:47 -07:00
OGAWA Hirofumi	3754a54447	fat: kill is_bad_inode() check FAT doesn't need to check bad inode anymore. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:46 -07:00
Jan Kara	d5dee5c395	reiserfs: unpack tails on quota files Quota files cannot have tails because quota_write and quota_read functions do not support them. So far when quota files did have tail, we just refused to turn quotas on it. Sadly this check has been wrong and so there are now plenty installations where quota files don't have NOTAIL flag set and so now after fixing the check, they suddently fail to turn quotas on. Since it's easy to unpack the tail from kernel, do this from reiserfs_quota_on() which solves the problem and is generally nicer to users anyway. Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: <urhausen@urifabi.net> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:46 -07:00
Jan Kara	a2fe594fa3	reiserfs: fix hang on umount with quotas when journal is aborted Call dquot_drop() from reiserfs_dquot_drop() even if we fail to start a transaction. Otherwise we never get to dropping references to quota structures from the inode and umount will hang indefinitely. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:46 -07:00
Harvey Harrison	fbe5498b3d	reiserfs: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:46 -07:00
Harvey Harrison	8acc570fab	reiserfs: fix more sparse warnings in do_balan.c fs/reiserfs/do_balan.c:1467:10: warning: symbol 'ret_val' shadows an earlier one fs/reiserfs/do_balan.c:275:6: originally declared here fs/reiserfs/do_balan.c:1471:23: warning: symbol 'ih' shadows an earlier one fs/reiserfs/do_balan.c:249:67: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:46 -07:00
Harvey Harrison	e13601bc6a	reiserfs: fix sparse warning in journal.c fs/reiserfs/journal.c:4319:2: warning: returning void-valued expression Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:46 -07:00
Marcin Slusarz	9e902df6be	reiserfs: le*_add_cpu conversion replace all: little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) + expression_in_cpu_byteorder); with: leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:46 -07:00
Harvey Harrison	78e917d59c	udf: fix sparse warning in namei.c Let's use bsize instead. fs/udf/namei.c:960:12: warning: symbol 'elen' shadows an earlier one fs/udf/namei.c:937:15: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:46 -07:00
Harvey Harrison	36a53ddf85	ufs: replace __inline with inline Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:45 -07:00
Marcin Slusarz	0045edaaf9	ufs: remove unused fs64_add and fs64_sub remove fs64_add and fs64_sub - they probably weren't ever used because their prototypes used u32 instead of __fs64 Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:45 -07:00
Harvey Harrison	9746077a71	ufs: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:45 -07:00
Marcin Slusarz	3c5afae2ba	ufs: [bl]e*_add_cpu conversion replace all: big/little_endian_variable = cpu_to_[bl]eX([bl]eX_to_cpu(big/little_endian_variable) + expression_in_cpu_byteorder); with: [bl]eX_add_cpu(&big/little_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:45 -07:00
Harvey Harrison	08fc99bfc3	jbd: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:45 -07:00
Harvey Harrison	e05b6b524b	ext3: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:45 -07:00
Jan Kara	fa1ff1e02f	ext3: fix mount messages when quota disabled When quota is disabled, we should not print 'journaled quota not supported' when user tried to mount non-journaled quota. Also fix typo in the message. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:45 -07:00
Aneesh Kumar K.V	2588ef83f7	ext3: retry block allocation if new blocks are allocated from system zone If the block allocator gets blocks out of system zone ext3 calls ext3_error. But if the file system is mounted with errors=continue retry block allocation. We need to mark the system zone blocks as in use to make sure retry don't pick them again System zone is the block range mapping block bitmap, inode bitmap and inode table. [akpm@linux-foundation.org: fix typo in comment] Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:45 -07:00
Jan Kara	07c9938a4e	ext3: fix hang on umount with quotas when journal is aborted Call dquot_drop() from ext3_dquot_drop() even if we fail to start a transaction. Otherwise we never get to dropping references to quota structures from the inode and umount will hang indefinitely. Thanks to Payphone LIOU for spotting the problem. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Payphone LIOU <lioupayphone@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:45 -07:00
Jan Kara	0b23076988	ext3: fix update of mtime and ctime on rename Make ext3 update mtime and ctime of the directory into which we move file even if the directory entry already exists. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Josef Bacik	5b9a499d77	jbd: fix possible journal overflow issues There are several cases where the running transaction can get buffers added to its BJ_Metadata list which it never dirtied, which makes its t_nr_buffers counter end up larger than its t_outstanding_credits counter. This will cause issues when starting new transactions as while we are logging buffers we decrement t_outstanding_buffers, so when t_outstanding_buffers goes negative, we will report that we need less space in the journal than we actually need, so transactions will be started even though there may not be enough room for them. In the worst case scenario (which admittedly is almost impossible to reproduce) this will result in the journal running out of space. The fix is to only refile buffers from the committing transaction to the running transactions BJ_Modified list when b_modified is set on that journal, which is the only way to be sure if the running transaction has modified that buffer. This patch also fixes an accounting error in journal_forget, it is possible that we can call journal_forget on a buffer without having modified it, only gotten write access to it, so instead of freeing a credit, we only do so if the buffer was modified. The assert will help catch if this problem occurs. Without these two patches I could hit this assert within minutes of running postmark, with them this issue no longer arises. Thank you, Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Acked-by: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Josef Bacik	5bc833feaa	jbd: fix the way the b_modified flag is cleared Currently at the start of a journal commit we loop through all of the buffers on the committing transaction and clear the b_modified flag (the flag that is set when a transaction modifies the buffer) under the j_list_lock. The problem is that everywhere else this flag is modified only under the jbd lock buffer flag, so it will race with a running transaction who could potentially set it, and have it unset by the committing transaction. This is also a big waste, you can have several thousands of buffers that you are clearing the modified flag on when you may not need to. This patch removes this code and instead clears the b_modified flag upon entering do_get_write_access/journal_get_create_access, so if that transaction does indeed use the buffer then it will be accounted for properly, and if it does not then we know we didn't use it. That will be important for the next patch in this series. Tested thoroughly by myself using postmark/iozone/bonnie++. Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Acked-by: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Julia Lawall	269b261916	fs/ext3: use BUG_ON if (...) BUG(); should be replaced with BUG_ON(...) when the test has no side-effects to allow a definition of BUG_ON that drops the code completely. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @ disable unlikely @ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } \| - if (unlikely(E)) { BUG(); } + BUG_ON(E); ) @@ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } \| - if (E) { BUG(); } + BUG_ON(E); ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Akinobu Mita	33575f8ffe	ext3: check ext3_journal_get_write_access() errors Check ext3_journal_get_write_access() errors. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Akinobu Mita	e0e369a7dd	ext3: use ext3_get_group_desc() Use ext3_get_group_desc() Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Akinobu Mita	22a5daf537	ext3: add missing ext3_journal_stop() Add missing ext3_journal_stop() in error handling. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Akinobu Mita	1eaafeae4b	ext3: use ext3_group_first_block_no() Use ext3_group_first_block_no() Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Adrian Bunk	15633005e0	make ext3_xattr_list() static Make the needlessly global ext3_xattr_list() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Marcin Slusarz	e7f23ebdef	ext3: convert byte order of constant instead of variable Convert byte order of constant instead of variable which can be done at compile time (vs run time). Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:44 -07:00
Hisashi Hifumi	3d61f75eef	ext3: fdatasync should skip metadata writeout when overwriting Currently fdatasync is identical to fsync in ext3. I think fdatasync should skip journal flush in data=ordered and data=writeback mode when it overwrites to already-instantiated blocks on HDD. When I_DIRTY_DATASYNC flag is not set, fdatasync should skip journal writeout because this indicates only atime or/and mtime updates. Following patch is the same approach of ext2's fsync code(ext2_sync_file). I did a performance test using the sysbench. #sysbench --num-threads=128 --max-requests=50000 --test=fileio --file-total-size=128G --file-test-mode=rndwr --file-fsync-mode=fdatasync run The result on ext3 was: -2.6.24 Operations performed: 0 Read, 50080 Write, 59600 Other = 109680 Total Read 0b Written 782.5Mb Total transferred 782.5Mb (12.116Mb/sec) 775.45 Requests/sec executed Test execution summary: total time: 64.5814s total number of events: 50080 total time taken by event execution: 3713.9836 per-request statistics: min: 0.0000s avg: 0.0742s max: 0.9375s approx. 95 percentile: 0.2901s Threads fairness: events (avg/stddev): 391.2500/23.26 execution time (avg/stddev): 29.0155/1.99 -2.6.24-patched Operations performed: 0 Read, 50009 Write, 61596 Other = 111605 Total Read 0b Written 781.39Mb Total transferred 781.39Mb (16.419Mb/sec) 1050.83 Requests/sec executed Test execution summary: total time: 47.5900s total number of events: 50009 total time taken by event execution: 2934.5768 per-request statistics: min: 0.0000s avg: 0.0587s max: 0.8938s approx. 95 percentile: 0.1993s Threads fairness: events (avg/stddev): 390.6953/22.64 execution time (avg/stddev): 22.9264/1.17 Filesystem I/O throughput was improved. Signed-off-by :Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Acked-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:43 -07:00
Aneesh Kumar K.V	8b91582500	ext2: retry block allocation if new blocks are allocated from system zone If the block allocator gets blocks out of system zone ext2 calls ext2_error. But if the file system is mounted with errors=continue retry block allocation. We need to mark the system zone blocks as in use to make sure retry don't pick them again System zone is the block range mapping block bitmap, inode bitmap and inode table. [akpm@linux-foundation.org: fix typo in comment] Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:43 -07:00
Harvey Harrison	605afd60ef	ext2: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:43 -07:00
Julia Lawall	2c11619a59	fs/ext2: use BUG_ON if (...) BUG(); should be replaced with BUG_ON(...) when the test has no side-effects to allow a definition of BUG_ON that drops the code completely. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @ disable unlikely @ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } \| - if (unlikely(E)) { BUG(); } + BUG_ON(E); ) @@ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } \| - if (E) { BUG(); } + BUG_ON(E); ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:43 -07:00
Akinobu Mita	4c8b3125f8	ext2: use ext2_fsblk_t type Use ext2_fsblk_t type for filesystem-wide blocks number Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:43 -07:00
Akinobu Mita	24097d12ef	ext2: use ext2_group_first_block_no() Use ext2_group_first_block_no() and assign the return values to ext2_fsblk_t variables. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:43 -07:00
Akinobu Mita	bbff286024	ext2: improve ext2_readdir() return value Improve ext2_readdir() return value for ext2_get_page() failure by using the actual result of ext2_get_page(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:43 -07:00
Marcin Slusarz	31f68e1301	ext2: convert byte order of constant instead of variable Convert byte order of constant instead of variable which can be done at compile time (vs run time). Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:43 -07:00
Marcin Slusarz	fba4d3997f	ext2: le*_add_cpu conversion replace all: little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) + expression_in_cpu_byteorder); with: leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:43 -07:00
Jan Kara	1b445a9c21	quota: reiserfs: make reiserfs handle quotaon on remount Update reiserfs to handle quotaon on remount RW. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:33 -07:00
Jan Kara	6f28e08794	quota: ext4: make ext4 handle quotaon on remount Update ext4 to handle quotaon on remount RW. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:33 -07:00
Jan Kara	2fd83a4f3c	quota: ext3: make ext3 handle quotaon on remount Update ext3 handle quotaon on remount RW. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:33 -07:00
Jan Kara	0ff5af8340	quota: quota core changes for quotaon on remount Currently, we just turn quotas off on remount of filesystem to read-only state. The patch below adds necessary framework so that we can turn quotas off on remount RO but we are able to automatically reenable them again when filesystem is remounted to RW state. All we need to do is to keep references to inodes of quota files when remounting RO and using these references to reenable quotas when remounting RW. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:33 -07:00
Jan Kara	03f6e92bdd	quota: various style cleanups Cleanups in quota code: Change __inline__ to inline. Change some macros to inline functions. Remove vfs_quota_off_mount() macro. DQUOT_OFF() should be (0) is CONFIG_QUOTA is disabled. Move declaration of mark_dquot_dirty and dirty_dquot from quota.h to dquot.c [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:33 -07:00
Jan Kara	8794b5b246	quota: remove superfluous DQUOT_OFF() in fs/namespace.c We don't need to turn quotas off before remounting root ro, because do_remount_sb() already handles this. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:33 -07:00
Andrew Perepechko	338bf9afda	quota: do not allow setting of quota limits to too high values We should check whether quota limits set via Q_SETQUOTA are not exceeding limits which quota format is able to handle. Signed-off-by: Andrew Perepechko <andrew.perepechko@sun.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:32 -07:00
Harvey Harrison	eee3754f5e	ncpfs: fix sparse warning in ncpsign_kernel.c We're casting anyway, might as well cast to the correct sign. Specific to i386 (ifdef __i386__) fs/ncpfs/ncpsign_kernel.c:58:23: warning: incorrect type in initializer (different signedness) fs/ncpfs/ncpsign_kernel.c:58:23: expected unsigned int data2 fs/ncpfs/ncpsign_kernel.c:58:23: got int <noident> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Petr Vandrovec <VANDROVE@vc.cvut.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:29 -07:00
Harvey Harrison	305787e44e	ncpfs: fix sparse warnings in ioctl.c In both cases, these inode variables arebeing used to test the server's root inode against NULL. Change them to s_inode. fs/ncpfs/ioctl.c:391:18: warning: symbol 'inode' shadows an earlier one fs/ncpfs/ioctl.c:264:28: originally declared here fs/ncpfs/ioctl.c:441:17: warning: symbol 'inode' shadows an earlier one fs/ncpfs/ioctl.c:264:28: originally declared here In this case, we are about to return anyway, just reuse result. fs/ncpfs/ioctl.c:521:8: warning: symbol 'result' shadows an earlier one fs/ncpfs/ioctl.c:268:6: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Petr Vandrovec <VANDROVE@vc.cvut.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:29 -07:00
Harvey Harrison	cdf8803768	ncpfs: add prototypes to ncp_fs.h Removes some externs from C files, noticed from the sparse warnings: fs/ncpfs/dir.c:90:26: warning: symbol 'ncp_root_dentry_operations' was not declared. Should it be static? fs/ncpfs/symlink.c:107:5: warning: symbol 'ncp_symlink' was not declared. Should it be static? fs/ncpfs/symlink.c:101:39: warning: symbol 'ncp_symlink_aops' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Petr Vandrovec <VANDROVE@vc.cvut.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:29 -07:00
Lee Schermerhorn	71fe804b6d	mempolicy: use struct mempolicy pointer in shmem_sb_info This patch replaces the mempolicy mode, mode_flags, and nodemask in the shmem_sb_info struct with a struct mempolicy pointer, initialized to NULL. This removes dependency on the details of mempolicy from shmem.c and hugetlbfs inode.c and simplifies the interfaces. mpol_parse_str() in mempolicy.c is changed to return, via a pointer to a pointer arg, a struct mempolicy pointer on success. For MPOL_DEFAULT, the returned pointer is NULL. Further, mpol_parse_str() now takes a 'no_context' argument that causes the input nodemask to be stored in the w.user_nodemask of the created mempolicy for use when the mempolicy is installed in a tmpfs inode shared policy tree. At that time, any cpuset contextualization is applied to the original input nodemask. This preserves the previous behavior where the input nodemask was stored in the superblock. We can think of the returned mempolicy as "context free". Because mpol_parse_str() is now calling mpol_new(), we can remove from mpol_to_str() the semantic checks that mpol_new() already performs. Add 'no_context' parameter to mpol_to_str() to specify that it should format the nodemask in w.user_nodemask for 'bind' and 'interleave' policies. Change mpol_shared_policy_init() to take a pointer to a "context free" struct mempolicy and to create a new, "contextualized" mempolicy using the mode, mode_flags and user_nodemask from the input mempolicy. Note: we know that the mempolicy passed to mpol_to_str() or mpol_shared_policy_init() from a tmpfs superblock is "context free". This is currently the only instance thereof. However, if we found more uses for this concept, and introduced any ambiguity as to whether a mempolicy was context free or not, we could add another internal mode flag to identify context free mempolicies. Then, we could remove the 'no_context' argument from mpol_to_str(). Added shmem_get_sbmpol() to return a reference counted superblock mempolicy, if one exists, to pass to mpol_shared_policy_init(). We must add the reference under the sb stat_lock to prevent races with replacement of the mpol by remount. This reference is removed in mpol_shared_policy_init(). [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: another build fix] [akpm@linux-foundation.org: yet another build fix] Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:25 -07:00
Nick Piggin	70688e4dd1	xip: support non-struct page backed memory Convert XIP to support non-struct page backed memory, using VM_MIXEDMAP for the user mappings. This requires the get_xip_page API to be changed to an address based one. Improve the API layering a little bit too, while we're here. This is required in order to support XIP filesystems on memory that isn't backed with struct page (but memory with struct page is still supported too). Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Carsten Otte <cotte@de.ibm.com> Cc: Jared Hulbert <jaredeh@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:23 -07:00
Jared Hulbert	30afcb4bd2	return pfn from direct_access, for XIP Alter the block device ->direct_access() API to work with the new get_xip_mem() API (that requires both kaddr and pfn are returned). Some architectures will not do the right thing in their virt_to_page() for use by XIP (to translate from the kernel virtual address returned by direct_access(), to a user mappable pfn in XIP's page fault handler. However, we can't switch it to just return the pfn and not the kaddr, because we have no good way to get a kva from a pfn, and XIP requires the kva for its read(2) and write(2) handlers. So we have to return both. Signed-off-by: Jared Hulbert <jaredeh@gmail.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-mm@kvack.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:23 -07:00
Peter Zijlstra	214e471ff9	smaps: account swap entries Show the amount of swap for each vma. This can be used to see where all the swap goes. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Matt Mackall <mpm@selenic.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:22 -07:00
Christoph Lameter	a10aa57987	vmalloc: show vmalloced areas via /proc/vmallocinfo Implement a new proc file that allows the display of the currently allocated vmalloc memory. It allows to see the users of vmalloc. That is important if vmalloc space is scarce (i386 for example). And it's going to be important for the compound page fallback to vmalloc. Many of the current users can be switched to use compound pages with fallback. This means that the number of users of vmalloc is reduced and page tables no longer necessary to access the memory. /proc/vmallocinfo allows to review how that reduction occurs. If memory becomes fragmented and larger order allocations are no longer possible then /proc/vmallocinfo allows to see which compound page allocations fell back to virtual compound pages. That is important for new users of virtual compound pages. Such as order 1 stack allocation etc that may fallback to virtual compound pages in the future. /proc/vmallocinfo permissions are made readable-only-by-root to avoid possible information leakage. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: CONFIG_MMU=n build fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:21 -07:00
David Rientjes	028fec414d	mempolicy: support optional mode flags With the evolution of mempolicies, it is necessary to support mempolicy mode flags that specify how the policy shall behave in certain circumstances. The most immediate need for mode flag support is to suppress remapping the nodemask of a policy at the time of rebind. Both the mempolicy mode and flags are passed by the user in the 'int policy' formal of either the set_mempolicy() or mbind() syscall. A new constant, MPOL_MODE_FLAGS, represents the union of legal optional flags that may be passed as part of this int. Mempolicies that include illegal flags as part of their policy are rejected as invalid. An additional member to struct mempolicy is added to support the mode flags: struct mempolicy { ... unsigned short policy; unsigned short flags; } The splitting of the 'int' actual passed by the user is done in sys_set_mempolicy() and sys_mbind() for their respective syscalls. This is done by intersecting the actual with MPOL_MODE_FLAGS, rejecting the syscall of there are additional flags, and storing it in the new 'flags' member of struct mempolicy. The intersection of the actual with ~MPOL_MODE_FLAGS is stored in the 'policy' member of the struct and all current users of pol->policy remain unchanged. The union of the policy mode and optional mode flags is passed back to the user in get_mempolicy(). This combination of mode and flags within the same actual does not break userspace code that relies on get_mempolicy(&policy, ...) and either switch (policy) { case MPOL_BIND: ... case MPOL_INTERLEAVE: ... }; statements or if (policy == MPOL_INTERLEAVE) { ... } statements. Such applications would need to use optional mode flags when calling set_mempolicy() or mbind() for these previously implemented statements to stop working. If an application does start using optional mode flags, it will need to mask the optional flags off the policy in switch and conditional statements that only test mode. An additional member is also added to struct shmem_sb_info to store the optional mode flags. [hugh@veritas.com: shmem mpol: fix build warning] Cc: Paul Jackson <pj@sgi.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:19 -07:00
Mel Gorman	19770b3260	mm: filter based on a nodemask as well as a gfp_mask The MPOL_BIND policy creates a zonelist that is used for allocations controlled by that mempolicy. As the per-node zonelist is already being filtered based on a zone id, this patch adds a version of __alloc_pages() that takes a nodemask for further filtering. This eliminates the need for MPOL_BIND to create a custom zonelist. A positive benefit of this is that allocations using MPOL_BIND now use the local node's distance-ordered zonelist instead of a custom node-id-ordered zonelist. I.e., pages will be allocated from the closest allowed node with available memory. [Lee.Schermerhorn@hp.com: Mempolicy: update stale documentation and comments] [Lee.Schermerhorn@hp.com: Mempolicy: make dequeue_huge_page_vma() obey MPOL_BIND nodemask] [Lee.Schermerhorn@hp.com: Mempolicy: make dequeue_huge_page_vma() obey MPOL_BIND nodemask rework] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:19 -07:00
Mel Gorman	dd1a239f6f	mm: have zonelist contains structs with both a zone pointer and zone_idx Filtering zonelists requires very frequent use of zone_idx(). This is costly as it involves a lookup of another structure and a substraction operation. As the zone_idx is often required, it should be quickly accessible. The node idx could also be stored here if it was found that accessing zone->node is significant which may be the case on workloads where nodemasks are heavily used. This patch introduces a struct zoneref to store a zone pointer and a zone index. The zonelist then consists of an array of these struct zonerefs which are looked up as necessary. Helpers are given for accessing the zone index as well as the node index. [kamezawa.hiroyu@jp.fujitsu.com: Suggested struct zoneref instead of embedding information in pointers] [hugh@veritas.com: mm-have-zonelist: fix memcg ooms] [hugh@veritas.com: just return do_try_to_free_pages] [hugh@veritas.com: do_try_to_free_pages gfp_mask redundant] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Christoph Lameter <clameter@sgi.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <clameter@sgi.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:18 -07:00
Mel Gorman	54a6eb5c47	mm: use two zonelist that are filtered by GFP mask Currently a node has two sets of zonelists, one for each zone type in the system and a second set for GFP_THISNODE allocations. Based on the zones allowed by a gfp mask, one of these zonelists is selected. All of these zonelists consume memory and occupy cache lines. This patch replaces the multiple zonelists per-node with two zonelists. The first contains all populated zones in the system, ordered by distance, for fallback allocations when the target/preferred node has no free pages. The second contains all populated zones in the node suitable for GFP_THISNODE allocations. An iterator macro is introduced called for_each_zone_zonelist() that interates through each zone allowed by the GFP flags in the selected zonelist. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <clameter@sgi.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:18 -07:00
Mel Gorman	0e88460da6	mm: introduce node_zonelist() for accessing the zonelist for a GFP mask Introduce a node_zonelist() helper function. It is used to lookup the appropriate zonelist given a node and a GFP mask. The patch on its own is a cleanup but it helps clarify parts of the two-zonelist-per-node patchset. If necessary, it can be merged with the next patch in this set without problems. Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <clameter@sgi.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:18 -07:00
Mel Gorman	dac1d27bc8	mm: use zonelists instead of zones when direct reclaiming pages The following patches replace multiple zonelists per node with two zonelists that are filtered based on the GFP flags. The patches as a set fix a bug with regard to the use of MPOL_BIND and ZONE_MOVABLE. With this patchset, the MPOL_BIND will apply to the two highest zones when the highest zone is ZONE_MOVABLE. This should be considered as an alternative fix for the MPOL_BIND+ZONE_MOVABLE in 2.6.23 to the previously discussed hack that filters only custom zonelists. The first patch cleans up an inconsistency where direct reclaim uses zonelist->zones where other places use zonelist. The second patch introduces a helper function node_zonelist() for looking up the appropriate zonelist for a GFP mask which simplifies patches later in the set. The third patch defines/remembers the "preferred zone" for numa statistics, as it is no longer always the first zone in a zonelist. The forth patch replaces multiple zonelists with two zonelists that are filtered. The two zonelists are due to the fact that the memoryless patchset introduces a second set of zonelists for __GFP_THISNODE. The fifth patch introduces helper macros for retrieving the zone and node indices of entries in a zonelist. The final patch introduces filtering of the zonelists based on a nodemask. Two zonelists exist per node, one for normal allocations and one for __GFP_THISNODE. Performance results varied depending on the machine configuration. In real workloads the gain/loss will depend on how much the userspace portion of the benchmark benefits from having more cache available due to reduced referencing of zonelists. These are the range of performance losses/gains when running against 2.6.24-rc4-mm1. The set and these machines are a mix of i386, x86_64 and ppc64 both NUMA and non-NUMA. loss to gain Total CPU time on Kernbench: -0.86% to 1.13% Elapsed time on Kernbench: -0.79% to 0.76% page_test from aim9: -4.37% to 0.79% brk_test from aim9: -0.71% to 4.07% fork_test from aim9: -1.84% to 4.60% exec_test from aim9: -0.71% to 1.08% This patch: The allocator deals with zonelists which indicate the order in which zones should be targeted for an allocation. Similarly, direct reclaim of pages iterates over an array of zones. For consistency, this patch converts direct reclaim to use a zonelist. No functionality is changed by this patch. This simplifies zonelist iterators in the next patch. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <clameter@sgi.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:18 -07:00
Adrian Bunk	9d02dbc813	make swap_pte_to_pagemap_entry() static Make the needlessly global swap_pte_to_pagemap_entry() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:18 -07:00
Nick Piggin	3c18ddd160	mm: remove nopage Nothing in the tree uses nopage any more. Remove support for it in the core mm code and documentation (and a few stray references to it in comments). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:18 -07:00
Christoph Lameter	488514d179	Remove set_migrateflags() Migrate flags must be set on slab creation as agreed upon when the antifrag logic was reviewed. Otherwise some slabs of a slabcache will end up in the unmovable and others in the reclaimable section depending on which flag was active when a new slab page was allocated. This likely slid in somehow when antifrag was merged. Remove it. The buffer_heads are always allocated with __GFP_RECLAIMABLE because the SLAB_RECLAIM_ACCOUNT option is set. The set_migrateflags() never had any effect there. Radix tree allocations are not directly reclaimable but they are allocated with __GFP_RECLAIMABLE set on each allocation. We now set SLAB_RECLAIM_ACCOUNT on radix tree slab creation making sure that radix tree slabs are consistently placed in the reclaimable section. Radix tree slabs will also be accounted as such. There is then no user left of set_migratepages. So remove it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:17 -07:00
Jeff Moyer	e92adcba26	aio: io_getevents() should return if io_destroy() is invoked This patch wakes up a thread waiting in io_getevents if another thread destroys the context. This was tested using a small program that spawns a thread to wait in io_getevents while the parent thread destroys the io context and then waits for the getevents thread to exit. Without this patch, the program hangs indefinitely. With the patch, the program exits as expected. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Christopher Smith <x@xman.org> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-28 08:58:17 -07:00
Steve French	39da984711	[CIFS] Fix statfs formatting Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-04-28 04:04:34 +00:00
Steve French	1dbbb60774	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-04-28 04:01:34 +00:00
Linus Torvalds	064922a805	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (40 commits) [SCSI] jazz_esp, sgiwd93, sni_53c710, sun3x_esp: fix platform driver hotplug/coldplug [SCSI] aic7xxx: add const [SCSI] aic7xxx: add static [SCSI] aic7xxx: Update _shipped files [SCSI] aic7xxx: teach aicasm to not emit unused debug code/data [SCSI] qla2xxx: Update version number to 8.02.01-k2. [SCSI] qla2xxx: Correct regression in relogin code. [SCSI] qla2xxx: Correct misc. endian and byte-ordering issues. [SCSI] qla2xxx: make qla2x00_issue_iocb_timeout() static [SCSI] qla2xxx: qla_os.c, make 2 functions static [SCSI] qla2xxx: Re-register FDMI information after a LIP. [SCSI] qla2xxx: Correct SRB usage-after-completion/free issues. [SCSI] qla2xxx: Correct ISP84XX verify-chip response handling. [SCSI] qla2xxx: Wakeup DPC thread to process any deferred-work requests. [SCSI] qla2xxx: Collapse RISC-RAM retrieval code during a firmware-dump. [SCSI] m68k: new mac_esp scsi driver [SCSI] zfcp: Add some statistics provided by the FCP adapter to the sysfs [SCSI] zfcp: Print some messages only during ERP [SCSI] zfcp: Wait for free SBAL during exchange config [SCSI] scsi_transport_fc: fc_user_scan correction ...	2008-04-27 11:25:00 -07:00
Linus Torvalds	bc84e0a160	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: [PATCH] sanitize locate_fd() [PATCH] sanitize unshare_files/reset_files_struct [PATCH] sanitize handling of shared descriptor tables in failing execve() [PATCH] close race in unshare_files() [PATCH] restore sane ->umount_begin() API cifs: timeout dfs automounts +little fix.	2008-04-25 19:05:55 -07:00
Steve French	d09e860cf0	[CIFS] Adds to dns_resolver checking if the server name is an IP addr and skipping upcall in this case. Signed-off-by: Igor Mammedov <niallain@gmail.com> Signed-off-by: sfrench@us.ibm.com	2008-04-26 00:22:23 +00:00
Roland Dreier	3dd7b71ca0	Export __locks_copy_lock() so modular lockd builds Commit `1a747ee0` ("locks: don't call ->copy_lock methods on return of conflicting locks") changed fs/lockd/svclock.c to call __locks_copy_lock() instead of locks_copy_lock(), but lockd can be built as a module and __locks_copy_lock() is not exported, which causes a build error ERROR: "__locks_copy_lock" [fs/lockd/lockd.ko] undefined! with CONFIG_LOCKD=m. Fix this by exporting __locks_copy_lock(). Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-25 15:49:46 -07:00
Steve French	404e86e155	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-04-25 20:20:10 +00:00
Linus Torvalds	7e97b28309	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: (82 commits) [MTD] m25p80: Add Support for ATMEL AT25DF641 64-Megabit SPI Flash [MTD] m25p80: add FAST_READ access support to M25Pxx [MTD] [NAND] bf5xx_nand: Avoid crash if bfin_mac is installed. [MTD] [NAND] at91_nand: control NCE signal [MTD] [NAND] AT91 hardware ECC compile fix for at91sam9263 / at91sam9260 [MTD] [NAND] Hardware ECC controller on at91sam9263 / at91sam9260 [JFFS2] Introduce dbg_readinode2 log level, use it to shut read_dnode() up [JFFS2] Fix jffs2_reserve_space() when all blocks are pending erasure. [JFFS2] Add erase_checking_list to hold blocks being marked. UBI: add a message [JFFS2] Return values of jffs2_block_check_erase error paths [MTD] Clean up AR7 partition map support [MTD] [NOR] Fix Intel CFI driver for collie flash [JFFS2] Finally remove redundant ref->__totlen field. [JFFS2] Honour TEST_TOTLEN macro in debugging code. ref->__totlen is going! [JFFS2] Add paranoia debugging for superblock counts [JFFS2] Fix free space leak with in-band cleanmarkers [JFFS2] Self-sufficient #includes in jffs2_fs_i.h: include <linux/mutex.h> [MTD] [NAND] Verify probe by retrying to checking the results match [MTD] [NAND] S3C2410 Allow ECC disable to be specified by the board ...	2008-04-25 12:25:48 -07:00
Steve French	0206e61b46	[CIFS] Fix spelling mistake Noticed by Joe Perches Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-04-25 18:19:40 +00:00
J. Bruce Fields	e36cd4a287	nfsd: don't allow setting ctime over v4 Presumably this is left over from earlier drafts of v4, which listed TIME_METADATA as writeable. It's read-only in rfc 3530, and shouldn't be modifiable anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-25 13:00:11 -04:00
J. Bruce Fields	1a747ee0cc	locks: don't call ->copy_lock methods on return of conflicting locks The file_lock structure is used both as a heavy-weight representation of an active lock, with pointers to reference-counted structures, etc., and as a simple container for parameters that describe a file lock. The conflicting lock returned from __posix_lock_file is an example of the latter; so don't call the filesystem or lock manager callbacks when copying to it. This also saves the need for an unnecessary locks_init_lock in the nfsv4 server. Thanks to Trond for pointing out the error. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-25 13:00:11 -04:00
Wendy Cheng	17efa372cf	lockd: unlock lockd locks held for a certain filesystem Add /proc/fs/nfsd/unlock_filesystem, which allows e.g.: shell> echo /mnt/sfs1 > /proc/fs/nfsd/unlock_filesystem so that a filesystem can be unmounted before allowing a peer nfsd to take over nfs service for the filesystem. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Cc: Lon Hohberger <lhh@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> fs/lockd/svcsubs.c \| 66 +++++++++++++++++++++++++++++++++++++++----- fs/nfsd/nfsctl.c \| 65 +++++++++++++++++++++++++++++++++++++++++++ include/linux/lockd/lockd.h \| 7 ++++ 3 files changed, 131 insertions(+), 7 deletions(-)	2008-04-25 13:00:11 -04:00
Wendy Cheng	4373ea84c8	lockd: unlock lockd locks associated with a given server ip For high-availability NFS service, we generally need to be able to drop file locks held on the exported filesystem before moving clients to a new server. Currently the only way to do that is by shutting down lockd entirely, which is often undesireable (for example, if you want to continue exporting other filesystems). This patch allows the administrator to release all locks held by clients accessing the client through a given server ip address, by echoing that address to a new file, /proc/fs/nfsd/unlock_ip, as in: shell> echo 10.1.1.2 > /proc/fs/nfsd/unlock_ip The expected sequence of events can be: 1. Tear down the IP address 2. Unexport the path 3. Write IP to /proc/fs/nfsd/unlock_ip to unlock files 4. Signal peer to begin take-over. For now we only support IPv4 addresses and NFSv2/v3 (NFSv4 locks are not affected). Also, if unmounting the filesystem is required, we assume at step 3 that clients using the given server ip are the only clients holding locks on the given filesystem; otherwise, an additional patch is required to allow revoking all locks held by lockd on a given filesystem. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Cc: Lon Hohberger <lhh@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> fs/lockd/svcsubs.c \| 66 +++++++++++++++++++++++++++++++++++++++----- fs/nfsd/nfsctl.c \| 65 +++++++++++++++++++++++++++++++++++++++++++ include/linux/lockd/lockd.h \| 7 ++++ 3 files changed, 131 insertions(+), 7 deletions(-)	2008-04-25 13:00:10 -04:00
David M. Richter	9d91cdcc0c	leases: remove unneeded variable from fcntl_setlease(). fcntl_setlease() has a struct dentry* that is used only once; this patch removes it. Signed-off-by: David M. Richter <richterd@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-25 12:58:22 -04:00
David M. Richter	1908555767	leases: move lock allocation earlier in generic_setlease() In generic_setlease(), the struct file_lock is allocated after tests for the presence of conflicting readers/writers is done, despite the fact that the allocation might block; this patch moves the allocation earlier. A subsequent set of patches will rely on this behavior to properly serialize between a modified __break_lease() and generic_setlease(). Signed-off-by: David M. Richter <richterd@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-25 12:58:22 -04:00
David M. Richter	288b2fd825	leases: when unlocking, skip locking-related steps In generic_setlease(), we don't need to allocate a new struct file_lock or check for readers or writers when called with F_UNLCK. Signed-off-by: David M. Richter <richterd@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-25 12:58:22 -04:00
David M. Richter	5fcc60c3a0	leases: fix a return-value mixup Fixes a return-value mixup from `85c59580b3` "locks: Fix potential OOPS in generic_setlease()", in which -ENOMEM replaced what had been intended to stay -EAGAIN in the variable "error". Signed-off-by: David M. Richter <richterd@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-25 12:58:22 -04:00
Al Viro	f8f95702f0	[PATCH] sanitize locate_fd() * 'file' argument is unused; lose it. * move setting flags from the caller (dupfd()) to locate_fd(); pass cloexec flag as new argument. Note that files_fdtable() that used to be in dupfd() isn't needed in the place in locate_fd() where the moved code ends up - we know that ->file_lock hadn't been dropped since the last time we calculated fdt because we can get there only if expand_files() returns 0 and it doesn't drop/reacquire in that case. * move getting/dropping ->file_lock into locate_fd(). Now the caller doesn't need to do anything with files_struct files anymore and we can move that inside locate_fd() as well, killing the struct files_struct argument. At that point locate_fd() is extremely similar to get_unused_fd_flags() and the next patches will merge those two. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-25 09:24:05 -04:00
Al Viro	3b1253880b	[PATCH] sanitize unshare_files/reset_files_struct * let unshare_files() give caller the displaced files_struct * don't bother with grabbing reference only to drop it in the caller if it hadn't been shared in the first place * in that form unshare_files() is trivially implemented via unshare_fd(), so we eliminate the duplicate logics in fork.c * reset_files_struct() is not just only called for current; it will break the system if somebody ever calls it for anything else (we can't modify ->files of somebody else). Lose the task_struct * argument. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-25 09:23:59 -04:00
Al Viro	fd8328be87	[PATCH] sanitize handling of shared descriptor tables in failing execve() * unshare_files() can fail; doing it after irreversible actions is wrong and de_thread() is certainly irreversible. * since we do it unconditionally anyway, we might as well do it in do_execve() and save ourselves the PITA in binfmt handlers, etc. * while we are at it, binfmt_som actually leaked files_struct on failure. As a side benefit, unshare_files(), put_files_struct() and reset_files_struct() become unexported. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-25 09:23:53 -04:00
Al Viro	42faad9965	[PATCH] restore sane ->umount_begin() API Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-25 09:23:25 -04:00
Igor Mammedov	78d31a3a87	cifs: timeout dfs automounts +little fix. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-25 09:15:26 -04:00
Steve French	47df179317	[CIFS] Update cifs version number Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-04-25 02:01:44 +00:00
Linus Torvalds	57675e6e75	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Fix typo in previous commit [CIFS] Fix define for new proxy cap to match documentation [CIFS] Fix UNC path prefix on QueryUnixPathInfo to have correct slash [CIFS] Reserve new proxy cap for WAFS [CIFS] Add various missing flags and defintions [CIFS] make cifs_dfs_automount_list_static [CIFS] Fix oops when slow oplock process races with unmount [CIFS] Fix acl length when very short ACL being modified by chmod [CIFS] Fix looping on reconnect to Samba when unexpected tree connect fail on reconnect [CIFS] minor update to change log	2008-04-24 13:47:31 -07:00
Linus Torvalds	563307b2fa	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (80 commits) SUNRPC: Invalidate the RPCSEC_GSS session if the server dropped the request make nfs_automount_list static NFS: remove duplicate flags assignment from nfs_validate_mount_data NFS - fix potential NULL pointer dereference v2 SUNRPC: Don't change the RPCSEC_GSS context on a credential that is in use SUNRPC: Fix a race in gss_refresh_upcall() SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests SUNRPC: Remove the unused export of xprt_force_disconnect SUNRPC: remove XS_SENDMSG_RETRY SUNRPC: Protect creds against early garbage collection NFSv4: Attempt to use machine credentials in SETCLIENTID calls NFSv4: Reintroduce machine creds NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid() nfs: fix printout of multiword bitfields nfs: return negative error value from nfs{,4}_stat_to_errno NLM/lockd: Ensure client locking calls use correct credentials NFS: Remove the buggy lock-if-signalled case from do_setlk() NLM/lockd: Fix a race when cancelling a blocking lock NLM/lockd: Ensure that nlmclnt_cancel() returns results of the CANCEL call NLM: Remove the signal masking in nlmclnt_proc/nlmclnt_cancel ...	2008-04-24 11:46:16 -07:00
Trond Myklebust	233607dbbc	Merge branch 'devel'	2008-04-24 14:01:02 -04:00
Steve French	a7f796a60b	[CIFS] Fix typo in previous commit Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-04-24 16:39:07 +00:00
Steve French	ee4987ab5c	[CIFS] Fix define for new proxy cap to match documentation The transport encryption capability and new SetFSInfo level were missing, and the new proxy capability (which Samba server is implementing) and proxy setfsinfo needed to be moved down to not collide with Samba's transport encryption capability. CC: Jeremy Allison <jra@samba.org> CC: Sam Liddicott <sam@lidicott.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-04-24 16:31:12 +00:00
Steve French	36d99df2fb	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-04-24 15:26:50 +00:00
Jeff Layton	ca456252db	knfsd: clear both setuid and setgid whenever a chown is done Currently, knfsd only clears the setuid bit if the owner of a file is changed on a SETATTR call, and only clears the setgid bit if the group is changed. POSIX says this in the spec for chown(): "If the specified file is a regular file, one or more of the S_IXUSR, S_IXGRP, or S_IXOTH bits of the file mode are set, and the process does not have appropriate privileges, the set-user-ID (S_ISUID) and set-group-ID (S_ISGID) bits of the file mode shall be cleared upon successful return from chown()." If I'm reading this correctly, then knfsd is doing this wrong. It should be clearing both the setuid and setgid bit on any SETATTR that changes the uid or gid. This wasn't really as noticable before, but now that the ATTR_KILL_S*ID bits are a no-op for the NFS client, it's more evident. This patch corrects the nfsd_setattr logic so that this occurs. It also does a bit of cleanup to the function. There is also one small behavioral change. If a SETATTR call comes in that changes the uid/gid and the mode, then we now only clear the setgid bit if the group execute bit isn't set. The setgid bit without a group execute bit signifies mandatory locking and we likely don't want to clear the bit in that case. Since there is no call in POSIX that should generate a SETATTR call like this, then this should rarely happen, but it's worth noting. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:43 -04:00
Jeff Layton	dee3209d99	knfsd: get rid of imode variable in nfsd_setattr ...it's not really needed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:43 -04:00
Jeff Layton	f97c650dda	NLM: don't let lockd exit on unexpected svc_recv errors (try #2 ) When svc_recv returns an unexpected error, lockd will print a warning and exit. This problematic for several reasons. In particular, it will cause the reference counts for the thread to be wrong, and can lead to a potential BUG() call. Rather than exiting on error from svc_recv, have the thread do a 1s sleep and then retry the loop. This is unlikely to cause any harm, and if the error turns out to be something temporary then it may be able to recover. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:43 -04:00
Jeff Layton	06e02d66fa	NFS: don't let nfs_callback_svc exit on unexpected svc_recv errors (try #2 ) When svc_recv returns an unexpected error, nfs_callback_svc will print a warning and exit. This problematic for several reasons. In particular, it will cause the reference counts for the thread to be wrong, and no new thread will be started until all nfs4 mounts are unmounted. Rather than exiting on error from svc_recv, have the thread do a 1s sleep and then retry the loop. This is unlikely to cause any harm, and if the error turns out to be something temporary then it may be able to recover. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:42 -04:00
Olga Kornievskaia	ff7d9756b5	nfsd: use static memory for callback program and stats There's no need to dynamically allocate this memory, and doing so may create the possibility of races on shutdown of the rpc client. (We've witnessed it only after adding rpcsec_gss support to the server, after which the rpc code can send destroys calls that expect to still be able to access the rpc_stats structure after it has been destroyed.) Such races are in theory possible if the module containing this "static" memory is removed very quickly after an rpc client is destroyed, but we haven't seen that happen. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:42 -04:00
J. Bruce Fields	e1ba1ab76e	nfsd: fix comment Obvious comment nit. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:42 -04:00
J. Bruce Fields	3c61eecb60	lockd: Fix stale nlmsvc_unlink_block comment As of `5996a298da` ("NLM: don't unlock on cancel requests") we no longer unlock in this case, so the comment is no longer accurate. Thanks to Stuart Friedberg for pointing out the inconsistency. Cc: Stuart Friedberg <sfriedberg@hp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:42 -04:00
Chuck Lever	1a448fdb3c	NFSD: Remove NFSv4 dependency on NFSv3 Clean up: Because NFSD_V4 "depends on" NFSD_V3, it appears as a child of the NFSD_V3 menu entry, and is not visible if NFSD_V3 is unselected. Replace the dependency on NFSD_V3 with a "select NFSD_V3". This makes NFSD_V4 look and work just like NFS_V3, while ensuring that NFSD_V3 is enabled if NFSD_V4 is. Sam Ravnborg adds: "This use of select is questionable. In general it is bad to select a symbol with dependencies. In this case the dependencies of NFSD_V3 are duplicated for NFSD_V4 so we will not se erratic configurations but do you remember to update NFSD_V4 when you add a depends on NFSD_V3? But I see no other clean way to do it right now." Later he said: "My comment was more to say we have things to address in kconfig. This is abuse in the acceptable range." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:41 -04:00
Chuck Lever	3329ba0523	SUNRPC: Remove PROC_FS dependency Recently, commit `440bcc59` added a reverse dependency to fs/Kconfig to ensure that PROC_FS was enabled if SUNRPC_GSS was enabled. Apparently this isn't necessary because the auth_gss components under net/sunrpc will build correctly even if PROC_FS is disabled, though RPCSEC_GSS will not work without /proc. It also violates the guideline in Documentation/kbuild/kconfig-language.txt that states "In general use select only for non-visible symbols (no prompts anywhere) and for symbols with no dependencies." To address these issues, remove the dependency. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:41 -04:00
Chuck Lever	6ffd4be633	NFSD: Use "depends on" for PROC_FS dependency Recently, commit `440bcc59` added a reverse dependency to fs/Kconfig to ensure that PROC_FS was enabled if NFSD_V4 was enabled. There is a guideline in Documentation/kbuild/kconfig-language.txt that states "In general use select only for non-visible symbols (no prompts anywhere) and for symbols with no dependencies." A quick grep around other Kconfig files reveals that no entry currently uses "select PROC_FS" -- every one uses "depends on". Thus CONFIG_NFSD_V4 should use "depends on PROC_FS" as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:41 -04:00
J. Bruce Fields	03550fac06	nfsd: move most of fh_verify to separate function Move the code that actually parses the filehandle and looks up the dentry and export to a separate function. This simplifies the reference counting a little and moves fh_verify() a little closer to the kernel ideal of small, minimally-indentended functions. Clean up a few other minor style sins along the way. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Neil Brown <neilb@suse.de>	2008-04-23 16:13:41 -04:00
Felix Blyakher	9167f501c6	nfsd: initialize lease type in nfs4_open_delegation() While lease is correctly checked by supplying the type argument to vfs_setlease(), it's stored with fl_type uninitialized. This breaks the logic when checking the type of the lease. The fix is to initialize fl_type. The old code still happened to function correctly since F_RDLCK is zero, and we only implement read delegations currently (nor write delegations). But that's no excuse for not fixing this. Signed-off-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:40 -04:00
Jeff Layton	a277e33cbe	NFS: convert nfs4 callback thread to kthread API There's a general push to convert kernel threads to use the (much cleaner) kthread API. This patch converts the NFSv4 callback kernel thread to the kthread API. In addition to being generally cleaner this also removes the dependency on signals when shutting down the thread. Note that this patch depends on the recent patches to svc_recv() to make it check kthread_should_stop() periodically. Those patches are in Bruce's tree at the moment and are slated for 2.6.26 along with the lockd conversion, so this conversion is probably also appropriate for 2.6.26. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:40 -04:00
Harvey Harrison	3ba1514815	nfsd: fix sparse warning in vfs.c fs/nfsd/vfs.c:991:27: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:39 -04:00
Harvey Harrison	a254b246ee	nfsd: fix sparse warnings Add extern to nfsd/nfsd.h fs/nfsd/nfssvc.c:146:5: warning: symbol 'nfsd_nrthreads' was not declared. Should it be static? fs/nfsd/nfssvc.c:261:5: warning: symbol 'nfsd_nrpools' was not declared. Should it be static? fs/nfsd/nfssvc.c:269:5: warning: symbol 'nfsd_get_nrthreads' was not declared. Should it be static? fs/nfsd/nfssvc.c:281:5: warning: symbol 'nfsd_set_nrthreads' was not declared. Should it be static? fs/nfsd/export.c:1534:23: warning: symbol 'nfs_exports_op' was not declared. Should it be static? Add include of auth.h fs/nfsd/auth.c:27:5: warning: symbol 'nfsd_setuser' was not declared. Should it be static? Make static, move forward declaration closer to where it's needed. fs/nfsd/nfs4state.c:1877:1: warning: symbol 'laundromat_main' was not declared. Should it be static? Make static, forward declaration was already marked static. fs/nfsd/nfs4idmap.c:206:1: warning: symbol 'idtoname_parse' was not declared. Should it be static? fs/nfsd/vfs.c:1156:1: warning: symbol 'nfsd_create_setattr' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:39 -04:00
J. Bruce Fields	d842120212	lockd: convert nsm_mutex to a spinlock There's no reason for a mutex here, except to allow an allocation under the lock, which we can avoid with the usual trick of preallocating memory for the new object and freeing it if it turns out to be unnecessary. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:39 -04:00
J. Bruce Fields	a95e56e72c	lockd: clean up __nsm_find() Use list_for_each_entry(). Also, in keeping with kernel style, make the normal case (kzalloc succeeds) unindented and handle the abnormal case with a goto. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:39 -04:00
J. Bruce Fields	164f98adbb	lockd: fix race in nlm_release() The sm_count is decremented to zero but left on the nsm_handles list. So in the space between decrementing sm_count and acquiring nsm_mutex, it is possible for another task to find this nsm_handle, increment the use count and then enter nsm_release itself. Thus there's nothing to prevent the nsm being freed before we acquire nsm_mutex here. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:39 -04:00
Harvey Harrison	93245d11fc	lockd: fix sparse warning in svcshare.c fs/lockd/svcshare.c:74:50: warning: Using plain integer as NULL pointer Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:39 -04:00
Adrian Bunk	f2b0dee2ec	make nfsd_create_setattr() static This patch makes the needlessly global nfsd_create_setattr() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:38 -04:00
Chuck Lever	6aaa67b5f3	NFSD: Remove redundant "select" clauses in fs/Kconfig As far as I can tell, selecting the CRYPTO and CRYPTO_MD5 entries under CONFIG_NFSD is redundant, since CONFIG_NFSD_V4 already selects RPCSEC_GSS_KRB5, which selects these entries. Testing with "make menuconfig" shows that the entries under CRYPTO still properly reflect "Y" or "M" based on the setting of CONFIG_NFSD after this change is applied. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:38 -04:00
Chuck Lever	78dd0992a3	NFSD: Move "select NFSD_V2_ACL if NFSD_V3_ACL" Clean up: since NFSD_V2_ACL is a boolean, it can be selected safely under the NFSD_V3_ACL entry (also a boolean). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:38 -04:00
Chuck Lever	892069552e	NFSD: Move "select FS_POSIX_ACL if NFSD_V4" Clean up: since FS_POSIX_ACL is a non-visible boolean entry, it can be selected safely under the NFSD_V4 entry (also a boolean). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:38 -04:00
Chuck Lever	d24455b5ff	NFSD: Update help text for CONFIG_NFSD Clean up: refresh the help text for Kconfig items related to the NFS server. Remove obsolete URLs, and make the language consistent among the options. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:38 -04:00
Chuck Lever	5ea0dd61f2	NFSD: Remove NFSD_TCP kernel build option Likewise, distros usually leave CONFIG_NFSD_TCP enabled. TCP support in the Linux NFS server is stable enough that we can leave it on always. CONFIG_NFSD_TCP adds about 10 lines of code, and defaults to "Y" anyway. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:38 -04:00
J. Bruce Fields	c0ce6ec87c	nfsd: clarify readdir/mountpoint-crossing code The code here is difficult to understand; attempt to clarify somewhat by pulling out one of the more mystifying conditionals into a separate function. While we're here, also add lease_time to the list of attributes that we don't really need to cross a mountpoint to fetch. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Peter Staubach <staubach@redhat.com>	2008-04-23 16:13:38 -04:00
J. Bruce Fields	6a85fa3add	nfsd4: kill unnecessary check in preprocess_stateid_op This condition is always true. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:37 -04:00
J. Bruce Fields	0836f58725	nfsd4: simplify stateid sequencing checks Pull this common code into a separate function. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:37 -04:00
J. Bruce Fields	f3362737be	nfsd4: remove unnecessary CHECK_FH check in preprocess_seqid_op Every caller sets this flag, so it's meaningless. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:37 -04:00
J. Bruce Fields	065f30ec14	nfs: remove unnecessary NFS_NEED_* defines Thanks to Robert Day for pointing out that these two defines are unused. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Trond Myklebust <trond@netapp.com>Trond Myklebust <trond@netapp.com> Cc: Neil Brown <neilb@suse.de> Cc: "Robert P. J. Day" <rpjday@crashcourse.ca>	2008-04-23 16:13:37 -04:00
Aurélien Charbon	f15364bd4c	IPv6 support for NFS server export caches This adds IPv6 support to the interfaces that are used to express nfsd exports. All addressed are stored internally as IPv6; backwards compatibility is maintained using mapped addresses. Thanks to Bruce Fields, Brian Haley, Neil Brown and Hideaki Joshifuji for comments Signed-off-by: Aurelien Charbon <aurelien.charbon@bull.net> Cc: Neil Brown <neilb@suse.de> Cc: Brian Haley <brian.haley@hp.com> Cc: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:36 -04:00
Jeff Layton	d751a7cd06	NLM: Convert lockd to use kthreads Have lockd_up start lockd using kthread_run. With this change, lockd_down now blocks until lockd actually exits, so there's no longer need for the waitqueue code at the end of lockd_down. This also means that only one lockd can be running at a time which simplifies the code within lockd's main loop. This also adds a check for kthread_should_stop in the main loop of nlmsvc_retry_blocked and after that function returns. There's no sense continuing to retry blocks if lockd is coming down anyway. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:36 -04:00
NeilBrown	1447d25eb3	knfsd: Remove NLM_HOST_MAX and associated logic. Lockd caches information about hosts that have recently held locks to expedite the taking of further locks. It periodically discards this information for hosts that have not been used for a few minutes. lockd currently has a value NLM_HOST_MAX, and changes the 'garbage collection' behaviour when the number of hosts exceeds this threshold. However its behaviour is strange, and likely not what was intended. When the number of hosts exceeds the max, it scans less often (every 2 minutes vs every minute) and allows unused host information to remain around longer (5 minutes instead of 2). Having this limit is of dubious value anyway, and we have not suffered from the code not getting the limit right, so remove the limit altogether. We go with the larger values (discard 5 minute old hosts every 2 minutes) as they are probably safer. Maybe the periodic garbage collection should be replace to with 'shrinker' handler so we just respond to memory pressure.... Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:35 -04:00
David Woodhouse	2c61cb250c	[JFFS2] Introduce dbg_readinode2 log level, use it to shut read_dnode() up We haven't seen bugs in this for a while now, since the rewrite. No need to be _quite_ so verbose... Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-23 16:43:15 +01:00
David Woodhouse	422b120238	[JFFS2] Fix jffs2_reserve_space() when all blocks are pending erasure. When _all_ the blocks were on the erase_pending_list, we could't find a block to GC from but there was no _actually_ free space, and jffs2_reserve_space() would get a little unhappy. Handle this case by returning -EAGAIN from jffs2_garbage_collect_pass(). There are two callers of that function -- jffs2_flush_wbuf_gc(), which will interpret it as an error and flush the writebuffer by other means, and jffs2_reserve_space(), which we modify to respond to -EAGAIN with an immediate call to jffs2_erase_pending_blocks() and another run round the loop. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-23 16:01:37 +01:00
David Woodhouse	e2bc322bf0	[JFFS2] Add erase_checking_list to hold blocks being marked. Just to keep the debug code happy when it's adding all the blocks up. Otherwise, they disappear for a while while the locks are dropped to check them and write the cleanmarker. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-23 14:15:24 +01:00
Anders Grafström	8a0f572397	[JFFS2] Return values of jffs2_block_check_erase error paths It looks the error paths in jffs2_block_check_erase() have wrong return values. A block that failed to be erased never gets marked as bad. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-23 10:06:46 +01:00
Miklos Szeredi	97e7e0f71d	[patch 7/7] vfs: mountinfo: show dominating group id Show peer group ID of nearest dominating group that has intersection with the mount's namespace. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-23 00:05:09 -04:00
Ram Pai	2d4d4864ac	[patch 6/7] vfs: mountinfo: add /proc/<pid>/mountinfo [mszeredi@suse.cz] rewrite and split big patch into managable chunks /proc/mounts in its current form lacks important information: - propagation state - root of mount for bind mounts - the st_dev value used within the filesystem - identifier for each mount and it's parent It also suffers from the following problems: - not easily extendable - ambiguity of mountpoints within a chrooted environment - doesn't distinguish between filesystem dependent and independent options - doesn't distinguish between per mount and per super block options This patch introduces /proc/<pid>/mountinfo which attempts to address all these deficiencies. Code shared between /proc/<pid>/mounts and /proc/<pid>/mountinfo is extracted into separate functions. Thanks to Al Viro for the help in getting the design right. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-23 00:05:03 -04:00
Miklos Szeredi	a1a2c409b6	[patch 5/7] vfs: mountinfo: allow using process root Allow /proc/<pid>/mountinfo to use the root of <pid> to calculate mountpoints. - move definition of 'struct proc_mounts' to <linux/mnt_namespace.h> - add the process's namespace and root to this structure - pass a pointer to 'struct proc_mounts' into seq_operations In addition the following cleanups are made: - use a common open function for /proc/<pid>/{mounts,mountstat} - surround namespace.c part of these proc files with #ifdef CONFIG_PROC_FS - make the seq_operations structures const Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-23 00:04:57 -04:00
Miklos Szeredi	719f5d7f0b	[patch 4/7] vfs: mountinfo: add mount peer group ID Add a unique ID to each peer group using the IDR infrastructure. The identifiers are reused after the peer group dissolves. The IDR structures are protected by holding namepspace_sem for write while allocating or deallocating IDs. IDs are allocated when a previously unshared vfsmount becomes the first member of a peer group. When a new member is added to an existing group, the ID is copied from one of the old members. IDs are freed when the last member of a peer group is unshared. Setting the MNT_SHARED flag on members of a subtree is done as a separate step, after all the IDs have been allocated. This way an allocation failure can be cleaned up easilty, without affecting the propagation state. Based on design sketch by Al Viro. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-23 00:04:51 -04:00
Miklos Szeredi	73cd49ecdd	[patch 3/7] vfs: mountinfo: add mount ID Add a unique ID to each vfsmount using the IDR infrastructure. The identifiers are reused after the vfsmount is freed. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-23 00:04:45 -04:00
Miklos Szeredi	9d1bc60138	[patch 2/7] vfs: mountinfo: add seq_file_root() Add a new function: seq_file_root() This is similar to seq_path(), but calculates the path relative to the given root, instead of current->fs->root. If the path was unreachable from root, then modify the root parameter to reflect this. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-23 00:04:38 -04:00
Ram Pai	6092d04818	[patch 1/7] vfs: mountinfo: add dentry_path() [mszeredi@suse.cz] split big patch into managable chunks Add the following functions: dentry_path() seq_dentry() These are similar to d_path() and seq_path(). But instead of calculating the path within a mount namespace, they calculate the path from the root of the filesystem to a given dentry, ignoring mounts completely. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-23 00:04:32 -04:00
Al Viro	934b25c597	[PATCH] remove unused label in xattr.c (noise from ro-bind) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-23 00:04:04 -04:00
Linus Torvalds	94bc891b00	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: [PATCH] get rid of __exit_files(), __exit_fs() and __put_fs_struct() [PATCH] proc_readfd_common() race fix [PATCH] double-free of inode on alloc_file() failure exit in create_write_pipe() [PATCH] teach seq_file to discard entries [PATCH] umount_tree() will unhash everything itself [PATCH] get rid of more nameidata passing in namespace.c [PATCH] switch a bunch of LSM hooks from nameidata to path [PATCH] lock exclusively in collect_mounts() and drop_collected_mounts() [PATCH] move a bunch of declarations to fs/internal.h	2008-04-22 18:28:34 -07:00
David Woodhouse	19e56ceae7	[JFFS2] Finally remove redundant ref->__totlen field. Haven't had any complaints about it recently, despite having the test code enabled to verify that the calculated length is correct. Kill it off, just by #undef TEST_TOTLEN for now; removing it for real can come a little later. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-23 01:26:12 +01:00
David Woodhouse	27e6b8e388	[JFFS2] Honour TEST_TOTLEN macro in debugging code. ref->__totlen is going! Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-23 01:25:33 +01:00
David Woodhouse	85a62db624	[JFFS2] Add paranoia debugging for superblock counts The problem fixed in commit `014b164e13` (space leak with in-band cleanmarkers) would have been caught a lot quicker if our paranoid debugging mode had included adding up the size counts from all the eraseblocks and comparing the totals with the counts in the superblock. Add that. Make jffs2_mark_erased_block() file the newly-erased block on the free_list before calling the debug function, to make it happy. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-23 01:24:17 +01:00
Al Viro	9b4f526cdc	[PATCH] proc_readfd_common() race fix Since we drop the rcu_read_lock inside the loop, we can't assume that files->fdt will remain unchanged (and not freed) between iterations. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-22 19:55:03 -04:00
Al Viro	ed15243717	[PATCH] double-free of inode on alloc_file() failure exit in create_write_pipe() Duh... Fortunately, the bug is quite recent (post-2.6.25) and, embarrassingly, mine ;-/ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-22 19:54:57 -04:00
David Woodhouse	014b164e13	[JFFS2] Fix free space leak with in-band cleanmarkers We were accounting for the cleanmarker by calling jffs2_link_node_ref() (without locking!), which adjusted both superblock and per-eraseblock accounting, subtracting the size of the cleanmarker from {jeb,c}->free_size and adding it to {jeb,c}->used_size. But only _then_ were we adding the size of the newly-erased block back to the superblock counts, and we were adding each of jeb->{free,used}_size to the corresponding superblock counts. Thus, the size of the cleanmarker was effectively subtracted from the superblock's free_size _twice_. Fix this, by always adding a full eraseblock size to c->free_size when we've erased a block. And call jffs2_link_node_ref() under the proper lock, while we're at it. Thanks to Alexander Yurchenko and/or Damir Shayhutdinov for (almost) pinpointing the problem. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-22 23:54:38 +01:00
David Woodhouse	cf9d1e428c	[JFFS2] Self-sufficient #includes in jffs2_fs_i.h: include <linux/mutex.h> ... instead of <linux/semaphore.h> which we don't need any more anyway. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-22 23:53:26 +01:00
David Sterba	16abef0e9e	fs: use loff_t type instead of long long Use offset type consistently. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-22 15:17:11 -07:00
Linus Torvalds	03b883840c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: linux/{dlm,dlm_device}.h: cleanup for userspace dlm: common max length definitions dlm: move plock code from gfs2 dlm: recover nodes that are removed and re-added dlm: save master info after failed no-queue request dlm: make dlm_print_rsb() static dlm: match signedness between dlm_config_info and cluster_set	2008-04-22 13:44:23 -07:00
Linus Torvalds	62429f4340	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: (41 commits) udf: use crc_itu_t from lib instead of udf_crc udf: Fix compilation warnings when UDF debug is on udf: Fix bug in VAT mapping code udf: Add read-only support for 2.50 UDF media udf: Fix handling of multisession media udf: Mount filesystem read-only if it has pseudooverwrite partition udf: Handle VAT packed inside inode properly udf: Allow loading of VAT inode udf: Fix detection of VAT version udf: Silence warning about accesses beyond end of device udf: Improve anchor block detection udf: Cleanup anchor block detection. udf: Move processing of virtual partitions udf: Move filling of partition descriptor info into a separate function udf: Improve error recovery on mount udf: Cleanup volume descriptor sequence processing udf: fix anchor point detection udf: Remove declarations of arrays of size UDF_NAME_LEN (256 bytes) udf: Remove checking of existence of filename in udf_add_entry() udf: Mark udf_process_sequence() as noinline ...	2008-04-22 13:40:47 -07:00
James Bottomley	0f4238958d	[SCSI] sysfs: make group is_valid return a mode_t We have a problem in scsi_transport_spi in that we need to customise not only the visibility of the attributes, but also their mode. Fix this by making the is_visible() callback return a mode, with 0 indicating is not visible. Also add a sysfs_update_group() API to allow us to change either the visibility or mode of the files at any time on the fly. Acked-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-04-22 15:16:31 -05:00
David Woodhouse	ced2207036	[JFFS2] semaphore->mutex conversion Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-22 15:13:40 +01:00
michael	cca1584171	[JFFS2] add write verify on dataflash. Add the write verification buffer to the dataflash. The mtd_dataflash has the CONFIG_DATAFLASH_WRITE_VERIFY so is better a change to Kconfig. Signed-off-by: Michael Trimarchi <trimarchimichael@yahoo.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-22 12:35:50 +01:00
David Woodhouse	25dc30b4cd	[JFFS2] fix sparse warnings in gc.c fs/jffs2/gc.c:1147:29: warning: symbol 'jeb' shadows an earlier one fs/jffs2/gc.c:1084:89: originally declared here fs/jffs2/gc.c:1197:29: warning: symbol 'jeb' shadows an earlier one fs/jffs2/gc.c:1084:89: originally declared here Rename the unused 'jeb' argument to avoid this. We could potentially remove the argument, but GCC should be doing that anyway. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-22 12:35:47 +01:00
Harvey Harrison	bf66737ca8	[JFFS2] fix sparse warning in write.c fs/jffs2/write.c:585:28: warning: symbol 'fd' shadows an earlier one fs/jffs2/write.c:536:27: originally declared here No need to redeclare fd, use the original one, after this point, fd is always reassigned before it used again. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-22 12:35:46 +01:00
David Woodhouse	8ca646abb4	[JFFS2] Fix sparse warning in nodemgmt.c fs/jffs2/nodemgmt.c:60:8: warning: symbol 'ret' shadows an earlier one fs/jffs2/nodemgmt.c:45:6: originally declared here (reported by Harvey Harrison) Just remove the offending declaration of 'int ret' and use the earlier one. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-22 12:35:44 +01:00
Harvey Harrison	f876a59dae	[JFFS2] include function prototype for jffs2_ioctl fs/jffs2/ioctl.c:14:5: warning: symbol 'jffs2_ioctl' was not declared. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-04-22 12:35:42 +01:00
David Woodhouse	f838bad1b3	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-04-22 12:34:25 +01:00
Al Viro	521b5d0c40	[PATCH] teach seq_file to discard entries Allow ->show() return SEQ_SKIP; that will discard all output from that element and move on. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-21 23:14:02 -04:00
Al Viro	4e1b36fb48	[PATCH] umount_tree() will unhash everything itself Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-21 23:13:54 -04:00
Al Viro	8c3ee42e80	[PATCH] get rid of more nameidata passing in namespace.c Further reduction of stack footprint (sys_pivot_root()); lose useless BKL in there, while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-21 23:13:47 -04:00
Al Viro	b5266eb4c8	[PATCH] switch a bunch of LSM hooks from nameidata to path Namely, ones from namespace.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-21 23:13:23 -04:00
Al Viro	1a60a28077	[PATCH] lock exclusively in collect_mounts() and drop_collected_mounts() Taking namespace_sem shared there isn't worth the trouble, especially with vfsmount ID allocation about to be added. That way we know that umount_tree(), copy_tree() and clone_mnt() are _always_ serialized by namespace_sem. umount_tree() still needs vfsmount_lock (it manipulates hash chains, among other things), but that's a separate story. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-21 23:11:09 -04:00
Al Viro	6d59e7f582	[PATCH] move a bunch of declarations to fs/internal.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-21 23:11:01 -04:00

... 4 5 6 7 8 ...

9134 Commits