linux

Author	SHA1	Message	Date
Al Viro	9dec3c4d30	[PATCH] dup_fd() part 2 use alloc_fdtable() instead of expand_files(), get rid of pointless grabbing newf->file_lock, kill magic in copy_fdtable() that used to be there only to skip copying when called from dup_fd(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-16 17:22:33 -04:00
Al Viro	02afc6267f	[PATCH] dup_fd() fixes, part 1 Move the sucker to fs/file.c in preparation to the rest Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-16 17:22:26 -04:00
Al Viro	f52111b154	[PATCH] take init_files to fs/file.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-16 17:22:20 -04:00
Harvey Harrison	9a6ab769bd	byteorder: don't directly include linux/byteorder/generic.h Use asm/byteorder.h instead. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-16 12:01:45 -07:00
Steve French	a1fe78f16e	[CIFS] Add missing defines for DFS Also has minor cleanup of previous patch CC: Igor Mammedov <niallain@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-16 18:48:38 +00:00
Igor Mammedov	fec4585fd7	CIFSGetDFSRefer cleanup + dfs_referral_level_3 fixed to conform REFERRAL_V3 the MS-DFSC spec. Signed-off-by: Igor Mammedov <niallain@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-16 18:41:43 +00:00
Adrian Bunk	1d2e88e73e	nfs: make nfs4_drop_state_owner() static nfs4_drop_state_owner() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-05-16 09:43:31 -07:00
Jan Blunck	31f31db1a1	nfs: path_{get,put}() cleanups Here are some more places where path_{get,put}() can be used instead of dput()/mntput() pair. Signed-off-by: Jan Blunck <jblunck@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-05-16 09:43:30 -07:00
Harvey Harrison	3110ff8048	nfs: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-05-16 09:43:29 -07:00
Eric Paris	46c8ac7425	nfs/lsm: make NFSv4 set LSM mount options NFSv3 get_sb operations call into the LSM layer to set security options passed from userspace. NFSv4 hooks were not originally added since it was reasonably late in the merge window and NFSv3 was the only thing that had regressed (v4 has never supported any LSM options) This patch makes NFSv4 call into the LSM to set security options rather than just blindly dropping them with no notice to the user as happens today. This patch was tested in a simple NFSv4 environment with the context= option and appeared to work as expected. Signed-off-by: Eric Paris <eparis@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-05-16 09:43:27 -07:00
Trond Myklebust	3a6258e1fb	NFSv4: Check the return value of decode_compound_hdr_arg() If decode_compound_hdr_arg() returns a resource error, then we cannot proceed to process the callback. Return a 'GARBAGE_ARGS' rpc-level error to the caller instead. If, however, the minor version field is incorrect, then we need to propagate the resulting NFS4ERR_MINOR_VERS_MISMATCH error back as the compound status field (setting the nops field to 0). Finally, if encode_compound_hdr_res() returns an error, we need to return an RPC_SYSTEM_ERR to the caller. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-05-16 09:43:26 -07:00
Fred Isaman	38def50fab	nfs: fix race in nfs_dirty_request When called from nfs_flush_incompatible, the req is not locked, so req->wb_page might be set to NULL before it is used by PageWriteback. Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-05-16 09:43:23 -07:00
Trond Myklebust	b0b539739f	NFS: Ensure that 'noac' and/or 'actimeo=0' turn off attribute caching Both the 'noac' and 'actimeo=0' mount options should ensure that attributes are not cached, however a bug in nfs_attribute_timeout() means that currently, the attributes may in fact get cached for up to one jiffy. This has been seen to cause corruption in some applications. The reason for the bug is that the time_in_range() test returns 'true' as long as the current time lies between nfsi->read_cache_jiffies and nfsi->read_cache_jiffies + nfsi->attrtimeo. In other words, if jiffies equals nfsi->read_cache_jiffies, then we still cache the attribute data. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-05-16 09:43:21 -07:00
Igor Mammedov	de2db8d790	Fixed DFS code to work with new 'build_path_from_dentry', that returns full path if share in the dfs, now. Signed-off-by: Igor Mammedov <niallain@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-16 15:23:28 +00:00
Mingming Cao	02c471cb17	jbd2: update transaction t_state to T_COMMIT fix Updating the current transaction's t_state is protected by j_state_lock. We need to do the same when updating the t_state to T_COMMIT. Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-05-15 14:46:17 -04:00
Aneesh Kumar K.V	519deca049	ext4: Retry block allocation if new blocks are allocated from system zone. If the block allocator gets blocks out of system zone ext4 calls ext4_error. But if the file system is mounted with errors=continue retry block allocation. We need to mark the system zone blocks as in use to make sure retry don't pick them again System zone is the block range mapping block bitmap, inode bitmap and inode table. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-05-15 14:43:20 -04:00
Steve French	95b1cb90b7	[CIFS] enable parsing for transport encryption mount parm Samba now supports transport encryption on particular exports (mounted tree ids can be encrypted for servers which support the unix extensions). This adds parsing support to cifs mount option parsing for this. Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-15 16:44:38 +00:00
Steve French	c2cf07d591	[CIFS] Finishup DFS code Fixup GetDFSRefer to prepare for cleanup of SMB response processing Fix build warning in link.c Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-15 06:20:02 +00:00
Steve French	f9ddcca4cf	[CIFS] BKL-removal: convert CIFS over to unlocked_ioctl cifs_ioctl doesn't seem to need the BKL for anything, so convert it over to use unlocked_ioctl. Signed-off-by: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-15 05:51:55 +00:00
Steve French	c32916374b	[CIFS] suppress duplicate warning fs/cifs/dir.c: In function 'cifs_ci_compare': fs/cifs/dir.c:582: warning: passing argument 1 of 'memcpy' discards qualifiers from pointer target type Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-15 05:41:54 +00:00
Stephen Hemminger	0599ad53fe	sysfs: remove error messages for -EEXIST case It is possible that the entry in sysfs already exists, one case of this is when a network device is renamed to bonding_masters. Anyway, in this case the proper error path is for device_rename to return an error code, not to generate bogus backtrace and errors. Also, to avoid possible races, the create link should be done before the remove link. This makes a device rename atomic operation like other renames. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-14 22:34:16 -07:00
Linus Torvalds	8f40f672e6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of ssh://master.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: fix error path during early mount 9p: make cryptic unknown error from server less scary 9p: fix flags length in net 9p: Correct fidpool creation failure in p9_client_create 9p: use struct mutex instead of struct semaphore 9p: propagate parse_option changes to client and transports fs/9p/v9fs.c (v9fs_parse_options): Handle kstrdup and match_strdup failure. 9p: Documentation updates add match_strlcpy() us it to make v9fs make uname and remotename parsing more robust	2008-05-14 19:30:51 -07:00
Tiger Yang	7e01c8e542	ext3/4: fix uninitialized bs in ext3/4_xattr_set_handle() This fix the uninitialized bs when we try to replace a xattr entry in ibody with the new value which require more than free space. This situation only happens we format ext3/4 with inode size more than 128 and we have put xattr entries both in ibody and block. The consequences about this bug is we will lost the xattr block which pointed by i_file_acl with all xattr entires in it. We will alloc a new xattr block and put that large value entry in it. The old xattr block will become orphan block. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Cc: <linux-ext4@vger.kernel.org> Cc: Andreas Gruenbacher <agruen@suse.de> Acked-by: Andreas Dilger <adilger@sun.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-14 19:11:14 -07:00
Mingming Cao	772279c5f1	jbd: need to hold j_state_lock to updates to transaction t_state to T_COMMIT Updating the current transaction's t_state is protected by j_state_lock. We need to do the same when updating the t_state to T_COMMIT. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Acked-by: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-14 19:11:14 -07:00
Steve French	646dd53987	[CIFS] Fix paths when share is in DFS to include proper prefix Some versions of Samba (3.2-pre e.g.) are stricter about checking to make sure that paths in DFS name spaces are sent in the form \\server\share\dir\subdir ... instead of \dir\subdir Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-15 01:50:56 +00:00
Eric Van Hensbergen	887b3ece65	9p: fix error path during early mount There was some cleanup issues during early mount which would trigger a kernel bug for certain types of failure. This patch reorganizes the cleanup to get rid of the bad behavior. This also merges the 9pnet and 9pnet_fd modules for the purpose of configuration and initialization. Keeping the fd transport separate from the core 9pnet code seemed like a good idea at the time, but in practice has caused more harm and confusion than good. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-05-14 19:23:27 -05:00
Jim Meyering	ab31267dfe	fs/9p/v9fs.c (v9fs_parse_options): Handle kstrdup and match_strdup failure. Now that this function can fail, return an int, diagnose other option-parsing failures, and adjust the sole caller: (v9fs_session_init): Handle kstrdup failure. Propagate any new v9fs_parse_options failure "up". Signed-off-by: Jim Meyering <meyering@redhat.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-05-14 19:23:25 -05:00
Eric Van Hensbergen	ee443996a3	9p: Documentation updates The kernel-doc comments of much of the 9p system have been in disarray since reorganization. This patch fixes those problems, adds additional documentation and a template book which collects the 9p information. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-05-14 19:23:25 -05:00
Markus Armbruster	b32a09db4f	add match_strlcpy() us it to make v9fs make uname and remotename parsing more robust match_strcpy() is a somewhat creepy function: the caller needs to make sure that the destination buffer is big enough, and when he screws up or forgets, match_strcpy() happily overruns the buffer. There's exactly one customer: v9fs_parse_options(). I believe it currently can't overflow its buffer, but that's not exactly obvious. The source string is a substing of the mount options. The kernel silently truncates those to PAGE_SIZE bytes, including the terminating zero. See compat_sys_mount() and do_mount(). The destination buffer is obtained from __getname(), which allocates from name_cachep, which is initialized by vfs_caches_init() for size PATH_MAX. We're safe as long as PATH_MAX <= PAGE_SIZE. PATH_MAX is 4096. As far as I know, the smallest PAGE_SIZE is also 4096. Here's a patch that makes the code a bit more obviously correct. It doesn't depend on PATH_MAX <= PAGE_SIZE. Signed-off-by: Markus Armbruster <armbru@redhat.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Jim Meyering <meyering@redhat.com> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-05-14 19:23:25 -05:00
Jeff Layton	35fc37d517	add function to convert access flags to legacy open mode SMBLegacyOpen always opens a file as r/w. This could be problematic for files with ATTR_READONLY set. Have it interpret the access_mode into a sane open mode. Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-14 18:45:30 +00:00
Jeff Layton	e10f7b551d	clarify return value of cifs_convert_flags() cifs_convert_flags returns 0x20197 in the default case. It's not immediately evident where that number comes from, so change it to be an or'ed set of flags. The compiler will boil it down anyway. (Thanks to Guenter Kukkukk for clarifying the flags). Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-14 18:44:35 +00:00
Valerie Clement	1930479c4b	ext4: mballoc fix mb_normalize_request algorithm for 1KB block size filesystems In case of inode preallocation, the number of blocks to allocate depends on the file size and it is calculated in ext4_mb_normalize_request(). Each group in the filesystem is then checked to find one that can be used for allocation; this is done in ext4_mb_good_group(). When a file bigger than 4MB is created, the requested number of blocks to preallocate, calculated by ext4_mb_normalize_request is 4096. However for a filesystem with 1KB block size, the maximum size of the block buddies used by the multiblock allocator is 2048, so none of groups in the filesystem satisfies the search criteria in ext4_mb_good_group(). Scanning all the filesystem groups impacts performance. This was demonstrated by using a freshly created, 70GB, 1k block filesystem, with caches dropped write before the test via /proc/sys/vm/drop_caches, and with the filesystem mounted with nodelalloc and nodealloc,nomballoc. The time to write an 8 megabyte file using "dd if=/dev/zero of=/mnt/test/fo bs=8k count=1k conv=fsync" took 35.5091 seconds (236kB/s) with nodellaloc, and 0.233754 seconds (35.9 MB/s) with the nodelloc,nomballoc options. With a 1TB partition, it took several minutes to write 8MB! This patch modifies the algorithm in ext4_mb_normalize_group_request to calculate the number of blocks to allocate by taking into account the maximum size of free blocks chunks handled by the multiblock allocator. It has also been tested for filesystems with 2KB and 4KB block sizes to ensure that those cases don't regress. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Valerie Clement <valerie.clement@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-05-13 19:31:14 -04:00
Jan Kara	2c8be6b222	ext4: fix typos in messages and comments (journalled -> journaled) Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-05-13 21:27:55 -04:00
Jan Kara	0623543b33	ext4: fix synchronization of quota files in journal=data mode In journal=data mode, it is not enough to do write_inode_now as done in vfs_quota_on() to write all data to their final location (which is needed for quota_read to work correctly). Calling journal_flush() does its job. Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-05-13 19:11:51 -04:00
Jan Kara	cd59e7b978	ext4: Fix mount messages when quota disabled When quota is disabled, we should not print 'journaled quota not supported' when user tried to mount non-journaled quota. Also fix typo in the message. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-05-13 19:11:51 -04:00
Jan Kara	dfc5d03f12	ext4: correct mount option parsing to detect when quota options can be changed We should not allow user to change quota mount options when quota is just suspended. It would make mount options and internal quota state inconsistent. Also we should not allow user to change quota format when quota is turned on. On the other hand we can just silently ignore when some option is set to the value it already has (mount does this on remount). Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-05-13 19:11:51 -04:00
Steve French	77c57ec896	[CIFS] don't explicitly do a FindClose on rewind when directory search has ended Do the following series of operations on a CIFS share: opendir(dir) readdir(dir) unlink(file in dir) rewinddir(dir) readdir(dir) If the readdir read all entries in the directory this will make CIFS throw an error like this: CIFS VFS: Send error in FindClose = -9 CIFS requests "Close at end of search" of the server by setting this bit when issuing FindFirst or FindNext. Therefore when all search entries are returned, the server may return "end of search" and close the search implicitly when this bit is set by the client on the request. We check for this when a readdir is explicitly closed - but when the client notices that a directory has changed after the last operation, we attempt to close the directory before reopening by reissuing a second FindFirst. But, the directory may already been implicitly closed (due to end of search) because the first readdir finished. So we only want to issue a FindClose call in this case when we don't expect it to already be closed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-13 21:39:32 +00:00
Cyrill Gorcunov	43f14d856f	eCryptFS: fix imbalanced mutex locking Fix imbalanced calls for mutex lock/unlock on ecryptfs_daemon_hash_mux Revealed by Ingo Molnar: http://lkml.org/lkml/2008/5/7/260 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:26 -07:00
Jean Delvare	f36f21ecca	Fix misuses of bdevname() bdevname() fills the buffer that it is given as a parameter, so calling strcpy() or snprintf() on the returned value is redundant (and probably not guaranteed to work - I don't think strcpy and snprintf support overlapping buffers.) Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Stephen Tweedie <sct@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:26 -07:00
Miklos Szeredi	78bb6cb9a8	fuse: add flag to turn on big writes Prior to 2.6.26 fuse only supported single page write requests. In theory all fuse filesystem should be able support bigger than 4k writes, as there's nothing in the API to prevent it. Unfortunately there's a known case in NTFS-3G where big writes cause filesystem corruption. There could also be other filesystems, where the lack of testing with big write requests would result in bugs. To prevent such problems on a kernel upgrade, disable big writes by default, but let filesystems set a flag to turn it on. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Szabolcs Szakacsits <szaka@ntfs-3g.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:26 -07:00
KOSAKI Motohiro	4cd1a8fc3d	memcg: fix possible panic when CONFIG_MM_OWNER=y When mm destruction happens, we should pass mm_update_next_owner() the old mm. But unfortunately new mm is passed in exec_mmap(). Thus, kernel panic is possible when a multi-threaded process uses exec(). Also, the owner member comment description is wrong. mm->owner does not necessarily point to the thread group leader. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: "Paul Menage" <menage@google.com> Cc: "KAMEZAWA Hiroyuki" <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:25 -07:00
Eric Sesterhenn	706322496b	Fix hfsplus oops on image without extents Fix an oops with a corrupted hfs+ image. See http://bugzilla.kernel.org/show_bug.cgi?id=10548 for details. Problem is that we call hfs_btree_open() from hfsplus_fill_super() to set HFSPLUS_SB(sb).[ext_tree\|cat_tree] Both trees are still NULL at this moment. If hfs_btree_open() fails for any reason it calls iput() on the page, which gets to hfsplus_releasepage() which tries to access HFSPLUS_SB(sb).* which is still NULL and oopses while dereferencing it. [akpm@linux-foundation.org: build fix] Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:24 -07:00
Serge E. Hallyn	289f8e27ed	capabilities: add bounding set to /proc/self/status There is currently no way to query the bounding set of another task. As there appears to be no security reason not to, and as Michael Kerrisk points out the following valid reasons to do so exist: * consistency (I can see all of the other per-thread/process sets in /proc/.../status) * debugging -- I could imagine that it would make the job of debugging an application that uses capabilities a little simpler. this patch adds the bounding set to /proc/self/status right after the effective set. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:24 -07:00
Jan Kara	9377abd026	quota: don't call sync_fs() from vfs_quota_off() when there's no quota turn off Sometimes, vfs_quota_off() is called on a partially set up super block (for example when fill_super() fails for some reason). In such cases we cannot call ->sync_fs() because it can Oops because of not properly filled in super block. So in case we find there's not quota to turn off, we just skip everything and return which fixes the above problem. [akpm@linux-foundation.org: fxi tpyo] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:23 -07:00
Christoph Hellwig	bb45d64224	ufs: remove unneeded ufs_put_inode prototype Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:23 -07:00
Miklos Szeredi	8dc4e37362	ecryptfs: clean up (un)lock_parent dget(dentry->d_parent) --> dget_parent(dentry) unlock_parent() is racy and unnecessary. Replace single caller with unlock_dir(). There are several other suspect uses of ->d_parent in ecryptfs... Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:23 -07:00
Jeff Dike	46d7b522eb	uml: move hppfs_kern.c to hppfs.c There's no reason for the _kern in hppfs_kern.c, so move it to hppfs.c. Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:21 -07:00
Jeff Dike	a0612b1f0b	uml: hppfs fixes hppfs tidying and fixes noticed during hch's get_inode work - style fixes a copy_to_user got its return value checked hppfs_write no longer fiddles file->f_pos because it gets and returns pos in its arguments hppfs_delete_inode dputs the underlyng procfs dentry stored in its private data and mntputs the vfsmnt stashed in s_fs_info hppfs_put_super no longer needs to mntput the s_fs_info, so it no longer needs to exist hppfs_readlink and hppfs_follow_link were doing a bunch of stuff with a struct file which they didn't use there is now a ->permission which calls generic_permission get_inode was always returning 0 for some reason - it now returns an inode if nothing bad happened Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:21 -07:00
Steve French	582d21e5e3	[CIFS] cleanup old checkpatch warnings Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-13 04:54:12 +00:00
Marcin Slusarz	ed5f037005	[CIFS] CIFSSMBPosixLock should return -EINVAL on error all other codepaths in this function return negative values on errors Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-13 04:01:01 +00:00
Jeff Layton	6353450a2d	fix memory leak in CIFSFindNext When CIFSFindNext gets back an -EBADF from a call, it sets the return code of the function to 0 and eventually exits. Doing this makes the cleanup at the end of the function skip freeing the SMB buffer, so we need to make sure we free the buffer explicitly when doing this. If we don't you end up with errors like this when unplugging the cifs kernel module: slab error in kmem_cache_destroy(): cache `cifs_request': Can't free all objects [<c046bdbf>] kmem_cache_destroy+0x61/0xf3 [<e0f03045>] cifs_destroy_request_bufs+0x14/0x28 [cifs] [<e0f2016e>] exit_cifs+0x1e/0x80 [cifs] [<c043aeae>] sys_delete_module+0x192/0x1b8 [<c04451fd>] audit_syscall_entry+0x14b/0x17d [<c0405413>] syscall_call+0x7/0xb ======================= Signed-off-by: Jeff Layton <jlayton@redhat.com>	2008-05-13 03:06:13 +00:00
Jeff Layton	d0a9c078db	[CIFS] CIFS currently allows for permissions to be changed on files, even when unix extensions and cifsacl support are disabled. These permissions changes are "ephemeral" however. They are lost whenever a share is mounted and unmounted, or when memory pressure forces the inode out of the cache. Because of this, we'd like to introduce a behavior change to make CIFS behave more like local DOS/Windows filesystems. When unix extensions and cifsacl support aren't enabled, then don't silently ignore changes to permission bits that can't be reflected on the server. Still, there may be people relying on the current behavior for certain applications. This patch adds a new "dynperm" (and a corresponding "nodynperm") mount option that will be intended to make the client fall back to legacy behavior when setting these modes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-12 22:23:49 +00:00
Linus Torvalds	542dafadd8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] don't allow demultiplex thread to exit until kthread_stop is called [CIFS] when not using unix extensions, check for and set ATTR_READONLY on create and mkdir [CIFS] add local struct inode pointer to cifs_setattr [CIFS] cifs_find_tcp_session cleanup	2008-05-12 13:29:15 -07:00
Jean Delvare	00377d8e38	[GFS2] Prefer strlcpy() over snprintf() strlcpy is faster than snprintf when you don't use the returned value. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-05-12 08:57:11 +01:00
Andrew Price	ad99f77778	[GFS2] Fix cast from unsigned int to s64 This fixes bz 444829 where allocating a new block caused gfs2 file systems to report 0 bytes used in df. It was caused by a broken cast from an unsigned int in gfs2_block_alloc() to a negative s64 in gfs2_statfs_change(). This patch casts the unsigned int to an s64 before the unary minus is applied. Signed-off-by: Andrew Price <andy@andrewprice.me.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-05-12 08:54:56 +01:00
Bob Peterson	091806edd4	[GFS2] filesystem consistency error from do_strip This patch fixes a GFS2 filesystem consistency error reported from function do_strip. The problem was caused by a timing window that allowed two vfs inodes to be created in memory that point to the same file. The problem is fixed by making the vfs's iget_test, iget_set mechanism check and set a new bit in the in-core gfs2_inode structure while the vfs inode spin_lock is held. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-05-12 08:54:53 +01:00
Linus Torvalds	c3921ab715	Add new 'cond_resched_bkl()' helper function It acts exactly like a regular 'cond_resched()', but will not get optimized away when CONFIG_PREEMPT is set. Normal kernel code is already preemptable in the presense of CONFIG_PREEMPT, so cond_resched() is optimized away (see commit `02b67cc3ba` "sched: do not do cond_resched() when CONFIG_PREEMPT"). But when wanting to conditionally reschedule while holding a lock, you need to use "cond_sched_lock(lock)", and the new function is the BKL equivalent of that. Also make fs/locks.c use it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-11 16:04:48 -07:00
Steve French	e691b9d1a0	[CIFS] don't allow demultiplex thread to exit until kthread_stop is called cifs_demultiplex_thread can exit under several conditions: 1) if it's signaled 2) if there's a problem with session setup 3) if kthread_stop is called on it The first two are problems. If kthread_stop is called on the thread, there is no guarantee that it will still be up. We need to have the thread stay up until kthread_stop is called on it. One option would be to not even try to tear things down until after kthread_stop is called. However, in the case where there is a problem setting up the session, there's no real reason to try continuing the loop. This patch allows the thread to clean up and prepare for exit under all three conditions, but it has the thread go to sleep until kthread_stop is called. This allows us to simplify the shutdown code somewhat since we can be reasonably sure that the thread won't exit after being signaled but before kthread_stop is called. It also removes the places where the thread itself set the tsk variable since it appeared that it could have a potential race where the thread might never be shut down. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-11 17:45:44 +00:00
Jeff Layton	67750fb9e0	[CIFS] when not using unix extensions, check for and set ATTR_READONLY on create and mkdir When creating a directory on a CIFS share without POSIX extensions, and the given mode has no write bits set, set the ATTR_READONLY bit. When creating a file, set ATTR_READONLY if the create mode has no write bits set and we're not using unix extensions. There are some comments about this being problematic due to the VFS splitting creates into 2 parts. I'm not sure what that's actually talking about, but I'm assuming that it has something to do with how mknod is implemented. In the simple case where we have no unix extensions and we're just creating a regular file, there's no reason we can't set ATTR_READONLY. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-11 17:45:43 +00:00
Jeff Layton	02eadeffda	[CIFS] add local struct inode pointer to cifs_setattr Clean up cifs_setattr a bit by adding a local inode pointer, and changing all of the direntry->d_inode references to it. This also adds a bit of micro-optimization. d_inode shouldn't change over the life of this function, so we only need to dereference it once. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-11 17:45:43 +00:00
Cyrill Gorcunov	1b20d67218	[CIFS] cifs_find_tcp_session cleanup This patch cleans up cifs_find_tcp_session so it become less indented. Also the error of skipping IPv6 matched addresses fixed. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-11 17:45:43 +00:00
Linus Torvalds	26c5e98e88	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] fix build warning [CIFS] Fixed build warning in is_ip [CIFS] cleanup cifsd completion [CIFS] Remove over-indented code in find_unc(). [CIFS] fix typo [CIFS] Remove duplicate call to mode_to_acl [CIFS] convert usage of implicit booleans to bool [CIFS] fixed compatibility issue with samba refferal request [CIFS] Fix statfs formatting [CIFS] Adds to dns_resolver checking if the server name is an IP addr and skipping upcall in this case. [CIFS] Fix spelling mistake [CIFS] Update cifs version number	2008-05-09 08:10:09 -07:00
Steve French	af4b3c355c	[CIFS] fix build warning Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-09 03:48:05 +00:00
Igor Mammedov	7c5e628f95	[CIFS] Fixed build warning in is_ip Signed-off-by: Igor Mammedov <niallain@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-08 20:48:42 +00:00
Huang Weiyi	19566ca6dc	fs/proc/task_mmu.c: remove duplicated include files Removed duplicated include files <linux/ptrace.h> and <linux/seq_file.h> in fs/proc/task_mmu.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-08 10:56:22 -07:00
Linus Torvalds	7a34912d90	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Revert "relay: fix splice problem" docbook: fix bio missing parameter block: use unitialized_var() in bio_alloc_bioset() block: avoid duplicate calls to get_part() in disk stat code cfq-iosched: make io priorities inherit CPU scheduling class as well as nice block: optimize generic_unplug_device() block: get rid of likely/unlikely predictions in merge logic vfs: splice remove_suid() cleanup cfq-iosched: fix RCU race in the cfq io_context destructor handling block: adjust tagging function queue bit locking block: sysfs store function needs to grab queue_lock and use queue_flag_*()	2008-05-08 10:48:36 -07:00
Linus Torvalds	0f1bce41fe	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: udf: Fix memory corruption when fs mounted with noadinicb option udf: Make udf exportable udf: fs/udf/partition.c:udf_get_pblock() mustn't be inline	2008-05-08 10:48:03 -07:00
Ulrich Drepper	ba719baeab	sys_pipe(): fix file descriptor leaks Remember to close the files if copy_to_user() failed. Spotted by dm.n9107@gmail.com. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: DM <dm.n9107@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-08 10:46:56 -07:00
Jens Axboe	75065ff619	Revert "relay: fix splice problem" This reverts commit `c3270e577c`.	2008-05-08 14:06:19 +02:00
Randy Dunlap	ffee0259c9	docbook: fix bio missing parameter Fix fs/bio.c kernel-doc parameter warning: Warning(linux-2.6.25-git14//fs/bio.c:972): No description found for parameter 'reading' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 18:35:03 +02:00
Jens Axboe	eeae1d48c0	block: use unitialized_var() in bio_alloc_bioset() Better than setting idx to some random value and it silences the same bogus gcc warning. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 13:26:27 +02:00
Jan Kara	9afadc4b1f	udf: Fix memory corruption when fs mounted with noadinicb option When UDF filesystem is mounted with noadinicb mount option, it happens that we extend an empty directory with a block. A code in udf_add_entry() didn't count with this possibility and used uninitialized data leading to memory and filesystem corruption. Add a check whether file already has some extents before operating on them. Signed-off-by: Jan Kara <jack@suse.cz>	2008-05-07 09:49:52 +02:00
Rasmus Rohde	221e583a73	udf: Make udf exportable Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Rasmus Rohde <rohde@duff.dk> Signed-off-by: Jan Kara <jack@suse.cz>	2008-05-07 09:48:23 +02:00
Miklos Szeredi	7f3d4ee108	vfs: splice remove_suid() cleanup generic_file_splice_write() duplicates remove_suid() just because it doesn't hold i_mutex. But it grabs i_mutex inside splice_from_pipe() anyway, so this is rather pointless. Move locking to generic_file_splice_write() and call remove_suid() and __splice_from_pipe() instead. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-05-07 09:29:00 +02:00
Steve French	cf432eb50f	[CIFS] cleanup cifsd completion Was a holdover from the old kernel_thread based cifsd code. We needed to know that the thread had set the task variable before proceeding. Now that kthread_run returns the new task, this doesn't appear to be needed anymore. As best I can tell, this sleep was intended to try to prevent cifs_umount from freeing the cifsSesInfo struct before cifsd had exited. Now that cifsd is using the kthread API, we know that when kthread_stop returns that cifsd has exited, so I don't think this is needed any longer. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Christop Hellwig <hch@infradead.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-06 22:27:16 +00:00
Steve French	dea570e08a	[CIFS] Remove over-indented code in find_unc(). Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-06 22:05:51 +00:00
Linus Torvalds	6ce07c7b61	VFS: fix unused variable warning Commit `33dcdac2df` ("kill ->put_inode") removed the final use of i_op->put_inode, but left the now totally unused "op" variable in iput(). Get rid of it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-06 13:13:37 -07:00
Al Viro	0b2bac2f1e	[PATCH] fix SMP ordering hole in fcntl_setlk() fcntl_setlk()/close() race prevention has a subtle hole - we need to make sure that if we do have an fcntl/close race on SMP box, the access to descriptor table and inode->i_flock won't get reordered. As it is, we get STORE inode->i_flock, LOAD descriptor table entry vs. STORE descriptor table entry, LOAD inode->i_flock with not a single lock in common on both sides. We do have BKL around the first STORE, but check in locks_remove_posix() is outside of BKL and for a good reason - we don't want BKL on common path of close(2). Solution is to hold ->file_lock around fcheck() in there; that orders us wrt removal from descriptor table that preceded locks_remove_posix() on close path and we either come first (in which case eviction will be handled by the close side) or we'll see the effect of close and do eviction ourselves. Note that even though it's read-only access, we do need ->file_lock here - rcu_read_lock() won't be enough to order the things. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-06 13:58:34 -04:00
Steve French	a815752ac0	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-05-06 17:55:32 +00:00
Christoph Hellwig	33dcdac2df	[PATCH] kill ->put_inode And with that last patch to affs killing the last put_inode instance we can finally, after many years of transition kill this racy and awkward interface. (It's kinda funny that even the description in Documentation/filesystems/vfs.txt was entirely wrong..) Also remove a very misleading comment above the defintion of struct super_operations. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-06 13:45:34 -04:00
Roman Zippel	dca3c33652	[PATCH] fix reservation discarding in affs - remove affs_put_inode, so preallocations aren't discared unnecessarily often. - remove affs_drop_inode, it's called with a spinlock held, so it can't use a mutex. - make i_opencnt atomic - avoid direct b_count manipulations - a few allocation failure fixes, so that these are more gracefully handled now. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-06 13:45:33 -04:00
Bryan Wu	eb28062f13	task_nommu: fix compile failing bug because of spilt file.h CC fs/proc/task_nommu.o fs/proc/task_nommu.c: In function ‘task_mem’: fs/proc/task_nommu.c:55: error: dereferencing pointer to incomplete type make[2]: * [fs/proc/task_nommu.o] Error 1 make[1]: * [fs/proc] Error 2 make: *** [fs] Error 2 Signed-off-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-04 17:08:48 -07:00
Ulrich Drepper	d35c7b0e54	unified (weak) sys_pipe implementation This replaces the duplicated arch-specific versions of "sys_pipe()" with one unified implementation. This removes almost 250 lines of duplicated code. It's marked __weak, so that if an architecture wants to override the default implementation it can do so by simply having its own replacement version, since many architectures use alternate calling conventions for the 'pipe()' system call for legacy reasons (ie traditional UNIX implementations often return the two file descriptors in registers) I still haven't changed the cris version even though Linus says the BKL isn't needed. The arch maintainer can easily do it if there are really no obstacles. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-03 13:50:33 -07:00
Linus Torvalds	4f9faaace2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits) rose: Wrong list_lock argument in rose_node seqops netns: Fix reassembly timer to use the right namespace netns: Fix device renaming for sysfs bnx2: Update version to 1.7.5. bnx2: Update RV2P firmware for 5709. bnx2: Zero out context memory for 5709. bnx2: Fix register test on 5709. bnx2: Fix remote PHY initial link state. bnx2: Refine remote PHY locking. bridge: forwarding table information for >256 devices tg3: Update version to 3.92 tg3: Add link state reporting to UMP firmware tg3: Fix ethtool loopback test for 5761 BX devices tg3: Fix 5761 NVRAM sizes tg3: Use constant 500KHz MI clock on adapters with a CPMU hci_usb.h: fix hard-to-trigger race dccp: ccid2.c, ccid3.c use clamp(), clamp_t() net: remove NR_CPUS arrays in net/core/dev.c net: use get/put_unaligned_* helpers bluetooth: use get/put_unaligned_* helpers ...	2008-05-03 10:18:21 -07:00
Steve French	5ade9deaaa	[CIFS] fix typo Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-05-02 20:56:23 +00:00
Linus Torvalds	be2e88011b	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Use GFP_NOFS in kmalloc during localalloc window move ocfs2: Allow uid/gid/perm changes of symlinks ocfs2/dlm: dlmdebug.c: make 2 functions static ocfs2: make struct o2cb_stack_ops static ocfs2: make struct ocfs2_control_device static ocfs2: Correct merge of `52f7c21` (Move /sys/o2cb to /sys/fs/o2cb)	2008-05-02 13:53:07 -07:00
Linus Torvalds	b66e1f11eb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: [PATCH] fix sysctl_nr_open bugs [PATCH] sanitize anon_inode_getfd() [PATCH] split linux/file.h [PATCH] make osf_select() use core_sys_select() [PATCH] remove horrors with irix tty ioctls handling [PATCH] fix file and descriptor handling in perfmon	2008-05-02 11:23:14 -07:00
Denis V. Lunev	78e92b99ec	netns: assign PDE->data before gluing entry into /proc tree In this unfortunate case, proc_mkdir_mode wrapper can't be used anymore and this is no way to reuse proc_create_data due to nlinks assignment. So, copy the code from proc_mkdir and assign PDE->data at the appropriate moment. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-02 04:12:41 -07:00
Linus Torvalds	2c4aabcca8	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: [MTD][NOR] Add physical address to point() method [JFFS2] Track parent inode for directories (for NFS export) [JFFS2] Invert last argument of jffs2_gc_fetch_inode(), make it boolean. [JFFS2] Quiet lockdep false positive. [JFFS2] Clean up jffs2_alloc_inode() and jffs2_i_init_once() [MTD] Delete long-unused jedec.h header file. [MTD] [NAND] at91_nand: use at91_nand_{en,dis}able consistently.	2008-05-01 11:15:28 -07:00
Jared Hulbert	a98889f3d8	[MTD][NOR] Add physical address to point() method Adding the ability to get a physical address from point() in addition to virtual address. This physical address is required for XIP of userspace code from flash. Signed-off-by: Jared Hulbert <jaredeh@gmail.com> Reviewed-by: Jörn Engel <joern@logfs.org> Acked-by: Nicolas Pitre <nico@cam.org> Acked-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-05-01 18:59:11 +01:00
David Woodhouse	27c72b040c	[JFFS2] Track parent inode for directories (for NFS export) To support NFS export, we need to know the parent inode of directories. Rather than growing the jffs2_inode_cache structure, share space with the nlink field -- which was always set to 1 for directories anyway. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-05-01 18:47:17 +01:00
Al Viro	5c598b3428	[PATCH] fix sysctl_nr_open bugs * if luser with root sets it to something that is not a multiple of BITS_PER_LONG, the system is screwed. * if it gets decreased at the wrong time, we can get expand_files() returning success and _not_ increasing the size of table as asked. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-01 13:08:57 -04:00
Al Viro	2030a42cec	[PATCH] sanitize anon_inode_getfd() a) none of the callers even looks at inode or file returned by anon_inode_getfd() b) any caller that would try to look at those would be racy, since by the time it returns we might have raced with close() from another thread and that file would be pining for fjords. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-01 13:08:50 -04:00
Al Viro	9f3acc3140	[PATCH] split linux/file.h Initial splitoff of the low-level stuff; taken to fdtable.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-01 13:08:16 -04:00
Al Viro	a2dcb44c3c	[PATCH] make osf_select() use core_sys_select() ... instead of open-coding it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-05-01 13:07:28 -04:00
David Woodhouse	1b690b4878	[JFFS2] Invert last argument of jffs2_gc_fetch_inode(), make it boolean. We don't actually care about nlink; we only care whether the inode in question is unlinked or not. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-05-01 17:24:28 +01:00
Harvey Harrison	bd7309677c	fuse: use clamp() rather than nested min/max clamp() exists for this use. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:04:02 -07:00
Jan Blunck	868eb7a853	autofs: path_{get,put}() cleanups Here are some more places where path_{get,put}() can be used instead of dput()/mntput() pair. Besides that it fixes a bug in autofs4_mount_busy() where mntput() was called before dput(). Signed-off-by: Jan Blunck <jblunck@suse.de> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:04:01 -07:00
Jeff Moyer	9d2de6ad2a	autofs4: fix incorrect return from root.c:try_to_fill_dentry() Jeff Moyer has identified a case where the autofs4 function root.c:try_to_fill_dentry() can return -EBUSY when it should return 0. Jeff's description of the way this happens is: "automount starts an expire for directory d. after the callout to the daemon, but before the rmdir, another process tries to walk into the same directory. It puts itself onto the waitq, pending the expiration. When the expire finishes, the second process is woken up. In try_to_fill_dentry, it does this check: status = d_invalidate(dentry); if (status != -EBUSY) return -EAGAIN; And status is EBUSY. The dentry still has a non-zero d_inode, and the flags do not contain LOOKUP_CONTINUE or LOOKUP_DIRECTORY So, we fall through and return -EBUSY to the caller." Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:04:01 -07:00
Jeff Moyer	033790449b	autofs4: fix execution order race in mount request code Jeff Moyer has identified a race in due to an execution order dependency in the autofs4 function root.c:try_to_fill_dentry(). Jeff's description of this race is: "P1 does a lookup of /mount/submount/foo. Since the VFS can't find an entry for "foo" under /mount/submount, it calls into the autofs4 kernel module to allocate a new dentry, D1. The kernel creates a new waitq for this lookup and calls the daemon to perform the mount. The daemon performs a mkdir of the "foo" directory under /mount/submount, which ends up creating a new dentry, D2. Then, P2 does a lookup of /mount/submount/foo. The VFS path walking logic finds a dentry in the dcache, D2, and calls the revalidate function with this. In the autofs4 revalidate code, we then trigger a mount, since the dentry is an empty directory that isn't a mountpoint, and so set DCACHE_AUTOFS_PENDING and call into the wait code to trigger the mount. The wait code finds our existing waitq entry (since it is keyed off of the directory name) and adds itself to the list of waiters. After the daemon finishes the mount, it calls back into the kernel to release the waiters. When this happens, P1 is woken up and goes about clearing the DCACHE_AUTOFS_PENDING flag, but it does this in D1! So, given that P1 in our case is a program that will immediately try to access a file under /mount/submount/foo, we end up finding the dentry D2 which still has the pending flag set, and we set out to wait for a mount again! So, one way to address this is to re-do the lookup at the end of try_to_fill_dentry, and to clear the pending flag on the hashed dentry. This seems a sane approach to me." And Jeff's patch does this. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:04:01 -07:00
Ian Kent	cab0936aac	autofs4: check for invalid dentry in getpath Catch invalid dentry when calculating its path. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:04:01 -07:00
Ian Kent	afec570c32	autofs4: fix sparse warning in waitq.c:autofs4_expire_indirect() Re-order some code in expire.c:autofs4_expire_indirect() to avoid compile warning, reported by Harvey Harrison: CHECK fs/autofs4/expire.c fs/autofs4/expire.c:383:2: warning: context imbalance in 'autofs4_expire_indirect' - unexpected unlock Signed-off-by: Ian Kent <raven@themaw.net> Reviewed-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:04:01 -07:00
Miklos Szeredi	02c6be615f	vfs: fix permission checking in sys_utimensat If utimensat() is called with both times set to UTIME_NOW or one of them to UTIME_NOW and the other to UTIME_OMIT, then it will update the file time without any permission checking. I don't think this can be used for anything other than a local DoS, but could be quite bewildering at that (e.g. "Why was that large source tree rebuilt when I didn't modify anything???") This affects all kernels from 2.6.22, when the utimensat() syscall was introduced. Fix by doing the same permission checking as for the "times == NULL" case. Thanks to Michael Kerrisk, whose utimensat-non-conformances-and-fixes.patch in -mm also fixes this (and breaks other stuff), only he didn't realize the security implications of this bug. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:03:59 -07:00
David Woodhouse	590fe34c47	[JFFS2] Quiet lockdep false positive. Don't hold f->sem while calling into jffs2_do_create(). It makes lockdep unhappy, and we don't really need it -- the _reason_ it's a false positive is because nobody else can see this inode yet and so nobody will be trying to lock it anyway. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-05-01 15:53:28 +01:00
David Woodhouse	4e571aba7b	[JFFS2] Clean up jffs2_alloc_inode() and jffs2_i_init_once() Ditch a couple of pointless casts from void *, and use the normal variable name 'f' for jffs2_inode_info pointers -- especially since it actually shows up in lockdep reports. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-05-01 12:29:37 +01:00
Al Viro	214b7049a7	Fix dnotify/close race We have a race between fcntl() and close() that can lead to dnotify_struct inserted into inode's list after the last descriptor had been gone from current->files. Since that's the only point where dnotify_struct gets evicted, we are screwed - it will stick around indefinitely. Even after struct file in question is gone and freed. Worse, we can trigger send_sigio() on it at any later point, which allows to send an arbitrary signal to arbitrary process if we manage to apply enough memory pressure to get the page that used to host that struct file and fill it with the right pattern... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 20:09:00 -07:00
Sunil Mushran	4ba1c5bfd2	ocfs2: Use GFP_NOFS in kmalloc during localalloc window move kmalloc() during a localalloc window move can trigger the mm to prune the dcache which inturn can trigger the fs to delete an inode causing it start a recursive transaction. The fix also makes the change in kmalloc during localalloc shutdown just to be safe. Fixes oss bugzilla#901 http://oss.oracle.com/bugzilla/show_bug.cgi?id=901 Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-04-30 17:09:58 -07:00
Sunil Mushran	bc535809c0	ocfs2: Allow uid/gid/perm changes of symlinks This patch adds the ability to change attributes of a symlink. Fixes oss bugzilla#963 http://oss.oracle.com/bugzilla/show_bug.cgi?id=963 Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-04-30 17:09:54 -07:00
Adrian Bunk	95642e5664	ocfs2/dlm: dlmdebug.c: make 2 functions static This patch makes the following needlessly global functions static: - stringify_lockname() - dlm_debug_put() Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-04-30 17:09:40 -07:00
Adrian Bunk	4af694e672	ocfs2: make struct o2cb_stack_ops static This patch makes the needlessly global struct o2cb_stack_ops static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-04-30 17:09:25 -07:00
Adrian Bunk	4d8755b5e6	ocfs2: make struct ocfs2_control_device static This patch makes the needlessly global struct ocfs2_control_device static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-04-30 17:09:08 -07:00
Joel Becker	9d80f7539a	ocfs2: Correct merge of `52f7c21` (Move /sys/o2cb to /sys/fs/o2cb) Commit `52f7c21b61` was intended to move /sys/o2cb to /sys/fs/o2cb, providing /sys/o2cb as a symlink for backwards compatibility. However, the merge apparently added the symlink but failed to move the directory, resulting in a duplicate filename error. It's a one-line change that was missing. Signed-off-by: Joel Becker <joel.becker@oracle.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-04-30 17:07:59 -07:00
Robert P. J. Day	883ce42ec4	DEBUGFS: Correct location of debugfs API documentation. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-04-30 16:52:47 -07:00
Ben Hutchings	40a2159abf	sysfs: Disallow truncation of files in sysfs sysfs allows attribute files to be truncated, e.g. using ftruncate(), with the expected effect on their inode. For most attributes, this doesn't change the "real" size of the file i.e. how much can be read from it. However, the parameter validation for reading and writing binary attribute files is based on the inode size and not the size specified in the file's bin_attribute, so it can be broken by this. For example, if we try using dd to write to such a file: # pwd /sys/bus/pci/devices/0000:08:00.0 # ls -l config -rw-r--r-- 1 root root 4096 Feb 1 17:35 config # dd if=/dev/zero of=config bs=4 count=1 1+0 records in 1+0 records out # ls -l config -rw-r--r-- 1 root root 0 Feb 1 17:50 config # dd if=/dev/zero of=config bs=4 count=1 seek=128 dd: writing `config': No space left on device 1+0 records in 0+0 records out Also, after truncation to 0, parameter validation for read and write is disabled. Most bin_attribute read and write methods also validate the size and offset, but for some this will allow out-of-range access. This may be a security issue, though access to such files is often limited to root. In any case, the validation should remain for safety's sake!) This was previously reported in Bugzilla as bug 9867. sysfs should ignore size changes or else refuse them (by returning -EINVAL). This patch makes it ignore them. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-04-30 16:52:46 -07:00
Linus Torvalds	d67c6f869c	Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] Update default configuration. [S390] use generic sys_ptrace [S390] Remove self ptrace IEEE_IP hack. [S390] Convert to SPARSEMEM & SPARSEMEM_VMEMMAP [S390] System z large page support. [S390] Convert machine feature detection code to C. [S390] vmemmap: use clear_table to initialise page tables. [S390] Move stfl to system.h and delete duplicated version. [S390] uaccess_mvcos: #ifdef config dependent code. [S390] cpu topology: Fix possible deadlock. [S390] Add topology_core_siblings to topology.h [S390] cio: Make isc handling more robust. [S390] remove -traditional [S390] Automatically detect added cpus. [S390] smp: Fix locking order. [S390] Add missing ifndef/define to include/asm-s390/sysinfo.h. [S390] Move show_regs to traps.c. [S390] cio: Use strict_strtoul() for attributes.	2008-04-30 08:38:30 -07:00
Harvey Harrison	8e24eea728	fs: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:54 -07:00
Harvey Harrison	530b641278	afs: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:54 -07:00
Thomas Gleixner	c6f3a97f86	debugobjects: add timer specific object debugging code Add calls to the generic object debugging infrastructure and provide fixup functions which allow to keep the system alive when recoverable problems have been detected by the object debugging core code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Greg KH <greg@kroah.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:53 -07:00
Andrew Morton	487798df6d	hfsplus: fix warning with 64k PAGE_SIZE fs/hfsplus/btree.c: In function 'hfsplus_bmap_alloc': fs/hfsplus/btree.c:239: warning: comparison is always false due to limited range of data type But this might hide a real bug? Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:52 -07:00
Andrew Morton	3e5a509730	hfs: fix warning with 64k PAGE_SIZE fs/hfs/btree.c: In function 'hfs_bmap_alloc': fs/hfs/btree.c:263: warning: comparison is always false due to limited range of data type The patch makes the warning go away, but the code might actually be buggy? Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:52 -07:00
Marcin Slusarz	07132922aa	sysv: [bl]e*_add_cpu conversion replace all: big/little_endian_variable = cpu_to_[bl]eX([bl]eX_to_cpu(big/little_endian_variable) + expression_in_cpu_byteorder); with: [bl]eX_add_cpu(&big/little_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:52 -07:00
Marcin Slusarz	e3592b12f5	quota: le*_add_cpu conversion replace all: little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) + expression_in_cpu_byteorder); with: leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Marcin Slusarz	20c79e785a	hfs/hfsplus: be*_add_cpu conversion replace all: big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) + expression_in_cpu_byteorder); with: beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Marcin Slusarz	6369a4abb4	affs: be*_add_cpu conversion replace all: big_endian_variable = cpu_to_beX(beX_to_cpu(big_endian_variable) + expression_in_cpu_byteorder); with: beX_add_cpu(&big_endian_variable, expression_in_cpu_byteorder); generated with semantic patch Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Christoph Hellwig	86098fa011	reiserfs: use open_bdev_excl Use the proper helper to open a blockdevice by name for filesystem use, this makes sure it's properly claimed (also added for open-by-number) and gets rid of the struct file abuse. Tested by mounting a reiserfs filesystem with external journal. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Jeff Mahoney <jeffm@suse.com> Acked-by: Edward Shishkin <edward.shishkin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	4dbf930ed6	fuse: fix sparse warnings fs/fuse/dev.c:306:2: warning: context imbalance in 'wait_answer_interruptible' - unexpected unlock fs/fuse/dev.c:361:2: warning: context imbalance in 'request_wait_answer' - unexpected unlock fs/fuse/dev.c:1002:4: warning: context imbalance in 'end_io_requests' - unexpected unlock Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	5559b8f4d1	fuse: fix race in llseek Fuse doesn't use i_mutex to protect setting i_size, and so generic_file_llseek() can be racy: it doesn't use i_size_read(). So do a fuse specific llseek method, which does use i_size_read(). [akpm@linux-foundation.org: make `retval' loff_t] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	b48badf013	fuse: fix node ID type Node ID is 64bit but it is passed as unsigned long to some functions. This breakage wasn't noticed, because libfuse uses unsigned long too. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	e5d9a0df07	fuse: fix max i/o size calculation Fix a bug that Werner Baumann reported: fuse can send a bigger write request than the maximum specified. This only affected direct_io operation. In addition set a sane minimum for the max_read and max_write tunables, so I/O always makes some progress. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	5c5c5e51b2	fuse: update file size on short read If the READ request returned a short count, then either - cached size is incorrect - filesystem is buggy, as short reads are only allowed on EOF So assume that the size is wrong and refresh it, so that cached read() doesn't zero fill the missing chunk. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Nick Piggin	ea9b9907b8	fuse: implement perform_write Introduce fuse_perform_write. With fusexmp (a passthrough filesystem), large (1MB) writes into a backing tmpfs filesystem are sped up by almost 4 times (256MB/s vs 71MB/s). [mszeredi@suse.cz]: - split into smaller functions - testing - duplicate generic_file_aio_write(), so that there's no need to add a new ->perform_write() a_op. Comment from hch. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	854512ec35	fuse: clean up setting i_size in write Extract common code for setting i_size in write functions into a common helper. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	3be5a52b30	fuse: support writable mmap Quoting Linus (3 years ago, FUSE inclusion discussions): "User-space filesystems are hard to get right. I'd claim that they are almost impossible, unless you limit them somehow (shared writable mappings are the nastiest part - if you don't have those, you can reasonably limit your problems by limiting the number of dirty pages you accept through normal "write()" calls)." Instead of attempting the impossible, I've just waited for the dirty page accounting infrastructure to materialize (thanks to Peter Zijlstra and others). This nicely solved the biggest problem: limiting the number of pages used for write caching. Some small details remained, however, which this largish patch attempts to address. It provides a page writeback implementation for fuse, which is completely safe against VM related deadlocks. Performance may not be very good for certain usage patterns, but generally it should be acceptable. It has been tested extensively with fsx-linux and bash-shared-mapping. Fuse page writeback design -------------------------- fuse_writepage() allocates a new temporary page with GFP_NOFS\|__GFP_HIGHMEM. It copies the contents of the original page, and queues a WRITE request to the userspace filesystem using this temp page. The writeback is finished instantly from the MM's point of view: the page is removed from the radix trees, and the PageDirty and PageWriteback flags are cleared. For the duration of the actual write, the NR_WRITEBACK_TEMP counter is incremented. The per-bdi writeback count is not decremented until the actual write completes. On dirtying the page, fuse waits for a previous write to finish before proceeding. This makes sure, there can only be one temporary page used at a time for one cached page. This approach is wasteful in both memory and CPU bandwidth, so why is this complication needed? The basic problem is that there can be no guarantee about the time in which the userspace filesystem will complete a write. It may be buggy or even malicious, and fail to complete WRITE requests. We don't want unrelated parts of the system to grind to a halt in such cases. Also a filesystem may need additional resources (particularly memory) to complete a WRITE request. There's a great danger of a deadlock if that allocation may wait for the writepage to finish. Currently there are several cases where the kernel can block on page writeback: - allocation order is larger than PAGE_ALLOC_COSTLY_ORDER - page migration - throttle_vm_writeout (through NR_WRITEBACK) - sync(2) Of course in some cases (fsync, msync) we explicitly want to allow blocking. So for these cases new code has to be added to fuse, since the VM is not tracking writeback pages for us any more. As an extra safetly measure, the maximum dirty ratio allocated to a single fuse filesystem is set to 1% by default. This way one (or several) buggy or malicious fuse filesystems cannot slow down the rest of the system by hogging dirty memory. With appropriate privileges, this limit can be raised through '/sys/class/bdi/<bdi>/max_ratio'. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	fc3ba692a4	mm: Add NR_WRITEBACK_TEMP counter Fuse will use temporary buffers to write back dirty data from memory mappings (normal writes are done synchronously). This is needed, because there cannot be any guarantee about the time in which a write will complete. By using temporary buffers, from the MM's point if view the page is written back immediately. If the writeout was due to memory pressure, this effectively migrates data from a full zone to a less full zone. This patch adds a new counter (NR_WRITEBACK_TEMP) for the number of pages used as temporary buffers. [Lee.Schermerhorn@hp.com: add vmstat_text for NR_WRITEBACK_TEMP] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	e4ad08fe64	mm: bdi: add separate writeback accounting capability Add a new BDI capability flag: BDI_CAP_NO_ACCT_WB. If this flag is set, then don't update the per-bdi writeback stats from test_set_page_writeback() and test_clear_page_writeback(). Misc cleanups: - convert bdi_cap_writeback_dirty() and friends to static inline functions - create a flag that includes all three dirty/writeback related flags, since almst all users will want to have them toghether Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	b6f2fcbcfc	mm: bdi: expose the BDI object in sysfs for FUSE Register FUSE's backing_dev_info under sysfs with the name "fuse-MAJOR:MINOR" Make the fuse control filesystem use s_dev instead of a fuse specific ID. This makes it easier to match directories under /sys/fs/fuse/connections/ with directories under /sys/class/bdi, and with actual mounts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:49 -07:00
Miklos Szeredi	fa799759f9	mm: bdi: expose the BDI object in sysfs for NFS Register NFS' backing_dev_info under sysfs with the name "nfs-MAJOR:MINOR" Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:49 -07:00
Sukadev Bhattiprolu	718a916338	devpts: factor out PTY index allocation Factor out the code used to allocate/free a pts index into new interfaces, devpts_new_index() and devpts_kill_index(). This localizes the external data structures used in managing the pts indices. [akpm@linux-foundation.org: undo accidental mutex2sem conversion] Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:48 -07:00
Alan Cox	f34d7a5b70	tty: The big operations rework - Operations are now a shared const function block as with most other Linux objects - Introduce wrappers for some optional functions to get consistent behaviour - Wrap put_char which used to be patched by the tty layer - Document which functions are needed/optional - Make put_char report success/fail - Cache the driver->ops pointer in the tty as tty->ops - Remove various surplus lock calls we no longer need - Remove proc_write method as noted by Alexey Dobriyan - Introduce some missing sanity checks where certain driver/ldisc combinations would oops as they didn't check needed methods were present [akpm@linux-foundation.org: fix fs/compat_ioctl.c build] [akpm@linux-foundation.org: fix isicom] [akpm@linux-foundation.org: fix arch/ia64/hp/sim/simserial.c build] [akpm@linux-foundation.org: fix kgdb] Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Cc: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:47 -07:00
Alan Cox	5d0fdf1e01	tty_io: fix remaining pid struct locking This fixes the last couple of pid struct locking failures I know about. [oleg@tv-sign.ru: clean up do_task_stat()] Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:40 -07:00
Alan Cox	04f378b198	tty: BKL pushdown - Push the BKL down into the line disciplines - Switch the tty layer to unlocked_ioctl - Introduce a new ctrl_lock spin lock for the control bits - Eliminate much of the lock_kernel use in n_tty - Prepare to (but don't yet) call the drivers with the lock dropped on the paths that historically held the lock BKL now primarily protects open/close/ldisc change in the tty layer [jirislaby@gmail.com: a couple of fixes] Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:40 -07:00
Oleg Nesterov	2800d8d19e	document de_thread() with exit_notify() connection Add a couple of small comments, it is not easy to see what this code does. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:38 -07:00
Roland McGrath	f3de272b82	signals: use HAVE_SET_RESTORE_SIGMASK Change all the #ifdef TIF_RESTORE_SIGMASK conditionals in non-arch code to #ifdef HAVE_SET_RESTORE_SIGMASK. If arch code defines it first, the generic set_restore_sigmask() using TIF_RESTORE_SIGMASK is not defined. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:37 -07:00
Roland McGrath	4e4c22c711	signals: add set_restore_sigmask This adds the set_restore_sigmask() inline in <linux/thread_info.h> and replaces every set_thread_flag(TIF_RESTORE_SIGMASK) with a call to it. No change, but abstracts the details of the flag protocol from all the calls. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:37 -07:00
Oleg Nesterov	7a5e873f09	signals: de_thread: simplify the ->child_reaper switching Now that we rely on SIGNAL_UNKILLABLE flag, de_thread() doesn't need the nasty hack to kill the old ->child_reaper during the mt-exec. This also means we can avoid taking tasklist_lock around zap_other_threads(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:37 -07:00
Oleg Nesterov	06fffb1267	do_task_stat: don't take rcu_read_lock() lock_task_sighand() was changed, and do_task_stat() doesn't need rcu_read_lock any longer. sighand->siglock protects all "interesting" fields. Except: it doesn't protect ->tty->pgrp, but neither does rcu_read_lock(), this should be fixed. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Pavel Emelyanov <xemul@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:34 -07:00
Jan Kara	2deb1acc65	isofs: fix access to unallocated memory when reading corrupted filesystem When a directory on isofs is corrupted, we did not check whether length of the name in a directory entry and the length of the directory entry itself are consistent. This could lead to possible access beyond the end of buffer when the length of the name was too big. Add this sanity check to directory reading code. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:33 -07:00
David Chinner	64275ea4f3	[XFS] Include linux/random.h in all builds, not just debug. Noted-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Dave Chinner <dgc@sgi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 07:53:50 -07:00
Gerald Schaefer	53492b1de4	[S390] System z large page support. This adds hugetlbfs support on System z, using both hardware large page support if available and software large page emulation on older hardware. Shared (large) page tables are implemented in software emulation mode, by using page->index of the first tail page from a compound large page to store page table information. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-04-30 13:38:47 +02:00
Linus Torvalds	c4755d16fc	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (48 commits) ext4: fix hot spins in mballoc after err_freebuddy and err_freemeta ext4: fix test ext_generic_write_end() copied return value ext3: fix test ext_generic_write_end() copied return value ext4: Move mballoc headers/structures to a seperate header file mballoc.h ext4: cleanup for compiling mballoc with verification and debugging #defines ext4: don't use ext4_error in ext4_check_descriptors ext4: mark inode dirty after initializing the extent tree ext4: update ctime and mtime for truncate with extents. ext4: Don't do GFP_NOFS allocations after taking ext4_lock_group ext4: move headers out of include/linux ext4: fix wrong gfp type under transaction ext4: Fix hang on umount with quotas when journal is aborted ext4: Fix update of mtime and ctime on rename jdb2: replace remaining __FUNCTION__ occurrences ext4: replace remaining __FUNCTION__ occurrences jbd2: only create debugfs and stats entries if init is successful jbd2: fix kernel-doc notation jbd2: replace potentially false assertion with if block jbd2: eliminate duplicated code in revocation table init/destroy functions jbd2: tidy up revoke cache initialisation and destruction ...	2008-04-29 20:34:49 -07:00
Linus Torvalds	c15a2434ed	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: (24 commits) [XFS] Fix build failure after enabling CONFIG_XFS_DEBUG [XFS] remove dmapi cruft in xfs_file.c [XFS] remove sendfile leftovers [XFS] allow enabling CONFIG_XFS_DEBUG [XFS] Don't initialise new inode generation numbers to zero [XFS] Fix check for block zero access in xfs_write_iomap_allocate() [XFS] Don't double count reserved block changes on UP. [XFS] remove xfs_log_ticket_zone on rmmod [XFS] fix non-smp xfs build [XFS] Fix broken HAVE_SPLICE removal commit. [XFS] kill XFS_ICSB_SB_LOCKED [XFS] split xfs_icsb_balance_counter [XFS] Add xfs_icsb_sync_counters_locked for when m_sb_lock already held [XFS] Cleanup xfs_attr a bit with xfs_name and remove cred [XFS] kill usesless IHOLD calls in xfs_remove and xfs_rmdir [XFS] kill parent == child checks in xfs_remove and xfs_rmdir [XFS] kill usesless IHOLD calls in xfs_rename [XFS] remove manual lookup from xfs_rename and simplify locking [XFS] shrink mrlock_t [XFS] simplify xfs_lookup ...	2008-04-29 20:34:17 -07:00
Roel Kluin	f1fa3342e2	ext4: fix hot spins in mballoc after err_freebuddy and err_freemeta In ext4_mb_init_backend() 'i' is of type ext4_group_t. Since unsigned, i >= 0 is always true, so fix hot spins after err_freebuddy: and -meta: and prevent decrements when zero. Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:01:15 -04:00
Roel Kluin	f8a87d8930	ext4: fix test ext_generic_write_end() copied return value 'copied' is unsigned, whereas 'ret2' is not. The test (copied < 0) fails Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:01:18 -04:00
Roel Kluin	7c2f3d6f89	ext3: fix test ext_generic_write_end() copied return value 'copied' is unsigned, whereas 'ret2' is not. The test (copied < 0) fails Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:01:27 -04:00
Mingming Cao	8f6e39a7ad	ext4: Move mballoc headers/structures to a seperate header file mballoc.h Move function and structure definiations out of mballoc.c and put it under a new header file mballoc.h Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:01:31 -04:00
Solofo Ramangalahy	60bd63d192	ext4: cleanup for compiling mballoc with verification and debugging #defines This patch allows compiling mballoc with: #define AGGRESSIVE_CHECK #define DOUBLE_CHECK #define MB_DEBUG It fixes: Compilation errors: fs/ext4/mballoc.c: In function '__mb_check_buddy': fs/ext4/mballoc.c:605: error: 'struct ext4_prealloc_space' has no member named 'group_list' fs/ext4/mballoc.c:606: error: 'struct ext4_prealloc_space' has no member named 'pstart' fs/ext4/mballoc.c:608: error: 'struct ext4_prealloc_space' has no member named 'len' Compilation warnings: fs/ext4/mballoc.c: In function 'ext4_mb_normalize_group_request': fs/ext4/mballoc.c:2863: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'int' fs/ext4/mballoc.c: In function 'ext4_mb_use_inode_pa': fs/ext4/mballoc.c:3103: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'int' Sparse check: fs/ext4/mballoc.c:3818:2: warning: context imbalance in 'ext4_mb_show_ac' - different lock contexts for basic block Signed-off-by: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 21:59:59 -04:00
Josef Bacik	c19204b0ae	ext4: don't use ext4_error in ext4_check_descriptors Because ext4_check_descriptors is called at mount time you can't use ext4_error as it calls ext4_commit_sb, which since the sb isn't all the way initialized causes bad things to happen (ie a panic). This patch changes the ext4_error's to printk's to keep this problem from happening. Thanks much, Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:00:28 -04:00
Aneesh Kumar K.V	8753e88f1b	ext4: mark inode dirty after initializing the extent tree We should mark the inode dirty only after initializing the extent tree. Also if we fail during extent initialization we need to call DQUOT_FREE_INODE. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:00:36 -04:00
Solofo Ramangalahy	ef7377289a	ext4: update ctime and mtime for truncate with extents. The recently announced "Linux POSIX file system test suite" caught a truncate issue when using extents: mtime and ctime are not updated when truncate is successful. This is the single issue caught with "default" ext4 (mkfs and mount with minimal options). The testsuite does not report failure with -o noextents. With the following patch, all tests of the testsuite pass. Signed-off-by: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:00:41 -04:00
Aneesh Kumar K.V	c83617db76	ext4: Don't do GFP_NOFS allocations after taking ext4_lock_group We can't do GFP_NOFS allocation after taking ext4_lock_group BUG: sleeping function called from invalid context at mm/slab.c:3054 in_atomic():1, irqs_disabled():0 1 lock held by vi/2426: #0: (&ei->i_data_sem){----}, at: [<c01cf665>] ext4_release_file+0x23/0x66 Pid: 2426, comm: vi Not tainted 2.6.25-rc7 #24 [<c011a3dc>] __might_sleep+0xbe/0xc5 [<c01620c9>] kmem_cache_alloc+0x22/0xa6 [<c01e382a>] ext4_mb_release_inode_pa+0x73/0x1b3 [<c01e6adf>] ext4_mb_discard_inode_preallocations+0x22d/0x2d4 [<c013000a>] ? param_set_ushort+0x32/0x39 [<c01ceba1>] ext4_discard_reservation+0x27/0x6a [<c01cf66c>] ext4_release_file+0x2a/0x66 [<c0165bd6>] __fput+0xae/0x155 [<c0165e46>] fput+0x17/0x19 [<c0163756>] filp_close+0x50/0x5a [<c01647c0>] sys_close+0x71/0xad [<c0104aba>] sysenter_past_esp+0x5f/0xa5 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:00:47 -04:00
Christoph Hellwig	3dcf54515a	ext4: move headers out of include/linux Move ext4 headers out of include/linux. This is just the trivial move, there's some more thing that could be done later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 18:13:32 -04:00
Josef Bacik	216553c4b7	ext4: fix wrong gfp type under transaction This fixes the allocations with GFP_KERNEL while under a transaction problems in ext4. This patch is the same as its ext3 counterpart, just switches these to GFP_NOFS. Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:02:02 -04:00
Jan Kara	2887df139c	ext4: Fix hang on umount with quotas when journal is aborted Call dquot_drop() from ext4_dquot_drop() even if we fail to start a transaction. Otherwise we never get to dropping references to quota structures from the inode and umount will hang indefinitely. Thanks to Payphone LIOU for spotting the problem. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> CC: Payphone LIOU <lioupayphone@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:02:07 -04:00
Jan Kara	53b7e9f680	ext4: Fix update of mtime and ctime on rename The patch below makes ext4 update mtime and ctime of the directory into which we move file even if the directory entry already exists. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-04-29 22:02:11 -04:00
Dave Jones	355a46961b	trivial: fix user-visible typo in hfsplus Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 13:23:21 -07:00
Steve French	9b1ec9ecea	[CIFS] Remove duplicate call to mode_to_acl The current logic in cifs_setattr calls mode_to_acl twice on mode changes if cifsacl is enabled. Remove the duplicate call. Signed-off-by: Jeff Layton <jlayton@redhat.com> CC: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-04-29 20:15:43 +00:00
Linus Torvalds	bd5d435a96	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: Skip I/O merges when disabled block: add large command support block: replace sizeof(rq->cmd) with BLK_MAX_CDB ide: use blk_rq_init() to initialize the request block: use blk_rq_init() to initialize the request block: rename and export rq_init() block: no need to initialize rq->cmd with blk_get_request block: no need to initialize rq->cmd in prepare_flush_fn hook block/blk-barrier.c:blk_ordered_cur_seq() mustn't be inline block/elevator.c:elv_rq_merge_ok() mustn't be inline block: make queue flags non-atomic block: add dma alignment and padding support to blk_rq_map_kern unexport blk_max_pfn ps3disk: Remove superfluous cast block: make rq_init() do a full memset() relay: fix splice problem	2008-04-29 08:18:03 -07:00
Jeff Moyer	39fa00311f	aio: fix misleading comments The FIXME comments are inaccurate. The locking comment over lookup_ioctx() is wrong. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:29 -07:00
Harvey Harrison	97a4feb4a7	ncpfs: use get/put_unaligned_* helpers [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:28 -07:00
Harvey Harrison	58d485d481	isofs: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:28 -07:00
Harvey Harrison	8b3789e5d5	hfsplus: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:28 -07:00
Harvey Harrison	803f445f17	fat: use get/put_unaligned_* helpers Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:28 -07:00
David Howells	9396d496d7	afs: support the CB.ProbeUuid RPC op Add support for the CB.ProbeUuid cache manager RPC op. This allows a modern OpenAFS server to quickly ask if the client has been rebooted. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
David Howells	7c80bcce34	afs: the AFS RPC op CBGetCapabilities is actually CBTellMeAboutYourself The AFS RxRPC op CBGetCapabilities is actually CBTellMeAboutYourself. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
Robert P. J. Day	0ae52d6fba	afs: use the shorter LIST_HEAD for brevity Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:26 -07:00
Hirofumi Nakagawa	801678c5a3	Remove duplicated unlikely() in IS_ERR() Some drivers have duplicated unlikely() macros. IS_ERR() already has unlikely() in itself. This patch cleans up such pointless code. Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jeff@garzik.org> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.de> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Pavel Emelyanov	d7321cd624	sysctl: add the ->permissions callback on the ctl_table_root When reading from/writing to some table, a root, which this table came from, may affect this table's permissions, depending on who is working with the table. The core hunk is at the bottom of this patch. All the rest is just pushing the ctl_table_root argument up to the sysctl_perm() function. This will be mostly (only?) used in the net sysctls. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:23 -07:00
Pavel Emelyanov	7708bfb1c8	sysctl: merge equal proc_sys_read and proc_sys_write Many (most of) sysctls do not have a per-container sense. E.g. kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget and so on and so forth. Besides, tuning then from inside a container is not even secure. On the other hand, hiding them completely from the container's tasks sometimes causes user-space to stop working. When developing net sysctl, the common practice was to duplicate a table and drop the write bits in table->mode, but this approach was not very elegant, lead to excessive memory consumption and was not suitable in general. Here's the alternative solution. To facilitate the per-container sysctls ctl_table_root-s were introduced. Each root contains a list of ctl_table_header-s that are visible to different namespaces. The idea of this set is to add the permissions() callback on the ctl_table_root to allow ctl root limit permissions to the same ctl_table-s. The main user of this functionality is the net-namespaces code, but later this will (should) be used by more and more namespaces, containers and control groups. Actually, this idea's core is in a single hunk in the third patch. First two patches are cleanups for sysctl code, while the third one mostly extends the arguments set of some sysctl functions. This patch: These ->read and ->write callbacks act in a very similar way, so merge these paths to reduce the number of places to patch later and shrink the .text size (a bit). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:23 -07:00
Denis V. Lunev	79da3664f6	jbd2: use non-racy method for proc entries creation Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: <linux-ext4@vger.kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	19b4fc52d6	reiserfs: use non-racy method for proc entries creation Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. /proc entry owner is also added. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	46fe74f2ae	ext4: use non-racy method for proc entries creation Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: <linux-ext4@vger.kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	21ac295b42	afs: use non-racy method for proc entries creation Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: David Howells <dhowells@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	34b37235c6	nfs: use proc_create to setup de->proc_fops Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	9ef2db2630	nfsd: use proc_create to setup de->proc_fops Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Denis V. Lunev	59b7435149	proc: introduce proc_create_data to setup de->data This set of patches fixes an proc ->open'less usage due to ->proc_fops flip in the most part of the kernel code. The original OOPS is described in the commit `2d3a4e3666`: Typical PDE creation code looks like: pde = create_proc_entry("foo", 0, NULL); if (pde) pde->proc_fops = &foo_proc_fops; Notice that PDE is first created, only then ->proc_fops is set up to final value. This is a problem because right after creation a) PDE is fully visible in /proc , and b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's possible to ->read without ->open (see one class of oopses below). The fix is new API called proc_create() which makes sure ->proc_fops are set up before gluing PDE to main tree. Typical new code looks like: pde = proc_create("foo", 0, NULL, &foo_proc_fops); if (!pde) return -ENOMEM; Fix most networking users for a start. In the long run, create_proc_entry() for regular files will go. In addition to this, proc_create_data is introduced to fix reading from proc without PDE->data. The race is basically the same as above. create_proc_entries is replaced in the entire kernel code as new method is also simply better. This patch: The problem is the same as for de->proc_fops. Right now PDE becomes visible without data set. So, the entry could be looked up without data. This, in most cases, will simply OOPS. proc_create_data call is created to address this issue. proc_create now becomes a wrapper around it. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Chris Mason <chris.mason@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jaroslav Kysela <perex@suse.cz> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Karsten Keil <kkeil@suse.de> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Len Brown <lenb@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Neil Brown <neilb@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Osterlund <petero2@telia.com> Cc: Pierre Peiffer <peifferp@gmail.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Alexey Dobriyan	b640a89ddd	proc: convert /proc/tty/ldiscs to seq_file interface Note: THIS_MODULE and header addition aren't technically needed because this code is not modular, but let's keep it anyway because people can copy this code into modular code. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:20 -07:00
Alexey Dobriyan	8731f14d37	proc: remove ->get_info infrastructure Now that last dozen or so users of ->get_info were removed, ditch it too. Everyone sane shouldd have switched to seq_file interface long ago. P.S.: Co-existing 3 interfaces (->get_info/->read_proc/->proc_fops) for proc is long-standing crap, BTW, thus a) put ->read_proc/->write_proc/read_proc_entry() users on death row, b) new such users should be rejected, c) everyone is encouraged to convert his favourite ->read_proc user or I'll do it, lazy bastards. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:19 -07:00
Alexey Dobriyan	c74c120a21	proc: remove proc_root from drivers Remove proc_root export. Creation and removal works well if parent PDE is supplied as NULL -- it worked always that way. So, one useless export removed and consistency added, some drivers created PDEs with &proc_root as parent but removed them as NULL and so on. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	928b4d8c89	proc: remove proc_root_driver Use creation by full path: "driver/foo". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	36a5aeb878	proc: remove proc_root_fs Use creation by full path instead: "fs/foo". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	9c37066d88	proc: remove proc_bus Remove proc_bus export and variable itself. Using pathnames works fine and is slightly more understandable and greppable. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	5e971dce0b	proc: drop several "PDE valid/invalid" checks proc-misc code is noticeably full of "if (de)" checks when PDE passed is always valid. Remove them. Addition of such check in proc_lookup_de() is for failed lookup case. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:18 -07:00
Alexey Dobriyan	7cee4e00e0	proc: less special case in xlate code If valid "parent" is passed to proc_create/remove_proc_entry(), then name of PDE should consist of only one path component, otherwise creation or or removal will fail. However, if NULL is passed as parent then create/remove accept full path as a argument. This is arbitrary restriction -- all infrastructure is in place. So, patch allows the following to succeed: create_proc_entry("foo/bar", 0, pde_baz); remove_proc_entry("baz/foo/bar", &proc_root); Also makes the following to behave identically: create_proc_entry("foo/bar", 0, NULL); create_proc_entry("foo/bar", 0, &proc_root); Discrepancy noticed by Den Lunev (IIRC). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Alexey Dobriyan	f649d6d326	proc: simplify locking in remove_proc_entry() proc_subdir_lock protects only modifying and walking through PDE lists, so after we've found PDE to remove and actually removed it from lists, there is no need to hold proc_subdir_lock for the rest of operation. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Roland McGrath	638fa202cd	procfs: mem permission cleanup This cleans up the permission checks done for /proc/PID/mem i/o calls. It puts all the logic in a new function, check_mem_permission(). The old code repeated the (!MAY_PTRACE(task) \|\| !ptrace_may_attach(task)) magical expression multiple times. The new function does all that work in one place, with clear comments. The old code called security_ptrace() twice on successful checks, once in MAY_PTRACE() and once in __ptrace_may_attach(). Now it's only called once, and only if all other checks have succeeded. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Alexey Dobriyan	0d5c9f5f59	proc: switch to proc_create() Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Matt Helsley	925d1c401f	procfs task exe symlink The kernel implements readlink of /proc/pid/exe by getting the file from the first executable VMA. Then the path to the file is reconstructed and reported as the result. Because of the VMA walk the code is slightly different on nommu systems. This patch avoids separate /proc/pid/exe code on nommu systems. Instead of walking the VMAs to find the first executable file-backed VMA we store a reference to the exec'd file in the mm_struct. That reference would prevent the filesystem holding the executable file from being unmounted even after unmapping the VMAs. So we track the number of VM_EXECUTABLE VMAs and drop the new reference when the last one is unmapped. This avoids pinning the mounted filesystem. [akpm@linux-foundation.org: improve comments] [yamamoto@valinux.co.jp: fix dup_mmap] Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: David Howells <dhowells@redhat.com> Cc:"Eric W. Biederman" <ebiederm@xmission.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
Alexey Dobriyan	e93b4ea20a	proc: print more information when removing non-empty directories This usually saves one recompile to insert similar printk like below. :) Sample nastygram: remove_proc_entry: removing non-empty directory '/proc/foo', leaking at least 'bar' ------------[ cut here ]------------ WARNING: at fs/proc/generic.c:776 remove_proc_entry+0x18a/0x200() Modules linked in: foo(-) container fan battery dock sbs ac sbshc backlight ipv6 loop af_packet amd_rng sr_mod i2c_amd8111 i2c_amd756 cdrom i2c_core button thermal processor Pid: 3034, comm: rmmod Tainted: G M 2.6.25-rc1 #5 Call Trace: [<ffffffff80231974>] warn_on_slowpath+0x64/0x90 [<ffffffff80232a6e>] printk+0x4e/0x60 [<ffffffff802d6c8a>] remove_proc_entry+0x18a/0x200 [<ffffffff8045cd88>] mutex_lock_nested+0x1c8/0x2d0 [<ffffffff8025f0f0>] __try_stop_module+0x0/0x40 [<ffffffff8025effd>] sys_delete_module+0x14d/0x200 [<ffffffff8045df3d>] lockdep_sys_exit_thunk+0x35/0x67 [<ffffffff8031c307>] __up_read+0x27/0xa0 [<ffffffff8045decc>] trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8020b6ab>] system_call_after_swapgs+0x7b/0x80 ---[ end trace 10ef850597e89c54 ]--- Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:17 -07:00
WANG Cong	4220b7fe89	elf: fix shadowed variables in fs/binfmt_elf.c Fix these sparse warings: fs/binfmt_elf.c:1749:29: warning: symbol 'tmp' shadows an earlier one fs/binfmt_elf.c:1734:28: originally declared here fs/binfmt_elf.c:2009:26: warning: symbol 'vma' shadows an earlier one fs/binfmt_elf.c:1892:24: originally declared here [akpm@linux-foundation.org: chose better variable name] Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:16 -07:00
Cyrill Gorcunov	6970c8eff8	BINFMT: fill_elf_header cleanup - use straight memset first This patch does simplify fill_elf_header function by setting to zero the whole elf header first. So we fillup the fields we really need only. before: text data bss dec hex filename 11735 80 0 11815 2e27 fs/binfmt_elf.o after: text data bss dec hex filename 11710 80 0 11790 2e0e fs/binfmt_elf.o viola, 25 bytes of text is freed Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:16 -07:00

... 2 3 4 5 6 ...

9177 Commits