linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-18 18:11:56 +00:00

Author	SHA1	Message	Date
Wanlong Gao	b2f4edb335	jbd2: use kmem_cache_zalloc wrapper instead of flag Use kmem_cache_zalloc wrapper instead of flag __GFP_ZERO. Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-06-01 00:10:32 -04:00
Salman Qazi	95599968d1	ext4: remove mb_groups before tearing down the buddy_cache We can't have references held on pages in the s_buddy_cache while we are trying to truncate its pages and put the inode. All the pages must be gone before we reach clear_inode. This can only be gauranteed if we can prevent new users from grabbing references to s_buddy_cache's pages. The original bug can be reproduced and the bug fix can be verified by: while true; do mount -t ext4 /dev/ram0 /export/hda3/ram0; \ umount /export/hda3/ram0; done & while true; do cat /proc/fs/ext4/ram0/mb_groups; done Signed-off-by: Salman Qazi <sqazi@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org	2012-05-31 23:52:14 -04:00
Salman Qazi	02b7831019	ext4: add ext4_mb_unload_buddy in the error path ext4_free_blocks fails to pair an ext4_mb_load_buddy with a matching ext4_mb_unload_buddy when it fails a memory allocation. Signed-off-by: Salman Qazi <sqazi@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org	2012-05-31 23:51:27 -04:00
Theodore Ts'o	79906964a1	ext4: don't trash state flags in EXT4_IOC_SETFLAGS In commit `353eb83c` we removed i_state_flags with 64-bit longs, But when handling the EXT4_IOC_SETFLAGS ioctl, we replace i_flags directly, which trashes the state flags which are stored in the high 32-bits of i_flags on 64-bit platforms. So use the the ext4_{set,clear}_inode_flags() functions which use atomic bit manipulation functions instead. Reported-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org	2012-05-31 23:46:01 -04:00
Tao Ma	9660755100	ext4: let getattr report the right blocks in delalloc+bigalloc In delayed allocation, i_reserved_data_blocks now indicates clusters, not blocks. So report it in the right number. This can be easily exposed by the following command: echo foo > blah; du -hc blah; sync; du -hc blah Reported-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-05-31 22:54:16 -04:00
Linus Torvalds	a00b6151a2	Merge branch 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux Pull nfsd update from Bruce Fields. * 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux: (23 commits) nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek SUNRPC: split upcall function to extract reusable parts nfsd: allocate id-to-name and name-to-id caches in per-net operations. nfsd: make name-to-id cache allocated per network namespace context nfsd: make id-to-name cache allocated per network namespace context nfsd: pass network context to idmap init/exit functions nfsd: allocate export and expkey caches in per-net operations. nfsd: make expkey cache allocated per network namespace context nfsd: make export cache allocated per network namespace context nfsd: pass pointer to export cache down to stack wherever possible. nfsd: pass network context to export caches init/shutdown routines Lockd: pass network namespace to creation and destruction routines NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches nfsd: pass pointer to expkey cache down to stack wherever possible. nfsd: use hash table from cache detail in nfsd export seq ops nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops nfsd: use exp_put() for svc_export_cache put nfsd: use cache detail pointer from svc_export structure on cache put nfsd: add link to owner cache detail to svc_export structure nfsd: use passed cache_detail pointer expkey_parse() ...	2012-05-31 18:18:11 -07:00
Linus Torvalds	08615d7d85	Merge branch 'akpm' (Andrew's patch-bomb) Merge misc patches from Andrew Morton: - the "misc" tree - stuff from all over the map - checkpatch updates - fatfs - kmod changes - procfs - cpumask - UML - kexec - mqueue - rapidio - pidns - some checkpoint-restore feature work. Reluctantly. Most of it delayed a release. I'm still rather worried that we don't have a clear roadmap to completion for this work. * emailed from Andrew Morton <akpm@linux-foundation.org>: (78 patches) kconfig: update compression algorithm info c/r: prctl: add ability to set new mm_struct::exe_file c/r: prctl: extend PR_SET_MM to set up more mm_struct entries c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat syscalls, x86: add __NR_kcmp syscall fs, proc: introduce /proc/<pid>/task/<tid>/children entry sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() eventfd: change int to __u64 in eventfd_signal() fs/nls: add Apple NLS pidns: make killed children autoreap pidns: use task_active_pid_ns in do_notify_parent rapidio/tsi721: add DMA engine support rapidio: add DMA engine support for RIO data transfers ipc/mqueue: add rbtree node caching support tools/selftests: add mq_perf_tests ipc/mqueue: strengthen checks on mqueue creation ipc/mqueue: correct mq_attr_ok test ipc/mqueue: improve performance of send/recv selftests: add mq_open_tests ...	2012-05-31 18:10:18 -07:00
Cyrill Gorcunov	5b172087f9	c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat We would like to have an ability to restore command line arguments and program environment pointers but first we need to obtain them somehow. Thus we put these values into /proc/$pid/stat. The exit_code is needed to restore zombie tasks. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Vagin <avagin@openvz.org> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Cyrill Gorcunov	818411616b	fs, proc: introduce /proc/<pid>/task/<tid>/children entry When we do checkpoint of a task we need to know the list of children the task, has but there is no easy and fast way to generate reverse parent->children chain from arbitrary <pid> (while a parent pid is provided in "PPid" field of /proc/<pid>/status). So instead of walking over all pids in the system (creating one big process tree in memory, just to figure out which children a task has) -- we add explicit /proc/<pid>/task/<tid>/children entry, because the kernel already has this kind of information but it is not yet exported. This is a first level children, not the whole process tree. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Christopher Yeoh	ac34ebb3a6	aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after changes made to support CMA in an earlier patch. Rather than having an additional check_access parameter to these functions, the first paramater type is overloaded to allow the caller to specify CHECK_IOVEC_ONLY which means check that the contents of the iovec are valid, but do not check the memory that they point to. This is used by process_vm_readv/writev where we need to validate that a iovec passed to the syscall is valid but do not want to check the memory that it points to at this point because it refers to an address space in another process. Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Sha Zhengju	ee62c6b2dc	eventfd: change int to __u64 in eventfd_signal() eventfd_ctx->count is an __u64 counter which is allowed to reach ULLONG_MAX. eventfd_write() adds a __u64 value to "count", but the kernel side eventfd_signal() only adds an int value to it. Make them consistent. [akpm@linux-foundation.org: update interface documentation] Signed-off-by: Sha Zhengju <handai.szj@taobao.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Vladimir Serbinenko	71ca97da9d	fs/nls: add Apple NLS HFS has support for NLS. However the relevant NLS tables are missing. Here they are automatically transformed from the tables at unicode.org. Codepages requiring special handling like CJK, RTL or Brahmic ones are not included in this patch. [akpm@linux-foundation.org: add unicode.org copyright and permission notices] Signed-off-by: Vladimir Serbinenko <phcoder@gmail.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Konstantin Khlebnikov	bca1554373	proc/smaps: show amount of nonlinear ptes in vma Currently, nonlinear mappings can not be distinguished from ordinary mappings. This patch adds into /proc/pid/smaps line "Nonlinear: <size> kB", where size is amount of nonlinear ptes in vma, this line appears only if VM_NONLINEAR is set. This information may be useful not only for checkpoint/restore project. Requested by Pavel Emelyanov. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Konstantin Khlebnikov	b1d4d9e0cb	proc/smaps: carefully handle migration entries Currently smaps reports migration entries as "swap", as result "swap" can appears in shared mapping. This patch converts migration entries into pages and handles them as usual. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Konstantin Khlebnikov	052fb0d635	proc: report file/anon bit in /proc/pid/pagemap This is an implementation of Andrew's proposal to extend the pagemap file bits to report what is missing about tasks' working set. The problem with the working set detection is multilateral. In the criu (checkpoint/restore) project we dump the tasks' memory into image files and to do it properly we need to detect which pages inside mappings are really in use. The mincore syscall I though could help with this did not. First, it doesn't report swapped pages, thus we cannot find out which parts of anonymous mappings to dump. Next, it does report pages from page cache as present even if they are not mapped, and it doesn't make that has not been cow-ed. Note, that issue with swap pages is critical -- we must dump swap pages to image file. But the issues with file pages are optimization -- we can take all file pages to image, this would be correct, but if we know that a page is not mapped or not cow-ed, we can remove them from dump file. The dump would still be self-consistent, though significantly smaller in size (up to 10 times smaller on real apps). Andrew noticed, that the proc pagemap file solved 2 of 3 above issues -- it reports whether a page is present or swapped and it doesn't report not mapped page cache pages. But, it doesn't distinguish cow-ed file pages from not cow-ed. I would like to make the last unused bit in this file to report whether the page mapped into respective pte is PageAnon or not. [comment stolen from Pavel Emelyanov's v1 patch] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Jan Engelhardt	715be1fce0	procfs: use more apprioriate types when dumping /proc/N/stat - use int fpr priority and nice, since task_nice()/task_prio() return that - field 24: get_mm_rss() returns unsigned long Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Alexey Dobriyan	af5e617143	proc: pass "fd" by value in /proc/*/{fd,fdinfo} code Pass "fd" directly, not via pointer -- one less memory read. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Alexey Dobriyan	f05ed3f1ab	proc: don't do dummy rcu_read_lock/rcu_read_unlock on error path rcu_read_lock()/rcu_read_unlock() is nop for TINY_RCU, but is not a nop for, say, PREEMPT_RCU. proc_fill_cache() is called without RCU lock, there is no need to lock/unlock on error path, simply jump out of the loop. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Cong Wang	2344bec788	proc: use mm_access() instead of ptrace_may_access() mm_access() handles this much better, and avoids some race conditions. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Cong Wang	e7dcd9990e	proc: remove mm_for_maps() mm_for_maps() is a simple wrapper for mm_access(), and the name is misleading, so just remove it and use mm_access() directly. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Cong Wang	b409e578d9	proc: clean up /proc/<pid>/environ handling Similar to `e268337dfe` ("proc: clean up and fix /proc/<pid>/mem handling"), move the check of permission to open(), this will simplify read() code. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Namjae Jeon	f0aac6162e	fat: use fat_msg_ratelimit() in fat__get_entry() If an application tries to lookup (opendir/readdir/stat) 5000 files on a fatfs USB device and the device is unplugged, many message occur, shown below. This makes the application slow. So use the new fat_msg_ratelimit() decrease the messaging rate. #> ./file_lookup_testcase ./files_directory/ usb 2-1.4: USB disconnect, device number 4 FAT-fs (sda1): FAT read failed (blocknr 2631) FAT-fs (sda1): Directory bread(block 396816) failed FAT-fs (sda1): Directory bread(block 396817) failed FAT-fs (sda1): Directory bread(block 396818) failed FAT-fs (sda1): Directory bread(block 396819) failed FAT-fs (sda1): Directory bread(block 396820) failed FAT-fs (sda1): Directory bread(block 396821) failed FAT-fs (sda1): Directory bread(block 396822) failed FAT-fs (sda1): Directory bread(block 396823) failed FAT-fs (sda1): Directory bread(block 406824) failed FAT-fs (sda1): Directory bread(block 406825) failed FAT-fs (sda1): Directory bread(block 406826) failed FAT-fs (sda1): Directory bread(block 406827) failed FAT-fs (sda1): Directory bread(block 406828) failed FAT-fs (sda1): Directory bread(block 406829) failed FAT-fs (sda1): Directory bread(block 406830) failed FAT-fs (sda1): Directory bread(block 406831) failed FAT-fs (sda1): Directory bread(block 417696) failed FAT-fs (sda1): Directory bread(block 417697) failed FAT-fs (sda1): Directory bread(block 417698) failed FAT-fs (sda1): Directory bread(block 417699) failed FAT-fs (sda1): Directory bread(block 417700) failed FAT-fs (sda1): Directory bread(block 417701) failed FAT-fs (sda1): Directory bread(block 417702) failed FAT-fs (sda1): Directory bread(block 417703) failed FAT-fs (sda1): FAT read failed (blocknr 2631) FAT-fs (sda1): Directory bread(block 396816) failed FAT-fs (sda1): Directory bread(block 396817) failed FAT-fs (sda1): Directory bread(block 396818) failed FAT-fs (sda1): Directory bread(block 396819) failed FAT-fs (sda1): Directory bread(block 396820) failed FAT-fs (sda1): Directory bread(block 396821) failed Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Amit Sahrawat <amit.sahrawat83@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Namjae Jeon	b742c34153	fat: add fat_msg_ratelimit() Add a fat_msg_ratelimit() to limit the message generation rate. Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Amit Sahrawat <amit.sahrawat83@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Artem Bityutskiy	78491189dd	fat: switch to fsinfo_inode Currently FAT file-system maps the VFS "superblock" abstraction to the FSINFO block. The FSINFO block contains non-essential data about the amount of free clusters and the next free cluster. FAT file-system can always find out this information by scanning the FAT table, but having it in the FSINFO block may speed things up sometimes. So FAT file-system relies on the VFS superblock write-out services to make sure the FSINFO block is written out to the media from time to time. The whole "superblock write-out" VFS infrastructure is served by the 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and writes out all dirty superblock using the '->write_super()' call-back. But the problem with this thread is that it wastes power by waking up the system every 5 seconds no matter what. So we want to kill it completely and thus, we need to make file-systems to stop using the '->write_super' VFS service, and then remove it together with the kernel thread. This patch switches the FAT FSINFO block management from '->write_super()'/'->s_dirt' to 'fsinfo_inode'/'->write_inode'. Now, instead of setting the 's_dirt' flag, we just mark the special 'fsinfo_inode' inode as dirty and let VFS invoke the '->write_inode' call-back when needed, where we write-out the FSINFO block. This patch also makes sure we do not mark the 'fsinfo_inode' inode as dirty if we are not FAT32 (FAT16 and FAT12 do not have the FSINFO block) or if we are in R/O mode. As a bonus, we can also remove the '->sync_fs()' and '->write_super()' FAT call-back function because they become unneeded. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Artem Bityutskiy	330fe3c4c6	fat: mark superblock as dirty less often Preparation for further changes. It touches few functions in fatent.c and prevents them from marking the superblock as dirty unnecessarily often. Namely, instead of marking it as dirty in the internal tight loops - do it only once at the end of the functions. And instead of marking it as dirty while holding the FAT table lock, do it outside the lock. The reason for this patch is that marking the superblock as dirty will soon become a little bit heavier operation, so it is cleaner to do this only when it is necessary. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Artem Bityutskiy	90b436657e	fat: introduce mark_fsinfo_dirty helper A preparation patch which introduces a 'mark_fsinfo_dirty()' helper function which just sets the 's_dirt' flag to 1 so far. I'll add more code to this helper later, so I do not mark it as 'inline'. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Artem Bityutskiy	020ac5b6be	fat: introduce special inode for managing the FSINFO block This is patchset makes fatfs stop using the VFS '->write_super()' method for writing out the FSINFO block. The final goal is to get rid of the 'sync_supers()' kernel thread. This kernel thread wakes up every 5 seconds (by default) and calls '->write_super()' for all mounted file-systems. And the bad thing is that this is done even if all the superblocks are clean. Moreover, some file-systems do not even need this end they do not register the '->write_super()' method at all (e.g., btrfs). So 'sync_supers()' most often just generates useless wake-ups and wastes power. I am trying to make all file-systems independent of '->write_super()' and plan to remove 'sync_supers()' and '->write_super' completely once there are no more users. The '->write_supers()' method is mostly used by baroque file-systems like hfs, udf, etc. Modern file-systems like btrfs and xfs do not use it. This justifies removing this stuff from VFS completely and make every FS self-manage own superblock. Tested with xfstests. This patch: Preparation for further changes. It introduces a special inode ('fsinfo_inode') in FAT file-system which we'll later use for managing the FSINFO block. Note, this there is already one special inode ('fat_inode') which is used for managing the FAT tables. Introduce new 'MSDOS_FSINFO_INO' constant for this special inode. It is safe to do because FAT file-system does not store inode numbers on the media but generates them run-time. I've also cleaned up the comment to existing 'MSDOS_ROOT_INO' constant, while on it. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Dan Carpenter	7bc1bac77a	HPFS: remove PRINTK() macro The PRINTK() macro isn't really used. Let's just remove it because it is ugly and out of date. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Ryusuke Konishi	11475975dd	nilfs2: flush disk caches in syncing There are two cases that the cache flush is needed to avoid data loss against unexpected hang or power failure. One is sync file function (i.e. nilfs_sync_file) and another is checkpointing ioctl. This issues a cache flush request to device for such cases if barrier mount option is enabled, and makes sure data really is on persistent storage on their completion. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Will Deacon	a1d494495c	pipe: return -ENOIOCTLCMD instead of -EINVAL on unknown ioctl command As described in commit `07d106d0a3` ("vfs: fix up ENOIOCTLCMD error handling"), drivers should return -ENOIOCTLCMD if they receive an ioctl command which they don't understand. Doing so will result in -ENOTTY being returned to userspace, which matches the behaviour of the compat layer if it fails to translate an ioctl command. This patch fixes the pipe ioctl to return -ENOIOCTLCMD instead of -EINVAL when passed an unknown ioctl command. Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Alan Cox <alan@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Xi Wang	a3860c1c5d	introduce SIZE_MAX ULONG_MAX is often used to check for integer overflow when calculating allocation size. While ULONG_MAX happens to work on most systems, there is no guarantee that `size_t' must be the same size as `long'. This patch introduces SIZE_MAX, the maximum value of `size_t', to improve portability and readability for allocation size validation. Signed-off-by: Xi Wang <xi.wang@gmail.com> Acked-by: Alex Elder <elder@dreamhost.com> Cc: David Airlie <airlied@linux.ie> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:26 -07:00
J. Bruce Fields	6eccece90b	nfsd4: fix, consolidate client_has_state Whoops: first, I reimplemented the already-existing has_resources without noticing; second, I got the test backwards. I did pick a better name, though. Combine the two.... Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:39 -04:00
J. Bruce Fields	b9831b59f3	nfsd4: don't remove rebooted client record until confirmation In the NFSv4.1 client-reboot case we're currently removing the client's previous state in exchange_id. That's wrong--we should be waiting till the confirming create_session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:34 -04:00
J. Bruce Fields	32f16b3823	nfsd4: remove some dprintk's and a comment The comment is redundant, and if we really want dprintk's here they'd probably be better in the common (check-slot_seqid) code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:31 -04:00
J. Bruce Fields	778df3f0fe	nfsd4: return "real" sequence id in confirmed case The client should ignore the returned sequence_id in the case where the CONFIRMED flag is set on an exchange_id reply--and in the unconfirmed case "1" is always the right response. So it shouldn't actually matter what we return here. We could continue returning 1 just to catch clients ignoring the spec here, but I'd rather be generous. Other things equal, returning the existing sequence_id seems more informative. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:27 -04:00
J. Bruce Fields	0f1ba0ef21	nfsd4: fix exchange_id to return confirm flag Otherwise nfsd4_set_ex_flags writes over the return flags. Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:21 -04:00
J. Bruce Fields	7447758be7	nfsd4: clarify that renewing expired client is a bug This can't happen: - cl_time is zeroed only by unhash_client_locked, which is only ever called under both the state lock and the client lock. - every caller of renew_client() should have looked up a (non-expired) client and then called renew_client() all without dropping the state lock. - the only other caller of renew_client_locked() is release_session_client(), which first checks under the client_lock that the cl_time is nonzero. So make it clear that this is a bug, not something we handle. I can't quite bring myself to make this a BUG(), though, as there are a lot of renew_client() callers, and returning here is probably safer than a BUG(). We'll consider making it a BUG() after some more cleanup. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:14 -04:00
J. Bruce Fields	90d700b779	nfsd4: simpler ordering of setclientid_confirm checks The cases here divide into two main categories: - if there's an uncomfirmed record with a matching verifier, then this is a "normal", succesful case: we're either creating a new client, or updating an existing one. - otherwise, this is a weird case: a replay, or a server reboot. Reordering to reflect that makes the code a bit more concise and the logic a lot easier to understand. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:07 -04:00
J. Bruce Fields	f3d03b9202	nfsd4: setclientid: remove pointless assignment Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:06 -04:00
J. Bruce Fields	8695b90ac3	nfsd4: fix error return in non-matching-creds case Note CLID_INUSE is for the case where two clients are trying to use the same client-provided long-form client identifiers. But what we're looking at here is the server-returned shorthand client id--if those clash there's a bug somewhere. Fix the error return, pull the check out into common code, and do the check unconditionally in all cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:04 -04:00
J. Bruce Fields	788c1eba50	nfsd4: fix setclientid_confirm same_cred check New clients are created only by nfsd4_setclientid(), which always gives any new client a unique clientid. The only exception is in the "callback update" case, in which case it may create an unconfirmed client with the same clientid as a confirmed client. In that case it also checks that the confirmed client has the same credential. Therefore, it is pointless for setclientid_confirm to check whether a confirmed and unconfirmed client with the same clientid have matching credentials--they're guaranteed to. Instead, it should be checking whether the credential on the setclientid_confirm matches either of those. Otherwise, it could be anyone sending the setclientid_confirm. Granted, I can't see why anyone would, but still it's probalby safer to check. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:03 -04:00
J. Bruce Fields	34b232bb37	nfsd4: merge 3 setclientid cases to 2 Boy, is this simpler. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:02 -04:00
J. Bruce Fields	8f9307119d	nfsd4: pull out common code from setclientid cases Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:01 -04:00
J. Bruce Fields	ad72aae5ad	nfsd4: merge last two setclientid cases The code here is mostly the same. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:00 -04:00
J. Bruce Fields	63db46328a	nfsd4: setclientid/confirm comment cleanup Be a little more concise. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:59 -04:00
J. Bruce Fields	e98479b8d6	nfsd4: setclientid remove unnecessary terms from a logical expression Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:58 -04:00
J. Bruce Fields	d5497fc693	nfsd4: move rq_flavor into svc_cred Move the rq_flavor into struct svc_cred, and use it in setclientid and exchange_id comparisons as well. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:58 -04:00
J. Bruce Fields	8fbba96e5b	nfsd4: stricter cred comparison for setclientid/exchange_id The typical setclientid or exchange_id will probably be performed with a credential that maps to either root or nobody, so comparing just uid's is unlikely to be useful. So, use everything else we can get our hands on. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:57 -04:00
J. Bruce Fields	03a4e1f6dd	nfsd4: move principal name into svc_cred Instead of keeping the principal name associated with a request in a structure that's private to auth_gss and using an accessor function, move it to svc_cred. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:55 -04:00
J. Bruce Fields	631fc9ea05	nfsd4: allow removing clients not holding state RFC 5661 actually says we should allow an exchange_id to remove a matching client, even if the exchange_id comes from a different principal, if the victim client lacks any state. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:55 -04:00
J. Bruce Fields	136e658d62	nfsd4: rearrange exchange_id logic to simplify Minor cleanup: it's simpler to have separate code paths for the update and non-update cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:54 -04:00
J. Bruce Fields	2dbb269dfe	nfsd4: exchange_id cleanup: comments Make these comments a bit more concise and uniform. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:53 -04:00
J. Bruce Fields	83e08fd46c	nfsd4: exchange_id cleanup: local shorthands for repeated tests Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:52 -04:00
J. Bruce Fields	1a308118c2	nfsd4: allow an EXCHANGE_ID to kill a 4.0 client Following rfc 5661 section 2.4.1, we can permit a 4.1 client to remove an established 4.0 client's state. (But we don't allow updates.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:52 -04:00
J. Bruce Fields	ea236d0704	nfsd4: exchange_id: check creds before killing confirmed client We mustn't allow a client to destroy another client with established state unless it has the right credential. And some minor cleanup. (Note: our comparison of credentials is actually pretty bogus currently; that will need to be fixed in another patch.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:51 -04:00
J. Bruce Fields	2786cc3a05	nfsd4: exchange_id error cleanup There's no point to the dprintk here as the main proc_compound loop already does this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:50 -04:00
J. Bruce Fields	11ae681052	nfsd4: exchange_id has a pointless copy We just verified above that these two verifiers are already the same. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:50 -04:00
Weston Andros Adamson	5fb35a3a9b	nfsd: return 0 on reads of fault injection files debugfs read operations were returning the contents of an uninitialized u64. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:48 -04:00
Jeff Layton	ce0fc43c5a	nfsd: wrap all accesses to st_deny_bmap Handle the st_deny_bmap in a similar fashion to the st_access_bmap. Add accessor functions and use those instead of bare bitops. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:48 -04:00
Jeff Layton	82c5ff1b14	nfsd: wrap accesses to st_access_bmap Currently, we do this for the most part with "bare" bitops, but eventually we'll need to expand the share mode code to handle access and deny modes on other nodes. In order to facilitate that code in the future, move to some generic accessor functions. For now, these are mostly static inlines, but eventually we'll want to move these to "real" functions that are able to handle multi-node configurations or have a way to "swap in" new operations to be done in lieu of or in conjunction with these atomic bitops. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:47 -04:00
Jeff Layton	3a3286147f	nfsd: make test_share a bool return All of the callers treat the return that way already. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:46 -04:00
Jeff Layton	5ae037e599	nfsd: consolidate set_access and set_deny These functions are identical. Also, rename them to bmap_to_share_mode to better reflect what they do, and have them just return the result instead of passing in a pointer to the storage location. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:46 -04:00
Chuck Lever	f07ea10dc8	NFSD: SETCLIENTID_CONFIRM returns NFS4ERR_CLID_INUSE too often According to RFC 3530bis, the only items SETCLIENTID_CONFIRM processing should be concerned with is the clientid, clientid verifier, and principal. The client's IP address is not supposed to be interesting. And, NFS4ERR_CLID_INUSE is meant only for principal mismatches. I triggered this logic with a prototype UCS client -- one that uses the same nfs_client_id4 string for all servers. The client mounted our server via its IPv4, then via its IPv6 address. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:45 -04:00
Stanislav Kinsbursky	8dbf28e495	LockD: add debug message to start and stop functions Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:44 -04:00
Stanislav Kinsbursky	3d1221dfa9	LockD: service start function introduced This is just a code move, which from my POV makes the code look better. I.e. now on start we have 3 different stages: 1) Service creation. 2) Service per-net data allocation. 3) Service start. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:44 -04:00
Stanislav Kinsbursky	7d13ec761a	LockD: move global usage counter manipulation from error path Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:43 -04:00
Stanislav Kinsbursky	2445223909	LockD: service creation function introduced This function creates service if it doesn't exist, or increases usage counter if it does, and returns a pointer to it. The usage counter will be droppepd by svc_destroy() later in lockd_up(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:42 -04:00
Stanislav Kinsbursky	dbf9b5d74c	LockD: use existing per-net data function on service creation This patch also replaces svc_rpcb_setup() with svc_bind(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:42 -04:00
Stanislav Kinsbursky	4db77695bf	LockD: pass service to per-net up and down functions Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:41 -04:00
Stanislav Kinsbursky	786185b5f8	SUNRPC: move per-net operations from svc_destroy() The idea is to separate service destruction and per-net operations, because these are two different things and the mix looks ugly. Notes: 1) For NFS server this patch looks ugly (sorry for that). But these place will be rewritten soon during NFSd containerization. 2) LockD per-net counter increase int lockd_up() was moved prior to make_socks() to make lockd_down_net() call safe in case of error. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:40 -04:00
Stanislav Kinsbursky	9793f7c889	SUNRPC: new svc_bind() routine introduced This new routine is responsible for service registration in a specified network context. The idea is to separate service creation from per-net operations. Note also: since registering service with svc_bind() can fail, the service will be destroyed and during destruction it will try to unregister itself from rpcbind. In this case unregistration has to be skipped. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:39 -04:00
Weston Andros Adamson	e7a0444aef	nfsd: add IPv6 addr escaping to fs_location hosts The fs_location->hosts list is split on colons, but this doesn't work when IPv6 addresses are used (they contain colons). This patch adds the function nfsd4_encode_components_esc() to allow the caller to specify escape characters when splitting on 'sep'. In order to fix referrals, this patch must be used with the mountd patch that similarly fixes IPv6 [] escaping. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:38 -04:00
J. Bruce Fields	45eaa1c1a1	nfsd4: fix change attribute endianness Though actually this doesn't matter much, as NFSv4.0 clients are required to treat the change attribute as opaque. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:38 -04:00
J. Bruce Fields	d1829b3824	nfsd4: fix free_stateid return endianness Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:37 -04:00
J. Bruce Fields	57b7b43b40	nfsd4: int/__be32 fixes In each of these cases there's a simple unambiguous correct choice, and no actual bug. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:37 -04:00
J. Bruce Fields	bc1b542be9	nfsd4: preserve __user annotation on cld downcall msg Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:36 -04:00
J. Bruce Fields	2355c59644	nfsd4: fix missing "static" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:35 -04:00
J. Bruce Fields	bfa4b36525	nfsd: state.c should include current_stateid.h OK, admittedly I'm mainly just trying to shut sparse up. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:35 -04:00
Chris Mason	1e20932a23	Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into for-linus Conflicts: fs/btrfs/ulist.h Signed-off-by: Chris Mason <chris.mason@oracle.com>	2012-05-31 16:49:53 -04:00
Trond Myklebust	1d59d61f60	NFS: Ensure that setattr and getattr wait for O_DIRECT write completion Use the same mechanism as the block devices are using, but move the helper functions from fs/direct-io.c into fs/inode.c to remove the dependency on CONFIG_BLOCK. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Fred Isaman <iisaman@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 11:41:36 -07:00
Jan Schmidt	c31931088f	Btrfs: fix tree mod log rewinded level and rewinding of moved keys When we rewind REMOVE_WHILE_FREEING operations, there's code that allocates a fresh buffer instead of cloning the old one. Setting that buffer's level correctly was missing in this case. When rewinding a MOVE_KEYS operation, btrfs_node_key_ptr_offset(slot) was missing for memmove_extent_buffer()'s arguments. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-05-31 19:56:19 +02:00
Jan Schmidt	f395694c2c	Btrfs: fix tree mod log del_ptr Logging for del_ptr when we're not deleting the last pointer was wrong. This fixes both, duplicate log entries and log sequence. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-05-31 19:56:19 +02:00
Jan Schmidt	e9b7fd4d8b	Btrfs: add tree_mod_dont_log helper Replace duplicate code by small inline helper function. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-05-31 19:56:18 +02:00
Jan Schmidt	926dd8a640	Btrfs: add missing spin_lock for insertion into tree mod log tree_mod_alloc calls __get_tree_mod_seq and must acquire a spinlock before doing so. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-05-31 19:56:18 +02:00
Jan Schmidt	3301958b7c	Btrfs: add inodes before dropping the extent lock in find_all_leafs We must build up the inode list with the extent lock held after following indirect refs. This also requires an extension to ulists, which allows to modify the stored aux value in case a key already exists in the list. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-05-31 19:53:08 +02:00
Al Viro	e5467859f7	split ->file_mmap() into ->mmap_addr()/->mmap_file() ... i.e. file-dependent and address-dependent checks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-31 13:11:54 -04:00
Theodore Ts'o	f3fc0210c0	ext4: add missing save_error_info() to ext4_error() The ext4_error() function is missing a call to save_error_info(). Since this is the function which marks the file system as containing an error, this oversight (which was introduced in 2.6.36) is quite significant, and should be backported to older stable kernels with high urgency. Reported-by: Ken Sumrall <ksumrall@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: ksumrall@google.com Cc: stable@kernel.org	2012-05-30 23:00:16 -04:00
Theodore Ts'o	2c0544b235	ext4: add debugging trigger for ext4_error() Make it easy to test whether or not the error handling subsystem in ext4 is working correctly. This allows us to simulate an ext4_error() by echoing a string to /sys/fs/ext4/<dev>/trigger_fs_error. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: ksumrall@google.com	2012-05-30 22:56:46 -04:00
Al Viro	7696e0c37f	binfmt_flat: use vm_munmap, we are missing ->mmap_sem there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:56 -04:00
Al Viro	5a5e4c2eca	binfmt_elf: switch elf_map() to vm_mmap/vm_munmap No reason to hold ->mmap_sem over the sequence Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:55 -04:00
Al Viro	63d37a84ab	vfs: umount_tree() might be called on subtree that had never made it __mnt_make_shortterm() in there undoes the effect of __mnt_make_longterm() we'd done back when we set ->mnt_ns non-NULL; it should not be done to vfsmounts that had never gone through commit_tree() and friends. Kudos to lczerner for catching that one... Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:55 -04:00
Will Deacon	46ce341b2f	pipe: return -ENOIOCTLCMD instead of -EINVAL on unknown ioctl command As described in commit `07d106d0a` ("vfs: fix up ENOIOCTLCMD error handling"), drivers should return -ENOIOCTLCMD if they receive an ioctl command which they don't understand. Doing so will result in -ENOTTY being returned to userspace, which matches the behaviour of the compat layer if it fails to translate an ioctl command. This patch fixes the pipe ioctl to return -ENOIOCTLCMD instead of -EINVAL when passed an unknown ioctl command. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:55 -04:00
J. Bruce Fields	3f50fff4da	vfs: remove unused __d_splice_alias argument Nobody sets want_disconn any more. Reported-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:54 -04:00
J. Bruce Fields	7732a557b1	vfs: stop d_splice_alias creating directory aliases A directory should never have more than one dentry pointing to it. But d_splice_alias() will add one if it finds a directory with an already-existing non-DISCONNECTED dentry. I can't find an obvious reproducer, but I also can't see what prevents d_splice_alias() from encountering such a case. It therefore seems safest to allow d_splice_alias to use any dentry it finds. (Prior to the removal of dentry_unhash() from vfs_rmdir(), around v3.0, this could cause an nfsd deadlock like this: - Somebody attempts to remove a non-empty directory. - The dentry_unhash() in vfs_rmdir() unhashes the dentry pointing to the non-empty directory. - ->rmdir() then fails with -ENOTEMPTY - Before the vfs_rmdir() caller reaches dput(), an nfsd process in rename looks up the directory by filehandle; at the end of that lookup, this dentry is found by d_alloc_anon(), and a reference is taken on it, preventing dput() from removing it. - A regular lookup of the directory calls d_splice_alias(), finds only an unhashed (not a DISCONNECTED) dentry, and insteads adds a new one, so the directory now has two dentries. - The nfsd process in rename, which was previously looking up the source directory of the rename, now looks up the target directory (which is the same), and gets the dentry newly created by the previous lookup. - The rename, seeing two different dentries, assumes this is a cross-directory rename and attempts to take the i_mutex on the directory twice. That reproducer no longer exists, but I don't think there was anything fundamentally incorrect about the vfs_rmdir() behavior there, so I think the real fault was here in d_splice_alias().) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:54 -04:00
Dan Carpenter	fd657170c0	fsnotify: remove unused parameter from send_to_group() We don't use "mnt" anymore in send_to_group() after `1968f5eed5` ("fanotify: use both marks when possible") was applied. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:53 -04:00
Dmitry Kasatkin	799243a389	vfs: increment iversion when a file is truncated When a file is truncated with truncate()/ftruncate() and then closed, iversion is not updated. This patch uses ATTR_SIZE flag as an indication to increment iversion. Mimi said: On fput(), i_version is used to detect and flag files that have changed and need to be re-measured in the IMA measurement policy. When a file is truncated with truncate()/ftruncate() and then closed, i_version is not updated. As a result, although the file has changed, it will not be re-measured and added to the IMA measurement list on subsequent access. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:53 -04:00
Shai Fultheim	a0a9b04337	fs: Move bh_cachep to the __read_mostly section bh_cachep is only written to once on initialization, so move it to the __read_mostly section. Signed-off-by: Shai Fultheim <shai@scalemp.com> Signed-off-by: Vlad Zolotarov <vlad@scalemp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:52 -04:00
Cong Wang	3ed37648e1	fs: move file_remove_suid() to fs/inode.c file_remove_suid() is a generic function operates on struct file, it almost has no relations with file mapping, so move it to fs/inode.c. Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:52 -04:00
Artem Bityutskiy	8bdc81c506	jffs2: get rid of jffs2_sync_super Currently JFFS2 file-system maps the VFS "superblock" abstraction to the write-buffer. Namely, it uses VFS services to synchronize the write-buffer periodically. The whole "superblock write-out" VFS infrastructure is served by the 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and writes out all dirty superblock using the '->write_super()' call-back. But the problem with this thread is that it wastes power by waking up the system every 5 seconds no matter what. So we want to kill it completely and thus, we need to make file-systems to stop using the '->write_super' VFS service, and then remove it together with the kernel thread. This patch switches the JFFS2 write-buffer management from '->write_super()'/'->s_dirt' to a delayed work. Instead of setting the 's_dirt' flag we just schedule a delayed work for synchronizing the write-buffer. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:52 -04:00
Artem Bityutskiy	06688905cc	jffs2: remove unnecessary GC pass on sync We do not need to call 'jffs2_write_super()' on sync. This function causes a GC pass to make sure the current contents is pushed out with the data which we already have on the media. But this is not needed on unmount and only slows sync down unnecessarily. It is enough to just sync the write-buffer. This call was added by one of the generic VFS rework patch-sets, see `d579ed00aa`. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-30 21:04:51 -04:00

1 2 3 4 5 ...

27407 Commits