linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-16 00:52:01 +00:00

Author	SHA1	Message	Date
Liu Bo	ed0eaa1498	Btrfs: make sure that we've made everything in pinned tree clean Since we have two trees for recording pinned extents, we need to go through both of them to make sure that we've done everything clean. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2012-06-15 11:42:27 -04:00
Liu Bo	6e841e32b1	Btrfs: avoid memory leak of extent state in error handling routine We've forgotten to clear extent states in pinned tree, which will results in space counter mismatch and memory leak: WARNING: at fs/btrfs/extent-tree.c:7537 btrfs_free_block_groups+0x1f3/0x2e0 [btrfs]() ... space_info 2 has 8380416 free, is not full space_info total=12582912, used=4096, pinned=4096, reserved=0, may_use=0, readonly=4194304 btrfs state leak: start 29364224 end 29376511 state 1 in tree ffff880075f20090 refs 1 ... Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2012-06-15 11:42:27 -04:00
Liu Bo	4e42ae1bdc	Btrfs: do not resize a seeding device Seeding devices are not supposed to change any more. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2012-06-15 11:42:26 -04:00
Liu Bo	bc1782374b	Btrfs: fix missing inherited flag in rename When we move a file into a directory with compression flag, we need to inherite BTRFS_INODE_COMPRESS and clear BTRFS_INODE_NOCOMPRESS as well. But if we move a file into a directory without compression flag, we need to clear both of them. It is the way how our setflags deals with compression flag, so keep the same behaviour here. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>	2012-06-15 11:33:30 -04:00
Chris Mason	acbcabd2de	Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into for-linus	2012-06-15 11:33:16 -04:00
Li Zefan	69e380d176	Btrfs: fix incompat flags setting It's a bug, but it happens to work, as BTRFS_COMPRESS_LZO == 2, which has only one bit set. Signed-off-by: Li Zefan <lizefan@huawei.com>	2012-06-14 21:30:57 -04:00
Li Zefan	6c282eb40e	Btrfs: fix defrag regression If a file has 3 small extents: \| ext1 \| ext2 \| ext3 \| Running "btrfs fi defrag" will only defrag the last two extents, if those extent mappings hasn't been read into memory from disk. This bug was introduced by commit `17ce6ef8d7` ("Btrfs: add a check to decide if we should defrag the range") The cause is, that commit looked into previous and next extents using lookup_extent_mapping() only. While at it, remove the code that checks the previous extent, since it's sufficient to check the next extent. Signed-off-by: Li Zefan <lizefan@huawei.com>	2012-06-14 21:30:55 -04:00
Josef Bacik	7ddf5a42d3	Btrfs: call filemap_fdatawrite twice for compression I removed this in an earlier commit and I was wrong. Because compression can return from filemap_fdatawrite() without having actually set any of it's pages as writeback() it can make filemap_fdatawait() do essentially nothing, and then we won't find any ordered extents because they may not have been created yet. So not only does this make fsync() completely useless, but it will also screw up if you truncate on a non-page aligned offset since we zero out the end and then wait on ordered extents and then call drop caches. We can drop the cache before the io completes and then we try to unpin the extent we just wrote we won't find it and everything goes sideways. So fix this by putting it back and put a giant comment there to keep me from trying to remove it in the future. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:30:54 -04:00
Josef Bacik	8180ef8894	Btrfs: keep inode pinned when compressing writes A user reported lots of problems using compression on the new code and it turns out part of the problem was that igrab() was failing when we added a new ordered extent. This is because when writing out an inode under compression we immediately return without actually doing anything to the pages, and then in another thread at some point down the line actually do the ordered dance. The problem is between the point that we start writeback and we actually add the ordered extent we could be trying to reclaim the inode, which makes igrab() return NULL. So we need to do an igrab() when we create the async extent and then drop it when we are done with it. This makes sure we stay pinned in memory until the ordered extent can get a reference on it and we are good to go. With this patch we no longer panic in btrfs_finish_ordered_io(). Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:30:53 -04:00
Josef Bacik	9c5085c147	Btrfs: implement ->show_devname Because btrfs can remove the device that was mounted we need to have a ->show_devname so that in this case we can print out some other device in the file system to /proc/mount. So if there are multiple devices in a btrfs file system we will just print the device with the lowest devid that we can find. This will make everything consistent and deal with device removal properly. The drawback is if you mount with a device that is higher than the lowest devicd it won't show up as the mounted device in /proc/mounts, but this is a small price to pay. This was inspired by Miao Xie's patch. Thanks, Reviewed-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:30:37 -04:00
Josef Bacik	606686eeac	Btrfs: use rcu to protect device->name Al pointed out that we can just toss out the old name on a device and add a new one arbitrarily, so anybody who uses device->name in printk could possibly use free'd memory. Instead of adding locking around all of this he suggested doing it with RCU, so I've introduced a struct rcu_string that does just that and have gone through and protected all accesses to device->name that aren't under the uuid_mutex with rcu_read_lock(). This protects us and I will use it for dealing with removing the device that we used to mount the file system in a later patch. Thanks, Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:29:16 -04:00
Josef Bacik	17ca04aff7	Btrfs: unlock everything properly in the error case for nocow I was getting hung on umount when a transaction was aborted because a range of one of the free space inodes was still locked. This is because the nocow stuff doesn't unlock anything on error. This fixed the problem and I verified that is what was happening. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:29:15 -04:00
Josef Bacik	ee670f0af3	Btrfs: fix btrfs_destroy_marked_extents So we're forcing the eb's to have their ref count set to 1 so invalidatepage works but this breaks lots of things, for example root nodes, and is just plain wrong, we don't need to just evict all of this stuff. Also drop the invalidatepage altogether and add a page_cache_release(). With this patch we no longer hang when trying to access the root nodes after an aborted transaction and we no longer leak memory. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:29:14 -04:00
Josef Bacik	7b8b92af58	Btrfs: abort the transaction if the commit fails If a transaction commit fails we don't abort it so we don't set an error on the file system. This patch fixes that by actually calling the abort stuff and then adding a check for a fs error in the transaction start stuff to make sure it is caught properly. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:29:13 -04:00
Josef Bacik	d7096fc3ef	Btrfs: wake up transaction waiters when aborting a transaction I was getting lots of hung tasks and a NULL pointer dereference because we are not cleaning up the transaction properly when it aborts. First we need to reset the running_transaction to NULL so we don't get a bad dereference for any start_transaction callers after this. Also we cannot rely on waitqueue_active() since it's just a list_empty(), so just call wake_up() directly since that will do the barrier for us and such. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:29:12 -04:00
Josef Bacik	b939d1ab76	Btrfs: fix locking in btrfs_destroy_delayed_refs The transaction abort stuff was throwing warnings from the list debugging code because we do a list_del_init outside of the delayed_refs spin lock. The delayed refs locking makes baby Jesus cry so it's not hard to get wrong, but we need to take the ref head mutex to make sure it's not being processed currently, and so if it is we need to drop the spin lock and then take and drop the mutex and do the search again. If we can take the mutex then we can safely remove the head from the list and carry on. Now when the transaction aborts I don't get the list debugging warnings. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:29:11 -04:00
Josef Bacik	beb42dd793	Btrfs: pass locked_page into extent_clear_unlock_delalloc if theres an error While doing my enospc work I got a transaction abortion that resulted in a panic when we tried to unlock_page() an already unlocked page. This is because we aren't calling extent_clear_unlock_delalloc with the locked page so it was unlocking all the pages in the range. This is wrong since __extent_writepage expects to have the page locked still unless we return *page_started as 1. This should keep us from panicing. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-14 21:29:09 -04:00
J. Bruce Fields	bc2df47a40	nfsd4: BUG_ON(!is_spin_locked()) no good on UP kernels Most frequent symptom was a BUG triggering in expire_client, with the server locking up shortly thereafter. Introduced by `508dc6e110` "nfsd41: free_session/free_client must be called under the client_lock". Cc: stable@kernel.org Cc: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-06-14 13:54:08 -04:00
Stanislav Kinsbursky	12918b10d5	NFS: hard-code init_net for NFS callback transports In case of destroying mount namespace on child reaper exit, nsproxy is zeroed to the point already. So, dereferencing of it is invalid. This patch hard-code "init_net" for all network namespace references for NFS callback services. This will be fixed with proper NFS callback containerization. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-06-14 13:53:43 -04:00
Jan Schmidt	3310c36eef	Btrfs: fix race in tree mod log addition When adding to the tree modification log, we grab two locks at different stages. We must not drop the outer lock until we're done with section protected by the inner lock. This moves the unlock call for the outer lock to the appropriate position. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-06-14 18:52:39 +02:00
Jan Schmidt	3d7806eca4	Btrfs: add btrfs_next_old_leaf To make sense of the tree mod log, the backref walker not only needs btrfs_search_old_slot, but it also called btrfs_next_leaf, which in turn was calling btrfs_search_slot. This obviously didn't give the correct result. This commit adds btrfs_next_old_leaf, a drop-in replacement for btrfs_next_leaf with a time_seq parameter. If it is zero, it behaves exactly like btrfs_next_leaf. If it is non-zero, it will use btrfs_search_old_slot with this time_seq parameter. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-06-14 18:52:09 +02:00
Jan Schmidt	a95236d99f	Btrfs: fix return value for __tree_mod_log_oldest_root In __tree_mod_log_oldest_root() we must return the found operation even if it's not a ROOT_REPLACE operation. Otherwise, the caller assumes that there are no operations to be rewinded and returns immediately. The code in the caller is modified to improve readability. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-06-14 18:44:22 +02:00
Jan Schmidt	8ba97a15e7	Btrfs: use btrfs_read_lock_root_node in get_old_root get_old_root could race with root node updates because we weren't locking the node early enough. Use btrfs_read_lock_root_node to grab the root locked in the very beginning and release the lock as soon as possible (just like btrfs_search_slot does). Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-06-14 18:44:21 +02:00
Jan Schmidt	f617e2fd52	Btrfs: remove obsolete btrfs_next_leaf call from __resolve_indirect_ref When resolving indirect refs, we used to call btrfs_next_leaf in case we didn't find an exact match. While we should find exact matches most of the time, in case we don't, we must continue searching. Treating those matches differently depending on the level we're searching doesn't make sense. Even worse, we might end up searching for a key larger than the largest, in which case there is no next_leaf and subsequent jobs would fail. This commit drops the bogous lines. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-06-14 18:44:20 +02:00
Anton Vorontsov	364ed2f465	pstore/inode: Make pstore_fill_super() static There's no reason to extern it. The patch fixes the annoying sparse warning: CHECK fs/pstore/inode.c fs/pstore/inode.c:264:5: warning: symbol 'pstore_fill_super' was not declared. Should it be static? Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-13 16:52:40 -07:00
Anton Vorontsov	93cce04968	pstore/ram: Should zap persistent zone on unlink Otherwise, unlinked file will reappear on the next boot. Reported-by: Kees Cook <keescook@chromium.org> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-13 16:52:40 -07:00
Anton Vorontsov	fce3979304	pstore/ram_core: Factor persistent_ram_zap() out of post_init() A handy function that we will use outside of ram_core soon. But so far just factor it out and start using it in post_init(). Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-13 16:52:40 -07:00
Anton Vorontsov	25b63da647	pstore/ram_core: Do not reset restored zone's position and size Otherwise, the files will survive just one reboot, and on a subsequent boot they will disappear. Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-13 16:52:39 -07:00
Anton Vorontsov	201e4aca5a	pstore/ram: Should update old dmesg buffer before reading Without the update, we'll only see the new dmesg buffer after the reboot, but previously we could see it right away. Making an oops visible in pstore filesystem before reboot is a somewhat dubious feature, but removing it wasn't an intentional change, so let's restore it. For this we have to make persistent_ram_save_old() safe for calling multiple times, and also extern it. Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-13 16:52:39 -07:00
Eric Dumazet	047fe36052	splice: fix racy pipe->buffers uses Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered by splice_shrink_spd() called from vmsplice_to_pipe() commit `35f3d14dbb` (pipe: add support for shrinking and growing pipes) added capability to adjust pipe->buffers. Problem is some paths don't hold pipe mutex and assume pipe->buffers doesn't change for their duration. Fix this by adding nr_pages_max field in struct splice_pipe_desc, and use it in place of pipe->buffers where appropriate. splice_shrink_spd() loses its struct pipe_inode_info argument. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Tom Herbert <therbert@google.com> Cc: stable <stable@vger.kernel.org> # 2.6.35 Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-06-13 21:16:42 +02:00
Suresh Jayaraman	e73f843a32	cifs: fix parsing of password mount option The double delimiter check that allows a comma in the password parsing code is unconditional. We set "tmp_end" to the end of the string and we continue to check for double delimiter. In the case where the password doesn't contain a comma we end up setting tmp_end to NULL and eventually setting "options" to "end". This results in the premature termination of the options string and hence the values of UNCip and UNC are being set to NULL. This results in mount failure with "Connecting to DFS root not implemented yet" error. This error is usually not noticable as we have password as the last option in the superblock mountdata. But when we call expand_dfs_referral() from cifs_mount() and try to compose mount options for the submount, the resulting mountdata will be of the form ",ver=1,user=foo,pass=bar,ip=x.x.x.x,unc=\\server\share" and hence results in the above error. This bug has been seen with older NAS servers running Samba 3.0.24. Fix this by moving the double delimiter check inside the conditional loop. Changes since -v1 - removed the wrong strlen() micro optimization. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.com> Acked-by: Sachin Prabhu <sprabhu@redhat.com> Cc: stable@vger.kernel.org [3.1+] Signed-off-by: Steve French <sfrench@us.ibm.com>	2012-06-12 12:53:02 -05:00
Linus Torvalds	266ae4e615	fix unbalanced wb->list_lock in 3.5-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJP0tJwAAoJECvKgwp+S8JaSVAP/1FKw7u5E9QzOW8kPCj58nig FVF3BgaYgVrJglRjNOGTSfhJl3TVHGTzEMGICtsKUvXmgs7VmMudyWFuyokxfrZi z70DIbSuPDOMEt31MXiACq9z4E2IrZZ2kTYpRVd+zCV/2s4p789ejDLxOkk2ikpt M4TcIMPCAU2e4/ljRjNEQkg8d23ehdJsH9Hg8k56Jtb57e0LNpwnaZ0FzFKCzYh0 Orm2dm9375vyzXZ2WNvgzl8ClJdEjkCyTq79i/39aQs4WeXsEP3nk+PFZYd+jrYA XwvV1sHJ43BOCCWwYS5dP0i9uGnRVlh/F2tyWWSSU2OizHzUXuOUWfXgY1JWyzaT uwEoCa8atYUOD7gp6ndnNR3uqBXtw0SihESst5fC7rKWuKOl62XToD6Pg6rPMex4 ZuRaUw4Ts4efbX0dw3iDHMrS6Cf7a0JH52QOEYovrge4mDjiJzClHR69hsmVyoVb 7s3j0q+mhur+NcnMkvC0C+pHn/m18q1i/yIBrD2K/UoLVGA17QMIZGoKfGnkeoaF a4fcoJ7fQE09tJ9KV1NLEdeD9PXQtETcNIAIZCfBCEJP9CPhHaUYkky+sOMcvKTr xEAe4f4WiHY7mTy8CRcDQjiOBigJe7Ksm+0qElauGcYw5Kqqrpdap1yPKbG4QWPL ABVYJyOvKSC3HLA6i7Gi =RBRw -----END PGP SIGNATURE----- Merge tag 'writeback-lock-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux Pull writeback locking fix from Wu Fengguang: "fix unbalanced wb->list_lock in 3.5-rc1" * tag 'writeback-lock-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Fix lock imbalance in writeback_sb_inodes()	2012-06-12 18:28:58 +03:00
Dan Carpenter	e216c8c771	NFS: add an endian notation for sparse This is supposed to be a __be32 value. Sparse complains a lot: fs/nfs/callback_xdr.c:699:30: warning: incorrect type in initializer (different base types) fs/nfs/callback_xdr.c:699:30: expected unsigned int [unsigned] status fs/nfs/callback_xdr.c:699:30: got restricted __be32 const [usertype] csr_status fs/nfs/callback_xdr.c:715:9: warning: cast to restricted __be32 fs/nfs/callback_xdr.c:716:16: warning: incorrect type in return expression (different base types) fs/nfs/callback_xdr.c:716:16: expected restricted __be32 fs/nfs/callback_xdr.c:716:16: got unsigned int [unsigned] status Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-12 09:54:40 -04:00
Dan Carpenter	0439f31c35	NFSv4.1: integer overflow in decode_cb_sequence_args() This seems like it could overflow on 32 bits. Use kmalloc_array() which has overflow protection built in. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-12 09:54:36 -04:00
Randy Dunlap	b84297197c	exofs: fix sparse non-ANSI function warning Fix sparse non-ANSI function warning: fs/exofs/sys.c:112:28: warning: non-ANSI function declaration of function 'exofs_sysfs_dbg_print' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-06-12 06:33:22 +03:00
Andy Adamson	2669940db8	NFSv4 do not send an empty SETATTR compound Commit `536e43d12b` ATTR_OPEN check can result in an ia_valid with only ATTR_FILE set, and no NFS_VALID_ATTRS attributes to request from the server. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-11 17:25:53 -04:00
Sachin Prabhu	64f9a83665	NFSv2: EOF incorrectly set on short read In cases where the server returns fewer bytes then those requested, we can incorrectly set the eof flag for the file. Fixing this allows the request to be retried with updated offset and count arguments. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-11 17:25:00 -04:00
Bryan Schumaker	c5afc8da5b	NFS: Use the NFS_DEFAULT_VERSION for v2 and v3 mounts Older versions of nfs utils don't always pass a "vers=" mount option for NFS. This chould lead to attempts at using NFS v0 due to a zeroed out nfs_parsed_mount_data struct. I solve this by setting the default NFS version to NFS_DEFAULT_VERSION in the v2 and v3 cases (v4 has already been taken care of by a similar patch). Reported-by: Joerg Roedel <joro@&bytes.org> Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-09 14:38:59 -04:00
Fred Isaman	906369e43c	NFS: fix directio refcount bug on commit This reverts a hunk from commit `0427708657` "NFS: Clean up - Simplify reference counting in fs/nfs/direct.c" The cleanups in that patch affect the write path, but by the time processing hits commit the removed reference has been added back by nfs_scan_commit_list(). Without this reversion, any page that is sent to commit holds on to an unbalanced reference that is never freed. The immediate effect is an imbalance over the wire between OPENs and CLOSEs. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-09 14:32:45 -04:00
Jan Kara	ead188f9f9	writeback: Fix lock imbalance in writeback_sb_inodes() Fix bug introduced by `169ebd90`. We have to have wb_list_lock locked when restarting writeback loop after having waited for inode writeback. Bug description by Ted Tso: I can reproduce this fairly easily by using ext4 w/o a journal, running under KVM with 1024megs memory, with fsstress (xfstests #13): [ 45.153294] ===================================== [ 45.154784] [ BUG: bad unlock balance detected! ] [ 45.155591] 3.5.0-rc1-00002-gb22b1f1 #124 Not tainted [ 45.155591] ------------------------------------- [ 45.155591] flush-254:16/2499 is trying to release lock (&(&wb->list_lock)->rlock) at: [ 45.155591] [<c022c3da>] writeback_sb_inodes+0x160/0x327 [ 45.155591] but there are no more locks to release! Reported-by: Theodore Ts'o <tytso@mit.edu> Tested-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>	2012-06-09 08:32:15 +09:00
Linus Torvalds	77249539cd	ext4 bug fixes for 3.5-rc2 This update contains two bug fixes, both destined for the stable tree. Perhaps the most important is one which fixes ext4 when used with file systems originally formatted for use with ext3, but then later converted to take advantage of ext4. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABCAAGBQJP0jyAAAoJENNvdpvBGATwGSIP/R0+4j/SqSncjLIEx6EvrCeY eyR9JYb0F44P1bKlZ6WErrupysmhPqHubnbxMg9bVMTxefN/m6tEyG+EG4MwMxFP mRtc5EVT0i/9k7U6s4Ey0kSnBnZqx6IBdFUR5lHrvAf7Ud920eY12Cx4oscI7Fj6 E81lWASOkDTjzsJpzFIqcW7b257Ne1oxPAu1p290W5riSkDzle7qqT5xT+lBAbr0 YM8ZFhznUdyv2aeuchf/dfys5YdVgSporplIjvP69bRyNr4x5B6QOUGxuX3RNhV3 Hqf83GW8HycyAEdl4AtakNtxiMPhrIy0S/RamWqBTE2+nws3Wnkd1wN4c5HjZZad w5u2E/uSxedZfgGDZZriRwHCgR+2I2CEtMy45gL96o+2Cel6unN+vjVPTVbMpMUy 3PsPoPjPVmNPLyn4vIfHlEQ5ZLhBzVpXFEsca9s+6JeHQ4dKP77fa8iq5AjzCjzA FYvmf1fNbDW1da81fRUOSnwuaC0JuUvOu5eQLLQ3jzz5gj+DvZvQqvleMHTXkQ3X hLpSXNTtXUBYeXIw5aJzh5VPJqqGzPkpJhAEbMRgCobRA1HtuAEJzZf4J7+jiWjV IrQpw0HIeZiK1oZW8iC8YYbIz8AakU+MnIVhvcPc3rIr5eeEtWPM/9ETx/mrO9s2 /y0bwu84eQLIctFjzR2s =9oyU -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Theodore Ts'o: "This update contains two bug fixes, both destined for the stable tree. Perhaps the most important is one which fixes ext4 when used with file systems originally formatted for use with ext3, but then later converted to take advantage of ext4." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: don't set i_flags in EXT4_IOC_SETFLAGS ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg	2012-06-08 11:15:31 -07:00
Linus Torvalds	e72643088f	Fix UBI and UBIFS - they refuse to work without debugfs. This was broken by the 3.5-rc1 UBI/UBIFS changes when we removed the debugging Kconfig switches. Also, correct locking in 'ubi_wl_flush()' - it was extended to support flushing a specific LEB in 3.5-rc1, and the locking was sub-optimal. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJP0jAHAAoJECmIfjd9wqK0qN8P/1JRuky3xsX8o5S5tqVLKkzO 4naM8bH1AuUemg8f+K+2IUyHp1Mp0lsBj7zA46+3xWXvRcZBJaWgqZFkLOuWSZn0 ChRGKKhdwRkXokq7JHgUJCrdTLyXsBV+uuBbJ+oTPQa2+7f1mxgZuLc+MJzpoKAs sNdE3A2pH3hhXDJKuT6hrM2wqD/2rW5bGurW4sYBxVUJEZ793HeXu30lTffZuWHo ClzDtC0iWIS5NmdN0sbQL6Os3p+uiR5mjJek0eABTHAmgXvfBUWpf9ewTXYS2ulK zY8brtIy2N8qmXFa8ZyguZKpCdulDqRLPz/2bmX5rOJSsyogoNxHI+UlZFCfq0u/ txA+Wm4dQZtvSM3Stws6gxuIDCX83wOgHq/gUitMsVkxo+McwevjXpZb4Ha80OVl Iw7ERGrbxZn+B+lTh+9jgD6kstqpefO7fBP5gkLRvHToxDKY+6sdH6b3UuE6pDJI SA2D1uDyk33A/dyvYZThcvsq7KMdF0mhfLuxkScSfP17hc0hnh/FSzkj3fHrADfy oA4Oo/jKFbOFnmByNHjzvOaqeYHVsKwzq+feDcZ7sYA5GNrVMVViLZQljQBdJWo5 E9howusrb81AGvjCIFobntBLE8DZF5OGfgt/n+60+jo3/qern/4jeuWfGL/Q9Wcw x27oVmOBLjuHuG3hXlGt =X7QS -----END PGP SIGNATURE----- Merge tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifs Pull UBI/UBIFS fixes from Artem Bityutskiy: "Fix UBI and UBIFS - they refuse to work without debugfs. This was broken by the 3.5-rc1 UBI/UBIFS changes when we removed the debugging Kconfig switches. Also, correct locking in 'ubi_wl_flush()' - it was extended to support flushing a specific LEB in 3.5-rc1, and the locking was sub-optimal." * tag 'upstream-3.5-rc2' of git://git.infradead.org/linux-ubifs: UBI: correct ubi_wl_flush locking UBIFS: fix debugfs-less systems support UBI: fix debugfs-less systems support	2012-06-08 11:04:06 -07:00
Linus Torvalds	32ba9c3fca	Revert "vfs: stop d_splice_alias creating directory aliases" This reverts commit `7732a557b1` (and commit `3f50fff4da`, which was a follow-up cleanup). We're chasing an elusive bug that Dave Jones can apparently reproduce using his system call fuzzer tool, and that looks like some kind of locking ordering problem on the directory i_mutex chain. Our i_mutex locking is rather complex, and depends on the topological ordering of the directories, which is why we have been very wary of splicing directory entries around. Of course, we really don't want to ever see aliased unconnected directories anyway, so none of this should ever happen, but this revert aims to basically get us back to a known older state. Bruce points to some of the previous discussion at http://marc.info/?i=<20110310105821.GE22723@ZenIV.linux.org.uk> and in particular a long post from Neil: http://marc.info/?i=<20110311150749.2fa2be66@notabene.brown> It should be noted that it's possible that Dave's problems come from other changes altohgether, including possibly just the fact that Dave constantly is teachning his fuzzer new tricks. So what appears to be a new bug could in fact be an old one that just gets newly triggered, but reverting these patches as "still under heavy discussion" is the right thing regardless. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-06-08 10:34:03 -07:00
Trond Myklebust	2d0dbc6ae8	NFSv4: Fix unnecessary delegation returns in nfs4_do_open While nfs4_do_open() expects the fmode argument to be restricted to combinations of FMODE_READ and FMODE_WRITE, both nfs4_atomic_open() and nfs4_proc_create will pass the nfs_open_context->mode, which contains the full fmode_t. This patch ensures that nfs4_do_open strips the other fmode_t bits, fixing a problem in which the nfs4_do_open call would result in an unnecessary delegation return. Reported-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2012-06-08 11:08:42 -04:00
Linus Torvalds	48d212a2ee	Revert "mm: correctly synchronize rss-counters at exit/exec" This reverts commit `40af1bbdca`. It's horribly and utterly broken for at least the following reasons: - calling sync_mm_rss() from mmput() is fundamentally wrong, because there's absolutely no reason to believe that the task that does the mmput() always does it on its own VM. Example: fork, ptrace, /proc - you name it. - calling it after having done mmdrop() on it is doubly insane, since the mm struct may well be gone now. - testing mm against NULL before you call it is insane too, since a NULL mm there would have caused oopses long before. .. and those are just the three bugs I found before I decided to give up looking for me and revert it asap. I should have caught it before I even took it, but I trusted Andrew too much. Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-06-07 17:54:07 -07:00
Tao Ma	b22b1f178f	ext4: don't set i_flags in EXT4_IOC_SETFLAGS Commit `7990696` uses the ext4_{set,clear}_inode_flags() functions to change the i_flags automatically but fails to remove the error setting of i_flags. So we still have the problem of trashing state flags. Fix this by removing the assignment. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org	2012-06-07 19:04:19 -04:00
Theodore Ts'o	b0dd6b70f0	ext4: fix the free blocks calculation for ext3 file systems w/ uninit_bg Ext3 filesystems that are converted to use as many ext4 file system features as possible will enable uninit_bg to speed up e2fsck times. These file systems will have a native ext3 layout of inode tables and block allocation bitmaps (as opposed to ext4's flex_bg layout). Unfortunately, in these cases, when first allocating a block in an uninitialized block group, ext4 would incorrectly calculate the number of free blocks in that block group, and then errorneously report that the file system was corrupt: EXT4-fs error (device vdd): ext4_mb_generate_buddy:741: group 30, 32254 clusters in bitmap, 32258 in gd This problem can be reproduced via: mke2fs -q -t ext4 -O ^flex_bg /dev/vdd 5g mount -t ext4 /dev/vdd /mnt fallocate -l 4600m /mnt/test The problem was caused by a bone headed mistake in the check to see if a particular metadata block was part of the block group. Many thanks to Kees Cook for finding and bisecting the buggy commit which introduced this bug (commit `fd034a84e1`, present since v3.2). Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Reported-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Kees Cook <keescook@chromium.org> Cc: stable@kernel.org	2012-06-07 18:56:06 -04:00
Konstantin Khlebnikov	40af1bbdca	mm: correctly synchronize rss-counters at exit/exec mm->rss_stat counters have per-task delta: task->rss_stat. Before changing task->mm pointer the kernel must flush this delta with sync_mm_rss(). do_exit() already calls sync_mm_rss() to flush the rss-counters before committing the rss statistics into task->signal->maxrss, taskstats, audit and other stuff. Unfortunately the kernel does this before calling mm_release(), which can call put_user() for processing task->clear_child_tid. So at this point we can trigger page-faults and task->rss_stat becomes non-zero again. As a result mm->rss_stat becomes inconsistent and check_mm() will print something like this: \| BUG: Bad rss-counter state mm:ffff88020813c380 idx:1 val:-1 \| BUG: Bad rss-counter state mm:ffff88020813c380 idx:2 val:1 This patch moves sync_mm_rss() into mm_release(), and moves mm_release() out of do_exit() and calls it earlier. After mm_release() there should be no pagefaults. [akpm@linux-foundation.org: tweak comment] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> [3.4.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-06-07 14:43:55 -07:00
Trond Myklebust	02c67525cf	NFSv4.1: Convert another trivial printk into a dprintk Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-07 13:45:53 -04:00
Fred Isaman	fb47ddc9d5	NFS4: Fix open bug when pnfs module blacklisted Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-07 13:44:24 -04:00
Trond Myklebust	f07936f2c4	NFS: Remove incorrect BUG_ON in nfs_found_client It is perfectly valid for nfs_get_client() to return a nfs_client that is in the process of setting up the NFSv4.1 session. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-07 10:21:03 -04:00
Artem Bityutskiy	818039c7d5	UBIFS: fix debugfs-less systems support Commit "f70b7e5 UBIFS: remove Kconfig debugging option" broke UBIFS and it refuses to initialize if debugfs (CONFIG_DEBUG_FS) is disabled. I incorrectly assumed that debugfs files creation function will return success if debugfs is disabled, but they actually return -ENODEV. This patch fixes the issue. Reported-by: Paul Parsons <lost.distance@yahoo.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Tested-by: Paul Parsons <lost.distance@yahoo.com>	2012-06-07 10:43:54 +03:00
Steve Dickson	f25efd851c	NFS: Map minor mismatch error to protocol not support error. Sservers that only have NFSv4.1 support the NFS4ERR_MINOR_VERS_MISMATCH error is return on v4.0 mounts. Mapping that error to EPROTONOSUPPORT will cause the mount to back off to v3 instead of failing. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-06 14:32:40 -04:00
Trond Myklebust	9bce008bae	NFS: Fix a commit bug The new commit code fails to copy the verifier into the wb_verf field of _all_ the nfs_page structures; it only copies it into the first entry. The consequence is that most requests end up failing to match in nfs_commit_release. Fix is to copy the verifier into the req->wb_verf field in nfs_write_completion. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>	2012-06-05 18:38:47 -04:00
Bryan Schumaker	cdf66442fa	NFS4: Set parsed mount data version to 4 This patch only affects mounting through "-t nfs4" since it doesn't set up an nfs version to use in the mount data. The nfs client was trying to mount using NFS v0, causing either a BUG() or a protocol not supported message. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-05 15:50:47 -04:00
Linus Torvalds	f9ba7179ce	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix blksize calculation fuse: fix stat call on 32 bit platforms fuse: optimize fallocate on permanent failure fuse: add FALLOCATE operation fuse: Convert to kstrtoul_from_user	2012-06-05 10:11:11 -07:00
Trond Myklebust	bda197f5d7	NFSv4.1: Ensure we clear session state flags after a session creation Both nfs4_reset_session and nfs41_init_clientid need to clear all the session related state flags on success. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-05 10:22:14 -04:00
Trond Myklebust	08106ac7c8	NFSv4.1: Convert a trivial printk into a dprintk There is no need to bug the user about the server returning an error on destroy_session. The error will be handled by the state manager, without any need for further input from anyone else. So convert that printk into a debugging dprintk. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-06-05 10:08:24 -04:00
Trond Myklebust	029c534737	NFSv4: Fix up decode_attr_mdsthreshold Fix an incorrect use of 'likely()'. The FATTR4_WORD2_MDSTHRESHOLD bit is only expected in NFSv4.1 OPEN calls, and so is actually rather _unlikely_. decode_attr_mdsthreshold needs to clear FATTR4_WORD2_MDSTHRESHOLD from the attribute bitmap after it has decoded the data. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Andy Adamson <andros@netapp.com>	2012-06-05 10:00:47 -04:00
Trond Myklebust	1549210fcc	NFSv4: Fix an Oops in the open recovery code The open recovery code does not need to request a new value for the mdsthreshold, and so does not allocate a struct nfs4_threshold. The problem is that encode_getfattr_open() will still request an mdsthreshold, and so we end up Oopsing in decode_attr_mdsthreshold. This patch fixes encode_getfattr_open so that it doesn't request an mdsthreshold when the caller isn't asking for one. It also fixes decode_attr_mdsthreshold so that it errors if the server returns an mdsthreshold that we didn't ask for (instead of Oopsing). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Andy Adamson <andros@netapp.com>	2012-06-05 10:00:14 -04:00
Linus Torvalds	bf2785a818	Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French. * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Move get_next_mid to ops struct CIFS: Make accessing is_valid_oplock/dump_detail ops struct field safe CIFS: Improve identation in cifs_unlock_range CIFS: Fix possible wrong memory allocation	2012-06-04 15:00:58 -07:00
Linus Torvalds	0640113be2	vfs: Fix /proc/<tid>/fdinfo/<fd> file handling Cyrill Gorcunov reports that I broke the fdinfo files with commit `30a08bf2d3` ("proc: move fd symlink i_mode calculations into tid_fd_revalidate()"), and he's quite right. The tid_fd_revalidate() function is not just used for the <tid>/fd symlinks, it's also used for the <tid>/fdinfo/<fd> files, and the permission model for those are different. So do the dynamic symlink permission handling just for symlinks, making the fdinfo files once more appear as the proper regular files they are. Of course, Al Viro argued (probably correctly) that we shouldn't do the symlink permission games at all, and make the symlinks always just be the normal 'lrwxrwxrwx'. That would have avoided this issue too, but since somebody noticed that the permissions had changed (which was the reason for that original commit `30a08bf2d3` in the first place), people do apparently use this feature. [ Basically, you can use the symlink permission data as a cheap "fdinfo" replacement, since you see whether the file is open for reading and/or writing by just looking at st_mode of the symlink. So the feature does make sense, even if the pain it has caused means we probably shouldn't have done it to begin with. ] Reported-and-tested-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-06-04 11:00:45 -07:00
Jan Schmidt	4d5a0565ce	Btrfs: remove call to btrfs_header_nritems with no effect This is a leftover from cleanup patch `559af821`. Before the cleanup, btrfs_header_nritems was called inside an if condition. As it has no side effects we need to preserve here, it should simply be dropped. Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>	2012-06-04 14:35:29 +02:00
Linus Torvalds	829f51dbd8	Merge branch 'akpm' (Fixups for Andrew's patchbomb) Merge fixups for the mac NLS tables from Andrew. * emailed from Andrew Morton, and one cleanup by me: nls: fix (and rename) mac NLS table files and config options fs/nls/Makefile: remove bogus CONFIG_ assignments	2012-06-01 19:56:23 -07:00
Linus Torvalds	8b8c0daa2c	nls: fix (and rename) mac NLS table files and config options The config options in the Kconfig file (with _CODEPAGE_ in the name) didn't match the config option name in the Makefile (no _CODEPAGE_). And both of them were of the hard-to-read MACXYZZY variety, which made them hard to parse for normal humans: MACROMAN easily reads as "macro man", not as "Mac Roman". So rename the options to be consistent, and be NLS_MAC_xyzzy. Rename the files to be mac-xyzzy.c too, and drop the "nls" part entirely (it's already in the directory name). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-06-01 19:51:22 -07:00
Andrew Morton	92a8956364	fs/nls/Makefile: remove bogus CONFIG_ assignments These were debug things which snuck through. Reported-by: Yinghai Lu <yinghai@kernel.org> Cc: Vladimir Serbinenko <phcoder@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-06-01 19:47:26 -07:00
Linus Torvalds	f5e7e844a5	- More robust parsing especially of xattr data in JFFS2 - Updates to mxc_nand and gpmi drivers to support new boards and device tree - Improve consistency of information about ECC strength in NAND devices - Clean up partition handling of plat_nand - Support NAND drivers without dedicated access to OOB area - BCH hardware ECC support for OMAP - Other fixes and cleanups, and a few new device IDs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEABECAAYFAk/JG1wACgkQdwG7hYl686M80wCglN4kutx20j+KJWuZofkr9Hog weEAoI4jrqEWEdW9EcT2CIWQw7eG+1v+ =7tdo -----END PGP SIGNATURE----- Merge tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd Pull mtd update from David Woodhouse: - More robust parsing especially of xattr data in JFFS2 - Updates to mxc_nand and gpmi drivers to support new boards and device tree - Improve consistency of information about ECC strength in NAND devices - Clean up partition handling of plat_nand - Support NAND drivers without dedicated access to OOB area - BCH hardware ECC support for OMAP - Other fixes and cleanups, and a few new device IDs Fixed trivial conflict in drivers/mtd/nand/gpmi-nand/gpmi-nand.c due to added include files next to each other. * tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd: (75 commits) mtd: mxc_nand: move ecc strengh setup before nand_scan_tail mtd: block2mtd: fix recursive call of mtd_writev mtd: gpmi-nand: define ecc.strength mtd: of_parts: fix breakage in Kconfig mtd: nand: fix scan_read_raw_oob mtd: docg3 fix in-middle of blocks reads mtd: cfi_cmdset_0002: Slight cleanup of fixup messages mtd: add fixup for S29NS512P NOR flash. jffs2: allow to complete xattr integrity check on first GC scan jffs2: allow to discriminate between recoverable and non-recoverable errors mtd: nand: omap: add support for hardware BCH ecc ARM: OMAP3: gpmc: add BCH ecc api and modes mtd: nand: check the return code of 'read_oob/read_oob_raw' mtd: nand: remove 'sndcmd' parameter of 'read_oob/read_oob_raw' mtd: m25p80: Add support for Winbond W25Q80BW jffs2: get rid of jffs2_sync_super jffs2: remove unnecessary GC pass on sync jffs2: remove unnecessary GC pass on umount jffs2: remove lock_super mtd: gpmi: add gpmi support for mx6q ...	2012-06-01 16:55:42 -07:00
Linus Torvalds	86c47b70f6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull third pile of signal handling patches from Al Viro: "This time it's mostly helpers and conversions to them; there's a lot of stuff remaining in the tree, but that'll either go in -rc2 (isolated bug fixes, ideally via arch maintainers' trees) or will sit there until the next cycle." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: x86: get rid of calling do_notify_resume() when returning to kernel mode blackfin: check __get_user() return value whack-a-mole with TIF_FREEZE FRV: Optimise the system call exit path in entry.S [ver #2] FRV: Shrink TIF_WORK_MASK [ver #2] FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions new helper: signal_delivered() powerpc: get rid of restore_sigmask() most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set set_restore_sigmask() is never called without SIGPENDING (and never should be) TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set don't call try_to_freeze() from do_signal() pull clearing RESTORE_SIGMASK into block_sigmask() sh64: failure to build sigframe != signal without handler openrisc: tracehook_signal_handler() is supposed to be called on success new helper: sigmask_to_save() new helper: restore_saved_sigmask() new helpers: {clear,test,test_and_clear}_restore_sigmask() HAVE_RESTORE_SIGMASK is defined on all architectures now	2012-06-01 11:53:44 -07:00
Pavel Shilovsky	8825736060	CIFS: Move get_next_mid to ops struct Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2012-06-01 12:35:19 -05:00
Pavel Shilovsky	7f0adb53bc	CIFS: Make accessing is_valid_oplock/dump_detail ops struct field safe Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2012-06-01 12:35:16 -05:00
Pavel Shilovsky	ea319d57d3	CIFS: Improve identation in cifs_unlock_range Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2012-06-01 12:35:12 -05:00
Pavel Shilovsky	0013fb4ca3	CIFS: Fix possible wrong memory allocation when cifs_reconnect sets maxBuf to 0 and we try to calculate a size of memory we need to store locks. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2012-06-01 12:35:08 -05:00
Linus Torvalds	1193755ac6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs changes from Al Viro. "A lot of misc stuff. The obvious groups: * Miklos' atomic_open series; kills the damn abuse of ->d_revalidate() by NFS, which was the major stumbling block for all work in that area. * ripping security_file_mmap() and dealing with deadlocks in the area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in general. * ->encode_fh() switched to saner API; insane fake dentry in mm/cleancache.c gone. * assorted annotations in fs (endianness, __user) * parts of Artem's ->s_dirty work (jff2 and reiserfs parts) * ->update_time() work from Josef. * other bits and pieces all over the place. Normally it would've been in two or three pull requests, but signal.git stuff had eaten a lot of time during this cycle ;-/" Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the 'truncate_range' inode method was removed by the VM changes, the VFS update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due to sparse fix added twice, with other changes nearby). * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits) nfs: don't open in ->d_revalidate vfs: retry last component if opening stale dentry vfs: nameidata_to_filp(): don't throw away file on error vfs: nameidata_to_filp(): inline __dentry_open() vfs: do_dentry_open(): don't put filp vfs: split __dentry_open() vfs: do_last() common post lookup vfs: do_last(): add audit_inode before open vfs: do_last(): only return EISDIR for O_CREAT vfs: do_last(): check LOOKUP_DIRECTORY vfs: do_last(): make ENOENT exit RCU safe vfs: make follow_link check RCU safe vfs: do_last(): use inode variable vfs: do_last(): inline walk_component() vfs: do_last(): make exit RCU safe vfs: split do_lookup() Btrfs: move over to use ->update_time fs: introduce inode operation ->update_time reiserfs: get rid of resierfs_sync_super reiserfs: mark the superblock as dirty a bit later ...	2012-06-01 10:34:35 -07:00
Linus Torvalds	4edebed866	Ext4 updates for 3.5 The major new feature added in this update is Darrick J. Wong's metadata checksum feature, which adds crc32 checksums to ext4's metadata fields. There is also the usual set of cleanups and bug fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABCAAGBQJPyNleAAoJENNvdpvBGATwtLMP/i3WsPyTvxmYP6HttHXQb8Jk GYCoTQ5bZMuTbOwOGg3w137cXWBv5uuPpxIk79YVLHSWx6HuanlGIa7/VnPKIaLu 2ihuvVfnrDqpwQ4MJaSq4R1Eka9JCwZ7HbYYo+fYOVobxgw588JVV9VVI9EdKRGz z11UkW8iHE0f6Xa5gOhdAMkR0uaPnxwJX/qHZYiHuognRivuwMglqWJSiMr8nQmo A2GmeoLehhW+k65IqgTCmSW6ZgFTvZdk6bskQIij3fOYHW3hHn/gcLFtmLTIZ/B5 LZdg/lngPYve+R/UyypliGKi+pv1qNEiTiBm0rrBgsdZFkBdGj0soSvGZzeK+Mp4 Q1vAmOBPYPFzs6nVzPst2n/osryyykFCK6TgSGZ50dosJ0NO8cBeDdX/gh9JKD2R yQUMUltOCCSj/eWU4iwqZ0T3FXRiH/+S3XMHznoKJiwUyGDBNQy4+Yg2k2WzUXrz Cu5t5BwNG2WNP7y5Et/wmUIzpC7VPId4qYmGyHe7OwTxSJgW+6f7GVkHfjWcDMuv pGgEUiInbMmLajP3v2/LKfVU4hXLZy4uJbhoBgDdeIpZrnPifJG/MwDOS4W+dLVT tDzgO1SAh3/E4jATreZ5bjzD/HGsfe1OX09UH3Pbc1EcgkrLnyrQXFwdHshdVu4A cxMoKNPVCQJySb1UrLkO =SdJJ -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull Ext4 updates from Theodore Ts'o: "The major new feature added in this update is Darrick J Wong's metadata checksum feature, which adds crc32 checksums to ext4's metadata fields. There is also the usual set of cleanups and bug fixes." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (44 commits) ext4: hole-punch use truncate_pagecache_range jbd2: use kmem_cache_zalloc wrapper instead of flag ext4: remove mb_groups before tearing down the buddy_cache ext4: add ext4_mb_unload_buddy in the error path ext4: don't trash state flags in EXT4_IOC_SETFLAGS ext4: let getattr report the right blocks in delalloc+bigalloc ext4: add missing save_error_info() to ext4_error() ext4: add debugging trigger for ext4_error() ext4: protect group inode free counting with group lock ext4: use consistent ssize_t type in ext4_file_write() ext4: fix format flag in ext4_ext_binsearch_idx() ext4: cleanup in ext4_discard_allocated_blocks() ext4: return ENOMEM when mounts fail due to lack of memory ext4: remove redundundant "(char *) bh->b_data" casts ext4: disallow hard-linked directory in ext4_lookup ext4: fix potential integer overflow in alloc_flex_gd() ext4: remove needs_recovery in ext4_mb_init() ext4: force ro mount if ext4_setup_super() fails ext4: fix potential NULL dereference in ext4_free_inodes_counts() ext4/jbd2: add metadata checksumming to the list of supported features ...	2012-06-01 10:12:15 -07:00
Al Viro	754421c8ca	HAVE_RESTORE_SIGMASK is defined on all architectures now Everyone either defines it in arch thread_info.h or has TIF_RESTORE_SIGMASK and picks default set_restore_sigmask() in linux/thread_info.h. Kill the ifdefs, slap #error in linux/thread_info.h to catch breakage when new ones get merged. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:58:46 -04:00
Miklos Szeredi	0ef97dcfce	nfs: don't open in ->d_revalidate NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a mount needs to be followed or not. It does check d_mountpoint() on the dentry, which can result in a weird error if the VFS found that the mount does not in fact need to be followed, e.g.: # mount --bind /mnt/nfs /mnt/nfs-clone # echo something > /mnt/nfs/tmp/bar # echo x > /tmp/file # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar # cat /mnt/nfs/tmp/bar cat: /mnt/nfs/tmp/bar: Not a directory Which should, by any sane filesystem, result in "something" being printed. So instead do the open in f_op->open() and in the unlikely case that the cached dentry turned out to be invalid, drop the dentry and return EOPENSTALE to let the VFS retry. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:12:02 -04:00
Miklos Szeredi	16b1c1cd71	vfs: retry last component if opening stale dentry NFS optimizes away d_revalidates for last component of open. This means that open itself can find the dentry stale. This patch allows the filesystem to return EOPENSTALE and the VFS will retry the lookup on just the last component if possible. If the lookup was done using RCU mode, including the last component, then this is not possible since the parent dentry is lost. In this case fall back to non-RCU lookup. Currently this is not used since NFS will always leave RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:12:01 -04:00
Miklos Szeredi	50ee93afca	vfs: nameidata_to_filp(): don't throw away file on error If open fails, don't put the file. This allows it to be reused if open needs to be retried. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:12:01 -04:00
Miklos Szeredi	91daee988d	vfs: nameidata_to_filp(): inline __dentry_open() Copy __dentry_open() into nameidata_to_filp(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:12:01 -04:00
Miklos Szeredi	78f71eff3c	vfs: do_dentry_open(): don't put filp Move put_filp() out to __dentry_open(), the only caller now. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:12:00 -04:00
Miklos Szeredi	90ad1a8ecb	vfs: split __dentry_open() Split __dentry_open() into two functions: do_dentry_open() - does most of the actual work, doesn't put file on failure open_check_o_direct() - after a successful open, checks direct_IO method This will allow i_op->atomic_open to do just the file initialization and leave the direct_IO checking to the VFS. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:12:00 -04:00
Miklos Szeredi	5f5daac12a	vfs: do_last() common post lookup Now the post lookup code can be shared between O_CREAT and plain opens since they are essentially the same. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:12:00 -04:00
Miklos Szeredi	d7fdd7f6e1	vfs: do_last(): add audit_inode before open This allows this code to be shared between O_CREAT and plain opens. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:11:59 -04:00
Miklos Szeredi	050ac841ea	vfs: do_last(): only return EISDIR for O_CREAT This allows this code to be shared between O_CREAT and plain opens. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:11:59 -04:00
Miklos Szeredi	af2f55426d	vfs: do_last(): check LOOKUP_DIRECTORY Check for ENOTDIR before finishing open. This allows this code to be shared between O_CREAT and plain opens. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:11:58 -04:00
Miklos Szeredi	54c33e7f95	vfs: do_last(): make ENOENT exit RCU safe This will allow this code to be used in RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:11:58 -04:00
Miklos Szeredi	d45ea86792	vfs: make follow_link check RCU safe This will allow this code to be used in RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:11:58 -04:00
Miklos Szeredi	decf340087	vfs: do_last(): use inode variable Use helper variable instead of path->dentry->d_inode before complete_walk(). This will allow this code to be used in RCU mode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:11:57 -04:00
Miklos Szeredi	a1eb331530	vfs: do_last(): inline walk_component() Copy walk_component() into do_lookup(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:11:57 -04:00
Miklos Szeredi	e276ae672f	vfs: do_last(): make exit RCU safe Allow returning from do_last() with LOOKUP_RCU still set on the "out:" and "exit:" labels. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:11:57 -04:00
Miklos Szeredi	697f514df1	vfs: split do_lookup() Split do_lookup() into two functions: lookup_fast() - does cached lookup without i_mutex lookup_slow() - does lookup with i_mutex Both follow managed dentries. The new functions are needed by atomic_open. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 12:11:56 -04:00
Josef Bacik	e41f941a23	Btrfs: move over to use ->update_time Btrfs had been doing it's own file_update_time so we could catch ENOSPC properly, so just update our btrfs_update_time to work with the new stuff and then we'll be fancy later. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-01 12:07:52 -04:00
Josef Bacik	c3b2da3148	fs: introduce inode operation ->update_time Btrfs has to make sure we have space to allocate new blocks in order to modify the inode, so updating time can fail. We've gotten around this by having our own file_update_time but this is kind of a pain, and Christoph has indicated he would like to make xfs do something different with atime updates. So introduce ->update_time, where we will deal with i_version an a/m/c time updates and indicate which changes need to be made. The normal version just does what it has always done, updates the time and marks the inode dirty, and then filesystems can choose to do something different. I've gone through all of the users of file_update_time and made them check for errors with the exception of the fault code since it's complicated and I wasn't quite sure what to do there, also Jan is going to be pushing the file time updates into page_mkwrite for those who have it so that should satisfy btrfs and make it not a big deal to check the file_update_time() return code in the generic fault path. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-01 12:07:25 -04:00
Linus Torvalds	51eab603f5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "This includes a fairly large change from Josef around data writeback completion. Before, the writeback wasn't completed until the metadata insertions for the extent were done, and this made for fairly large latency spikes on the last page of each ordered extent. We already had a separate mechanism for tracking pending metadata insertions, so Josef just needed to tweak things a little to end writeback earlier on the page. Overall it makes us much friendly to memory reclaim and lowers latencies quite a lot for synchronous IO. Jan Schmidt has finished some background work required to track btree blocks as they go through changes in ownership. It's the missing piece he needed for both btrfs send/receive and subvolume quotas. Neither of those are ready yet, but the new tracking code is included here. Most of the time, the new code is off. It is only used by scrub and other backref walkers. Stefan Behrens has added io failure tracking. This includes counters for which drives are causing the most trouble so the admin (or an automated tool) can choose to kick them out. We're tracking IO errors, crc errors, and generation checks we do on each metadata block. RAID5/6 did miss the cut this time because I'm having trouble with corruptions. I'll nail it down next week and post as a beta testing before 3.6" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (58 commits) Btrfs: fix tree mod log rewinded level and rewinding of moved keys Btrfs: fix tree mod log del_ptr Btrfs: add tree_mod_dont_log helper Btrfs: add missing spin_lock for insertion into tree mod log Btrfs: add inodes before dropping the extent lock in find_all_leafs Btrfs: use delayed ref sequence numbers for all fs-tree updates Btrfs: fix false positive in check-integrity on unmount Btrfs: fix runtime warning in check-integrity check data mode Btrfs: set ioprio of scrub readahead to idle Btrfs: fix return code in drop_objectid_items Btrfs: check to see if the inode is in the log before fsyncing Btrfs: return value of btrfs_read_buffer is checked correctly Btrfs: read device stats on mount, write modified ones during commit Btrfs: add ioctl to get and reset the device stats Btrfs: add device counters for detected IO and checksum errors btrfs: Drop unused function btrfs_abort_devices() Btrfs: fix the same inode id problem when doing auto defragment Btrfs: fall back to non-inline if we don't have enough space Btrfs: fix how we deal with the orphan block rsv Btrfs: convert the inode bit field to use the actual bit operations ...	2012-06-01 08:37:31 -07:00
Linus Torvalds	419f431949	Merge branch 'for-3.5' of git://linux-nfs.org/~bfields/linux Pull the rest of the nfsd commits from Bruce Fields: "... and then I cherry-picked the remainder of the patches from the head of my previous branch" This is the rest of the original nfsd branch, rebased without the delegation stuff that I thought really needed to be redone. I don't like rebasing things like this in general, but in this situation this was the lesser of two evils. * 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits) nfsd4: fix, consolidate client_has_state nfsd4: don't remove rebooted client record until confirmation nfsd4: remove some dprintk's and a comment nfsd4: return "real" sequence id in confirmed case nfsd4: fix exchange_id to return confirm flag nfsd4: clarify that renewing expired client is a bug nfsd4: simpler ordering of setclientid_confirm checks nfsd4: setclientid: remove pointless assignment nfsd4: fix error return in non-matching-creds case nfsd4: fix setclientid_confirm same_cred check nfsd4: merge 3 setclientid cases to 2 nfsd4: pull out common code from setclientid cases nfsd4: merge last two setclientid cases nfsd4: setclientid/confirm comment cleanup nfsd4: setclientid remove unnecessary terms from a logical expression nfsd4: move rq_flavor into svc_cred nfsd4: stricter cred comparison for setclientid/exchange_id nfsd4: move principal name into svc_cred nfsd4: allow removing clients not holding state nfsd4: rearrange exchange_id logic to simplify ...	2012-06-01 08:32:58 -07:00
Artem Bityutskiy	033369d1af	reiserfs: get rid of resierfs_sync_super This patch stops reiserfs using the VFS 'write_super()' method along with the s_dirt flag, because they are on their way out. The whole "superblock write-out" VFS infrastructure is served by the 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and writes out all dirty superblock using the '->write_super()' call-back. But the problem with this thread is that it wastes power by waking up the system every 5 seconds, even if there are no diry superblocks, or there are no client file-systems which would need this (e.g., btrfs does not use '->write_super()'). So we want to kill it completely and thus, we need to make file-systems to stop using the '->write_super()' VFS service, and then remove it together with the kernel thread. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 10:37:36 -04:00
Artem Bityutskiy	5c5fd81962	reiserfs: mark the superblock as dirty a bit later The 'journal_mark_dirty()' function currently first marks the superblock as dirty by setting 's_dirt' to 1, then does various sanity checks and returns, then actuall does all the magic with the journal. This is not an ideal order, though. It makes more sense to first do all the checks, then do all the internal stuff, and at the end notify the VFS that the superblock is now dirty. This patch moves the 's_dirt = 1' assignment from the very beginning of this function to the very end. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 10:37:36 -04:00
Artem Bityutskiy	717f03c4d7	reiserfs: remove useless superblock dirtying The 'reiserfs_resize()' function marks the superblock as dirty by assigning 1 to 's_dirt' and then calls 'journal_mark_dirty()' which does the same. Thus, we can remove the assignment from 'reiserfs_resize()'. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 10:37:36 -04:00
Artem Bityutskiy	25729b0e94	reiserfs: clean-up function return type Turn 'reiserfs_flush_old_commits()' into a void function because the callers do not cares about what it returns anyway. We are going to remove the 'sb->s_dirt' field completely and this patch is a small step towards this direction. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 10:37:36 -04:00
Artem Bityutskiy	efaa33eb13	reiserfs: cleanup reiserfs_fill_super a bit We have the reiserfs superblock pointer in the 'sbi' variable in this function, no need to use the 'REISERFS_SB(s)' macro which is the same. This is jut a small clean-up. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 10:37:36 -04:00
Al Viro	e3fc629d7b	switch aio and shm to do_mmap_pgoff(), make do_mmap() static after all, 0 bytes and 0 pages is the same thing... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 10:37:17 -04:00
Hugh Dickins	5e44f8c374	ext4: hole-punch use truncate_pagecache_range When truncating a file, we unmap pages from userspace first, as that's usually more efficient than relying, page by page, on the fallback in truncate_inode_page() - particularly if the file is mapped many times. Do the same when punching a hole: 3.4 added truncate_pagecache_range() to do the unmap and trunc, so use it in ext4_ext_punch_hole(), instead of calling truncate_inode_pages_range() directly. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-06-01 00:15:28 -04:00
Wanlong Gao	b2f4edb335	jbd2: use kmem_cache_zalloc wrapper instead of flag Use kmem_cache_zalloc wrapper instead of flag __GFP_ZERO. Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-06-01 00:10:32 -04:00
Salman Qazi	95599968d1	ext4: remove mb_groups before tearing down the buddy_cache We can't have references held on pages in the s_buddy_cache while we are trying to truncate its pages and put the inode. All the pages must be gone before we reach clear_inode. This can only be gauranteed if we can prevent new users from grabbing references to s_buddy_cache's pages. The original bug can be reproduced and the bug fix can be verified by: while true; do mount -t ext4 /dev/ram0 /export/hda3/ram0; \ umount /export/hda3/ram0; done & while true; do cat /proc/fs/ext4/ram0/mb_groups; done Signed-off-by: Salman Qazi <sqazi@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org	2012-05-31 23:52:14 -04:00
Salman Qazi	02b7831019	ext4: add ext4_mb_unload_buddy in the error path ext4_free_blocks fails to pair an ext4_mb_load_buddy with a matching ext4_mb_unload_buddy when it fails a memory allocation. Signed-off-by: Salman Qazi <sqazi@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org	2012-05-31 23:51:27 -04:00
Theodore Ts'o	79906964a1	ext4: don't trash state flags in EXT4_IOC_SETFLAGS In commit `353eb83c` we removed i_state_flags with 64-bit longs, But when handling the EXT4_IOC_SETFLAGS ioctl, we replace i_flags directly, which trashes the state flags which are stored in the high 32-bits of i_flags on 64-bit platforms. So use the the ext4_{set,clear}_inode_flags() functions which use atomic bit manipulation functions instead. Reported-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org	2012-05-31 23:46:01 -04:00
Tao Ma	9660755100	ext4: let getattr report the right blocks in delalloc+bigalloc In delayed allocation, i_reserved_data_blocks now indicates clusters, not blocks. So report it in the right number. This can be easily exposed by the following command: echo foo > blah; du -hc blah; sync; du -hc blah Reported-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2012-05-31 22:54:16 -04:00
Linus Torvalds	a00b6151a2	Merge branch 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux Pull nfsd update from Bruce Fields. * 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux: (23 commits) nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek SUNRPC: split upcall function to extract reusable parts nfsd: allocate id-to-name and name-to-id caches in per-net operations. nfsd: make name-to-id cache allocated per network namespace context nfsd: make id-to-name cache allocated per network namespace context nfsd: pass network context to idmap init/exit functions nfsd: allocate export and expkey caches in per-net operations. nfsd: make expkey cache allocated per network namespace context nfsd: make export cache allocated per network namespace context nfsd: pass pointer to export cache down to stack wherever possible. nfsd: pass network context to export caches init/shutdown routines Lockd: pass network namespace to creation and destruction routines NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches nfsd: pass pointer to expkey cache down to stack wherever possible. nfsd: use hash table from cache detail in nfsd export seq ops nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops nfsd: use exp_put() for svc_export_cache put nfsd: use cache detail pointer from svc_export structure on cache put nfsd: add link to owner cache detail to svc_export structure nfsd: use passed cache_detail pointer expkey_parse() ...	2012-05-31 18:18:11 -07:00
Linus Torvalds	08615d7d85	Merge branch 'akpm' (Andrew's patch-bomb) Merge misc patches from Andrew Morton: - the "misc" tree - stuff from all over the map - checkpatch updates - fatfs - kmod changes - procfs - cpumask - UML - kexec - mqueue - rapidio - pidns - some checkpoint-restore feature work. Reluctantly. Most of it delayed a release. I'm still rather worried that we don't have a clear roadmap to completion for this work. * emailed from Andrew Morton <akpm@linux-foundation.org>: (78 patches) kconfig: update compression algorithm info c/r: prctl: add ability to set new mm_struct::exe_file c/r: prctl: extend PR_SET_MM to set up more mm_struct entries c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat syscalls, x86: add __NR_kcmp syscall fs, proc: introduce /proc/<pid>/task/<tid>/children entry sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() eventfd: change int to __u64 in eventfd_signal() fs/nls: add Apple NLS pidns: make killed children autoreap pidns: use task_active_pid_ns in do_notify_parent rapidio/tsi721: add DMA engine support rapidio: add DMA engine support for RIO data transfers ipc/mqueue: add rbtree node caching support tools/selftests: add mq_perf_tests ipc/mqueue: strengthen checks on mqueue creation ipc/mqueue: correct mq_attr_ok test ipc/mqueue: improve performance of send/recv selftests: add mq_open_tests ...	2012-05-31 18:10:18 -07:00
Cyrill Gorcunov	5b172087f9	c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat We would like to have an ability to restore command line arguments and program environment pointers but first we need to obtain them somehow. Thus we put these values into /proc/$pid/stat. The exit_code is needed to restore zombie tasks. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Vagin <avagin@openvz.org> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Cyrill Gorcunov	818411616b	fs, proc: introduce /proc/<pid>/task/<tid>/children entry When we do checkpoint of a task we need to know the list of children the task, has but there is no easy and fast way to generate reverse parent->children chain from arbitrary <pid> (while a parent pid is provided in "PPid" field of /proc/<pid>/status). So instead of walking over all pids in the system (creating one big process tree in memory, just to figure out which children a task has) -- we add explicit /proc/<pid>/task/<tid>/children entry, because the kernel already has this kind of information but it is not yet exported. This is a first level children, not the whole process tree. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Christopher Yeoh	ac34ebb3a6	aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after changes made to support CMA in an earlier patch. Rather than having an additional check_access parameter to these functions, the first paramater type is overloaded to allow the caller to specify CHECK_IOVEC_ONLY which means check that the contents of the iovec are valid, but do not check the memory that they point to. This is used by process_vm_readv/writev where we need to validate that a iovec passed to the syscall is valid but do not want to check the memory that it points to at this point because it refers to an address space in another process. Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Sha Zhengju	ee62c6b2dc	eventfd: change int to __u64 in eventfd_signal() eventfd_ctx->count is an __u64 counter which is allowed to reach ULLONG_MAX. eventfd_write() adds a __u64 value to "count", but the kernel side eventfd_signal() only adds an int value to it. Make them consistent. [akpm@linux-foundation.org: update interface documentation] Signed-off-by: Sha Zhengju <handai.szj@taobao.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Vladimir Serbinenko	71ca97da9d	fs/nls: add Apple NLS HFS has support for NLS. However the relevant NLS tables are missing. Here they are automatically transformed from the tables at unicode.org. Codepages requiring special handling like CJK, RTL or Brahmic ones are not included in this patch. [akpm@linux-foundation.org: add unicode.org copyright and permission notices] Signed-off-by: Vladimir Serbinenko <phcoder@gmail.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Clemens Ladisch <clemens@ladisch.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Konstantin Khlebnikov	bca1554373	proc/smaps: show amount of nonlinear ptes in vma Currently, nonlinear mappings can not be distinguished from ordinary mappings. This patch adds into /proc/pid/smaps line "Nonlinear: <size> kB", where size is amount of nonlinear ptes in vma, this line appears only if VM_NONLINEAR is set. This information may be useful not only for checkpoint/restore project. Requested by Pavel Emelyanov. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Konstantin Khlebnikov	b1d4d9e0cb	proc/smaps: carefully handle migration entries Currently smaps reports migration entries as "swap", as result "swap" can appears in shared mapping. This patch converts migration entries into pages and handles them as usual. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Konstantin Khlebnikov	052fb0d635	proc: report file/anon bit in /proc/pid/pagemap This is an implementation of Andrew's proposal to extend the pagemap file bits to report what is missing about tasks' working set. The problem with the working set detection is multilateral. In the criu (checkpoint/restore) project we dump the tasks' memory into image files and to do it properly we need to detect which pages inside mappings are really in use. The mincore syscall I though could help with this did not. First, it doesn't report swapped pages, thus we cannot find out which parts of anonymous mappings to dump. Next, it does report pages from page cache as present even if they are not mapped, and it doesn't make that has not been cow-ed. Note, that issue with swap pages is critical -- we must dump swap pages to image file. But the issues with file pages are optimization -- we can take all file pages to image, this would be correct, but if we know that a page is not mapped or not cow-ed, we can remove them from dump file. The dump would still be self-consistent, though significantly smaller in size (up to 10 times smaller on real apps). Andrew noticed, that the proc pagemap file solved 2 of 3 above issues -- it reports whether a page is present or swapped and it doesn't report not mapped page cache pages. But, it doesn't distinguish cow-ed file pages from not cow-ed. I would like to make the last unused bit in this file to report whether the page mapped into respective pte is PageAnon or not. [comment stolen from Pavel Emelyanov's v1 patch] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Jan Engelhardt	715be1fce0	procfs: use more apprioriate types when dumping /proc/N/stat - use int fpr priority and nice, since task_nice()/task_prio() return that - field 24: get_mm_rss() returns unsigned long Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Alexey Dobriyan	af5e617143	proc: pass "fd" by value in /proc/*/{fd,fdinfo} code Pass "fd" directly, not via pointer -- one less memory read. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Alexey Dobriyan	f05ed3f1ab	proc: don't do dummy rcu_read_lock/rcu_read_unlock on error path rcu_read_lock()/rcu_read_unlock() is nop for TINY_RCU, but is not a nop for, say, PREEMPT_RCU. proc_fill_cache() is called without RCU lock, there is no need to lock/unlock on error path, simply jump out of the loop. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Cong Wang	2344bec788	proc: use mm_access() instead of ptrace_may_access() mm_access() handles this much better, and avoids some race conditions. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:29 -07:00
Cong Wang	e7dcd9990e	proc: remove mm_for_maps() mm_for_maps() is a simple wrapper for mm_access(), and the name is misleading, so just remove it and use mm_access() directly. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Cong Wang	b409e578d9	proc: clean up /proc/<pid>/environ handling Similar to `e268337dfe` ("proc: clean up and fix /proc/<pid>/mem handling"), move the check of permission to open(), this will simplify read() code. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Namjae Jeon	f0aac6162e	fat: use fat_msg_ratelimit() in fat__get_entry() If an application tries to lookup (opendir/readdir/stat) 5000 files on a fatfs USB device and the device is unplugged, many message occur, shown below. This makes the application slow. So use the new fat_msg_ratelimit() decrease the messaging rate. #> ./file_lookup_testcase ./files_directory/ usb 2-1.4: USB disconnect, device number 4 FAT-fs (sda1): FAT read failed (blocknr 2631) FAT-fs (sda1): Directory bread(block 396816) failed FAT-fs (sda1): Directory bread(block 396817) failed FAT-fs (sda1): Directory bread(block 396818) failed FAT-fs (sda1): Directory bread(block 396819) failed FAT-fs (sda1): Directory bread(block 396820) failed FAT-fs (sda1): Directory bread(block 396821) failed FAT-fs (sda1): Directory bread(block 396822) failed FAT-fs (sda1): Directory bread(block 396823) failed FAT-fs (sda1): Directory bread(block 406824) failed FAT-fs (sda1): Directory bread(block 406825) failed FAT-fs (sda1): Directory bread(block 406826) failed FAT-fs (sda1): Directory bread(block 406827) failed FAT-fs (sda1): Directory bread(block 406828) failed FAT-fs (sda1): Directory bread(block 406829) failed FAT-fs (sda1): Directory bread(block 406830) failed FAT-fs (sda1): Directory bread(block 406831) failed FAT-fs (sda1): Directory bread(block 417696) failed FAT-fs (sda1): Directory bread(block 417697) failed FAT-fs (sda1): Directory bread(block 417698) failed FAT-fs (sda1): Directory bread(block 417699) failed FAT-fs (sda1): Directory bread(block 417700) failed FAT-fs (sda1): Directory bread(block 417701) failed FAT-fs (sda1): Directory bread(block 417702) failed FAT-fs (sda1): Directory bread(block 417703) failed FAT-fs (sda1): FAT read failed (blocknr 2631) FAT-fs (sda1): Directory bread(block 396816) failed FAT-fs (sda1): Directory bread(block 396817) failed FAT-fs (sda1): Directory bread(block 396818) failed FAT-fs (sda1): Directory bread(block 396819) failed FAT-fs (sda1): Directory bread(block 396820) failed FAT-fs (sda1): Directory bread(block 396821) failed Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Amit Sahrawat <amit.sahrawat83@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Namjae Jeon	b742c34153	fat: add fat_msg_ratelimit() Add a fat_msg_ratelimit() to limit the message generation rate. Signed-off-by: Namjae Jeon <linkinjeon@gmail.com> Signed-off-by: Amit Sahrawat <amit.sahrawat83@gmail.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Artem Bityutskiy	78491189dd	fat: switch to fsinfo_inode Currently FAT file-system maps the VFS "superblock" abstraction to the FSINFO block. The FSINFO block contains non-essential data about the amount of free clusters and the next free cluster. FAT file-system can always find out this information by scanning the FAT table, but having it in the FSINFO block may speed things up sometimes. So FAT file-system relies on the VFS superblock write-out services to make sure the FSINFO block is written out to the media from time to time. The whole "superblock write-out" VFS infrastructure is served by the 'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and writes out all dirty superblock using the '->write_super()' call-back. But the problem with this thread is that it wastes power by waking up the system every 5 seconds no matter what. So we want to kill it completely and thus, we need to make file-systems to stop using the '->write_super' VFS service, and then remove it together with the kernel thread. This patch switches the FAT FSINFO block management from '->write_super()'/'->s_dirt' to 'fsinfo_inode'/'->write_inode'. Now, instead of setting the 's_dirt' flag, we just mark the special 'fsinfo_inode' inode as dirty and let VFS invoke the '->write_inode' call-back when needed, where we write-out the FSINFO block. This patch also makes sure we do not mark the 'fsinfo_inode' inode as dirty if we are not FAT32 (FAT16 and FAT12 do not have the FSINFO block) or if we are in R/O mode. As a bonus, we can also remove the '->sync_fs()' and '->write_super()' FAT call-back function because they become unneeded. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Artem Bityutskiy	330fe3c4c6	fat: mark superblock as dirty less often Preparation for further changes. It touches few functions in fatent.c and prevents them from marking the superblock as dirty unnecessarily often. Namely, instead of marking it as dirty in the internal tight loops - do it only once at the end of the functions. And instead of marking it as dirty while holding the FAT table lock, do it outside the lock. The reason for this patch is that marking the superblock as dirty will soon become a little bit heavier operation, so it is cleaner to do this only when it is necessary. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Artem Bityutskiy	90b436657e	fat: introduce mark_fsinfo_dirty helper A preparation patch which introduces a 'mark_fsinfo_dirty()' helper function which just sets the 's_dirt' flag to 1 so far. I'll add more code to this helper later, so I do not mark it as 'inline'. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Artem Bityutskiy	020ac5b6be	fat: introduce special inode for managing the FSINFO block This is patchset makes fatfs stop using the VFS '->write_super()' method for writing out the FSINFO block. The final goal is to get rid of the 'sync_supers()' kernel thread. This kernel thread wakes up every 5 seconds (by default) and calls '->write_super()' for all mounted file-systems. And the bad thing is that this is done even if all the superblocks are clean. Moreover, some file-systems do not even need this end they do not register the '->write_super()' method at all (e.g., btrfs). So 'sync_supers()' most often just generates useless wake-ups and wastes power. I am trying to make all file-systems independent of '->write_super()' and plan to remove 'sync_supers()' and '->write_super' completely once there are no more users. The '->write_supers()' method is mostly used by baroque file-systems like hfs, udf, etc. Modern file-systems like btrfs and xfs do not use it. This justifies removing this stuff from VFS completely and make every FS self-manage own superblock. Tested with xfstests. This patch: Preparation for further changes. It introduces a special inode ('fsinfo_inode') in FAT file-system which we'll later use for managing the FSINFO block. Note, this there is already one special inode ('fat_inode') which is used for managing the FAT tables. Introduce new 'MSDOS_FSINFO_INO' constant for this special inode. It is safe to do because FAT file-system does not store inode numbers on the media but generates them run-time. I've also cleaned up the comment to existing 'MSDOS_ROOT_INO' constant, while on it. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Dan Carpenter	7bc1bac77a	HPFS: remove PRINTK() macro The PRINTK() macro isn't really used. Let's just remove it because it is ugly and out of date. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Ryusuke Konishi	11475975dd	nilfs2: flush disk caches in syncing There are two cases that the cache flush is needed to avoid data loss against unexpected hang or power failure. One is sync file function (i.e. nilfs_sync_file) and another is checkpointing ioctl. This issues a cache flush request to device for such cases if barrier mount option is enabled, and makes sure data really is on persistent storage on their completion. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Will Deacon	a1d494495c	pipe: return -ENOIOCTLCMD instead of -EINVAL on unknown ioctl command As described in commit `07d106d0a3` ("vfs: fix up ENOIOCTLCMD error handling"), drivers should return -ENOIOCTLCMD if they receive an ioctl command which they don't understand. Doing so will result in -ENOTTY being returned to userspace, which matches the behaviour of the compat layer if it fails to translate an ioctl command. This patch fixes the pipe ioctl to return -ENOIOCTLCMD instead of -EINVAL when passed an unknown ioctl command. Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Alan Cox <alan@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Xi Wang	a3860c1c5d	introduce SIZE_MAX ULONG_MAX is often used to check for integer overflow when calculating allocation size. While ULONG_MAX happens to work on most systems, there is no guarantee that `size_t' must be the same size as `long'. This patch introduces SIZE_MAX, the maximum value of `size_t', to improve portability and readability for allocation size validation. Signed-off-by: Xi Wang <xi.wang@gmail.com> Acked-by: Alex Elder <elder@dreamhost.com> Cc: David Airlie <airlied@linux.ie> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:26 -07:00
J. Bruce Fields	6eccece90b	nfsd4: fix, consolidate client_has_state Whoops: first, I reimplemented the already-existing has_resources without noticing; second, I got the test backwards. I did pick a better name, though. Combine the two.... Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:39 -04:00
J. Bruce Fields	b9831b59f3	nfsd4: don't remove rebooted client record until confirmation In the NFSv4.1 client-reboot case we're currently removing the client's previous state in exchange_id. That's wrong--we should be waiting till the confirming create_session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:34 -04:00
J. Bruce Fields	32f16b3823	nfsd4: remove some dprintk's and a comment The comment is redundant, and if we really want dprintk's here they'd probably be better in the common (check-slot_seqid) code. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:31 -04:00
J. Bruce Fields	778df3f0fe	nfsd4: return "real" sequence id in confirmed case The client should ignore the returned sequence_id in the case where the CONFIRMED flag is set on an exchange_id reply--and in the unconfirmed case "1" is always the right response. So it shouldn't actually matter what we return here. We could continue returning 1 just to catch clients ignoring the spec here, but I'd rather be generous. Other things equal, returning the existing sequence_id seems more informative. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:27 -04:00
J. Bruce Fields	0f1ba0ef21	nfsd4: fix exchange_id to return confirm flag Otherwise nfsd4_set_ex_flags writes over the return flags. Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:21 -04:00
J. Bruce Fields	7447758be7	nfsd4: clarify that renewing expired client is a bug This can't happen: - cl_time is zeroed only by unhash_client_locked, which is only ever called under both the state lock and the client lock. - every caller of renew_client() should have looked up a (non-expired) client and then called renew_client() all without dropping the state lock. - the only other caller of renew_client_locked() is release_session_client(), which first checks under the client_lock that the cl_time is nonzero. So make it clear that this is a bug, not something we handle. I can't quite bring myself to make this a BUG(), though, as there are a lot of renew_client() callers, and returning here is probably safer than a BUG(). We'll consider making it a BUG() after some more cleanup. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:14 -04:00
J. Bruce Fields	90d700b779	nfsd4: simpler ordering of setclientid_confirm checks The cases here divide into two main categories: - if there's an uncomfirmed record with a matching verifier, then this is a "normal", succesful case: we're either creating a new client, or updating an existing one. - otherwise, this is a weird case: a replay, or a server reboot. Reordering to reflect that makes the code a bit more concise and the logic a lot easier to understand. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:07 -04:00
J. Bruce Fields	f3d03b9202	nfsd4: setclientid: remove pointless assignment Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:06 -04:00
J. Bruce Fields	8695b90ac3	nfsd4: fix error return in non-matching-creds case Note CLID_INUSE is for the case where two clients are trying to use the same client-provided long-form client identifiers. But what we're looking at here is the server-returned shorthand client id--if those clash there's a bug somewhere. Fix the error return, pull the check out into common code, and do the check unconditionally in all cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:04 -04:00
J. Bruce Fields	788c1eba50	nfsd4: fix setclientid_confirm same_cred check New clients are created only by nfsd4_setclientid(), which always gives any new client a unique clientid. The only exception is in the "callback update" case, in which case it may create an unconfirmed client with the same clientid as a confirmed client. In that case it also checks that the confirmed client has the same credential. Therefore, it is pointless for setclientid_confirm to check whether a confirmed and unconfirmed client with the same clientid have matching credentials--they're guaranteed to. Instead, it should be checking whether the credential on the setclientid_confirm matches either of those. Otherwise, it could be anyone sending the setclientid_confirm. Granted, I can't see why anyone would, but still it's probalby safer to check. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:03 -04:00
J. Bruce Fields	34b232bb37	nfsd4: merge 3 setclientid cases to 2 Boy, is this simpler. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:02 -04:00
J. Bruce Fields	8f9307119d	nfsd4: pull out common code from setclientid cases Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:01 -04:00
J. Bruce Fields	ad72aae5ad	nfsd4: merge last two setclientid cases The code here is mostly the same. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:30:00 -04:00
J. Bruce Fields	63db46328a	nfsd4: setclientid/confirm comment cleanup Be a little more concise. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:59 -04:00
J. Bruce Fields	e98479b8d6	nfsd4: setclientid remove unnecessary terms from a logical expression Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:58 -04:00
J. Bruce Fields	d5497fc693	nfsd4: move rq_flavor into svc_cred Move the rq_flavor into struct svc_cred, and use it in setclientid and exchange_id comparisons as well. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:58 -04:00
J. Bruce Fields	8fbba96e5b	nfsd4: stricter cred comparison for setclientid/exchange_id The typical setclientid or exchange_id will probably be performed with a credential that maps to either root or nobody, so comparing just uid's is unlikely to be useful. So, use everything else we can get our hands on. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:57 -04:00

1 2 3 4 5 ...

27559 Commits