Replace EXTRA_CFLAGS with ccflags-y. And change ntfs-objs to ntfs-y
for cleaner conditional inclusion.
Signed-off-by: matt mooney <mfm@muteddisk.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
* 'mnt_devname' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
vfs: bury ->get_sb()
nfs: switch NFS from ->get_sb() to ->mount()
nfs: stop mangling ->mnt_devname on NFS
vfs: new superblock methods to override /proc/*/mount{s,info}
nfs: nfs_do_{ref,sub}mount() superblock argument is redundant
nfs: make nfs_path() work without vfsmount
nfs: store devname at disconnected NFS roots
nfs: propagate devname to nfs{,4}_get_root()
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] tioca: Fix assignment from incompatible pointer warnings
[IA64] mca.c: Fix cast from integer to pointer warning
[IA64] setup.c Typo fix "Architechtuallly"
[IA64] Add CONFIG_MISC_DEVICES=y to configs that need it.
[IA64] disable interrupts at end of ia64_mca_cpe_int_handler()
[IA64] Add DMA_ERROR_CODE define.
pstore: fix build warning for unused return value from sysfs_create_file
pstore: X86 platform interface using ACPI/APEI/ERST
pstore: new filesystem interface to platform persistent storage
* 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
BKL: That's all, folks
fs/locks.c: Remove stale FIXME left over from BKL conversion
ipx: remove the BKL
appletalk: remove the BKL
x25: remove the BKL
ufs: remove the BKL
hpfs: remove the BKL
drivers: remove extraneous includes of smp_lock.h
tracing: don't trace the BKL
adfs: remove the big kernel lock
The last remaining instances of ->get_sb() can be converted ->mount()
now - nothing in them uses new vfsmount anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a) ->show_devname(m, mnt) - what to put into devname columns in mounts,
mountinfo and mountstats
b) ->show_path(m, mnt) - what to put into relative path column in mountinfo
Leaving those NULL gives old behaviour. NFS switched to using those.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
part 3: now we have everything to get nfs_path() just by dentry -
just follow to (disconnected) root and pick the rest of the thing
there.
Start killing propagation of struct vfsmount * on the paths that
used to bring it to nfs_path().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
part 2: make sure that disconnected roots have corresponding mnt_devname
values stashed into them.
Have nfs*_get_root() stuff a copy of devname into ->d_fsdata of the
found root, provided that it is disconnected.
Have ->d_release() free it when dentry goes away.
Have the places where NFS uses ->d_fsdata for sillyrename (and that
can *never* happen to a disconnected root - dentry will be attached
to its parent) free old devname copies if they find those.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (33 commits)
AppArmor: kill unused macros in lsm.c
AppArmor: cleanup generated files correctly
KEYS: Add an iovec version of KEYCTL_INSTANTIATE
KEYS: Add a new keyctl op to reject a key with a specified error code
KEYS: Add a key type op to permit the key description to be vetted
KEYS: Add an RCU payload dereference macro
AppArmor: Cleanup make file to remove cruft and make it easier to read
SELinux: implement the new sb_remount LSM hook
LSM: Pass -o remount options to the LSM
SELinux: Compute SID for the newly created socket
SELinux: Socket retains creator role and MLS attribute
SELinux: Auto-generate security_is_socket_class
TOMOYO: Fix memory leak upon file open.
Revert "selinux: simplify ioctl checking"
selinux: drop unused packet flow permissions
selinux: Fix packet forwarding checks on postrouting
selinux: Fix wrong checks for selinux_policycap_netpeer
selinux: Fix check for xfrm selinux context algorithm
ima: remove unnecessary call to ima_must_measure
IMA: remove IMA imbalance checking
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: (46 commits)
fs/9p: Make the writeback_fid owned by root
fs/9p: Writeback dirty data before setattr
fs/9p: call vmtruncate before setattr 9p opeation
fs/9p: Properly update inode attributes on link
fs/9p: Prevent multiple inclusion of same header
fs/9p: Workaround vfs rename rehash bug
fs/9p: Mark directory inode invalid for many directory inode operations
fs/9p: Add . and .. dentry revalidation flag
fs/9p: mark inode attribute invalid on rename, unlink and setattr
fs/9p: Add support for marking inode attribute invalid
fs/9p: Initialize root inode number for dotl
fs/9p: Update link count correctly on different file system operations
fs/9p: Add drop_inode 9p callback
fs/9p: Add direct IO support in cached mode
fs/9p: Fix inode i_size update in file_write
fs/9p: set default readahead pages in cached mode
fs/9p: Move writeback fid to v9fs_inode
fs/9p: Add v9fs_inode
fs/9p: Don't set stat.st_blocks based on nrpages
fs/9p: Add inode hashing
...
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix build failure introduced by s/freezeable/freezable/
workqueue: add system_freezeable_wq
rds/ib: use system_wq instead of rds_ib_fmr_wq
net/9p: replace p9_poll_task with a work
net/9p: use system_wq instead of p9_mux_wq
xfs: convert to alloc_workqueue()
reiserfs: make commit_wq use the default concurrency level
ocfs2: use system_wq instead of ocfs2_quota_wq
ext4: convert to alloc_workqueue()
scsi/scsi_tgt_lib: scsi_tgtd isn't used in memory reclaim path
scsi/be2iscsi,qla2xxx: convert to alloc_workqueue()
misc/iwmc3200top: use system_wq instead of dedicated workqueues
i2o: use alloc_workqueue() instead of create_workqueue()
acpi: kacpi*_wq don't need WQ_MEM_RECLAIM
fs/aio: aio_wq isn't used in memory reclaim path
input/tps6507x-ts: use system_wq instead of dedicated workqueue
cpufreq: use system_wq instead of dedicated workqueues
wireless/ipw2x00: use system_wq instead of dedicated workqueues
arm/omap: use system_wq in mailbox
workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER
It turns out that while a maximum of 8 partitions may be what people
"should" have had, you can actually fit up to 18 entries(*) in a sector.
And some people clearly were taking advantage of that, like Michael
Cree, who had ten partitions on one of his OSF disks.
(*) The OSF partition data starts at byte offset 64 in the first sector,
and the array of 16-byte partition entries start at offset 148 in
the on-disk partition structure.
Reported-by: Michael Cree <mcree@orcon.net.nz>
Cc: stable@kernel.org (v2.6.38)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
iprune_sem is continously giving us lockdep warnings because we do take it in
read mode in the reclaim path, but we're also doing non-NOFS allocations under
it taken in write mode.
Taking a bit deeper look at it I think it's fixable quite trivially:
- for invalidate_inodes we do not need iprune_sem at all. We have an active
reference on the superblock, so the filesystem is not going away until it
has finished.
- for evict_inodes we do need it, to make sure prune_icache has done it's
work before we tear down the superblock. But there is no reason to
hold it over the actual reclaim operation - it's enough to cycle through
it after the actual reclaim to make sure we wait for any pending
prune_icache to complete. We just have to remove the WARN_ON for
otherwise busy inodes as they can actually happen now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
When debugging is enabled, we allocate a buffer of PEB size for
various debugging purposes. However, now all users of this buffer
are gone and we can safely remove it and save 128KiB or more RAM.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Instead of using pre-allocated 'c->dbg->buf' buffer in
'dbg_scan_orphans()', dynamically allocate it when needed. The intend
is to get rid of the pre-allocated 'c->dbg->buf' buffer and save
128KiB of RAM (or more if PEB size is larger). Indeed, currently we
allocate this memory even if the user never enables any self-check,
which is wasteful.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Instead of using pre-allocated 'c->dbg->buf' buffer in
'dump_lpt_leb()', dynamically allocate it when needed. The intend
is to get rid of the pre-allocated 'c->dbg->buf' buffer and save
128KiB of RAM (or more if PEB size is larger). Indeed, currently we
allocate this memory even if the user never enables any self-check,
which is wasteful.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Instead of using pre-allocated 'c->dbg->buf' buffer in
'dbg_check_ltab_lnum()', dynamically allocate it when needed. The
intend is to get rid of the pre-allocated 'c->dbg->buf' buffer and
save 128KiB of RAM (or more if PEB size is larger). Indeed,
currently we allocate this memory even if the user never enables
any self-check, which is wasteful.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Instead of using pre-allocated 'c->dbg->buf' buffer in
'scan_check_cb()', dynamically allocate it when needed. The intend
is to get rid of the pre-allocated 'c->dbg->buf' buffer and save
128KiB of RAM (or more if PEB size is larger). Indeed, currently we
allocate this memory even if the user never enables any self-check,
which is wasteful.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Instead of using pre-allocated 'c->dbg->buf' buffer in
'dbg_dump_leb()', dynamically allocate it when needed. The intend
is to get rid of the pre-allocated 'c->dbg->buf' buffer and save
128KiB of RAM (or more if PEB size is larger). Indeed, currently we
allocate this memory even if the user never enables any self-check,
which is wasteful.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (57 commits)
tidy the trailing symlinks traversal up
Turn resolution of trailing symlinks iterative everywhere
simplify link_path_walk() tail
Make trailing symlink resolution in path_lookupat() iterative
update nd->inode in __do_follow_link() instead of after do_follow_link()
pull handling of one pathname component into a helper
fs: allow AT_EMPTY_PATH in linkat(), limit that to CAP_DAC_READ_SEARCH
Allow passing O_PATH descriptors via SCM_RIGHTS datagrams
readlinkat(), fchownat() and fstatat() with empty relative pathnames
Allow O_PATH for symlinks
New kind of open files - "location only".
ext4: Copy fs UUID to superblock
ext3: Copy fs UUID to superblock.
vfs: Export file system uuid via /proc/<pid>/mountinfo
unistd.h: Add new syscalls numbers to asm-generic
x86: Add new syscalls for x86_64
x86: Add new syscalls for x86_32
fs: Remove i_nlink check from file system link callback
fs: Don't allow to create hardlink for deleted file
vfs: Add open by file handle support
...
The new vfs locking scheme introduced in 2.6.38 breaks NFS sillyrename
because the latter relies on being able to determine the parent
directory of the dentry in the ->iput() callback in order to send the
appropriate unlink rpc call.
Looking at the code that cares about races with dput(), there doesn't
seem to be anything that specifically uses d_parent as a test for
whether or not there is a race:
- __d_lookup_rcu(), __d_lookup() all test for d_hashed() after d_parent
- shrink_dcache_for_umount() is safe since nothing else can rearrange
the dentries in that super block.
- have_submount(), select_parent() and d_genocide() can test for a
deletion if we set the DCACHE_DISCONNECTED flag when the dentry
is removed from the parent's d_subdirs list.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org (2.6.38, needs commit c826cb7dfc "dcache.c:
create helper function for duplicated functionality" )
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This creates a helper function for he "try to ascend into the parent
directory" case, which was written out in triplicate before. With all
the locking and subtle sequence number stuff, we really don't want to
duplicate that kind of code.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pull the handling of current->total_link_count into
__do_follow_link()
* put the common "do ->put_link() if needed and path_put() the link"
stuff into a helper (put_link(nd, link, cookie))
* rename __do_follow_link() to follow_link(), while we are at it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The last remaining place (resolution of nested symlink) converted
to the loop of the same kind we have in path_lookupat() and
path_openat().
Note that we still *do* have a recursion in pathname resolution;
can't avoid it, really. However, it's strictly for nested symlinks
now - i.e. ones in the middle of a pathname.
link_path_walk() has lost the tail now - it always walks everything
except the last component.
do_follow_link() renamed to nested_symlink() and moved down.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that link_path_walk() is called without LOOKUP_PARENT
only from do_follow_link(), we can simplify the checks in
last component handling. First of all, checking if we'd
arrived to a directory is not needed - the caller will check
it anyway. And LOOKUP_FOLLOW is guaranteed to be there,
since we only get to that place with nd->depth > 0.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
new helper: walk_component(). Handles everything except symlinks;
returns negative on error, 0 on success and 1 on symlinks we decided
to follow. Drops out of RCU mode on such symlinks.
link_path_walk() and do_last() switched to using that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We don't want to allow creation of private hardlinks by different application
using the fd passed to them via SCM_RIGHTS. So limit the null relative name
usage in linkat syscall to CAP_DAC_READ_SEARCH
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
d_move puts the renamed dentry at the end of d_subdirs, screwing with our
cached dentry directory offsets. We were just clearing I_COMPLETE to avoid
any possibility of trouble. However, assigning the renamed dentry an
offset at the end of the directory (to match it's new d_subdirs position)
is sufficient to maintain correct behavior and hold onto I_COMPLETE.
This is especially important for workloads like rsync, which renames files
into place. Before, we would lose I_COMPLETE and do MDS lookups for each
file. With this patch we only talk to the MDS on create and rename.
Signed-off-by: Sage Weil <sage@newdream.net>
Changes to make sure writeback fid is owned by root
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
change file attribute can result in making the file readonly.
So flush the dirty pages before that.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We need to call vmtruncate before 9p setattr operation, otherwise we
could write back some dirty pages between setattr with ATTR_SIZE and vmtruncate
causing some truncated pages to be written back to server
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
With caching enabled, we need to make sure we don't
update inode->i_size via stat2inode because we could
have dirty data which is not yet written to the server
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
This is similar to what ceph, ocfs2 and nfs does
http://kerneltrap.org/mailarchive/linux-fsdevel/2008/4/18/1498534
May be we should get vfs fixed
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
One successfull directory operation we would have changed directory
inode attribute. So mark them invalid
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We need to revalidate . and .. entries also
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
rename, unlink and setattr can result in update of inode attribute.
So mark the cached copy invalid
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
With cached mode some of the file system operation result
in updating inode attributes (ctime). Add support for
marking inode attribute invalid in such cases so that
we fetch the updated inode attribute on dentry revalidation.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We want to immediately drop the inode in non cached mode
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Only update inode i_size when we write towards end of file.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We want to enable readahead in cached mode
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Switch to the fscache code to v9fs_inode. We will later use
v9fs_inode in cache=loose mode to track the inode cache
validity timeout. Ie if we find an inode in cache older
that a specific jiffie range we will consider it stale
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
simple_getattr does set stat.st_blocks to a value
derived from nrpages. That is not correct with 9p
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We didn't add the inode to inode hash in 9p. We need to do that
to get sync to work, otherwise __mark_inode_dirty will not
add the inode to super block's dirty list.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
FIXME!! what about dotu ?
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We should not mark file system synchronous if mounted cache=* option
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Update the comment to indicate that we don't want to cache
negative dentries.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We can now support writeable mmaps.
Based on the original patch from Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The fid attached to inode will be opened O_RDWR mode and is used
for dirty page writeback only.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We add read write helper function here which will
be used later by the mmap patch
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We need to call fscache_wait_on_page_write in launder_page
for fscache
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We need to ihold even in cached mode
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
We need to call v9fs_cache_inode_set_cookie in create
path also
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
With the old code we were not setting the file->f_op
with cached file operations during creat.
(format correction by jvrao@linux.vnet.ibm.com)
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Current code sets access=user as default for all protocol versions.
This patch chagnes it to "client" only for dotl.
User can always specify particular access mode with -o access= option.
No change there.
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The mount option access=client is overloaded as it assumes acl too.
Adding posixacl option to enable POSIX ACLs makes it explicit and clear.
Also it is convenient in the future to add other types of acls like richacls.
Ideally, the access mode 'client' should be just like V9FS_ACCESS_USER
except it underscores the location of access check.
Traditional 9P protocol lets the server perform access checks but with
this mode, all the access checks will be performed on the client itself.
Server just follows the client's directive.
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
If the kernel is not compiled with CONFIG_9P_FS_POSIX_ACL and the
mount option is specified to enable ACLs current code fails the mount.
This patch brings the behavior inline with other filesystems like ext3
by proceeding with the mount and log a warning to syslog.
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
With create/mkdir/mknod in non cached mode we initialize the inode using
v9fs_get_inode. v9fs_get_inode doesn't initialize the cache inode value
to NULL. This is causing to trip on BUG_ON in v9fs_get_cached_acl.
Fix is to initialize acls to NULL and not to leave them in ACL_NOT_CACHED
state.
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
In v9fs_get_acl() if __v9fs_get_acl() gets only one of the
dacl/pacl we are not releasing it.
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
As per RCU glock patch review comments, don't use the _raw
version of this function here.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Just need to make sure that AF_UNIX garbage collector won't
confuse O_PATHed socket on filesystem for real AF_UNIX opened
socket.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
For readlinkat() we simply allow empty pathname; it will fail unless
we have dfd equal to O_PATH-opened symlink, so we are outside of
POSIX scope here. For fchownat() and fstatat() we allow AT_EMPTY_PATH;
let the caller explicitly ask for such behaviour.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
At that point we can't do almost nothing with them. They can be opened
with O_PATH, we can manipulate such descriptors with dup(), etc. and
we can see them in /proc/*/{fd,fdinfo}/*.
We can't (and won't be able to) follow /proc/*/fd/* symlinks for those;
there's simply not enough information for pathname resolution to go on
from such point - to resolve a symlink we need to know which directory
does it live in.
We will be able to do useful things with them after the next commit, though -
readlinkat() and fchownat() will be possible to use with dfd being an
O_PATH-opened symlink and empty relative pathname. Combined with
open_by_handle() it'll give us a way to do realink-by-handle and
lchown-by-handle without messing with more redundant syscalls.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
New flag for open(2) - O_PATH. Semantics:
* pathname is resolved, but the file itself is _NOT_ opened
as far as filesystem is concerned.
* almost all operations on the resulting descriptors shall
fail with -EBADF. Exceptions are:
1) operations on descriptors themselves (i.e.
close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD),
fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD),
fcntl(fd, F_SETFD, ...))
2) fcntl(fd, F_GETFL), for a common non-destructive way to
check if descriptor is open
3) "dfd" arguments of ...at(2) syscalls, i.e. the starting
points of pathname resolution
* closing such descriptor does *NOT* affect dnotify or
posix locks.
* permissions are checked as usual along the way to file;
no permission checks are applied to the file itself. Of course,
giving such thing to syscall will result in permission checks (at
the moment it means checking that starting point of ....at() is
a directory and caller has exec permissions on it).
fget() and fget_light() return NULL on such descriptors; use of
fget_raw() and fget_raw_light() is needed to get them. That protects
existing code from dealing with those things.
There are two things still missing (they come in the next commits):
one is handling of symlinks (right now we refuse to open them that
way; see the next commit for semantics related to those) and another
is descriptor passing via SCM_RIGHTS datagrams.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
File system UUID is made available to application
via /proc/<pid>/mountinfo
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
File system UUID is made available to application
via /proc/<pid>/mountinfo
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We add a per superblock uuid field. File systems should
update the uuid in the fill_super callback
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that VFS check for inode->i_nlink == 0 and returns proper
error, remove similar check from file system
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add inode->i_nlink == 0 check in VFS. Some of the file systems
do this internally. A followup patch will remove those instance.
This is needed to ensure that with link by handle we don't allow
to create hardlink of an unlinked file. The check also prevent a race
between unlink and link
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The syscall also return mount id which can be used
to lookup file system specific information such as uuid
in /proc/<pid>/mountinfo
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
For name_to_handle_at(2) we'll want both ...at()-style syscall that
would be usable for non-directory descriptors (with empty relative
pathname). Introduce new flag (AT_EMPTY_PATH) to deal with that and
corresponding LOOKUP_EMPTY; teach user_path_at() and path_init() to
deal with the latter.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: NFSROOT should default to "proto=udp"
nfs4: remove duplicated #include
NFSv4: nfs4_state_mark_reclaim_nograce() should be static
NFSv4: Fix the setlk error handler
NFSv4.1: Fix the handling of the SEQUENCE status bits
NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
NFSv4.1 reclaim complete must wait for completion
NFSv4: remove duplicate clientid in struct nfs_client
NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY
sunrpc: Propagate errors from xs_bind() through xs_create_sock()
(try3-resend) Fix nfs_compat_user_ino64 so it doesn't cause problems if bit 31 or 63 are set in fileid
nfs: fix compilation warning
nfs: add kmalloc return value check in decode_and_add_ds
SUNRPC: Remove resource leak in svc_rdma_send_error()
nfs: close NFSv4 COMMIT vs. CLOSE race
SUNRPC: Close a race in __rpc_wait_for_completion_task()
The kernel automatically evaluates partition tables of storage devices.
The code for evaluating OSF partitions contains a bug that leaks data
from kernel heap memory to userspace for certain corrupted OSF
partitions.
In more detail:
for (i = 0 ; i < le16_to_cpu(label->d_npartitions); i++, partition++) {
iterates from 0 to d_npartitions - 1, where d_npartitions is read from
the partition table without validation and partition is a pointer to an
array of at most 8 d_partitions.
Add the proper and obvious validation.
Signed-off-by: Timo Warns <warns@pre-sense.de>
Cc: stable@kernel.org
[ Changed the patch trivially to not repeat the whole le16_to_cpu()
thing, and to use an explicit constant for the magic value '8' ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gfs2_write_begin() calls grab_cache_page_write_begin() that returns *locked*
page. Correspondent error-handling path lacks for unlock_page() call:
> out:
> if (error == 0)
> return 0;
>
> page_cache_release(page);
The whole system hangs if gfs2_unstuff_dinode() called from gfs2_write_begin()
failed for some reason.
Reported-by: Maxim <maxim.patlasov@gmail.com>
Signed-off-by: Maxim <maxim.patlasov@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The exportfs encode handle function should return the minimum required
handle size. This helps user to find out the handle size by passing 0
handle size in the first step and then redoing to the call again with
the returned handle size value.
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
New helpers: user_statfs() and fd_statfs(), taking userland pathname and
descriptor resp. and filling struct kstatfs. Syscalls of statfs family
(native, compat and foreign - osf and hpux on alpha and parisc resp.)
switched to those. Removes some boilerplate code, simplifies cleanup
on errors...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
new function: file_open_root(dentry, mnt, name, flags) opens the file
vfs_path_lookup would arrive to.
Note that name can be empty; in that case the usual requirement that
dentry should be a directory is lifted.
open-coded equivalents switched to it, may_open() got down exactly
one caller and became static.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>