This patch ties all operation vectors into a file system superblock
and registers the exofs file_system_type at module's load time.
* The file system control block (AKA on-disk superblock) resides in
an object with a special ID (defined in common.h).
Information included in the file system control block is used to
fill the in-memory superblock structure at mount time. This object
is created before the file system is used by mkexofs.c It contains
information such as:
- The file system's magic number
- The next inode number to be allocated
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
implementation of directory and inode operations.
* A directory is treated as a file, and essentially contains a list
of <file name, inode #> pairs for files that are found in that
directory. The object IDs correspond to the files' inode numbers
and are allocated using a 64bit incrementing global counter.
* Each file's control block (AKA on-disk inode) is stored in its
object's attributes. This applies to both regular files and other
types (directories, device files, symlinks, etc.).
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
OK Now we start to read and write from osd-objects. We try to
collect at most contiguous pages as possible in a single write/read.
The first page index is the object's offset.
TODO:
In 64-bit a single bio can carry at most 128 pages.
Add support of chaining multiple bios
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
implementation of the file_operations and inode_operations for
regular data files.
Most file_operations are generic vfs implementations except:
- exofs_truncate will truncate the OSD object as well
- Generic file_fsync is not good for none_bd devices so open code it
- The default for .flush in Linux is todo nothing so call exofs_fsync
on the file.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
This patch includes osd infrastructure that will be used later by
the file system.
Also the declarations of constants, on disk structures,
and prototypes.
And the Kbuild+Kconfig files needed to build the exofs module.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc:
Revert "proc: revert /proc/uptime to ->read_proc hook"
proc 2/2: remove struct proc_dir_entry::owner
proc 1/2: do PDE usecounting even for ->read_proc, ->write_proc
proc: fix sparse warnings in pagemap_read()
proc: move fs/proc/inode-alloc.txt comment into a source file
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.
We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.
But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.
->read_proc/->write_proc were just fixed to not require ->owner for
protection.
rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.
Removing ->owner will also make PDE smaller.
So, let's nuke it.
Kudos to Jeff Layton for reminding about this, let's say, oversight.
http://bugzilla.kernel.org/show_bug.cgi?id=12454
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
struct proc_dir_entry::owner is going to be removed. Now it's only necessary
to protect PDEs which are using ->read_proc, ->write_proc hooks.
However, ->owner assignments are racy and make it very easy for someone to switch
->owner on live PDE (as some subsystems do) without fixing refcounts and so on.
http://bugzilla.kernel.org/show_bug.cgi?id=12454
So, ->owner is on death row.
Proxy file operations exist already (proc_file_operations), just bump usecount
when necessary.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
fs/proc/task_mmu.c:696:12: warning: cast removes address space of expression
fs/proc/task_mmu.c:696:9: warning: incorrect type in assignment (different address spaces)
fs/proc/task_mmu.c:696:9: expected unsigned long long [noderef] [usertype] <asn:1>*out
fs/proc/task_mmu.c:696:9: got unsigned long long [usertype] *<noident>
fs/proc/task_mmu.c:697:12: warning: cast removes address space of expression
fs/proc/task_mmu.c:697:9: warning: incorrect type in assignment (different address spaces)
fs/proc/task_mmu.c:697:9: expected unsigned long long [noderef] [usertype] <asn:1>*end
fs/proc/task_mmu.c:697:9: got unsigned long long [usertype] *<noident>
fs/proc/task_mmu.c:723:12: warning: cast removes address space of expression
fs/proc/task_mmu.c:723:26: error: subtraction of different types can't work (different address spaces)
fs/proc/task_mmu.c:725:24: error: subtraction of different types can't work (different address spaces)
Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
so that people will realize that it exists and can update it as needed.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
This patch renames n_, c_, etc variables to something more sane. This
is the sixth in a series of patches to rip out some of the awful
variable naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a simple s/p_._//g to the reiserfs code. This is the
fifth in a series of patches to rip out some of the awful variable
naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a simple s/p_s_tb/tb/g to the reiserfs code. This is the
fourth in a series of patches to rip out some of the awful variable
naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a simple s/p_s_inode/inode/g to the reiserfs code. This
is the third in a series of patches to rip out some of the awful
variable naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a simple s/p_s_bh/bh/g to the reiserfs code. This is the
second in a series of patches to rip out some of the awful variable
naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is a simple s/p_s_sb/sb/g to the reiserfs code. This is the
first in a series of patches to rip out some of the awful variable
naming in reiserfs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch strips trailing whitespace from the reiserfs code.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch cleans up some redundancies in the reiserfs tree path code.
decrement_bcount() is essentially the same function as brelse(), so we use
that instead.
decrement_counters_in_path() is exactly the same function as pathrelse(), so
we kill that and use pathrelse() instead.
There's also a bit of cleanup that makes the code a bit more readable.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the first in a series of patches to make balance_leaf() not
quite so insane.
This patch factors out the open coded initializations of buffer_info
structures and defines a few initializers for the 4 cases they're used.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some time ago, some changes were made to make security inode attributes
be atomically written during inode creation. ReiserFS fell behind in
this area, but with the reworking of the xattr code, it's now fairly
easy to add.
The following patch adds the ability for security attributes to be added
automatically during inode creation.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current reiserfs xattr implementation open codes reiserfs_readdir
and frees the path before calling the filldir function. Typically, the
filldir function is something that modifies the file system, such as a
chown or an inode deletion that also require reading of an inode
associated with each direntry. Since the file system is modified, the
path retained becomes invalid for the next run. In addition, it runs
backwards in attempt to minimize activity.
This is clearly suboptimal from a code cleanliness perspective as well
as performance-wise.
This patch implements a generic reiserfs_for_each_xattr that uses the
generic readdir and a specific filldir routine that simply populates an
array of dentries and then performs a specific operation on them. When
all files have been operated on, it then calls the operation on the
directory itself.
The result is a noticable code reduction and better performance.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Deadlocks are possible in the xattr code between the journal lock and the
xattr sems.
This patch implements journalling for xattr operations. The benefit is
twofold:
* It gets rid of the deadlock possibility by always ensuring that xattr
write operations are initiated inside a transaction.
* It corrects the problem where xattr backing files aren't considered any
differently than normal files, despite the fact they are metadata.
I discussed the added journal load with Chris Mason, and we decided that
since xattrs (versus other journal activity) is fairly rare, the introduction
of larger transactions to support journaled xattrs wouldn't be too big a deal.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig had asked me quite some time ago to port the reiserfs
xattrs to the generic xattr interface.
This patch replaces the reiserfs-specific xattr handling code with the
generic struct xattr_handler.
However, since reiserfs doesn't split the prefix and name when accessing
xattrs, it can't leverage generic_{set,get,list,remove}xattr without
needlessly reconstructing the name on the back end.
Update 7/26/07: Added missing dput() to deletion path.
Update 8/30/07: Added missing mark_inode_dirty when i_mode is used to
represent an ACL and no previous ACL existed.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the changes to xattr root locking, the i_has_xattr_dir flag
is no longer needed. This patch removes it.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The per-inode locking can be made more fine-grained to surround just the
interaction with the filesystem itself. This really only applies to
protecting reads during a write, since concurrent writes are barred with
inode->i_mutex at the vfs level.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the switch to using inode->i_mutex locking during lookups/creation
in the xattr root, the per-super xattr lock is no longer needed.
This patch removes it.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The xattr file open/lookup code is needlessly complex. We can use
vfs-level operations to perform the same work, and also simplify the
locking constraints. The locking advantages will be exploited in future
patches.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current reiserfs xattr implementation will not clean up old xattr
files if files are deleted when REISERFS_FS_XATTR is unset. This
results in inaccessible lost files, wasting space.
This patch compiles in basic xattr knowledge, such as how to delete them
and change ownership for quota tracking. If the file system has never
used xattrs, then the operation is quite fast: it returns immediately
when it sees there is no .reiserfs_priv directory.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are a number of helper functions for marking a reiserfs inode
private that were leftover from reiserfs did its own thing wrt to
private inodes. S_PRIVATE has been in the kernel for some time, so this
patch removes the helpers and uses IS_PRIVATE instead.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Early in the reiserfs xattr development, there was a plan to use
hardlinks to save disk space for identical xattrs. That code never
materialized and isn't going to, so this patch removes the detection
code.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes reiserfs_get_page to take an offset rather than an
index since no callers calculate the index differently.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes the xinode and mapping variables from
reiserfs_xattr_{get,set}.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch makes many paths that are currently using warnings to handle
the error.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Although reiserfs can currently handle severe errors such as journal failure,
it cannot handle less severe errors like metadata i/o failure. The following
patch adds a reiserfs_error() function akin to the one in ext3.
Subsequent patches will use this new error handler to handle errors more
gracefully in general.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch kills off reiserfs_journal_abort as it is never called, and
combines __reiserfs_journal_abort_{soft,hard} into one function called
reiserfs_abort_journal, which performs the same work. It is silent
as opposed to the old version, since the message was always issued
after a regular 'abort' message.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ReiserFS panics can be somewhat inconsistent.
In some cases:
* a unique identifier may be associated with it
* the function name may be included
* the device may be printed separately
This patch aims to make warnings more consistent. reiserfs_warning() prints
the device name, so printing it a second time is not required. The function
name for a warning is always helpful in debugging, so it is now automatically
inserted into the output. Hans has stated that every warning should have
a unique identifier. Some cases lack them, others really shouldn't have them.
reiserfs_warning() now expects an id associated with each message. In the
rare case where one isn't needed, "" will suffice.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The formatting of the error buffer is race prone. It uses static buffers
for both formatting and output. While overwriting the error buffer
can product garbled output, overwriting the format buffer with incompatible
% directives can cause crashes.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vsprintf will consume varargs on its own. Skipping them manually
results in garbage in the error buffer, or Oopses in the case of
pointers.
This patch removes the advancement and fixes a number of bugs where
crashes were observed as side effects of a regular error report.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ReiserFS warnings can be somewhat inconsistent.
In some cases:
* a unique identifier may be associated with it
* the function name may be included
* the device may be printed separately
This patch aims to make warnings more consistent. reiserfs_warning() prints
the device name, so printing it a second time is not required. The function
name for a warning is always helpful in debugging, so it is now automatically
inserted into the output. Hans has stated that every warning should have
a unique identifier. Some cases lack them, others really shouldn't have them.
reiserfs_warning() now expects an id associated with each message. In the
rare case where one isn't needed, "" will suffice.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In several places, reiserfs_warning is used when there is no warning, just
a notice. This patch changes some of them to indicate that the message
is merely informational.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The output format between a warning/error/panic/info/etc changes with
which one is used.
The following patch makes the messages more internally consistent, but also
more consistent with other Linux filesystems.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch makes leaf_paste_entries more consistent with respect to the
other leaf operations. Using buffer_info instead of buffer_head
directly allows us to get a superblock pointer for use in error
handling.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes up the reiserfs code such that transaction ids are
always unsigned ints. In places they can currently be signed ints or
unsigned longs.
The former just causes an annoying clm-2200 warning and may join a
transaction when it should wait.
The latter is just for correctness since the disk format uses a 32-bit
transaction id. There aren't any runtime problems that result from it
not wrapping at the correct location since the value is truncated
correctly even on big endian systems. The 0 value might make it to
disk, but the mount-time checks will bump it to 10 itself.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The following patch adds the fields for tracking mount counts and last
fsck timestamps to the superblock. It also increments the mount count
on every read-write mount.
Reiserfsprogs 3.6.21 added support for these fields.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>