Always run commit in sync_fs, because even if the journal seems
to be almost empty, there may be a deletion which removes a large
file, which affects the index greatly. And because we want
better free space predictions after 'sync_fs()', we have to
commit.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Argh. The ->sync_fs call is called _before_ all inodes are flushed.
This means we first sync write buffers and commit, then all
inodes are synced, and we end up with unflushed write buffers!
Fix this by forcing synching all indoes from 'ubifs_sync_fs()'.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The c->min_idx_lebs constant depends on c->old_idx_sz, which
is read from the master node. This means that we have to
initialize c->min_idx_lebs only after we have read the master
node.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When we commit, but before we try to write anything to the flash
media, @c->min_idx_size is inaccurate, because we do not re-calculate
it after the commit. Do not forget to do this.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Take into account that 2 eraseblocks are never available because
they are reserved for the index. This gives more realistic count
of FS blocks.
To avoid future confusions like this, introduce a constant.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This patch fixes the following section mismatch:
WARNING: fs/ubifs/ubifs.o(.init.text+0xec): Section mismatch in reference from the function init_module() to the function .exit.text:ubifs_compressors_exit()
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Conflicts:
fs/nfsd/nfs4recover.c
Manually fixed above to use new creds API functions, e.g.
nfs4_save_creds().
Signed-off-by: James Morris <jmorris@namei.org>
Do not forget to check whether lpt debugging is enabled before
running the check functions. This commit also makes some spelling
fixes.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
We need to have a possibility to see various UBIFS variables
and ask UBIFS to dump various information. Debugfs is what
we need.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Introduce a new data structure which contains all debugging
stuff inside. This is cleaner than having debugging stuff
directly in 'c'.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
I have a habit of compiling kernel with
EXTRA_CFLAGS="-Wextra -Wno-unused -Wno-sign-compare -Wno-missing-field-initializers"
and so fs/ubifs/key.h give lots (~10) of these every time:
CC fs/ubifs/tnc_misc.o
In file included from fs/ubifs/ubifs.h:1725,
from fs/ubifs/tnc_misc.c:30:
fs/ubifs/key.h: In function 'key_r5_hash':
fs/ubifs/key.h:64: warning: comparison of unsigned expression >= 0 is always true
fs/ubifs/key.h: In function 'key_test_hash':
fs/ubifs/key.h:81: warning: comparison of unsigned expression >= 0 is always true
This patch fixes the warnings.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
It is very handy to be able to change default UBIFS compressor
via mount options. Introduce -o compr=<name> mount option support.
Currently only "none", "lzo" and "zlib" compressors are supported.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Save a 4 bytes of RAM per 'struct inode' by stroring inode
compression type in bit-filed, instead of using 'int'.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
If data does not compress, it is better to leave it uncompressed
because we'll read it faster then. So do not compress data if we
save less than 64 bytes.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
To avoid memory allocation failure during bulk-read, pre-allocate
a bulk-read buffer, so that if there is only one bulk-reader at
a time, it would just use the pre-allocated buffer and would not
do any memory allocation. However, if there are more than 1 bulk-
reader, then only one reader would use the pre-allocated buffer,
while the other reader would allocate the buffer for itself.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Bulk-read allocates 128KiB or more using kmalloc. The allocation
starts failing often when the memory gets fragmented. UBIFS still
works fine in this case, because it falls-back to standard
(non-optimized) read method, though. This patch teaches bulk-read
to allocate exactly the amount of memory it needs, instead of
allocating 128KiB every time.
This patch is also a preparation to the further fix where we'll
have a pre-allocated bulk-read buffer as well. For example, now
the @bu object is prepared in 'ubifs_bulk_read()', so we could
path either pre-allocated or allocated information to
'ubifs_do_bulk_read()' later. Or teaching 'ubifs_do_bulk_read()'
not to allocate 'bu->buf' if it is already there.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Bulk-read allocates a lot of memory with 'kmalloc()', and when it
is/gets fragmented 'kmalloc()' fails with a scarry warning. But
because bulk-read is just an optimization, UBIFS keeps working fine.
Supress the warning by passing __GFP_NOWARN option to 'kmalloc()'.
This patch also introduces a macro for the magic 128KiB constant.
This is just neater.
Note, this is not really fixes the problem we had, but just hides
the warnings. The further patches fix the problem.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.
Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().
Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Adrian Hunter <ext-adrian.hunter@nokia.com>
Cc: linux-mtd@lists.infradead.org
Signed-off-by: James Morris <jmorris@namei.org>
The LPT may have gaps in it because initially empty LEBs
are not added by mkfs.ubifs - because it does not know how
many there are. Then UBIFS allocates empty LEBs in the
reverse order that they are discovered i.e. they are
added to, and removed from, the front of a list. That
creates a gap in the middle of the LPT.
The function dirtying the LPT tree (for the purpose of
small model garbage collection) assumed that a gap could
only occur at the very end of the LPT and stopped dirtying
prematurely, which in turn resulted in the LPT running
out of space - something that is designed to be impossible.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
We print 'ino_t' type using '%lu' printk() placeholder, but this
results in many warnings when compiling for Alpha platform. Fix
this by adding (unsingned long) casts.
Fixes these warnings:
fs/ubifs/journal.c:693: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/journal.c:1131: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/dir.c:163: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/tnc.c:2680: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/tnc.c:2700: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/replay.c:1066: warning: format '%lu' expects type 'long unsigned int', but argument 7 has type 'ino_t'
fs/ubifs/orphan.c:108: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:135: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:142: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:154: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:159: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:451: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:539: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:612: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:843: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:856: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/recovery.c:1438: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/recovery.c:1443: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/recovery.c:1475: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/recovery.c:1495: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:105: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:105: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:110: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:110: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:114: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:114: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:118: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:118: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:1591: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1671: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1674: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1680: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1699: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1788: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1821: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1833: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1924: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1932: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1938: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1945: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1953: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1960: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1967: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1973: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1988: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1991: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:2009: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'ino_t'
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Noticed by sparse:
fs/ubifs/file.c:75:2: warning: restricted __le64 degrades to integer
fs/ubifs/file.c:629:4: warning: restricted __le64 degrades to integer
fs/ubifs/dir.c:431:3: warning: restricted __le64 degrades to integer
This should be checked to ensure the ubifs_assert is working as
intended, I've done the suggested annotation in this patch.
fs/ubifs/sb.c:298:6: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:298:6: expected int [signed] [assigned] tmp
fs/ubifs/sb.c:298:6: got restricted __le64 [usertype] <noident>
fs/ubifs/sb.c:299:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:299:19: expected restricted __le64 [usertype] atime_sec
fs/ubifs/sb.c:299:19: got int [signed] [assigned] tmp
fs/ubifs/sb.c:300:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:300:19: expected restricted __le64 [usertype] ctime_sec
fs/ubifs/sb.c:300:19: got int [signed] [assigned] tmp
fs/ubifs/sb.c:301:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:301:19: expected restricted __le64 [usertype] mtime_sec
fs/ubifs/sb.c:301:19: got int [signed] [assigned] tmp
This looks like a bugfix as your tmp was a u32 so there was truncation in
the atime, mtime, ctime value, probably not intentional, add a tmp_le64
and use it here.
fs/ubifs/key.h:348:9: warning: cast to restricted __le32
fs/ubifs/key.h:348:9: warning: cast to restricted __le32
fs/ubifs/key.h:419:9: warning: cast to restricted __le32
Read from the annotated union member instead.
fs/ubifs/recovery.c:175:13: warning: incorrect type in assignment (different base types)
fs/ubifs/recovery.c:175:13: expected unsigned int [unsigned] [usertype] save_flags
fs/ubifs/recovery.c:175:13: got restricted __le32 [usertype] flags
fs/ubifs/recovery.c:186:13: warning: incorrect type in assignment (different base types)
fs/ubifs/recovery.c:186:13: expected restricted __le32 [usertype] flags
fs/ubifs/recovery.c:186:13: got unsigned int [unsigned] [usertype] save_flags
Do byteshifting at compile time of the flag value. Annotate the saved_flags
as le32.
fs/ubifs/debug.c:368:10: warning: cast to restricted __le32
fs/ubifs/debug.c:368:10: warning: cast from restricted __le64
Should be checked if the truncation was intentional, I've changed the
printk to print the full width.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Remove the "UBIFS background thread ubifs_bgd0_0 started" message.
We kill the background thread when we switch to R/O mode, and
start it again whan we switch to R/W mode. OLPC is doing this
many times during boot, and we see this message many times as
well, which is irritating. So just kill the message.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
* 'linux-next' of git://git.infradead.org/ubifs-2.6: (25 commits)
UBIFS: fix ubifs_compress commentary
UBIFS: amend printk
UBIFS: do not read unnecessary bytes when unpacking bits
UBIFS: check buffer length when scanning for LPT nodes
UBIFS: correct condition to eliminate unecessary assignment
UBIFS: add more debugging messages for LPT
UBIFS: fix bulk-read handling uptodate pages
UBIFS: improve garbage collection
UBIFS: allow for sync_fs when read-only
UBIFS: commit on sync_fs
UBIFS: correct comment for commit_on_unmount
UBIFS: update dbg_dump_inode
UBIFS: fix commentary
UBIFS: fix races in bit-fields
UBIFS: ensure data read beyond i_size is zeroed out correctly
UBIFS: correct key comparison
UBIFS: use bit-fields when possible
UBIFS: check data CRC when in error state
UBIFS: improve znode splitting rules
UBIFS: add no_chk_data_crc mount option
...
Update the comment for ubifs_compress(), which incorrectly states that it
returnsa success/failure indicator.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
It is better to print "Reserved for root" than
"Reserved pool size", because it is more obvious for users
what this means.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.
This was posted for review some time ago and I believe its been in -mm
since then.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Also add debugging checks for LPT size and separate
out c->check_lpt_free from unrelated bitfields.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Bulk-read skips uptodate pages but this was putting its
array index out and causing it to treat subsequent pages
as holes.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Make garbage collection try to keep data nodes from the same
inode together and in ascending order. This improves
performance when reading those nodes especially when bulk-read
is used.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
sync_fs can be called even if the file system is mounted
read-only. Ensure the commit is not run in that case.
Reported-by: Zoltan Sogor <weth@inf.u-szeged.hu>
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Commit the journal when the FS is sync'ed. This will make
statfs provide better free space report. And we anyway
advice our users to sync the FS if they want better statfs
report.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
We cannot store bit-fields together if the processes which
change them may race, unless we serialize them.
Thus, move the nospc and nospc_rp bit-fields eway from
the mount option/constant bit-fields, to avoid races.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The "bulk_read" and "no_chk_data_crc" have only 2 values -
0 and 1. We already have bit-fields in corresponding data
structers, so make "bulk_read" and "no_chk_data_crc"
bit-fields as well.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When UBIFS switches to R/O mode because of an error,
it is reasonable to enable data CRC checking.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When inserting into a full znode it is split into two
znodes. Because data node keys are usually consecutive,
it is better to try to keep them together. This patch
does a better job of that.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
UBIFS read performance can be improved by skipping the CRC
check when data nodes are read. This option can be used if
the underlying media is considered to be highly reliable.
Note that CRCs are always checked for metadata.
Read speed on Arm platform with OneNAND goes from 19 MiB/s
to 27 MiB/s with data CRC checking disabled.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Some flash media are capable of reading sequentially at faster rates.
UBIFS bulk-read facility is designed to take advantage of that, by
reading in one go consecutive data nodes that are also located
consecutively in the same LEB.
Read speed on Arm platform with OneNAND goes from 17 MiB/s to
19 MiB/s.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
In case of error, the function kthread_create returns an ERR pointer,
but never returns a NULL pointer. So a NULL test that comes before an
IS_ERR test should be deleted.
The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@match_bad_null_test@
expression x, E;
statement S1,S2;
@@
x = kthread_create(...)
... when != x = E
* if (x == NULL)
S1 else S2
// </smpl>
Signed-off-by: Julien Brunel <brunel@diku.dk>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
'ubifs_get_lprops()' and 'ubifs_release_lprops()' basically wrap
mutex lock and unlock. We have them because we want lprops subsystem
be separate and as independent as possible. And we planned better
locking rules for lprops.
Anyway, because they are short, it is better to inline them.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
IS_ERR() macro already has unlikely(), so do not use constructions
like 'if (unlikely(IS_ERR())'.
Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This commit adds a reserved pool size print and tweaks the
prints to make them look nicer.
It also fixes and cleans-up some comments.
Additionally, it deletes some blank lines to make the code look
a little nicer.
In other words, nothing essential.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
fs/ubifs/dir.c:428: warning: format '%llu' expects type 'long long
unsigned int', but argument 5 has type 'long unsigned int'
fs/ubifs/debug.c:541: warning: format '%llu' expects type 'long long
unsigned int', but argument 2 has type 'long unsigned int'
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The assert was not valid because one of the variables
'taken_empty_lebs' has transient values out of sync
with the other variables.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
- update GC sequence number if any nodes may have been moved
even if GC did not finish the LEB
- don't ignore error return when reading
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
If the ubifs partition is mounted RO and then remounted RW we end
up with no thread name in ubifs_remount_rw() and the thread appears
nameless.
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBIFS does not really work correctly when fanout is 2,
because of the way we manage the indexing tree. It may
just become a list and UBIFS screws up.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
David Woodhouse suggested to be consistent with other FSes
and xor the beginning and the end of the UUID.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBIFS stores 16-bit UUID in the superblock, and it is a good
idea to return part of it in 'f_fsid' filed of kstatfs structure.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Since free space we report in statfs is file size which should
fit to the FS - change the way we calculate free space and use
leb_overhead instead of dark_wm in calculations.
Results of "freespace" test (120MiB volume, 16KiB LEB size,
512 bytes page size). Before the change:
freespace: Test 1: fill the space we have 3 times
freespace: was free: 85204992 bytes 81.3 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 11284480 bytes 10.8 MiB, wrote 13.2% more than predicted
freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 12935168 bytes 12.3 MiB, wrote 15.5% more than predicted
freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 12939264 bytes 12.3 MiB, wrote 15.5% more than predicted
freespace: Test 1 finished
freespace: Test 2: gradually lessen amount of free space and fill the FS
freespace: do 10 steps, lessen free space by 7596218 bytes 7.2 MiB each time
freespace: was free: 78675968 bytes 75.0 MiB, wrote: 88903680 bytes 84.8 MiB, delta: 10227712 bytes 9.8 MiB, wrote 13.0% more than predicted
freespace: was free: 72015872 bytes 68.7 MiB, wrote: 81514496 bytes 77.7 MiB, delta: 9498624 bytes 9.1 MiB, wrote 13.2% more than predicted
freespace: was free: 63938560 bytes 61.0 MiB, wrote: 72589312 bytes 69.2 MiB, delta: 8650752 bytes 8.2 MiB, wrote 13.5% more than predicted
freespace: was free: 56127488 bytes 53.5 MiB, wrote: 63762432 bytes 60.8 MiB, delta: 7634944 bytes 7.3 MiB, wrote 13.6% more than predicted
freespace: was free: 48336896 bytes 46.1 MiB, wrote: 54935552 bytes 52.4 MiB, delta: 6598656 bytes 6.3 MiB, wrote 13.7% more than predicted
freespace: was free: 40587264 bytes 38.7 MiB, wrote: 46157824 bytes 44.0 MiB, delta: 5570560 bytes 5.3 MiB, wrote 13.7% more than predicted
freespace: was free: 32841728 bytes 31.3 MiB, wrote: 37384192 bytes 35.7 MiB, delta: 4542464 bytes 4.3 MiB, wrote 13.8% more than predicted
freespace: was free: 25100288 bytes 23.9 MiB, wrote: 28618752 bytes 27.3 MiB, delta: 3518464 bytes 3.4 MiB, wrote 14.0% more than predicted
freespace: was free: 17342464 bytes 16.5 MiB, wrote: 19841024 bytes 18.9 MiB, delta: 2498560 bytes 2.4 MiB, wrote 14.4% more than predicted
freespace: was free: 9605120 bytes 9.2 MiB, wrote: 11063296 bytes 10.6 MiB, delta: 1458176 bytes 1.4 MiB, wrote 15.2% more than predicted
freespace: Test 2 finished
freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS
freespace: do 10 steps, lessen free space by 7606272 bytes 7.3 MiB each time
freespace: trashing: was free: 83668992 bytes 79.8 MiB, need free: 7606272 bytes 7.3 MiB, files created: 248297, delete 225724 (90.9% of them)
freespace: was free: 70803456 bytes 67.5 MiB, wrote: 82485248 bytes 78.7 MiB, delta: 11681792 bytes 11.1 MiB, wrote 16.5% more than predicted
freespace: trashing: was free: 81080320 bytes 77.3 MiB, need free: 15212544 bytes 14.5 MiB, files created: 248711, delete 202047 (81.2% of them)
freespace: was free: 59867136 bytes 57.1 MiB, wrote: 71897088 bytes 68.6 MiB, delta: 12029952 bytes 11.5 MiB, wrote 20.1% more than predicted
freespace: trashing: was free: 82243584 bytes 78.4 MiB, need free: 22818816 bytes 21.8 MiB, files created: 248866, delete 179817 (72.3% of them)
freespace: was free: 50905088 bytes 48.5 MiB, wrote: 63168512 bytes 60.2 MiB, delta: 12263424 bytes 11.7 MiB, wrote 24.1% more than predicted
freespace: trashing: was free: 83402752 bytes 79.5 MiB, need free: 30425088 bytes 29.0 MiB, files created: 248920, delete 158114 (63.5% of them)
freespace: was free: 42651648 bytes 40.7 MiB, wrote: 55406592 bytes 52.8 MiB, delta: 12754944 bytes 12.2 MiB, wrote 29.9% more than predicted
freespace: trashing: was free: 84402176 bytes 80.5 MiB, need free: 38031360 bytes 36.3 MiB, files created: 248709, delete 136641 (54.9% of them)
freespace: was free: 35233792 bytes 33.6 MiB, wrote: 48250880 bytes 46.0 MiB, delta: 13017088 bytes 12.4 MiB, wrote 36.9% more than predicted
freespace: trashing: was free: 82530304 bytes 78.7 MiB, need free: 45637632 bytes 43.5 MiB, files created: 248778, delete 111208 (44.7% of them)
freespace: was free: 27287552 bytes 26.0 MiB, wrote: 40267776 bytes 38.4 MiB, delta: 12980224 bytes 12.4 MiB, wrote 47.6% more than predicted
freespace: trashing: was free: 85114880 bytes 81.2 MiB, need free: 53243904 bytes 50.8 MiB, files created: 248508, delete 93052 (37.4% of them)
freespace: was free: 22437888 bytes 21.4 MiB, wrote: 35328000 bytes 33.7 MiB, delta: 12890112 bytes 12.3 MiB, wrote 57.4% more than predicted
freespace: trashing: was free: 84103168 bytes 80.2 MiB, need free: 60850176 bytes 58.0 MiB, files created: 248637, delete 68743 (27.6% of them)
freespace: was free: 15536128 bytes 14.8 MiB, wrote: 28319744 bytes 27.0 MiB, delta: 12783616 bytes 12.2 MiB, wrote 82.3% more than predicted
freespace: trashing: was free: 84357120 bytes 80.4 MiB, need free: 68456448 bytes 65.3 MiB, files created: 248567, delete 46852 (18.8% of them)
freespace: was free: 9015296 bytes 8.6 MiB, wrote: 22044672 bytes 21.0 MiB, delta: 13029376 bytes 12.4 MiB, wrote 144.5% more than predicted
freespace: trashing: was free: 84942848 bytes 81.0 MiB, need free: 76062720 bytes 72.5 MiB, files created: 248636, delete 25993 (10.5% of them)
freespace: was free: 6086656 bytes 5.8 MiB, wrote: 8331264 bytes 7.9 MiB, delta: 2244608 bytes 2.1 MiB, wrote 36.9% more than predicted
freespace: Test 3 finished
freespace: finished successfully
After the change:
freespace: Test 1: fill the space we have 3 times
freespace: was free: 94048256 bytes 89.7 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 2441216 bytes 2.3 MiB, wrote 2.6% more than predicted
freespace: was free: 92246016 bytes 88.0 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 4247552 bytes 4.1 MiB, wrote 4.6% more than predicted
freespace: was free: 92254208 bytes 88.0 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 4235264 bytes 4.0 MiB, wrote 4.6% more than predicted
freespace: Test 1 finished
freespace: Test 2: gradually lessen amount of free space and fill the FS
freespace: do 10 steps, lessen free space by 8386001 bytes 8.0 MiB each time
freespace: was free: 86605824 bytes 82.6 MiB, wrote: 88252416 bytes 84.2 MiB, delta: 1646592 bytes 1.6 MiB, wrote 1.9% more than predicted
freespace: was free: 78667776 bytes 75.0 MiB, wrote: 80715776 bytes 77.0 MiB, delta: 2048000 bytes 2.0 MiB, wrote 2.6% more than predicted
freespace: was free: 69615616 bytes 66.4 MiB, wrote: 71630848 bytes 68.3 MiB, delta: 2015232 bytes 1.9 MiB, wrote 2.9% more than predicted
freespace: was free: 61018112 bytes 58.2 MiB, wrote: 62783488 bytes 59.9 MiB, delta: 1765376 bytes 1.7 MiB, wrote 2.9% more than predicted
freespace: was free: 52424704 bytes 50.0 MiB, wrote: 53968896 bytes 51.5 MiB, delta: 1544192 bytes 1.5 MiB, wrote 2.9% more than predicted
freespace: was free: 43880448 bytes 41.8 MiB, wrote: 45199360 bytes 43.1 MiB, delta: 1318912 bytes 1.3 MiB, wrote 3.0% more than predicted
freespace: was free: 35332096 bytes 33.7 MiB, wrote: 36425728 bytes 34.7 MiB, delta: 1093632 bytes 1.0 MiB, wrote 3.1% more than predicted
freespace: was free: 26771456 bytes 25.5 MiB, wrote: 27643904 bytes 26.4 MiB, delta: 872448 bytes 852.0 KiB, wrote 3.3% more than predicted
freespace: was free: 18231296 bytes 17.4 MiB, wrote: 18878464 bytes 18.0 MiB, delta: 647168 bytes 632.0 KiB, wrote 3.5% more than predicted
freespace: was free: 9674752 bytes 9.2 MiB, wrote: 10088448 bytes 9.6 MiB, delta: 413696 bytes 404.0 KiB, wrote 4.3% more than predicted
freespace: Test 2 finished
freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS
freespace: do 10 steps, lessen free space by 8397544 bytes 8.0 MiB each time
freespace: trashing: was free: 92372992 bytes 88.1 MiB, need free: 8397552 bytes 8.0 MiB, files created: 248296, delete 225723 (90.9% of them)
freespace: was free: 71909376 bytes 68.6 MiB, wrote: 82472960 bytes 78.7 MiB, delta: 10563584 bytes 10.1 MiB, wrote 14.7% more than predicted
freespace: trashing: was free: 88989696 bytes 84.9 MiB, need free: 16795096 bytes 16.0 MiB, files created: 248794, delete 201838 (81.1% of them)
freespace: was free: 60354560 bytes 57.6 MiB, wrote: 71782400 bytes 68.5 MiB, delta: 11427840 bytes 10.9 MiB, wrote 18.9% more than predicted
freespace: trashing: was free: 90304512 bytes 86.1 MiB, need free: 25192640 bytes 24.0 MiB, files created: 248733, delete 179342 (72.1% of them)
freespace: was free: 51187712 bytes 48.8 MiB, wrote: 62943232 bytes 60.0 MiB, delta: 11755520 bytes 11.2 MiB, wrote 23.0% more than predicted
freespace: trashing: was free: 91209728 bytes 87.0 MiB, need free: 33590184 bytes 32.0 MiB, files created: 248779, delete 157160 (63.2% of them)
freespace: was free: 42704896 bytes 40.7 MiB, wrote: 55050240 bytes 52.5 MiB, delta: 12345344 bytes 11.8 MiB, wrote 28.9% more than predicted
freespace: trashing: was free: 92700672 bytes 88.4 MiB, need free: 41987728 bytes 40.0 MiB, files created: 248848, delete 136135 (54.7% of them)
freespace: was free: 35250176 bytes 33.6 MiB, wrote: 48115712 bytes 45.9 MiB, delta: 12865536 bytes 12.3 MiB, wrote 36.5% more than predicted
freespace: trashing: was free: 93986816 bytes 89.6 MiB, need free: 50385272 bytes 48.1 MiB, files created: 248723, delete 115385 (46.4% of them)
freespace: was free: 29995008 bytes 28.6 MiB, wrote: 41582592 bytes 39.7 MiB, delta: 11587584 bytes 11.1 MiB, wrote 38.6% more than predicted
freespace: trashing: was free: 91881472 bytes 87.6 MiB, need free: 58782816 bytes 56.1 MiB, files created: 248645, delete 89569 (36.0% of them)
freespace: was free: 22511616 bytes 21.5 MiB, wrote: 34705408 bytes 33.1 MiB, delta: 12193792 bytes 11.6 MiB, wrote 54.2% more than predicted
freespace: trashing: was free: 91774976 bytes 87.5 MiB, need free: 67180360 bytes 64.1 MiB, files created: 248580, delete 66616 (26.8% of them)
freespace: was free: 16908288 bytes 16.1 MiB, wrote: 26898432 bytes 25.7 MiB, delta: 9990144 bytes 9.5 MiB, wrote 59.1% more than predicted
freespace: trashing: was free: 92450816 bytes 88.2 MiB, need free: 75577904 bytes 72.1 MiB, files created: 248654, delete 45381 (18.3% of them)
freespace: was free: 10170368 bytes 9.7 MiB, wrote: 19111936 bytes 18.2 MiB, delta: 8941568 bytes 8.5 MiB, wrote 87.9% more than predicted
freespace: trashing: was free: 93282304 bytes 89.0 MiB, need free: 83975448 bytes 80.1 MiB, files created: 248513, delete 24794 (10.0% of them)
freespace: was free: 3911680 bytes 3.7 MiB, wrote: 7872512 bytes 7.5 MiB, delta: 3960832 bytes 3.8 MiB, wrote 101.3% more than predicted
freespace: Test 3 finished
freespace: finished successfully
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The assertion was incorrect, because it did not take into
account free space.
This patch also amends the comments correspondingly, and
cleans them up a little.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When we report free space to user-space, we should not report
0 if the amount of empty LEBs is too low, because they would
be produced by GC when needed. Thus, just call
'ubifs_calc_available()' straight away which would take
'min_idx_lebs' into account anyway.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
We have a hack which forces the amount of flash space to be
equivalent to 'c->blocks_cnt' in case of empty FS. This is
to make users happy and see '%0' used in 'df' when they
mount an empty FS. This hack is not needed in
'ubifs_calc_available()', but it is only needed the caller,
in 'ubifs_budg_get_free_space()'. So push it down there.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This is bad because the rest of the code should not depend on it,
and this may hide bugss, instead of revealing them.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The TNC mutex is unlocked prematurely when reading leaf nodes
with non-hashed keys. This is unsafe because the node may be
moved by garbage collection and the eraseblock unmapped, although
that has never actually happened during stress testing.
This patch fixes the flaw by detecting the race and retrying with
the TNC mutex locked.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Leaf-nodes that have a hashed key are stored in the
leaf-node-cache (LNC) which is protected by the TNC
mutex. Consequently, when reading a leaf node with
a hashed key (i.e. directory entries, xattr entries)
the TNC mutex is always required.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Always allow truncations to zero, even if budgeting thinks there
is no space. UBIFS reserves some space for deletions anyway.
Otherwise, the following happans:
1. create a file, and write as much as possible there, until ENOSPC
2. truncate the file, which fails with ENOSPC, which is not good.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Xattr code has not been tested for a while and there were
serveral bugs. One of them is using wrong inode in
'ubifs_jnl_change_xattr()'. The other is a deadlock in
'ubifs_setxattr()': the i_mutex is locked in
'cap_inode_need_killpriv()' path, so deadlock happens when
'ubifs_setxattr()' tries to lock it again.
Thanks to Zoltan Sogor for finding these bugs.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Commit d70b67c8bc fixed VFS and
it never calls FS lookup function in deleted directories now.
We may remove corresponding UBIFS check.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Data length has to be aligned in the budgeting request. Code
in xattr.c did not do this.
Signed-off-by: Zoltan Sogor <weth@inf.u-szeged.hu>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Use "if (0) printk()" construct in debugging print macros to
make the debugging messages be checked even if debugging is
off.
This patch also removes some unneeded spaces and blank lines.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBIFS does not presently re-use inode numbers, so leaving
i_generation zero is most appropriate for now.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
At the moment UBIFS reserves twice old index size space for the
index. But this is not enough in some cases, because if the indexing
node are very fragmented and there are many small gaps, while the
dirty index has big znodes - in-the-gaps method would fail.
Thus, reserve trise as more, in which case we are guaranteed that
we can commit in any case.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBIFS aligns node lengths to 8, so budgeting has to do the
same. Well, direntry, inode, and page budgets are already
aligned, but not inode data budget (e.g., data in special
devices or symlinks). Do this for inode data as well.
Also, add corresponding debugging checks.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Budgeting is a crucial UBIFS subsystem - add more assertions
to improve requests checking. This is not compiled in when
UBIFS debugging is disabled.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The debug function that checks orphans, does so using the
TNC mutex. That means it will not see a correct picture
if the inode is removed from the orphan tree before it is
removed from TNC.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
The values in these two fields need to be preserved independently
and so a union cannot be used.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Every time anything is deleted, UBIFS writes the deletion inode
node twice - once in 'ubifs_jnl_update()' and the second time in
'ubifs_jnl_write_inode()'. However, the second write is not needed
if no commit happened after 'ubifs_jnl_update()'. This patch
checks that condition and avoids writing the deletion inode for
the second time.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Increment the commit number at the beginnig of the commit, instead
of doing this after the commit. This is needed for further
optimizations.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The 'last_reference' parameter of 'pack_inode()' is not really
needed because 'inode->i_nlink' may be tested instead. Zap it.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Simplify 'ubifs_jnl_write_inode()' by removing the 'deletion'
parameter which is not really needed because we may test
inode->i_nlink and check whether this is a deletion or not.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Orphan inodes are deleted inodes which will disappear after FS
re-mount. There is not need to write orphan inodes back, because
they are not needed on the flash media.
So optimize orphans a little by not writing them back. Just mark
them as clean, free the budget, and report success to VFS.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
We use ubifs_ro_mode() quite a lot, and not in fast-path, so
there is no reason to blow the code up by having it inlined.
Also, we usually want R/O mode change to be seen to other
CPUs as soon as possible, so when we make this a function
call, we will automatically have a memory barrier.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBI transparently handles write errors by automatically copying
and remapping the affected eraseblock. If UBI is unable to do
that, for example its pool of eraseblocks reserved for bad block
handling is empty, then the error is propagated to UBIFS. UBIFS
must protect the media from falling into an inconsistent state
by immediately switching to read-only mode. In the case of log
updates, this was not being done.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
UBIFS recovery testing debug facility simulates media failures.
When simulating an IO error, the error code returned must be
-EIO but it was not always if the user switched off the
debug recovery testing option at the same time.
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Although the inode is marked as clean when it is being deleted,
it might stay and be used as orphan, and be marked as dirty.
So we have to free the budget when we delete it.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The 'ubifs_release_dirty_inode_budget()' was buggy and incorrectly
freed the budget, which led to not freeing all dirty data budget.
This patch fixes that.
Also, this patch fixes ubifs_mkdir() which passed 1 in dirty_ino_d,
which makes no sense. Well, it is harmless though.
Also, add few more useful assertions. And improve few debugging
messages.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
We encouredge people to mount using volume name, not device
numbers. So print the name of the mounted UBI volume, not just
IDs.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all.
Several places in the tree needed direct include.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Kmem cache passed to constructor is only needed for constructors that are
themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
passed kmem cache in non-trivial way, so pass only pointer to object.
Non-trivial places are:
arch/powerpc/mm/init_64.c
arch/powerpc/mm/hugetlbpage.c
This is flag day, yes.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Matt Mackall <mpm@selenic.com>
[akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
[akpm@linux-foundation.org: fix mm/slab.c]
[akpm@linux-foundation.org: fix ubifs]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add UBIFS to Makefile and Kbuild.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
This is a new flash file system. See
http://www.linux-mtd.infradead.org/doc/ubifs.html
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>