linux/fs
Jeff Layton 866b04fccb inode numbering: make static counters in new_inode and iunique be 32 bits
The problems are:

- on filesystems w/o permanent inode numbers, i_ino values can be larger
  than 32 bits, which can cause problems for some 32 bit userspace programs on
  a 64 bit kernel.  We can't do anything for filesystems that have actual
  >32-bit inode numbers, but on filesystems that generate i_ino values on the
  fly, we should try to have them fit in 32 bits.  We could trivially fix this
  by making the static counters in new_inode and iunique 32 bits, but...

- many filesystems call new_inode and assume that the i_ino values they are
  given are unique.  They are not guaranteed to be so, since the static
  counter can wrap.  This problem is exacerbated by the fix for #1.

- after allocating a new inode, some filesystems call iunique to try to get
  a unique i_ino value, but they don't actually add their inodes to the
  hashtable, and so they're still not guaranteed to be unique if that counter
  wraps.

This patch set takes the simpler approach of simply using iunique and hashing
the inodes afterward.  Christoph H.  previously mentioned that he thought that
this approach may slow down lookups for filesystems that currently hash their
inodes.

The questions are:

1) how much would this slow down lookups for these filesystems?
2) is it enough to justify adding more infrastructure to avoid it?

What might be best is to start with this approach and then only move to using
IDR or some other scheme if these extra inodes in the hashtable prove to be
problematic.

I've done some cursory testing with this patch and the overhead of hashing and
unhashing the inodes with pipefs is pretty low -- just a few seconds of system
time added on to the creation and destruction of 10 million pipes (very
similar to the overhead that the IDR approach would add).

The hard thing to measure is what effect this has on other filesystems. I'm
open to ways to try and gauge this.

Again, I've only converted pipefs as an example. If this approach is
acceptable then I'll start work on patches to convert other filesystems.

With a pretty-much-worst-case microbenchmark provided by Eric Dumazet
<dada1@cosmosbay.com>:

hashing patch (pipebench):
sys     1m15.329s
sys     1m16.249s
sys     1m17.169s

unpatched (pipebench):
sys     1m9.836s
sys     1m12.541s
sys     1m14.153s

Which works out to 1.05642174294555027017.  So ~5-6% slowdown.

This patch:

When a 32-bit program that was not compiled with large file offsets does a
stat and gets a st_ino value back that won't fit in the 32 bit field, glibc
(correctly) generates an EOVERFLOW error.  We can't do anything about fs's
with larger permanent inode numbers, but when we generate them on the fly, we
ought to try and have them fit within a 32 bit field.

This patch takes the first step toward this by making the static counters in
these two functions be 32 bits.

[jlayton@redhat.com: mention that it's only the case for 32bit, non-LFS stat]
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:16 -07:00
..
9p header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
adfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
affs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
afs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
autofs [PATCH] Mark struct super_operations const 2007-02-12 09:48:47 -08:00
autofs4 header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
befs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
bfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
cifs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
coda slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
configfs remove "struct subsystem" as it is no longer needed 2007-05-02 18:57:59 -07:00
cramfs mm: make read_cache_page synchronous 2007-05-07 12:12:51 -07:00
debugfs remove "struct subsystem" as it is no longer needed 2007-05-02 18:57:59 -07:00
devpts devpts: add fsnotify create event 2007-05-08 11:14:59 -07:00
dlm Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw 2007-05-07 12:26:27 -07:00
ecryptfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
efs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
exportfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
ext2 ext3: copy i_flags to inode flags on write 2007-05-08 11:15:13 -07:00
ext3 ext3: copy i_flags to inode flags on write 2007-05-08 11:15:12 -07:00
ext4 header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
fat fat: fix VFAT compat ioctls on 64-bit systems 2007-05-08 11:15:14 -07:00
freevxfs freevxfs: possible null pointer dereference fix 2007-05-08 11:14:59 -07:00
fuse add filesystem subtype support 2007-05-08 11:15:01 -07:00
gfs2 header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
hfs is_power_of_2 in fs/hfs 2007-05-08 11:14:59 -07:00
hfsplus is_power_of_2 in fs/hfs 2007-05-08 11:14:59 -07:00
hostfs uml: hostfs style fixes 2007-05-08 11:14:57 -07:00
hpfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
hppfs [PATCH] Mark struct super_operations const 2007-02-12 09:48:47 -08:00
hugetlbfs hugetlbfs: add NULL check in hugetlb_zero_setup() 2007-05-07 12:12:57 -07:00
isofs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
jbd jbd: check for error returned by kthread_create on creating journal thread 2007-05-08 11:15:13 -07:00
jbd2 jbd: check for error returned by kthread_create on creating journal thread 2007-05-08 11:15:13 -07:00
jffs2 slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
jfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
lockd header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
minix slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
msdos [PATCH] mark struct inode_operations const 2 2007-02-12 09:48:46 -08:00
ncpfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
nfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
nfs_common [PATCH] nfs_common endianness annotations 2006-10-20 10:26:41 -07:00
nfsd header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
nls [PATCH] fs: make nls_cp936.c handle some U00XY characters and U20AC correctly 2006-12-07 08:39:46 -08:00
ntfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
ocfs2 header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
openpromfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
partitions partition: add support for sysv68 partitions 2007-05-08 11:15:09 -07:00
proc procfs: use simple_read_from_buffer() 2007-05-08 11:15:14 -07:00
qnx4 slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
ramfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
reiserfs reiserfs: use __set_current_state() 2007-05-08 11:15:13 -07:00
romfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
smbfs smbfs: remove unnecessary allow_signal 2007-05-08 11:15:11 -07:00
sysfs remove "struct subsystem" as it is no longer needed 2007-05-02 18:57:59 -07:00
sysv header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
udf udf: decrement correct link count in udf_rmdir 2007-05-08 11:15:14 -07:00
ufs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
vfat [PATCH] mark struct inode_operations const 3 2007-02-12 09:48:46 -08:00
xfs mm: move common segment checks to separate helper function 2007-05-08 11:14:57 -07:00
aio.c KMEM_CACHE(): simplify slab cache creation 2007-05-07 12:12:55 -07:00
attr.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
bad_inode.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_aout.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
binfmt_elf_fdpic.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_elf.c Invalid return value of execve() resulting in oopses 2007-05-08 11:15:15 -07:00
binfmt_em86.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_flat.c [PATCH] uclinux: correctly remap bin_fmtflat exe allocated mem regions 2007-02-09 10:45:33 -08:00
binfmt_misc.c [PATCH] Mark struct super_operations const 2007-02-12 09:48:47 -08:00
binfmt_script.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_som.c [PARISC] Fix fs/binfmt_som.c 2006-10-04 06:51:26 -06:00
bio.c KMEM_CACHE(): simplify slab cache creation 2007-05-07 12:12:55 -07:00
block_dev.c is_power_of_2 in fs/block_dev.c 2007-05-08 11:14:59 -07:00
buffer.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
char_dev.c [PATCH] remove protection of LANANA-reserved majors 2007-04-04 21:12:47 -07:00
compat_ioctl.c Fix error handling in HDIO_GETGEO compat wrapper 2007-05-08 11:15:14 -07:00
compat.c cleanup compat ioctl handling 2007-05-08 11:15:09 -07:00
dcache.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
dcookies.c [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
direct-io.c [PATCH] dio: lock refcount operations 2006-12-10 09:57:21 -08:00
dnotify.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
dquot.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
drop_caches.c [PATCH] remove invalidate_inode_pages() 2007-02-11 10:51:31 -08:00
eventpoll.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
exec.c (re)register_binfmt returns with -EBUSY 2007-05-08 11:15:08 -07:00
fcntl.c [PATCH] fdtable: Make fdarray and fdsets equal in size 2006-12-10 09:57:22 -08:00
fifo.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
file_table.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
file.c [PATCH] fdtable: Provide free_fdtable() wrapper 2006-12-22 08:55:50 -08:00
filesystems.c add filesystem subtype support 2007-05-08 11:15:01 -07:00
fs-writeback.c Write back inode data pages even when the inode itself is locked 2007-01-26 12:53:20 -08:00
generic_acl.c
inode.c inode numbering: make static counters in new_inode and iunique be 32 bits 2007-05-08 11:15:16 -07:00
inotify_user.c [PATCH] inotify: read return val fix 2007-02-12 09:48:28 -08:00
inotify.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
internal.h cleanup compat ioctl handling 2007-05-08 11:15:09 -07:00
ioctl.c vfs: remove superflous sb == NULL checks 2007-05-08 11:15:02 -07:00
ioprio.c [PATCH] pid: replace do/while_each_task_pid with do/while_each_pid_task 2007-02-12 09:48:32 -08:00
Kconfig reiserfs: proc support requires PROC_FS 2007-05-08 11:15:04 -07:00
Kconfig.binfmt blackfin architecture 2007-05-07 12:12:58 -07:00
libfs.c [PATCH] shmem and simple const super_operations 2007-03-05 07:57:51 -08:00
locks.c Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux 2007-05-07 12:34:24 -07:00
Makefile Remove JFFS (version 1), as scheduled. 2007-02-17 16:10:59 -05:00
mbcache.c [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
mpage.c Factor outstanding I/O error handling 2007-05-08 11:14:57 -07:00
namei.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
namespace.c check privileges before setting mount propagation 2007-05-08 11:15:12 -07:00
nfsctl.c
no-block.c
open.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
pipe.c VFS: delay the dentry name generation on sockets and pipes 2007-05-08 11:15:03 -07:00
pnode.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
pnode.h [PATCH] rename struct namespace to struct mnt_namespace 2006-12-08 08:28:51 -08:00
posix_acl.c [PATCH] kmemdup: some users 2006-10-01 00:39:19 -07:00
quota_v1.c
quota_v2.c
quota.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
read_write.c use use SEEK_MAX to validate user lseek arguments 2007-05-08 11:14:59 -07:00
read_write.h [PATCH] Remove readv/writev methods and use aio_read/aio_write instead 2006-10-01 00:39:28 -07:00
readdir.c ROUND_UP macro cleanup in fs/(select|compat|readdir).c 2007-05-08 11:15:09 -07:00
select.c ROUND_UP macro cleanup in fs/(select|compat|readdir).c 2007-05-08 11:15:09 -07:00
seq_file.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
splice.c [PATCH] splice: partial write fix 2007-03-29 14:26:42 +02:00
stack.c [PATCH] fs/stack.c: Copy i_nlink after all other attributes are copied 2007-02-19 14:21:50 -08:00
stat.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
super.c add filesystem subtype support 2007-05-08 11:15:01 -07:00
sync.c Remove do_sync_file_range() 2007-05-08 11:15:04 -07:00
utimes.c [PATCH] severing fs.h, radix-tree.h -> sched.h 2006-12-04 02:00:24 -05:00
xattr_acl.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
xattr.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00