linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-11 14:42:24 +00:00

History

Boaz Harrosh 9ff19309a9 ore: Fix NFS crash by supporting any unaligned RAID IO In RAID_5/6 We used to not permit an IO that it's end byte is not stripe_size aligned and spans more than one stripe. .i.e the caller must check if after submission the actual transferred bytes is shorter, and would need to resubmit a new IO with the remainder. Exofs supports this, and NFS was supposed to support this as well with it's short write mechanism. But late testing has exposed a CRASH when this is used with none-RPC layout-drivers. The change at NFS is deep and risky, in it's place the fix at ORE to lift the limitation is actually clean and simple. So here it is below. The principal here is that in the case of unaligned IO on both ends, beginning and end, we will send two read requests one like old code, before the calculation of the first stripe, and also a new site, before the calculation of the last stripe. If any "boundary" is aligned or the complete IO is within a single stripe. we do a single read like before. The code is clean and simple by splitting the old _read_4_write into 3 even parts: 1._read_4_write_first_stripe 2. _read_4_write_last_stripe 3. _read_4_write_execute And calling 1+3 at the same place as before. 2+3 before last stripe, and in the case of all in a single stripe then 1+2+3 is preformed additively. Why did I not think of it before. Well I had a strike of genius because I have stared at this code for 2 years, and did not find this simple solution, til today. Not that I did not try. This solution is much better for NFS than the previous supposedly solution because the short write was dealt with out-of-band after IO_done, which would cause for a seeky IO pattern where as in here we execute in order. At both solutions we do 2 separate reads, only here we do it within a single IO request. (And actually combine two writes into a single submission) NFS/exofs code need not change since the ORE API communicates the new shorter length on return, what will happen is that this case would not occur anymore. hurray!! [Stable this is an NFS bug since 3.2 Kernel should apply cleanly] CC: Stable Tree <stable@kernel.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>		2012-07-20 11:45:28 +03:00
..
BUGS	exofs: Documentation	2009-03-31 19:44:38 +03:00
common.h	Fix common misspellings	2011-03-31 11:26:23 -03:00
dir.c	exofs: remove the second argument of k[un]map_atomic()	2012-03-20 21:48:22 +08:00
exofs.h	exofs: Add SYSFS info for autologin/pNFS export	2012-05-21 12:24:01 +03:00
file.c	fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers	2011-07-20 20:47:59 -04:00
inode.c	vfs: Rename end_writeback() to clear_inode()	2012-05-06 13:43:41 +08:00
Kbuild	exofs: Add SYSFS info for autologin/pNFS export	2012-05-21 12:24:01 +03:00
Kconfig	ore: FIX breakage when MISC_FILESYSTEMS is not set	2012-01-06 16:48:14 +02:00
Kconfig.ore	ore: FIX breakage when MISC_FILESYSTEMS is not set	2012-01-06 16:48:14 +02:00
namei.c	vfs: check i_nlink limits in vfs_{mkdir,rename_dir,link}	2012-03-20 21:29:32 -04:00
ore_raid.c	ore: Fix NFS crash by supporting any unaligned RAID IO	2012-07-20 11:45:28 +03:00
ore_raid.h	ore: RAID5 Write	2011-10-24 17:15:33 -07:00
ore.c	ore: fix BUG_ON, too few sgs when reading	2012-01-06 16:49:07 +02:00
super.c	exofs: Add SYSFS info for autologin/pNFS export	2012-05-21 12:24:01 +03:00
symlink.c	exofs: Remove IBM copyrights	2009-06-21 17:53:47 +03:00
sys.c	exofs: fix sparse non-ANSI function warning	2012-06-12 06:33:22 +03:00