linux/fs
Robbie Ko 8ecebf4d76 Btrfs: fix unexpected failure of nocow buffered writes after snapshotting when low on space
Commit e9894fd3e3 ("Btrfs: fix snapshot vs nocow writting") forced
nocow writes to fallback to COW, during writeback, when a snapshot is
created. This resulted in writes made before creating the snapshot to
unexpectedly fail with ENOSPC during writeback when success (0) was
returned to user space through the write system call.

The steps leading to this problem are:

1. When it's not possible to allocate data space for a write, the
   buffered write path checks if a NOCOW write is possible.  If it is,
   it will not reserve space and success (0) is returned to user space.

2. Then when a snapshot is created, the root's will_be_snapshotted
   atomic is incremented and writeback is triggered for all inode's that
   belong to the root being snapshotted. Incrementing that atomic forces
   all previous writes to fallback to COW during writeback (running
   delalloc).

3. This results in the writeback for the inodes to fail and therefore
   setting the ENOSPC error in their mappings, so that a subsequent
   fsync on them will report the error to user space. So it's not a
   completely silent data loss (since fsync will report ENOSPC) but it's
   a very unexpected and undesirable behaviour, because if a clean
   shutdown/unmount of the filesystem happens without previous calls to
   fsync, it is expected to have the data present in the files after
   mounting the filesystem again.

So fix this by adding a new atomic named snapshot_force_cow to the
root structure which prevents this behaviour and works the following way:

1. It is incremented when we start to create a snapshot after triggering
   writeback and before waiting for writeback to finish.

2. This new atomic is now what is used by writeback (running delalloc)
   to decide whether we need to fallback to COW or not. Because we
   incremented this new atomic after triggering writeback in the
   snapshot creation ioctl, we ensure that all buffered writes that
   happened before snapshot creation will succeed and not fallback to
   COW (which would make them fail with ENOSPC).

3. The existing atomic, will_be_snapshotted, is kept because it is used
   to force new buffered writes, that start after we started
   snapshotting, to reserve data space even when NOCOW is possible.
   This makes these writes fail early with ENOSPC when there's no
   available space to allocate, preventing the unexpected behaviour of
   writeback later failing with ENOSPC due to a fallback to COW mode.

Fixes: e9894fd3e3 ("Btrfs: fix snapshot vs nocow writting")
Signed-off-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-17 18:35:43 +02:00
..
9p treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
adfs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
affs affs: fix potential memory leak when parsing option 'prefix' 2018-05-28 12:36:41 +02:00
afs Merge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-06-16 16:32:04 +09:00
autofs autofs: fix slab out of bounds read in getname_kernel() 2018-07-14 11:11:09 -07:00
befs fix a series of Documentation/ broken file name references 2018-06-15 18:10:01 -03:00
bfs bfs_add_entry: pass name/len as qstr pointer 2018-05-22 14:27:50 -04:00
btrfs Btrfs: fix unexpected failure of nocow buffered writes after snapshotting when low on space 2018-08-17 18:35:43 +02:00
cachefiles cachefiles: Wait rather than BUG'ing on "Unexpected object collision" 2018-07-25 14:49:00 +01:00
ceph ceph: fix dentry leak in splice_dentry() 2018-06-26 18:42:44 +02:00
cifs cifs: Fix stack out-of-bounds in smb{2,3}_create_lease_buf() 2018-07-05 13:48:25 -05:00
coda vfs: change inode times to use struct timespec64 2018-06-05 16:57:31 -07:00
configfs vfs: change inode times to use struct timespec64 2018-06-05 16:57:31 -07:00
cramfs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
crypto f2fs-for-4.18-rc1 2018-06-11 10:16:13 -07:00
debugfs Revert "debugfs: inode: debugfs_create_dir uses mode permission from parent" 2018-06-12 20:52:16 -07:00
devpts
dlm treewide: Use array_size() in vmalloc() 2018-06-12 16:19:22 -07:00
ecryptfs Merge branch 'fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into aio-base 2018-05-26 09:16:25 +02:00
efivarfs
efs
exofs exofs: avoid VLA in structures 2018-06-15 07:55:24 +09:00
exportfs
ext2 ext2: add warning when specifying nocheck option 2018-06-20 11:04:26 +02:00
ext4 ext4: fix check to prevent initializing reserved inodes 2018-07-29 15:34:00 -04:00
f2fs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
fat fat: fix memory allocation failure handling of match_strdup() 2018-07-21 12:50:46 -07:00
freevxfs freevxfs_lookup(): use d_splice_alias() 2018-05-22 14:27:51 -04:00
fscache fscache: Fix reference overput in fscache_attach_object() error handling 2018-07-25 14:49:00 +01:00
fuse vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
gfs2 vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
hfs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
hfsplus vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
hostfs vfs: change inode times to use struct timespec64 2018-06-05 16:57:31 -07:00
hpfs treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
hugetlbfs mm: use vma_init() to initialize VMAs on stack and data segments 2018-07-26 19:38:03 -07:00
isofs
jbd2 Bug fixes for ext4; most of which relate to vulnerabilities where a 2018-07-08 11:10:30 -07:00
jffs2 vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
jfs jfs: Fix usercopy whitelist for inline inode data 2018-08-04 07:53:46 -07:00
kernfs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
lockd
minix minix_lookup: use d_splice_alias() 2018-05-22 14:27:52 -04:00
nfs NFSv4: Fix _nfs4_do_setlk() 2018-08-01 23:17:06 -04:00
nfs_common
nfsd vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
nilfs2 do d_instantiate/unlock_new_inode combinations safely 2018-05-11 15:36:37 -04:00
nls
notify fsnotify: add fsnotify_add_inode_mark() wrappers 2018-05-18 14:58:22 +02:00
ntfs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
ocfs2 vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
omfs omfs_lookup(): report IO errors, use d_splice_alias() 2018-05-22 14:27:58 -04:00
openpromfs openpromfs: switch to d_splice_alias() 2018-05-22 14:27:57 -04:00
orangefs Solve a series of broken links for files under Documentation: 2018-06-17 05:25:18 +09:00
overlayfs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
proc fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps* 2018-07-14 11:11:09 -07:00
pstore pstore: Remove bogus format string definition 2018-06-14 14:57:24 +02:00
qnx4 qnx4_lookup: use d_splice_alias() 2018-05-22 14:27:52 -04:00
qnx6 qnx6_lookup: switch to d_splice_alias() 2018-05-22 14:27:54 -04:00
quota quota: Cleanup list iteration in dqcache_shrink_scan() 2018-06-20 11:04:26 +02:00
ramfs
reiserfs reiserfs: fix buffer overflow with long warning messages 2018-07-14 11:11:10 -07:00
romfs romfs_lookup: switch to d_splice_alias() 2018-05-22 14:27:55 -04:00
squashfs Squashfs: Compute expected length from inode size rather than block length 2018-08-02 09:34:02 -07:00
sysfs unfuck sysfs_mount() 2018-05-21 14:30:09 -04:00
sysv sysv_lookup: use d_splice_alias() 2018-05-22 14:27:53 -04:00
tracefs
ubifs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
udf udf: Drop unused arguments of udf_delete_aext() 2018-06-20 11:05:49 +02:00
ufs treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
xfs xfs: properly handle free inodes in extent hint validators 2018-07-24 11:34:52 -07:00
aio.c Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-07-22 12:04:51 -07:00
anon_inodes.c
attr.c vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
bad_inode.c vfs: change inode times to use struct timespec64 2018-06-05 16:57:31 -07:00
binfmt_aout.c
binfmt_elf_fdpic.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
binfmt_elf.c fs, elf: make sure to page align bss in load_elf_library 2018-07-14 11:11:10 -07:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c docs: Fix more broken references 2018-06-15 18:11:26 -03:00
binfmt_script.c
block_dev.c for-linus-20180727 2018-07-27 12:51:00 -07:00
buffer.c fs: move page_cache_seek_hole_data to iomap.c 2018-06-01 18:37:33 -07:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c autofs: clean up includes 2018-06-07 17:34:40 -07:00
compat.c ncpfs: remove compat functionality 2018-06-05 19:23:26 +02:00
coredump.c
d_path.c
dax.c libnvdimm for 4.18 2018-06-08 17:21:52 -07:00
dcache.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-06-04 10:14:28 -07:00
dcookies.c
direct-io.c block: consistently use GFP_NOIO instead of __GFP_NORECLAIM 2018-05-14 08:55:18 -06:00
drop_caches.c
eventfd.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
eventpoll.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
exec.c mm: fix vma_is_anonymous() false-positives 2018-07-26 19:38:03 -07:00
fcntl.c mm: restructure memfd code 2018-06-07 17:34:35 -07:00
fhandle.c
file_table.c
file.c
filesystems.c proc: introduce proc_create_single{,_data} 2018-05-16 07:23:35 +02:00
fs_pin.c
fs_struct.c
fs-writeback.c bdi: Fix oops in wb_workfn() 2018-05-03 16:11:37 -06:00
inode.c Fix up non-directory creation in SGID directories 2018-07-05 12:36:36 -07:00
internal.h drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open() 2018-07-10 23:29:03 -04:00
ioctl.c fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems 2018-05-24 12:04:28 -05:00
iomap.c fs: fix iomap_bmap position calculation 2018-08-02 13:09:27 -07:00
Kconfig autofs: remove left-over autofs4 stubs 2018-06-11 08:22:34 -07:00
Kconfig.binfmt docs: Fix more broken references 2018-06-15 18:11:26 -03:00
libfs.c
locks.c vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
Makefile autofs: remove left-over autofs4 stubs 2018-06-11 08:22:34 -07:00
mbcache.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
mount.h
mpage.c
namei.c Merge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-06-16 16:32:04 +09:00
namespace.c fs: Allow superblock owner to access do_remount_sb() 2018-05-24 12:02:25 -05:00
no-block.c
nsfs.c
open.c Revert "fs: fold open_check_o_direct into do_dentry_open" 2018-06-03 10:58:23 -07:00
pipe.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
pnode.c
pnode.h
posix_acl.c
proc_namespace.c
read_write.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
readdir.c
select.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
seq_file.c proc: fix smaps and meminfo alignment 2018-05-25 18:12:11 -07:00
signalfd.c Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-06-16 16:21:50 +09:00
splice.c Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-06-16 16:21:50 +09:00
stack.c
stat.c
statfs.c
super.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-06-04 10:14:28 -07:00
sync.c
timerfd.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
userfaultfd.c userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK fails 2018-08-02 16:03:40 -07:00
utimes.c
xattr.c vfs: delete unnecessary assignment in vfs_listxattr 2018-05-29 13:22:41 -04:00