linux/drivers/md/bcache
Coly Li 7e027ca4b5 bcache: add stop_when_cache_set_failed option to backing device
When there are too many I/O errors on cache device, current bcache code
will retire the whole cache set, and detach all bcache devices. But the
detached bcache devices are not stopped, which is problematic when bcache
is in writeback mode.

If the retired cache set has dirty data of backing devices, continue
writing to bcache device will write to backing device directly. If the
LBA of write request has a dirty version cached on cache device, next time
when the cache device is re-registered and backing device re-attached to
it again, the stale dirty data on cache device will be written to backing
device, and overwrite latest directly written data. This situation causes
a quite data corruption.

But we cannot simply stop all attached bcache devices when the cache set is
broken or disconnected. For example, use bcache to accelerate performance
of an email service. In such workload, if cache device is broken but no
dirty data lost, keep the bcache device alive and permit email service
continue to access user data might be a better solution for the cache
device failure.

Nix <nix@esperi.org.uk> points out the issue and provides the above example
to explain why it might be necessary to not stop bcache device for broken
cache device. Pavel Goran <via-bcache@pvgoran.name> provides a brilliant
suggestion to provide "always" and "auto" options to per-cached device
sysfs file stop_when_cache_set_failed. If cache set is retiring and the
backing device has no dirty data on cache, it should be safe to keep the
bcache device alive. In this case, if stop_when_cache_set_failed is set to
"auto", the device failure handling code will not stop this bcache device
and permit application to access the backing device with a unattached
bcache device.

Changelog:
[mlyle: edited to not break string constants across lines]
v3: fix typos pointed out by Nix.
v2: change option values of stop_when_cache_set_failed from 1/0 to
    "auto"/"always".
v1: initial version, stop_when_cache_set_failed can be 0 (not stop) or 1
    (always stop).

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Michael Lyle <mlyle@lyle.org>
Cc: Nix <nix@esperi.org.uk>
Cc: Pavel Goran <via-bcache@pvgoran.name>
Cc: Junhui Tang <tang.junhui@zte.com.cn>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-18 20:15:20 -06:00
..
alloc.c bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags 2018-03-18 20:15:20 -06:00
bcache.h bcache: add stop_when_cache_set_failed option to backing device 2018-03-18 20:15:20 -06:00
bset.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
bset.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
btree.c bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags 2018-03-18 20:15:20 -06:00
btree.h Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block 2017-11-14 15:32:19 -08:00
closure.c bcache: mark closure_sync() __sched 2018-01-08 13:29:00 -07:00
closure.h bcache: closures: move control bits one bit right 2018-01-09 12:18:51 -07:00
debug.c bcache: fix wrong return value in bch_debug_init() 2018-01-08 13:29:00 -07:00
debug.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
extents.c bcache: Fix building error on MIPS 2017-11-24 16:22:58 -07:00
extents.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
io.c bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags 2018-03-18 20:15:20 -06:00
journal.c bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags 2018-03-18 20:15:20 -06:00
journal.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kconfig bcache: Kill dead cgroup code 2014-03-18 12:22:35 -07:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
movinggc.c block: move bio_alloc_pages() to bcache 2018-01-06 09:18:00 -07:00
request.c bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags 2018-03-18 20:15:20 -06:00
request.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
stats.c md: Convert timers to use timer_setup() 2017-11-14 20:11:57 -07:00
stats.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
super.c bcache: add stop_when_cache_set_failed option to backing device 2018-03-18 20:15:20 -06:00
sysfs.c bcache: add stop_when_cache_set_failed option to backing device 2018-03-18 20:15:20 -06:00
sysfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
trace.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
util.c block: move bio_alloc_pages() to bcache 2018-01-06 09:18:00 -07:00
util.h bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags 2018-03-18 20:15:20 -06:00
writeback.c bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags 2018-03-18 20:15:20 -06:00
writeback.h bcache: fix cached_dev->count usage for bch_cache_set_error() 2018-03-18 20:15:20 -06:00