linux/drivers/md
Song Liu b4c625c673 md/r5cache: r5cache recovery: part 1
Recovery of write-back cache has different logic to write-through only
cache. Specifically, for write-back cache, the recovery need to scan
through all active journal entries before flushing data out. Therefore,
large portion of the recovery logic is rewritten here.

To make the diffs cleaner, we split the rewrite as follows:

1. In this patch, we:
      - add new data to r5l_recovery_ctx
      - add new functions to recovery write-back cache
   The new functions are not used in this patch, so this patch does not
   change the behavior of recovery.

2. In next patch, we:
      - modify main recovery procedure r5l_recovery_log() to call new
        functions
      - remove old functions

With cache feature, there are 2 different scenarios of recovery:
1. Data-Parity stripe: a stripe with complete parity in journal.
2. Data-Only stripe: a stripe with only data in journal (or partial
   parity).

The code differentiate Data-Parity stripe from Data-Only stripe with
flag STRIPE_R5C_CACHING.

For Data-Parity stripes, we use the same procedure as raid5 journal,
where all the data and parity are replayed to the RAID devices.

For Data-Only strips, we need to finish complete calculate parity and
finish the full reconstruct write or RMW write. For simplicity, in
the recovery, we load the stripe to stripe cache. Once the array is
started, the stripe cache state machine will handle these stripes
through normal write path.

r5c_recovery_flush_log contains the main procedure of recovery. The
recovery code first scans through the journal and loads data to
stripe cache. The code keeps tracks of all these stripes in a list
(use sh->lru and ctx->cached_list), stripes in the list are
organized in the order of its first appearance on the journal.
During the scan, the recovery code assesses each stripe as
Data-Parity or Data-Only.

During scan, the array may run out of stripe cache. In these cases,
the recovery code will also call raid5_set_cache_size to increase
stripe cache size. If the array still runs out of stripe cache
because there isn't enough memory, the array will not assemble.

At the end of scan, the recovery code replays all Data-Parity
stripes, and sets proper states for Data-Only stripes. The recovery
code also increases seq number by 10 and rewrites all Data-Only
stripes to journal. This is to avoid confusion after repeated
crashes. More details is explained in raid5-cache.c before
r5c_recovery_rewrite_data_only_stripes().

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-18 13:28:14 -08:00
..
bcache block: export bio_free_pages to other modules 2016-09-22 07:48:03 -06:00
persistent-data dm array: introduce cursor api 2016-09-22 11:15:04 -04:00
bitmap.c md/bitmap: add blktrace event for writes to the bitmap 2016-11-18 09:34:45 -08:00
bitmap.h md-cluster: sync bitmap when node received RESYNCING msg 2016-05-04 12:39:35 -07:00
dm-bio-prison.c block: add a bi_error field to struct bio 2015-07-29 08:55:15 -06:00
dm-bio-prison.h
dm-bio-record.h
dm-bufio.c . various fixes and cleanups for request-based DM core 2016-10-09 17:16:18 -07:00
dm-bufio.h
dm-builtin.c dm: move request-based code out to dm-rq.[hc] 2016-06-10 15:15:44 -04:00
dm-cache-block-types.h
dm-cache-metadata.c dm cache metadata: switch to using the new cursor api for loading metadata 2016-09-22 11:15:05 -04:00
dm-cache-metadata.h dm cache: make sure every metadata function checks fail_io 2016-03-10 17:12:12 -05:00
dm-cache-policy-cleaner.c dm cache: speed up writing of the hint array 2016-09-22 11:15:02 -04:00
dm-cache-policy-internal.h dm cache: speed up writing of the hint array 2016-09-22 11:15:02 -04:00
dm-cache-policy-smq.c dm cache policy smq: distribute entries to random levels when switching to smq 2016-09-22 11:15:03 -04:00
dm-cache-policy.c
dm-cache-policy.h dm cache: speed up writing of the hint array 2016-09-22 11:15:02 -04:00
dm-cache-target.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-core.h dm: move request-based code out to dm-rq.[hc] 2016-06-10 15:15:44 -04:00
dm-crypt.c . various fixes and cleanups for request-based DM core 2016-10-09 17:16:18 -07:00
dm-delay.c dm: rename target's per_bio_data_size to per_io_data_size 2016-02-22 22:34:37 -05:00
dm-era-target.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-exception-store.c - Revert a dm-multipath change that caused a regression for unprivledged 2015-11-04 21:19:53 -08:00
dm-exception-store.h dm snapshot: fix hung bios when copy error occurs 2016-01-08 20:03:05 -05:00
dm-flakey.c dm flakey: fix reads to be issued if drop_writes configured 2016-08-24 21:55:05 -04:00
dm-io.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-ioctl.c dm: allow bio-based table to be upgraded to bio-based with DAX support 2016-07-20 23:49:52 -04:00
dm-kcopyd.c dm: move request-based code out to dm-rq.[hc] 2016-06-10 15:15:44 -04:00
dm-linear.c libnvdimm for 4.8 2016-07-28 17:38:16 -07:00
dm-log-userspace-base.c dm: drop NULL test before kmem_cache_destroy() and mempool_destroy() 2015-10-31 19:06:00 -04:00
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c Merge branch 'for-4.9/block' of git://git.kernel.dk/linux-block 2016-10-07 14:42:05 -07:00
dm-log.c dm log: fix unitialized bio operation flags 2016-08-24 21:55:05 -04:00
dm-mpath.c dm mpath: always return reservation conflict without failing over 2016-09-29 10:57:07 -04:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-queue-length.c dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-raid1.c dm mirror: use all available legs on multiple failures 2016-10-14 11:55:17 -04:00
dm-raid.c dm raid: fix activation of existing raid4/10 devices 2016-10-17 16:41:31 -04:00
dm-region-hash.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-round-robin.c dm round robin: do not use this_cpu_ptr() without having preemption disabled 2016-08-15 09:23:14 -04:00
dm-rq.c - A couple DM raid and DM mirror fixes 2016-10-28 09:27:58 -07:00
dm-rq.h dm rq: introduce dm_mq_kick_requeue_list() 2016-09-15 11:16:05 -04:00
dm-service-time.c dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-snap-persistent.c dm: use bio op accessors 2016-06-07 13:41:38 -06:00
dm-snap-transient.c dm snapshot: fix hung bios when copy error occurs 2016-01-08 20:03:05 -05:00
dm-snap.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-stats.c dm: move request-based code out to dm-rq.[hc] 2016-06-10 15:15:44 -04:00
dm-stats.h dm stats: support precise timestamps 2015-06-17 12:40:40 -04:00
dm-stripe.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-switch.c dm switch: simplify conditional in alloc_region_table() 2015-10-31 19:06:06 -04:00
dm-sysfs.c dm: move request-based code out to dm-rq.[hc] 2016-06-10 15:15:44 -04:00
dm-table.c dm table: fix missing dm_put_target_type() in dm_table_add_target() 2016-10-24 11:17:46 -04:00
dm-target.c libnvdimm for 4.8 2016-07-28 17:38:16 -07:00
dm-thin-metadata.c dm thin: fix a race condition between discarding and provisioning a block 2016-07-20 12:43:35 -04:00
dm-thin-metadata.h dm thin: fix a race condition between discarding and provisioning a block 2016-07-20 12:43:35 -04:00
dm-thin.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-uevent.c
dm-uevent.h
dm-verity-fec.c dm verity fec: fix block calculation 2016-07-01 23:29:08 -04:00
dm-verity-fec.h dm verity: add support for forward error correction 2015-12-10 10:39:03 -05:00
dm-verity-target.c dm: rename target's per_bio_data_size to per_io_data_size 2016-02-22 22:34:37 -05:00
dm-verity.h dm verity: add ignore_zero_blocks feature 2015-12-10 10:39:03 -05:00
dm-zero.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm.c - A couple DM raid and DM mirror fixes 2016-10-28 09:27:58 -07:00
dm.h dm: add infrastructure for DAX support 2016-07-20 23:49:49 -04:00
faulty.c MD: rename some functions 2016-01-20 13:52:20 -08:00
Kconfig dm: add missing newline between DM_DEBUG_BLOCK_STACK_TRACING and DM_BUFIO 2016-03-10 17:12:11 -05:00
linear.c md: add block tracing for bio_remapping 2016-11-18 09:32:50 -08:00
linear.h
Makefile dm: move request-based code out to dm-rq.[hc] 2016-06-10 15:15:44 -04:00
md-cluster.c md-cluster: make resync lock also could be interruptted 2016-09-21 09:09:44 -07:00
md-cluster.h md-cluster: gather resync infos and enable recv_thread after bitmap is ready 2016-05-09 09:24:03 -07:00
md.c md: add blktrace event for writes to superblock 2016-11-18 09:47:57 -08:00
md.h md: define mddev flags, recovery flags and r1bio state bits using enums 2016-11-09 12:53:52 -08:00
multipath.c md/multipath: replace printk() with pr_*() 2016-11-07 15:08:22 -08:00
multipath.h
raid0.c md: add block tracing for bio_remapping 2016-11-18 09:32:50 -08:00
raid0.h block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
raid1.c md/raid1, raid10: add blktrace records when IO is delayed 2016-11-18 09:35:37 -08:00
raid1.h md: define mddev flags, recovery flags and r1bio state bits using enums 2016-11-09 12:53:52 -08:00
raid5-cache.c md/r5cache: r5cache recovery: part 1 2016-11-18 13:28:14 -08:00
raid5.c md/r5cache: sysfs entry journal_mode 2016-11-18 13:27:24 -08:00
raid5.h md/r5cache: sysfs entry journal_mode 2016-11-18 13:27:24 -08:00
raid10.c md/raid1, raid10: add blktrace records when IO is delayed 2016-11-18 09:35:37 -08:00
raid10.h raid10: improve random reads performance 2016-07-19 15:20:28 -07:00