linux/drivers/md
Joe Thornber 66a6363566 dm cache: add stochastic-multi-queue (smq) policy
The stochastic-multi-queue (smq) policy addresses some of the problems
with the current multiqueue (mq) policy.

Memory usage
------------

The mq policy uses a lot of memory; 88 bytes per cache block on a 64
bit machine.

SMQ uses 28bit indexes to implement it's data structures rather than
pointers.  It avoids storing an explicit hit count for each block.  It
has a 'hotspot' queue rather than a pre cache which uses a quarter of
the entries (each hotspot block covers a larger area than a single
cache block).

All these mean smq uses ~25bytes per cache block.  Still a lot of
memory, but a substantial improvement nontheless.

Level balancing
---------------

MQ places entries in different levels of the multiqueue structures
based on their hit count (~ln(hit count)).  This means the bottom
levels generally have the most entries, and the top ones have very
few.  Having unbalanced levels like this reduces the efficacy of the
multiqueue.

SMQ does not maintain a hit count, instead it swaps hit entries with
the least recently used entry from the level above.  The over all
ordering being a side effect of this stochastic process.  With this
scheme we can decide how many entries occupy each multiqueue level,
resulting in better promotion/demotion decisions.

Adaptability
------------

The MQ policy maintains a hit count for each cache block.  For a
different block to get promoted to the cache it's hit count has to
exceed the lowest currently in the cache.  This means it can take a
long time for the cache to adapt between varying IO patterns.
Periodically degrading the hit counts could help with this, but I
haven't found a nice general solution.

SMQ doesn't maintain hit counts, so a lot of this problem just goes
away.  In addition it tracks performance of the hotspot queue, which
is used to decide which blocks to promote.  If the hotspot queue is
performing badly then it starts moving entries more quickly between
levels.  This lets it adapt to new IO patterns very quickly.

Performance
-----------

In my tests SMQ shows substantially better performance than MQ.  Once
this matures a bit more I'm sure it'll become the default policy.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-06-11 17:12:59 -04:00
..
bcache block: remove management of bi_remaining when restoring original bi_end_io 2015-05-22 08:58:55 -06:00
persistent-data dm thin metadata: remove in-core 'read_only' flag 2015-05-29 14:18:59 -04:00
bitmap.c md/bitmap: remove rcu annotation from pointer arithmetic. 2015-05-21 09:14:41 +10:00
bitmap.h md-cluster: re-add capabilities 2015-04-22 07:59:39 +10:00
dm-bio-prison.c dm bio prison: add dm_cell_promote_or_release() 2015-05-29 14:19:06 -04:00
dm-bio-prison.h dm bio prison: add dm_cell_promote_or_release() 2015-05-29 14:19:06 -04:00
dm-bio-record.h dm: Refactor for new bio cloning/splitting 2013-11-23 22:33:55 -08:00
dm-bufio.c dm bufio: fix time comparison to use time_after_eq() 2015-02-09 13:06:48 -05:00
dm-bufio.h dm snapshot: use dm-bufio prefetch 2014-01-14 23:23:03 -05:00
dm-builtin.c dm sysfs: fix a module unload race 2014-01-14 23:23:04 -05:00
dm-cache-block-types.h dm cache: revert "remove remainder of distinct discard block size" 2014-11-10 15:25:30 -05:00
dm-cache-metadata.c dm cache: fix missing ERR_PTR returns and handling 2015-01-28 09:59:20 -05:00
dm-cache-metadata.h dm cache: revert "remove remainder of distinct discard block size" 2014-11-10 15:25:30 -05:00
dm-cache-policy-cleaner.c dm cache: pass a new 'critical' flag to the policies when requesting writeback work 2015-05-29 14:19:04 -04:00
dm-cache-policy-internal.h dm cache: pull out some bitset utility functions for reuse 2015-05-29 14:19:05 -04:00
dm-cache-policy-mq.c dm cache: pass a new 'critical' flag to the policies when requesting writeback work 2015-05-29 14:19:04 -04:00
dm-cache-policy-smq.c dm cache: add stochastic-multi-queue (smq) policy 2015-06-11 17:12:59 -04:00
dm-cache-policy.c dm cache: add policy name to status output 2014-01-16 13:44:11 -05:00
dm-cache-policy.h dm cache: pass a new 'critical' flag to the policies when requesting writeback work 2015-05-29 14:19:04 -04:00
dm-cache-target.c dm cache: boost promotion of blocks that will be overwritten 2015-05-29 14:19:07 -04:00
dm-crypt.c dm crypt: add comments to better describe crypto processing logic 2015-05-29 14:19:02 -04:00
dm-delay.c dm delay: use msecs_to_jiffies for time conversion 2015-04-15 12:10:21 -04:00
dm-era-target.c dm era: check for a non-NULL metadata object before closing it 2014-06-03 13:44:08 -04:00
dm-exception-store.c dm: replace simple_strtoul 2012-07-27 15:07:59 +01:00
dm-exception-store.h
dm-flakey.c block: Abstract out bvec iterator 2013-11-23 22:33:47 -08:00
dm-io.c dm io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME 2015-02-27 14:53:32 -05:00
dm-ioctl.c dm: only initialize the request_queue once 2015-04-30 10:25:21 -04:00
dm-kcopyd.c dm: stop using WQ_NON_REENTRANT 2013-08-23 09:02:13 -04:00
dm-linear.c block: Abstract out bvec iterator 2013-11-23 22:33:47 -08:00
dm-log-userspace-base.c dm log userspace base: fix compile warning 2015-04-15 12:10:20 -04:00
dm-log-userspace-transfer.c dm log userspace transfer: match wait_for_completion_timeout return type 2015-04-15 12:10:20 -04:00
dm-log-userspace-transfer.h
dm-log-writes.c dm log writes: use ULL suffix for 64-bit constants 2015-05-29 14:19:01 -04:00
dm-log.c dm: use memweight() 2012-07-30 17:25:16 -07:00
dm-mpath.c dm mpath: fix leak of dm_mpath_io structure in blk-mq .queue_rq error path 2015-05-27 17:37:22 -04:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c dm: reject trailing characters in sccanf input 2012-03-28 18:41:26 +01:00
dm-raid.c dm raid: add support for the MD RAID0 personality 2015-05-29 14:19:00 -04:00
dm-raid1.c dm raid1: keep issuing IO after leg failure 2015-05-29 14:19:02 -04:00
dm-region-hash.c block: Abstract out bvec iterator 2013-11-23 22:33:47 -08:00
dm-round-robin.c dm: reject trailing characters in sccanf input 2012-03-28 18:41:26 +01:00
dm-service-time.c dm: reject trailing characters in sccanf input 2012-03-28 18:41:26 +01:00
dm-snap-persistent.c dm snapshot: remove unnecessary NULL checks before vfree() calls 2015-02-09 13:06:49 -05:00
dm-snap-transient.c
dm-snap.c block: remove management of bi_remaining when restoring original bi_end_io 2015-05-22 08:58:55 -06:00
dm-stats.c - Significant DM thin-provisioning performance improvements to meet 2014-12-08 21:10:03 -08:00
dm-stats.h dm: add statistics support 2013-09-05 20:46:06 -04:00
dm-stripe.c dm stripe: drop useless exit point from dm_stripe_init() 2015-05-29 14:19:01 -04:00
dm-switch.c dm switch: efficiently support repetitive patterns 2014-08-01 12:30:37 -04:00
dm-sysfs.c dm: add 'use_blk_mq' module param and expose in per-device ro sysfs attr 2015-04-15 12:10:17 -04:00
dm-table.c dm: do not allocate any mempools for blk-mq request-based DM 2015-05-29 14:18:57 -04:00
dm-target.c dm: allocate requests in target when stacking on blk-mq devices 2015-02-09 13:06:47 -05:00
dm-thin-metadata.c dm thin metadata: remove in-core 'read_only' flag 2015-05-29 14:18:59 -04:00
dm-thin-metadata.h dm thin metadata: remove unused dm_pool_get_data_block_size() 2015-02-09 13:06:49 -05:00
dm-thin.c dm thin: cleanup schedule_zero() to read more logically 2015-05-29 14:18:58 -04:00
dm-uevent.c
dm-uevent.h
dm-verity.c block: remove management of bi_remaining when restoring original bi_end_io 2015-05-22 08:58:55 -06:00
dm-zero.c dm crypt, dm zero: update author name following legal name change 2014-07-10 16:44:14 -04:00
dm.c dm: factor out a common cleanup_mapped_device() 2015-05-29 14:18:58 -04:00
dm.h block, dm: don't copy bios for request clones 2015-05-22 08:58:57 -06:00
faulty.c md: rename ->stop to ->free 2015-02-04 08:35:52 +11:00
Kconfig dm cache: add stochastic-multi-queue (smq) policy 2015-06-11 17:12:59 -04:00
linear.c md: rename ->stop to ->free 2015-02-04 08:35:52 +11:00
linear.h
Makefile dm cache: add stochastic-multi-queue (smq) policy 2015-06-11 17:12:59 -04:00
md-cluster.c md-cluster: re-add capabilities 2015-04-22 07:59:39 +10:00
md-cluster.h md-cluster: re-add capabilities 2015-04-22 07:59:39 +10:00
md.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2015-05-08 19:49:35 -07:00
md.h md: remove 'go_faster' option from ->sync_request() 2015-04-22 08:00:40 +10:00
multipath.c md: rename ->stop to ->free 2015-02-04 08:35:52 +11:00
multipath.h
raid1.c md: remove 'go_faster' option from ->sync_request() 2015-04-22 08:00:40 +10:00
raid1.h md: make ->congested robust against personality changes. 2015-02-04 08:35:52 +11:00
raid5.c raid5: fix broken async operation chain 2015-05-21 09:14:20 +10:00
raid5.h md/raid5: allow the stripe_cache to grow and shrink. 2015-04-22 08:00:43 +10:00
raid10.c md: remove 'go_faster' option from ->sync_request() 2015-04-22 08:00:40 +10:00
raid10.h md: make ->congested robust against personality changes. 2015-02-04 08:35:52 +11:00
raid0.c md/raid0: fix restore to sector variable in raid0_make_request 2015-05-21 09:14:25 +10:00
raid0.h