linux/drivers/block
Ming Lei bc07c10a36 block: loop: support DIO & AIO
There are at least 3 advantages to use direct I/O and AIO on
read/write loop's backing file:

1) double cache can be avoided, then memory usage gets
decreased a lot

2) not like user space direct I/O, there isn't cost of
pinning pages

3) avoid context switch for obtaining good throughput
- in buffered file read, random I/O top throughput is often obtained
only if they are submitted concurrently from lots of tasks; but for
sequential I/O, most of times they can be hit from page cache, so
concurrent submissions often introduce unnecessary context switch
and can't improve throughput much. There was such discussion[1]
to use non-blocking I/O to improve the problem for application.
- with direct I/O and AIO, concurrent submissions can be
avoided and random read throughput can't be affected meantime

xfstests(-g auto, ext4) is basically passed when running with
direct I/O(aio), one exception is generic/232, but it failed in
loop buffered I/O(4.2-rc6-next-20150814) too.

Follows the fio test result for performance purpose:
	4 jobs fio test inside ext4 file system over loop block

1) How to run
	- KVM: 4 VCPUs, 2G RAM
	- linux kernel: 4.2-rc6-next-20150814(base) with the patchset
	- the loop block is over one image on SSD.
	- linux psync, 4 jobs, size 1500M, ext4 over loop block
	- test result: IOPS from fio output

2) Throughput(IOPS) becomes a bit better with direct I/O(aio)
        -------------------------------------------------------------
        test cases          |randread   |read   |randwrite  |write  |
        -------------------------------------------------------------
        base                |8015       |113811 |67442      |106978
        -------------------------------------------------------------
        base+loop aio       |8136       |125040 |67811      |111376
        -------------------------------------------------------------

- somehow, it should be caused by more page cache avaiable for
application or one extra page copy is avoided in case of direct I/O

3) context switch
        - context switch decreased by ~50% with loop direct I/O(aio)
	compared with loop buffered I/O(4.2-rc6-next-20150814)

4) memory usage from /proc/meminfo
        -------------------------------------------------------------
                                   | Buffers       | Cached
        -------------------------------------------------------------
        base                       | > 760MB       | ~950MB
        -------------------------------------------------------------
        base+loop direct I/O(aio)  | < 5MB         | ~1.6GB
        -------------------------------------------------------------

- so there are much more page caches available for application with
direct I/O

[1] https://lwn.net/Articles/612483/

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-23 11:01:16 -06:00
..
aoe Revert "block: remove artifical max_hw_sectors cap" 2015-08-18 13:21:13 -07:00
drbd block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
mtip32xx mtip32x: fix regression introduced by blk-mq per-hctx flush 2015-08-25 14:35:51 -06:00
paride Char/Misc driver patches for 4.2-rc1 2015-06-26 14:51:15 -07:00
rsxx block: make generic_make_request handle arbitrarily sized bios 2015-08-13 12:31:33 -06:00
xen-blkback Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block 2015-09-02 13:10:25 -07:00
zram zram: fix possible use after free in zcomp_create() 2015-09-17 21:16:07 -07:00
amiflop.c block: drop owner assignment from platform_drivers 2014-10-20 16:20:18 +02:00
ataflop.c Merge branch 'for-3.16/core' of git://git.kernel.dk/linux-block into next 2014-06-02 09:29:34 -07:00
brd.c libnvdimm for 4.3: 2015-09-08 14:35:59 -07:00
cciss_cmd.h
cciss_scsi.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
cciss_scsi.h
cciss.c cciss: correct the non-resettable board list 2015-05-31 11:14:34 -07:00
cciss.h
cpqarray.c genirq: Remove the deprecated 'IRQF_DISABLED' request_irq() flag entirely 2015-03-05 20:53:06 +01:00
cpqarray.h
cryptoloop.c
DAC960.c block: use pci_zalloc_consistent 2014-08-08 15:57:28 -07:00
DAC960.h
floppy.c block: add a bi_error field to struct bio 2015-07-29 08:55:15 -06:00
hd.c block: hd: remove deprecated IRQF_DISABLED 2014-10-01 08:16:07 -06:00
ida_cmd.h
ida_ioctl.h
Kconfig libnvdimm, pmem: move pmem to drivers/nvdimm/ 2015-06-24 21:24:10 -04:00
loop.c block: loop: support DIO & AIO 2015-09-23 11:01:16 -06:00
loop.h block: loop: support DIO & AIO 2015-09-23 11:01:16 -06:00
Makefile libnvdimm, pmem: move pmem to drivers/nvdimm/ 2015-06-24 21:24:10 -04:00
mg_disk.c block: drop owner assignment from platform_drivers 2014-10-20 16:20:18 +02:00
nbd.c nbd: flags is a u32 variable 2015-08-17 08:23:01 -06:00
null_blk.c null_blk: fix wrong capacity when bs is not 512 bytes 2015-09-02 16:49:42 -06:00
nvme-core.c Merge branch 'for-4.3/drivers' of git://git.kernel.dk/linux-block 2015-09-02 13:14:58 -07:00
nvme-scsi.c Merge branch 'for-4.2/drivers' of git://git.kernel.dk/linux-block 2015-06-25 15:12:50 -07:00
osdblk.c block: support different tag allocation policy 2015-01-23 14:15:46 -07:00
pktcdvd.c block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
ps3disk.c
ps3vram.c block: make generic_make_request handle arbitrarily sized bios 2015-08-13 12:31:33 -06:00
rbd_types.h
rbd.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2015-09-11 12:33:03 -07:00
skd_main.c block: have drivers use blk_queue_max_discard_sectors() 2015-07-17 08:41:53 -06:00
skd_s1120.h
smart1,2.h
sunvdc.c sunvdc: reconnect ldc after vds service domain restarts 2014-12-11 18:52:45 -08:00
swim3.c powerpc: Move Power Macintosh drivers to generic byteswappers 2015-03-23 14:29:40 +11:00
swim_asm.S
swim.c block: drop owner assignment from platform_drivers 2014-10-20 16:20:18 +02:00
sx8.c block: rename REQ_TYPE_SPECIAL to REQ_TYPE_DRV_PRIV 2015-05-05 13:40:03 -06:00
umem.c block: make generic_make_request handle arbitrarily sized bios 2015-08-13 12:31:33 -06:00
umem.h
virtio_blk.c virtio-blk: Allow extended partitions 2015-09-08 13:31:07 +03:00
xen-blkfront.c xen: MFN/GFN/BFN terminology changes for 4.3-rc0 2015-09-10 16:21:11 -07:00
xsysace.c block: systemace: Remove .owner field for driver 2014-08-21 20:37:54 -05:00
z2ram.c block: remove struct request buffer member 2014-04-15 14:03:02 -06:00