linux/net/smc
Wen Gu b8d199451c net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R
On long-running enterprise production servers, high-order contiguous
memory pages are usually very rare and in most cases we can only get
fragmented pages.

When replacing TCP with SMC-R in such production scenarios, attempting
to allocate high-order physically contiguous sndbufs and RMBs may result
in frequent memory compaction, which will cause unexpected hung issue
and further stability risks.

So this patch is aimed to allow SMC-R link group to use virtually
contiguous sndbufs and RMBs to avoid potential issues mentioned above.
Whether to use physically or virtually contiguous buffers can be set
by sysctl smcr_buf_type.

Note that using virtually contiguous buffers will bring an acceptable
performance regression, which can be mainly divided into two parts:

1) regression in data path, which is brought by additional address
   translation of sndbuf by RNIC in Tx. But in general, translating
   address through MTT is fast.

   Taking 256KB sndbuf and RMB as an example, the comparisons in qperf
   latency and bandwidth test with physically and virtually contiguous
   buffers are as follows:

- client:
  smc_run taskset -c <cpu> qperf <server> -oo msg_size:1:64K:*2\
  -t 5 -vu tcp_{bw|lat}
- server:
  smc_run taskset -c <cpu> qperf

   [latency]
   msgsize              tcp            smcr        smcr-use-virt-buf
   1               11.17 us         7.56 us         7.51 us (-0.67%)
   2               10.65 us         7.74 us         7.56 us (-2.31%)
   4               11.11 us         7.52 us         7.59 us ( 0.84%)
   8               10.83 us         7.55 us         7.51 us (-0.48%)
   16              11.21 us         7.46 us         7.51 us ( 0.71%)
   32              10.65 us         7.53 us         7.58 us ( 0.61%)
   64              10.95 us         7.74 us         7.80 us ( 0.76%)
   128             11.14 us         7.83 us         7.87 us ( 0.47%)
   256             10.97 us         7.94 us         7.92 us (-0.28%)
   512             11.23 us         7.94 us         8.20 us ( 3.25%)
   1024            11.60 us         8.12 us         8.20 us ( 0.96%)
   2048            14.04 us         8.30 us         8.51 us ( 2.49%)
   4096            16.88 us         9.13 us         9.07 us (-0.64%)
   8192            22.50 us        10.56 us        11.22 us ( 6.26%)
   16384           28.99 us        12.88 us        13.83 us ( 7.37%)
   32768           40.13 us        16.76 us        16.95 us ( 1.16%)
   65536           68.70 us        24.68 us        24.85 us ( 0.68%)
   [bandwidth]
   msgsize                tcp              smcr          smcr-use-virt-buf
   1                1.65 MB/s         1.59 MB/s         1.53 MB/s (-3.88%)
   2                3.32 MB/s         3.17 MB/s         3.08 MB/s (-2.67%)
   4                6.66 MB/s         6.33 MB/s         6.09 MB/s (-3.85%)
   8               13.67 MB/s        13.45 MB/s        11.97 MB/s (-10.99%)
   16              25.36 MB/s        27.15 MB/s        24.16 MB/s (-11.01%)
   32              48.22 MB/s        54.24 MB/s        49.41 MB/s (-8.89%)
   64             106.79 MB/s       107.32 MB/s        99.05 MB/s (-7.71%)
   128            210.21 MB/s       202.46 MB/s       201.02 MB/s (-0.71%)
   256            400.81 MB/s       416.81 MB/s       393.52 MB/s (-5.59%)
   512            746.49 MB/s       834.12 MB/s       809.99 MB/s (-2.89%)
   1024          1292.33 MB/s      1641.96 MB/s      1571.82 MB/s (-4.27%)
   2048          2007.64 MB/s      2760.44 MB/s      2717.68 MB/s (-1.55%)
   4096          2665.17 MB/s      4157.44 MB/s      4070.76 MB/s (-2.09%)
   8192          3159.72 MB/s      4361.57 MB/s      4270.65 MB/s (-2.08%)
   16384         4186.70 MB/s      4574.13 MB/s      4501.17 MB/s (-1.60%)
   32768         4093.21 MB/s      4487.42 MB/s      4322.43 MB/s (-3.68%)
   65536         4057.14 MB/s      4735.61 MB/s      4555.17 MB/s (-3.81%)

2) regression in buffer initialization and destruction path, which is
   brought by additional MR operations of sndbufs. But thanks to link
   group buffer reuse mechanism, the impact of this kind of regression
   decreases as times of buffer reuse increases.

   Taking 256KB sndbuf and RMB as an example, latency of some key SMC-R
   buffer-related function obtained by bpftrace are as follows:

   Function                         Phys-bufs           Virt-bufs
   smcr_new_buf_create()             67154 ns            79164 ns
   smc_ib_buf_map_sg()                 525 ns              928 ns
   smc_ib_get_memory_region()       162294 ns           161191 ns
   smc_wr_reg_send()                  9957 ns             9635 ns
   smc_ib_put_memory_region()       203548 ns           198374 ns
   smc_ib_buf_unmap_sg()               508 ns             1158 ns

------------
Test environment notes:
1. Above tests run on 2 VMs within the same Host.
2. The NIC is ConnectX-4Lx, using SRIOV and passing through 2 VFs to
   the each VM respectively.
3. VMs' vCPUs are binded to different physical CPUs, and the binded
   physical CPUs are isolated by `isolcpus=xxx` cmdline.
4. NICs' queue number are set to 1.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-18 11:19:17 +01:00
..
af_smc.c net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R 2022-07-18 11:19:17 +01:00
Kconfig treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Makefile net/smc: fix compile warning for smc_sysctl 2022-03-07 11:59:17 +00:00
smc_cdc.c net/smc: fixes for converting from "struct smc_cdc_tx_pend **" to "struct smc_wr_tx_pend_priv *" 2022-05-28 12:36:26 +01:00
smc_cdc.h net/smc: fix kernel panic caused by race of smc_sock 2021-12-28 12:42:45 +00:00
smc_clc.c net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R 2022-07-18 11:19:17 +01:00
smc_clc.h net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R 2022-07-18 11:19:17 +01:00
smc_close.c net/smc: Fix slab-out-of-bounds issue in fallback 2022-04-25 11:03:48 -07:00
smc_close.h net/smc: remove close abort worker 2019-10-22 11:23:44 -07:00
smc_core.c net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R 2022-07-18 11:19:17 +01:00
smc_core.h net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R 2022-07-18 11:19:17 +01:00
smc_diag.c Partially revert "net/smc: Add netlink net namespace support" 2022-02-02 07:42:41 -08:00
smc_ib.c net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R 2022-07-18 11:19:17 +01:00
smc_ib.h net/smc: optimize for smc_sndbuf_sync_sg_for_device and smc_rmb_sync_sg_for_cpu 2022-07-18 11:19:17 +01:00
smc_ism.c net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
smc_ism.h net/smc: keep static copy of system EID 2021-09-14 12:49:10 +01:00
smc_llc.c net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R 2022-07-18 11:19:17 +01:00
smc_llc.h net/smc: extend LLC layer for SMC-Rv2 2021-10-16 14:58:13 +01:00
smc_netlink.c net/smc: Add global configure for handshake limitation by netlink 2022-02-11 11:14:58 +00:00
smc_netlink.h net/smc: add support for user defined EIDs 2021-09-14 12:49:10 +01:00
smc_netns.h net/smc: introduce list of pnetids for Ethernet devices 2020-09-28 15:19:03 -07:00
smc_pnet.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
smc_pnet.h net/smc: Use a mutex for locking "struct smc_pnettable" 2022-02-24 09:09:33 -08:00
smc_rx.c net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R 2022-07-18 11:19:17 +01:00
smc_rx.h
smc_stats.c net/smc: Fix ENODATA tests in smc_nl_get_fback_stats() 2021-06-21 12:16:58 -07:00
smc_stats.h net/smc: Make SMC statistics network namespace aware 2021-06-16 12:54:02 -07:00
smc_sysctl.c net/smc: Introduce a sysctl for setting SMC-R buffer type 2022-07-18 11:19:17 +01:00
smc_sysctl.h net/smc: fix -Wmissing-prototypes warning when CONFIG_SYSCTL not set 2022-03-09 20:02:35 -08:00
smc_tracepoint.c net/smc: Introduce tracepoint for smcr link down 2021-11-01 13:39:14 +00:00
smc_tracepoint.h net/smc: Add net namespace for tracepoints 2022-01-02 12:07:39 +00:00
smc_tx.c net/smc: Allow virtually contiguous sndbufs or RMBs for SMC-R 2022-07-18 11:19:17 +01:00
smc_tx.h net/smc: Cork when sendpage with MSG_SENDPAGE_NOTLAST flag 2022-01-31 15:08:20 +00:00
smc_wr.c net/smc: send cdc msg inline if qp has sufficient inline space 2022-05-17 17:34:12 -07:00
smc_wr.h net/smc: Remove unused function declaration 2022-01-15 22:57:21 +00:00
smc.h net/smc: Only save the original clcsock callback functions 2022-04-25 11:03:48 -07:00