linux/drivers/infiniband/hw/mlx4
Jack Morgenstein f95ccffc71 IB/mlx4: Use 4K pages for kernel QP's WQE buffer
In the current implementation, the driver tries to allocate contiguous
memory, and if it fails, it falls back to 4K fragmented allocation.

Once the memory is fragmented, the first allocation might take a lot
of time, and even fail, which can cause connection failures.

This patch changes the logic to always allocate with 4K granularity,
since it's more robust and more likely to succeed.

This patch was tested with Lustre and no performance degradation
was observed.

Note: This commit eliminates the "shrinking WQE" feature. This feature
depended on using vmap to create a virtually contiguous send WQ.
vmap use was abandoned due to problems with several processors (see the
commit cited below). As a result, shrinking WQE was available only with
physically contiguous send WQs. Allocating such send WQs caused the
problems described above.
Therefore, as a side effect of eliminating the use of large physically
contiguous send WQs, the shrinking WQE feature became unavailable.

Warning example:
worker/20:1: page allocation failure: order:8, mode:0x80d0
CPU: 20 PID: 513 Comm: kworker/20:1 Tainted: G OE ------------
Workqueue: ib_cm cm_work_handler [ib_cm]
Call Trace:
[<ffffffff81686d81>] dump_stack+0x19/0x1b
[<ffffffff81186160>] warn_alloc_failed+0x110/0x180
[<ffffffff8118a954>] __alloc_pages_nodemask+0x9b4/0xba0
[<ffffffff811ce868>] alloc_pages_current+0x98/0x110
[<ffffffff81184fae>] __get_free_pages+0xe/0x50
[<ffffffff8133f6fe>] swiotlb_alloc_coherent+0x5e/0x150
[<ffffffff81062551>] x86_swiotlb_alloc_coherent+0x41/0x50
[<ffffffffa056b4c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core]
[<ffffffffa056b73b>] mlx4_buf_alloc+0x1bb/0x260 [mlx4_core]
[<ffffffffa0b15496>] create_qp_common+0x536/0x1000 [mlx4_ib]
[<ffffffff811c6ef7>] ? dma_pool_free+0xa7/0xd0
[<ffffffffa0b163c1>] mlx4_ib_create_qp+0x3b1/0xdc0 [mlx4_ib]
[<ffffffffa0b01bc2>] ? mlx4_ib_create_cq+0x2d2/0x430 [mlx4_ib]
[<ffffffffa0b21f20>] mlx4_ib_create_qp_wrp+0x10/0x20 [mlx4_ib]
[<ffffffffa08f152a>] ib_create_qp+0x7a/0x2f0 [ib_core]
[<ffffffffa06205d4>] rdma_create_qp+0x34/0xb0 [rdma_cm]
[<ffffffffa08275c9>] kiblnd_create_conn+0xbf9/0x1950 [ko2iblnd]
[<ffffffffa074077a>] ? cfs_percpt_unlock+0x1a/0xb0 [libcfs]
[<ffffffffa0835519>] kiblnd_passive_connect+0xa99/0x18c0 [ko2iblnd]

Fixes: 73898db043 ("net/mlx4: Avoid wrong virtual mappings")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-30 20:25:46 -06:00
..
ah.c IB/mlx4: Create slave AH's directly 2018-06-26 14:37:26 -06:00
alias_GUID.c IB/mlx4: Fix some spelling mistakes 2017-08-24 16:27:10 -04:00
cm.c IB/mlx4: Fix CM REQ retries in paravirt mode 2017-07-20 11:20:50 -04:00
cq.c IB/mlx: Set slid to zero in Ethernet completion struct 2018-02-28 12:10:32 -07:00
doorbell.c IB: Refactor umem to use linear SG table 2014-03-04 10:34:28 -08:00
Kconfig net: mellanox: add DEVLINK dependencies 2016-03-03 17:08:59 -05:00
mad.c RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments const 2018-07-30 20:09:34 -06:00
main.c RDMA: Fix storage of PortInfo CapabilityMask in the kernel 2018-07-10 11:06:45 -06:00
Makefile IB/mlx4: Add iov directory in sysfs under the ib device 2012-09-30 20:33:39 -07:00
mcg.c IB/mlx4: Suppress gcc 7 fall-through complaints 2017-10-14 20:47:06 -04:00
mlx4_ib.h IB/mlx4: Use 4K pages for kernel QP's WQE buffer 2018-07-30 20:25:46 -06:00
mr.c IB/mlx4: Mark user MR as writable if actual virtual memory is writable 2018-05-28 11:41:39 -06:00
qp.c IB/mlx4: Use 4K pages for kernel QP's WQE buffer 2018-07-30 20:25:46 -06:00
srq.c RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments const 2018-07-30 20:09:34 -06:00
sysfs.c IB/mlx4: fix sprintf format warning 2017-09-13 18:53:15 -07:00