linux/drivers/infiniband/hw
Jason Gunthorpe f28b1932ea RDMA/mlx5: Fix a race with mlx5_ib_update_xlt on an implicit MR
mlx5_ib_update_xlt() must be protected against parallel free of the MR it
is accessing, also it must be called single threaded while updating the
HW. Otherwise we can have races of the form:

    CPU0                               CPU1
  mlx5_ib_update_xlt()
   mlx5_odp_populate_klm()
     odp_lookup() == NULL
     pklm = ZAP
                                      implicit_mr_get_data()
 				        implicit_mr_alloc()
 					  <update interval tree>
					mlx5_ib_update_xlt
					  mlx5_odp_populate_klm()
					    odp_lookup() != NULL
					    pklm = VALID
					   mlx5_ib_post_send_wait()

    mlx5_ib_post_send_wait() // Replaces VALID with ZAP

This can be solved by putting both the SRCU and the umem_mutex lock around
every call to mlx5_ib_update_xlt(). This ensures that the content of the
interval tree relavent to mlx5_odp_populate_klm() (ie mr->parent == mr)
will not change while it is running, and thus the posted WRs to update the
KLM will always reflect the correct information.

The race above will resolve by either having CPU1 wait till CPU0 completes
the ZAP or CPU0 will run after the add and instead store VALID.

The pagefault path adding children already holds the umem_mutex and SRCU,
so the only missed lock is during MR destruction.

Fixes: 81713d3788 ("IB/mlx5: Add implicit MR support")
Link: https://lore.kernel.org/r/20191001153821.23621-3-jgg@ziepe.ca
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-04 15:54:21 -03:00
..
bnxt_re RDMA/bnxt_re: Fix spelling mistake "missin_resp" -> "missing_resp" 2019-09-16 10:58:57 -03:00
cxgb3 RDMA/{cxgb3, cxgb4, i40iw}: Remove common code 2019-08-12 10:19:43 -04:00
cxgb4 RDMA/cxgb4: Do not dma memory off of the stack 2019-10-04 15:13:27 -03:00
efa RDMA/efa: Fix incorrect error print 2019-09-16 14:25:43 -03:00
hfi1 RDMA/hfi1: Prevent memory leak in sdma_init 2019-10-01 11:34:55 -03:00
hns RDMA subsystem updates for 5.4 2019-09-21 10:26:24 -07:00
i40iw RDMA/i40iw: Associate ibdev to netdev before IB device registration 2019-10-04 14:29:14 -03:00
mlx4 Merge tag 'v5.3-rc8' into rdma.git for-next 2019-09-13 16:59:51 -03:00
mlx5 RDMA/mlx5: Fix a race with mlx5_ib_update_xlt on an implicit MR 2019-10-04 15:54:21 -03:00
mthca IB: Remove unneeded memset 2019-07-03 14:26:49 -03:00
ocrdma RDMA: Introduce ib_port_phys_state enum 2019-08-12 10:18:52 -04:00
qedr RDMA: Introduce ib_port_phys_state enum 2019-08-12 10:18:52 -04:00
qib mm/gup: add make_dirty arg to put_user_pages_dirty_lock() 2019-09-24 15:54:08 -07:00
usnic mm/gup: add make_dirty arg to put_user_pages_dirty_lock() 2019-09-24 15:54:08 -07:00
vmw_pvrdma RDMA/vmw_pvrdma: Free SRQ only once 2019-10-01 10:47:58 -03:00
Makefile rdma: Remove nes 2019-06-13 09:59:49 -04:00