linux/drivers/infiniband/core
Mike Marciniszyn 4fbc3a52cd RDMA/core: Fix umem iterator when PAGE_SIZE is greater then HCA pgsz
64k pages introduce the situation in this diagram when the HCA 4k page
size is being used:

 +-------------------------------------------+ <--- 64k aligned VA
 |                                           |
 |              HCA 4k page                  |
 |                                           |
 +-------------------------------------------+
 |                   o                       |
 |                                           |
 |                   o                       |
 |                                           |
 |                   o                       |
 +-------------------------------------------+
 |                                           |
 |              HCA 4k page                  |
 |                                           |
 +-------------------------------------------+ <--- Live HCA page
 |OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO| <--- offset
 |                                           | <--- VA
 |                MR data                    |
 +-------------------------------------------+
 |                                           |
 |              HCA 4k page                  |
 |                                           |
 +-------------------------------------------+
 |                   o                       |
 |                                           |
 |                   o                       |
 |                                           |
 |                   o                       |
 +-------------------------------------------+
 |                                           |
 |              HCA 4k page                  |
 |                                           |
 +-------------------------------------------+

The VA addresses are coming from rdma-core in this diagram can be
arbitrary, but for 64k pages, the VA may be offset by some number of HCA
4k pages and followed by some number of HCA 4k pages.

The current iterator doesn't account for either the preceding 4k pages or
the following 4k pages.

Fix the issue by extending the ib_block_iter to contain the number of DMA
pages like comment [1] says and by using __sg_advance to start the
iterator at the first live HCA page.

The changes are contained in a parallel set of iterator start and next
functions that are umem aware and specific to umem since there is one user
of the rdma_for_each_block() without umem.

These two fixes prevents the extra pages before and after the user MR
data.

Fix the preceding pages by using the __sq_advance field to start at the
first 4k page containing MR data.

Fix the following pages by saving the number of pgsz blocks in the
iterator state and downcounting on each next.

This fix allows for the elimination of the small page crutch noted in the
Fixes.

Fixes: 10c75ccb54 ("RDMA/umem: Prevent small pages from being returned by ib_umem_find_best_pgsz()")
Link: https://lore.kernel.org/r/20231129202143.1434-2-shiraz.saleem@intel.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-12-04 20:02:41 -04:00
..
addr.c RDMA/core: Delete useless module.h include 2022-01-28 13:03:12 -04:00
agent.c
agent.h
cache.c RDMA/core: Annotate struct ib_pkey_cache with __counted_by 2023-10-02 14:44:54 +03:00
cgroup.c
cm_msgs.h
cm_trace.c
cm_trace.h trace: Relocate event helper files 2022-12-10 11:01:12 -05:00
cm.c RDMA/cm: Trace icm_send_rej event before the cm state is reset 2023-04-09 12:52:57 +03:00
cma_configfs.c RDMA/cma: Fix truncation compilation warning in make_cma_ports 2023-09-18 11:52:10 +03:00
cma_priv.h RDMA/core: Add an rb_tree that stores cm_ids sorted by ifindex and remote IP 2022-06-16 09:54:35 +03:00
cma_trace.c
cma_trace.h trace: Relocate event helper files 2022-12-10 11:01:12 -05:00
cma.c RDMA/cma: Initialize ib_sa_multicast structure to 0 when join 2023-10-02 13:10:40 +03:00
core_priv.h RDMA/core: Add support to set privileged QKEY parameter 2023-10-19 09:31:00 +03:00
counters.c RDMA/counter: Add optional counter support 2021-10-12 12:48:05 -03:00
cq.c RDMA/core: Delete useless module.h include 2022-01-28 13:03:12 -04:00
device.c RDMA/core: Add support to dump SRQ resource in RAW format 2023-09-20 10:50:54 +03:00
ib_core_uverbs.c
iwcm.c infiniband: Remove the now superfluous sentinel element from ctl_table array 2023-10-11 12:16:13 -07:00
iwcm.h
iwpm_msg.c RDMA/iwpm: Rely on the rdma_nl_[un]register() to ensure that requests are valid 2021-07-30 10:01:41 -03:00
iwpm_util.c RDMA: Remove unnecessary NULL values 2023-08-07 16:56:57 +03:00
iwpm_util.h RDMA/core: Delete useless module.h include 2022-01-28 13:03:12 -04:00
lag.c RDMA/core: Remove NULL check before dev_{put, hold} 2023-10-24 18:16:04 +03:00
mad_priv.h
mad_rmpp.c
mad_rmpp.h
mad.c IB/mad: Don't call to function that might sleep while in atomic context 2022-11-10 10:57:15 +02:00
Makefile
mr_pool.c
multicast.c
netlink.c RDMA: Remove unnecessary ternary operators 2023-07-31 15:16:12 +03:00
nldev.c Merge tag 'v6.6' into rdma.git for-next 2023-10-31 10:54:48 -03:00
opa_smi.h
packer.c
rdma_core.c RDMA: Correct duplicated words in comments 2022-06-24 16:52:28 -03:00
rdma_core.h
restrack.c RDMA/restrack: Release MR restrack when delete 2022-11-15 09:56:32 +02:00
restrack.h
roce_gid_mgmt.c IB: Fix repeated words 'the the' comments 2022-07-22 12:02:29 -03:00
rw.c RDMA/core: Fix a couple of obvious typos in comments 2023-10-04 21:55:44 +03:00
sa_query.c RDMA/core: Use size_{add,sub,mul}() in calls to struct_size() 2023-09-19 10:33:45 +03:00
sa.h
security.c
smi.c
smi.h
sysfs.c IB/core: Add support for XDR link speed 2023-09-26 12:38:39 +03:00
trace.c
ucma.c infiniband: Remove the now superfluous sentinel element from ctl_table array 2023-10-11 12:16:13 -07:00
ud_header.c
umem_dmabuf.c RDMA/umem: Use dma-buf locked API to solve deadlock 2023-01-31 10:24:49 -04:00
umem_odp.c Linux 6.0 2022-10-06 19:48:45 -03:00
umem.c RDMA/core: Fix umem iterator when PAGE_SIZE is greater then HCA pgsz 2023-12-04 20:02:41 -04:00
user_mad.c RDMA/core: Use size_{add,sub,mul}() in calls to struct_size() 2023-09-19 10:33:45 +03:00
uverbs_cmd.c RDMA/core: Add support to set privileged QKEY parameter 2023-10-19 09:31:00 +03:00
uverbs_ioctl.c RDMA/core: Add UVERBS_ATTR_RAW_FD 2022-09-27 10:15:24 -03:00
uverbs_main.c RDMA/uverbs: Fix typo of sizeof argument 2023-09-11 15:01:09 +03:00
uverbs_marshall.c RDMA/core: Don't infoleak GRH fields 2022-01-05 16:30:19 -04:00
uverbs_std_types_async_fd.c
uverbs_std_types_counters.c IB/uverbs: Fix an potential error pointer dereference 2023-08-07 16:49:59 +03:00
uverbs_std_types_cq.c
uverbs_std_types_device.c IB/core: Add support for XDR link speed 2023-09-26 12:38:39 +03:00
uverbs_std_types_dm.c
uverbs_std_types_flow_action.c RDMA/core: Delete IPsec flow action logic from the core 2022-04-09 08:25:06 +03:00
uverbs_std_types_mr.c RDMA/uverbs: Track dmabuf memory regions 2021-08-19 09:59:53 -03:00
uverbs_std_types_qp.c IB/uverbs: fix the typo of optional 2022-10-19 09:46:45 +03:00
uverbs_std_types_srq.c
uverbs_std_types_wq.c
uverbs_std_types.c
uverbs_uapi.c RDMA/uverbs: Check for null return of kmalloc_array 2022-01-05 14:16:53 -04:00
uverbs.h
verbs.c RDMA/core: Fix uninit-value access in ib_get_eth_speed() 2023-11-13 10:19:07 +02:00