Commit Graph

13106 Commits

Author SHA1 Message Date
Jason Gunthorpe
bf0480a2df IB/cm: Remove dgid from the cm_id_priv av
It turns out this is only being used to store the LID for SIDR mode to
search the RB tree for request de-duplication. Store the LID value
directly and don't pretend it is a GID.

Link: https://lore.kernel.org/r/2e7c87b6f662c90c642fc1838e363ad3e6ef14a4.1623236345.git.leonro@nvidia.com
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-10 09:39:27 -03:00
Alaa Hleihel
2ba0aa2fee IB/mlx5: Fix initializing CQ fragments buffer
The function init_cq_frag_buf() can be called to initialize the current CQ
fragments buffer cq->buf, or the temporary cq->resize_buf that is filled
during CQ resize operation.

However, the offending commit started to use function get_cqe() for
getting the CQEs, the issue with this change is that get_cqe() always
returns CQEs from cq->buf, which leads us to initialize the wrong buffer,
and in case of enlarging the CQ we try to access elements beyond the size
of the current cq->buf and eventually hit a kernel panic.

 [exception RIP: init_cq_frag_buf+103]
  [ffff9f799ddcbcd8] mlx5_ib_resize_cq at ffffffffc0835d60 [mlx5_ib]
  [ffff9f799ddcbdb0] ib_resize_cq at ffffffffc05270df [ib_core]
  [ffff9f799ddcbdc0] llt_rdma_setup_qp at ffffffffc0a6a712 [llt]
  [ffff9f799ddcbe10] llt_rdma_cc_event_action at ffffffffc0a6b411 [llt]
  [ffff9f799ddcbe98] llt_rdma_client_conn_thread at ffffffffc0a6bb75 [llt]
  [ffff9f799ddcbec8] kthread at ffffffffa66c5da1
  [ffff9f799ddcbf50] ret_from_fork_nospec_begin at ffffffffa6d95ddd

Fix it by getting the needed CQE by calling mlx5_frag_buf_get_wqe() that
takes the correct source buffer as a parameter.

Fixes: 388ca8be00 ("IB/mlx5: Implement fragmented completion queue (CQ)")
Link: https://lore.kernel.org/r/90a0e8c924093cfa50a482880ad7e7edb73dc19a.1623309971.git.leonro@nvidia.com
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-10 08:59:34 -03:00
Aharon Landau
6466f03fdf RDMA/mlx5: Delete right entry from MR signature database
The value mr->sig is stored in the entry upon mr allocation, however, ibmr
is wrongly entered here as "old", therefore, xa_cmpxchg() does not replace
the entry with NULL, which leads to the following trace:

 WARNING: CPU: 28 PID: 2078 at drivers/infiniband/hw/mlx5/main.c:3643 mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
 Modules linked in: nvme_rdma nvme_fabrics nvme_core 8021q garp mrp bonding bridge stp llc rfkill rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_tad
 CPU: 28 PID: 2078 Comm: reboot Tainted: G               X --------- ---  5.13.0-0.rc2.19.el9.x86_64 #1
 Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 2.9.1 12/07/2018
 RIP: 0010:mlx5_ib_stage_init_cleanup+0x4d/0x60 [mlx5_ib]
 Code: 8d bb 70 1f 00 00 be 00 01 00 00 e8 9d 94 ce da 48 3d 00 01 00 00 75 02 5b c3 0f 0b 5b c3 0f 0b 48 83 bb b0 20 00 00 00 74 d5 <0f> 0b eb d1 4
 RSP: 0018:ffffa8db06d33c90 EFLAGS: 00010282
 RAX: 0000000000000000 RBX: ffff97f890a44000 RCX: ffff97f900ec0160
 RDX: 0000000000000000 RSI: 0000000080080001 RDI: ffff97f890a44000
 RBP: ffffffffc0c189b8 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000300 R12: ffff97f890a44000
 R13: ffffffffc0c36030 R14: 00000000fee1dead R15: 0000000000000000
 FS:  00007f0d5a8a3b40(0000) GS:ffff98077fb80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000555acbf4f450 CR3: 00000002a6f56002 CR4: 00000000001706e0
 Call Trace:
  mlx5r_remove+0x39/0x60 [mlx5_ib]
  auxiliary_bus_remove+0x1b/0x30
  __device_release_driver+0x17a/0x230
  device_release_driver+0x24/0x30
  bus_remove_device+0xdb/0x140
  device_del+0x18b/0x3e0
  mlx5_detach_device+0x59/0x90 [mlx5_core]
  mlx5_unload_one+0x22/0x60 [mlx5_core]
  shutdown+0x31/0x3a [mlx5_core]
  pci_device_shutdown+0x34/0x60
  device_shutdown+0x15b/0x1c0
  __do_sys_reboot.cold+0x2f/0x5b
  ? vfs_writev+0xc7/0x140
  ? handle_mm_fault+0xc5/0x290
  ? do_writev+0x6b/0x110
  do_syscall_64+0x40/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: e6fb246cca ("RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()")
Link: https://lore.kernel.org/r/f3f585ea0db59c2a78f94f65eedeafc5a2374993.1623309971.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-10 08:59:34 -03:00
Maor Gottlieb
2adcb4c5a5 RDMA: Verify port when creating flow rule
Validate port value provided by the user and with that remove no longer
needed validation by the driver.  The missing check in the mlx5_ib driver
could cause to the below oops.

Call trace:
  _create_flow_rule+0x2d4/0xf28 [mlx5_ib]
  mlx5_ib_create_flow+0x2d0/0x5b0 [mlx5_ib]
  ib_uverbs_ex_create_flow+0x4cc/0x624 [ib_uverbs]
  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xd4/0x150 [ib_uverbs]
  ib_uverbs_cmd_verbs.isra.7+0xb28/0xc50 [ib_uverbs]
  ib_uverbs_ioctl+0x158/0x1d0 [ib_uverbs]
  do_vfs_ioctl+0xd0/0xaf0
  ksys_ioctl+0x84/0xb4
  __arm64_sys_ioctl+0x28/0xc4
  el0_svc_common.constprop.3+0xa4/0x254
  el0_svc_handler+0x84/0xa0
  el0_svc+0x10/0x26c
 Code: b9401260 f9615681 51000400 8b001c20 (f9403c1a)

Fixes: 436f2ad05a ("IB/core: Export ib_create/destroy_flow through uverbs")
Link: https://lore.kernel.org/r/faad30dc5219a01727f47db3dc2f029d07c82c00.1623309971.git.leonro@nvidia.com
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-10 08:59:33 -03:00
Yevgeny Kliteynik
3f3f05ab88 net/mlx5: Added new parameters to reformat context
Adding new reformat context type (INSERT_HEADER) requires adding two new
parameters to reformat context - reformat_param_0 and reformat_param_1.
As defined by HW spec, these parameters have different meaning for
different reformat context type.

The first parameter (reformat_param_0) is not new to HW spec, but it
wasn't used by any of the supported reformats. The second parameter
(reformat_param_1) is new to the HW spec - it was added to allow
supporting INSERT_HEADER.

For NSERT_HEADER, reformat_param_0 indicates the header used to
reference the location of the inserted header, and reformat_param_1
indicates the offset of the inserted header from the reference point
defined by reformat_param_0.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-06-09 18:36:07 -07:00
Shiraz Saleem
6246f1ccb9 RDMA/irdma: Use list_last_entry/list_first_entry
Use list_last_entry and list_first_entry instead of using prev and next
pointers.

Link: https://lore.kernel.org/r/20210608211415.680-1-shiraz.saleem@intel.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 20:04:00 -03:00
Mark Bloch
edc0b0bccc RDMA/mlx5: Block FDB rules when not in switchdev mode
Allow creating FDB steering rules only when in switchdev mode.

The only software model where a userspace application can manipulate
FDB entries is when it manages the eswitch. This is only possible in
switchdev mode where we expose a single RDMA device with representors
for all the vports that are connected to the eswitch.

Fixes: 52438be441 ("RDMA/mlx5: Allow inserting a steering rule to the FDB")
Link: https://lore.kernel.org/r/e928ae7c58d07f104716a2a8d730963d1bd01204.1623052923.git.leonro@nvidia.com
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 17:02:15 -03:00
Baokun Li
ac477efcf0 RDMA/irdma: Use list_move instead of list_del/list_add
Using list_move() instead of list_del() + list_add().

Link: https://lore.kernel.org/r/20210608031041.2820429-1-libaokun1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 16:55:41 -03:00
Weihang Li
ec9bf373f2 RDMA/core: Use refcount_t instead of atomic_t on refcount of ib_uverbs_device
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-8-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 15:04:36 -03:00
Weihang Li
a5e27fb68f RDMA/ipoib: Use refcount_t instead of atomic_t for reference counting
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-13-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 15:00:04 -03:00
Weihang Li
7183451f84 RDMA/cxgb4: Use refcount_t instead of atomic_t for reference counting
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-12-git-send-email-liweihang@huawei.com
Cc: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 14:59:05 -03:00
Weihang Li
8f9513d89f RDMA/hns: Use refcount_t instead of atomic_t for QP reference counting
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-11-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 14:58:51 -03:00
Weihang Li
33649cd3f9 RDMA/hns: Use refcount_t instead of atomic_t for SRQ reference counting
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-10-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 14:58:30 -03:00
Weihang Li
cc9e5a844a RDMA/hns: Use refcount_t instead of atomic_t for CQ reference counting
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-9-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 14:56:11 -03:00
Weihang Li
58d33b4fa9 RDMA/core: Use refcount_t instead of atomic_t on refcount of mcast_port
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-6-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 14:45:07 -03:00
Weihang Li
cd74db6cbb RDMA/core: Use refcount_t instead of atomic_t on refcount of mcast_member
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-5-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 14:43:56 -03:00
Jason Gunthorpe
6448508076 RDMA/core: Remove refcount from struct ib_mad_snoop_private
The member is never used, delete it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 14:43:28 -03:00
Weihang Li
fcc57a7b2b RDMA/core: Use refcount_t instead of atomic_t on refcount of iwpm_admin_data
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks. Increase refcount_t from 0 to 1 is
regarded as there is a risk about use-after-free. So it should be set to 1
directly during initialization.

Link: https://lore.kernel.org/r/1622194663-2383-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 14:36:24 -03:00
Weihang Li
60dff56d77 RDMA/core: Use refcount_t instead of atomic_t on refcount of iwcm_id_private
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 14:35:44 -03:00
Kamal Heib
61c7d826b8 RDMA/irdma: Fix return error sign from irdma_modify_qp
There is a typo in the returned error code sign from irdma_modify_qp()
when the attr_mask is not supported - Fix it.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210607221543.254144-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-07 20:21:10 -03:00
Colin Ian King
205be5dc99 RDMA/irdma: Fix spelling mistake "Allocal" -> "Allocate"
There is a spelling mistake in a literal string. Fix it.

Link: https://lore.kernel.org/r/20210607113345.82206-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-07 20:20:32 -03:00
Colin Ian King
cd16b32c15 RDMA/irdma: remove redundant initialization of variable val
The variable val is being initialized with a value that is never
read, it is being updated later on. The assignment is redundant and
can be removed.

Link: https://lore.kernel.org/r/20210605131347.26293-1-colin.king@canonical.com
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-07 20:20:05 -03:00
Colin Ian King
1b01a42c9c RDMA/irdma: remove extraneous indentation on a statement
A single statement is indented one level too deeply, clean up the
code by removing the extraneous tab.

Link: https://lore.kernel.org/r/20210605130400.25987-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-07 20:19:31 -03:00
Colin Ian King
7fde2dbe03 RDMA/irdma: Fix issues with u8 left shift operation
The shifting of the u8 integer info->map[i] the left will be promoted
to a 32 bit signed int and then sign-extended to a u64. In the event
that the top bit of the u8 is set then all then all the upper 32 bits
of the u64 end up as also being set because of the sign-extension.
Fix this by casting the u8 values to a u64 before the left shift. This

Link: https://lore.kernel.org/r/20210605122059.25105-1-colin.king@canonical.com
Addresses-Coverity: ("Unitentional integer overflow / bad shift operation")
Fixes: 3f49d68425 ("RDMA/irdma: Implement HW Admin Queue OPs")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-07 20:19:03 -03:00
Christian König
c43f2f9842 RDMA/umem: fix missing automated rename
This occasions was missed during the recent rename of the function.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210607070658.11586-1-christian.koenig@amd.com
2021-06-07 15:41:50 +02:00
Jiapeng Chong
aeb27bb76a RDMA/cxgb4: Fix missing error code in create_qp()
The error code is missing in this code scenario so 0 will be returned. Add
the error code '-EINVAL' to the return value 'ret'.

Eliminates the follow smatch warning:

drivers/infiniband/hw/cxgb4/qp.c:298 create_qp() warn: missing error code 'ret'.

Link: https://lore.kernel.org/r/1622545669-20625-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-04 11:55:34 -03:00
Devesh Sharma
35f5ace5de RDMA/bnxt_re: Enable global atomic ops if platform supports
Enabling Atomic operations for Gen P5 devices if the underlying platform
supports global atomic ops.

Link: https://lore.kernel.org/r/20210603131534.982257-2-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 16:59:11 -03:00
Kamal Heib
32a25f2ea6 RDMA/rxe: Fix failure during driver load
To avoid the following failure when trying to load the rdma_rxe module
while IPv6 is disabled, add a check for EAFNOSUPPORT and ignore the
failure, also delete the needless debug print from rxe_setup_udp_tunnel().

$ modprobe rdma_rxe
modprobe: ERROR: could not insert 'rdma_rxe': Operation not permitted

Fixes: dfdd6158ca ("IB/rxe: Fix kernel panic in udp_setup_tunnel")
Link: https://lore.kernel.org/r/20210603090112.36341-1-kamalheib1@gmail.com
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 16:50:29 -03:00
Bob Pearson
5bcf5a59c4 RDMA/rxe: Protext kernel index from user space
In order to prevent user space from modifying the index that belongs to
the kernel for shared queues let the kernel use a local copy of the index
and copy any new values of that index to the shared rxe_queue_bus struct.

This adds more switch statements which decreases the performance of the
queue API. Move the type into the parameter list for these functions so
that the compiler can optimize out the switch statements when the explicit
type is known. Modify all the calls in the driver on performance paths to
pass in the explicit queue type.

Link: https://lore.kernel.org/r/20210527194748.662636-4-rpearsonhpe@gmail.com
Link: https://lore.kernel.org/linux-rdma/20210526165239.GP1002214@@nvidia.com/
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 15:53:01 -03:00
Bob Pearson
0a67c46d2e RDMA/rxe: Protect user space index loads/stores
Modify the queue APIs to protect all user space index loads with
smp_load_acquire() and all user space index stores with
smp_store_release(). Base this on the types of the queues which can be one
of ..KERNEL, ..FROM_USER, ..TO_USER. Kernel space indices are protected by
locks which also provide memory barriers.

Link: https://lore.kernel.org/r/20210527194748.662636-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 15:53:01 -03:00
Bob Pearson
59daff49f2 RDMA/rxe: Add a type flag to rxe_queue structs
To create optimal code only want to use smp_load_acquire() and
smp_store_release() for user indices in rxe_queue APIs since kernel
indices are protected by locks which also act as memory barriers. By
adding a type to the queues we can determine which indices need to be
protected.

Link: https://lore.kernel.org/r/20210527194748.662636-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 15:53:01 -03:00
Shay Drory
404e5a1269 RDMA/mlx4: Do not map the core_clock page to user space unless enabled
Currently when mlx4 maps the hca_core_clock page to the user space there
are read-modifiable registers, one of which is semaphore, on this page as
well as the clock counter. If user reads the wrong offset, it can modify
the semaphore and hang the device.

Do not map the hca_core_clock page to the user space unless the device has
been put in a backwards compatibility mode to support this feature.

After this patch, mlx4 core_clock won't be mapped to user space on the
majority of existing devices and the uverbs device time feature in
ibv_query_rt_values_ex() will be disabled.

Fixes: 52033cfb5a ("IB/mlx4: Add mmap call to map the hardware clock")
Link: https://lore.kernel.org/r/9632304e0d6790af84b3b706d8c18732bc0d5e27.1622726305.git.leonro@nvidia.com
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 14:19:53 -03:00
Mark Zhang
a0ffb4c12f RDMA/mlx5: Use different doorbell memory for different processes
In a fork scenario, the parent and child can have same virtual address and
also share the uverbs fd.  That causes to the list_for_each_entry search
return same doorbell physical page for all processes, even though that
page has been COW' or copied.

This patch takes the mm_struct into consideration during search, to make
sure that VA's belonging to different processes are not intermixed.

Resolves the malfunction of uverbs after fork in some specific cases.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/feacc23fe0bc6e1088c6824d5583798745e72405.1622726212.git.leonro@nvidia.com
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 14:19:53 -03:00
Jason Gunthorpe
50971e3915 Merge branch 'irdma' into rdma.git for-next
Shiraz Saleem says:

====================
Add Intel Ethernet Protocol Driver for RDMA (irdma)

The following patch series introduces a unified Intel Ethernet Protocol
Driver for RDMA (irdma) for the X722 iWARP device and a new E810 device
which supports iWARP and RoCEv2. The irdma module replaces the legacy
i40iw module for X722 and extends the ABI already defined for i40iw. It is
backward compatible with legacy X722 rdma-core provider (libi40iw).

X722 and E810 are PCI network devices that are RDMA capable. The RDMA
block of this parent device is represented via an auxiliary device
exported to 'irdma' using the core auxiliary bus infrastructure recently
added for 5.11 kernel.  The parent PCI netdev drivers 'i40e' and 'ice'
register auxiliary RDMA devices with private data/ops encapsulated that
bind to auxiliary drivers registered in irdma module.

Currently, default is RoCEv2 for E810. Runtime support for protocol switch
to iWARP will be made available via devlink in a future patch.
====================

Link: https://lore.kernel.org/r/20210602205138.889-1-shiraz.saleem@intel.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

* branch 'irdma':
  RDMA/irdma: Update MAINTAINERS file
  RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw
  RDMA/irdma: Add ABI definitions
  RDMA/irdma: Add dynamic tracing for CM
  RDMA/irdma: Add miscellaneous utility definitions
  RDMA/irdma: Add user/kernel shared libraries
  RDMA/irdma: Add RoCEv2 UD OP support
  RDMA/irdma: Implement device supported verb APIs
  RDMA/irdma: Add PBLE resource manager
  RDMA/irdma: Add connection manager
  RDMA/irdma: Add QoS definitions
  RDMA/irdma: Add privileged UDA queue implementation
  RDMA/irdma: Add HMC backing store setup functions
  RDMA/irdma: Implement HW Admin Queue OPs
  RDMA/irdma: Implement device initialization definitions
  RDMA/irdma: Register auxiliary driver and implement private channel OPs
  i40e: Register auxiliary devices to provide RDMA
  i40e: Prep i40e header for aux bus conversion
  ice: Register auxiliary device to provide RDMA
  ice: Implement iidc operations
  ice: Initialize RDMA support
  iidc: Introduce iidc.h
  i40e: Replace one-element array with flexible-array member
2021-06-02 20:07:59 -03:00
Shiraz Saleem
fa0cf568fd RDMA/irdma: Add irdma Kconfig/Makefile and remove i40iw
Add Kconfig and Makefile to build irdma driver.

Remove i40iw driver and add an alias in irdma.

Remove legacy exported symbols i40e_register_client
and i40e_unregister_client from i40e as they are no
longer used.

irdma is the replacement driver that supports X722.

Link: https://lore.kernel.org/r/20210602205138.889-16-shiraz.saleem@intel.com
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 20:06:36 -03:00
Michael J. Ruhl
ddae5d62f3 RDMA/irdma: Add dynamic tracing for CM
Add dynamic tracing functionality to debug connection
management issues.

Link: https://lore.kernel.org/r/20210602205138.889-14-shiraz.saleem@intel.com
Signed-off-by: "Michael J. Ruhl" <michael.j.ruhl@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:19 -03:00
Mustafa Ismail
915cc7ac0f RDMA/irdma: Add miscellaneous utility definitions
Add miscellaneous utility functions and headers.

Link: https://lore.kernel.org/r/20210602205138.889-13-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:19 -03:00
Mustafa Ismail
551c46edc7 RDMA/irdma: Add user/kernel shared libraries
Building the WQE descriptors for different verb
operations are similar in kernel and user-space.
Add these shared libraries.

Link: https://lore.kernel.org/r/20210602205138.889-12-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:18 -03:00
Mustafa Ismail
dd90451fac RDMA/irdma: Add RoCEv2 UD OP support
Add the header, data structures and functions
to populate the WQE descriptors and issue the
Control QP commands that support RoCEv2 UD operations.

Link: https://lore.kernel.org/r/20210602205138.889-11-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:18 -03:00
Mustafa Ismail
b48c24c2d7 RDMA/irdma: Implement device supported verb APIs
Implement device supported verb APIs. The supported APIs
vary based on the underlying transport the ibdev is
registered as (i.e. iWARP or RoCEv2).

Link: https://lore.kernel.org/r/20210602205138.889-10-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:18 -03:00
Mustafa Ismail
e8c4dbc2fc RDMA/irdma: Add PBLE resource manager
Implement a Physical Buffer List Entry (PBLE) resource manager
to manage a pool of PBLE HMC resource objects.

Link: https://lore.kernel.org/r/20210602205138.889-9-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:18 -03:00
Mustafa Ismail
146b9756f1 RDMA/irdma: Add connection manager
Add connection management (CM) implementation for
iWARP including accept, reject, connect, create_listen,
destroy_listen and CM utility functions

Link: https://lore.kernel.org/r/20210602205138.889-8-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:18 -03:00
Mustafa Ismail
3ae331c751 RDMA/irdma: Add QoS definitions
Add definitions for managing the RDMA HW work scheduler (WS) tree.

A WS node is created via a control QP operation with the bandwidth
allocation, arbitration scheme, and traffic class of the QP specified.
The Qset handle returned associates the QoS parameters for the QP.
The Qset is registered with the LAN and an equivalent node is created
in the LAN packet scheduler tree.

Link: https://lore.kernel.org/r/20210602205138.889-7-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:17 -03:00
Mustafa Ismail
a3a06db504 RDMA/irdma: Add privileged UDA queue implementation
Implement privileged UDA queues to handle iWARP connection
packets and receive exceptions.

Link: https://lore.kernel.org/r/20210602205138.889-6-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:17 -03:00
Mustafa Ismail
d1850c005a RDMA/irdma: Add HMC backing store setup functions
HW uses host memory as a backing store for a number of
protocol context objects and queue state tracking.
The Host Memory Cache (HMC) is a component responsible for
managing these objects stored in host memory.

Add the functions and data structures to manage the allocation
of backing pages used by the HMC for the various objects

Link: https://lore.kernel.org/r/20210602205138.889-5-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:17 -03:00
Mustafa Ismail
3f49d68425 RDMA/irdma: Implement HW Admin Queue OPs
The driver posts privileged commands to the HW
Admin Queue (Control QP or CQP) to request administrative
actions from the HW. Implement create/destroy of CQP
and the supporting functions, data structures and headers
to handle the different CQP commands

Link: https://lore.kernel.org/r/20210602205138.889-4-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:17 -03:00
Mustafa Ismail
44d9e52977 RDMA/irdma: Implement device initialization definitions
Implement device initialization routines, interrupt set-up,
and allocate object bit-map tracking structures.
Also, add device specific attributes and register definitions.

Link: https://lore.kernel.org/r/20210602205138.889-3-shiraz.saleem@intel.com
[flexible array transformation]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:16 -03:00
Mustafa Ismail
8498a30e1b RDMA/irdma: Register auxiliary driver and implement private channel OPs
Register auxiliary drivers which can attach to auxiliary RDMA
devices from Intel PCI netdev drivers i40e and ice. Implement the private
channel ops, and register net notifiers.

Link: https://lore.kernel.org/r/20210602205138.889-2-shiraz.saleem@intel.com
[flexible array transformation]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 19:55:16 -03:00
Mark Zhang
76039ac909 IB/cm: Protect cm_dev, cm_ports and mad_agent with kref and lock
During cm_dev deregistration in cm_remove_one(), the cm_device and
cm_ports will be freed, after that they should not be accessed. The
mad_agent needs to be protected as well.

This patch adds a cm_device kref to protect cm_dev and cm_ports, and a
mad_agent_lock spinlock to protect mad_agent.

Link: https://lore.kernel.org/r/501ba7a2ff203dccd0e6755d3f93329772adce52.1622629024.git.leonro@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 15:41:58 -03:00
Mark Zhang
7345201c39 IB/cm: Improve the calling of cm_init_av_for_lap and cm_init_av_by_path
The cm_init_av_for_lap() and cm_init_av_by_path() function calls have the
following issues:

1. Both of them might sleep and should not be called under spinlock.
2. The access of cm_id_priv->av should be under cm_id_priv->lock, which
   means it can't be initialized directly.

This patch splits the calling of 2 functions into two parts: first one
initializes an AV outside of the spinlock, the second one copies AV to
cm_id_priv->av under spinlock.

Fixes: e1444b5a16 ("IB/cm: Fix automatic path migration support")
Link: https://lore.kernel.org/r/038fb8ad932869b4548b0c7708cab7f76af06f18.1622629024.git.leonro@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 15:41:58 -03:00