linux

Author	SHA1	Message	Date
Jason Gunthorpe	a6f844da39	Merge tag 'v5.18' into rdma.git for-next Following patches have dependencies. Resolve the merge conflict in drivers/net/ethernet/mellanox/mlx5/core/main.c by keeping the new names for the fs functions following linux-next: https://lore.kernel.org/r/20220519113529.226bc3e2@canb.auug.org.au/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-24 12:40:28 -03:00
Bob Pearson	4e05a4b329	RDMA/rxe: Check rxe_get() return value In the tasklets (completer, responder, and requester) check the return value from rxe_get() to detect failures to get a reference. This only occurs if the qp has had its reference count drop to zero which indicates that it no longer should be used. The ref is never 0 today because the tasklets are flushed before the ref is dropped. The next patch changes this so that the ref is dropped then the tasklets are flushed. Link: https://lore.kernel.org/r/20220421014042.26985-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:45 -03:00
Bob Pearson	570a4bf744	RDMA/rxe: Recheck the MR in when generating a READ reply The rping benchmark fails on long runs. The root cause of this failure has been traced to a failure to compute a nonzero value of mr in rare situations. Fix this failure by correctly handling the computation of mr in read_reply() in rxe_resp.c in the replay flow. Fixes: `8a1a0be894` ("RDMA/rxe: Replace mr by rkey in responder resources") Link: https://lore.kernel.org/r/20220418174103.3040-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-20 11:21:24 -03:00
Bob Pearson	290c4a902b	RDMA/rxe: Fix "Replace mr by rkey in responder resources" The referenced commit generates a reference counting error if the rkey has the same index but the wrong key. In this case the reference taken by rxe_pool_get_index() is not dropped. Drop the reference if the keys don't match in rxe_recheck_mr(). Check that the mw and mr are still valid. Fixes: `8a1a0be894` ("RDMA/rxe: Replace mr by rkey in responder resources") Link: https://lore.kernel.org/r/20220411030647.20011-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-12 11:17:52 -03:00
Bob Pearson	98c8026331	RDMA/rxe: Remove reliable datagram support The rdma_rxe driver does not actually support the reliable datagram transport but contains two references to RD opcodes in driver code. This commit removes these references to RD transport opcodes which are never used. Link: https://lore.kernel.org/r/cce0f07d-25fc-5880-69e7-001d951750b7@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-08 14:38:50 -03:00
Bob Pearson	409baed5d7	RDMA/rxe: Remove support for SMI QPs from rdma_rxe Currently the rdma_rxe driver supports SMI type QPs in a few places which is incorrect. RoCE devices never should support SMI QPs. This commit removes SMI QP support from the driver. Link: https://lore.kernel.org/r/20220407185416.16372-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-08 14:38:33 -03:00
Bob Pearson	3197706abd	RDMA/rxe: Use standard names for ref counting Rename rxe_add_ref() to rxe_get() and rxe_drop_ref() to rxe_put(). Significantly improves readability for new readers. Link: https://lore.kernel.org/r/20220304000808.225811-10-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-16 10:34:42 -03:00
Bob Pearson	8a1a0be894	RDMA/rxe: Replace mr by rkey in responder resources Currently rxe saves a copy of MR in responder resources for RDMA reads. Since the responder resources are never freed just over written if more are needed this MR may not have a reference freed until the QP is destroyed. This patch uses the rkey instead of the MR and on subsequent packets of a multipacket read reply message it looks up the MR from the rkey for each packet. This makes it possible for a user to deregister an MR or unbind a MW on the fly and get correct behaviour. Link: https://lore.kernel.org/r/20220304000808.225811-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-15 20:49:56 -03:00
Bob Pearson	63221acb0c	RDMA/rxe: Fix ref error in rxe_av.c The commit referenced below can take a reference to the AH which is never dropped. This only happens in the UD request path. This patch optionally passes that AH back to the caller so that it can hold the reference while the AV is being accessed and then drop it. Code to do this is added to rxe_req.c. The AV is also passed to rxe_prepare in rxe_net.c as an optimization. Fixes: `e2fe06c908` ("RDMA/rxe: Lookup kernel AH from ah index in UD WQEs") Link: https://lore.kernel.org/r/20220304000808.225811-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-15 20:49:56 -03:00
Bob Pearson	a099b08599	RDMA/rxe: Revert changes from irqsave to bh locks A previous patch replaced all irqsave locks in rxe with bh locks. This ran into problems because rdmacm has a bad habit of calling rdma verbs APIs while disabling irqs. This is not allowed during spin_unlock_bh() causing programs that use rdmacm to fail. This patch reverts the changes to locks that had this problem or got dragged into the same mess. After this patch blktests/check -q srp now runs correctly. Link: https://lore.kernel.org/r/20220215194448.44369-1-rpearsonhpe@gmail.com Fixes: `21adfa7a3c` ("RDMA/rxe: Replace irqsave locks with bh locks") Reported-by: Guoqing Jiang <guoqing.jiang@linux.dev> Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 11:51:28 -04:00
Xiao Yang	b1377cc37f	RDMA/rxe: Check the last packet by RXE_END_MASK It's wrong to check the last packet by RXE_COMP_MASK because the flag is to indicate if responder needs to generate a completion. Fixes: `9fcd67d177` ("IB/rxe: increment msn only when completing a request") Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20211229034438.1854908-1-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-08 11:54:38 -04:00
Xiao Yang	115fda3509	RDMA/rxe: Remove duplicate settings Remove duplicate settings for vendor_err and qp_num. Link: https://lore.kernel.org/r/20210930094813.226888-5-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-06 19:45:30 -03:00
Xiao Yang	45216d6363	RDMA/rxe: Add MASK suffix for RXE_READ_OR_ATOMIC and RXE_WRITE_OR_SEND To reflect the intention, since it is not just a single bit. Link: https://lore.kernel.org/r/20210914080253.1145353-3-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-28 11:42:24 -03:00
Xiao Yang	373efe0f30	RDMA/rxe: Add new RXE_READ_OR_WRITE_MASK 1) Replace (RXE_READ_MASK \| RXE_WRITE_MASK) with RXE_READ_OR_WRITE_MASK. 2) Change (RXE_READ_MASK \| RXE_WRITE_OR_SEND) to RXE_READ_OR_WRITE_MASK because we don't need to check RETH for RXE_SEND_MASK. Link: https://lore.kernel.org/r/20210914080253.1145353-2-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-28 11:42:24 -03:00
Bob Pearson	ae6e843fe0	RDMA/rxe: Add memory barriers to kernel queues Earlier patches added memory barriers to protect user space to kernel space communications. The user space queues were previously shown to have occasional memory synchonization errors which were removed by adding smp_load_acquire, smp_store_release barriers. This patch extends that to the case where queues are used between kernel space threads. This patch also extends the queue types to include kernel ULP queues which access the other end of the queues in kernel verbs calls like poll_cq and post_send/recv. Link: https://lore.kernel.org/r/20210914164206.19768-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-24 10:14:59 -03:00
Jason Gunthorpe	6a217437f9	Merge branch 'sg_nents' into rdma.git for-next From Maor Gottlieb ==================== Fix the use of nents and orig_nents in the sg table append helpers. The nents should be used by the DMA layer to store the number of DMA mapped sges, the orig_nents is the number of CPU sges. Since the sg append logic doesn't always create a SGL with exactly orig_nents entries store a total_nents as well to allow the table to be properly free'd and reorganize the freeing logic to share across all the use cases. ==================== Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> * 'sg_nents': RDMA: Use the sg_table directly and remove the opencoded version from umem lib/scatterlist: Fix wrong update of orig_nents lib/scatterlist: Provide a dedicated function to support table append	2021-08-30 09:49:59 -03:00
Bob Pearson	e2a05339fa	RDMA/rxe: Use the correct size of wqe when processing SRQ The memcpy() that copies a WQE from a SRQ the QP uses an incorrect size. The size should have been the size of the rxe_send_wqe struct not the size of a pointer to it. The result is that IO operations using a SRQ on the responder side will fail. Fixes: `ec0fa2445c` ("RDMA/rxe: Fix over copying in get_srq_wqe") Link: https://lore.kernel.org/r/20210729220039.18549-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-08-02 12:45:22 -03:00
Bob Pearson	1117f26ea7	RDMA/rxe: Move ICRC generation to a subroutine Isolate ICRC generation into a single subroutine named rxe_generate_icrc() in rxe_icrc.c. Remove scattered crc generation code from elsewhere. Link: https://lore.kernel.org/r/20210707040040.15434-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:34 -03:00
Dan Carpenter	36941dfe0e	RDMA/rxe: Missing unlock on error in get_srq_wqe() This error path needs to unlock before returning. Fixes: `ec0fa2445c` ("RDMA/rxe: Fix over copying in get_srq_wqe") Link: https://lore.kernel.org/r/YNXUCmnPsSkPyhkm@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Majd Dibbiny <majd@nvidia.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-25 12:00:28 -03:00
Bob Pearson	3896bde92d	RDMA/rxe: Fix extra copy in prepare_ack_packet Currently prepare_ack_packet writes almost all the fields of the BTH in the ack packet twice. Replace code with the subroutine init_bth(). Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210618045742.204195-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:53 -03:00
Bob Pearson	ec0fa2445c	RDMA/rxe: Fix over copying in get_srq_wqe Currently get_srq_wqe() in rxe_resp.c copies the maximum possible number of bytes from the wqe into the QPs copy of the SRQ wqe. This is usually extra work and risks reading past the end of the SRQ circular buffer if the SRQ is configured with less than the maximum possible number of SGEs. Check the number of SGEs is not too large. Compute the actual number of bytes in the WR and copy only those. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210618045742.204195-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:52 -03:00
Bob Pearson	1993cbed65	RDMA/rxe: Fix extra copies in build_rdma_network_hdr build_rdma_network_hdr() in rxe_resp.c does more copying than is needed. Remove this subroutine and eliminate the extra copies for IPV6 and reduce the extra copying for IPV4. Fixes: `e404f945a6` ("IB/rxe: improved debug prints & code cleanup") Link: https://lore.kernel.org/r/20210618045742.204195-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:52 -03:00
Bob Pearson	fceb24a73e	RDMA/rxe: Fix useless copy in send_atomic_ack In send_atomic_ack() in rxe_resp.c there is code copying ack_pkt into the skb->cb[]. This doesn't do anything useful because the cb[] is not used in the transmit path by the rxe driver. Remove this code. Fixes: `4c93496f18` ("IB/rxe: do not copy extra stack memory to skb") Link: https://lore.kernel.org/r/20210618045742.204195-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:52 -03:00
Bob Pearson	cdd0b85675	RDMA/rxe: Implement memory access through MWs Add code to implement memory access through memory windows. Link: https://lore.kernel.org/r/20210608042552.33275-10-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:18 -03:00
Bob Pearson	3902b429ca	RDMA/rxe: Implement invalidate MW operations Implement invalidate MW and cleaned up invalidate MR operations. Added code to perform remote invalidate for send with invalidate. Added code to perform local invalidation. Deleted some blank lines in rxe_loc.h. Link: https://lore.kernel.org/r/20210608042552.33275-9-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:18 -03:00
Bob Pearson	15ae1375ea	RDMA/rxe: Fix qp reference counting for atomic ops Currently the rdma_rxe driver attempts to protect atomic responder resources by taking a reference to the qp which is only freed when the resource is recycled for a new read or atomic operation. This means that in normal circumstances there is almost always an extra qp reference once an atomic operation has been executed which prevents cleaning up the qp and associated pd and cqs when the qp is destroyed. This patch removes the call to rxe_add_ref() in send_atomic_ack() and the call to rxe_drop_ref() in free_rd_atomic_resource(). If the qp is destroyed while a peer is retrying an atomic op it will cause the operation to fail which is acceptable. Link: https://lore.kernel.org/r/20210604230558.4812-1-rpearsonhpe@gmail.com Reported-by: Zhu Yanjun <zyjzyj2000@gmail.com> Fixes: `86af617641` ("IB/rxe: remove unnecessary skb_clone") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:20:23 -03:00
Bob Pearson	5bcf5a59c4	RDMA/rxe: Protext kernel index from user space In order to prevent user space from modifying the index that belongs to the kernel for shared queues let the kernel use a local copy of the index and copy any new values of that index to the shared rxe_queue_bus struct. This adds more switch statements which decreases the performance of the queue API. Move the type into the parameter list for these functions so that the compiler can optimize out the switch statements when the explicit type is known. Modify all the calls in the driver on performance paths to pass in the explicit queue type. Link: https://lore.kernel.org/r/20210527194748.662636-4-rpearsonhpe@gmail.com Link: https://lore.kernel.org/linux-rdma/20210526165239.GP1002214@@nvidia.com/ Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-03 15:53:01 -03:00
Bob Pearson	ea49225189	RDMA/rxe: Fix missing acks from responder All responder errors from request packets that do not consume a receive WQE fail to generate acks for RC QPs. This patch corrects this behavior by making the flow follow the same path as request packets that do consume a WQE after the completion. Link: https://lore.kernel.org/r/20210402001016.3210-1-rpearson@hpe.com Link: https://lore.kernel.org/linux-rdma/1a7286ac-bcea-40fb-2267-480134dd301b@gmail.com/ Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-04-08 15:59:28 -03:00
Bob Pearson	364e282c4f	RDMA/rxe: Split MEM into MR and MW In the original rxe implementation it was intended to use a common object to represent MRs and MWs but they are different enough to separate these into two objects. This allows replacing the mem name with mr for MRs which is more consistent with the style for the other objects and less likely to be confusing. This is a long patch that mostly changes mem to mr where it makes sense and adds a new rxe_mw struct. Link: https://lore.kernel.org/r/20210325212425.2792-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-30 17:11:30 -03:00
Jason Gunthorpe	7289e26f39	Merge tag 'v5.11' into rdma.git for-next Linux 5.11 Merged to resolve conflicts with RDMA rc commits - drivers/infiniband/sw/rxe/rxe_net.c The final logic is to call rxe_get_dev_from_net() again with the master netdev if the packet was rx'd on a vlan. To keep the elimination of the local variables requires a trivial edit to the code in -rc Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-18 11:19:29 -04:00
Bob Pearson	bf139b58af	RDMA/rxe: Remove unused pkt->offset The pkt->offset field is never used except to assign it to 0. But it adds lots of unneeded code. This patch removes the field and related code. This causes a measurable improvement in performance. Link: https://lore.kernel.org/r/20210211210455.3274-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-16 14:42:59 -04:00
Bob Pearson	899aba891c	RDMA/rxe: Fix FIXME in rxe_udp_encap_recv() rxe_udp_encap_recv() drops the reference to rxe->ib_dev taken by rxe_get_dev_from_net() which should be held until each received skb is freed. This patch moves the calls to ib_device_put() to each place a received skb is freed. It also takes references to the ib_device for each cloned skb created to process received multicast packets. Fixes: `4c173f596b` ("RDMA/rxe: Use ib_device_get_by_netdev() instead of open coding") Link: https://lore.kernel.org/r/20210128233318.2591-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-08 15:33:51 -04:00
Martin Wilck	f1b0a8ea9f	Revert "RDMA/rxe: Remove VLAN code leftovers from RXE" This reverts commit `b2d2440430`. It's true that creating rxe on top of 802.1q interfaces doesn't work. Thus, commit `fd49ddaf7e` ("RDMA/rxe: prevent rxe creation on top of vlan interface") was absolutely correct. But `b2d2440430` was incorrect assuming that with this change, RDMA and VLAN don't work togehter at all. It just has to be set up differently. Rather than creating rxe on top of the VLAN interface, rxe must be created on top of the physical interface. RDMA then works just fine through VLAN interfaces on top of that physical interface, via the "upper device" logic. This is hard to see in the rxe logic because it never talks about vlan, but instead rxe carefully selects upper vlan netdevices when working with packets which in turn imply certain vlan tagging. This is all done correctly and interacts with the gid table with VLAN support the same as real HW does. `b2d2440430` broke this setup deliberately and should thus be reverted. Also, `b2d2440430` removed rxe_dma_device(), so adapt the revert to discard that hunk. Fixes: `b2d2440430` ("RDMA/rxe: Remove VLAN code leftovers from RXE") Link: https://lore.kernel.org/r/20210120161913.7347-1-mwilck@suse.com Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-20 13:29:28 -04:00
Zhu Yanjun	b2d2440430	RDMA/rxe: Remove VLAN code leftovers from RXE Since the commit `fd49ddaf7e` ("RDMA/rxe: prevent rxe creation on top of vlan interface") does not permit rxe on top of vlan device, all the stuff related with vlan should be removed. Fixes: `fd49ddaf7e` ("RDMA/rxe: prevent rxe creation on top of vlan interface") Link: https://lore.kernel.org/r/1604326422-18625-1-git-send-email-yanjunz@nvidia.com Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-12 11:38:27 -04:00
Bob Pearson	63fa15dbd4	RDMA/rxe: Add SPDX hdrs to rxe source files Add SPDX headers to all rxe .c and .h files. Link: https://lore.kernel.org/r/20200827145439.2273-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-31 12:20:02 -03:00
Steve Wise	2030abddec	rxe: correctly calculate iCRC for unaligned payloads If RoCE PDUs being sent or received contain pad bytes, then the iCRC is miscalculated, resulting in PDUs being emitted by RXE with an incorrect iCRC, as well as ingress PDUs being dropped due to erroneously detecting a bad iCRC in the PDU. The fix is to include the pad bytes, if any, in iCRC computations. Note: This bug has caused broken on-the-wire compatibility with actual hardware RoCE devices since the soft-RoCE driver was first put into the mainstream kernel. Fixing it will create an incompatibility with the original soft-RoCE devices, but is necessary to be compatible with real hardware devices. Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Steve Wise <larrystevenwise@gmail.com> Link: https://lore.kernel.org/r/20191203020319.15036-2-larrystevenwise@gmail.com Signed-off-by: Doug Ledford <dledford@redhat.com>	2019-12-09 13:55:26 -05:00
Konstantin Taranov	bdce129049	RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM Calculate the correct byte_len on the receiving side when a work completion is generated with IB_WC_RECV_RDMA_WITH_IMM opcode. According to the IBA byte_len must indicate the number of written bytes, whereas it was always equal to zero for the IB_WC_RECV_RDMA_WITH_IMM opcode, even though data was transferred. Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Konstantin Taranov <konstantin.taranov@inf.ethz.ch> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-07-08 16:40:15 -03:00
Zhu Yanjun	9802c335e7	IB/rxe: Remove unnecessary rxe variable The variable rxe in the function is not used. So it is removed. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2019-01-21 16:46:08 -07:00
Zhu Yanjun	a854b1e890	IB/rxe: move the variable into the function that uses it The variable rxe is only used in the function rxe_xmit_packet, and the caller functions do not use it. So move this variable into the function rxe_xmit_packet. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-11-08 14:22:54 -07:00
Andrew Boyer	6e5559b275	RDMA/rxe: Add link_down, rdma_sends, rdma_recvs stats counters link_down is self-explanatory. rdma_sends and rdma_recvs count the number of RDMA Send and RDMA Receive operations completed successfully. This is different from the existing sent_pkts and rcvd_pkts counters because the existing counters measure packets, not RDMA operations. ack_deffered is renamed to ack_deferred to fix the spelling. out_of_sequence is renamed to out_of_seq_request to make clear that it is counting only requests and not other packets which can be out of sequence. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-11-08 14:22:54 -07:00
Sagi Grimberg	e48d8ed9c6	rxe: fix error completion wr_id and qp_num Error completions must still contain a valid wr_id and qp_num such that the consumer can rely on. Correctly fill these fields in receive error completions. Reported-by: Walker Benjamin <benjamin.walker@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Tested-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-11-06 16:25:04 -05:00
Zhu Yanjun	4e588c8d03	IB/rxe: clean skb queue directly When resp is in error state, the queued SKBs will not be handled. The function get_req cleans up the skb queue directly. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-11-06 16:03:14 -05:00
Vijay Immanuel	b97db58557	IB/rxe: fix for duplicate request processing and ack psns Don't reset the resp opcode for a replayed read response. The resp opcode could be in the middle of a write or send sequence, when the duplicate read request was received. An example sequence is as follows: - Receive read request for 12KB PSN 20. Transmit read response first, middle and last with PSNs 20,21,22. - Receive write first PSN 23. At this point the resp psn is 24 and resp opcode is write first. - The sender notices that PSN 20 is dropped and retransmits. Receive read request for 12KB PSN 20. Transmit read response first, middle and last with PSNs 20,21,22. The resp opcode is set to -1, the resp psn remains 24. - Receive write first PSN 23. This is processed by duplicate_request(). The resp opcode remains -1 and resp psn remains 24. - Receive write middle PSN 24. check_op_seq() reports a missing first error since the resp opcode is -1. When sending an ack for a duplicate send or write request, use the psn of the previous ack sent. Do not use the psn of a read response for the ack. An example sequence is as follows: - Receive write PSN 30. Transmit ACK for PSN 30. - Receive read request 4KB PSN 31. Transmit read response with PSN 31. The resp psn is now 32. - The sender notices that PSN 30 is dropped and retransmits. Receive write PSN 30. duplicate_request() sends an ACK with PSN 31. That is incorrect since PSN 31 was a read request. Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-08-30 17:22:12 -04:00
Parav Pandit	3db2bceb29	IB/rxe: Simplify rxe_find_route() to avoid GID query for netdev rxe_prepare() is called on an skb which has ndev already initialized by rxe_init_packet(). Therefore avoid querying the GID attribute again and use the available netdevice from the skb->dev. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-08-30 16:31:50 -04:00
Vijay Immanuel	92cf36eec2	IB/rxe: support for 802.1q VLAN on the listener Set the vlan flag and vlan_id field in the wc for rdma_listen() to work over VLAN. This is required by ib_init_ah_attr_from_wc() which is called by the CM REQ handler. Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2018-06-18 13:16:04 -06:00
Doug Ledford	f5e27a203f	Merge branch 'k.o/for-rc' into k.o/wip/dl-for-next Several items of conflict have arisen between the RDMA stack's for-rc branch and upcoming for-next work: `9fd4350ba8` ("IB/rxe: avoid double kfree_skb") directly conflicts with `2e47350789` ("IB/rxe: optimize the function duplicate_request") Patches already submitted by Intel for the hfi1 driver will fail to apply cleanly without this merge Other people on the mailing list have notified that their upcoming patches also fail to apply cleanly without this merge Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-05-09 15:48:48 -04:00
Zhu Yanjun	9fd4350ba8	IB/rxe: avoid double kfree_skb When skb is sent, it will pass the following functions in soft roce. rxe_send [rdma_rxe] ip_local_out __ip_local_out ip_output ip_finish_output ip_finish_output2 dev_queue_xmit __dev_queue_xmit dev_hard_start_xmit In the above functions, if error occurs in the above functions or iptables rules drop skb after ip_local_out, kfree_skb will be called. So it is not necessary to call kfree_skb in soft roce module again. Or else crash will occur. The steps to reproduce: server client --------- --------- \|1.1.1.1\|<----rxe-channel--->\|1.1.1.2\| --------- --------- On server: rping -s -a 1.1.1.1 -v -C 10000 -S 512 On client: rping -c -a 1.1.1.1 -v -C 10000 -S 512 The kernel configs CONFIG_DEBUG_KMEMLEAK and CONFIG_DEBUG_OBJECTS are enabled on both server and client. When rping runs, run the following command in server: iptables -I OUTPUT -p udp --dport 4791 -j DROP Without this patch, crash will occur. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-04-27 14:20:47 -04:00
Zhu Yanjun	e12ee8ce51	IB/rxe: remove unused function variable In the functions rxe_mem_init_dma, rxe_mem_init_user, rxe_mem_init_fast and copy_data, the function variable rxe is not used. So this function variable rxe is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-04-27 12:18:29 -04:00
Zhu Yanjun	fe896ceb57	IB/rxe: replace refcount_inc with skb_get Follow the advice from Bart, the function refcount_inc is replaced with skb_get in commit `99dae69025` ("IB/rxe: optimize mcast recv process") and commit `86af617641` ("IB/rxe: remove unnecessary skb_clone"). CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Suggested-by: Bart Van Assche <Bart.VanAssche@wdc.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-04-19 13:58:16 -04:00
Zhu Yanjun	2e47350789	IB/rxe: optimize the function duplicate_request In the function duplicate_request, the reference of skb can be increased to replace the function skb_clone. This will make rxe performace better and save memory. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2018-04-19 13:58:16 -04:00

1 2

76 Commits