Commit Graph

260 Commits

Author SHA1 Message Date
Jason Gunthorpe
81ddb41f87 RDMA/cm: Allow ib_send_cm_rej() to be done under lock
The first thing ib_send_cm_rej() does is obtain the lock, so use the usual
unlocked wrapper, locked actor pattern here.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

While here simplify some of the logic in the implementation.

Link: https://lore.kernel.org/r/20200310092545.251365-14-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:54 -03:00
Jason Gunthorpe
87cabf3e09 RDMA/cm: Allow ib_send_cm_drep() to be done under lock
The first thing ib_send_cm_drep() does is obtain the lock, so use the
usual unlocked wrapper, locked actor pattern here.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

Link: https://lore.kernel.org/r/20200310092545.251365-13-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:53 -03:00
Jason Gunthorpe
e029fdc068 RDMA/cm: Allow ib_send_cm_dreq() to be done under lock
The first thing ib_send_cm_dreq() does is obtain the lock, so use the
usual unlocked wrapper, locked actor pattern here.

This avoids a sketchy lock/unlock sequence (which could allow state to
change) during cm_destroy_id().

Link: https://lore.kernel.org/r/20200310092545.251365-12-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:53 -03:00
Jason Gunthorpe
00777a68ae RDMA/cm: Add some lockdep assertions for cm_id_priv->lock
These functions all touch state, so must be called under the lock.
Inspection shows this is currently true.

Link: https://lore.kernel.org/r/20200310092545.251365-11-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:53 -03:00
Jason Gunthorpe
d1de9a8807 RDMA/cm: Add missing locking around id.state in cm_dup_req_handler
All accesses to id.state must be done under the spinlock.

Fixes: a977049dac ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20200310092545.251365-10-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:53 -03:00
Jason Gunthorpe
c206f8bad1 RDMA/cm: Make it clearer how concurrency works in cm_req_handler()
ib_crate_cm_id() immediately places the id in the xarray, and publishes it
into the remote_id and remote_qpn rbtrees. This makes it visible to other
threads before it is fully set up.

It appears the thinking here was that the states IB_CM_IDLE and
IB_CM_REQ_RCVD do not allow any MAD handler or lookup in the remote_id and
remote_qpn rbtrees to advance.

However, cm_rej_handler() does take an action on IB_CM_REQ_RCVD, which is
not really expected by the design.

Make the whole thing clearer:
 - Keep the new cm_id out of the xarray until it is completely set up.
   This directly prevents MAD handlers and all rbtree lookups from seeing
   the pointer.
 - Move all the trivial setup right to the top so it is obviously done
   before any concurrency begins
 - Move the mutation of the cm_id_priv out of cm_match_id() and into the
   caller so the state transition is obvious
 - Place the manipulation of the work_list at the end, under lock, after
   the cm_id is placed in the xarray. The work_count cannot change on an
   ID outside the xarray.
 - Add some comments

Link: https://lore.kernel.org/r/20200310092545.251365-9-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:52 -03:00
Jason Gunthorpe
083bfdbfd5 RDMA/cm: Make it clear that there is no concurrency in cm_sidr_req_handler()
ib_create_cm_id() immediately places the id in the xarray, so it is visible
to network traffic.

The state is initially set to IB_CM_IDLE and all the MAD handlers will
test this state under lock and refuse to advance from IDLE, so adding to
the xarray is harmless.

Further, the set to IB_CM_SIDR_REQ_RCVD also excludes all MAD handlers.

However, the local_id isn't even used for SIDR mode, and there will be no
input MADs related to the newly created ID.

So, make the whole flow simpler so it can be understood:
 - Do not put the SIDR cm_id in the xarray. This directly shows that there
   is no concurrency
 - Delete the confusing work_count and pending_list manipulations. This
   mechanism is only used by MAD handlers and timewait, neither of which
   apply to SIDR.
 - Add a few comments and rename 'cur_cm_id_priv' to 'listen_cm_id_priv'
 - Move other loose sets up to immediately after cm_id creation so that
   the cm_id is fully configured right away. This fixes an oversight where
   the service_id will not be returned back on a IB_SIDR_UNSUPPORTED
   reject.

Link: https://lore.kernel.org/r/20200310092545.251365-8-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:52 -03:00
Jason Gunthorpe
153a2e432e RDMA/cm: Read id.state under lock when doing pr_debug()
The lock should not be dropped before doing the pr_debug() print as it is
accessing data protected by the lock, such as id.state.

Fixes: 119bf81793 ("IB/cm: Add debug prints to ib_cm")
Link: https://lore.kernel.org/r/20200310092545.251365-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:52 -03:00
Jason Gunthorpe
98f67156a8 RDMA/cm: Simplify establishing a listen cm_id
Any manipulation of cm_id->state must be done under the cm_id_priv->lock,
the two routines that added listens did not follow this rule, because they
never participate in any concurrent access around the state.

However, since this exception makes the code hard to understand, simplify
the flow so that it can be fully locked:
 - Move manipulation of listen_sharecount into cm_insert_listen() so it is
   trivially under the cm.lock without having to expose the cm.lock to the
   caller.
 - Push the cm.lock down into cm_insert_listen() and have the function
   increment the reference count before returning an existing pointer.
 - Split ib_cm_listen() into an cm_init_listen() and do not call
   ib_cm_listen() from ib_cm_insert_listen()
 - Make both ib_cm_listen() and ib_cm_insert_listen() directly call
   cm_insert_listen() under their cm_id_priv->lock which does both a
   collision detect and, if needed, the insert (atomically)
 - Enclose all state manipulation within the cm_id_priv->lock, notice this
   set can be done safely after cm_insert_listen() as no reader is allowed
   to read the state without holding the lock.
 - Do not set the listen cm_id in the xarray, as it is never correct to
   look it up. This makes the concurrency simpler to understand.

Many needless error unwinds are removed in the process.

Link: https://lore.kernel.org/r/20200310092545.251365-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:52 -03:00
Jason Gunthorpe
2305d6864a RDMA/cm: Make the destroy_id flow more robust
Too much of the destruction is very carefully sensitive to the state
and various other things. Move more code to the unconditional path and
add several WARN_ONs to check consistency.

Link: https://lore.kernel.org/r/20200310092545.251365-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:52 -03:00
Jason Gunthorpe
bede86a39d RDMA/cm: Remove a race freeing timewait_info
When creating a cm_id during REQ the id immediately becomes visible to the
other MAD handlers, and shortly after the state is moved to IB_CM_REQ_RCVD

This allows cm_rej_handler() to run concurrently and free the work:

        CPU 0                                CPU1
 cm_req_handler()
  ib_create_cm_id()
  cm_match_req()
    id_priv->state = IB_CM_REQ_RCVD
                                       cm_rej_handler()
                                         cm_acquire_id()
                                         spin_lock(&id_priv->lock)
                                         switch (id_priv->state)
  					   case IB_CM_REQ_RCVD:
                                            cm_reset_to_idle()
                                             kfree(id_priv->timewait_info);
   goto destroy
  destroy:
    kfree(id_priv->timewait_info);
                                             id_priv->timewait_info = NULL

Causing a double free or worse.

Do not free the timewait_info without also holding the
id_priv->lock. Simplify this entire flow by making the free unconditional
during cm_destroy_id() and removing the confusing special case error
unwind during creation of the timewait_info.

This also fixes a leak of the timewait if cm_destroy_id() is called in
IB_CM_ESTABLISHED with an XRC TGT QP. The state machine will be left in
ESTABLISHED while it needed to transition through IB_CM_TIMEWAIT to
release the timewait pointer.

Also fix a leak of the timewait_info if the caller mis-uses the API and
does ib_send_cm_reqs().

Fixes: a977049dac ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20200310092545.251365-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:51 -03:00
Jason Gunthorpe
ca21cb7fb1 RDMA/cm: Fix checking for allowed duplicate listens
The test here typod the cm_id_priv to use, it used the one that was
freshly allocated. By definition the allocated one has the matching
cm_handler and zero context, so the condition was always true.

Instead check that the existing listening ID is compatible with the
proposed handler so that it can be shared, as was originally intended.

Fixes: 067b171b86 ("IB/cm: Share listening CM IDs")
Link: https://lore.kernel.org/r/20200310092545.251365-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:51 -03:00
Jason Gunthorpe
e8dc4e885c RDMA/cm: Fix ordering of xa_alloc_cyclic() in ib_create_cm_id()
xa_alloc_cyclic() is a SMP release to be paired with some later acquire
during xa_load() as part of cm_acquire_id().

As such, xa_alloc_cyclic() must be done after the cm_id is fully
initialized, in particular, it absolutely must be after the
refcount_set(), otherwise the refcount_inc() in cm_acquire_id() may not
see the set.

As there are several cases where a reader will be able to use the
id.local_id after cm_acquire_id in the IB_CM_IDLE state there needs to be
an unfortunate split into a NULL allocate and a finalizing xa_store.

Fixes: a977049dac ("[PATCH] IB: Add the kernel CM implementation")
Link: https://lore.kernel.org/r/20200310092545.251365-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 17:05:51 -03:00
Leon Romanovsky
a4f994a059 RDMA/cm: Delete not implemented CM peer to peer communication
Peer to peer support was never implemented, so delete it to make code less
clutter.

Link: https://lore.kernel.org/r/20200310091438.248429-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-13 10:46:53 -03:00
Jason Gunthorpe
6f00a54c2c Linux 5.6-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5lkYceHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGpHQH/RJrzcaZHo4lw88m
 Jf7vBZ9DYUlRgqE0pxTHWmodNObKRqpwOUGflUcWbb/7GD2LQUfeqhSECVQyTID9
 N9y7FcPvx321Qhc3EkZ24DBYk0+DQ0K2FVUrSa/PxO0n7czxxXWaLRDmlSULEd3R
 D4pVs3zEWOBXJHUAvUQ5R+lKfkeWKNeeepeh+rezuhpdWFBRNz4Jjr5QUJ8od5xI
 sIwobYmESJqTRVBHqW8g2T2/yIsFJ78GCXs8DZLe1wxh40UbxdYDTA0NDDTHKzK6
 lxzBgcmKzuge+1OVmzxLouNWMnPcjFlVgXWVerpSy3/SIFFkzzUWeMbqm6hKuhOn
 wAlcIgI=
 =VQUc
 -----END PGP SIGNATURE-----

Merge tag 'v5.6-rc5' into rdma.git for-next

Required due to dependencies in following patches.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-10 12:49:09 -03:00
Jason Gunthorpe
c14dfddbd8 RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen()
The algorithm pre-allocates a cm_id since allocation cannot be done while
holding the cm.lock spinlock, however it doesn't free it on one error
path, leading to a memory leak.

Fixes: 067b171b86 ("IB/cm: Share listening CM IDs")
Link: https://lore.kernel.org/r/20200221152023.GA8680@ziepe.ca
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-27 16:43:29 -04:00
Gustavo A. R. Silva
5b361328ca RDMA: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Link: https://lore.kernel.org/r/20200213010425.GA13068@embeddedor.com
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> # added a few more
2020-02-20 13:33:51 -04:00
Leon Romanovsky
ca750d4a9c RDMA/ucma: Mask QPN to be 24 bits according to IBTA
IBTA declares QPN as 24bits, mask input to ensure that kernel
doesn't get higher bits and ensure by adding WANR_ONCE() that
other CM users do the same.

Link: https://lore.kernel.org/r/20200212072635.682689-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-13 11:38:19 -04:00
Jason Gunthorpe
13e0af1801 RDMA/cm: Remove CM message structs
All accesses now use the new IBA acessor scheme, so delete the structs
entirely and generate the structures from the schema file.

Link: https://lore.kernel.org/r/20200116170037.30109-8-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25 15:11:37 -04:00
Jason Gunthorpe
4ca662a30a RDMA/cm: Use IBA functions for complex structure members
Use a Coccinelle spatch to replace CM structure members used as
structures, arrays, or pointers with IBA_GET/SET versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression src;
expression len;
{struct} *msg;
@@
- memcpy(msg->{old_name}, src, len)
+ IBA_SET_MEM({new_name}, msg, src, len)
@@
{struct} *msg;
identifier x;
@@
- msg->{old_name}.x
+ IBA_GET_MEM_PTR({new_name}, msg)->x
@@
{struct} *msg;
@@
- &msg->{old_name}
+ IBA_GET_MEM_PTR({new_name}, msg)

For GIDs:
@@
{struct} *msg;
@@
- msg->{old_name}
+ *IBA_GET_MEM_PTR({new_name}, msg)

For non-GIDs:
@@
{struct} *msg;
@@
- msg->{old_name}
+ IBA_GET_MEM_PTR({new_name}, msg)

Iterated for every remaining IBA_CHECK_OFF()/IBA_CHECK_GET()
pairing. Touched up with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-7-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25 15:06:01 -04:00
Jason Gunthorpe
91b60a7128 RDMA/cm: Use IBA functions for simple structure members
Use a Coccinelle spatch script to replace use of simple CM structure
members with IBA_GET/SET versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression val;
{struct} *msg;
@@
- msg->{old_name} = val
+ IBA_SET({new_name}, msg, be{bits}_to_cpu(val))
@@
{struct} *msg;
@@
- msg->{old_name}
+ cpu_to_be{bits}(IBA_GET({new_name}, msg))

Iterated for every IBA_CHECK_OFF that isn't a CM_FIELD_MLOC.

And the below iterated over all byte sizes to remove doubled byte swaps:

@@
expression val;
@@
-be{bits}_to_cpu(cpu_to_be{bits}(val))
+val

(and __be_to_cpu and ntoh varients)

Touched up with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-6-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25 15:06:00 -04:00
Jason Gunthorpe
01adb7f46f RDMA/cm: Use IBA functions for swapping get/set acessors
Use a Coccinelle spatch script to replace CM helper functions that
return/accept BE values with IBA_GET/SET versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression val;
{struct} *msg;
@@
- {old_setter}(msg, val)
+ IBA_SET({new_name}, msg, be{bits}_to_cpu(val))
@@
{struct} *msg;
@@
- {old_getter}(msg)
+ cpu_to_be{bits}(IBA_GET({new_name}, msg))

Iterated for every IBA_CHECK_GET_BE()/IBA_CHECK_SET_BE() pairing.

And the below iterated over all byte sizes to remove doubled byte swaps:

@@
expression val;
@@
-be{bits}_to_cpu(cpu_to_be{bits}(val))
+val

(and __be_to_cpu and ntoh varients)

Touched up with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-5-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25 15:06:00 -04:00
Jason Gunthorpe
b6bbee6889 RDMA/cm: Use IBA functions for simple get/set acessors
Use a Coccinelle spatch to replace CM helper functions with IBA_GET/SET
versions. Applied with

$ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c

The spatch file was generated using the template pattern:

@@
expression val;
{struct} *msg;
@@
- {old_setter}
+ IBA_SET({new_name}, msg, val)
@@
{struct} *msg;
@@
- {old_getter}
+ IBA_GET({new_name}, msg)

Iterated for every IBA_CHECK_GET()/IBA_CHECK_GET() pairing. Touched up
with clang-format after.

Link: https://lore.kernel.org/r/20200116170037.30109-4-jgg@ziepe.ca
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25 15:05:59 -04:00
Leon Romanovsky
d05d4ac4c9 RDMA/cm: Add SET/GET implementations to hide IBA wire format
There is no separation between RDMA-CM wire format as it is declared in
IBTA and kernel logic which implements needed support. Such situation
causes to many mistakes in conversion between big-endian (wire format)
and CPU format used by kernel. It also mixes RDMA core code with
combination of uXX and beXX variables.

The idea that all accesses to IBA definitions will go through special
GET/SET macros to ensure that no conversion mistakes are made. The
shifting and masking required to read the value is automatically deduced
using the field offset description from the tables in the IBA
specification.

This starts with the CM MADs described in IBTA release 1.3 volume 1.

To confirm that the new macros behave the same as the old accessors a
self-test is included in this patch.

Each macro replacing a straightforward struct field compile-time tests
that the new field has the same offsetof() and width as the old field.

For the fields with accessor functions a runtime test, the 'all ones'
value is placed in a dummy message and read back in several ways to
confirm that both approaches give identical results.

Later patches in this series delete the self test.

This creates a tested table of new field name, old field name(s) and some
meta information like BE coding for the functions which will be used in
the next patches.

Link: https://lore.kernel.org/r/20200116170037.30109-3-jgg@ziepe.ca
Link: https://lore.kernel.org/r/20191212093830.316934-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25 15:05:59 -04:00
Leon Romanovsky
ad9efa05a0 RDMA/cm: Delete unused CM ARP functions
Clean the code by deleting ARP functions, which are not called anyway.

Link: https://lore.kernel.org/r/20191212093830.316934-46-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03 21:07:21 -04:00
Leon Romanovsky
017d8eada8 RDMA/cm: Delete unused CM LAP functions
Clean the code by deleting LAP functions, which are not called anyway.

Link: https://lore.kernel.org/r/20191212093830.316934-43-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03 21:06:21 -04:00
Danit Goldberg
4d6e8a033f RDMA/cm: Use RCU synchronization mechanism to protect cm_id_private xa_load()
The RCU mechanism is optimized for read-mostly scenarios and therefore
more suitable to protect the cm_id_private to decrease "cm.lock"
congestion.

This patch replaces the existing spinlock locking mechanism and kfree with
RCU mechanism in places where spinlock(cm.lock) protected xa_load
returning the cm_id_priv

In addition, delete the cm_get_id() function as there is no longer a
distinction if the caller already holds the cm_lock.

Remove an open coded version of cm_get_id().

Link: https://lore.kernel.org/r/20191219134750.413429-1-leon@kernel.org
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03 16:29:44 -04:00
Danit Goldberg
0acc637dac RDMA/cm: Use refcount_t type for refcount variable
This atomic in struct cm_id_private is being used as a refcount, change it
to refcount_t for better clarity and to get the refcount protections.

Link: https://lore.kernel.org/r/1573997601-4502-1-git-send-email-danitg@mellanox.com
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-19 16:10:04 -04:00
Jason Gunthorpe
036313316d Linux 5.4-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl210Z8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGv+kIAKRpO7EuDokQL4qp
 hxEEaCMJA1T055EMlNU6FVAq/ZbmapzreUyNYiRMpPWKGTWNMkhIcZQfysYeGZz5
 y/KRxAiVxlcB+3v3yRmoZd/XoQmhgvJmqD4zhaGI2Utonow4f/SGSEFFZqqs9WND
 4HJROjZHgQ4JBxg9Z+QMo0FxbV/DCZpEOUq51N9WJywyyDRb18zotE83stpU+pE2
 fjqT7mk0NLrnYXuDRAbFC1Aau9ed4H6LlwLmxaqxq/Pt5Rz7wIKwKL9HIT4Dm/0a
 qpani6phhHWL7MwUpa2wkEonFCD03rJFl3DUVJo64Ijh4up5D/jyXQ+GKV2P4WKJ
 275Rb5Q=
 =WiZZ
 -----END PGP SIGNATURE-----

Merge tag 'v5.4-rc5' into rdma.git for-next

Linux 5.4-rc5

For dependencies in the next patches

Conflict resolved by keeping the delete of the unlock.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28 16:36:29 -03:00
Leon Romanovsky
24f5214923 RDMA/cm: Update copyright together with SPDX tag
Add Mellanox to lust of copyright holders and replace copyright
boilerplate with relevant SPDX tag.

Link: https://lore.kernel.org/r/20191020071559.9743-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28 10:15:11 -03:00
Leon Romanovsky
8d625101a7 RDMA/cm: Delete unused cm_is_active_peer function
Function cm_is_active_peer is not used, delete it.

Link: https://lore.kernel.org/r/20191020071559.9743-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28 10:15:10 -03:00
Parav Pandit
909624d8db IB/cm: Use container_of() instead of typecast
Use container_of() macro to get to timewait info structure instead of
typecasting.

Link: https://lore.kernel.org/r/20191002122517.17721-5-leon@kernel.org
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-04 15:38:16 -03:00
Jack Morgenstein
94635c36f3 RDMA/cm: Fix memory leak in cm_add/remove_one
In the process of moving the debug counters sysfs entries, the commit
mentioned below eliminated the cm_infiniband sysfs directory.

This sysfs directory was tied to the cm_port object allocated in procedure
cm_add_one().

Before the commit below, this cm_port object was freed via a call to
kobject_put(port->kobj) in procedure cm_remove_port_fs().

Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated.
This, however, meant that kfree was never called for the cm_port buffers.

Fix this by adding explicit kfree(port) calls to functions cm_add_one()
and cm_remove_one().

Note: the kfree call in the first chunk below (in the cm_add_one error
flow) fixes an old, undetected memory leak.

Fixes: c87e65cfb9 ("RDMA/cm: Move debug counters to be under relevant IB device")
Link: https://lore.kernel.org/r/20190916071154.20383-2-leon@kernel.org
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-04 14:58:31 -03:00
Parav Pandit
8f97486024 IB/cm: Reduce dependency on gid attribute ndev check
GID type to path record type conversion can be done directly based on port
type and gid attribute type.  There is no need to find out using indirect
way by its GID attribute's ndev field.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-03 11:10:02 -03:00
Leon Romanovsky
c87e65cfb9 RDMA/cm: Move debug counters to be under relevant IB device
The sysfs layout is created by CM incorrectly presented RDMA devices with
InfiniBand link layer. Layout of such devices represents device tree of
connections. By moving CM statistics to be under relevant port of IB
device, we will fix the following issues:

 * Symlink name - It used device name instead of specific identifier.
 * Target location - It was supposed to point to PCI-ID/infiniband_cm/
   instead of PCI-ID/infiniband/
 * Target name - It created extra device file under already existing
   device folder, e.g. mlx5_0/mlx5_0
 * Crash during boot with RDMA persistent naming patches.

 sysfs: cannot create duplicate filename '/class/infiniband_cm/mlx5_0'
 CPU: 29 PID: 433 Comm: modprobe Not tainted 5.0.0-rc5+ #178
 Call Trace:
  dump_stack+0xcc/0x180
  sysfs_warn_dup.cold.3+0x17/0x2d
  sysfs_do_create_link_sd.isra.2+0xd0/0xf0
  device_add+0x7cb/0x1450
  device_create_groups_vargs+0x1ae/0x220
  device_create+0x93/0xc0
  cm_add_one+0x38f/0xf60 [ib_cm]
  add_client_context+0x167/0x210 [ib_core]
  enable_device_and_get+0x230/0x3f0 [ib_core]
  ib_register_device+0x823/0xbf0 [ib_core]
  __mlx5_ib_add+0x45/0x150 [mlx5_ib]
  mlx5_ib_add+0x1b3/0x5e0 [mlx5_ib]
  mlx5_add_device+0x130/0x3a0 [mlx5_core]
  mlx5_register_interface+0x1a9/0x270 [mlx5_core]
  do_one_initcall+0x14f/0x5de
  do_init_module+0x247/0x7c0
  load_module+0x4c2f/0x60d0
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

After this change:
[leonro@server ~]$ ls -al /sys/class/infiniband/ibp0s12f0/ports/1/
drwxr-xr-x  2 root root    0 Mar 11 11:17 cm_rx_duplicates
drwxr-xr-x  2 root root    0 Mar 11 11:17 cm_rx_msgs
drwxr-xr-x  2 root root    0 Mar 11 11:17 cm_tx_msgs
drwxr-xr-x  2 root root    0 Mar 11 11:17 cm_tx_retries

Fixes: 110cf374a8 ("infiniband: make cm_device use a struct device and not a kobject.")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:24 -03:00
Leon Romanovsky
c7252a6532 RDMA/cm: Remove useless zeroing of static global variable
Static global variables are initialized to zero by C standard,
there is no need to zero them again.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-04 08:29:04 -03:00
Matthew Wilcox
ae78ff3a0f RDMA/cm: Convert local_id_table to XArray
Also introduce cm_local_id() to reduce the amount of boilerplate when
converting a local ID to an XArray index.

Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-26 11:44:22 -03:00
Gustavo A. R. Silva
b5c61b968d IB/cm: Use struct_size() in kmalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
    int stuff;
    void *entry[];
};

instance = kmalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-07 11:43:03 -07:00
Gal Pressman
2553ba217e RDMA: Mark if destroy address handle is in a sleepable context
Introduce a 'flags' field to destroy address handle callback and add a
flag that marks whether the callback is executed in an atomic context or
not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:28:03 -07:00
Gal Pressman
b090c4e3a0 RDMA: Mark if create address handle is in a sleepable context
Introduce a 'flags' field to create address handle callback and add a flag
that marks whether the callback is executed in an atomic context or not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:17:19 -07:00
Leon Romanovsky
e54b6a3bcd RDMA/cm: Respect returned status of cm_init_av_by_path
Add missing check for failure of cm_init_av_by_path

Fixes: e1444b5a16 ("IB/cm: Fix automatic path migration support")
Reported-by: Slava Shwartsman <slavash@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 08:29:24 -06:00
Jason Gunthorpe
896de0090a RDMA/core: Use dev_name instead of ibdev->name
These return the same thing but dev_name is a more conventional use of the
kernel API.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
2018-09-26 13:51:48 -06:00
Parav Pandit
cee104334c IB/core: Introduce and use sgid_attr in CM requests
For RoCE, when CM requests are received for RC and UD connections,
netdevice of the incoming request is unavailable. Because of that CM
requests are always forwarded to init_net namespace.

Now that we have the GID attribute available, introduce SGID attribute in
incoming CM requests and refer to the netdevice of it.  This is similar to
existing SGID attribute field in outgoing CM requests for RC and UD
transports.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-26 09:47:47 -06:00
Håkon Bugge
87a37ce9e4 IB/cm: Remove unused and erroneous msg sequence encoding
In cm_form_tid(), a two bit message sequence number is OR'ed into bit
31-30 of the lower TID value.

After commit f06d265375 ("IB/cm: Randomize starting comm ID"), the
local_id is XOR'ed with a 32-bit random value. Hence, bit 31-30 in the
lower TID now has an arbitrarily value and it makes no sense to OR in
the message sequence number.

Adding to that, the evolution in use of IDR routines in cm_alloc_id()
has always had the possibility of returning a value with bit 30 set.

In addition, said bits are never checked.

Hence, remove the encoding and the corresponding enum.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-09 11:39:28 -06:00
Parav Pandit
a8872d53e9 IB/cm: Use sgid_attr from the AV
Prior patches now ensure that the AV has a sgid_attr, if one would have
been required.  Instead of querying for one, take it directly from the AH.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-25 14:19:57 -06:00
Parav Pandit
398391071f IB/cm: Replace members of sa_path_rec with 'struct sgid_attr *'
While processing a path record entry in CM messages the associated GID
attribute is now also supplied.

Currently for RoCE a netdevice's net namespace pointer and ifindex are
stored in path record entry. Both of these fields of the netdev can change
anytime while processing CM messages. Additionally storing net namespace
without holding reference will lead to use-after-free crash. Therefore it
is removed. Netdevice information for RoCE is instead provided via
referenced gid attribute in ib_cm requests.

Such a design leads to a situation where the kernel can crash when the net
pointer becomes invalid. However today it is always initialized to
init_net, which cannot become invalid. In order to support processing
packets in any arbitrary namespace of the received packet, it is necessary
to avoid such conditions.

This patch removes the dependency on the net pointer and ifindex; instead
it will rely on SGID attribute which contains a pointer to netdev.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-25 14:19:57 -06:00
Parav Pandit
815d456ef2 IB/cm: Pass the sgid_attr through various events
Make the sgid_attr available along with path information to the event
consumer, this allows the consumer to keep using the same GID table entry
as the event is related to.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-25 14:19:57 -06:00
Parav Pandit
aa74f4878d IB: Make init_ah_attr_grh_fields set sgid_attr
Use the sgid and other information from the path record to figure out the
sgid_attrs.

Store the selected table entry in the sgid_attr for everything else to
use.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-25 14:19:56 -06:00
Parav Pandit
b740321765 IB: Make ib_init_ah_attr_from_wc set sgid_attr
The work completion is inspected to determine what dgid table entry was
used to receieve the packet, produces a sgid_attr that matches and sticks
it in the ah_attr.

All callers of this function are now required to release the ah_attr on
success.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-25 14:19:56 -06:00
Jason Gunthorpe
d97099fe53 IB{cm, core}: Introduce and use ah_attr copy, move, replace APIs
Introduce AH attribute copy, move and replace APIs to be used by core and
provider drivers.

In CM code flow when ah attribute might be re-initialized twice while
processing incoming request, or initialized once while from path record
while sending out CM requests. Therefore use rdma_move_ah_attr API to
handle such scenarios instead of memcpy().

Provider drivers keeps a copy ah_attr during the lifetime of the ah.
Therefore, use rdma_replace_ah_attr() which conditionally release
reference to old ah_attr and holds reference to new attribute whose
referrence is released when the AH is freed.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-18 11:11:26 -06:00
Parav Pandit
1dfce29457 IB: Replace ib_query_gid/ib_get_cached_gid with rdma_query_gid
If the gid_attr argument is NULL then the functions behave identically to
rdma_query_gid. ib_query_gid just calls ib_get_cached_gid, so everything
can be consolidated to one function.

Now that all callers either use rdma_query_gid() or ib_get_cached_gid(),
ib_query_gid() API is removed.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18 11:09:05 -06:00
Linus Torvalds
a1cdde8c41 4.18 Merge window pull request
This has been a quiet cycle for RDMA, the big bulk is the usual smallish
 driver updates and bug fixes. About four new uAPI related things. Not as much
 Szykaller patches this time, the bugs it finds are getting harder to fix.
 
 - More work cleaning up the RDMA CM code
 - Usual driver bug fixes and cleanups for qedr, qib, hfi1, hns, i40iw, iw_cxgb4, mlx5, rxe
 - Driver specific resource tracking and reporting via netlink
 - Continued work for name space support from Parav
 - MPLS support for the verbs flow steering uAPI
 - A few tricky IPoIB fixes improving robustness
 - HFI1 driver support for the '16B' management packet format
 - Some auditing to not print kernel pointers via %llx or similar
 - Mark the entire 'UCM' user-space interface as BROKEN with the intent to remove it
   entirely. The user space side of this was long ago replaced with RDMA-CM and
   syzkaller is finding bugs in the residual UCM interface nobody wishes to fix because
   nobody uses it.
 - Purge more bogus BUG_ON's from Leon
 - 'flow counters' verbs uAPI
 - T10 fixups for iser/isert, these are Acked by Martin but going through the RDMA
   tree due to dependencies
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCgAGBQJbGEcPAAoJEDht9xV+IJsarBMQAIsAFOizycF0kQfDtvz1yHyV
 YjkT3NA71379DsDsCOezVKqZ6RtXdQncJoqqEG1FuNKiXh/rShR3rk9XmdBwUCTq
 mIY0ySiQggdeSIJclROiBuzLE3F/KIIkY3jwM80DzT9GUEbnVuvAMt4M56X48Xo8
 RpFc13/1tY09ZLBVjInlfmCpRWyNgNccDBDywB/5hF5KCFR/BG/vkp4W0yzksKiU
 7M/rZYyxQbtwSfe/ZXp7NrtwOpkpn7vmhED59YgKRZWhqnHF9KKmV+K1FN+BKdXJ
 V1KKJ2RQINg9bbLJ7H2JPdQ9EipvgAjUJKKBoD+XWnoVJahp6X2PjX351R/h4Lo5
 TH+0XwuCZ2EdjRxhnm3YE+rU10mDY9/UUi1xkJf9vf0r25h6Fgt6sMnN0QBpqkTh
 euRZnPyiFeo1b+hCXJfKqkQ6An+F3zes5zvVf59l0yfVNLVmHdlz0lzKLf/RPk+t
 U+YZKxfmHA+mwNhMXtKx7rKVDrko+uRHjaX2rPTEvZ0PXE7lMzFMdBWYgzP6sx/b
 4c55NiJMDAGTyLCxSc7ziGgdL9Lpo/pRZJtFOHqzkDg8jd7fb07ID7bMPbSa05y0
 BU5VpC8yEOYRpOEFbkJSPtHc0Q8cMCv/q1VcMuuhKXYnfSho3TWvtOSQIjUoU/q0
 8T6TXYi2yF+f+vZBTFlV
 =Mb8m
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This has been a quiet cycle for RDMA, the big bulk is the usual
  smallish driver updates and bug fixes. About four new uAPI related
  things. Not as much Szykaller patches this time, the bugs it finds are
  getting harder to fix.

  Summary:

   - More work cleaning up the RDMA CM code

   - Usual driver bug fixes and cleanups for qedr, qib, hfi1, hns,
     i40iw, iw_cxgb4, mlx5, rxe

   - Driver specific resource tracking and reporting via netlink

   - Continued work for name space support from Parav

   - MPLS support for the verbs flow steering uAPI

   - A few tricky IPoIB fixes improving robustness

   - HFI1 driver support for the '16B' management packet format

   - Some auditing to not print kernel pointers via %llx or similar

   - Mark the entire 'UCM' user-space interface as BROKEN with the
     intent to remove it entirely. The user space side of this was long
     ago replaced with RDMA-CM and syzkaller is finding bugs in the
     residual UCM interface nobody wishes to fix because nobody uses it.

   - Purge more bogus BUG_ON's from Leon

   - 'flow counters' verbs uAPI

   - T10 fixups for iser/isert, these are Acked by Martin but going
     through the RDMA tree due to dependencies"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (138 commits)
  RDMA/mlx5: Update SPDX tags to show proper license
  RDMA/restrack: Change SPDX tag to properly reflect license
  IB/hfi1: Fix comment on default hdr entry size
  IB/hfi1: Rename exp_lock to exp_mutex
  IB/hfi1: Add bypass register defines and replace blind constants
  IB/hfi1: Remove unused variable
  IB/hfi1: Ensure VL index is within bounds
  IB/hfi1: Fix user context tail allocation for DMA_RTAIL
  IB/hns: Use zeroing memory allocator instead of allocator/memset
  infiniband: fix a possible use-after-free bug
  iw_cxgb4: add INFINIBAND_ADDR_TRANS dependency
  IB/isert: use T10-PI check mask definitions from core layer
  IB/iser: use T10-PI check mask definitions from core layer
  RDMA/core: introduce check masks for T10-PI offload
  IB/isert: fix T10-pi check mask setting
  IB/mlx5: Add counters read support
  IB/mlx5: Add flow counters read support
  IB/mlx5: Add flow counters binding support
  IB/mlx5: Add counters create and destroy support
  IB/uverbs: Add support for flow counters
  ...
2018-06-07 13:04:07 -07:00
Kees Cook
acafe7e302 treewide: Use struct_size() for kmalloc()-family
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
    int stuff;
    void *entry[];
};

instance = kmalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);

This patch makes the changes for kmalloc()-family (and kvmalloc()-family)
uses. It was done via automatic conversion with manual review for the
"CHECKME" non-standard cases noted below, using the following Coccinelle
script:

// pkey_cache = kmalloc(sizeof *pkey_cache + tprops->pkey_tbl_len *
//                      sizeof *pkey_cache->table, GFP_KERNEL);
@@
identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
expression GFP;
identifier VAR, ELEMENT;
expression COUNT;
@@

- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT), GFP)
+ alloc(struct_size(VAR, ELEMENT, COUNT), GFP)

// mr = kzalloc(sizeof(*mr) + m * sizeof(mr->map[0]), GFP_KERNEL);
@@
identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
expression GFP;
identifier VAR, ELEMENT;
expression COUNT;
@@

- alloc(sizeof(*VAR) + COUNT * sizeof(VAR->ELEMENT[0]), GFP)
+ alloc(struct_size(VAR, ELEMENT, COUNT), GFP)

// Same pattern, but can't trivially locate the trailing element name,
// or variable name.
@@
identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
expression GFP;
expression SOMETHING, COUNT, ELEMENT;
@@

- alloc(sizeof(SOMETHING) + COUNT * sizeof(ELEMENT), GFP)
+ alloc(CHECKME_struct_size(&SOMETHING, ELEMENT, COUNT), GFP)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-06 11:15:43 -07:00
Leon Romanovsky
2f5059a7af RDMA/cm: Abort loop in case of CM dequeue
In case CM work list is empty, the work pointer will be NULL,
so instead of kernel crash it is better to abort processing
of works.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-01 11:16:23 -04:00
Parav Pandit
e822ff213f IB/cm: Store and restore ah_attr during CM message processing
During CM request processing flow, ah_attr is initialized twice.
First based on wc. Secondly based on primary path record.
ah_attr initialization from path record can fail, which leads to ah_attr
zeroed out.

Therefore, always initialize ah_attr on stack during reinitialization
phase. If ah_attr init is successful, use the new ah_attry by
overwriting the old one. If the ah_attr init fails, continue to use the
last ah_attr.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-16 14:11:46 -06:00
Parav Pandit
0e225dcb76 IB/cm: Store and restore ah_attr during LAP msg processing
During CM LAP processing, ah_attr is reinitialized on receiving LAP
request. First likely during CM request processing.

ah_attr might get zero out if LAP processing fails.
Therefore, attempt to create new ah_attr for the LAP message.
If the initialization fails, continue with older ah_attr.
If the initialization passes, consider the new ah_attr by overwriting
the older one.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-16 14:11:46 -06:00
Parav Pandit
a5c57d3272 IB/cm: Avoid AV ah_attr overwriting during LAP message handling
AH attribute of the cm_id can be overwritten if LAP message is received
on CM request which is in progress. This bug got introduced to avoid
sleeping when spin lock is held as part of commit in Fixes tag.

Therefore validate the cm_id state first and continue to perform AV
ah_attr initialization.

Given that Aleternative path related messages are not supported for
RoCE, init_av_from_response/path is such messages are ok to be called
from blocking context.

Fixes: 33f93e1ebc ("IB/cm: Fix sleeping while spin lock is held")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-16 14:11:46 -06:00
Parav Pandit
97c45c2c28 IB/cm: Block processing alternate path handling RoCE Rx cm messages
Due to below reasons, it is better to not support alternate path receive
messages for RoCE in near term.

1. Alternate path for RoCE is not supported at rdmacm layer.
2. It is not supported in uverbs/core layer for RoCE.
3. Alternate path for IPv6 for link local address cannot resolve route
determinstically without a valid incoming interface id whose usecase
make sense only with dual port mode.
4. init_av_from_path while processing LAP messages for IB and RoCE can
lead to adding duplicate entry of AV into the port list, leads to list
corruption.
5. rdma-core userspace a well known userspace implementation has removed
support of libucm which use ucm.ko module, which is the only module that
can trigger alternate path related messages.
6. ucm kernel module is requested to be removed from the IB core in
patch [1].

[1] https://patchwork.kernel.org/patch/10268503/

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-27 16:22:12 -06:00
Parav Pandit
114cc9c4b1 IB/cma: Resolve route only while receiving CM requests
Currently CM request for RoCE follows following flow.
rdma_create_id()
rdma_resolve_addr()
rdma_resolve_route()
For RC QPs:
rdma_connect()
->cma_connect_ib()
  ->ib_send_cm_req()
    ->cm_init_av_by_path()
      ->ib_init_ah_attr_from_path()
For UD QPs:
rdma_connect()
->cma_resolve_ib_udp()
  ->ib_send_cm_sidr_req()
    ->cm_init_av_by_path()
      ->ib_init_ah_attr_from_path()

In both the flows, route is already resolved before sending CM requests.
Therefore, code is refactored to avoid resolving route second time in
ib_cm layer.
ib_init_ah_attr_from_path() is extended to resolve route when it is not
yet resolved for RoCE link layer. This is achieved by caller setting
route_resolved field in path record whenever it has route already
resolved.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-23 10:58:05 -06:00
Parav Pandit
cb12a8e2fa IB/cm: Introduce and use helper function to get cm_port from path
Introduce and use helper function get_cm_port_from_path() to get
cm_port based on the the path record entry.

Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:17:49 -06:00
Parav Pandit
a22af59ea9 IB/cm: Add and use a helper function to add cm_id's to the port list
Add and use helper function add_cm_id_to_port_list() to attach
cm_id to port list.

Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:17:48 -06:00
Leon Romanovsky
925f7ea7a6 RDMA/cm: Fix access to uninitialized variable
The ndev will be initialized and held only for successful
ib_get_cached_gid(), otherwise it is garbage stack memory.
Calling dev_put() in failure path is wrong.

Fixes: 16c72e4028 ("IB/cm: Refactor to avoid setting path record software only fields")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-28 14:07:16 -07:00
Parav Pandit
708ea056b3 IB/core: Avoid SGID attributes query while converting GID from OPA to IB
SGID attributes are not used during OPA to IB GID conversion.
Therefore don't query it.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-28 14:07:16 -07:00
Parav Pandit
16c72e4028 IB/cm: Refactor to avoid setting path record software only fields
When path ah_attr initialization from path record
fails, ib_cm_send_rej() uses av.ah_attr fields to send out reject
message. In such cases initialization of path record software fields
is not needed. Code is simplified for same.

Additionally in current code in cm_req_handler, when ib_get_cached_gid
fails for a given sgid_index of the GID of the GRH of the incoming CM MAD,
error code 12 is sent. This error code refers to primary GID in incoming
CM REQ and not for the GID in in MAD packet.
Therefore code is refactored to send code 5 (unsupported request) for such
error.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:12 -07:00
Parav Pandit
f6bdb14267 IB/{core, umad, cm}: Rename ib_init_ah_from_wc to ib_init_ah_attr_from_wc
Currently ib_init_ah_from_wc initializes address handle attributes and
not the address handle object itself.
To avoid confusion between ah_attr vs ah, ib_init_ah_from_wc is
renamed to ib_init_ah_attr_from_wc to reflect that its initialzes
ah_attr.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:11 -07:00
Parav Pandit
4ad6a0245e IB/{core, cm, cma, ipoib}: Rename ib_init_ah_from_path to ib_init_ah_attr_from_path
Since ib_init_ah_from_path initializes the address handle attribute, it is
renamed to reflect so.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:11 -07:00
Parav Pandit
33f93e1ebc IB/cm: Fix sleeping while spin lock is held
In case of LAP are used for RoCE, it can lead to a problem of sleeping a
context while spin lock is held in below flow.

cm_lap_handler
	->spin_lock
	-> <..switch_case..>
	-> cm_init_av_for_response
		-> ib_init_ah_from_wc
			-> rdma_addr_find_l2_eth_by_grh
				wait_for_completion()

Therefore ah attribute initialization is done for incoming lap requests
outside of the lock context.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:11 -07:00
Parav Pandit
5cf3968afc IB/cm: Handle address handle attribute init error
cm_init_av_by_path depends on ib_init_ah_from_path to initialize ah
attribute and ib_init_ah_from_path() can fail, such error should not
be ignored.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:11 -07:00
Parav Pandit
0c4386ec77 IB/{cm, umad}: Handle av init error
cm_init_av_for_response depends on ib_init_ah_from_wc() whose return
status is ignored.
ib_init_ah_from_wc() can fail and its return status should be handled as
done in this patch.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:10 -07:00
Daniel Jurgens
119bf81793 IB/cm: Add debug prints to ib_cm
Add debug prints to the error paths in the connection manager control
flows, to help debug connection management problems.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:06 -07:00
Matan Barak
8b00914654 IB/core: Fix memory leak in cm_req_handler error flows
In cm_req_handler error flows, sometimes cm_id_priv->timewait_info
isn't free'd.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Mukesh Kacker <mukesh.kacker@oracle.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18 15:37:05 -07:00
Don Hiatt
c5c4e40e90 IB/CM: Change sgid to IB GID when handling CM request
ULPs do not understand OPA GIDs and will reject CM requests
if the sgid does not match the local_gid. In order to
fix this behavior we convert the OPA GID back to an IB GID.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-11 16:19:40 -07:00
Parav Pandit
5a3dc32372 IB/cm: Fix memory corruption in handling CM request
In recent code, two path record entries are alwasy cleared while
allocated could be either one or two path record entries.
This leads to zero out of unallocated memory.

This fix initializes alternative path record only when alternative path
is set.

While we are at it, path record allocation doesn't check for OPA
alternative path, but rest of the code checks for OPA alternative path.
Path record allocation code doesn't check for OPA alternative LID.
This can further lead to memory corruption when only one path record is
allocated, but there is actually alternative OPA path record present in CM
request.

Cc: <stable@vger.kernel.org> # v4.12+
Fixes: 9fdca4da4d ("IB/SA: Split struct sa_path_rec based on IB and ROCE specific fields")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25 14:37:03 -04:00
Bart Van Assche
401c6ae363 IB/cm: Suppress gcc 7 fall-through complaints
Avoid that gcc 7 reports the following warning when building with W=1:

warning: this statement may fall through [-Wimplicit-fallthrough=]

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-14 20:47:05 -04:00
Bart Van Assche
9d18717790 IB/core: Simplify sa_path_set_[sd]lid() calls
Instead of making every caller convert the second argument of
sa_path_set_slid() and sa_path_set_dlid() to big endian format,
make these two functions accept LIDs in CPU endian format.
This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Cc: Don Hiatt <don.hiatt@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-10 10:49:44 -04:00
Roland Dreier
c761611811 IB/cm: Fix sleeping in atomic when RoCE is used
A couple of places in the CM do

    spin_lock_irq(&cm_id_priv->lock);
    ...
    if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))

However when the underlying transport is RoCE, this leads to a sleeping function
being called with the lock held - the callchain is

    cm_alloc_response_msg() ->
      ib_create_ah_from_wc() ->
        ib_init_ah_from_wc() ->
          rdma_addr_find_l2_eth_by_grh() ->
            rdma_resolve_ip()

and rdma_resolve_ip() starts out by doing

    req = kzalloc(sizeof *req, GFP_KERNEL);

not to mention rdma_addr_find_l2_eth_by_grh() doing

    wait_for_completion(&ctx.comp);

to wait for the task that rdma_resolve_ip() queues up.

Fix this by moving the AH creation out of the lock.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-31 08:35:07 -04:00
Hiatt, Don
62ede77799 Add OPA extended LID support
This patch series primarily increases sizes of variables that hold
lid values from 16 to 32 bits. Additionally, it adds a check in
the IB mad stack to verify a properly formatted MAD when OPA
extended LIDs are used.

Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-18 14:47:37 -04:00
Doug Ledford
d3cf4d9915 Merge branch 'misc' into k.o/for-next
Conflicts:
	drivers/infiniband/core/iwcm.c - The rdma_netlink patches in
	HEAD and the iwarp cm workqueue fix (don't use WQ_MEM_RECLAIM,
	we aren't safe for that context) touched the same code.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-18 14:10:23 -04:00
Sagi Grimberg
cb93e59777 cm: Don't allocate ib_cm workqueue with WQ_MEM_RECLAIM
create_workqueue always creates the workqueue with WQ_MEM_RECLAIM
and silences a flush dependency warn for WQ_LEGACY. Instead, we
want to keep the warn in case the allocator tries to flush the
cm workqueue because its very likely that cm work execution will
yield memory allocations (for example cm connection requests).

Reported-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-18 10:46:20 -04:00
Dasaratharaman Chandramouli
ac3a949fb2 IB/CM: Set appropriate slid and dlid when handling CM request
If extended LIDs are being used, a connection request contains
OPA GIDs in them. Extract the lids from the OPA gids and populate
slid/dlid fields in the path records that are created when handling
a connection request.

Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-08 14:50:40 -04:00
Dasaratharaman Chandramouli
6b3c0e6e6d IB/CM: Create appropriate path records when handling CM request
When handling an incoming conection request, ib_cm creates
either an IB or an OPA path record based on the gid field
in the request.

Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-08 14:50:25 -04:00
Hiatt, Don
e92aa00a51 IB/CM: Add OPA Path record support to CM
Add OPA path record support to the Connection Manager.

Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-08 14:50:25 -04:00
Hiatt, Don
7db20ecd1d IB/core: Change wc.slid from 16 to 32 bits
slid field in struct ib_wc is increased to 32 bits.
This enables core components to use larger LIDs if needed.
The user ABI is unchanged and return 16 bit values when queried.

Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-08 14:50:25 -04:00
Majd Dibbiny
d3957b86a4 RDMA/SA: Fix kernel panic in CMA request handler flow
Commit 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB and
ROCE specific fields) moved the service_id to be specific attribute
for IB and OPA SA Path Record, and thus wasn't assigned for RoCE.

This caused to the following kernel panic in the CMA request handler flow:

[   27.074594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[   27.074731] IP: __radix_tree_lookup+0x1d/0xe0
...
[   27.075356] Workqueue: ib_cm cm_work_handler [ib_cm]
[   27.075401] task: ffff88022e3b8000 task.stack: ffffc90001298000
[   27.075449] RIP: 0010:__radix_tree_lookup+0x1d/0xe0
...
[   27.075979] Call Trace:
[   27.076015]  radix_tree_lookup+0xd/0x10
[   27.076055]  cma_ps_find+0x59/0x70 [rdma_cm]
[   27.076097]  cma_id_from_event+0xd2/0x470 [rdma_cm]
[   27.076144]  ? ib_init_ah_from_path+0x39a/0x590 [ib_core]
[   27.076193]  cma_req_handler+0x25/0x480 [rdma_cm]
[   27.076237]  cm_process_work+0x25/0x120 [ib_cm]
[   27.076280]  ? cm_get_bth_pkey.isra.62+0x3c/0xa0 [ib_cm]
[   27.076350]  cm_req_handler+0xb03/0xd40 [ib_cm]
[   27.076430]  ? sched_clock_cpu+0x11/0xb0
[   27.076478]  cm_work_handler+0x194/0x1588 [ib_cm]
[   27.076525]  process_one_work+0x160/0x410
[   27.076565]  worker_thread+0x137/0x4a0
[   27.076614]  kthread+0x112/0x150
[   27.076684]  ? max_active_store+0x60/0x60
[   27.077642]  ? kthread_park+0x90/0x90
[   27.078530]  ret_from_fork+0x2c/0x40

This patch moves it back to the common SA Path Record structure
and removes the redundant setter and getter.

Tested on Connect-IB and Connect-X4 in Infiniband and RoCE respectively.

Fixes: 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB ands
	ROCE specific fields)
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-06-01 17:20:14 -04:00
Dasaratharaman Chandramouli
5752075144 IB/SA: Add OPA path record type
Add opa_sa_path_rec to sa_path_rec data structure.
The 'type' field in sa_path_rec identifies the
type of the path record.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:39:02 -04:00
Dasaratharaman Chandramouli
9fdca4da4d IB/SA: Split struct sa_path_rec based on IB and ROCE specific fields
sa_path_rec now contains a union of sa_path_rec_ib and sa_path_rec_roce
based on the type of the path record. Note that fields applicable to
path record type ROCE v1 and ROCE v2 fall under sa_path_rec_roce.
Accessor functions are added to these fields so the caller doesn't have
to know the type.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:38:19 -04:00
Dasaratharaman Chandramouli
dfa834e1d9 IB/SA: Introduce path record specific types
struct sa_path_rec has a gid_type field. This patch introduces a more
generic path record specific type 'rec_type' which is either IB, ROCE v1
or ROCE v2. The patch also provides conversion functions to get
a gid type from a path record type and vice versa

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:37:28 -04:00
Dasaratharaman Chandramouli
c2f8fc4ec4 IB/SA: Rename ib_sa_path_rec to sa_path_rec
Rename ib_sa_path_rec to a more generic sa_path_rec.
This is part of extending ib_sa to also support OPA
path records in addition to the IB defined path records.

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:37:28 -04:00
Dasaratharaman Chandramouli
82ffc22648 IB/CM: Add braces when using sizeof
This patch adds braces around parameters to sizeof
as called out by checkpatch

Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:36:39 -04:00
Dasaratharaman Chandramouli
44c58487d5 IB/core: Define 'ib' and 'roce' rdma_ah_attr types
rdma_ah_attr can now be either ib or roce allowing
core components to use one type or the other and also
to define attributes unique to a specific type. struct
ib_ah is also initialized with the type when its first
created. This ensures that calls such as modify_ah
dont modify the type of the address handle attribute.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli
d8966fcd4c IB/core: Use rdma_ah_attr accessor functions
Modify core and driver components to use accessor functions
introduced to access individual fields of rdma_ah_attr

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli
3652315934 IB/core: Rename ib_destroy_ah to rdma_destroy_ah
Rename ib_destroy_ah to rdma_destroy_ah so its in sync with the
rename of the ib address handle attribute

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli
0a18cfe4f6 IB/core: Rename ib_create_ah to rdma_create_ah
Rename ib_create_ah to rdma_create_ah so its in sync with the
rename of the ib address handle attribute

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli
90898850ec IB/core: Rename struct ib_ah_attr to rdma_ah_attr
This patch simply renames struct ib_ah_attr to
rdma_ah_attr as these fields specify attributes that are
not necessarily specific to IB.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01 14:32:43 -04:00
Moni Shoua
498683c6a7 IB/cma: Add debug messages to error flows
Print debug messages to the kernel log to add more
information about RDMA_CM events that indicate an error.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 16:20:37 -05:00
Linus Torvalds
4d5b57e05a Updates for 4.10 kernel merge window
- Shared mlx5 updates with net stack (will drop out on merge if Dave's
   tree has already been merged)
 - Driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe
 - Debug cleanups
 - New connection rejection helpers
 - SRP updates
 - Various misc fixes
 - New paravirt driver from vmware
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYUbAPAAoJELgmozMOVy/dMXcP/iuG5MNzfN8Ny1JftyBQGWg3
 cqoQ2OLj9CsXjwVB+5EqbcZHRZY852lKONaLoDKkIOx4YAXO2YuIKOp944vN7EQx
 96wfqzT1F5jzAcy5mYZXgLaStGFDAwejKMqeHd0LfJj3OEtemGnVPWYzyqSQmSKo
 dzJraS1Z9GIRppzU5WaRpB9PtRBkqIqGJ5vZ0EKLGhed5hYY5r0iMJB0GfriMRDO
 lJ4UUVfpsAoLPnqDBFH6IMn2V2UeAw9IR5zNa1mrM1RBfvt/uYTxrw1w3p9WoaNs
 GRodhk4DCeAfeyqzVPNBLyXZ4Zq4FzGe3UWM4qysJ1RR4oFNw9Cuw0Fqk8mrfznr
 7hv5TpGIckRZiKf8l6e+qLirF0qGtXJg29j2vPVQI9i5nSj95g1agA81PnLQlLLb
 flWyxeMj81my7lfMHN1xcV6pqPEKMCOysZmfcvVfJd2XxpjuVD7ekl/YXWp8o8kU
 YPdQMqPD626XsD8VpPdMszb9FPmx0JD0HEv+Y1rIFX8JegEI+c3H2X0dqC27T/Ou
 FEPWOy025EgHm0Fh/7eIzkG6tjZ4JHoCugJAcxNZGj2XW4eB6r5vY8UwJ8iQRv+n
 PVYHiy0UoIRePh0mrdOSSphGZMi/GO/DsqKwCtAMEK43WqZQju6wR7QSIGkh66mp
 4uSHJqpf3YEYylxGMhk3
 =QeGy
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "This is the complete update for the rdma stack for this release cycle.

  Most of it is typical driver and core updates, but there is the
  entirely new VMWare pvrdma driver. You may have noticed that there
  were changes in DaveM's pull request to the bnxt Ethernet driver to
  support a RoCE RDMA driver. The bnxt_re driver was tentatively set to
  be pulled in this release cycle, but it simply wasn't ready in time
  and was dropped (a few review comments still to address, and some
  multi-arch build issues like prefetch() not working across all
  arches).

  Summary:

   - shared mlx5 updates with net stack (will drop out on merge if
     Dave's tree has already been merged)

   - driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe

   - debug cleanups

   - new connection rejection helpers

   - SRP updates

   - various misc fixes

   - new paravirt driver from vmware"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (210 commits)
  IB: Add vmw_pvrdma driver
  IB/mlx4: fix improper return value
  IB/ocrdma: fix bad initialization
  infiniband: nes: return value of skb_linearize should be handled
  MAINTAINERS: Update Intel RDMA RNIC driver maintainers
  MAINTAINERS: Remove Mitesh Ahuja from emulex maintainers
  IB/core: fix unmap_sg argument
  qede: fix general protection fault may occur on probe
  IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc
  mlx5, calc_sq_size(): Make a debug message more informative
  mlx5: Remove a set-but-not-used variable
  mlx5: Use { } instead of { 0 } to init struct
  IB/srp: Make writing the add_target sysfs attr interruptible
  IB/srp: Make mapping failures easier to debug
  IB/srp: Make login failures easier to debug
  IB/srp: Introduce a local variable in srp_add_one()
  IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build
  IB/multicast: Check ib_find_pkey() return value
  IPoIB: Avoid reading an uninitialized member variable
  IB/mad: Fix an array index check
  ...
2016-12-15 12:03:32 -08:00
Doug Ledford
9032ad78bb Merge branches 'misc', 'qedr', 'reject-helpers', 'rxe' and 'srp' into merge-test 2016-12-14 14:44:47 -05:00
Hans Westgaard Ry
9315bc9a13 IB/core: Issue DREQ when receiving REQ/REP for stale QP
from "InfiBand Architecture Specifications Volume 1":

  A QP is said to have a stale connection when only one side has
  connection information. A stale connection may result if the remote CM
  had dropped the connection and sent a DREQ but the DREQ was never
  received by the local CM. Alternatively the remote CM may have lost
  all record of past connections because its node crashed and rebooted,
  while the local CM did not become aware of the remote node's reboot
  and therefore did not clean up stale connections.

and:

   A local CM may receive a REQ/REP for a stale connection. It shall
   abort the connection issuing REJ to the REQ/REP. It shall then issue
   DREQ with "DREQ:remote QPN” set to the remote QPN from the REQ/REP.

This patch solves a problem with reuse of QPN. Current codebase, that
is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
in CM. A problem with this is the timeconstants governing this
mechanism; they are up to 768 seconds and the interface may look
inresponsive in that period.  Issuing a DREQ (and receiving a DREP)
does the necessary cleanup and the interface comes up.

Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14 12:56:24 -05:00
Steve Wise
77a5db1315 rdma_cm: add rdma_reject_msg() helper function
rdma_reject_msg() returns a pointer to a string message associated with
the transport reject reason codes.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14 11:38:28 -05:00
Mark Bloch
9db0ff53cb IB/cm: Mark stale CM id's whenever the mad agent was unregistered
When there is a CM id object that has port assigned to it, it means that
the cm-id asked for the specific port that it should go by it, but if
that port was removed (hot-unplug event) the cm-id was not updated.
In order to fix that the port keeps a list of all the cm-id's that are
planning to go by it, whenever the port is removed it marks all of them
as invalid.

This commit fixes a kernel panic which happens when running traffic between
guests and we force reboot a guest mid traffic, it triggers a kernel panic:

 Call Trace:
  [<ffffffff815271fa>] ? panic+0xa7/0x16f
  [<ffffffff8152b534>] ? oops_end+0xe4/0x100
  [<ffffffff8104a00b>] ? no_context+0xfb/0x260
  [<ffffffff81084db2>] ? del_timer_sync+0x22/0x30
  [<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
  [<ffffffff81084240>] ? process_timeout+0x0/0x10
  [<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
  [<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
  [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
  [<ffffffffa0752675>] ? free_msg+0x55/0x70 [mlx5_core]
  [<ffffffffa0753434>] ? cmd_exec+0x124/0x840 [mlx5_core]
  [<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0
  [<ffffffff8152d45e>] ? do_page_fault+0x3e/0xa0
  [<ffffffff8152a815>] ? page_fault+0x25/0x30
  [<ffffffffa024da25>] ? cm_alloc_msg+0x35/0xc0 [ib_cm]
  [<ffffffffa024e821>] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm]
  [<ffffffffa024f836>] ? cm_destroy_id+0x176/0x320 [ib_cm]
  [<ffffffffa024fb00>] ? ib_destroy_cm_id+0x10/0x20 [ib_cm]
  [<ffffffffa034f527>] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib]
  [<ffffffffa034f590>] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib]
  [<ffffffffa034f5a5>] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib]
  [<ffffffff81094d20>] ? worker_thread+0x170/0x2a0
  [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
  [<ffffffff81094bb0>] ? worker_thread+0x0/0x2a0
  [<ffffffff8109aef6>] ? kthread+0x96/0xa0
  [<ffffffff8100c20a>] ? child_rip+0xa/0x20
  [<ffffffff8109ae60>] ? kthread+0x0/0xa0
  [<ffffffff8100c200>] ? child_rip+0x0/0x20

Fixes: a977049dac ("[PATCH] IB: Add the kernel CM implementation")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-16 20:03:44 -05:00