Commit Graph

3460 Commits

Author SHA1 Message Date
Naresh Gottumukkala
c43e9ab84d RDMA/ocrdma: Increase STAG array size
1) Increase STAG Array size.
2) Max inline data size should be set to the same value
   used during QP creation
3) Set max_sge_rd to zero since we dont support RD transport in our adapters.
4) Max cqes reported in ibv_devinfo should be from QUERY_CONFIG.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02 21:18:42 -07:00
Naresh Gottumukkala
cffce99051 RDMA/ocrdma: Dont use PD 0 for userpace CQ DB
Create_CQ verb doesn't provide a PD pointer.  So, until now we are
creating all (both userspace and kernel) CQ DB regions from PD0.  This
will result in mmapping PD0 to applications.  A rogue userspace
application can mess things up.

Also more serious issues is even the be2net NIC uses PD0.

This patch addresses this problem by:

1) Create a PD page for every userspace application when the
   alloc_ucontext is called. This will be destroyed in
   dealloc_ucontext.
2) All CQs for that context will use the PD allocated in ucontext.
3) The first create_PD call from application will result in returning
   the PD address from its ucontext (no new PD will be created).
4) For subsecquent create_pd calls from application, we create new PDs for
   the application.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02 21:18:32 -07:00
Naresh Gottumukkala
2b51a9b9eb RDMA/ocrdma: FRMA code cleanup
1) Fixed setting FR_MR bit for FRWR stag allocation
2) Access rights are passsed during FRWR stage and not during STAT allocation stage
3) FRWR WQE structure cleanup
4) Add QP level signaled bit.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02 21:17:56 -07:00
Naresh Gottumukkala
f11220ee69 RDMA/ocrdma: For ERX2 irrespective of Qid, num_posted offset is 24
1) All RQ doorbells are handled by ERX2 and doorbell->num_posted
   offset is constant to bit offset 24 for ERX2 irrspective of Q id.

2) Fixed RESET to INIT state change (from ERR->RST->INIT->RTR case).

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02 21:17:55 -07:00
Naresh Gottumukkala
c88bd03ffc RDMA/ocrdma: Fix to work with even a single MSI-X vector
There are cases like SRIOV where can get only one MSI-X vector
allocated for RoCE.  In that case we need to use the vector for both
data plane and control plane.  We need to use EQ create version V2.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02 21:17:54 -07:00
Naresh Gottumukkala
d3cb6c0b2a RDMA/ocrdma: Remove the MTU check based on Ethernet MTU
Also increase MAX AH to 512.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02 21:17:53 -07:00
Naresh Gottumukkala
7c33880c3c RDMA/ocrdma: Add support for fast register work requests (FRWR)
Also get the max_srq value from query_config mailbox response.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02 21:17:48 -07:00
Naresh Gottumukkala
43a6b4025c RDMA/ocrdma: Create IRD queue fix
1)	Fix ocrdma_get_num_posted_shift for upto 128 QPs.
2)	Create for min of dev->max_wqe and requested wqe in create_qp.
3)	As part of creating ird queue, populate with basic header templates.
4)	Make sure all the DB memory allocated to userspace are page aligned.
5)	Fix issue in checking the mmap local cache.
6)	Some code cleanup.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-09-02 21:16:21 -07:00
Naresh Gottumukkala
45e86b33ec RDMA/ocrdma: Cache recv DB until QP moved to RTR
1) In post recv, don't ring the DB doorbell if the QP is in RTR state.
   Cache the DB calls, until the QP is moved to RTS state.
2) Add max_rd_sge support to dev->attr.
3) Code cleanup in alloc_pd path.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12 11:00:51 -07:00
Naresh Gottumukkala
7b9b1a596e RDMA/ocrdma: Remove __packed
1) Remove __packed for structures.
2) Align and pad all ABI structure to 64 bit boundaries
   instead of using __packed.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12 10:59:44 -07:00
Naresh Gottumukkala
057729cb23 RDMA/ocrdma: Remove driver QP state machine
Remove QP state machine in ocrdma low-level driver and use on the core
IB stack's instead.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12 10:58:38 -07:00
Naresh Gottumukkala
9c58726ba9 RDMA/ocrdma: Don't allow zero/invalid sgid usage
Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12 10:58:38 -07:00
Naresh Gottumukkala
1afc0454b6 RDMA/ocrdma: Remove redundant dev reference
Remove redundant dev reference from structures:

1) ocrdma_cq.
2) ocrdma_ah.
3) ocrdma_hw_mr.
4) ocrdma_mw.
5) ocrdma_srq.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12 10:58:38 -07:00
Naresh Gottumukkala
f99b1649db RDMA/ocrdma: Style and redundant code cleanup
Code cleanup and remove redundant code:

1) redundant initialization removed
2) braces changed as per CodingStyle.
3) redundant checks removed
4) extra braces in return statements removed.
5) removed unused pd pointer from mr.
6) reorganized get_dma_mr()
7) fixed set_av() to return error on invalid sgid index.
8) reference to ocrdma_dev removed from struct ocrdma_pd.

Signed-off-by: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-08-12 10:58:37 -07:00
Roland Dreier
569935db80 Merge branches 'cma', 'cxgb3', 'cxgb4', 'ipoib', 'misc', 'mlx4', 'mlx5', 'nes', 'ocrdma' and 'qib' into for-next 2013-07-31 14:24:06 -07:00
Erez Shitrit
c290414169 IPoIB: Fix pkey change flow for virtualization environments
IPoIB's required behaviour w.r.t to the pkey used by the device is the following:

- For "parent" interfaces (e.g ib0, ib1, etc) who are created
  automatically as a result of hot-plug events from the IB core, the
  driver needs to take whatever pkey vlaue it finds in index 0, and
  stick to that index.

- For child interfaces (e.g ib0.8001, etc) created by admin directive,
  the driver needs to use and stick to the value provided during its
  creation.

In SR-IOV environment its possible for the VF probe to take place
before the cloud management software provisions the suitable pkey for
the VF in the paravirtualed PKEY table index 0. When this is the case,
the VF IB stack will find in index 0 an invalide pkey, which is all
zeros.

Moreover, the cloud managment can assign the pkey value at index 0 at
any time of the guest life cycle.

The correct behavior for IPoIB to address these requirements for
parent interfaces is to use PKEY_CHANGE event as trigger to optionally
re-init the device pkey value and re-create all the relevant resources
accordingly, if the value of the pkey in index 0 has changed (from
invalid to valid or from valid value X to invalid value Y).

This patch enhances the heavy flushing code which is triggered by pkey
change event, to behave correctly for parent devices. For child
devices, the code remains the same, namely chases pkey value and not
index.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 14:23:44 -07:00
Or Gerlitz
3d790a4c26 IPoIB: Make sure child devices use valid/proper pkeys
Make sure that the IB invalid pkey (0x0000 or 0x8000) isn't used for
child devices.

Also, make sure to always set the full membership bit for the pkey of
devices created by rtnl link ops.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 14:23:40 -07:00
Jack Morgenstein
ef5ed4166f IB/core: Create QP1 using the pkey index which contains the default pkey
Currently, QP1 is created using pkey index 0. This patch simply looks
for the index containing the default pkey, rather than hard-coding
pkey index 0.

This change will have no effect in native mode, since QP0 and QP1 are
created before the SM configures the port, so pkey table will still be
the default table defined by the IB Spec, in C10-123: "If non-volatile
storage is not used to hold P_Key Table contents, then if a PM
(Partition Manager) is not present, and prior to PM initialization of
the P_Key Table, the P_Key Table must act as if it contains a single
valid entry, at P_Key_ix = 0, containing the default partition
key. All other entries in the P_Key Table must be invalid."

Thus, in the native mode case, the driver will find the default pkey
at index 0 (so it will be no different than the hard-coding).

However, in SR-IOV mode, for VFs, the pkey table may be
paravirtualized, so that the VF's pkey index zero may not necessarily
be mapped to the real pkey index 0. For VFs, therefore, it is
important to find the virtual index which maps to the real default
pkey.

This commit does the following for QP1 creation:

1. Find the pkey index containing the default pkey, and use that index
   if found.  ib_find_pkey() returns the index of the
   limited-membership default pkey (0x7FFF) if the full-member default
   pkey is not in the table.

2. If neither form of the default pkey is found, use pkey index 0
   (previous behavior).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 14:15:17 -07:00
Andi Shyti
618af3846b mlx5_core: Variable may be used uninitialized
In the sq_overhead() function, if qp_typ is equal to IB_QPT_RC, size
will be used uninitialized.

Signed-off-by: Andi Shyti <andi@etezian.org>
Acked-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 14:12:33 -07:00
Dan Carpenter
92b0ca7cb1 IB/mlx5: Fix stack info leak in mlx5_ib_alloc_ucontext()
We don't set "resp.reserved".  Since it's at the end of the struct
that means we don't have to copy it to the user.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 14:12:08 -07:00
Wei Yongjun
281d1a9211 IB/mlx5: Fix error return code in init_one()
Fix to return a negative error code from the error handling case
instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 14:12:07 -07:00
Jack Morgenstein
3eac103f83 IB/mlx4: Use default pkey when creating tunnel QPs
When creating tunnel QPs for special QP tunneling, look for the
default pkey in the slave's virtual pkey table.  If it is present, use
the real pkey index where the default pkey is located.

If the default pkey is not found in the pkey table, use the real pkey
index which is stored at index 0 in the slave's virtual pkey table
(this is the current behavior).

This change is required to support cloud computing, where the
paravirtualized index of the default pkey is moved to index 1 or
higher.  The pkey at paravirtualized index 0 is used for the default
IPoIB interface created by the VF.

Its possible for the pkey value at paravirtualized index 0 to be
invalid (zero) at VF probe time (pkey index 0 is mapped to real pkey
index 127, which contains pkey = 0).

At some point after the VF probe, the cloud computing interface at the
hypervisor maps virtual index 0 for the VF to the pkey index
containing the pkey that IPoIB will use in its operation.  However,
when the tunnel QP is created, the pkey at the slave's virtual index 0
is still mapped to the invalid pkey index, so tunnel QP creation
fails.

This commit causes the hypervisor to search for the default pkey in
the slave's pkey table -- and this pkey is present in the table (at
index > 0) at tunnel QP creation time, so that the tunnel QP creation
will succeed.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 12:22:12 -07:00
Sean Hefty
5eb695c177 RDMA/cma: Only call cma_save_ib_info() for CM REQs
Calling cma_save_ib_info() for CM SIDR REQs results in a crash
accessing an invalid path record pointer.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 00:50:44 -07:00
Sean Hefty
e511d1ae16 RDMA/cma: Fix accessing invalid private data for UD
If a application is using AF_IB with a UD QP, but does not provide any
private data, we will end up accessing invalid memory.  Check for this
case and handle it appropriately.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-31 00:50:40 -07:00
Paul Bolle
8fb488d740 RDMA/cma: Fix gcc warning
Building cma.o triggers this gcc warning:

    drivers/infiniband/core/cma.c: In function ‘rdma_resolve_addr’:
    drivers/infiniband/core/cma.c:465:23: warning: ‘port’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    drivers/infiniband/core/cma.c:426:5: note: ‘port’ was declared here

This is a false positive, as "port" will always be initialized if we're
at "found". But if we assign to "id_priv->id.port_num" directly, we can
drop "port". That will, obviously, silence gcc.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-30 16:11:22 -07:00
Roland Dreier
3c93f039d2 Revert "RDMA/nes: Fix compilation error when nes_debug is enabled"
This reverts commit bca1935ccd, which removes variables
nes_tcp_state_str and nes_iwarp_state_str, assuming that they aren't
defined.  However, they are defined within a #ifdef NES_DEBUG statement,
which if enabled causes "defined but not used" compiler warning, when
the variables are removed.

Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-30 15:48:35 -07:00
Mike Marciniszyn
b268e4db3d IB/qib: Add err_decode() call for ring dump
Commit 0b3ddf380c ("Log all SDMA errors unconditionally") missed
part of the patch.

This also corrects a format warning when dma_addr_t is 32 bits
on a 64 bit system.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-30 10:13:00 -07:00
Dan Carpenter
246fcdbc9d RDMA/cxgb3: Fix stack info leak in iwch_create_cq()
The "uresp.reserved" field isn't initialized on this path so it could
leak uninitialized stack information to the user.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-30 10:11:33 -07:00
Dan Carpenter
604296303f RDMA/nes: Fix info leaks in nes_create_qp() and nes_create_cq()
We pass a few bytes of uninitialized stack memory to the user here.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-30 10:10:35 -07:00
Dan Carpenter
63ea374957 RDMA/ocrdma: Fix several stack info leaks
A grab bag of places which don't properly initialize stack data.  I
removed one place which cleared ".rsvd" because it's not needed now
that I have added a memset() earlier in the function.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-30 10:09:16 -07:00
Dan Carpenter
ae1fe07f3f RDMA/cxgb4: Fix stack info leak in c4iw_create_qp()
"uresp.ma_sync_key" doesn't get set on this path so we leak 8 bytes of data.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-30 10:07:56 -07:00
Roland Dreier
3606b99971 RDMA/ocrdma: Remove unused include
I'd like to remove rdma/ib_cache.h some day, so let's avoid
proliferating uses of it unnecessarily.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-26 10:03:04 -07:00
Linus Torvalds
c552441373 Main batch of InfiniBand/RDMA changes for 3.11 merge window:
- AF_IB (native IB addressing) for CMA from Sean Hefty
  - New mlx5 driver for Mellanox Connect-IB adapters (including post merge request fixes)
  - SRP fixes from Bart Van Assche (including fix to first merge request)
  - qib HW driver updates
  - Resurrection of ocrdma HW driver development
  - uverbs conversion to create fds with O_CLOEXEC set
  - Other small changes and fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJR30TKAAoJEENa44ZhAt0h854P/jvAhK5u+XTM5VyjAi0DKJ7P
 bWcsu+KxbOIFnjEdsYQl1mGP44gdO8GPZp7+JR5nDHDRpw9K76qy6QQiPbaF6Y8D
 cZH8Xlq4hzBfElTWBkExEemPrVUUq77j03FE9TBatdLAtEyYkgrNyqr7Ys6zVwVK
 ugR8nAahvnB7Jh1tsyZBBd9kfbWtXJnaGC8/Zk3Na4n4zXRAbr0DcnRF0sncTL38
 VFnWbi33OQAxu5bsb2jGec/SNP3BbNwspFPjSCKqiiItRaCj13JiHhrKKvVk4RZe
 hIRnPH47kjLRp2/PwBo6o+gTXZuRg48VGBx4CKUTwx1nCzPPN1iz9ZOfqUv9Qwcv
 LX8mxC7QS/Yvud4KeEBsj6kotb80EkRF2KV5RkIKCxQiwetGD9127bZylC8ttxGw
 2f6MzYtAGD4R4C10lO8N+59VugSg1xAvwsqz0a/jy2XyVHbI1ugQedzkB20x5WPY
 51S08ABvtU9yIxIYrw2VEaa/5WN+XJ6+LpG9QBAGXdMLiCiiAe7n/YzyXI6AgwaW
 Jl/uKr6H6/jEHUHKwkyqsmbpVGPhtGWu8deyr1oYvOEP4i48gcDqMQsfMcCISrQV
 MeQU3hS/obykUlNeqjmMI2CXrecqSsiq0hXd4DLaSoZ2Rb4Drx2Wj6sTQLIAgL2q
 GBYjHWMUpZXIFHQaH7am
 =nZh8
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull InfiniBand/RDMA changes from Roland Dreier:
 - AF_IB (native IB addressing) for CMA from Sean Hefty
 - new mlx5 driver for Mellanox Connect-IB adapters (including post
   merge request fixes)
 - SRP fixes from Bart Van Assche (including fix to first merge request)
 - qib HW driver updates
 - resurrection of ocrdma HW driver development
 - uverbs conversion to create fds with O_CLOEXEC set
 - other small changes and fixes

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
  mlx5: Return -EFAULT instead of -EPERM
  IB/qib: Log all SDMA errors unconditionally
  IB/qib: Fix module-level leak
  mlx5_core: Adjust hca_cap.uar_page_sz to conform to Connect-IB spec
  IB/srp: Let srp_abort() return FAST_IO_FAIL if TL offline
  IB/uverbs: Use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
  mlx5_core: Fixes for sparse warnings
  IB/mlx5: Make profile[] static in main.c
  mlx5: Fix parameter type of health_handler_t
  mlx5: Add driver for Mellanox Connect-IB adapters
  IB/core: Add reserved values to enums for low-level driver use
  IB/srp: Bump driver version and release date
  IB/srp: Make HCA completion vector configurable
  IB/srp: Maintain a single connection per I_T nexus
  IB/srp: Fail I/O fast if target offline
  IB/srp: Skip host settle delay
  IB/srp: Avoid skipping srp_reset_host() after a transport error
  IB/srp: Fix remove_one crash due to resource exhaustion
  IB/qib: New transmitter tunning settings for Dell 1.1 backplane
  IB/core: Fix error return code in add_port()
  ...
2013-07-13 12:57:21 -07:00
Roland Dreier
e04abfa243 Merge branches 'mlx5', 'qib' and 'srp' into for-next 2013-07-11 16:49:30 -07:00
Dan Carpenter
5e631a03af mlx5: Return -EFAULT instead of -EPERM
For copy_to/from_user() failure, the correct error code is -EFAULT not
-EPERM.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-11 16:48:45 -07:00
Dean Luick
0b3ddf380c IB/qib: Log all SDMA errors unconditionally
This patch adds code to log SDMA errors for supportability purposes.

Signed-off-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-11 16:47:06 -07:00
Mike Marciniszyn
308c813b19 IB/qib: Fix module-level leak
The vzalloc()'ed field physshadow is leaked on module unload.

This patch adds vfree after the sibling page shadow is freed.

Reported-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-11 16:46:44 -07:00
Bart Van Assche
80d5e8a235 IB/srp: Let srp_abort() return FAST_IO_FAIL if TL offline
If the transport layer is offline it is more appropriate to let
srp_abort() return FAST_IO_FAIL instead of SUCCESS.

Reported-by: Sebastian Riemer <sebastian.riemer@profitbricks.com>
Acked-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-11 16:43:48 -07:00
Linus Torvalds
6d2fa9e141 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Lots of activity this round on performance improvements in target-core
  while benchmarking the prototype scsi-mq initiator code with
  vhost-scsi fabric ports, along with a number of iscsi/iser-target
  improvements and hardening fixes for exception path cases post v3.10
  merge.

  The highlights include:

   - Make persistent reservations APTPL buffer allocated on-demand, and
     drop per t10_reservation buffer.  (grover)
   - Make virtual LUN=0 a NULLIO device, and skip allocation of NULLIO
     device pages (grover)
   - Add transport_cmd_check_stop write_pending bit to avoid extra
     access of ->t_state_lock is WRITE I/O submission fast-path.  (nab)
   - Drop unnecessary CMD_T_DEV_ACTIVE check from
     transport_lun_remove_cmd to avoid extra access of ->t_state_lock in
     release fast-path.  (nab)
   - Avoid extra t_state_lock access in __target_execute_cmd fast-path
     (nab)
   - Drop unnecessary vhost-scsi wait_for_tasks=true usage +
     ->t_state_lock access in release fast-path.  (nab)
   - Convert vhost-scsi to use modern se_cmd->cmd_kref
     TARGET_SCF_ACK_KREF usage (nab)
   - Add tracepoints for SCSI commands being processed (roland)
   - Refactoring of iscsi-target handling of ISCSI_OP_NOOP +
     ISCSI_OP_TEXT to be transport independent (nab)
   - Add iscsi-target SendTargets=$IQN support for in-band discovery
     (nab)
   - Add iser-target support for in-band discovery (nab + Or)
   - Add iscsi-target demo-mode TPG authentication context support (nab)
   - Fix isert_put_reject payload buffer post (nab)
   - Fix iscsit_add_reject* usage for iser (nab)
   - Fix iscsit_sequence_cmd reject handling for iser (nab)
   - Fix ISCSI_OP_SCSI_TMFUNC handling for iser (nab)
   - Fix session reset bug with RDMA_CM_EVENT_DISCONNECTED (nab)

  The last five iscsi/iser-target items are CC'ed to stable, as they do
  address issues present in v3.10 code.  They are certainly larger than
  I'd like for stable patch set, but are important to ensure proper
  REJECT exception handling in iser-target for 3.10.y"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (51 commits)
  iser-target: Ignore non TEXT + LOGOUT opcodes for discovery
  target: make queue_tm_rsp() return void
  target: remove unused codes from enum tcm_tmrsp_table
  iscsi-target: kstrtou* configfs attribute parameter cleanups
  iscsi-target: Fix tfc_tpg_auth_cit configfs length overflow
  iscsi-target: Fix tfc_tpg_nacl_auth_cit configfs length overflow
  iser-target: Add support for ISCSI_OP_TEXT opcode + payload handling
  iser-target: Rename sense_buf_[dma,len] to pdu_[dma,len]
  iser-target: Add vendor_err debug output
  target: Add (obsolete) checking for PMI/LBA fields in READ CAPACITY(10)
  target: Return correct sense data for IO past the end of a device
  target: Add tracepoints for SCSI commands being processed
  iser-target: Fix session reset bug with RDMA_CM_EVENT_DISCONNECTED
  iscsi-target: Fix ISCSI_OP_SCSI_TMFUNC handling for iser
  iscsi-target: Fix iscsit_sequence_cmd reject handling for iser
  iscsi-target: Fix iscsit_add_reject* usage for iser
  iser-target: Fix isert_put_reject payload buffer post
  iscsi-target: missing kfree() on error path
  iscsi-target: Drop left-over iscsi_conn->bad_hdr
  target: Make core_scsi3_update_and_write_aptpl return sense_reason_t
  ...
2013-07-11 12:57:19 -07:00
Linus Torvalds
496322bc91 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "This is a re-do of the net-next pull request for the current merge
  window.  The only difference from the one I made the other day is that
  this has Eliezer's interface renames and the timeout handling changes
  made based upon your feedback, as well as a few bug fixes that have
  trickeled in.

  Highlights:

   1) Low latency device polling, eliminating the cost of interrupt
      handling and context switches.  Allows direct polling of a network
      device from socket operations, such as recvmsg() and poll().

      Currently ixgbe, mlx4, and bnx2x support this feature.

      Full high level description, performance numbers, and design in
      commit 0a4db187a9 ("Merge branch 'll_poll'")

      From Eliezer Tamir.

   2) With the routing cache removed, ip_check_mc_rcu() gets exercised
      more than ever before in the case where we have lots of multicast
      addresses.  Use a hash table instead of a simple linked list, from
      Eric Dumazet.

   3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
      Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
      Marek Puzyniak, Michal Kazior, and Sujith Manoharan.

   4) Support reporting the TUN device persist flag to userspace, from
      Pavel Emelyanov.

   5) Allow controlling network device VF link state using netlink, from
      Rony Efraim.

   6) Support GRE tunneling in openvswitch, from Pravin B Shelar.

   7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
      Daniel Borkmann and Eric Dumazet.

   8) Allow controlling of TCP quickack behavior on a per-route basis,
      from Cong Wang.

   9) Several bug fixes and improvements to vxlan from Stephen
      Hemminger, Pravin B Shelar, and Mike Rapoport.  In particular,
      support receiving on multiple UDP ports.

  10) Major cleanups, particular in the area of debugging and cookie
      lifetime handline, to the SCTP protocol code.  From Daniel
      Borkmann.

  11) Allow packets to cross network namespaces when traversing tunnel
      devices.  From Nicolas Dichtel.

  12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
      manner akin to how we monitor real network traffic via ptype_all.
      From Daniel Borkmann.

  13) Several bug fixes and improvements for the new alx device driver,
      from Johannes Berg.

  14) Fix scalability issues in the netem packet scheduler's time queue,
      by using an rbtree.  From Eric Dumazet.

  15) Several bug fixes in TCP loss recovery handling, from Yuchung
      Cheng.

  16) Add support for GSO segmentation of MPLS packets, from Simon
      Horman.

  17) Make network notifiers have a real data type for the opaque
      pointer that's passed into them.  Use this to properly handle
      network device flag changes in arp_netdev_event().  From Jiri
      Pirko and Timo Teräs.

  18) Convert several drivers over to module_pci_driver(), from Peter
      Huewe.

  19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
      O(1) calculation instead.  From Eric Dumazet.

  20) Support setting of explicit tunnel peer addresses in ipv6, just
      like ipv4.  From Nicolas Dichtel.

  21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.

  22) Prevent a single high rate flow from overruning an individual cpu
      during RX packet processing via selective flow shedding.  From
      Willem de Bruijn.

  23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
      Dumazet.

  24) Don't just drop GSO packets which are above the TBF scheduler's
      burst limit, chop them up so they are in-bounds instead.  Also
      from Eric Dumazet.

  25) VLAN offloads are missed when configured on top of a bridge, fix
      from Vlad Yasevich.

  26) Support IPV6 in ping sockets.  From Lorenzo Colitti.

  27) Receive flow steering targets should be updated at poll() time
      too, from David Majnemer.

  28) Fix several corner case regressions in PMTU/redirect handling due
      to the routing cache removal, from Timo Teräs.

  29) We have to be mindful of ipv4 mapped ipv6 sockets in
      upd_v6_push_pending_frames().  From Hannes Frederic Sowa.

  30) Fix L2TP sequence number handling bugs, from James Chapman."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
  drivers/net: caif: fix wrong rtnl_is_locked() usage
  drivers/net: enic: release rtnl_lock on error-path
  vhost-net: fix use-after-free in vhost_net_flush
  net: mv643xx_eth: do not use port number as platform device id
  net: sctp: confirm route during forward progress
  virtio_net: fix race in RX VQ processing
  virtio: support unlocked queue poll
  net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
  Documentation: Fix references to defunct linux-net@vger.kernel.org
  net/fs: change busy poll time accounting
  net: rename low latency sockets functions to busy poll
  bridge: fix some kernel warning in multicast timer
  sfc: Fix memory leak when discarding scattered packets
  sit: fix tunnel update via netlink
  dt:net:stmmac: Add dt specific phy reset callback support.
  dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
  dt:net:stmmac: Allocate platform data only if its NULL.
  net:stmmac: fix memleak in the open method
  ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
  net: ipv6: fix wrong ping_v6_sendmsg return value
  ...
2013-07-09 18:24:39 -07:00
Roland Dreier
0eba551148 Merge branches 'af_ib', 'cxgb4', 'misc', 'mlx5', 'ocrdma', 'qib' and 'srp' into for-next 2013-07-08 11:22:11 -07:00
Roland Dreier
da183c7af8 IB/uverbs: Use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
The macro get_unused_fd() is used to allocate a file descriptor with
default flags.  Those default flags (0) can be "unsafe": O_CLOEXEC must
be used by default to not leak file descriptor across exec().

Replace calls to get_unused_fd() in uverbs with calls to
get_unused_fd_flags(O_CLOEXEC).  Inheriting uverbs fds across exec()
cannot be used to do anything useful.

Based on a patch/suggestion from Yann Droneaud <ydroneaud@opteya.com>.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-08 11:15:45 -07:00
Roland Dreier
ad32b95f82 IB/mlx5: Make profile[] static in main.c
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-08 10:32:32 -07:00
Eli Cohen
e126ba97db mlx5: Add driver for Mellanox Connect-IB adapters
The driver is comprised of two kernel modules: mlx5_ib and mlx5_core.
This partitioning resembles what we have for mlx4, except that mlx5_ib
is the pci device driver and not mlx5_core.

mlx5_core is essentially a library that provides general functionality
that is intended to be used by other Mellanox devices that will be
introduced in the future.  mlx5_ib has a similar role as any hardware
device under drivers/infiniband/hw.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>

[ Merge in coccinelle fixes from Fengguang Wu <fengguang.wu@intel.com>.
  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-08 10:32:24 -07:00
Nicholas Bellinger
ca40d24eb8 iser-target: Ignore non TEXT + LOGOUT opcodes for discovery
This patch adds a check in isert_rx_opcode() to ignore non TEXT + LOGOUT
opcodes when SessionType=Discovery has been negotiated.

Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-07-07 18:37:02 -07:00
Joern Engel
b79fafac70 target: make queue_tm_rsp() return void
The return value wasn't checked by any of the callers.  Assuming this is
correct behaviour, we can simplify some code by not bothering to
generate it.

nab: Add srpt_queue_data_in() + srpt_queue_tm_rsp() nops around
     srpt_queue_response() void return

Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-07-07 18:36:53 -07:00
Nicholas Bellinger
adb54c2931 iser-target: Add support for ISCSI_OP_TEXT opcode + payload handling
This patch adds isert_handle_text_cmd() to handle incoming
ISCSI_OP_TEXT PDU processing, along with isert_put_text_rsp()
for posting ISCSI_OP_TEXT_RSP ib_send_wr response.

It copies ISCSI_OP_TEXT payload using unsolicited payload at
&iser_rx_desc->data[0] into iscsi_cmd->text_in_ptr for usage
with outgoing isert_put_text_rsp() -> iscsit_build_text_rsp()

v2 changes:
  - Let iscsit_build_text_rsp() determine any extra padding

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-07-07 18:36:48 -07:00
Nicholas Bellinger
dbbc5d1107 iser-target: Rename sense_buf_[dma,len] to pdu_[dma,len]
Now that these two variables are used for REJECT payloads as well
as SCSI response sense payloads, rename them to something that
makes more sense.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-07-07 18:36:47 -07:00
Nicholas Bellinger
c5a2adbfcb iser-target: Add vendor_err debug output
Add output for ib_wc.vendor_err in isert_cq_[t,r]x_work(), which
is useful for debugging future issues.

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-07-07 18:36:46 -07:00
Nicholas Bellinger
b2cb96494d iser-target: Fix session reset bug with RDMA_CM_EVENT_DISCONNECTED
This patch addresses a bug where RDMA_CM_EVENT_DISCONNECTED may occur
before the connection shutdown has been completed by rx/tx threads,
that causes isert_free_conn() to wait indefinately on ->conn_wait.

This patch allows isert_disconnect_work code to invoke rdma_disconnect
when isert_disconnect_work() process context is started by client
session reset before isert_free_conn() code has been reached.

It also adds isert_conn->conn_mutex protection for ->state within
isert_disconnect_work(), isert_cq_comp_err() and isert_free_conn()
code, along with isert_check_state() for wait_event usage.

(v2: Add explicit iscsit_cause_connection_reinstatement call
     during isert_disconnect_work() to force conn reset)

Cc: stable@vger.kernel.org  # 3.10+
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-07-07 18:35:56 -07:00